Pay-i Proxy Configuration
Overview
This guide explains how to configure Pay-i to route your GenAI API calls through its proxy service. While the standard Pay-i Instrumentation is sufficient for most use cases, proxy configuration is required when you need to implement Block
limits that prevent requests from being sent to providers when a budget is exceeded.
When to Use Proxy Configuration
Pay-i's proxy configuration should be used when:
- You need to implement
Block
limits to prevent API calls when budgets are exceeded - You want real-time cost visibility within API responses
- You need to enforce spending limits directly at the API call level
If you're only tracking usage, applying Allow
limits, or adding business context to your calls, the standard Pay-i Instrumentation approach is recommended as it's simpler to implement and adds no latency to your requests.
How it Works
When configured to use Pay-i as a proxy:
- Your application sends API requests to Pay-i instead of directly to the provider
- Pay-i receives the request and checks if any applicable
Block
limits are in "overrun" or "blocked" state - If limits are in "ok" or "exceeded" state (spend <= max), Pay-i forwards the request to the provider
- Pay-i receives the response from the provider, augments it with cost information, and returns it to your application
- If any
Block
limits are in "overrun" or "blocked" state (spend > max), Pay-i prevents the request from reaching the provider and returns an error response
Important: A common misconception is that the "exceeded" state means requests are blocked. This is incorrect. The "exceeded" state only indicates that spending has reached the threshold but is still under or equal to the max value. For Block limits, only the "overrun" state (when spend > max) or "blocked" state (subsequent requests after hitting overrun) will prevent requests from reaching the provider.
This approach allows Pay-i to enforce spending constraints in real-time, before costs are incurred.
Configuring Proxy Mode
Setting up proxy mode involves two main steps:
- Initialize Pay-i instrumentation with proxy mode enabled
- Configure your provider client to route requests through Pay-i
1. Initialize Instrumentation with Proxy Mode
First, initialize Pay-i instrumentation with proxy mode explicitly enabled:
from payi.lib.instrument import payi_instrument
# Initialize Pay-i instrumentation with proxy mode
payi_instrument(config={"proxy": True})
2. Configure Provider Client
Next, configure your provider client to route requests through Pay-i instead of directly to the provider. Here's an example with OpenAI:
import os
from openai import OpenAI
from payi.lib.helpers import payi_openai_url
# Configure OpenAI client to use Pay-i as a proxy
client = OpenAI(
api_key=os.getenv("OPENAI_API_KEY"),
base_url=payi_openai_url(), # Uses Pay-i's URL instead of OpenAI's
default_headers={"xProxy-api-key": os.getenv("PAYI_API_KEY")} # Authenticate with Pay-i
)
For detailed configuration instructions for other providers, please see the provider-specific guides:
Each provider has unique requirements for authentication and configuration when using proxy mode.
LangChain Note
Important: LangChain integration with Pay-i uses a callback handler approach and does not support proxy configuration or
Block
limits. For LangChain integration, see the LangChain Configuration guide.
Helper Functions
Pay-i provides URL helper functions to simplify configuration:
Helper Function | Description |
---|---|
payi_openai_url() | Generates the correct proxy URL for OpenAI |
payi_azure_openai_url() | Generates the correct proxy URL for Azure OpenAI |
payi_anthropic_url() | Generates the correct proxy URL for Anthropic |
payi_aws_bedrock_url() | Generates the correct proxy URL for AWS Bedrock |
These helpers automatically use the PAYI_BASE_URL
environment variable if set, or default to the standard Pay-i API endpoint.
Request and Response Differences
When using proxy configuration, requests and responses differ from standard instrumentation:
Request Headers
Requests must include special headers for the proxy to work correctly:
xProxy-api-key
: Your Pay-i API key (required)- Provider-specific headers (described in the provider-specific guides)
You can still include any business context annotations in extra_headers
like you would with standard instrumentation.
Response Format
The most significant difference is in the response format. With proxy routing, responses are augmented with an xproxy_result
object that contains real-time information about the request:
{
"xproxy_result": {
"cost": 0.00342,
"limits": [
{
"limit_id": "your-limit-id",
"name": "Example Limit",
"limit_type": "Allow",
"current": 0.45,
"max": 10.0,
"remaining": 9.55
}
]
}
}
This object provides immediate visibility into:
- The exact cost of the current API call
- The status of any limits applied to the request
- How much budget remains for each limit
Error Handling
When a Block
limit is in "overrun" or "blocked" state (spending has gone over the max value), the proxy returns an error response instead of forwarding the request to the provider:
{
"error": {
"message": "Limit exceeded: ProjectBudget",
"type": "LimitExceeded",
"code": 429,
"limit_id": "your-limit-id"
}
}
For details on handling proxy errors, see Handling Errors.
Important Considerations
Mixing Warning
IMPORTANT: Do not mix proxy configuration and standard instrumentation for the same API calls. Using both approaches for the same requests will cause double-counting. Configure your application consistently with one approach or the other for a given workflow.
Performance Impact
Proxy routing adds a small amount of latency (typically 10-50ms) to each request. While this is negligible compared to the typical latency of GenAI API calls (often 1000+ ms), it's something to be aware of in high-performance applications.
Related Resources
- Auto-Instrumentation - Standard instrumentation approach
- Operational Approaches - Advanced explanation of the underlying technical approaches
- Handling Successes - Working with the
xproxy_result
object - Handling Errors - Managing error responses in proxy mode
Updated 12 days ago