Operational Modes
Overview
This advanced guide explains the two underlying technical approaches that power Pay-i's tracking and monitoring capabilities. Most users will simply use Pay-i Instrumentation without needing to understand these details, but this information may be valuable for advanced users, architects, and those with special requirements.
Two Technical Approaches
At a fundamental level, Pay-i offers two distinct technical approaches for processing GenAI API calls:
1. Direct Provider Call with Telemetry
The standard and recommended approach is direct provider calls with asynchronous telemetry:
- Your application communicates directly with the GenAI provider
- Pay-i's SDK hooks into these calls to capture and report usage data asynchronously
- This instrumentation happens behind the scenes with minimal code changes
- In Python applications, the Python SDK submits telemetry data to Pay-i internally
- For non-Python applications, you would need to implement calls to the REST Ingest API directly
- All your provider-specific features, authentication, and workflows remain unchanged
This approach offers the most straightforward integration with minimal disruption to existing architectures, while providing comprehensive tracking and analytics. It has no impact on request latency, as telemetry occurs asynchronously.
2. Proxy Routing
The alternative approach routes calls through Pay-i:
- Your application sends the API call to Pay-i, which forwards it to the provider and returns the results
- Pay-i sits in the middle of the request/response flow
- Responses include additional data like
xproxy_result
showing real-time costs - This approach can enforce
Block
limits that prevent requests from being sent to providers when a limit is exceeded
This approach adds a small amount of latency (typically 10-50ms) but enables real-time budget enforcement through Block
limits.
Technical Implementation
The approach you use is determined by how you configure your application:
Aspect | Direct Provider Call with Telemetry | Proxy Routing |
---|---|---|
SDK Initialization | payi_instrument() | payi_instrument(config={"proxy": True}) |
API Client Configuration | Normal provider endpoints | Points to Pay-i URLs (e.g., payi_openai_url() ) |
Request Headers | Optional extra_headers with tracking annotations | Required proxy headers (xProxy-api-key , etc.) + optional extra_headers with tracking annotations |
Response Handling | Standard provider responses | Responses include xproxy_result with cost data |
Decorator Usage | @ingest decorator | @proxy decorator |
When to Use Each Approach
While the Direct Provider Call with Telemetry is our recommended default for most scenarios, there are specific situations where proxy routing is necessary:
Feature/Requirement | Direct Provider Call with Telemetry | Proxy Routing |
---|---|---|
Track usage and costs | ✓ | ✓ |
Tag requests with business context | ✓ | ✓ |
Apply Allow limits (track but don't block) | ✓ | ✓ |
Apply Block limits (prevent over-budget calls) | ✗ | ✓ |
Real-time cost visibility in responses | ✗ | ✓ |
Minimal architecture changes | ✓ | ✗ |
Zero added latency | ✓ | ✗ |
If you need to enforce Block
limits that prevent requests from being sent when a budget is exceeded, you must use the proxy routing approach. For detailed implementation, see Pay-i Proxy Configuration.
IMPORTANT: Choose either Direct Provider Call with Telemetry or Proxy Routing for your application - do not mix them within the same workflow. Using both approaches for the same API calls will cause double-counting. Configure your application consistently with one approach or the other.
Decorator Considerations
When using the @ingest
decorator with Direct Provider Call with Telemetry:
- The decorator first executes your function (which makes a direct provider call)
- Then it automatically calls the Ingest API behind the scenes to submit data
- The
xproxy_result
object (which contains cost data) is not currently returned or easily accessible by code calling the@ingest
-decorated function
If you need access to real-time cost data with Direct Provider Call with Telemetry, please contact [email protected] for assistance.
Additional Considerations
Privacy and Security
When using the proxy routing approach:
- Pay-i never stores any data from the API calls that it proxies, unless you explicitly opt-in
- Any stored data is encrypted at rest and in transit
- Pay-i maintains security standards for data handling and transmission
For organizations with strict compliance or security requirements, Pay-i also offers a private deployment option that can be used with either approach. Contact [email protected] for more information about private deployments.
Related Resources
- Auto-Instrumentation - The standard way to configure Pay-i
- Proxy Configuration - Setup instructions for proxy routing (for
Block
limits) - Custom Instrumentation - Adding business context to your tracking
- Manual Event Submission - Submitting custom events via API (both Python SDK and REST API options available)
Updated 2 days ago