Guides

Operational Modes

Overview

This advanced guide explains the two underlying technical approaches that power Pay-i's tracking and monitoring capabilities. Most users will simply use Pay-i Instrumentation without needing to understand these details, but this information may be valuable for advanced users, architects, and those with special requirements.

Two Technical Approaches

At a fundamental level, Pay-i offers two distinct technical approaches for processing GenAI API calls:

1. Direct Provider Call with Telemetry

The standard and recommended approach is direct provider calls with asynchronous telemetry:

  • Your application communicates directly with the GenAI provider
  • Pay-i's SDK hooks into these calls to capture and report usage data asynchronously
  • This instrumentation happens behind the scenes with minimal code changes
  • In Python applications, the Python SDK submits telemetry data to Pay-i internally
  • For non-Python applications, you would need to implement calls to the REST Ingest API directly
  • All your provider-specific features, authentication, and workflows remain unchanged

This approach offers the most straightforward integration with minimal disruption to existing architectures, while providing comprehensive tracking and analytics. It has no impact on request latency, as telemetry occurs asynchronously.

2. Proxy Routing

The alternative approach routes calls through Pay-i:

  • Your application sends the API call to Pay-i, which forwards it to the provider and returns the results
  • Pay-i sits in the middle of the request/response flow
  • Responses include additional data like xproxy_result showing real-time costs
  • This approach can enforce Block limits that prevent requests from being sent to providers when a limit is exceeded

This approach adds a small amount of latency (typically 10-50ms) but enables real-time budget enforcement through Block limits.

Technical Implementation

The approach you use is determined by how you configure your application:

AspectDirect Provider Call with TelemetryProxy Routing
SDK Initializationpayi_instrument()payi_instrument(config={"proxy": True})
API Client ConfigurationNormal provider endpointsPoints to Pay-i URLs (e.g., payi_openai_url())
Request HeadersOptional extra_headers with tracking annotationsRequired proxy headers (xProxy-api-key, etc.) + optional extra_headers with tracking annotations
Response HandlingStandard provider responsesResponses include xproxy_result with cost data
Decorator Usage@ingest decorator@proxy decorator

When to Use Each Approach

While the Direct Provider Call with Telemetry is our recommended default for most scenarios, there are specific situations where proxy routing is necessary:

Feature/RequirementDirect Provider Call with TelemetryProxy Routing
Track usage and costs
Tag requests with business context
Apply Allow limits (track but don't block)
Apply Block limits (prevent over-budget calls)
Real-time cost visibility in responses
Minimal architecture changes
Zero added latency

If you need to enforce Block limits that prevent requests from being sent when a budget is exceeded, you must use the proxy routing approach. For detailed implementation, see Pay-i Proxy Configuration.

IMPORTANT: Choose either Direct Provider Call with Telemetry or Proxy Routing for your application - do not mix them within the same workflow. Using both approaches for the same API calls will cause double-counting. Configure your application consistently with one approach or the other.

Decorator Considerations

When using the @ingest decorator with Direct Provider Call with Telemetry:

  • The decorator first executes your function (which makes a direct provider call)
  • Then it automatically calls the Ingest API behind the scenes to submit data
  • The xproxy_result object (which contains cost data) is not currently returned or easily accessible by code calling the @ingest-decorated function

If you need access to real-time cost data with Direct Provider Call with Telemetry, please contact [email protected] for assistance.

Additional Considerations

Privacy and Security

When using the proxy routing approach:

  • Pay-i never stores any data from the API calls that it proxies, unless you explicitly opt-in
  • Any stored data is encrypted at rest and in transit
  • Pay-i maintains security standards for data handling and transmission

For organizations with strict compliance or security requirements, Pay-i also offers a private deployment option that can be used with either approach. Contact [email protected] for more information about private deployments.

Related Resources