Manual Event Submission (Ingest API)
Overview
The Ingest API (accessed via the /requests/ingest
endpoint) provides a mechanism for explicitly submitting event data to Pay-i after a provider interaction has occurred or for non-provider events. While Direct Provider Call instrumentation uses the Ingest API under the hood (via the @ingest decorator), this documentation covers how to use the Ingest API directly. Only Proxy Routing uses in-context logging during the request flow.
The Ingest API is particularly valuable for these key use cases:
- Tracking Custom Resources - Submit usage data for custom resources not directly supported by Pay-i
- Integration-Resistant Providers - Submit data for providers that don't work well with automatic instrumentation (e.g., AWS Bedrock in some scenarios)
- Non-Python Clients - Submit usage from applications written in languages without Pay-i SDK support
- Submitting Historical Data - Record individual past events for complete usage visibility
- External System Reconciliation - Align Pay-i data with other tracking systems
When using the Ingest API directly, the call happens after the provider interaction has completed, meaning your service already has all the necessary usage data (tokens, latency metrics, etc.) available.
Important: When working with streaming responses, ensure you've read the stream to the end before submitting usage data. Pay-i needs the complete token information to accurately track usage and calculate costs. If you don't read the entire stream, you'll have incomplete data for submission.
Manual event submission supports all the same features as standard Pay-i Instrumentation, except for Block
limits. This is because Pay-i cannot block requests it doesn't directly proxy. Pay-i will still calculate the costs based on the reported units and associate the data with any provided tags, use cases, and Allow
limits for tracking and analysis.
When the Ingest API is called, it returns an xproxy_result
containing cost information and limit status.
Ingest Fields
The Ingest API takes the following inputs to submit an event:
# | Input | Description | Body/Header | Required |
---|---|---|---|---|
1 | Category | The Category of the Resource used in the request. | Body | Y |
2 | Resource | The request Resource, used to calculate pricing. | Body | Y |
3 | Units | The number of input and output units for each of the request's Unit Types. | Body | Y |
4 | E2EL | End to end latency of the request. | Body | N |
5 | TTFT | The time to first token, which is equivalent to the E2EL for non-streaming scenarios. If both E2EL and TTFT are provided, then the Inter-Token Latency (ITL) and Output Tokens Per Second (OTPS) will automatically be calculated. | Body | N |
6 | Provider URI | The endpoint used for the request, shown in DevOps views. | Body | N |
7 | Request Prompt | The JSON sent to the Provider as part of the request. If Logging is disabled for this application, this will not be saved. | Body | N |
8 | Request Headers | An array of header names and values that were sent to the Provider. | Body | N |
9 | Provider Response | The JSON returned from the Provider after the request has completed. If Logging is disabled for this application, this will not be saved. | Body | N |
10 | Response Headers | An array of header names and values that were received from the Provider with the response. | Body | N |
11 | event_timestamp | The ISO 8601 timestamp of when the request was sent to the provider. If not provided, the current time is used. Refer to event_timestamp for more details. | Body | N |
12 | HTTP Status Code | The status code of the HTTP request to the Provider, e.g., "200" or "400", used for failure tracking. | Body | N |
13 | Properties | Coming soon | Body | N |
14 | Limit IDs | Comma separated list of limit-id s to associate with the request. Note that blocking limits are not supported and will result in an error. | Header | N |
15 | Request Tags | Comma separated list of request tags to associate with the request. | Header | N |
16 | User ID | The user-id associated with the request. | Header | N |
17 | Use Case ID | The use-case-id associated with the request. | Header | N |
18 | Use Case Name | The name of the use-case-type used in the request. As with instrumented requests, if a use_case_name is provided and a use-case-id is not, then one will automatically be generated. | Header | N |
Ingest Example
{
"category": "system.openai",
"resource": "gpt-4o-mini",
"event_timestamp": "2024-05-13T00:00:00",
"end_to_end_latency_ms": 12450,
"time_to_first_token_ms": 1143,
"http_status_code": 200,
"provider_uri": "https://api.openai.com/v1/chat/completions",
"provider_prompt": "{ \"request\": \"Your request JSON here\" }",
"units": {
"text": {
"input": 156,
"output": 1746
},
"text_cache_read": {
"input": 60,
"output": 0
},
"vision": {
"input": 3512,
"output": 0
}
},
"provider_request_headers": {
"RequestHeader1": [
"HeaderValue",
"HeaderValue2"
],
"RequestHeader2": [
"HeaderValue"
]
},
"provider_response": [
"{ \"response\": \"Provider response JSON here\" }"
],
"provider_response_headers": {
"ResponseHeader1": [
"HeaderValue",
"HeaderValue2"
],
"ResponseHeader2": [
"HeaderValue"
]
},
"properties": {
"system.failure": "invalid_json"
}
}
API Response
When you submit events via the Ingest API, Pay-i returns an IngestResponse
containing:
event_timestamp
- The timestamp of when the event occurredingest_timestamp
- The timestamp of when Pay-i received and processed the eventrequest_id
- A unique identifier for the submitted requestxproxy_result
- Detailed information about the request processing and results
The xproxy_result
object contains:
request_id
- Unique identifier for tracking the request in Pay-i systemscost
- Detailed cost breakdown including:- Currency information (e.g., "usd")
- Input costs (units and pricing details)
- Output costs (units and pricing details)
- Total cost calculation
limits
- Status of any limits associated with the request, with states like "ok", "exceeded", or "blocked"blocked_limit_ids
- List of any limit IDs that are blocking the requestrequest_tags
- The tags associated with the request for business trackinguse_case_id
- The use case identifier associated with the requestuser_id
- The user identifier associated with the requestresource_id
- The identifier for the resource used
This response structure allows your application to programmatically verify that events were properly processed, monitor limit usage, track costs, and maintain alignment with your business tracking needs.
Event Timestamp {#event_timestamp}
The event_timestamp specifies when the ingested event has occurred. Specifying an event_timestamp
is optional. If not specified, it defaults to UTC.Now
.
An event_timestamp
can be specified for any time in the past, and up to 5 minutes in the future to account for timing differences between your service and the Pay-i service.
When submitting an event with a historical timestamp, Pay-i will automatically use the price of the resource at that time when calculating the costs of the event. If you are submitting a request for a custom resource, then the appropriate Resource Version is automatically selected.
If the event_timestamp
refers to a point in time for which there is no pricing information (e.g., trying to ingest an event for gpt-4 before the model was released), then an error will be thrown.
Bulk Event Submission
Pay-i allows you to send thousands of events as part of a single network request, to reduce network overhead. This makes it easy to provide Pay-i with historical data or to handle high-traffic situations.
If you would like to use this feature, please contact [email protected].
Related Resources
- Operational Modes - Understanding different instrumentation approaches
- Custom Categories and Resources - Setting up tracking for custom models
- Historical Data Backfill - Strategies for importing historical usage data
- Ingest API Documentation - Complete API reference for single event submission
- Bulk Ingest API Documentation - API reference for submitting multiple events
Updated 10 days ago