Overview

The Ingest API (accessed via the /requests/ingest endpoint) provides a mechanism for explicitly submitting event data to Pay-i after a provider interaction has occurred or for non-provider events. While Direct Provider Call instrumentation uses the Ingest API under the hood (via the @ingest decorator), this documentation covers how to use the Ingest API directly. Only Proxy Routing uses in-context logging during the request flow.

The Ingest API is particularly valuable for these key use cases:

Tracking Custom Resources - Submit usage data for custom resources not directly supported by Pay-i
Integration-Resistant Providers - Submit data for providers that don't work well with automatic instrumentation (e.g., AWS Bedrock in some scenarios)
Non-Python Clients - Submit usage from applications written in languages without Pay-i SDK support
Submitting Historical Data - Record individual past events for complete usage visibility
External System Reconciliation - Align Pay-i data with other tracking systems

When using the Ingest API directly, the call happens after the provider interaction has completed, meaning your service already has all the necessary usage data (tokens, latency metrics, etc.) available.

Important: When working with streaming responses, ensure you've read the stream to the end before submitting usage data. Pay-i needs the complete token information to accurately track usage and calculate costs. If you don't read the entire stream, you'll have incomplete data for submission.

Manual event submission supports all the same features as standard Pay-i Instrumentation, except for Block limits. This is because Pay-i cannot block requests it doesn't directly proxy. Pay-i will still calculate the costs based on the reported units and associate the data with any provided tags, use cases, and Allow limits for tracking and analysis.

When the Ingest API is called, it returns an xproxy_result containing cost information and limit status.

Ingest Fields

The Ingest API takes the following inputs to submit an event:

#	Input	Description	Body/Header	Required
1	Category	The Category of the Resource used in the request.	Body	Y
2	Resource	The request Resource, used to calculate pricing.	Body	Y
3	Units	The number of input and output units for each of the request's Unit Types.	Body	Y
4	E2EL	End to end latency of the request.	Body	N
5	TTFT	The time to first token, which is equivalent to the E2EL for non-streaming scenarios. If both E2EL and TTFT are provided, then the Inter-Token Latency (ITL) and Output Tokens Per Second (OTPS) will automatically be calculated.	Body	N
6	Provider URI	The endpoint used for the request, shown in DevOps views.	Body	N
7	Request Prompt	The JSON sent to the Provider as part of the request. If Logging is disabled for this application, this will not be saved.	Body	N
8	Request Headers	An array of header names and values that were sent to the Provider.	Body	N
9	Provider Response	The JSON returned from the Provider after the request has completed. If Logging is disabled for this application, this will not be saved.	Body	N
10	Response Headers	An array of header names and values that were received from the Provider with the response.	Body	N
11	event_timestamp	The ISO 8601 timestamp of when the request was sent to the provider. If not provided, the current time is used. Refer to event_timestamp for more details.	Body	N
12	HTTP Status Code	The status code of the HTTP request to the Provider, e.g., "200" or "400", used for failure tracking.	Body	N
13	Properties	Coming soon	Body	N
14	Limit IDs	Comma separated list of `limit-id`s to associate with the request. Note that blocking limits are not supported and will result in an error.	Header	N
15	Request Tags	Comma separated list of request tags to associate with the request.	Header	N
16	User ID	The `user-id` associated with the request.	Header	N
17	Use Case ID	The `use-case-id` associated with the request.	Header	N
18	Use Case Name	The `name` of the `use-case-type` used in the request. As with instrumented requests, if a `use_case_name` is provided and a `use-case-id` is not, then one will automatically be generated.	Header	N

Ingest Example

{
  "category": "system.openai",
  "resource": "gpt-4o-mini",
  "event_timestamp": "2024-05-13T00:00:00",
  "end_to_end_latency_ms": 12450,
  "time_to_first_token_ms": 1143,
  "http_status_code": 200,
  "provider_uri": "https://api.openai.com/v1/chat/completions",
  "provider_prompt": "{ \"request\": \"Your request JSON here\" }",
  "units": {
    "text": {
      "input": 156,
      "output": 1746
    },
    "text_cache_read": {
      "input": 60,
      "output": 0
    },
    "vision": {
      "input": 3512,
      "output": 0
    }
  },
  "provider_request_headers": {
    "RequestHeader1": [
      "HeaderValue",
      "HeaderValue2"
    ],
    "RequestHeader2": [
      "HeaderValue"
    ]
  },
  "provider_response": [
    "{ \"response\": \"Provider response JSON here\" }"
  ],
  "provider_response_headers": {
    "ResponseHeader1": [
      "HeaderValue",
      "HeaderValue2"
    ],
    "ResponseHeader2": [
      "HeaderValue"
    ]
  },
  "properties": {
    "system.failure": "invalid_json"
  }
}

API Response

When you submit events via the Ingest API, Pay-i returns an IngestResponse containing:

event_timestamp - The timestamp of when the event occurred
ingest_timestamp - The timestamp of when Pay-i received and processed the event
request_id - A unique identifier for the submitted request
xproxy_result - Detailed information about the request processing and results

The xproxy_result object contains:

request_id - Unique identifier for tracking the request in Pay-i systems
cost - Detailed cost breakdown including:
- Currency information (e.g., "usd")
- Input costs (units and pricing details)
- Output costs (units and pricing details)
- Total cost calculation
limits - Status of any limits associated with the request, with states like "ok", "exceeded", or "blocked"
blocked_limit_ids - List of any limit IDs that are blocking the request
request_tags - The tags associated with the request for business tracking
use_case_id - The use case identifier associated with the request
user_id - The user identifier associated with the request
resource_id - The identifier for the resource used

This response structure allows your application to programmatically verify that events were properly processed, monitor limit usage, track costs, and maintain alignment with your business tracking needs.

Event Timestamp {#event_timestamp}

The event_timestamp specifies when the ingested event has occurred. Specifying an event_timestamp is optional. If not specified, it defaults to UTC.Now.

An event_timestamp can be specified for any time in the past, and up to 5 minutes in the future to account for timing differences between your service and the Pay-i service.

When submitting an event with a historical timestamp, Pay-i will automatically use the price of the resource at that time when calculating the costs of the event. If you are submitting a request for a custom resource, then the appropriate Resource Version is automatically selected.

If the event_timestamp refers to a point in time for which there is no pricing information (e.g., trying to ingest an event for gpt-4 before the model was released), then an error will be thrown.

Bulk Event Submission

Pay-i allows you to send thousands of events as part of a single network request, to reduce network overhead. This makes it easy to provide Pay-i with historical data or to handle high-traffic situations.

If you would like to use this feature, please contact [email protected].

Related Resources

Operational Modes - Understanding different instrumentation approaches
Custom Categories and Resources - Setting up tracking for custom models
Historical Data Backfill - Strategies for importing historical usage data
Ingest API Documentation - Complete API reference for single event submission
Bulk Ingest API Documentation - API reference for submitting multiple events