Guides

Manual Event Submission (Ingest API)

Overview

The Ingest API (accessed via the /requests/ingest endpoint) provides a mechanism for explicitly submitting event data to Pay-i after a provider interaction has occurred or for non-provider events. While Direct Provider Call instrumentation uses the Ingest API under the hood (via the @ingest decorator), this documentation covers how to use the Ingest API directly. Only Proxy Routing uses in-context logging during the request flow.

The Ingest API is particularly valuable for these key use cases:

  • Tracking Custom Resources - Submit usage data for custom resources not directly supported by Pay-i
  • Integration-Resistant Providers - Submit data for providers that don't work well with automatic instrumentation (e.g., AWS Bedrock in some scenarios)
  • Non-Python Clients - Submit usage from applications written in languages without Pay-i SDK support
  • Submitting Historical Data - Record individual past events for complete usage visibility
  • External System Reconciliation - Align Pay-i data with other tracking systems

When using the Ingest API directly, the call happens after the provider interaction has completed, meaning your service already has all the necessary usage data (tokens, latency metrics, etc.) available.

Important: When working with streaming responses, ensure you've read the stream to the end before submitting usage data. Pay-i needs the complete token information to accurately track usage and calculate costs. If you don't read the entire stream, you'll have incomplete data for submission.

Manual event submission supports all the same features as standard Pay-i Instrumentation, except for Block limits. This is because Pay-i cannot block requests it doesn't directly proxy. Pay-i will still calculate the costs based on the reported units and associate the data with any provided tags, use cases, and Allow limits for tracking and analysis.

When the Ingest API is called, it returns an xproxy_result containing cost information and limit status.

Ingest Fields

The Ingest API takes the following inputs to submit an event:

#InputDescriptionBody/HeaderRequired
1CategoryThe Category of the Resource used in the request.BodyY
2ResourceThe request Resource, used to calculate pricing.BodyY
3UnitsThe number of input and output units for each of the request's Unit Types.BodyY
4E2ELEnd to end latency of the request.BodyN
5TTFTThe time to first token, which is equivalent to the E2EL for non-streaming scenarios. If both E2EL and TTFT are provided, then the Inter-Token Latency (ITL) and Output Tokens Per Second (OTPS) will automatically be calculated.BodyN
6Provider URIThe endpoint used for the request, shown in DevOps views.BodyN
7Request PromptThe JSON sent to the Provider as part of the request. If Logging is disabled for this application, this will not be saved.BodyN
8Request HeadersAn array of header names and values that were sent to the Provider.BodyN
9Provider ResponseThe JSON returned from the Provider after the request has completed. If Logging is disabled for this application, this will not be saved.BodyN
10Response HeadersAn array of header names and values that were received from the Provider with the response.BodyN
11event_timestampThe ISO 8601 timestamp of when the request was sent to the provider. If not provided, the current time is used. Refer to event_timestamp for more details.BodyN
12HTTP Status CodeThe status code of the HTTP request to the Provider, e.g., "200" or "400", used for failure tracking.BodyN
13PropertiesComing soonBodyN
14Limit IDsComma separated list of limit-ids to associate with the request. Note that blocking limits are not supported and will result in an error.HeaderN
15Request TagsComma separated list of request tags to associate with the request.HeaderN
16User IDThe user-id associated with the request.HeaderN
17Use Case IDThe use-case-id associated with the request.HeaderN
18Use Case NameThe name of the use-case-type used in the request. As with instrumented requests, if a use_case_name is provided and a use-case-id is not, then one will automatically be generated.HeaderN

Ingest Example

{
  "category": "system.openai",
  "resource": "gpt-4o-mini",
  "event_timestamp": "2024-05-13T00:00:00",
  "end_to_end_latency_ms": 12450,
  "time_to_first_token_ms": 1143,
  "http_status_code": 200,
  "provider_uri": "https://api.openai.com/v1/chat/completions",
  "provider_prompt": "{ \"request\": \"Your request JSON here\" }",
  "units": {
    "text": {
      "input": 156,
      "output": 1746
    },
    "text_cache_read": {
      "input": 60,
      "output": 0
    },
    "vision": {
      "input": 3512,
      "output": 0
    }
  },
  "provider_request_headers": {
    "RequestHeader1": [
      "HeaderValue",
      "HeaderValue2"
    ],
    "RequestHeader2": [
      "HeaderValue"
    ]
  },
  "provider_response": [
    "{ \"response\": \"Provider response JSON here\" }"
  ],
  "provider_response_headers": {
    "ResponseHeader1": [
      "HeaderValue",
      "HeaderValue2"
    ],
    "ResponseHeader2": [
      "HeaderValue"
    ]
  },
  "properties": {
    "system.failure": "invalid_json"
  }
}

API Response

When you submit events via the Ingest API, Pay-i returns an IngestResponse containing:

  • event_timestamp - The timestamp of when the event occurred
  • ingest_timestamp - The timestamp of when Pay-i received and processed the event
  • request_id - A unique identifier for the submitted request
  • xproxy_result - Detailed information about the request processing and results

The xproxy_result object contains:

  • request_id - Unique identifier for tracking the request in Pay-i systems
  • cost - Detailed cost breakdown including:
    • Currency information (e.g., "usd")
    • Input costs (units and pricing details)
    • Output costs (units and pricing details)
    • Total cost calculation
  • limits - Status of any limits associated with the request, with states like "ok", "exceeded", or "blocked"
  • blocked_limit_ids - List of any limit IDs that are blocking the request
  • request_tags - The tags associated with the request for business tracking
  • use_case_id - The use case identifier associated with the request
  • user_id - The user identifier associated with the request
  • resource_id - The identifier for the resource used

This response structure allows your application to programmatically verify that events were properly processed, monitor limit usage, track costs, and maintain alignment with your business tracking needs.

Event Timestamp {#event_timestamp}

The event_timestamp specifies when the ingested event has occurred. Specifying an event_timestamp is optional. If not specified, it defaults to UTC.Now.

An event_timestamp can be specified for any time in the past, and up to 5 minutes in the future to account for timing differences between your service and the Pay-i service.

When submitting an event with a historical timestamp, Pay-i will automatically use the price of the resource at that time when calculating the costs of the event. If you are submitting a request for a custom resource, then the appropriate Resource Version is automatically selected.

If the event_timestamp refers to a point in time for which there is no pricing information (e.g., trying to ingest an event for gpt-4 before the model was released), then an error will be thrown.

Bulk Event Submission

Pay-i allows you to send thousands of events as part of a single network request, to reduce network overhead. This makes it easy to provide Pay-i with historical data or to handle high-traffic situations.

If you would like to use this feature, please contact [email protected].


Related Resources