Overview

While auto-instrumentation gives you immediate tracking of your GenAI usage with minimal code changes, custom instrumentation takes your insights to the next level by adding business context to your API calls.

This guide explains how to enhance your GenAI application with custom Annotations that connect technical usage to business outcomes. It builds on Auto-Instrumentation and allows you to:

Associate API calls with specific business use cases
Track usage by individual users or customer segments
Apply spending limits and controls
Organize requests with meaningful tags for analysis
Measure business ROI and value metrics

For an understanding of the business value these capabilities enable, see Value of Pay-i Instrumentation in the Pay-i Concepts section.

Annotation Methods

Once you have auto-instrumentation configured, you can add custom business context through Annotations. Pay-i offers two main approaches for adding these Annotations, each suited to different scenarios:

Approach	Description	Scope	Best For
Custom Headers	Add annotations directly to API calls using `extra_headers`	Per individual request	Fine-grained control with different annotations for each API call
Track Decorator	Use the Python `@track` decorator that applies to functions and their call trees	Applies to entire call tree	Functions that make multiple API calls that should share the same context

The key difference is scope: custom headers apply to individual requests, while the @track decorator automatically applies to everything in a function's call tree. For detailed information about how the decorator propagates annotations through call trees, see the Track Decorator documentation.

Implementation Examples

Here's how to add custom annotations to your API calls:

Using Custom Headers (Per-Request)

import os
from openai import OpenAI
from payi.lib.instrument import payi_instrument
from payi.lib.helpers import create_headers

# Initialize Pay-i instrumentation
payi_instrument()

# Configure OpenAI client
client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))

# Make a request with annotations - only applies to this specific request
response = client.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": "Hello, how are you?"}],
    extra_headers=create_headers(
        use_case_name="customer_service",
        user_id="jane_doe",
        limit_ids=["department_budget"],
        request_tags=["greeting"]
    )
)

For provider-specific implementation details, see:

Using the Track Decorator (Call Tree)

import os
from openai import OpenAI
from payi.lib.instrument import payi_instrument, track

# Initialize Pay-i instrumentation
payi_instrument()

# Configure OpenAI client
client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))

# Use decorator to add annotations - applies to all API calls made within this function
# and any other functions it calls (the entire call tree)
@track(
    u@trackuser_iduse_case_name="customer_service",
    user_id="jane_doe",
limiudget"],
t_ids=["department_b  request_tags=["greeting"]
)
def handle_greeting(user_message):
    # These annotations will apply to ALL API calls made within this function
    response = client.chat.completions.create(
        model="gpt-4",
        messages=[{"role": "user", "content": user_message}]
    )
    return response.choices[0].message.content

Note: If you're using Pay-i Proxy Configuration for Block limits, make sure you've initialized with config={"proxy": True} as described in the proxy configuration documentation.

Implementation Approaches

Review the detailed documentation for each approach to decide which best fits your needs:

Custom Headers - For direct, per-request annotation of API calls
Track Decorator - For function-level annotations that propagate through call trees
Both approaches can be combined in the same application if needed, giving you maximum flexibility.

Provider-Specific Implementation

The approach to adding annotations varies depending on the provider SDK's capabilities:

Direct Header Support

These providers support annotations via the extra_headers parameter directly in their inference calls:

Provider	Implementation Method	Notes
OpenAI	`extra_headers` parameter	Native support in API calls
Azure OpenAI	`extra_headers` parameter	Native support in API calls
Anthropic	`extra_headers` parameter	Native support in API calls

Alternative Approaches

These providers require different approaches since they don't natively support extra_headers:

Provider	Implementation Method	How It Works
AWS Bedrock	`track_context()`	Use for per-inference calls without needing boto3 callback registration
LangChain	`PayiCallbackHandler`	Uses a callback handler pattern instead of headers

See our provider-specific guides for complete examples and implementation details for each provider:

Related Resources

Auto-Instrumentation - Basic setup and configuration
Pay-i Proxy Configuration - For when you need Block limits
Pay-i Concepts - Core concepts explanation
Use Cases - Creating and managing use cases
Budget Limits - Setting and monitoring budget limits