Overview

This guide explains how to configure LangChain to work with Pay-i Instrumentation for tracking token usage and costs. LangChain works with various Providers and Resources that Pay-i can track.

Prerequisites

To use Pay-i with LangChain, you'll need:

LangChain Libraries:
- langchain-core (pip install langchain-core)
- langchain-openai (pip install langchain-openai) for OpenAI/Azure OpenAI
- Other provider-specific LangChain packages as needed
Pay-i:
- Pay-i API key
- Pay-i Python SDK (pip install payi)
Provider API Keys:
- API key for at least one supported provider (OpenAI, Azure OpenAI, Anthropic, or AWS Bedrock)

SDK Support

The examples in this guide use the Pay-i Python SDK with LangChain. The integration uses a custom callback handler that sends token usage data to Pay-i after LLM calls complete.

Pay-i with LangChain

LangChain integration with Pay-i uses a callback handler approach. This handler captures token usage after calls complete and sends the data to Pay-i for tracking. Unlike other providers, LangChain doesn't directly route API calls through Pay-i, but instead focuses on reporting usage data after each call completes.

Note: Since LangChain integration works through the callback handler mechanism, it doesn't support Block limits that prevent calls from being made. It can track usage against Allow limits, but cannot enforce spending caps in real-time.

Basic Setup

Let's start with the basic setup for Pay-i with LangChain:

import os
from payi import Payi
from payi.lib.instrument import payi_instrument
from payi.lib.helpers import PayiCategories
from langchain_core.callbacks import BaseCallbackHandler
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate

# Initialize Pay-i instrumentation
payi_instrument()

# API keys from environment variables
openai_key = os.getenv("OPENAI_API_KEY", "YOUR_OPENAI_KEY")

Creating a Custom Callback Handler

The key to LangChain integration is creating a PayiHandler custom callback class. While payi_instrument() initializes the global Pay-i instrumentation, the callback handler needs a direct client instance to make API calls:

class PayiHandler(BaseCallbackHandler):
    def __init__(self, client, params):
        self.name = "custom_handler"
        self.client = client
        self.params = {
            **params
        }

    def on_llm_end(self, response, **kwargs):
        llm_output = response.llm_output
        if llm_output and 'token_usage' in llm_output:
            token_usage = llm_output['token_usage']
            prompt_tokens = token_usage.get('prompt_tokens', 0)
            completion_tokens = token_usage.get('completion_tokens', 0)

            if not (prompt_tokens > 0 or completion_tokens > 0):
                print(f"{self.name}: no token usage in LLM output", response)
                return

            try:
                # Send token usage data to Pay-i
                result = self.client.ingest.units(
                    category=self.params['category'],
                    resource=self.params['resource'],
                    units={ "text": { "input": prompt_tokens, "output": completion_tokens} },
                    limit_ids=self.params['limit_ids'], 
                    request_tags=self.params['request_tags']
                )
                print(f'ingest result: {result.model_dump_json(indent=4)}')
            except Exception as e:
                print(f"{self.name}: error sending usage info", e)

Using the Handler with LangChain

Here's how to use the PayiHandler with a LangChain model:

# Create Pay-i client for the handler to use
payi_client = Payi(api_key=os.getenv("PAYI_API_KEY"))

# Configuration parameters for the Pay-i handler
params = {
    'category': PayiCategories.openai,  # Provider category
    'resource': 'gpt-3.5-turbo',        # Model identifier
    'limit_ids': ['your-limit-id'],     # Optional: limits to track against
    'request_tags': ['langchain', 'example']  # Optional: tags for organization
}

# Create the Pay-i handler
handler = PayiHandler(client=payi_client, params=params)

# Create the LangChain model with the handler
model = ChatOpenAI( 
    model=params['resource'],
    api_key=openai_key,
    callbacks=[handler]  # Register the Pay-i handler
)

# Create and run a simple chain
prompt = ChatPromptTemplate.from_messages(["Say this: {text}"])
chain = prompt | model
response = chain.invoke({"text": "Hello, world!"})
print(response.content)

Supported LangChain Providers

This approach works with any LangChain integration that properly reports token usage in the llm_output. Examples include:

OpenAI models
Azure OpenAI models
Anthropic Claude models
AWS Bedrock models

To use a different provider, simply update the category and resource parameters to match the provider you're using:

# For Azure OpenAI
params = {
    'category': PayiCategories.azure_openai,
    'resource': 'your-deployment-name',
    'limit_ids': ['your-limit-id'],
    'request_tags': ['langchain', 'azure']
}

# For Anthropic
params = {
    'category': PayiCategories.anthropic,
    'resource': 'claude-3-haiku-20240307-v1:0',
    'limit_ids': ['your-limit-id'],
    'request_tags': ['langchain', 'anthropic']
}

Advanced Usage: Customizing the Handler

You can extend the PayiHandler class to add more functionality. For example, you might want to:

Track additional metadata
Add more detailed logging
Include user IDs for attribution
Handle streaming responses

Here's an extended example with user ID tracking:

class ExtendedPayiHandler(BaseCallbackHandler):
    def __init__(self, client, params, user_id=None):
        self.name = "extended_handler"
        self.client = client
        self.params = {
            **params
        }
        self.user_id = user_id
        
    def on_llm_end(self, response, **kwargs):
        llm_output = response.llm_output
        if llm_output and 'token_usage' in llm_output:
            token_usage = llm_output['token_usage']
            prompt_tokens = token_usage.get('prompt_tokens', 0)
            completion_tokens = token_usage.get('completion_tokens', 0)
            
            try:
                # Include user_id if provided
                ingest_params = {
                    'category': self.params['category'],
                    'resource': self.params['resource'],
                    'units': { "text": { "input": prompt_tokens, "output": completion_tokens} },
                    'limit_ids': self.params.get('limit_ids', []),
                    'request_tags': self.params.get('request_tags', [])
                }
                
                if self.user_id:
                    ingest_params['user_id'] = self.user_id
                    
                result = self.client.ingest.units(**ingest_params)
                print(f'ingest result: {result.model_dump_json(indent=4)}')
            except Exception as e:
                print(f"{self.name}: error sending usage info", e)

LangChain Provider Configuration