Guides

LangChain Provider Configuration

Overview

This guide explains how to configure LangChain to work with Pay-i for tracking token usage and costs. LangChain works with various Providers and Resources that Pay-i can track.

SDK Support

The examples in this guide use the Pay-i Python SDK with LangChain. The integration uses a custom callback handler that sends token usage data to Pay-i after LLM calls complete.

LangChain Integration Approach

Unlike other providers where Pay-i can operate in either Proxy or Ingest mode, LangChain integration works exclusively through an Ingest approach using a custom callback handler. This handler captures token usage after calls complete and sends the data to Pay-i's ingest API.

Creating a Custom Callback Handler

The key to LangChain integration is creating a PayiHandler custom callback class:

import os
from payi import Payi
from langchain_core.callbacks import BaseCallbackHandler

class PayiHandler(BaseCallbackHandler):
    def __init__(self, client, params):
        self.name = "custom_handler"
        self.client = client
        self.params = {
            **params
        }

    def on_llm_end(self, response, **kwargs):
        llm_output = response.llm_output
        if llm_output and 'token_usage' in llm_output:
            token_usage = llm_output['token_usage']
            prompt_tokens = token_usage.get('prompt_tokens', 0)
            completion_tokens = token_usage.get('completion_tokens', 0)

            if not (prompt_tokens > 0 or completion_tokens > 0):
                print(f"{self.name}: no token usage in LLM output", response)
                return

            try:
                # Send token usage data to Pay-i
                result = self.client.ingest.units(
                    category=self.params['category'],
                    resource=self.params['resource'],
                    units={ "text": { "input": prompt_tokens, "output": completion_tokens} },
                    limit_ids=self.params['limit_ids'], 
                    request_tags=self.params['request_tags']
                )
                print(f'ingest result: {result.model_dump_json(indent=4)}')
            except Exception as e:
                print(f"{self.name}: error sending usage info", e)

Setting Up and Using the Handler

Here's how to set up and use the PayiHandler with LangChain:

from payi import Payi
from payi.lib.helpers import PayiCategories
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate

# Read API keys from environment variables
openai_key = os.getenv("OPENAI_API_KEY", "YOUR_OPENAI_KEY")
payi_api_key = os.getenv("PAYI_API_KEY", "YOUR_PAYI_API_KEY")

# Create Pay-i client
payi_client = Payi(
    api_key=payi_api_key
)

# Create a limit to track usage against (optional)
limit_response = payi_client.limits.create(
    limit_name='LangChain example limit', 
    max=12.50,  # $12.50 USD
    limit_type="Allow",
    limit_tags=["langchain_example"]
)
limit_id = limit_response.limit.limit_id

# Configuration parameters for the Pay-i handler
params = {
    'category': PayiCategories.openai,  # Provider category
    'resource': 'gpt-3.5-turbo',        # Model identifier
    'limit_ids': [limit_id],            # Optional: limits to track against
    'request_tags': ['langchain', 'example']  # Optional: tags for organization
}

# Create the Pay-i handler
handler = PayiHandler(client=payi_client, params=params)

# Create the LangChain model with the handler
model = ChatOpenAI( 
    model=params['resource'],
    api_key=openai_key,
    callbacks=[handler]  # Register the Pay-i handler
)

# Create and run a simple chain
prompt = ChatPromptTemplate.from_messages(["Say this: {text}"])
chain = prompt | model
response = chain.invoke({"text": "Hello, world!"})
print(response.content)

Supported LangChain Providers

This approach works with any LangChain integration that properly reports token usage in the llm_output. Examples include:

  • OpenAI models
  • Azure OpenAI models
  • Anthropic Claude models
  • AWS Bedrock models

To use a different provider, simply update the category and resource parameters to match the provider you're using:

# For Azure OpenAI
params = {
    'category': PayiCategories.azure_openai,
    'resource': 'your-deployment-name',
    'limit_ids': [limit_id],
    'request_tags': ['langchain', 'azure']
}

# For Anthropic
params = {
    'category': PayiCategories.anthropic,
    'resource': 'claude-3-haiku-20240307-v1:0',
    'limit_ids': [limit_id],
    'request_tags': ['langchain', 'anthropic']
}

Advanced Usage: Customizing the Handler

You can extend the PayiHandler class to add more functionality. For example, you might want to:

  • Track additional metadata
  • Add more detailed logging
  • Include user IDs for attribution
  • Handle streaming responses

Here's an extended example with user ID tracking:

class ExtendedPayiHandler(BaseCallbackHandler):
    def __init__(self, client, params, user_id=None):
        self.name = "extended_handler"
        self.client = client
        self.params = {
            **params
        }
        self.user_id = user_id
        
    def on_llm_end(self, response, **kwargs):
        llm_output = response.llm_output
        if llm_output and 'token_usage' in llm_output:
            token_usage = llm_output['token_usage']
            prompt_tokens = token_usage.get('prompt_tokens', 0)
            completion_tokens = token_usage.get('completion_tokens', 0)
            
            try:
                # Include user_id if provided
                ingest_params = {
                    'category': self.params['category'],
                    'resource': self.params['resource'],
                    'units': { "text": { "input": prompt_tokens, "output": completion_tokens} },
                    'limit_ids': self.params.get('limit_ids', []),
                    'request_tags': self.params.get('request_tags', [])
                }
                
                if self.user_id:
                    ingest_params['user_id'] = self.user_id
                    
                result = self.client.ingest.units(**ingest_params)
                print(f'ingest result: {result.model_dump_json(indent=4)}')
            except Exception as e:
                print(f"{self.name}: error sending usage info", e)

Related Resources