Overview

This guide provides a simple, working demonstration of Pay-i's integration with Azure OpenAI, showcasing cost tracking, limits, tags, and streaming. For an even simpler introduction, see our Getting Started guide. For more advanced examples with different Providers, check out our Pay-i-Quickstarts GitHub repository.

In this quickstart, we'll walk through a basic example step-by-step, showing you how to see immediate value from Pay-i with minimal setup.

What This Quickstart Shows

This quickstart showcases:

Setting up Pay-i with Azure OpenAI
Creating and monitoring usage limits
Adding request tags for custom analytics
Making standard (non-streaming) API requests
Making streaming API requests
Tracking costs and usage in real-time

Prerequisites

Python 3.7+
Azure OpenAI API key
Azure OpenAI endpoint URL
Azure OpenAI API version (e.g., "2023-05-15")
Azure OpenAI deployment name (e.g., "gpt-35-turbo")
Azure deployment type (optional: "global", "datazone", or "region")
Pay-i API key

Installation

pip install payi openai

%pip install payi openai

Note: If you have an older version of the Pay-i SDK installed, use pip install --upgrade payi to get the latest version.

We'll first run the quickstart to see it in action, then break down each portion of the code to understand how it works.

Running the Quickstart

Download the code from our GitHub repository and save it as azure_openai_quickstart.py

Set your environment variables:

export AZURE_OPENAI_API_KEY="your-azure-openai-api-key"
export AZURE_OPENAI_ENDPOINT="your-azure-openai-endpoint"
export AZURE_OPENAI_MODEL="gpt-4o-2024-05-13"
export AZURE_OPENAI_DEPLOYMENT="your-deployment-name"
export PAYI_API_KEY="your-payi-api-key"

set AZURE_OPENAI_API_KEY=your-azure-openai-api-key
set AZURE_OPENAI_ENDPOINT=your-azure-openai-endpoint
set AZURE_OPENAI_MODEL=gpt-4o-2024-05-13
set AZURE_OPENAI_DEPLOYMENT=your-deployment-name
set PAYI_API_KEY=your-payi-api-key

Run the quickstart:
```
python azure_openai_quickstart.py
```

Understanding the Code

Now that you've seen the quickstart in action, let's break down how it works:

1. Setup and Imports

First, we import the necessary libraries and set up our configuration:

from payi.lib.helpers import payi_azure_openai_url
from openai import AzureOpenAI
import json
import os

# Read the API keys from the environment, replace the default values with your own keys if needed
AZURE_API_KEY = os.getenv("AZURE_OPENAI_API_KEY", "YOUR_AZURE_OPENAI_API_KEY")
PAYI_API_KEY = os.getenv("PAYI_API_KEY", "YOUR_PAYI_API_KEY")

API_VERSION = "2024-02-15-preview"

# Replace with your Azure OpenAI Model Name, e.g. "gpt-4o-2024-05-13"
# Note that the full model name is required by Azure OpenAI
AZURE_MODEL = "YOUR_AZURE_OPENAI_MODEL"

# Replace with your Azure OpenAI Deployment Name, e.g. "test-4o"
AZURE_DEPLOYMENT = "YOUR_AZURE_OPENAI_DEPLOYMENT"

# Replace with your deployed Azure OpenAI endpoint URI
AZURE_ENDPOINT = "YOUR_AZURE_OPENAI_ENDPOINT"

# Replace with one of the following values: "global", "datazone", or "region" depending on your azure deployment type
AZURE_DEPLOYMENT_TYPE = None

2. Client Initialization

Next, we set up our Pay-i and Azure OpenAI clients:

from payi.lib.instrument import payi_instrument

# Initialize Pay-i client for limit management
payi_client = Payi(api_key=PAYI_API_KEY)

# Enable Pay-i instrumentation
payi_instrument()

# Initialize Azure OpenAI client
azure_client = AzureOpenAI(
    api_key=AZURE_API_KEY,
    api_version=API_VERSION,
    azure_endpoint=AZURE_ENDPOINT  # Direct connection to Azure
)

3. Limit Creation

Creating a limit is optional but helpful for cost control:

# Create a limit (optional)
try:
    limit_name = "AzureQuickStart Limit"
    limit_response = payi_client.limits.create(
        limit_name=limit_name,
        max=10.00  # $10 USD limit
    )
    limit_id = limit_response.limit.limit_id  # Store limit ID to track costs against it
except Exception as e:
    limit_id = None

4. Making a Standard API Request

Now we make a regular (non-streaming) request to Azure OpenAI through Pay-i:

# Create request tags for custom analytics and track costs against limit
tags = ["azure-standard-request"]
headers = create_headers(request_tags=tags, limit_ids=[limit_id] if limit_id else None)

# Make a standard API call, just like we would with regular Azure OpenAI
response = azure_client.chat.completions.create(
    model=AZURE_DEPLOYMENT,
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Explain what Pay-i does in one sentence."}
    ],
    extra_headers=headers
)


# Print the result
print("\nResponse:")
print(f"---\n{response.choices[0].message.content}\n---")

# Pay-i automatically captures tracking information like cost and request ID
cost_info = response.xproxy_result.get('cost', {})
    request_id = response.xproxy_result.get('request_id', 'N/A')
    print("\nPay-i tracking information:")
    print(f"- Request ID: {request_id}")
    print(f"- Cost: ${cost_info}")

5. Limit Monitoring

After making requests, we can check our limit usage:

if limit_id:
    status = payi_client.limits.retrieve(limit_id=limit_id)  # Retrieve current limit status
    
    # Get the total cost from the limit status
    total_cost = status.limit.totals.cost.total.base
        
    usage_percent = (total_cost / status.limit.max) * 100  # Calculate usage percentage
    
    print(f"✓ Current usage: ${total_cost:.6f} of ${status.limit.max:.2f} ({usage_percent:.2f}%)")

6. Streaming API Requests

Pay-i also works with streaming responses:

# Tags for analytics and tracking against our limit
tags = ["azure-streaming-request"]
headers = create_headers(request_tags=tags, limit_ids=[limit_id] if limit_id else None)

# Make streaming request
stream = azure_client.chat.completions.create(
    model=AZURE_DEPLOYMENT,
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Write a short poem about AI cost efficiency."}
    ],
    max_tokens=100,
    stream=True,  # Enable streaming
    extra_headers=headers
)


# Process the streaming response
print("\nStreaming response:")
print("---")

for chunk in stream:
    if chunk.choices and chunk.choices[0].delta.content:
        content = chunk.choices[0].delta.content
        print(content, end="")

print("\n---")

# Check final limit status
if limit_id:
    status = payi_client.limits.retrieve(limit_id=limit_id)  # Retrieve current limit status
    
    # Get the total cost from the limit status
    total_cost = status.limit.totals.cost.total.base
        
    usage_percent = (total_cost / status.limit.max) * 100  # Calculate usage percentage
    
    print(f"\nChecking final limit status...")
    print(f"✓ Final usage: ${total_cost:.6f} of ${status.limit.max:.2f} ({usage_percent:.2f}%)")

print("Now you can check your Pay-i dashboard to see detailed metrics and costs.")

What to Look For

Limit creation: Notice how a limit is created and assigned a unique ID
Cost tracking: Each request reports its cost, which is then reflected in the limit status
Streaming support: Pay-i works seamlessly with both standard and streaming requests
Request metadata: Tags are applied to each request for better organization
Error handling: The code includes robust error handling for production use

Next Steps

After running this quickstart, you can:

Check your Pay-i dashboard to see detailed metrics about the requests
Explore the request tags and analytics in the Pay-i interface
Integrate similar code into your own applications
Explore more advanced features like Use Cases and Decorators

For more detailed information about configuring Pay-i with different GenAI Providers, see the Auto-Instrumentation guide or the Azure-specific Azure OpenAI Configuration guide.

Expected Output

The quickstart will output something similar to:

Response:
---
Pay-i is a comprehensive cost tracking and monitoring platform for AI APIs that helps businesses control spending and optimize their AI usage.
---

Pay-i tracking information:
- Request ID: 2682789
- Cost: ${'currency': 'usd', 'input': {'base': 8.5e-06}, 'output': {'base': 3.45e-05}, 'total': {'base': 4.3e-05}}
✓ Current usage: $0.000043 of $10.00 (0.00%)

Streaming response:
---
In the realm of Azure's cloud,
Where AI costs could grow unbounded and proud,
Pay-i stands vigilant and true,
Tracking each token, old and new.

Efficiency becomes the golden key,
Optimizing costs for all to see,
With limits in place and usage clear,
Financial surprises need not appear.
---

Checking final limit status...
✓ Final usage: $0.000165 of $10.00 (0.00%)
Now you can check your Pay-i dashboard to see detailed metrics and costs.