Guides

Azure OpenAI Quickstart

Overview

This guide provides a simple, working demonstration of Pay-i's integration with OpenAI, showcasing cost tracking, limits, tags, and streaming. For an even simpler introduction, see our Getting Started guide. For more advanced examples with different Providers, check out our Pay-i-Quickstarts GitHub repository.

In this quickstart, we'll walk through a basic example step-by-step, showing you how to see immediate value from Pay-i with minimal setup.

What This Quickstart Shows

This quickstart showcases:

  • Setting up Pay-i with OpenAI
  • Creating and monitoring usage limits
  • Adding request tags for custom analytics
  • Making API requests
  • Making streaming API requests
  • Tracking costs and usage in real-time

Prerequisites

  • Python 3.7+
  • OpenAI API key
  • Pay-i API key

Installation

  1. Clone the Pay-i QuickStarts repository:
git clone https://github.com/Pay-i/Pay-i-Quickstarts.git
  1. Navigate to the OpenAI quickstart directory:
cd Pay-i-Quickstarts/quickstarts/openai

Note: The repository contains two versions of each quickstart in separate folders:

  • ingest: Examples using the standard instrumentation approach (recommended for most use cases)
  • proxy: Examples using the proxy configuration approach (for implementing Block limits)

This quickstart uses the ingest version. For more information on the differences between these approaches, see Operational Approaches.

  1. Install the required packages:
pip install --upgrade payi openai
%pip install --upgrade payi openai

We'll first run the quickstart to see it in action, then break down each portion of the code to understand how it works.

Running the Quickstart

  1. Set up your API keys using a .env file (recommended):

    a. Install python-dotenv:

    pip install python-dotenv
    

    b. Create a .env file with your API keys:

    OPENAI_API_KEY=your-openai-api-key
    PAYI_API_KEY=your-payi-api-key
    

    c. Add .env to your .gitignore file to prevent accidentally committing your API keys.

    For non-Python environments, see Environment Variables for alternatives.

  2. Run the quickstart:

    python openai_quickstart.py
    

Understanding the Code

Now that you've seen the quickstart in action, let's break down how it works:

1. Setup and Imports

First, we import the necessary libraries and set up our configuration:

import os
from dotenv import load_dotenv

# Import required libraries
from openai import OpenAI
from payi import Payi
from payi.lib.helpers import payi_openai_url, create_headers
from payi.lib.instrument import payi_instrument

# Load environment variables from .env file
load_dotenv()

2. Client Initialization

Next, we set up our Pay-i and OpenAI clients:

# Initialize Pay-i client
payi_client = Payi()

# Enable Pay-i instrumentation
payi_instrument()

# Initialize OpenAI client
openai_client = OpenAI()

3. Limit Creation

Creating a limit is optional but helpful for cost control:

# Create a limit (optional)
try:
    limit_name = "QuickStart Limit"
    limit_response = payi_client.limits.create(
        limit_name=limit_name,
        max=10.00  # $10 USD limit
    )
    limit_id = limit_response.limit.limit_id  # Store limit ID to track costs against it
except Exception as e:
    exit(f"Error creating limit: {e}")

4. Making an API Request

Now we make a request to OpenAI through Pay-i:

# Create request tags for custom analytics and track costs against limit
tags = ["standard-request"]
headers = create_headers(request_tags=tags, limit_ids=[limit_id])

# Make a standard API call, just like we would with regular OpenAI
response = openai_client.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=[{"role": "user", "content": "Explain what Pay-i does in one sentence."}],
    max_tokens=50,
    extra_headers=headers
)

# Print the result
print("\nResponse:")
print(f"---\n{response.choices[0].message.content}\n---")

# Pay-i automatically captures tracking information like cost and request ID
if hasattr(response, 'xproxy_result'):
    cost_info = response.xproxy_result.get('cost', {})
    request_id = response.xproxy_result.get('request_id', 'N/A')
    print("\nPay-i tracking information:")
    print(f"- Request ID: {request_id}")
    print(f"- Cost: ${cost_info}")

5. Limit Monitoring

After making requests, we can check our limit usage:

status = payi_client.limits.retrieve(limit_id=limit_id)  # Retrieve current limit status

# Get the total cost from the limit status
total_cost = status.limit.totals.cost.total.base        
usage_percent = (total_cost / status.limit.max) * 100  # Calculate usage percentage
    
print(f"✓ Current usage: ${total_cost:.6f} of ${status.limit.max:.2f} ({usage_percent:.2f}%)")

6. Streaming API Requests

Pay-i also works with streaming responses:

# Tags for analytics and tracking against our limit
tags = ["streaming-request"]
headers = create_headers(request_tags=tags, limit_ids=[limit_id])

# Make streaming request
stream = openai_client.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=[{"role": "user", "content": "Write a short poem about AI cost efficiency."}],
    max_tokens=100,
    stream=True,  # Enable streaming
    extra_headers=headers
)

# Process the streaming response
print("\nStreaming response:")
print("---")

for chunk in stream:
    if chunk.choices and chunk.choices[0].delta.content:
        content = chunk.choices[0].delta.content
        print(content, end="")

print("\n---")

# Check final limit status
status = payi_client.limits.retrieve(limit_id=limit_id)  # Retrieve current limit status
    
# Get the total cost from the limit status
total_cost = status.limit.totals.cost.total.base        
usage_percent = (total_cost / status.limit.max) * 100  # Calculate usage percentage
    
print(f"\nChecking final limit status...")
print(f"✓ Final usage: ${total_cost:.6f} of ${status.limit.max:.2f} ({usage_percent:.2f}%)")

print("Now you can check your Pay-i dashboard to see detailed metrics and costs.")

What to Look For

  1. Limit creation: Notice how a limit is created and assigned a unique ID
  2. Cost tracking: Each request reports its cost, which is then reflected in the limit status
  3. Streaming support: Pay-i works seamlessly with both standard and streaming requests
  4. Request metadata: Tags are applied to each request for better organization
  5. Error handling: The code includes robust error handling for production use

Next Steps

After running this quickstart, you can:

  1. Check your Pay-i dashboard to see detailed metrics about the requests
  2. Explore the request tags and analytics in the Pay-i interface
  3. Integrate similar code into your own applications
  4. Explore more advanced features like Use Cases and Decorators

For more detailed information about configuring Pay-i with different GenAI Providers, see the Auto-Instrumentation guide.

Expected Output

The quickstart will output something similar to:

Response:
---
Pay-i is a financial technology company that provides a platform for businesses to easily manage and automate their payroll processes.
---

Pay-i tracking information:
- Request ID: 2682615
- Cost: ${'currency': 'usd', 'input': {'base': 8.5e-06}, 'output': {'base': 3.45e-05}, 'total': {'base': 4.3e-05}}
✓ Current usage: $0.000043 of $10.00 (0.00%)

Streaming response:
---
In the world of AI, efficiency is key,
Cutting costs and saving time to set us free.
Algorithms and data work hand in hand,
Bringing savings to businesses across the land.

Automation and insights streamline the process,
Making tasks easier, with no need to second guess.
AI cost efficiency is the future now,
Bringing value and savings to all, somehow.
---

Checking final limit status...
✓ Final usage: $0.000165 of $10.00 (0.00%)
Now you can check your Pay-i dashboard to see detailed metrics and costs.