Guides

Day 4: Applying Limits with track_context

📋 Overview

Welcome to Day 4 of our 5-day learning path! In Day 3, you learned how to assign specific Use Cases to your GenAI calls using the @track(use_case_name) decorator. Today, we'll take another step forward by exploring how to apply spending limits to control your GenAI costs.

Today's Goal: Learn how to apply existing Pay-i spending limits to your GenAI functions using both the @track decorator and the track_context function.

Why Apply Limits?

Applying spending limits to your GenAI calls provides several important benefits:

  • Cost control: Prevent unexpected or runaway expenditure
  • Budget management: Allocate specific budgets to different features or users
  • Risk mitigation: Protect against potential abuse or bugs
  • Departmental budgeting: Assign spending allowances to different teams or departments
  • Client billing: Enforce spending caps for clients on specific pricing tiers

Recap: Annotations So Far

In previous days, we've covered:

  • Day 1: Basic auto-instrumentation with payi_instrument()
  • Day 2: Adding user context with track_context(user_id="...")
  • Day 3: Assigning Use Cases with @track(use_case_name="...")

Today, we'll extend our knowledge by focusing on the limit_ids parameter that can be used with both the @track decorator and the track_context function.

🔍 Prerequisites: Creating Limits

Before you can apply limits to your code, those limits must first be created in Pay-i. You can create limits in two ways:

  1. Via the Pay-i Portal: Create and configure limits in the web interface
  2. Via the SDK: Create limits programmatically using client.limits.create()

For this example, let's create two limits using the Python SDK:

import os
from payi import Client

# Initialize Pay-i client
client = Client(api_key=os.getenv("PAYI_API_KEY"))

# Create a department-level limit for the marketing team
marketing_limit = client.limits.create(
    limit_name="Marketing Department Budget",
    max=50.0,  # $50 budget
    limit_type="block"  # Block requests when the limit is reached
)
marketing_limit_id = marketing_limit.limit.limit_id  # Save this ID for later use

# Create a per-user budget limit
user_limit = client.limits.create(
    limit_name="Premium User Budget",
    max=10.0,  # $10 per user
    limit_type="allow",  # Log but allow requests that exceed the limit
    threshold=0.9  # Alert at 90% usage
)
user_limit_id = user_limit.limit.limit_id  # Save this ID for later use

print(f"Marketing limit ID: {marketing_limit_id}")
print(f"User limit ID: {user_limit_id}")

The limit_id values returned (typically in the format lim_1234567890) are what you'll use in your code to apply these limits to your GenAI calls.

Note: Once created, you can also find these limit IDs in the Pay-i Portal by navigating to the Limits section and viewing the details of each limit.

🔑 Core Concept: Applying Limits with @track and track_context

Pay-i offers two approaches for applying limits to your GenAI calls, matching the patterns we've already learned:

1. Static Limits with @track

The @track decorator accepts a limit_ids parameter that lets you apply specific limits to all GenAI calls made within a function:

from payi.lib.instrument import track

@track(limit_ids=["lim_marketing_dept_monthly"])  # Using a hardcoded, known limit ID
def generate_marketing_content(prompt):
    # All GenAI calls in this function will be tracked against this static limit ID
    response = client.chat.completions.create(...)
    return response.choices[0].message.content

This approach is best for static limits that you want to consistently apply to a specific feature or function. The limit IDs must be literal strings known at coding time, not variables determined at runtime.

Important: The @track decorator requires hardcoded limit IDs known at the time you write your code. For limits created programmatically or determined at runtime (like the ones we created earlier in this tutorial), you must use track_context instead.

2. Dynamic Limits with track_context

The track_context function also accepts a limit_ids parameter for applying limits dynamically at runtime:

from payi.lib.instrument import track_context

def generate_content_for_user(prompt, user_id, user_tier):
    # Determine which limit to apply based on user tier
    if user_tier == "premium":
        limit_id = f"lim_{user_id}_premium"  # This would be a pre-created limit ID
    else:
        limit_id = "lim_basic_users"  # This would be a pre-created limit ID
    
    # Apply the limit dynamically at runtime
    with track_context(limit_ids=[limit_id]):
        response = client.chat.completions.create(...)
    
    return response.choices[0].message.content

This approach is ideal for dynamic limits that change based on runtime variables like user identity, subscription tier, or other context-specific information.

💻 Practical Examples

Example 1: Applying a Static Limit with @track

Here's a complete example showing how to apply a static limit to all GenAI calls in a function:

import os
from openai import OpenAI
from payi.lib.instrument import payi_instrument, track

# Initialize Pay-i instrumentation
payi_instrument()

# Configure OpenAI client
client = OpenAI()  # Uses OPENAI_API_KEY environment variable

# Apply the marketing limit created earlier to all marketing content generation
@track(limit_ids=[marketing_limit_id])
def generate_marketing_content(prompt):
    response = client.chat.completions.create(
        model="gpt-3.5-turbo",
        messages=[
            {"role": "system", "content": "You are a marketing content specialist."},
            {"role": "user", "content": prompt}
        ]
    )
    return response.choices[0].message.content

# Generate some marketing content
slogan = generate_marketing_content("Create a catchy slogan for a new smartphone")
print(f"Marketing slogan: {slogan}")

In this example, all GenAI calls within the generate_marketing_content function will be tracked against the marketing limit we created earlier. If this limit is exceeded, the requests will be blocked since we created it with limit_type="block".

Example 2: Applying a Dynamic User Limit with track_context

Here's how to apply the user limit we created earlier:

import os
from openai import OpenAI
from payi.lib.instrument import payi_instrument, track_context

# Initialize Pay-i instrumentation
payi_instrument()

# Configure OpenAI client
client = OpenAI()  # Uses OPENAI_API_KEY environment variable

# Using the user limit we created earlier
premium_user_limit_id = user_limit_id  # This was saved from our earlier limit creation

# Apply user-specific limits dynamically
def generate_premium_content(prompt, user_id):
    # Use the user ID and our pre-created limit
    with track_context(user_id=user_id, limit_ids=[premium_user_limit_id]):
        response = client.chat.completions.create(
            model="gpt-4",  # Using a more expensive model for premium content
            messages=[{"role": "user", "content": prompt}]
        )
    return response.choices[0].message.content

# Generate premium content for a specific user
premium_content = generate_premium_content(
    "Write a detailed analysis of recent AI developments",
    "user_12345"
)
print(f"Premium content: {premium_content[:50]}...")

This approach tracks the user's spending against our pre-created "Premium User Budget" limit, which allows requests to exceed the limit (for logging purposes) but will trigger alerts at 90% usage since we set the threshold to 0.9.

Example 3: Combining Use Cases, User IDs, and Limits

You can combine all three tracking methods to create a comprehensive tracking solution:

import os
from openai import OpenAI
from payi.lib.instrument import payi_instrument, track, track_context

# Initialize Pay-i instrumentation
payi_instrument()

# Configure OpenAI client
client = OpenAI()  # Uses OPENAI_API_KEY environment variable

# Using both our previously created limits
department_limit_id = marketing_limit_id  # Reusing our marketing limit for department tracking
user_budget_limit_id = user_limit_id      # Reusing our user limit for user tracking

# Static annotations for use case and department-level budget limit
@track(
    use_case_name="customer_support",  # Associate with the customer support use case
    limit_ids=[department_limit_id]    # Apply the department budget limit
)
def generate_support_response(ticket_id, question, user_id):
    # Dynamic tracking for user ID and user-specific limit
    with track_context(
        user_id=user_id,               # Track which user this is for
        limit_ids=[user_budget_limit_id]  # Apply the user's budget limit
    ):
        response = client.chat.completions.create(
            model="gpt-3.5-turbo",
            messages=[
                {"role": "system", "content": "You are a helpful customer support assistant."},
                {"role": "user", "content": f"Ticket ID: {ticket_id}\nQuestion: {question}"}
            ]
        )
    return response.choices[0].message.content

# Generate a support response
support_reply = generate_support_response(
    "TKT-12345",
    "How do I reset my password?",
    "customer_67890"
)
print(f"Support response: {support_reply}")

In this example:

  1. The function is associated with the customer_support use case
  2. All calls are tracked against the department budget limit (SUPPORT_DEPT_LIMIT_ID)
  3. Each call is also tracked against a user-specific monthly limit (USER_MONTHLY_LIMIT_ID)
  4. The user ID is recorded for user-level analytics

This layered approach gives you maximum visibility and control over your GenAI spending.

✅ Verification: Checking Limit Application

To verify that your limits are being applied correctly:

  1. Run your application and make several AI calls that should be tracked against your limits
  2. Log in to developer.pay-i.com
  3. Navigate to your application dashboard
  4. Click on Limits in the left sidebar
  5. You'll see a list of all your limits with:
    • Their current usage
    • Percentage of the limit used
    • Status indicators (green, yellow, or red based on usage)

For limits that have exceeded their threshold but not the maximum, you'll see yellow warning indicators. For limits that have reached their maximum (in "Block" mode), you'll see that requests have been blocked.

You can click on any limit to view detailed information about the requests that have been tracked against it, including timestamps, costs, and associated use cases and users.

➡️ Next Steps

Congratulations! You've learned how to apply spending limits to your GenAI calls using both the @track decorator for static limits and the track_context function for dynamic limits.

Tomorrow in Day 5, we'll bring everything together and explore more advanced patterns for combining all the techniques we've learned throughout this learning path.

💡 Additional Resources