Advanced Concepts

Choose the simple pattern before using the APIs directly

We strongly recommend the simplicity of specifying limit ids when calling @track and track_context and inspecting the limit state in the resulting xproxy_result.

Advanced Concepts

This guide explains advanced concepts to manage usage limits and budgets in Pay-i using the Python SDK directly. Limits provide fine-grained cost control and budget enforcement for your AI applications. For a conceptual understanding of limits and their role in the Pay-i platform, please refer to the Limits Concepts page.

When to Use Limits

You'll work with limits in the Pay-i Python SDK for several important purposes:

  1. Budget Management: Enforce spending caps for AI model usage to prevent unexpected costs
  2. Cost Control: Implement tiered spending thresholds with notification alerts
  3. Resource Allocation: Distribute AI budgets across teams, projects, or client accounts
  4. Usage Guardrails: Prevent runaway costs from unexpected usage spikes or misconfigurations
  5. Compliance: Meet organizational or regulatory requirements for budget enforcement

Common Workflows

Working with the Pay-i limits API involves several common patterns for creating, monitoring, and enforcing spending limits. This section walks you through these workflows with practical code examples.

The examples below demonstrate how to:

  1. Create a budget limit with appropriate parameters
  2. Find existing limits by ID or name
  3. Monitor usage across multiple requests
  4. Check limit status and react to exceeded limits
  5. Reset limits at the end of billing periods

For demonstration purposes, these examples use payi.ingest.units() to manually record GenAI usage metrics. The ingest.units() method returns limit status information that your application must check to determine if limits have been exceeded. In typical applications, you would use the SDK's decorators or context managers for more streamlined usage tracking.

Note: These examples use the Python SDK's client objects (Payi and AsyncPayi), which provide a resource-based interface to the Pay-i API. For details on client initialization and configuration, see the Pay-i Client Initialization guide.

Note for returning users: If you've previously run through the examples in this guide, you may want to clean up any existing limits before proceeding. This will help prevent errors due to limits with the same name but different parameters, as limit creation is only idempotent when all parameters exactly match.

Creating a Monthly Budget Limit

When you need to establish a fixed spending cap for your AI usage, creating a monthly budget limit ensures you won't exceed your allocated budget. Note that limit creation is idempotent only when all parameters exactly match - if you attempt to create a limit with the same name but different parameters, the operation will fail with an error:

pip install payi dotenv
from payi import Payi
from dotenv import load_dotenv

load_dotenv()

payi = Payi()
limit_id = "monthly_allow_budget"

# Create a monthly budget limit with "allow" type for use-case level tracking
response = payi.limits.create(
  limit_name="Monthly Budget",
  limit_id=limit_id,
  max=100.0,           
  limit_type="allow",  
  threshold=0.80       
)

Real-World Example: Multi-Layered Budget Control System

Let's walk through a comprehensive example of setting up a tiered limit structure that provides granular cost control for an enterprise AI application. This example demonstrates a common pattern in production environments:

Budget tracking with non-blocking limits - These monitor usage against targets without disrupting service, allowing you to track progress toward goals and receive notifications

This mixed approach gives you both protection against runaway costs and visibility into usage patterns:

from payi import Payi
from datetime import datetime, timezone
from dotenv import load_dotenv

load_dotenv()

# Initialize the Pay-i client
payi = Payi()

# Step 1: Set up a multi-layered budget control system with different scopes
def setup_budget_control_system():
  """
  Create a sophisticated budget control system with daily, monthly,
  and model-specific limits to provide defense-in-depth cost control.

  This uses a combination of:
  - "allow" limits for tracking usage and goals without blocking requests
  - "block" limits as safety caps against runaway costs
  """
  # Dictionary to store limit IDs
  stored_limit_ids = {}
    
  # Helper function to ensure a limit exists with exact desired parameters
  # If a limit with same name but different parameters exists, it deletes and recreates it
  def ensure_limit_exists(name, id, max_value, limit_type, threshold_value):
    # First check if limit with this name already exists
    existing_limits = payi.limits.list(limit_name=name)
        
    if existing_limits.items:
      existing_limit = existing_limits.items[0]
      existing_id = existing_limit.limit_id
            
      # Check if existing limit has different important parameters
      # If so, we need to delete and recreate it since these can't be updated
      if (existing_limit.limit_type != limit_type or
          existing_limit.threshold != threshold_value):
        print(f"Found limit '{name}' with different immutable parameters.")
        print(f"  Existing: type={existing_limit.limit_type}, threshold={existing_limit.threshold*100:.0f}%")
        print(f"  Desired: type={limit_type}, threshold={threshold_value*100:.0f}%")
        print(f"  Deleting and recreating limit...")
                
        # Delete the existing limit
        payi.limits.delete(limit_id=existing_id)
        print(f"  Deleted limit: {name} (ID: {existing_id})")
                
        # Create new limit with desired parameters
        limit_response = payi.limits.create(
          limit_name=name,
          limit_id=id,
          max=max_value,
          limit_type=limit_type,
          threshold=threshold_value
        )
        limit_obj = limit_response.limit
        print(f"  Created new limit: '{limit_obj.limit_name}' (${limit_obj.max:.2f}, {limit_obj.limit_type} type, {limit_obj.threshold * 100:.0f}% threshold)")
        return limit_obj.limit_id
            
      else:
        # Limit type and threshold match, we only need to update the max value if it changed
        if existing_limit.max != max_value:
          print(f"Found limit '{name}' with matching immutable parameters but different max value.")
          update_response = payi.limits.update(
            limit_id=existing_id,
            max=max_value
          )
          updated_limit = update_response.limit
          print(f"  Updated max value: ${existing_limit.max:.2f} → ${updated_limit.max:.2f}")
          return updated_limit.limit_id
        else:
          print(f"Limit '{name}' already exists with identical parameters. Using existing limit.")
                    return existing_id
        
      else:
        # Limit doesn't exist, create a new one
        limit_response = payi.limits.create(
          limit_name=name,
          limit_id=id,
          max=max_value,
          limit_type=limit_type,
          threshold=threshold_value
        )
        limit_obj = limit_response.limit
        print(f"Created new limit: '{limit_obj.limit_name}' (${limit_obj.max:.2f}, {limit_obj.limit_type} type, {limit_obj.threshold * 100:.0f}% threshold)")
        return limit_obj.limit_id
    
    # Create daily usage safety cap
    daily_limit_id = ensure_limit_exists(
        name="Daily Safety Cap",
        id="daily",
        max_value=50.0,                            
        limit_type="allow",                           
        threshold_value=0.90                           
    )
    stored_limit_ids["daily"] = daily_limit_id
    
    # Create monthly budget tracker
    monthly_limit_id = ensure_limit_exists(
      name="Monthly Budget",
      id="monthly",
      max_value=1000.0,
      limit_type="allow",                  
      threshold_value=0.80                 
    )
    stored_limit_ids["monthly"] = monthly_limit_id
    
    # Create premium model usage tracker
    premium_limit_id = ensure_limit_exists(
      name="Premium Model Usage",
      id="premium",
      max_value=300.0,                               
      limit_type="allow",                          
      threshold_value=0.83                            
    )
    stored_limit_ids["premium"] = premium_limit_id
    
    return stored_limit_ids

# Step 2: Create a function that uses these limits in requests
def process_ai_request(prompt, model, user_id, limit_ids):
  """
  Process an AI request using the specified model, applying all budget limits.
  Returns both the AI response and budget status information.
  """
  # Use typical token counts for this request
  # In a real app, you might estimate based on the model and prompt
  import random
  input_tokens = random.randint(50, 200)  # Simulate input size
  output_tokens = random.randint(100, 150)  # Simulate output size
    
  # Apply a mix of blocking and non-blocking limits:
  # - Daily safety cap (blocks when exceeded to prevent unexpected costs)
  # - Monthly budget tracker (allows but notifies when exceeding budget targets)
  applied_limits = [limit_ids["daily"], limit_ids["monthly"]]
    
  # For premium models, also track against the premium model budget
  # Define premium models with their correct categories
  premium_model_categories = {
    "claude-opus-4-8": "system.anthropic",
    "gpt-4o": "system.openai",
    "gemini-2.5-pro": "system.google.vertex"
  }
    
  # Check if the model is in our premium list
  if model in premium_model_categories:
    applied_limits.append(limit_ids["premium"])
    category = premium_model_categories[model]
  else:
    # Default to OpenAI for this example
    category = "system.openai"
    
  print(f"Processing request with {len(applied_limits)} active limits for {category}/{model}...")
    
  # Make the request with limits applied
  response = payi.ingest.units(
    category=category,
    resource=model,
    units={"text": {"input": input_tokens, "output": output_tokens}},
    limit_ids=applied_limits,  # Apply all relevant limits
    user_id=user_id
  )
    
  # Check limit status and handle accordingly
  # We'll collect status information for all limits in this list
  # to return a comprehensive budget status to the calling code
  budget_status = []
  for limit_id in response.xproxy_result.limits:
    # Retrieve the limit using its ID
    limit_response = payi.limits.retrieve(limit_id=limit_id)
        
    # Access the limit object from the response
    limit = limit_response.limit
        
    # Get current usage value from the totals.cost.total.base path
    current_usage = limit.totals.cost.total.base
    max_limit = limit.max
        
    # Check if limit was exceeded
    usage_ratio = current_usage / max_limit
    if usage_ratio >= 1.0:
      budget_status.append({
        "limit_name": limit.limit_name,
        "status": "EXCEEDED",
        "current": current_usage,
        "max": max_limit,
        "percent": usage_ratio * 100  # Include percentage even for exceeded limits
      })
    else:
      usage_pct = usage_ratio * 100
      status = "WARNING" if usage_pct > 75 else "OK"
      budget_status.append({
        "limit_name": limit.limit_name,
        "status": status,
        "current": current_usage,
        "max": max_limit,
        "percent": usage_pct
      })
    
  return {
    "success": not any(item["status"] == "EXCEEDED" for item in budget_status),
    "budget_status": budget_status,
    "response": response
  }

# Step 3: Example usage in a real application
def main():
  # Step 1: Set up budget control system once and store the IDs
  # In a production app, you would do this setup once and store IDs in a database
  limit_ids = setup_budget_control_system()
    
  # Process requests with budget enforcement
  user_request = "Write a comprehensive analysis of quantum computing applications in finance"
    
  result = process_ai_request(
    prompt=user_request,
    model="claude-opus-4-8",  # Premium model
    user_id="user_abc123",
    limit_ids=limit_ids
  )
    
  # Handle the result
  if result["success"]:
    print("Request processed successfully")
    # Process the model's response
  else:
    print("Request blocked due to budget constraints")
    print("Budget status:")
    for limit in result["budget_status"]:
      if limit["status"] == "EXCEEDED":
        print(f"  ❌ {limit['limit_name']}: ${limit['current']:.5f}/${limit['max']:.2f} ({limit['percent']:.1f}%) - LIMIT EXCEEDED")

if __name__ == "__main__":
    main()

Best Practices

When working with limits in the Pay-i Python SDK, consider these best practices:

  1. Defense in Depth: Implement multiple layers of limits (daily, monthly, per-model) to prevent unexpected cost spikes.

  2. Set Appropriate Thresholds: Configure notification thresholds (75-90% of the limit) to receive warnings before limits are reached.

  3. Track with High Precision: Use 5 decimal places for usage costs (e.g., ${usage:.5f}) and 2 decimal places for budget limits (e.g., ${limit:.2f}). This precision is crucial for GenAI micropayments, where even $0.00001 differences matter when accumulated over millions of requests.

  4. Use Properties for Organization: Apply consistent properties to limits to make them easier to find, filter, and manage.

  5. Reset on Appropriate Cycles: Establish a clear reset schedule that aligns with your billing or budget cycles.

  6. Monitor Regularly: Check limit status periodically to track usage patterns and adjust limits proactively.

  7. Store dynamically created limit IDs: Always store limit IDs that Pay-i creates on behalf of your application. When possible It is recommended to follow a simpler approach and use deterministically known limit ids.

  8. Graceful Degradation: When limits are reached, your agent should adjust accordingly (use a cheaper model, gracefully exit executing the use case, etc).