Limits

Limits are used to control your GenAI traffic and costs.

What are Limits?

Limits are used to track and optionally control spending. There are two types of Limits, Allow and Block.

  • A limit tracks your spending against a defined max amount.
  • Each limit can optionally have a threshold, which helps you monitor when spending is approaching the max.
  • Allow limits let spending go beyond the max, tracking any excess as overrun.
  • Block limits halt spending once the max is exceeded, preventing further usage.

    Note
    Block can only be applied to requests where Pay-i is a proxy.

Simply, a limit is an accumulator with state.

Limits can be applied at any scope in your application: application total spend, cumulative for a use case, per use case instance, request, user, account, etc. Limits are composable and enable you to implement your business requirements in a simple and straightfoward manner.

Limits can be reset, which sets their spend amount back to $0.00 but maintains all prior spending history for review in the Pay-i dashboard.

Summary

Limits can be applied to any application scope you require You can leverage limits to track and compare time-based spend (quarterly spend), restrict a user's usage when they near their budget, or when to change logic executing a use case Instance (switch to a human representative when the chat cost exceeds $1).

More information can be found in limits development documentation.