Limits
Limits are used to control your GenAI traffic and costs.
What are Limits?
Limits are used to track and optionally control spending. There are two types of Limits, Allow and Block.
- A limit tracks your spending against a defined
maxamount. - Each limit can optionally have a
threshold, which helps you monitor when spending is approaching themax. Allowlimits let spending go beyond themax, tracking any excess asoverrun.Blocklimits halt spending once themaxis exceeded, preventing further usage.Note
Blockcan only be applied to requests where Pay-i is a proxy.
Simply, a limit is an accumulator with state.
Limits can be applied at any scope in your application: application total spend, cumulative for a use case, per use case instance, request, user, account, etc. Limits are composable and enable you to implement your business requirements in a simple and straightfoward manner.
Limits can be reset, which sets their spend amount back to $0.00 but maintains all prior spending history for review in the Pay-i dashboard.
Summary
Limits can be applied to any application scope you require You can leverage limits to track and compare time-based spend (quarterly spend), restrict a user's usage when they near their budget, or when to change logic executing a use case Instance (switch to a human representative when the chat cost exceeds $1).
More information can be found in limits development documentation.