Managing Limits
Limits are used to control your GenAI traffic and costs.
Overview
Limits can be created either via the API or in the Pay-i UI.
After creating a limit, the API or Pay-i UI will provide a limit-id, which is used when making requests. You use the optional header xProxy-Limit-IDs to specify any number of limit-id's with a comma-separated list. When you make a request with one or more limit-id's, those limits are evaluated to see if their configuration allows the request to proceed.
If a limit blocks the request, then a call to the Provider is not made (and thus no cost is incurred), and the resulting xproxy_result object will indicate which limit prevented the call from proceeding. If none of the limits block the request, the request becomes a normal API call to the Provider.
After a request is completed, the total cost of that request is added to each limit’s totals. The resulting xproxy_result object will also detail the state of each limit, and which, if any, are over their thresholds and/or maximums.
Limit Creation
Limits are created with the following characteristics:
name- A friendly name used to visualize the limit in the Pay-i UI.
max- The maximum spend in USD.
- Any spend in excess of the
maxis tracked and reported in the Pay-i dashboard and in the limit API as Limit Overrun.- To see how it is tracked in the API, see Understanding Limit Details.
threshold- The
thresholdis expressed as a percentage of the limit's maximum between 75% and 99%. - The Risk Threshold is the
thresholdmultiplied by themax.- For example, with a $10.00
maxand athresholdof 0.8:- Risk Threshold = $10.00 * 0.8 = $8.00.
- For example, with a $10.00
- If a
thresholdis not specified it defaults to 1.0, meaning the Risk Threshold is simply equal to themax. - To see how Risk Threshold is used, see Limit States.
- The
limit_type- The
limit_typedefines what happens when spend exceeds the limit'smax. - There are two limit types:
AllowandBlock.Allowlimits will allow all requests to proceed, even if themaxhas been reached or exceeded.Blocklimits will prevent requests from proceeding if themaxhas been reached or exceeded.- The request which causes the limits to reach or exceed its maximum may carry Limit Overrun costs.
- All requests made after the limit has reached or exceeded the maximum will be blocked.
- The
- Tags
- See Limit Tags.
Limit States
The Pay-i API will return a state object for each limit used in a request in the xproxy_result. These states can be used by your code to choose the appropriate actions to de-risk spend from exceeding the limit's max, such as by switching to a cheaper end-to-end solution, stopping further requests, or up-selling your user.
This section describes the possible values for the state object and when they occur.
Important: A common misconception is that when a limit state is "exceeded" the requests are blocked. This is not correct. The "exceeded" state simply means spending has reached or passed the threshold but is still below or equal to the max value. Requests are still allowed in the "exceeded" state. Only the "blocked" state (for Block limits) actually prevents requests. The mathematical definitions for each state are provided below.
State | Description |
|---|---|
ok | The amount spent is still below the limit's Risk Threshold. Mathematically: spend < max*threshold |
exceeded | The amount spent has reached or exceeded the limit's Risk Threshold but is less than or equal to the limit's |
overrun | The amount of spend has exceeded the limit's For an |
blocked | This limit has blocked the request. This only occurs when a |
blocked_external |
Limit A, type: All reported limit states are stored and can be reviewed with the Get Limit Details API. Refer to Understanding Limit Details. |
Limit State Diagram
* - For Block limits, the request which causes the budget to exceed its maximum will have the overrun state. All further requests will have the blocked state.
Limit State Examples
Allow
max= $10.00threshold= 0.8- Risk Threshold = $8.00
| # | Spend (Before) | Request Cost | Spend (After) | State | Overrun | Explanation |
|---|---|---|---|---|---|---|
| 1. | $7.80 | $0.19 | $7.99 | ok | $0.00 | $7.99 < $8.00 (Risk Threshold), so state is "ok" |
| 2. | $7.99 | $2.00 | $9.99 | exceeded | $0.00 | $9.99 > $8.00 (Risk Threshold) but $9.99 ≤ $10.00 (max), so state is "exceeded" |
| 3. | $9.99 | $0.30 | $10.29 | overrun | $0.29 | $10.29 > $10.00 (max), so state is "overrun" |
| 4. | $10.29 | $0.50 | $10.79 | overrun | $0.79 | $10.79 > $10.00 (max), so state remains "overrun" |
Block
max= $10.00threshold= 0.8- Risk Threshold = $8.00
| # | Spend (Before) | Request Cost | Spend (After) | State | Overrun | Explanation |
|---|---|---|---|---|---|---|
| 1. | $7.80 | $0.19 | $7.99 | ok | $0.00 | $7.99 < $8.00 (Risk Threshold), so state is "ok" |
| 2. | $7.99 | $2.00 | $9.99 | exceeded | $0.00 | $9.99 > $8.00 (Risk Threshold) but $9.99 ≤ $10.00 (max), so state is "exceeded" (requests still allowed) |
| 3. | $9.99 | $0.30 | $10.29 | overrun | $0.29 | $10.29 > $10.00 (max), so state is "overrun" for this final request that exceeds max |
| 4. | $10.29 | N/A | $10.29 | blocked | $0.29 | Spend is already > max, so state is "blocked" and request is denied |
Related APIs
Updated 10 days ago
