Limits (Continued)
Limits are used to control your GenAI traffic and costs.
Overview
Limits can be created either via the API or in the Pay-i UI.
After creating a limit, the API or Pay-i UI will provide a limit-id
, which is used when making requests. You use the optional header xProxy-Limit-IDs
to specify any number of limit-id
's with a comma-separated list. When you make a request with one or more limit-id
's, those limits are evaluated to see if their configuration allows the request to proceed.
If a limit blocks the request, then a call to the Provider is not made (and thus no cost is incurred), and the resulting xproxy_result object will indicate which limit prevented the call from proceeding. If none of the limits block the request, the request becomes a normal API call to the Provider.
After a request is completed, the total cost of that request is added to each limit’s totals. The resulting xproxy_result object will also detail the state of each limit, and which, if any, are over their thresholds and/or maximums.
Limit Creation
Limits are created with the following characteristics:
name
- A friendly name used to visualize the limit in the Pay-i UI.
max
- The maximum spend in USD.
- Any spend in excess of the
max
is tracked and reported in the Pay-i dashboard and in the limit API as Limit Overrun.- To see how it is tracked in the API, see Understanding Limit Details.
threshold
- The
threshold
is expressed as a percentage of the limit's maximum between 75% and 99%. - The Risk Threshold is the
threshold
multiplied by themax
.- For example, with a $10.00
max
and athreshold
of 0.8:- Risk Threshold = $10.00 * 0.8 = $8.00.
- For example, with a $10.00
- If a
threshold
is not specified it defaults to 1.0, meaning the Risk Threshold is simply equal to themax
. - To see how Risk Threshold is used, see Limit States.
- The
limit_type
- The
limit_type
defines what happens when spend exceeds the limit'smax
. - There are two limit types:
Allow
andBlock
.Allow
limits will allow all requests to proceed, even if themax
has been reached or exceeded.Block
limits will prevent requests from proceeding if themax
has been reached or exceeded.- The request which causes the limits to reach or exceed its maximum may carry Limit Overrun costs.
- All requests made after the limit has reached or exceeded the maximum will be blocked.
- The
- Tags
- See Limit Tags.
Limit States
The Pay-i API will return a state
object for each limit used in a request in the xproxy_result. These states can be used by your code to choose the appropriate actions to de-risk spend from exceeding the limit's max
, such as by switching to a cheaper end-to-end solution, stopping further requests, or up-selling your user.
This section describes the possible values for the state
object and when they occur.
State | Description |
---|---|
ok | The amount spent is still below the limit's Risk Threshold. |
exceeded | The amount spent has exceeded the limit's Risk Threshold but is less than the limit's max . |
overrun | The amount of spend has exceeded the limit's max .For an Allow limit, all further requests will result in the overrun limit state.For a Block limit, only the request which causes the spend to exceed the limit's max will have the overrun limit state. All further requests will result in the blocked limit state. |
blocked | This limit has blocked the request. |
blocked_external | blocked_external can only appear when multiple limits are used for a single request, and at least one of them is of type Block . The following example illustrates when blocked_external is used.Limit A, type: Allow Limit B, type: Block A request is made declaring both limits, A and B. Limit A allows the request. Limit B blocks the request. The state of the request for Limit B is blocked . The state of the request for Limit A is blocked_external , indicating the request was blocked by another limit that was part of the same request.All reported limit states are stored and can be reviewed with the Get Limit Details API. See the Understanding Limit Details section for more information. |
Limit State Diagram
* - For Block
limits, the request which causes the budget to exceed its maximum will have the overrun
state. All further requests will have the blocked
state.
Limit State Examples
Allow
max
= $10.00threshold
= 0.8- Risk Threshold = $8.00
# | Spend (Before) | Request Cost | Spend (After) | State | Overrun |
---|---|---|---|---|---|
1. | $7.80 | $0.19 | $7.99 | ok | $0.00 |
2. | $7.99 | $2.00 | $9.99 | exceeded | $0.00 |
2. | $9.99 | $0.30 | $10.29 | overrun | $0.29 |
3. | $10.29 | $0.50 | $10.79 | overrun | $0.79 |
Block
max
= $10.00threshold
= 0.8- Risk Threshold = $8.00
# | Spend (Before) | Request Cost | Spend (After) | State | Overrun |
---|---|---|---|---|---|
1. | $7.80 | $0.19 | $7.99 | ok | $0.00 |
2. | $7.99 | $2.00 | $9.99 | exceeded | $0.00 |
2. | $9.99 | $0.30 | $10.29 | overrun | $0.29 |
3. | $10.29 | N/A | $10.29 | blocked | $0.29 |
Related APIs
Updated 20 days ago