Rate Limits
Control request rates, token budgets, and key expiry.
Overview
Rate limits protect your LLM spend and availability. All limits are enforced at the gateway before requests reach the provider.
Key Features
- Per-key rate limits - Requests per minute, hour, or day
- Token limits - Input and output token budgets per request or over time
- API key expiration - Configurable epoch time for automatic key expiry
- Response timeout - Maximum wait time to prevent hung requests from consuming resources