Token Saving
Reduce token usage by compressing input content automatically.
How It Works
Request Arrives
{"name": "John", "age": 30}
14 input tokens
QuilrAI Compresses
name:John|age:30
8 input tokens
Sent to LLM
43% tokens saved
Same response quality ✓
QuilrAI
- Request Arrives - Your app sends a normal API call
- Gateway Compresses - Content is transformed to use fewer tokens
- Forwarded to LLM - Optimized content sent - same accuracy, lower cost
Compression Methods
Smart JSON Compression - Up to 20% savings
Converts JSON objects in LLM inputs to TOON format - ideal for tool call responses and structured data.
HTML to Text
Strips HTML tags and extracts clean text - removes markup overhead from scraped pages or rich content.
Markdown to Text
Removes Markdown syntax characters that consume tokens without adding meaning for the LLM.
Seamless and Input-Only
Compression is applied only to input tokens before they reach the LLM. Responses are returned untouched. Your application code stays exactly the same - no SDK changes, no prompt rewrites, just lower costs.