Skip to main content

Quick Start

Get up and running with the LLM Gateway in 4 steps.

Create API Key
Provider: OpenAI
Models: gpt-4o, gpt-4o-mini
Key: sk-quilr-•••
Swap Base URL
base_url → guardrails-usa-2.quilr.ai
api_key → sk-quilr-•••
SDK code: unchanged
Configure
PII detection: ON
Rate limit: 100 req/min
Routing: weighted group
Monitor
Requests: 1,247
Cost: $12.40
Avg latency: 340ms
QuilrAI

1. Create an API Key

Go to the LLM Gateway tab and click Create New Key. Select your provider (OpenAI, Anthropic, Azure, Bedrock, Bedrock Runtime boto3, Vertex AI, OpenAI Responses, OpenAI Realtime, Copilot Studio, or any OpenAI-compatible endpoint), choose which models to expose, and generate your key.

Your provider API key is stored securely - developers only see the QuilrAI proxy key.

Pick the right provider for the endpoint you want to hit

Each QuilrAI endpoint is served only by matching provider types. A plain OpenAI chat-completions key cannot hit /openai_responses/ or /openai_realtime/ by swapping the URL - create the key with the OpenAI Responses / OpenAI Realtime provider (or their Azure variants), or add one as an additional provider on an existing key. See the Provider Support matrix.

For chat completions, a bedrock key can also use /openai_compatible/v1/chat/completions. QuilrAI translates the OpenAI-compatible request to Bedrock Converse, so OpenAI SDKs and OpenAI-compatible wrappers can call selected Bedrock models directly.

2. Swap the Base URL

Replace your provider's base URL with the closest QuilrAI regional gateway URL and use your QuilrAI key. Everything else - SDK, parameters, response format - stays exactly the same.

# Point the client to QuilrAI's gateway
client = OpenAI(
base_url='https://guardrails-usa-2.quilr.ai/openai_compatible/',
api_key='sk-quilr-xxx'
)

# Everything below stays exactly the same
resp = client.chat.completions.create(
model='gpt-4o',
messages=[{'role': 'user', 'content': 'Hello!'}]
)

Replace sk-quilr-xxx with the API key you created in the dashboard. The example uses the US East endpoint; choose the nearest regional endpoint from the Integration Guide for production traffic.

3. Configure Your Key

Sane defaults are selected automatically. Change them when setting up the key or edit them later.

SettingDescription
Security GuardrailsPII/PHI/PCI detection, adversarial blocking
Guardian AgentDependency safety guidance and task-adherence checks
Rate LimitsRequests per min/hr/day, token budgets
Request RoutingMulti-provider load balancing and failover
Token SavingJSON compression, HTML/MD to text
Prompt StoreCentralized system prompts
Identity AwarePer-user auth and tracking

4. Monitor Requests

Every request through the gateway is logged with cost, latency, token counts, and guardrail actions. Check your Logs tab to verify requests are flowing through, or use the LLM Gateway Log Export API to stream logs into your own data platform.


Next step: See the Integration Guide for full code examples with cURL, JavaScript, region options, and more.