Skip to main content

LLM Gateway Log Export API

Use the Log Export API to read LLM Gateway request logs from your own data platform, SIEM, warehouse, or scheduled export job.

The API returns newline-delimited JSON. Each response line is one complete JSON object, so clients can stream, parse, and checkpoint logs incrementally.

GET https://guardrails.quilr.ai/llmgateway/logs/export

Response content type:

Content-Type: application/x-ndjson

Authentication

Pass a log export key from the QuilrAI LLM Gateway UI:

X-Quilr-Log-Export-Key: sk-export-...

Do not use your QuilrAI gateway API key as the request credential for this endpoint. The log export key is separate from model-call authentication.

The UI exposes two export scopes:

Export keyScope
log_export_keyExports logs only for the underlying QuilrAI API key it belongs to.
all_apps_log_export_keyExports logs for all active, non-expired LLM Gateway apps in the tenant.

Both scopes use the same endpoint, header, query parameters, pagination model, and response format. In all-apps exports, each llmgateway.request event still includes the concrete app name in app.name.

Query Parameters

All query parameters are optional.

ParameterDescription
start_timeISO 8601 lower bound for exported logs. Naive timestamps are treated as UTC.
end_timeISO 8601 upper bound for exported logs. Naive timestamps are treated as UTC.
cursorOpaque cursor from the previous checkpoint.next_cursor. When provided, it wins over start_time.
limitMaximum request rows to export in this response. Default 1000. Values above 5000 are silently clamped to 5000. Values below 1 or non-integer values return 400.

Logs are available for a maximum of 15 days. Choose start_time within that retention window when backfilling. Requests with an effective start_time, end_time, or cursor timestamp before the retention window fail with 400.

If neither start_time nor cursor is provided, the API exports a default 24-hour window ending at the effective export end time.

Export Lag

The API does not export logs newer than 15 minutes. Gateway logs and prediction payloads are written asynchronously, so this lag keeps exported rows stable.

If end_time is newer than now - 15 minutes, the server clamps it to the maximum exportable time. The request still succeeds. The export_started and checkpoint events include the effective export bounds.

Request Examples

Start an export window:

curl -N \
-H "X-Quilr-Log-Export-Key: sk-export-..." \
"https://guardrails.quilr.ai/llmgateway/logs/export?start_time=2026-05-14T00:00:00Z&end_time=2026-05-14T01:00:00Z&limit=1000"

Resume from the previous checkpoint:

curl -N \
-H "X-Quilr-Log-Export-Key: sk-export-..." \
"https://guardrails.quilr.ai/llmgateway/logs/export?cursor=<next_cursor>"

When resuming with cursor, you do not need to pass start_time or end_time.

Pagination

Rows are ordered by:

timestamp ASC, request_id ASC

The cursor is opaque. Store it exactly as returned in checkpoint.next_cursor and send it back as the cursor query parameter on the next request.

If checkpoint.has_more is true, call the endpoint again immediately with cursor=<next_cursor>.

If checkpoint.has_more is false, there are no more rows in the current effective window. Store next_cursor and poll later with that cursor to continue incremental export.

When an initial request (no cursor supplied) returns zero rows, the API returns a checkpoint cursor pinned to the effective end time. This lets exporters store one cursor value even for empty windows.

When a request with a cursor returns zero rows, checkpoint.next_cursor echoes the inbound cursor unchanged and has_more is false. Re-poll later with the same cursor.

Coverage

The export covers LLM Gateway traffic for the selected export scope, including:

Traffic typeExported
OpenAI-compatible chat completionsYes
Anthropic MessagesYes
OpenAI ResponsesYes
OpenAI Realtime session logsYes
OpenAI speech-to-textYes
OpenAI text-to-speechYes
EmbeddingsYes
RerankYes
AWS Bedrock Runtime boto3Yes
Vertex AI GeminiYes
Streaming requestsYes
SDK mode checksYes
Copilot Studio checksYes

Response Events

Every successful response starts with export_started, contains zero or more llmgateway.request events, and ends with checkpoint.

export_started

The first line describes the effective export window.

{
"type": "export_started",
"schema_version": "v1",
"scope": "app",
"app_name": "my-app",
"app_count": 1,
"effective_start_time": "2026-05-14T00:00:00.000Z",
"effective_end_time": "2026-05-14T01:00:00.000Z",
"max_exportable_time": "2026-05-14T10:45:00.000Z",
"end_time_clamped": false,
"limit": 1000
}
FieldTypeDescription
typestringAlways export_started.
schema_versionstringEvent schema version. Current value is v1.
scopestringapp for a single-key export, or all_apps for a tenant-wide all-apps export.
app_namestring or nullLLM Gateway app name for a single-key export, or "" if that app has no configured name. null for all-apps exports because the export spans multiple apps; the concrete per-request app name is on each llmgateway.request event at app.name.
app_countnumberNumber of active, non-expired apps included in the export scope.
effective_start_timestringISO 8601 timestamp where this export starts.
effective_end_timestringISO 8601 timestamp where this export ends.
max_exportable_timestringNewest timestamp eligible for export after the 15-minute lag.
end_time_clampedbooleantrue when the requested end_time was newer than max_exportable_time.
limitnumberMaximum request rows returned in this response.

For all-apps export, the first line uses this scope shape:

{
"type": "export_started",
"schema_version": "v1",
"scope": "all_apps",
"app_name": null,
"app_count": 3,
"effective_start_time": "2026-05-14T00:00:00.000Z",
"effective_end_time": "2026-05-14T01:00:00.000Z",
"max_exportable_time": "2026-05-14T10:45:00.000Z",
"end_time_clamped": false,
"limit": 1000
}

llmgateway.request

Each request row is emitted as one llmgateway.request event.

{
"type": "llmgateway.request",
"schema_version": "v1",
"cursor": "<opaque-cursor>",
"app": {
"name": "my-app"
},
"request": {
"id": "request-id",
"timestamp": "2026-05-14T00:00:01.123Z",
"endpoint": "/openai_compatible/v1/chat/completions",
"model": "gpt-4.1",
"provider": "openai",
"stream": false,
"status_code": 200,
"error_type": null,
"error_message": null
},
"tokens": {
"request": 100,
"response": 200,
"cache_read": 0,
"cache_write": null,
"reasoning": null,
"max_requested": 1000
},
"latency_ms": {
"upstream": 800,
"quilr_processing": 120,
"guardrails": 90,
"first_response": 920,
"total": 950
},
"guardrails": {
"outcome": "normal",
"is_blocked": false,
"is_anonymized": false,
"actions_and_categories": {},
"request_predictions": [],
"response_predictions": []
},
"payload": {
"hydration_status": "complete",
"request_text": {},
"response_text": {}
},
"metadata": {
"user_email": null,
"conversation_id": null,
"client_ip": "203.0.113.10",
"extra_data": {},
"sdk": null
},
"routing": {
"group_id": null,
"mode": null
},
"telemetry": {
"processing_times": null,
"chunk_funnel": null
}
}

Top-Level Fields

FieldTypeDescription
typestringAlways llmgateway.request.
schema_versionstringEvent schema version. Current value is v1.
cursorstringOpaque cursor for this request row.
appobjectApp metadata.
requestobjectGateway request metadata.
tokensobjectToken counts and token limits.
latency_msobjectLatency measurements in milliseconds.
guardrailsobjectGuardrail outcome and prediction metadata.
payloadobjectHydrated request and response payloads when available.
metadataobjectUser, client, SDK, and extra request metadata.
routingobjectRouting group metadata when routing is used.
telemetryobjectAdditional processing telemetry.

app

FieldTypeDescription
namestringLLM Gateway app name.

request

FieldTypeDescription
idstringUnique request ID.
timestampstringRequest timestamp in ISO 8601 format.
endpointstringGateway endpoint path used by the request.
modelstring or nullRequested model or routing group name.
providerstring or nullProvider selected for the request.
streambooleanWhether the request used a streaming response mode.
status_codenumber or nullHTTP status code returned to the client.
error_typestring or nullError category when the request failed.
error_messagestring or nullError message when the request failed. Credential-shaped substrings are redacted before export.

tokens

FieldTypeDescription
requestnumber or nullInput token count.
responsenumber or nullOutput token count.
cache_readnumberTokens read from provider prompt cache. 0 when the provider did not report a cache read.
cache_writenumber or nullTokens written to provider prompt cache, when available.
reasoningnumber or nullReasoning token count, when reported by the provider.
max_requestednumber or nullMaximum output tokens requested by the client.

latency_ms

FieldTypeDescription
upstreamnumber or nullTime spent waiting on the upstream provider.
quilr_processingnumber or nullTime spent in QuilrAI gateway processing.
guardrailsnumber or nullTime spent evaluating guardrails.
first_responsenumber or nullTime to first response token or first response byte, when available.
totalnumber or nullTotal gateway request duration.

guardrails

FieldTypeDescription
outcomestring or nullFinal guardrail outcome, such as normal, blocked, or another configured outcome.
is_blockedbooleanWhether the request or response was blocked.
is_anonymizedbooleanWhether anonymization was applied.
actions_and_categoriesobjectGuardrail actions grouped by detected categories.
request_predictionsarrayRequest-side prediction results.
response_predictionsarrayResponse-side prediction results.

Guardian Agent findings are included in these same prediction arrays with match_type: "guardian". Guardian request and response categories are also available under metadata.extra_data.guardian_agent when present.

payload

FieldTypeDescription
hydration_statusstringcomplete when payload data is available, or missing_prediction when the request log exists but payload hydration is unavailable.
request_textobject, array, string, or nullHydrated request payload. The field name matches the dashboard concept and is not limited to plain strings.
response_textobject, array, string, or nullHydrated response payload. The field name matches the dashboard concept and is not limited to plain strings.

When hydration is unavailable, the payload object uses this shape:

{
"hydration_status": "missing_prediction",
"request_text": null,
"response_text": null
}

metadata

FieldTypeDescription
user_emailstring or nullUser email associated with the request, when identity-aware tracking is configured. Also present in extra_data when populated.
conversation_idstring or nullConversation ID from X-Conversation-Id, when provided. Also present in extra_data when populated.
client_ipstring or nullClient IP observed by the gateway. Also present in extra_data when populated.
extra_dataobjectAdditional request metadata. The hoisted fields above (user_email, conversation_id, client_ip) are not removed from this object. The jwt_claims field is always stripped.
sdkobject or nullSDK metadata when the request came from SDK mode or a tracked SDK client.

routing

FieldTypeDescription
group_idstring or nullRouting group identifier when request routing is used.
modestring or nullRouting mode used for the request.

telemetry

FieldTypeDescription
processing_timesobject or nullAdditional internal processing timings, when available.
chunk_funnelobject or nullStreaming chunk telemetry, when available.

checkpoint

The final line on a successful response is a checkpoint.

{
"type": "checkpoint",
"schema_version": "v1",
"next_cursor": "<opaque-cursor>",
"rows": 1000,
"has_more": true,
"effective_end_time": "2026-05-14T01:00:00.000Z",
"max_exportable_time": "2026-05-14T10:45:00.000Z"
}
FieldTypeDescription
typestringAlways checkpoint.
schema_versionstringEvent schema version. Current value is v1.
next_cursorstringOpaque cursor to store and use on the next request.
rowsnumberNumber of llmgateway.request events emitted in this response.
has_morebooleantrue when another page is available for the same effective export window.
effective_end_timestringEffective upper bound used for this export response.
max_exportable_timestringNewest timestamp eligible for export after the 15-minute lag.

Redaction

The export endpoint applies a best-effort credential scrub before emitting any row. Expect the following to be missing or rewritten in exported events:

  • extra_data.jwt_claims is removed from every row.
  • Any object key named headers, request_headers, response_headers, or http_headers is replaced with the string [REDACTED_HEADERS].
  • Object keys that name a credential (such as authorization, api_key, x_api_key, quilr_api_key, access_token, refresh_token, client_secret, password, private_key, AWS credential field names) and any key suffixed with _api_key, _apikey, _access_token, _refresh_token, _client_secret, or _private_key are replaced with [REDACTED].
  • String values are scanned for common credential patterns. Matches are rewritten to placeholders such as [REDACTED_API_KEY], [REDACTED_QUILR_API_KEY], [REDACTED_LOG_EXPORT_KEY], or Bearer [REDACTED].

Redaction is applied recursively to payload.request_text, payload.response_text, guardrails.actions_and_categories, guardrails.request_predictions, guardrails.response_predictions, metadata.extra_data, metadata.sdk, telemetry.processing_times, telemetry.chunk_funnel, and the top-level error_message field. This is a safety layer, not a formal DLP pass over exported payloads.

Errors

Errors are returned as NDJSON too.

{"type":"error","error":{"message":"<message>","code":"<code>"}}
FieldTypeDescription
typestringAlways error.
error.messagestringHuman-readable error message.
error.codestringMachine-readable error code.

Errors before streaming starts return an HTTP error status with a single NDJSON error line as the response body. Errors after streaming has started return HTTP 200 and emit an error event line in the body because the HTTP response has already been committed.