Agent Monitoring
Correlate every gateway LLM call with the agent run, trace, workflow, or conversation that produced it - using standard distributed-tracing headers, Quilr agent headers, or provider request-body metadata. No Quilr SDK required.
How It Works
- Your agent makes a call through the gateway, carrying tracing context on the request - in headers, in the provider request body, or both.
- The gateway normalizes the recognized signals into one stable shape and discards everything it does not recognize. Raw headers, raw baggage, and secrets are never stored.
- You correlate related calls in the request logs and log export by their shared
trace_id,agent_run_id,conversation_id, orworkflow_run_id.
The capture is non-blocking. Malformed or unrecognized tracing data is dropped silently - it never rejects or delays a request.
What Gets Captured
Recognized signals are normalized into a single observability object stored under each request's extra_data (schema_version 1.0). The object is included in the Log Export API response as metadata.extra_data.observability.
Only known, bounded, sanitized values are copied. The gateway does not persist raw request headers, raw baggage, or secrets.
{
"observability": {
"schema_version": "1.0",
"trace": {
"source": "w3c",
"trace_id": "0af7651916cd43dd8448eb211c80319c",
"incoming_parent_span_id": "b7ad6b7169203331",
"trace_flags": "01",
"sampled": true
},
"correlation": { "conversation_id": "thread-123", "external_request_id": "client-req-123" },
"agent": { "run_id": "run-123", "name": "Support Agent", "framework": "langgraph" },
"workflow": { "id": "support-triage", "run_id": "workflow-run-123" },
"baggage": { "agent.run_id": "run-123", "agent.framework": "langgraph" },
"gateways": { "helicone": { "session_id": "sess-1" } },
"request": { "metadata": { "workflow": "support-triage" }, "user": "end-user-7" },
"upstream": { "provider": "openai", "request_id": "req_abc123" }
}
}
Correlation Channels
Real agent frameworks split correlation between HTTP headers and the provider request body, so the gateway reads both.
W3C Trace Context
The primary, recommended channel. Send the standard headers and the gateway extracts the trace and the incoming caller span.
traceparent: 00-0af7651916cd43dd8448eb211c80319c-b7ad6b7169203331-01
tracestate: vendorname=opaqueValue
The incoming span_id is the caller's span - it is stored as incoming_parent_span_id. Malformed or all-zero values are dropped.
Vendor Tracing Headers
If you already run a different tracing system, send its headers and the gateway normalizes them into the same trace shape, keeping the original values under vendor_trace_context for debugging.
When more than one trace system is present, the primary trace is chosen by precedence - W3C, then B3, Datadog, AWS X-Ray, Google Cloud Trace, Sentry - and the rest are retained in vendor_trace_context.
Quilr Agent Headers
There is no universal HTTP standard for agent/run identity, so the gateway defines a canonical X-Quilr-Agent-* set and accepts short X-Agent-* aliases for compatibility. Both forms map to the same fields.
Conversation, Correlation, and Session
Lightweight correlation IDs for grouping and idempotency. Conversation IDs remain backward compatible with Conversation Grouping.
The gateway's own internal request_id is never overwritten - inbound request IDs are stored only as external correlation.
W3C Baggage
The gateway parses the baggage header but stores only allowlisted keys - raw baggage is never persisted, and any key containing a sensitive term is dropped.
baggage: agent.run_id=run-123,agent.framework=langgraph,workflow.id=claims-review
Allowlisted keys:
agent.run_id,agent.id,agent.name,agent.version,agent.framework,agent.thread_id,agent.step.name,agent.step.typeworkflow.id,workflow.run_idsession.id,session.previous_iduser.id,user.email,conversation.id
OpenTelemetry GenAI semantic-convention correlation keys (IDs and names only - never the content-bearing gen_ai.* message or argument attributes):
gen_ai.conversation.id,gen_ai.agent.id,gen_ai.agent.name,gen_ai.data_source.id,gen_ai.tool.name,gen_ai.tool.call.id
Request Body Correlation - the no-header path
Most agent frameworks (LangChain/LangGraph, LlamaIndex, CrewAI, OpenAI Agents) do not put correlation in HTTP headers on the provider call. When a developer sets it, it lands in provider-native request-body fields. The gateway parses these too, so a stock OpenAI or Anthropic SDK call through the gateway can correlate with zero custom headers.
client.chat.completions.create(
model="gpt-4o",
messages=[...],
user="end-user-7",
extra_body={"metadata": {"agent_run_id": "run-123", "workflow": "support-triage"}},
)
Body correlation is captured on the OpenAI-compatible, Anthropic Messages, OpenAI Responses, and Copilot Studio surfaces. Pairs whose key contains a sensitive term are dropped; opaque values such as run-monkey-3 are kept. Only scalar values (string, number, boolean) are captured; nested objects are skipped.
Migrating From Another Gateway
If you are moving from another LLM gateway, keep your existing instrumentation - the gateway captures the common inbound conventions under gateways, preserving provenance.
Upstream Provider Request IDs
For request-to-provider traceability, the gateway also records the upstream provider's response ID (for example OpenAI x-request-id, Anthropic request-id, Bedrock x-amzn-requestid) under observability.upstream. When a request makes more than one upstream call (preflight or retries), the most recent is kept flattened and a bounded list of recent calls is retained.
Framework Guidance
These signals let you correlate the LLM calls an agent makes. They do not capture the agent's own tool calls, retrieval steps, or memory operations - the gateway cannot infer those reliably from provider traffic. Use your framework's tracing exporter for those spans.
Limits and Safety
The capture is deliberately bounded and privacy-preserving:
- Sensitive keys are dropped. Any header, baggage key, or metadata key containing
authorization,auth,api_key,apikey,token,secret,cookie,password,credential,key, orjwtis discarded. - Values are sanitized. Stored values are ASCII-safe with NUL bytes removed, capped at 512 bytes each.
- Maps are bounded. Free-form metadata and JSON-header maps are capped at 16 key/value pairs.
- Total size is capped at 8 KB. If the normalized object would exceed this, the lowest-priority sections are dropped in order (vendor trace context, then competing-gateway data, then body metadata, then baggage) so the most valuable trace, correlation, and agent IDs are preserved.
- Header lookup is case-insensitive, and malformed tracing data is dropped rather than stored.
- Identity is never trusted from telemetry. Tenant, app, API key, and user authorization are always derived from gateway auth - never from inbound tracing headers or body fields.
Related
- Conversation Grouping - the
X-Conversation-Idheader reused here for conversation correlation. - Identity Aware - per-user identity that complements agent/run correlation.
- Log Export API - where the
observabilityobject is delivered inmetadata.extra_data.