Skip to main content

Agent Monitoring

Correlate every gateway LLM call with the agent run, trace, workflow, or conversation that produced it - using standard distributed-tracing headers, Quilr agent headers, or provider request-body metadata. No Quilr SDK required.

How It Works

Agent Makes a Call
traceparent: 00-0af7...-b7ad...-01
X-Quilr-Agent-Run-Id: run-123
metadata: { workflow: triage }
Gateway Normalizes
trace_id, span_id extracted
agent / workflow IDs captured
Secrets + raw headers dropped
Correlate in Logs
extra_data.observability
Join calls by trace_id / run_id
Included in log export
QuilrAI
  1. Your agent makes a call through the gateway, carrying tracing context on the request - in headers, in the provider request body, or both.
  2. The gateway normalizes the recognized signals into one stable shape and discards everything it does not recognize. Raw headers, raw baggage, and secrets are never stored.
  3. You correlate related calls in the request logs and log export by their shared trace_id, agent_run_id, conversation_id, or workflow_run_id.

The capture is non-blocking. Malformed or unrecognized tracing data is dropped silently - it never rejects or delays a request.

What Gets Captured

Recognized signals are normalized into a single observability object stored under each request's extra_data (schema_version 1.0). The object is included in the Log Export API response as metadata.extra_data.observability.

Only known, bounded, sanitized values are copied. The gateway does not persist raw request headers, raw baggage, or secrets.

{
"observability": {
"schema_version": "1.0",
"trace": {
"source": "w3c",
"trace_id": "0af7651916cd43dd8448eb211c80319c",
"incoming_parent_span_id": "b7ad6b7169203331",
"trace_flags": "01",
"sampled": true
},
"correlation": { "conversation_id": "thread-123", "external_request_id": "client-req-123" },
"agent": { "run_id": "run-123", "name": "Support Agent", "framework": "langgraph" },
"workflow": { "id": "support-triage", "run_id": "workflow-run-123" },
"baggage": { "agent.run_id": "run-123", "agent.framework": "langgraph" },
"gateways": { "helicone": { "session_id": "sess-1" } },
"request": { "metadata": { "workflow": "support-triage" }, "user": "end-user-7" },
"upstream": { "provider": "openai", "request_id": "req_abc123" }
}
}

Correlation Channels

Real agent frameworks split correlation between HTTP headers and the provider request body, so the gateway reads both.

W3C Trace Context

The primary, recommended channel. Send the standard headers and the gateway extracts the trace and the incoming caller span.

traceparent: 00-0af7651916cd43dd8448eb211c80319c-b7ad6b7169203331-01
tracestate: vendorname=opaqueValue
HeaderCaptured as
traceparenttrace.trace_id, trace.incoming_parent_span_id, trace.trace_flags, trace.sampled
tracestatetrace.tracestate (stored bounded)

The incoming span_id is the caller's span - it is stored as incoming_parent_span_id. Malformed or all-zero values are dropped.

Vendor Tracing Headers

If you already run a different tracing system, send its headers and the gateway normalizes them into the same trace shape, keeping the original values under vendor_trace_context for debugging.

SystemHeaders
Zipkin / B3b3, x-b3-traceid, x-b3-spanid, x-b3-parentspanid, x-b3-sampled, x-b3-flags
Datadogx-datadog-trace-id, x-datadog-parent-id, x-datadog-sampling-priority, x-datadog-origin, x-datadog-tags
AWS X-Rayx-amzn-trace-id
Google Cloud Tracex-cloud-trace-context
Sentrysentry-trace

When more than one trace system is present, the primary trace is chosen by precedence - W3C, then B3, Datadog, AWS X-Ray, Google Cloud Trace, Sentry - and the rest are retained in vendor_trace_context.

Quilr Agent Headers

There is no universal HTTP standard for agent/run identity, so the gateway defines a canonical X-Quilr-Agent-* set and accepts short X-Agent-* aliases for compatibility. Both forms map to the same fields.

Canonical headerAliasCaptured as
X-Quilr-Agent-Run-IdX-Agent-Run-Idagent.run_id
X-Quilr-Agent-IdX-Agent-Idagent.id
X-Quilr-Agent-NameX-Agent-Nameagent.name
X-Quilr-Agent-VersionX-Agent-Versionagent.version
X-Quilr-Agent-FrameworkX-Agent-Frameworkagent.framework
X-Quilr-Agent-Thread-IdX-Agent-Thread-Idagent.thread_id
X-Quilr-Agent-Span-IdX-Agent-Span-Idagent.span_id
X-Quilr-Agent-Parent-Span-IdX-Agent-Parent-Span-Idagent.parent_span_id
X-Quilr-Agent-Step-NameX-Agent-Step-Nameagent.step_name
X-Quilr-Agent-Step-TypeX-Agent-Step-Typeagent.step_type
X-Quilr-Workflow-IdX-Workflow-Idworkflow.id
X-Quilr-Workflow-Run-IdX-Workflow-Run-Idworkflow.run_id

Conversation, Correlation, and Session

Lightweight correlation IDs for grouping and idempotency. Conversation IDs remain backward compatible with Conversation Grouping.

Header(s)Captured as
X-Conversation-Id, Conversation-Id, conversation_idcorrelation.conversation_id (also top-level extra_data.conversation_id)
X-Request-Id, Request-Idcorrelation.external_request_id
X-Correlation-Id, Correlation-Idcorrelation.correlation_id
X-Session-Id, Session-Idcorrelation.session_id
Idempotency-Key, X-Idempotency-Keycorrelation.idempotency_key

The gateway's own internal request_id is never overwritten - inbound request IDs are stored only as external correlation.

W3C Baggage

The gateway parses the baggage header but stores only allowlisted keys - raw baggage is never persisted, and any key containing a sensitive term is dropped.

baggage: agent.run_id=run-123,agent.framework=langgraph,workflow.id=claims-review

Allowlisted keys:

  • agent.run_id, agent.id, agent.name, agent.version, agent.framework, agent.thread_id, agent.step.name, agent.step.type
  • workflow.id, workflow.run_id
  • session.id, session.previous_id
  • user.id, user.email, conversation.id

OpenTelemetry GenAI semantic-convention correlation keys (IDs and names only - never the content-bearing gen_ai.* message or argument attributes):

  • gen_ai.conversation.id, gen_ai.agent.id, gen_ai.agent.name, gen_ai.data_source.id, gen_ai.tool.name, gen_ai.tool.call.id

Request Body Correlation - the no-header path

Most agent frameworks (LangChain/LangGraph, LlamaIndex, CrewAI, OpenAI Agents) do not put correlation in HTTP headers on the provider call. When a developer sets it, it lands in provider-native request-body fields. The gateway parses these too, so a stock OpenAI or Anthropic SDK call through the gateway can correlate with zero custom headers.

ProviderBody fieldCaptured as
OpenAI (Chat Completions / Responses)metadata (object, up to 16 pairs)request.metadata
OpenAIuserrequest.user
OpenAIsafety_identifierrequest.safety_identifier
OpenAIprompt_cache_keyrequest.prompt_cache_key
Anthropic (Messages)metadata.user_idrequest.metadata.user_id
client.chat.completions.create(
model="gpt-4o",
messages=[...],
user="end-user-7",
extra_body={"metadata": {"agent_run_id": "run-123", "workflow": "support-triage"}},
)

Body correlation is captured on the OpenAI-compatible, Anthropic Messages, OpenAI Responses, and Copilot Studio surfaces. Pairs whose key contains a sensitive term are dropped; opaque values such as run-monkey-3 are kept. Only scalar values (string, number, boolean) are captured; nested objects are skipped.

Migrating From Another Gateway

If you are moving from another LLM gateway, keep your existing instrumentation - the gateway captures the common inbound conventions under gateways, preserving provenance.

GatewayHeaders
HeliconeHelicone-Session-Id, Helicone-Session-Name, Helicone-Session-Path, Helicone-User-Id, Helicone-Request-Id
Portkeyx-portkey-trace-id, x-portkey-span-id, x-portkey-span-name, x-portkey-metadata (JSON object, bounded)
Cloudflare AI Gatewaycf-aig-metadata (JSON object, bounded)

Upstream Provider Request IDs

For request-to-provider traceability, the gateway also records the upstream provider's response ID (for example OpenAI x-request-id, Anthropic request-id, Bedrock x-amzn-requestid) under observability.upstream. When a request makes more than one upstream call (preflight or retries), the most recent is kept flattened and a bounded list of recent calls is retained.

Framework Guidance

FrameworkLowest-friction correlation
LangGraph / LangChainPass extra_body={"metadata": {"agent_run_id": ..., "workflow": ...}} and user=... on the model call. To use headers, map thread_id to X-Conversation-Id and the run ID to X-Quilr-Agent-Run-Id.
OpenAI Agents SDKPropagate the run/trace IDs via W3C traceparent or X-Quilr-Agent-* headers.
LlamaIndexPass conversation/run IDs as headers on LLM calls routed through the gateway.
CrewAI / customUse W3C trace headers when available, and X-Quilr-Agent-* for run/agent identity.
Header correlation vs. full agent tracing

These signals let you correlate the LLM calls an agent makes. They do not capture the agent's own tool calls, retrieval steps, or memory operations - the gateway cannot infer those reliably from provider traffic. Use your framework's tracing exporter for those spans.

Limits and Safety

The capture is deliberately bounded and privacy-preserving:

  • Sensitive keys are dropped. Any header, baggage key, or metadata key containing authorization, auth, api_key, apikey, token, secret, cookie, password, credential, key, or jwt is discarded.
  • Values are sanitized. Stored values are ASCII-safe with NUL bytes removed, capped at 512 bytes each.
  • Maps are bounded. Free-form metadata and JSON-header maps are capped at 16 key/value pairs.
  • Total size is capped at 8 KB. If the normalized object would exceed this, the lowest-priority sections are dropped in order (vendor trace context, then competing-gateway data, then body metadata, then baggage) so the most valuable trace, correlation, and agent IDs are preserved.
  • Header lookup is case-insensitive, and malformed tracing data is dropped rather than stored.
  • Identity is never trusted from telemetry. Tenant, app, API key, and user authorization are always derived from gateway auth - never from inbound tracing headers or body fields.
  • Conversation Grouping - the X-Conversation-Id header reused here for conversation correlation.
  • Identity Aware - per-user identity that complements agent/run correlation.
  • Log Export API - where the observability object is delivered in metadata.extra_data.