OpenAI to Bedrock Translation
Use OpenAI-compatible chat clients with AWS Bedrock models. QuilrAI accepts an OpenAI-style /chat/completions request, calls Bedrock Converse or ConverseStream, and returns an OpenAI-shaped chat completion response.
Last verified: May 13, 2026
Scope
This page covers the bedrock provider on:
/openai_compatible/v1/chat/completions
It does not cover:
- AWS Bedrock Runtime boto3 routes such as
/bedrock-runtime/model/{model_id}/converse - Anthropic Messages on Bedrock
- Bedrock embeddings or rerank
- OpenAI Responses API
Use this mode when your application already speaks OpenAI Chat Completions and you want to call selected Bedrock models without switching to boto3.
Request Flow
- Create an LLM Gateway key with provider
bedrock. - Select one or more Bedrock chat models that support
Converse. - Point your OpenAI SDK or OpenAI-compatible wrapper at the closest regional endpoint, such as
https://guardrails-usa-2.quilr.ai/openai_compatible/. - Send
modelas the Bedrock model ID or inference profile ID. - QuilrAI translates the OpenAI-style request to Bedrock
ConverseorConverseStream.
from openai import OpenAI
client = OpenAI(
base_url="https://guardrails-usa-2.quilr.ai/openai_compatible/",
api_key="sk-quilr-xxx",
)
response = client.chat.completions.create(
model="amazon.nova-lite-v1:0",
messages=[{"role": "user", "content": "Summarize this in one sentence."}],
max_tokens=256,
)
print(response.choices[0].message.content)
The normal gateway behavior still applies: authentication, provider and model routing, prompt-store substitution, request-side DLP, response-side DLP for non-streaming responses, logging, rate limits, token estimates, and performance metrics.
Supported Parameters
The Bedrock translator uses an allowlist. Unknown OpenAI parameters are rejected instead of silently dropped.
max_tokens and max_completion_tokens can both be present only when they have the same value. Token limits must be positive integers. stop must be a string or a list of strings.
Common rejected parameters include frequency_penalty, presence_penalty, logit_bias, logprobs, top_logprobs, seed, store, service_tier, modalities, audio, prediction, and reasoning_effort.
Message Support
Content is text-only:
- String content is supported.
- Content arrays are supported only when every part is text-like.
- Images, audio, files, Bedrock document blocks, and Bedrock image blocks are not supported on this OpenAI-to-Bedrock path.
Do not use this surface for multimodal Bedrock calls. Use a native Bedrock Runtime route when you need provider-native request shapes.
Tools and Function Calling
OpenAI tools are supported when each tool is a function tool:
Legacy OpenAI functions and function_call are also supported. For legacy function-call history, QuilrAI uses the function name as the Bedrock toolUseId so the later role: "function" result can be matched consistently.
Modern assistant.tool_calls entries must include id. QuilrAI rejects tool calls without IDs because the client would have no stable identifier to send back in the later tool result.
Tool Choice
If a request requires a tool but no tools are present, QuilrAI rejects the request before Bedrock.
Parallel Tool Results
OpenAI-compatible clients often send one role: "tool" message per tool call:
[
{"role": "assistant", "tool_calls": [{"id": "call_a"}, {"id": "call_b"}]},
{"role": "tool", "tool_call_id": "call_a", "content": "result A"},
{"role": "tool", "tool_call_id": "call_b", "content": "result B"}
]
Bedrock expects all matching tool results for that assistant turn in the next user message. QuilrAI groups consecutive OpenAI role: "tool" or role: "function" messages into one Bedrock user message containing multiple toolResult blocks.
This grouping matters for parallel tool calls. Sending each tool result as a separate Bedrock user turn can produce a Bedrock validation error about missing toolResult blocks.
Structured Output
response_format support:
For json_schema, QuilrAI maps:
json_schema.strict is validated as a boolean, but it is not separately mapped. Bedrock structured output is schema-constrained through outputConfig.textFormat.
Bedrock structured output is model-dependent. Newer Claude models can support it, while older Claude and some Nova models may reject outputConfig. QuilrAI does not maintain a separate model allowlist for this field; unsupported models return the upstream Bedrock error.
json_object is rejected because Bedrock structured output requires a concrete JSON Schema. A generic object mode would not preserve OpenAI-equivalent semantics.
Streaming Responses
Streaming uses Bedrock ConverseStream and returns OpenAI-compatible server-sent events.
Streaming response-side DLP is not applied on this path. QuilrAI performs request-side scanning, forwards chunks, and accumulates text and tool-call data for logging.
Non-Streaming Responses
Non-streaming Bedrock responses are converted back to OpenAI chat completions:
- Bedrock text blocks are joined into
choices[].message.content. - Bedrock
toolUseblocks become OpenAIchoices[].message.tool_calls. - Tool-use-only responses return
message.content: null. - Bedrock usage maps to OpenAI
prompt_tokens,completion_tokens, andtotal_tokens.
Finish reasons map as follows:
Unknown Bedrock stop reasons pass through unchanged.
Unsupported Features
Unsupported request features are rejected before the upstream call unless Bedrock itself owns the model-gated failure.
Unsupported content:
- Image content
- Audio content
- File content
- Bedrock document blocks
- Mixed multimodal content arrays
Unsupported request features:
- Multiple choices with
n > 1 - Log probabilities
- Token bias
- Seed control
- Frequency and presence penalties
- OpenAI JSON object mode
- Audio input or output modes
- Prediction hints
- Service tier controls
- Storage flags
- Reasoning-effort controls
Errors
QuilrAI returns OpenAI-shaped error responses for adapter validation failures and wraps Bedrock validation errors without hiding the upstream message.
Common adapter error codes include:
Guardrail Behavior
Request-side DLP scans user text before the Bedrock call. Non-streaming responses are scanned before they are returned to the client.
Streaming responses are different: request-side DLP still runs, but response-side DLP is skipped so chunks can pass through as they arrive.
Tool messages are carried through without changing tool IDs or result ordering. Changing a tool_call_id, dropping a role: "tool" message, or reordering tool results can break Bedrock's strict tool-result validation.
Expected Good Scenarios
These scenarios are covered by the translator:
- Plain text chat
- System, developer, user, and assistant text messages
- Non-streaming text responses
- Streaming text responses
- Bedrock tool calls translated back to OpenAI
tool_calls - Tool-call deltas in streaming responses
- Consecutive OpenAI tool result messages grouped into one Bedrock tool-result user message
- Legacy OpenAI
functions,function_call, androle: "function" - Strict function tools when the deployed boto3/botocore Bedrock Runtime model supports
ToolSpecification.strict response_format: json_schemaon Bedrock models that supportoutputConfig
Expected Failures
These failures are intentional:
response_format: json_objectresponse_format: json_schemaon Bedrock models that do not supportoutputConfig- OpenAI image, audio, or file content
- Modern
assistant.tool_callsentries withoutid - Tool result messages missing
tool_call_id - A user message immediately after assistant tool calls without matching tool results
- Parallel tool results that are not consecutive in the OpenAI message history and therefore cannot be grouped into one Bedrock user turn