The Responses API is not a breaking version bump — it’s a new endpoint format. Your existing Chat Completions code will continue to work during the transition period, but all new features and documentation target the Responses API.
What Changed
| Concept | Chat Completions (old) | Responses API (new) |
|---|---|---|
| Endpoint | POST /api/v1/chat/completions | POST /api/v1/responses |
| SDK method | client.chat.completions.create() | client.responses.create() |
| Input | messages: [{role, content}] | input: "string" or input: [{role, content}] |
| System message | {role: "system", content: "..."} in messages | instructions parameter |
| Output text | response.choices[0].message.content | response.output[0].content[0].text |
| Streaming (Python) | stream=True + for chunk in stream: | client.responses.stream() context manager |
| Streaming (JS) | stream: true + for await | client.responses.stream() + .on() events |
| Tool format | {type, function: {name, description, parameters}} | {type, name, description, parameters} (flat) |
| Tool call output | response.choices[0].message.tool_calls | Items in response.output with type == "function_call" |
| Tool result | {role: "tool", tool_call_id, content} | {type: "function_call_output", call_id, output} |
| Structured output | response_format: {type, json_schema} | text: {format: {type, json_schema}} via extra_body |
| Parsed output | response.choices[0].message.parsed | response.parsed |
| Response ID | chatcmpl_... | resp_... |
| Response object | "chat.completion" | "response" |
| Token fields | prompt_tokens / completion_tokens | input_tokens / output_tokens |
| Finish signal | choices[0].finish_reason: "stop" | status: "completed" |
| SSE events | Generic data: chunks | Named events (response.created, response.output_text.delta, response.completed) |
| Credits | usage.credits_used | usage.credits_used (no change) |
Step-by-Step Migration
1. Update the SDK Method
The SDK method changes fromchat.completions.create() to responses.create(). The input format also changes from messages to input. No SDK version change is required — the OpenAI SDK already supports both.
2. Move System Messages to instructions
System messages are no longer part of the messages array. Use the top-level instructions parameter instead. Instructions are appended to the persona’s built-in system prompt.
3. Update Streaming
The streaming interface changes from a flag-based approach to a dedicated stream method with named events.4. Update Structured Outputs
Theresponse_format parameter is replaced by text format configuration passed via extra_body. The parsed result moves from response.choices[0].message.parsed to response.parsed.
5. Update Tool Calling
Tool definitions change from a nested format to a flat format. Tool call results change from{role: "tool"} messages to {type: "function_call_output"} items.
Tool Definitions
Reading Tool Calls
Sending Tool Results
6. Update Response Parsing
The response shape changes significantly. Update all code that reads from the response object.| Chat Completions (old) | Responses API (new) |
|---|---|
response.choices[0].message.content | response.output[0].content[0].text |
response.choices[0].message.parsed | response.parsed |
response.choices[0].finish_reason | response.status |
response.usage.prompt_tokens | response.usage.input_tokens |
response.usage.completion_tokens | response.usage.output_tokens |
response.usage.credits_used | response.usage.credits_used (unchanged) |
response.id (prefix chatcmpl_) | response.id (prefix resp_) |
response.object ("chat.completion") | response.object ("response") |
7. Update Error Handling
Error responses use the same format, but the retry logic should reference the new SDK method.Migration Checklist
Migration checklist
Migration checklist
Use this checklist to verify your migration is complete:
- Replace
client.chat.completions.create()withclient.responses.create() - Replace
messagesparameter withinputparameter - Move system messages from
messagesarray toinstructionsparameter - Update streaming to use
client.responses.stream()and named events - Update
response_formattotext: {format: {...}}inextra_body - Flatten tool definitions (remove
functionwrapper) - Update tool result messages to
{type: "function_call_output", call_id, output} - Update response parsing:
.choices[0].message.contentto.output[0].content[0].text - Update parsed access:
.choices[0].message.parsedto.parsed - Update token field references:
prompt_tokenstoinput_tokens,completion_tokenstooutput_tokens - Update finish detection:
finish_reason == "stop"tostatus == "completed" - Update cURL endpoints from
/chat/completionsto/responses - Update error handling and retry logic to use new SDK method
- Test all code paths (basic, streaming, tools, structured outputs, analysis mode)
Common Gotchas
System messages are ignored in the input array
System messages are ignored in the input array
The Responses API does not support
{role: "system"} in the input array. Use the instructions parameter instead. If you include a system message in input, it will be ignored or cause an error.Tool definitions must be flat
Tool definitions must be flat
The old
{type: "function", function: {name, ...}} nested format will not work. Use the flat format: {type: "function", name: "...", description: "...", parameters: {...}}.Tool results use call_id, not tool_call_id
Tool results use call_id, not tool_call_id
The field name changed from
tool_call_id to call_id, and the message type changed from {role: "tool"} to {type: "function_call_output"}.Streaming uses a context manager in Python
Streaming uses a context manager in Python
You can no longer use
stream=True as a parameter. Instead, use client.responses.stream() which returns a context manager. Iterate over events and check event.type == 'response.output_text.delta'.Structured output config key changed
Structured output config key changed
response_format is replaced by text in extra_body. The schema structure inside is the same, but the wrapping key is different: text: {format: {type: "json_schema", json_schema: {...}}}.Response shape is different
Response shape is different
There are no more
choices — the output is in response.output[]. Each output item has a type field ("message" for text, "function_call" for tool calls). Text content is at response.output[0].content[0].text.persona_id passing is unchanged
persona_id passing is unchanged
In Python,
persona_id is still passed via extra_body. In JavaScript, it’s still a top-level field with // @ts-ignore. This did not change.See Also
Responses API
Full Responses API reference and usage guide
Migrate OpenAI to Mavera
Migrate from OpenAI to Mavera (base URL + persona)
Streaming Guide
Deep dive into streaming patterns
Function Calling Guide
Tool calling patterns and best practices