OpenAI SDK compatible. Any code written for
client.responses.create() works with Mavera. Just change base_url and add a persona_id.Key Features
Persona Intelligence
Inject specialized personas to get audience-aware responses from 50+ demographics
OpenAI Compatible
Works with any OpenAI SDK — Python, JavaScript, Go, Rust, and more
Streaming
Real-time token streaming via Server-Sent Events with named event types
Analysis Mode
Get structured insights with confidence scores, emotional analysis, and bias detection
Structured Outputs
Request JSON responses with custom schemas for type-safe integrations
Vision Support
Analyze images with multimodal capabilities via input arrays
Tool Calling
Define custom functions with a flat tool format, plus built-in server tools
Credits Tracking
Every response includes
usage.credits_used for cost monitoringRequest Flow
Basic Usage
Persona Integration
Every Responses API request should include apersona_id to activate Mavera’s audience intelligence. The persona shapes the model’s perspective, language, values, and decision-making patterns.
persona_id, the API still works but returns generic responses without audience-specific perspective.
Streaming
Enable real-time streaming for better UX in chat interfaces. Mavera streams tokens via Server-Sent Events (SSE) using named event types likeresponse.output_text.delta.
SSE Event Types
| Event | Description |
|---|---|
response.created | The response object has been created |
response.output_text.delta | A chunk of output text — access the text via event.delta |
response.completed | The response is fully complete |
The stream context manager in Python automatically handles connection lifecycle. In JavaScript, use
stream.on() for event-based iteration and await stream.finalResponse() to get the completed response object.Analysis Mode
Enable analysis mode for structured, quantified insights alongside the natural-language response. This is unique to Mavera — it layers behavioral analysis on top of the persona’s answer.analysis object:
| Field | Type | Description |
|---|---|---|
confidence | number (1–10) | How confident the persona is in this response |
emotion.emotional_valence | number (1–10) | Positive vs negative emotional tone |
emotion.arousal | number (1–10) | Intensity of emotional activation |
emotion.dominance | number (1–10) | Feeling of control or influence |
biases | array | Cognitive biases that may affect the response (e.g., anchoring, confirmation bias) |
key_insights | array | Structured takeaways from the response |
Structured Outputs
Request JSON responses with a specific structure using thetext format parameter passed via extra_body. This provides type-safe, predictable outputs for downstream code.
Simple JSON Mode
Force the model to return valid JSON:JSON Schema Mode
Define a strict schema for predictable, typed responses. This is ideal for pipelines where downstream code depends on a specific structure.Tool Calling (Function Calling)
Enable the model to call custom functions you define. The Responses API uses a flat tool format — tool properties likename, description, and parameters are top-level fields instead of nested under a function wrapper.
Defining Tools
Pass an array of tool definitions with JSON Schema parameters:Handling Tool Calls
When the response containsfunction_call items, execute the functions locally and send results back using function_call_output:
Tool Choice
Control how the model uses tools:Built-in Server Tools
Mavera provides server-side tools that execute automatically without client roundtrips:| Tool | Description |
|---|---|
tavily_search | Web search for current information |
tavily_extract | Extract structured content from URLs |
generateImage | Generate images from text prompts (DALL·E) |
editImage | Edit uploaded images |
generateVideo | Generate videos from text (Sora) |
imageToVideo | Convert images to video (RunwayML) |
server_tool_calls array of the response.
Image / Vision Input
Include images for visual analysis using theinput array with input_image content blocks:
Images must be publicly accessible URLs. Supported formats: PNG, JPEG, GIF, WebP. Maximum file size: 20 MB.
Multi-turn Conversations
Use theinput array with role-based messages for multi-turn conversations:
Error Handling
The API uses standard HTTP status codes and returns structured error responses.Error Response Format
Common Error Codes
| Status | Code | Description | Resolution |
|---|---|---|---|
400 | invalid_request | Malformed request body or missing fields | Check request format and required fields |
401 | unauthorized | Invalid or missing API key | Verify your mvra_live_ or mvra_test_ key |
403 | forbidden | Key lacks permission for this operation | Check workspace access and key permissions |
404 | persona_not_found | The specified persona_id doesn’t exist | List personas to find valid IDs |
422 | validation_error | Request fields fail validation | Check field types and constraints |
429 | rate_limit_exceeded | Too many requests | Back off and retry with exponential delay |
500 | internal_error | Server-side failure | Retry after a brief delay |
503 | service_unavailable | Temporary overload | Retry with exponential backoff |
Retry Logic
Rate Limits and Credits
Rate Limits
| Plan | Requests / minute | Requests / day |
|---|---|---|
| Free | 10 | 100 |
| Pro | 60 | 5,000 |
| Business | 300 | 50,000 |
| Enterprise | Custom | Custom |
Credit Costs
| Operation | Credits |
|---|---|
| Standard response | 1–3 |
| Response with analysis mode | 3–5 |
| Response with tool calls | 2–4 per turn |
| Response with image input | 3–5 |
| Streaming (same as non-streaming) | 1–3 |
response.usage.credits_used. Monitor this to stay within budget.
Advanced Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
model | string | — | Required. Always "mavera-1" |
input | string or array | — | Required. A string for simple queries, or an array of {role, content} messages for multi-turn |
persona_id | string | — | ID of the persona to use |
instructions | string | — | System-level instructions appended to the persona’s system prompt |
analysis_mode | boolean | false | Enable structured analysis output |
text | object | — | Structured output format config (e.g. {format: {type: "json_schema", ...}}) |
tools | array | — | Custom function definitions (flat format) |
tool_choice | string/object | "auto" | Control tool usage |
reasoning_effort | string | "medium" | "low", "medium", or "high" |
verbosity | string | "medium" | "low", "medium", or "high" |
stream | boolean | false | Enable streaming (or use client.responses.stream()) |
temperature | number | 1 | Randomness (0–2) |
max_output_tokens | integer | model default | Maximum tokens in the response |
top_p | number | 1 | Nucleus sampling threshold |
frequency_penalty | number | 0 | Penalize repeated tokens (−2 to 2) |
presence_penalty | number | 0 | Penalize tokens already present (−2 to 2) |
The
instructions parameter replaces system messages. Instructions are appended to the persona’s built-in system prompt, so you don’t need to repeat persona context — just add task-specific guidance.Response Format
Standard Response
Response with Analysis
Response with Tool Calls
Best Practices
Always include a persona
Always include a persona
Mavera’s value comes from persona intelligence. Without a
persona_id, you get generic responses. Match the persona to your target audience for the most useful outputs.Use instructions for task context
Use instructions for task context
The
instructions parameter is appended to the persona’s built-in system prompt. Use it to set task context — for example: "You are a market research analyst interviewing this persona about brand loyalty." This gives the model both audience perspective and task direction without overriding the persona.Stream long responses
Stream long responses
Use
client.responses.stream() for any response that might exceed a few sentences. This dramatically improves perceived latency in user-facing applications.Use structured outputs for pipelines
Use structured outputs for pipelines
When the response feeds into downstream code, use
text format with json_schema to guarantee a predictable structure. This eliminates parsing errors.Handle errors gracefully
Handle errors gracefully
Implement retry logic with exponential backoff for
429 and 5xx errors. Log usage.credits_used to monitor costs.Choose reasoning effort wisely
Choose reasoning effort wisely
Use
reasoning_effort: "high" for complex analysis and "low" for simple Q&A. This controls both quality and credit cost.Use flat tool format
Use flat tool format
The Responses API uses a flat tool definition —
name, description, and parameters are top-level fields in each tool object. Don’t nest them under a function key.Next Steps
Personas
Explore 50+ pre-built personas and create custom ones
Focus Groups
Run simulated audience research at scale
Migration Guide
Migrate from the Responses API
API Reference
Full API specification