Skip to main content
The Responses API is Mavera’s core endpoint for generating AI responses enhanced with persona intelligence. It produces audience-aware outputs that reflect how real demographics think, decide, and respond. The API follows the OpenAI Responses format, so you can use the OpenAI SDK with a two-line configuration change.
OpenAI SDK compatible. Any code written for client.responses.create() works with Mavera. Just change base_url and add a persona_id.

Key Features

Persona Intelligence

Inject specialized personas to get audience-aware responses from 50+ demographics

OpenAI Compatible

Works with any OpenAI SDK — Python, JavaScript, Go, Rust, and more

Streaming

Real-time token streaming via Server-Sent Events with named event types

Analysis Mode

Get structured insights with confidence scores, emotional analysis, and bias detection

Structured Outputs

Request JSON responses with custom schemas for type-safe integrations

Vision Support

Analyze images with multimodal capabilities via input arrays

Tool Calling

Define custom functions with a flat tool format, plus built-in server tools

Credits Tracking

Every response includes usage.credits_used for cost monitoring

Request Flow

Basic Usage

1

Install the OpenAI SDK

pip install openai
2

Initialize the client with Mavera's base URL

from openai import OpenAI

client = OpenAI(
    api_key="mvra_live_your_key_here",
    base_url="https://app.mavera.io/api/v1",
)
3

Make a response request with a persona

response = client.responses.create(
    model="mavera-1",
    input="How do Gen Z consumers view sustainability?",
    instructions="You are a helpful assistant.",
    extra_body={"persona_id": "YOUR_PERSONA_ID"},
)

print(response.output[0].content[0].text)
print(f"Credits used: {response.usage.credits_used}")
In the Python SDK, pass Mavera-specific fields like persona_id via extra_body. In the JavaScript SDK, pass them as top-level properties with a // @ts-ignore comment — the SDK forwards unknown fields automatically.

Persona Integration

Every Responses API request should include a persona_id to activate Mavera’s audience intelligence. The persona shapes the model’s perspective, language, values, and decision-making patterns.
response = client.responses.create(
    model="mavera-1",
    input="What makes a brand authentic?",
    extra_body={"persona_id": "gen_z_consumer"},
)
# Response reflects Gen Z values: transparency, social responsibility, UGC over polished ads
Without a persona_id, the API still works but returns generic responses without audience-specific perspective.
persona_id is technically optional, but most Mavera features lose their value without it. Always pass a persona for audience-aware outputs.

Streaming

Enable real-time streaming for better UX in chat interfaces. Mavera streams tokens via Server-Sent Events (SSE) using named event types like response.output_text.delta.
with client.responses.stream(
    model="mavera-1",
    input="Write a product description for eco-friendly sneakers",
    extra_body={"persona_id": "YOUR_PERSONA_ID"},
) as stream:
    for event in stream:
        if event.type == "response.output_text.delta":
            print(event.delta, end="", flush=True)

SSE Event Types

EventDescription
response.createdThe response object has been created
response.output_text.deltaA chunk of output text — access the text via event.delta
response.completedThe response is fully complete
The stream context manager in Python automatically handles connection lifecycle. In JavaScript, use stream.on() for event-based iteration and await stream.finalResponse() to get the completed response object.

Analysis Mode

Enable analysis mode for structured, quantified insights alongside the natural-language response. This is unique to Mavera — it layers behavioral analysis on top of the persona’s answer.
response = client.responses.create(
    model="mavera-1",
    input="How do millennials feel about remote work?",
    extra_body={
        "persona_id": "YOUR_PERSONA_ID",
        "analysis_mode": True,
    },
)

print(response.output[0].content[0].text)

analysis = response.analysis
print(f"Confidence: {analysis['confidence']}/10")
print(f"Emotional valence: {analysis['emotion']['emotional_valence']}/10")
print(f"Arousal: {analysis['emotion']['arousal']}/10")
print(f"Dominance: {analysis['emotion']['dominance']}/10")
print(f"Biases detected: {[b['name'] for b in analysis['biases']]}")
print(f"Key insights: {analysis['key_insights']}")
Analysis mode returns these fields in the analysis object:
FieldTypeDescription
confidencenumber (1–10)How confident the persona is in this response
emotion.emotional_valencenumber (1–10)Positive vs negative emotional tone
emotion.arousalnumber (1–10)Intensity of emotional activation
emotion.dominancenumber (1–10)Feeling of control or influence
biasesarrayCognitive biases that may affect the response (e.g., anchoring, confirmation bias)
key_insightsarrayStructured takeaways from the response
Structured outputs (text format) and analysis_mode cannot be used together. Use text format for custom schemas, or analysis_mode for Mavera’s built-in analysis structure.

Structured Outputs

Request JSON responses with a specific structure using the text format parameter passed via extra_body. This provides type-safe, predictable outputs for downstream code.

Simple JSON Mode

Force the model to return valid JSON:
response = client.responses.create(
    model="mavera-1",
    input="List 3 benefits of exercise as JSON with 'benefits' array",
    extra_body={
        "persona_id": "YOUR_PERSONA_ID",
        "text": {
            "format": {"type": "json_object"}
        }
    },
)

import json
data = json.loads(response.output[0].content[0].text)
print(data["benefits"])

JSON Schema Mode

Define a strict schema for predictable, typed responses. This is ideal for pipelines where downstream code depends on a specific structure.
response = client.responses.create(
    model="mavera-1",
    input="Review this product: Great quality, fast shipping!",
    extra_body={
        "persona_id": "YOUR_PERSONA_ID",
        "text": {
            "format": {
                "type": "json_schema",
                "json_schema": {
                    "name": "product_review",
                    "strict": True,
                    "schema": {
                        "type": "object",
                        "properties": {
                            "sentiment": {"type": "string", "enum": ["positive", "neutral", "negative"]},
                            "score": {"type": "number"},
                            "summary": {"type": "string"}
                        },
                        "required": ["sentiment", "score", "summary"]
                    }
                }
            }
        }
    },
)

data = response.parsed
print(f"Sentiment: {data['sentiment']}")
print(f"Score: {data['score']}/10")
print(f"Summary: {data['summary']}")
When using json_schema mode, the response includes both output[0].content[0].text (the JSON as a string) and parsed (the already-parsed JSON object). Use parsed to skip the parsing step.

Tool Calling (Function Calling)

Enable the model to call custom functions you define. The Responses API uses a flat tool format — tool properties like name, description, and parameters are top-level fields instead of nested under a function wrapper.

Defining Tools

Pass an array of tool definitions with JSON Schema parameters:
tools = [
    {
        "type": "function",
        "name": "get_weather",
        "description": "Get the current weather for a location",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {"type": "string", "description": "City and state"}
            },
            "required": ["location"]
        }
    }
]

response = client.responses.create(
    model="mavera-1",
    input="What's the weather in San Francisco?",
    tools=tools,
    extra_body={"persona_id": "YOUR_PERSONA_ID"},
)

for item in response.output:
    if item.type == "function_call":
        print(f"Function: {item.name}, Args: {item.arguments}")

Handling Tool Calls

When the response contains function_call items, execute the functions locally and send results back using function_call_output:
import json

def get_weather(location):
    return {"temp": 72, "unit": "fahrenheit", "condition": "sunny"}

for item in response.output:
    if item.type == "function_call":
        args = json.loads(item.arguments)
        result = get_weather(**args)

        follow_up = client.responses.create(
            model="mavera-1",
            input=[
                *response.output,
                {
                    "type": "function_call_output",
                    "call_id": item.call_id,
                    "output": json.dumps(result)
                }
            ],
            tools=tools,
            extra_body={"persona_id": "YOUR_PERSONA_ID"},
        )

        print(follow_up.output[0].content[0].text)

Tool Choice

Control how the model uses tools:
# Let the model decide (default)
tool_choice="auto"

# Disable tool calling for this request
tool_choice="none"

# Force the model to use at least one tool
tool_choice="required"

# Force a specific function
tool_choice={"type": "function", "name": "get_weather"}

Built-in Server Tools

Mavera provides server-side tools that execute automatically without client roundtrips:
ToolDescription
tavily_searchWeb search for current information
tavily_extractExtract structured content from URLs
generateImageGenerate images from text prompts (DALL·E)
editImageEdit uploaded images
generateVideoGenerate videos from text (Sora)
imageToVideoConvert images to video (RunwayML)
Server tool results appear in the server_tool_calls array of the response.

Image / Vision Input

Include images for visual analysis using the input array with input_image content blocks:
response = client.responses.create(
    model="mavera-1",
    input=[
        {
            "role": "user",
            "content": [
                {"type": "input_text", "text": "What is in this image? How would our target audience react to it?"},
                {"type": "input_image", "image_url": "https://your-bucket.s3.amazonaws.com/campaign-visual.png"}
            ]
        }
    ],
    extra_body={"persona_id": "YOUR_PERSONA_ID"},
)

print(response.output[0].content[0].text)
Images must be publicly accessible URLs. Supported formats: PNG, JPEG, GIF, WebP. Maximum file size: 20 MB.

Multi-turn Conversations

Use the input array with role-based messages for multi-turn conversations:
response = client.responses.create(
    model="mavera-1",
    input=[
        {"role": "user", "content": "What's the most important factor in brand loyalty?"},
        {"role": "assistant", "content": "For Gen Z, authenticity and social responsibility top the list."},
        {"role": "user", "content": "How does that compare to millennials?"}
    ],
    extra_body={"persona_id": "YOUR_PERSONA_ID"},
)

Error Handling

The API uses standard HTTP status codes and returns structured error responses.

Error Response Format

{
  "error": {
    "message": "Invalid persona_id: persona_xyz does not exist",
    "type": "invalid_request_error",
    "code": "persona_not_found"
  }
}

Common Error Codes

StatusCodeDescriptionResolution
400invalid_requestMalformed request body or missing fieldsCheck request format and required fields
401unauthorizedInvalid or missing API keyVerify your mvra_live_ or mvra_test_ key
403forbiddenKey lacks permission for this operationCheck workspace access and key permissions
404persona_not_foundThe specified persona_id doesn’t existList personas to find valid IDs
422validation_errorRequest fields fail validationCheck field types and constraints
429rate_limit_exceededToo many requestsBack off and retry with exponential delay
500internal_errorServer-side failureRetry after a brief delay
503service_unavailableTemporary overloadRetry with exponential backoff

Retry Logic

import time
from openai import OpenAI, RateLimitError, APIError

client = OpenAI(
    api_key="mvra_live_your_key_here",
    base_url="https://app.mavera.io/api/v1",
)

def respond_with_retry(input_text, persona_id, max_retries=3):
    for attempt in range(max_retries):
        try:
            return client.responses.create(
                model="mavera-1",
                input=input_text,
                extra_body={"persona_id": persona_id},
            )
        except RateLimitError:
            wait = 2 ** attempt
            print(f"Rate limited. Retrying in {wait}s...")
            time.sleep(wait)
        except APIError as e:
            if e.status_code >= 500:
                wait = 2 ** attempt
                print(f"Server error. Retrying in {wait}s...")
                time.sleep(wait)
            else:
                raise
    raise Exception("Max retries exceeded")

Rate Limits and Credits

Rate Limits

PlanRequests / minuteRequests / day
Free10100
Pro605,000
Business30050,000
EnterpriseCustomCustom
Rate limit headers are included in every response:
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 57
X-RateLimit-Reset: 1706345700

Credit Costs

OperationCredits
Standard response1–3
Response with analysis mode3–5
Response with tool calls2–4 per turn
Response with image input3–5
Streaming (same as non-streaming)1–3
Credits used are always returned in response.usage.credits_used. Monitor this to stay within budget.
Use reasoning_effort: "low" and verbosity: "low" for cheaper, faster responses when you don’t need deep analysis.

Advanced Parameters

ParameterTypeDefaultDescription
modelstringRequired. Always "mavera-1"
inputstring or arrayRequired. A string for simple queries, or an array of {role, content} messages for multi-turn
persona_idstringID of the persona to use
instructionsstringSystem-level instructions appended to the persona’s system prompt
analysis_modebooleanfalseEnable structured analysis output
textobjectStructured output format config (e.g. {format: {type: "json_schema", ...}})
toolsarrayCustom function definitions (flat format)
tool_choicestring/object"auto"Control tool usage
reasoning_effortstring"medium""low", "medium", or "high"
verbositystring"medium""low", "medium", or "high"
streambooleanfalseEnable streaming (or use client.responses.stream())
temperaturenumber1Randomness (0–2)
max_output_tokensintegermodel defaultMaximum tokens in the response
top_pnumber1Nucleus sampling threshold
frequency_penaltynumber0Penalize repeated tokens (−2 to 2)
presence_penaltynumber0Penalize tokens already present (−2 to 2)
The instructions parameter replaces system messages. Instructions are appended to the persona’s built-in system prompt, so you don’t need to repeat persona context — just add task-specific guidance.

Response Format

Standard Response

{
  "id": "resp_abc123def456abc123def456",
  "object": "response",
  "created_at": 1706345678,
  "status": "completed",
  "model": "mavera-1",
  "output": [
    {
      "id": "msg_abc123def456",
      "type": "message",
      "role": "assistant",
      "status": "completed",
      "content": [
        {
          "type": "output_text",
          "text": "Based on my understanding as a Gen Z consumer..."
        }
      ]
    }
  ],
  "usage": {
    "input_tokens": 42,
    "output_tokens": 156,
    "total_tokens": 198,
    "credits_used": 3
  }
}

Response with Analysis

{
  "id": "resp_abc123def456abc123def456",
  "object": "response",
  "created_at": 1706345678,
  "status": "completed",
  "model": "mavera-1",
  "output": [
    {
      "id": "msg_abc123def456",
      "type": "message",
      "role": "assistant",
      "status": "completed",
      "content": [
        {
          "type": "output_text",
          "text": "Remote work is something I strongly value..."
        }
      ]
    }
  ],
  "analysis": {
    "confidence": 8,
    "emotion": {
      "emotional_valence": 7,
      "arousal": 5,
      "dominance": 6
    },
    "biases": [
      {
        "name": "recency_bias",
        "description": "Overweighing recent remote work experiences"
      }
    ],
    "key_insights": [
      "Strong preference for hybrid models over fully remote",
      "Values flexibility more than salary increases"
    ]
  },
  "usage": {
    "credits_used": 4
  }
}

Response with Tool Calls

{
  "id": "resp_abc123def456abc123def456",
  "object": "response",
  "created_at": 1706345678,
  "status": "completed",
  "model": "mavera-1",
  "output": [
    {
      "type": "function_call",
      "id": "fc_abc123",
      "call_id": "call_abc123",
      "name": "get_weather",
      "arguments": "{\"location\": \"San Francisco, CA\"}"
    }
  ],
  "usage": {
    "credits_used": 2
  }
}

Best Practices

Mavera’s value comes from persona intelligence. Without a persona_id, you get generic responses. Match the persona to your target audience for the most useful outputs.
The instructions parameter is appended to the persona’s built-in system prompt. Use it to set task context — for example: "You are a market research analyst interviewing this persona about brand loyalty." This gives the model both audience perspective and task direction without overriding the persona.
Use client.responses.stream() for any response that might exceed a few sentences. This dramatically improves perceived latency in user-facing applications.
When the response feeds into downstream code, use text format with json_schema to guarantee a predictable structure. This eliminates parsing errors.
Implement retry logic with exponential backoff for 429 and 5xx errors. Log usage.credits_used to monitor costs.
Use reasoning_effort: "high" for complex analysis and "low" for simple Q&A. This controls both quality and credit cost.
The Responses API uses a flat tool definition — name, description, and parameters are top-level fields in each tool object. Don’t nest them under a function key.

Next Steps

Personas

Explore 50+ pre-built personas and create custom ones

Focus Groups

Run simulated audience research at scale

Migration Guide

Migrate from the Responses API

API Reference

Full API specification