Responses API - Mavera

The Responses API is Mavera’s core endpoint for generating AI responses enhanced with persona intelligence. It produces audience-aware outputs that reflect how real demographics think, decide, and respond. The API follows the OpenAI Responses format, so you can use the OpenAI SDK with a two-line configuration change.

OpenAI SDK compatible. Any code written for client.responses.create() works with Mavera. Just change base_url and add a persona_id.

Key Features

Persona Intelligence

Inject specialized personas to get audience-aware responses from 50+ demographics

OpenAI Compatible

Works with any OpenAI SDK — Python, JavaScript, Go, Rust, and more

Streaming

Real-time token streaming via Server-Sent Events with named event types

Analysis Mode

Get structured insights with confidence scores, emotional analysis, and bias detection

Structured Outputs

Request JSON responses with custom schemas for type-safe integrations

Vision Support

Analyze images with multimodal capabilities via input arrays

Tool Calling

Define custom functions with a flat tool format, plus built-in server tools

Credits Tracking

Every response includes usage.credits_used for cost monitoring

Request Flow

Basic Usage

Install the OpenAI SDK

pip install openai

npm install openai

Initialize the client with Mavera's base URL

from openai import OpenAI

client = OpenAI(
    api_key="mvra_live_your_key_here",
    base_url="https://app.mavera.io/api/v1",
)

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: "mvra_live_your_key_here",
  baseURL: "https://app.mavera.io/api/v1",
});

Make a response request with a persona

response = client.responses.create(
    model="mavera-1",
    input="How do Gen Z consumers view sustainability?",
    instructions="You are a helpful assistant.",
    extra_body={"persona_id": "YOUR_PERSONA_ID"},
)

print(response.output[0].content[0].text)
print(f"Credits used: {response.usage.credits_used}")

const response = await client.responses.create({
  model: "mavera-1",
  input: "How do Gen Z consumers view sustainability?",
  instructions: "You are a helpful assistant.",
  // @ts-ignore - Mavera custom field
  persona_id: "YOUR_PERSONA_ID",
});

console.log(response.output[0].content[0].text);
console.log("Credits used:", response.usage.credits_used);

curl -X POST https://app.mavera.io/api/v1/responses \
  -H "Authorization: Bearer mvra_live_your_key_here" \
  -H "Content-Type: application/json" \
  -d '{"model": "mavera-1", "persona_id": "YOUR_PERSONA_ID", "input": "How do Gen Z consumers view sustainability?"}'

In the Python SDK, pass Mavera-specific fields like persona_id via extra_body. In the JavaScript SDK, pass them as top-level properties with a // @ts-ignore comment — the SDK forwards unknown fields automatically.

Persona Integration

Every Responses API request should include a persona_id to activate Mavera’s audience intelligence. The persona shapes the model’s perspective, language, values, and decision-making patterns.

response = client.responses.create(
    model="mavera-1",
    input="What makes a brand authentic?",
    extra_body={"persona_id": "gen_z_consumer"},
)
# Response reflects Gen Z values: transparency, social responsibility, UGC over polished ads

Without a persona_id, the API still works but returns generic responses without audience-specific perspective.

persona_id is technically optional, but most Mavera features lose their value without it. Always pass a persona for audience-aware outputs.

Streaming

Enable real-time streaming for better UX in chat interfaces. Mavera streams tokens via Server-Sent Events (SSE) using named event types like response.output_text.delta.

with client.responses.stream(
    model="mavera-1",
    input="Write a product description for eco-friendly sneakers",
    extra_body={"persona_id": "YOUR_PERSONA_ID"},
) as stream:
    for event in stream:
        if event.type == "response.output_text.delta":
            print(event.delta, end="", flush=True)

const stream = client.responses.stream({
  model: "mavera-1",
  input: "Write a product description for eco-friendly sneakers",
  // @ts-ignore - Mavera custom field
  persona_id: "YOUR_PERSONA_ID",
});

stream.on("response.output_text.delta", (event) => {
  process.stdout.write(event.delta);
});

const finalResponse = await stream.finalResponse();

SSE Event Types

Event	Description
`response.created`	The response object has been created
`response.output_text.delta`	A chunk of output text — access the text via `event.delta`
`response.completed`	The response is fully complete

The stream context manager in Python automatically handles connection lifecycle. In JavaScript, use stream.on() for event-based iteration and await stream.finalResponse() to get the completed response object.

Analysis Mode

Enable analysis mode for structured, quantified insights alongside the natural-language response. This is unique to Mavera — it layers behavioral analysis on top of the persona’s answer.

response = client.responses.create(
    model="mavera-1",
    input="How do millennials feel about remote work?",
    extra_body={
        "persona_id": "YOUR_PERSONA_ID",
        "analysis_mode": True,
    },
)

print(response.output[0].content[0].text)

analysis = response.analysis
print(f"Confidence: {analysis['confidence']}/10")
print(f"Emotional valence: {analysis['emotion']['emotional_valence']}/10")
print(f"Arousal: {analysis['emotion']['arousal']}/10")
print(f"Dominance: {analysis['emotion']['dominance']}/10")
print(f"Biases detected: {[b['name'] for b in analysis['biases']]}")
print(f"Key insights: {analysis['key_insights']}")

const response = await client.responses.create({
  model: "mavera-1",
  input: "How do millennials feel about remote work?",
  // @ts-ignore - Mavera custom field
  persona_id: "YOUR_PERSONA_ID",
  // @ts-ignore - Mavera custom field
  analysis_mode: true,
});

console.log(response.output[0].content[0].text);

const { analysis } = response;
console.log(`Confidence: ${analysis.confidence}/10`);
console.log(`Emotional valence: ${analysis.emotion.emotional_valence}/10`);
console.log(`Biases: ${analysis.biases.map(b => b.name).join(", ")}`);

Analysis mode returns these fields in the analysis object:

Field	Type	Description
`confidence`	number (1–10)	How confident the persona is in this response
`emotion.emotional_valence`	number (1–10)	Positive vs negative emotional tone
`emotion.arousal`	number (1–10)	Intensity of emotional activation
`emotion.dominance`	number (1–10)	Feeling of control or influence
`biases`	array	Cognitive biases that may affect the response (e.g., anchoring, confirmation bias)
`key_insights`	array	Structured takeaways from the response

Structured outputs (text format) and analysis_mode cannot be used together. Use text format for custom schemas, or analysis_mode for Mavera’s built-in analysis structure.

Structured Outputs

Request JSON responses with a specific structure using the text format parameter passed via extra_body. This provides type-safe, predictable outputs for downstream code.

Simple JSON Mode

Force the model to return valid JSON:

response = client.responses.create(
    model="mavera-1",
    input="List 3 benefits of exercise as JSON with 'benefits' array",
    extra_body={
        "persona_id": "YOUR_PERSONA_ID",
        "text": {
            "format": {"type": "json_object"}
        }
    },
)

import json
data = json.loads(response.output[0].content[0].text)
print(data["benefits"])

const response = await client.responses.create({
  model: "mavera-1",
  input: "List 3 benefits of exercise as JSON with 'benefits' array",
  // @ts-ignore - Mavera custom field
  persona_id: "YOUR_PERSONA_ID",
  // @ts-ignore - Mavera custom field
  text: {
    format: { type: "json_object" },
  },
});

const data = JSON.parse(response.output[0].content[0].text);
console.log(data.benefits);

JSON Schema Mode

Define a strict schema for predictable, typed responses. This is ideal for pipelines where downstream code depends on a specific structure.

response = client.responses.create(
    model="mavera-1",
    input="Review this product: Great quality, fast shipping!",
    extra_body={
        "persona_id": "YOUR_PERSONA_ID",
        "text": {
            "format": {
                "type": "json_schema",
                "json_schema": {
                    "name": "product_review",
                    "strict": True,
                    "schema": {
                        "type": "object",
                        "properties": {
                            "sentiment": {"type": "string", "enum": ["positive", "neutral", "negative"]},
                            "score": {"type": "number"},
                            "summary": {"type": "string"}
                        },
                        "required": ["sentiment", "score", "summary"]
                    }
                }
            }
        }
    },
)

data = response.parsed
print(f"Sentiment: {data['sentiment']}")
print(f"Score: {data['score']}/10")
print(f"Summary: {data['summary']}")

const response = await client.responses.create({
  model: "mavera-1",
  input: "Review this product: Great quality, fast shipping!",
  // @ts-ignore - Mavera custom field
  persona_id: "YOUR_PERSONA_ID",
  // @ts-ignore - Mavera custom field
  text: {
    format: {
      type: "json_schema",
      json_schema: {
        name: "product_review",
        strict: true,
        schema: {
          type: "object",
          properties: {
            sentiment: { type: "string", enum: ["positive", "neutral", "negative"] },
            score: { type: "number" },
            summary: { type: "string" },
          },
          required: ["sentiment", "score", "summary"],
        },
      },
    },
  },
});

const data = response.parsed;
console.log(`Sentiment: ${data.sentiment}, Score: ${data.score}/10`);

When using json_schema mode, the response includes both output[0].content[0].text (the JSON as a string) and parsed (the already-parsed JSON object). Use parsed to skip the parsing step.

Tool Calling (Function Calling)

Enable the model to call custom functions you define. The Responses API uses a flat tool format — tool properties like name, description, and parameters are top-level fields instead of nested under a function wrapper.

Defining Tools

Pass an array of tool definitions with JSON Schema parameters:

tools = [
    {
        "type": "function",
        "name": "get_weather",
        "description": "Get the current weather for a location",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {"type": "string", "description": "City and state"}
            },
            "required": ["location"]
        }
    }
]

response = client.responses.create(
    model="mavera-1",
    input="What's the weather in San Francisco?",
    tools=tools,
    extra_body={"persona_id": "YOUR_PERSONA_ID"},
)

for item in response.output:
    if item.type == "function_call":
        print(f"Function: {item.name}, Args: {item.arguments}")

const tools = [
  {
    type: "function",
    name: "get_weather",
    description: "Get the current weather for a location",
    parameters: {
      type: "object",
      properties: {
        location: { type: "string", description: "City and state" },
      },
      required: ["location"],
    },
  },
];

const response = await client.responses.create({
  model: "mavera-1",
  input: "What's the weather in San Francisco?",
  tools,
  // @ts-ignore - Mavera custom field
  persona_id: "YOUR_PERSONA_ID",
});

for (const item of response.output) {
  if (item.type === "function_call") {
    console.log(`Function: ${item.name}, Args: ${item.arguments}`);
  }
}

Handling Tool Calls

When the response contains function_call items, execute the functions locally and send results back using function_call_output:

import json

def get_weather(location):
    return {"temp": 72, "unit": "fahrenheit", "condition": "sunny"}

for item in response.output:
    if item.type == "function_call":
        args = json.loads(item.arguments)
        result = get_weather(**args)

        follow_up = client.responses.create(
            model="mavera-1",
            input=[
                *response.output,
                {
                    "type": "function_call_output",
                    "call_id": item.call_id,
                    "output": json.dumps(result)
                }
            ],
            tools=tools,
            extra_body={"persona_id": "YOUR_PERSONA_ID"},
        )

        print(follow_up.output[0].content[0].text)

function getWeather(location) {
  return { temp: 72, unit: "fahrenheit", condition: "sunny" };
}

for (const item of response.output) {
  if (item.type === "function_call") {
    const args = JSON.parse(item.arguments);
    const result = getWeather(args.location);

    const followUp = await client.responses.create({
      model: "mavera-1",
      input: [
        ...response.output,
        {
          type: "function_call_output",
          call_id: item.call_id,
          output: JSON.stringify(result),
        },
      ],
      tools,
      // @ts-ignore - Mavera custom field
      persona_id: "YOUR_PERSONA_ID",
    });

    console.log(followUp.output[0].content[0].text);
  }
}

Tool Choice

Control how the model uses tools:

# Let the model decide (default)
tool_choice="auto"

# Disable tool calling for this request
tool_choice="none"

# Force the model to use at least one tool
tool_choice="required"

# Force a specific function
tool_choice={"type": "function", "name": "get_weather"}

Built-in Server Tools

Mavera provides server-side tools that execute automatically without client roundtrips:

Tool	Description
`tavily_search`	Web search for current information
`tavily_extract`	Extract structured content from URLs
`generateImage`	Generate images from text prompts (DALL·E)
`editImage`	Edit uploaded images
`generateVideo`	Generate videos from text (Sora)
`imageToVideo`	Convert images to video (RunwayML)

Server tool results appear in the server_tool_calls array of the response.

Image / Vision Input

Include images for visual analysis using the input array with input_image content blocks:

response = client.responses.create(
    model="mavera-1",
    input=[
        {
            "role": "user",
            "content": [
                {"type": "input_text", "text": "What is in this image? How would our target audience react to it?"},
                {"type": "input_image", "image_url": "https://your-bucket.s3.amazonaws.com/campaign-visual.png"}
            ]
        }
    ],
    extra_body={"persona_id": "YOUR_PERSONA_ID"},
)

print(response.output[0].content[0].text)

const response = await client.responses.create({
  model: "mavera-1",
  input: [
    {
      role: "user",
      content: [
        { type: "input_text", text: "What is in this image? How would our target audience react to it?" },
        { type: "input_image", image_url: "https://your-bucket.s3.amazonaws.com/campaign-visual.png" },
      ],
    },
  ],
  // @ts-ignore - Mavera custom field
  persona_id: "YOUR_PERSONA_ID",
});

console.log(response.output[0].content[0].text);

curl -X POST https://app.mavera.io/api/v1/responses \
  -H "Authorization: Bearer mvra_live_your_key_here" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "mavera-1",
    "persona_id": "YOUR_PERSONA_ID",
    "input": [
      {
        "role": "user",
        "content": [
          {"type": "input_text", "text": "What is in this image?"},
          {"type": "input_image", "image_url": "https://your-bucket.s3.amazonaws.com/image.png"}
        ]
      }
    ]
  }'

Images must be publicly accessible URLs. Supported formats: PNG, JPEG, GIF, WebP. Maximum file size: 20 MB.

Multi-turn Conversations

Use the input array with role-based messages for multi-turn conversations:

response = client.responses.create(
    model="mavera-1",
    input=[
        {"role": "user", "content": "What's the most important factor in brand loyalty?"},
        {"role": "assistant", "content": "For Gen Z, authenticity and social responsibility top the list."},
        {"role": "user", "content": "How does that compare to millennials?"}
    ],
    extra_body={"persona_id": "YOUR_PERSONA_ID"},
)

Error Handling

The API uses standard HTTP status codes and returns structured error responses.

Error Response Format

{
  "error": {
    "message": "Invalid persona_id: persona_xyz does not exist",
    "type": "invalid_request_error",
    "code": "persona_not_found"
  }
}

Common Error Codes

Status	Code	Description	Resolution
`400`	`invalid_request`	Malformed request body or missing fields	Check request format and required fields
`401`	`unauthorized`	Invalid or missing API key	Verify your `mvra_live_` or `mvra_test_` key
`403`	`forbidden`	Key lacks permission for this operation	Check workspace access and key permissions
`404`	`persona_not_found`	The specified `persona_id` doesn’t exist	List personas to find valid IDs
`422`	`validation_error`	Request fields fail validation	Check field types and constraints
`429`	`rate_limit_exceeded`	Too many requests	Back off and retry with exponential delay
`500`	`internal_error`	Server-side failure	Retry after a brief delay
`503`	`service_unavailable`	Temporary overload	Retry with exponential backoff

Retry Logic

import time
from openai import OpenAI, RateLimitError, APIError

client = OpenAI(
    api_key="mvra_live_your_key_here",
    base_url="https://app.mavera.io/api/v1",
)

def respond_with_retry(input_text, persona_id, max_retries=3):
    for attempt in range(max_retries):
        try:
            return client.responses.create(
                model="mavera-1",
                input=input_text,
                extra_body={"persona_id": persona_id},
            )
        except RateLimitError:
            wait = 2 ** attempt
            print(f"Rate limited. Retrying in {wait}s...")
            time.sleep(wait)
        except APIError as e:
            if e.status_code >= 500:
                wait = 2 ** attempt
                print(f"Server error. Retrying in {wait}s...")
                time.sleep(wait)
            else:
                raise
    raise Exception("Max retries exceeded")

async function respondWithRetry(input, personaId, maxRetries = 3) {
  for (let attempt = 0; attempt < maxRetries; attempt++) {
    try {
      return await client.responses.create({
        model: "mavera-1",
        input,
        // @ts-ignore - Mavera custom field
        persona_id: personaId,
      });
    } catch (error) {
      if (error.status === 429 || error.status >= 500) {
        const wait = Math.pow(2, attempt) * 1000;
        console.log(`Retrying in ${wait}ms...`);
        await new Promise((r) => setTimeout(r, wait));
      } else {
        throw error;
      }
    }
  }
  throw new Error("Max retries exceeded");
}

Rate Limits and Credits

Rate Limits

Plan	Requests / minute	Requests / day
Free	10	100
Pro	60	5,000
Business	300	50,000
Enterprise	Custom	Custom

Rate limit headers are included in every response:

X-RateLimit-Limit: 60
X-RateLimit-Remaining: 57
X-RateLimit-Reset: 1706345700

Credit Costs

Operation	Credits
Standard response	1–3
Response with analysis mode	3–5
Response with tool calls	2–4 per turn
Response with image input	3–5
Streaming (same as non-streaming)	1–3

Credits used are always returned in response.usage.credits_used. Monitor this to stay within budget.

Use reasoning_effort: "low" and verbosity: "low" for cheaper, faster responses when you don’t need deep analysis.

Advanced Parameters

Parameter	Type	Default	Description
`model`	string	—	Required. Always `"mavera-1"`
`input`	string or array	—	Required. A string for simple queries, or an array of `{role, content}` messages for multi-turn
`persona_id`	string	—	ID of the persona to use
`instructions`	string	—	System-level instructions appended to the persona’s system prompt
`analysis_mode`	boolean	`false`	Enable structured analysis output
`text`	object	—	Structured output format config (e.g. `{format: {type: "json_schema", ...}}`)
`tools`	array	—	Custom function definitions (flat format)
`tool_choice`	string/object	`"auto"`	Control tool usage
`reasoning_effort`	string	`"medium"`	`"low"`, `"medium"`, or `"high"`
`verbosity`	string	`"medium"`	`"low"`, `"medium"`, or `"high"`
`stream`	boolean	`false`	Enable streaming (or use `client.responses.stream()`)
`temperature`	number	`1`	Randomness (0–2)
`max_output_tokens`	integer	model default	Maximum tokens in the response
`top_p`	number	`1`	Nucleus sampling threshold
`frequency_penalty`	number	`0`	Penalize repeated tokens (−2 to 2)
`presence_penalty`	number	`0`	Penalize tokens already present (−2 to 2)

The instructions parameter replaces system messages. Instructions are appended to the persona’s built-in system prompt, so you don’t need to repeat persona context — just add task-specific guidance.

Response Format

Standard Response

{
  "id": "resp_abc123def456abc123def456",
  "object": "response",
  "created_at": 1706345678,
  "status": "completed",
  "model": "mavera-1",
  "output": [
    {
      "id": "msg_abc123def456",
      "type": "message",
      "role": "assistant",
      "status": "completed",
      "content": [
        {
          "type": "output_text",
          "text": "Based on my understanding as a Gen Z consumer..."
        }
      ]
    }
  ],
  "usage": {
    "input_tokens": 42,
    "output_tokens": 156,
    "total_tokens": 198,
    "credits_used": 3
  }
}

Response with Analysis

{
  "id": "resp_abc123def456abc123def456",
  "object": "response",
  "created_at": 1706345678,
  "status": "completed",
  "model": "mavera-1",
  "output": [
    {
      "id": "msg_abc123def456",
      "type": "message",
      "role": "assistant",
      "status": "completed",
      "content": [
        {
          "type": "output_text",
          "text": "Remote work is something I strongly value..."
        }
      ]
    }
  ],
  "analysis": {
    "confidence": 8,
    "emotion": {
      "emotional_valence": 7,
      "arousal": 5,
      "dominance": 6
    },
    "biases": [
      {
        "name": "recency_bias",
        "description": "Overweighing recent remote work experiences"
      }
    ],
    "key_insights": [
      "Strong preference for hybrid models over fully remote",
      "Values flexibility more than salary increases"
    ]
  },
  "usage": {
    "credits_used": 4
  }
}

Response with Tool Calls

{
  "id": "resp_abc123def456abc123def456",
  "object": "response",
  "created_at": 1706345678,
  "status": "completed",
  "model": "mavera-1",
  "output": [
    {
      "type": "function_call",
      "id": "fc_abc123",
      "call_id": "call_abc123",
      "name": "get_weather",
      "arguments": "{\"location\": \"San Francisco, CA\"}"
    }
  ],
  "usage": {
    "credits_used": 2
  }
}

Best Practices

Always include a persona

Mavera’s value comes from persona intelligence. Without a persona_id, you get generic responses. Match the persona to your target audience for the most useful outputs.

Use instructions for task context

The instructions parameter is appended to the persona’s built-in system prompt. Use it to set task context — for example: "You are a market research analyst interviewing this persona about brand loyalty." This gives the model both audience perspective and task direction without overriding the persona.

Stream long responses

Use client.responses.stream() for any response that might exceed a few sentences. This dramatically improves perceived latency in user-facing applications.

Use structured outputs for pipelines

When the response feeds into downstream code, use text format with json_schema to guarantee a predictable structure. This eliminates parsing errors.

Handle errors gracefully

Implement retry logic with exponential backoff for 429 and 5xx errors. Log usage.credits_used to monitor costs.

Choose reasoning effort wisely

Use reasoning_effort: "high" for complex analysis and "low" for simple Q&A. This controls both quality and credit cost.

Use flat tool format

The Responses API uses a flat tool definition — name, description, and parameters are top-level fields in each tool object. Don’t nest them under a function key.

Next Steps

Personas

Explore 50+ pre-built personas and create custom ones

Focus Groups

Run simulated audience research at scale

Migration Guide

Migrate from the Responses API

API Reference

Full API specification

​Key Features

Persona Intelligence

OpenAI Compatible

Streaming

Analysis Mode

Structured Outputs

Vision Support

Tool Calling

Credits Tracking

​Request Flow

​Basic Usage

​Persona Integration

​Streaming

​SSE Event Types

​Analysis Mode

​Structured Outputs

​Simple JSON Mode

​JSON Schema Mode

​Tool Calling (Function Calling)

​Defining Tools

​Handling Tool Calls

​Tool Choice

​Built-in Server Tools

​Image / Vision Input

​Multi-turn Conversations

​Error Handling

​Error Response Format

​Common Error Codes

​Retry Logic

​Rate Limits and Credits

​Rate Limits

​Credit Costs

​Advanced Parameters

​Response Format

​Standard Response

​Response with Analysis

​Response with Tool Calls

​Best Practices

​Next Steps

Personas

Focus Groups

Migration Guide

API Reference

Key Features

Request Flow

Basic Usage

Persona Integration

Streaming

SSE Event Types

Analysis Mode

Structured Outputs

Simple JSON Mode

JSON Schema Mode

Tool Calling (Function Calling)

Defining Tools

Handling Tool Calls

Tool Choice

Built-in Server Tools

Image / Vision Input

Multi-turn Conversations

Error Handling

Error Response Format

Common Error Codes

Retry Logic

Rate Limits and Credits

Rate Limits

Credit Costs

Advanced Parameters

Response Format

Standard Response

Response with Analysis

Response with Tool Calls

Best Practices

Next Steps