Podcast Analysis Pipeline

Scenario

Transcribe podcast episodes with topic detection, paragraph segmentation, and utterance timing. Feed structured transcripts to Mavera for content strategy — which topics resonate, derivative content to create, and editorial calendar gaps. Flow: Deepgram POST /v1/listen?model=nova-3&detect_topics=true&paragraphs=true&utterances=true&diarize=true → topic-tagged transcript → Mavera POST /mave/chat → Content strategy

Code

import os, requests, time

DG = os.environ["DEEPGRAM_API_KEY"]
MV = os.environ["MAVERA_API_KEY"]
MV_BASE = "https://app.mavera.io/api/v1"
MV_H = {"Authorization": f"Bearer {MV}", "Content-Type": "application/json"}

EPISODES = [{"file": "ep-47-ai-marketing.mp3", "title": "EP47: AI in Marketing"},
    {"file": "ep-48-brand-strategy.mp3", "title": "EP48: Brand Strategy 2026"},
    {"file": "ep-49-content-ops.mp3", "title": "EP49: Content Ops at Scale"}]
params = {"model": "nova-3", "smart_format": "true", "punctuate": "true",
    "detect_topics": "true", "paragraphs": "true", "utterances": "true",
    "diarize": "true", "language": "en"}

all_eps = []
for ep in EPISODES:
    with open(ep["file"], "rb") as f:
        resp = requests.post("https://api.deepgram.com/v1/listen", params=params,
            headers={"Authorization": f"Token {DG}", "Content-Type": "audio/mpeg"},
            data=f, timeout=180)
    resp.raise_for_status()
    r = resp.json()
    alt = r["results"]["channels"][0]["alternatives"][0]
    topics = [t["topic"] for s in r["results"].get("topics",{}).get("segments",[])
              for t in s.get("topics",[])]
    paragraphs = alt.get("paragraphs",{}).get("paragraphs",[])
    moments = [{"time": f"{int(p['start']//60)}:{int(p['start']%60):02d}",
        "text": " ".join(s.get("text","") for s in p.get("sentences",[]))[:200]}
        for p in paragraphs[:10]]
    dur = r.get("metadata",{}).get("duration",0)
    all_eps.append({"title": ep["title"], "min": round(dur/60,1),
        "words": len(alt["transcript"].split()), "topics": topics[:8], "moments": moments})
    print(f"{ep['title']} — {round(dur/60,1)}min | {len(alt['transcript'].split())} words | {len(topics)} topics")
    time.sleep(2)

corpus = ""
for e in all_eps:
    corpus += f"\n### {e['title']} ({e['min']}min, {e['words']} words)\n"
    corpus += f"Topics: {', '.join(e['topics'][:5])}\n"
    for m in e["moments"][:5]:
        corpus += f"  [{m['time']}] {m['text'][:150]}\n"

time.sleep(1)
strategy = requests.post(f"{MV_BASE}/mave/chat", headers=MV_H, json={
    "message": f"Podcast content strategist. {len(all_eps)} episodes:\n\n{corpus[:10000]}\n\n"
        "Produce:\n1. **TOPIC HEATMAP** — Topics across episodes\n"
        "2. **RESONANCE SIGNALS** — Longest discussion topics\n"
        "3. **CONTENT DERIVATIVES** — 10 blogs, 5 social threads, 3 video clips\n"
        "4. **GUEST INSIGHTS** — Key quotes to repurpose\n"
        "5. **EDITORIAL CALENDAR** — Next 4 episode topics\n"
}).json()
print(strategy.get("content", "")[:4000])

Example Output

EP47 — 42.3min | 6,847 words | 6 topics
EP48 — 38.7min | 5,921 words | 5 topics
EP49 — 51.2min | 8,334 words | 7 topics

TOPIC HEATMAP
  "AI content generation" — 3/3 episodes (14 min total)
  "brand consistency"     — 2/3 episodes (9 min)

CONTENT DERIVATIVES
  Blog: "Why AI Won't Replace Copywriters (But Will Replace Bad Ones)"
  Clip: EP47 [6:12–8:30] — Hot take on AI content quality

EDITORIAL CALENDAR
  EP50: Content Measurement (gap: only 4 min coverage)
  EP51: AI + Brand Voice (connect EP47 + EP48 themes)

Error Handling

Long podcast episodes

Deepgram handles files up to 2 GB. Set timeout=180 for episodes over 60 minutes. Process batches sequentially with 2-second delays.

Topic detection returns few results

Topic detection needs 5+ minutes of diverse content. Monologue-heavy episodes may return 1-2 topics. Supplement with Mavera analysis of the raw transcript.

Diarization accuracy

For interviews with clear turn-taking, accuracy is 95%+. For panels with crosstalk, process each microphone track separately.

​Scenario

​Code

​Example Output

​Error Handling

Scenario

Code

Example Output

Error Handling