how to · 4 min read

transcribe a podcast for show notes, captions, and citation.

podcast transcription has three downstream uses — show notes, accessibility captions, and quotable pull-out lines — and most tools force you to fight your transcript before any of them work. here's the workflow that doesn't.

three jobs, one transcript

a podcast transcript has to do three jobs simultaneously, and most generic tools make at least two of them painful:

the same transcript should serve all three. the editor should let you switch between them — clean or verbatim, paragraphed or row-per-turn, full transcript or just the highlighted quotes — without re-transcribing or re-formatting from scratch.

the workflow

  1. upload the episode. mp3, m4a, wav, anything. for video podcasts the audio is extracted automatically. up to 5 GB per file (about 8 hours of single-channel audio).
  2. transcription runs. on a 60-minute episode, the first pass is ready in 1–3 minutes (cloud mode) or roughly real-time (on-device private mode for sensitive episodes — embargoed announcements, off-the-record passages, privacy-conscious guests).
  3. fix labels in bulk. "speaker 1" becomes "host" and "speaker 2" becomes the guest's name, once. propagates through every row in the transcript. proper nouns (company names, product names, people mentioned) fixed once and remembered across future episodes.
  4. edit for show notes. optionally remove filler in one pass — the editor flags "uh," "um," "you know" patterns and lets you accept or reject in batch. paragraph breaks land on logical boundaries by default; adjust where editorial taste calls for it.
  5. highlight pull-quotes. mark the lines that will become tweets or graphics. each highlighted quote exports with a timestamp link back to the audio so you can verify before posting. the editor's click-word-to-replay-audio feature is the verification tool.
  6. export everything at once. show-notes .docx (or markdown for substack), .vtt for the episode player, .srt for the youtube version, json for the website's transcript page, plus a separate "highlights" file with the pull-quotes and their timestamps.

the privacy case for podcasters

most podcast transcription is fine on a cloud tool. some isn't:

for these episodes, run the file in private mode. the audio stays on your laptop, the transcript stays on your laptop, the export is local. no vendor in the chain.

show-notes formatting that gets indexed

search engines index podcast show notes. they also reward clean structure and penalize wall-of-text. our show-notes export uses paragraph breaks at logical thought boundaries (typically 30–90 seconds of audio per paragraph), bolded speaker turns where structurally meaningful, and a heading hierarchy you can edit in. the .docx imports cleanly into most CMS systems (wordpress, ghost, substack, podbean) without paragraph-style corruption.

the timestamps stay embedded. listeners reading along can click any timestamp to jump to that point in the embedded player.

captions: complete vs clean

accessibility captions ideally include every spoken word — that's the WCAG-AA standard. some podcasts publish "clean" captions (filler removed) for editorial reasons; others publish complete. our caption export supports both modes: .srt-complete (verbatim) and .srt-clean (filler removed, paragraph-level). pick one or export both.

pricing for podcasters

$0.25 per minute. a 30-minute episode is $6. a 60-minute episode is $15. private mode and cloud mode are the same price. no subscription, no minimum. for podcasts with steady weekly volume, batch pricing arrives after launch — write hello@audiohighlight.com and tell us your shape.

related

lifetime deal while we're in beta.

join the waitlist to get a lifetime deal — your first month free, plus 50% off forever. private invite when we ship; no drip campaign.