how to transcribe dictation — talk a draft instead of typing

file-based, not real-time

there are two shapes of dictation tooling. real-time tools like wisprflow, aqua, and macwhisper-style dictation listeners stream your voice into whatever text field you're focused on — slack, gmail, the address bar — replacing the keyboard. they're great for quick messages and short captures. they fall over on long-form drafting:

you can't pause to think without the model deciding you're done.
you can't go back and re-say a paragraph without breaking the focused text field.
you don't get a clean audio artifact to keep — the words stream out, the audio is gone.
speaker punctuation drifts at length. a 40-minute real-time dictation produces a wall of text the model never paragraphs correctly.

file-based dictation flips the tradeoff. you record a self-contained audio file — twenty minutes of you talking through a chapter, a blog post outline, a podcast script draft — and transcribe it as one job. the audio file is preserved. the paragraphing is done in one pass on the full context, not turn-by-turn. you can re-dictate a section by recording a second file and pasting it in. for anything longer than a single message, file-based wins.

the workflow

record yourself. phone voice memo, mac voice memos, a field recorder, or our voice recorder which records straight in the browser with no install and no upload until you decide. m4a or wav. there's no length limit that matters in practice; we've seen 90-minute single-take dictations.
upload the file. up to 5 GB per file. the voice recorder's "transcribe this" button hands the file straight to us; from any other recorder, drag the file into the upload box.
transcription runs. on a 30-minute dictation, the first pass is ready in 1–2 minutes (cloud mode) or roughly real-time (on-device private mode for unpublished drafts you don't want on a third-party server).
fix proper nouns once. character names, place names, technical terms specific to your work. fixed once, remembered across future dictations in the same project.
polish the paragraphs. this is where the dictation workflow earns its place. the editor lets you accept-or-reject filler ("um," "uh," "you know," "so anyway") in batches. the paragraph breaks already land on logical thought boundaries; tighten where taste calls for it. a 30-minute dictation typically produces 3,000–4,000 polished words in 10 minutes of editing.
export the draft. .docx for word/google docs, markdown for substack/ghost, plain text for whatever editor you draft in.

audio paragraph polishing is the win

the reason file-based dictation works for long-form is that the model can paragraph the whole transcript at once. it sees the full arc of your thinking, not the sentence you're currently saying. paragraph breaks land where your thought actually shifted, not where you happened to inhale.

the editor surfaces the filler-removal pass as a single accept-or-reject decision per pattern. "remove all 'um' and 'uh' across the file?" — yes. "remove 'you know' (47 instances)?" — yes for most, no for the three where it's part of a quoted passage. five seconds of decisions, the transcript tightens by 8–12%.

the result reads like a rough draft, not a transcript. for podcasters this is the script for the next episode. for bloggers it's the post in 80% form. for authors it's a chapter outline with the connective tissue already there.

when not to use file-based dictation

we are not the right tool for slack messages, email replies, or quick captures into the address bar. for that workflow, use a real-time tool. wisprflow, aqua, and the built-in mac dictation are all fine for it. file-based dictation starts to make sense around the five-minute mark — long enough that paragraphing matters, the audio is worth keeping, and the model wants the full context to do its job.

private mode for unpublished drafts

first drafts are private by default. you don't want a chapter of your novel sitting on a transcription vendor's server, even briefly. run dictation files in private mode — audio stays on your laptop, transcript stays on your laptop, the export is local, no cloud round-trip. on-device mode is roughly real-time on a recent macbook (M2 or later). same price as cloud mode. see /private-transcription/podcasting for the long version.

pricing for talking-out-a-draft work

$0.25 per minute. a 20-minute dictation is $5. a 60-minute dictation is $15. you pay per file, not per word. for a weekly post or a chapter draft this lands at maybe $20–60 a month depending on how much you talk. no subscription, no minimum.

waitlist signups get the lifetime deal: first month free, 50% off forever after. for an author drafting a book by dictation, that compounds.

transcribe dictation when you'd rather talk than type.

file-based, not real-time

the workflow

audio paragraph polishing is the win

when not to use file-based dictation

private mode for unpublished drafts

pricing for talking-out-a-draft work

related

voice recorder

blog post export

for podcasters and writers (privacy)

transcribe a podcast

lifetime deal while we're in beta.