about audiohighlight — why we built it

what we're building

audiohighlight is an AI transcription tool with a browser workspace. you drop a file in, you get back a transcript that doesn't need cleanup, and when you do need to verify a quote you click the word and hear the second of audio it came from.

speaker labels you fix in bulk, not one at a time. custom vocabulary you set per account so "Bayesian" stops becoming "Beijing." exports to .docx, .srt, .vtt, plain text, JSON. one price per file. no subscription. no setup.

why this exists

the category has no winner on the actual job. every existing tool either hands you a .docx that takes hours to clean up, locks the good editor behind a subscription that doesn't fit how individuals buy transcription, or insists on joining your meetings as a bot when the audio you actually need to transcribe is the file you already recorded.

the job people actually hire transcription for is to get from "I have audio" to "I have a verified, citable, editable record" in less time than transcribing it themselves. we measure ourselves against that job, not against word error rate. our public commitment is a benchmark on cleanup-time-per-audio-hour, on a published audio corpus, reproducible.

what we explicitly aren't

not a meeting bot. no calendar OAuth, no auto-join. file uploads only. journalists who refuse meeting bots for source-protection reasons can still use us; podcasters who recorded the meeting in their DAW can upload that file directly.
not enterprise software. no SSO, no SCIM, no procurement cycles at launch. the buyer is the individual who needs a transcript today.
not a transcription marketplace. we don't recruit human transcribers and we don't list jobs for them. that's transcribeme and scribie's market.

who we are

one person, mostly. a guy who loves words and software and occasionally confuses the two. former local journalist, current language obsessive (currently on Norwegian; previous tours through Mandarin and Russian, with a long detour through Spanish that overlapped with a few years living south of a border that shall remain unnamed). along the way I kept ending up in transcripts — interview transcripts for stories, court-record transcripts for fact-checks, language-learning transcripts for shadowing exercises, the occasional deposition because someone needed a second pair of eyes.

the through-line was always the same. recordings of people talking. transcripts that took longer to clean up than to record. and tools that pretended the cleanup didn't exist. on a long story about local housing I sat down to audit my own workflow — the number of minutes I spent fixing speaker labels, fixing surnames the model heard wrong, scrubbing the timeline to verify the quote that mattered. counted them. realized the cleanup tax was bigger than the writing. spent two years building a tool I'd actually want to use. and now I'm shipping it.

what I'm not pretending

this isn't a 30-person company. it's a small operation running on personal savings, a borrowed couch in a cofounder's spare bedroom, and the patience of three early advisors who keep telling me to ship sooner. the marketing copy doesn't say "we" because the work was done by a marketing team — it says "we" because that's the convention, and dropping the convention to write about myself feels worse than borrowing it. when you write to hello@audiohighlight.com, I'm the one who answers.

I don't have venture funding. I don't have a sales team. I don't have a content marketer. I have a working ffmpeg.wasm chain in production, a benchmark methodology I trust, an opinionated stack of free tools that happen to also be the demo, and a positioning thesis that survived a 50%-budget opus pondering session and a series of arguments with smarter people. that's the entire competitive surface. the goal is to ship a transcription product that I would have gladly paid $30 a month for during the housing-court story, and to keep shipping it at a rate that lets me be the person on the other end of every support email.

what I publish

what I measure, how I measure it, and the corpus I measure on. the benchmark page isn't a marketing fixture; it's the contract. word error rate is a useless buyer metric (long version), so the benchmark measures cleanup time per audio hour instead. the corpus, the methodology, the tested tools, the per-tool delivered transcripts, all public, all reproducible. if the number changes, I publish the correction.

I also publish what I won't do. no meeting-bot product. no tier upgrade for bulk speaker fix. no training on customer audio. no AI ghostwriting of your prose. these constraints aren't preferences; they're load-bearing for the audiences this is built for. when I describe a feature on this site, the rule is that there's a working demo or a public methodology behind it, not a roadmap fantasy.

where I came from

a small Texas town, then a chemistry degree I never used, then a public-radio internship that turned into a year of freelance reporting on local government, then a ten-year drift through software engineering jobs at companies with adult supervision (an embedded-systems shop in Austin, a payments processor in Denver, a brief mistake in San Francisco). I learned that I could write code well, that shipping at small companies was vastly more interesting than shipping at large ones, and that the journalism instinct — chase the story, verify the source, publish the methodology — translated into building a product almost unchanged.

the language-learning thing started in college and never stopped. about two thousand hours of accumulated practice across four languages, the kind of obsession that gives you opinions about transcription notation in three writing systems. several of the more specialized export formats exist because I built them for myself first.

what I'm working on this week

the live-now things: the free tools (extract-audio, voice recorder, more shipping each week), the full content surface (manifesto, benchmark, comparisons, blog posts), and the founding-member waitlist for the full transcription product.

the not-yet-live things: the transcription product itself (target: this quarter), the cross-corpus search that's the biggest single bet in the roadmap, and a Remotion-based highlight-clip generator that I'm increasingly convinced is the most fun feature I'll get to build.

if you're a working journalist with a deadline, or a podcaster who needs a clean clip pack, corrected guest names, and a blog draft from your last episode — the tool is being built for you specifically. the founding-member rate is the way to be first in line.

contact

questions, corrections, press: hello@audiohighlight.com. one human, single-day reply, no support form.

we kept hitting the same wall.