A YouTuber paid me $300 to "fix" a 20-minute video. The audio had 47 "ums," a dog barking in the background, and he wanted it narrated in Spanish. I delivered in 3 hours.

Category: Monetization Guide

Excerpt:

Descript lets you edit audio by editing text—delete filler words with backspace, fix mistakes by retyping. Murf.ai generates realistic voiceovers in 20+ languages. Together, they handle audio cleanup, voiceover generation, and even complete audio "rescues" where the original recording is unusable. Here's the workflow, pricing, and client acquisition strategy.

Last Updated: March 12, 2026
Stack: Descript (edit audio like a doc) + Murf.ai (120+ AI voices)
Audio Studio No recording booth needed $50-150 per project
The problem: bad audio kills content The fix: edit text, not waveforms The money: done-for-you audio

A YouTuber paid me $300 to "fix" a 20-minute video. The audio had 47 "ums," a dog barking in the background, and he wanted it narrated in Spanish. I delivered in 3 hours.

Before you ask: no, I don't speak Spanish. And before this workflow, I would have quoted 3 days and subcontracted the voiceover. Here's what changed.

Descript turned his messy recording into clean, editable text. I deleted the "ums" by hitting backspace. I cut the dog bark by finding "bark" in the transcript and pressing delete. Then I used Descript's AI to regenerate his voice for minor fixes — no re-recording needed.

Murf.ai handled the Spanish narration. I pasted the translated script, picked a voice that matched his energy, and generated studio-quality audio in 15 minutes. The whole project took 3 hours. My effective rate: $100/hour.

What you're actually selling
Audio cleanup without the headache
Clients send you messy recordings. You send back clean, professional audio. They don't need to know you edited a transcript, not waveforms.
Voiceover in any language
Murf has 120+ voices in 20+ languages. You can offer narration in Spanish, French, German, Hindi — without knowing any of them.
Filler word removal as a service
"Remove all my ums and uhs" is a legitimate offering. One click in Descript. Clients will pay $30-50 just for this.
You're not an audio engineer. You're a content processor.
Who this is NOT for: Music production, sound design, or anything requiring professional DAW-level control. Descript and Murf are for spoken content — podcasts, voiceovers, narrations, courses. If a client needs their song mixed, refer them elsewhere.

The Pain: why most people's audio sounds terrible

The scenarios I see constantly

Podcaster: Records 45 minutes. Realizes afterward there's construction noise outside. Doesn't want to re-record. Posts anyway with an apology in the description.

Course creator: Has 12 videos ready to sell. Each one has 30+ filler words. Knows it sounds unprofessional. Doesn't have the budget to hire an editor.

YouTube narrator: Hates their own voice. Wants someone else to read the script. Professional voice actors quote $200+ per video.

Business owner: Recorded a webinar in their kitchen. Fridge humming, phone notifications pinging. Needs it cleaned up for a lead magnet.

None of these people need a professional studio. They need someone who can take their mess and make it usable.
What they've tried (and why it failed)
Audacity: Free but intimidating. They look at waveforms and have no idea what to do.
Fiverr voice actors: Great quality, but $50-200 per project. Adds up fast for regular content.
DIY noise removal: Makes audio sound like it's underwater. They give up.
Ignoring it: The most common solution. Content goes out with flaws. Audience judges quietly.

Tool Breakdown: what each one handles

Descript

Edit audio by editing text. Upload any audio/video file, and Descript transcribes it. Then you can:

  • Delete filler words with one click (ums, uhs, likes)
  • Cut sections by deleting text
  • Fix mistakes by retyping — AI regenerates the speaker's voice
  • Remove background noise automatically
  • Generate transcripts and captions
Free tier: 1 hour transcription/month. Paid: From $12-16/month. Use for: Editing, cleanup, transcription.
Murf.ai

Generate realistic voiceovers from text. 120+ voices in 20+ languages. You can:

  • Choose voices by tone (professional, casual, energetic)
  • Adjust speed, pitch, and emphasis
  • Add pauses and pronunciation guides
  • Sync voiceover to video timeline
  • Generate same script in multiple languages
Free tier: 10 mins/month, watermarked. Paid: From $19/month. Use for: Voiceover generation, narration, dubbing.
How they work together
Workflow 1 — Audio cleanup: Client sends messy audio → Descript transcribes and cleans → You deliver clean file.

Workflow 2 — Full narration: Client sends script → Murf generates voiceover → Descript syncs to video if needed.

Workflow 3 — The combo: Client's audio is unusable → Transcribe in Descript → Edit script → Generate new voice in Murf → Sync back to video. This is the $150+ project.

The Workflow: 3 common project types

Type 1: Filler Word Removal (15-20 min)
1. Upload to Descript
Drag and drop audio/video file. Descript transcribes automatically. Takes about 1 minute per 10 minutes of audio.
2. Auto-detect filler words
Descript highlights all "um," "uh," "like," "you know" automatically. Review the highlights — sometimes "like" is used correctly. Delete the rest with one click.
3. Export clean audio
Descript stitches the audio back together seamlessly. Export as MP3, WAV, or whatever format the client needs. Done.
Charge: $25-40 for this service alone. It's simple but clients don't know it's one click.
Type 2: Full Voiceover Generation (30-45 min)
1. Get the script from client
They send a Google Doc, PDF, or just text. Ask for any pronunciation notes (names, brands, technical terms).
2. Choose voice in Murf
Browse voices by gender, accent, and tone. Let client pick 2-3 options, then generate a sample paragraph. They approve one voice before you do the full script.
3. Generate and fine-tune
Paste script into Murf. Adjust speed (usually 0.9x for narration sounds more natural). Add pauses between paragraphs. Preview and tweak emphasis on key words.
4. Export
Download as MP3 or WAV. If syncing to video, you can do that in Murf directly or send the audio file for the client to handle.
Charge: $50-100 for a 5-10 minute voiceover. Pro voice actors charge $200-500 for the same. You're not undercutting — you're serving clients who can't afford pros.
Type 3: The Full Rescue (1-2 hours)
1. Transcribe the unusable audio
Client has a recording with terrible quality — noise, echo, bad mic. Upload to Descript anyway. You need the transcript, not the audio.
2. Clean up the transcript
The transcript will have errors from bad audio quality. Fix them. This is your editing step — remove digressions, tighten sentences, fix grammar.
3. Generate new voice in Murf
Paste cleaned transcript. Generate voiceover. If client wants to sound like themselves, Murf can clone voices — but that's a higher-tier feature and requires their consent.
4. Sync to video (if applicable)
Import video to Descript, replace old audio track with new voiceover. Adjust timing. Export final video.
Charge: $100-200 for this full-service rescue. You're saving them from re-recording entirely.
⚠️ Ethical note on voice cloning
Murf offers voice cloning, but only use it with explicit client consent. Never generate a voice that impersonates someone without permission. For most projects, you'll use Murf's stock voices — they're realistic enough that clients won't know it's AI.

Pricing: what to charge

ServiceWhat You DoTimePrice
Filler Word RemovalRemove ums/uhs, basic cleanup15-20 min$25-50
Audio CleanupNoise removal, filler words, light editing, transcript30-45 min$50-75
Voiceover Generation ⭐Script to voiceover, your choice of voice, basic sync30-60 min$50-100
Full Audio RescueBad audio → transcript → clean voiceover → synced1-2 hours$100-200
Podcast Edit (per episode)Full cleanup, filler removal, intro/outro, transcript1-2 hours$75-150

Pro voice actors charge $200-500 for a 5-minute recording. Podcast editors charge $50-150 per episode. You're not competing on quality with pros — you're serving the market below that price point.

The retainer opportunity

Podcasters need this weekly. Offer a monthly package: 4 episodes, $300-500/month. That's $75-125 per episode, and you have predictable income. Once you're in their workflow, they won't switch.

The multilingual upsell

Murf supports 20+ languages. Offer: "Same video narrated in Spanish/French/German for +$50." You don't need to speak the language — just paste translated text. Huge value add.

First Client: where to find them

Who needs this right now
New podcasters
Check Apple Podcasts for shows with fewer than 20 episodes. Many have bad audio and don't know how to fix it.
Course creators on Udemy/Skillshare
Filter by recent courses. Many have audio issues. DM them: "I noticed the audio in your course — I can clean that up for $50."
YouTube channels with narration
Documentary-style channels, explainer channels, meditation channels. They often need voiceover work.
Local businesses with webinars
Real estate agents, coaches, consultants. They record webinars and never do anything with them because the audio is rough.
Your free sample strategy

For your first 5 potential clients, offer a free 2-minute sample. Take their audio, remove filler words, and send back the clean version. It takes you 5 minutes. They get immediate value.

What to say
"I ran your latest episode through my audio cleanup process — here's a 2-minute sample with all the filler words removed. If you like how it sounds, I can do your full back catalog."
Cold DM template
Hey [name] — been listening to [podcast/channel name]. 

Quick unsolicited feedback: your content is solid but the audio 
could be cleaner. I noticed some background noise and filler words 
that might be distracting listeners.

I do audio cleanup as a service — basically I remove the "ums," 
clean up noise, and make things sound more professional. 

Here's a free sample: I took 2 minutes from your latest episode 
and cleaned it up.
[link]

If you like it, I can do full episodes for $X each. If not, 
you still got a free sample out of this message. 

— [your name]

Send 5 of these. Personalize each one. Expect 2-3 replies.

Build portfolio pieces first
Before outreach, make 3 sample "before/after" clips. Take any Creative Commons audio, process it in Descript, show the difference. This becomes your portfolio. Clients want to hear what you can do, not read about it.
Start with free tiers today
What I wish I'd known: You're not selling "audio engineering." You're selling "done-for-you cleanup." Clients don't care about the tools. They care that their podcast doesn't sound like it was recorded in a bathroom. Focus on the outcome, not the process.
Last Updated: March 12, 2026
audio editing Descript Murf.ai voiceover podcast editing AI voice audio cleanup
FacebookXWhatsAppEmail