Podcasts are 100 percent audio. No image, no caption, no graphic on screen to rescue a flat line. The voice is the entire product, and ChatGPT prose has none of the natural pauses, contractions, half-thoughts, or asides that make speech sound human. Listeners hear every robotic phrase and skip ahead. TextSight rewrites the script for oral rhythm before you record, fixes the cold open, and keeps the host voice your subscribers came back for. Honest framing: spoken voice plus authenticity, never tricking the listener.
A page lets the reader set the pace. A microphone takes that control away. Three rhythm patterns separate written prose from spoken prose, and AI defaults to the written shape every time.
The model is trained to produce text that scans well silently. Long subordinate clauses are fine on a screen because the eye can pause where it wants. The same clause read into a mic runs 18 seconds without a breath and the delivery sounds rushed within a minute, even with a great voice. Podcast scripts need oral rhythm. ChatGPT does not produce it by default.
Spoken English averages 8 to 14 words per clause with a real pause every 6 to 10 words. AI prose averages 22 to 30 words per sentence with no breaks. Break every long sentence into two or three short ones, and mark pauses with an ellipsis so the host knows where the breath belongs. The script becomes a delivery guide, not a wall of text.
Real hosts interrupt themselves mid-sentence. They shift gears, contradict the last line, add a quick story, then come back. AI never does this. Every sentence in an AI draft completes itself cleanly, which is the opposite of how people actually talk. One aside per minute is the cheapest way to add humanity back into the script.
Native speech contracts in roughly 95 percent of slots. Don't, can't, it's, you're, I'm. Real hosts also use small fillers like right, you know, and look, and address the listener as you. AI strips every one of these. Adding them back closes most of the gap between rewritten output and a script that sounds like a person made it.
Invisible on the page. Obvious the moment the mic turns on. Read every paragraph aloud and rewrite before recording.
Today on the show. Welcome back. In today's episode, we explore. Make sure to subscribe! Five openers carry 80 percent of AI-drafted podcasts, and listeners skip them in twelve seconds. Open mid-confession, on a concrete number, or on the actual news instead. No setup, no welcome.
AI loves a single sentence that runs through three commas and a conjunction. The page handles it. The mouth does not. Hosts stumble, the cadence cracks, and the audience notices. Cut every long clause into two or three short ones, and rehearse the seam.
Utilise, leverage, navigate, robust, comprehensive, holistic. Words nobody says in conversation, dropped into a sentence you now have to read into a microphone. The audio reveals the source instantly. Swap for use, get, work through, strong, full, whole.
AI prose runs in identical paragraph lengths with no breath markers. Real scripts vary paragraph length wildly and mark the breath. Add ellipses where the pause belongs, let one paragraph be a single sentence, let another run long.
AI defaults to do not, cannot, it is, you are. Spoken English uses don't, can't, it's, you're roughly 95 percent of the time. Full forms in audio sound like reading a press release aloud. Host voice also lives in throwaway asides: the mid-sentence opinion, the small story, the side note. AI strips both.
AI outros restate every point, thank the listener, then ask for a rating in the same flat cadence. Most listeners have already left. End on a sharp line, a question, or a teaser. Put the rating ask at the 30 percent mark of the episode, not the end.
Every block that ends up read into a mic, including the brand-supplied ones, sits inside the same workflow.
Live or die on monologue energy. AI scripts have none. Cadence is flat, asides are missing, host sounds bored of the show. Run the full draft through Balanced, then a manual read-aloud pass. Cold open rewritten from scratch most of the time.
Pre-written questions that sound AI-generated are show killers. A great guest hears within 30 seconds whether the host wrote the questions or pasted them from a chatbot, and a disengaged guest gives flat, short answers. Rewrite every question, read each aloud, treat them as conversation anchors.
The first 60 seconds and last 30 seconds carry the episode. Cold open gets the most attention. Outro should end on a sharp line, a question, or a hand-off to the next episode. Boilerplate closes get muted.
Brand-supplied copy is the single biggest AI residue source. Marketing teams generate talking points in ChatGPT and ship them to the host mostly unedited. Rewrite the flagged lines into the host register while keeping the brand's required claims intact, then read the cleaned version on camera. Light mode is usually the right setting because exact wording matters here.
The throwaway lines between blocks are where AI rhythm leaks back in. They are short, so they get skipped in the rewrite pass. Listeners hear the seams. Scan and rewrite every transition segment as one block, not as throwaway.
Written text in support of the audio, but still part of the channel voice. Subscribers read show notes on the podcast app when deciding whether to play the episode. Templated descriptions read suspect and lower the click rate. Scan the description and the chapter labels as one block before publishing.
Free fits a weekly solo show on short scripts. Pro fits daily releases, interview prep, and sponsor reads. Business fits production studios and podcast networks. Full details on the pricing page.
Billed $89.88/year — Save $30
Billed $179.88/year — Save $60
Billed $359.88/year — Save $120
Yearly billing saves 25%. View full pricing →
Podcast scripts vary wildly by length and approval constraints. The right mode depends on the block, not the host.
Preserves intent, fixes oral rhythm, restores contractions, and breaks long clauses. The right setting for solo episodes, interview prep, and most longer scripts where you trust the structure but the voice needs lifting. Target an Authenticity Score of 85 or higher.
Smaller edits, intent preserved aggressively. Use Light on sponsor reads where exact claim wording is locked, intro and outro segments that are short enough that meaning shift matters, and transition patter. Light is also right for any block under 200 characters where Maximum could swing too hard.
Aggressive rewrite. The right setting for daily news shows on tight turnarounds, briefing formats that read like a press release out of the model, and narrative or documentary podcasts where the polished production layer demands an Authenticity Score of 90 or above. Always do a read-aloud pass after Maximum to swap out any written-only words that crept back in.
Words that read fine on a page trip the tongue when spoken into a mic. Ten minutes per script. Applied to every recording.
Let ChatGPT produce the rough script. Do not fix AI prose in the draft stage. Keep the structure, the beats, the bullet points. Throw the wording away. The AI rewriter is the right place to fix wording, not the prompt.
Paste into app.textsight.ai and pick Balanced for solo and interview shows, Maximum for daily news or narrative, Light for sponsor reads. Target an Authenticity Score of 85 on regular shows, 90 on narrative work. Read the sentence-level highlights and accept the rewrite suggestions on the flagged spans.
Not in your head. Full voice, at the speed you would actually record. Anything you read silently you will not catch. Mark every stumble, slow spot, breathless line, and any place that did not sound like you. Each mark is a cue that the script is not yet speech.
Swap utilise for use. Cut a long clause into two short ones. Add a contraction. Drop a filler where you would actually say one. Read the cold open three more times and time it; if it runs over 60 seconds, cut. The AI rewriter handled oral rhythm. You handle voice.
Opening 60 seconds of a business podcast on remote-team failure modes, first as ChatGPT drafted it, then the rewritten rewrite the host actually recorded. First-90-second drop-off moved from 47 percent to 19 percent on the published episode.
"Welcome back to the show. In today's episode, we explore the most common pitfalls that remote teams encounter in modern workplaces. Many organisations have transitioned to fully distributed models, yet they continue to grapple with persistent challenges that undermine productivity and team cohesion. Throughout the next thirty minutes, we will be examining the root causes and providing actionable insights to help leaders navigate this complex landscape."
"I fired my third remote hire last Tuesday. Not because she was bad. Because I was. And I think most founders running distributed teams are making the exact same mistake I just made... and they don't see it yet. So here's what happened, what I think it means, and the one rule I'm trying next. Pour a coffee. This one's personal."
What changed: dropped Welcome back, In today's episode, transitioned, navigate, comprehensive. Opened on a concrete confession (fired a hire last Tuesday). Added contractions everywhere. Inserted a pause marker (the ellipsis), a half-thought (Not because she was bad. Because I was), and direct address (Pour a coffee). Authenticity Score moved 73 points and first-90-second drop-off less than halved.
Spoken-word writing for video. Long-form scripts, Shorts hooks, and on-camera reads.
For YouTube scripts →Sister guide for Substack and beehiiv issues. Voice consistency across subscriber sends.
For newsletters →The 0-100 score explained, with target bands for spoken vs written formats.
Read the explainer →The standalone AI rewriter tool. Three modes, sentence-level highlights, voice-preserving rewrites.
Open the AI rewriter →Free to try. No card. Rewrite the cold open, restore spoken cadence, run the read-aloud loop, and protect listener retention before you press record. Your first scan in about six seconds.