Home · Blog · AI Detection
AI DETECTION

Em-Dash Overuse: How AI Learned to Sound Smart (And How It Gives Itself Away)

GPT-4o uses em-dashes at 4–6x the rate of average human writing. Here's why, and what other punctuation tells are hiding in plain sight.

EM

The em-dash is having a crisis.

A perfectly good punctuation mark — used for centuries to create emphasis, insert asides, and break rhythm — has been colonized. AI models discovered that the em-dash performs sophistication. It signals nuance. It gestures toward complexity. And so they use it constantly, at 4–6x the rate of average human writing in comparable registers.

The result: any piece of AI text over 300 words now reads like it was written by someone who just discovered em-dashes and can't stop.

Why AI Loves the Em-Dash

Here's what the em-dash actually does rhetorically: it creates an interruption. An aside. A qualifying thought that breaks the main sentence's flow. Used well, it creates the feeling of a writer thinking in real time — "here's my main point — wait, this matters too — and here's why."

That's exactly the quality AI is trying to approximate. Spontaneous thought. Real-time processing. The performance of a mind at work.

But AI can't actually think in real time. It generates token by token according to probability. The em-dash isn't emerging from genuine spontaneous qualification — it's a learned pattern for creating the texture of spontaneous qualification without any actual spontaneous thought behind it.

This is why every other sentence in a ChatGPT response has one. The model has learned: em-dash = sophisticated. So it reaches for em-dashes the way a nervous student reaches for semicolons. A lot. Too much. In places where they don't do anything.

Compare: GPT-4o averages 1 em-dash per 47 words in analytical writing. Human essayists writing in the same register? Closer to 1 per 215 words. The gap is enormous, and once you start looking for it, you can't unsee it.

What the Em-Dash Is Supposed to Do

In good writing, the em-dash earns its place. There are roughly three legitimate uses:

The interrupting aside. "The study — which had a sample size of 4 — found exactly nothing useful." The em-dash brackets a piece of information the reader needs but that would disrupt the flow if placed elsewhere. The aside is genuinely interruptive — the sentence would be complete without it, which is the point.

The pivot. "She trained for two years, slept five hours a night, gave up basically everything — and finished in fourth place." The em-dash here creates a beat before the anticlimax. It's doing rhythmic work. You feel the pause before the disappointment lands. A comma wouldn't do it. A period would lose the connection. The em-dash is the only mark that holds the sentence together while also splitting it in two.

The dramatic addition. "There's only one thing you need to do — stop." Short, emphatic. The em-dash creates a micro-pause that makes the final word hit harder.

All of these uses have something in common: they're doing specific rhetorical work. The em-dash is placed deliberately to create a specific effect. Pull it out and something is lost.

AI uses em-dashes when there's no effect to create — just the appearance of one. The em-dash sits in the sentence like a stage prop. Remove it, replace it with a comma, and nothing changes.

Before and After: What AI Em-Dash Abuse Looks Like

Let me show you the difference concretely.

AI-generated paragraph with em-dash abuse:

"Artificial intelligence has transformed many industries — including healthcare, education, and finance. The technology offers significant opportunities — from improved efficiency to better decision-making capabilities. However, it also presents challenges — particularly around data privacy and ethical considerations. Organizations need to approach AI adoption carefully — balancing innovation with responsibility."

Four em-dashes in four sentences. Count them. Not one of those em-dashes is doing anything a comma couldn't do. They're all "introducing a list" or "adding a qualifier" — both uses where a comma or colon would be cleaner.

Rewritten without em-dash abuse:

"Artificial intelligence has transformed healthcare, education, and finance. The efficiency gains are real. But so is the mess it's created around data privacy — and that's the tradeoff most organizations aren't having an honest conversation about yet. You can't adopt AI carefully without first deciding what you're willing to trade."

One em-dash. It earns its place by creating a genuine pivot from the efficiency point to the privacy critique. Everything else uses cleaner punctuation for what it's actually doing.

The rewritten version runs tighter and has a sharper point. That's usually what happens when you remove cosmetic em-dashes: you discover what the sentences were actually trying to say.

The Colon and Semicolon AI Tells

While we're here: the em-dash isn't the only punctuation mark AI has learned to weaponize.

The AI colon. Human writers use colons to introduce lists, quotations, or explanations that couldn't grammatically stand alone. AI uses colons to introduce perfectly-structured three-part lists that feel suspiciously balanced. "There are three key considerations: clarity, consistency, and communication." The colon is fine. The three-alliterative-items structure that follows it is a tell.

If you see a colon followed by a list where every item is a similar length, similar grammatical form, and there are exactly three of them — that's AI. Humans make lists of two items, or five, or one very long item followed by a short one. We don't naturally generate grammatically parallel three-part lists without trying. We especially don't do it when they alliterate.

The AI semicolon. Humans use semicolons to join closely related independent clauses — usually when the relationship is too tight for a period but too loose for a conjunction. AI uses semicolons to create false balance. "Remote work increases flexibility; it also introduces new challenges." These two clauses aren't really in tension. They're not illuminated by juxtaposition. The semicolon is making the sentence look more sophisticated than it is.

Real balance would be: "Every hour I save not commuting, I lose to Slack notifications." That's actually in tension — same person, same time block, opposite outcomes. The AI version just sounds like it's in tension because the semicolon creates a visual pause between two claims.

The AI parenthetical. This one is subtler. AI uses parenthetical remarks to perform intellectual humility and qualification — "(though this varies by context)" or "(it's worth noting that exceptions exist)" — in positions where the qualification doesn't actually add anything. The parenthetical appears as a learned signal that the writer is being nuanced. Often it's just noise.

How Often Is Too Often?

A rule of thumb: if you can count more than 2–3 em-dashes in 500 words, you've crossed into suspicious territory. In 1000 words, 4 is acceptable. 8 is not.

Same principle for the three-part balanced list: once per 800 words. If it's appearing every other paragraph, that's a pattern.

The semicolon-as-false-balance construction: once per piece, maximum. Used twice, it's a quirk. Used four times, it's a fingerprint.

None of these are absolute rules. Good writers break every rule for good reasons. The tell isn't the construction itself — it's the frequency and the purposelessness. When every other sentence has an em-dash and none of them are doing anything the sentence couldn't do without one, that's AI.

What This Does to Your TextSight Score

Detection systems like TextSight pick up punctuation patterns as part of the Humanization Score signal. Em-dash density correlates with lower scores — not because detectors are specifically programmed to penalize em-dashes, but because high em-dash density co-occurs with other AI patterns: low sentence length variance, passive-voice hedging, topic-sentence-first paragraph structure.

If your text is scoring in the 30–50 range and you're using em-dashes heavily, it's worth running a simple audit: search your document for "—" and look at every instance. For each one, ask whether it's doing any of the three legitimate jobs (interrupting aside, pivot, dramatic addition). If it isn't — if you could remove it and replace it with a comma, a period, or nothing — cut it.

This edit alone typically moves a score 5–12 points, depending on how many cosmetic em-dashes were in the original. Combined with sentence length variation work and passive voice reduction, the combined effect is substantially larger.

A Practical Editing Workflow

When editing AI output for detection and readability, here's the punctuation pass I run:

Step 1: Find and flag every em-dash. Search for "—" in your document. Mark each one.

Step 2: Categorize each one. Is it an interrupting aside (sentence would be complete without the content between the dashes)? A pivot (beat before an anticlimax or contrast)? A dramatic addition (short emphatic statement after the dash)? Or none of the above?

Step 3: Cut the "none of the above" ones. Replace with whatever punctuation actually fits: comma, period, colon. If nothing fits cleanly, that's a sign the sentence structure is wrong — rewrite it.

Step 4: Check your colons. Look at everything after each colon. If it's a list, how many items? Three identically-structured items is a red flag. If it's fewer or more, or the items vary in form, you're probably fine.

Step 5: Check your semicolons. Are the two joined clauses genuinely in tension or relationship? Or is the semicolon just making two medium sentences look like one sophisticated one?

This whole pass takes about 10 minutes on a 1000-word document. It's not the most important edit — sentence length variance matters more — but it's quick, it improves readability, and it removes some of the structural AI fingerprinting.

The Bigger Point

Punctuation tells matter because punctuation is where writing style lives at its most granular level. You can paraphrase AI content, restructure paragraphs, change vocabulary. But if you don't touch the punctuation patterns, the structural fingerprints remain.

The em-dash situation tells you something important about how AI writing works: the models have learned to mimic the appearance of good writing without understanding what makes good writing work. They use em-dashes because em-dashes look sophisticated in the training data. They don't know — and can't know — whether sophistication is what a particular sentence actually needs.

That gap between mimicking quality and understanding it is exactly where detection systems operate. It's also where careful human editors can still produce work that reads as unmistakably theirs.

Go count your em-dashes. If you find more than four in your last 1000 words, half of them probably don't belong there.


Related reading:

DB

Dipak Bhosale

Founder & CEO · TextSight

Writing about AI detection, humanization, and the strange new craft of writing in 2026. Operates Lacewing Technologies from Maharashtra, India.

Try the detector free.

Paste any text. See where AI signals show up. Fix what's flagged in minutes.

Start free — no card More from the blog