Authenticity Score — How Natural Does Your Writing Read?

Q: How does the Authenticity Score compare to GPTZero's score?

GPTZero reports an AI probability plus separate burstiness and perplexity figures. The Authenticity Score is the inverse of an AI probability (higher means more human), blended with additional signals such as vocabulary fingerprints and structural markers. Scores often agree directionally: text TextSight labels Original usually clears GPTZero, and text TextSight labels AI Generated usually fails it. They are not interchangeable. Each detector is calibrated against a different benchmark. If you need to pass GPTZero specifically, verify there.

Q: Is the score equally accurate across different content types?

No. The score is most accurate on long-form prose (essays, articles, blog posts) of 300 words or more. Accuracy drops on very short text, code, lyrics, formulaic copy (product descriptions, legal boilerplate), and ESL writing that follows tight stylistic conventions. Technical documentation often scores lower than its quality deserves because dense procedural prose naturally resembles AI rhythm. Treat scores on those categories as directional.

Methodology

How the score is calculated

The Authenticity Score is a weighted blend of several language signals — none of them new individually, but combined into a single number you can act on:

Burstiness: Variation in sentence length and complexity. Human writers naturally mix short and long sentences; AI tends toward a regular rhythm.
Perplexity: How surprising each next word is. AI predicts the high-probability next word more often than humans do.
Lexical diversity: Range and rarity of vocabulary. AI tends to repeat preferred phrases ("delve into," "moreover," "in conclusion").
Structural markers: Repeated paragraph shapes, predictable transitions, and the listicle / triplet patterns AI loves.
Model-specific fingerprints: Lexical signatures characteristic of GPT-4, Claude, Gemini, or Llama 3.

Each signal contributes a weighted component to the final 0-100. Weights are tuned against a benchmark of human-vs-AI-authored content. The score is probabilistic — like every AI detection signal — so we publish the methodology, the benchmark setup, and the caveats openly.

Read the full methodology →

Questions

Frequently asked

What is the Authenticity Score?

A 0-100 measurement of how natural and human-like a piece of text reads. Computed by TextSight on every AI Detector scan and every AI Rewriter rewrite. Higher means more human-like; lower means more AI fingerprints.

How is the score calculated?

A weighted blend of burstiness, perplexity, lexical diversity, structural patterns, and model-specific fingerprints. Weights tuned against a benchmark of human-vs-AI content. Read the full methodology.

What's a good Authenticity Score?

Depends on the stakes. 60+ for personal writing, 75+ for academic submissions, 85+ for compliance / legal / journalism. Below 40, most readers and detectors will flag the text as AI-generated.

How is it different from the AI probability score?

AI probability answers "how likely is this AI-generated?" (higher = more AI). Authenticity Score answers "how natural does this read?" (higher = more human). Usually inversely correlated but not perfectly — a sentence can be obviously AI-written and still flow well, or vice versa.

Will a high score guarantee my text passes a specific detector?

No. The score is computed against TextSight's own detector. A high score correlates with passing most other detectors, but no number guarantees a pass on any specific third-party tool. If you need to pass a specific detector, verify by re-scanning on that tool.

Can the score be wrong?

Yes. Like every AI detection signal, it's probabilistic. Heavily edited AI text can score high; deliberately stilted human writing can score low. We flag low-confidence scores on short or unusual text. Use it as a benchmark, not a verdict.

Where does the score appear?

Every AI Detector scan, every AI Rewriter rewrite, and every output from the 20+ free writing tools at /tools/. Anywhere TextSight processes text, you get the score.

Score Bands

Understanding the 5 Authenticity Score bands

Every 0-100 score lands in one of five colour-coded bands. The band is the verdict at a glance. The number tells you how much rewriting still separates the draft from your target. Same labels on Free, Starter, Pro, and Business.

Band 1 of 5

81 to 100

Original

The writing carries the kind of variation, voice, and small surprises that come from a human author who actually thought about it. Safe to ship for academic submissions, journalism, legal copy, or anything that needs to read as authentically authored. Most third-party detectors will also clear text in this band.

Band 2 of 5

61 to 80

Mostly Human

Reads as natural prose with a few sentences that lean toward AI rhythm. Fine for blog posts, internal docs, marketing copy, and most general use. If you are submitting to a strict detector or an editor who looks at AI signals, rewrite the highlighted sentences and aim for the Original band.

Band 3 of 5

41 to 60

Mixed

Roughly half AI-flavoured, half human. Common after a single AI rewriter pass on raw GPT or Claude output. Some detectors will flag, some will not. Treat this as a checkpoint, not a finish line. Run another rewrite pass or edit the orange and red sentences by hand.

Band 4 of 5

21 to 40

Likely AI

The text shows strong AI fingerprints: predictable transitions, repeated phrasing, even sentence rhythm, and the giveaway vocabulary models reach for. Most readers who read closely will sense it. Most detectors will catch it. Do not ship without a substantial rewrite.

Band 5 of 5

0 to 20

AI Generated

Reads as raw model output. Uniform sentence length, predictable structure, and the full vocabulary stack ("delve," "moreover," "in conclusion," "tapestry," "navigate"). Effectively every detector on the market will flag it. Send through the AI Rewriter or rewrite from scratch before any submission.

Reference

How to read it

Interpreting your score

Use the band first, the number second. The band tells you what action to take. The number tells you how close to the next band you are. A 79 is one rewrite pass away from Original. A 62 is solidly inside Mostly Human but trending toward Mixed. Pair every score with the sentence highlights so you know which lines to fix, not just how far you have to climb.

Plan Picker

Plan picker: Authenticity Score on every tier

The Authenticity Score appears on every plan: Free, Starter, Pro, and Business. Tiers differ on monthly volume, file and URL upload, REST API access, and team seats. The score itself is the same calibrated 0-100 number with the same five bands and the same sentence-level highlights, wherever you scan.

Free

Best for: Trying the score on a handful of drafts before committing. Students checking one or two essays a week. Writers sanity-checking a single piece.

3 scans/day, 10,000-character lifetime cap on AI rewriter before signup. Full Authenticity Score with all five bands and sentence highlights on every scan.

Includes Authenticity Score

Starter

$9.99/mo

Best for: Active students with 3 to 5 essays per week. Casual bloggers shipping a few posts weekly. Anyone who wants the score plus plagiarism risk on every piece.

20 scans/day, 20,000 AI rewriter words/month, Chrome extension, plagiarism risk indicator. Same Authenticity Score and bands as every other tier.

Includes Authenticity Score

Frequently asked about the Authenticity Score

How exactly is the Authenticity Score calculated?

The score blends five weighted language signals into one 0-100 number: burstiness (variation in sentence length and structure), perplexity (how predictable each next word is for a strong language model), lexical diversity (range and rarity of vocabulary), structural markers (repeated paragraph shapes, predictable transitions, listicle and triplet patterns), and model-specific fingerprints (lexical signatures characteristic of GPT-4, Claude, Gemini, or Llama 3). Each signal contributes a weighted component, the weights are tuned against a benchmark of human-vs-AI authored content, and the result is the 0-100 score you see. The full methodology and benchmark setup live at /accuracy-methodology.

Can the Authenticity Score be wrong?

Yes. The score is a calibrated probability, not absolute truth. Heavily edited AI text can score high. Deliberately stilted human writing (especially in formal academic registers, ESL contexts, or technical documentation) can score low. We surface a low-confidence flag on very short or unusual text so you know when to weight the number less. Use the score as a benchmark and a target, never as the sole basis for a high-stakes decision such as a grade, an invoice dispute, or a publication kill.

Why is my Authenticity Score 35 when I wrote every word myself?

Human writing scores low when it accidentally resembles AI output. The usual culprits: a very even sentence rhythm, heavy reliance on Latin-derived verbs and academic transitions ("moreover," "furthermore," "in conclusion"), parallel triplet structures, hedge phrases ("it is important to note that"), and uniform paragraph shapes. Open the sentence-level highlights, look at which lines triggered, and rewrite those specifically. Vary length, drop one or two transitions, and let one sentence run longer than feels comfortable. The score usually moves into the Mostly Human band after a single pass.

How does the score handle mixed AI-and-human text?

The 0-100 number is computed across the whole passage, so a half-AI half-human draft typically lands somewhere in the Mixed band (41 to 60). The sentence-level highlights are where mixed text becomes useful: each sentence is colour-coded separately, so you can see which paragraphs you wrote and which came from a model. Rewrite the flagged sentences, rescan, and watch the overall score climb. This is the most common workflow for editors handling AI-assisted drafts from contributors.

How does the Authenticity Score compare to GPTZero's score?

GPTZero reports an "AI probability" and a separate burstiness and perplexity figure. The Authenticity Score is the inverse of an AI probability (higher means more human), blended with additional signals such as vocabulary fingerprints and structural markers. Both tools draw on overlapping research, so scores often agree directionally: text TextSight labels Original usually clears GPTZero, and text TextSight labels AI Generated usually fails it. They are not interchangeable, though. Each detector is calibrated against a different benchmark, so do not assume one number maps onto the other. If you need to pass GPTZero specifically, verify there.

Is the score equally accurate across different content types?

No, and we publish the caveats. The score is most accurate on long-form prose (essays, articles, blog posts) of 300 words or more. Accuracy drops on very short text (one or two sentences), code, lyrics, formulaic copy (product descriptions, legal boilerplate), and ESL writing that follows tight stylistic conventions. Technical documentation often scores lower than its quality deserves because dense procedural prose naturally resembles AI rhythm. Treat scores on those categories as directional. Full type-by-type accuracy breakdown is in the methodology page.

Can I export an Authenticity Score report?

Yes. Every scan can be exported as a PDF that includes the overall 0-100 score, the band label, the sentence-level breakdown with colour coding, the timestamp, and the model fingerprints detected. Pro and Business plans get the export unbranded (Business adds white-label support for client deliverables). Free and Starter exports carry a small TextSight footer. Reports are also available via the REST API on the Business plan for teams piping the score into their own dashboards or grading workflows.

What counts as "passing" an Authenticity Score check?

There is no universal pass mark. The right target depends on the stakes. For personal writing or internal docs, any score in the Mostly Human band (61 to 80) is fine. For most blog posts, marketing copy, and Substack pieces, target 70 or higher. For academic submissions, aim for 80 or higher and reread the highlighted sentences before submitting. For compliance, legal, journalism, or anywhere the writing must read as clearly human-authored, target the Original band (81 to 100). And remember: a high TextSight score correlates with passing other detectors but does not guarantee it. If a specific third-party tool is the gate, verify on that tool too.

Authenticity Score — A 0-100 Read of How Natural Your Writing Is.

A single number that tells you how human the text reads

What score should you aim for?

Reads as AI

Borderline

Natural prose

Submission-ready

Reads as human

How the score is calculated

Same score, everywhere you write

AI Detector

AI Rewriter

Paraphraser

Summarizer

Grammar Checker

All 20 tools

What the score is not