HomeResources › How AI Detectors Work

How AI Detectors Work: the signals behind a verdict.

An AI detector does not know who wrote your text. It measures statistical patterns, how predictable each word is, how much your sentences vary, the fingerprint of your style, then a model converts those patterns into a probability that the passage was machine-generated. Below: what a detector actually does, the core signals it reads (perplexity, burstiness, stylometry, word distribution), the model types that turn signals into a score, why detectors disagree, and how TextSight scores text sentence by sentence so you can read a verdict critically.

Jump to the core signals Try the detector free
4 core signals explained 3 model types Sentence-level scoring Last updated
The basics

What an AI detector actually does.

A detector is a classifier. It sorts text into "more likely human" or "more likely machine" by measuring how the words sit next to each other. It is not a lie detector and it cannot see authorship.

When you paste text into an AI detector, the tool is answering one narrow statistical question: do the patterns in this passage look more like the human writing it was trained on, or more like the machine-generated writing it was trained on? It outputs a probability, usually shown as a percentage, and a verdict if that probability crosses a threshold. That is the whole job. The tool has no record of who typed the words, no access to your draft history, and no way to "know" the answer the way a witness would.

This matters because the output is often read as a finding when it is really a measurement. A 78 percent AI score means the text shares enough surface features with the detector's machine-writing training set to land above its cutoff. It does not mean a person did or did not write it. Understanding the difference between a probability and a proof is the single most useful thing to take from this page, and it is why TextSight shows per-sentence evidence rather than only a headline number.

Every detector works in two stages. First it extracts features from the text, the statistical signals covered in the next section. Then it feeds those features into a model that has learned, from millions of labelled examples, where the boundary between human and machine writing tends to fall. The quality of a detector is the quality of those two stages, and its honesty is whether it tells you how it measured.

The mechanism

The core signals detectors measure.

Almost every detector reads some combination of four families of signal. None of them is proof on its own. Together they form the pattern a model is trained to recognise.

1

Perplexity

How predictable each next word is, given the words before it. AI writing tends to choose the statistically likely next word, so it has low perplexity. Detectors read low, smooth predictability as a generation signal. Careful human writers can produce low perplexity too, which is where false flags begin.

2

Burstiness

The variance in sentence length and complexity across a passage. Humans are bursty: a four-word sentence next to a thirty-eight-word one. Machine writing is more even. Low burstiness, sentences of similar length and rhythm, is one of the strongest classical AI signals.

3

Stylometry

The style fingerprint of the text: vocabulary richness, punctuation habits, and function-word ratios (how often "the", "of", "and" appear). Human authors have idiosyncratic fingerprints. Machine output tends toward a flatter, more average profile, which a model can learn to spot.

4

Word distribution and n-grams

The frequency of specific words and short word sequences (n-grams). Over-use of transition phrases like "Furthermore", "Moreover", and "It is important to note" shifts the distribution toward patterns common in machine output and formulaic instruction.

The key thing to hold onto is that these are statistical properties of text, not stylistic crimes. Low perplexity is not cheating, it is just consistency. Low burstiness is not dishonesty, it is just uniform rhythm, which formal academic training actively rewards. A detector cannot tell whether your even, vocabulary-consistent paragraph came from a language model or from a meticulous student who edits hard. That ambiguity is exactly why these signals produce errors, a topic we return to below in why detectors disagree.

Under the hood

The three model types that turn signals into a score.

Once the signals are extracted, something has to convert them into a verdict. Detectors broadly fall into three families, and the family largely determines how a tool behaves on edge cases.

1. Perplexity and statistical scorers

The earliest and simplest detectors measure perplexity and burstiness directly and apply a threshold. They are fast, transparent, and need no large training set, but they are also the most fragile. They were largely calibrated on older model output, so they degrade as new models write with more variance, and they over-flag the careful human writing that happens to score low on the same axes.

2. Fine-tuned transformer classifiers

The modern mainstream. A transformer model (often a BERT-family encoder) is fine-tuned on millions of paired human and AI samples to predict an AI probability directly, rather than reading a single hand-picked statistic. These classifiers learn subtler combinations of the signals above and generally outperform pure perplexity scorers on long-form text, at the cost of needing a large, well-curated training corpus and periodic retraining as models evolve.

3. Ensembles

The most robust approach combines signals rather than betting on one. In practice the strongest workflow is also an ensemble at the human level: running a passage through two independent detectors, where agreement is the strongest evidence and a single-tool verdict is the weakest. As TextSight's methodology puts it, ensemble use of two detectors remains the most reliable workflow.

TextSight sits in the transformer-classifier family: a fine-tuned encoder that predicts AI probability directly from the text, which is part of why its false positive rate stays low. It pairs that model with sentence-level analysis so the verdict comes with evidence, and it recommends ensemble agreement across two tools for any decision that carries consequences. We describe the architecture this way because it is how the rest of the site already describes it, we do not invent capabilities the product does not have.

Granularity

Document-level versus sentence-level scoring.

Most detectors return one number for the whole passage. That number averages over everything, which hides where the signal actually lives.

A document-level score is the easiest thing to produce and the hardest thing to act on. If a 1,200-word essay comes back at 64 percent AI, you have no idea whether that is one heavily machine-like paragraph dragging up four perfectly human ones, or an even spread across the whole text. The headline number cannot tell you, and neither can it tell a reviewer where to look. Averaging is exactly the operation that makes a verdict feel authoritative while removing the detail that would let you check it.

Sentence-level scoring solves this by running the analysis at a finer grain and highlighting which specific sentences carry the AI signal and which read as human. That turns an opaque percentage into something a person can actually read and challenge. For a writer, it points at the passages worth revising for clarity and voice. For an educator, it shows where to ask a follow-up question rather than where to accuse. TextSight's AI detector is built around this idea: the per-sentence breakdown is the product, the headline score is just the summary.

The honest part

Why detectors disagree, and where they fail.

No detector is perfect, and any vendor claiming zero errors is misrepresenting the problem. Knowing the failure modes is what lets you read a verdict for what it is.

False positives on formal and ESL writing

The biggest limit. Second-language academic writers and highly polished native writers both tend to produce low-perplexity, low-burstiness prose, the exact signals detectors read as machine output. The result is human writing flagged as AI, with second-language writers carrying the most risk. This is a structural property of the signals, not a malfunction. We cover it in depth on AI detector false positives.

Paraphrasing and humanizers

Running machine text through a paraphraser raises its perplexity and adds variance, which can lower a detector score. Detectors respond by training on paraphraser output, so the advantage erodes with every model update. Chasing a lower number is a moving target, which is why TextSight frames the work around understanding and improving honest writing, not evading a tool.

Short text

Below roughly 250 words a detector has too little signal to average over, and the score swings with a single rephrase. A short reply can read very differently on two scans with no edits at all. Short passages are where two tools disagree most, and where any single verdict deserves the least trust.

Evolving models

Every new generation of language model shifts the statistical fingerprint a detector was trained to recognise. A detector tuned on last year's output performs worse on this year's, which is why measured accuracy drifts and why responsible tools re-test and retrain on a schedule rather than publishing one number forever.

Put together, these failure modes explain why two reputable detectors can return different scores on the identical passage. Different model, different training set, different threshold. The practical takeaway is the one TextSight states in its methodology: ensemble agreement across two independent tools is the strongest evidence, and a single-tool verdict is the weakest.

Our approach

How TextSight approaches detection.

The same mechanics, applied with sentence-level transparency and an honest account of the limits.

TextSight uses a fine-tuned transformer classifier that predicts AI probability directly from the text, and reports 99.2 percent accuracy on its public 1,000-document benchmark. We publish the methodology behind that number, including sample composition and threshold logic, because a number without a method is just marketing. The model is tuned with second-language academic prose deliberately in the training mix, which is part of why our false positive rate stays low on the population most often wrongly flagged.

What we do differently is refuse to stop at a headline percentage. Every scan returns a per-sentence breakdown so you can see exactly which sentences carry the AI signal, read the verdict critically, and decide what to do with it. We also tell you plainly that no detector is infallible and that ensemble agreement across two tools beats any single score. The goal is understanding and better honest writing, not a tool to evade. If you want the full measurement detail, the accuracy methodology page documents how we test and re-test.

FAQ

How AI detectors work, frequently asked.

Can AI detectors be wrong?
Yes. Every AI detector is a probabilistic classifier, so it produces both false positives (human writing flagged as AI) and false negatives (AI writing it misses). The error rate is structural, not a bug that can be patched out, because the signals detectors rely on (low perplexity, even sentence length, formulaic transitions) also appear in genuine human writing, especially polished academic prose and second-language writing. Any single verdict should be read as a probability against a calibration set, not as proof of authorship.
Do AI detectors store my text?
It depends on the tool, and you should read each vendor's privacy policy. A detector has to send your text to a model to score it, but storage and reuse for training are separate decisions. TextSight processes text to return a verdict and does not sell your content or use scanned passages to train models without consent. If data handling matters for your workflow, check the provider's privacy and data-retention documentation before pasting confidential material into any detector.
Can AI detectors tell which model wrote the text?
Mostly no, and you should be sceptical of tools that claim otherwise. Detectors are trained to separate human from machine writing, not to attribute a specific model like GPT-4 or Claude. Some tools guess a likely source family, but accuracy drops sharply across model boundaries and as new models are released, because the statistical fingerprint shifts. Treat any "written by model X" label as a low-confidence guess rather than a finding.
Are AI detectors accurate?
Accuracy varies widely by tool, by text length, and by who wrote it. On clean native-English long-form writing, leading classifier-based detectors perform well. On short passages, technical writing, and second-language prose, false positive rates climb. TextSight reports 99.2 percent accuracy on its public 1,000-document benchmark, and still recommends treating any single verdict as one signal. The honest framing is calibrated and transparent, not perfect.
Can paraphrasing or a humanizer beat a detector?
Paraphrasing can lower a detector score because it raises perplexity and adds variance, but it does not change who wrote the underlying ideas, and modern detectors are increasingly trained on paraphraser output. Chasing a lower score is a losing game as models update. A more durable approach is to use sentence-level feedback to understand which parts of your own writing read as machine-like, then revise for clarity and voice rather than to evade a tool.
Why do two AI detectors give different scores on the same text?
Each detector uses a different model, a different training set, and a different flag threshold, so the same passage can land above the cutoff on one tool and below it on another. Short text amplifies the disagreement because the score swings with a single rephrase. This is exactly why ensemble agreement across two independent detectors is stronger evidence than any single verdict, and why TextSight shows per-sentence highlights so a human can read the result critically.
Related

More on detection, accuracy, and honest writing.

See the signals on your own text. Free, no signup.

TextSight's free tier gives you three scans a day at 5,000 characters per scan, with sentence-level highlights so you can see exactly which sentences carry the AI signal and why. No card, no email, no commitment.

Start free, no card See the methodology
Sentence-level highlights · ESL-aware calibration · Transparent methodology · No signup for the free tier