Overview
The short version:
- An AI-detection score is a probability estimate, not a definitive verdict.
- Both false positives (human writing flagged as AI) and false negatives (AI writing read as human) are possible with any detector.
- A score should be treated as evidence to review, not as proof.
- It should never be the sole basis for an academic or employment penalty.
TextSight's detector estimates how likely a piece of text was generated by an AI model, based on patterns in the writing. It's a useful signal — but like every AI detector on the market, it works in probabilities, not certainties. This page explains exactly what our results can and cannot tell you, so you can use them fairly.
What the score means
A TextSight result is an estimated likelihood, expressed as a score, that text shows patterns commonly associated with AI-generated writing. We also surface sentence-level signals so you can see which parts of a document contributed to the result.
- A higher score means the text more closely resembles patterns typical of AI-generated content.
- A lower score means it more closely resembles patterns typical of human writing.
- The score is best used as a starting point for review — a prompt to look closer, ask questions, or request a draft history — not as a conclusion on its own.
What it does not mean
A high score is not proof that text was written by AI, and a low score is not proof that it wasn't. No AI detector can prove authorship.
- It is not a verdict, a confession, or evidence of intent.
- It is not grounds for automatic action — failing a student, rejecting a candidate, or taking down content — without human review.
- It does not measure the quality, correctness, or originality of the writing.
Why false positives happen
A false positive is when genuinely human writing is flagged as AI-like. This can happen because some human writing naturally shares the statistical patterns that detectors look for. Common situations include:
- Non-native English writers, whose phrasing can be more measured and formulaic.
- Formulaic, technical, or templated writing — legal, scientific, academic, or instructional text that follows strict conventions.
- Short texts, where there isn't enough signal to judge confidently.
- Heavily edited or polished writing, including text run through grammar tools.
- Common, widely-used phrasing that appears frequently in training data.
For more detail and examples, see AI detector false positives.
Why false negatives happen
A false negative is when AI-generated text is read as human. This is just as real a limitation:
- Edited or paraphrased AI output — once a person rewrites parts of AI text, it can read as human.
- Mixed authorship — documents that blend human and AI writing are inherently harder to judge.
- Newer or unusual models whose patterns differ from what any detector has seen.
Using results responsibly
Our recommendation: treat the score as one input among many. Pair it with context, your own judgement, and — where appropriate — a conversation with the writer.
- Always apply human review. A person should make the final call, not the score.
- Look at context — drafts, version history, the writer's track record, and the assignment or brief.
- Give the writer a chance to explain. A high score can be a reason to ask, not to accuse.
- Don't rely on a single tool. Different detectors disagree; no result is final.
For educators & institutions
If you use TextSight in a classroom or organization, we strongly encourage you to build a fair process around it:
- Do not use a detection score as the sole basis for an academic-integrity decision or disciplinary action.
- Combine it with your institution's own policies, evidence, and a human review step.
- Be transparent with students and staff about how detection is used.
See our Responsible AI Use policy for how we ask all users to apply these tools.
Further reading