Paste your text and get a 0-100 AI probability score, sentence-level highlights ranked by suspicion, and an Authenticity Score that tells you how natural it reads. One scan covers GPT-4, Claude, Gemini, and Llama.
No credit card · No signup for your first scan · Up to 99.2% accuracy in our internal benchmark
Paste your draft, upload a PDF or DOCX, or point at a URL. We score it across 50+ language signals trained on GPT-4, Claude 3.x, Gemini, and Llama 3 — and ship a one-click AI rewriter in the same scan to fix what's flagged.
Open the AI Detector →Drop text directly, upload a PDF or DOCX file, or paste a URL. Anything under 5,000 characters runs free. Larger drafts run on paid plans.
50+ language signals — burstiness, perplexity, lexical patterns, and structural markers — checked against OpenAI, Anthropic, Google, and Meta output fingerprints.
You get an overall AI-probability score, an Authenticity Score, sentence-level highlights ranked by suspicion, and a built-in AI rewriter that rewrites only the flagged sentences.
Run your draft before submission. See which sentences look AI-generated and rewrite them in the same scan — without uploading to a third-party AI rewriter first.
Check guest posts and agency-written drafts before publishing. The Authenticity Score gives you a measurable benchmark to track improvements as you rewrite.
Sentence-level highlights give you specific evidence to discuss with students — not a single overall score that's hard to defend.
Bulk-scan deliverables before they ship. API access on the Business plan ($29.99/mo) covers 10,000 calls/mo — about 6× cheaper than Originality.ai's Enterprise tier.
Up to 99.2% accuracy in our internal benchmark against GPT-4, GPT-4o, Claude 3.x, Gemini, and Llama 3 outputs. Results may vary with content type, writing style, AI model, and editing level. AI detection is probabilistic — no detector is perfect, and we publish our caveats openly.
What works well: long-form English text (100+ words), recent commercial models (GPT-4/4o, Claude 3.5/3.7, Gemini, Llama 3), and unedited or lightly edited AI output.
Where to be cautious: short snippets (under 50 words), heavily human-edited AI text, non-English languages, and text that intentionally mimics a specific human author's voice. We flag low-confidence scores in the result so you know when not to rely on them.
TextSight analyzes your text against signals trained on GPT-4, GPT-4o, Claude 3.x, Gemini, and Llama 3 outputs. It returns an overall AI-probability score (0-100), sentence-level highlights ranked by suspicion, and an Authenticity Score that tells you how natural the text reads. The scan typically completes in under three seconds.
Yes. The free tier gives you 3 scans per day with the same detector and Authenticity Score as the paid plans. No credit card and no signup are required for your first scans. Higher daily limits start at $9.99/mo on the Starter plan.
Up to 99.2% accuracy on our internal benchmark of GPT-4, Claude 3.x, Gemini, and Llama 3 outputs. Results may vary with content type, writing style, AI model, and editing level. AI detection is probabilistic — read /accuracy-methodology for our test setup, false-positive rate, and the caveats we apply to every score.
Yes. The detector is trained across all four major model families — OpenAI (GPT-3.5, GPT-4, GPT-4o), Anthropic (Claude 3, 3.5, 3.7), Google (Gemini), and Meta (Llama 3). Newer model releases are added to the training set as they ship. We also publish a dedicated ChatGPT detector page focused on OpenAI-specific signals.
Detection tells you which parts of your text read as AI-generated. Authenticity rewrites the flagged sentences so they read more naturally — TextSight ships both in the same scan. After you get a score, one click runs the AI Rewriter on just the flagged sentences and re-scores.
The detector is English-focused. It works on other Latin-alphabet languages but accuracy drops outside English. For comprehensive multilingual detection, dedicated multilingual tools are a better fit.
Free scans are processed in-memory and discarded; we keep only the score and metadata in your history. Paid plans give you a 30-day to unlimited retention window for the full text — see /privacy for the full retention table.
The detector needs at least 50 words of signal to be reliable. 100+ words is the recommended minimum for confident scoring. Below 50 words, the score is shown but flagged with low confidence.
Checking text from a specific model? Each page is tuned to the signals that model leaves behind.
ESL-aware detection tuned to how English is taught and written in each market.
Role- and model-specific guidance for the way you actually work.
Step-by-step guides for checking and spotting AI text.
Most detectors hand you one opaque score and ask you to trust it. TextSight shows the signals, the bands, and the limits. Here are the six design choices that drive every scan.
The detector is calibrated on outputs from every major commercial model family: GPT-4 and GPT-4o (OpenAI), Claude 3.5 and 3.7 (Anthropic), Gemini (Google), Llama 3 (Meta), and Mistral. Single-vendor detectors miss Claude or Mistral content because they only trained on GPT samples. TextSight catches all five families in one scan.
A single overall score hides the truth. TextSight scores every sentence independently and colour-codes the result, so you can see which lines actually triggered the verdict. That matters when a student needs to defend an essay or an editor needs to rewrite only the offending sections instead of starting over.
Human writing varies. Sentences expand, contract, surprise. AI writing tends to settle into a steady predictable rhythm. The detector measures perplexity (how surprising each token is) and burstiness (how much sentence length and complexity vary), alongside dozens of lexical and structural markers. It is statistical pattern recognition, not magic.
A 50-word snippet and a 5,000-word essay are not the same problem. Short text is noisy, long text accumulates signal. The detector adapts its confidence threshold based on input length, flags low-confidence scores below 50 words, and processes documents up to the upload cap on paid plans without losing the cross-sentence context that drives accuracy.
Academic prose reads differently from a Substack post, and both differ from a Slack message. The training set spans academic essays, journalistic features, conversational blog posts, technical documentation, and creative fiction. That breadth keeps false positives lower on ESL writing, STEM papers, and tightly edited corporate copy than detectors trained on a single register.
Every scan returns one of five clear bands: Original, Mostly Human, Mixed, Likely AI, AI Generated. Borderline scores are labelled as such, low-confidence inputs are flagged before you read the verdict, and the methodology page documents the test set and false-positive rate. You see the number and the caveat together, not the number alone.
Four tiers, one detector. The right plan usually comes down to two questions: how many documents do you scan per week, and do you need the REST API for product or agency use?
Best for: students sanity-checking essays a few times a week, writers spot-checking a single piece before submission, and anyone evaluating the detector before committing.
3 scans/day, no signup for your first scan, full sentence-level highlights, Authenticity Score, and the one-click AI rewriter up to a 10,000-character lifetime cap.
Best for: active students with three to five essays a week, casual bloggers running drafts through detection before publish, and teachers running ad-hoc spot checks.
20 scans/day, 20,000 AI rewriter words/month, Chrome extension, file and URL upload, plagiarism risk indicator, full history retention.
Best for: freelance content writers shipping 10+ deliverables a week, daily newsletter authors, and SEO writers running every draft through detection before invoicing.
Unlimited scans, 50,000 AI rewriter words/month, file and URL upload, priority support, white-label PDF export, and 30-day full-text retention.
Best for: SEO agencies auditing client work, EdTech and SaaS teams embedding detection via API, and educators scanning whole-class submission batches.
100,000 AI rewriter words/month, REST API access with 10,000 calls/month, 5 team seats, white-label reports, and unlimited full-text history.
Annual billing saves 25%, dropping Pro to $14.99/mo and Business to $29.99/mo. See full pricing →
The questions that matter most when picking a detector. Direct answers, no marketing dodges.
On long-form English content over 300 words against current commercial models (GPT-4, GPT-4o, Claude 3.5/3.7, Gemini, Llama 3, Mistral), our internal benchmark sits at roughly 88 to 92 percent accuracy. On short-form text under 100 words it drops to roughly 70 to 78 percent because there is simply less signal to analyse. We report the band-level verdict so you can interpret confidence directly. Anyone advertising 99 percent accuracy across all content lengths is overclaiming, and we publish the test methodology so you can verify the numbers.
Our internal false-positive rate on human-written text is approximately 2 to 4 percent across general writing samples. That is not zero, which is why TextSight ships sentence-level evidence rather than a single overall verdict. When a borderline score comes back, you can see which specific sentences triggered the signal and decide whether they actually look AI-generated or whether the writer simply has a formal style. The methodology page documents how we measure the rate and on which test corpora.
Better than detectors trained only on native-English corpora, but ESL writing still carries higher false-positive risk because non-native English often shares some surface features with AI text (simpler sentence structures, more consistent vocabulary, fewer idioms). The training set includes a substantial ESL sample, which keeps the rate lower than competitors, but we still recommend ESL writers rely on the sentence-level highlights and Authenticity Score rather than the overall band when stakes are high.
Technical writing is harder for every detector because dense vocabulary, formulaic structures, and conventionalised phrasing look statistically similar to AI output. TextSight is calibrated against a technical-writing subset (research papers, documentation, white papers) so the false-positive penalty is reduced, but STEM authors should treat borderline scores as a prompt to inspect the highlighted sentences rather than as final judgements. The Authenticity Score is often more useful than the AI probability for technical content.
The detector is retrained on a rolling basis as major model releases land. When OpenAI, Anthropic, Google, Meta, or Mistral ship a new flagship version, we capture sample outputs across a range of prompts and writing styles, evaluate detection drift, and roll an updated detector typically within 4 to 8 weeks of release. The methodology page lists the currently-supported model versions and the date of the most recent training refresh.
The detector is English-first. It runs on other Latin-script languages (Spanish, French, German, Portuguese, Italian) but accuracy drops because the training corpus is heavily English. For Cyrillic, Arabic, CJK, or Devanagari scripts we recommend a dedicated multilingual detector. If you need detection in a specific non-English language, contact support and we will share the calibration data we have for that language so you can decide whether it meets your bar.
Yes. REST API access is included on the Business plan ($39.99/mo) with 10,000 calls per month, full sentence-level response payloads, SSE streaming for the AI rewriter endpoint, and documented rate limits. EdTech platforms, LMS vendors, agency CMS integrations, and content-quality tools all use the same endpoints the dashboard uses. The full reference is at api-docs, including authentication, request/response schemas, and example payloads.
Detection accuracy is strongest on the major commercial models that dominate the training set: GPT-4, GPT-4o, Claude 3.5, Claude 3.7, Gemini 1.5/2.0, Llama 3, and Mistral Large. Accuracy is solid on smaller variants (Claude Haiku, GPT-4o-mini, Gemini Flash) and degrades gradually on heavily fine-tuned open-source models and on outputs that have been passed through an aggressive third-party AI rewriter. The result panel labels the highest-likelihood model family when the signature is clear, so you know which model the text most resembles.
Detect AI. Fix AI. One tool. 3 free scans/day, no card required.