No detector hands you a single-shot free scan on a 50,000 word thesis, and the ones that pretend to are quietly truncating your input or hiding the result behind a credit-card form. TextSight free is upfront about the constraint and gives you a working path through it. You get 3 scans per day with up to 5,000 characters per scan, an 11-format file extractor that takes the whole document up front, and a chunking workflow that splits a long file into 5,000 character sections you scan over a day or across several days. Each chunk returns the same Authenticity Score, band, and sentence-level highlights that Pro returns. Average the per-chunk scores and you have a document-level read on a file that no honest free tier would scan in one click.
Free is the right starting point for a one-off large file. For routine large-file work, Pro lifts the per-scan ceiling and the daily cap. Yearly billing saves 25%. Full details on the pricing page.
Billed $89.88/year, Save $30
Billed $179.88/year, Save $60
Billed $359.88/year, Save $120
Yearly billing saves 25%. View full pricing →
Large is a slippery word across detector tools. We measure in characters because that is what the classifier reads, and we say plainly what fits a single free scan and what does not.
That is a single blog draft, a cover letter, an email reply, a class assignment, or a four to five page text-layer essay. The whole document fits in one free scan, no chunking required, and the daily 3-scan budget covers a draft, a revision, and a final.
A 1,000 to 5,000 word article, a long-form pitch script, a ten-page paper, or a research-note bundle. One to six chunks. A motivated free user covers a medium file across a single day using the three-scan daily budget twice.
Research papers past 5,000 words, full theses at 50,000 to 80,000 words (which is 300,000 to 500,000 characters), dissertation chapters, long reports, RFP responses, white papers, and book manuscripts. None fit a single free scan, all are scannable through the chunking workflow across multiple days, all are one-upload territory on Pro.
No teaser, no countdown timer, no paywall on the result. Just a clearly bounded budget that maps onto real chunked workflows.
Three full scans a day, each up to 5,000 characters of input. The budget resets at midnight UTC. A single 5,000 character chunk is roughly four to five pages of double-spaced essay text or eight to ten pages of single-spaced body copy, or about 750 to 900 English words. The daily budget covers up to 15,000 characters of scanned text per day before you hit either ceiling.
The /api/file/extract endpoint built on officeparser v7 accepts .docx, .pdf, .txt, .md, .rtf, .odt, .epub, .html, .pptx, .xlsx, and .csv. Extraction is metered separately from detection, so you can upload the full large file once, watch the textarea fill with the extracted prose, then chunk it into 5,000 character pieces inside the editor without burning a detection scan.
Each chunked scan runs the same classifier that powers Pro, the Chrome extension, and the WordPress plugin. The Authenticity Score, the band label, the sentence-level colour map, and the top AI tells render identically. There is no quality tier on the model itself; what changes between free and Pro is the daily cap and the per-scan ceiling, not the depth of the result.
Past 5,000 characters in a single paste the input stops accepting. Past three scans in a day the next request prompts for an upgrade or a wait until midnight UTC. Nothing is silently truncated; you always see exactly how many characters the classifier read.
A repeatable workflow that scales to any document length on the free tier. Patient by design, identical in per-chunk depth to a Pro scan.
Open app.textsight.ai and drop the large file onto the upload area. The 11-format file-extract endpoint pulls clean text from the document and fills the textarea. This step costs nothing against the detection budget. For a typical 100,000 character report the textarea fills in about three to four seconds.
Structure-first splits hold up better than blind character splits because they preserve sentence boundaries. For an academic paper that is abstract, introduction, literature review, methodology, results, discussion, conclusion. For a report it is executive summary, sections, appendices. Any section longer than 5,000 characters splits once more by subsection heading or by paragraph block until each chunk is under the ceiling.
Paste the first chunk into the scan box. The classifier returns an Authenticity Score in roughly 30 seconds. Save the result with the section name as the tag. Repeat for the second and third chunks of the day. The daily budget resets at midnight UTC; on day two you scan the next three chunks.
A two-column note works fine: section name, Authenticity Score. After a full pass you have a per-section map. The map alone is often more useful than the average, because AI generation patterns concentrate in predictable places (lit reviews, conclusions, abstracts) and the per-chunk view shows you exactly where to focus the rewrite.
Simple average for equal-sized chunks; character-weighted average for uneven chunks (multiply each score by its character count, sum, divide by total characters). The resulting document-level number is what a single-shot scan would have returned if the per-scan ceiling were not a constraint, within a small classifier variance.
Four document types that come up repeatedly in free-tier large-file traffic, with the chunk count to expect on each.
A typical journal article runs 2,000 to 10,000 words. Split by IMRaD structure (introduction, methods, results, discussion) and each section usually lands between 3,000 and 12,000 characters. Two to twelve chunks total, scannable across one to four days on free. The per-chunk map shows you which section the AI signal lives in, which is usually the discussion or the lit review.
A full thesis at 50,000 to 80,000 words is forty to one hundred chunks. Free covers this across two to five weeks of patient daily scanning, which fits a defence calendar comfortably. Most students chunk by chapter, then by subsection inside chapters longer than 5,000 characters. Pro covers a chapter in a single upload and shortens the timeline to days instead of weeks.
An industry report, an internal audit, a research white paper, a long RFP response. Ten to thirty chunks. Structure-first split usually maps to the report's existing table of contents, so the chunking work is mostly already done. The per-chunk map is the artifact a procurement team actually wants to see.
A short non-fiction book at 50,000 words; a longer trade book at 80,000 to 100,000. Sixty to one hundred and twenty chunks. The chunking workflow runs but the volume starts to argue for Pro. Most book-length free-tier scans are first chapters or sample sections rather than full manuscripts; if you are scanning a full book regularly, Pro pays for itself in time saved on day one.
Two reasonable methods, when each one applies, and why the per-chunk map is often more useful than the average.
Sum the Authenticity Scores across chunks and divide by chunk count. Works well when you split a long document into uniformly sized 5,000 character blocks. A ten-chunk document with scores 62, 58, 71, 55, 60, 64, 59, 68, 57, 66 averages to 62. The simple average is fast and good enough for a sanity check.
Multiply each chunk Authenticity Score by its character count, sum, then divide by total characters. Use this when you split by structure and the section lengths differ (an abstract at 1,500 characters versus a results section at 9,000). The weighted average gives a more accurate document-level read because long sections have more influence than short ones, which is also how a single-shot scan would have weighted them.
If chunk scores cluster within 10 to 15 points the document is uniformly AI or uniformly human and the average is reliable. If chunk scores span a 40 point range the document is mixed, and the average smooths out exactly the information you need. In the mixed case, open the per-chunk map and act on the high-scoring sections directly rather than chasing a misleading single number.
If you have a Pro subscription on hand or are evaluating the upgrade, run the full document through a single-shot scan after the chunked pass. The averaged number should land within a few points of the single-shot number; if it does not, the gap is usually a heavy section the structure-split missed. This is a useful confidence check before submitting the chunked methodology in an academic-integrity report.
Free is built for the one-off large file. There is a real point where the chunking math gets tedious and Pro or Business pays for itself in time saved.
One thesis-sized scan a month is fine on free. Two large files a week starts to consume your daily budget faster than the chunks finish. Pro at $19.99 monthly or $14.99 yearly removes the daily ceiling and lifts the per-scan character cap, so a chapter fits in one upload.
A submission deadline does not wait for midnight UTC. Pro fits scenarios where the document needs to be scored before the end of the workday, because unlimited scans plus a larger per-scan cap covers any reasonable single document in a single session.
Teams running large-file detection at scale (LMS moderation, agency content QA, batch academic-integrity review) belong on Business at $39.99 monthly or $29.99 yearly. Business adds the REST batch endpoint at api.textsight.ai/scan/v1/batch which takes an array of documents and returns an array of scored results in one call, plus a 5-seat workspace with a shared audit log. Webhook callbacks support asynchronous pipelines so a CMS can submit a batch and forget the connection.
Free scan history is session-bound. Pro extends history to 90 days with search and CSV export, which matters for documented integrity reviews that an institution or client needs to keep as an audit artifact. Free is enough for personal use; Pro is the tier that produces archive-quality records.
Most free tiers paywall large files first, with the exact mechanism varying by tool. TextSight free is the documented chunking workflow.
The pattern is consistent. Free large-file scanning is the first feature most detectors paywall, and the ones that pretend to offer it either truncate silently or hide the result behind a credit-card prompt. TextSight free admits the same constraint exists and exposes a deliberate workflow that fits inside it rather than hiding the limit.
Native .pdf upload on the free tier via the same officeparser v7 endpoint as Pro.
Read the guide →Business tier bulk scanning via the UI queue and the REST batch endpoint.
See bulk options →Anonymous first scan, paste path, no card, no email. Same classifier as Pro.
Read the deep dive →Full tier breakdown for Free, Starter, Pro, and Business. Annual billing saves 25%.
See pricing →3 scans per day at 5,000 characters per scan, 11-format file extract that does not burn a quota, same classifier as Pro on every chunk.