AI Detector for Microsoft Copilot Output

Built for Microsoft Copilot

Tuned for M365 Copilot and GitHub Copilot in a single multi-model scan.

Microsoft Copilot is one of the most widely deployed AI assistants in the enterprise because it ships inside tools people already use every day. Output is shaped enough to be recognisable, but most detectors trained primarily on OpenAI samples underrate Copilot content. TextSight is trained on multi-model data and weights Copilot-specific patterns alongside ChatGPT and Gemini signals.

TextSight reads both faces of Microsoft Copilot. M365 Copilot lives inside Word, Outlook, Teams, PowerPoint, and Excel, drafting emails, documents, meeting recaps, and slide notes in an enterprise register. GitHub Copilot lives inside the editor and command line, drafting code comments, README files, commit messages, and inline documentation. The two surfaces share the same stylistic spine, a terse, instructional, compliance-aware cadence, so the classifier reads them as one model family.

One scan across the whole work doc

There is no model picker to set first. The classifier scores each sentence on its own pattern, which is what a real workplace document needs: a status report where one section came out of Copilot in Word, a paragraph was reworded in ChatGPT, and the opening was typed by the author gets pulled apart line by line. The Copilot-shaped procedure block flags while the genuinely human framing stays clear, instead of the whole doc collapsing to one ambiguous number.

Sentence-level highlights tuned to Copilot tells

Colour-coded sentence highlights point to specific lines that carry Copilot markers: terse list-and-step framing, numbered procedures, bulleted instructions, flat office-neutral phrasing, and README-style code comments. Reviewers see exactly which sentences drove the score rather than guessing from a single percentage.

Office or editor surface, same signal

Output coming through Copilot in Word or Outlook, the Copilot web app, or GitHub Copilot in the editor all carry the same fingerprints. The classifier treats Copilot as a model family, not as a product surface, so detection works regardless of which Microsoft tool the user pasted from.

Microsoft Copilot voice patterns

What makes Copilot prose recognisable to a trained classifier.

Copilot has its own register, and it is the enterprise-and-code register. It writes for the workplace: terse, procedural, professionally neutral, and tuned to be safe in a compliance-aware environment. The patterns are consistent enough that a classifier trained on Copilot samples picks them up reliably. The most useful tells fall into five families.

Terse list-and-step structure

This is the strongest Copilot tell. Copilot reaches for a procedure almost reflexively. A short framing line, then a Steps: header, then numbered actions, then a closing tip, shows up even when the task is a single sentence of advice that a human would just answer in a clause. Where ChatGPT often discusses, Copilot enumerates. The numbered-procedure shape is helpful in a work setting but recognisably templated from one response to the next, and it is more durable than most tells because converting a numbered list back into flowing prose is real work that most users skip.

Imperative how-to openings

Copilot opens with task-framing phrases like Here is how you can, To do this, follow these steps, You can accomplish this by, or simply Here's a draft. The opening orients the reader toward an action rather than a discussion. This imperative, get-to-the-point cadence is distinct from ChatGPT's softer discursive lead-ins and lands in a narrow stylistic band the classifier learns quickly.

Flat, office-neutral affect

Copilot is tuned for the workplace, so the affect is deliberately flat and professional. There is little personality, almost no humour, and a steady, even tone whether the topic is a quarterly summary or a code refactor. Sentences are clean and businesslike, with stock corporate connectives (Additionally, To summarize, In conclusion, Please note). That low-variance, neutral register is itself a fingerprint because human workplace writing carries more individual voice and more inconsistency.

Compliance-aware, safe phrasing

Copilot hedges toward caution in a corporate-policy way. It adds qualifiers like ensure you follow your organization's policies, consult your administrator, this may vary depending on your configuration, and please review before sending. These safety wrappers are appropriate in an enterprise tool but recur in predictable places, especially at the end of an answer, and the classifier reads them as a Copilot marker.

README and code-comment cadence when technical

On the GitHub Copilot side, the prose around code reads like documentation. README-style headings (Installation, Usage, Configuration), instructional bullet lists, and tidy explanatory comments above functions carry a uniform rhythm. Inline comments tend to restate what the code does in plain, complete sentences rather than the terse shorthand a human developer typically writes. When that documentation prose gets pasted into a wiki or article, the cadence travels with it and the highlights pick it out.

Plans & pricing

Pricing for solo reviewers and detection teams.

Pro at $19.99 a month standard, $14.99 a month on yearly, is the right fit for solo editors, instructors, and reviewers running steady individual scans. Business at $39.99 a month standard, $29.99 a month on yearly, fits teams scanning fifty or more pieces a month with shared history and REST API access. Full details on the pricing page.

Free

$0/forever

Try a Copilot scan. No card, no email.

3 scans / day
5,000 chars per scan
Sentence-level highlights
2 lifetime AI rewriter uses

Start free

Starter

$7.49/month

Billed $89.88/year — Save $30

Light reviewers running a few scans a week.

20 scans / day
20,000 AI rewriter words/mo
Chrome extension
Email support

Get Starter

Why other detectors underrate Copilot content.

Detector disagreement on Copilot is common. The first generation of AI detectors trained primarily on OpenAI ChatGPT output because that was the dominant model in 2023. Microsoft Copilot output, especially its terse procedural and code-documentation styles, was under-represented in those training sets, so the classifiers learned ChatGPT patterns deeply and Copilot patterns shallowly.

Training distribution skew

Detectors trained mostly on discursive ChatGPT prose learn its uniform sentence cadence and stock transitional phrasing. A Copilot answer that is mostly a numbered procedure, a tight how-to opening, and a flat office-neutral closing does not look like a flowing GPT essay, so it does not light up the same features. The detector reads it as low confidence and returns a human-ish score even when the prose is straightforwardly Copilot. Short, list-heavy text is also where many detectors are weakest, which compounds the gap.

What multi-model training changes

TextSight was trained on samples from Microsoft Copilot, OpenAI ChatGPT, Google Gemini, and other large language models. Copilot-specific markers, including the list-and-step structure, imperative how-to openings, and compliance-aware safety wrappers, activate the right signals. Cross-model scoring stays calibrated rather than collapsing to whichever model the training set leaned on.

How to read a Copilot disagreement

It is common for a GPT-tuned detector to wave a Copilot-drafted email through as human while TextSight flags it. That split is a calibration gap, not a contradiction: the GPT-trained tool never learned that a tidy numbered procedure wrapped in office-neutral phrasing is a model fingerprint, because in its world the strongest signal is flowing GPT cadence and Copilot deliberately does not write that way. The practical move is to ignore the headline number on both sides and look at the highlights. If the red sits on the "Steps:" block and the safe sign-off, the Copilot read holds.

Re-fit cadence tracks Office and GitHub releases

Microsoft ships Copilot updates across Office and GitHub on a fast schedule, and each model bump nudges the stylistic distribution. TextSight refits against fresh Copilot samples on a regular cadence so the procedural and compliance-aware tells stay calibrated as the product evolves. No detector reaches certainty on every passage, which is exactly why the workflow leans on sentence-level evidence rather than a single number.

Where Microsoft Copilot shows up

Work email, Office docs, and code documentation.

Copilot output appears wherever Microsoft's tools are, which is most of the modern workplace: work emails and business documents where the neutral professional tone fits, technical wikis and code documentation where the step-by-step framing maps onto explanation, and internal procedures where the numbered-list format is the whole point. Each context calls for a slightly different read of the scan.

Work emails and Outlook drafts

M365 Copilot drafts and rewrites email inside Outlook, so a lot of workplace correspondence now starts as Copilot prose. Recipients and managers reviewing it see the same flat office-neutral tone, the tidy two-or-three-point structure, and the polite compliance-aware sign-offs. Sentence highlights make the pattern explicit, which is more useful when you want to know whether a message was actually written by the person who sent it than a single percentage.

Office documents and meeting recaps

Teams use Copilot in Word and Teams for business documents, status reports, and meeting summaries because the output is clean and businesslike out of the gate. The same uniformity is the tell. The list-and-step structure recurs, the connective phrasing repeats, and the affect stays flat across sections a human would vary. Reviewers running a pre-publication scan catch these before a document goes wide.

Code documentation and wikis

Engineering teams use GitHub Copilot to draft README files, API references, commit messages, and inline comments. The README cadence and instructional bullets fit the format, but the prose around the code reads identifiably Copilot. Detection here is less about misconduct and more about flagging documentation that no human has read before it shipped to a wiki, which is a separate quality concern worth catching.

Internal procedures and policy drafts

Copilot is a natural fit for onboarding docs, runbooks, and policy drafts because it produces numbered procedures and safe phrasing by default. That is fine internally, but it creates problems when those notes get lifted into public-facing pages, customer-facing knowledge bases, or published guides without editing. A quick scan catches the lift-and-paste case.

What you see in a Copilot scan

Sentence highlights, paragraph cards, perplexity, and burstiness.

A single percentage is not a fix path or an evidence trail. The TextSight result panel surfaces which sentences carried Copilot markers and why, with paragraph-level rollups for longer pieces, so reviewers can point to specific lines rather than negotiating headline numbers.

Sentence-level highlights on the procedural spine

Every sentence is colour-coded by its own AI-likeness score, and on Copilot content the red tends to cluster in a predictable place: the numbered steps, the "Here is how you can" opener, and the compliance-aware sign-off. A reviewer can usually tell at a glance whether a work email or pull-request description was written by the person who sent it, because the templated middle glows while a genuine human aside in between stays green. That contrast is the read, not the headline percentage. The full mechanics are covered in how AI detectors work.

Paragraph cards for long docs and READMEs

Word documents, policy drafts, and README files run long, so the paragraph-level rollup on Pro matters here more than on a short reply. It points straight at the section dragging the score, which on Copilot output is almost always the "Steps:" block or an Installation/Usage heading lifted out of GitHub Copilot. Reviewing the lowest-scoring section first is the fastest way to decide whether a doc had a human editing pass before it shipped to the wiki.

Perplexity reads the office-neutral vocabulary

Perplexity measures how predictable word choices are to a language model. Copilot's deliberately flat workplace vocabulary and its stock corporate connectives ("Additionally", "To summarize", "Please note") are exactly the words a model finds most predictable, so the per-sentence number runs low across business prose and drops further inside templated procedure language. On Pro this is read-only context that helps separate real Copilot residue from a genuinely formulaic piece of human business writing.

Burstiness flattens in procedural reply mode

Burstiness measures sentence-length variance, and Copilot's even, businesslike cadence keeps it low to begin with. Short numbered actions and one-line instructions flatten it further, so a passage where the list-and-step and imperative how-to fingerprints both fire also tends to read as near-zero variance. That combination, low burstiness plus the procedural tells, is one of the more reliable Copilot reads because the variance collapsed for the same reason the structure appeared: the model was answering in its work-assistant procedure mode.

FAQ

Microsoft Copilot detection frequently asked.

Is TextSight built to detect Microsoft Copilot output specifically?

Yes. TextSight is trained on multi-model data that includes samples from Microsoft Copilot, covering both M365 Copilot in Word, Outlook, and Teams and GitHub Copilot in code and documentation, alongside ChatGPT, Gemini, and other models. Copilot-specific markers such as terse list-and-step structure, Office-flavored professional neutrality, and README-style code-comment cadence are part of the classifier's signal set. You do not need to tell the scanner which model produced the text; the classifier identifies Copilot-shaped prose by its own patterns.

Does TextSight detect both M365 Copilot and GitHub Copilot?

Yes. Microsoft Copilot spans two main surfaces and TextSight reads both. M365 Copilot drafts emails, Word documents, Teams summaries, and Excel narratives in an enterprise register. GitHub Copilot writes code comments, README files, commit messages, and inline documentation in a developer register. The two surfaces share a Copilot spine, the terse instructional cadence and compliance-aware neutrality, so a single scan flags either without you choosing a surface first.

How does Microsoft Copilot's writing style differ from ChatGPT?

Copilot leans harder into terse, list-and-step structure than ChatGPT. It opens with phrases like Here is how you can and Steps: and breaks answers into numbered procedures and bulleted instructions even in prose contexts. The affect is flatter and more office-neutral, tuned for work documents and compliance-aware phrasing. ChatGPT is more discursive and uses more transitional connective tissue. The two models have distinct fingerprints, and TextSight reads both in one scan rather than asking you to pick a model first.

What does Copilot's list-and-step structure tell a detector?

Copilot reaches for procedural scaffolding fast. A short Here is how preamble, a Steps: header, numbered actions, and a closing tip recur across responses regardless of whether the task needed a procedure. That templated instructional shape is a strong calibration signal, especially when the surrounding prose stays flat and office-neutral. Structure alone is not a verdict; the classifier weighs it alongside burstiness, perplexity, and lexical patterns. But it is one of the more reliable Copilot tells that survives light editing because reformatting a numbered list back into flowing prose is more work than most users do.

Does TextSight detect Copilot alongside ChatGPT and Gemini in one scan?

Yes. The classifier is multi-model by design. A single scan flags Microsoft Copilot, OpenAI ChatGPT, Google Gemini, and other large language models without you needing to pre-select a target. This matters for mixed-source content where one section was drafted in Copilot inside Word, another reworded in ChatGPT, and a third paragraph written by hand. Sentence-level highlights show which lines reacted regardless of the source model.

Where does Microsoft Copilot output usually show up?

M365 Copilot is embedded across Word, Outlook, Teams, PowerPoint, and Excel, so it shows up in work emails, business documents, meeting summaries, and internal procedures. GitHub Copilot shows up in code comments, README files, API documentation, commit messages, and technical wikis. Output also reaches knowledge-base articles, policy drafts, and onboarding docs. TextSight reads the prose regardless of which Microsoft surface produced it.

How accurate is TextSight on Copilot compared to OpenAI models?

Detection accuracy is broadly comparable across model families in our testing, with sentence-level highlights performing well on Copilot because the list-and-step structure and flat office register concentrate the tells in predictable places. No detector is perfect; on native human English writing, false positives are uncommon but not zero, and heavily edited or template-driven business writing can read borderline either way. The classifier is re-fit on a regular cadence against fresh samples from all major models so it tracks distribution drift on both sides.

Which TextSight tier fits Microsoft Copilot detection workloads?

Pro at $19.99 a month standard, or $14.99 a month on yearly, is the right fit for solo reviewers, editors, and team leads reviewing a steady flow of Copilot-drafted emails, docs, and pull requests. It unlocks unlimited scans, a 10,000 character cap per scan, 90-day scan history, file upload, and the integrated AI rewriter. Business at $39.99 a month standard, or $29.99 a month on yearly, fits teams scanning fifty or more pieces a month with five seats, REST API access, an audit log, and white-label PDFs.

AI detector built to catch Microsoft Copilot output across docs and code.