Detect Microsoft Copilot content across M365 Copilot and GitHub Copilot in a single scan. Copilot writes in an enterprise register, terse list-and-step structure, flat office-neutral affect, and README-style code-comment cadence when technical. TextSight reads the prose, flags Copilot-shaped sentences with colour-coded highlights, and runs the same scan against ChatGPT and Gemini at no extra step. Free to try. No card.
Microsoft Copilot is one of the most widely deployed AI assistants in the enterprise because it ships inside tools people already use every day. Output is shaped enough to be recognisable, but most detectors trained primarily on OpenAI samples underrate Copilot content. TextSight is trained on multi-model data and weights Copilot-specific patterns alongside ChatGPT and Gemini signals.
TextSight reads both faces of Microsoft Copilot. M365 Copilot lives inside Word, Outlook, Teams, PowerPoint, and Excel, drafting emails, documents, meeting recaps, and slide notes in an enterprise register. GitHub Copilot lives inside the editor and command line, drafting code comments, README files, commit messages, and inline documentation. The two surfaces share the same stylistic spine, a terse, instructional, compliance-aware cadence, so the classifier reads them as one model family.
There is no model picker to set first. The classifier scores each sentence on its own pattern, which is what a real workplace document needs: a status report where one section came out of Copilot in Word, a paragraph was reworded in ChatGPT, and the opening was typed by the author gets pulled apart line by line. The Copilot-shaped procedure block flags while the genuinely human framing stays clear, instead of the whole doc collapsing to one ambiguous number.
Colour-coded sentence highlights point to specific lines that carry Copilot markers: terse list-and-step framing, numbered procedures, bulleted instructions, flat office-neutral phrasing, and README-style code comments. Reviewers see exactly which sentences drove the score rather than guessing from a single percentage.
Output coming through Copilot in Word or Outlook, the Copilot web app, or GitHub Copilot in the editor all carry the same fingerprints. The classifier treats Copilot as a model family, not as a product surface, so detection works regardless of which Microsoft tool the user pasted from.
Copilot has its own register, and it is the enterprise-and-code register. It writes for the workplace: terse, procedural, professionally neutral, and tuned to be safe in a compliance-aware environment. The patterns are consistent enough that a classifier trained on Copilot samples picks them up reliably. The most useful tells fall into five families.
This is the strongest Copilot tell. Copilot reaches for a procedure almost reflexively. A short framing line, then a Steps: header, then numbered actions, then a closing tip, shows up even when the task is a single sentence of advice that a human would just answer in a clause. Where ChatGPT often discusses, Copilot enumerates. The numbered-procedure shape is helpful in a work setting but recognisably templated from one response to the next, and it is more durable than most tells because converting a numbered list back into flowing prose is real work that most users skip.
Copilot opens with task-framing phrases like Here is how you can, To do this, follow these steps, You can accomplish this by, or simply Here's a draft. The opening orients the reader toward an action rather than a discussion. This imperative, get-to-the-point cadence is distinct from ChatGPT's softer discursive lead-ins and lands in a narrow stylistic band the classifier learns quickly.
Copilot is tuned for the workplace, so the affect is deliberately flat and professional. There is little personality, almost no humour, and a steady, even tone whether the topic is a quarterly summary or a code refactor. Sentences are clean and businesslike, with stock corporate connectives (Additionally, To summarize, In conclusion, Please note). That low-variance, neutral register is itself a fingerprint because human workplace writing carries more individual voice and more inconsistency.
Copilot hedges toward caution in a corporate-policy way. It adds qualifiers like ensure you follow your organization's policies, consult your administrator, this may vary depending on your configuration, and please review before sending. These safety wrappers are appropriate in an enterprise tool but recur in predictable places, especially at the end of an answer, and the classifier reads them as a Copilot marker.
On the GitHub Copilot side, the prose around code reads like documentation. README-style headings (Installation, Usage, Configuration), instructional bullet lists, and tidy explanatory comments above functions carry a uniform rhythm. Inline comments tend to restate what the code does in plain, complete sentences rather than the terse shorthand a human developer typically writes. When that documentation prose gets pasted into a wiki or article, the cadence travels with it and the highlights pick it out.
Pro at $19.99 a month standard, $14.99 a month on yearly, is the right fit for solo editors, instructors, and reviewers running steady individual scans. Business at $39.99 a month standard, $29.99 a month on yearly, fits teams scanning fifty or more pieces a month with shared history and REST API access. Full details on the pricing page.
Billed $89.88/year — Save $30
Billed $179.88/year — Save $60
Billed $359.88/year — Save $120
Yearly billing saves 25%. View full pricing →
Detector disagreement on Copilot is common. The first generation of AI detectors trained primarily on OpenAI ChatGPT output because that was the dominant model in 2023. Microsoft Copilot output, especially its terse procedural and code-documentation styles, was under-represented in those training sets, so the classifiers learned ChatGPT patterns deeply and Copilot patterns shallowly.
Detectors trained mostly on discursive ChatGPT prose learn its uniform sentence cadence and stock transitional phrasing. A Copilot answer that is mostly a numbered procedure, a tight how-to opening, and a flat office-neutral closing does not look like a flowing GPT essay, so it does not light up the same features. The detector reads it as low confidence and returns a human-ish score even when the prose is straightforwardly Copilot. Short, list-heavy text is also where many detectors are weakest, which compounds the gap.
TextSight was trained on samples from Microsoft Copilot, OpenAI ChatGPT, Google Gemini, and other large language models. Copilot-specific markers, including the list-and-step structure, imperative how-to openings, and compliance-aware safety wrappers, activate the right signals. Cross-model scoring stays calibrated rather than collapsing to whichever model the training set leaned on.
It is common for a GPT-tuned detector to wave a Copilot-drafted email through as human while TextSight flags it. That split is a calibration gap, not a contradiction: the GPT-trained tool never learned that a tidy numbered procedure wrapped in office-neutral phrasing is a model fingerprint, because in its world the strongest signal is flowing GPT cadence and Copilot deliberately does not write that way. The practical move is to ignore the headline number on both sides and look at the highlights. If the red sits on the "Steps:" block and the safe sign-off, the Copilot read holds.
Microsoft ships Copilot updates across Office and GitHub on a fast schedule, and each model bump nudges the stylistic distribution. TextSight refits against fresh Copilot samples on a regular cadence so the procedural and compliance-aware tells stay calibrated as the product evolves. No detector reaches certainty on every passage, which is exactly why the workflow leans on sentence-level evidence rather than a single number.
Copilot output appears wherever Microsoft's tools are, which is most of the modern workplace: work emails and business documents where the neutral professional tone fits, technical wikis and code documentation where the step-by-step framing maps onto explanation, and internal procedures where the numbered-list format is the whole point. Each context calls for a slightly different read of the scan.
M365 Copilot drafts and rewrites email inside Outlook, so a lot of workplace correspondence now starts as Copilot prose. Recipients and managers reviewing it see the same flat office-neutral tone, the tidy two-or-three-point structure, and the polite compliance-aware sign-offs. Sentence highlights make the pattern explicit, which is more useful when you want to know whether a message was actually written by the person who sent it than a single percentage.
Teams use Copilot in Word and Teams for business documents, status reports, and meeting summaries because the output is clean and businesslike out of the gate. The same uniformity is the tell. The list-and-step structure recurs, the connective phrasing repeats, and the affect stays flat across sections a human would vary. Reviewers running a pre-publication scan catch these before a document goes wide.
Engineering teams use GitHub Copilot to draft README files, API references, commit messages, and inline comments. The README cadence and instructional bullets fit the format, but the prose around the code reads identifiably Copilot. Detection here is less about misconduct and more about flagging documentation that no human has read before it shipped to a wiki, which is a separate quality concern worth catching.
Copilot is a natural fit for onboarding docs, runbooks, and policy drafts because it produces numbered procedures and safe phrasing by default. That is fine internally, but it creates problems when those notes get lifted into public-facing pages, customer-facing knowledge bases, or published guides without editing. A quick scan catches the lift-and-paste case.
A single percentage is not a fix path or an evidence trail. The TextSight result panel surfaces which sentences carried Copilot markers and why, with paragraph-level rollups for longer pieces, so reviewers can point to specific lines rather than negotiating headline numbers.
Every sentence is colour-coded by its own AI-likeness score, and on Copilot content the red tends to cluster in a predictable place: the numbered steps, the "Here is how you can" opener, and the compliance-aware sign-off. A reviewer can usually tell at a glance whether a work email or pull-request description was written by the person who sent it, because the templated middle glows while a genuine human aside in between stays green. That contrast is the read, not the headline percentage. The full mechanics are covered in how AI detectors work.
Word documents, policy drafts, and README files run long, so the paragraph-level rollup on Pro matters here more than on a short reply. It points straight at the section dragging the score, which on Copilot output is almost always the "Steps:" block or an Installation/Usage heading lifted out of GitHub Copilot. Reviewing the lowest-scoring section first is the fastest way to decide whether a doc had a human editing pass before it shipped to the wiki.
Perplexity measures how predictable word choices are to a language model. Copilot's deliberately flat workplace vocabulary and its stock corporate connectives ("Additionally", "To summarize", "Please note") are exactly the words a model finds most predictable, so the per-sentence number runs low across business prose and drops further inside templated procedure language. On Pro this is read-only context that helps separate real Copilot residue from a genuinely formulaic piece of human business writing.
Burstiness measures sentence-length variance, and Copilot's even, businesslike cadence keeps it low to begin with. Short numbered actions and one-line instructions flatten it further, so a passage where the list-and-step and imperative how-to fingerprints both fire also tends to read as near-zero variance. That combination, low burstiness plus the procedural tells, is one of the more reliable Copilot reads because the variance collapsed for the same reason the structure appeared: the model was answering in its work-assistant procedure mode.
More LLM-specific detection guides.
OpenAI ChatGPT detection with the same multi-model classifier and sentence highlights.
For ChatGPT →The main detector page covering accuracy, methodology, and the multi-model classifier.
Main detector →Light, Balanced, and Maximum modes for editing Copilot-shaped passages without losing voice.
Read the guide →Google Gemini detection across 1.5 Pro, Flash, Ultra, and Gemini in Workspace.
For Gemini →Field comparison of the leading detectors and where a multi-model classifier wins.
Read the roundup →Free to try. No card. Pro at $14.99 a month on yearly for solo reviewers; Business at $29.99 a month on yearly for detection teams.