Guide

How Do AI Detectors Actually Work?

AI detectors don't read meaning — they measure how statistically predictable your text is. Here's how perplexity, burstiness, and the LLM signature actually drive the score.

Written by
Published

AI detectors work by measuring how statistically predictable your writing is, not by understanding what it says. Tools like GPTZero, Originality.ai, Turnitin, and Sapling score each stretch of text on how closely it matches the smooth, average word patterns that large language models tend to produce. The output is a probability, never a fact — and that distinction is the whole story.

How AI detectors score text

AI detectors score text by estimating the likelihood that each word was the “obvious” next choice. They run your writing through a language model of their own and ask, in effect, how surprised would a model be to see this word here? Human writing tends to surprise the model more often; AI writing tends to land on the safe, high-probability option again and again.

That estimate is built from two core signals — perplexity and burstiness — plus a trained classifier that has seen large samples of both human and machine text. None of these signals reads your argument or checks your facts. They only describe the shape of the prose: its predictability and its rhythm. If your genuine writing happens to be smooth and even, the detector can’t tell the difference, which is why real authors get flagged as AI more often than you’d expect.

What perplexity measures

Perplexity measures how predictable your word choices are to a language model — low perplexity means the next word was easy to guess, high perplexity means it was surprising. ChatGPT, Claude, and Gemini are optimized to produce fluent, likely text, so their output sits at low perplexity by design.

A detector breaks your text into tokens — the sub-word units models actually process — and computes how confidently a model would predict each one. String together a lot of low-surprise tokens and the detector reads a strong AI signal. The catch: clear, conventional, grammatically clean human writing also has low perplexity. A textbook paragraph, a polished email, or careful prose from a non-native English writer can all look “too predictable” for reasons that have nothing to do with AI.

What burstiness measures

Burstiness measures how much your sentence length and complexity vary across a passage. Human writing is bursty — a long, winding sentence followed by a short one, then a fragment for emphasis. AI text tends to be uniform, with sentences of similar length and even cadence.

Detectors quantify that variation and treat low burstiness as another AI marker. The logic is sound on average: models trained to produce balanced, readable prose smooth out the spikes that human attention and emotion create. But the signal is noisy. Edited, formulaic, or template-driven human writing — lab reports, legal boilerplate, SEO content — can be just as flat, which is part of why detectors struggle with structured genres.

Why the LLM statistical signature exists

The LLM signature exists because large language models pick words by probability, sampling toward the most likely continuations of a prompt. Averaged over thousands of words, that produces a recognizable fingerprint: low perplexity, low burstiness, conventional phrasing, and a tendency to reach for the same connective words and balanced sentence structures.

That fingerprint is real, and it’s why detection works at all above chance. But it’s a statistical tendency, not a watermark. The signature gets fainter the more a model is steered with a strong prompt, the more the text is edited by a human, and the more recent the model — newer ChatGPT and Claude releases write with more variation than the 2023-era output most detectors were originally trained on. The signature also overlaps heavily with plain, competent human writing, which is the root cause of nearly every false positive.

Why detectors are probabilistic, not proof

Detectors are probabilistic because they output a likelihood, not a verdict — a 90% AI score means the text resembles AI patterns, not that a machine wrote it. Every serious vendor, including Turnitin and Originality.ai, says some version of this in their own documentation, and tools like Sapling report a confidence percentage rather than a yes/no.

This is the part people skip, and it’s the most important. A classifier trained on patterns will assign a high score to anything that matches those patterns, whether or not the cause was AI. That’s exactly how an honest, original essay gets flagged, and why no detector score should stand alone as evidence of misconduct. If you want to see what a specific tool keys on, our breakdowns of Turnitin, GPTZero, ZeroGPT, and Sapling walk through each one’s mechanics and blind spots.

Can you change the score?

You can lower a detector score by making text more varied and specific, but no method removes the risk entirely. Raising burstiness — mixing sentence lengths, adding concrete detail, breaking up even cadence — pushes prose away from the AI average. Naive synonym-swapping does the opposite over time, leaving its own artifacts that newer detectors catch.

This is the space genuine rewriting works in, and where we’re careful not to overpromise: there’s a real difference between reads more human and guaranteed to beat a detector, which we get into in how to make AI writing read more naturally. For specific workflows, see the guides for students, researchers, and content writers.

Raising burstiness and cutting predictable phrasing lowers the score. It's a signal moving, not a guaranteed pass.

The honest bottom line

AI detectors work by measuring perplexity and burstiness — how predictable and how uniform your writing is — and comparing that signature against patterns learned from machine text. They’re useful signals and they get a lot right, but they output probabilities, they flag clean human writing, and they were never proof. Understand what they measure and you’ll read any score with the right amount of skepticism.

Humanizer is a native Mac and iPhone app that rewrites text to read more naturally and shows you a detector score on every result. No guaranteed bypass — just a clearer picture and a more human rewrite.