AI Detection Is a Lie Detector That Is a Liar

Aug 25, 2025

If someone tells you they can “prove” your school paper, employee report, or investor update was written by ChatGPT — let me save you time: they can’t.

Here’s what’s actually being sold, who’s making money, and why the real liability sits with the buyers, not the bots.

⸻

The Tools Everyone’s Pretending Work

Turnitin — the education monopoly, bundled into LMS systems.
GPTZero — the viral teacher’s pet turned “enterprise solution.”
Copyleaks — promises “99% accuracy” to schools and employers.
Originality.ai — beloved by publishers and SEO shops.
ZeroGPT / Winston AI / Sapling / Grammarly Pro — budget detectors, marketed to SMBs and individuals.

They sell one thing: certainty.

⸻

The Pitch vs. The Punchline

Turnitin: “<1% false positives.”

Copyleaks: “Over 99% accurate.”

Sapling: “97% accuracy.”

Originality.ai: “Highest accuracy in the market.”

Sounds airtight. Until you realize they’re measuring vibes with math.

⸻

How the Math Actually Works

These systems don’t detect “ChatGPT.” They detect probability patterns.

Perplexity: Was this sentence too predictable
Burstiness: Are the sentence lengths too consistent?
Token probability: Does this look like model output?

In theory, humans are messy. In practice:

Non-native English writers = “too smooth” → flagged.
Edited AI text = messy enough → passes as human.
Humans who binge on ChatGPT writing start mimicking the style unconsciously → flagged as bots.

⸻

Even OpenAI killed its own detector in 2023, admitting it was too inaccurate to keep online.

And that should tell you everything.

If the company that built GPT-4 — with the most insight into how these systems actually behave — couldn’t make a detector good enough to keep public, what do you think a startup is selling you when it claims “99% accuracy”?

OpenAI had all the advantages: insider access to model weights, world-class researchers, unlimited compute. And still, the detector produced so many false positives on human writing and false negatives on AI writing that the company quietly pulled it down. Their statement was blunt: detection “wasn’t reliable.”

No hedge. No “future update coming.” Just a shutdown.

Which means the only thing that has improved since 2023 isn’t the tech — it’s the marketing.

⸻

The Meme Version: Em Dashes, Triads, and the Last-Line Twist

You’ve seen the TikToks: “Too many em dashes? A triadic list? A paragraph that ends with ‘It’s not X, it’s Y’? Must be AI.”

That’s not forensic analysis. That’s punctuation palm-reading.

Em dashes aren’t AI fingerprints — they’re just punctuation with better PR than semicolons.
Triads? Humans have been writing in threes since Moses, Cicero, and the Declaration of Independence.
“Not X but Y” is older than Aristotle.

If style were proof, the New York Times opinion section would be a mass grave of AI violations.

⸻

The Business Model

This isn’t a feature. It’s an industry.

GPTZero is profitable, projects ~$16M ARR.
Turnitin rakes in $200M+ annually.
Copyleaks / Originality.ai chase enterprise contracts by promising “compliance risk reduction.”

The pitch is simple: Pay us, or you’ll miss the AI cheaters.

The reality: they’re selling an illusion of certainty that creates new liability instead of reducing it.

⸻

Case Study: Copyleaks’ “99% Accuracy” Claim

Let’s take one concrete example.

Copyleaks markets its detector as “over 99% accurate.” That’s the line plastered across its site, sales decks, and enterprise pitches.

But here’s what’s buried in the fine print (and surfaced in independent tests):

That number comes from lab conditions, not real-world writing.
It assumes long-form samples with 20%+ AI content.
It does not reflect short essays, lightly edited drafts, or human–AI mixes.

When independent reviewers ran the tool on actual student essays and business memos, the results fell closer to 50–60% accuracy — basically a coin flip. Worse, non-native English writers were disproportionately flagged as AI.

That’s not 99%. That’s marketing fiction.

And legally? That’s not just embarrassing. It’s dangerous.

The FTC doesn’t treat “99% accurate” as puffery. It treats it as an objective claim — one that requires competent, reliable evidence. If your evidence collapses outside of cherry-picked conditions, you’re not advertising. You’re deceiving.

Which means Copyleaks (and anyone else making the same claim) isn’t just selling software. They’re inviting regulators to make them a test case.

⸻

Why “99% Accuracy” Is a Legal Trap

Even if you skip the teardown, the pattern is the same across the industry. Those glossy numbers always come from:

Controlled lab tests
Long-form samples (short text is nearly impossible to classify)
Cases where the doc is mostly AI, not lightly assisted

That’s not reality. That’s a marketing deck. And it opens three fronts:

1️⃣ False Advertising (FTC)

The FTC already forced one AI detection company to retract inflated accuracy claims. Section 5 doesn’t care if you had “internal benchmarks.” If your 99% turns into 53% in practice, you’re exposed.

2️⃣ Discrimination (EEOC, Title VI)

Detectors flag non-native English writers at higher rates. That’s disparate impact. Schools and employers who discipline on detector results could face bias complaints.

3️⃣ Defamation & Due Process

Accusing someone of cheating or misconduct based on a statistical guess isn’t just awkward — it’s actionable. And “the algorithm said so” isn’t a defense.

⸻

Regulators Are Already Circling

FTC is targeting AI accuracy claims (Workado, Cleo, AccessiBe all got slapped this year).
DOE is under pressure to investigate biased edtech tools in schools.
EEOC has AI discrimination in its enforcement playbook.

If you’re building a business on “AI detection,” your ARR is effectively a countdown clock until regulators make your slide deck Exhibit A.

And if you’re buying these tools, congratulations — you just imported liability into your compliance stack.

⸻

The Signal Is Weak. The Risk Is Strong.

Used carefully, detectors can raise a flag:

Sudden jumps in polish → maybe check the draft history.
Entirely uniform text → maybe worth a conversation.

But as evidence? They’re worthless. That’s not contrarian — that’s the consensus.

⸻

The Real Strategy

If you’re a founder, operator, or investor, don’t build policy on “99% accuracy.” Build on what can’t be gamed:

Draft history & versioning — Google Docs, Notion, Git: actual proof of authorship.
Explainability tests — ask the writer to walk you through the work live.
Policy clarity — define what counts as “AI-assisted” vs. “AI-authored.”

That’s enforceable. That’s defensible. That’s not vibes.

⸻

Bottom Line

AI detection isn’t a lie detector. It’s a guess — one that costs millions, creates bias, and hands plaintiffs an easy lawsuit.

It’s not evidence. It’s a vibes-based alarm system dressed up as compliance tech.

🧠 Treat it like a smoke alarm, not a fire marshal. Use it to check the room. Never to convict the arsonist.

And if you’re an investor betting on “AI detection” as the moat? You’re not funding compliance. You’re underwriting liability.

⸻

🤖 Subscribe to AnaGPT

Every week, I break down the latest legal in AI, tech, and law—minus the jargon. Whether you’re a founder, creator, or lawyer, this newsletter will help you stay two steps ahead of the lawsuits.

➡️ Forward this post to someone working on AI. They’ll thank you later.

➡️ Follow Ana on Instagram @anajuneja

➡️ Add Ana on LinkedIn @anajuneja

AnaGPT

Discussion about this post

Ready for more?