AI Test Debt

AI test debt is the accumulated quality liability that results when AI coding agents author production code faster than tests can keep up — including untested code paths, brittle scripts that survived a redesign by accident, and quarantined tests left unfixed. Like financial debt, it compounds.

In one sentence

AI test debt is the testing-layer analogue of technical debt — quality liabilities that accrue when the code-authoring layer outruns the test-authoring layer, specifically because AI coding agents ship faster than humans verify.

Three components

ComponentWhat it isSymptom
Coverage debtNew code paths shipped without testsCoverage decay trending positive
Stability debtFlaky tests accumulating, often quarantined without fixesQuarantine list grows, never shrinks
Resolution debtTests that pass against UI changes by accident — wrong elements selected, real regressions hiddenTest pass-rate stays high but production incidents rise

Why AI coding agents specifically

Pre-AI development was rate-limited by human authoring throughput, so test authoring kept rough pace. With AI agents, the code loop accelerates 2–5× while the test loop stays human-bound unless tests are also AI-generated. The asymmetry creates debt by default.

How to measure

You cannot fix what you cannot see. Track:

  • Changed-surface-area coverage for the last 30 days (see coverage decay).
  • Quarantine list size and average age — growing list and rising age signal stability debt.
  • Test pass-rate vs production incident rate divergence — if tests are increasingly green while incidents rise, resolution debt is the likely cause.

How to pay it down

  • Generate tests in the same loop as code — the coding agent calls an agent-native QA tool to author tests for its own changes.
  • Time-box quarantine — tests that don't return to the blocking suite within 30 days are reviewed for deletion or rewrite.
  • Validate self-healing diffs — every UI redesign that triggers a self-healing test heal must show its diff in PR review, so resolution debt is visible.

What AI test debt is not

  • Not the same as low coverage — a team can have low coverage and no debt if it's not changing the code (greenfield idle codebase).
  • Not the same as flake count — flakes are one component; coverage and resolution debt are distinct categories.

Related terms