How Many End-to-End Tests Do You Actually Need?
Updated on April 19, 2026
Updated on April 19, 2026
Most teams do not have an end-to-end testing problem. They have a scope problem.
The usual failure mode is familiar: a team starts with a few high-value browser tests, gets burned by a production bug, then responds by automating everything it can see. Six months later the suite is slow, flaky, and politically radioactive. Engineers stop trusting failures. Product stops trusting releases. Nobody can tell whether the problem is the app or the test harness.
That is backwards. End-to-end tests should be the most selective tests in your stack, not the biggest pile in it. Martin Fowler’s testing pyramid still holds up here: keep far more tests lower in the stack, and use UI tests sparingly because they are slower and more expensive to maintain. Google has made the same point bluntly for years: do not solve every quality gap by adding more end-to-end tests.
A healthy browser suite is not trying to prove every detail of the product. It is trying to prove that the product’s most important user journeys still work.
If a failure in a workflow would block revenue, onboarding, authentication, permissions, or core task completion, that workflow deserves end-to-end coverage. If a behavior can be verified more cheaply at the unit, component, or API layer, it usually belongs there instead. Playwright’s own guidance pushes teams to test what matters to users and avoid implementation details. That advice sounds simple, but it eliminates a huge amount of waste.
A practical rule is this: if losing a flow would trigger an incident, support surge, or rollback, automate it in the browser. If losing it would be annoying but containable, look for a lower-level test first.
Here is the short list most teams should start with:
This is the discipline strong QA cultures adopt. They do not confuse “important” with “must be a browser test.” They reserve browser automation for flows where only a real browser can provide credible confidence.
You almost certainly have too many E2E tests if any of these are true:
That last point matters most. Bigger suites often feel safer, but they often reduce signal. Google’s warning against piling on more end-to-end tests is really a warning against buying the wrong kind of confidence.
For fast-moving product teams, especially teams shipping UI changes daily, the goal is not maximum browser coverage. The goal is maximum trust per browser test.
That means each end-to-end test should do one valuable job:
Everything else should move down the stack or out of automation entirely.
This is also why the best teams now talk more about verification than raw test counts. A smaller suite of durable, user-centered checks beats a giant suite that mostly reenacts implementation details. That philosophy is visible in how Shiplight AI talks about quality: verification belongs inside the development loop, and maintenance overhead is a quality problem in its own right.
Do not ask, “Can we automate this?”
Ask, “Is the browser the cheapest place to learn this?”
If the answer is no, do not write the test there. If the answer is yes, make it count.
That single decision keeps test suites lean, trustworthy, and fast enough to matter. And in practice, it usually means you need fewer end-to-end tests than your team thinks, but much better ones.