Recommendations for Running Targeted Subsets of Tests Before Production Deploys
Updated on April 28, 2026
Updated on April 28, 2026
Full regression suites are a luxury most teams cannot afford on every deploy. Even when you can run everything, you often should not. A long, noisy pipeline creates two failure modes that look different but feel the same in production: teams start merging because it’s probably fine, and real risk slips through behind a wall of low-signal failures.
The practical alternative is targeted testing: running the smallest set of end-to-end checks that meaningfully reduces release risk right before you deploy. Most teams call this a critical-path suite, but the label matters less than the discipline behind it: choose tests based on user impact, change impact, and operational risk, not tradition.
Below are field-tested recommendations for building targeted subsets that stay trustworthy over time, plus how Shiplight AI helps teams keep these suites fast, readable, and enforceable.
A critical-path suite is not the top 20 tests. It is the set of user journeys that, if broken, immediately harms revenue, retention, trust, or compliance. For many products, that includes:
The key is that critical path is defined by blast radius, not by the engineering team’s guess of likely to break. The best suites represent how the business fails when the product fails.
Most mature CI/CD setups converge on a small set of suites that run at different times for different reasons.
Targeted testing works best when these subsets are explicit artifacts in your workflow, not ad hoc selections someone curates during an incident.
Tests must pass is not a gate. It is a slogan. A real gate defines what happens when reality is messy:
Teams that ship reliably decide this before the incident. A good default is: critical-path failures block deploys, but only after flakiness is actively managed and triaged. If you accept overrides, require a short, logged rationale.
Shiplight AI helps here by making failures easier to interpret. With intent-based steps and AI-powered assertions that validate UI rendering and context, teams get fewer false alarms and more actionable failures, which makes block on red realistic instead of aspirational.
Traditional end-to-end automation often binds to selectors, brittle DOM structure, or fragile waits. That fragility pushes teams into a pattern of keeping the critical path small because maintenance is painful.
Invert that dynamic: author the suite in user language and let the tooling handle UI churn.
Shiplight’s intent-based execution is designed for this. Tests are expressed as user actions (click the login button, fill shipping address), then executed in real browsers with self-healing behavior when UI elements move or change. The outcome is a critical-path suite that stays stable even as your UI evolves.
Targeted testing breaks down when critical becomes a vibe. Make it a tag, and treat tags as part of the system:
Once tagging exists, enforcement becomes straightforward: pre-deploy runs critical-path AND pre-deploy. Post-deploy runs critical-path AND post-deploy.
Shiplight supports test suite organization and targeted execution, so teams can run exactly the slices they intend without duplicating tests or maintaining separate files for each pipeline stage.
Critical-path coverage should be stable, but pre-deploy confidence should also be adaptive. The best pre-deploy strategy is often:
Shiplight’s auto-generated tests from pull requests are built for this pattern. When the PR introduces or changes a user-facing flow, generating targeted coverage tied to the diff helps you catch regressions that a static critical path cannot anticipate, without inflating the permanent suite.
Targeted suites that creep from 10 minutes to 45 minutes tend to get bypassed. Protect speed explicitly:
Shiplight’s cloud runners make parallel execution a default, which is often the difference between we always run this before deploy and we run it when we have time.
A flaky test in the critical path is not annoying. It is a broken safety mechanism. If teams learn they can ignore red, the pipeline stops being a control.
Operationally, you want two things:
Shiplight’s live dashboards and reporting help teams track suite health and identify which tests are reliable enough to be part of a deploy gate. Combined with self-healing and evidence-rich debugging artifacts, the path from this is flaky to this is fixed shortens dramatically.
A practical day one setup many teams adopt looks like this:
Shiplight fits naturally into this model because it reduces the two costs that typically kill targeted testing programs: authoring friction and maintenance. When tests are easy to create in plain language, resilient to UI change, and straightforward to organize into suites that match your release process, targeted before deploy stops being a best-practice slide and becomes a habit.
The most reliable critical-path suites reflect how your users experience value. That means product, engineering, and QA need a shared definition of must not break, and an execution layer that can keep up with the pace of UI change.
If you want to make pre-deploy gates faster, clearer, and harder to bypass, Shiplight AI is built for exactly that: intent-based end-to-end testing in real browsers, with self-healing reliability and suite management that makes targeted execution the default instead of the exception.