Locators Are a Cache: The Mental Model for E2E Tests That Survive UI Change
January 1, 1970
January 1, 1970
End-to-end testing has a reputation problem. Not because E2E is the wrong level of validation, but because too many teams build E2E suites on a fragile foundation: selectors treated as truth.
That foundation collapses the moment a product team does what product teams are supposed to do: iterate. A button label changes, a layout shifts, a component gets refactored. Suddenly your “reliable” suite becomes a maintenance queue.
A better approach starts with a reframing:
Locators should be a performance cache, not a hard dependency.
That mental model is baked into Shiplight AI’s test authoring and execution system, where tests are expressed as intent (what the user is trying to do), then accelerated with deterministic locators when it makes sense. When the UI moves, Shiplight can fall back to intent, recover the step, and keep the suite operational.
Below is a practical, implementation-minded guide to building E2E coverage that stays fast, readable, and resilient as your product evolves.
Most flaky suites are not flaky because browsers are unpredictable. They are flaky because we encode incidental details, DOM structure, CSS selectors, brittle IDs, into tests as if those details were requirements.
Your requirements are things like:
Your requirements are not:
Shiplight’s approach is to keep the test’s meaning stable even when the interface is not. Shiplight runs on top of Playwright, but it adds an intent layer so tests are authored as user actions and outcomes, not selector plumbing.
Write tests as natural language intent, enrich them with deterministic locators for speed, and treat those locators as a cache that can be healed when the UI changes.
In Shiplight’s YAML-based tests, you can mix three important types of statements:
VERIFY: (asserting outcomes in plain language)Here is what that looks like at a simple starting point:
goal: Verify user can create a new project
url: https://app.example.com/projects
statements:
- Click the "New Project" button
- Enter "My Test Project" in the project name field
- Click "Create"
- "VERIFY: Project page shows title 'My Test Project'"
As you refine the test, you can enrich steps with explicit Playwright locators for deterministic replay:
- description: Click Create
action_entity:
locator: "getByRole('button', { name: 'Create' })"
action_data:
action_name: click
The key detail is not the syntax. It is the philosophy: the locator accelerates the intent, but does not replace it. When a locator goes stale, Shiplight can recover by falling back to the natural language description and finding the correct element. In Shiplight Cloud, the platform can then update the cached locator after a successful heal, so future runs stay fast.
Self-healing is only useful if it is predictable. Shiplight’s AI SDK exposes a step() method that wraps Playwright actions with intent. Your code runs normally, but if it throws (selector not found, timeout, UI shift), Shiplight uses the step description to recover and attempt an alternative path to the same goal.
That design encourages a best practice many teams miss:
Describe what you are trying to accomplish, not how the DOM currently happens to implement it.
This is how you keep tests aligned with product behavior, even when implementation details churn.
Resilient execution matters, but teams still need to understand failures quickly. Shiplight invests heavily in “debugging as a first-class workflow,” both locally and in cloud.
.test.yaml visuallyShiplight provides a VS Code extension that lets you run and debug .test.yaml files in an interactive webview panel. You can step through statements, edit action entities inline, watch the browser session in real time, and rerun immediately.
In the cloud test editor, debugging includes step-by-step execution, “run until” partial execution, a live browser view, a screenshot gallery with before and after comparisons, and console plus context panels for logs and variables.
This is the difference between “a test failed” and “here is exactly what the user saw, what the system did, and where behavior diverged.”
Even with strong debugging tools, teams waste time translating raw failures into decisions. Shiplight Cloud includes AI Test Summary for failed runs, generating a structured explanation: root cause analysis, expected vs actual behavior, recommendations, and visual analysis of screenshots when available. Summaries are generated when first viewed and then cached for fast subsequent access.
The practical outcome is lower mean time to diagnosis, especially for teams running many suites across multiple environments.
Many E2E programs quietly avoid email-driven journeys because they are annoying to automate. Those flows are often the highest leverage to validate.
Shiplight supports Email Content Extraction so tests can read forwarded emails and extract verification codes, activation links, or custom content using an LLM-based extractor, without regex-heavy parsing. In Shiplight, you configure a forwarded address (for example xxxx@forward.shiplight.ai) and then use an EXTRACT_EMAIL_CONTENT step that outputs variables like email_otp_code or email_magic_link for later steps.
That unlocks reliable coverage for password resets, MFA, sign-in links, onboarding, and billing notifications.
Shiplight Cloud integrates with GitHub Actions via an API token stored as a GitHub secret (SHIPLIGHT_API_TOKEN). Shiplight’s documentation outlines the workflow: create a token in Shiplight, store it in GitHub secrets, and wire suites into your PR and deployment pipelines.
This is where the “locators are a cache” model pays dividends. You can gate releases on E2E without turning your team into full-time test maintainers.
Shiplight is built as a verification platform for AI-native development, connecting to coding agents via MCP so agents can verify UI changes in a real browser while building, then turn those verifications into regression tests.
For teams with enterprise requirements, Shiplight also positions itself as SOC 2 Type II certified with a 99.99% uptime SLA and support for private cloud and VPC deployments.
If your E2E suite breaks every time your product improves, the issue is not your team’s discipline. It is the model.
Treat intent as the source of truth. Treat locators as a cache. Invest in debugging and diagnosis. Cover the hard flows, including email. Then connect it all to the development loop so verification happens where software is built.
That is the path to E2E coverage that scales with your roadmap instead of fighting it.