EngineeringEnterpriseGuidesBest Practices

Locators Are a Cache: The Mental Model for E2E Tests That Survive UI Change

Shiplight AI Team

Updated on April 1, 2026

View as Markdown

End-to-end testing has a reputation problem. Not because E2E is the wrong level of validation, but because too many teams build E2E suites on a fragile foundation: selectors treated as truth.

That foundation collapses the moment a product team does what product teams are supposed to do: iterate. A button label changes, a layout shifts, a component gets refactored. Suddenly your “reliable” suite becomes a maintenance queue.

A better approach starts with a reframing:

Locators should be a performance cache, not a hard dependency.

That mental model is baked into Shiplight AI’s test authoring and execution system, where tests are expressed as intent (what the user is trying to do), then accelerated with deterministic locators when it makes sense. When the UI moves, Shiplight can fall back to intent, recover the step, and keep the suite operational.

Below is a practical, implementation-minded guide to building E2E coverage that stays fast, readable, and resilient as your product evolves.

The core failure mode: turning UI structure into “requirements”

Most flaky suites are not flaky because browsers are unpredictable. They are flaky because we encode incidental details, DOM structure, CSS selectors, brittle IDs, into tests as if those details were requirements.

Your requirements are things like:

  • A user can log in.
  • A checkout completes.
  • A permission boundary is enforced.
  • A magic link signs a user in.

Your requirements are not:

  • This button must be the third element inside the second container.
  • This class name must never change.

Shiplight’s approach is to keep the test’s meaning stable even when the interface is not. Shiplight runs on top of Playwright, but it adds an intent layer so tests are authored as user actions and outcomes, not selector plumbing.

Shiplight’s execution model in one sentence

Write tests as natural language intent, enrich them with deterministic locators for speed, and treat those locators as a cache that can be healed when the UI changes.

In Shiplight’s YAML-based tests, you can mix three important types of steps:

  1. Natural language steps (Shiplight’s web agent resolves actions at runtime)
  2. Deterministic “action entities” with locators (fast replay, typically around a second per step)
  3. AI-powered assertions using VERIFY: (asserting outcomes in plain language)

Here is what that looks like at a simple starting point:

goal: Verify user journey
statements:
  - intent: Navigate to the application
  - intent: Perform the user action
  - VERIFY: the expected result

As you refine the test, you can enrich steps with explicit Playwright locators for deterministic replay:

- description: Click Create step: locator: "getByRole('button', { name: 'Create' })" action_data: action_name: click

The key detail is not the syntax. It is the philosophy: the locator accelerates the intent, but does not replace it. When a locator goes stale, Shiplight can recover by falling back to the natural language description and finding the correct element. In Shiplight Cloud, the platform can then update the cached locator after a successful heal, so future runs stay fast.

Self-healing that is grounded in intent, not guesswork

Self-healing is only useful if it is predictable. Shiplight’s AI SDK exposes a step() method that wraps Playwright actions with intent. Your code runs normally, but if it throws (selector not found, timeout, UI shift), Shiplight uses the step description to recover and attempt an alternative path to the same goal.

That design encourages a best practice many teams miss:

Describe what you are trying to accomplish, not how the DOM currently happens to implement it.

This is how you keep tests aligned with product behavior, even when implementation details churn.

Debugging without the context switching tax

Resilient execution matters, but teams still need to understand failures quickly. Shiplight invests heavily in “debugging as a first-class workflow,” both locally and in cloud.

In VS Code: debug .test.yaml visually

Shiplight provides a VS Code extension that lets you run and debug .test.yaml files in an interactive webview panel. You can step through statements, edit action entities inline, watch the browser session in real time, and rerun immediately.

In Shiplight Cloud: live view, screenshots, logs, and context

In the cloud test editor, debugging includes step-by-step execution, “run until” partial execution, a live browser view, a screenshot gallery with before and after comparisons, and console plus context panels for logs and variables.

This is the difference between “a test failed” and “here is exactly what the user saw, what the system did, and where behavior diverged.”

Making failures actionable with AI summaries

Even with strong debugging tools, teams waste time translating raw failures into decisions. Shiplight Cloud includes AI Test Summary for failed runs, generating a structured explanation: root cause analysis, expected vs actual behavior, recommendations, and visual analysis of screenshots when available. Summaries are generated when first viewed and then cached for fast subsequent access.

The practical outcome is lower mean time to diagnosis, especially for teams running many suites across multiple environments.

Do not skip the hard flows: email verification and magic links

Many E2E programs quietly avoid email-driven journeys because they are annoying to automate. Those flows are often the highest leverage to validate.

Shiplight supports Email Content Extraction so tests can read forwarded emails and extract verification codes, activation links, or custom content using an LLM-based extractor, without regex-heavy parsing. In Shiplight, you configure a forwarded address (for example xxxx@forward.shiplight.ai) and then use an EXTRACT_EMAIL_CONTENT step that outputs variables like email_otp_code or email_magic_link for later steps.

That unlocks reliable coverage for password resets, MFA, sign-in links, onboarding, and billing notifications.

Bring it into CI with GitHub Actions

Shiplight Cloud integrates with GitHub Actions via an API token stored as a GitHub secret (SHIPLIGHT_API_TOKEN). Shiplight’s documentation outlines the workflow: create a token in Shiplight, store it in GitHub secrets, and wire suites into your PR and deployment pipelines.

This is where the “locators are a cache” model pays dividends. You can gate releases on E2E without turning your team into full-time test maintainers.

Where Shiplight fits

Shiplight is built as a verification platform for AI-native development, connecting to coding agents via MCP so agents can verify UI changes in a real browser while building, then turn those verifications into regression tests.

For teams with enterprise requirements, Shiplight also positions itself as SOC 2 Type II certified with a 99.99% uptime SLA and support for private cloud and VPC deployments.

The takeaway

If your E2E suite breaks every time your product improves, the issue is not your team’s discipline. It is the model.

Treat intent as the source of truth. Treat locators as a cache. Invest in debugging and diagnosis. Cover the hard flows, including email. Then connect it all to the development loop so verification happens where software is built.

That is the path to E2E coverage that scales with your roadmap instead of fighting it.

Related Articles

Key Takeaways

  • Verify in a real browser during development. Shiplight's MCP server lets AI coding agents validate UI changes before code review.
  • Generate stable regression tests automatically. Verifications become YAML test files that self-heal when the UI changes.
  • Reduce maintenance with AI-driven self-healing. Cached locators keep execution fast; AI resolves only when the UI has changed.
  • Integrate E2E testing into CI/CD as a quality gate. Tests run on every PR, catching regressions before they reach staging.

Frequently Asked Questions

What is AI-native E2E testing?

AI-native E2E testing uses AI agents to create, execute, and maintain browser tests automatically. Unlike traditional test automation that requires manual scripting, AI-native tools like Shiplight interpret natural language intent and self-heal when the UI changes.

How do self-healing tests work?

Self-healing tests use AI to adapt when UI elements change. Shiplight uses an intent-cache-heal pattern: cached locators provide deterministic speed, and AI resolution kicks in only when a cached locator fails — combining speed with resilience.

What is MCP testing?

MCP (Model Context Protocol) lets AI coding agents connect to external tools. Shiplight's MCP server enables agents in Claude Code, Cursor, or Codex to open a real browser, verify UI changes, and generate tests during development.

How do you test email and authentication flows end-to-end?

Shiplight supports testing full user journeys including login flows and email-driven workflows. Tests can interact with real inboxes and authentication systems, verifying the complete path from UI to inbox.

Get Started

References: Playwright browser automation, SOC 2 Type II standard, GitHub Actions documentation, Google Testing Blog