AI-native teams are shipping UI changes faster than traditional QA workflows can realistically absorb. When pull requests multiply and interfaces evolve daily, the bottleneck is no longer writing code. It is proving, continuously and credibly, that what shipped still works.
Shiplight AI is built for that new reality: verifying changes in real browsers during development, turning those verifications into stable regression tests, and running them with the operational rigor required to gate releases. Below is a service-by-service guide to what Shiplight offers, who each service is for, and how teams use them together to keep quality high without turning test maintenance into a second job.
Shiplight Plugin for AI coding agents
If your team builds with Claude Code, Cursor, Codex, or GitHub Copilot, Shiplight’s Plugin is designed to put verification into the same loop where code is written. The Plugin combines a Browser MCP server (the “eyes and hands” that can navigate a real browser) with built-in Skills that orchestrate workflows like verification and test creation.
What it includes:
- Browser MCP server that gives your agent a real browser to see, click, type, and navigate like a user.
- Built-in Skills such as
/verify (visually confirm UI changes) and /create_e2e_tests (spec-driven E2E test creation), plus /cloud to sync and run tests using cloud capabilities. - Triage workflow via
/triage to reproduce failures, diagnose root causes, and fix YAML tests when the test is broken or report a bug when the app is broken.
Who it is for:
- Engineering teams shipping quickly with AI agents and needing verification to happen during implementation, not after a handoff.
Value it provides:
- A tighter feedback loop: verify the UI change immediately, then convert that verified flow into regression coverage that compounds over time.
YAML E2E test format for durable, reviewable regression coverage
Many test suites rot because they are coupled to DOM details instead of user intent. Shiplight’s YAML test format is built to keep tests readable in code review while staying resilient as the UI changes.
What it includes:
- Intent-driven YAML steps that read like user stories, where each step expresses the user goal (intent) rather than brittle selectors.
- Locator caching with self-healing behavior: locators can exist as a performance optimization, but when the UI shifts, Shiplight can re-derive the action from intent instead of requiring manual repairs.
- Compatibility with existing Playwright workflows, including incremental adoption alongside current Playwright configuration.
Who it is for:
- Teams that want tests in version control, reviewable in pull requests, and understandable by developers, QA, and product stakeholders.
Value it provides:
- Better signal and lower maintenance: behavioral changes fail tests, superficial UI drift does not have to.
AI-enhanced Test Editor for creating and refining tests with control
Shiplight’s Test Editor is where teams operationalize coverage quickly, without forcing every contribution through hand-written scripts.
What it includes:
- Natural language mode and JavaScript code mode, including direct Playwright code editing for advanced scenarios.
- AI Mode vs Fast Mode: Fast Mode uses cached, pre-generated Playwright actions for speed; AI Mode dynamically evaluates descriptions against the current browser state for flexibility.
- Recording that captures real browser interactions and converts them into executable steps, useful for fast capture of complex workflows and bug reproduction.
- Auto-healing behavior during debugging and cloud execution, where Fast Mode failures can retry in AI Mode to recover from UI drift.
Who it is for:
- QA professionals and engineers who want a fast path to coverage, plus precise control when correctness matters.
- PMs and designers who want to contribute to test intent and verification without needing to become automation experts.
Value it provides:
- Higher throughput without sacrificing standards: you can move quickly in natural language, then harden critical paths by converting and refining steps as needed.
Shiplight Cloud for running, organizing, and learning from test results
Once tests become part of how you decide what ships, execution and reporting cannot be an afterthought. Shiplight Cloud centralizes test case storage, runs across environments, and the artifacts teams need to debug with confidence.
What it includes:
- Cloud test case management and execution tools that can create/update test cases from YAML flows, trigger runs across environments, and retrieve step-by-step results with screenshots and runner logs.
- Suites to bundle related test cases for convenience, tracking, and bulk operations.
- Schedules that run tests automatically using cron expressions, with environment selection and reporting on results, pass rates, and performance metrics.
- Results analysis with rich artifacts, including full video recordings and Playwright trace files for interactive debugging (network activity, DOM snapshots, console logs).
- AI Test Summary that analyzes failed test results using steps, errors, and screenshots, then provides root-cause guidance and recommendations, cached after the first view.
- Webhooks to send test run outcomes to your systems, including options like “Failed,” “Pass→Fail,” and “Fail→Pass,” plus signature verification guidance.
- Hooks (before/after templates) to standardize setup and teardown, reduce duplication, and keep changes centralized across many tests.
Who it is for:
- Teams that need consistent execution, shared visibility, and fast triage across multiple environments and stakeholders.
Value it provides:
- Fewer blind spots: failures come with the context needed to act, not just a red status light.
Desktop App and VS Code Extension for local-first debugging and contribution
When you want speed, nothing beats local debugging. Shiplight supports local-first workflows without cutting you off from cloud collaboration.
What it includes:
- Shiplight Desktop, a native macOS app that loads the Shiplight web UI while running the browser sandbox and AI agent worker locally for fast debugging. It also includes a bundled MCP server so IDEs can connect without installing the MCP npm package separately.
- System requirements for Desktop: macOS on Apple Silicon (M1 or later), a Shiplight account, and a Google or Anthropic API key for the AI agent.
- VS Code Extension that lets you create, run, and debug
*.test.yaml files inside VS Code with an interactive visual debugger, including step-through execution and inline edits.
Who it is for:
- Developers and QA who want the shortest possible loop from “something failed” to “I understand exactly why.”
Value it provides:
- Less context switching and faster stabilization of tests, especially in high-change UI areas.
Enterprise security and operational control
For high-growth and enterprise organizations, tooling must meet security and reliability expectations before it can become a true release gate.
What it includes:
- SOC 2 Type II certification, encryption in transit and at rest, role-based access control, and immutable audit logs.
- Scalable infrastructure with a 99.99% uptime SLA and high-availability posture described across regions.
- Integration points across CI/CD and collaboration tooling, including GitHub Actions and additional CI options listed for enterprise customers.
Who it is for:
- Teams that need QA automation to be auditable, reliable, and supportable as a platform capability, not a best-effort script collection.
Value it provides:
- Confidence that your verification system can scale with your organization and still be trusted when it matters most.
Bringing it together: choosing the right entry point
Most teams start in one of two places:
- If your workflow is agent-driven, start with the Shiplight Plugin and build coverage as a byproduct of verification.
- If your team wants durable tests that stay readable and resilient, standardize on YAML intent tests, then use the Test Editor, Cloud, and CI integrations to operationalize execution and gating.
Shiplight’s services are designed to connect cleanly across the lifecycle: verify as you build, promote verified behavior into regression coverage, and run it with the artifacts and governance required to ship with confidence.