From Human-First to Agent-First Testing: What a Year of Building Taught Us
Feng
Updated on March 30, 2026
Feng
Updated on March 30, 2026
Shiplight Cloud is a fully-managed, cloud-based natural language testing platform designed to multiply human productivity. Teams author tests visually, the platform handles execution, and results are managed in the cloud. It continues to serve teams that need managed test authoring and execution.
By late 2025, the landscape around us shifted in ways that called for a different product:
In addition to Shiplight Cloud, we built Shiplight Plugins as a new product for developers and automation engineers who work with AI agents. The core principle: AI handles test creation, execution, and maintenance, while the system produces clear evidence at every step for humans to understand and trust.
Here's how this comes together in practice.
Any MCP-compatible coding agent connects to the Shiplight browser MCP server, gaining the ability to open a browser, navigate the app, interact with elements, take screenshots, and observe network activity.
It goes beyond launching a fresh browser: attach to an existing Chrome DevTools URL to test against a running dev environment with real data and authenticated state. A relay server supports remote and headless setups.
The AI agent navigates the application as a human would, producing a structured test as output.
We designed Shiplight tests around natural language in YAML format to solve the readability and maintenance problems with AI-generated Playwright scripts:
goal: Verify that a user can log in and create a new project
base_url: https://your-app.com
statements:
- URL: /login
- intent: Enter email address
action: input_text
locator: "getByPlaceholder('Email')"
text: "{{TEST_EMAIL}}"
- intent: Enter the password
action: input_text
locator: "getByPlaceholder('Password')"
text: "{{TEST_PASSWORD}}"
- intent: Click Sign In
action: click
locator: "getByRole('button', { name: 'Sign In' })"
- VERIFY: The dashboard is visible with a welcome message
- intent: Click "New Project" in the sidebar
action: click
locator: "getByRole('link', { name: 'New Project' })"
- VERIFY: The project creation form is displayedEach test describes the flow in human terms. The same person who specified the feature can review the test without understanding test code. Files live in the repo, are reviewed in PRs, and produce clean diffs. Intent-based steps resolve via AI at runtime or use cached locators for deterministic replay. Custom logic (API calls, database queries, setup) embeds inline as JavaScript.
shiplight test runs tests locally. shiplight debug opens an interactive debugger to step through tests one statement at a time, inspect browser state, and edit steps in place.

After a run, Shiplight generates an HTML report. We retained the best of Playwright (video recording, trace data) and addressed what was lacking. Instead of cryptic selectors and programmatic steps, reports show natural language steps paired with screenshots.

On failure: a screenshot of the actual page state, the expected behavior, and an AI-generated explanation. For example, "Expected a welcome message, but the page displays 'Session Expired'." Readable by anyone on the team without code context.
Tests are YAML files in the repo. The CLI runs anywhere Node.js runs. GitHub Actions, GitLab CI, CircleCI require minimal configuration: add a step and point it at the test directory.
Shiplight Cloud features (scheduled runs, team dashboards, historical trends, hosted reports) are available when needed. But the core loop works entirely with the CLI and existing CI. No lock-in.
A year ago we built a platform to help humans test more productively. Now we are building for a world where one person, operating AI, designs, builds, and verifies a feature in a single session.
The role of testing is not disappearing — it is shifting. The tooling needs to reflect that: verification integrated into the development flow, evidence clear enough to trust without re-doing the work, and tests that maintain themselves as the product evolves.
We are building Shiplight to be that layer.