GuidesAI Testing

Test Authoring Methods Compared: 5 Ways Automated Tests Are Written in 2026

Shiplight AI Team

Updated on July 17, 2026

Illustrated Shiplight blog cover: three glossy authoring approaches shown as distinct floating panels side by side, a code file, a visual builder, and an AI chip, the AI one glowing with a bright green checkmark.

Test authoring is how automated tests get created — the process of translating what a product should do into executable checks that run in CI. In 2026, five methods coexist, each with distinct tradeoffs in speed, readability, maintenance, and who on the team can participate.

---

A test framework like Playwright or Selenium is only half the story. The other half is authoring — how you get the tests into existence in the first place. In 2026, five authoring methods dominate:

Code-first (Playwright, Selenium, Cypress scripts)
Record-and-playback
Plain English / NLP test steps
AI-generated tests from specs or UI exploration
Intent-based YAML

None of these is universally best. The right method depends on who writes the tests, how often the product changes, and whether AI coding agents are part of your development workflow. This guide covers all five with concrete examples and a decision framework.

Method 1: Code-First Test Authoring

Code-first authoring means engineers write tests directly in a programming language — TypeScript, JavaScript, Python, Groovy — using a test framework's API to interact with the browser.

This is the original model. Playwright, Selenium, Cypress, and WebDriver all target this approach.

import { test, expect } from '@playwright/test';

test('user can complete checkout', async ({ page }) => {
  await page.goto('https://app.example.com');
  await page.getByLabel('Email').fill('test@example.com');
  await page.getByLabel('Password').fill('password123');
  await page.getByRole('button', { name: 'Sign in' }).click();
  await page.getByRole('link', { name: 'Add to cart' }).click();
  await page.getByRole('button', { name: 'Checkout' }).click();
  await expect(page.getByText('Order confirmed')).toBeVisible();
});

Strengths: Maximum control over browser behavior, deterministic execution, full access to framework features, works well in CI.

Weaknesses: Engineers-only — product managers, designers, and QA analysts without coding skills cannot contribute. Tests break frequently when locators change, creating high maintenance cost. Authoring a new test from scratch takes hours.

Best for: Engineering-heavy teams with dedicated test infrastructure and the headcount to maintain it.

Method 2: Record-and-Playback Test Authoring

Record-and-playback test authoring means the tool observes your manual browser interactions and generates a runnable test script from them. You click through the flow, the tool captures each action, and the output is an executable test.

This approach is ~20 years old — Selenium IDE pioneered it, and most modern no-code tools (Katalon, some modes of ACCELQ) still use variants of it. AI-augmented record-and-playback adds smart locator generation and auto-healing.

Typical flow:

Click "Record" in the tool
Perform the test manually — log in, click buttons, fill forms
Tool generates a test with steps mirroring your actions
Replay to verify

Strengths: Fast initial authoring. Non-engineers can produce test drafts. No coding required.

Weaknesses: Generated tests are often brittle — recorded click coordinates or CSS selectors break when the UI changes. Tests drift from user intent because what was recorded was a specific execution, not a specification of behavior. Difficult to maintain at scale.

Best for: Quick initial coverage, documenting existing workflows, or onboarding non-engineers into test creation.

Codeless E2E testing covers how modern record-and-playback has evolved.

Method 3: Plain English / NLP Test Authoring

Plain English test authoring means writing tests as natural-language sentences that the tool interprets and translates into browser actions at runtime.

No code, no YAML, no selectors. In practice, though, the English is constrained: each sentence has to parse into the tool's command vocabulary.

Go to https://app.example.com/login
Enter "admin@example.com" into "Email"
Enter "password123" into "Password"
Click "Sign In"
Check that the page contains "Welcome, Admin"

testRigor pioneered this model as a cloud-hosted platform designed to make manual QA productive without engineers. Its own docs note the parsed English "has some syntax to it"; free-form phrasing is translated into their command set. Some features of Virtuoso QA, Functionize, and ACCELQ offer similar authoring experiences. In these tools, tests live as suites in the vendor's cloud console and run on the vendor's hosted runners, not in your repo.

Strengths: Anyone who can write a bulleted list can create a test. Highest accessibility for non-technical team members — business analysts, product managers, support staff. Tests read like documentation.

Weaknesses: Ambiguity — "Click Sign In" assumes the tool can resolve which element is "Sign In" when there might be multiple. Complex flows with dynamic content, custom components, or non-standard UI patterns challenge natural-language resolution. Debugging unclear tests is harder than debugging code. Because the suites live in the vendor's console, they do not appear in pull requests or version control alongside the code they cover.

Designed for: manual-QA-heavy organizations where non-technical QA staff author tests in a vendor console, a buyer profile distinct from engineering-led teams.

See no-code testing for non-technical teams for a deeper guide.

Method 4: AI-Generated Tests from Specs or UI Exploration

AI-generated test authoring means the AI produces test cases automatically from inputs like product specifications, user stories, or autonomous application exploration — with no manual step-by-step authoring.

Three input types are common:

From specifications

You feed the AI a user story, acceptance criteria, or PRD section. It generates a test covering the described behavior.

> User story: "As a signed-in user, I can add items to my cart and complete checkout with a saved payment method." > > → AI produces a 10-step test covering login, navigation, add-to-cart, checkout form, payment confirmation.

From UI exploration

The AI navigates your running application, discovers flows, and generates tests for what it finds. Mabl and some Functionize modes work this way. No input required beyond a URL.

From session recordings

The AI observes real user traffic and generates tests reflecting actual usage patterns. Checksum is the primary example.

Strengths: Scales — coverage grows without human authoring effort. Captures flows that engineers wouldn't think to write tests for. Integrates naturally with AI coding agent workflows.

Weaknesses: Generated tests may include redundant or low-value cases. Spec-to-test accuracy depends on spec clarity. Autonomous exploration can miss business-critical edge cases that aren't obvious from the UI.

Best for: Teams with limited QA headcount, SaaS products with established user bases, or engineering organizations that want coverage to scale with development velocity.

See AI testing tools that automatically generate test cases for a tool-by-tool comparison.

Method 5: Intent-Based YAML Test Authoring

Intent-based YAML test authoring means writing tests as structured YAML files where each step describes user intent in natural language, with AI resolving intent to browser actions at runtime.

This is the approach Shiplight is built around. It combines the readability of plain English with the structure and version-control friendliness of code.

goal: Verify user can complete checkout
steps:
  - intent: Log in as a test user
  - intent: Navigate to the product catalog
  - intent: Add the first product to the cart
  - intent: Proceed to checkout
  - intent: Enter shipping address
  - intent: Complete payment with test card
  - VERIFY: order confirmation page shows order number

Tests are readable by anyone who can follow a bulleted list, yet structured enough to live in git, appear in pull request diffs, and run in CI. When the UI changes, Shiplight resolves each intent step from scratch rather than failing on a stale selector — the intent-cache-heal pattern.

Intent-based YAML is the primary authoring model in Shiplight Plugin, which exposes /create_e2e_tests as an MCP tool so Claude Code, Cursor, Codex, and GitHub Copilot can generate intent-based tests during development.

Strengths: Readable like plain English, structured like code. Survives UI changes via intent-based self-healing. Version-controlled, reviewable in PRs, portable across environments. Can be generated by AI coding agents or written by non-engineers.

Weaknesses: Requires basic YAML familiarity (less than a scripting language, more than plain prose). Newer format with smaller ecosystem than Playwright or Selenium scripts.

Best for: Teams using AI coding agents, mixed-skill engineering organizations, and any team that wants tests as a first-class artifact in their git workflow.

Test Authoring Methods: Side-by-Side Comparison

Method	Who Authors	Format	Where Tests Live	Readability	Maintenance	AI Agent Support
Code-first	Engineers	Code (TS/JS/Python)	Your git repo	Low (non-engineers)	Manual	Limited
Record-and-playback	Anyone	Recorded script	Tool workspace	Medium	Fragile	No
Plain English / NLP	Anyone	Constrained natural language	Vendor cloud console	High	Self-healing typical	Limited
AI-generated	AI	Varies (code or proprietary)	Varies (repo or vendor cloud)	Varies	Self-healing typical	Partial
Intent-based YAML	Anyone or AI	YAML with intent steps	Your git repo	High	Intent-based self-healing	Native (MCP)

How to Choose a Test Authoring Method

By team profile

Team profile	Recommended method
All engineers, need max control	Code-first (Playwright)
QA team with no coding	Intent-based YAML in your repo, or plain English / NLP if a vendor console workflow is acceptable
Engineers + AI coding agents	Intent-based YAML (Shiplight)
Want coverage without authoring	AI-generated (exploration or session-based)
Need to onboard non-engineers gradually	Record-and-playback, graduate to YAML

By application change velocity

Stable UI, rare changes: Code-first or record-and-playback both work
High change velocity: Self-healing methods (plain English, intent-based YAML, AI-generated)
AI coding agents driving changes: Intent-based YAML with MCP integration

By review requirements

Tests reviewed by product managers: Plain English or intent-based YAML
Tests reviewed by engineers only: Any method works
Regulated industries (audit trail required): Intent-based YAML (git-native, version-controlled, human-readable)

FAQ

What is test authoring?

Test authoring is the process of creating automated tests — translating what a product should do into executable checks that run in a test framework. It is distinct from test execution (which runs the tests) and test maintenance (which fixes them when they break).

Is record-and-playback still used in 2026?

Yes, but it has evolved. Modern AI-augmented record-and-playback tools add smart locator generation and self-healing to reduce the brittleness that made the original approach unreliable. It remains useful for quick initial coverage and onboarding non-engineers, but has been displaced for production suites by intent-based and AI-generated methods.

What is the difference between plain English test authoring and intent-based YAML?

Plain English tests read like prose, but each sentence has to parse into the tool's command vocabulary, and the suites typically live in the vendor's cloud console. Intent-based YAML is structured: each step is a YAML key-value pair with a clear intent field, making it version-control-friendly and unambiguous to parse. Intent-based YAML is a middle ground between the flexibility of plain English and the rigor of code.

Can AI coding agents generate tests directly?

Yes, with the right authoring format and integration. Shiplight Plugin exposes test generation as an MCP tool that Claude Code, Cursor, Codex, and GitHub Copilot can call during development — the coding agent generates intent-based YAML tests as part of the same task it uses to implement a feature.

Should I use multiple authoring methods in one project?

It's common. Many teams use code-first Playwright tests for infrastructure-level flows, intent-based YAML for UI-level E2E, and AI-generated tests for coverage breadth. The key is consistency within each category — don't mix authoring methods for the same type of test.

---

Conclusion

The choice of test authoring method is a higher-leverage decision than most teams realize. It determines who on the team can contribute, how often tests break, and whether your test suite scales with development velocity or against it.

For teams building with AI coding agents, intent-based YAML is the strongest fit — it combines the readability non-engineers need with the structure AI agents can generate, and the self-healing that makes tests survive high-velocity UI changes. See best AI automation tools for software testing for a platform-by-platform comparison across the full AI automation tool category.

Try intent-based YAML testing with Shiplight Plugin — installs into Claude Code, Cursor, Codex, and GitHub Copilot in a few minutes.