GuidesAI Testing

Test Authoring Methods Compared: 5 Ways Automated Tests Are Written in 2026

Shiplight AI Team

Updated on April 20, 2026

View as Markdown
Side-by-side illustration of 5 test authoring methods: code-first script, record-and-playback, plain English, AI-generated from spec, and intent-based YAML

Test authoring is how automated tests get created — the process of translating what a product should do into executable checks that run in CI. In 2026, five methods coexist, each with distinct tradeoffs in speed, readability, maintenance, and who on the team can participate.

---

A test framework like Playwright or Selenium is only half the story. The other half is authoring — how you get the tests into existence in the first place. In 2026, five authoring methods dominate:

  1. Code-first (Playwright, Selenium, Cypress scripts)
  2. Record-and-playback
  3. Plain English / NLP test steps
  4. AI-generated tests from specs or UI exploration
  5. Intent-based YAML

None of these is universally best. The right method depends on who writes the tests, how often the product changes, and whether AI coding agents are part of your development workflow. This guide covers all five with concrete examples and a decision framework.

Method 1: Code-First Test Authoring

Code-first authoring means engineers write tests directly in a programming language — TypeScript, JavaScript, Python, Groovy — using a test framework's API to interact with the browser.

This is the original model. Playwright, Selenium, Cypress, and WebDriver all target this approach.

import { test, expect } from '@playwright/test';

test('user can complete checkout', async ({ page }) => {
  await page.goto('https://app.example.com');
  await page.getByLabel('Email').fill('test@example.com');
  await page.getByLabel('Password').fill('password123');
  await page.getByRole('button', { name: 'Sign in' }).click();
  await page.getByRole('link', { name: 'Add to cart' }).click();
  await page.getByRole('button', { name: 'Checkout' }).click();
  await expect(page.getByText('Order confirmed')).toBeVisible();
});

Strengths: Maximum control over browser behavior, deterministic execution, full access to framework features, works well in CI.

Weaknesses: Engineers-only — product managers, designers, and QA analysts without coding skills cannot contribute. Tests break frequently when locators change, creating high maintenance cost. Authoring a new test from scratch takes hours.

Best for: Engineering-heavy teams with dedicated test infrastructure and the headcount to maintain it.

Method 2: Record-and-Playback Test Authoring

Record-and-playback test authoring means the tool observes your manual browser interactions and generates a runnable test script from them. You click through the flow, the tool captures each action, and the output is an executable test.

This approach is ~20 years old — Selenium IDE pioneered it, and most modern no-code tools (Katalon, some modes of ACCELQ) still use variants of it. AI-augmented record-and-playback adds smart locator generation and auto-healing.

Typical flow:

  1. Click "Record" in the tool
  2. Perform the test manually — log in, click buttons, fill forms
  3. Tool generates a test with steps mirroring your actions
  4. Replay to verify

Strengths: Fast initial authoring. Non-engineers can produce test drafts. No coding required.

Weaknesses: Generated tests are often brittle — recorded click coordinates or CSS selectors break when the UI changes. Tests drift from user intent because what was recorded was a specific execution, not a specification of behavior. Difficult to maintain at scale.

Best for: Quick initial coverage, documenting existing workflows, or onboarding non-engineers into test creation.

Codeless E2E testing covers how modern record-and-playback has evolved.

Method 3: Plain English / NLP Test Authoring

Plain English test authoring means writing tests as natural-language sentences that the tool interprets and translates into browser actions at runtime.

No code, no YAML, no selectors. Just prose.

Go to https://app.example.com/login
Enter "admin@example.com" into "Email"
Enter "password123" into "Password"
Click "Sign In"
Check that the page contains "Welcome, Admin"

testRigor pioneered this model. Some features of Virtuoso QA, Functionize, and ACCELQ offer similar authoring experiences.

Strengths: Anyone who can write a bulleted list can create a test. Highest accessibility for non-technical team members — business analysts, product managers, support staff. Tests read like documentation.

Weaknesses: Ambiguity — "Click Sign In" assumes the tool can resolve which element is "Sign In" when there might be multiple. Complex flows with dynamic content, custom components, or non-standard UI patterns challenge natural-language resolution. Debugging unclear tests is harder than debugging code.

Best for: Non-technical QA teams, business-rule-driven testing, environments where tests need to be readable by non-engineers.

See no-code testing for non-technical teams for a deeper guide.

Method 4: AI-Generated Tests from Specs or UI Exploration

AI-generated test authoring means the AI produces test cases automatically from inputs like product specifications, user stories, or autonomous application exploration — with no manual step-by-step authoring.

Three input types are common:

From specifications

You feed the AI a user story, acceptance criteria, or PRD section. It generates a test covering the described behavior.

> User story: "As a signed-in user, I can add items to my cart and complete checkout with a saved payment method." > > → AI produces a 10-step test covering login, navigation, add-to-cart, checkout form, payment confirmation.

From UI exploration

The AI navigates your running application, discovers flows, and generates tests for what it finds. Mabl and some Functionize modes work this way. No input required beyond a URL.

From session recordings

The AI observes real user traffic and generates tests reflecting actual usage patterns. Checksum is the primary example.

Strengths: Scales — coverage grows without human authoring effort. Captures flows that engineers wouldn't think to write tests for. Integrates naturally with AI coding agent workflows.

Weaknesses: Generated tests may include redundant or low-value cases. Spec-to-test accuracy depends on spec clarity. Autonomous exploration can miss business-critical edge cases that aren't obvious from the UI.

Best for: Teams with limited QA headcount, SaaS products with established user bases, or engineering organizations that want coverage to scale with development velocity.

See AI testing tools that automatically generate test cases for a tool-by-tool comparison.

Method 5: Intent-Based YAML Test Authoring

Intent-based YAML test authoring means writing tests as structured YAML files where each step describes user intent in natural language, with AI resolving intent to browser actions at runtime.

This is the approach Shiplight is built around. It combines the readability of plain English with the structure and version-control friendliness of code.

goal: Verify user can complete checkout
steps:
  - intent: Log in as a test user
  - intent: Navigate to the product catalog
  - intent: Add the first product to the cart
  - intent: Proceed to checkout
  - intent: Enter shipping address
  - intent: Complete payment with test card
  - VERIFY: order confirmation page shows order number

Tests are readable by anyone who can follow a bulleted list, yet structured enough to live in git, appear in pull request diffs, and run in CI. When the UI changes, Shiplight resolves each intent step from scratch rather than failing on a stale selector — the intent-cache-heal pattern.

Intent-based YAML is the primary authoring model in Shiplight Plugin, which exposes /create_e2e_tests as an MCP tool so Claude Code, Cursor, Codex, and GitHub Copilot can generate intent-based tests during development.

Strengths: Readable like plain English, structured like code. Survives UI changes via intent-based self-healing. Version-controlled, reviewable in PRs, portable across environments. Can be generated by AI coding agents or written by non-engineers.

Weaknesses: Requires basic YAML familiarity (less than a scripting language, more than plain prose). Newer format with smaller ecosystem than Playwright or Selenium scripts.

Best for: Teams using AI coding agents, mixed-skill engineering organizations, and any team that wants tests as a first-class artifact in their git workflow.

Test Authoring Methods: Side-by-Side Comparison

MethodWho AuthorsFormatReadabilityMaintenanceAI Agent Support
Code-firstEngineersCode (TS/JS/Python)Low (non-engineers)ManualLimited
Record-and-playbackAnyoneRecorded scriptMediumFragileNo
Plain English / NLPAnyoneNatural languageHighSelf-healing typicalLimited
AI-generatedAIVaries (code or proprietary)VariesSelf-healing typicalPartial
Intent-based YAMLAnyone or AIYAML with intent stepsHighIntent-based self-healingNative (MCP)

How to Choose a Test Authoring Method

By team profile

Team profileRecommended method
All engineers, need max controlCode-first (Playwright)
QA team with no codingPlain English / NLP or intent-based YAML
Engineers + AI coding agentsIntent-based YAML (Shiplight)
Want coverage without authoringAI-generated (exploration or session-based)
Need to onboard non-engineers graduallyRecord-and-playback, graduate to YAML

By application change velocity

  • Stable UI, rare changes: Code-first or record-and-playback both work
  • High change velocity: Self-healing methods (plain English, intent-based YAML, AI-generated)
  • AI coding agents driving changes: Intent-based YAML with MCP integration

By review requirements

  • Tests reviewed by product managers: Plain English or intent-based YAML
  • Tests reviewed by engineers only: Any method works
  • Regulated industries (audit trail required): Intent-based YAML (git-native, version-controlled, human-readable)

FAQ

What is test authoring?

Test authoring is the process of creating automated tests — translating what a product should do into executable checks that run in a test framework. It is distinct from test execution (which runs the tests) and test maintenance (which fixes them when they break).

Is record-and-playback still used in 2026?

Yes, but it has evolved. Modern AI-augmented record-and-playback tools add smart locator generation and self-healing to reduce the brittleness that made the original approach unreliable. It remains useful for quick initial coverage and onboarding non-engineers, but has been displaced for production suites by intent-based and AI-generated methods.

What is the difference between plain English test authoring and intent-based YAML?

Plain English tests are unstructured prose — the tool parses each sentence and infers actions. Intent-based YAML is structured: each step is a YAML key-value pair with a clear intent field, making it version-control-friendly and unambiguous to parse. Intent-based YAML is a middle ground between the flexibility of plain English and the rigor of code.

Can AI coding agents generate tests directly?

Yes, with the right authoring format and integration. Shiplight Plugin exposes test generation as an MCP tool that Claude Code, Cursor, Codex, and GitHub Copilot can call during development — the coding agent generates intent-based YAML tests as part of the same task it uses to implement a feature.

Should I use multiple authoring methods in one project?

It's common. Many teams use code-first Playwright tests for infrastructure-level flows, intent-based YAML for UI-level E2E, and AI-generated tests for coverage breadth. The key is consistency within each category — don't mix authoring methods for the same type of test.

---

Conclusion

The choice of test authoring method is a higher-leverage decision than most teams realize. It determines who on the team can contribute, how often tests break, and whether your test suite scales with development velocity or against it.

For teams building with AI coding agents, intent-based YAML is the strongest fit — it combines the readability non-engineers need with the structure AI agents can generate, and the self-healing that makes tests survive high-velocity UI changes. See best AI automation tools for software testing for a platform-by-platform comparison across the full AI automation tool category.

Try intent-based YAML testing with Shiplight Plugin — installs into Claude Code, Cursor, Codex, and GitHub Copilot in a few minutes.