The best way to trigger on-demand test runs from a dashboard or API

Updated on April 14, 2026

On-demand test runs are easy to add and surprisingly hard to get right.

Most teams start with the obvious goal: Let me click Run in the dashboard, or hit an endpoint from CI. That works, until it doesn’t. Soon you have ad hoc runs with unclear scope, repeated retries that hide real regressions, and results nobody trusts because the run context was never captured.

The best way to trigger on-demand test runs is to treat a run trigger as a product surface, not a button. That means your dashboard and API should create the same thing every time: a run with explicit intent, a defined scope, traceable context, and predictable outputs. Shiplight AI is built for this kind of workflow: intent-based end-to-end testing in real browsers, with on-demand execution from the dashboard, CLI, or API, plus the reporting and debugging you need to turn a run into a decision.

Below is a practical model you can implement immediately, whether you are triggering runs from a UI, from automation, or from both.

Define what “on-demand” is for

On-demand runs are not a smaller version of CI. They serve different moments:

  • Developer proof: validate a UI change before asking for review.
  • Release triage: confirm whether a regression is real, and whether it is still happening.
  • Targeted confidence: run the flows that matter to a specific feature, experiment, or customer journey.
  • Fast reproduction: rerun a failed scenario in a clean, known environment with artifacts attached.

The mistake is letting on-demand become “run everything whenever I feel nervous.” That is slow, expensive, and it trains teams to ignore results.

A strong on-demand trigger system makes the small, intentional run the default.

Use a single run contract for both dashboard and API

If the dashboard creates one kind of run and the API creates another, you will get two cultures: manual runs that are “real,” and automated runs that are “just checks.” You want one run contract that everything uses, with a consistent set of fields.

A solid run contract answers four questions:

  1. What should run? (suite, tags, or specific tests)
  2. In what context? (branch, commit SHA, environment, base URL)
  3. With what inputs? (variables like locale, user role, seed data)
  4. With what expectations? (blocking vs informational, required checks, timeouts)

In Shiplight, this maps cleanly to how teams already work: tests expressed in readable YAML, organized into suites and tags, executed in real browsers, and verified with intent-based steps and AI-powered assertions. The run trigger should simply select and parameterize that work.

Make scope selection explicit and opinionated

The best trigger experience is one where a user cannot accidentally run the wrong thing.

A practical hierarchy looks like this:

  • Default: run a curated smoke suite (fast, high-signal, low-flake).
  • Common options: feature suite (ex: “Checkout”), change-aware subset (ex: “Flows impacted by this PR”), or tagged run (ex: tier:critical).
  • Advanced: custom selection for investigation (specific tests, specific browsers, specific data profiles).

Shiplight teams typically get the most value when they treat on-demand runs as surgical verification. Use intent-based execution and self-healing to reduce brittle selector maintenance, but still keep the scope small enough that the results are immediately actionable.

If you are designing the dashboard UX, use guardrails:

  • Put the default suite first.
  • Show estimated duration.
  • Require a reason when someone runs “full regression” from the UI.
  • Offer “Run again with same inputs” to reproduce precisely, instead of encouraging random retries.

Treat parameters as first-class, not an afterthought

On-demand runs fail in the real world because they are missing context. A run without parameters is not reproducible.

At minimum, capture:

  • Environment (preview, staging, production-like)
  • Build identity (commit SHA, build number, artifact version)
  • Base URL and feature flags
  • Test data profile (synthetic seed, sandbox account, locale, role)

Shiplight’s approach to readable test definitions and modular composition makes parameterization easier to sustain. When variables live alongside the test definition in version control, you reduce the temptation to “just run it manually and hope.”

Design for idempotency and safe retries

Retries can be useful, but only if the system makes the difference between “flaky” and “fixed” obvious.

Best practices:

  • Idempotency keys: ensure an API-triggered run is not duplicated if the caller retries a request.
  • Structured reruns: rerun the same tests with the same inputs, not a slightly different selection.
  • Retry policy transparency: show whether a pass required retries, and how many.

This is where self-healing and high-quality assertions matter. A test that “passes after three retries” is not confidence. A run trigger should encourage a single clean proof, backed by artifacts and clear failure modes.

Emit results in a way teams can operationalize

Triggering is only half the job. The output must land where decisions happen.

A strong on-demand run system supports:

  • A run URL that is shareable and stable (so “can you look?” is one click)
  • Artifacts for fast debugging (screenshots, DOM snapshots, console/network logs)
  • Webhooks that can route outcomes to Slack, Jira, or incident tooling
  • A concise summary that explains what failed and what changed, without forcing someone to scroll through noise

Shiplight’s live dashboards, reporting, debugging tools, and AI test summarization are designed for exactly this: turning a test run into a triage-ready narrative so the team can act quickly.

A practical API triggering pattern you can copy

You do not need a complex API to do this well. You need a clear request model.

Here is an illustrative payload structure (not a Shiplight-specific endpoint) that reflects a run contract built for reproducibility:

{
"suite": "smoke",
"tags": ["checkout", "tier:critical"],
"environment": "staging",
"build": {
"commitSha": "abc123",
"branch": "feature/new-cart"
},
"inputs": {
"baseUrl": "https://staging.example.com",
"locale": "en-US",
"userRole": "standard"
},
"execution": {
"browsers": ["chromium"],
"parallelism": 4,
"timeoutMinutes": 20
},
"reporting": {
"webhookUrl": "https://hooks.example.com/test-results",
"notifyOn": ["failed", "error"]
}
}

The key is not the shape of the JSON. The key is that any engineer can look at the run request and answer: what ran, where, and why.

Why Shiplight AI is a strong fit for on-demand runs

Traditional automation stacks often make on-demand runs painful because they depend on brittle selectors, heavy scripting, and constant maintenance. That friction pushes teams toward fewer runs, bigger runs, and slower feedback.

Shiplight is designed for frequent, targeted verification:

  • Intent-based execution so tests reflect what the user is trying to do, not how the DOM happens to be structured.
  • Self-healing to reduce the maintenance burden that kills trust in on-demand testing.
  • AI-powered assertions that validate real UI behavior, not just the presence of an element.
  • Cloud runners, dashboards, and integrations so a run can be triggered on demand and turned into a decision quickly.

If your goal is to make on-demand runs a daily habit for developers and a reliable tool for release owners, the best way is simple: make every trigger produce a run that is small, explicit, reproducible, and easy to act on. Shiplight is built to support that standard without adding process overhead or turning QA into a specialist bottleneck.