LangENKO

ATE — Agent Test Engine

Extract → Generate → Run → Report → Heal. A six-stage pipeline that turns the interaction graph of your app into runnable Playwright specs and suggests diffs when they fail.

since v0.22
On this page

ATE — Agent Test Engine

ATE is Mandu's Automation Test Engine. It looks at the code the agent just wrote, infers what the user can do in that code, generates Playwright end-to- end tests for those interactions, runs them, and — when they fail — proposes unified diffs that might fix the failure.

Five stages form the contract, with a sixth (impact) to trim the work when only a subset of routes changed.

The six stages

  ┌────────────┐   ┌────────────┐   ┌────────────┐
  │  Extract   │──►│  Generate  │──►│    Run     │
  └────────────┘   └────────────┘   └─────┬──────┘

  ┌────────────┐   ┌────────────┐   ┌────────────┐
  │   Impact   │   │    Heal    │◄──│   Report   │
  └────────────┘   └────────────┘   └────────────┘
Stage Module Output
Extract extractor (ts-morph) .mandu/interaction-graph.json
Generate codegen .mandu/scenarios.json, tests/e2e/auto/*.spec.ts
Run runner (Playwright) .mandu/reports/run-<id>/
Report report summary.json, HTML report
Heal heal selector/diff suggestions
Impact impact (git diff) subset of affected routeIds

Stage boundaries are real — each writes to disk and each later stage accepts a runId. That is what lets the agent resume after a crash or CI retry.

Stage 1 — Extract

ateExtract walks the project's AST with ts-morph and builds an interaction graph of routes, modals, and actions. It recognises:

  • <Link href="<path>"> (Next.js)
  • <ManduLink to="<path>">
  • mandu.navigate("<path>")
  • mandu.modal.open("name")
  • mandu.action.run("name")
import { ateExtract } from "@mandujs/ate";

await ateExtract({
  repoRoot: process.cwd(),
  routeGlobs: ["app/**/page.tsx"],
  buildSalt: "dev",
});

Output lands at .mandu/interaction-graph.json with nodes (routes, modals, actions) and edges (navigate, openModal, runAction).

Stage 2 — Generate

ateGenerate reads the interaction graph and synthesises Playwright specs. The oracle level controls how strict the generated assertions are:

Level Meaning
L0 Smoke only — page loads, no console errors, no 5xx
L1 L0 + structural signals (one <main>, correct status codes)
L2 L1 + contract schema validation (Zod shape match)
L3 L2 + full behavioural checks (side effects, state, error paths)
ateGenerate({ repoRoot: process.cwd(), oracleLevel: "L1" });

The generator emits tests/e2e/auto/route___.spec.ts style files plus a playwright.config.ts if one does not exist.

Stage 3 — Run

ateRun shells out to Playwright and collects results by runId:

const run = await ateRun({
  repoRoot: process.cwd(),
  baseURL: "http://localhost:3333",
  ci: false,
  headless: true,
  browsers: ["chromium"],
});

It returns { runId, startedAt, finishedAt, exitCode }. Artifacts — traces, screenshots, JSON results — land in .mandu/reports/run-<runId>/.

Stage 4 — Report

ateReport composes the oracle-aware summary:

await ateReport({
  repoRoot: process.cwd(),
  runId: run.runId,
  startedAt: run.startedAt,
  finishedAt: run.finishedAt,
  exitCode: run.exitCode,
  oracleLevel: "L1",
  format: "both", // json | html | both
});

summary.json includes per-route status, oracle level, and — if impact was used — the changed files and selected routes.

Stage 5 — Heal

When tests fail, ateHeal parses the traces and proposes fixes. Each suggestion carries a title, a classification (selector stale, schema mismatch, missing handler, wrong status) and a unified diff. Heal never commits code on its own.

const healing = ateHeal({ repoRoot: process.cwd(), runId: run.runId });

for (const s of healing.suggestions) {
  console.log(s.title);
  console.log(s.diff);
}

Apply a specific suggestion only after human (or agent) review:

import { applyHeal } from "@mandujs/ate";

applyHeal({
  repoRoot: process.cwd(),
  runId: run.runId,
  healIndex: 0,
  createBackup: true, // default
});

applyHeal takes a snapshot through the Change system first. If the fix regresses, run bunx mandu change rollback.

Stage 6 — Impact (optional subset)

ateImpact reads git diff, maps changed files back to routes, and returns only the selectedRoutes that need to be tested. Use it to skip a full suite in CI when only a handful of pages changed:

import { ateImpact } from "@mandujs/ate";

const impact = await ateImpact({ repoRoot: process.cwd() });
// Pass impact.selectedRoutes as onlyRoutes to ateGenerate / ateRun

CLI and MCP shortcuts

The whole pipeline is exposed at three layers:

# One-shot CLI (L0 oracle, optional impact)
bunx mandu test:auto --impact --baseURL http://localhost:3333

# Heal a run (minimal skeleton around ateHeal)
bunx mandu test:heal

Through MCP, the same functionality is available as individual tools — mandu.ate.extract, mandu.ate.generate, mandu.ate.run, mandu.ate.report, mandu.ate.heal, mandu.ate.impact, mandu.ate.feedback, mandu.ate.apply_heal — or as mandu.ate.auto_pipeline for the full cycle in one call, with optional useImpactAnalysis and autoHeal flags.

CI integration

A minimal GitHub Actions step starts the dev server and lets test:auto do the rest. Impact analysis is most effective when base points at origin/main on PRs; caching .mandu/interaction-graph.json between jobs trims extract time.

- run: bunx mandu dev &
- run: bunx mandu test:auto --impact --ci --baseURL http://localhost:3333

Do not do this

  • Do not auto-apply heal suggestions without review. Only selector-map changes are safe to auto-apply; contract or handler changes always require review (mandu.ate.feedback reports which is which).
  • Do not discard .mandu/reports/run-*. Heal and feedback both depend on run artifacts, and the report stage references them by runId.
  • Do not raise the oracle level above the code's maturity. L3 on a new route without stable contracts produces a wall of red. Start at L1, climb as invariants harden.

🤖 Agent Prompt

🤖 Agent Prompt — ATE — Agent Test Engine
Apply the guidance from the Mandu docs page at https://mandujs.com/docs/build-with-agents/ate to my project.

Summary of the page:
ATE is @mandujs/ate. Pipeline: ateExtract → ateGenerate → ateRun → ateReport → ateHeal with ateImpact for subset selection. CLI: bunx mandu test:auto. MCP: mandu.ate.* tools plus mandu.ate.auto_pipeline. Oracle levels L0-L3 gate assertion depth.

Required invariants — must hold after your changes:
- Every stage writes to .mandu/ate or .mandu/reports and can be resumed by runId
- ate.heal produces diff suggestions only; it never auto-commits code
- ate.apply_heal always creates a snapshot via the change system before writing
- oracleLevel determines assertion depth: L0 smoke, L1 HTTP, L2 contract, L3 behavioral

Then:
1. Make the change in my codebase consistent with the page.
2. Run `bun run guard` and `bun run check` to verify nothing
   in src/ or app/ breaks Mandu's invariants.
3. Show me the diff and any guard violations.

For Agents

AI hint

ATE is @mandujs/ate. Pipeline: ateExtract → ateGenerate → ateRun → ateReport → ateHeal with ateImpact for subset selection. CLI: bunx mandu test:auto. MCP: mandu.ate.* tools plus mandu.ate.auto_pipeline. Oracle levels L0-L3 gate assertion depth.

Invariants
  • Every stage writes to .mandu/ate or .mandu/reports and can be resumed by runId
  • ate.heal produces diff suggestions only; it never auto-commits code
  • ate.apply_heal always creates a snapshot via the change system before writing
  • oracleLevel determines assertion depth: L0 smoke, L1 HTTP, L2 contract, L3 behavioral
Guard scope
agent-workflow