ATE — Agent Test Engine
Extract → Generate → Run → Report → Heal. A six-stage pipeline that turns the interaction graph of your app into runnable Playwright specs and suggests diffs when they fail.
On this page
ATE — Agent Test Engine
ATE is Mandu's Automation Test Engine. It looks at the code the agent just wrote, infers what the user can do in that code, generates Playwright end-to- end tests for those interactions, runs them, and — when they fail — proposes unified diffs that might fix the failure.
Five stages form the contract, with a sixth (impact) to trim the work when
only a subset of routes changed.
The six stages
┌────────────┐ ┌────────────┐ ┌────────────┐
│ Extract │──►│ Generate │──►│ Run │
└────────────┘ └────────────┘ └─────┬──────┘
▼
┌────────────┐ ┌────────────┐ ┌────────────┐
│ Impact │ │ Heal │◄──│ Report │
└────────────┘ └────────────┘ └────────────┘
| Stage | Module | Output |
|---|---|---|
| Extract | extractor (ts-morph) |
.mandu/interaction-graph.json |
| Generate | codegen |
.mandu/scenarios.json, tests/e2e/auto/*.spec.ts |
| Run | runner (Playwright) |
.mandu/reports/run-<id>/ |
| Report | report |
summary.json, HTML report |
| Heal | heal |
selector/diff suggestions |
| Impact | impact (git diff) |
subset of affected routeIds |
Stage boundaries are real — each writes to disk and each later stage accepts a
runId. That is what lets the agent resume after a crash or CI retry.
Stage 1 — Extract
ateExtract walks the project's AST with ts-morph and builds an
interaction graph of routes, modals, and actions. It recognises:
<Link href="<path>">(Next.js)<ManduLink to="<path>">mandu.navigate("<path>")mandu.modal.open("name")mandu.action.run("name")
import { ateExtract } from "@mandujs/ate";
await ateExtract({
repoRoot: process.cwd(),
routeGlobs: ["app/**/page.tsx"],
buildSalt: "dev",
});
Output lands at .mandu/interaction-graph.json with nodes (routes, modals,
actions) and edges (navigate, openModal, runAction).
Stage 2 — Generate
ateGenerate reads the interaction graph and synthesises Playwright specs.
The oracle level controls how strict the generated assertions are:
| Level | Meaning |
|---|---|
| L0 | Smoke only — page loads, no console errors, no 5xx |
| L1 | L0 + structural signals (one <main>, correct status codes) |
| L2 | L1 + contract schema validation (Zod shape match) |
| L3 | L2 + full behavioural checks (side effects, state, error paths) |
ateGenerate({ repoRoot: process.cwd(), oracleLevel: "L1" });
The generator emits tests/e2e/auto/route___.spec.ts style files plus a
playwright.config.ts if one does not exist.
Stage 3 — Run
ateRun shells out to Playwright and collects results by runId:
const run = await ateRun({
repoRoot: process.cwd(),
baseURL: "http://localhost:3333",
ci: false,
headless: true,
browsers: ["chromium"],
});
It returns { runId, startedAt, finishedAt, exitCode }. Artifacts — traces,
screenshots, JSON results — land in .mandu/reports/run-<runId>/.
Stage 4 — Report
ateReport composes the oracle-aware summary:
await ateReport({
repoRoot: process.cwd(),
runId: run.runId,
startedAt: run.startedAt,
finishedAt: run.finishedAt,
exitCode: run.exitCode,
oracleLevel: "L1",
format: "both", // json | html | both
});
summary.json includes per-route status, oracle level, and — if impact was
used — the changed files and selected routes.
Stage 5 — Heal
When tests fail, ateHeal parses the traces and proposes fixes. Each
suggestion carries a title, a classification (selector stale, schema
mismatch, missing handler, wrong status) and a unified diff. Heal never
commits code on its own.
const healing = ateHeal({ repoRoot: process.cwd(), runId: run.runId });
for (const s of healing.suggestions) {
console.log(s.title);
console.log(s.diff);
}
Apply a specific suggestion only after human (or agent) review:
import { applyHeal } from "@mandujs/ate";
applyHeal({
repoRoot: process.cwd(),
runId: run.runId,
healIndex: 0,
createBackup: true, // default
});
applyHeal takes a snapshot through the Change system first. If the fix
regresses, run bunx mandu change rollback.
Stage 6 — Impact (optional subset)
ateImpact reads git diff, maps changed files back to routes, and returns
only the selectedRoutes that need to be tested. Use it to skip a full
suite in CI when only a handful of pages changed:
import { ateImpact } from "@mandujs/ate";
const impact = await ateImpact({ repoRoot: process.cwd() });
// Pass impact.selectedRoutes as onlyRoutes to ateGenerate / ateRun
CLI and MCP shortcuts
The whole pipeline is exposed at three layers:
# One-shot CLI (L0 oracle, optional impact)
bunx mandu test:auto --impact --baseURL http://localhost:3333
# Heal a run (minimal skeleton around ateHeal)
bunx mandu test:heal
Through MCP, the same functionality is available as individual tools —
mandu.ate.extract, mandu.ate.generate, mandu.ate.run, mandu.ate.report,
mandu.ate.heal, mandu.ate.impact, mandu.ate.feedback,
mandu.ate.apply_heal — or as mandu.ate.auto_pipeline for the full cycle in
one call, with optional useImpactAnalysis and autoHeal flags.
CI integration
A minimal GitHub Actions step starts the dev server and lets test:auto do
the rest. Impact analysis is most effective when base points at
origin/main on PRs; caching .mandu/interaction-graph.json between jobs
trims extract time.
- run: bunx mandu dev &
- run: bunx mandu test:auto --impact --ci --baseURL http://localhost:3333
Do not do this
- Do not auto-apply heal suggestions without review. Only selector-map changes are safe to auto-apply; contract or handler changes always require review (
mandu.ate.feedbackreports which is which).- Do not discard
.mandu/reports/run-*. Heal and feedback both depend on run artifacts, and the report stage references them byrunId.- Do not raise the oracle level above the code's maturity. L3 on a new route without stable contracts produces a wall of red. Start at L1, climb as invariants harden.
🤖 Agent Prompt
Apply the guidance from the Mandu docs page at https://mandujs.com/docs/build-with-agents/ate to my project.
Summary of the page:
ATE is @mandujs/ate. Pipeline: ateExtract → ateGenerate → ateRun → ateReport → ateHeal with ateImpact for subset selection. CLI: bunx mandu test:auto. MCP: mandu.ate.* tools plus mandu.ate.auto_pipeline. Oracle levels L0-L3 gate assertion depth.
Required invariants — must hold after your changes:
- Every stage writes to .mandu/ate or .mandu/reports and can be resumed by runId
- ate.heal produces diff suggestions only; it never auto-commits code
- ate.apply_heal always creates a snapshot via the change system before writing
- oracleLevel determines assertion depth: L0 smoke, L1 HTTP, L2 contract, L3 behavioral
Then:
1. Make the change in my codebase consistent with the page.
2. Run `bun run guard` and `bun run check` to verify nothing
in src/ or app/ breaks Mandu's invariants.
3. Show me the diff and any guard violations.
For Agents
ATE is @mandujs/ate. Pipeline: ateExtract → ateGenerate → ateRun → ateReport → ateHeal with ateImpact for subset selection. CLI: bunx mandu test:auto. MCP: mandu.ate.* tools plus mandu.ate.auto_pipeline. Oracle levels L0-L3 gate assertion depth.
- Every stage writes to .mandu/ate or .mandu/reports and can be resumed by runId
- ate.heal produces diff suggestions only; it never auto-commits code
- ate.apply_heal always creates a snapshot via the change system before writing
- oracleLevel determines assertion depth: L0 smoke, L1 HTTP, L2 contract, L3 behavioral