Validation for AI coding agents

Stop shipping code that almost works.

Your agent checks its own work against your real config, before it says it's done.

Get started Read the docs

coding-agent · ~/app · validity-mcp

How it works

Find out if your agent's UI actually works — before merge, not after.

The agent thinks it's done.

Your AI agent edits a component, calls it finished, and hands the PR back. Validity catches it before you do.

Validity renders it for real.

The change mounts in your own Vite or Next config — your providers, your wrappers, your CSS. No mocks, no stubs.

You get a verdict.

Every requirement comes back pass, fail, or partial, with a one-line note on what's broken. Hand it back to the agent or merge.

In your loop

One config. The agent and CI read it the same way.

validity.config.ts

import { defineConfig } from "validity";

export default defineConfig({
  project: "./src",
  render: {
    vite: "./vite.config.ts",
    wrap: "./src/test/wrap.tsx",
  },
  criteria: {
    strict: true,        // fail on any unmet
    screenshots: "local", // never leave disk
  },
  mcp: {
    name: "validity",
    tools: ["validate"],
  },
});

From the agent

Your agent calls validity__validate as a native MCP tool. No prompt scaffolding.

From your machine

Renders and screenshots stay in node_modules/.validity. Only the verdict syncs.

From CI

validity validate exits non-zero on a fail. Gate PRs on the same checks your agent ran.

Who it's for

For teams that have stopped pretending the agent got it right.

Coding agents

Any MCP-compatible harness. One endpoint, every tool surface.

AI-assisted dev

Pair-programming setups where the human reviews PRs the agent already “validated.” Now they actually are.

Eval teams

Drop Validity into the harness. Each generation is judged against the brief it received, not a fresh rubric.

RAG and tool-use stacks

Make sure the retrieved context renders, the tool ran, and the UI reflected it — before the user does.

Trust & safety

Every run is an audit trail. You can prove what the agent claimed worked — and what didn't.

Agentic startups

Make “validated” a merge requirement. The verdict moves with the PR.