Chad — a coding engine that doesn't lie about progress

01 · Is this for me?

Three doors. Pick yours.

If you write code for a living

Yes. This is for you.

Chad is the agent that earns your trust by showing receipts. Run it locally, point it at your repo, watch it work. The verifier won't lie even if you ask it to.

If you've used ChatGPT to make small things

Yes — exactly for you.

If you've gotten frustrated when a chatbot says "done" but the result is broken, Chad is the friend who actually checks the work before handing it over.

If you've never written code

Not yet.

Chad is the engine, and the engine still wants a terminal. A no-command-line companion is in the works for you, but it's not ready. For now, sit tight.

PHASES · ARCADE BUILD

14 / 15

Phases completed on a real 8-game arcade build. Chad w/ qwen3:14b, fifteen minutes, no babysitting.

FAMILIES · DRAWPAD

3 model families

Perfect drawpad runs. qwen3, gemma3, ministral — all hit the spec when run with Chad.

UNVERIFIED PHASES

Phases marked done without verification. The number, exactly.

02 · We let him cook

One model. With Chad and without.

Same spec. Same weights. Same prompt. The only thing that changed was whether Chad was running the loop.

bare qwen3:14b SAID DONE · UNVERIFIED

arcade-bare/ ├── index.html # 1.4 kb · stub ├── arcade.js # 1.8 kb · 2 functions └── styles.css # 0.4 kb 3 files · 3.6 kb games shipped · 0 of 8 verifier ran · no model said · "done."

"Said it was done." 3.6 KB · 3 files

Chad w/ qwen3:14b VERIFIED · 14/15 PHASES

arcade-chad/ ├── index.html ├── arcade.js # router · 8 modes ├── games/ │ ├── snake.js # spec ✓ │ ├── pong.js # spec ✓ │ ├── tetris.js # spec ✓ │ ├── breakout.js # spec ✓ │ ├── memory-match.js # spec ✓ │ ├── tic-tac-toe.js # spec ✓ │ ├── simon.js # spec ✓ │ └── whack-a-mole.js # spec ✓ ├── HONESTY.md # auto-emitted └── KNOWN_LIMITS.md # auto-emitted 15 files · games shipped · 8 of 8 verifier ran · 24× · 14 pass · 1 surfaced

"Verified." 15 files · 8 games

Same weights. Same prompt. Chad is the product.

03 · How the cook-off worked

We gave them a recipe. We let them cook.

Eight different AI models, same recipe, two tries each. One on its own. One with Chad in the kitchen.

First try: on its own. We handed each model the recipe and accepted whatever came back. Like ordering a meal and eating whatever the chef plates.

Second try: with Chad. Same model. Same recipe. But Chad runs the kitchen. He breaks the recipe into steps, watches the model work each step, and won't let it move on until that step actually works.

Same ingredients. Same recipe. Two ways of cooking.

On the hardest recipe — an 8-game arcade in one webpage — Chad with qwen3:14b read the spec and said: "this needs 15 steps." He cooked all 15. Step 8 took three tries. Step 12 took four. Everything else, first attempt. Fifteen minutes later, eight working games came out of the oven.

The same model, on its own, wrote 3.6 KB of code in 82 seconds and called it done. None of the eight games actually ran.

A bigger model, qwen3:32b on its own, also failed — three files, none of which loaded without errors.

Codestral 22B is the cautionary tale. On its own, it shipped a JavaScript file that starts with a markdown fence. Line one of the file is the literal text ```javascript. The browser never even loaded the page. Codestral said "done." It was lying.

04 · What Chad does

What Chad does that nothing else does.

Six specifics. None of them are taglines. Each one is the reason a build that should have failed didn't.

A checker that won't be skipped.

Every step runs checks. If checks fail, Chad retries. He doesn't lie about it.

Plain English: every step gets checked before Chad moves to the next one.

Discovery before code.

Chad asks clarifying questions before he plans. Most agents skip this. Most builds suffer for it.

Plain English: like a contractor — asks questions before quoting the job.

Design summary checkpoint.

Chad restates what he heard before he writes a line. You catch misreads in 30 seconds, not 30 minutes.

Plain English: he repeats back what you asked for. Catches misunderstandings early.

A receipt with every project.

Every Chad project ships with a plain-English file describing what works, what doesn't, and what was skipped. No hidden caveats.

Plain English: read the receipt and you'll know exactly what you got.

Fixes his own bugs first.

Chad repairs his own engine on a separate copy and asks before merging. You stay in control of your tools.

Plain English: shows you the fix before it goes live. You say yes.

Confidence, calibrated.

"87% on web apps, 34% on Tauri." If he doesn't know the domain, he tells you before you spend an hour.

Plain English: tells you upfront how good he is at your kind of project.

05 · How it works

Three stages. You approve twice. Chad does the rest.

No magic, no daemon decisions. Two checkpoints to align before any code is written, then a verifier loop that won't shut up until it passes — and writes the receipt.

STAGE 01

Discovery

Clarifying questions, design summary, you approve.

STAGE 02

Plan

Decomposed tickets, calibrated confidence, you approve.

STAGE 03

Build, verify, write the receipt

Chad writes, Chad checks, Chad retries — until pass or surfaced failure. Then a plain-English receipt of what he did.

06 · Without a verifier

What "done" looks like without a verifier.

Codestral 22B was asked to build a drawpad. It wrote a .js file. Line 1 of the .js file is a markdown fence.

drawpad.js · bare codestral:22b SHIPPED · UNVERIFIED

1```javascript
2// Drawpad — generated by codestral:22b (bare)
3const canvas = document.getElementById('pad');
4const ctx = canvas.getContext('2d');
5canvas.addEventListener('mousedown', e => {
6  ctx.beginPath();
7  ctx.moveTo(e.offsetX, e.offsetY);
8});
9// (lines 11–24 elided — none of it works either)
10 ```

Codestral 22B, asked to build a drawpad. It said it was done. The .js file starts with a markdown fence on line one. The verifier would have caught this. There was no verifier.

07 · Where Chad doesn't help much

The honest part of the story.

On simple, single-file apps — a Snake game, a basic calculator, a one-page form — every modern model does fine on its own. We ran the same Snake spec against eight models, with and without Chad. Both sides nailed it.

Chad's value shows up at scale. The harder the recipe and the more steps it has, the bigger the gap. That's what we built him for.

08 · Install

Three ways. All local. Pick one.

Chad runs on your machine. Pulls the model you already have. Talks to your editor over localhost:8766.

READY TO INSTALL? — YOU'LL NEED A TERMINAL.

MACOS · LINUX

$ curl -fsSL https://usechad.dev/install.sh | sh

PYTHON · PIP

$ pip install chad-engine

DOCKER

$ docker run -p 8766:8766 \
    ghcr.io/usechad/chad:latest

Chad runs locally. Your code never leaves your machine. No telemetry. No phone-home. Bring your own model.