The Oracle Shift
AI doesn't think like you. Stop forcing human patterns onto it.
Learn how AI works, then do that.
The Problem
Story-driven AI fails. Here's why.
How most people use AI
RESULT: "I fixed it" = unverified claim. Sounds right, might be wrong.
Why this breaks:
- AI confidence ≠ correctness — models sound certain even when wrong
- Narratives don't scale — you can't review every claim by reading
- No feedback loop — if it's wrong, you find out in production
- Human bottleneck — you become the verification layer
"I implemented the feature" tells you nothing.
"Here's the test that proves it works" tells you everything.
The Insight
Replace narration with verification. Roles with constraints.
The old model: divide labor by role
The AI-native model: divide verification by oracle
Oracles, not org charts.
Front-end, back-end, QA—these are human throughput partitions. AI doesn't need them. It needs constraints it can check against.
- 1Goal: write code that sounds right
- 2Trust: model confidence + explanation
- 3Success: "I implemented it"
- 4Strategy: big plan → big change
- 5QA: a phase at the end
- 6Uncertainty: hidden in confident tone
- 1Goal: produce verifiable change
- 2Trust: compilers, tests, benchmarks
- 3Success: "Here's the passing test"
- 4Strategy: small diff → fast falsifier
- 5QA: always-on judge in the loop
- 6Uncertainty: explicit + falsifiable
The Framework
How to operationalize this insight.
3A The Loop
Every change follows this cycle. No exceptions.
3B Control Theory Lens
Think: sensors, controller, actuators. Most AI setups fail because they have actuators but weak sensors.
Sensors
(truth sources)- Tests, type checks, linters
- CI results, telemetry
Controller
(the loop)- Propose → Patch → Prove → Pack
- Stop at first failure
Actuators
(changes)- Code edits, configs
- Rollbacks, deploys
3C The Funnel
LLM proposes many options. Oracles filter down to what actually works.
Apply It
Copy these into your project. Start proof-driven today.
You are an engineering agent in a proof-driven repo.
Prime directive: If it doesn't run, it doesn't exist.
Never claim fixed/works/done without executable evidence.
Operate in a closed loop:
1) Propose: Goal + non-goals + proof plan (exact commands)
2) Patch: small diffs only; no rewrites unless required
3) Prove: run fastest falsifiers first (format → lint → typecheck → tests)
4) Pack: add regression tests for fixes; update docs
Output format every time:
- Plan (Goal, Non-goals, Proof plan)
- Changes (files + why)
- Proof (commands + pass/fail)
- Risks & mitigations
# AGENTS.md — Proof-Driven Agent Operating Rules
> If it doesn't run, it doesn't exist.
Non-negotiables:
1) No proof, no claim
2) Regression test mandate for bug fixes
3) Don't disable truth gates
4) Interfaces are physics
5) Externalize uncertainty
Loop: Propose → Patch → Prove → Pack
Proof ladder: format → lint → typecheck → unit tests → integration
Truth command: npm run verify
#!/usr/bin/env bash
set -euo pipefail
echo "== format ==" && npm run format:check
echo "== lint ==" && npm run lint
echo "== typecheck ==" && npm run typecheck
echo "== test ==" && npm run test
echo "== verify OK =="
Lesson Summary
The problem: Story-driven AI fails because narratives don't scale and confidence ≠ correctness.
The insight: Replace roles with oracles. Verification beats narration.
The framework: Propose → Patch → Prove → Pack. Every claim ends in executable evidence.
The phrase: If it doesn't run, it doesn't exist.