AI NATIVE CURRICULUM

The Oracle Shift

AI doesn't think like you. Stop forcing human patterns onto it.
Learn how AI works, then do that.

CORE INSIGHT

Verification replaces narration

YOU'LL LEARN

Why oracles beat org charts for AI work

TAKEAWAY

A framework you can paste into any project

PART 1

The Problem

Story-driven AI fails. Here's why.

How most people use AI

1 Describe what you want
2 AI generates a narrative
3 Trust the story

RESULT: "I fixed it" = unverified claim. Sounds right, might be wrong.

Why this breaks:

  • AI confidence ≠ correctness — models sound certain even when wrong
  • Narratives don't scale — you can't review every claim by reading
  • No feedback loop — if it's wrong, you find out in production
  • Human bottleneck — you become the verification layer
"I implemented the feature" tells you nothing.
"Here's the test that proves it works" tells you everything.
PART 2

The Insight

Replace narration with verification. Roles with constraints.

The old model: divide labor by role

The AI-native model: divide verification by oracle

THE PHRASE

Oracles, not org charts.

Front-end, back-end, QA—these are human throughput partitions. AI doesn't need them. It needs constraints it can check against.

mindset.diff +6 -6
STORY-DRIVEN
  • 1Goal: write code that sounds right
  • 2Trust: model confidence + explanation
  • 3Success: "I implemented it"
  • 4Strategy: big plan → big change
  • 5QA: a phase at the end
  • 6Uncertainty: hidden in confident tone
+ PROOF-DRIVEN
  • 1Goal: produce verifiable change
  • 2Trust: compilers, tests, benchmarks
  • 3Success: "Here's the passing test"
  • 4Strategy: small diff → fast falsifier
  • 5QA: always-on judge in the loop
  • 6Uncertainty: explicit + falsifiable
PART 3

The Framework

How to operationalize this insight.

3A The Loop

Every change follows this cycle. No exceptions.

Propose Goal, non-goals, proof plan
Patch Small diff, focused change
Prove Run fastest falsifier first
Pack Regression test + docs
PROOF LADDER (stop at first failure)
format lint typecheck unit tests integration

3B Control Theory Lens

Think: sensors, controller, actuators. Most AI setups fail because they have actuators but weak sensors.

Sensors

(truth sources)
  • Tests, type checks, linters
  • CI results, telemetry

Controller

(the loop)
  • Propose → Patch → Prove → Pack
  • Stop at first failure

Actuators

(changes)
  • Code edits, configs
  • Rollbacks, deploys

3C The Funnel

LLM proposes many options. Oracles filter down to what actually works.

LLM PROPOSES
patch A
patch B
patch C
ORACLES JUDGE
patch A ✗ fails unit test
patch B ✗ fails typecheck
patch C ✓ passes all checks
SHIP
PART 4

Apply It

Copy these into your project. Start proof-driven today.

Claude Project Instructions
You are an engineering agent in a proof-driven repo.

Prime directive: If it doesn't run, it doesn't exist.
Never claim fixed/works/done without executable evidence.

Operate in a closed loop:
1) Propose: Goal + non-goals + proof plan (exact commands)
2) Patch: small diffs only; no rewrites unless required
3) Prove: run fastest falsifiers first (format → lint → typecheck → tests)
4) Pack: add regression tests for fixes; update docs

Output format every time:
- Plan (Goal, Non-goals, Proof plan)
- Changes (files + why)
- Proof (commands + pass/fail)
- Risks & mitigations
AGENTS.md (drop in repo root)
# AGENTS.md — Proof-Driven Agent Operating Rules

> If it doesn't run, it doesn't exist.

Non-negotiables:
1) No proof, no claim
2) Regression test mandate for bug fixes
3) Don't disable truth gates
4) Interfaces are physics
5) Externalize uncertainty

Loop: Propose → Patch → Prove → Pack
Proof ladder: format → lint → typecheck → unit tests → integration
Truth command: npm run verify
verify.sh (template)
#!/usr/bin/env bash
set -euo pipefail

echo "== format =="   && npm run format:check
echo "== lint =="     && npm run lint
echo "== typecheck ==" && npm run typecheck
echo "== test =="     && npm run test
echo "== verify OK =="

Lesson Summary

1

The problem: Story-driven AI fails because narratives don't scale and confidence ≠ correctness.

2

The insight: Replace roles with oracles. Verification beats narration.

3

The framework: Propose → Patch → Prove → Pack. Every claim ends in executable evidence.

4

The phrase: If it doesn't run, it doesn't exist.