Page
Writing · Post
Filed
25 May 2026
Reading
8 minutes
Tags
SPEC-DRIVEN DEVELOPMENT · AGENTIC ENGINEERING · AI WORKFLOW AUTOMATION
← ALL POSTS spec-driven development agentic engineering 8 MIN READ

Spec-Driven Development: Let the AI Obsess Over the Code

Spec-driven development is the next way to build with AI: obsess over a detailed spec, then let the agent build against it. How it works, and what it costs.

I Wrote 31 Specs Before AI Built a Single Screen

I wrote thirty-one specifications for HabitCycles before I let AI build a single screen.

Thirty-one documents. Problem statements, user stories, acceptance criteria written out in plain English, all version-controlled, all sitting in a specs/ folder that mirrors the codebase. It took an unreasonable amount of time, and for most of it I had nothing to show. No app. No screens. Just documents.

It felt mad. Then I handed the whole thing to an AI agent and watched it build, and I stopped thinking it was mad.

This is how I build with AI now. I want to walk you through what it is, why it works, and where it quietly costs you, because I think it’s the next serious way to work with these tools, and almost nobody is doing it yet.

The Problem With Prompt-and-Magic

The default way to build with AI is intoxicating. You type a sentence into a prompt and something appears. A screen, a function, a whole feature. It feels like magic, and honestly, it’s fun.

It also falls apart the moment the thing you’re building gets complicated.

When you’re prompting and accepting and re-prompting, you’re in the loop. You’re the bottleneck. Every decision routes through you, in the moment, with whatever context happens to be in the chat window. The AI doesn’t know how this feature connects to the other six. It doesn’t know the rule you established three sessions ago. It’s guessing, confidently, and you’re correcting, constantly. (I pulled apart that “in the loop versus on the loop” distinction after a meetup talk that rewired how I think about it.)

I’d already learned a version of this lesson with my App Store work: a chat conversation is stateless, and the fix was to give AI a persistent, structured knowledge base instead of re-explaining everything every time. Spec-driven development is the same lesson, pointed at how you actually build the product.

What Spec-Driven Development Actually Is

A spec, the way I write it, is two documents fused into one.

The top half is a product document: the problem, the goals, who it’s for, the user stories. The same things any decent PM writes before a feature gets built. The bottom half is a technical delivery document: the architecture, the data model, the acceptance criteria, all written out in human words. It doesn’t say “implement decisionToStatus”; it says “when someone reviews a cycle and chooses to replace the habit, the cycle closes as replaced and a fresh cycle opens under the new one.” Plain English that says exactly what should happen, so there’s something concrete to build and verify against.

Here’s the part that trips people up: I don’t write these from scratch. I write them with AI. I’ll draft the problem, the AI proposes structure and edge cases, I push back, it revises. I obsess over the document until I genuinely understand how the feature works and I’m confident it’s right. The obsession is the point. By the time the spec is done, the hard thinking is done.

And I don’t read each spec in isolation. I read them as a system: how the cycle spec connects to the habit spec, how the data model feeds the screens. Then I check the whole thing hangs together before any code exists. In HabitCycles the specs live in folders that mirror the codebase exactly, so finding the spec for a piece of code is just walking the same path. Thirty-one of them, across the domain logic, the marketing site and the mobile app.

Then, and only then, I let the AI build.

Let the Agent Run

This is where it gets close to a Ralph Wiggum loop: you point an agent at the work and let it grind away on repeat. The difference is what it’s grinding against. It isn’t improvising from a one-line prompt. It has a precise, detailed specification it can check its own work against, acceptance criterion by acceptance criterion.

FIG.A · THE SPEC-DRIVEN LOOP
The spec-driven development loop A left-to-right flow in three steps. First, draft with AI — you and the AI iterate on the spec until it's right. Second, the spec itself: product and technical sections, connected and verified into one system. Third, the agent builds, verifying its work against the spec. A bracket under the first two steps reads "you obsess here, the spec"; a bracket under the third reads "the AI obsesses here, the code". DRAFT WITH AI obsess until it's right THE SPEC product + technical connected · verified AGENT BUILDS verifies vs the spec YOU OBSESS HERE · the spec AI OBSESSES HERE · the code

It was exceptional at it. Because the context was complete (the requirements, the connections, the constraints, all written down), the agent didn’t have to guess. It built against the spec, verified against the spec, and the output was the most reliable I’ve had from AI on anything complex.

That’s the trade I keep coming back to: obsess over the spec as if it were the code, and the AI will obsess over the code as much as you obsessed over the spec.

The Changelog of Why

There’s a second payoff I didn’t expect to value as much as I do.

When something changes, I change the spec first, and that edit ships in its own commit, before any implementation. So the version history of the spec becomes a record of how my thinking evolved: what the feature used to do, what it does now, and why it changed. Product changes bump a version number. Smaller technical iterations get logged as short decision records sitting right next to the spec.

When I come back to a feature in two months, I’m not reverse-engineering my own intentions from the code. I read the spec and its history. And so does the AI, which matters more than it sounds.

Why the AI Actually Builds Better

Give an AI a vague prompt and a large codebase, and a huge amount of its effort goes into archaeology. It has to read through dozens of files to work out how things connect before it can safely change anything. Miss a connection and you get a subtle bug.

The spec removes the archaeology. The connections are already written down, in one place, in human language. When I ask for a change, the agent doesn’t have to grep its way across thirty files to understand the blast radius. The spec already says what touches what. It scopes the work accurately, plans it without holes, and gets on with it. The spec is context engineering made concrete.

The Gotcha Nobody Warns You About

Here’s the one that bit me, and the fix I’d hand to anyone trying this.

AI wants to write into your spec, and not just the product thinking. It adds its own technical notes, implementation scratch, the running commentary it generates as it works. Left unchecked, it quietly turns your clean, human-readable document into a mixed log of product intent and machine notes. The thing that was meant to be read by a person becomes a slog.

The fix is to decide ownership up front and bake it into the document. I split every spec with a hard rule. Above the line sits the product section, which I own, written for people. Below the line sits the technical section, where the AI keeps its own notes and iterates over time. The frontmatter even names who owns each half. Draw that line from day one and the human document stays human.

FIG.B · ANATOMY OF A SPEC · TWO OWNERS, ONE FILE
Anatomy of a spec A single spec file split into two owned sections by a hard dividing rule. A frontmatter strip at the top names the owners: productOwner jamie, technicalOwner claude. Above the line sits the product spec, which Jamie owns: the problem, goals and non-goals, user stories, and acceptance criteria in plain English. The dividing line is the rule you draw on day one. Below it sits the technical spec, which the AI owns and iterates on: architecture and data model, test strategy and technical criteria, and decision records. --- productOwner: jamie · technicalOwner: claude · version: 3 --- PRODUCT SPEC YOU OWN · written for people the problem goals & non-goals user stories acceptance criteria, in plain English the line you draw on day one TECHNICAL SPEC AI OWNS · iterates over time architecture & data model test strategy & technical criteria decision records (ADRs)

It’s a workaround. Maybe the tools handle this better one day. For now, that line is the difference between a spec a teammate will actually read and one nobody wants to open.

The Honest Cost

I won’t pretend this is free.

It’s less magical. There’s no dopamine hit of typing a sentence and watching an app appear, because for the whole first stretch the thing you’re producing is a document, not a product. You’re thinking, connecting, planning, and you don’t get to see it run yet. The prompt-and-magic loop is genuinely more fun. Some days I miss it.

It also feels slower, right up until it isn’t. You spend real time at the painful planning stage with nothing visible to show for it. But once the spec is solid, the AI runs flat out against it, and you ship more complex, more reliable systems faster than any amount of chat-window iterating would get you. The slowness is front-loaded. The speed is on the other side of it.

Who This Is For

If you’re a product manager building anything agentic, this might be the most useful habit you pick up this year. The spec is a human-readable source of truth. Another PM can read it. An engineer can understand what you’re building without reading the code. A new joiner can onboard from it, at whatever depth they need. You stop being the single point of context for your own project.

And if you’re an indie builder like me, it’s simply a more powerful way to build, even if it’s a little less fun. You trade some magic for reliability, and past a certain level of complexity that’s a trade worth making every time.

I think this is where serious AI development is heading. We spent the last couple of years learning to prompt better. The next move is to stop prompting and start specifying: to obsess over the document the way we used to obsess over the code, and let the AI take the part it’s genuinely better at.

I wrote thirty-one specs before I built a screen. I’d do it again tomorrow.

If you’re building this way, or you think I’ve got it wrong, I’d like to hear it. What would you spec first?


Key Takeaways

  • Spec-driven development means obsessing over a detailed specification first, then letting an AI agent build against it. It works reliably because the requirements and connections are all written down.
  • Write the spec as two fused documents: a product layer you own, and a technical delivery layer with acceptance criteria in plain English.
  • Version-control the spec and change it before the code. Its history becomes a changelog of why the product evolved.
  • Specs make AI build better by removing “archaeology”: the connections are written down, so the agent scopes changes without reading dozens of files.
  • Split every spec into a human-owned section and an AI-owned section, or the AI will quietly pollute your readable document with machine notes.
  • The cost is real: less magical, more upfront thinking, slower to first sight of anything running. The payoff is speed and reliability on complex systems, and a source of truth any teammate can read.
FAQ · 4

Questions, answered.

appendix
What is spec-driven development with AI?

It's a way of working where you write a detailed specification for a feature before any code exists, then hand that spec to an AI agent to build against. The spec combines a product document (problem, goals, users, user stories) with a technical delivery document (architecture and acceptance criteria written in plain English). You obsess over the spec; the AI obsesses over the code.

Do you write the specs yourself or with AI?

With AI. I draft the problem and direction, the AI proposes structure, edge cases and technical detail, and I iterate until I'm confident the feature is right. I own the document and the product thinking; the AI accelerates the drafting and pressure-tests the edges.

How is a spec different from a traditional PRD?

A PRD usually stops at the product layer: the what and the why. A spec carries through to a technical delivery layer with acceptance criteria written in human words, so it's directly buildable and verifiable by an AI agent. It's also version-controlled and lives next to the code, so it stays a living source of truth rather than a document that goes stale after kickoff.

Isn't writing specs first just waterfall?

No. The spec isn't frozen up front. When requirements change you change the spec first, ship that change in its own commit, and let the version history record what changed and why. It's iterative. You iterate on the specification as the primary artefact, with the code following it, rather than discovering the design by editing code.

APPENDIX · KEEP GOING

Enjoyed this? There's more.

New posts when the work compels it. Or if you're hiring or building, my inbox is open.

MORE WRITING → WORK WITH ME →
RELATED · 3

If this is your thing, read these next.

hand-picked