Working with OpenSpec: the two-thirds roast

spec-driven-development harness-engineering documentation ai

Published 5 June 2026·8 min read

A Sunday pork roast with one end neatly sliced off, sitting beside a small vintage oven

My dad has a story he has been telling at the dinner table for as long as I can remember. As a kid he used to visit a distant cousin, and on Sundays the family did a pork roast. Every time, before it went in, the mother of the house would take a knife and cut a good third off one end of the roast, set that piece aside, and cook the two pieces as if it were the most normal thing in the world. They always ended up with two roasts: one big, one small.

One day my dad asked the aunt why she did it. She thought about it, shrugged, and said the only honest thing she had: "We have always done it like that. My mother did it that way too." That was the whole answer. The knife came out, the end came off, every single Sunday, because that was simply how a roast was done.

A few weeks later she rang my dad up, laughing. She had finally asked her own mother why they cut the end off the roast. And the reason, it turned out, had nothing to do with the meat at all. When her mother was little, the family had a tiny oven, and a full roast would not fit in it. So she cut a third off to make it squeeze in. That was it. The oven had been gone for forty years, the kitchens had changed three times over, and the family had kept dutifully sacrificing a third of every roast to an oven that no longer existed.

A lot of how I work with AI on a codebase is an effort to never ship code that conforms to non-existing constraints. When a project loses the reason behind the things it does, or hangs on to an outdated idea, it keeps doing them anyway. The code is the knife coming down on the roast. The reason - the tiny oven - is the part that goes missing first. OpenSpec is how I keep the reason written down next to the work, so the next person to open the kitchen is not cooking two-thirds roasts for no reason. This is a short, practical record of that workflow.

The spec is the source of truth

The core idea of OpenSpec is that the spec, not the code, is the source of truth. A unit of work - a change - lives in its own folder under openspec/changes/<change-id>/ as a small set of artifacts: a proposal, a design, a task list, and one or more delta specs describing exactly what is going to change. You write down what reality should look like, and only then do you go and make the code match.

When the work is done and actually verified, those deltas get merged into the living specs under openspec/specs/ and the change is archived. The arrangement is deliberately two-tiered: a temporary folder for the change in flight, and a permanent record for the system as it stands.

openspec/
├── specs/      # the living source of truth - the system as it is now
└── changes/
    └── <change-id>/
        ├── proposal.md   # why this change exists, what it touches
        ├── design.md     # decisions and trade-offs
        ├── tasks.md       # the implementation checklist
        └── specs/          # delta specs: what this change adds or alters

I drive all of this with a mix of slash commands and the CLI. The slash commands - /opsx:explore, /opsx:propose, /opsx:apply, /opsx:sync, /opsx:archive - do the heavy lifting. The openspec CLI (list, show, validate, archive) is for browsing, checking, and finalising. The whole point is simple to say and harder to live by: the spec should describe reality before the code does, including when I find problems halfway through testing.

The flow, end to end

An idea does not go straight to code. It travels a loop, and most of the loop is spent on the spec rather than the implementation. Here is the path from a half-formed thought to an archived change.

Notice where the arrows loop back. They never loop back into nothing - every correction goes through the spec first. That is the single habit the whole workflow is built to enforce.

Walking the loop, stage by stage

Each stage has a job, and a couple of them are where I deliberately spend the most time.

Explore, always first. Every idea starts with /opsx:explore, even the ones that feel obvious. I use it to investigate the problem space and pin down requirements. I do not keep exploration as a throwaway chat or a side document - the thinking flows directly into the proposal, so nothing useful is lost.
Propose. /opsx:propose turns that conversation into a change folder and its planning artifacts in one step: the proposal, the design, the tasks, and the delta specs describing what changes.
Read the spec, and iterate heavily. This is where the real time goes. I read the generated spec critically and then explore or propose more detail into it, usually over several rounds, before a line of code exists. If the spec is fuzzy, I would much rather fix it here than discover the gap mid-implementation. I lean on openspec show and openspec validate to sanity-check the structure while iterating.
Apply. Once the spec holds up under that scrutiny, /opsx:apply implements the tasks one box at a time.
Manual test. I do not trust a change until I have run the app and exercised the actual feature by hand, alongside the automated suite and any new tests the change brought with it. Green CI is necessary, not sufficient.
Append to spec, then fix. When testing surfaces a gap or the wrong behaviour, I work spec-first: the correction or the new requirement goes into the spec, and only then do I implement the fix. That keeps the spec authoritative instead of letting the code quietly drift ahead of it. Then back through apply and manual test as many times as it takes.
Sync and archive. A change is ready when manual and automated tests pass and the spec reflects the final behaviour. Then /opsx:sync merges the delta specs into the main specs, and /opsx:archive (or openspec archive) folds the deltas into the living specs and moves the change to history.

Iterate on the spec, not just the code. The spec leads - exploration feeds it, testing corrects it, and archiving only happens once the code and the spec finally agree.

The main spec captures intent

The main spec is everything under openspec/specs/, organised by capability - auth-login/spec.md, auth-session/spec.md, and so on. Unlike a change folder, which is temporary and describes a single delta, the main spec is permanent and describes the system as it is right now. Every archived change merges its deltas into it, so over time it grows into a complete, version-controlled picture of every capability the system has and the requirements behind them. It is the document I read to answer "what does this thing do, and why?"

That second half - the why - is the most valuable thing the main spec preserves, and it is exactly the part the roast family lost. When I explore and propose, I am not only describing what to build. I am recording why it should exist, what problem it solves, and which trade-offs I accepted. Syncing and archiving fold that reasoning into the main spec, so the why outlives the change folder and becomes permanent context.

Code tells you what the system does. Only the captured intent tells you why, and the why is the lens that makes maintaining the spec worth the effort:

Troubleshoot against intent. When something misbehaves, I compare the code's actual behaviour to the behaviour the spec says was intended, instead of guessing what it was supposed to do.
Recover the why of a feature. Months later, I (or a teammate, or an agent) can read the reasoning behind a decision rather than reverse-engineering it from the code, which only ever shows the how.
Find similar features and prior intent. Before building something new, I scan the specs for existing capabilities and the reasons behind them, so I can reuse, extend, or stay consistent rather than quietly contradict a past decision.
Onboard people and agents with context. The main spec is the fastest accurate briefing on why the system is the way it is. An agent grounded in intent makes far better changes than one inferring it from code.
Review intent, not just diffs. A change gets judged against the spec's stated purpose, so I am checking whether the behaviour is right for the reason it exists, not only whether the code compiles.

Anything you can surface in a spec - behaviour, constraints, edge cases, and above all the intent behind them - is something you no longer have to rediscover from the code later. A neglected main spec quietly loses that intent, which is precisely why "tested and the spec matches" is my gate before anything gets archived.

Don't cut the end off the roast

The family kept cutting a third off every roast for forty years because the reason left the kitchen but the habit stayed. Codebases do the same thing constantly: a flag set for a load problem that was solved two rewrites ago, a retry loop guarding an API that no longer exists, a workaround nobody dares delete because nobody remembers what it was for. The code is still cutting the end off, and the oven is long gone.

Working spec-first is how I refuse to inherit that. Explore so the reason is understood before anything is built. Propose so the reason is written down. Read and iterate so the reason actually holds up. Test by hand, correct the spec before the code, and only archive once the two agree. The reward is not really speed - it is that six months from now, when someone asks why a thing is the way it is, the answer is sitting in the spec in plain language instead of in someone's fading memory.

So the next time you reach for the knife on your own codebase, the only question worth asking is the one my dad's aunt eventually asked her mother.

Do you still have the tiny oven, or are you just cutting the end off the roast?