OpenSpec: the missing LEGO manual

ai agents harness-engineering documentation tutorial

Published 9 May 2026·7 min read

A built LEGO Technic Expert Builder 853 chassis on a beige carpet next to its dog-eared instruction manual

When I was nine, school holidays at my grandmother's place meant pulling out the LEGO Technic tub - beams and axles and gears and little black pins of three different lengths. My brother and I would sit on the carpet, tip the tub out, and try to make a thing. A car. A helicopter. A clock. Whatever we had a vague picture of in our heads.

Most of those builds did not survive after lunch - the car had steering but mis-matched wheels, the helicopter had one rotor that could spin but was missing a place for the minifig to sit. The clock could tick but had used all the cogs.

Then one summer my uncle walked in and opened a drawer in grandma's wall unit that I had never noticed, and there it was: a stack of dog-eared manuals, every Technic instruction manual she had kept since 1979. He pulled the F1 racer out, slid it across the carpet, and an hour later I had a steering rack that actually steered. Same tub. Same pieces. The manual was the difference.

Right now, ad-hoc prompting is exactly like building LEGO Technic from a picture in your mind. The pieces are all there, the AI is willing, and you have a vague idea of what you want. Every prompt is another go at the tub. Every reprompt is another sigh of "no, not like that". And the manual - the spec - is sitting in a drawer you have not opened yet.

The drawer of manuals is right there

Ad-hoc prompting is the loop you fall into when you skip the manual. You type "add a login form" and the AI gives you something close. You re-prompt. "No, use NextAuth." Re-prompt. "No, the callback should redirect to /dashboard, not /." Three rounds in, you have spent more on tokens than you would have spent writing a one-paragraph spec, and the AI still does not know that the dashboard is gated by an org check.

OpenSpec is the manual. You describe the change in plain language once, the AI proposes a design and a task list, you read it, push back where it is wrong, and only then does the AI write code.

The first time you use it, the part that surprises you is not the speed. It is how rarely you reprompt. The AI built what the spec said, and the spec said what you meant, because you understood the spec before you let it touch the code.

OpenSpec ships four commands you run, in order, on every change:

/opsx:explore: read the codebase, summarise what it found, ask the questions you have not thought to ask yet.
/opsx:propose: write the manual - a proposal, a design, a task list, and one spec file per capability.
/opsx:apply: read the specs and the tasks, write the code, tick each box off as it goes.
/opsx:archive: close the change out and roll its spec deltas into the long-lived specs/ folder.

OpenSpec's opsx: profile also exposes the same four commands under /openspec:*.

Setting up the workshop

OpenSpec is a global npm CLI plus a folder of conventions. You install it once, then run init in whatever project you want it to know about. Requires Node 20.19 or newer. Tested with OpenSpec 1.3.1.

npm install -g @fission-ai/openspec@latest

Now spin up a fresh project. We will build a tiny todo app called todone.

mkdir todone && cd todone
openspec init

From here, treat this as a copy-paste tutorial. In your AI chat, run the OpenSpec commands in order and paste the matching prompt for each step.

After init, the openspec/ folder appears. It is the manual drawer. Three things in it matter:

openspec/
├── specs/        # current shipped behaviour, one folder per capability
├── changes/      # active proposals, designs, tasks, spec deltas
└── config.yaml   # which AI tools and profiles you want wired in

specs/ what the system already does. Empty in a fresh init - it fills in as changes land and get archived.
changes/ active proposals. Each one is a folder with proposal.md, design.md, tasks.md, and one or more spec deltas under specs/. This is where the manual lives while you are building.
config.yaml which AI tools (Claude Code, Cursor, Copilot, and friends) you want OpenSpec to wire its slash commands into.

A terminal recording of openspec init laying down the scaffold.

/opsx:explore

explore is the cheapest part of an OpenSpec change. You run it with a hand-wavy idea - "a todo app with completed and archived states" - and the AI does three things. It reads whatever already exists in the project. It writes back what it understood, often with a little ASCII diagram of how it thinks the pieces fit together. Then it asks the questions you have not thought to ask yet.

This is the moment where the spec gets its bones. If the AI's understanding is wrong, you correct it now, in chat, before anything has been written down. If a question reveals a decision you have not actually made yet ("do archived items expire after thirty days?"), you make it now, on the carpet, instead of three days later in a code review.

Prompt used in the walkthrough:

/opsx:explore we are building a todo app called "todone" we want to show the user a list of their tasks, and have a way to add more, and then also archive completed tasks. this will be using localstorage in the browser only. We will use plain HTML, CSS and vanilla JavaScript - no framework, no build step

What comes back is a summary of what the AI understood, a list of clarifying questions at the bottom, and a single "where do you want to push on first?" Do not answer that line. Instead, reply with one word:

askUserQuestion

Claude picks it up, drops the wall-of-bullets format, and walks you through the same questions one at a time with multiple-choice answers, like a wizard. Much easier to think about each decision in isolation, and you can still type a free-form answer when the choices do not fit.

Note: AskUserQuestion is a Claude Code tool, not an OpenSpec command. If you are using a different harness (Cursor, Copilot, Aider), it should still work - if not just ask the AI to question you one at a time.

/opsx:explore walking the todone idea, with AskUserQuestion handling clarifying questions one at a time.

/opsx:propose

propose is the moment the conversation becomes a document. The AI takes the explore-mode chat, opens openspec/changes/<name>/, and writes four things: a proposal.md, a design.md, a tasks.md, and one specs/<capability>/spec.md per named capability. Open proposal.md and read through it. You will see four sections.

Prompt used in the walkthrough:

/opsx:propose

/opsx:propose writes four kinds of file. The actual files for our todone change are below - flip between the tabs to read them end-to-end.

proposal.md: the why, what changes, capabilities, and impact at a glance. The one-page pitch.
design.md: the technical decisions and trade-offs the AI considered. Alternatives, rejections, and the reasoning that survives long after the commit message is forgotten.
tasks.md: the implementation checklist that /opsx:apply works through one box at a time.
specs/<capability>/spec.md: one per named capability. The manual for that single behaviour.

## Why

We want a small, self-contained todo app called "todone" that runs entirely in the browser from a static folder - open `index.html`, type tasks, close the tab, come back tomorrow and the tasks are still there. No build step, no framework, no server. Small enough to spec end-to-end on a single page, and a good vehicle for exercising a complete task lifecycle (active, completed, archived) inside one local-only project.

## What Changes

- Add a new standalone project under `todone/` containing `index.html`, `styles.css`, and `app.js`. The project is self-contained: opening `todone/index.html` directly in any modern browser loads the working app.
- The app supports the full task lifecycle: add, complete, uncheck, edit (active only), archive (per-item and sweep-all), restore, hard-delete, plus a "Reset" action that wipes all of the app's state.
- Persist the app's state in `localStorage` under the versioned key `todone:v1:tasks`. The namespace prefix avoids collisions with anything else that might share the origin.
- Use plain HTML, plain CSS, and vanilla JavaScript. No bundler, no transpiler, no runtime dependencies. `app.js` is loaded as a single `<script type="module">` so it can use modern ES syntax directly.
- Frame the app inside a clearly bordered container with a small "todone" title, a "Reset" button, a one-line trust note ("Your tasks save in your browser only - never sent anywhere."), the add-task form, and three lists (active / completed / archived).

## Capabilities

### New Capabilities
- `task-management`: the full task lifecycle inside the app - adding, listing, editing on active, completing and unchecking, archiving (per-item and sweep), restoring, and hard-deleting tasks across the active / completed / archived states, plus a reset action that clears all app state.
- `task-persistence`: loading and saving the app's task collection in browser `localStorage` under the versioned key `todone:v1:tasks`, including defensive reads for corrupt data, defensive writes for unavailable storage, and an explicit reset path that wipes the persisted value.

### Modified Capabilities
<!-- None. The capabilities above are net-new. -->

## Impact

- **New code**: a small `todone/` folder containing exactly three files (`index.html`, `styles.css`, `app.js`). No `package.json`, no `node_modules`, no build pipeline.
- **No new runtime dependencies**: the app runs in any modern browser with no install step.
- **Storage**: a single `localStorage` key (`todone:v1:tasks`) on whatever origin the page is served from (or `file://` if opened from disk). No cookies, no network, no analytics, no telemetry.
- **Out of scope for v1**: drag-to-reorder, due dates, priorities, multi-device sync, undo for delete, and any kind of authentication or sharing.

/opsx:propose generating the proposal, design, tasks, and per-capability spec files.

/opsx:apply

apply is the part that feels like a magic trick the first time. You run /opsx:apply <change-name>, the AI opens tasks.md, reads the first unchecked box, does the work, ticks the box, and moves to the next one. Each tick is committed to the file as it happens, not at the end.

Because the ticks are persistent, /opsx:apply is resumable. If your context window fills up halfway through (or you close the laptop and go to lunch), you start a new session, run /opsx:apply again, and it picks up at the first unchecked box. No "what was I doing?" Just one boring step at a time, until the list is done.

When a task fails - usually because the spec turned out to be wrong about something - stop, fix the underlying spec or task, and re-run. Do not let the AI guess past it. The whole point of the manual is that you can edit it before the next page.

Prompt used in the walkthrough:

/opsx:apply add-todone-app

/opsx:apply ticking through tasks.md and resuming cleanly after an interruption.

/opsx:archive

Once the change is built and validated, archive closes it out. OpenSpec moves the accepted delta into long-lived specs and marks the change complete so the drawer stays clean for the next feature.

Prompt used in the walkthrough:

/opsx:archive add-todone-app

Refactoring

On a green-field, four-file feature OpenSpec feels like ceremony. On a refactor, it earns its keep within the first ten minutes. /opsx:explore on an existing piece of code surfaces every assumption the AI has and every gap it does not yet have - including the question you would have forgotten to ask.

And it leaves a record. Six months from now, when somebody asks why the storage key is namespaced with v1, the answer is in design.md, in plain language, with the alternatives that were considered and rejected. No archaeology through commit messages, no Slack search, no "I think Mikkel did it that way for a reason".

Here is what that looks like on a real refactor. The component below was specced and built end-to-end through OpenSpec - explore, propose, apply. Go ahead, try it.

todone (demo)

Your tasks save in your browser only - never sent anywhere.

Nothing to do yet - add your first task above.

That component is the manual made physical. Scroll back to the tabset above and you are reading the exact proposal.md, design.md, tasks.md, and capability specs that produced it - no edits, no curation. The videos below show the same loop on a refactor where the architecture actually changed mid-stream, and the manual kept up.

A real refactor turn: /opsx:explore catches a forgotten edge case and the proposal adjusts before any code lands.

And here we use /opsx:explore to completely change how the architecture of our todo app works, turning the standalone HTML build into a React component so it works within the website correctly.

Open the drawer

On Monday morning: install OpenSpec, run openspec init in the project that has been giving you the most reprompt-fatigue, and run /opsx:explore on a feature you have been putting off. Not the feature itself. Just the explore.

If you came away from explore knowing things about the change you did not know an hour earlier - and you will - keep going. /opsx:propose, read the manual, push back where it is wrong, /opsx:apply. The pieces have not changed. Your tub has not changed. The manual is the difference.

Which feature have you been re-prompting in circles, instead of using the manual?