Your LLM Has Amnesia. Build It a Wiki.

I spent 2–3 days vibe-coding a web app with Claude Code and Opus — Anthropic’s strongest coding model. Agent, skills, prompts describing the stack, the business logic, the data models, the UI elements, the forms. It scaffolded the frontend, the backend, the database, and wired nice-looking forms straight through to the schema. Everything worked.

Then I checked the token meter.

Burning fast. And every few hours the agent lost the thread — I’d re-paste conventions I’d written days earlier, and sometimes catch it repeating a mistake I’d already corrected. Almost always the root cause was mine: a rule I’d written vaguely, or never written at all. I tried /clear to drop the bloated history. That saved tokens, and immediately created a different problem: Claude Code had to re-read the codebase from scratch to figure out which files mattered for the next change. The loop only got worse — longer context, more tokens, more re-derivation.

Then Andrej Karpathy’s gist landed in my feed. His concept clicked with the way I already used my Obsidian graph. I followed it and rebuilt the project around a structured markdown wiki the agent reads on demand. This post is what came out of that — the folder layout I now use across projects, and why each piece earns its place.

What We’ll Cover

The four problems a project wiki fixes
The folder layout — epics, stories, tasks, bugs, releases, retros, tech-docs
Frontmatter as the LLM’s API
Wikilinks: how the agent walks a graph instead of grepping
The brainstorming checklist that catches spec bugs unit tests miss
Token and time tracking per task and per story
The release lifecycle, day by day
The afternoon starter kit

Environment: any plain-text editor works; I use Obsidian for the wikilink graph and Dataview for auto-rolled release pages. Files are plain Markdown with YAML frontmatter on disk — no proprietary format.

1. The Four Problems a Project Wiki Fixes

Working with an LLM agent on a real codebase has four predictable failure modes:

Token re-derivation. Each new chat re-asks “what’s the stack? what’s the URL scheme? what’s the data model?” You pay for the same paragraph every time.
Vendor lock-in by accident. If your project memory lives in one tool’s hidden context, switching models — Claude to GPT to a local Llama — wipes it.
Cold-start drift. Without a single source of truth, the agent re-invents naming, re-discovers conventions, picks slightly different patterns each session. The codebase grows inconsistent under you.
Retro amnesia. You learn something the hard way in sprint one. By sprint three, nobody — human or model — remembers the lesson.

The wiki sits between you, the agent, and the code. It is the system prompt you only pay to write once.

Markdown Is the Portable Format

The same files work in Obsidian, Cursor, Claude Code, VS Code, or plain cat. Switch your AI vendor next quarter and your project memory comes with you. No export, no migration, no lock-in.

2. The Folder Layout

Eight top-level folders, each holding one document type with a strict naming pattern and typed frontmatter:

Folder	Doc type	Role
`epics/`	Epic	A high-level goal, a group of stories
`stories/`	Story	One testable user-facing outcome
`tasks/`	Task	A single execution unit, frontend or backend
`bugs/`	Bug	A defect with traceback to its task or story
`releases/`	Release	A deployable bundle, auto-rolled from frontmatter
`retrospectives/`	Retro	Root cause analysis and action items
`tech-docs/`	Reference	Stable architecture facts
`_assets/`	Images	Diagrams and screenshots

File naming is parseable, so the agent derives relationships from filenames alone:

epics/{Word}.md
stories/{epic-key}-{seq}--{slug}.md
tasks/{story-id}-T{seq}--{slug}.md
bugs/{story-id}-B{seq}--{slug}.md
releases/v{major}.{minor}.md
retrospectives/retro-{release}--{slug}.md

The id EPC-01-T1 tells you instantly: epic EPC, story 01, task 1. No directory crawl, no fuzzy match.

3. Frontmatter as the LLM’s API

Every doc starts with typed YAML. This is the contract both humans and agents read against. Stories carry the richest schema:

type: story
epic: "[[Epic-Name]]"
id: EPC-01
status: in-progress
priority: high
story_point: 8
actor: end-user
goal: short verb-phrase goal
business_value: why this matters
tasks:
  - "[[EPC-01-T1--first-task]]"
  - "[[EPC-01-T2--second-task]]"
release: v1.3
created: 2026-04-15
used_tokens:
time_spent:

Two audiences read this:

You, in your editor, where queries turn it into rolled-up tables.
The agent, which can grep, filter, and reason about it without parsing prose.

A query like “show all in-progress stories in the next release” is one Dataview block on the human side and one grep -l 'release: v1.3' stories/ on the agent side. Same source, two consumers — that is the whole point of typed frontmatter.

One Paste Beats Ten Re-Explanations

Front-load your _index.md once at the start of a session — stack, URL scheme, data model, naming patterns. The agent stops asking “what’s the framework?” for the rest of the conversation. That single paste is worth thousands of tokens across a sprint.

4. Wikilinks: The Graph the Agent Walks

Wikilinks are not decoration. They are how the agent navigates context without a full-vault grep.

Obsidian graph view of the project wiki — clusters of epics, stories, tasks, bugs, and releases connected by wikilinks

A story lists its tasks as wikilinks. A bug links back to the task that introduced it. A release links to its retrospective. A retro links to the stories it reviewed. Every path is two hops or fewer:

epic -> story -> task -> bug -> release -> retro -> action item -> _index.md

The agent doesn’t need to load the whole vault. It loads the story, follows one link, lands in the right task or bug, and stops. Tokens spent: minimal. Context fidelity: high. This is the difference between “search the codebase for X” (expensive, noisy) and “read this one file and follow the link” (cheap, surgical).

5. The Brainstorming Checklist That Catches Spec Bugs

Unit tests catch logic bugs. They do not catch spec bugs — the wrong screen, the missing config knob, a value that violates a database constraint nobody documented. Most LLM-written code that ships broken is broken for spec reasons, not logic ones.

Before any story moves from brainstorming to todo, the wiki forces four questions:

Step-by-step UI flow. What does the user see at each step? What page loads? What are the error and loading states? Don’t say “user verifies email” — specify which page, what URL, what shows on success and failure.
Backend integration points. Which service handles each step? How are tokens or sessions passed between server and client? Who creates the session, who reads it?
External config requirements. What dashboard settings, env vars, DNS records, or third-party config are needed? List them in an ## External Config section.
Data entity values. Cross-check every column value mentioned in the story against the actual schema constraints — CHECK, enum, NOT NULL. If the spec says role = 'parent', verify that string exists in the constraint.

Every one of these questions costs minutes to answer up front and saves hours of debugging later. They exist because they are the questions retros keep finding at the root of bugs.

6. Token and Time Tracking

Two fields on every task and every story:

used_tokens:
time_spent:

You fill them in when the work flips to done. Tasks record their own actuals. Stories sum their tasks. Over a few sprints you have real per-story cost data — token spend by feature area, by agent, by complexity tier.

You cannot optimize what you don’t measure. “The AI is expensive” is a feeling. “Auth stories cost 3× what CRUD stories cost” is a number you can act on.

A Wiki You Don't Update Is a Wiki That Lies

used_tokens, completed, and status must be filled at done-time, not “later.” A wiki that drifts out of sync with reality rots into vibes — you and the agent both stop trusting it, and you’re back to re-deriving context from scratch.

7. The Release Lifecycle, Day by Day

The whole point of the wiki is that it accumulates. A single release page proves it:

Epic drafted. A short page in epics/ names a goal and gets a key.
Story brainstormed. The four-item checklist runs. Status moves to todo.
Tasks executed. Each task gets a completed date and a commit hash. Status done.
Bugs filed. When something breaks in QA or production, a bug links back via related: to the task that introduced it.
Release cut. A releases/v{x.y}.md page rolls up everything tagged release: v1.3 via Dataview. No hand-maintained tables.
Retro written. After the release ships, a retrospectives/retro-*.md page analyzes the bugs by root cause category and produces action items.
Action items feed back. The action items update _index.md itself — new rules, new checklist items, new conventions. The wiki gets smarter every sprint.

Every step leaves a typed, linked artifact. Tomorrow’s session reads them and is up to speed in seconds.

8. The Afternoon Starter Kit

You don’t need the full structure on day one. Minimum viable wiki:

_index.md — your project overview, tech stack, URL scheme, data model. The one file every session reads first.
epics/, stories/, tasks/ — three folders, that’s it. Skip bugs, releases, retros until you have something to ship.
The four-item brainstorming checklist pasted into _index.md. Use it on the first story. You’ll feel the difference.

Add the rest as you grow into them: bugs the first time something breaks, releases when you cut your first version, retros after the first release ships. Dataview can wait until release two — by then the value is obvious.

What You Have Now

You used to	You now
Re-paste the stack each session	The agent reads `_index.md` once
Lose decisions when you switch models	The wiki survives the model swap
Discover spec gaps in QA	The brainstorming checklist surfaces them pre-code
Guess what the agent costs	`used_tokens` per story, summed from tasks
Repeat sprint mistakes	Retro action items land back in `_index.md`

Next Steps

Create the eight folders. Write _index.md with your stack, URL scheme, and data model. One paste, ten minutes.
Adopt the four-item brainstorming checklist. Apply it to the next story you start. Ship it. Notice what would have slipped through.
After your first bug, write a retro. After your second sprint, you will have a wiki that pays for itself every session.

Markdown plus frontmatter is not glamorous. That’s the point — boring formats outlive vendors. Build memory you own.