I Thought /grill-me Was the Perfect Skill. Then I Tried /grill-with-docs.
Matt Pocock built two of the most powerful alignment tools in the AI coding ecosystem. Here's why both belong in your workflow — and why one alone isn't enough.
🔥 The Problem Nobody Talks About
You fire up Claude Code. You describe the feature. You hit enter.
The agent is off — writing files, creating types, wiring up logic. Fast. Confident. Impressive.
Then you look at what it built.
It's not wrong, exactly. But it's not quite right either. It misunderstood your data model. It invented a concept you don't use. It made a structural decision that conflicts with three other things it has no idea about.
This is the misalignment problem — and it's the most common failure mode in AI-assisted development.
"The most common failure mode in software development is misalignment. You think the dev knows what you want. Then you see what they've built — and you realise it didn't understand you at all. This is just the same in the AI age."
— Matt Pocock,
mattpocock/skillsREADME
Pocock's answer to this isn't a fancier model or a longer prompt. It's friction — deliberate, structured friction that forces alignment before the agent writes a single line.
He built two skills for this: /grill-me and /grill-with-docs. And the upgrade from one to the other is something you feel the first time you run it.
🧱 First: What Are Claude Code Skills?
Before diving in — a quick primer if you haven't used the skills system yet.
A skill is just a markdown file with a name, description, and a body of instructions. You drop it in .claude/skills/, and your agent reads it when you invoke it with a slash command. There's no execution layer, no API wrapper. The entire power is in the words.
.claude/
└── skills/
├── grill-me/
│ └── SKILL.md
└── grill-with-docs/
├── SKILL.md
├── CONTEXT-FORMAT.md
└── ADR-FORMAT.md
Install both from Matt's repo with one command:
npx skills@latest add mattpocock/skills
That's it. Now let's look at what you actually get.
🎯 /grill-me — Deep Detail, One Branch at a Time
Here is the entire skill. All of it:
name: grill-me
description: Interview the user relentlessly about a plan or design
until reaching shared understanding, resolving each branch of the
decision tree. Use when user wants to stress-test a plan, get
grilled on their design, or mentions "grill me".
Interview me relentlessly about every aspect of this plan until
we reach a shared understanding. Walk down each branch of the design
tree resolving dependencies between decisions one by one.
If a question can be answered by exploring the codebase, explore
the codebase instead.
For each question, provide your recommended answer.
Three sentences of instruction. Over 100,000 stars on GitHub.
What actually happens
You describe something — a feature, a system, a vague idea — and the agent stops. It doesn't write code. It asks a question. Then another. Then another.
You: I want to add a webhook retry queue with exponential backoff.
Claude: Before I start, I want to make sure I understand the design.
Q1: Should retries be processed in-process (same server), or via
a separate background worker?
My recommendation: Separate background worker — keeps the main
request path clean and allows independent scaling.
> Your answer:
The phrase that makes this work is "walk down each branch of the design tree." It forces Claude to treat your feature as a hierarchy of decisions, not a flat prompt. Upstream choices get resolved before downstream ones. You never end up in a situation where the agent asks about caching strategy before it knows what database you're using.
Sessions typically run 40–45 minutes. By the end, you have something genuinely valuable: a rich, structured conversation that captures your actual intent — not what the agent guessed.
Why it's brilliant for detail extraction
The "provide your recommended answer" instruction is the masterstroke. Instead of facing a blank question, you're reviewing a draft answer. You can just say "yes" and move on, or correct it if it's wrong. The cognitive load drops dramatically.
Claude: Q7: How should we handle the case where a webhook endpoint
returns a 200 but with an error body?
My recommendation: Treat as failure — require explicit success
signals. A 200 with error body indicates a broken contract on
their end, not ours.
You: Agreed. But also log the response body for debugging.
Claude: Noted. Q8: ...
This is rubber ducking, automated and made argumentative. You're not talking to yourself — you're talking to something that pushes back, has opinions, and doesn't let you skip the hard questions.
Where it shines — and where it stops
/grill-me is phenomenal at getting the detail out of your head and into a shared context. It works on anything: features, architecture, course plans, business decisions, even eulogies (yes, someone used it for that, and Pocock noted it worked wonderfully).
But it has a gap.
Every session starts fresh. The agent has no memory of the decisions you made last week, the terminology your team uses, or the architectural choices baked into your codebase. If you run /grill-me on a mature project, it will ask questions your CHANGELOG could answer. It might use "user" when your codebase calls it "customer." It might propose a pattern you already tried and rejected.
This is where the second skill comes in.
📂 /grill-with-docs — Detail Plus Structure
Here's the key addition in the grill-with-docs skill definition:
name: grill-with-docs
description: Grilling session that challenges your plan against the
existing domain model, sharpens terminology, and updates
documentation (CONTEXT.md, ADRs) inline as decisions crystallise.
Interview me relentlessly about every aspect of this plan until
we reach a shared understanding. Walk down each branch of the design
tree, resolving dependencies between decisions one-by-one.
For each question, provide your recommended answer.
Ask the questions one at a time, waiting for feedback on each
question before continuing.
If a question can be answered by exploring the codebase, explore
the codebase instead. During codebase exploration, also look for
existing documentation:
/
├── CONTEXT.md
├── docs/
│ └── adr/
│ ├── 0001-event-sourced-orders.md
│ └── 0002-postgres-for-write-model.md
└── src/
It starts with the same relentless interview. But now, before the questions begin, the agent reads your CONTEXT.md (your project glossary) and your ADR directory (your past architectural decisions). It already knows your language. It already knows what you chose, and why.
The three things it does differently
1. Challenges you against your own model
If your CONTEXT.md says a "Workspace" is a multi-tenant unit and you start talking about "organisations," the agent will flag it:
Claude: You used the word "organisation" — your domain model calls
this a Workspace. Should we add an alias to CONTEXT.md, or was
this intentional?
This is the ubiquitous language principle from Domain-Driven Design, baked directly into the interview flow.
2. Sharpens your terminology in real time
The before/after example from Pocock's own repo says it better than any description:
| Example | |
|---|---|
| ❌ Before | "There's a problem when a lesson inside a section of a course is made 'real' (i.e. given a spot in the file system)" |
| ✅ After | "There's a problem with the materialization cascade" |
Twelve words become three. That concision pays back on every subsequent session, every PR review, every standup.
3. Writes ADRs only when they're earned
The skill is disciplined about when to create an Architecture Decision Record. It only offers to write one when all three conditions hold:
- ⛔ Hard to reverse — the cost of changing your mind later is meaningful
- ⛔ Surprising without context — a future reader will wonder "why did they do this?"
- ⛔ Result of a real trade-off — there were genuine alternatives, you picked one for specific reasons
If any condition is missing, it skips the ADR entirely.
After a session, you might end up with something like this written automatically:
# ADR 0003: Exponential backoff capped at 24 hours
## Status
Accepted
## Context
Webhook delivery failures can be transient (endpoint downtime) or
permanent (misconfigured URLs). We need a retry strategy that
handles both without overwhelming downstream systems.
## Decision
Use exponential backoff starting at 1 minute, doubling each attempt,
capped at 24 hours. After 7 failed attempts, mark as permanently
failed and notify the workspace owner.
## Consequences
- Reduces thundering herd on recovery
- 24-hour cap means we notify owners within a day of failure
- Permanently failed webhooks require manual reactivation
That document didn't exist before the session. Now it does — and every future agent session that touches webhooks will read it.
⚖️ The Real Difference: Depth vs. Depth + Structure
Here's how I think about the two now:
/grill-me |
/grill-with-docs |
|
|---|---|---|
| Best for | New projects, greenfield, any domain | Existing codebases, mature domains, team projects |
| Reads | Codebase only | Codebase + CONTEXT.md + ADRs |
| Writes | Nothing | Updates CONTEXT.md, creates ADRs |
| Output | Rich conversation | Conversation + living documentation |
| Great at | Extracting detail from your head | Detail + structural decisions that stick |
| Setup required | None | CONTEXT.md + /setup-matt-pocock-skills |
/grill-me is the standalone, zero-setup version. You can run it against a vague idea with nothing in place and get 45 minutes of structured, useful output. It's the one Pocock says he uses "outside of coding too."
/grill-with-docs is the codebase-native version. It needs the infrastructure — a CONTEXT.md, an ADR directory, ideally a run of /setup-matt-pocock-skills first. But once it's in place, the sessions are qualitatively different. The agent isn't just asking good questions. It's asking questions calibrated to your project's specific language, history, and constraints.
🔄 How They Fit Into the Bigger Picture
Pocock describes his full development loop like this:
grill → spec → slice → ship → refactor
Both grilling skills slot into the first step. But /grill-with-docs feeds directly into the refactor phase too — because every session leaves the documentation in better shape than it found it.
A full session with a mature project might look like:
# Align on the feature
/grill-with-docs "Add a webhook retry queue with exponential backoff"
# → 8-15 questions
# → Refined plan
# → Updated CONTEXT.md (with "retry queue" defined)
# → New ADR (backoff strategy decision)
# Turn the plan into work
/to-prd
/to-issues
# Implement with feedback loops
/tdd
# Weekly architecture hygiene (reads your updated CONTEXT.md)
/improve-codebase-architecture
Each skill in the chain benefits from the documentation that /grill-with-docs keeps current.
🧠 Which One Should You Start With?
Starting a new project → /grill-me. No setup. No dependencies. Run it before every non-trivial feature and let the detail extraction do its job.
Working in an existing codebase → Set up the infrastructure once (/setup-matt-pocock-skills), write a first-pass CONTEXT.md, then switch to /grill-with-docs for everything that touches your domain model.
On a team → /grill-with-docs almost certainly. The shared CONTEXT.md means every engineer (human and AI) is speaking the same language.
The good news: you don't choose between them permanently. Start with /grill-me. When you feel the friction of repeating context session after session, that's the signal to level up.
🏁 The Bottom Line
/grill-me is a masterclass in how three sentences can change a workflow. It solves the misalignment problem by making requirement specification a structured, collaborative step rather than an afterthought.
/grill-with-docs takes that same interrogation and grounds it in your project's lived reality. It doesn't just help you think through a feature — it makes the thinking permanent, so you never have to repeat it.
I thought /grill-me was the perfect skill. It was the best tool I'd used for getting an idea out of my head and into a form an agent could build from.
Then I tried /grill-with-docs on a project that had been running for three months — one with a real domain, real architectural decisions, real terminology baked into the codebase.
The session was different. The agent already knew the language. It caught a terminology slip in my second sentence. By the end, the CONTEXT.md had two new entries and the ADR directory had a document I should have written six weeks ago.
The detail is still there. Now it sticks.
→ Install both skills: npx skills@latest add mattpocock/skills
→ Full repo: github.com/mattpocock/skills
→ Matt's original write-up: aihero.dev/my-grill-me-skill-has-gone-viral