Automation

I Thought /grill-me Was Perfect. Then I Found /grill-with-docs.

Brendan Tack Brendan Tack · · 4 min read
I Thought /grill-me Was Perfect. Then I Found /grill-with-docs.

I Thought /grill-me Was the Perfect Skill. Then I Tried /grill-with-docs.

Matt Pocock built two of the most powerful alignment tools in the AI coding ecosystem. Here's why both belong in your workflow — and why one alone isn't enough.


🔥 The Problem Nobody Talks About

You fire up Claude Code. You describe the feature. You hit enter.

The agent is off — writing files, creating types, wiring up logic. Fast. Confident. Impressive.

Then you look at what it built.

It's not wrong, exactly. But it's not quite right either. It misunderstood your data model. It invented a concept you don't use. It made a structural decision that conflicts with three other things it has no idea about.

This is the misalignment problem — and it's the most common failure mode in AI-assisted development.

"The most common failure mode in software development is misalignment. You think the dev knows what you want. Then you see what they've built — and you realise it didn't understand you at all. This is just the same in the AI age."

— Matt Pocock, mattpocock/skills README

Pocock's answer to this isn't a fancier model or a longer prompt. It's friction — deliberate, structured friction that forces alignment before the agent writes a single line.

He built two skills for this: /grill-me and /grill-with-docs. And the upgrade from one to the other is something you feel the first time you run it.


🧱 First: What Are Claude Code Skills?

Before diving in — a quick primer if you haven't used the skills system yet.

A skill is just a markdown file with a name, description, and a body of instructions. You drop it in .claude/skills/, and your agent reads it when you invoke it with a slash command. There's no execution layer, no API wrapper. The entire power is in the words.

CODE
.claude/
└── skills/
    ├── grill-me/
    │   └── SKILL.md
    └── grill-with-docs/
        ├── SKILL.md
        ├── CONTEXT-FORMAT.md
        └── ADR-FORMAT.md

Install both from Matt's repo with one command:

BASH
npx skills@latest add mattpocock/skills

That's it. Now let's look at what you actually get.


🎯 /grill-me — Deep Detail, One Branch at a Time

Here is the entire skill. All of it:

CODE
name: grill-me
description: Interview the user relentlessly about a plan or design
  until reaching shared understanding, resolving each branch of the
  decision tree. Use when user wants to stress-test a plan, get
  grilled on their design, or mentions "grill me".

Interview me relentlessly about every aspect of this plan until
we reach a shared understanding. Walk down each branch of the design
tree resolving dependencies between decisions one by one.

If a question can be answered by exploring the codebase, explore
the codebase instead.

For each question, provide your recommended answer.

Three sentences of instruction. Over 100,000 stars on GitHub.

What actually happens

You describe something — a feature, a system, a vague idea — and the agent stops. It doesn't write code. It asks a question. Then another. Then another.

CODE
You: I want to add a webhook retry queue with exponential backoff.

Claude: Before I start, I want to make sure I understand the design.

  Q1: Should retries be processed in-process (same server), or via
  a separate background worker?

  My recommendation: Separate background worker — keeps the main
  request path clean and allows independent scaling.

  > Your answer:

The phrase that makes this work is "walk down each branch of the design tree." It forces Claude to treat your feature as a hierarchy of decisions, not a flat prompt. Upstream choices get resolved before downstream ones. You never end up in a situation where the agent asks about caching strategy before it knows what database you're using.

Sessions typically run 40–45 minutes. By the end, you have something genuinely valuable: a rich, structured conversation that captures your actual intent — not what the agent guessed.

Why it's brilliant for detail extraction

The "provide your recommended answer" instruction is the masterstroke. Instead of facing a blank question, you're reviewing a draft answer. You can just say "yes" and move on, or correct it if it's wrong. The cognitive load drops dramatically.

CODE
Claude: Q7: How should we handle the case where a webhook endpoint
  returns a 200 but with an error body?

  My recommendation: Treat as failure — require explicit success
  signals. A 200 with error body indicates a broken contract on
  their end, not ours.

You: Agreed. But also log the response body for debugging.

Claude: Noted. Q8: ...

This is rubber ducking, automated and made argumentative. You're not talking to yourself — you're talking to something that pushes back, has opinions, and doesn't let you skip the hard questions.

Where it shines — and where it stops

/grill-me is phenomenal at getting the detail out of your head and into a shared context. It works on anything: features, architecture, course plans, business decisions, even eulogies (yes, someone used it for that, and Pocock noted it worked wonderfully).

But it has a gap.

Every session starts fresh. The agent has no memory of the decisions you made last week, the terminology your team uses, or the architectural choices baked into your codebase. If you run /grill-me on a mature project, it will ask questions your CHANGELOG could answer. It might use "user" when your codebase calls it "customer." It might propose a pattern you already tried and rejected.

This is where the second skill comes in.


📂 /grill-with-docs — Detail Plus Structure

Here's the key addition in the grill-with-docs skill definition:

CODE
name: grill-with-docs
description: Grilling session that challenges your plan against the
  existing domain model, sharpens terminology, and updates
  documentation (CONTEXT.md, ADRs) inline as decisions crystallise.

Interview me relentlessly about every aspect of this plan until
we reach a shared understanding. Walk down each branch of the design
tree, resolving dependencies between decisions one-by-one.

For each question, provide your recommended answer.
Ask the questions one at a time, waiting for feedback on each
question before continuing.

If a question can be answered by exploring the codebase, explore
the codebase instead. During codebase exploration, also look for
existing documentation:

/
├── CONTEXT.md
├── docs/
│   └── adr/
│       ├── 0001-event-sourced-orders.md
│       └── 0002-postgres-for-write-model.md
└── src/

It starts with the same relentless interview. But now, before the questions begin, the agent reads your CONTEXT.md (your project glossary) and your ADR directory (your past architectural decisions). It already knows your language. It already knows what you chose, and why.

The three things it does differently

1. Challenges you against your own model

If your CONTEXT.md says a "Workspace" is a multi-tenant unit and you start talking about "organisations," the agent will flag it:

CODE
Claude: You used the word "organisation" — your domain model calls
  this a Workspace. Should we add an alias to CONTEXT.md, or was
  this intentional?

This is the ubiquitous language principle from Domain-Driven Design, baked directly into the interview flow.

2. Sharpens your terminology in real time

The before/after example from Pocock's own repo says it better than any description:

Example
❌ Before "There's a problem when a lesson inside a section of a course is made 'real' (i.e. given a spot in the file system)"
✅ After "There's a problem with the materialization cascade"

Twelve words become three. That concision pays back on every subsequent session, every PR review, every standup.

3. Writes ADRs only when they're earned

The skill is disciplined about when to create an Architecture Decision Record. It only offers to write one when all three conditions hold:

If any condition is missing, it skips the ADR entirely.

After a session, you might end up with something like this written automatically:

MARKDOWN
# ADR 0003: Exponential backoff capped at 24 hours

## Status
Accepted

## Context
Webhook delivery failures can be transient (endpoint downtime) or
permanent (misconfigured URLs). We need a retry strategy that
handles both without overwhelming downstream systems.

## Decision
Use exponential backoff starting at 1 minute, doubling each attempt,
capped at 24 hours. After 7 failed attempts, mark as permanently
failed and notify the workspace owner.

## Consequences
- Reduces thundering herd on recovery
- 24-hour cap means we notify owners within a day of failure
- Permanently failed webhooks require manual reactivation

That document didn't exist before the session. Now it does — and every future agent session that touches webhooks will read it.


⚖️ The Real Difference: Depth vs. Depth + Structure

Here's how I think about the two now:

/grill-me /grill-with-docs
Best for New projects, greenfield, any domain Existing codebases, mature domains, team projects
Reads Codebase only Codebase + CONTEXT.md + ADRs
Writes Nothing Updates CONTEXT.md, creates ADRs
Output Rich conversation Conversation + living documentation
Great at Extracting detail from your head Detail + structural decisions that stick
Setup required None CONTEXT.md + /setup-matt-pocock-skills

/grill-me is the standalone, zero-setup version. You can run it against a vague idea with nothing in place and get 45 minutes of structured, useful output. It's the one Pocock says he uses "outside of coding too."

/grill-with-docs is the codebase-native version. It needs the infrastructure — a CONTEXT.md, an ADR directory, ideally a run of /setup-matt-pocock-skills first. But once it's in place, the sessions are qualitatively different. The agent isn't just asking good questions. It's asking questions calibrated to your project's specific language, history, and constraints.


🔄 How They Fit Into the Bigger Picture

Pocock describes his full development loop like this:

CODE
grill → spec → slice → ship → refactor

Both grilling skills slot into the first step. But /grill-with-docs feeds directly into the refactor phase too — because every session leaves the documentation in better shape than it found it.

A full session with a mature project might look like:

BASH
# Align on the feature
/grill-with-docs "Add a webhook retry queue with exponential backoff"
# → 8-15 questions
# → Refined plan
# → Updated CONTEXT.md (with "retry queue" defined)
# → New ADR (backoff strategy decision)

# Turn the plan into work
/to-prd
/to-issues

# Implement with feedback loops
/tdd

# Weekly architecture hygiene (reads your updated CONTEXT.md)
/improve-codebase-architecture

Each skill in the chain benefits from the documentation that /grill-with-docs keeps current.


🧠 Which One Should You Start With?

Starting a new project/grill-me. No setup. No dependencies. Run it before every non-trivial feature and let the detail extraction do its job.

Working in an existing codebase → Set up the infrastructure once (/setup-matt-pocock-skills), write a first-pass CONTEXT.md, then switch to /grill-with-docs for everything that touches your domain model.

On a team/grill-with-docs almost certainly. The shared CONTEXT.md means every engineer (human and AI) is speaking the same language.

The good news: you don't choose between them permanently. Start with /grill-me. When you feel the friction of repeating context session after session, that's the signal to level up.


🏁 The Bottom Line

/grill-me is a masterclass in how three sentences can change a workflow. It solves the misalignment problem by making requirement specification a structured, collaborative step rather than an afterthought.

/grill-with-docs takes that same interrogation and grounds it in your project's lived reality. It doesn't just help you think through a feature — it makes the thinking permanent, so you never have to repeat it.

I thought /grill-me was the perfect skill. It was the best tool I'd used for getting an idea out of my head and into a form an agent could build from.

Then I tried /grill-with-docs on a project that had been running for three months — one with a real domain, real architectural decisions, real terminology baked into the codebase.

The session was different. The agent already knew the language. It caught a terminology slip in my second sentence. By the end, the CONTEXT.md had two new entries and the ADR directory had a document I should have written six weeks ago.

The detail is still there. Now it sticks.


→ Install both skills: npx skills@latest add mattpocock/skills

→ Full repo: github.com/mattpocock/skills

→ Matt's original write-up: aihero.dev/my-grill-me-skill-has-gone-viral

Want to talk about your business?

Book a free Reverse Demo — we'll show you what your operation could look like with the right automations in place.

Book a Reverse Demo