Do You Actually Need an AI Agent?

Right now, every LinkedIn feed is shouting the same thing: you need agents. Multi-agent teams. Autonomous workflows. AI doing all the things while you sip coffee.

But strip the hype away and ask the honest question: does the average business actually need an agent? Or is ChatGPT — plus a couple of well-built workflows — already doing 90% of the job?

Spoiler: most of the time, you don't need an agent. You need to know the difference between three things, and pick the cheapest one that solves your problem.

The three options most people conflate

Anthropic's engineering team published one of the clearest breakdowns of this, and it's worth borrowing. They split AI work into three buckets:

Task — A single model call. Summarise this email. Extract these fields. Classify this ticket. Predictable cost, bounded failure modes.
Workflow — Multiple model calls in a predefined control flow. You design the steps; the model fills them in. Route → draft → check policy → send. You own the plumbing.
Agent — The model owns the plumbing. It decides what to do, in what order, which tools to call, and when to stop. Loops, retries, exploration, surprises.

Barry Zhang from Anthropic put the rule plainly at the AI Engineer Summit: "Don't build agents for everything." The hype cycle wants you to. Your vendors definitely want you to. Resist.

The reason is simple: workflows are predictable, cheap, and easy to debug. Agents are flexible but expensive, stochastic, and a nightmare to fix when they go wrong. If you can map the decision tree, you don't need an agent — you need a workflow.

Start with the goal, not the tool

Before you go anywhere near building anything, write down what you actually want to achieve. Not "I want an AI agent." Not "I want to automate stuff." The actual outcome:

Faster lead response time?
Fewer hours reconciling spreadsheets?
Cleaner client onboarding?

If you can't write the outcome in one sentence, you're not ready to build anything yet. Goal first, tool last.

The audit: a 4-step filter

Once the goal is clear, run every candidate task through this filter.

Step 1 — Inventory. Write down every recurring task in the business. Order them by total time consumed per week.

Step 2 — Importance. For each task that takes more than 20 minutes, ask: if this stopped happening tomorrow, would the business suffer? If the answer is no, kill the task. Don't automate it. Just stop.

Step 3 — Determinism. For tasks that survive Step 2, ask:

Does the input vary predictably, or wildly?
Are the steps the same every time, or does each case need judgement?
Is the output verifiable, or is it subjective?

If the steps are predictable and the output is verifiable, you're in workflow territory. n8n, Zapier, a deterministic pipeline. No agent needed.

Step 4 — Ambiguity test. Only the leftover tasks — the ones with genuine ambiguity, multi-step reasoning, and a path that changes case-by-case — are real agent candidates. And even then, you have to look at the cost before you commit.

The cost reality nobody's telling you

This is where the hype meets a brick wall.

A 2025 study from Stanford's Digital Economy Lab analysed frontier model behaviour on agentic coding benchmarks and found something uncomfortable: agentic tasks burned roughly a thousand times more tokens than equivalent code reasoning or chat tasks, with input tokens — not outputs — driving most of the cost. The reason is mechanical: agents re-read their entire context on every loop iteration. Original prompt, plus every previous response, plus every new tool result. The context snowball is real, and it grows fast.

Worse, agent costs are wildly unpredictable. The same study found runs on the same task could vary by up to 30x in total token spend. You can't reliably forecast what an agent will cost until after it's run.

For SMBs, this matters. Industry data puts moderate agent deployments at roughly $1,000–$5,000 per month, with high-usage agents burning 5–10 million tokens monthly. Get the architecture wrong and you're paying agent prices for what should have been a $20-a-month workflow.

Before you commit, get honest about:

Cost per execution. Estimate token usage, multiply by realistic monthly volume, then double it.
Tool access. Every external API the agent touches is an attack surface and a potential failure point.
Data sensitivity. If the agent handles PII, payment data, or anything regulated, you've just inherited a compliance scope you may not be ready for.
Error cost. A wrong email is recoverable. A wrong invoice, an incorrectly-cancelled subscription, or a deleted record is not.

When an agent actually makes sense

Agents earn their keep when all four of these are true:

The task is genuinely ambiguous. You can't pre-map the decision tree.
The output is verifiable. Tests, schemas, or a human review gate — something has to confirm the agent did the right thing.
The value justifies the spend. A task worth £5 isn't worth £2 of tokens plus the engineering overhead.
The cost of error is low or recoverable. Read-only access and human-in-the-loop checkpoints help here.

This is why agentic coding tools like Claude Code and Cursor actually work — code is verifiable through tests, the problem space is bounded, and the value per task is high. Most other "agent" use cases don't pass all four conditions, which is why most of them quietly get rebuilt as workflows six months later.

The honest answer for most businesses

For 80% of SMBs, the right stack looks like this:

ChatGPT or Claude for one-off thinking, drafting, and analysis.
Deterministic workflows (n8n, Zapier, Make) for the repeatable stuff.
An agent — if, and only if, you have a single high-value task that genuinely can't be solved any other way.

Build the simplest thing that solves the problem. Add complexity only when you can prove the simpler option failed.

If you're staring at a list of tasks and you're not sure which bucket each one falls into, that's exactly the conversation we have on a Valdris consultation call. Free, no pitch deck, no agent-shaped hammer looking for nails. We'll map your tasks against the framework above and tell you honestly whether you need an agent, a workflow, or just a sharper prompt.

[Book a call →] and we'll work it out together.