← Alla artiklar Verklighetscheck

Three words, four definitions — why AI conversations keep failing

Three words, four definitions — why AI conversations keep failing

Three words, four definitions

The meeting seemed productive. Everyone agreed: the solution would use an agent with custom skills and automation to handle the repetitive parts. Great. Aligned. Moving forward.

Six weeks later, nothing shipped. The developer built a Claude Code workflow with local skill files. The product lead expected something closer to an autonomous assistant running in the background. The operations manager assumed it meant a Make scenario triggered by a webhook.

They had been talking about the same three words for six weeks. They were never talking about the same thing.

This is not a communication failure. It is a vocabulary failure — and it is almost universal in AI conversations right now. The words skill, agent, and automation carry between three and four incompatible definitions each, depending on the ecosystem, the mental model, and the tool the speaker has been using.

No one lies. No one oversimplifies deliberately. The words just mean different things.


"Skill" — three separate concepts

Concept 1: A local workflow file (Claude Code)

In Claude Code — Anthropic's CLI environment — a skill is a folder on disk containing a markdown prompt file. It runs in a local session. It cannot be triggered externally. It is personal tooling, bound to one machine and one user's session. Powerful for the person who built it. Invisible to the rest of the organization.

Concept 2: A self-generated workflow module (OpenClaw)

OpenClaw, an open-source agent framework, uses the word skill differently. Here, a skill is auto-generated from a workflow the user builds — and it can be triggered externally via webhook. Another system sends a POST request; the agent wakes up and executes. This is meaningfully different from a filesystem-bound session script.

Concept 3: An API-invokable capability (Claude API)

In the Claude API, skills are uploaded capabilities with unique identifiers. External systems trigger them via skill_id. Designed for integration and automation at scale, completely decoupled from any local session.

Why this matters: When someone says "we'll build a skill for that," they might mean any of the three. A developer who has been working with Claude Code thinks local workflow, no external trigger. A builder who has been working with OpenClaw thinks webhook-invokable module. A system architect thinks API-registered capability. All three agree to the plan. None of them are building the same thing.

The practical test: Which kind of skill, in which ecosystem? That question ends the ambiguity immediately.


"Agent" — four axes that slide into each other

The word agent carries even more interpretive weight because it spans both technical architecture and organizational metaphor.

Product axis: Scout (Microsoft), Hermes (Nous Research), OpenClaw — these are specific named products with distinct architectures. Scout runs under an Entra identity with audit trails, designed for M365 organizations. Hermes builds a closed learning loop after each task — depth over breadth. OpenClaw is gateway-first, built for breadth of integration. Using "agent" to mean any of these three products leads to meaningless comparisons.

Architecture axis: Technically, an agent is an LLM in a loop with tools and memory. That is an architectural description, not a product. A Claude API conversation with tool use is architecturally agentic. So is a simple Python script that calls an LLM, checks output, and loops. Calling both "agents" obscures everything important about how they actually work.

Autonomy axis: This is the one that confuses organizational discussions most. "Agent" can mean a scripted sequence of steps that looks autonomous (low autonomy, high predictability) or a genuinely self-directed process that decides its own next action based on intermediate results (high autonomy, lower predictability). The trust requirements for these two things are fundamentally different.

Metaphor axis: "Digital employee." "Staff on a subscription." "Hired on retainer." These are sales metaphors that import human accountability structures onto software. The metaphor is powerful because it is legible to decision-makers who don't track the technical distinctions. It is also misleading because it suggests accountability transfer that does not actually happen.

Why this matters: An organization that says "we want an agent to handle our onboarding flow" might mean any combination of these four axes. Before building anything, the conversation needs to answer: What product or architecture? What autonomy level? Who remains accountable when it goes wrong?


"Automation" — different status in different mental models

This one is subtler because the word carries different category status depending on who is speaking.

For a practitioner: Automation is a first-class category. Make, Zapier, n8n — these are automation platforms. An automation is a defined pipeline with a trigger, steps, and an outcome. It is a stable, testable thing. The principle of automation before agents — build solid, machine-safe pipelines that an agent can start rather than building an agent that replaces the pipeline — treats automation as the dependable foundation layer.

For a process thinker: Automation is not a meaningful category. There are just processes, divided into process steps. Some steps are human, some are programmatic. Calling the programmatic ones "automation" adds no useful information — it is just describing the substrate of execution. From this perspective, "we have zero automations" might mean "we have not set up any Make scenarios" or might mean "we do not think in terms of automation as a category at all."

These are not incompatible worldviews. But they make conversations fail constantly.

When a consultant says "you need to automate this," the practitioner hears specific platform, defined trigger, testable pipeline. The process thinker hears a vague recommendation to make things more programmatic. Both nod. Neither acts on the same thing.

The Mindtastic distinction: The reason we separate automation from agents in the AI Orchestration framework is precisely because they have different trust profiles. An automation is a workflow you have reviewed, defined, and can validate. An agent operates within that workflow — or outside it, with higher autonomy and therefore higher accountability requirements. Conflating the two leads to either over-engineering (treating every Make scenario as an agent decision) or under-engineering (treating every agent decision as a simple automation with no accountability layer).


The practical consequence

False consensus costs real decisions.

An organization that approves "an agent-based approach with skills and automation" has approved almost nothing. The words are compatible enough that everyone nods — and incompatible enough that three teams can build three different things with identical mandate and budget.

The fix is not a glossary. Glossaries require people to check them, and they don't, especially under deadline.

The fix is one question, asked early: Which kind of X, in which ecosystem?

  • Which kind of skill — local session, webhook-triggered, or API-registered?
  • Which kind of agent — which product, what autonomy level, who stays accountable?
  • Which kind of automation — a defined platform pipeline, or general programmatic processing?

The question takes thirty seconds. The answer tells you whether the conversation was real alignment or apparent consensus waiting to collapse.

In AI right now, apparent consensus is the norm. The technology moves fast enough that everyone is slightly behind, and no one wants to be the person who admits they are not tracking the distinctions. So people agree on words and diverge on reality.

Asking which kind is not a sign of confusion. It is the most useful thing you can do in most AI conversations.


Related: Accountability levels in AI development and The work loop — six steps that make accountability concrete