Agent Skill: Three-Layer Architecture and Design Philosophy

Table of Contents

Introduction
#

I’ve been using Claude Code a lot lately, and there’s one feature I keep coming back to: Agent Skill. At first I treated it as just a “prompt archive,” but the more I used it, the more I realized the design is far more elegant than that. In this post, I’ll walk through what Agent Skill is, how to use it, the thinking behind its architecture, and how it compares to MCP, so you can pick the right tool for your use case.

What Is Agent Skill
#

The simplest way to think about it: Agent Skill is a reference manual that a large language model can consult at any time.

For example, if you’re building a smart customer service bot, you can write rules in a Skill like “calm the user down first when handling complaints, and never make promises you can’t keep.” If you want meeting summaries, you can specify “output must follow the format: attendees, topics, decisions.” No more pasting the same long prompt every time—the model just looks up the manual and gets to work¹.

That said, “reference manual” is a beginner-friendly simplification. Skill can do a lot more, as we’ll see.

Basic Usage: Building a Meeting Summary Assistant
#

Let’s use Claude Code as an example. The first step is creating a Skill.

Claude Code expects Skills to live in the .claude/skills/ directory under your home folder. Create a folder called “meeting-summary-assistant”—the folder name becomes the Skill name. Inside, create a skill.md file.

The file has two parts:

The header is metadata, wrapped in triple dashes, with name and description fields. name must match the folder name. description tells the model what this Skill does.

The rest is the instruction, which describes the rules the model should follow. In my case, I specified that summaries must cover attendees, topics, and decisions, and included an input/output example to make sure the model really understands.

Once created, open Claude Code and ask “what Skills do you have?” It’ll list the one you just made. Then paste in a meeting transcript and ask it to summarize. Claude Code will recognize the request matches your “meeting-summary-assistant,” ask for permission to use it, read the skill.md, and produce a formatted summary.

Pretty intuitive.

Under the Hood: How Skill Works
#

Now that we’ve seen the basics, let’s think about what actually happened.

There are three actors in the flow: the user, Claude Code (the host), and the LLM behind it. Here’s the sequence:

The user sends a request
Claude Code sends the request along with the names and descriptions of all Skills to the LLM
The LLM recognizes the request matches “meeting-summary-assistant” and tells Claude Code
Claude Code reads the full skill.md of the matched Skill
Claude Code sends the user request and the complete skill.md content to the LLM
The LLM generates a response following the Skill’s rules

The key detail: step 2 only sends names and descriptions; step 4 reads the full content. Even if you have a dozen Skills installed, the model starts with a lightweight directory. This is Skill’s first core mechanism: lazy loading.

Advanced Usage I: Reference (Conditional File Loading)
#

Lazy loading already saves tokens, but it can go further.

Suppose your meeting summary assistant gets more sophisticated: when a meeting involves spending money, it should flag whether expenses comply with financial policies; when contracts come up, it should note legal risks. To do this, the Skill needs to know the relevant financial rules and legal provisions. If you stuff all of that into skill.md, the file becomes bloated—even a simple technical retrospective would force the model to load pages of irrelevant financial clauses.

Can we load files with even finer granularity? For example, only load financial policies when the meeting actually discusses money?

That’s exactly what Reference solves.

Create a file like company-finance-handbook.md with expense reimbursement standards, then add a rule in skill.md: only trigger when keywords like “budget,” “procurement,” or “expense” appear; when triggered, read the handbook and flag any amounts that exceed limits.

In practice: if the meeting transcript mentions budgets, Claude Code reads skill.md, detects the financial relevance, loads the handbook, and includes financial reminders in the summary. If it’s a money-free technical retrospective, the handbook stays on disk without consuming a single token.

Reference’s core property: conditional activation. Load only when needed, stay completely untouched otherwise.

Advanced Usage II: Script (Code Execution)
#

Reading files is just the first step. The real automation kicks in when Skill can run code directly. That’s where Script comes in.

Create an upload.py script for uploading files, then add a rule in skill.md: if the user mentions “upload,” “sync,” or “send to server,” the script must be executed.

When testing, Claude Code generates the meeting summary and then directly executes upload.py. Here’s the interesting part: when Claude Code requests script execution, it does not read the script’s source code. It only cares about how to run it and what the result is.

This means even a 10,000-line script with complex business logic consumes essentially zero model context.

So while Reference and Script are both advanced features, their impact on model context is fundamentally different:

Reference reads: loads file content into context, consuming tokens
Script runs: executes without reading, nearly zero context overhead

Progressive Disclosure: The Three-Layer Architecture
#

Tying it all together, Skill’s design is a carefully layered progressive disclosure structure with three tiers:

Layer 1: Metadata. All Skill names and descriptions, always loaded—essentially a catalog. The model scans this before every response to determine if the user’s request matches any Skill.

Layer 2: Instruction. Everything in skill.md beyond the metadata. Only loaded when the model identifies a match, hence lazy loading.

Layer 3: Resources. Includes Reference and Script (the official spec also mentions Assets, but it overlaps with Reference so we’ll skip it here). This layer only activates when the model determines specific resources are needed based on the instruction layer—it’s lazy loading on top of lazy loading, or “lazy within lazy.”

Each layer builds on the judgment of the one above it, keeping token consumption to an absolute minimum.

How Skill Relates to Prompt Engineering
#

This brings up another common question: what’s the relationship between Skill and Prompt Engineering? Both seem to be about “teaching the model what to do”—so what’s the difference?

My take: they solve problems at different levels.

Prompt Engineering is about “how to think.” Its job is guiding the model toward correct understanding and reasoning—defining roles, providing context, formatting outputs, reducing hallucinations. It operates at the cognitive layer, deciding what the model should do, how to decompose problems, and whether external capabilities are needed. But prompts themselves don’t execute any real actions.

Skill is about “how to act.” It turns model decisions into executable behavior—calling functions, running scripts, reading and writing files. Skill doesn’t participate in thinking; it takes instructions and gets things done².

An imperfect but memorable analogy: Prompt Engineering is like writing an onboarding manual for a new hire, explaining how to judge different situations. Skill is like handing them a toolbox so they can act on those judgments. One is the brain, the other is the hands.

With this in mind, the three-layer architecture clicks into place: Skill’s instruction layer carries the output of Prompt Engineering, while the resource layer is where real execution lives.

Skill vs. MCP: Which One to Use
#

After all this, you might be thinking: Skill and MCP seem kind of similar—both connect the model to the outside world.

Anthropic nailed the distinction in one sentence:

MCP connects Claude to data. Skills teach Claude what to do with that data.

MCP supplies data to the model—querying sales records, fetching shipping status. Skill teaches the model how to process that data—requiring meeting summaries to include topics, demanding reports to cite specific numbers.

You might ask: Skill can also connect to data via scripts, so why not just use Skill for everything?

Sure, it can—but “can” doesn’t mean “should.” A Swiss Army knife can chop vegetables, but nobody actually uses it for cooking. MCP is fundamentally a standalone service; Skill is fundamentally a set of instructions. They differ significantly in security, stability, and suitable use cases. Skill is better suited for lightweight scripts and simple logic, while MCP is more reliable for complex data connections³.

In practice, you’ll often combine the two: MCP handles data connections, Skill defines processing rules—each doing what it does best.

Strictly speaking, Skill is more than a static reference manual. It supports conditional file loading and code execution, giving it dynamic capabilities. ↩︎
“Doesn’t participate in thinking” is relative to Prompt Engineering. Skill’s instruction layer does incorporate Prompt Engineering, but the resource layer’s Reference and Script are purely about execution, not reasoning. ↩︎
Skill scripts are executed directly by Claude Code without the sandboxing and permission controls that MCP provides, making them unsuitable for sensitive or high-risk data operations. ↩︎

Introduction #

What Is Agent Skill #

Basic Usage: Building a Meeting Summary Assistant #

Under the Hood: How Skill Works #

Advanced Usage I: Reference (Conditional File Loading) #

Advanced Usage II: Script (Code Execution) #

Progressive Disclosure: The Three-Layer Architecture #

How Skill Relates to Prompt Engineering #

Skill vs. MCP: Which One to Use #