TL;DR
When building AI-powered workflows, you have two powerful patterns:
- Skills: Add capabilities to your AI (like installing plugins)
- Sub-Agents: Delegate tasks to AI specialists (like hiring consultants)
This post explores when to use each, backed by research showing multi-agent systems outperform single-agent by 90.2%.
Key Takeaway: Use skills for persistent knowledge (cached documentation), sub-agents for complex workflows (validation, analysis), and hybrid for best results.
The Problem: Choosing the Wrong Abstraction
I recently faced a common challenge: My AI assistant kept searching the web for the same Next.js patterns, FastAPI best practices, and Anthropic documentation every single conversation.
Each search cost:
- ⏱️ 5-10 seconds of latency
- 🎫 ~8,000 tokens per query
- 🔄 Inconsistent results (search rankings change)
- 💻 Requires internet connection
After 10 queries, I’d burned 80,000 tokens on redundant searches.
I knew I needed to cache this documentation, but the question was: What’s the right architecture?
Should I:
- Create a sub-agent that validates code against cached docs?
- Build a skill that makes docs available in every conversation?
- Use a hybrid approach?
Turns out, understanding the distinction between Skills and Sub-Agents completely changed my approach.
What Are Skills? (New from Anthropic, October 2025)
Anthropic introduced Agent Skills in October 2025 as a way to equip AI agents with specialized knowledge and capabilities.
The Mental Model
Think of skills like installing a library or plugin:
Before: Your code
After: Your code + New Library = More capabilities
Skills extend what the AI can do by providing:
- 📚 Domain knowledge (cached documentation)
- 🛠️ Executable tools (validation scripts, formatters)
- 🧠 Specialized expertise (security patterns, API references)
How Skills Work
.claude/skills/
└── framework-docs/
├── SKILL.md # Skill definition
├── nextjs/
│ ├── standards.md # Cached Next.js docs
│ └── patterns.md
├── fastapi/
│ └── standards.md # Cached FastAPI docs
└── scripts/
└── validate.sh # Executable validation
When you ask: “What’s the Next.js App Router pattern?”
The AI:
- Loads the
framework-docsskill - Reads
nextjs/standards.md(progressive disclosure) - Answers instantly from cached knowledge
- No web search needed
The Key Innovation: Progressive Disclosure
Skills don’t dump everything into context. They load only what’s needed:
User: "How do I fetch data in Next.js?"
Loads ONLY:
├── SKILL.md (500 tokens) # Overview
└── nextjs/data-fetching.md (2,000 tokens) # Relevant section
SKIPS:
├── nextjs/routing.md
├── nextjs/server-components.md
└── fastapi/* (not needed for this query)
Result: 94% reduction in context usage (50,000 tokens → 3,000 tokens)
What Are Sub-Agents?
Sub-agents are specialized AI workers that execute complex tasks independently and return condensed results.
The Mental Model
Think of sub-agents like delegating to a specialist consultant:
You: "I need a comprehensive code review"
Consultant: *Goes away, does deep analysis*
Consultant: *Returns executive summary*
Sub-agents:
- 🧠 Have their own context window (isolation)
- 🎯 Specialize in specific tasks (code review, validation, refactoring)
- ⚡ Can run in parallel (multiple specialists working simultaneously)
- 📊 Return condensed summaries (not full details)
How Sub-Agents Work
.claude/agents/
├── validate-nextjs.md # Autonomous validation
├── coderabbit-pr-fixer.md # Fix PR issues systematically
└── refresh-docs.md # Update cached documentation
When you ask: “Validate my Next.js code against best practices”
The main AI:
- Spawns a
validate-nextjssub-agent - Sub-agent runs independently with own context
- Sub-agent reads files, validates code (20,000 tokens in its context)
- Returns condensed report (2,000 tokens)
- Main conversation stays clean (87.5% context savings)
The Research: Why Multi-Agent Matters
Anthropic’s research on multi-agent systems revealed a stunning finding:
Multi-agent systems with Claude Opus 4 (orchestrator) + Claude Sonnet 4 (sub-agents) outperformed single-agent Claude Opus 4 by 90.2%
Why Sub-Agents Outperform Single Agents
Context Management:
- Single agent: Maintains full context across entire task (burns tokens)
- Sub-agents: Isolated context per task, return only summaries
Parallelization:
- Single agent: Sequential execution (slow)
- Sub-agents: Run multiple tasks simultaneously (fast)
Specialization:
- Single agent: Generalist (okay at everything)
- Sub-agents: Specialists (excellent at one thing)
Example: Code review across 3 languages
Single Agent:
Review TypeScript → Review Python → Review Go
Total time: 30 seconds
Context: 60,000 tokens (all 3 reviews in context)
Multi-Agent:
Spawn 3 sub-agents in parallel:
├── validate-typescript.md
├── validate-python.md
└── validate-go.md
Total time: 10 seconds (3x faster)
Context: 6,000 tokens (only summaries in main context)
When to Use Skills vs Sub-Agents
Use Skills When You Need:
✅ Persistent Capabilities
- Documentation access in every conversation
- Company-specific knowledge base
- Internal API references
✅ Reducing Redundant Operations
- Stop searching web for same patterns
- Cached framework best practices
- Reusable code snippets
✅ Progressive Knowledge
- Large knowledge bases (load sections on demand)
- API documentation
- Design system components
Example Use Cases:
✓ Framework documentation (Next.js, React, FastAPI)
✓ Company coding standards
✓ Internal API references
✓ Security best practices
✓ Design system patterns
Use Sub-Agents When You Need:
✅ Complex Workflows
- Multi-step validation
- Code review with prioritization
- Iterative fix-validate loops
✅ Parallelization
- Run tests + build + deploy simultaneously
- Validate multiple languages at once
- Process large datasets in chunks
✅ Context Isolation
- Keep main conversation clean
- Process large PR comments (tens of thousands of tokens)
- Deep analysis without polluting context
Example Use Cases:
✓ Code validation pipelines
✓ PR review and fix workflows
✓ Security audits
✓ Performance optimization
✓ Codebase refactoring
The Hybrid Architecture: Best of Both Worlds
For my documentation caching problem, neither approach alone was optimal:
- Skills only: Great for access, but who updates the cache?
- Sub-agents only: Great for validation, but can’t access docs in normal chat
Solution: Combine them.
The Architecture
.claude/
├── skills/
│ └── framework-docs/ # ← Skill: Knowledge access
│ ├── SKILL.md
│ ├── nextjs/standards.md
│ ├── fastapi/standards.md
│ └── anthropic/agent-standards.md
└── agents/
├── refresh-docs.md # ← Sub-agent: Maintenance
└── validate-all.md # ← Sub-agent: Uses skill for validation
How It Works Together
Interactive Use (Skill shines):
User: "What's the Next.js data fetching pattern?"
AI: *Uses framework-docs skill*
AI: "Here's the pattern from cached docs..."
Time: Instant
Tokens: 3,000 (vs 8,000 with web search)
Automated Validation (Sub-agent + Skill):
User: "Validate my code against all standards"
Main AI: *Spawns validate-all sub-agent*
Sub-agent: *Loads framework-docs skill*
Sub-agent: *Validates in isolated context*
Sub-agent: *Returns condensed report*
Time: Fast (parallel processing)
Tokens: 2,500 in main context (vs 20,000 without sub-agent)
Maintenance (Sub-agent maintains Skill):
User: "Refresh framework docs" (monthly)
Main AI: *Spawns refresh-docs sub-agent*
Sub-agent: *Fetches latest from web*
Sub-agent: *Updates framework-docs skill cache*
Result: Skill now has fresh data for next conversation
Real-World Example: Documentation Caching
Let me show you the before/after of implementing this architecture.
Before: Web Search Hell
Conversation 1:
User: "How do I fetch data in Next.js?"
AI: *WebSearch: "Next.js data fetching 2025"*
AI: *Parses 5,000 tokens of search results*
AI: "Here's what I found..."
Cost: 8,000 tokens, 5 seconds
Conversation 2 (tomorrow):
User: "What's the Next.js App Router pattern?"
AI: *WebSearch again* (doesn't remember)
AI: *Parses results again*
AI: "Here's what I found..."
Cost: 8,000 tokens, 5 seconds
10 queries: 80,000 tokens, 50 seconds
After: Hybrid Architecture
One-Time Setup:
# Create skill structure
.claude/skills/framework-docs/
├── SKILL.md
└── nextjs/
├── standards.md
└── patterns.md
# Create refresh agent
.claude/agents/refresh-docs.md
# Initial fetch (run once)
User: "Refresh framework docs"
AI: *Spawns refresh-docs sub-agent*
Sub-agent: *Fetches Next.js docs*
Sub-agent: *Saves to framework-docs skill*
Every Conversation After:
Conversation 1:
User: "How do I fetch data in Next.js?"
AI: *Loads framework-docs skill*
AI: *Reads nextjs/standards.md*
AI: "Here's the pattern from cached docs..."
Cost: 3,000 tokens, instant
Conversation 2 (tomorrow):
User: "What's the Next.js App Router pattern?"
AI: *Already has skill loaded*
AI: *Reads nextjs/patterns.md*
AI: "Here's the pattern..."
Cost: 1,000 tokens (skill already loaded), instant
10 queries: 12,000 tokens (85% reduction), instant
Monthly Maintenance:
User: "Refresh framework docs"
AI: *Spawns refresh-docs sub-agent*
Sub-agent: *Updates cache*
Result: Fresh data for next month
Performance Comparison
Let’s quantify the impact:
Token Usage (10 Queries)
| Approach | Tokens | vs Baseline |
|---|---|---|
| Web search every time | 80,000 | Baseline |
| Skill (cached docs) | 12,000 | 85% reduction |
| Sub-agent (batch validation) | 2,500 | 97% reduction |
Time (10 Queries)
| Approach | Time | vs Baseline |
|---|---|---|
| Web search every time | 50 seconds | Baseline |
| Skill (cached docs) | Instant | 100% faster |
| Sub-agent (parallel) | 5 seconds | 90% faster |
Accuracy
| Approach | Consistency | Offline |
|---|---|---|
| Web search | ❌ Varies by search ranking | ❌ No |
| Skill (cached docs) | ✅ Always same | ✅ Yes |
| Sub-agent | ✅ Uses skill data | ✅ Yes |
Implementation Guide
Step 1: Create the Skill
.claude/skills/framework-docs/SKILL.md:
---
name: framework-docs
description: Access cached framework documentation
version: 1.0.0
---
# Framework Documentation Skill
This skill provides instant access to frequently-used framework documentation.
## Available Frameworks
- Next.js (App Router, Server Components, Data Fetching)
- FastAPI (Async patterns, Pydantic, Testing)
- Anthropic (Agent authoring best practices)
## Usage
Reference docs by framework:
- Next.js: See `nextjs/standards.md`
- FastAPI: See `fastapi/standards.md`
- Anthropic: See `anthropic/agent-standards.md`
## Last Updated
Check `last-updated.json` for cache freshness.
Step 2: Create the Refresh Sub-Agent
.claude/agents/refresh-docs.md:
---
name: refresh-docs
description: Update framework documentation cache
tools: Read, Write, WebFetch, WebSearch, Bash
---
You are a documentation caching specialist. Your job is to fetch the latest
framework documentation and update the framework-docs skill cache.
## Steps
1. Check current cache dates in `last-updated.json`
2. For each framework:
- Fetch latest documentation from official sources
- Extract best practices and patterns
- Update corresponding markdown file
- Update timestamp in `last-updated.json`
3. Report what was updated and cache freshness
## Sources
- Next.js: https://nextjs.org/docs
- FastAPI: https://fastapi.tiangolo.com
- Anthropic: https://docs.anthropic.com
## Output Format
Provide summary:
- ✅ Framework updated
- 📅 Last fetched date
- 📝 Major changes (if any)
Step 3: Create the Validation Sub-Agent
.claude/agents/validate-all.md:
---
name: validate-all
description: Validate code against all framework standards
tools: Read, Bash, Grep
---
You are a code validation specialist. Validate code against cached
framework standards from the framework-docs skill.
## Steps
1. Load the framework-docs skill
2. Identify which frameworks are used in the project
3. Read relevant standard files from the skill
4. Validate code against standards
5. Categorize issues by severity (Critical/Major/Minor)
6. Return condensed report with specific line numbers
## Output Format
**Framework**: Next.js
**Files Validated**: 12
**Issues Found**: 3
Critical:
- file.ts:42 - Using Pages Router instead of App Router
Major:
- file.ts:15 - Not using Server Components for data fetching
Minor:
- file.ts:8 - Could use parallel data fetching pattern
Step 4: Use It
# One-time setup
User: "Refresh framework docs"
AI: *Runs refresh-docs sub-agent*
# Interactive use (any conversation)
User: "What's the Next.js data fetching pattern?"
AI: *Uses framework-docs skill*
AI: "Here's the pattern..."
# Validation workflow
User: "Validate my Next.js code"
AI: *Spawns validate-all sub-agent*
AI: *Sub-agent uses framework-docs skill*
AI: "Here's your validation report..."
# Monthly maintenance
User: "Refresh framework docs"
AI: *Updates cache for next month*
Lessons Learned
1. Progressive Disclosure is Key
Don’t load everything into context. Use skills’ progressive disclosure:
❌ Bad: Load all 50,000 tokens of docs
✅ Good: Load overview (500 tokens), then section on demand (2,000 tokens)
2. Let Sub-Agents Maintain Skills
Skills provide capabilities. Sub-agents do work:
✅ Skill: framework-docs (provides knowledge)
✅ Sub-agent: refresh-docs (maintains the skill)
✅ Sub-agent: validate-all (uses the skill)
3. Hybrid Beats Solo
Neither skills nor sub-agents alone were optimal:
Skills only: Great access, but manual updates
Sub-agents only: Great workflows, but no persistent knowledge
Hybrid: Best of both (85%+ efficiency gains)
4. Version Your Cache
Track when docs were fetched:
{
"nextjs": {
"fetched": "2025-10-18",
"version": "15.0",
"ttl_days": 60
}
}
5. Parallelize with Sub-Agents
Multi-agent systems are 90.2% better than single-agent:
Sequential: Validate TS → Validate Py → Validate Go (30s)
Parallel: Spawn 3 sub-agents simultaneously (10s, 3x faster)
When NOT to Use This Architecture
This hybrid approach isn’t always the answer:
Skip Skills If:
❌ Information changes constantly (don’t cache volatile data) ❌ One-time use (not worth setup overhead) ❌ Simple queries (web search is fine for rare lookups)
Skip Sub-Agents If:
❌ Task is trivial (single-step, no complexity) ❌ Need full context (sub-agents return summaries) ❌ Can’t parallelize (sequential dependencies)
Skip Hybrid If:
❌ Simple use case (don’t over-engineer) ❌ No maintenance needed (static documentation) ❌ Rare access (setup cost > benefit)
The Future: Agent Skills Ecosystem
Anthropic introduced skills in October 2025. This is just the beginning.
I predict we’ll see:
Skill Marketplaces:
- "react-native-docs" skill (community-maintained)
- "company-api-standards" skill (enterprise)
- "security-patterns" skill (security teams)
Skill Composition:
.claude/skills/
├── framework-docs/ # General docs
├── company-kb/ # Company-specific
└── project-patterns/ # Project-specific
AI: *Composes all three for comprehensive knowledge*
Skill Versioning:
framework-docs@2.0.0 (latest)
framework-docs@1.5.0 (stable)
Automated Skill Updates:
Sub-agent: refresh-docs runs automatically monthly
Skill: framework-docs always fresh without manual intervention
Conclusion
When I started this journey, I was frustrated by redundant web searches burning tokens and time.
The solution wasn’t choosing Skills OR Sub-Agents.
It was understanding:
- Skills = Capabilities (persistent knowledge, tools)
- Sub-Agents = Tasks (workflows, validation)
- Hybrid = Synergy (skills provide knowledge, agents do work)
Results:
- ✅ 85% reduction in tokens (80,000 → 12,000)
- ✅ 100% faster responses (instant vs 5-10 seconds)
- ✅ Offline capable (cached docs)
- ✅ Consistent answers (not affected by search ranking)
- ✅ Automated maintenance (refresh-docs sub-agent)
The Key Insight:
“Skills extend what your AI can do. Sub-agents delegate what your AI should do. Use skills for knowledge, sub-agents for work, and hybrid for comprehensive solutions.”
If you’re building AI-powered workflows and facing similar challenges, I hope this architecture saves you as much time and tokens as it saved me.
Resources
- Anthropic: Equipping Agents with Skills
- Claude Agent SDK
- My Knowledge Base Pattern (ACE Framework)
- Implementation Example
About the Author
Bishnu Bista is a software engineer building AI-powered developer tools. He maintains an open-source knowledge base following the ACE (Agentic Context Engineering) framework, documenting learnings from 5+ projects on agent-based workflows.
Discussion: What’s your experience with Skills vs Sub-Agents? Have you found other patterns that work well? Share in the comments below!
Updates: This post will be updated as Skills pattern matures and new best practices emerge. Last updated: October 18, 2025.