TL;DR

When building AI-powered workflows, you have two powerful patterns:

  • Skills: Add capabilities to your AI (like installing plugins)
  • Sub-Agents: Delegate tasks to AI specialists (like hiring consultants)

Research shows multi-agent systems outperform single-agent by 90.2%. Use skills for persistent knowledge, sub-agents for complex workflows, and hybrid for best results.


The Problem

My AI assistant kept searching the web for the same Next.js patterns and FastAPI best practices every conversation. Each search cost 5-10 seconds and ~8,000 tokens. After 10 queries, I’d burned 80,000 tokens on redundant searches.

The question: Should I create a sub-agent for validation, a skill for documentation access, or both?


What Are Skills?

Anthropic introduced Agent Skills in October 2025 to equip AI agents with specialized knowledge.

Think of skills like installing a library:

Before: Your code
After:  Your code + New Library = More capabilities

Skills extend what the AI can do by providing:

  • 📚 Domain knowledge (cached documentation)
  • 🛠️ Executable tools (validation scripts, formatters)
  • 🧠 Specialized expertise (security patterns, API references)

Key Innovation: Progressive Disclosure

Skills load only what’s needed, not everything at once:

User: "How do I fetch data in Next.js?"

Loads: SKILL.md (500 tokens) + nextjs/data-fetching.md (2,000 tokens)
Skips: Everything else

Result: 94% reduction in context usage (50,000 → 3,000 tokens)

What Are Sub-Agents?

Sub-agents are specialized AI workers that execute complex tasks independently and return condensed results.

Think of sub-agents like delegating to a specialist consultant:

You: "I need a comprehensive code review"
Consultant: *Goes away, does deep analysis*
Consultant: *Returns executive summary*

Sub-agents:

  • 🧠 Have their own context window (isolation)
  • 🎯 Specialize in specific tasks
  • ⚡ Can run in parallel
  • 📊 Return condensed summaries

Example: Code validation

  1. Main AI spawns validate-nextjs sub-agent
  2. Sub-agent processes 20,000 tokens in isolated context
  3. Returns condensed 2,000 token report
  4. Main conversation stays clean (87.5% context savings)

The Research: Why Multi-Agent Matters

Multi-agent systems with Claude Opus 4 (orchestrator) + Claude Sonnet 4 (sub-agents) outperformed single-agent Claude Opus 4 by 90.2%

Why Sub-Agents Outperform:

  • Context Management: Isolated context per task, return only summaries
  • Parallelization: Run multiple tasks simultaneously
  • Specialization: Excellent at one thing vs okay at everything

Example: Code review across 3 languages

Single Agent:  30 seconds, 60,000 tokens
Multi-Agent:   10 seconds, 6,000 tokens (3x faster, 90% fewer tokens)

When to Use Skills vs Sub-Agents

Use Skills When You Need:

Persistent Capabilities

  • Documentation access in every conversation
  • Company-specific knowledge base
  • Internal API references

Reducing Redundant Operations

  • Stop searching web for same patterns
  • Cached framework best practices
  • Reusable code snippets

Progressive Knowledge

  • Large knowledge bases (load sections on demand)
  • API documentation
  • Design system components

Example Use Cases: Framework docs, coding standards, API references, security best practices


Use Sub-Agents When You Need:

Complex Workflows

  • Multi-step validation
  • Code review with prioritization
  • Iterative fix-validate loops

Parallelization

  • Run tests + build + deploy simultaneously
  • Validate multiple languages at once
  • Process large datasets in chunks

Context Isolation

  • Keep main conversation clean
  • Process large PR comments
  • Deep analysis without polluting context

Example Use Cases: Code validation pipelines, PR review workflows, security audits, performance optimization


The Hybrid Architecture

Neither approach alone was optimal for my documentation caching problem:

  • Skills only: Great for access, but who updates the cache?
  • Sub-agents only: Great for validation, but can’t access docs in normal chat

Solution: Combine them

.claude/
├── skills/
│   └── framework-docs/         # Skill: Knowledge access
└── agents/
    ├── refresh-docs.md         # Sub-agent: Maintenance
    └── validate-all.md         # Sub-agent: Uses skill

How It Works:

Interactive Use:
User: "What's the Next.js data fetching pattern?"
AI: *Uses framework-docs skill* → Instant, 3,000 tokens

Automated Validation:
User: "Validate my code"
Main AI: *Spawns validate-all sub-agent*
Sub-agent: *Uses framework-docs skill* → 2,500 tokens

Monthly Maintenance:
User: "Refresh framework docs"
AI: *Spawns refresh-docs sub-agent* → Updates skill cache

Performance Comparison

Token Usage (10 Queries)

ApproachTokensvs Baseline
Web search every time80,000Baseline
Skill (cached docs)12,00085% reduction
Sub-agent (batch validation)2,50097% reduction

Time (10 Queries)

ApproachTimevs Baseline
Web search every time50 secondsBaseline
Skill (cached docs)Instant100% faster
Sub-agent (parallel)5 seconds90% faster

Accuracy

ApproachConsistencyOffline
Web search❌ Varies by search ranking❌ No
Skill (cached docs)✅ Always same✅ Yes
Sub-agent✅ Uses skill data✅ Yes

Key Lessons

1. Progressive Disclosure is Essential

❌ Bad: Load all 50,000 tokens of docs
✅ Good: Load overview (500 tokens), then section on demand (2,000 tokens)

2. Let Sub-Agents Maintain Skills

✅ Skill: framework-docs (provides knowledge)
✅ Sub-agent: refresh-docs (maintains the skill)
✅ Sub-agent: validate-all (uses the skill)

3. Hybrid Beats Solo

Skills only: Great access, manual updates
Sub-agents only: Great workflows, no persistent knowledge
Hybrid: Best of both (85%+ efficiency gains)

4. Parallelize with Sub-Agents

Multi-agent systems are 90.2% better than single-agent:

Sequential: Validate TS → Validate Py → Validate Go (30s)
Parallel: Spawn 3 sub-agents simultaneously (10s, 3x faster)

When NOT to Use This Architecture

Skip Skills If:

❌ Information changes constantly ❌ One-time use (not worth setup overhead) ❌ Simple queries (web search is fine for rare lookups)

Skip Sub-Agents If:

❌ Task is trivial (single-step, no complexity) ❌ Need full context (sub-agents return summaries) ❌ Can’t parallelize (sequential dependencies)

Skip Hybrid If:

❌ Simple use case (don’t over-engineer) ❌ No maintenance needed (static documentation) ❌ Rare access (setup cost > benefit)


Conclusion

The solution wasn’t choosing Skills OR Sub-Agents. It was understanding:

  • Skills = Capabilities (persistent knowledge, tools)
  • Sub-Agents = Tasks (workflows, validation)
  • Hybrid = Synergy (skills provide knowledge, agents do work)

Results:

  • ✅ 85% reduction in tokens (80,000 → 12,000)
  • ✅ 100% faster responses (instant vs 5-10 seconds)
  • ✅ Offline capable (cached docs)
  • ✅ Consistent answers
  • ✅ Automated maintenance

Key Insight:

“Skills extend what your AI can do. Sub-agents delegate what your AI should do. Use skills for knowledge, sub-agents for work, and hybrid for comprehensive solutions.”