Why Copying Others' Claude Code Skills Doesn't Work

Abstract

The Claude Code ecosystem now includes 500+ community skills, 1,200+ agent skills, and dozens of MCP server directories. The temptation to clone someone's "awesome" setup is strong. But copying configurations leads to three compounding problems. First, context window bloat that leaves less room for actual code. Second, tool sprawl that degrades the model's ability to pick the right tool. Third, workflows that don't match your codebase. This post provides the token math, research data, and a practical framework for building configurations intentionally.

The Problem: Copy-Paste Configuration

Claude Code loads configuration from several sources every session. These include system prompt, tool definitions, MCP server schemas, CLAUDE.md files, skills, and memory files. Each source consumes tokens from a shared context window. The math is straightforward but often overlooked.

Token Budget: Minimal vs. Bloated

Here is what the /context command reveals for two real setups:

Minimal setup (2 MCP servers, focused CLAUDE.md):

System prompt:  2.6K tokens  (1.3%)System tools:  11.6K tokens  (5.8%)MCP tools:  5.7K tokens  (2.8%)Custom agents:  69 tokens  (0.0%)Memory files:  743 tokens  (0.4%)Skills:  61 tokens  (0.0%)Autocompact buffer: 33.0K tokens (16.5%)Free space:  146.2K tokens (73.1%)

System prompt:  2.6K tokens  (1.3%)System tools:  11.6K tokens  (5.8%)MCP tools:  5.7K tokens  (2.8%)Custom agents:  69 tokens  (0.0%)Memory files:  743 tokens  (0.4%)Skills:  61 tokens  (0.0%)Autocompact buffer: 33.0K tokens (16.5%)Free space:  146.2K tokens (73.1%)

Bloated setup (8+ MCP servers, copy-pasted config):

System prompt:  2.6K tokens  (1.3%)System tools:  17.6K tokens  (8.8%)MCP tools:  82.0K tokens (41.0%)Custom agents:  1.3K tokens  (0.7%)Memory files:  7.4K tokens  (3.7%)Skills:  1.0K tokens  (0.5%)Autocompact buffer: 33.0K tokens (16.5%)Free space:  55.1K tokens (27.5%)

System prompt:  2.6K tokens  (1.3%)System tools:  17.6K tokens  (8.8%)MCP tools:  82.0K tokens (41.0%)Custom agents:  1.3K tokens  (0.7%)Memory files:  7.4K tokens  (3.7%)Skills:  1.0K tokens  (0.5%)Autocompact buffer: 33.0K tokens (16.5%)Free space:  55.1K tokens (27.5%)

The bloated setup loses 91K tokens to overhead. That is nearly half the context window consumed before the conversation starts. Those 91K tokens could have been your codebase, your error logs, or your documentation.

The Token Economics Table

Configuration	MCP Token Cost	Available for Work	Effective Capacity
Zero MCP servers	0 tokens	~155K tokens	100%
2-3 focused servers	~6K tokens	~149K tokens	96%
8+ copy-pasted servers	~82K tokens	~73K tokens	47%
15+ "awesome list" servers	~145K tokens	~10K tokens	6%

At 6% effective capacity, the model has barely enough room to hold a single file and your question.

Why This Is Different: Silent Degradation

Unlike traditional tool configuration where wrong settings cause visible errors, AI configuration problems are silent:

The model does not crash. It gives worse answers.
Extra context does not throw exceptions. It subtly degrades output quality.
Wrong skills do not fail. They produce plausible-but-wrong suggestions.
Too many tools do not error. The model picks the wrong one more often.

This is the core challenge. There is no stack trace to debug. You just notice that Claude's suggestions are less precise, its tool choices less accurate, and its responses more generic. Most developers blame "the model having a bad day" rather than their own configuration.

Tool Selection Degradation: The Research

The relationship between tool count and accuracy is well-documented. Here is what the research shows.

TaskBench (NeurIPS 2024)

The TaskBench benchmark measured LLM accuracy across tool graphs of varying complexity:

Single tool: 96.16% accuracy
6 tools: 39.31% accuracy
8 tools: 25.00% accuracy

Note: TaskBench measures tool graph accuracy (complete multi-step tool chains), not isolated tool selection. This makes the degradation even more relevant. Complexity compounds.

Anthropic's MCP Evaluations

Anthropic's own benchmarks on Opus 4 with MCP tools show a similar pattern:

50+ tools (without Tool Search): 49% accuracy
50+ tools (with Tool Search): 74% accuracy

Tool Search improved things, but a 26% error rate is still significant. One in four tool selections is wrong even with the mitigation enabled.

The Sweet Spot

Industry experience and benchmarks suggest 5-7 tools as the range for consistent, accurate tool selection. Beyond that, each additional tool increases the probability that the model picks the wrong one.

Tool Search: Safety Net, Not Solution

Anthropic shipped MCP Tool Search in Claude Code 2.1.7 (January 2026). When tool definitions exceed 10% of the context window, Tool Search lazy-loads tools on demand instead of including all definitions upfront. The results are meaningful:

Token overhead reduced by 46.9-85%
Opus 4 accuracy improved from 49% to 74% on MCP evaluations

But Tool Search is a mitigation, not an invitation to install everything:

The model still searches through tool descriptions to find matches
Each search step costs tokens and adds latency
Incorrect search matches waste a turn
The model must still understand which server has the right tool

Think of Tool Search like a garbage collector. It helps manage memory pressure, but you should not write memory-leaking code just because the GC exists.

The Progressive Enhancement Approach

Instead of starting with someone else's configuration and pruning, start with nothing and add with evidence.

Phase 1: Minimal Foundation

Start with zero MCP servers and a minimal CLAUDE.md. Include only:

Build, test, and lint commands for your project
Architecture decisions Claude cannot infer from code
Coding style rules specific to your team
File naming conventions

Anthropic recommends keeping CLAUDE.md under 200 lines, ideally under 100. Every line is read every session.

Phase 2: Add with Evidence

Add an MCP server only when you find yourself repeatedly describing the same external system to Claude. That repeated description is the signal. Not a trending awesome list.

The 3-server rule is a useful starting point:

One for version control (GitHub/GitLab)
One for your primary data source (database, API, cloud provider)
One domain-specific (project management, documentation, monitoring)

Phase 3: Write Custom Skills

Write skills only for tasks you have done 5+ times with similar prompts. A skill should encode your workflow, not someone else's.

Anti-pattern: Copy a generic "code review" skill from a community list. Better: Write a skill that reflects your team's review checklist, your test requirements, your architectural patterns.

The value of a skill is in its specificity. A generic skill provides no more value than Claude's built-in capabilities.

Phase 4: Audit Regularly

Use /context and /mcp to track overhead. Key metrics:

Context overhead: MCP tools + system prompt + memory as % of total context (target: under 20%)
Tool utilization: Tools actually used vs. tools registered (target: above 60%)
Compaction frequency: How often auto-compaction triggers in a session (fewer is better)

Decision Framework

When to Add an MCP Server

Signal	Action
You describe the same external system to Claude 3+ times/week	Add a focused MCP server for it
You saw it on an awesome list and it looks interesting	Skip it
Your team all needs the same external tool access	Add it to project config
You used it once in the last month	Disable or remove it
A single server registers 50+ tools	Look for a more focused alternative

When to Write a Skill vs. Use a Prompt

Signal	Action
You have done this task 5+ times with similar prompts	Write a custom skill
Someone shared a skill that matches your workflow	Adapt it to your conventions, do not copy it
The task is generic (e.g., "review this code")	Use Claude's built-in capabilities
The task requires your project's specific conventions	Write a skill that encodes those conventions
The skill would exceed 200 lines	Split into focused sub-skills

When to Add to CLAUDE.md

Signal	Action
Claude keeps making the same mistake	Add a specific rule to prevent it
The instruction applies to every task in this project	Keep it in root CLAUDE.md
The instruction applies to one subdirectory only	Move it to a scoped CLAUDE.md
You copied it from another project's CLAUDE.md	Remove it unless it applies to your project
Your CLAUDE.md exceeds 200 lines	Audit and move non-universal rules to skills

Common Pitfalls

"I'll Need It Eventually"

Installing MCP servers for tools you do not use daily. A Docker MCP server with 135 tools consumes ~126K tokens. That is 63% of a standard 200K context window for a tool you use twice a week. Even with 1M token windows now available, 126K tokens of overhead is still 126K tokens not spent on your actual work.

Fix: Enable MCP servers per-session when needed, not globally. Use /mcp to toggle.

"My CLAUDE.md Covers Everything"

Writing a comprehensive CLAUDE.md that covers every possible scenario. Claude reads it every session, but only a fraction applies to any given task.

Fix: Use scoped CLAUDE.md files in subdirectories. Keep root CLAUDE.md to universal rules only. Move task-specific instructions to separate markdown files or skills.

Copying a viral Claude Code configuration without understanding why each piece exists. The original author built it for their specific needs over weeks of iteration.

Fix: Read each skill and config before adding it. Ask: "Have I needed this in the last week?" If not, do not add it.

"Skills Are Just Better Prompts"

Treating skills as prompt templates rather than workflow encoders. A good skill reflects a specific, repeatable process with your tools, your conventions, and your quality bar.

Fix: Only create skills for tasks you have done 5+ times. Base them on your actual workflow, not someone else's.

"Tool Search Fixes Everything"

Relying on Tool Search to handle unlimited MCP servers. Tool Search is a mitigation, not a solution. It still costs search turns and can return wrong tools.

Fix: Even with Tool Search, aim for fewer than 10 MCP servers with focused tool sets. Quality of tool descriptions matters more than quantity.

The CLAUDE.md Scoping Problem

CLAUDE.md files follow a hierarchy:

~/.claude/CLAUDE.md (global) -- loaded in every project
./CLAUDE.md (project root) -- loaded for this project
./packages/api/CLAUDE.md (subdirectory) -- loaded when working here

Copy-pasting someone's global CLAUDE.md means their React component rules apply when you are writing Go microservices. Their "always use pnpm" instruction conflicts with your project's yarn workspace. Their "write tests with Vitest" does not match your Jest setup.

What belongs in CLAUDE.md:

Build/test/lint commands for this project
Architecture decisions Claude cannot infer from code
Coding style rules specific to your team
File naming conventions

What does not belong:

Generic programming advice ("write clean code")
Tool instructions that belong in skills
Rules for stacks you are not using
Verbose explanations. Claude reads code well.

Practical Audit Guide

Step 1: Check Your Token Budget

Run /context in Claude Code. Look at the breakdown. If MCP tools consume more than 15% of your context window, you are likely over-configured.

Step 2: Audit MCP Servers

Run /mcp to see per-server tool counts and token costs. Identify servers you have not used in the last week. Disable them.

Step 3: Review CLAUDE.md

Read your CLAUDE.md files (global and project-level). For each instruction, ask: "Does Claude actually need this for the tasks I do?" Remove everything that is aspirational rather than practical.

Step 4: Evaluate Skills

List the skills in your .claude/skills/ directory. For each one, ask: "Did I write this, or did I copy it? Does it reflect my workflow or someone else's?" Replace copied skills with custom ones based on your actual process.

Step 5: Set a Target

Aim for under 20% total overhead. Run /context after each change to measure progress.

Conclusion

The best Claude Code configuration is the one you built yourself. Community skills, shared dotfiles, and awesome lists are useful for inspiration. They are not meant to be copied wholesale. Every skill, CLAUDE.md instruction, and MCP server consumes tokens from a finite budget. The research is clear: more tools means worse tool selection. More context overhead means less room for actual work. Generic configurations produce generic outputs.

Start with nothing. Add with evidence. Audit regularly. Your configuration should grow from experience, not from awesome lists.

References

Best Practices for Claude Code - Anthropic's official guidance on CLAUDE.md structure, recommending under 200 lines
Context Windows - Claude API Docs - Official documentation on context window mechanics and token budgeting
Model Context Protocol and the "Too Many Tools" Problem - Analysis of how MCP tool sprawl degrades LLM performance
Having Multiple MCP Servers Running Eats into Context Window - Issue #3036 - Community report of 67K+ tokens consumed by 7 MCP servers
Claude Code Just Cut MCP Context Bloat by 46.9% - Analysis of Tool Search reducing token overhead from 51K to 8.5K
AI Tool Overload: Why More Tools Mean Worse Performance - Research showing 5-7 tools as the practical upper limit for consistent accuracy
Your MCP Servers Are Eating Your Context - Practical analysis of per-server token costs and reduction strategies
Extend Claude with Skills - Claude Code Docs - Official documentation on SKILL.md format and skill creation best practices
Skill Authoring Best Practices - Claude API Docs - Anthropic's guidance on writing effective skills
TaskBench: Benchmarking Large Language Models for Task Automation (NeurIPS 2024) - Academic research showing tool graph accuracy dropping from 96% (1 tool) to 25% (8 tools)
Feature Request: Lazy Loading for MCP Servers - Issue #7336 - Community discussion that led to the Tool Search feature
MCP Tool Search: How Claude Code Fixed Context Window Bloat - Technical deep-dive on Tool Search internals
MCP Tools Consume 50% of Context Tokens - Issue #13717 - User report of 98.7K tokens consumed by MCP tools
Optimising MCP Server Context Usage in Claude Code - Developer walkthrough of auditing and optimizing MCP server token consumption

Abstract#

The Problem: Copy-Paste Configuration#

Token Budget: Minimal vs. Bloated#

The Token Economics Table#

Why This Is Different: Silent Degradation#

Tool Selection Degradation: The Research#

TaskBench (NeurIPS 2024)#

Anthropic's MCP Evaluations#

The Sweet Spot#

Tool Search: Safety Net, Not Solution#

The Progressive Enhancement Approach#

Phase 1: Minimal Foundation#

Phase 2: Add with Evidence#

Phase 3: Write Custom Skills#

Phase 4: Audit Regularly#

Decision Framework#

When to Add an MCP Server#

When to Write a Skill vs. Use a Prompt#

When to Add to CLAUDE.md#

Common Pitfalls#

"I'll Need It Eventually"#

"My CLAUDE.md Covers Everything"#

"The Trending Setup Must Be Best"#

"Skills Are Just Better Prompts"#

"Tool Search Fixes Everything"#

The CLAUDE.md Scoping Problem#

Practical Audit Guide#

Step 1: Check Your Token Budget#

Step 2: Audit MCP Servers#

Step 3: Review CLAUDE.md#

Step 4: Evaluate Skills#

Step 5: Set a Target#

Conclusion#

References#

Related Posts