Why Copying Others' Claude Code Skills Doesn't Work
Cargo-culting Claude Code configurations leads to context window bloat, degraded tool selection, and mismatched workflows. A data-backed guide to intentional AI tool configuration with token budget math and progressive enhancement.
Abstract
The Claude Code ecosystem now includes 500+ community skills, 1,200+ agent skills, and dozens of MCP server directories. The temptation to clone someone's "awesome" setup is strong. But copying configurations leads to three compounding problems. First, context window bloat that leaves less room for actual code. Second, tool sprawl that degrades the model's ability to pick the right tool. Third, workflows that don't match your codebase. This post provides the token math, research data, and a practical framework for building configurations intentionally.
The Problem: Copy-Paste Configuration
Claude Code loads configuration from several sources every session. These include system prompt, tool definitions, MCP server schemas, CLAUDE.md files, skills, and memory files. Each source consumes tokens from a shared context window. The math is straightforward but often overlooked.
Token Budget: Minimal vs. Bloated
Here is what the /context command reveals for two real setups:
Minimal setup (2 MCP servers, focused CLAUDE.md):
Bloated setup (8+ MCP servers, copy-pasted config):
The bloated setup loses 91K tokens to overhead. That is nearly half the context window consumed before the conversation starts. Those 91K tokens could have been your codebase, your error logs, or your documentation.
The Token Economics Table
At 6% effective capacity, the model has barely enough room to hold a single file and your question.
Why This Is Different: Silent Degradation
Unlike traditional tool configuration where wrong settings cause visible errors, AI configuration problems are silent:
- The model does not crash. It gives worse answers.
- Extra context does not throw exceptions. It subtly degrades output quality.
- Wrong skills do not fail. They produce plausible-but-wrong suggestions.
- Too many tools do not error. The model picks the wrong one more often.
This is the core challenge. There is no stack trace to debug. You just notice that Claude's suggestions are less precise, its tool choices less accurate, and its responses more generic. Most developers blame "the model having a bad day" rather than their own configuration.
Tool Selection Degradation: The Research
The relationship between tool count and accuracy is well-documented. Here is what the research shows.
TaskBench (NeurIPS 2024)
The TaskBench benchmark measured LLM accuracy across tool graphs of varying complexity:
- Single tool: 96.16% accuracy
- 6 tools: 39.31% accuracy
- 8 tools: 25.00% accuracy
Note: TaskBench measures tool graph accuracy (complete multi-step tool chains), not isolated tool selection. This makes the degradation even more relevant. Complexity compounds.
Anthropic's MCP Evaluations
Anthropic's own benchmarks on Opus 4 with MCP tools show a similar pattern:
- 50+ tools (without Tool Search): 49% accuracy
- 50+ tools (with Tool Search): 74% accuracy
Tool Search improved things, but a 26% error rate is still significant. One in four tool selections is wrong even with the mitigation enabled.
The Sweet Spot
Industry experience and benchmarks suggest 5-7 tools as the range for consistent, accurate tool selection. Beyond that, each additional tool increases the probability that the model picks the wrong one.
Tool Search: Safety Net, Not Solution
Anthropic shipped MCP Tool Search in Claude Code 2.1.7 (January 2026). When tool definitions exceed 10% of the context window, Tool Search lazy-loads tools on demand instead of including all definitions upfront. The results are meaningful:
- Token overhead reduced by 46.9-85%
- Opus 4 accuracy improved from 49% to 74% on MCP evaluations
But Tool Search is a mitigation, not an invitation to install everything:
- The model still searches through tool descriptions to find matches
- Each search step costs tokens and adds latency
- Incorrect search matches waste a turn
- The model must still understand which server has the right tool
Think of Tool Search like a garbage collector. It helps manage memory pressure, but you should not write memory-leaking code just because the GC exists.
The Progressive Enhancement Approach
Instead of starting with someone else's configuration and pruning, start with nothing and add with evidence.
Phase 1: Minimal Foundation
Start with zero MCP servers and a minimal CLAUDE.md. Include only:
- Build, test, and lint commands for your project
- Architecture decisions Claude cannot infer from code
- Coding style rules specific to your team
- File naming conventions
Anthropic recommends keeping CLAUDE.md under 200 lines, ideally under 100. Every line is read every session.
Phase 2: Add with Evidence
Add an MCP server only when you find yourself repeatedly describing the same external system to Claude. That repeated description is the signal. Not a trending awesome list.
The 3-server rule is a useful starting point:
- One for version control (GitHub/GitLab)
- One for your primary data source (database, API, cloud provider)
- One domain-specific (project management, documentation, monitoring)
Phase 3: Write Custom Skills
Write skills only for tasks you have done 5+ times with similar prompts. A skill should encode your workflow, not someone else's.
Anti-pattern: Copy a generic "code review" skill from a community list. Better: Write a skill that reflects your team's review checklist, your test requirements, your architectural patterns.
The value of a skill is in its specificity. A generic skill provides no more value than Claude's built-in capabilities.
Phase 4: Audit Regularly
Use /context and /mcp to track overhead. Key metrics:
- Context overhead: MCP tools + system prompt + memory as % of total context (target: under 20%)
- Tool utilization: Tools actually used vs. tools registered (target: above 60%)
- Compaction frequency: How often auto-compaction triggers in a session (fewer is better)
Decision Framework
When to Add an MCP Server
When to Write a Skill vs. Use a Prompt
When to Add to CLAUDE.md
Common Pitfalls
"I'll Need It Eventually"
Installing MCP servers for tools you do not use daily. A Docker MCP server with 135 tools consumes ~126K tokens. That is 63% of a standard 200K context window for a tool you use twice a week. Even with 1M token windows now available, 126K tokens of overhead is still 126K tokens not spent on your actual work.
Fix: Enable MCP servers per-session when needed, not globally. Use /mcp to toggle.
"My CLAUDE.md Covers Everything"
Writing a comprehensive CLAUDE.md that covers every possible scenario. Claude reads it every session, but only a fraction applies to any given task.
Fix: Use scoped CLAUDE.md files in subdirectories. Keep root CLAUDE.md to universal rules only. Move task-specific instructions to separate markdown files or skills.
"The Trending Setup Must Be Best"
Copying a viral Claude Code configuration without understanding why each piece exists. The original author built it for their specific needs over weeks of iteration.
Fix: Read each skill and config before adding it. Ask: "Have I needed this in the last week?" If not, do not add it.
"Skills Are Just Better Prompts"
Treating skills as prompt templates rather than workflow encoders. A good skill reflects a specific, repeatable process with your tools, your conventions, and your quality bar.
Fix: Only create skills for tasks you have done 5+ times. Base them on your actual workflow, not someone else's.
"Tool Search Fixes Everything"
Relying on Tool Search to handle unlimited MCP servers. Tool Search is a mitigation, not a solution. It still costs search turns and can return wrong tools.
Fix: Even with Tool Search, aim for fewer than 10 MCP servers with focused tool sets. Quality of tool descriptions matters more than quantity.
The CLAUDE.md Scoping Problem
CLAUDE.md files follow a hierarchy:
~/.claude/CLAUDE.md(global) -- loaded in every project./CLAUDE.md(project root) -- loaded for this project./packages/api/CLAUDE.md(subdirectory) -- loaded when working here
Copy-pasting someone's global CLAUDE.md means their React component rules apply when you are writing Go microservices. Their "always use pnpm" instruction conflicts with your project's yarn workspace. Their "write tests with Vitest" does not match your Jest setup.
What belongs in CLAUDE.md:
- Build/test/lint commands for this project
- Architecture decisions Claude cannot infer from code
- Coding style rules specific to your team
- File naming conventions
What does not belong:
- Generic programming advice ("write clean code")
- Tool instructions that belong in skills
- Rules for stacks you are not using
- Verbose explanations. Claude reads code well.
Practical Audit Guide
Step 1: Check Your Token Budget
Run /context in Claude Code. Look at the breakdown. If MCP tools consume more than 15% of your context window, you are likely over-configured.
Step 2: Audit MCP Servers
Run /mcp to see per-server tool counts and token costs. Identify servers you have not used in the last week. Disable them.
Step 3: Review CLAUDE.md
Read your CLAUDE.md files (global and project-level). For each instruction, ask: "Does Claude actually need this for the tasks I do?" Remove everything that is aspirational rather than practical.
Step 4: Evaluate Skills
List the skills in your .claude/skills/ directory. For each one, ask: "Did I write this, or did I copy it? Does it reflect my workflow or someone else's?" Replace copied skills with custom ones based on your actual process.
Step 5: Set a Target
Aim for under 20% total overhead. Run /context after each change to measure progress.
Conclusion
The best Claude Code configuration is the one you built yourself. Community skills, shared dotfiles, and awesome lists are useful for inspiration. They are not meant to be copied wholesale. Every skill, CLAUDE.md instruction, and MCP server consumes tokens from a finite budget. The research is clear: more tools means worse tool selection. More context overhead means less room for actual work. Generic configurations produce generic outputs.
Start with nothing. Add with evidence. Audit regularly. Your configuration should grow from experience, not from awesome lists.
References
- Best Practices for Claude Code - Anthropic's official guidance on CLAUDE.md structure, recommending under 200 lines
- Context Windows - Claude API Docs - Official documentation on context window mechanics and token budgeting
- Model Context Protocol and the "Too Many Tools" Problem - Analysis of how MCP tool sprawl degrades LLM performance
- Having Multiple MCP Servers Running Eats into Context Window - Issue #3036 - Community report of 67K+ tokens consumed by 7 MCP servers
- Claude Code Just Cut MCP Context Bloat by 46.9% - Analysis of Tool Search reducing token overhead from 51K to 8.5K
- AI Tool Overload: Why More Tools Mean Worse Performance - Research showing 5-7 tools as the practical upper limit for consistent accuracy
- Your MCP Servers Are Eating Your Context - Practical analysis of per-server token costs and reduction strategies
- Extend Claude with Skills - Claude Code Docs - Official documentation on SKILL.md format and skill creation best practices
- Skill Authoring Best Practices - Claude API Docs - Anthropic's guidance on writing effective skills
- TaskBench: Benchmarking Large Language Models for Task Automation (NeurIPS 2024) - Academic research showing tool graph accuracy dropping from 96% (1 tool) to 25% (8 tools)
- Feature Request: Lazy Loading for MCP Servers - Issue #7336 - Community discussion that led to the Tool Search feature
- MCP Tool Search: How Claude Code Fixed Context Window Bloat - Technical deep-dive on Tool Search internals
- MCP Tools Consume 50% of Context Tokens - Issue #13717 - User report of 98.7K tokens consumed by MCP tools
- Optimising MCP Server Context Usage in Claude Code - Developer walkthrough of auditing and optimizing MCP server token consumption