Skip to content

Why Copying Others' Claude Code Skills Doesn't Work

Cargo-culting Claude Code configurations leads to context window bloat, degraded tool selection, and mismatched workflows. A data-backed guide to intentional AI tool configuration with token budget math and progressive enhancement.

Abstract

The Claude Code ecosystem now includes 500+ community skills, 1,200+ agent skills, and dozens of MCP server directories. The temptation to clone someone's "awesome" setup is strong. But copying configurations leads to three compounding problems. First, context window bloat that leaves less room for actual code. Second, tool sprawl that degrades the model's ability to pick the right tool. Third, workflows that don't match your codebase. This post provides the token math, research data, and a practical framework for building configurations intentionally.

The Problem: Copy-Paste Configuration

Claude Code loads configuration from several sources every session. These include system prompt, tool definitions, MCP server schemas, CLAUDE.md files, skills, and memory files. Each source consumes tokens from a shared context window. The math is straightforward but often overlooked.

Token Budget: Minimal vs. Bloated

Here is what the /context command reveals for two real setups:

Minimal setup (2 MCP servers, focused CLAUDE.md):

System prompt:  2.6K tokens  (1.3%)System tools:  11.6K tokens  (5.8%)MCP tools:  5.7K tokens  (2.8%)Custom agents:  69 tokens  (0.0%)Memory files:  743 tokens  (0.4%)Skills:  61 tokens  (0.0%)Autocompact buffer: 33.0K tokens (16.5%)Free space:  146.2K tokens (73.1%)

Bloated setup (8+ MCP servers, copy-pasted config):

System prompt:  2.6K tokens  (1.3%)System tools:  17.6K tokens  (8.8%)MCP tools:  82.0K tokens (41.0%)Custom agents:  1.3K tokens  (0.7%)Memory files:  7.4K tokens  (3.7%)Skills:  1.0K tokens  (0.5%)Autocompact buffer: 33.0K tokens (16.5%)Free space:  55.1K tokens (27.5%)

The bloated setup loses 91K tokens to overhead. That is nearly half the context window consumed before the conversation starts. Those 91K tokens could have been your codebase, your error logs, or your documentation.

The Token Economics Table

ConfigurationMCP Token CostAvailable for WorkEffective Capacity
Zero MCP servers0 tokens~155K tokens100%
2-3 focused servers~6K tokens~149K tokens96%
8+ copy-pasted servers~82K tokens~73K tokens47%
15+ "awesome list" servers~145K tokens~10K tokens6%

At 6% effective capacity, the model has barely enough room to hold a single file and your question.

Why This Is Different: Silent Degradation

Unlike traditional tool configuration where wrong settings cause visible errors, AI configuration problems are silent:

  • The model does not crash. It gives worse answers.
  • Extra context does not throw exceptions. It subtly degrades output quality.
  • Wrong skills do not fail. They produce plausible-but-wrong suggestions.
  • Too many tools do not error. The model picks the wrong one more often.

This is the core challenge. There is no stack trace to debug. You just notice that Claude's suggestions are less precise, its tool choices less accurate, and its responses more generic. Most developers blame "the model having a bad day" rather than their own configuration.

Tool Selection Degradation: The Research

The relationship between tool count and accuracy is well-documented. Here is what the research shows.

TaskBench (NeurIPS 2024)

The TaskBench benchmark measured LLM accuracy across tool graphs of varying complexity:

  • Single tool: 96.16% accuracy
  • 6 tools: 39.31% accuracy
  • 8 tools: 25.00% accuracy

Note: TaskBench measures tool graph accuracy (complete multi-step tool chains), not isolated tool selection. This makes the degradation even more relevant. Complexity compounds.

Anthropic's MCP Evaluations

Anthropic's own benchmarks on Opus 4 with MCP tools show a similar pattern:

  • 50+ tools (without Tool Search): 49% accuracy
  • 50+ tools (with Tool Search): 74% accuracy

Tool Search improved things, but a 26% error rate is still significant. One in four tool selections is wrong even with the mitigation enabled.

The Sweet Spot

Industry experience and benchmarks suggest 5-7 tools as the range for consistent, accurate tool selection. Beyond that, each additional tool increases the probability that the model picks the wrong one.

Tool Search: Safety Net, Not Solution

Anthropic shipped MCP Tool Search in Claude Code 2.1.7 (January 2026). When tool definitions exceed 10% of the context window, Tool Search lazy-loads tools on demand instead of including all definitions upfront. The results are meaningful:

  • Token overhead reduced by 46.9-85%
  • Opus 4 accuracy improved from 49% to 74% on MCP evaluations

But Tool Search is a mitigation, not an invitation to install everything:

  • The model still searches through tool descriptions to find matches
  • Each search step costs tokens and adds latency
  • Incorrect search matches waste a turn
  • The model must still understand which server has the right tool

Think of Tool Search like a garbage collector. It helps manage memory pressure, but you should not write memory-leaking code just because the GC exists.

The Progressive Enhancement Approach

Instead of starting with someone else's configuration and pruning, start with nothing and add with evidence.

Phase 1: Minimal Foundation

Start with zero MCP servers and a minimal CLAUDE.md. Include only:

  • Build, test, and lint commands for your project
  • Architecture decisions Claude cannot infer from code
  • Coding style rules specific to your team
  • File naming conventions

Anthropic recommends keeping CLAUDE.md under 200 lines, ideally under 100. Every line is read every session.

Phase 2: Add with Evidence

Add an MCP server only when you find yourself repeatedly describing the same external system to Claude. That repeated description is the signal. Not a trending awesome list.

The 3-server rule is a useful starting point:

  1. One for version control (GitHub/GitLab)
  2. One for your primary data source (database, API, cloud provider)
  3. One domain-specific (project management, documentation, monitoring)

Phase 3: Write Custom Skills

Write skills only for tasks you have done 5+ times with similar prompts. A skill should encode your workflow, not someone else's.

Anti-pattern: Copy a generic "code review" skill from a community list. Better: Write a skill that reflects your team's review checklist, your test requirements, your architectural patterns.

The value of a skill is in its specificity. A generic skill provides no more value than Claude's built-in capabilities.

Phase 4: Audit Regularly

Use /context and /mcp to track overhead. Key metrics:

  • Context overhead: MCP tools + system prompt + memory as % of total context (target: under 20%)
  • Tool utilization: Tools actually used vs. tools registered (target: above 60%)
  • Compaction frequency: How often auto-compaction triggers in a session (fewer is better)

Decision Framework

When to Add an MCP Server

SignalAction
You describe the same external system to Claude 3+ times/weekAdd a focused MCP server for it
You saw it on an awesome list and it looks interestingSkip it
Your team all needs the same external tool accessAdd it to project config
You used it once in the last monthDisable or remove it
A single server registers 50+ toolsLook for a more focused alternative

When to Write a Skill vs. Use a Prompt

SignalAction
You have done this task 5+ times with similar promptsWrite a custom skill
Someone shared a skill that matches your workflowAdapt it to your conventions, do not copy it
The task is generic (e.g., "review this code")Use Claude's built-in capabilities
The task requires your project's specific conventionsWrite a skill that encodes those conventions
The skill would exceed 200 linesSplit into focused sub-skills

When to Add to CLAUDE.md

SignalAction
Claude keeps making the same mistakeAdd a specific rule to prevent it
The instruction applies to every task in this projectKeep it in root CLAUDE.md
The instruction applies to one subdirectory onlyMove it to a scoped CLAUDE.md
You copied it from another project's CLAUDE.mdRemove it unless it applies to your project
Your CLAUDE.md exceeds 200 linesAudit and move non-universal rules to skills

Common Pitfalls

"I'll Need It Eventually"

Installing MCP servers for tools you do not use daily. A Docker MCP server with 135 tools consumes ~126K tokens. That is 63% of a standard 200K context window for a tool you use twice a week. Even with 1M token windows now available, 126K tokens of overhead is still 126K tokens not spent on your actual work.

Fix: Enable MCP servers per-session when needed, not globally. Use /mcp to toggle.

"My CLAUDE.md Covers Everything"

Writing a comprehensive CLAUDE.md that covers every possible scenario. Claude reads it every session, but only a fraction applies to any given task.

Fix: Use scoped CLAUDE.md files in subdirectories. Keep root CLAUDE.md to universal rules only. Move task-specific instructions to separate markdown files or skills.

Copying a viral Claude Code configuration without understanding why each piece exists. The original author built it for their specific needs over weeks of iteration.

Fix: Read each skill and config before adding it. Ask: "Have I needed this in the last week?" If not, do not add it.

"Skills Are Just Better Prompts"

Treating skills as prompt templates rather than workflow encoders. A good skill reflects a specific, repeatable process with your tools, your conventions, and your quality bar.

Fix: Only create skills for tasks you have done 5+ times. Base them on your actual workflow, not someone else's.

"Tool Search Fixes Everything"

Relying on Tool Search to handle unlimited MCP servers. Tool Search is a mitigation, not a solution. It still costs search turns and can return wrong tools.

Fix: Even with Tool Search, aim for fewer than 10 MCP servers with focused tool sets. Quality of tool descriptions matters more than quantity.

The CLAUDE.md Scoping Problem

CLAUDE.md files follow a hierarchy:

  • ~/.claude/CLAUDE.md (global) -- loaded in every project
  • ./CLAUDE.md (project root) -- loaded for this project
  • ./packages/api/CLAUDE.md (subdirectory) -- loaded when working here

Copy-pasting someone's global CLAUDE.md means their React component rules apply when you are writing Go microservices. Their "always use pnpm" instruction conflicts with your project's yarn workspace. Their "write tests with Vitest" does not match your Jest setup.

What belongs in CLAUDE.md:

  • Build/test/lint commands for this project
  • Architecture decisions Claude cannot infer from code
  • Coding style rules specific to your team
  • File naming conventions

What does not belong:

  • Generic programming advice ("write clean code")
  • Tool instructions that belong in skills
  • Rules for stacks you are not using
  • Verbose explanations. Claude reads code well.

Practical Audit Guide

Step 1: Check Your Token Budget

Run /context in Claude Code. Look at the breakdown. If MCP tools consume more than 15% of your context window, you are likely over-configured.

Step 2: Audit MCP Servers

Run /mcp to see per-server tool counts and token costs. Identify servers you have not used in the last week. Disable them.

Step 3: Review CLAUDE.md

Read your CLAUDE.md files (global and project-level). For each instruction, ask: "Does Claude actually need this for the tasks I do?" Remove everything that is aspirational rather than practical.

Step 4: Evaluate Skills

List the skills in your .claude/skills/ directory. For each one, ask: "Did I write this, or did I copy it? Does it reflect my workflow or someone else's?" Replace copied skills with custom ones based on your actual process.

Step 5: Set a Target

Aim for under 20% total overhead. Run /context after each change to measure progress.

Conclusion

The best Claude Code configuration is the one you built yourself. Community skills, shared dotfiles, and awesome lists are useful for inspiration. They are not meant to be copied wholesale. Every skill, CLAUDE.md instruction, and MCP server consumes tokens from a finite budget. The research is clear: more tools means worse tool selection. More context overhead means less room for actual work. Generic configurations produce generic outputs.

Start with nothing. Add with evidence. Audit regularly. Your configuration should grow from experience, not from awesome lists.

References

Related Posts