Context Engineering
π Core Principle
Find the smallest possible set of high-signal tokens that maximize the likelihood of your desired outcome.
Quality of agent output is directly proportional to quality of context you provide.
ποΈ Description
The practice of designing and optimizing the context fed to LLM agents. Includes project instructions (CLAUDE.md), tool configurations (MCP servers), reusable skills, and architectural decisions about what information goes where.
Context Engineering vs Prompt Engineering:
- Prompt Engineering β writing/organizing LLM instructions for optimal outcomes (one-time task)
- Context Engineering β curating and maintaining the optimal set of tokens during inference across multiple turns (iterative process)
Context engineering manages: system instructions, tools, MCP, external data, message history, runtime data retrieval.
π§© Key components
- CLAUDE.md / project instructions β persistent context about the project: conventions, architecture, workflows, safety rules
- MCP servers β tools that give agents access to external systems (databases, APIs, documentation)
- Skills β dynamically loaded instruction packages (Agent Skills) for specific tasks
- Sub-agents β delegating tasks to focused agents with isolated context windows
- Hooks β automated responses to agent events (pre-commit checks, post-edit validation)
βοΈ The attention budget β context rot
LLMs have an βattention budgetβ depleted as context grows:
- Every token attends to every other token (nΒ² relationships)
- As context length increases, model accuracy decreases
- Models have less training experience with longer sequences
- Context must be treated as finite resource with diminishing marginal returns
Solution: fresh sub-agents for complex tasks; treat context as currency.
System prompts: the Goldilocks zone
- Too prescriptive β β hardcoded if-else, brittle, high maintenance
- Too vague β β falsely assumes shared context, lacks direction
- Just right β β specific guidance + flexible heuristics, minimal but sufficient
Best practices: simple/direct language, distinct sections (<background_information>, <instructions>, ## Tool guidance), XML or Markdown structure, start minimal then add based on failure modes. Minimal β short.
Tools: minimal and clear
- Self-contained β single, clear purpose
- Robust to error β handle edge cases gracefully
- Extremely clear β unambiguous intended use
- Token-efficient β relevant info without bloat
- Descriptive parameters β
user_idnotuser
If a human engineer canβt definitively say which tool to use, an AI agent canβt be expected to do better.
Avoid: bloated tool sets, overlapping purposes, ambiguous decision points.
Examples: diverse, not exhaustive
β Curate diverse canonical examples β pictures worth a thousand words β Stuff in laundry list of edge cases, articulate every rule
Context retrieval strategies
- Just-in-time context (recommended for agents) β maintain lightweight identifiers (paths, queries, links), load data dynamically. Mirrors human cognition. See Progressive Disclosure.
- Pre-inference retrieval (RAG) β embedding-based retrieval before inference. Use for static content.
- Hybrid β retrieve some upfront, enable autonomous exploration. Example: Claude Code loads CLAUDE.md upfront, uses glob/grep just-in-time.
Rule of thumb: βDo the simplest thing that works.β
Long-horizon tasks: three techniques
1. Compaction
Summarize conversation nearing context limit, reinitiate with summary. Preserve architectural decisions, bugs, implementation; discard redundant tool outputs. Tune by maximizing recall first, then improving precision.
2. Structured note-taking (agentic memory)
Agent writes notes persisted outside context (to-do lists, NOTES.md, project logs). Persistent memory with minimal overhead. Best for iterative development.
3. Sub-agent architectures
Specialized sub-agents with clean context windows. Main agent coordinates plan; sub-agents explore (tens of thousands of tokens) and return condensed summaries (1-2k tokens). See Harness Engineering.
Quick decision framework
| Scenario | Recommended approach |
|---|---|
| Static content | Pre-inference retrieval or hybrid |
| Dynamic exploration | Just-in-time context |
| Extended back-and-forth | Compaction |
| Iterative development | Structured note-taking |
| Complex research | Sub-agent architectures |
β οΈ Anti-patterns
- β Cramming everything into prompts
- β Brittle if-else logic
- β Bloated tool sets
- β Exhaustive edge cases as examples
- β Assuming larger context windows solve everything
- β Ignoring context pollution over long interactions
π Key takeaways
- Context is finite β treat as precious resource with attention budget
- Think holistically β consider entire state available to LLM
- Stay minimal β more context isnβt always better
- Be iterative β context curation happens each turn
- Design for autonomy β as models improve, let them act intelligently
- Start simple β test minimal setup, add based on failure modes
Even as models improve, the challenge of maintaining coherence across extended interactions will remain central to building more effective agents.
π Related concepts
- Progressive Disclosure β the practical pattern for just-in-time context
- Harness Engineering β the practical implementation of context engineering
- Agent Skills β dynamically loaded instruction packages
- Token Optimization for Claude Code β tools that reduce context spend 40β98%
- Context window management β knowing what fits and what to prioritize
π Further reading
Agentic Coding LLM Knowledge Bases Claude Code Claude Code Best Practice Skills 2.0 Testing Software 3.0 Agentic Engineering
Source: Anthropic, Effective context engineering for AI agents (September 2025).
Template: knowledge_note_info