Context Engineering
The practice of engineering what goes INTO the LLM — optimizing inputs for maximum information density, token efficiency, and model performance. The term was popularized through entities/12-factor-agents (Dex / HumanLayer), and is closely related to Andrej Karpathy’s concept of prompt engineering as an engineering discipline.
What It Is
Context Engineering is the systematic practice of designing, optimizing, and managing the input context passed to an LLM. It goes beyond simple “prompt engineering” to encompass everything the LLM sees:
- System prompts and instructions
- Conversation history / thread representation
- RAG-retrieved documents
- Tool call results
- Memory and knowledge snippets
- Structured output schemas
The core insight: the LLM’s input is fundamentally “here’s what happened so far, what’s next?” — and how you represent that “what happened so far” directly determines output quality.
Key Principles
1. Custom Formats Over Standard Message Roles
Don’t default to the OpenAI/Anthropic message format (system/user/assistant/tool). Build your own context format tailored to your use case. Examples:
<!-- XML-style format with semantic tags -->
<slack_message>From: @alex, Text: Can you deploy the backend?</slack_message>
<list_git_tags>intent: "list_git_tags"</list_git_tags>
<list_git_tags_result>tags: [{name: "v1.2.3", ...}]</list_git_tags_result>
<!-- YAML format for dense information -->
thread:
- event: slack_message
from: alex
text: Can you deploy the backend?
- event: tool_call
intent: list_git_tags
- event: tool_result
tags:
- name: v1.2.3
2. Information Density First
Every token in the context window has a cost (both monetary and in model attention). Maximize signal-to-noise ratio:
- Summarize verbose tool outputs
- Drop irrelevant conversation turns
- Compress repetitive patterns
- Redact sensitive information (with markers)
3. Context Is State
Following entities/12-factor-agents Factor 5, the context window IS your execution state. If you need to know the current step, retry count, or any metadata — it should be derivable from the context, not from a separate tracking system.
4. Pre-fetch vs. Ask
From Factor 13: If you know which tools the agent will need, call them deterministically before the loop starts. This saves a full LLM round-trip and gives the model more information upfront. Remove pre-fetched results from the available tool options.
Relationship to Other Concepts
- concepts/agent-loop-architecture — Context Engineering is the craft of what goes INTO the loop’s prompt
- concepts/context-compression — A specific technique within Context Engineering: compressing long context to fit within model limits
- concepts/memory-system — Memory is a type of context that persists across sessions; Context Engineering handles the session-level representation
- entities/12-factor-agents Factor 3 specifically — “Own Your Context Window” is the definitive statement of this concept
- concepts/rag-systems — RAG is a source of context; how you format and inject retrieved docs is a Context Engineering decision
- entities/karpathy-llm101n — Karpathy’s work on prompt engineering as engineering
Why It Matters
The difference between a mediocre agent and a great one is often not the model, the tools, or the architecture — it’s how the context is engineered. Two agents with the same LLM and same tools can produce wildly different results based on:
- How tool results are formatted in context
- How conversation history is summarized or retained
- What information is made prominent vs. buried
- How errors are presented for self-healing (entities/12-factor-agents Factor 9)
As LLMs improve, Context Engineering becomes more important, not less — because the bottleneck shifts from model capability to input quality.