context engineering

Context engineering plays a foundational role in how AI agents interpret inputs, plan actions, and generate output.

On Why Context Engineering is Crucial in AI Agent Design:

Context Improves Instruction Fidelity

Clear, well-structured context helps the AI agent understand what it's supposed to do and how to do it.

Without this, the agent may misunderstand goals, produce irrelevant output, or make faulty assumptions.

Context Reduces Ambiguity

Ambiguous or underspecified prompts can lead to hallucination or generic responses.

Context engineering introduces domain-specific language, task framing, and examples that constrain the model's output toward accurate, goal-relevant responses.

Context Supports Task Switching and Memory

Agents operating in dynamic environments must maintain and recall prior interactions, goals, or state.

Proper context scaffolding, such as summaries, embeddings, or state stores, allows agents to manage memory effectively and adapt their behavior over time.

Context Enables Multi-Agent Coordination

In swarm or multi-agent systems, context can be shared using world model or a message bus.

With these methods, each agent interprets its role and tasks relative to the common context.

These methods can be key to avoiding conflicts and ensuring alignment.

Context Aligns with Human Expectations

Agents, such as co-pilots and assistants, work with or on behalf of humans. These agents must behave predictably and transparently.

Context engineering allows injection of user intent, constraints, tone, style, and preferences and this leads to more aligned and trustworthy behavior.

Context Optimizes Token Efficiency

LLMs have token limits, and so engineering the context to include only the relevant data, such as recent chat history or task-relevant facts, helps to maximize efficiency and prevent overload.

This can also include chunking strategies or retrieval-augmented generation.

Context Enables Tool Use and Function Calling

Agents that can call tools or APIs need to understand when and how to invoke them.

Embedding instructions and examples into the context or via structured prompt templates ensures reliable function triggering.

Context Supports Personalization and Autonomy

To act autonomously or reflect a user's personality or business process, the agent needs contextual grounding.

This may include user profile data, task history, and strategic goals.

Context Provides Robustness to Environment Changes

Context can include descriptions of the environment, such as UI layout, data schema, system status.

When this context is structured properly, agents can adapt to changes without requiring retraining or hardcoding.

Context Enables Long-Term Strategic Behavior

Agents designed to work over long time horizons must reason not just tactically, but strategically.

Context engineering can guide them with mission statements, OKRs, or evolving roadmaps.

Consider the Ideas of Authorities on Context Engineering

Here is a look at Context Engineering from the viewpoint of leading authorities.

Andrej Karpathy

Karpathy emphasizes that the primary goal of context engineering is to "make the task look solvable" to the model by properly framing it within the provided input.

This includes not only supplying the right data but also packaging it with instructions, goals, and signals that maximize the model's ability to infer intent.

The benefit is an LLM response that is highly relevant, aligned with expectations, and far less prone to hallucinations.

For Karpathy, context engineering is less about hacking prompts and more about systematically shaping input to simulate understanding.

Gustavo del Rio

Gustavo del Rio positions context engineering as the architecture that underpins intelligent systems. He likens context engineering to software architecture in traditional engineering disciplines.

The benefit of this shift is that it empowers AI systems to be robust, interpretable, and easily auditable.

His perspective encourages developers to think of context as a long-term asset: something you maintain, version, and evolve like any other critical component of a system.

Shashi Jagtap

Jagtap views context engineering as essential for enabling autonomous agent behavior in dynamic and uncertain environments.

Through the lens of agent architecture and his IMPACT framework, the goal is to give each agent an evolving, layered understanding of its mission, environment, and memory.

The aim is to enabling situational awareness.

The IMPACT framework for designing agent architectures, includes:

Integrated LLMs
Meaningful intent & goals
Plan‑driven control flows
Adaptive planning loops
Centralized persistent memory
Trust & observability mechanisms

The benefits are multifold: agents become more adaptable, reduce reliance on hard-coded behavior, and can coordinate with other agents through shared context structures.

Fernando Peres & Daniel Lozovsky

Peres and Lozovsky argue that context engineering is replacing prompt engineering in real-world AI applications.

Their goal is to promote structured input management, such as injecting state, memory, rules, and personalization, to ensure the AI system behaves as expected across changing use cases.

The benefit is twofold: better performance with fewer retries or adjustments, and more trust from users who see consistent, useful results.

LangChain Team (incl. Mehul Gupta)

The LangChain team sees context engineering as a formal discipline that organizes the various input layers an agent must reason over.

Context engineering organizes system instructions, chat history, retrieval results, tool schemas, and dynamic memory.

One goal of the LangChain team is to provide APIs and design patterns that let developers cleanly compose and maintain this structure.

The benefit is a modular, testable agent that doesn't fall apart as it scales across tools, sessions, or tasks.

Jan Daniel Semrau

Semrau focuses on dynamic context management ; particularly separating out what needs to persist (long-term memory) from what should be transient (short-term working memory).

His goal is to build agents that don't overconsume tokens or confuse state, especially when interfacing with changing environments or evolving objectives.

Juan Pavón

Pavón's agent-oriented methodology (like INGENIAS) treats context as an explicit part of agent models, including goals, beliefs, and interaction protocols.

His goal is to formalize how agents represent and reason about their world, so that coordination and decision-making are consistent and transparent.

The benefit is traceability: teams can understand why an agent made a decision by inspecting its context and role at the time.

Context Documentation Guidelines for AI Systems

These guidelines synthesize best practices from leading experts in context engineering.

Developers should use these principles to produce clear, modular, and maintainable context documentation for LLM-powered agents.

State the Agent's Purpose and Role

Clearly define the agent's task domain, responsibilities, and intended behavior. Include how the agent is expected to interact with users, tools, or other agents.

Describe the Layers of Context

Break down the context into structured sections such as:

System Prompt / Instructions
User Input or History
Retrieved Knowledge or Tools
Memory (Short-term / Long-term)
Environment State / Sensor Data (if applicable)

Specify Context Update Mechanisms

Document how and when the context is updated:

Time-based (e.g., session expiry)
Event-driven (e.g., user action, tool call)
Agent-driven (e.g., memory writes, learning events)

Include details about summarization, truncation, or compression methods.

Define Context Scope and Boundaries

Clearly outline what context is included vs. excluded. Define what data persists, what is ephemeral, and what is public vs. private in multi-agent environments.

Include Real Examples

Provide actual examples of:

Final constructed prompts
Tool invocation formats
Retrieval payloads

Link to Versioned Artifacts

Maintain version control for context-related artifacts:

Prompt templates
Retrieval pipelines
Tool schemas
Memory structures

Include Context Validation Rules

Define what constitutes a valid context:

Maximum token limits
Mandatory fields
Input formatting (JSON, YAML, Markdown)

Document Assumptions and Dependencies

List key assumptions such as:

Few-shot vs. zero-shot behavior expectations
Token and latency tolerances
External tools, APIs, or services required for execution

Explain Human Interpretability Considerations

Identify which context elements are user-visible and explain how audit trails, logs, or transparency features are derived from context.

Cross-reference Use Cases and Personas

Map context components to specific:

User goals or jobs to be done
Personas interacting with the agent
Workflows or conversational patterns

Key Insights from Don't Build Multi‑Agents

Context Engineering Matters Most

Context engineering is about designing systems that manage and carry context intelligently across turns in long-running agent workflows; this goes beyond basic prompt engineering (cognition.ai/blog/dont-build-multi-agents).

Two Core Principles to Avoid Fragility

Share full context including agent traces: When using subagents, they must receive the complete conversation history and decision path; not just individual task messages; to avoid miscommunication.
Actions carry implicit decisions: Without shared assumptions, subagents make conflicting decisions (e.g. mismatched visual style in the Flappy Bird example) that are hard to reconcile.

Risks of Naïve Multi-Agent Designs

If subagents interpret tasks differently, you can end up with incoherent or incompatible outputs (as in the Flappy Bird clone scenario) because each makes decisions without global context.

Prefer Single-Threaded Agents Initially

A "single-threaded linear agent" is often simpler, more reliable, easier to debug, and sufficient for many real-world tasks.

As systems grow longer or more complex, context compression (e.g. using a smaller LLM to summarize) can help without requiring multi-agent architectures.

When Multi-Agent Might Make Sense

In contrast, Anthropic's multi-agent research system is optimized for parallel "read" or research tasks over large context spaces.

These systems leverage multiple agents to explore different problem aspects simultaneously; but they come with heavy token costs (up to 15×) and require careful orchestration and context alignment.

Strategic Takeaways

The article title "Don't Build Multi‑Agents" is intentionally provocative but more accurately critiques naïve multi-agent systems rather than ruling out all uses of them.

The recommended approach: Start with a disciplined single-agent architecture that maintains cumulative context and reasoning continuity. Only adopt multi-agent structures when truly needed, for example: Tasks that are highly parallel or exploratory ("read"-heavy scenarios), Or when context window size truly exceeds manageable limits.

Always design for shared context, traceability, and decision alignment; regardless of architecture.

freeradiantbunny.org

freeradiantbunny.org/blog