Context Engineering in AI Agents

Summary

This presentation by Lance from Langchain provides an overview of "context engineering" in the development of AI agents, emphasizing its importance in managing context windows to optimize agent performance.
The discussion covers four primary strategies: writing, selecting, compressing, and isolating context, with real-world examples and references to popular agent frameworks.
The latter part of the talk highlights how Langraph, a low-level orchestration framework, supports these context management techniques, offering tools for state management, memory, tool selection, and modular agent design.

(no explicit action items, assignments, or deadlines were discussed in the transcript)

Context engineering is described as the art and science of efficiently filling the context window with the most relevant information for each agent step.
The need for context engineering arises due to the limited capacity (context window) of large language models (LLMs), which must be managed to prevent failures such as confusion or hallucination.
Context types include instructions, memories, few-shot examples, tool feedback, and external knowledge.

Agents often handle long-running or complex tasks and frequently use tool calls, leading to increased context size and management challenges.
Accumulating tool feedback and maintaining relevant task history can saturate the context window.
Potential issues include context poisoning, distraction, curation, and context clash, which can adversely affect agent responses.

Involves saving information outside the context window (e.g., using scratch pads or persistent memory) for agents to reference later.
Examples include:
- Scratch pads: Temporary note-taking for ongoing tasks (e.g., Anthropic’s multi-agent researcher).
- Memories: Persisting data across sessions (e.g., Gener agents, ChatGPT’s memory, Cursor, Windsurf).

Pertains to retrieving and injecting the most relevant information into the context window to support task completion.
Types of memories (semantic, episodic, procedural) can be selectively pulled based on task needs.
Selection strategies include:
- Rules files or style guidelines for instructions.
- Embedding-based similarity search or graph databases for facts and memory retrieval.
- Semantic search over tool descriptions to manage large tool sets (retrieval-augmented generation, or RAG).

Focuses on retaining only essential tokens, mainly through summarization and trimming.
Examples:
- Summarizing session histories to stay within token limits (e.g., automatic compaction in cloud code).
- Narrower summarization for specific agent outputs or at the interfaces between agents.
- Trimming old or irrelevant context messages, using heuristics or LLM-based methods.

Entails splitting context into compartments to maintain agent efficiency and avoid overload.
Multi-agent systems allow agents to have independent context windows, enabling parallel exploration of sub-tasks (e.g., OpenAI’s Swarm, Anthropic’s multi-agent research).
Other isolation techniques include:
- Sandboxing executable code and maintaining state outside the primary context window.
- Using structured runtime state objects (e.g., pydantic models) to organize and control context exposure.

None noted. The session was informational and did not raise pending issues or require follow-up.