Overview of Context Engineering in AI

Summary

The meeting, led by Lance from Langchain, provided an overview of "context engineering" in the design of AI agents, outlining its importance and the main strategies involved.
Four primary strategies were discussed: writing, selecting, compressing, and isolating context, with examples from current agent applications.
The session also detailed how Langraph, a framework by Langchain, supports these context management approaches.
Best practices, practical techniques, and references to additional resources were shared.

"Context engineering" is defined as the art and science of filling an agent's context window with the right information for each step of its operation.
The need for effective context engineering arises from the limited capacity of LLMs' context windows and the complexity added by long-running, tool-using agents.

Context sources include instructions (prompt engineering), memories (short-term and long-term), few-shot examples, tool descriptions, and external knowledge.
Agents particularly challenge context management due to task length and cumulative tool feedback.

As context grows, risks include confusion, hallucination, distraction, and conflicting information, making curation critical.
Effective context engineering is considered essential for engineers building AI agents.

Involves saving data outside the LLM's context window for future retrieval.
Scratch pads allow agents to take notes within a session, while memories persist relevant information across sessions.
Example: Anthropic’s multi-agent researcher saves plans to memory for recall beyond token limits.

Entails pulling only relevant data into the context window as needed.
Different memory types (procedural/instructions, semantic/facts, episodic/examples) can be pulled depending on task.
Techniques include using files for procedural instructions, embedding-based retrieval for facts, and semantic search for relevant tool descriptions or large toolsets.

Focused on summarizing or trimming tokens to stay within context window limits.
Approaches include overall session summarization, targeted summarization (only for completed work or between sub-agents), and token pruning using heuristics or LLMs.

Involves partitioning or sandboxing context to avoid overload and maintain focus.
Multi-agent systems assign separate context windows to sub-agents, enabling parallel task processing.
Sandboxes or state objects can persist information across turns or house token-heavy data outside the LLM context window.

Langraph provides a low-level orchestration framework where agents are organized as nodes and edges, optimized for context management.
State objects act as scratch pads, accessible and modifiable in every node.
Long-term memory is built in, allowing both per-session persistence and cross-session recall.
Tool and knowledge selection is facilitated through embedding-based search and flexible data store models.
Utilities for summarizing, trimming, and post-processing are available or can be custom-implemented within nodes.
Langraph supports multi-agent patterns and sandboxed execution for advanced context isolation.
Users are encouraged to use tracing/evaluation (e.g., Langmith) to monitor token usage and measure effects of context management strategies.