Overview
This lecture explores the concept of context engineering for large language models (LLMs), contrasting deterministic and probabilistic context, and suggesting principles for more effective, secure, and accurate agent outcomes.
What is Context Engineering?
- Context engineering expands on prompt engineering by considering all model inputs, not just the prompt.
- LLMs process prompts, system instructions, rules, and uploaded documents as context.
- The goal is to ensure all context provided leads to desired model outcomes.
Deterministic vs. Probabilistic Context
- Deterministic context includes static prompts, documentation, and data that are directly controlled.
- Most current advice focuses on optimizing the deterministic context for efficiency and cost (e.g., token usage).
- Probabilistic context includes external, dynamic sources like web data or large internal databases.
- When models access the web, probabilistic context can overwhelm deterministic data due to sheer volume.
Challenges and Risks with Probabilistic Context
- User prompts only partially shape the vast amount of external information accessed.
- It's difficult to ensure model responses use high-quality, reliable sources.
- Security risks increase with probabilistic context, such as potential LLM injection attacks.
- Traditional evaluation metrics (precision and recall) are less effective for probabilistic context.
Principles for Context Engineering
- Expect and design for information discovery in the context, not just static input.
- Monitor and audit the quality and reliability of information sources used by models.
- Take security precautions against injection attacks in open or semi-open environments.
- Use relevance scoring of sources to better assess decision accuracy.
- Version and test prompts systematically to maintain performance.
Future Directions
- Evaluation methods must adapt to consider the impact of probabilistic context.
- Engineers should focus on shaping the context the agent explores, not just minimizing tokens.
Key Terms & Definitions
- Prompt Engineering — designing effective inputs (prompts) for LLMs.
- Context Engineering — shaping all aspects of LLM input, including prompts, system rules, and external data.
- Deterministic Context — static, controlled input data provided to the LLM.
- Probabilistic Context — dynamic, external or web-based information accessible to the LLM.
- Token — smallest unit of text processed by an LLM.
- LLM Injection Attack — a security risk where malicious inputs alter the behavior of an LLM.
Action Items / Next Steps
- Review and document the sources your LLM agents use during research tasks.
- Implement version control for prompts and context strategies.
- Investigate relevance scoring and audit methods for improving decision accuracy.
- Study security best practices for probabilistic context environments.