Lecture Notes: Anthropic's Release of Claude 37
Introduction to Claude 37
- Presenter: Lang Cham
- Organization: Anthropic
- Model Release: Claude 37
- Key Feature: First explicit reasoning model by Anthropic.
Usage of Claude 37
- Installation: Install L Anthropic and set the API key.
- Model Selection: Choose Claude 37 latest.
- Parameters:
- Max Tokens: Set response token limit.
- Thinking Parameter: New setting for thinking tokens.
Key Features
- Thinking and Response Blocks:
- Outputs include a "thinking" block and a "response" block.
- Transparency: Users can see Claude's thought process.
- Latency:
- Full output latency: 20 seconds.
Compatibility with Agents
- Agent Building:
- Compatible with building agents.
- Example: Simple agent with arithmetic tools.
- Tool Usage: Shows thinking around when and why to use a specific tool.
Reasoning Models vs. Chat Models
- Scaling Paradigms:
- Chat Models: Next-to prediction.
- Reasoning Models: Reinforcement Learning on Chain of Thought.
- Interaction Modes:
- Chat Models: Strong for short chats.
- Reasoning Models: Better for long reasoning tasks.
Performance and Capabilities
- Post-Tuning: Claude 37 trained with reinforcement learning.
- Thinking Traces: Exposed to users for transparency.
- Control Over Thinking:
- Users can set thinking budget of tokens.
- Knowledge Cut-off: October 2024.
- Token Output: Up to 128,000 tokens.
- Performance: Improved on software engineering tasks (e.g., sbench scores).
Pricing
- Costs:
- Input Tokens: $3 per million.
- Output Tokens: $15 per million.
- Output Tokens: More frequent due to thinking processes.
Tips for Usage
- Usage Recommendations:
- Suitable for challenging STEM problems.
- Consider 16,000+ tokens for complex tasks.
- Latency Considerations: Higher thinking tokens increase latency.
- Configurable Outputs:
- Detailed outlines with specific word counts.
- Prompting Strategies:
- Avoid predetermined instructions.
- Encourage thorough and detailed thinking.
Model Parameters
- Budget Tokens: Important for model efficiency.
- Response Blocks:
- Contain thinking and response segments.
- Signature Field: Cryptographic token for verification.
Conclusion
- Overall Performance: High emphasis on coding and tool calling.
- Potential for Experimentation: Encouraged to explore different configurations and tasks.
Feel free to leave any comments or questions.