Coconote
AI notes
AI voice & video notes
Export note
Try for free
Understanding Prompting and Fine-Tuning Methods
Aug 8, 2024
Lecture Notes: Prompting, Instruction Fine-Tuning, and RLHF
Introduction
Lecturer
: Jesse Moo, PhD student in CS Department (NLP group)
Topic
: Prompting, Instruction Fine-Tuning, and RLHF (Reinforcement Learning from Human Feedback)
Relevance
: Key concepts behind the training of modern chatbots like ChatGPT and Bing
Course Logistics
Project Proposals
: Due recently; mentors are being assigned.
Assignment 5
: Due Friday at midnight; suggested tools include Colab, AWS, Azure, or Kaggle for GPU access.
Course Feedback Survey
: Posted on Ed; due Sunday by 11:59 pm.
Lecture Overview
Large Language Models (LLMs)
Increase in compute and data for LLMs over the years.
Pre-training helps LLMs learn features like syntax, co-reference, sentiment, and more.
LLMs act as rudimentary world models due to vast internet data.
Examples of abilities: math reasoning, code generation, medical text comprehension.
Zero-shot and Few-shot Learning
Zero-shot learning
: Perform tasks without explicit training, e.g., question answering by predicting next token.
Few-shot learning
: Specify tasks by giving example inputs/outputs, improving performance.
Models
:
GPT (2018)
: 117 million parameters, trained on books.
GPT-2 (2019)
: 1.5 billion parameters, trained on 40GB web text.
GPT-3 (2020)
: 175 billion parameters, enables few-shot learning.
Prompt Engineering
Chain of Thought Prompting
: Demonstrate reasoning steps in prompts to improve task performance.
Zero-shot Chain of Thought Prompting
: Simple instructions like "let's think step by step" can improve results.
Prompt Engineering
: Emerging field, involves constructing effective prompts for various tasks.
Instruction Fine-Tuning
Objective
: Align language models with user intent by fine-tuning on instruction-output pairs.
Data Sets
: Large datasets like Supernatural Instructions (~1.6k tasks, 3 million examples) are used.
Evaluation Benchmarks
: MMLU and Big Bench for assessing performance on diverse tasks.
Benefits
: Generalizes to unseen tasks, smaller models can outperform larger models with fine-tuning.
Challenges
: Expensive human data collection, creative/open-ended tasks, and token-level penalty issues.
Reinforcement Learning from Human Feedback (RLHF)
Objective
: Maximize expected reward (human preference) for language model outputs.
Method
:
Train a reward model to predict human preferences.
Use policy gradient methods to optimize language model parameters.
Include a penalty term to prevent divergence from pre-trained model.
Challenges
: Human feedback is expensive, noisy, and miscalibrated; reward hacking; over-optimization.
Advanced Concepts and Future Directions
Constitutional AI
: Using AI feedback to critique and improve language model outputs.
Self-Improvement
: Fine-tuning models on their own outputs, particularly for Chain of Thought reasoning.
Challenges
: High data requirements, reward hacking, hallucination, and security (jailbreaking issues).
Conclusion
Current State
: Instruction fine-tuning and RLHF have significantly improved LLM capabilities but still face challenges.
Future Work
: Exploring safer and more efficient methods to align AI models with human values and preferences.
Open Questions
: Addressing fundamental limitations like hallucination and data efficiency for RLHF.
Final Remarks
Exciting Time
: Fast-paced developments in LLM research, requiring continual updates and innovations.
End of Notes
📄
Full transcript