Understanding Large Language Models

Oct 16, 2024

Introduction to Large Language Models

Overview

  • Re-recorded talk on large language models (LLMs) for YouTube.
  • Discusses LLMs using the example of LAMA2-70B by Meta.ai.

What is a Large Language Model?

  • Comprises two files: parameter file and a run file (code to run parameters).
  • Example: LAMA2 series with different sizes (7B, 13B, 34B, 70B parameters).
  • Open weights models like LAMA2 are accessible, unlike proprietary models like ChatGPT.
  • Parameters stored as 140GB float 16 data type.

Running LLMs

  • Can run LAMA2 models with just the two files on a laptop.
  • Requires no internet for basic inference.
  • Example given with scaled-down model for speed demonstration.

Obtaining Parameters: Model Training

  • Training compresses large datasets (10TB of text) using GPU clusters.
  • LAMA2 training specifics: 6000 GPUs, 12 days, $2 million cost.
  • Training involves lossy compression of internet text.

Neural Network Functionality

  • Task: Predict the next word in a sequence.
  • Training data leads to learning general world knowledge.
  • Inference uses text predictions based on training data distribution.

Understanding Neural Networks

  • Architecture understood but parameter interactions remain complex.
  • Issue: Models like GPT-4 have knowledge retrieval problems (e.g., asymmetry in question-answer pairs).

Model Training and Fine-tuning

Pre-training vs. Fine-tuning

  • Pre-training: Involves large-scale internet text for general knowledge.
  • Fine-tuning: Adjusts model behavior for specific tasks using high-quality Q&A datasets.

Fine-tuning Process

  • Collect high-quality labeled data for Q&A.
  • Process improves model's ability to act as an assistant.

Further Fine-tuning (Stage 3)

  • Uses comparison labels (e.g., choosing best response) for further refinement.
  • Reinforcement Learning from Human Feedback (RLHF) as an example method.

Labeling Instructions

  • Instructions can be detailed, aiming for helpful, truthful, harmless outputs.

Human-Machine Collaboration

  • Increasing use of AI in creating labels, reducing human workload.

Current Model Landscape and Performance

Open vs. Proprietary Models

  • Closed models (e.g., GPT, Claude) perform better but lack user access.
  • Open models (e.g., LAMA2) offer freedom in fine-tuning and usage.

Scaling Laws

  • Performance depends on parameters and training data size.
  • Larger models tend to perform better without needing new algorithms.

Future Directions

System 1 vs. System 2 Thinking

  • LLMs currently operate on instinctive processing (System 1).
  • Goal: Develop System 2, allowing deeper reasoning and longer processing times.

Self-improvement

  • Inspired by AlphaGo's self-improvement through reinforcement learning.
  • Challenge: LLMs lack clear reward functions except in narrow domains.

Customization and Specialized Tasks

  • Customizing models for specific tasks using tools like GPT’s App Store.

Challenges and Security Concerns

Jailbreak and Prompt Injection Attacks

  • Jailbreaks allow bypassing safety instructions via roleplay or encoded queries.
  • Prompt injections hijack model instructions (e.g., hidden text in images).

Data Poisoning and Backdoor Attacks

  • Potential for training data manipulation to create triggers that alter model behavior.

Defense and Ongoing Security Efforts

  • Constant updates and defenses against emergent attacks.

Conclusion

  • LLMs as part of a new computing paradigm with unique challenges and opportunities.
  • Active development and interest in improving capabilities and security of LLMs.