🧠

Understanding Large Language Models and MLPs

Feb 1, 2025

Lecture Notes on Large Language Models and Multilayer Perceptrons (MLPs)

Overview

  • Large language models (LLMs) like GPT-3 store knowledge in their parameters, exemplified by predicting phrases like "Michael Jordan plays basketball".
  • Researchers from Google DeepMind explored how facts are stored in LLMs, focusing on matching athletes to their sports.
  • A main finding suggests that facts reside within a specific part of the networks called Multilayer Perceptrons (MLPs).

Transformer Architecture

  • Transformers consist of two main components:
    • Attention Mechanism: Allows vectors to share information between one another.
    • Multilayer Perceptrons (MLPs): Thought to store facts; simpler to compute but complex to interpret.

High-Dimensional Spaces

  • Vectors in models live in high-dimensional spaces, where directions can encode different meanings (e.g., gender information).
  • MLPs can store facts by leveraging directions in this high-dimensional space, such as encoding the knowledge that "Michael Jordan plays basketball".

MLP Operations

  • Composed of two matrix multiplications with a rectified linear unit (ReLU) non-linear function in between.
  • Up Projection Matrix (W up): Maps vectors into a higher-dimensional space.
  • Down Projection Matrix (W down): Maps vectors back to original dimension size.
  • Neurons in an MLP are activated if the vector aligns with certain directions, which can trigger adding specific features (like "basketball") to the vector.

Example: Storing 'Michael Jordan plays basketball'

  • Assume directions in the space for first name "Michael", last name "Jordan", and "basketball".
  • Vectors align with these directions to encode complete facts by triggering the right neurons.

Parameters and Scaling

  • GPT-3 example: MLPs consist of about 116 billion parameters, a substantial portion of the 175 billion total parameters.
  • High-dimensional spaces allow potentially storing more features than there are dimensions due to superposition, which may explain LLM scalability.

Superposition and Interpretability

  • Superposition hypothesis: Features might be nearly perpendicular, allowing models to store more features than dimensions available.
  • Interpreting models is challenging because features could be encoded in complex combinations rather than single neurons.

Future Topics

  • Training process for LLMs includes backpropagation and specific cost functions.
  • Fine-tuning methods like reinforcement learning with human feedback.
  • Scaling laws and their impact on model performance.
  • Upcoming non-machine learning topics will be discussed in future videos.

Note: These notes capture the core concepts discussed in the lecture about how large language models work and store knowledge, focusing on the architecture and function of MLPs. More details on training processes will be covered in future lectures.