Coconote
AI notes
AI voice & video notes
Try for free
📊
State Space Models and Mamba: A New Architecture for Language Models
Jul 18, 2024
State Space Models and Mamba
Introduction
Presenter
: Louis Sano from Sano Academy
Topic
: State space models and Mamba
Revolutionary
: New architecture enhancing large language models
Origin
: Introduced by Goo and Dao
Combining RNNs and CNNs
It merges concepts from recurrent neural networks (RNNs) and convolutional neural networks (CNNs)
Aimed at generating language efficiently
Example: Race Car
Variables and Measures
Input (u)
: Maintenance (e.g., topping off fluid, general maintenance)
State (h)
: Vehicle health (e.g., gas level, oil level, tire condition, motor condition)
Output (y)
: Performance (e.g., speed)
Functions/Matrices
A
: Describes the natural wear and tear of the car (state transition matrix)
B
: Effects of maintenance on the car's state (control matrix)
C
: Performance based on the car's state (observation matrix)
D
: Direct effect of maintenance on performance (usually neglected in language models)
Equations
State Transition
: HT = A * HT-1 + B * XT
Output
: YT = C * HT + D * XT
Numerical Example
Gas
: 0.9 of previous day
Oil
: 0.95 of previous day
Tires
: 0.8 of previous day
Motor
: 0.85 of previous day
State Transition Matrix
: Accounts for mutual influences like good tires saving gas
Control Matrix
: Influence of input on state
Observation Matrix
: Converts state into performance
Direct Action Matrix
: Influence of maintenance on performance
Language Generation
Variables for Language Models
Context (State)
: Big vector encapsulating discussion details
Last Word (Input)
: Last word in context (e.g.,
📄
Full transcript