MIT SUS1 191 Lecture: Deep Learning Foundations
Instructor: Alexander Amini
Introduction
- Focus: Foundations of AI and Deep Learning
- Field: Rapidly changing (past 8 years of teaching the course)
- AI solving human-level problems
- Intro courses in AI change frequently due to rapid advancements
AI and Deep Learning Progression
- Example: Intro video created using AI (viral, $10,000 compute cost earlier, now commoditized)
- AI can generate hyper-realistic media, software code from English prompts
- Objective: Understand foundation to create future AI technologies
Fundamental Concepts
Intelligence
- Processing information to inform future decisions
- AI: Giving computers the ability to process info and make decisions
- Machine Learning (ML): Subset of AI - teaching computers to process info from data
- Deep Learning (DL): Subset of ML - uses neural networks (NNs) to process raw data
Neural Networks Overview
- Neural Network Basics: Composed of perceptrons (neurons)
- Steps: DOT product -> Add bias -> Apply nonlinearity
- Activation functions introduce nonlinearity, necessary for handling complex real-world data
Key Concepts in Neural Networks
Perceptron Model
- Single neuron ingesting inputs (X1, X2,.. Xn) -> Weights (W1, W2, ... Wn) -> Nonlinearity (Activation function like Sigmoid, ReLU)
- Importance of nonlinearity: Enables handling nonlinear data
Building Neural Networks
- Layers: Input layer -> Hidden layer -> Output layer
- Multi-layer neural networks: Each layer transforms input progressively
- Code Implementation: Layers and nonlinearity easily defined using frameworks like TensorFlow
Training Neural Networks
- Compute loss using a loss function
- Example: Binary classification using cross-entropy loss
- Gradient descent to minimize loss: Compute gradients & adjust weights iteratively (backpropagation)
- Use of optimizers like SGD with adaptive learning rates for efficient training
Practical Tips for Training
Batching Data
- Use Mini-batches to reduce computational cost and improve efficiency
- Parallelization using GPUs for accelerated computation
Addressing Overfitting
- Overfitting: Model performs well on training data but poorly on test data
- **Regularization Techniques: **
- Dropout: Randomly shutting down neurons during training
- Early Stopping: Monitoring training and test loss curves to stop before overfitting
Syllabus Overview: Labs and Projects
Software Labs
- Cover various applications: Music generation (Lab 1), Computer vision (Lab 2), Large language models (new lab)
- Coupled with lectures and prizes for top solutions
Final Project
- Shark Tank-style project pitch competition with prizes
- Emphasis on hands-on practice and application of learned concepts
Closing Points
- Foundations to build scalable and advanced AI models
- Deep learning libraries do backpropagation automatically
- Next lecture: Sequence modeling with RNNs and Transformers by Ava
Resources and Support
- Slides, course materials available online
- Piazza for questions and discussions
- Reach out to instructors and TAs for help
Note: This is an intensive one-week course. Engage, practice, and utilize resources efficiently for maximum learning.