🧠

MIT SUS1 191 Lecture: Deep Learning Foundations

Jun 29, 2024

MIT SUS1 191 Lecture: Deep Learning Foundations

Instructor: Alexander Amini

Introduction

  • Focus: Foundations of AI and Deep Learning
  • Field: Rapidly changing (past 8 years of teaching the course)
  • AI solving human-level problems
  • Intro courses in AI change frequently due to rapid advancements

AI and Deep Learning Progression

  • Example: Intro video created using AI (viral, $10,000 compute cost earlier, now commoditized)
  • AI can generate hyper-realistic media, software code from English prompts
  • Objective: Understand foundation to create future AI technologies

Fundamental Concepts

Intelligence

  • Processing information to inform future decisions
  • AI: Giving computers the ability to process info and make decisions
  • Machine Learning (ML): Subset of AI - teaching computers to process info from data
  • Deep Learning (DL): Subset of ML - uses neural networks (NNs) to process raw data

Neural Networks Overview

  • Neural Network Basics: Composed of perceptrons (neurons)
  • Steps: DOT product -> Add bias -> Apply nonlinearity
  • Activation functions introduce nonlinearity, necessary for handling complex real-world data

Key Concepts in Neural Networks

Perceptron Model

  • Single neuron ingesting inputs (X1, X2,.. Xn) -> Weights (W1, W2, ... Wn) -> Nonlinearity (Activation function like Sigmoid, ReLU)
  • Importance of nonlinearity: Enables handling nonlinear data

Building Neural Networks

  • Layers: Input layer -> Hidden layer -> Output layer
  • Multi-layer neural networks: Each layer transforms input progressively
  • Code Implementation: Layers and nonlinearity easily defined using frameworks like TensorFlow

Training Neural Networks

  • Compute loss using a loss function
  • Example: Binary classification using cross-entropy loss
  • Gradient descent to minimize loss: Compute gradients & adjust weights iteratively (backpropagation)
  • Use of optimizers like SGD with adaptive learning rates for efficient training

Practical Tips for Training

Batching Data

  • Use Mini-batches to reduce computational cost and improve efficiency
  • Parallelization using GPUs for accelerated computation

Addressing Overfitting

  • Overfitting: Model performs well on training data but poorly on test data
  • **Regularization Techniques: **
    • Dropout: Randomly shutting down neurons during training
    • Early Stopping: Monitoring training and test loss curves to stop before overfitting

Syllabus Overview: Labs and Projects

Software Labs

  • Cover various applications: Music generation (Lab 1), Computer vision (Lab 2), Large language models (new lab)
  • Coupled with lectures and prizes for top solutions

Final Project

  • Shark Tank-style project pitch competition with prizes
  • Emphasis on hands-on practice and application of learned concepts

Closing Points

  • Foundations to build scalable and advanced AI models
  • Deep learning libraries do backpropagation automatically
  • Next lecture: Sequence modeling with RNNs and Transformers by Ava

Resources and Support

  • Slides, course materials available online
  • Piazza for questions and discussions
  • Reach out to instructors and TAs for help

Note: This is an intensive one-week course. Engage, practice, and utilize resources efficiently for maximum learning.