🚗

Deep Learning for Self-Driving Cars - Lecture Notes

Jul 4, 2024

Deep Learning for Self-Driving Cars - Lecture Notes

Introduction

  • Course: 6.S094 Deep Learning for Self-Driving Cars, part of a series on deep learning.
  • Resources:
  • Assignments: Emailed to registered students.
  • Contact: [email protected] for questions/comments.

Deep Learning Basics

Overview

  • Deep Learning (DL): Extracts useful patterns from data with minimal human effort.
  • Key Concept: Optimization of neural networks using libraries like Python and TensorFlow.
  • Challenge: Asking good questions and obtaining good data.

Historical Context

  • Digitization: Easy access to data in digital form.
  • Hardware: Advancements in CPUs, GPUs, TPUs enable large-scale execution of DL algorithms.
  • Community and Tools: Collaboration and tools like GitHub, TensorFlow, PyTorch speed up problem-solving.
  • Applications: Face recognition, scene understanding, NLP, medical diagnosis, autonomous vehicles, ads, recommender systems, games.

Key Developments in Neural Networks

Historical Milestones

  • 1940s: Early neural networks concepts.
  • 1950s: Perceptron implementation.
  • 1970-80s: Backpropagation, restricted Boltzmann machines, RNNs.
  • 1990s: CNNs, MNIST dataset, LSTM, bi-directional RNNs.
  • 2006: Deep Learning rebranded with Deep Belief Nets.
  • 2009: ImageNet dataset.
  • 2012: AlexNet and major improvements in DL.
  • 2014: GANs introduced, DeepFace for face recognition.
  • 2016-17: AlphaGo and AlphaZero achievements.
  • 2018: Year of NLP breakthroughs (e.g., Google Bert).

Tooling Evolution

  • 60s-Current: Progress with tools from basic perceptrons to TensorFlow and PyTorch.
  • Importance: Tools minimize human effort needed to reach solutions.

Practical Applications and Challenges

Deep Learning in Real-world Applications

  • Humanoid Robotics: Limited DL use, mostly traditional methods.
  • Autonomous Vehicles: Predominantly non-DL methods except for perception.

Ethical Issues and Safety

  • AI Safety: Need for ethical considerations, human oversight.
  • Unexpected Consequences: Example of optimizing game algorithms resulting in unforeseen behaviors.

Theoretical Foundations of Deep Learning

Learning Representations

  • Core Idea: Higher-level abstractions from data representations for easier interpretation and classification.
  • Human-Driven Goal: Simplifying complex problems (Einstein's influence).
  • Compression and Simplicity: Finding simple yet effective representations (e.g., heliocentric model).

Removing Human Input

  • Reduced Human Role: DL automates feature extraction, reducing need for expert input.
  • Limitations & Trade-offs: Balancing excitement and realism (Gartner Hype Cycle).

Technical Aspects of Neural Networks

Fundamental Unit: Neuron

  • Basic Structure: Input weights, bias, activation function, output.
  • Comparison with Biological Neurons: Simplicity of artificial neurons vs. biological complexity.
  • Efficiency Issues: Power consumption and learning differences.

Network Architecture

  • Layers: Input, hidden, output; stacked neurons form deep networks.
  • Universal Approximation: Single hidden layer can approximate any function.
  • Parallelizability: Efficient execution on GPUs and TPUs.

Training Neural Networks

  • Activation Functions: Key to non-linearity and learning (e.g., ReLU, sigmoid).
  • Loss Functions: MSE for regression, cross-entropy for classification.
  • Backpropagation: Adjusts weights based on error gradients.
  • Optimization Algorithms: Variants like SGD, momentum-based optimize learning.

Regularization Techniques

  • Overfitting: Regularization prevents over-memorization.
  • Validation & Early Stopping: Monitor performance on validation set to avoid overfitting.
  • Dropout: Randomly removes neurons during training.
  • Normalization: Input and batch normalization to stabilize learning.

Deep Learning Techniques and Models

Convolutional Neural Networks (CNNs)

  • Spatial Invariance: Uses filters to detect features regardless of their position.
  • Key Architectures: AlexNet, ResNet, GoogLeNet, SENet.

Object Detection and Semantic Segmentation

  • Object Detection: Identifies and classifies objects within an image (e.g., Faster R-CNN, SSD, YOLO).
  • Semantic Segmentation: Pixel-level classification of images.

Transfer Learning

  • Concept: Fine-tuning pre-trained models on new datasets.
  • Use Cases: Specialized tasks like pedestrian detection.

Autoencoders and Representations

  • Autoencoders: Use bottleneck architecture to compress data into meaningful representations.
  • Embeddings: Efficient representations for large datasets.

Generative Adversarial Networks (GANs)

  • Method: Generator and discriminator networks compete to produce realistic data.
  • Applications: Image generation, video consistency, high-resolution image creation.

Natural Language Processing (NLP)

  • Word Embeddings: Word2Vec for meaningful word representations.
  • Recurrent Neural Networks (RNNs): Handle sequence data, capture temporal dependencies.
  • LSTMs: Manage long-term dependencies in sequential data.
  • Attention Mechanisms: Improve context understanding in sequence data.

Automated Machine Learning (AutoML)

  • Neural Architecture Search: Automates discovery of effective neural network architectures.
  • Application: Streamlines DL processes by minimizing human intervention.

Deep Reinforcement Learning

  • Concept: Agents learn optimal actions through rewards in environments.
  • Applications: Robotics, video gaming, autonomous systems.

Conclusion

  • Goal: Progressing from theory to practical applications in DL, emphasizing ethical considerations.
  • Resources: All materials available at deeplearning.mit.edu.

Thank you for attending!