🤖

Reinforcement Learning Notes

Jul 4, 2024

Reinforcement Learning Course Lecture by Nicholas Chernette

Introduction

  • Trainer: Nicholas Chernette
  • Course Goal: To progress from beginner to adept at leveraging reinforcement learning
  • **Topics Covered: **
    • Setting up environment
    • Working with algorithms
    • Testing and building custom environments

Course Outline

Step 1: RL (Reinforcement Learning) Overview

  • How it works: Teaching agents through trial and error
  • Key components:
    • Agent
    • Environment
    • Actions
    • Rewards/Observations
  • Applications:
    • Autonomous driving (Carla)
    • Securities Trading
    • Neural Network Architecture
    • Robotics
    • Gaming
  • Limitations:
    • Overkill for simple problems
    • Environment assumptions
    • Training duration and stability
  • Examples:
    • Training models to balance poles, drive cars, etc.

Step 2: Environment Setup

  • Tools Required:
    • Open AI Gym
    • Mujoco for real robots
    • Various libraries based on application
  • Environment Representation: Spaces
    • Box: Range of values
    • Discrete: Specific set of items
    • Tuples: Combining spaces
    • Dict: Combination of types
    • Multi-binary and Multi-discrete: Specialized spaces
  • Environment Testing: Using Open AI Gym
    • Observation/Action Spaces interaction
    • Testing environments (Cartpole example)

Step-by-step Implementation**

Step 3: Training Models

  • Model-free RL algorithms: A2C, PPO, DQN, HER, SAC, TD3
  • Algorithm Suitability: Based on type of action spaces
    • Discrete
    • Continuous
    • Evaluation Metrics: Rewards, Entropy, etc.
  • Setting up/train model: Using libraries like Pytorch with Stable Baselines
  • Saving/Loading Models: Methods and techniques for deployment

Step 4: Testing and Evaluation

  • Evaluation Policy: Seeing the model’s performance
  • Tensorboard: For metric visualization

Step 5: Callbacks, Algorithms, Architectures

  • Reward Thresholds: Stop training when threshold reached
  • Using Callbacks: To monitor and save best models
  • Custom Architectures: Changing neural network setups
  • Alternate Algorithms: Implementing various RL algorithms

Projects

Project 1: Reinforcement Learning for Atari Games

  • Setup Dependencies: Gym, A2C, vec_env tools
  • ROM Installation: For Atari environments
  • Implementing Environment: Using gym’s tools
  • Training: Using CNN policies and testing via random actions

Project 2: Reinforcement Learning for Autonomous Driving

  • Setup Dependencies: Swig, Piglet
  • Environment Setup: Box2D racing car
  • Training: PPO algorithm
  • Testing: Using trained model, adjusting action spaces

Project 3: Custom Environment Creation

  • Example: Shower temperature environment
  • Setup Dependencies: Imported necessary libraries
  • Environment Coding: Defining state, steps, rewards
  • Training: PPO algorithm
  • Evaluation: Testing and refining model

Summary

  • Learning Resources: Courses by experts like David Silva or books by pioneers (Richard Sutton)
  • Next Steps:
    • Hyperparameters, detailed environments, end-to-end implementation
  • Final tips: Importance of understanding required metrics, training duration, algorithm choices