Long Short-Term Memory (LSTM)

Jul 20, 2024

Lecture Notes: Long Short-Term Memory (LSTM)

Introduction

  • Speaker: Fathorrahman
  • Topic: Long Short-Term Memory (LSTM), part of Recurrent Neural Network (RNN)
  • Practical applications: Speech Recognition, Forecasting, Anomaly Detection

Difference Between Feedforward Networks and Recurrent Neural Networks (RNN)

  • Feedforward Networks: Input is processed in one direction, from input layer to output layer
  • RNN: Sequential data processing; feedback connections between units create memory blocks to retain information

Structure of RNN

  • Memory Blocks: Chain-like blocks that store and pass information along the sequence
  • Data Flow: Information from each previous block affects the next block
  • Example: Using sequence data for predictions (e.g., time steps)

Working with Data

  • Single Data Point: Represented by multiple variables
  • Sequential Data: Several variables with multiple time steps

Implementation**

  • Feedforward Network: Tuples of variables (
  • RNN: Array-like structure of sequence data processed over multiple time steps

Long Short-Term Memory (LSTM)

  • Difference from RNN: Introduction of gates to manage information flow
    • Forget Gate: Decides what information to discard
    • Input Gate: Adds new information
    • Cell State: Tracks the state over sequences and adds ability to remember long-term dependencies
  • Computation: Uses activation functions like sigmoid and tanh to manage information flow

Error Functions and Training

  • Feature Scaling: Data normalization (e.g., using Min-Max Scaler) to improve model performance
  • Model Training: Splitting data into training and test sets
  • Hyperparameter Tuning: Adjusting parameters like number of units and dropout rate for optimal performance

Practical Example: Energy Consumption Forecasting

  • Data Set: Electric power consumption over 10 years
  • Data Preprocessing: Sorting, date conversion, normalization
  • Model Building: Using Keras and TensorFlow libraries to create LSTM architecture
  • Sliding Window: Converting sequence data into subsequences for modeling
  • Evaluation Metrics: Mean Square Error (MSE), Mean Absolute Error (MAE)
  • Visualization: Plotting predicted vs actual values to assess model performance

Key Takeaways

  • LSTM is powerful for handling sequential data due to its memory cell structure
  • Proper Data Preparation is crucial (normalization, data splitting)
  • Model Evaluation: Visualizing results helps in understanding model accuracy and performance