Long Short-Term Memory (LSTM)

Jul 20, 2024

Lecture Notes: Long Short-Term Memory (LSTM)

Introduction

Speaker: Fathorrahman
Topic: Long Short-Term Memory (LSTM), part of Recurrent Neural Network (RNN)
Practical applications: Speech Recognition, Forecasting, Anomaly Detection

Difference Between Feedforward Networks and Recurrent Neural Networks (RNN)

Feedforward Networks: Input is processed in one direction, from input layer to output layer
RNN: Sequential data processing; feedback connections between units create memory blocks to retain information

Structure of RNN

Memory Blocks: Chain-like blocks that store and pass information along the sequence
Data Flow: Information from each previous block affects the next block
Example: Using sequence data for predictions (e.g., time steps)

Working with Data

Single Data Point: Represented by multiple variables
Sequential Data: Several variables with multiple time steps

Implementation**

Feedforward Network: Tuples of variables (
RNN: Array-like structure of sequence data processed over multiple time steps

Long Short-Term Memory (LSTM)

Difference from RNN: Introduction of gates to manage information flow
- Forget Gate: Decides what information to discard
- Input Gate: Adds new information
- Cell State: Tracks the state over sequences and adds ability to remember long-term dependencies
Computation: Uses activation functions like sigmoid and tanh to manage information flow

Error Functions and Training

Feature Scaling: Data normalization (e.g., using Min-Max Scaler) to improve model performance
Model Training: Splitting data into training and test sets
Hyperparameter Tuning: Adjusting parameters like number of units and dropout rate for optimal performance

Practical Example: Energy Consumption Forecasting

Data Set: Electric power consumption over 10 years
Data Preprocessing: Sorting, date conversion, normalization
Model Building: Using Keras and TensorFlow libraries to create LSTM architecture
Sliding Window: Converting sequence data into subsequences for modeling
Evaluation Metrics: Mean Square Error (MSE), Mean Absolute Error (MAE)
Visualization: Plotting predicted vs actual values to assess model performance

Key Takeaways

LSTM is powerful for handling sequential data due to its memory cell structure
Proper Data Preparation is crucial (normalization, data splitting)
Model Evaluation: Visualizing results helps in understanding model accuracy and performance

Full transcript