Understanding Denoising Diffusion Models

Aug 22, 2024

Diffusion Models: Denoising Diffusion Probabilistic Models (DDPM)

Overview

Focus on denoising diffusion probabilistic models (DDPM).
Deep dive into forward and reverse diffusion processes.
Understanding the mathematical formulation of the training objective.

Key Concepts

Forward Process

Gradual destruction of image information through noise addition.
Each time step (T) adds Gaussian noise, leading to a completely random noise distribution.
Moving from state X(t-1) to X(t).
The result after many steps mimics a sample from a normal distribution.

Reverse Process

Learning the reverse of the forward process via a neural network.
The model learns to iteratively remove noise from a random sample until it resembles the original distribution.

Intuition Behind Diffusion

Diffusion: Movement from higher to lower concentration; relates to transforming complex distributions into simpler ones.
Comparison with Variational Autoencoders (VAEs):
- VAEs convert images into a Gaussian distribution and reconstruct images using a decoder.
- Diffusion models use multiple steps for a similar outcome but emphasize the role of stochastic processes.

Mathematical Framework

Diffusion as a stochastic Markov process:
- Current state depends only on the previous state.
- Continuous transitions are modeled through equations involving deterministic and stochastic terms.

Transition Function

Key to moving from complex distribution (image) to simple distribution (normal).
Importance of choosing the right parameters (Alpha and Beta) to ensure smooth transitions without abrupt changes.

Variance Schedule

Instead of using fixed variance, a linear variance schedule is applied:
- Allows larger jumps early in the reverse process and smaller adjustments closer to the original image.

Efficient Computation

Transition from image to noisy version can be computed in one shot using cumulative product terms.
The reverse distribution can also be approximated using Gaussian distributions under certain conditions.

Learning the Reverse Process

Similar to VAEs, the goal is to maximize the likelihood of observed data:
- Minimize Kullback-Leibler divergence between learned and true distributions.
The key difference lies in conditioning on the original image during the learning process to approximate the reverse distribution more effectively.

Loss Function

Loss is defined based on how well the model approximates the noise added during the forward process:
- Square differences between ground truth noise and predicted noise.

Training and Generation

Steps in training:
1. Sample an image from the dataset and a random time step.
2. Create a noisy version of the image using the forward process.
3. Train the neural network to predict the noise added.
Generation process:
1. Start with a random noise sample.
2. Iteratively predict and denoise until the original image is reconstructed.

Conclusion

The lecture emphasizes the importance of understanding diffusion models in depth.
Acknowledgment of resources and contributors that aided learning.
Future discussions will cover architecture and implementation of DDPM.

Full transcript