Understanding Denoising Diffusion Models

Aug 22, 2024

Diffusion Models: Denoising Diffusion Probabilistic Models (DDPM)

Overview

  • Focus on denoising diffusion probabilistic models (DDPM).
  • Deep dive into forward and reverse diffusion processes.
  • Understanding the mathematical formulation of the training objective.

Key Concepts

Forward Process

  • Gradual destruction of image information through noise addition.
  • Each time step (T) adds Gaussian noise, leading to a completely random noise distribution.
  • Moving from state X(t-1) to X(t).
  • The result after many steps mimics a sample from a normal distribution.

Reverse Process

  • Learning the reverse of the forward process via a neural network.
  • The model learns to iteratively remove noise from a random sample until it resembles the original distribution.

Intuition Behind Diffusion

  • Diffusion: Movement from higher to lower concentration; relates to transforming complex distributions into simpler ones.
  • Comparison with Variational Autoencoders (VAEs):
    • VAEs convert images into a Gaussian distribution and reconstruct images using a decoder.
    • Diffusion models use multiple steps for a similar outcome but emphasize the role of stochastic processes.

Mathematical Framework

  • Diffusion as a stochastic Markov process:
    • Current state depends only on the previous state.
    • Continuous transitions are modeled through equations involving deterministic and stochastic terms.

Transition Function

  • Key to moving from complex distribution (image) to simple distribution (normal).
  • Importance of choosing the right parameters (Alpha and Beta) to ensure smooth transitions without abrupt changes.

Variance Schedule

  • Instead of using fixed variance, a linear variance schedule is applied:
    • Allows larger jumps early in the reverse process and smaller adjustments closer to the original image.

Efficient Computation

  • Transition from image to noisy version can be computed in one shot using cumulative product terms.
  • The reverse distribution can also be approximated using Gaussian distributions under certain conditions.

Learning the Reverse Process

  • Similar to VAEs, the goal is to maximize the likelihood of observed data:
    • Minimize Kullback-Leibler divergence between learned and true distributions.
  • The key difference lies in conditioning on the original image during the learning process to approximate the reverse distribution more effectively.

Loss Function

  • Loss is defined based on how well the model approximates the noise added during the forward process:
    • Square differences between ground truth noise and predicted noise.

Training and Generation

  • Steps in training:

    1. Sample an image from the dataset and a random time step.
    2. Create a noisy version of the image using the forward process.
    3. Train the neural network to predict the noise added.
  • Generation process:

    1. Start with a random noise sample.
    2. Iteratively predict and denoise until the original image is reconstructed.

Conclusion

  • The lecture emphasizes the importance of understanding diffusion models in depth.
  • Acknowledgment of resources and contributors that aided learning.
  • Future discussions will cover architecture and implementation of DDPM.