Generative Adversarial Networks - Part Two

Jul 21, 2024

Generative Adversarial Networks - Part Two

Overview

  • Focus on deeper understanding and theoretical proof of GANs
  • Goal: Understand the key equation, algorithm for solving it, and proof for recovering the perfect generative model
  • Sign-up mentioned for early access and exclusive content on the presenter’s blog

Data Flow in GANs

  • Components: Noise vector (Z), Generator (G), Real Data (X), Discriminator (D)
  • Generator (G): Transforms noise (Z) into fake samples (G(Z))
  • Discriminator (D): Takes either real (X) or fake (G(Z)) samples and outputs a probability of the sample being real
    • D(X): Probability that real sample (X) is real
    • D(G(Z)): Probability that fake sample (G(Z)) is real
  • Labels: 1 for real samples, 0 for fake samples
  • Supervised Learning Transformation: Unsupervised learning problem converted into supervised by labeling samples

Cost Function

  • Two main terms:
    1. Discriminator on Real Data: Expectation of the log of D(X)
    2. Discriminator on Fake Data: Expectation of log(1 - D(G(Z)))
  • Discriminator Goals:
    • Wants D(X) to be large for real samples
    • Wants D(G(Z)) to be small for fake samples
  • Generator Goals:
    • Maximize D(G(Z)) for fooling the discriminator
  • Adversarial Framework:
    • Discriminator seeks to maximize the cost function
    • Generator seeks to minimize the cost function

Training Algorithm

  1. Discriminator Loop:

    • Pull M noise samples → Generate M fake data samples
    • Sample M real data samples
    • Label real samples (1) and fake samples (0)
    • Calculate loss with the labeled outputs
    • Update discriminator's parameters to maximize cost function (take gradients and update)
  2. Generator Loop:

    • Pull M noise samples → Generate M fake samples
    • Calculate reduced cost function (no need for real data)
    • Update generator's parameters to minimize cost function (take gradients and update)

Theoretical Proof

  • Objective: Prove that optimal generator matches real data distribution
  • Optimal Discriminator:
    • Relation with probability distributions (produces 1/2 if distributions match)
    • Maximum cost function achieved at minus log 4
    • Use calculus and algebra to derive the optimal discriminator

Jensen-Shannon Divergence

  • Objective: Minimize the JS Divergence
  • JS Divergence as a distance measure between real and fake data distributions
  • Cost Function Rewritten:
    • Involves JS Divergence term + constant (minus log 4)
    • Minimum JS Divergence is 0 (when distributions match)
    • When minimized, generator distribution matches real data distribution

Recap

  • GANs Architecture and Training Process
  • Using random noise vectors to generate fake samples
  • Discriminator as a classification problem
  • Alternating training loops for discriminator and generator
  • Theoretical proof validating the approach

Additional Resources

  • Mention of a three-part blog series on GANs
  • Links to original paper and blog for signup provided in description