Coconote
AI notes
AI voice & video notes
Try for free
Generative Adversarial Networks - Part Two
Jul 21, 2024
Generative Adversarial Networks - Part Two
Overview
Focus on deeper understanding and theoretical proof of GANs
Goal: Understand the key equation, algorithm for solving it, and proof for recovering the perfect generative model
Sign-up mentioned for early access and exclusive content on the presenter’s blog
Data Flow in GANs
Components
: Noise vector (Z), Generator (G), Real Data (X), Discriminator (D)
Generator (G)
: Transforms noise (Z) into fake samples (G(Z))
Discriminator (D)
: Takes either real (X) or fake (G(Z)) samples and outputs a probability of the sample being real
D(X)
: Probability that real sample (X) is real
D(G(Z))
: Probability that fake sample (G(Z)) is real
Labels
: 1 for real samples, 0 for fake samples
Supervised Learning Transformation
: Unsupervised learning problem converted into supervised by labeling samples
Cost Function
Two main terms:
Discriminator on Real Data
: Expectation of the log of D(X)
Discriminator on Fake Data
: Expectation of log(1 - D(G(Z)))
Discriminator Goals
:
Wants D(X) to be large for real samples
Wants D(G(Z)) to be small for fake samples
Generator Goals
:
Maximize D(G(Z)) for fooling the discriminator
Adversarial Framework
:
Discriminator seeks to maximize the cost function
Generator seeks to minimize the cost function
Training Algorithm
Discriminator Loop
:
Pull M noise samples → Generate M fake data samples
Sample M real data samples
Label real samples (1) and fake samples (0)
Calculate loss with the labeled outputs
Update discriminator's parameters to maximize cost function (take gradients and update)
Generator Loop
:
Pull M noise samples → Generate M fake samples
Calculate reduced cost function (no need for real data)
Update generator's parameters to minimize cost function (take gradients and update)
Theoretical Proof
Objective
: Prove that optimal generator matches real data distribution
Optimal Discriminator
:
Relation with probability distributions (produces 1/2 if distributions match)
Maximum cost function achieved at minus log 4
Use calculus and algebra to derive the optimal discriminator
Jensen-Shannon Divergence
Objective
: Minimize the JS Divergence
JS Divergence as a distance measure between real and fake data distributions
Cost Function Rewritten
:
Involves JS Divergence term + constant (minus log 4)
Minimum JS Divergence is 0 (when distributions match)
When minimized, generator distribution matches real data distribution
Recap
GANs Architecture and Training Process
Using random noise vectors to generate fake samples
Discriminator as a classification problem
Alternating training loops for discriminator and generator
Theoretical proof validating the approach
Additional Resources
Mention of a three-part blog series on GANs
Links to original paper and blog for signup provided in description
📄
Full transcript