GANs refers to Generative Adversarial Networks.
- GANs is inspired by game theory, it has 2 nets (Generator & Discriminator),
they play against each other to get stronger at each round.
- GANs is an implicit generative model since Generator uses the signal (loss)
from the Discriminator (classifier) to implicitly approximate his intractable cost function.
- : real data example.
- fake data example from
- : noise input usually from a uniform distribution.
- : a label .
- : a discriminator net to estimate .
- : a generator net to output fake example().
- : assumed data distribution over noise input .
- : generator distribution over sample .
- : ‘real’ data distribution over real sample .
KL divergence and JS divergence
- KL is a measure of how one probability distribution diverges from a second, expected probability distribution
- KL and not symmetric, forward reversed
- JS , is a symmetric and more smooth measure of 2 probability distribution.
Analyze loss function 1
- when we fixed G,what is the optimal D:
- this is taking the partial partial derivative of the loss function w.r.t D(x) to 0
- we can get
Analyze loss function 2
- when we have optimal D* what is the loss for min G :
Problems 1: Gradient vanishing
now we know that when we have optimal D, min g is same as min
there are 3 different cases to consider when we plug it in to the JS measure
and -> 0 , -> 0
or -> , ->
and barely happen,neglectable
The data distribution lie close to a low-dimensional manifold Example: consider image data
- Very high dimensional (1,000,000D)
- A randomly generated image will almost certainly not look like any real world scene
- The space of images that occur in nature is almost completely empty
- Hypothesis: real world images lie on a smooth, low-dimensional manifold
Assumption: Support of lie on low dimensional manifolds
Support: A real-valued function f is he subset of the domain containing those elements which are not mapped to zero.
- we now assume Support of lives in a low-dimensional manifold embedded in a higher-dimensional space (input space)
- now think about what does the generator net do?
- we first randomly generate z and dim(z) << dim(x)
- we use G(z) as a non-linear mapping from dim(z) to dim(x)
- so what does the p_g represent eventually?
- since Manifold learning is an approach to non-linear dimensionality reduction
- p_g represents a consequence after reverting manifold learning
- we now assume Support of lie on low dimensional manifolds
- this means each one of manifold hardly fills up the whole high dimensional space
- they are almost certainly gonna be disjoint, the case where they overlap is neglectable
problem2: mode collapse,unstable gradient updates
Alternative D loss for min G