Lec19 - GANs
Lec19 - GANs
Lec19 - GANs
Networks (GANs)
From Ian Goodfellow et al.
A short tutorial by :-
Binglin, Shashank & Bhargav
D D(x)
G
z
G(z)
D(G(z))
https://www.slideshare.net/xavigiro/deep-learning-for-computer-vision-generative-models-and-adversarial-training-upc-2016
Training Generator
https://www.slideshare.net/xavigiro/deep-learning-for-computer-vision-generative-models-and-adversarial-training-upc-2016
GAN’s formulation
•
•D
Discriminator
updates
Generator
updates
Vanishing gradient strikes back again…
•
• =
Goodfellow, Ian, et al. "Generative adversarial nets." Advances in neural information processing systems. 2014.
DCGAN: Bedroom images
Radford, Alec, Luke Metz, and Soumith Chintala. "Unsupervised representation learning with deep convolutional generative adversarial networks." arXiv:1511.06434 (2015).
Deep Convolutional GANs (DCGANs)
Key ideas:
• Replace FC hidden layers with
Generator Architecture Convolutions
• Generator: Fractional-Strided
convolutions
• Inside Generator
• Use ReLU for hidden layers
• Use Tanh for the output layer
Radford, Alec, Luke Metz, and Soumith Chintala. "Unsupervised representation learning with deep convolutional generative adversarial networks." arXiv:1511.06434
(2015).
Part 2
• Training Challenges
• Non-Convergence
• Mode-Collapse
• Proposed Solutions
• Supervision with Labels
• Mini-Batch GANs
• Modification of GAN’s losses
• Discriminator (EB-GAN)
• Generator (InfoGAN)
Non-Convergence
•• Deep
Learning models (in general) involve a single player
• The player tries to maximize its reward (minimize its loss).
• Use SGD (with Backpropagation) to find the optimal parameters.
• SGD has convergence guarantees (under certain conditions).
• Problem: With non-convexity, we might converge to local optima.
Goodfellow, Ian. "NIPS 2016 Tutorial: Generative Adversarial Networks." arXiv preprint arXiv:1701.00160 (2016).
Mode-Collapse
• Generator fails to output diverse samples
Target
Expected
Output
Metz, Luke, et al. "Unrolled Generative Adversarial Networks." arXiv preprint arXiv:1611.02163 (2016).
How to reward sample diversity?
• At Mode Collapse,
• Generator produces good samples, but a very few of them.
• Thus, Discriminator can’t tag them as fake.
• More formally,
• Let the Discriminator look at the entire batch instead of single examples
• If there is lack of diversity, it will mark the examples as fake
• Thus,
• Generator will be forced to produce diverse samples.
Salimans, Tim, et al. "Improved techniques for training gans." Advances in Neural Information Processing Systems. 2016.
Mini-Batch GANs
• Extract features that capture diversity in the mini-batch
• For e.g. L2 norm of the difference between all pairs from the batch
• This in turn,
• Will force the Generator to match those feature values with the real data
• Will generate diverse batches
Salimans, Tim, et al. "Improved techniques for training gans." Advances in Neural Information Processing Systems. 2016.
Supervision with Labels
• Label information of the real data might help
Car
Dog
Real
Human
D D
Fake
Fake
Salimans, Tim, et al. "Improved techniques for training gans." Advances in Neural Information Processing Systems. 2016.
Alternate view of GANs
•
• Alternatively, we can flip the binary classification labels i.e. Fake = 1, Real = 0
• Now, we can replace cross-entropy with any loss function (Hinge Loss)
Zhao, Junbo, Michael Mathieu, and Yann LeCun. "Energy-based generative adversarial network." arXiv preprint arXiv:1609.03126 (2016)
Energy-Based GANs
• Modified game plans
• Generator will try to generate samples with
low values
• Discriminator will try to assign high scores to
fake values
Zhao, Junbo, Michael Mathieu, and Yann LeCun. "Energy-based generative adversarial network." arXiv preprint arXiv:1609.03126 (2016)
More Bedrooms…
Zhao, Junbo, Michael Mathieu, and Yann LeCun. "Energy-based generative adversarial network." arXiv preprint arXiv:1609.03126 (2016)
Feature parameterization
3D Faces
Chen, X., Duan, Y., Houthooft, R., Schulman, J., Sutskever, I., & Abbeel, P. InfoGAN: Interpretable Representation Learning by Information Maximization Generative
Adversarial Nets, NIPS (2016).
How to reward Disentanglement?
• Disentanglement means individual dimensions
independently capturing key attributes of the image
Chen, X., Duan, Y., Houthooft, R., Schulman, J., Sutskever, I., & Abbeel, P. InfoGAN: Interpretable Representation Learning by Information Maximization Generative Adversarial Nets
Mutual Information
• Mutual
Information captures the mutual dependence between two variables
min max 𝑉 ( 𝐷 , 𝐺 ) =𝑉 ( 𝐷 ,𝐺 ) − 𝜆 𝐼 ( 𝑐 ; 𝐺 ( 𝑧 , 𝑐 ))
𝐼
𝐺 𝐷
Chen, X., Duan, Y., Houthooft, R., Schulman, J., Sutskever, I., & Abbeel, P. InfoGAN: Interpretable Representation Learning by Information Maximization Generative
Adversarial Nets, NIPS (2016).
InfoGAN
•Mutual
Information’s Variational Lower bound
𝐷
𝑄
Chen, X., Duan, Y., Houthooft, R., Schulman, J., Sutskever, I., & Abbeel, P. InfoGAN: Interpretable Representation Learning by Information Maximization Generative
Adversarial Nets, NIPS (2016).
Part 3
• Conditional GANs
• Applications
• Image-to-Image Translation
• Text-to-Image Synthesis
• Face Aging
• Advanced GAN Extensions
• Coupled GAN
• LAPGAN – Laplacian Pyramid of Adversarial Networks
• Adversarially Learned Inference
• Summary
Conditional GANs
• Simple modification to the original GAN
framework that conditions the model on
additional information for better multi-modal
learning.
Image Credit: Figure 2 in Odena, A., Olah, C. and Shlens, J., 2016. Conditional image synthesis with auxiliary classifier GANs. arXiv preprint arXiv:1610.09585.
Mirza, Mehdi, and Simon Osindero. “Conditional generative adversarial nets”. arXiv preprint arXiv:1411.1784 (2014).
Conditional GANs
MNIST digits generated conditioned on their class label.
MNIST digits
Mirza, Mehdi, and Simon Osindero. Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784 (2014).
Image-to-Image Translation
Reed, S., Akata, Z., Yan, X., Logeswaran, L., Schiele, B., & Lee, H. “Generative adversarial text to image synthesis”. ICML (2016).
Text-to-Image Synthesis
Antipov, G., Baccouche, M., & Dugelay, J. L. (2017). “Face Aging With Conditional Generative Adversarial Networks”. arXiv preprint arXiv:1702.01983.
Face Aging with Conditional GANs
Antipov, G., Baccouche, M., & Dugelay, J. L. (2017). “Face Aging With Conditional Generative Adversarial Networks”. arXiv preprint arXiv:1702.01983.
Part 3
• Conditional GANs
• Applications
• Image-to-Image Translation
• Text-to-Image Synthesis
• Face Aging
• Advanced GAN Extensions
• LAPGAN – Laplacian Pyramid of Adversarial Networks
• Adversarially Learned Inference
• Summary
Laplacian Pyramid of Adversarial
Networks
Denton, E.L., Chintala, S. and Fergus, R., 2015. “Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks”. NIPS (2015)
Laplacian Pyramid of Adversarial
Networks
Denton, E.L., Chintala, S. and Fergus, R., 2015. “Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks”. NIPS (2015)
Laplacian Pyramid of Adversarial
Networks
Training Procedure:
Models at each level are trained independently to learn the required representation.
Denton, E.L., Chintala, S. and Fergus, R., 2015. “Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks”. NIPS (2015)
Adversarially Learned Inference
• Basic idea is to learn an encoder/inference network along with the
generator network.
generator distribution
Discriminator Network
• min max 𝔼 𝑞 ( 𝑥 ) ¿ ¿ ¿
𝐺 𝐷
• Joint:
• Marginals: and
• Conditionals: and
Applications:
• Isola, P., Zhu, J. Y., Zhou, T., & Efros, A. A. Image-to-image translation with conditional adversarial networks. arXiv preprint arXiv:1611.07004. (2016).
• Reed, S., Akata, Z., Yan, X., Logeswaran, L., Schiele, B., & Lee, H. Generative adversarial text to image synthesis. JMLR (2016).
• Antipov, G., Baccouche, M., & Dugelay, J. L. (2017). Face Aging With Conditional Generative Adversarial Networks. arXiv preprint arXiv:1702.01983.
Questions?