Best Resources for Getting Started With Generative Adversarial Networks (GANs)

0
34

Generative Adversarial Networks, or GANs, are a type of deep learning technique for generative modeling.

GANs are the techniques behind the startlingly photorealistic generation of human faces, as well as impressive image translation tasks such as photo colorization, face de-aging, super-resolution, and more.

It can be very challenging to get started with GANs. This is both because the field is very young, starting with the first paper in 2014, and because of the vast number of papers and applications published every month on the topic.

In this post, you will discover the best resources that you can use to learn about generative adversarial networks.

After reading this post, you will know:

  • What a generative adversarial network is and examples of specific applications for the technique.
  • Video tutorials and lectures on GANs presented by the inventor of the technique.
  • Reading list including the most read papers on GANs and books on deep generative models.

Let’s get started.

Overview

This tutorial is divided into five parts; they are:

  1. What Are GANs?
  2. GAN Applications
  3. GAN Video Presentations
  4. GAN Paper Reading List
  5. GAN Books

What Are Generative Adversarial Networks?

A Generative Adversarial Network, or GAN, is a type of neural network architecture for generative modeling.

Generative modeling involves using a model to generate new examples that plausibly come from an existing distribution of samples, such as generating new photographs that are generally similar but specifically different from a dataset of existing photographs.

A GAN is a generative model that is trained using two neural network models. One model is called the “generator” or “generative network” model, which learns to generate new plausible samples. The other model is called the “discriminator” or “discriminative network” and learns to differentiate generated examples from real examples.

The two models are set up in a contest or a game (in a game theory sense) where the generator model seeks to fool the discriminator model, and the discriminator is provided with both examples of real and generated examples.

Overview of a Generative Adversarial Network

Overview of a Generative Adversarial Network

After training, the generative model can then be used to create new plausible samples on demand.

Applications of Generative Adversarial Networks

The majority of the research and applications of GANs have focused on the domain of computer vision.

The reasons for this is the great success of deep learning models such as Convolutional Neural Networks (CNNs) in the field of computer vision over the last 5 to 7 years, such as achieving state-of-the-art results on challenging tasks like object detection and face recognition.

The canonical example of a GAN is in the generation of new realistic looking photographs, most startlingly demonstrated in the example of the generation of photorealistic faces.

Example of Photorealistic Human Faces Generated by a GAN

Example of Photorealistic Human Faces Generated by a GAN.Taken from “A Style-Based Generator Architecture for Generative Adversarial Networks“.

There are many “generate new examples” problems, such as:

  • Generating new anime characters.
  • Generating new logos.
  • Generating new Pokemon.
  • Generating new clothing.

GANs can be used for surprising image processing tasks for photographs and videos. Broadly, this is referred to as image translation, such as:

  • Translating a photo of summer to winter.
  • Translating a sketch to a photograph.
  • Translating a photo of daytime to nighttime.

Some more specific image translation examples include:

  • Automatic aging or de-aging of photographs of faces.
  • Automatic colorization of black and white photographs.
  • Automatic resolution enhancement of photographs.
  • Automatic style transfer (e.g. apply a painting style to photos).
  • Automatic image inpainting (e.g. filling in obscured parts of an image).

GANs can also be used to generate sequences of images or video and used on tasks such as automatically predicting sequences of video frames and even hallucinating scenarios for use in training reinforcement learning models.

Beyond image processing, the technique can be used generally for data augmentation where entirely new plausible samples can be generated as input when training a model.

For more examples of interesting applications of GANs, see:

Video Presentations on Generative Adversarial Networks

A good way to get a gentle introduction to GANs, how they work, and applications is to watch a video presentation.

Ian Goodfellow, credited as the inventor of the technique, has given many lecture and tutorial presentations that are freely available on YouTube. Ian is an excellent communicator and provides a crisp presentation of the technique.

I recommend watching Ian’s 2016 tutorial at NIPS (now NeurIPS).

The video is about two hours long and includes a detailed review of GANs, theory, and applications, with questions and answers with the audience at the end.

I would also strongly encourage reading the accompanying slides and paper version of the tutorial:

If you’re interested in a more focused presentation (about 28 minutes) of the same material with less theory, I recommend Ian’s 2016 presentation for “AI With the Best,” an online conference.

More recently, Ian gave a presentation to the AAAI in 2019 on the broader topic of Adversarial Machine Learning, that also covers GANs, and this presentation is also highly recommended.

If you are looking for a more academic presentation of GANs, then I would recommend the lecture on Generative Models from the Stanford course on Convolutional Neural Networks.

This lecture provides a useful context for GANs as well as coverage of the related techniques of Variational Autoencoders and PixelRNN.

Paper Reading List for Generative Adversarial Networks

GANs is a very new area of study.

I’ve tried to separate this reading list from the broader list of papers on GAN applications, focusing on the development of the theory and training of GAN models.

The first paper specifically on GANs as a generative model was published by Ian Goodfellow, et al. in 2014 titled “Generative Adversarial Networks.”

The paper presents the general technique and demonstrates it with some simple examples of generating images from MNIST (handwritten digits), CIFAR-10 (small photographs), and faces.

Alec Radford, et al. in their 2015 paper titled “Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks” provide an updated version of GANs using modern configuration and training practices for convolutional neural networks, referred to as Deep Convolutional Generative Adversarial Networks, or DCGANs.

This was an important paper because it demonstrated how the power of the technique can be unlocked with examples such as generating photorealistic rooms and faces.

After the DCGAN paper, a rash of papers was written providing improvements to the inherently unstable process of training the GAN models. Perhaps the most important of these papers include:

Some more recent high-quality papers on the challenge of training and evaluating GANs include:

Beyond these papers, a high-level overview of the history of related generative models can be seen on the Wikipedia page for GANs.

There are a number of GAN survey papers that can help to get a feeling for the extent of the field. A select few include:

Many people have tried to put together reading lists for GANs, and it is very challenging given both the newness of the field and the pace of new papers. Some other paper reading lists include:

Books Generative Adversarial Networks

There is some coverage of GANs in modern books on deep learning.

Perhaps the most important starting point is the Deep Learning textbook written by Goodfellow, et al. Chapter 20 is titled “Deep Generative Models” and provides a useful summary of a range of techniques, including GANs, covered in Section 20.10.4.

Francois Chollet, the author of the Keras deep learning framework, provides a chapter on deep generative models in his 2017 book titled “Deep Learning with Python.” Specifically, section 8.5 titled “Introduction to generative adversarial networks” that covers GANs and how to train a DCGAN in Keras.

At the time of writing, there are also two interesting books on deep learning for generative modeling in the works that are projected to be released later in the year. They are:

It will be exciting to see what these books cover.

Further Reading

This section provides more resources on the topic if you are looking to go deeper.

Books

Papers

  • NIPS 2016 Tutorial: Generative Adversarial Networks, 2016.
  • Generative Adversarial Networks, 2014.
  • Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks, 2015.
  • Wasserstein GAN, 2017.
  • Are GANs Created Equal? A Large-Scale Study, 2017.
  • The GAN Landscape: Losses, Architectures, Regularization, and Normalization, 2018.
  • Generative Adversarial Networks: An Overview, 2017.
  • Generative Adversarial Networks: Introduction and Outlook, 2017

Videos

  • Generative Adversarial Networks, Ian Goodfellow, NIPS, 2016.
  • Generative Adversarial Networks, Ian Goodfellow, AIWTB, 2016.
  • Adversarial Machine Learning, Ian Goodfellow, AAAI, 2019.
  • Generative Models, Convolutional Neural Networks for Visual Recognition, 2017.

Articles

Summary

In this post, you discovered the best resources that you can use to learn about generative adversarial networks.

Specifically, you learned:

  • What a generative adversarial network is and examples of specific applications for the technique.
  • Video tutorials and lectures on GANs presented by the inventor of the technique.
  • Reading list including the most read papers on GANs and books on deep generative models.

Do you have any questions?
Ask your questions in the comments below and I will do my best to answer.



Best Resources for Getting Started With Generative Adversarial Networks (GANs)