Unsupervised representation learning with deep convolutional generative adversarial networks
A.Radford ,L.Metz and S.Chintala, arXiv, Jan 2016.
[python] [Tensorflow]
GANs are unstable to train and generated images suffers from being noisy and incomprehensible. LAPGAN showed higher quality images but the objects still suffered by looking wobbly because of noise introduced in chaining multiple models. For stable training across datasets and training higher resolution and deeper generative models a family of architectures is identified in DCGAN and some changes are made to CNN architecture: (1)replacing max pooling functions with strided convolutions which allows the network to learn its own spatial downsampling,(2) removing full connected hidden layers, (3) using batch normalization to stabilize learning by normalizing input it also deals with training problems that risies due to poor initialization, (4) use of ReLu activation function in generator for all layers except for the output layer that uses tanh activation function and leaky ReLu activation function is used in discriminator.
With no preprocessing to training images rather than scalig the images in the images in the range [-1,1] for tanh activation function, training is done by using mini-batch stochastic gradient descent. Weights are initialized from a zero centered normal distribution with learning rate of 0.0002 and momentum 0.5 for stable training of images.
The quality of DCGANs can be evaluated by using it as a feature extractor on supervised datasets.
GANs are unstable to train and generated images suffers from being noisy and incomprehensible. LAPGAN showed higher quality images but the objects still suffered by looking wobbly because of noise introduced in chaining multiple models. For stable training across datasets and training higher resolution and deeper generative models a family of architectures is identified in DCGAN and some changes are made to CNN architecture: (1)replacing max pooling functions with strided convolutions which allows the network to learn its own spatial downsampling,(2) removing full connected hidden layers, (3) using batch normalization to stabilize learning by normalizing input it also deals with training problems that risies due to poor initialization, (4) use of ReLu activation function in generator for all layers except for the output layer that uses tanh activation function and leaky ReLu activation function is used in discriminator.
With no preprocessing to training images rather than scalig the images in the images in the range [-1,1] for tanh activation function, training is done by using mini-batch stochastic gradient descent. Weights are initialized from a zero centered normal distribution with learning rate of 0.0002 and momentum 0.5 for stable training of images.
The quality of DCGANs can be evaluated by using it as a feature extractor on supervised datasets.