How many fake examples are required for training a GAN? - pytorch

I recently trained a GAN with ~1000 real images and 64 fake images. This obviously wasn't enough fake images, so most of the "latent space" in the GAN just creates the same image. How many fake images are GANs usually trained with in order to make their latent space usable?

The number of fake images that you generate depends on the size of the noise vector that you sample. Ideally, in a single step of training, you would sample a noise vector of dimension batch_size x noise_dim and use the generator to generate batch_size fake images. Similarly, your discriminator would see the batch_size number of real images.
This way, at every step, your discriminator sees an equal number of fake and real images, and the total number of fake images seen by your discriminator would be equal to the total number of images in your training set itself.

Related

Can't overcome Overfitting - GrayScale Images from Numerical Arrays and CNN with PyTorch

I am trying to implement an image classification task for the grayscale images, which were converted from some sensor readings. It means that I had initially time series data e.g. acceleration or displacement, then I transformed them into images. Before I do the transformation, I did apply normalization across the data. I have a 1000x9 image dimension where 1000 represents the total time step and 9 is the number of data points. The split ratio is 70%, 15%, and 15% for training, validation, and test data sets. There are 10 different labels, each label has 100 images, it's a multi-class classification task.
An example of my array before image conversion is:
As you see above, the precisions are so sensitive. When I convert them into images, I am able to see the darkness and white part of the image;
Imagine that I have a directory from D1 to D9 (damaged cases) and UN (health case) and there are so many images like this.
Then, I have a CNN-network where my goal is to make a classification. But, there is a significant overfitting issue and whatever I do it's not working out. One of the architecture I've been working on;
Model summary;
I also augment the data. After 250 epochs, this is what I get;
So, what I wonder is that I tried to apply some regularization or augmentation but they do not give me kind of solid results. I experimented it by changing the number of hidden units, layers, etc. Would you think that I need to fully change my architecture? I basically consider two blocks of CNN and FC layers at the end. This is not the first time I've been working on images like this, but I cannot mitigate this overfitting issue. I appreciate it if any of you give me some solid suggestions so I can get smooth results. i was thinking to use some pre-trained models for transfer learning but the image dimension causes some problems, do you know if I can use any of those pre-trained models with 1000x9 image dimension? I know there are some overfiting topics in the forum, but since those images are coming from numerical arrays and I could not make it work, I wanted to create a new title. Thank you!

Torch model forward with a diferent image size

I am testing some well known models for computer vision: UNet, FC-DenseNet103, this implementation
I train them with 224x224 randomly cropped patches and do the same on the validation set.
Now when I run inference on some videos, I pass it the frames directly (1280x640) and it works. It runs the same operations on different image sizes and never gives an error. It actually gives a nice output, but the quality of the output depends on the image size...
Now it's been a long time since I've worked with neural nets but when I was using tensorflow I remember I had to crop the input images to the train crop size.
Why don't I need to do this anymore? What's happening under the hood?
It seems that the models that you are using have no linear layers. Because of this the output of the convolutional layers go straight into the softmax function. The softmax function doesn't take a specific shape for its input so it can take any shape as input. Because of this your model will work with any shape of image but the accuracy of your model will probably be far worse given different image shapes than the one you trained on.
There is always a specific input size in the documentation of the model. You should use this size. These are the current model limitations.
For UNets this may even be a ratio. I think it depends on implementation.
Just a note on resize:
transform.Resize((h,w))
transform.Resize(d)
In case of the (h, w), output size will be matched to this.
In the second case of d size, the smaller edge of the image will be matched to d.
For example, if height > width, then image will be re-scaled to (d * height / width, d)
The idea is to not ruin the aspect ratio of the image.

How multiple images are processed in CNN

In normal ANN each training sample is represented by a row of the matrix and in that way batches of training data can be processed but in CNN how multiple images are processed.
The same with ANN, you can stack up the images to n-dimensions tensor to be processed.
For CNNs that are trained on images, for example, say your dataset is RGB (3-channel) images that are 256x256 pixels. A single image can be represented by a 3 x 256 x 256 matrix. If you set your batch size to be 10, that means you’re concatenating 10 images together into a 10 x 3 x 256 x 256 matrix.
Tuning the batch size is one of the aspects of getting training right - if your batch size is too small, then there will be a lot of variance within a batch, and your training loss curve will bounce around a lot. But if it’s too large, your GPU will run out of memory to hold it, or training will progress too slowly to see if it’s the optimization is diverging early on.

Does the input size to CNN change depending on the batch size?

Usually, the input to a convolutional neural network (CNN) is described by an image with given width*height*channels. Will the number of input nodes be different for different numbers of a batch size, as well? Meaning, will the number of input nodes become batch_size*width*height*channels?
From this excellent answer batch size defines number of samples that going to be propagated through the network. Batch size does not have an effect on the network's architecture, including quantity of inputs.
Let's say you have 1,000 RGB images with size 32x32, and you set the batch size to be 100. Your convolutional network should have an input shape of 32x32x3. To train the network the training algorithm picks a sample of 100 images out of the total 1,000, and trains the network on each individual image in that subset. That is your 'batch'. The network architecture doesn't care that your subset (batch) has 100, 200 or 1,000 images, it only cares about the shape of a single image, because that's all it sees at a time. When the network has been trained on all 100 images then it has completed one epoch, and the network parameters are updated.
The training algorithm will pick a different batch of images for each epoch, but the above always holds true: the network only sees a single image at a time, so the image shape must match the input layer shape, with no regard for how many images in that particular batch.
As to why we have batches instead of just training on the entire set (ie setting batch size to 100% and training for one epoch), with GPUs it makes training much faster, and also means the parameters are updated less often while training, giving a smoother and more reliable convergence.

value of steps per epoch passed to keras fit generator function

What is the need for setting steps_per_epoch value when calling the function fit_generator() when ideally it should be number of total samples/ batch size?
Keras' generators are infinite.
Because of this, Keras cannot know by itself how many batches the generators should yield to complete one epoch.
When you have a static number of samples, it makes perfect sense to use samples//batch_size for one epoch. But you may want to use a generator that performs random data augmentation for instance. And because of the random process, you will never have two identical training epochs. There isn't then a clear limit.
So, these parameters in fit_generator allow you to control the yields per epoch as you wish, although in standard cases you'll probably keep to the most obvious option: samples//batch_size.
Without data augmentation, the number of samples is static as Daniel mentioned.
Then, the number of samples for training is steps_per_epoch * batch size.
By using ImageDataGenerator in Keras, we make additional training data for data augmentation. Therefore, the number of samples for training can be set by yourself.
If you want two times training data, just set steps_per_epoch as (original sample size *2)/batch_size.

Resources