#Hello, for a customized CNN that I train on a picture dataset with the method fit_generator , I don't #understanbd why it doesn't work with a low batch size and it works when I increase the batch _size #parameter ? Could somebody explain me what's wrong ?
nb_train_samples = 700
nb_validation_samples = 70
epochs = 50
batch_size = 5 ###
loops of CNN don't work when the batch size is too low (ex : 5
TL;DR: In simple words, your steps_per_epoch and validation_steps can not be more than len(train_generator) if it is batched already. (Assuming it is not repeated.)
The generator is already batched. You are trying to pass more steps than len(training_set) with nb_train_samples // batch_size which is equal to 140.
You don't need to set steps_per_epoch when using generators, unless you want to have less steps.
Example:
train_generator = train_datagen.flow_from_directory(
...
batch_size=20)
train_generator.samples # returns 2000
So in this case len(train_generator) returns 100. If you want to use less data-points then you can specify steps_per_epoch like:
steps_per_epoch=train_generator.samples // 32 <-- equals to 62
Related
The code below is based on François Collet. He uses it to show that when the training set is small (2000 images), data augmentation improves the classification power in the validation set (which is true!).
My questions are:
If the model.fit_generator method uses steps_per_epoch = 2000 // batch_size. Are we using 2000 images per epoch?
If yes. What is the point of data augmentation if I use an augmented sample size equal in size to the original one?
batch_size = 32
# Train data augmentation
train_datagen = ImageDataGenerator(
rescale = 1./255,
shear_range = 0.2,
zoom_range = 0.2,
horizontal_flip = True)
train_generator = train_datagen.flow_from_directory(
train_dir,
target_size = (150, 150),
batch_size = batch_size,
class_mode = 'binary')
# Train data generation
validation_datagen = ImageDataGenerator(rescale=1./255)
validation_generator = validation_datagen.flow_from_directory(
validation_dir,
target_size = (150, 150),
batch_size = 20,
class_mode = 'binary')
# training and validation
history = model.fit_generator(train_generator,
steps_per_epoch = 2000 // batch_size,
epochs = 100,
validation_data = validation_generator,
validation_steps = 500 // batch_size)
The code that you posted is quite dated, but serves the purpose for the explanation.
Admittedly we have 2000 images. We use them all, but the number of steps that are performed in that epoch is 2000//batch_size, since you update the weights of the network after a batch of size batch_size. Therefore you perform 2000//batch_size steps.
At the same time, think about augmentation as enrichment at run-time. When we use augmentation you do not create new examples which are physically stored on your drive, but when you load the batch into the memory. This means that, out of the batch that contains batch_size elements, some of them are modified(augmented), and fed into your network. Each augmentation has a probability associated, that is there is N % probability (you can set it even manually if you want) that your image is subjected to that specific augmentation.
But this means that as the training progresses, as the number of epochs increase, your network gets to see many more images than the initial size of 2000.
In the snippet that you have provided, steps_per_epoch = 2000 // batch_size essentially means that the model will see the whole 2000 images during an epoch but out of those 2000 images many of them will be replaced with their augmented counterparts based on either a specific probability that you can provide or by randomly choosing the images.
For eg. Consider you are building a Dog-vs-Cats classifier and the dataset is made up of images of right-facing dogs and left-facing cats only. In this case, if you don't apply augmentation (horizontal_flipping) then the model might learn that all the left facing animals are cats which will lead to incorrect results when given an image of a left-facing dog.
Augmentation here (specifically horizontal_flipping) will randomly flip the images of cats and dogs enabling the model to reach a better solution and hence make it more robust!
Augmentation Happens in-place no new images are generated.
If we set the steps_per_epoch (in ImageDataGenerator) higher than the total possible batches(total_samples/batch_Size). Will the model revisit the same data points from starting or will it ignore?
Ex:
Flattened image shape which will go to Dense layer: (2000*1)
batch size: 20
Total no of batches possible: 100 (2000/20)
steps per epoch: 1000 (set explicitly)
As far as I know, steps_per_epoch is independent of the 'real' epoch (which is number_of_inputs/batch_size). Let's use an example similar to what you want to know, with 2000 data points and batch_size of 20 (which means 2000/20 = 100 steps for one 'real' epoch):
If you set steps_per_epoch = 1000: Keras asks for a loop of 1000 batches, which basically means 10 'real' epochs (or 10 times of whole data traversal).
If you set steps_per_epoch = 50: Keras asks for a loop of 50 batches, and the remaining 50 batches of one 'real' epoch is visited in the next loop.
I am having some trouble to understand some arguments of model.fit function in Keras.
Model Keras
In my problem i have a total of 1147 samples, and i have split those samples for training and validation (80% for training and 20% for validation). I am using the same batch size for training and validation. So, i got this:
Total_Samples = 1147
Training_Samples = 918
Validation_Samples = 229
Batch_Size = 16 # For Training and Validation
1st Question: Is the steps_per_epoch = Total_Samples/Batch_Size?
2nd Question Is the validation_steps = Validation_Samples/Batch_Size?
Thanks in advance!
The steps_per_epoch will be Training_Samples (not Total_Samples) divided by Batch_Size. Similarly, the validation_steps will be Validation_Samples divided by Batch_Size.
using Keras fit_generator, steps_per_epoch should be equivalent to the total number available of samples divided by the batch_size.
But how would the generator or the fit_generator react if I choose a batch_size that does not fit n times into the samples? Does it yield samples until it cannot fill a whole batch_size anymore or does it just use a smaller batch_size for the last yield?
Why I ask: I divide my data into train/validation/test of different size (different %) but would use the same batch size for train and validation sets but especially for train and test sets. As they are different in size I cannot guarantee that batch size fit into the total amount of samples.
If it's your generator with yield
It's you who create the generator, so the behavior is defined by you.
If steps_per_epoch is greater than the expected batches, fit will not see anything, it will simply keep requesting batches until it reaches the number of steps.
The only thing is: you must assure your generator is infinite.
Do this with while True: at the beginning, for instance.
If it's a generator from ImageDataGenerator.
If the generator is from an ImageDataGenerator, it's actually a keras.utils.Sequence and it has the length property: len(generatorInstance).
Then you can check yourself what happens:
remainingSamples = total_samples % batch_size #confirm that this is gerater than 0
wholeBatches = total_samples // batch_size
totalBatches = wholeBatches + 1
if len(generator) == wholeBatches:
print("missing the last batch")
elif len(generator) == totalBatches:
print("last batch included")
else:
print('weird behavior')
And check the size of the last batch:
lastBatch = generator[len(generator)-1]
if lastBatch.shape[0] == remainingSamples:
print('last batch contains the remaining samples')
else:
print('last batch is different')
If you assign N to the parameter steps_per_epoch of fit_generator(), Keras will basically call your generator N times before considering one epoch done. It's up to your generator to yield all your samples in N batches.
Note that since for most models it is fine to have different batch sizes each iteration, you could fix steps_per_epoch = ceil(dataset_size / batch_size) and let your generator output a smaller batch for the last samples.
i had facing the same logical error
solved it with defining steps_per_epochs
BS = 32
steps_per_epoch=len(trainX) // BS
history = model.fit(train_batches,
epochs=initial_epochs,steps_per_epoch=steps_per_epoch,
validation_data=validation_batches)
I want to create stateful LSTM in keras. I gave it a command like this:
model.add(LSTM(300,input_dim=4,activation='tanh',stateful=True,batch_input_shape=(19,13,4),return_sequences=True))
where batch size=19. But on running it gives error
Exception: In a stateful network, you should only pass inputs with a number of samples that can be divided by the batch size. Found: 8816 samples. Batch size: 32.
I did not specify batch size 32 anywhere in my script and 19 is divisible by 8816
model.fit() does the batching (as opposed to model.train_on_batch for example). Consequently it has a batch_size parameter which defaults to 32.
Change this to your input batch size and it should work as expected.
Example:
batch_size = 19
model = Sequential()
model.add(LSTM(300,input_dim=4,activation='tanh',stateful=True,batch_input_shape=(19,13,4),return_sequences=True))
model.fit(x, y, batch_size=batch_size)
there are two cases where batch_size error could occur.
model.fit(train_x, train_y, batch_size= n_batch, shuffle=True,verbose=2)
trainPredict = model.predict(train_x, batch_size=n_batch)
or testPredict = model.predict(test_x,batch_size=n_batch)
in both cases, you have to mention no. of batches.
note: we need to predict train and test both, so best practice is divide the test and train in such a way that your batch size is multiple of both in stateful=True case
To dynamically size your data and batches:
Size data and training sample split:
data_size = int(len(supervised_values))
train_size_initial = int(data_size * train_split)
x_samples = supervised_values[-data_size:, :]
Size number of training samples to batch size:
if train_size_initial < batch_size_div:
batch_size = 1
else:
batch_size = int(train_size_initial / batch_size_div)
train_size = int(int(train_size_initial / batch_size) * batch_size) # provide even division of training / batches
val_size = int(int((data_size - train_size) / batch_size) * batch_size) # provide even division of val / batches
print('Data Size: {} Train Size: {} Batch Size: {}'.format(data_size, train_size, batch_size))
Split data into train and validation sets
train, val = x_samples[0:train_size, 0:-1], x_samples[train_size:train_size + val_size, 0:-1]
Both training and validation data need to be divisible by the batch size. Make sure that any part of the model using batch size takes the same number (e.g. batch_input_shape in your LSTM layer, and batch_size in model.fit() and model.predict(). Down-sample training and validation data if need be to make this work.
e.g.
>>> batch_size = 100
>>> print(x_samples_train.shape)
>>> print(x_samples_validation.shape)
(42028, 24, 14)
(10451, 24, 14)
# Down-sample so training and validation are both divisible by batch_size
>>> x_samples_train_ds = x_samples_train[-42000:]
>>> print(x_samples_train_ds.shape)
>>> y_samples_train_ds = y_samples_train[-42000:]
>>> print(y_samples_train_ds.shape)
(42000, 24, 14)
(42000,)
>>> x_samples_validation_ds = x_samples_validation[-10000:]
>>> print(x_samples_validation_ds.shape)
>>> y_samples_validation_ds = y_samples_validation[-10000:]
>>> print(y_samples_validation_ds.shape)
(10000, 24, 14)
(10000,)