I have a Multi-Layer Perceptron network in Keras with two hidden Layers.
While trying to train the network I get the Error in the fit_generator :
Error:
ValueError: logits and labels must have the same shape ((None, 2) vs (None, 1))
My Code is:
import numpy as np
import keras
from keras import layers
from keras import Sequential
# Define Window size (color images)
img_window = (32,32,3)
# Flatten the Window shape
input_shape = np.prod(img_window)
print(input_shape)
# Define MLP with two hidden layers(neurons)
simpleMLP = Sequential(
[layers.Input(shape=img_window),
layers.Flatten(), # Flattens the input, conv2D to 1 vector , which does not affect the batch size.
layers.Dense(input_shape//2 ,activation="relu"),
layers.Dense(input_shape//2 ,activation="relu"),
layers.Dense(2,activation="sigmoid"),
]
)
# After model is "built" call its summary() menthod to display its contents
simpleMLP.summary()
# Initialization
# Size of the batches of data, adjust it depends on RAM
batch_size = 128
epochs = 5
# Compile MLP model with 3 arguments: loss function, optimizer, and metrics function to judge model performance
simpleMLP.compile(loss="binary_crossentropy",optimizer="adam",metrics=["binary_accuracy"]) #BCE
# Create ImagedataGenerator to splite training, validation dataset
from keras.preprocessing.image import ImageDataGenerator
train_dir = '/content/train'
train_datagen = ImageDataGenerator(
rescale=1./255, # rescaling factor
shear_range=0.1,
zoom_range=0.1,
horizontal_flip=True,
fill_mode='nearest')
valid_dir = '/content/valid'
valid_datagen =ImageDataGenerator(
rescale=1./255,
shear_range=0.1,
zoom_range=0.1,
horizontal_flip=True,
fill_mode='nearest')
train_generator = train_datagen.flow_from_directory(
train_dir,
target_size=img_window[:2],
batch_size=batch_size,
class_mode='binary',
color_mode='rgb'
)
validation_generator = valid_datagen.flow_from_directory(
valid_dir,
target_size=img_window[:2],
batch_size=batch_size,
class_mode='binary',
color_mode='rgb')
# Train the MLP model
simpleMLP.fit_generator((
train_generator,
steps_per_epoch= 8271 // batch_size,
epochs=5,
validation_data=validation_generator,
validation_steps= 2072 // batch_size)
Can you please advise me how to resolve this problem? thanks in advance.
You problem simply is that, you have got labels of shape (N, 1) and loss defined as binary_crossentropy. This means you should have a single output node in the last layer. But you have a model that outputs two classes.
simpleMLP = Sequential(
[...
layers.Dense(2,activation="sigmoid"),
]
)
Simply change this to,
simpleMLP = Sequential(
[...
layers.Dense(1,activation="sigmoid"),
]
)
Related
I'm trying to preform transfer learning with mobilenetv2 to classify 196 classes of cars from the cars-196 dataset of stanford.
My work environment is google colab notebook.
I use the ImageDataGenerator from keras to load the images for the train and validation.
On the training images I also perform data augmentation.
The following code is how I perform it:
# To load the dataset from the drive
from google.colab import drive
drive.mount('/content/drive')
import math
from keras.models import Sequential, Model
from keras.layers import Dense, Conv2D, MaxPooling2D, Flatten, Dropout, ReLU, GlobalAveragePooling2D
from keras.preprocessing.image import ImageDataGenerator
from keras.applications.mobilenet_v2 import MobileNetV2, preprocess_input
BATCH_SIZE = 196
train_datagen = ImageDataGenerator(
rotation_range=20, # Rotate the augmented image by 20 degrees
zoom_range=0.2, # Zoom by 20% more or less
horizontal_flip=True, # Allow for horizontal flips of augmented images
brightness_range=[0.8, 1.2], # Lighter and darker images by 20%
width_shift_range=0.1,
height_shift_range=0.1,
preprocessing_function=preprocess_input
)
img_data_iterator = train_datagen.flow_from_directory(
# Where to take the data from, the classes are the sub folder names
'/content/drive/My Drive/Datasets/cars-196/car_data/train',
class_mode="categorical", # classes are in 2D one hot encoded way, default is true but just to point it out
shuffle=True, # shuffle the data, default is true but just to point it out
batch_size=BATCH_SIZE,
target_size=(224, 224) # This size is the default of mobilenet NN
)
validation_img_data_iterator = ImageDataGenerator().flow_from_directory(
'/content/drive/My Drive/Datasets/cars-196/car_data/test',
class_mode="categorical",
shuffle=True,
batch_size=BATCH_SIZE,
target_size=(224, 224)
)
base_model = MobileNetV2(weights='imagenet', include_top=False)
x = base_model.output
x = GlobalAveragePooling2D()(x)
x = Dense(512, activation='relu')(x)
x = Dropout(0.5)(x)
preds = Dense(196, activation='softmax')(x)
model = Model(inputs=base_model.input, outputs=preds)
# Disable training of already trained layer
for layer in model.layers[:-3]:
layer.trainable = False
model.compile(optimizer='Adam',loss='categorical_crossentropy',metrics=['accuracy'])
# define the checkpoint
from keras.callbacks import ModelCheckpoint
filepath = "/content/drive/My Drive/Datasets/cars-196/model.h5"
checkpoint = ModelCheckpoint(filepath, monitor='loss', verbose=1, save_best_only=True, mode='min')
callbacks_list = [checkpoint]
history = model.fit(
img_data_iterator,
steps_per_epoch=math.ceil(8144/BATCH_SIZE), # 8144 is the number of training images
validation_steps=math.ceil(8062/BATCH_SIZE), # 8062 is the number of validation images
validation_data=validation_img_data_iterator,
epochs=100,
callbacks=callbacks_list
)
About the batch size, from this stackoverflow question I decided to set the batch size as the number of labels available, but it did not change anything in terms of val_accuracy.
I've added a dropout of 0.5 between the fully connected layers that I have added, but again, no change in the accuracy of the validation.
My accuracy on the training set arrives to about 92% while the validation accuracy stays at about 0.7%.
My guess is that the ImageDataGenerator is acting weird and screwing up the accuracy, but I have not found any solution for the problem so ATM I dont have a clew what is the reason behind it.
Does anyone have any guesses as to what might be the problem?
----- EDIT
The train and test folder all have sub folders with name of the labels (the different cars I want to identify) and each subfolder has images of that car. This is just how the cars-196 dataset is. The ImageDataGenerator attaches the right label to the image, depending on in what subfolder that image was in.
The problem was that I did not apply the preprocess_input function to the validation data image generator.
Instead of
validation_img_data_iterator = ImageDataGenerator().flow_from_directory(
'/content/drive/My Drive/Datasets/cars-196/car_data/test',
class_mode="categorical",
shuffle=True,
batch_size=BATCH_SIZE,
target_size=(224, 224)
)
Changed it to
validation_img_data_iterator = ImageDataGenerator(
preprocessing_function=preprocess_input
).flow_from_directory(
'/content/drive/My Drive/Datasets/cars-196/car_data/test',
class_mode="categorical",
shuffle=True,
batch_size=BATCH_SIZE,
target_size=(224, 224)
)
I am working on a project to classify waste as plastics and non plastics using only.images to train them.However i still dont know what features does the model take into account while classifyimg them.I am using CNN,however the accuracy of prediction is still not up to the mark.
The reason why i went to CNN because there is no specific feature to distinguish plastics from others.Is there any other way to approach this problem?
For eg If i train the images of cats,my Neural Network learns what is a cat however i do not explicitly give features is the same case valid here too?
Suppose you want to extract the Features from the Pre-Trained Convolutional Neural Network, VGGNet, VGG16.
Code to reuse the Convolutional Base is:
from keras.applications import VGG16
conv_base = VGG16(weights='imagenet',
include_top=False,
input_shape=(150, 150, 3)) # This is the Size of your Image
The final feature map has shape (4, 4, 512). That’s the feature on top of which you’ll stick a densely connected classifier.
There are 2 ways to extract Features:
FAST FEATURE EXTRACTION WITHOUT DATA AUGMENTATION: Running the convolutional base over your dataset, recording its output to a
Numpy array on disk, and then using this data as input to a standalone, densely
connected classifier similar to those you saw in part 1 of this book. This solution is fast and cheap to run, because it only requires running the convolutional base once for every input image, and the convolutional base is by far the most expensive part of the pipeline. But for the same reason, this technique won’t allow you to use data augmentation.
Code for extracting Features using this method is shown below:
import os
import numpy as np
from keras.preprocessing.image import ImageDataGenerator
base_dir = '/Users/fchollet/Downloads/cats_and_dogs_small'
train_dir = os.path.join(base_dir, 'train')
validation_dir = os.path.join(base_dir, 'validation')
test_dir = os.path.join(base_dir, 'test')
datagen = ImageDataGenerator(rescale=1./255)
batch_size = 20
def extract_features(directory, sample_count):
features = np.zeros(shape=(sample_count, 4, 4, 512))
labels = np.zeros(shape=(sample_count))
generator = datagen.flow_from_directory(directory, target_size=(150, 150),
batch_size=batch_size, class_mode='binary')
i=0
for inputs_batch, labels_batch in generator:
features_batch = conv_base.predict(inputs_batch)
features[i * batch_size : (i + 1) * batch_size] = features_batch
labels[i * batch_size : (i + 1) * batch_size] = labels_batch
i += 1
if i * batch_size >= sample_count:
break
return features, labels
train_features, train_labels = extract_features(train_dir, 2000)
validation_features, validation_labels = extract_features(validation_dir,1000)
test_features, test_labels = extract_features(test_dir, 1000)
train_features = np.reshape(train_features, (2000, 4*4* 512))
validation_features = np.reshape(validation_features, (1000, 4*4* 512))
test_features = np.reshape(test_features, (1000, 4*4* 512))
from keras import models
from keras import layers
from keras import optimizers
model = models.Sequential()
model.add(layers.Dense(256, activation='relu', input_dim=4 * 4 * 512))
model.add(layers.Dropout(0.5))
model.add(layers.Dense(1, activation='sigmoid'))
model.compile(optimizer=optimizers.RMSprop(lr=2e-5),
loss='binary_crossentropy', metrics=['acc'])
history = model.fit(train_features, train_labels, epochs=30,
batch_size=20, validation_data=(validation_features, validation_labels))
Training is very fast, because you only have to deal with two Dense
layers—an epoch takes less than one second even on CPU
FEATURE EXTRACTION WITH DATA AUGMENTATION: Extending the model you have (conv_base) by adding Dense layers on top, and running the whole thing end to end on the input data. This will allow you to use data augmentation, because every input image goes through the convolutional base every time it’s seen by the model. But for the same reason, this technique is far more expensive than the first
Code for the same is shown below:
from keras import models
from keras import layers
model = models.Sequential()
model.add(conv_base)
model.add(layers.Flatten())
model.add(layers.Dense(256, activation='relu'))
model.add(layers.Dense(1, activation='sigmoid'))
from keras.preprocessing.image import ImageDataGenerator
from keras import optimizers
train_datagen = ImageDataGenerator(rescale=1./255,rotation_range=40,
width_shift_range=0.2,height_shift_range=0.2,shear_range=0.2,
zoom_range=0.2,horizontal_flip=True,fill_mode='nearest')
test_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(train_dir,target_size=(150, 150), batch_size=20, class_mode='binary')
validation_generator = test_datagen.flow_from_directory(
validation_dir,
target_size=(150, 150),
batch_size=20,
class_mode='binary')
model.compile(loss='binary_crossentropy',
optimizer=optimizers.RMSprop(lr=2e-5),
metrics=['acc'])
history = model.fit_generator(train_generator, steps_per_epoch=100, epochs=30, validation_data=validation_generator, validation_steps=50)
For more details, please refer Section 5.3.1 of the book, "Deep Learning with Python", authored by Father of Keras, "Francois Chollet"
I passed two days trying to use Neural Structured language to adapt into me CNN Model I use ImageDataGenerator and flow_from_directory when I use model.fit_generator I got an error message:
ValueError:
When passing input data as arrays, do not specify
steps_per_epoch/steps argument. Please use batch_size instead.
I use Keras 2.3.1 and TensorFlow 2.0 as backend
This is a snipped of my code:
num_classes = 4
img_rows, img_cols = 224, 224
batch_size = 16
train_datagen = ImageDataGenerator(
rescale=1./255,
rotation_range=30,
width_shift_range=0.3,
height_shift_range=0.3,
horizontal_flip=True,
fill_mode='nearest')
validation_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(
train_data_dir,
target_size=(img_rows, img_cols),
batch_size=batch_size, shuffle=True,
class_mode='categorical')
validation_generator = validation_datagen.flow_from_directory(
validation_data_dir,
target_size=(img_rows, img_cols),
batch_size=batch_size, shuffle=True,
class_mode='categorical')
def vgg():
model1 = Sequential([ ])
return model1
base_model = vgg()
I adapte Datagenerated from (x,y) format to a dictionary format
def convert_training_data_generator():
for x ,y in train_generator:
return {'feature': x, 'label':y}
def convert_testing_data_generator():
for x ,y in validation_generator:
return {'feature': x, 'label': y}
adv_config = nsl.configs.make_adv_reg_config(multiplier=0.2, adv_step_size=0.05)
model = nsl.keras.AdversarialRegularization(base_model, adv_config=adv_config)
train= convert_training_data_generator()
test= convert_testing_data_generator()
history = model.fit_generator(train,
steps_per_epoch= nb_train_samples // batch_size,
epochs = epochs,
callbacks = callbacks,
validation_data = test,
validation_steps = nb_validation_samples // batch_size)
I think here there is the same error. Maybe you should consider using instead model.fit() function. You should define in that case your train input your train labels and the batch_size. In order to figure out the difference between fit and fit_generator, you can follow that link.
Hi guys i have a problem on VGG16 on Keras.
I am trying to make accuracy higher but did not work.
I only have 46 data training, 12 classes, and 26 data validation.
Currently, the highest accuracy that I can get is 0.18.
I try to change the batch size into 2 but the result is worst than i expected.
I don't think I should set the data training sample should be higher than my actual data.
What should I do to increase the accuracy?
This is my actual code:
from keras.applications.vgg16 import VGG16
from keras.preprocessing import image
from keras.applications.vgg16 import preprocess_input
from keras.layers import Input, Flatten, Dense, Dropout
from keras.models import Model, Sequential
from keras import backend as K
import numpy as np
import matplotlib.pyplot as plt
# dimensions of our images.
from keras.preprocessing.image import ImageDataGenerator
img_width, img_height = 224, 224
train_data_dir = 'database/train'
validation_data_dir = 'database/validation'
nb_train_samples = 46
nb_validation_samples = 26
epochs = 50
batch_size = 4
if K.image_data_format() == 'channels_first':
input_shape = (3, img_width, img_height)
else:
input_shape = (img_width, img_height, 3)
#Get back the convolutional part of a VGG network trained on ImageNet
vgg_conv = VGG16(weights='imagenet', include_top=True)
vgg_conv.summary()
print('VGG Pretrained Model loaded.')
#Add a layer where input is the output of the second last layer
x = Dense(12, activation='softmax', name='predictions')(vgg_conv.layers[-2].output)
model = Model(input=vgg_conv.input, output=x)
#In the summary, weights and layers from VGG part will be hidden, but they will be fit during the training
model.summary()
# this is the augmentation configuration we will use for training
train_datagen = ImageDataGenerator(
rescale=1. / 224,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True)
# this is the augmentation configuration we will use for testing:
# only rescaling
test_datagen = ImageDataGenerator(rescale=1. / 224)
train_generator = train_datagen.flow_from_directory(
train_data_dir,
target_size=(img_width, img_height),
batch_size=batch_size,
class_mode='categorical')
validation_generator = test_datagen.flow_from_directory(
validation_data_dir,
target_size=(img_width, img_height),
batch_size=batch_size,
class_mode='categorical')
# compile model
# model.compile(loss='sparse_categorical_crossentropy', optimizer=optimizers.RMSprop(lr=2e-4), metrics=['accuracy'])
model.compile(loss='categorical_crossentropy', optimizer='rmsprop', metrics=['accuracy'])
# Train the model
history = model.fit_generator(
train_generator,
steps_per_epoch=nb_train_samples / batch_size,
epochs=epochs,
validation_data=validation_generator,
validation_steps=nb_validation_samples / batch_size)
# Save the model
model.save('vgg16_pretrained_5.h5')
# Check Performance
acc = history.history['acc']
val_acc = history.history['val_acc']
loss = history.history['loss']
val_loss = history.history['val_loss']
epochs = range(len(acc))
plt.plot(epochs, acc, 'b', label='Training acc')
plt.plot(epochs, val_acc, 'r', label='Validation acc')
plt.title('Training and validation accuracy')
plt.legend()
plt.figure()
plt.plot(epochs, loss, 'b', label='Training loss')
plt.plot(epochs, val_loss, 'r', label='Validation loss')
plt.title('Training and validation loss')
plt.legend()
plt.show()
Since you have 12 classes and only 46 observations, it roughly becomes 2 observations per class (This is just a guess without even looking at the data set). With this little data, the NN model can not even understand the pattern of the data and eventually will fail to generalize. So you need at least for than 2k observations for better results.
I'm trying to adapt Deep Learning with Python section 5.3 Feature extraction with Data Augmentation to a 3-class problem with resnet50 (imagenet weights).
Full code at https://github.com/morenoh149/plantdisease
from keras import models
from keras import layers
from keras.applications.resnet50 import ResNet50
from keras import optimizers
from keras.preprocessing.image import ImageDataGenerator
input_shape = (224, 224, 3)
target_size = (224, 224)
batch_size = 20
conv_base = ResNet50(weights='imagenet', input_shape=input_shape, include_top=False)
model = models.Sequential()
model.add(conv_base)
model.add(layers.Flatten())
model.add(layers.Dense(256, activation='relu'))
model.add(layers.Dense(3, activation='softmax'))
conv_base.trainable = False
train_datagen = ImageDataGenerator(rescale=1./255)
test_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(
'input/train',
target_size=target_size,
batch_size=batch_size,
class_mode='categorical')
validation_generator = test_datagen.flow_from_directory(
'input/validation',
target_size=target_size,
batch_size=batch_size,
class_mode='categorical')
model.compile(loss='categorical_crossentropy',
optimizer=optimizers.RMSprop(lr=2e-5),
metrics=['acc'])
history = model.fit_generator(
train_generator,
steps_per_epoch=96,
epochs=30,
verbose=2,
validation_data=validation_generator,
validation_steps=48)
Questions:
the book doesn't go much into ImageDataGenerator and selecting steps_per_epoch and validation_steps. What should these values be?
I have 3 classes, 1000 images each. I've split it 60/20/20 train/validation/test.
I was able to get a validation accuracy of 60% without data augmentation. Above I've simplified the ImageDataGenerator to only rescale. This model has a validation accuracy of 30% Why?
What changes do I need to make to the data-augmented version of this script to match the accuracy with no augmentation?
UPDATE:
This may be an issue with keras itself
https://github.com/keras-team/keras/issues/9214
https://github.com/keras-team/keras/pull/9965
To answer your first question: steps_per_epochis the number of batches the training generator should yield before considering an epoch finished. If you have 600 training images with batch size 20, this would be 30 steps per epoch et cetera. validation_steps applies the same logic to the validation data generator, be it at the end of each epoch.
In general, steps_per_epoch is the size of your dataset divided by the batch size.