I'm trying to divide my images (dataset of bunnies and dogs) into x_train, x_val, y_train, y_val, and testing.
The following is what I did:
I placed the photos of each class (dogs/bunnies) in separate folders inside two folders: training and testing.
Training directory-> Bunny directory -> bunny images
Training directory-> Puppy directory -> puppy images
Testing directory-> Bunny directory -> bunny images
Testing directory-> Puppy directory -> puppy images
I used the following code to get the images from the folders:
training_data = train_datagen.flow_from_directory('./images/train',
target_size = (28, 28),
batch_size = 86,
class_mode = 'binary',
color_mode='rgb',
classes=None)
test_data = test_datagen.flow_from_directory('./images/test',
target_size = (28, 28),
batch_size = 86,
class_mode = 'binary',
color_mode='rgb',
classes=None)
Which gives me the following output:
Found 152 images belonging to 2 classes.
Found 23 images belonging to 2 classes.
Question 1: I wasn't sure how to define my labels here (y_val/ y_train) or if I need to (but it appears that most models have y_val/y_train).
Question 2: I tried to run
x_train, x_val = train_test_split(training_data, test_size=0.1)
In order to at least split my training data into validation/training, but when I tried to run my model it gave me the following error:
history=classifier.fit_generator(x_train,
steps_per_epoch = (8000 / 86),
epochs = 2,
validation_data = x_val,
validation_steps = 8000/86,
callbacks=[learning_rate_reduction])
ValueError: validation_data should be a tuple (val_x, val_y, val_sample_weight) or (val_x, val_y).
Found: [(array([[[[0.5058095 , 0.46913707, 0.42369673],...
Question 1:
From my experience there's no discernable confinements in naming y,x variables. For example in this kernel a person uses y_train, y_test names for labels and here a person uses train_Y. There's a rule that you should give names that shows what the variable is about.
Question 2:
I would recommend using validation_split parameter in ImageDataGenerator (doc) to set up fraction of images reserved for validation. After that I would recommend using subset parameter in flow_from_directory (doc) to define training_generator and validation generator variables. (I want to point out that the flow_from_directory returns generator, not data).
So your code would look like:
data_generator = ImageDataGenerator(
validation_split=0.2,
rescale=1./255,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True,
)
train_generator = data_generator.flow_from_directory(
'./images/train',
target_size = (28, 28),
batch_size = 86,
class_mode = 'binary',
color_mode='rgb',
classes=None, subset="training"
)
validation_generator = data_generator.flow_from_directory(
'./images/train',
target_size = (28, 28),
batch_size = 86,
class_mode = 'binary',
color_mode='rgb',
classes=None, subset="validation"
)
history=classifier.fit_generator(
train_generator,
steps_per_epoch = (8000 / 86),
epochs = 2,
validation_data = validation_generator,
validation_steps = 8000/86,
callbacks=[learning_rate_reduction]
)
Related
I am trying to overfit my model on a single batch to check model integrity. I am using Keras and TensorFlow for the implementation of my model and coding style for this project.
I know how to get the single batch and overfit the model in PyTorch but don't have an idea in Keras.
to get a single batch in PyTorch I used:
images, labels = next(iter(train_dataset))
criterion = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr = 0.0001)
for epoch in range(epochs):
print(f"Epoch [{epoch}/{epochs}]")
# for batch_idx, (data, target) in enumerate(train_loader):
data, target = data.to(device), target.to(device)
data = data.reshape(data.shape[0], -1)
# forward
score = model(data)
loss = criterion(score, target)
print(f"Loss: {loss.item()}")
# backward
optimizer.zero_grad()
loss.backward()
optimizer.step()
How to do it in keras any helping matrial?
Thank you everyone for coming here. I found a solution and here it is:
datagen = ImageDataGenerator(rescale=1 / 255.0,
rotation_range=20,
zoom_range=0.2,
width_shift_range=0.05,
height_shift_range=0.05,
shear_range=0.2,
horizontal_flip=True,
fill_mode="nearest"
)
# preprocessing_function=preprocess_input,
# Declare an image generator for validation & testing without generation
test_datagen = ImageDataGenerator(rescale = 1./255,)#preprocessing_function=preprocess_input
# Declare generators for training, validation, and testing from DataFrames
train_gen = datagen.flow_from_directory(directory_train,
target_size=(512, 512),
color_mode='rgb',
batch_size=BATCH_SIZE,
class_mode='binary',
shuffle=True)
val_gen = test_datagen.flow_from_directory(directory_val,
target_size=(512, 512),
color_mode='rgb',
batch_size=BATCH_SIZE,
class_mode='binary',
shuffle=False)
test_gen = test_datagen.flow_from_directory(directory_test,
target_size=(512, 512),
color_mode='rgb',
batch_size=BATCH_SIZE,
class_mode='binary',
shuffle=False)
train_images, train_labels = next(iter(train_gen))
val_images, val_labels = next(iter(val_gen))
test_images, test_labels = next(iter(val_gen))
#check shape for selected Batch
print("Length of Train images : {}".format(len(train_images)))
print("shape of Train images : {}".format(train_images.shape))
print("shape of Train labels : {}".format(train_labels.shape))
Length of Train images : 32
shape of Train images : (32, 512, 512, 3)
shape of Train labels : (32,)
history = model.fit(train_images, train_labels,
use_multiprocessing=True,
workers=16,
epochs=100,
class_weight=class_weights,
validation_data=(val_images, val_labels),
shuffle=True,
callbacks=call_backs)
i was looking to run my training set but it keep on giving the error when i try to make the model run it pops this error"Your input ran out of data; interrupting training. Make sure that your dataset or generator can generate at least steps_per_epoch * epochs batches (in this case, 12000 batches). You may need to use the repeat() function when building your dataset."
enter code here
# Convolutional Neural Network
# Importing the libraries
import tensorflow as tf
from keras.preprocessing.image import ImageDataGenerator
tf.__version__
# Part 1 - Data Preprocessing
# Generating images for the Training set
train_datagen = ImageDataGenerator(rescale = 1./255,
shear_range = 0.2,
zoom_range = 0.2,
horizontal_flip = True)
# Generating images for the Test set
test_datagen = ImageDataGenerator(rescale = 1./255)
# Creating the Training set
training_set = train_datagen.flow_from_directory('dataset/train',
target_size = (64, 64),
batch_size = 32,
class_mode = 'binary')
# Creating the Test set
test_set = test_datagen.flow_from_directory('dataset/test',
target_size = (64, 64),
batch_size = 32,
class_mode = 'binary')
# Part 2 - Building the CNN
# Initialising the CNN
cnn = tf.keras.models.Sequential()
# Step 1 - Convolution
cnn.add(tf.keras.layers.Conv2D(filters=32, kernel_size=3, padding="same", activation="relu", input_shape=[64, 64, 3]))
# Step 2 - Pooling
cnn.add(tf.keras.layers.MaxPool2D(pool_size=2, strides=2, padding='valid'))
# Adding a second convolutional layer
cnn.add(tf.keras.layers.Conv2D(filters=32, kernel_size=3, padding="same", activation="relu"))
cnn.add(tf.keras.layers.MaxPool2D(pool_size=2, strides=2, padding='valid'))
# Step 3 - Flattening
cnn.add(tf.keras.layers.Flatten())
# Step 4 - Full Connection
cnn.add(tf.keras.layers.Dense(units=128, activation='relu'))
# Step 5 - Output Layer
cnn.add(tf.keras.layers.Dense(units=1, activation='sigmoid'))
# Part 3 - Training the CNN
# Compiling the CNN
cnn.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = ['accuracy'])
# Training the CNN on the Training set and evaluating it on the Test set
cnn.fit_generator(training_set,
steps_per_epoch = 4000,
epochs = 3,
validation_data = test_set,
validation_steps = 2000)
steps_per_epochs notes the number of batches at each epoch, so actually you need to verify that you you have at least steps_per_ecpoh * batch_size images at your dataset (and not steps_per_ecpoh * epochs as you wrote). This is true both for teain and validation datasets.
A common way is to set steps_per_ecpoh=floor(len(dataset)/batch_size). The ImageDataGenerator class has a default batch_size of 32, you can change it if you want by passing a relevant parameter.
I passed two days trying to use Neural Structured language to adapt into me CNN Model I use ImageDataGenerator and flow_from_directory when I use model.fit_generator I got an error message:
ValueError:
When passing input data as arrays, do not specify
steps_per_epoch/steps argument. Please use batch_size instead.
I use Keras 2.3.1 and TensorFlow 2.0 as backend
This is a snipped of my code:
num_classes = 4
img_rows, img_cols = 224, 224
batch_size = 16
train_datagen = ImageDataGenerator(
rescale=1./255,
rotation_range=30,
width_shift_range=0.3,
height_shift_range=0.3,
horizontal_flip=True,
fill_mode='nearest')
validation_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(
train_data_dir,
target_size=(img_rows, img_cols),
batch_size=batch_size, shuffle=True,
class_mode='categorical')
validation_generator = validation_datagen.flow_from_directory(
validation_data_dir,
target_size=(img_rows, img_cols),
batch_size=batch_size, shuffle=True,
class_mode='categorical')
def vgg():
model1 = Sequential([ ])
return model1
base_model = vgg()
I adapte Datagenerated from (x,y) format to a dictionary format
def convert_training_data_generator():
for x ,y in train_generator:
return {'feature': x, 'label':y}
def convert_testing_data_generator():
for x ,y in validation_generator:
return {'feature': x, 'label': y}
adv_config = nsl.configs.make_adv_reg_config(multiplier=0.2, adv_step_size=0.05)
model = nsl.keras.AdversarialRegularization(base_model, adv_config=adv_config)
train= convert_training_data_generator()
test= convert_testing_data_generator()
history = model.fit_generator(train,
steps_per_epoch= nb_train_samples // batch_size,
epochs = epochs,
callbacks = callbacks,
validation_data = test,
validation_steps = nb_validation_samples // batch_size)
I think here there is the same error. Maybe you should consider using instead model.fit() function. You should define in that case your train input your train labels and the batch_size. In order to figure out the difference between fit and fit_generator, you can follow that link.
I have trained a basic CNN model for image classification.
While training the model I have used ImageDataGenerator from keras api.
After the model is being trained i used testdatagenerator and flow_from_directory method for testing.
Everything Went well.
Then I saved the model for future use.
Now i am using the same model and used predict method from keras api with a single image, but the prediction is very different every time I test using different images.
Could you please let me know any solution.
training_augmentation = ImageDataGenerator(rescale=1 / 255.0)
validation_testing_augmentation = ImageDataGenerator(rescale=1 / 255.0)
# Initialize training generator
training_generator = training_augmentation.flow_from_directory(
JPG_TRAIN_IMAGE_DIR,
class_mode="categorical",
target_size=(32, 32),
color_mode="rgb",
shuffle=True,
batch_size=batch_size
)
# initialize the validation generator
validation_generator = validation_testing_augmentation.flow_from_directory(
JPG_VAL_IMAGE_DIR,
class_mode="categorical",
target_size=(32, 32),
color_mode="rgb",
shuffle=False,
batch_size=batch_size
)
# initialize the testing generator
testing_generator = validation_testing_augmentation.flow_from_directory(
JPG_TEST_IMAGE_DIR,
class_mode="categorical",
target_size=(32, 32),
color_mode="rgb",
shuffle=False,
batch_size=batch_size
)
history = model.fit_generator(
training_generator,
steps_per_epoch=total_img_count_dict['train'] // batch_size,
validation_data=validation_generator,
validation_steps=total_img_count_dict['val'] // batch_size,
epochs=epochs,
callbacks=callbacks)
testing_generator.reset()
prediction_stats = model.predict_generator(testing_generator, steps=(total_img_count_dict['test'] // batch_size) + 1)
### Trying to use predict method
img_file = '/content/drive/My Drive/Traffic_Sign_Recognition/to_display_output/Copy of 00003_00019.jpg'
img = cv2.imread(img_file)
img=cv2.resize(img, (32,32))
img = img/255.0
a=np.reshape(img, (1, 32, 32, 3))
model = load_model('/content/drive/My Drive/Traffic_Sign_Recognition/basic_cnn.h5')
prediction = model.predict(a)
When I am trying to use predict, every time wrong prediction is coming.
Any leads will be appreciated.
Keras generator uses PIL for image reading which read images from disk as RGB.
You are using opencv for reading which reads images as BGR. You have to convert your image from BGR to RGB.
img = cv2.imread(img_file)
img = cv2.cvtColor(img,cv2.COLOR_BGR2RGB)
...
I'm trying to do an image segmentation problem where I want to segment 5 objects in an image. I'm using a U-net architecture. My final layer looks like this:
conv_final = Conv2D(OUTPUT_MASK_CHANNELS, (1, 1))(up_conv_224)
conv_final = Activation('sigmoid')(conv_final)
model = Model(inputs, conv_final, name="ZF_UNET_224")
However I get an error saying:
ValueError: Error when checking target: expected conv2d_24 to have shape (224, 224, 5) but got array with shape (224, 224, 3)
This is the generator that I'm using
image_generator = train_datagen.flow_from_directory(
'data/train', # this is the target directory
target_size=(224, 224),
color_mode = 'rgb',# all images will be resized to 150x150
batch_size=batch_size,
class_mode=None,
seed = 1) # since we use binary_crossentropy loss, we need binary labels
# this is a similar generator, for validation data
mask_generator = mask_datagen.flow_from_directory(
'data/train',
target_size=(224, 224),
color_mode = 'rgb',
batch_size=batch_size,
class_mode=None,
seed = 1)
train_generator = zip(image_generator, mask_generator)
What can I do to fix this? Any help appreciated!
You have to convert the data into one hot encoded format.
Use from keras.utils import to_categorical