Keras model ValueError: Error when checking model target: - keras

I have not coded in years, forgive me. I am trying to do something that may be impossible. I have 38 videos of people performing the same basic movement. I want to train the model to identify those doing it correct v not correct.
I am using color now, because the grayscale did not work either and I wanted to test like the example I used. I used the model as defined in an example, link.
Keras,
Python3.5 in Anaconda 64,
Tensorflow backend,
on Windows 10 (64bit)
I was hoping to try different models on the problem and use grayscale to reduce memory, but cant get past first step!
Thanks!!!
Here is my code:
import time
import numpy as np
import sys
import os
import cv2
import keras
import tensorflow as tf
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten, BatchNormalization
from keras.layers import Conv3D, Conv2D, MaxPooling2D, GRU, ConvLSTM2D, TimeDistributed
y_cat = np.zeros(40,np.float)
good = "Good"
bad = "Bad"
batch_size = 32
num_classes = 1
epochs = 1
nvideos = 38
nframes = 130
nrows = 240
ncols = 320
nchan = 3
x_learn = np.zeros((nvideos,nframes,nrows,ncols,nchan),np.int32)
x_learn = np.load(".\\train\\datasetcolor.npy")
with open(".\\train\\tags.txt") as ft:
y_learn = ft.readlines()
y_learn = [x.strip() for x in y_learn]
ft.close()
# transform string tags to numeric.
for i in range (0,len(y_learn)):
if (y_learn[i] == good): y_cat[i] = 1
elif (y_learn[i] == bad): y_cat[i] = 0
#build model
# duplicating from https://github.com/fchollet/keras/blob/master/examples/conv_lstm.py
model = Sequential()
model.image_dim_ordering = 'tf'
model.add(ConvLSTM2D(filters=40, kernel_size=(3, 3),
input_shape=(nframes,nrows,ncols,nchan),
padding='same', return_sequences=True))
model.add(BatchNormalization())
model.add(ConvLSTM2D(filters=40, kernel_size=(3, 3),
padding='same', return_sequences=True))
model.add(BatchNormalization())
model.add(ConvLSTM2D(filters=40, kernel_size=(3, 3),
padding='same', return_sequences=True))
model.add(BatchNormalization())
model.add(ConvLSTM2D(filters=40, kernel_size=(3, 3),
padding='same', return_sequences=True))
model.add(BatchNormalization())
model.add(Conv3D(filters=1, kernel_size=(3, 3, 3),
activation='sigmoid',
padding='same', data_format='channels_last'))
model.compile(loss='binary_crossentropy', optimizer='adadelta')
print(model.summary())
# fit with first 3 videos because I don't have the horsepower yet
history = model.fit(x_learn[:3], y_learn[:3],
batch_size=batch_size,
epochs=epochs)
print (history)
Results:
Layer (type) Output Shape Param #
=================================================================
conv_lst_m2d_5 (ConvLSTM2D) (None, 130, 240, 320, 40) 62080
_________________________________________________________________
batch_normalization_5 (Batch (None, 130, 240, 320, 40) 160
_________________________________________________________________
conv_lst_m2d_6 (ConvLSTM2D) (None, 130, 240, 320, 40) 115360
_________________________________________________________________
batch_normalization_6 (Batch (None, 130, 240, 320, 40) 160
_________________________________________________________________
conv_lst_m2d_7 (ConvLSTM2D) (None, 130, 240, 320, 40) 115360
_________________________________________________________________
batch_normalization_7 (Batch (None, 130, 240, 320, 40) 160
_________________________________________________________________
conv_lst_m2d_8 (ConvLSTM2D) (None, 130, 240, 320, 40) 115360
_________________________________________________________________
batch_normalization_8 (Batch (None, 130, 240, 320, 40) 160
_________________________________________________________________
conv3d_1 (Conv3D) (None, 130, 240, 320, 1) 1081
=================================================================
Total params: 409,881.0
Trainable params: 409,561
Non-trainable params: 320.0
_________________________________________________________________
None
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-3-d909d285f474> in <module>()
82 history = model.fit(x_learn[:3], y_learn[:3],
83 batch_size=batch_size,
---> 84 epochs=epochs)
85
86 print (history)
ValueError: Error when checking model target: expected conv3d_1 to have 5 dimensions, but got array with shape (3, 1)

"Target" means that the problem is in the output of your model versus the format of y_learn.
The array y_learn should be exactly the same shape of the model's output, because the model outputs a "guess", while y_learn is the "correct answer". The system can only compare the guess with the correct answer if they have the same dimensions.
See the difference:
Model Output (seen in the summary): (None,130,240,320,1)
y_learn: (None,1)
Where "None" is the batch size. You gave y_learn[:3], then your batch size is 3 for this training session.
In order to correct it properly, we need to understand what y_learn is.
If I understood well, you've got only a number, 0 or 1, for each video. If that's so, your y_learn is totally ok, and what you need is for your model to output things like (None,1).
A very simple way to do that (perhaps not the best, and I couldn't be of more help here...) is to add a final Dense layer with just one neuron:
model.add(Flatten())
model.add(Dense(1, activation='sigmoid'))
Now, when you do model.summary(), you will see the final output as (None,1)

Related

LSTM for Video Input

I am a newbie trying out LSTM.
I am basically using LSTM to determine action type (5 different actions) like running, dancing etc. My input is 60 frames per action and roughly let's say about 120 such videos
train_x.shape = (120,192,192,60)
where 120 is the number of sample videos for training, 192X192 is the frame size and 60 is the # frames.
train_y.shape = (120*5) [1 0 0 0 0 ..... 0 0 0 0 1] one hot-coded
I am not clear as to how to pass 3d parameters to lstm (timestamp and features)
model.add(LSTM(100, input_shape=(train_x.shape[1],train_x.shape[2])))
model.add(Dropout(0.5))
model.add(Dense(100, activation='relu'))
model.add(Dense(len(uniquesegments), activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(train_x, train_y, epochs=100, batch_size=batch_size, verbose=1)
i get the following error
Input 0 of layer sequential is incompatible with the layer: expected ndim=3, found ndim=4. Full shape received: (None, 192, 192, 60)
training data algorithm
Loop through videos
Loop through each frame of a video
logic
append to array
convert to numpy array
roll axis to convert 60 192 192 to 192 192 60
add to training list
convert training list to numpy array
training list shape <120, 192, 192, 60>
First you should know, method of solving video classification task is better suit for Convolutional RNN than LSTM or any RNN Cell, just as CNN is better suit for image classification task than MLP
Those RNN cell (e.g LSTM, GRU) is expect inputs with shape (samples, timesteps, channels), since you are deal inputs with shape (samples, timesteps, width, height, channels), so you should using tf.keras.layers.ConvLSTM2D instead
Following example code will show you how to build a model that can deal your video classification task:
import tensorflow as tf
from tensorflow.keras import models, layers
timesteps = 60
width = 192
height = 192
channels = 1
action_num = 5
model = models.Sequential(
[
layers.Input(
shape=(timesteps, width, height, channels)
),
layers.ConvLSTM2D(
filters=64, kernel_size=(3, 3), padding="same", return_sequences=True, dropout=0.1, recurrent_dropout=0.1
),
layers.MaxPool3D(
pool_size=(1, 2, 2), strides=(1, 2, 2), padding="same"
),
layers.BatchNormalization(),
layers.ConvLSTM2D(
filters=32, kernel_size=(3, 3), padding="same", return_sequences=True, dropout=0.1, recurrent_dropout=0.1
),
layers.MaxPool3D(
pool_size=(1, 2, 2), strides=(1, 2, 2), padding="same"
),
layers.BatchNormalization(),
layers.ConvLSTM2D(
filters=16, kernel_size=(3, 3), padding="same", return_sequences=False, dropout=0.1, recurrent_dropout=0.1
),
layers.MaxPool2D(
pool_size=(2, 2), strides=(2, 2), padding="same"
),
layers.BatchNormalization(),
layers.Flatten(),
layers.Dense(256, activation='relu'),
layers.Dense(action_num, activation='softmax')
]
)
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.summary()
Outputs:
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv_lst_m2d (ConvLSTM2D) (None, 60, 192, 192, 64) 150016
_________________________________________________________________
max_pooling3d (MaxPooling3D) (None, 60, 96, 96, 64) 0
_________________________________________________________________
batch_normalization (BatchNo (None, 60, 96, 96, 64) 256
_________________________________________________________________
conv_lst_m2d_1 (ConvLSTM2D) (None, 60, 96, 96, 32) 110720
_________________________________________________________________
max_pooling3d_1 (MaxPooling3 (None, 60, 48, 48, 32) 0
_________________________________________________________________
batch_normalization_1 (Batch (None, 60, 48, 48, 32) 128
_________________________________________________________________
conv_lst_m2d_2 (ConvLSTM2D) (None, 48, 48, 16) 27712
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 24, 24, 16) 0
_________________________________________________________________
batch_normalization_2 (Batch (None, 24, 24, 16) 64
_________________________________________________________________
flatten (Flatten) (None, 9216) 0
_________________________________________________________________
dense (Dense) (None, 256) 2359552
_________________________________________________________________
dense_1 (Dense) (None, 5) 1285
=================================================================
Total params: 2,649,733
Trainable params: 2,649,509
Non-trainable params: 224
_________________________________________________________________
Beware you should reorder your data to the shape (samples, timesteps, width, height, channels) before feed in above model (i.e not like np.reshape, but like np.moveaxis), in your case the shape should be (120, 60, 192, 192, 1), then you can split your 120 video to batchs and feed to model
From the docs, it seems like LSTM isn't even intended to take an input_shape argument. And that makes sense because typically you should be feeding it a 1d feature per timestep. That's why in the docs it says:
inputs: A 3D tensor with shape [batch, timesteps, feature]
What you're trying to do won't work (I've also left you a comment explaining why you probably shouldn't be trying to do it that way).

Shapes Incompatible in Keras with CNN

I am implementing a network that takes a 2d image and outputs a 3D binary voxels for it.
I am using an autoencoder with LSTM module.
The current shape of images and voxels are as follows:
print(x_train.shape)
print(y_train.shape)
>>> (792, 127, 127, 3)
>>> (792, 32, 32, 32)
792 RGB images 127 x 127
792 corresponding voxels with 3D Binary Tensor (32 x 32 x 32)
Running the following encoder model:
from tensorflow.keras import Sequential
from tensorflow.keras.layers import Conv2D, LeakyReLU, MaxPooling2D, Dense, Flatten, Conv3D, MaxPool3D, GRU, Reshape, UpSampling3D
from tensorflow import keras
enc_filter = [96, 128, 256, 256, 256, 256]
fc_filters = [1024]
model = Sequential()
epochs = 5
batch_size = 24
input_shape=(127,127,3)
model.add(Conv2D(enc_filter[0], kernel_size=(7, 7), strides=(1,1),activation='relu',input_shape=input_shape))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(LeakyReLU(alpha=0.1))
model.add(Flatten())
model.add(Dense(1024, activation='relu'))
model.compile(loss=keras.losses.categorical_crossentropy,
optimizer=keras.optimizers.SGD(lr=0.01),
metrics=['accuracy'])
model.fit(x_train, y_train,
batch_size=batch_size,
epochs=epochs)
yields the following:
ValueError: Shapes (24, 32, 32, 32) and (24, 1024) are incompatible
Can someone address why the shapes are incompatible? I tried removing layers and test others but all yields compatibility issues.
Your model has a dense layer with 1024 output, but you are passing 32,32,32 shaped array.
You need to reshape your model output so that it has proper shape.
This is a dummy model, you need to change the parameters to find the suitable architecture.
from tensorflow.keras import Sequential
from tensorflow.keras.layers import Conv2D, LeakyReLU, MaxPooling2D, Dense, Flatten, Conv3D, MaxPool3D, GRU, Reshape, UpSampling3D
from tensorflow import keras
import numpy as np
# dummy data
x_train = np.random.randn(792, 127, 127, 3)
y_train = np.random.randn(792, 32, 32, 32)
enc_filter = [96, 128, 256, 2]
fc_filters = [1024]
model = Sequential()
epochs = 5
batch_size = 24
input_shape=(127,127,3)
model.add(Conv2D(enc_filter[0], kernel_size=(7, 7), strides=(1,1),activation='relu',input_shape=input_shape))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(LeakyReLU(alpha=0.1))
model.add(Conv2D(enc_filter[1], kernel_size=(7, 7), strides=(1,1),activation='relu',input_shape=input_shape))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(LeakyReLU(alpha=0.1))
model.add(Conv2D(enc_filter[2], kernel_size=(7, 7), strides=(1,1),activation='relu',input_shape=input_shape))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(LeakyReLU(alpha=0.1))
model.add(Conv2D(enc_filter[3], kernel_size=(7, 7), strides=(1,1),activation='relu',input_shape=input_shape)) # bottolneck
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(LeakyReLU(alpha=0.1))
model.add(Flatten())
model.add(Dense(32*32*32, activation='relu'))
model.add(Reshape((32,32,32)))
model.compile(loss=keras.losses.categorical_crossentropy,
optimizer=keras.optimizers.SGD(lr=0.01),
metrics=['accuracy'])
model.summary()
model.fit(x_train, y_train,
batch_size=batch_size,
epochs=epochs)
Model: "sequential_10"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_24 (Conv2D) (None, 121, 121, 96) 14208
_________________________________________________________________
max_pooling2d_24 (MaxPooling (None, 60, 60, 96) 0
_________________________________________________________________
leaky_re_lu_24 (LeakyReLU) (None, 60, 60, 96) 0
_________________________________________________________________
conv2d_25 (Conv2D) (None, 54, 54, 128) 602240
_________________________________________________________________
max_pooling2d_25 (MaxPooling (None, 27, 27, 128) 0
_________________________________________________________________
leaky_re_lu_25 (LeakyReLU) (None, 27, 27, 128) 0
_________________________________________________________________
conv2d_26 (Conv2D) (None, 21, 21, 256) 1605888
_________________________________________________________________
max_pooling2d_26 (MaxPooling (None, 10, 10, 256) 0
_________________________________________________________________
leaky_re_lu_26 (LeakyReLU) (None, 10, 10, 256) 0
_________________________________________________________________
conv2d_27 (Conv2D) (None, 4, 4, 2) 25090
_________________________________________________________________
max_pooling2d_27 (MaxPooling (None, 2, 2, 2) 0
_________________________________________________________________
leaky_re_lu_27 (LeakyReLU) (None, 2, 2, 2) 0
_________________________________________________________________
flatten_10 (Flatten) (None, 8) 0
_________________________________________________________________
dense_1 (Dense) (None, 32768) 294912
_________________________________________________________________
reshape_10 (Reshape) (None, 32, 32, 32) 0
=================================================================
Total params: 2,542,338
Trainable params: 2,542,338
Non-trainable params: 0
In the summary, you can see I add a dense layer with 32x32x32 neurons and then reshape it.

keras copying VGG16 pretrained weights layer by layer

I want to copy some of the VGG16 layer weights layer by layer to another small network with alike layers, but I get an error that says:
File "/home/d/Desktop/s/copyweights.py", line 78, in <module>
list(f["model_weights"].keys())
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "/usr/local/lib/python3.5/dist-packages/h5py/_hl/group.py", line 262, in __getitem__
oid = h5o.open(self.id, self._e(name), lapl=self._lapl)
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "h5py/h5o.pyx", line 190, in h5py.h5o.open
KeyError: "Unable to open object (object 'model_weights' doesn't exist)"
The file is definitely in the path, I downloaded the file again just to make sure it is not corrupted (and it doesn't cause an error when I use model.load_weights in general - I also have HDF5 installed
Here is the code:
from keras import applications
from keras.preprocessing.image import ImageDataGenerator
from keras import optimizers
from keras.models import Sequential, Model
from keras.layers import Dropout, Flatten, Dense, GlobalAveragePooling2D
from keras import backend as k
from keras.callbacks import ModelCheckpoint, LearningRateScheduler, TensorBoard, EarlyStopping
from keras import layers
from keras import models
from keras import optimizers
from keras.layers import Dropout
from keras.regularizers import l2
from keras.callbacks import EarlyStopping, ModelCheckpoint, ReduceLROnPlateau
import matplotlib.pyplot as plt
from keras.preprocessing.image import ImageDataGenerator
import os
epochs = 50
callbacks = []
#schedule = None
decay = 0.0
earlyStopping = EarlyStopping(monitor='val_loss', patience=10, verbose=0, mode='min')
mcp_save = ModelCheckpoint('.mdl_wts.hdf5', save_best_only=True, monitor='val_loss', mode='min')
reduce_lr_loss = ReduceLROnPlateau(monitor='val_loss', factor=0.1, patience=3, verbose=1, epsilon=1e-5, mode='min')
base_model = models.Sequential()
base_model.add(layers.Conv2D(64, (3, 3), activation='relu', name='block1_conv1', input_shape=(256, 256, 3)))
base_model.add(layers.Conv2D(64, (3, 3), activation='relu', name='block1_conv2'))
base_model.add(layers.MaxPooling2D((2, 2)))
#model.add(Dropout(0.2))
base_model.add(layers.Conv2D(128, (3, 3), activation='relu', name='block2_conv1'))
base_model.add(layers.Conv2D(128, (3, 3), activation='relu', name='block2_conv2'))
base_model.add(layers.MaxPooling2D((2, 2), name='block2_pool'))
#model.add(Dropout(0.2))
base_model.summary()
"""
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) (None, 256, 256, 3) 0
_________________________________________________________________
block1_conv1 (Conv2D) (None, 256, 256, 64) 1792
_________________________________________________________________
block1_conv2 (Conv2D) (None, 256, 256, 64) 36928
_________________________________________________________________
block1_pool (MaxPooling2D) (None, 128, 128, 64) 0
_________________________________________________________________
block2_conv1 (Conv2D) (None, 128, 128, 128) 73856
_________________________________________________________________
block2_conv2 (Conv2D) (None, 128, 128, 128) 147584
_________________________________________________________________
block2_pool (MaxPooling2D) (None, 64, 64, 128) 0
=================================================================
Total params: 260,160.0
Trainable params: 260,160.0
Non-trainable params: 0.0
"""
layer_dict = dict([(layer.name, layer) for layer in base_model.layers])
[layer.name for layer in base_model.layers]
"""
['input_1',
'block1_conv1',
'block1_conv2',
'block1_pool',
'block2_conv1',
'block2_conv2',
'block2_pool']
"""
import h5py
weights_path = '/home/d/Desktop/s/vgg16_weights_new.h5' # ('https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg19_weights_tf_dim_ordering_tf_kernels.h5)
f = h5py.File(weights_path)
list(f["model_weights"].keys())
"""
['block1_conv1',
'block1_conv2',
'block1_pool',
'block2_conv1',
'block2_conv2',
'block2_pool',
'block3_conv1',
'block3_conv2',
'block3_conv3',
'block3_conv4',
'block3_pool',
'block4_conv1',
'block4_conv2',
'block4_conv3',
'block4_conv4',
'block4_pool',
'block5_conv1',
'block5_conv2',
'block5_conv3',
'block5_conv4',
'block5_pool',
'dense_1',
'dense_2',
'dense_3',
'dropout_1',
'global_average_pooling2d_1',
'input_1']
"""
# list all the layer names which are in the model.
layer_names = [layer.name for layer in base_model.layers]
"""
# Here we are extracting model_weights for each and every layer from the .h5 file
>>> f["model_weights"]["block1_conv1"].attrs["weight_names"]
array([b'block1_conv1/kernel:0', b'block1_conv1/bias:0'],
dtype='|S21')
# we are assiging this array to weight_names below
>>> f["model_weights"]["block1_conv1"]["block1_conv1/kernel:0]
<HDF5 dataset "kernel:0": shape (3, 3, 3, 64), type "<f4">
# The list comprehension (weights) stores these two weights and bias of both the layers
>>>layer_names.index("block1_conv1")
1
>>> model.layers[1].set_weights(weights)
# This will set the weights for that particular layer.
With a for loop we can set_weights for the entire network.
"""
for i in layer_dict.keys():
weight_names = f["model_weights"][i].attrs["weight_names"]
weights = [f["model_weights"][i][j] for j in weight_names]
index = layer_names.index(i)
base_model.layers[index].set_weights(weights)
base_model.add(layers.Flatten())
base_model.add(layers.Dropout(0.5)) #Dropout for regularization
base_model.add(layers.Dense(256, activation='relu'))
base_model.add(layers.Dense(1, activation='sigmoid')) #Sigmoid function at the end because we have just two classes
# compile the model with a SGD/momentum optimizer
# and a very slow learning rate.
base_model.compile(loss='binary_crossentropy',
optimizer=optimizers.Adam(lr=1e-4, decay=decay),
metrics=['accuracy'])
os.environ["CUDA_VISIBLE_DEVICES"]="0"
train_dir = '/home/d/Desktop/s/data/train'
eval_dir = '/home/d/Desktop/s/data/eval'
test_dir = '/home/d/Desktop/s/data/test'
# create a data generator
train_datagen = ImageDataGenerator(rescale=1./255, #Scale the image between 0 and 1
rotation_range=40,
width_shift_range=0.2,
height_shift_range=0.2,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True,)
val_datagen = ImageDataGenerator(rescale=1./255) #We do not augment validation data. we only perform rescale
test_datagen = ImageDataGenerator(rescale=1./255) #We do not augment validation data. we only perform rescale
# load and iterate training dataset
train_generator = train_datagen.flow_from_directory(train_dir, target_size=(224,224),class_mode='binary', batch_size=16, shuffle='True', seed=42)
# load and iterate validation dataset
val_generator = val_datagen.flow_from_directory(eval_dir, target_size=(224,224),class_mode='binary', batch_size=16, shuffle='True', seed=42)
# load and iterate test dataset
test_generator = test_datagen.flow_from_directory(test_dir, target_size=(224,224), class_mode=None, batch_size=1, shuffle='False', seed=42)
#The training part
#We train for 64 epochs with about 100 steps per epoch
history = base_model.fit_generator(train_generator,
steps_per_epoch=train_generator.n // train_generator.batch_size,
epochs=epochs,
validation_data=val_generator,
validation_steps=val_generator.n // val_generator.batch_size,
callbacks=[earlyStopping, mcp_save, reduce_lr_loss])
#Save the model
#base_model.save_weights('/home/d/Desktop/s/base_model_weights.h5')
#base_model.save('/home/d/Desktop/s/base_model_keras.h5')
#lets plot the train and val curve
#get the details form the history object
acc = history.history['acc']
val_acc = history.history['val_acc']
loss = history.history['loss']
val_loss = history.history['val_loss']
epochs = range(1, len(acc) + 1)
#Train and validation accuracy
plt.plot(epochs, acc, 'b', label='Training accuracy')
plt.plot(epochs, val_acc, 'r', label='Validation accuracy')
plt.title('Training and Validation accurarcy')
plt.legend()
plt.figure()
#Train and validation loss
plt.plot(epochs, loss, 'b', label='Training loss')
plt.plot(epochs, val_loss, 'r', label='Validation loss')
plt.title('Training and Validation loss')
plt.legend()
plt.show()
Take the VGG16 model directly:
from keras.applications import VGG16
vgg = VGG16(choose parameters)
for layer in vgg.layers:
layer_weights_list = layer.get_weights()

Memory usage of neural network, Keras

I am trying to develop a model for denoising images. I've been reading up on how to calculate memory usage of a neural network and the standard approach seems to be:
params = depth_n x (kernel_width x kernel_height) x depth_n-1 + depth
By summing all parameters together in my network, I end up getting 1,038,097 which approximates to 4.2MB. It seems I have done a slight miscalculation in the last layer since Keras ends up getting 1,038,497 params. Nevertheless, this is a small difference. 4.2MB is just the parameters, and I've seen somewhere that one should multiply by 3 to include backprop and other needed calculations. This would then approximate to 13MB.
I have approximately 11 GB of GPU memory to work with, yet this model gets exhausted. Where does all the extra needed memory come from? What am I missing? I know this post might be labeled as duplicate, but none of the others seems to catch the topic which I am asking about.
My model:
def network(self):
weights = RandomUniform(minval=-0.05, maxval=0.05, seed=None)
input_img = Input(shape=(self.img_rows, self.img_cols, self.channels))
conv1 = Conv2D(1024, (3,3), activation='tanh', kernel_initializer=weights,
padding='same', use_bias=True)(input_img)
conv2 = Conv2D(64, (3,3), activation='tanh', kernel_initializer=weights,
padding='same', use_bias=True)(conv1)
conv3 = Conv2D(64, (3,3), activation='tanh', kernel_initializer=weights,
padding='same', use_bias=True)(conv2)
conv4 = Conv2D(64, (3,3), activation='relu', kernel_initializer=weights,
padding='same', use_bias=True)(conv3)
conv5 = Conv2D(64, (7,7), activation='relu', kernel_initializer=weights,
padding='same', use_bias=True)(conv4)
conv6 = Conv2D(64, (5,5), activation='relu', kernel_initializer=weights,
padding='same', use_bias=True)(conv5)
conv7 = Conv2D(32, (5,5), activation='relu', kernel_initializer=weights,
padding='same', use_bias=True)(conv6)
conv8 = Conv2D(32, (3,3), activation='relu', kernel_initializer=weights,
padding='same', use_bias=True)(conv7)
conv9 = Conv2D(16, (3,3), activation='relu', kernel_initializer=weights,
padding='same', use_bias=True)(conv8)
decoded = Conv2D(1, (5,5), kernel_initializer=weights,
padding='same', activation='sigmoid', use_bias=True)(conv8)
return input_img, decoded
def compiler(self):
self.model.compile(optimizer='RMSprop', loss='mse')
self.model.summary()
I assume my model is silly in a lot of ways and that there are multiple things to improve (dropout, other filter sizes and numbers, optimizers etc.) and all suggestions are received gladly, but the actual question still remain. Why does this model consume so much memory? Is it due to the extremely high depth of conv1?
Model summary:
Using TensorFlow backend.
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) (None, 1751, 480, 1) 0
_________________________________________________________________
conv2d_1 (Conv2D) (None, 1751, 480, 1024) 10240
_________________________________________________________________
conv2d_2 (Conv2D) (None, 1751, 480, 64) 589888
_________________________________________________________________
conv2d_3 (Conv2D) (None, 1751, 480, 64) 36928
_________________________________________________________________
conv2d_4 (Conv2D) (None, 1751, 480, 64) 36928
_________________________________________________________________
conv2d_5 (Conv2D) (None, 1751, 480, 64) 200768
_________________________________________________________________
conv2d_6 (Conv2D) (None, 1751, 480, 64) 102464
_________________________________________________________________
conv2d_7 (Conv2D) (None, 1751, 480, 32) 51232
_________________________________________________________________
conv2d_8 (Conv2D) (None, 1751, 480, 32) 9248
_________________________________________________________________
conv2d_10 (Conv2D) (None, 1751, 480, 1) 801
=================================================================
Total params: 1,038,497
Trainable params: 1,038,497
Non-trainable params: 0
_________________________________________________________________
You are correct, this is due to the number of filters in conv1. What you must compute is the memory required to store the activations:
As shown by your model.summary(), the output size of this layer is (None, 1751, 480, 1024). For a single image, this is a total of 1751*480*1024 pixels. As your image is likely in float32, each pixel takes 4 bytes to store. So the output of this layer requires 1751*480*1024*4 bytes, which is around 3.2 GB per image just for this layer.
If you were to change the number of filters to, say, 64, you would only need around 200 MB per image.
Either change the number of filters or change the batch size to 1.

How to fix expected dense_11 to have 4 dimensions error

I got some error when I was building a convolutional neural network with Keras:
Error when checking target: expected dense_11 to have 4 dimensions,
but got array with shape (48986, 12)
Since I lack knowledge, I have no idea what to fix. Can someone explain the reason and also suggest the solution?
input_shape = (99, 81, 1)
nclass = 12
model = Sequential()
model.add(Dense(32, input_shape=input_shape))
model.add(Convolution2D(8,3,3,activation='relu'))
model.add(MaxPooling2D((2,2), strides=(2,2)))
model.add(Dense(128, activation='relu'))
model.add(Dense(128, activation='relu'))
model.add(Dense(nclass, activation='softmax'))
x_train, x_valid, y_train, y_valid = train_test_split(x_train, y_train, test_size=0.1, random_state=2017)
#vgg
batch_size = 128
nb_epoch = 1
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
#model.fit(x_train,y_train,nb_epoch= nb_epoch,batch_size = batch_size , validation_split=0.1)
model.fit(x_train, y_train, batch_size=16, validation_data=(x_valid, y_valid), epochs=3, shuffle=True, verbose=2)
model.save(os.path.join(model_path, 'vgg16.model'))
x_train has a shape of (99, 81, 1) and the nclass output should be 12.
Look at the error again:
"Error when checking target: expected dense_11 to have 4 dimensions, but got array with shape (48986, 12)" - target=labels/output
Meaning, there is some kind of problem with your output shape.
Lets print the model summary to check what is the expected output shape:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_1 (Dense) (None, 99, 81, 32) 64
_________________________________________________________________
conv2d_1 (Conv2D) (None, 97, 79, 8) 2312
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 48, 39, 8) 0
_________________________________________________________________
dense_2 (Dense) (None, 48, 39, 128) 1152
_________________________________________________________________
dense_3 (Dense) (None, 48, 39, 128) 16512
_________________________________________________________________
dense_4 (Dense) (None, 48, 39, 12) 1548
=================================================================
Total params: 21,588
Trainable params: 21,588
Non-trainable params: 0
_________________________________________________________________
The final layer outputs predictions with shape: (None, 48,39,12).
You can see that this is happening because the Dense layer get input with shape (None, 48,39,8) and according to Keras implementation, Dense layer is places on top of the last dimension -> meaning: Dense layer with 128 nodes that gets input with shape (None,48,39,8) will outputs (None,48,39,128).
The solution depends on what you want to do and what is the shape of your labels (what the output should be).
For example, if the output shape of your model should be (nclass,1) than maybe you can Flatten the data after the MaxPool layer.
If it should be something else that change your labels shape to be (None, 48, 39, 12).
Good luck :)

Resources