I am trying to build a 1D CNN but I can't get the right dimensions passed to my last dense layer
The architecture of my model is
model_CNN=Sequential()
model_CNN.add(Conv1D(14, 29, activation='relu', input_shape=(X_train.shape[1], 1)))
model_CNN.add(Conv1D(30, 22, activation='relu'))
model_CNN.add(Flatten())
model_CNN.add(Dense(176,activation='relu'))
model_CNN.add(Dense(Y_train.shape[1],activation='linear'))
With a summary that looks like
Layer (type) Output Shape Param #
=================================================================
conv1d_71 (Conv1D) (None, 3304, 14) 420
_________________________________________________________________
conv1d_72 (Conv1D) (None, 3283, 30) 9270
_________________________________________________________________
flatten_18 (Flatten) (None, 98490) 0
_________________________________________________________________
dense_102 (Dense) (None, 176) 17334416
_________________________________________________________________
dense_103 (Dense) (None, 5) 885
=================================================================
Total params: 17,344,991
Trainable params: 17,344,991
Non-trainable params: 0
When I try to fit my model, I confirm that my input shape is correct (240, 3332, 1), but then I get the following error
ValueError: Error when checking target: expected dense_103
to have 2 dimensions, but got array with shape (240, 5, 1)
So my flatten function is not creating a 1D array, but also somehow the input only fails on the second dense layer, not the first. What's going on?
Related
I trained and load a cnn+dense model:
# load model
cnn_model = load_model('my_cnn_model.h5')
cnn_model.summary()
The output is this (I have images dimension 2 X 3600):
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_1 (Conv2D) (None, 2, 3600, 32) 128
_________________________________________________________________
conv2d_2 (Conv2D) (None, 2, 1800, 32) 3104
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 2, 600, 32) 0
_________________________________________________________________
conv2d_3 (Conv2D) (None, 2, 600, 64) 6208
_________________________________________________________________
conv2d_4 (Conv2D) (None, 2, 300, 64) 12352
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 2, 100, 64) 0
_________________________________________________________________
conv2d_5 (Conv2D) (None, 2, 100, 128) 24704
_________________________________________________________________
conv2d_6 (Conv2D) (None, 2, 50, 128) 49280
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 2, 16, 128) 0
_________________________________________________________________
flatten_1 (Flatten) (None, 4096) 0
_________________________________________________________________
dense_1 (Dense) (None, 1024) 4195328
_________________________________________________________________
dense_2 (Dense) (None, 1024) 1049600
_________________________________________________________________
dense_3 (Dense) (None, 3) 3075
=================================================================
Total params: 5,343,779
Trainable params: 5,343,779
Non-trainable params: 0
Now, what I want is to leave weights up to flatten and replace dense layers with LSTM to train the added LSTM part.
I just wrote:
# freeze model
base_model = cnn_model(input_shape=(2, 3600, 1))
#base_model = cnn_model
base_model.trainable = False
# Adding the first lstm layer
x = LSTM(1024,activation='relu',return_sequences='True')(base_model.output)
# Adding the second lstm layer
x = LSTM(1024, activation='relu',return_sequences='False')(x)
# Adding the output
output = Dense(3,activation='linear')(x)
# Final model creation
model = Model(inputs=[base_model.input], outputs=[output])
But I obtained:
base_model = cnn_model(input_shape=(2, 3600, 1))
TypeError: __call__() missing 1 required positional argument: 'inputs'
I know I have to add TimeDistributed ideally in the Flatten layer, but I do not know how to do.
Moreover I'm not sure about base_model.trainable = False if it do exactly what I want.
Can you please help me to do the job?
Thank you very much!
You can't directly take the output from Flatten(), LSTM needs 2-d features (time, filters). You have to reshape your tensors.
You can take the output from the layer before flatten (max-pooling), let's say this layer has index i in the model, we can take the output from that layer and reshape it based on our needs and pass it to LSTM.
before_flatten = base_model.layers[i].output # i is the index of the layer from which you want to take the model output
conv2lstm_reshape = Reshape((-1, 2))(before_flatten) # you have to select it, the temporal dim and filters
# Adding the first lstm layer
x = LSTM(1024,activation='relu',return_sequences='True')(conv2lstm_reshape)
# Adding the second lstm layer
x = LSTM(1024, activation='relu',return_sequences='False')(x)
# Adding the output
output = Dense(3,activation='linear')(before_flatten)
# Final model creation
model = Model(inputs=[base_model.input], outputs=[output])
model.summary()
I have a custom network of a Keras Xception base with added regression head:
pretrained_model = tf.keras.applications.Xception(input_shape=[244, 244, 3], include_top=False, weights='imagenet')
pretrained_model.trainable = True
model = tf.keras.Sequential([
pretrained_model,
tf.keras.layers.GlobalAveragePooling2D(),
tf.keras.layers.Dropout(0.5),
tf.keras.layers.Dense(32, activation='relu'),
tf.keras.layers.Dropout(0.5),
tf.keras.layers.Dense(1, activation='tanh')
])
The model summary:
Layer (type) Output Shape Param #
=================================================================
xception (Model) (None, 7, 7, 2048) 20861480
_________________________________________________________________
global_average_pooling2d_3 ( (None, 2048) 0
_________________________________________________________________
dropout_4 (Dropout) (None, 2048) 0
_________________________________________________________________
dense_6 (Dense) (None, 32) 65568
_________________________________________________________________
dropout_5 (Dropout) (None, 32) 0
_________________________________________________________________
dense_7 (Dense) (None, 1) 33
=================================================================
Total params: 20,927,081
Trainable params: 20,872,553
Non-trainable params: 54,528
I want to get the last activations from the xception(model) layer.
The details of xception:
Model: "xception"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_4 (InputLayer) [(None, 224, 224, 3) 0
__________________________________________________________________________________________________
block1_conv1 (Conv2D) (None, 111, 111, 32) 864 input_4[0][0]
__________________________________________________________________________________________________
...
__________________________________________________________________________________________________
block14_sepconv2 (SeparableConv (None, 7, 7, 2048) 3159552 block14_sepconv1_act[0][0]
__________________________________________________________________________________________________
block14_sepconv2_bn (BatchNorma (None, 7, 7, 2048) 8192 block14_sepconv2[0][0]
__________________________________________________________________________________________________
block14_sepconv2_act (Activatio (None, 7, 7, 2048) 0 block14_sepconv2_bn[0][0]
==================================================================================================
Total params: 20,861,480
Trainable params: 20,806,952
Non-trainable params: 54,528
To reference the last activation layer I have to use :
model.layers[0].get_layer('block14_sepconv2_act').output
since explicitly my 'model' does not contain the 'block14_sepconv2_act' layer.
To access activation I want to use the code below:
activations = tf.keras.Model(model.inputs,model.layers[0].get_layer('block14_sepconv2_act').output)
activations(sample)
but I get the error:
ValueError: Graph disconnected: cannot obtain value for tensor Tensor("input_4_1:0", shape=(None, 224, 224, 3), dtype=float32) at layer "input_4". The following previous layers were accessed without issue: []
My question is how can I access the intermediate layer outputs of a pretrained model if added in this way to the custom model?
I am trying to create a CNN with tensorflow, my images are 64x64x1 images and I have an array of 3662 images which I am using for training. I have total 5 labels which I have one-hot encoded. I am getting this error everytime:
InvalidArgumentError: logits and labels must have the same first dimension, got logits shape [3662,5] and labels shape [18310]
[[{{node loss_2/dense_5_loss/SparseSoftmaxCrossEntropyWithLogits/SparseSoftmaxCrossEntropyWithLogits}}]]
my neural network structure is this:
def cnn_model():
model = models.Sequential()
# model.add(layers.Dense(128, activation='relu', ))
model.add(layers.Conv2D(128, (3, 3), activation='relu',input_shape=(64, 64, 1)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu',padding = 'same'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu', padding='same'))
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Flatten())
model.add(layers.Dense(5, activation='softmax'))
model.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(),
metrics=['accuracy'])
print(model.summary())
return model
My model summary is this:
Model: "sequential_3"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_9 (Conv2D) (None, 62, 62, 128) 1280
_________________________________________________________________
max_pooling2d_6 (MaxPooling2 (None, 31, 31, 128) 0
_________________________________________________________________
conv2d_10 (Conv2D) (None, 31, 31, 64) 73792
_________________________________________________________________
max_pooling2d_7 (MaxPooling2 (None, 15, 15, 64) 0
_________________________________________________________________
conv2d_11 (Conv2D) (None, 15, 15, 64) 36928
_________________________________________________________________
dense_4 (Dense) (None, 15, 15, 64) 4160
_________________________________________________________________
flatten_2 (Flatten) (None, 14400) 0
_________________________________________________________________
dense_5 (Dense) (None, 5) 72005
=================================================================
Total params: 188,165
Trainable params: 188,165
Non-trainable params: 0
my output array is of the shape (3662,5,1). I have seen other answers to same questions but I can't figure out the problem with mine. Where am I wrong?
Edit: My labels are stored in one hot encoded form using these:
df = pd.get_dummies(df)
diag = np.array(df)
diag = np.reshape(diag,(3662,5,1))
I have tried as numpy array and after converting to tensor(same for input as per documentation)
The problem lines within the choice of the loss function tf.keras.losses.SparseCategoricalCrossentropy(). According to what you are trying to achieve you should use tf.keras.losses.CategoricalCrossentropy(). Namely, the documentation of tf.keras.losses.SparseCategoricalCrossentropy() states:
Use this crossentropy loss function when there are two or more label classes. We expect labels to be provided as integers.
On the other hand, the documentation of tf.keras.losses.CategoricalCrossentropy() states:
We expect labels to be provided in a one_hot representation.
And because your labels are encoded as one-hot, you should use tf.keras.losses.CategoricalCrossentropy().
model.fit produces exception:
tensorflow.python.framework.errors_impl.InvalidArgumentError: Cannot update variable with shape [] using a Tensor with shape [32], shapes must be equal.
[[{{node metrics/accuracy/AssignAddVariableOp}}]]
[[loss/dense_loss/categorical_crossentropy/weighted_loss/broadcast_weights/assert_broadcastable/AssertGuard/pivot_f/_50/_63]] [Op:__inference_keras_scratch_graph_1408]
Model definition:
model = tf.keras.Sequential()
model.add(tf.keras.layers.InputLayer(
input_shape=(360, 7)
))
model.add(tf.keras.layers.Conv1D(32, 1, activation='relu', input_shape=(360, 7)))
model.add(tf.keras.layers.Conv1D(32, 1, activation='relu'))
model.add(tf.keras.layers.MaxPooling1D(3))
model.add(tf.keras.layers.Conv1D(512, 1, activation='relu'))
model.add(tf.keras.layers.Conv1D(1048, 1, activation='relu'))
model.add(tf.keras.layers.GlobalAveragePooling1D())
model.add(tf.keras.layers.Dropout(0.5))
model.add(tf.keras.layers.Dense(32, activation='softmax'))
Input Features Shape
(105, 360, 7)
Input Labels Shape
(105, 32, 1)
Compile statement
model.compile(optimizer='adam',
loss=tf.keras.losses.CategoricalCrossentropy(),
metrics=['accuracy'])
Model.fit statement
model.fit(features,
labels,
epochs=50000,
validation_split=0.2,
verbose=1)
Any help would be much appreciated
You can use model.summary() to see your model architecture.
print(model.summary())
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv1d (Conv1D) (None, 360, 32) 256
_________________________________________________________________
conv1d_1 (Conv1D) (None, 360, 32) 1056
_________________________________________________________________
max_pooling1d (MaxPooling1D) (None, 120, 32) 0
_________________________________________________________________
conv1d_2 (Conv1D) (None, 120, 512) 16896
_________________________________________________________________
conv1d_3 (Conv1D) (None, 120, 1048) 537624
_________________________________________________________________
global_average_pooling1d (Gl (None, 1048) 0
_________________________________________________________________
dropout (Dropout) (None, 1048) 0
_________________________________________________________________
dense (Dense) (None, 32) 33568
=================================================================
Total params: 589,400
Trainable params: 589,400
Non-trainable params: 0
_________________________________________________________________
None
The shape of your output layer is required to be (None,32), but the shape of your labels is (105,32,1). So you need to change the shape to (105,32). np.squeeze() function is used when we want to remove single-dimensional entries from the shape of an array.
Use Flatten() before the Dense Layers.
I have a similar problem to Keras replacing input layer, however I need to remove also the next layer, and that will require different input shape.
Here is a simplification of what I'm trying to do:
a = Input(shape=(64,))
b = Dense(32)(a)
c = Dense(16)(b)
d = Dense(8)(c)
model = Model(inputs=a, outputs=d)
print(model.summary())
print('input shape = ' + str(model.input_shape))
model.layers.pop(0)
model.layers.pop(0)
print(model.summary())
print('input shape = ' + str(model.input_shape))
new_input = Input(shape=(32,))
new_output = model(new_input)
new_model = Model(new_input, new_output)
print(new_model.summary())
But the input shape of the model remains the same:
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) (None, 64) 0
_________________________________________________________________
dense_1 (Dense) (None, 32) 2080
_________________________________________________________________
dense_2 (Dense) (None, 16) 528
_________________________________________________________________
dense_3 (Dense) (None, 8) 136
=================================================================
Total params: 2,744
Trainable params: 2,744
Non-trainable params: 0
_________________________________________________________________
None
input shape = (None, 64)
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_2 (Dense) (None, 16) 528
_________________________________________________________________
dense_3 (Dense) (None, 8) 136
=================================================================
Total params: 664
Trainable params: 664
Non-trainable params: 0
_________________________________________________________________
None
input shape = (None, 64)
And that prevents me from creating new model, so the code above fails with:
ValueError: Dimensions must be equal, but are 32 and 64 for 'model_1/dense_1/MatMul' (op: 'MatMul') with input shapes: [?,32], [64,32].
Any ideas how to do that?
It might not be possible to do in the way that you describe. The accepted answer on this post explains it a little.
how-to-change-input-shape-in-sequential-model-in-keras?
Their solution was to rebuild the layer with the correct input shape, then load the pre-trained weights for that specific layer.