remove last layer from pytorch x3d_l model - pytorch

I want to remove last softmax layer from pytorch x3d_l model.
This is what I tried
model = torch.hub.load('facebookresearch/pytorchvideo', 'x3d_l', pretrained=True)
model.blocks[5]=nn.Sequential(*list(model.blocks[5].children())[:-2])
Shape of my input is torch.Size([2, 3, 16, 320, 320])
When I pass it to the model, I got this error
RuntimeError: mat1 and mat2 shapes cannot be multiplied (4096x1 and 2048x400)
This last block look like this
After removing last two layer, it look like this

Related

ValueError Input 0 of layer sequential_13 is incompatible with the layer: expected ndim=3, found ndim=4 Full shape received: (None, None, None, None)

I am trying to work with a Simple RNN to predict Parkinson's Gait using Physionet Database. I am feeding the RNN with Images of height 240 and width of 16 pixels. I am also using Model checkpoint and monitor validation accuracy to save the best weights. While trying the input shape to the RNN I am getting that error as
ValueError: Input 0 of layer sequential_13 is incompatible with the layer: expected ndim=3, found ndim=4. Full shape received: (None, None, None, None)
RNN model:
model = Sequential()
model.add(SimpleRNN(24, kernel_initializer='glorot_uniform', input_shape=(64,240), return_sequences = True))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Flatten())
model.add(Dense(2))
model.add(Activation('softmax'))
opt = optimizers.RMSprop(learning_rate=0.001, decay=1e-6)
epoch=10
early_stopping = EarlyStopping(monitor='val_accuracy', patience=60, verbose=1, mode='auto')
checkpoint = ModelCheckpoint("model_parkinsons.h5",
monitor='val_accuracy', verbose=0, save_best_only=True,
save_weights_only=False, mode='auto', save_freq='epoch')
model.compile(loss='binary_crossentropy',
optimizer=opt,
metrics=['accuracy'])
Batch size:64
Height of the image: 240
a.shape
Output: (64, 16, 240, 1)
I tried to feed the input shape as a.shape[1:]
But I am getting the error as expected 3 dimension but got 4 dimension.
Please help me how to resolve this.
In your first layer, you specified the input shape of your network. This shape does not include your batch size. So, if you specify "input_shape=(64, 240)", this would mean that your final input would need to have the shape (batch_size, 64, 240). Since 64 is your batch size, it seems like there definitely went something wrong there. Additionally, your input has four dimensions: (64, 16, 240, 1), but your first layer takes three dimensional inputs. I do not quite understand what you want to achieve with your model, but it should work if you input a[:, :, :, 0] into your model instead of a. Additionally, you need to set "input_shape=(16, 240)" in your first layer. If you do these two things, then your model uses the RNN to process one column of the image at a time. This approach does not make any sense to me (since RNNs are not used for image processing, at least not in this form), but I do not see any other way to interpret what you already did.

ValueError: Input 0 of layer simple_rnn_1 is incompatible with the layer: expected ndim=3, found ndim=2. Full shape received: [None, 50]

I am new to tensor flow and i am trying to build a multivariate (two features for each time step) multi step (forecast 12 time step in the future) forecast model.
I created tensorflow data set to feed it to my model:
When i print the shape of my data set, i find the following:
<PrefetchDataset shapes: ((None, None, 2), (None, None)), types: (tf.float32, tf.float32)>
This is what i understand:
(None, None, 2) = Input tensor "features" : (batchSize, Timesteps input, Features by time step)
(None, None) =Output Tensor "label" (batchSize, future forecsated time steps )
I follow up by creating my model as following :
keras.backend.clear_session()
tf.random.set_seed(42)
np.random.seed(42)
model = keras.models.Sequential([
keras.layers.SimpleRNN(50),
keras.layers.SimpleRNN(100),
keras.layers.Dense(12),
])
optimizer = keras.optimizers.SGD(lr=1.5e-6, momentum=0.9)
model.compile(loss="mae",
optimizer=optimizer,
metrics=["mae"])
When i fit the model
model.fit(train_set, epochs=5,
validation_data=valid_set)
I have the following error:
ValueError: Input 0 of layer simple_rnn_1 is incompatible with the layer: expected ndim=3, found ndim=2. Full shape received: [None, 50]
I do understand that SimpleRNN layer is expecting a 3 dimension tensor. But i think that my input has this dimension.
Thanks a lot for the help.
If you need me to share with you how i am creating my dataset, i would gladly do it.
The issue was coming form the second layer not the first one. Basically, the activation of the first layer issue a vector rather than a sequence, so it will have a rank of 2, eg: (a, b). But the second layer requires a three dimension input. TO solve this i added the return_sequences=True in the first layer of RNN.
If train_set is a numpy array, pass train_set.reshape((1,50)) to model.fit()
model.fit(train_set.reshape((1,50)), epochs=5,
validation_data=valid_set)
Then you wouldn't need to apply return_sequences=True to the first RNN cell either.

How to solve "logits and labels must have the same first dimension" error

I'm trying out different Neural Network architectures for a word based NLP.
So far I've used bidirectional-, embedded- and models with GRU's guided by this tutorial: https://towardsdatascience.com/language-translation-with-rnns-d84d43b40571 and it all worked out well.
When I tried using LSTM's however, I get an error saying:
logits and labels must have the same first dimension, got logits shape [32,186] and labels shape [4704]
How can I solve this?
My source and target dataset consists of 7200 sample sentences. They are integer tokenized and embedded. The source dataset is post padded to match the length of the target dataset.
Here is my model and the relevant code:
lstm_model = Sequential()
lstm_model.add(Embedding(src_vocab_size, 128, input_length=X.shape[1], input_shape=X.shape[1:]))
lstm_model.add(LSTM(128, return_sequences=False, dropout=0.1, recurrent_dropout=0.1))
lstm_model.add(Dense(128, activation='relu'))
lstm_model.add(Dropout(0.5))
lstm_model.add((Dense(target_vocab_size, activation='softmax')))
lstm_model.compile(optimizer=Adam(0.002), loss='sparse_categorical_crossentropy', metrics=['accuracy'])
history = lstm_model.fit(X, Y, batch_size = 32, callbacks=CALLBACK, epochs = 100, validation_split = 0.25) #At this line the error is raised!
With the shapes:
X.shape = (7200, 147)
Y.shape = (7200, 147, 1)
src_vocab_size = 188
target_vocab_size = 186
I've looked at similar question on here already and tried adding a Reshape layer
simple_lstm_model.add(Reshape((-1,)))
but this only causes the following error:
"TypeError: __int__ returned non-int (type NoneType)"
It's really weird as I preprocess the dataset the same way for all models and it works just fine except for the above.
You should have return_sequences=True and return_state=False in calling the LSTM constructor.
In your snippet, the LSTM only return its last state, instead of the sequence of states for every input embedding. In theory, you could have spotted it from the error message:
logits and labels must have the same first dimension, got logits shape [32,186] and labels shape [4704]
The logits should be three-dimensional: batch size × sequence length × number of classes. The length of the sequences is 147 and indeed 32 × 147 = 4704 (number of your labels). This could have told you the length of the sequences disappeared.

How to set Weights and Gradient Weights in a layer of non-Sequential() Keras model

I have some pre-trained weights (both for layer and gradient) as Numpy Arrays, and I need to set them in a network I recreated.
Example of part of my network:
X_input = Input((4,256,256))
# batchSize is 4
# size so far: (batchSize,4,256,256)
X = Conv2D(96,(11,11), strides=(4,4), data_format = 'channels_first')(X_input)
# output of the convolution has size: (batchSize, 96, 62, 62)
X = BatchNormalization(axis = 1)(X)
X = Activation('relu')(X)
X = MaxPooling2D((3, 3), strides=(2, 2), data_format='channels_first')(X)
The np.array of weights I should set in the Conv2D layer has shape: (96, 4, 11, 11)
I can actually call the set_weights() function as with a Sequential() model like:
model.get_layer('layerName').set_weights(myNpArrayWeights)
But if I do so, this gives the error:
ValueError: You called `set_weights(weights)` on layer "step2_conv1" with a
weight list of length 96, but the layer was expecting 2 weights.
Provided weights: [[[[ 3.87499551e-03 1.32818555e-03 2.97062146e-0...
as if shape were incorrect?
So I tried to input 2 testing weights using np.array([1,2]).
This is the error reported:
ValueError: Fetch argument <tf.Variable 'step2_conv1_4/kernel:0'
shape=(11, 11, 4, 96) dtype=float32_ref> cannot be interpreted as a Tensor.
(Tensor Tensor("step2_conv1_4/kernel:0", shape=(11, 11, 4, 96), dtype=float32_ref)
is not an element of this graph.)
How do I solve this?
How do can I set the weights also for the gradient?
Python version: 3.6.5
Keras version: 2.2.4
Tensorflow version: 1.13.1
EDIT
For the first ValueError:
In Conv2D layer set use_bias=False so that it expects only 1 array of weights, if use_bias is set True then an additional array of weights will be considered in the layer.
For the second ValueError:
Before instantiating the model, it is necessary to clear the session, because you might have run multiple times a model (as I did) and apparently Tensorflow gets confused about the multiple graphs present.
To clean the session run:
keras.backend.clear_session()

Wrong input shape to neural network layer

I am trying to classify handwritten digits using the MNIST dataset to train my model. My model trained successfully and hit an accuracy of 98.9%. But when I try and input a custom image it shows me the following error :
Error when checking : expected conv2d_4_input to have shape (None, 28, 28, 1) but got array with shape (1, 1, 28, 28)
This is the first convolutional layer i.e. the input layer.
What can I do to resolve this issue ?
This is my Convolutional Neural Network :
conv_model = Sequential()
conv_model.add(Conv2D(filters, kernel_size[0], input_shape=(28 , 28 , 1)))
conv_model.add(Activation(act))
conv_model.add(Conv2D(filters, kernel_size[0]))
conv_model.add(Activation(act))
conv_model.add(MaxPool2D(pool_size=(2,2)))
conv_model.add(Dropout(0.25))
conv_model.add(Flatten())
conv_model.add(Dense(128))
conv_model.add(Activation(act))
conv_model.add(Dropout(0.5))
conv_model.add(Dense(10))
conv_model.add(Activation('softmax'))
#conv_model.summary()
Compilation Details :
conv_model.compile(loss='categorical_crossentropy', optimizer='adadelta', metrics=['accuracy'])
COMPLETE SOURCE CODE :
https://github.com/tanmay-edgelord/HandwrittenDigitRecognition
The image :
If any further details are required please comment.
The error message is pretty straight:
Your first layer is expecting data with shape (None, 28, 28, 1), where "None" can be any number (it's the batch size, how many examples you have).
Your data on the other hand has shape (1, 1, 28, 28).
The confusion seems to me a common one: Keras puts the channels at the last dimension, and your data has the channels in the first.
Solution:
Just reshape your data in the correct format: (1, 28, 28, 1).
But are you trying to give that entire image to the model??? If so, it won't work very well, it's expecting images with 28 x 28 pixels.
You will have to separate each number in a different 28 x 28 image. And you must take into account the possibility of your image being inverted in terms of what is black and what is white. Usually the MNIST data has a black background (0 values) with a white number (1 values).
The problem got solved by passing it to the reshape function with the correct input size
roi2 = roi.reshape(1,28,28,1)

Resources