I have a simple single-layer CNN network that I have trained. I would now like to use this model without the top in order to visualize the CNN filters, as illustrated in the Keras.io blog post here.
This is my simple model:
model = Sequential()
model.add(Conv2D(24, (5, 5), padding='valid',input_shape=(1, 17,17), activation='relu',name='conv2d_1'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.5))
model.add(Flatten())
model.add(Dense(8, activation='relu',name='dens_1'))
model.add(Dense(2, activation='softmax',name='dens_2'))
I would like to load the model using load_model, and then create a topless model with a generic input size (1, None,None), such as the model illustrated below. Is this possible? I prefer to not have to explicitly re-define all layers in the model.
model_no_top = Sequential()
model_no_top.add(Conv2D(24, (5, 5), padding='valid',input_shape=(1, None, None), activation='relu',name='conv2d_1'))
model_no_top.add(MaxPooling2D(pool_size=(2, 2)))
model_no_top.add(Dropout(0.5))
I've loaded the model and grabbed the layers that I want, but the input_shape is the same as the original model.
model = load_model(model_path)
model_no_top = Sequential()
for layer in model.layers:
if 'flatten' not in layer.name and 'dens' not in layer.name:
model_no_top.add(layer)
Related
I'm trying to construct an encoder to get the latent space in order to plot it. I don't really know if I can get it from the RepeatVector or if I have to add a Dense layer.
Here is my code:
model = Sequential()
model.add(LSTM(16, activation='relu', return_sequences=True, input_shape= (x_train.shape[1], 1)))
model.add(LSTM(4, activation='relu', return_sequences=False)) #Encoder
model.add(RepeatVector(X_train.shape[1])) #Latent
model.add(LSTM(4, activation='relu', return_sequences=True)) #Decoder
model.add(LSTM(16, activation='relu', return_sequences=False)) #Decoder
model.add(TimeDistributed(Dense(X_train.shape[2]))) #Decoder
You need to separate the model into two parts (encoder and decoder).
Then, build the encoder using the input and output of the encoder part.
btw the output would be the last layer before using RepeatVector.
encoder = Model(inputs, output_from_encoder)
I have a pre-trained sequential CNN model which I trained on images of 224x224x3. The following is the architecture:
model = Sequential()
model.add(Conv2D(filters = 64, kernel_size = (5, 5), strides = 1, activation = 'relu', input_shape = (224, 224, 3)))
model.add(MaxPool2D(pool_size = (3, 3)))
model.add(Dropout(0.2))
model.add(Conv2D(filters = 128, kernel_size = (3, 3), strides = 1, activation = 'relu'))
model.add(MaxPool2D(pool_size = (2, 2)))
model.add(Dropout(0.2))
model.add(Conv2D(filters = 256, kernel_size = (2, 2), strides = 1, activation = 'relu'))
model.add(MaxPool2D(pool_size = (2, 2)))
model.add(Dropout(0.2))
model.add(Flatten())
model.add(Dense(128, activation = 'relu', use_bias=False))
model.add(Dense(num_classes, activation = 'softmax'))
model.summary()
For reference, here is the model summary: model summary
I want to retrain this model on images of size 40x40x3. However, I am facing the following error: "ValueError: Input 0 of layer dense_12 is incompatible with the layer: expected axis -1 of input shape to have value 200704 but received input with shape (None, 256)".
What should I do to resolve this error?
Note: I am using Tensorflow version 2.4.1
The problem is, in your pre-trained model you have a flattened shape of 200704 as input shape (line no 4 from last), and then the output size is 128 for the dense layer (line 3 from the last). And now you wanna use the same pre-trained model for the image of 40X40, it will not work. The reasons are :
1- Your model is input image shape-dependent. it's not an end-to-end conv model, as you use dense layers in between, which makes the model input image size-dependent.
2- The flatten size of the 40x40 image after all the conv layers are 256, not 200704.
Solution
1- Either you change the flatten part with adaptive average pooling layer and then your last dense layer with softmax is fine. And again retrain your old model on 224x224 images. Following that you can train on your 40x40 images.
2- Or the easiest way is to just use a subset of your pre-trained model till the flatten part (exclude the flatten part) and then add a flatten part with dense layer and classification layer (layer with softmax). For this method you have to write a custom model, like here, just the first part will be the subset of the pre-trained model, and flatten and classification part will be additional. And then you can train the whole model over the new dataset. You can also take the benefit of transfer-learning using this method, by allowing the backward gradient to flow only through the newly created linear layer and not through the pre-trained layers.
I have a code below which implements an architecture (in grid search), to yield appropriate parameters for input, nodes, epochs, batch size and differenced time series input.
The challenge I have is to convert the neural network from just having one LSTM hidden layer, to multiple LSTM hidden layers.
At the moment, I could only run the code with Dense-type hidden layers, without having any errors thrown, otherwise I get dimension errors, tuple errors and so on.
The problem is only persistent in the neural network architecture section.
Original code that works:
def model_fit(train, config):
# unpack config
n_input, n_nodes, n_epochs, n_batch, n_diff = config
# Data
if n_diff > 0:
train = difference(train, n_diff)
# Time series to supervised format
data = series_to_supervised(train, n_in=n_input)
train_x, train_y = data[:, :-1], data[:, -1]
# Reshaping input data into [samples, timesteps, features]
n_features = 1
train_x = train_x.reshape((train_x.shape[0], train_x.shape[1], n_features))
# Define model for (Grid search architecture)
model = Sequential()
model.add(LSTM(n_nodes, activation='relu', input_shape=(n_input, n_features)))
model.add(Dense(n_nodes, activation='relu'))
model.add(Dense(n_nodes, activation='relu'))
model.add(Dense(n_nodes, activation='relu'))
model.add(Dense(1))
# Compile model (Grid search architecture)
model.compile(loss='mse', optimizer='adam')
# fit model
model.fit(train_x, train_y, epochs=n_epochs, batch_size=n_batch, verbose=0)
return model
Modified LSTM-hidden layer code, that fails to run:
# Define model for (Grid search architecture)
model = Sequential()
model.add(LSTM(n_nodes, activation='relu', input_shape=(n_input, n_features), return_sequences=True))
model.add(LSTM(n_nodes, activation='relu', return_sequences=True))
model.add(LSTM(n_nodes, activation='relu', return_sequences=True))
model.add(LSTM(n_nodes, activation='relu', return_sequences=True))
model.add(TimeDistributed(Dense(1)))
Another variant that also threw an error - ValueError: Error when checking target: expected time_distributed_4 to have 3 dimensions, but got array with shape (34844, 1)
model = Sequential()
model.add(LSTM(n_nodes, activation='relu', input_shape=(n_input, n_features), return_sequences=True))
model.add(LSTM(n_nodes, activation='relu', return_sequences=False))
model.add(RepeatVector(n_input))
model.add(LSTM(n_nodes, activation='relu', return_sequences=True))
model.add(LSTM(n_nodes, activation='relu', return_sequences=True))
model.add(TimeDistributed(Dense(n_features)))
Could anyone with any suggestion please help me ?
Try to set return_sequences=False at the last layer.
Model is producing only one output for all testing images, though the test set has all possible classes.
I've already tried using different optimizers and loss functions.
model = tf.keras.models.Sequential([
tf.keras.layers.Conv2D(64, (3, 3), input_shape=x.shape[1:], activation='relu'),
tf.keras.layers.MaxPooling2D(pool_size=(2, 2)),
tf.keras.layers.Conv2D(64, (3, 3), activation='relu'),
tf.keras.layers.MaxPooling2D(pool_size=(2, 2)),
tf.keras.layers.Conv2D(64, (3, 3), activation='relu'),
tf.keras.layers.MaxPooling2D(pool_size=(2, 2)),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(512, activation='relu'),
tf.keras.layers.Dense(1, activation='sigmoid')
])
model.compile(optimizer="adam", loss="binary_crossentropy", metrics=["accuracy"])
model.fit(x, y, epochs=10, batch_size=32)
Expected result: Predict parasitized or uninfected sample for the given image.
Actual result: Predicting always the same class. Either all images as parasitized or all images as uninfected.
tf.keras.layers.Dense(1, activation='sigmoid')
You can also do it in another way by treating like a categorical problem so your code should be like this tf.keras.layers.Dense(2, activation=' softmax') and also loss should be categorical cross-entropy.
Problem: Your model seems to be a much simpler one which is unable to generalize the data to make a prediction, instead use a pre-trained model and use the feature extraction and finetune method if you have less data. hope this will help you good luck:
I want to train my model on video data for gesture recognition, proposed using LSTM's and TimeDistributed layers. Would this be ideal way to tackle my problem?
# Convolution
pool_size = 4
# LSTM
lstm_output_size = 1
print('Build model...')
model = Sequential()
model.add(TimeDistributed(Dense(62), input_shape=(img_width, img_height,3)))
model.add(Conv2D(32, (3, 3)))
model.add(Dropout(0.25))
model.add(Conv2D(32, (3, 3)))
model.add(MaxPooling2D(pool_size=pool_size))
# model.add(Dense(1))
model.add(TimeDistributed(Flatten()))
model.add(CuDNNLSTM(256, return_sequences=True))
model.add(CuDNNLSTM(256, return_sequences=True))
model.add(CuDNNLSTM(256, return_sequences=True))
model.add(CuDNNLSTM(lstm_output_size))
model.add(Dense(units = 1, activation = 'sigmoid'))
print('Train...')
model.summary()
# Run epochs of sampling data then training
For temporal sequence data LSTM networks are generally the right choice. If you want to analyze video then a combination with 2d convolutions sounds reasonable to me. However, you have to apply TimeDistributed on all layers which dont expect sequence data. In your example that means all laysers expect LSTM.
# Convolution
pool_size = 4
# LSTM
lstm_output_size = 1
print('Build model...')
model = Sequential()
model.add(TimeDistributed(Dense(62), input_shape=(img_width, img_height,3)))
model.add(TimeDistributed(Conv2D(32, (3, 3))))
model.add(Dropout(0.25))
model.add(TimeDistributed(Conv2D(32, (3, 3))))
model.add(TimeDistributed(MaxPooling2D(pool_size=pool_size)))
# model.add(Dense(1))
model.add(TimeDistributed(Flatten()))
model.add(CuDNNLSTM(256, return_sequences=True))
model.add(CuDNNLSTM(256, return_sequences=True))
model.add(CuDNNLSTM(256, return_sequences=True))
model.add(CuDNNLSTM(lstm_output_size))
model.add(Dense(units = 1, activation = 'sigmoid'))
print('Train...')
model.summary()
# run epochs of sampling data then training
The last Dense Layer can stay this way because the final lstm doesnt output a sequence.