When trying to connect LSTM with Dense, it gives an error (when trying to train):
input = Input(shape=(x_train.shape[1], None))
X = Embedding(num_words, max_article_len)(input)
X = LSTM(128, return_sequences=True, dropout = 0.5)(X)
X = LSTM(128)(X)
X = Dense(32, activation='softmax')(X)
model = Model(inputs=[input], outputs=[X])
...
>>> ValueError: Error when checking target: expected dense to have shape (32,) but got array with shape (1,)
I tried different connection options, but the error repeats:
X, h, c = LSTM(128, return_sequences=False, return_state=True, dropout = 0.5)(X)
X = Dense(32, activation='softmax')(X)
>>> ValueError: Error when checking target: expected dense to have shape (32,) but got array with shape (1,)
Any solution options on the functional API / Sequential?
Data conversion code:
train = pd.read_csv('train.csv')
articles = train['text']
y_train = train['lang']
num_words = 50000
max_article_len = 20
tokenizer = Tokenizer(num_words=num_words)
tokenizer.fit_on_texts(articles)
sequences = tokenizer.texts_to_sequences(articles)
x_train = pad_sequences(sequences, maxlen=max_article_len, padding='post')
x_train.shape
>>> (18974, 100)
y_train.shape
>>> (18974,)
The last parameter must be set to False;
X = LSTM(128, return_sequences=True, dropout = 0.5)(X)
X = LSTM(128, return_sequences=False)(X)
If you still have issues, then the problem must be with your input shape.
Related
I'm creating a multi input model where i concatenate a CNN model and a LSTM model. The lstm model contains the last 5 events and the CNN contains a picture of the last event. Both are organized so that each element k in the numpy matches the 5 events and the corresponding picture, as do the output labels which is the 'next' event that should be predicted by the model.
chanDim = -1
inputs = Input(shape=inputShape)
x = inputs
x = Dense(128)(x)
x = Activation("relu")(x)
x = BatchNormalization(axis=chanDim)(x)
x = Dropout(0.3)(x)
x = Flatten()(x)
x = Activation("relu")(x)
x = BatchNormalization(axis=chanDim)(x)
x = Dropout(0.1)(x)
x = Activation("relu")(x)
model_cnn = Model(inputs, x)
This creates the CNN model, and the following code represents the LSTM model
hidden1 = LSTM(128)(visible)
hidden2 = Dense(64, activation='relu')(hidden1)
output = Dense(10, activation='relu')(hidden2)
model_lstm = Model(inputs=visible, outputs=output)
Now, when I combine these models and extend them using a simple dense layer to make the multiclass prediction of 14 classes, all the inputs match and I can concat the (none, 10) and (none, 10) into a (none, 20) for the MLP:
x = Dense(14, activation="softmax")(x)
model_mlp = Model(inputs=[model_lstm.input, model_cnn.input], outputs=x)
This all works fine until I try to compile the model it gives me an error concerning the input of the last dense layer of the mlp model:
ValueError: Error when checking target: expected dense_121 to have shape (14,) but got array with shape (1,)
Do you know how this is possible? If you need more information I'm happy to provide that
your target must be (None, 14) dimensional. with softmax you have to one-hot encode the output
try this:
y = pd.get_dummies(np.concatenate([y_train, y_test])).values
y_train = y[:len(y_train)]
y_test = y[len(y_train):]
In an attempt to apply 3 different activation functions to the 3 columns of the last Dense layer I'm using Lambda layers. First I split and then reassemble the output. No actual activation is applied as of yet, because I found that the following gives me an error at the model fit step:
def createModel():
in_put = Input(shape=(3300,))
layer1 = Dense(512, activation='relu')(in_put)
layer2 = Dense(1024, activation='relu')(layer1)
layer3 = Dense(32, activation='relu')(layer2)
out_put = Dense(3)(layer3)
#splitting output into 3 columns for further activation
theta = Lambda(lambda x: x[:,0], output_shape=(1,))(out_put)
phi = Lambda(lambda x: x[:,1], output_shape=(1,))(out_put)
r = Lambda(lambda x: x[:,2], output_shape=(1,))(out_put)
#combining the activated layers (activation has not been implemented yet)
out_put_a = Lambda(lambda x: K.stack([x[0], x[1], x[2]]),output_shape=(3,), name="output")([theta, phi, r])
model = Model(inputs=in_put, outputs=out_put_a)
return model
my_network=createModel()
batch_size = 1000
epochs = 1
my_network.compile(optimizer='rmsprop', loss='mean_squared_error')
history = my_network.fit_generator(generator=training_generator, epochs=epochs,
validation_data=testing_generator)
It crashes with the following error:
InvalidArgumentError: 2 root error(s) found.
(0) Invalid argument: Incompatible shapes: [3] vs. [1000]
[[{{node training/RMSprop/gradients/loss/output_loss/mul_grad/BroadcastGradientArgs}}]]
[[loss/mul/_55]]
(1) Invalid argument: Incompatible shapes: [3] vs. [1000]
[[{{node training/RMSprop/gradients/loss/output_loss/mul_grad/BroadcastGradientArgs}}]]
0 successful operations.
0 derived errors ignored.>
At the same time the code below (with no Lambda layers works just fine):
def createModel():
in_put = Input(shape=(3300,))
layer1 = Dense(512, activation='relu')(in_put)
layer2 = Dense(1024, activation='relu')(layer1)
layer3 = Dense(32, activation='relu')(layer2)
out_put = Dense(3)(layer3)
model = Model(inputs=in_put, outputs=out_put)
return model
my_network=createModel()
batch_size = 1000
epochs = 1
my_network.compile(optimizer='rmsprop', loss='mean_squared_error')
history = my_network.fit_generator(generator=training_generator, epochs=epochs,
validation_data=testing_generator)
Could anyone point out what's wrong with the lambda layer?
I have a nested model which has an input layer, and has some final dense layers before the output. Here is the code for it:
image_input = Input(shape, name='image_input')
x = DenseNet121(input_shape=shape, include_top=False, weights=None,backend=keras.backend,
layers=keras.layers,
models=keras.models,
utils=keras.utils)(image_input)
x = GlobalAveragePooling2D(name='avg_pool')(x)
x = Dense(1024, activation='relu', name='dense_layer1_image')(x)
x = BatchNormalization()(x)
x = Dropout(0.5)(x)
x = Dense(512, activation='relu', name='dense_layer2_image')(x)
x = BatchNormalization()(x)
x = Dropout(0.5)(x)
output = Dense(num_class, activation='softmax', name='image_output')(x)
classificationModel = Model(inputs=[image_input], outputs=[output])
Now If say I wanted to extract the densenets weights from this model and perform transfer learning to another larger model which also has the same densenet model nested but also has an some other layers after the dense net such as:
image_input = Input(shape, name='image_input')
x = DenseNet121(input_shape=shape, include_top=False, weights=None,backend=keras.backend,
layers=keras.layers,
models=keras.models,
utils=keras.utils)(image_input)
x = GlobalAveragePooling2D(name='avg_pool')(x)
x = Dense(1024, activation='relu', name='dense_layer1_image')(x)
x = BatchNormalization()(x)
x = Dropout(0.5)(x)
x = Dense(512, activation='relu', name='dense_layer2_image')(x)
x = BatchNormalization()(x)
x = Dropout(0.5)(x)
x = Dense(256, activation='relu', name='dense_layer3_image')(x)
x = BatchNormalization()(x)
x = Dropout(0.5)(x)
output = Dense(num_class, activation='sigmoid', name='image_output')(x)
classificationModel = Model(inputs=[image_input], outputs=[output])
Would I need to just do: modelB.load_weights(<weights.hdf5>, by_name=True)? Also should I name the internal densenet? and if so how?
You can, before using the nested model, have it into a variable.
It gets a lot easier to do everything:
densenet = DenseNet121(input_shape=shape, include_top=False,
weights=None,backend=keras.backend,
layers=keras.layers,
models=keras.models,
utils=keras.utils)
image_input = Input(shape, name='image_input')
x = densenet(image_input)
x = GlobalAveragePooling2D(name='avg_pool')(x)
......
Now it's super simple to:
weights = densenet.get_weights()
another_densenet.set_weights(weights)
The loaded file
You can also print a model.summary() of your loaded model. The dense net will be the first or second layer (you must check this).
You can then get it like densenet = loaded_model.layers[i].
You can then transfer these weights to the new dense net, both with the method in the previous answer and with the new_model.layers[i].set_weights(densenet.get_weights())
Perhaps the easiest way to go about this is to use the model you have trained itself without trying to load the model weights. Say you have trained the initial model (copied and pasted from the provided source code with minimal edits to variable name):
image_input = Input(shape, name='image_input')
# ... intermediery layers elided
x = BatchNormalization()(x)
output = Dropout(0.5)(x)
model_output = Dense(num_class, activation='softmax', name='image_output')(output)
smaller_model = Model(inputs=[image_input], outputs=[model_output])
To use the trained weights of this model for a larger model, we can simply declare another model that uses the trained weights, then use that newly defined model as a component of the larger model.
new_model = Model(image_input, output) # Model that uses trained weights
main_input = Input(shape, name='main_input')
x = new_model(main_input)
x = Dense(256, activation='relu', name='dense_layer3_image')(x)
x = BatchNormalization()(x)
x = Dropout(0.5)(x)
output = Dense(num_class, activation='sigmoid', name='image_output')(x)
final_model = Model(inputs=[main_input], outputs=[output])
If anything is unclear, I'd be more than happy to elaborate.
I just need to concatenate a flatten layer and a feature vector in Keras. This is the code:
#custom parameters
n_features = 38
vgg_model = VGGFace(include_top=False, input_shape=(224, 224, 3))
last_layer = vgg_model.get_layer('pool5').output
x = Flatten(name='flatten')(last_layer)
# feature vector
feature_vector = Input(shape = (n_features,))
conc = concatenate(([x, feature_vector]), axis=1)
layer_intermediate = Dense(128, activation='relu', name='fc6')(conc)
layer_intermediate1 = Dense(32, activation='relu', name='fc7')(layer_intermediate)
out = Dense(5, activation='softmax', name='fc8')(layer_intermediate1)
custom_vgg_model = Model(vgg_model.input, out)
But I'm getting this error:
---> 20 custom_vgg_model = Model(vgg_model.input, out)
ValueError: Graph disconnected: cannot obtain value for tensor Tensor("input_88:0", shape=(?, 38), dtype=float32) at layer "input_88". The following previous layers were accessed without issue: ['input_87', 'conv1_1', 'conv1_2', 'pool1', 'conv2_1', 'conv2_2', 'pool2', 'conv3_1', 'conv3_2', 'conv3_3', 'pool3', 'conv4_1', 'conv4_2', 'conv4_3', 'pool4', 'conv5_1', 'conv5_2', 'conv5_3', 'pool5', 'flatten']
Btw the shape of the flatten layer is (None, 25088)
Since your feature_vector is also Input. Try to add feature_vector into inputs when you define the Model.
custom_vgg_model = Model([vgg_model.input,feature_vector], out)
I try to build a very simple LSTM to classify text.
def encoded(texts):
res = [one_hot(text, 100000, filters='!"#$%&()*+,-./:;<=>?#[\]^_`{|}~', split=' ') for text in texts]
return res
def train(X, y, X_t, y_t):
X = encoded(X)
X_t = encoded(X_t)
model = Sequential()
model.add(Embedding(100000,100))
model.add(Bidirectional(LSTM(20,return_sequences = True),merge_mode='ave'))
model.add(TimeDistributed(Dense(1, activation='sigmoid')))
model.compile(loss='binary_crossentropy', optimizer='rmsprop', metrics=['accuracy'])
model.fit(np.array(X), np.array(y), batch_size=16, epochs=8)
score = model.evaluate(np.array(X_t), np.array(y_t), batch_size = 16)
print(score)
However I got this error:
ValueError: setting an array element with a sequence.
It seems like embedding layer didnt create right dimension vector or something wrong with the format of input X(X_t).
Any idea?