Here is my model:
from keras.layers import Input, Embedding, Flatten
from keras.models import Model
n_teams = 10888
team_lookup = Embedding(input_dim=n_teams,
output_dim=1,
input_length=1,
name='Team-Strength')
teamid_in = Input(shape=(1,))
strength_lookup = team_lookup(teamid_in)
strength_lookup_flat = Flatten()(strength_lookup)
team_strength_model = Model(teamid_in, strength_lookup_flat, name='Team-Strength-Model')
team_in_1 = Input(shape=(1,), name='Team-1-In')
team_in_2 = Input(shape=(1,), name='Team-2-In')
home_in = Input(shape=(1,), name='Home-In')
team_1_strength = team_strength_model(team_in_1)
team_2_strength = team_strength_model(team_in_2)
out = Concatenate()([team_1_strength, team_2_strength, home_in])
out = Dense(1)(out)
When I am fitting the model with 10888 inputs and running summary I am getting total of 10892 parameters, please explain me:
1) Where do the 4 come from? and
2) If each of my outputs is 10888 why it counts only once?
Here is the summary of the model:
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
Team-1-In (InputLayer) (None, 1) 0
__________________________________________________________________________________________________
Team-2-In (InputLayer) (None, 1) 0
__________________________________________________________________________________________________
Team-Strength (Model) (None, 1) 10888 Team-1-In[0][0]
Team-2-In[0][0]
__________________________________________________________________________________________________
Home-In (InputLayer) (None, 1) 0
__________________________________________________________________________________________________
concatenate_1 (Concatenate) (None, 3) 0 Team-Strength[1][0]
Team-Strength[2][0]
Home-In[0][0]
__________________________________________________________________________________________________
dense_1 (Dense) (None, 1) 4 concatenate_1[0][0]
==================================================================================================
Total params: 10,892
Trainable params: 10,892
Non-trainable params: 0
__________________________________________________________________________________________________
To answer your questions:
4 stems from the output_size * (input_size + 1) = number_parameters. From concatenate_1[0][0] you have 3 connections and 1 bias, hence 4.
10880 is the size of your embedding layer to which Team-1 and Team-2 are connected. It's the total "vocabulary" that is going to be used and has nothing to do with the output (which is the second parameter to the Embedding).
I hope it makes sense.
Related
I had a problem about hierarchical lstm in keras. It works well when the data is 2 dimensions. When I changed it to three dimensions, it does not work. My data is (25,10,2)
I want to build a hierarchical lstm, the first layer lstm will convert each data with shape (10,2) into a vector, there are 25 vectors feed into the second layer lstm. The input data in the first layer lstm is (10,2). I used two embeddings and multiply them. I appreciate if anyone can help.
def H_LSTM():
single_input = Input(shape=(10,2),dtype='int32')
in_sentence = Lambda(lambda x: single_input[:,:, 0:1], output_shape=(maxlen,))(single_input)
in_sentence = Reshape((maxlen,), input_shape = (maxlen,1))(in_sentence)
in_drug = Lambda(lambda x: single_input[:, :, 1:1], output_shape=(maxlen,))(single_input)
in_drug = Reshape((maxlen,), input_shape = (maxlen,1))(in_drug)
embedded_sentence = Embedding(len(word_index) + 1, embedding_dim, weights=[embedding_matrix],
input_length=maxlen, trainable=True, mask_zero=False)(in_sentence)
embedded_drug = Embedding(len(word_index) + 1, embedding_dim, weights=[embedding_matrix],
input_length=maxlen, trainable=True, mask_zero=False)(in_drug)
embedded_sequences = Multiply()([embedded_sentence, embedded_drug])
lstm_sentence = LSTM(100)(embedded_sequences)
encoded_model = Model(inputs = single_input, outputs = lstm_sentence)
sequence_input = Input(shape=(25,10,2),dtype='int32')
seq_encoded = TimeDistributed(encoded_model)(sequence_input)
seq_encoded = Dropout(0.2)(seq_encoded)
# Encode entire sentence
seq_encoded = LSTM(100)(seq_encoded)
# Prediction
prediction = Dense(2, activation='softmax')(seq_encoded)
model = Model(inputs = sequence_input, outputs = prediction)
model.compile(loss='categorical_crossentropy',
optimizer='rmsprop',
metrics=['acc'])
return model
Model Summary:
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_3 (InputLayer) (None, 10, 2) 0
__________________________________________________________________________________________________
lambda_3 (Lambda) (None, 10) 0 input_3[0][0]
__________________________________________________________________________________________________
lambda_4 (Lambda) (None, 10) 0 input_3[0][0]
__________________________________________________________________________________________________
reshape_3 (Reshape) (None, 10) 0 lambda_3[0][0]
__________________________________________________________________________________________________
reshape_4 (Reshape) (None, 10) 0 lambda_4[0][0]
__________________________________________________________________________________________________
embedding_3 (Embedding) (None, 10, 128) 4895744 reshape_3[0][0]
__________________________________________________________________________________________________
embedding_4 (Embedding) (None, 10, 128) 4895744 reshape_4[0][0]
__________________________________________________________________________________________________
multiply_2 (Multiply) (None, 10, 128) 0 embedding_3[0][0]
embedding_4[0][0]
__________________________________________________________________________________________________
lstm_3 (LSTM) (None, 100) 91600 multiply_2[0][0]
==================================================================================================
Total params: 9,883,088
Trainable params: 9,883,088
Non-trainable params: 0
__________________________________________________________________________________________________
None
Model: "model_4"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_4 (InputLayer) (None, 25, 10, 2) 0
_________________________________________________________________
time_distributed_2 (TimeDist (None, 25, 100) 9883088
_________________________________________________________________
dropout_2 (Dropout) (None, 25, 100) 0
_________________________________________________________________
lstm_4 (LSTM) (None, 100) 80400
_________________________________________________________________
dense_2 (Dense) (None, 2) 202
=================================================================
Total params: 9,963,690
Trainable params: 9,963,690
Non-trainable params: 0
Error Message:
InvalidArgumentError: You must feed a value for placeholder tensor 'input_3' with dtype int32 and shape [?,10,2]
[[node input_3 (defined at D:\Users\Jinhe.Shi\AppData\Local\Continuum\anaconda3\lib\site-packages\keras\backend\tensorflow_backend.py:3009) ]] [Op:__inference_keras_scratch_graph_6214]
Function call stack:
keras_scratch_graph
Update: the framework is shown in the following, the difference is no attention layer and I added two embeddings in the lower layer lstm.
enter image description here
Model fit:
The error happens during the model fitting.
model2 = H_LSTM();
print("model fitting - Hierachical network")
model2.fit(X_train, Y_train, nb_epoch=3, batch_size=100, validation_data=(X_test, Y_test))
The input data likes:
enter image description here
I'm trying to build an NLP model that uses two different types of embedding, but I can't get the concatenation layer to function properly.
From what I can tell, the layer output shapes are correct, and they should be able to concatenate along the last axis. Am I calling something improperly?
Code sample:
from tensorflow.keras import layers
x_embed = layers.Embedding(90000, 100, input_length=9000)
feat_embed = layers.Embedding(90000, 8, input_length=9000)
layerlist = [x_embed, feat_embed]
concat = layers.Concatenate()(layerlist)
Error:
Exception has occurred: ValueError
A Concatenate layer should be called on a list of at least 2 inputs
You need to have inputs in your model, also need to specify correct concatenation axis.
from tensorflow.keras import layers
from tensorflow.keras import models
ip1 = layers.Input((9000))
ip2 = layers.Input((9000))
x_embed = layers.Embedding(90000, 100, input_length=9000)(ip1)
feat_embed = layers.Embedding(90000, 8, input_length=9000)(ip2)
layerlist = [x_embed, feat_embed]
concat = layers.Concatenate(axis = -1)(layerlist)
model = models.Model([ip1, ip2], concat)
model.summary()
Model: "model_1"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_5 (InputLayer) [(None, 9000)] 0
__________________________________________________________________________________________________
input_6 (InputLayer) [(None, 9000)] 0
__________________________________________________________________________________________________
embedding_6 (Embedding) (None, 9000, 100) 9000000 input_5[0][0]
__________________________________________________________________________________________________
embedding_7 (Embedding) (None, 9000, 8) 720000 input_6[0][0]
__________________________________________________________________________________________________
concatenate_3 (Concatenate) (None, 9000, 108) 0 embedding_6[0][0]
embedding_7[0][0]
==================================================================================================
Total params: 9,720,000
Trainable params: 9,720,000
Non-trainable params: 0
I have a list of dense layers with same output shapes [batch, 1]. If I combine the outputs of these layers with keras.layers.concatenate(), what would the shape be?
dense_layers = [Dense(1), Dense(1), Dense(1)] #some dense layers
merged_output = keras.layers.concatenate([dense_layers])
Would the shape of merged_output be (batch, 3) or(3, 1)?
The answer is (batch,3). To see this, you can build a model and print model.summary():
from keras.layers import Input, Dense
from keras.models import Model
from keras.layers import concatenate
batch = 30
# define three sets of inputs
input1 = Input(shape=(batch,1))
input2 = Input(shape=(batch,1))
input3 = Input(shape=(batch,1))
# define three dense layers
layer1 = Dense(1)(input1)
layer2 = Dense(1)(input2)
layer3 = Dense(1)(input3)
# concatenate layers
dense_layers = [layer1, layer2, layer3]
merged_output = concatenate(dense_layers)
# create a model and check for output shape
model = Model(inputs=[input1, input2, input3], outputs=merged_output)
model.summary()
Layer (type) Output Shape Param # Connected to
=============================================================================
input_1 (InputLayer) (None, 30, 1) 0
_______________________________________________________________________________
input_2 (InputLayer) (None, 30, 1) 0
_______________________________________________________________________________
input_3 (InputLayer) (None, 30, 1) 0
_______________________________________________________________________________
dense_1 (Dense) (None, 30, 1) 2 input_1[0][0]
_______________________________________________________________________________
dense_2 (Dense) (None, 30, 1) 2 input_2[0][0]
_______________________________________________________________________________
dense_3 (Dense) (None, 30, 1) 2 input_3[0][0]
_______________________________________________________________________________
concatenate_1 (Concatenate) (None, 30, 3) 0 dense_1[0][0]
dense_2[0][0]
dense_3[0][0]
==============================================================================
Total params: 6
Trainable params: 6
Non-trainable params: 0
______________________________________________________________________________
I am building an LSTM network for multivariate time series classification using 2 categorical features which I have created Embedding layers for in Keras. The model compiles and the architecture is displayed below with code. I am getting a ValueError: all the input array dimensions except for the concatenation axis must match exactly. This is strange to me because of model compiling and the output shapes seem to match (3D alignment concatenated along axis = -1). The model fit X parameters are a list of 3 inputs (first categorical variable array, second categorical variable array, and multivariate time series input 3-D for LSTM)
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_4 (InputLayer) (None, 1) 0
__________________________________________________________________________________________________
input_5 (InputLayer) (None, 1) 0
__________________________________________________________________________________________________
VAR_1 (Embedding) (None, 46, 5) 50 input_4[0][0]
__________________________________________________________________________________________________
VAR_2 (Embedding) (None, 46, 13) 338 input_5[0][0]
__________________________________________________________________________________________________
time_series (InputLayer) (None, 46, 11) 0
__________________________________________________________________________________________________
concatenate_3 (Concatenate) (None, 46, 18) 0 VAR_1[0][0]
VAR_2[0][0]
__________________________________________________________________________________________________
concatenate_4 (Concatenate) (None, 46, 29) 0 time_series[0][0]
concatenate_3[0][0]
__________________________________________________________________________________________________
lstm_2 (LSTM) (None, 46, 100) 52000 concatenate_4[0][0]
__________________________________________________________________________________________________
attention_2 (Attention) (None, 100) 146 lstm_2[0][0]
__________________________________________________________________________________________________
dense_2 (Dense) (None, 1) 101 attention_2[0][0]
==================================================================================================
Total params: 52,635
Trainable params: 52,635
Non-trainable params: 0
n_timesteps = 46
n_features = 11
def EmbeddingNet(cat_vars,n_timesteps,n_features,embedding_sizes):
inputs = []
embed_layers = []
for (c, (in_size, out_size)) in zip(cat_vars, embedding_sizes):
i = Input(shape=(1,))
o = Embedding(in_size, out_size, input_length=n_timesteps, name=c)(i)
inputs.append(i)
embed_layers.append(o)
embed = Concatenate()(embed_layers)
time_series_input = Input(batch_shape=(None,n_timesteps,n_features ), name='time_series')
inputs.append(time_series_input)
concatenated_inputs = Concatenate(axis=-1)([time_series_input, embed])
lstm_layer1 = LSTM(units=100,return_sequences=True)(concatenated_inputs)
attention = Attention()(lstm_layer1)
output_layer = Dense(1, activation="sigmoid")(attention)
opt = Adam(lr=0.001)
model = Model(inputs=inputs, outputs=output_layer)
model.compile(loss='binary_crossentropy',optimizer=opt,metrics=['accuracy'])
model.summary()
return model
model = EmbeddingNet(cat_vars,n_timesteps,n_features,embedding_sizes)
history = model.fit(x=[x_train_cat_array[0],x_train_cat_array[1],x_train_input], y=y_train_input, batch_size=8, epochs=1, verbose=1, validation_data=([x_val_cat_array[0],x_val_cat_array[1],x_val_input], y_val_input),shuffle=False)
I'm trying to the the very same thing. You should concatenate over axis 2. Please check HERE
Let me know if this works in your dataset, because categorical features are not giving any benefit to me.
I am using keras 1.1.1 in windows 7 with tensorflow backend.
I am trying to prepend the stock Resnet50 pretained model with an image downsampler. Below is my code.
from keras.applications.resnet50 import ResNet50
import keras.layers
# this could also be the output a different Keras model or layer
input = keras.layers.Input(shape=(400, 400, 1)) # this assumes K.image_dim_ordering() == 'tf'
x1 = keras.layers.AveragePooling2D(pool_size=(2,2))(input)
x2 = keras.layers.Flatten()(x1)
x3 = keras.layers.RepeatVector(3)(x2)
x4 = keras.layers.Reshape((200, 200, 3))(x3)
x5 = keras.layers.ZeroPadding2D(padding=(12,12))(x4)
m = keras.models.Model(input, x5)
model = ResNet50(input_tensor=m.output, weights='imagenet', include_top=False)
but I get an error which I am unsure how to fix.
builtins.Exception: Graph disconnected: cannot obtain value for tensor
Output("input_2:0", shape=(?, 400, 400, 1), dtype=float32) at layer
"input_2". The following previous layers were accessed without issue:
[]
You can use both the Functional API and Sequential approaches to solve this. See working example for both approaches below:
from keras.applications.ResNet50 import ResNet50
from keras.models import Sequential, Model
from keras.layers import AveragePooling2D, Flatten, RepeatVector, Reshape, ZeroPadding2D, Input, Dense
pretrained = ResNet50(input_shape=(224, 224, 3), weights='imagenet', include_top=False)
# Sequential method
model_1 = Sequential()
model_1.add(AveragePooling2D(pool_size=(2,2),input_shape=(400, 400, 1)))
model_1.add(Flatten())
model_1.add(RepeatVector(3))
model_1.add(Reshape((200, 200, 3)))
model_1.add(ZeroPadding2D(padding=(12,12)))
model_1.add(pretrained)
model_1.add(Dense(1))
# functional API method
input = Input(shape=(400, 400, 1))
x = AveragePooling2D(pool_size=(2,2),input_shape=(400, 400, 1))(input)
x = Flatten()(x)
x = RepeatVector(3)(x)
x = Reshape((200, 200, 3))(x)
x = ZeroPadding2D(padding=(12,12))(x)
x = pretrained(x)
preds = Dense(1)(x)
model_2 = Model(input,preds)
model_1.summary()
model_2.summary()
The summaries (replace resnet for xception):
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
average_pooling2d_1 (Average (None, 200, 200, 1) 0
_________________________________________________________________
flatten_1 (Flatten) (None, 40000) 0
_________________________________________________________________
repeat_vector_1 (RepeatVecto (None, 3, 40000) 0
_________________________________________________________________
reshape_1 (Reshape) (None, 200, 200, 3) 0
_________________________________________________________________
zero_padding2d_1 (ZeroPaddin (None, 224, 224, 3) 0
_________________________________________________________________
xception (Model) (None, 7, 7, 2048) 20861480
_________________________________________________________________
dense_1 (Dense) (None, 7, 7, 1) 2049
=================================================================
Total params: 20,863,529
Trainable params: 20,809,001
Non-trainable params: 54,528
_________________________________________________________________
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_2 (InputLayer) (None, 400, 400, 1) 0
_________________________________________________________________
average_pooling2d_2 (Average (None, 200, 200, 1) 0
_________________________________________________________________
flatten_2 (Flatten) (None, 40000) 0
_________________________________________________________________
repeat_vector_2 (RepeatVecto (None, 3, 40000) 0
_________________________________________________________________
reshape_2 (Reshape) (None, 200, 200, 3) 0
_________________________________________________________________
zero_padding2d_2 (ZeroPaddin (None, 224, 224, 3) 0
_________________________________________________________________
xception (Model) (None, 7, 7, 2048) 20861480
_________________________________________________________________
dense_2 (Dense) (None, 7, 7, 1) 2049
=================================================================
Total params: 20,863,529
Trainable params: 20,809,001
Non-trainable params: 54,528
_________________________________________________________________
Both approaches work fine. If you plan on freezing the pretrained model and letting pre/post layers learn -- and afterward finetuning the model, the approach I found to work goes like so:
# given the same resnet model as before...
model = load_model('modelname.h5')
# pull out the nested model
nested_model = model.layers[5] # assuming the model is the 5th layer
# loop over the nested model to allow training
for l in nested_model.layers:
l.trainable=True
# insert the trainable pretrained model back into the original
model.layer[5] = nested_model