Graph disconnected issue in Keras - keras

Architecture I want to implement
I wish to implement this architecture with Keras functional API. I am new to this and here is my code for now (which gets stuck at concatenating inputs).
# Arbitrary dimension for all embeddings
embedding_dim = 10
# Quarter hour of the day embedding
input_quarter_hour = Input(shape=(1,))
embed_quarter_hour = Embedding(metadata['n_quarter_hours'], embedding_dim, input_length=1)(input_quarter_hour)
embed_quarter_hour = Reshape((embedding_dim,))(embed_quarter_hour)
# Day of the week embedding
input_day_of_week = Input(shape=(1,))
embed_day_of_week = Embedding(metadata['n_days_per_week'], embedding_dim, input_length=1)(input_day_of_week)
embed_day_of_week = Reshape((embedding_dim,))(embed_day_of_week)
# Week of the year embedding
input_week_of_year = Input(shape=(1,))
embed_week_of_year = Embedding(metadata['n_weeks_per_year'], embedding_dim, input_length=1)(input_week_of_year)
embed_week_of_year = Reshape((embedding_dim,))(embed_week_of_year)
# Client ID embedding
input_client_ids = Input(shape=(1,))
embed_client_ids = Embedding(metadata['n_client_ids'], embedding_dim, input_length=1)(input_client_ids)
embed_client_ids = Reshape((embedding_dim,))(embed_client_ids)
# Taxi ID embedding
input_taxi_ids = Input(shape=(1,))
embed_taxi_ids = Embedding(metadata['n_taxi_ids'], embedding_dim, input_length=1)(input_taxi_ids)
embed_taxi_ids = Reshape((embedding_dim,))(embed_taxi_ids)
# Taxi stand ID embedding
input_stand_ids = Input(shape=(1,))
embed_stand_ids = Embedding(metadata['n_stand_ids'], embedding_dim, input_length=1)(input_stand_ids)
embed_stand_ids = Reshape((embedding_dim,))(embed_stand_ids)
# GPS coordinates (5 first lat/long and 5 latest lat/long, therefore 20 values)
coords_in = Input(shape=(20,))
coords_out = Dense(1, input_dim=20, init='normal')(coords_in)
#model = Sequential()
concatenated = concatenate([
embed_quarter_hour,
embed_day_of_week,
embed_week_of_year,
embed_client_ids,
embed_taxi_ids,
embed_stand_ids,
coords_out
])
out = Dense(500, activation='relu')(concatenated)
out = Dense(len(clusters),activation='softmax',name='output_layer')(out)
cast_clusters = K.cast_to_floatx(clusters)
def destination(probabilities):
return tf.matmul(probabilities, cast_clusters)
out = Activation(destination)(out)
model = Model(concatenated,out)
I am getting this error :
Graph disconnected: cannot obtain value for tensor
Tensor("input_64:0", shape=(?, 1), dtype=float32) at layer "input_64".
The following previous layers were accessed without issue: [].
I am guessing the problem stems from the size of my tensors... But I don't now how to debug this kind of code.

You should pass a list of all inputs to the model when creating a Keras Model instance. The variable concatenated that you are using in your code does not contain the inputs but instead contains the outputs of certain layers. Moreover, you should not concatenate your inputs but simply use a list.
The following code should work:
inputs = [
input_quarter_hour,
input_day_of_week,
input_week_of_year,
input_client_ids,
input_taxi_ids,
input_stand_ids,
coords_in
]
model = Model(inputs=inputs, outputs=out)

Related

Keras, Tensorflow : Merge two different model output into one

I am working on one deep learning model where I am trying to combine two different model's output :
The overall structure is like this :
So the first model takes one matrix, for example [ 10 x 30 ]
#input 1
input_text = layers.Input(shape=(1,), dtype="string")
embedding = ElmoEmbeddingLayer()(input_text)
model_a = Model(inputs = [input_text] , outputs=embedding)
# shape : [10,50]
Now the second model takes two input matrix :
X_in = layers.Input(tensor=K.variable(np.random.uniform(0,9,[10,32])))
M_in = layers.Input(tensor=K.variable(np.random.uniform(1,-1,[10,10]))
md_1 = New_model()([X_in, M_in]) #new_model defined somewhere
model_s = Model(inputs = [X_in, A_in], outputs = md_1)
# shape : [10,50]
I want to make these two matrices trainable like in TensorFlow I was able to do this by :
matrix_a = tf.get_variable(name='matrix_a',
shape=[10,10],
dtype=tf.float32,
initializer=tf.constant_initializer(np.array(matrix_a)),trainable=True)
I am not getting any clue how to make those matrix_a and matrix_b trainable and how to merge the output of both networks then give input.
I went through this question But couldn't find an answer because their problem statement is different from mine.
What I have tried so far is :
#input 1
input_text = layers.Input(shape=(1,), dtype="string")
embedding = ElmoEmbeddingLayer()(input_text)
model_a = Model(inputs = [input_text] , outputs=embedding)
# shape : [10,50]
X_in = layers.Input(tensor=K.variable(np.random.uniform(0,9,[10,10])))
M_in = layers.Input(tensor=K.variable(np.random.uniform(1,-1,[10,100]))
md_1 = New_model()([X_in, M_in]) #new_model defined somewhere
model_s = Model(inputs = [X_in, A_in], outputs = md_1)
# [10,50]
#tranpose second model output
tranpose = Lambda(lambda x: K.transpose(x))
agglayer = tranpose(md_1)
# concat first and second model output
dott = Lambda(lambda x: K.dot(x[0],x[1]))
kmean_layer = dotter([embedding,agglayer])
# input
final_model = Model(inputs=[input_text, X_in, M_in], outputs=kmean_layer,name='Final_output')
final_model.compile(loss = 'categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
final_model.summary()
Overview of the model :
Update:
Model b
X = np.random.uniform(0,9,[10,32])
M = np.random.uniform(1,-1,[10,10])
X_in = layers.Input(tensor=K.variable(X))
M_in = layers.Input(tensor=K.variable(M))
layer_one = Model_b()([M_in, X_in])
dropout2 = Dropout(dropout_rate)(layer_one)
layer_two = Model_b()([layer_one, X_in])
model_b_ = Model([X_in, M_in], layer_two, name='model_b')
model a
length = 150
dic_size = 100
embed_size = 12
input_text = Input(shape=(length,))
embedding = Embedding(dic_size, embed_size)(input_text)
embedding = LSTM(5)(embedding)
embedding = Dense(10)(embedding)
model_a = Model(input_text, embedding, name = 'model_a')
I am merging like this:
mult = Lambda(lambda x: tf.matmul(x[0], x[1], transpose_b=True))([embedding, model_b_.output])
final_model = Model(inputs=[model_b_.input[0],model_b_.input[1],model_a.input], outputs=mult)
Is it right way to matmul two keras model?
I don't know if I am merging the output correctly and the model is correct.
I would greatly appreciate it if anyone kindly gives me some advice on how should I make that matrix trainable and how to merge the model's output correctly then give input.
Thanks in advance!
Trainable weights
Ok. Since you are going to have custom trainable weights, the way to do this in Keras is creating a custom layer.
Now, since your custom layer has no inputs, we will need a hack that will be explained later.
So, this is the layer definition for the custom weights:
from keras.layers import *
from keras.models import Model
from keras.initializers import get as get_init, serialize as serial_init
import keras.backend as K
import tensorflow as tf
class TrainableWeights(Layer):
#you can pass keras initializers when creating this layer
#kwargs will take base layer arguments, such as name and others if you want
def __init__(self, shape, initializer='uniform', **kwargs):
super(TrainableWeights, self).__init__(**kwargs)
self.shape = shape
self.initializer = get_init(initializer)
#build is where you define the weights of the layer
def build(self, input_shape):
self.kernel = self.add_weight(name='kernel',
shape=self.shape,
initializer=self.initializer,
trainable=True)
self.built = True
#call is the layer operation - due to keras limitation, we need an input
#warning, I'm supposing the input is a tensor with value 1 and no shape or shape (1,)
def call(self, x):
return x * self.kernel
#for keras to build the summary properly
def compute_output_shape(self, input_shape):
return self.shape
#only needed for saving/loading this layer in model.save()
def get_config(self):
config = {'shape': self.shape, 'initializer': serial_init(self.initializer)}
base_config = super(TrainableWeights, self).get_config()
return dict(list(base_config.items()) + list(config.items()))
Now, this layer should be used like this:
dummyInputs = Input(tensor=K.constant([1]))
trainableWeights = TrainableWeights(shape)(dummyInputs)
Model A
Having the layer defined, we can start modeling.
First, let's see the model_a side:
#general vars
length = 150
dic_size = 100
embed_size = 12
#for the model_a segment
input_text = Input(shape=(length,))
embedding = Embedding(dic_size, embed_size)(input_text)
#the following two lines are just a resource to reach the desired shape
embedding = LSTM(5)(embedding)
embedding = Dense(50)(embedding)
#creating model_a here is optional, only if you want to use model_a independently later
model_a = Model(input_text, embedding, name = 'model_a')
Model B
For this, we are going to use our TrainableWeights layer.
But first, let's simulate a New_model() as mentioned.
#simulates New_model() #notice the explicit batch_shape for the matrices
newIn1 = Input(batch_shape = (10,10))
newIn2 = Input(batch_shape = (10,30))
newOut1 = Dense(50)(newIn1)
newOut2 = Dense(50)(newIn2)
newOut = Add()([newOut1, newOut2])
new_model = Model([newIn1, newIn2], newOut, name='new_model')
Now the entire branch:
#the matrices
dummyInput = Input(tensor = K.constant([1]))
X_in = TrainableWeights((10,10), initializer='uniform')(dummyInput)
M_in = TrainableWeights((10,30), initializer='uniform')(dummyInput)
#the output of the branch
md_1 = new_model([X_in, M_in])
#optional, only if you want to use model_s independently later
model_s = Model(dummyInput, md_1, name='model_s')
The whole model
Finally, we can join the branches in a whole model.
Notice how I didn't have to use model_a or model_s here. You can do it if you want, but those submodels are not needed, unless you want later to get them individually for other usages. (Even if you created them, you don't need to change the code below to use them, they're already part of the same graph)
#I prefer tf.matmul because it's clear and understandable while K.dot has weird behaviors
mult = Lambda(lambda x: tf.matmul(x[0], x[1], transpose_b=True))([embedding, md_1])
#final model
model = Model([input_text, dummyInput], mult, name='full_model')
Now train it:
model.compile('adam', 'binary_crossentropy', metrics=['accuracy'])
model.fit(np.random.randint(0,dic_size, size=(128,length)),
np.ones((128, 10)))
Since the output is 2D now, there is no problem about the 'categorical_crossentropy', my comment was because of doubts on the output shape.

Why there is only one input in attention model in Keras?

In this code, the author defined 2 input, but there is only one input feed to the model. There should be some bugs, however, I can run it. I wonder why I can successfully run this code.
def han():
# refer to 4.2 in the paper whil reading the following code
# Input for one day : max article per day =40, dim_vec=200
input1 = Input(shape=(40, 200), dtype='float32')
# Attention Layer
dense_layer = Dense(200, activation='tanh')(input1)
softmax_layer = Activation('softmax')(dense_layer)
attention_mul = multiply([softmax_layer,input1])
#end attention layer
vec_sum = Lambda(lambda x: K.sum(x, axis=1))(attention_mul)
pre_model1 = Model(input1, vec_sum)
pre_model2 = Model(input1, vec_sum)
# Input of the HAN shape (None,11,40,200)
# 11 = Window size = N in the paper 40 = max articles per day, dim_vec = 200
input2 = Input(shape=(11, 40, 200), dtype='float32')
# TimeDistributed is used to apply a layer to every temporal slice of an input
# So we use it here to apply our attention layer ( pre_model ) to every article in one day
# to focus on the most critical article
pre_gru = TimeDistributed(pre_model1)(input2)
# bidirectional gru
l_gru = Bidirectional(GRU(100, return_sequences=True))(pre_gru)
# We apply attention layer to every day to focus on the most critical day
post_gru = TimeDistributed(pre_model2)(l_gru)
# MLP to perform classification
dense1 = Dense(100, activation='tanh')(post_gru)
dense2 = Dense(3, activation='tanh')(dense1)
final = Activation('softmax')(dense2)
final_model = Model(input2, final)
final_model.summary()
return final_model
Keras models can be used as layers. In the code above, input1 is used to define pre_model{1,2}. The models are then called bellow by the model named final_model.
final_model has a single input layer.

Convert code to new keras version (functional API) or how to concatenate 2 models

Megre doesn't work anymore. I tried the new functional API (concatenate, add, multiply) but it doesn't work for models. How to implement it?
lower_model = [self.build_network(self.model_config['critic_lower'], input_shape=(self.history_length, self.n_stock, 1))
for _ in range(1 + self.n_smooth + self.n_down)]
merged = Merge(lower_model, mode='concat')
# upper layer
upper_model = self.build_network(self.model_config['critic_upper'], model=merged)
# action layer
action = self.build_network(self.model_config['critic_action'], input_shape=(self.n_stock,), is_conv=False)
# output layer
merged = Merge([upper_model, action], mode='mul')
model = Sequential()
model.add(merged)
model.add(Dense(1))
return model
I cannot really give you the exact answer, because your question is not detailed enough, but I can provide you an example, where layers are concatenated. Common problem is to import Concatenate and use it as in previous versions.
nlp_input = Input(shape=(seq_length,), name='nlp_input')
meta_input = Input(shape=(10,), name='meta_input')
emb = Embedding(output_dim=embedding_size, input_dim=100, input_length=seq_length)(nlp_input)
nlp_out = Bidirectional(LSTM(128, dropout=0.3, recurrent_dropout=0.3, kernel_regularizer=regularizers.l2(0.01)))(emb)
x = concatenate([nlp_out, meta_input])
x = Dense(classifier_neurons, activation='relu')(x)
x = Dense(1, activation='sigmoid')(x)
model = Model(inputs=[nlp_input , meta_input], outputs=[x])
This is a dirty workaround to show how to get input and output tensors from models and use concatenate layers with them. Also to learn how to use Dense and other layers with tensors and create functional API models.
Ideally, you should rewrite everything that's inside build_network for clean and optimized code. (Perhaps this doesn't even work depending on the content of this function, but this is the idea)
lower_model = [self.build_network(
self.model_config['critic_lower'],
input_shape=(self.history_length, self.n_stock, 1))
for _ in range(1 + self.n_smooth + self.n_down)]
#for building models you need input and output tensors
lower_inputs = [model.input for model in lower_model]
lower_outputs = [model.output for model in lower_model]
#these lines assume each model in the list has only one input and output
#using a concatenate layer on a list of tensors
merged_tensor = Concatenate()(lower_outputs) #or Concatenate(axis=...)(lower_outputs)
#this is a workaround for compatibility.
#ideally you should work just with tensors, not create unnecessary intermediate models
merged_model = Model(lower_inputs, merged_tensor) #make model from input tensors to outputs
# upper layer
upper_model = self.build_network(self.model_config['critic_upper'], model=merged_model)
# action layer
action = self.build_network(self.model_config['critic_action'], input_shape=(self.n_stock,), is_conv=False)
# output layer - get the output tensors from the models
upper_out = upper_model.output
action_out = action.output
#apply the Multiply layer on the list of tensors
merged_tensor = Multiply()([upper_out, action_out])
#apply the Dense layer on the merged tensor
out = Dense(1)(merged_tensor)
#get input tensors to create a model
upper_iputs = upper_model.inputs #should be a list
action_inputs = action.inputs #if not a list, append to the previous list
inputs = upper_inputs + action_inputs
model = Model(inputs, out)
return model

keras lstm model in a bidirectional wrapper -- pass values to next layer

I am working on this keras model. it is derived from nmt examples and is meant to be a chatbot. The specific question is about the Bidirectional wrapper around LSTM a. The bi wrapper returns 5 items. One is the return state and the other four are h_states and c_states, one set for the forward direction and one for backwards. Somehow I want to feed them to the next LSTM layer. I don't seem to be doing it effectively. To start with I don't know which is h, which is c and which two are forward and which two are back. The method I'm using to pass the values to the next lstm layer is not good. Can anyone help me out? I am using the tensorflow 1.6.0 backend and Keras 2.1.5 on a linux computer.
def embedding_model_lstm( words,
embedding_weights_a=None,
trainable=False,
skip_embed=False,
return_sequences_b=False):
lstm_unit_a = units
lstm_unit_b = units # * 2
embed_unit = 100 # int(hparams['embed_size'])
x_shape = (tokens_per_sentence,)
valid_word_a = Input(shape=x_shape)
valid_word_b = Input(shape=x_shape)
embeddings_a = Embedding(words,embed_unit ,
weights=[embedding_weights_a],
input_length=tokens_per_sentence,
trainable=trainable
)
embed_a = embeddings_a(valid_word_a)
### encoder for training ###
lstm_a = Bidirectional(LSTM(units=lstm_unit_a,
return_sequences=True,
return_state=True,
#recurrent_dropout=0.2,
input_shape=(None,),
), merge_mode='ave')
recurrent_a, rec_a_1, rec_a_2, rec_a_3, rec_a_4 = lstm_a(embed_a)
concat_a_1 = Average()([rec_a_1, rec_a_3])
concat_a_2 = Average()([rec_a_2, rec_a_4])
lstm_a_states = [concat_a_1, concat_a_2]
embed_b = embeddings_a(valid_word_b)
lstm_b = LSTM(units=lstm_unit_b ,
#recurrent_dropout=0.2,
return_sequences=return_sequences_b,
return_state=True
)
recurrent_b, inner_lstmb_h, inner_lstmb_c = lstm_b(embed_b, initial_state=lstm_a_states)
dense_b = Dense(embed_unit, input_shape=(tokens_per_sentence,),
activation='relu', #softmax or relu
)
decoder_b = dense_b(recurrent_b)
model = Model([valid_word_a,valid_word_b], decoder_b)
### encoder for inference ###
model_encoder = Model(valid_word_a, lstm_a_states)
### decoder for inference ###
input_h = Input(shape=(None, ))
input_c = Input(shape=(None, ))
inputs_inference = [input_h, input_c]
embed_b = embeddings_a(valid_word_b)
outputs_inference, outputs_inference_h, outputs_inference_c = lstm_b(embed_b,
initial_state=inputs_inference)
outputs_states = [outputs_inference_h, outputs_inference_c]
dense_outputs_inference = dense_b(outputs_inference)
### inference model ###
model_inference = Model([valid_word_b] + inputs_inference,
[dense_outputs_inference] +
outputs_states)
return model, model_encoder, model_inference
I am using python3.

Keras: Multiple inputs and Multiple ouputs for fit_generator using flow_from_directory

Multi task learning Model accepts three inputs. I am using keras data generator. Is it possible to pass three data generator to model.fit_generator function ?.
Problem Definition
I am working a classification problem. The dataset i am using is Painters by number, a competition hosted by kaggle . The task is to identify painter,style and genre given paintings.
I have developed individual models to perform each task. Now, i would like to incorporate multi task learning, see if it outperforms individual models.
Model No of classes (Softmax)
------ ------------------------
Model predicting painter 8
given paintings
Model predicting style 10
given paintings
Model predicting genre 23
given paintings
The above table details the individual models and the no of output classes for each model.
Now, i want to do multi task learning , so I came up with below simple architecture
Multi Task Learning Architecture
style = Input(shape=(64,64,3))
genre = Input(shape=(64,64,3))
painter = Input(shape=(64,64,3))
shared_conv = Convolution2D(
filters = 5,# 5 feature maps
kernel_size = (5,5),
strides = 1)
shared_conv_layer_A = shared_conv(style)
shared_conv_layer_B = shared_conv(genre)
shared_conv_layer_C = shared_conv(painter)
merged_layer = keras.layers.concatenate([shared_conv_layer_A,shared_conv_layer_B,shared_conv_layer_C],axis=-1)
pooling = MaxPooling2D(
pool_size = (2,2),
strides = 2
)(merged_layer)
dense = Flatten()(pooling)
out_style = Dense(
no_classes_style,
kernel_initializer=glorot_normal(seed=seed_val),
bias_initializer = 'zero',
kernel_regularizer = l2(l=0.0001),
activation = 'softmax',
)(dense)
out_genre = Dense(
no_classes_genre,
kernel_initializer=glorot_normal(seed=seed_val),
bias_initializer = 'zero',
kernel_regularizer = l2(l=0.0001),
activation = 'softmax',
)(dense)
out_painter = Dense(
no_classes_painter,
kernel_initializer=glorot_normal(seed=seed_val),
bias_initializer = 'zero',
kernel_regularizer = l2(l=0.0001),
activation = 'softmax',
)(dense)
multi_tasking_model = Model(inputs=[style,genre,painter],outputs=[out_style,out_genre,out_painter])
multi_tasking_model.summary()
multi_tasking_model.compile(
loss = 'categorical_crossentropy',
optimizer=Adam(lr=0.0001, beta_1=0.9, beta_2=0.999, epsilon=0.00000001 ),
metrics=['accuracy']
)
Now i want to pass three keras image data generators. So, i came up with a custom data generator
def create_data_generator(style_generator,genre_generator,painter_generator):
# Input
_style_generator = style_generator[0]
_genre_generator = genre_generator[0]
_painter_generator = painter_generator[0]
# Label
_lstyle_generator = style_generator[1]
_lgenre_generator = genre_generator[1]
_lpainter_generator = painter_generator[1]
return [_style_generator,_genre_generator,_painter_generator],[_lstyle_generator,_genre_generator,_painter_generator]
train_mulitle_data_generator = create_data_generator(trainStyleDataGenerator,trainGenreDataGenerator,trainPainterDataGenerator)
valid_mulitle_data_generator = create_data_generator(validationStyleDataGenerator,validationGenreDataGenerator,validationPainterDataGenerator)
history = multi_tasking_model.fit_generator(
generator = train_mulitle_data_generator,
steps_per_epoch= len(train_mulitle_data_generator),
epochs = no_epoch,
validation_data = valid_mulitle_data_generator,
)
The error i encountered
'tuple' object has no attribute 'ndim'
Is there any alternative way to pass multiple inputs and multiple outputs. Any suggestions or tips would be greatly helpful please ?.
At the moment create_data_generator does not define a generator. Try this:
def create_data_generator(style_generator,genre_generator,painter_generator):
while(True):
_style_generator, _lstyle_generator = next(style_generator)
_genre_generator, _lgenre_generator = next(genre_generator)
_painter_generator, _lpainter_generator = next(painter_generator)
yield [_style_generator,_genre_generator,_painter_generator], [_lstyle_generator,_genre_generator,_painter_generator]

Resources