Keras multi input one shared embedding layer - keras

Is it possible to simply share one embedding layer with one input with multiple features ?
Is it possible to avoid to create multiple inputs layers one by feature.
I would like to avoid to create 34 input layers (one by feature).
The goal is to pass throw one embedding layer 34 feature sequence, get 34 embedded vector sequences. Concatenate them to obtain one super feature vector sequence. And then feed a LSTM.
input shape (None,100,34) -> Embedding_layer_size_64 -> (None,100, 34*64) -> LSTM -> softmax
hope it's clear

The Solution:
# Shared embedding
embedding_layer = Embedding(input_dim = vocab_size+1, output_dim = emb_dim, input_length = nb_timesteps, mask_zero = True)
# For every features we have it's own input
feature_inputs = [Input(shape=(nb_timesteps, ), name='feature_' + str(i + 1)) for i in range(nb_features)]
# Repeat this for every feature
feature_embeddings = [embedding_layer(f) for f in feature_inputs]
# Concatenate the embedding outputs
concatenated_embeddings = concatenate(feature_embeddings, axis=-1)
lstm_1 = LSTM(output_dim)(concatenated_embeddings)
output_layer = Dense(nb_classes, activation='softmax')(lstm_1)
model = Model(inputs=feature_inputs, outputs=output_layer, name="Multi_feature_Embedding_LSTM")

Related

How to stack same RNN for every layer?

I would like to know how to stack many layers of RNN but every layer are the same RNN. I want every layer share the same weight. I have read stack LSTM and RNN, but I found that each layer was not the same.
1 layer code:
inputs = keras.Input(shape=(maxlen,), batch_size = batch_size)
Emb_layer = layers.Embedding(max_features,word_dim)
Emb_output = Emb_layer(inputs)
first_layer = layers.SimpleRNN(n_hidden,use_bias=True,return_sequences=False,stateful =False)
first_layer_output = first_layer(Emb_output)
dense_layer = layers.Dense(1, activation='sigmoid')
dense_output = dense_layer(first_layer_output )
model = keras.Model(inputs=inputs, outputs=dense_output)
model.summary()
enter image description here
RNN 1 layer
inputs = keras.Input(shape=(maxlen,), batch_size = batch_size)
Emb_layer = layers.Embedding(max_features,word_dim)
Emb_output = Emb_layer(inputs)
first_layer = layers.SimpleRNN(n_hidden,use_bias=True,return_sequences=True,stateful =True)
first_layer_output = first_layer(Emb_output)
first_layer_state = first_layer.states
second_layer = layers.SimpleRNN(n_hidden,use_bias=True,return_sequences=False,stateful =False)
second_layer_set_state = second_layer(first_layer_output, initial_state=first_layer_state)
dense_layer = layers.Dense(1, activation='sigmoid')
dense_output = dense_layer(second_layer_set_state )
model = keras.Model(inputs=inputs, outputs=dense_output)
model.summary()
enter image description here
Stack RNN 2 layer.
For example, I want to build two layers RNN, but the first layer and the second must have the same weight, such that when I update the weight in the first layer the second layer must be updated and share the same value. As far as I know, TF has RNN.state. It returns the value from the previous layer. However, when I use this, it seems that each layer is treated independently. The 2-layer RNN that I want should have trainable parameters equal to the 1-layer since they shared the same weight, but this did not work.
You can view the layer object as a container for the weights that knows how to apply the weights. You can use the layer object as many times as you want. Assuming the embedding and the RNN dimension are the same, you can do:
states = Emb_layer(inputs)
first_layer = layers.SimpleRNN(n_hidden, use_bias=True, return_sequences=True)
for _ in range(10):
states = first_layer(states)
There is no reason to set stateful to true. This is used when you split long sequences into multiple batches and what the RNN to remember the state between batches, so you do not have yo manually set initial states. You can get the final state of the RNN (that you wany you want to use for classification) by simply indexing the last position from states.

Convert code to new keras version (functional API) or how to concatenate 2 models

Megre doesn't work anymore. I tried the new functional API (concatenate, add, multiply) but it doesn't work for models. How to implement it?
lower_model = [self.build_network(self.model_config['critic_lower'], input_shape=(self.history_length, self.n_stock, 1))
for _ in range(1 + self.n_smooth + self.n_down)]
merged = Merge(lower_model, mode='concat')
# upper layer
upper_model = self.build_network(self.model_config['critic_upper'], model=merged)
# action layer
action = self.build_network(self.model_config['critic_action'], input_shape=(self.n_stock,), is_conv=False)
# output layer
merged = Merge([upper_model, action], mode='mul')
model = Sequential()
model.add(merged)
model.add(Dense(1))
return model
I cannot really give you the exact answer, because your question is not detailed enough, but I can provide you an example, where layers are concatenated. Common problem is to import Concatenate and use it as in previous versions.
nlp_input = Input(shape=(seq_length,), name='nlp_input')
meta_input = Input(shape=(10,), name='meta_input')
emb = Embedding(output_dim=embedding_size, input_dim=100, input_length=seq_length)(nlp_input)
nlp_out = Bidirectional(LSTM(128, dropout=0.3, recurrent_dropout=0.3, kernel_regularizer=regularizers.l2(0.01)))(emb)
x = concatenate([nlp_out, meta_input])
x = Dense(classifier_neurons, activation='relu')(x)
x = Dense(1, activation='sigmoid')(x)
model = Model(inputs=[nlp_input , meta_input], outputs=[x])
This is a dirty workaround to show how to get input and output tensors from models and use concatenate layers with them. Also to learn how to use Dense and other layers with tensors and create functional API models.
Ideally, you should rewrite everything that's inside build_network for clean and optimized code. (Perhaps this doesn't even work depending on the content of this function, but this is the idea)
lower_model = [self.build_network(
self.model_config['critic_lower'],
input_shape=(self.history_length, self.n_stock, 1))
for _ in range(1 + self.n_smooth + self.n_down)]
#for building models you need input and output tensors
lower_inputs = [model.input for model in lower_model]
lower_outputs = [model.output for model in lower_model]
#these lines assume each model in the list has only one input and output
#using a concatenate layer on a list of tensors
merged_tensor = Concatenate()(lower_outputs) #or Concatenate(axis=...)(lower_outputs)
#this is a workaround for compatibility.
#ideally you should work just with tensors, not create unnecessary intermediate models
merged_model = Model(lower_inputs, merged_tensor) #make model from input tensors to outputs
# upper layer
upper_model = self.build_network(self.model_config['critic_upper'], model=merged_model)
# action layer
action = self.build_network(self.model_config['critic_action'], input_shape=(self.n_stock,), is_conv=False)
# output layer - get the output tensors from the models
upper_out = upper_model.output
action_out = action.output
#apply the Multiply layer on the list of tensors
merged_tensor = Multiply()([upper_out, action_out])
#apply the Dense layer on the merged tensor
out = Dense(1)(merged_tensor)
#get input tensors to create a model
upper_iputs = upper_model.inputs #should be a list
action_inputs = action.inputs #if not a list, append to the previous list
inputs = upper_inputs + action_inputs
model = Model(inputs, out)
return model

How to connect custom Keras layer with multiple outputs

I defined a custom Keras layer custom_layer with two outputs: output_1 and output_2. Next, I want two independent layers A and B to connect to output_1 and output_2 respectively. How to implement this kind of network?
Using the keras api mode you can create any network architecture.
In your case a possible solution is
input_layer = Input(shape=(100,1))
custom_layer = Dense(10)(input_layer)
# layer A model
layer_a = Dense(10, activation='relu')(custom_layer)
output1 = Dense(1, activation='sigmoid')(layer_a)
# layer B model
layer_b = Dense(10, activation='relu')(custom_layer)
output1 = Dense(1, activation='sigmoid')(layer_b)
# define model input and output
model = Model(inputs=input_layer, outputs=[output1, output2])
If the custom layer has two output tensors (i.e. it returns a list of output tensors) when applied on one input, then:
custom_layer_output = CustomLayer(...)(input_tensor)
layer_a_output = LayerA(...)(custom_layer_output[0])
layer_b_output = LayerB(...)(custom_layer_output[1])
But if it is applied on two different input tensors, then:
custom_layer = CustomLayer(...)
out1 = custom_layer(input1)
out2 = custom_layer(input2)
layer_a_output = LayerA(...)(out1)
layer_b_output = LayerB(...)(out2)
# alternative way
layer_a_output = LayerA(...)(custom_layer.get_output_at(0))
layer_b_output = LayerB(...)(custom_layer.get_output_at(1))
Keras supports having multiple output layers in your custom layer. There is a merge, which will update the documentation soon.
The basic idea is to work with lists. Everithing you have to reutrn in your custom layer (like layers and shape) you have to return as lists of them.
If you implement your custom layer in the right way the rest is simple:
output_1, output_2 = custom_layer()(input_layer)
layer_a_output = layer_a()(output_1)
layer_b_output = layer_b()(output_2)

How to input mask value to Convolution1D layer

I need to feed variable length sequences into my model.
My model is Embedding + LSTM + Conv1d + Maxpooling + softmax.
When I set mask_zero = True in Embedding, I fail to compile at Conv1d.
How can I input mask value in Conv1d or is there another solution?
The Masking layer expects every downstream layer to support masking, which is not the case of the Conv1D layer. Fortunately, there is another way to apply masking, using the Functional API:
inputs = Input(...)
mask = Masking().compute_mask(inputs) # <= Compute the mask
embed = Embedding(...)(inputs)
lstm = LSTM(...)(embed, mask=mask) # <= Apply the mask
conv = Conv1D(...)(lstm)
...
model = Model(inputs=[inputs], outputs=[...])
Conv1D layer does not support masking at this time. Here is an open issue on the keras repo.
Depending on the task you might be able to get away with embedding the mask_value just like the other values in the sequence and apply global pooling (as you're doing now).

How to combine FCNN and RNN in Tensorflow?

I want to make a Neural Network, which would have recurrency (for example, LSTM) at some layers and normal connections (FC) at others.
I cannot find a way to do it in Tensorflow.
It works, if I have only FC layers, but I don't see how to add just one recurrent layer properly.
I create a network in a following way :
with tf.variable_scope("autoencoder_variables", reuse=None) as scope:
for i in xrange(self.__num_hidden_layers + 1):
# Train weights
name_w = self._weights_str.format(i + 1)
w_shape = (self.__shape[i], self.__shape[i + 1])
a = tf.multiply(4.0, tf.sqrt(6.0 / (w_shape[0] + w_shape[1])))
w_init = tf.random_uniform(w_shape, -1 * a, a)
self[name_w] = tf.Variable(w_init,
name=name_w,
trainable=True)
# Train biases
name_b = self._biases_str.format(i + 1)
b_shape = (self.__shape[i + 1],)
b_init = tf.zeros(b_shape)
self[name_b] = tf.Variable(b_init, trainable=True, name=name_b)
if i+1 == self.__recurrent_layer:
# Create an LSTM cell
lstm_size = self.__shape[self.__recurrent_layer]
self['lstm'] = tf.contrib.rnn.BasicLSTMCell(lstm_size)
It should process the batches in a sequential order. I have a function for processing just one time-step, which will be called later, by a function, which process the whole sequence :
def single_run(self, input_pl, state, just_middle = False):
"""Get the output of the autoencoder for a single batch
Args:
input_pl: tf placeholder for ae input data of size [batch_size, DoF]
state: current state of LSTM memory units
just_middle : will indicate if we want to extract only the middle layer of the network
Returns:
Tensor of output
"""
last_output = input_pl
# Pass through the network
for i in xrange(self.num_hidden_layers+1):
if(i!=self.__recurrent_layer):
w = self._w(i + 1)
b = self._b(i + 1)
last_output = self._activate(last_output, w, b)
else:
last_output, state = self['lstm'](last_output,state)
return last_output
The following function should take sequence of batches as input and produce sequence of batches as an output:
def process_sequences(self, input_seq_pl, dropout, just_middle = False):
"""Get the output of the autoencoder
Args:
input_seq_pl: input data of size [batch_size, sequence_length, DoF]
dropout: dropout rate
just_middle : indicate if we want to extract only the middle layer of the network
Returns:
Tensor of output
"""
if(~just_middle): # if not middle layer
numb_layers = self.__num_hidden_layers+1
else:
numb_layers = FLAGS.middle_layer
with tf.variable_scope("process_sequence", reuse=None) as scope:
# Initial state of the LSTM memory.
state = initial_state = self['lstm'].zero_state(FLAGS.batch_size, tf.float32)
tf.get_variable_scope().reuse_variables() # THIS IS IMPORTANT LINE
# First - Apply Dropout
the_whole_sequences = tf.nn.dropout(input_seq_pl, dropout)
# Take batches for every time step and run them through the network
# Stack all their outputs
with tf.control_dependencies([tf.convert_to_tensor(state, name='state') ]): # do not let paralelize the loop
stacked_outputs = tf.stack( [ self.single_run(the_whole_sequences[:,time_st,:], state, just_middle) for time_st in range(self.sequence_length) ])
# Transpose output from the shape [sequence_length, batch_size, DoF] into [batch_size, sequence_length, DoF]
output = tf.transpose(stacked_outputs , perm=[1, 0, 2])
return output
The issue is with a variable scopes and their property "reuse".
If I run this code as it is I am getting the following error:
' Variable Train/process_sequence/basic_lstm_cell/weights does not exist, or was not created with tf.get_variable(). Did you mean to set reuse=None in VarScope? '
If I comment out the line, which tell it to reuse variables ( tf.get_variable_scope().reuse_variables() ) I am getting the following error:
'Variable Train/process_sequence/basic_lstm_cell/weights already exists, disallowed. Did you mean to set reuse=True in VarScope?'
It seems, that we need "reuse=None" for the weights of the LSTM cell to be initialized and we need "reuse=True" in order to call the LSTM cell.
Please, help me to figure out the way to do it properly.
I think the problem is that you're creating variables with tf.Variable. Please, use tf.get_variable instead -- does this solve your issue?
It seems that I have solved this issue using the hack from the official Tensorflow RNN example (https://www.tensorflow.org/tutorials/recurrent) with the following code
with tf.variable_scope("RNN"):
for time_step in range(num_steps):
if time_step > 0: tf.get_variable_scope().reuse_variables()
(cell_output, state) = cell(inputs[:, time_step, :], state)
outputs.append(cell_output)
The hack is that when we run LSTM first time, tf.get_variable_scope().reuse is set to False, so that the new LSTM cell is created. When we run it next time, we set tf.get_variable_scope().reuse to True, so that we are using the LSTM, which was already created.

Resources