What would be the most straightforward way to implement a hypernetwork in Keras? That is, where one leg of the network creates the weights for another? In particular, I would like to do template matching where I feed the template in to a CNN leg that generates a convolutional kernel for a leg that operates on the main image. The part I'm unsure of is where I have a CNN layer that is fed weights externally, yet the gradients still flow through properly for training.
The weights leg:
For the weights leg, just create a regiular network as you would with Keras.
Be sure that its output(s) have shape like (spatial_kernel_size1, spatial_kernel_size2, input_channels, output_channels)
Usint the functional API you can create a few weights, for instance:
inputs = Input((imgSize1, imgSize2, imgChannels))
w1 = Conv2D(desired_channels, ....)(inputs)
w2 = Conv2D(desired_channels2, ....)(inputs or w1)
....
You should apply some kind of pooling here, since your outputs will have a huge size and you probably want filters with small sizes such as 3, 5, etc.
w1 = GlobalAveragePooling2D()(w1) #maybe GlobalMaxPooling2D
w2 = GlobalAveragePooling2D()(w2)
If you're using fixed image sizes, you could also use other kinds of pooling or flatten and dense, etc.
Make sure you reshape the weights for the correct shape.
w1 = Reshape((size1,size2,input_channels, output_channels))(w1)
w2 = Reshape((sizeA, sizeB, input_channels2, output_channels2))(w2)
....
The choice of the number of channels is up to you to optimize
The convolutional leg:
Now, this leg will only use "non trainable" convolutions, they can be found directly in the backend and be used in Lambda layers:
out1 = Lambda(lambda x: K.conv2d(x[0], x[1]))([inputs,w1])
out2 = Lambda(lambda x: K.conv2d(x[0], x[1]))([out1,w2])
Now, how you're going to interleave the layers, how many weights, etc., is also something you should optimize for yourself.
Create the model:
model = Model(inputs, out2)
Interleaving
You may take an output from this leg as input for the weight generator leg too:
w3 = Conv2D(filters, ...)(out2)
w3 = GlobalAveragePooling2D()(w3)
w3 = Reshape((sizeI, sizeII, inputC, outputC))(w3)
out3 = Lambda(lambda x: K.conv2d(x[0], x[1]))([out2,w3])
Related
I’m building a NN model using keras, and I wish to impose a constraint on it that doesn’t (directly) have to do with the weights. Would be very grateful for some help / points me towards some relevant keywords to look up. The constraint I wish to impose is a bit complex, but it can be simplified in the following manner: I wish to impose a constraint on the output of certain inputs of the net. For the sake of simplicity, let’s say the constraint looks like NN(3)+NN(4) < 10, where NN is the neural net, which can be seen as a function. How can I impose such a constraint? Thank you very much in advance for any help on the subject!
edit: A more detailed explanation of what I'm trying to do and why.
The theoretical model I'm building is this:
I'm feeding the output of the first net into the input of the second net, along with an additive gaussian noise.
The constraint I wish to impose is on the output of the first NN (g). Why? Without a constraint, the net maps the inputs to outputs as high as it possibly can in order to make the additive noise as insignificant as possible. And rightly so, this is the optimal encoding function g, but it's not very interesting :) And so I wish to impose a constraint on the output of the first NN (g). More specifically, the constraint is on the total power of the function: integral{ fX(x) * g(x)^2 dx }. But this can be simplified more or less, to a function that looks something like what I described earlier - g(3)+g(4)<10. More specifically, the function is sum { fX(x) * g(i)^2 * dx } < max_power, for some sampled inputs i.
This is the problem, now here's how I attempted to implement it:
model = Sequential([
Dense(300, input_dim=1, activation='relu'),
Dense(300, activation='relu'),
Dense(1, activation='linear', name=encoder_output),
GaussianNoise(nvar, name='noise'),
Dense(300, activation='relu', name=decoder_input),
Dense(300, activation='relu'),
Dense(1, activation='linear', name=decoder_output),
])
Mainly, this is supposedly a single neural net, and not really 2 (although there is no difference obviously).
The import things to note is the input dim 1, output dim 1 (x and y in the diagram), and the gaussian noise in the middle. The hidden layers are not very interesting right now, I'll optimize them at a later point.
In this model, I wish to impose a constraint on the output of a (supposedly) hidden layer named encoder_output. Hope this clarifies things.
You could use a multi input/multi output model with shared weights layers. The model could for example look like this:
from keras.layers import Input, Dense, Add
from keras.models import Model
# Shared weights layers
hidden = Dense(10, activation='relu')
nn_output = Dense(1, activation='relu')
x1 = Input(shape=(1,))
h1 = hidden(x1)
y1 = nn_output(h1)
x2 = Input(shape=(1,))
h2 = hidden(x2)
y2 = nn_output(h2)
# Your constraint
# In case it should be more complicated, you can implement
# a custom keras layer
sum = Add()([y1, y2])
model = Model(inputs=[x1, x2], outputs=[y1, y2, sum])
model.compile(optimizer='sgd', loss='mse')
X_train_1 = [3,4]
X_train_2 = [4,3]
y_train_1 = [123,456] # your expected output
y_train_2 = [456,123] # your expected output
s = [10,10] # expected sums
model.fit([X_train_1, X_train_2], [y_train_1, y_train_2, s], epochs=10)
If you have no exact value for your constraint that can be used as an expected output, you can remove it from the outputs and write a simple custom regularizer that would be used on it. There is a simple example for a custom regularizer in the Keras documentation.
I need to create multiple dense layers by for loop, the number of iteration depends on the number of labels. I want to create one dense layer for each label. Each label has a different set of features, so I want to predict each label separately with corresponding feature set in each dense layer. Is that possible? The following code is my attempt.
layers = []
for i in range(num_labels):
h1 = Dense(num_genes_per+10, kernel_initializer='normal', input_dim = num_genes_per, activation='relu')(inputs)
h2 = Dense(int(num_genes_per/2), kernel_initializer='normal', activation='relu')(h1)
output= Dense(1, kernel_initializer='normal', activation='linear')(h2)
layers.append(output)
merged_output = concatenate(layers, axis=1)
model = Model(inputs, merged_output)
The output of each h2 will have shape [batch, 1], and the merged_output will have shape [batch, num_labels]. Is there any error in the above code?
I know it is not efficient, but if I concatenate the different set of features into one input tensor, and use only one dense layer to predict all labels at same time, would it harms the prediction accuracy?
It depends on how you defined features and labels. If features 1, 2 and 3 are used to predict label 1 and they have no relation to label 2, It does not make sense to include it in label 3 inference.
I intend to feed all outputs of timesteps from a LSTM to a fully-connected layer. However, the following codes fail. How can I reduce 3D output of LSTM to 2D by concatenating each output of timestep?
X = LSTM(units=128,return_sequences=True)(input_sequence)
X = Dropout(rate=0.5)(X)
X = LSTM(units=128,return_sequences=True)(X)
X = Dropout(rate=0.5)(X)
X = Concatenate()(X)
X = Dense(n_class)(X)
X = Activation('softmax')(X)
You can use the Flatten layer to flatten the 3D output of LSTM layer to a 2D shape.
As a side note, it is better to use dropout and recurrent_dropout arguments of LSTM layer instead of using Dropout layer directly with recurrent layers.
Additional to #todays answer:
It seems like you want to use return_sequences just to concatenate it into a dense layer. If you did not already try it with return_sequeunces=False, I would recommend you to do to so. The main purpose of return_sequences is to stack LSTMS or to make seq2seq predictions. In your case it should be enough to just use the LSTM.
I would like to merge a forward LSTM and a backward LSTM in Keras. The input array of the backward LSTM is different from that of a forward LSTM. Thus, I cannot use keras.layers.Bidirectional.
The forward input is (10, 4).
The backward input is (12, 4) and it is reversed before put into the model. I would like to reverse it again after LSTM and merge it with the forward.
The simplified model is as follows.
from lambdawithmask import Lambda as MaskLambda
def reverse_func(x, mask=None):
return tf.reverse(x, [False, True, False])
forward = Sequential()
backward = Sequential()
model = Sequential()
forward.add(LSTM(input_shape = (10, 4), output_dim = 4, return_sequences = True))
backward.add(LSTM(input_shape = (12, 4), output_dim = 4, return_sequences = True))
backward.add(MaskLambda(function=reverse_func, mask_function=reverse_func))
model.add(Merge([forward, backward], mode = "concat", concat_axis = 1))
When I run this, the error message is:
Tensors in list passed to 'values' of 'ConcatV2' Op have types [bool, float32] that don't all match.
Could anyone help me? I coded in Python 3.5.2 with Keras (2.0.5) and the backend is tensorflow (1.2.1).
First of all, if you have two different inputs, you cannot use a Sequential model. You must use the functional API Model:
from keras.models import Model
The two first models can be sequential, no problem, but the junction must be a regular model. When it's about concatenating, I also use the functional approach (create the layer, then pass the input):
junction = Concatenate(axis=1)([forward.output,backward.output])
Why axis=1? You can only concatenate things with the same shape. Since you have 10 and 12, they're not compatible unless you use this exact axis for the merge, which is the second axis, considering you have (BatchSize, TimeSteps, Units)
For creating the final model, use the Model, specify the inputs and outputs:
model = Model([forward.input,backward.input], junction)
In the model to be reversed, use simply a Lambda layer. A MaskLambda does more than just the function you want. I also suggest you use the keras backend insted of tensorflow functions:
import keras.backend as K
#instead of the MaskLambda:
backward.add(Lambda(lambda x: K.reverse(x,axes=[1]), output_shape=(12,?))
Here, the ? is the amount of units your LSTM layers have. See PS at the end.
PS: I'm not sure output_dim is useful in the LSTM layer. It's necessary in Lambda layers, but I never use it anywhere else. Shapes are natural consequences of the amount of "units" you put in your layers. Strangely, you didn't specify the amount of units.
PS2: How exactly do you want to concatenate two sequences with different sizes?
As said in above answer, using a Functional API offers you much flexibility in case of multi input/output models. You can simply set the go_backwards argument as True to reverse the traversal of the input vector by the LSTM layer.
I have defined the smart_merge function below which merges the forward and backward LSTM layers together along with handling the single traversal case.
from keras.models import Model
from keras.layers import Input, merge
def smart_merge(vectors, **kwargs):
return vectors[0] if len(vectors)==1 else merge(vectors, **kwargs)
input1 = Input(shape=(10,4), dtype='int32')
input2 = Input(shape=(12,4), dtype='int32')
LtoR_LSTM = LSTM(56, return_sequences=False)
LtoR_LSTM_vector = LtoR_LSTM(input1)
RtoL_LSTM = LSTM(56, return_sequences=False, go_backwards=True)
RtoL_LSTM_vector = RtoL_LSTM(input2)
BidireLSTM_vector = [LtoR_LSTM_vector]
BidireLSTM_vector.append(RtoL_LSTM_vector)
BidireLSTM_vector= smart_merge(BidireLSTM_vector, mode='concat')
I need to feed variable length sequences into my model.
My model is Embedding + LSTM + Conv1d + Maxpooling + softmax.
When I set mask_zero = True in Embedding, I fail to compile at Conv1d.
How can I input mask value in Conv1d or is there another solution?
The Masking layer expects every downstream layer to support masking, which is not the case of the Conv1D layer. Fortunately, there is another way to apply masking, using the Functional API:
inputs = Input(...)
mask = Masking().compute_mask(inputs) # <= Compute the mask
embed = Embedding(...)(inputs)
lstm = LSTM(...)(embed, mask=mask) # <= Apply the mask
conv = Conv1D(...)(lstm)
...
model = Model(inputs=[inputs], outputs=[...])
Conv1D layer does not support masking at this time. Here is an open issue on the keras repo.
Depending on the task you might be able to get away with embedding the mask_value just like the other values in the sequence and apply global pooling (as you're doing now).