I am trying to learn multi modal distribution using Neural Network and Gaussian Mixture model.
but, in my modal summary the output layer has None output, and it is from using MixtureSameFamily.
Can you help me resolve this?
def zero_inf(out):
loc, scale, probs = tf.split(out, num_or_size_splits=3, axis=-1)
scale = tf.nn.softplus(scale)
probs = tf.nn.softmax(probs)
return tfd.MixtureSameFamily(
mixture_distribution = tfd.Categorical(probs=probs),#D
components_distribution = tfd.Normal(loc=loc, scale=scale))
## Definition of the custom parametrized distribution
inputs = tf.keras.layers.Input(shape=(5,))
out = Dense(6)(inputs)#A
p_y_zi = tfp.layers.DistributionLambda(lambda t: zero_inf(t))(out)
model_zi = Model(inputs=inputs, outputs=p_y_zi)
# def NLL(y_true, y_hat):
# return -y_hat.log_prob(tf.reshape(y_true,(-1,)))
# model_zi.compile(optimizer="adam", loss=NLL)
model_zi.summary()
Below is the model summary.
Model: "model_70"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_30 (InputLayer) [(None, 5)] 0
_________________________________________________________________
dense_130 (Dense) (None, 6) 36
_________________________________________________________________
distribution_lambda_36 (Dist ((None,), (None,)) 0
=================================================================
Total params: 36
Trainable params: 36
Non-trainable params: 0
_________________________________________________________________
Because of this empty/none output, I unable to train the model as I get some shape related errors.
InvalidArgumentError: Cannot update variable with shape [] using a Tensor with shape [32], shapes must be equal.
[[{{node metrics_60/mae/AssignAddVariableOp}}]]
Tensorflow version: 2.6.2
Tensorflow-probability: 0.14.0
Python: 3.6
Related
New to the Keras model implementation and despite looking for answers:
https://stackoverflow.com/questions/60991253/invalidargumenterror-input-must-be-a-vector-got-shape
https://stackoverflow.com/questions/41736677/how-could-keras-model-predict-only-one-sample
But still couldnt make it work :(
I successfully trained my Keras model (words embeddings using "https://tfhub.dev/google/universal-sentence-encoder/4")
But when I try to predict with:
test_text = ["We are looking for Data Scientists"]
test_text = np.array(test_text , dtype=object)[:, np.newaxis]
predicts = model.predict(test_text , batch_size=32)
predicts
But I get the following error:
InvalidArgumentError Traceback (most recent call last)
<ipython-input-150-078cff510ad4> in <module>
----> 1 predicts = model.predict(test_text, batch_size=32)
2
3 predicts
1 frames
/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/execute.py in quick_execute(op_name, num_outputs, inputs, attrs, ctx, name)
53 ctx.ensure_initialized()
54 tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
---> 55 inputs, attrs, num_outputs)
56 except core._NotOkStatusException as e:
57 if name is not None:
InvalidArgumentError: Graph execution error:
input must be a vector, got shape: []
[[{{node text_preprocessor/tokenize/StringSplit/StringSplit}}]] [Op:__inference_predict_function_838262]
Below the model summary and the batch input shape required - but really dont get it to work :(
Any HELP more than welcome!!!
Thanks
>> model.summary()
Model: "model"
_________________________________________________________________
Layer (type) Output Shape Param #
=
input_18 (InputLayer) \[(None, 1)\] 0
lambda_14 (Lambda) (None, 512) 0
dense_1 (Dense) (None, 256) 131328
dense_2 (Dense) (None, 788) 202516
=================================================================
Total params: 333,844
Trainable params: 333,844
Non-trainable params: 0
config = model.get_config() # Returns pretty much every information about your model
print(config\["layers"\]\[0\]\["config"\]\["batch_input_shape"\]) # returns a tuple of width, height and channels
(None, 1)
Looking at some answers like
Keras model input shape wrong
I also tried to reshape like that
predicts = model.predict(test_text.reshape((1, -1)))
predicts
But I get exactly the same error than previously
I'm building an LSTM, for a report, and would like to summarize things about it. However, I've seen two different ways to build an LSTM in Keras that yield two different values for the number of parameters.
I'd like to understand why the parameters differ in this way.
This question correctly shows why this code
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation
from keras.layers import Embedding
from keras.layers import LSTM
model = Sequential()
model.add(LSTM(256, input_dim=4096, input_length=16))
model.summary()
results in 4457472 parameters.
From what I can tell, the following two LSTMs should be the same
m2 = Sequential()
m2.add(LSTM(1, input_dim=5, input_length=1))
m2.summary()
m3 = Sequential()
m3.add(LSTM((1),batch_input_shape=(None,5,1)))
m3.summary()
However, the m2 results in 28 parameters, but the m3 results in 12 parameters. Why?
How is 12 being calculated for a 1 unit LSTM with a 5-dim input?
Included the warning message. Hope it is helpful.
Output
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
lstm_1 (LSTM) (None, 256) 4457472
=================================================================
Total params: 4,457,472
Trainable params: 4,457,472
Non-trainable params: 0
_________________________________________________________________
Warning (from warnings module):
File "difparam.py", line 11
m2.add(LSTM(1, input_dim=5, input_length=1))
UserWarning: The `input_dim` and `input_length` arguments in recurrent layers are deprecated. Use `input_shape` instead.
Warning (from warnings module):
File "difparam.py", line 11
m2.add(LSTM(1, input_dim=5, input_length=1))
UserWarning: Update your `LSTM` call to the Keras 2 API: `LSTM(1, input_shape=(1, 5))`
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
lstm_2 (LSTM) (None, 1) 28
=================================================================
Total params: 28
Trainable params: 28
Non-trainable params: 0
_________________________________________________________________
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
lstm_3 (LSTM) (None, 1) 12
=================================================================
Total params: 12
Trainable params: 12
Non-trainable params: 0
_________________________________________________________________
m2 was built based on info from the Stack Overflow question, and m3 was built based on this video from YouTube.
Because the correct values are input_dim = 1 and input_length = 5.
It's even written in the warning you received, where the input shape for m2 is different from the one used in m3:
UserWarning: Update your LSTM call to the Keras 2 API: LSTM(1, input_shape=(1, 5))
It's highly recommended that you use the suggestion in the warning.
I am not able to solve the following error, please accept my apologies if it sounds naive, I am very new to Keras.
The output of the encoder is actually a complex value, so each output is real and imaginary part, input_h1 is also a complex value with real and imaginary parts represented as a vector. I want to multiply both of them.
# Input bits
input_bits1 = Input(shape=(2,))
input_bits2 = Input(shape=(2,))
# Input Channels
input_h1 = Input(shape=(2,))
input_h2 = Input(shape=(2,))
# Concatenate both inputs
input_bits = keras.layers.concatenate([input_bits1, input_bits2], axis=1)
print(input_bits)
# Create Encoder
m1 = Dense(64, activation='relu')(input_bits)
m2 = Dense(128, activation='relu')(m1)
encoded1 = Dense(2, activation='linear')(m2)
# Normalize the encoded value
encoded = Lambda(lambda x: K.l2_normalize(x, axis=1))(encoded1)
# The output of the encoder is actually a complex value, so each output is real and imaginary part,input_h1 is also a complex value with real and imaginary parts represented as a vector. I want to multiply both of them.
# mt1 is the real part of complex number multiplication
mt1 = encoded[:,0:1]*input_h1[:,0:1] - encoded[:,1:2]*input_h1[:,1:2]
print(mt1)
# nt1 is the imaginary part of the complex number multiplication
nt1 = encoded[:,0:1]*input_h1[:,1:2] + encoded[:,1:2]*input_h1[:,0:1]
print(nt1)
# Concatenate real and imaginary parts to feed into the decoder
mnt2 = keras.layers.concatenate([mt1, nt1], axis=1)
print(mnt2)
# Decoder 1
x5 = Dense(1024, activation='relu')(mnt2)
x6 = Dense(512, activation='relu')(x5)
x7 = Dense(64, activation='relu')(x6)
decoded_UP1 = Dense(2, activation='tanh')(x7)
# Decoder 2
a3 = Dense(1024, activation='relu')(mnt2)
a4 = Dense(512, activation='relu')(a3)
a5 = Dense(64, activation='relu')(a4)
decoded_UP2 = Dense(2, activation='tanh')(a5)
decoded = keras.layers.concatenate([decoded_UP1, decoded_UP2], axis=1)
autoencoder = Model([input_bits1, input_bits2, input_h1, input_h2], decoded)
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')
autoencoder.summary()
I am getting the following output/error :
AttributeError Traceback (most recent call last)
<ipython-input-9-c3710aa7e060> in <module>()
35 decoded = keras.layers.concatenate([decoded_UP1, decoded_UP2], axis=1)
36
---> 37 autoencoder = Model([input_bits1, input_bits2, input_h1, input_h2], decoded)
38 autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')
39 autoencoder.summary()
AttributeError: 'NoneType' object has no attribute '_inbound_nodes'
To improve the clarity of the code, you should create multiple models for instance one for the encoder and one for the decoder. That way you could have a model.summary() for each:
Example for the Bits Encoder:
from keras.layers import Input, Dense, concatenate
from keras.models import Model
# Input
input_bits1 = Input(shape=(2,))
input_bits2 = Input(shape=(2,))
input_bits = keras.layers.concatenate([input_bits1, input_bits2], axis=1)
# Hidden Layers
encoder_bits_h1 = Dense(64, activation='relu')(input_bits)
encoder_bits_h2 = Dense(128, activation='relu')(encoder_bits_h1)
encoder_bits_h3 = Dense(2, activation='linear')(encoder_bits_h2)
# Create the model
bits_encoder = Model(inputs=[input_bits1, input_bits2], outputs=[encoder_bits_h3])
bits_encoder.summary()
Which returns the bits encoder configuration:
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_5 (InputLayer) (None, 2) 0
__________________________________________________________________________________________________
input_6 (InputLayer) (None, 2) 0
__________________________________________________________________________________________________
concatenate_2 (Concatenate) (None, 4) 0 input_5[0][0]
input_6[0][0]
__________________________________________________________________________________________________
dense_4 (Dense) (None, 64) 320 concatenate_2[0][0]
__________________________________________________________________________________________________
dense_5 (Dense) (None, 128) 8320 dense_4[0][0]
__________________________________________________________________________________________________
dense_6 (Dense) (None, 2) 258 dense_5[0][0]
==================================================================================================
Total params: 8,898
Trainable params: 8,898
Non-trainable params: 0
__________________________________________________________________________________________________
I built a Sequential model with the VGG16 network at the initial base, for example:
from keras.applications import VGG16
conv_base = VGG16(weights='imagenet',
# do not include the top, fully-connected Dense layers
include_top=False,
input_shape=(150, 150, 3))
from keras import models
from keras import layers
model = models.Sequential()
model.add(conv_base)
model.add(layers.Flatten())
model.add(layers.Dense(256, activation='relu'))
# the 3 corresponds to the three output classes
model.add(layers.Dense(3, activation='sigmoid'))
My model looks like this:
model.summary()
Layer (type) Output Shape Param #
=================================================================
vgg16 (Model) (None, 4, 4, 512) 14714688
_________________________________________________________________
flatten_1 (Flatten) (None, 8192) 0
_________________________________________________________________
dense_7 (Dense) (None, 256) 2097408
_________________________________________________________________
dense_8 (Dense) (None, 3) 771
=================================================================
Total params: 16,812,867
Trainable params: 16,812,867
Non-trainable params: 0
_________________________________________________________________
Now, I want to get the layer names associated with the vgg16 Model portion of my network. I.e. something like:
layer_name = 'block3_conv1'
filter_index = 0
layer_output = model.get_layer(layer_name).output
loss = K.mean(layer_output[:, :, :, filter_index])
However, since the vgg16 convolutional is shown as a Model and it's layers are not being exposed, I get the error:
ValueError: No such layer: block3_conv1
How do I do this?
The key is to first do .get_layer on the Model object, then do another .get_layer on that specifying the specific vgg16 layer, THEN do .output:
layer_output = model.get_layer('vgg16').get_layer('block3_conv1').output
To get the name of the layer from the VGG16 instance use the following code.
for layer in conv_base.layers:
print(layer.name)
the name should be the same inside your model. to show this you could do the following.
print([layer.name for layer in model.get_layer('vgg16').layers])
like Ryan showed us. to call the vgg16 layer you must call it from the model first using the get_layer method.
One can simply store the name of layers in the list for further usage
layer_names=[layer.name for layer in base_model.layers]
This worked for me :)
for idx in range(len(model.layers)):
print(model.get_layer(index = idx).name)
Use the layer's summary:
model.get_layer('vgg16').summary()
I'm trying to implement encoder-decoder type network in Keras, with Bidirectional GRUs.
The following code seems to be working
src_input = Input(shape=(5,))
ref_input = Input(shape=(5,))
src_embedding = Embedding(output_dim=300, input_dim=vocab_size)(src_input)
ref_embedding = Embedding(output_dim=300, input_dim=vocab_size)(ref_input)
encoder = Bidirectional(
GRU(2, return_sequences=True, return_state=True)
)(src_embedding)
decoder = GRU(2, return_sequences=True)(ref_embedding, initial_state=encoder[1])
But when I change the decode to use Bidirectional wrapper, it stops showing encoder and src_input layers in the model.summary(). The new decoder looks like:
decoder = Bidirectional(
GRU(2, return_sequences=True)
)(ref_embedding, initial_state=encoder[1:])
The output of model.summary() with the Bidirectional decoder.
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_2 (InputLayer) (None, 5) 0
_________________________________________________________________
embedding_2 (Embedding) (None, 5, 300) 6610500
_________________________________________________________________
bidirectional_2 (Bidirection (None, 5, 4) 3636
=================================================================
Total params: 6,614,136
Trainable params: 6,614,136
Non-trainable params: 0
_________________________________________________________________
Question: Am I missing something when I pass initial_state in Bidirectional decoder? How can I fix this? Is there any other way to make this work?
It's a bug. The RNN layer implements __call__ so that tensors in initial_state can be collected into a model instance. However, the Bidirectional wrapper did not implement it. So topological information about the initial_state tensors is missing and some strange bugs happen.
I wasn't aware of it when I was implementing initial_state for Bidirectional. It should be fixed now, after this PR. You can install the latest master branch on GitHub to fix it.