I am new to Keras and deep learning and was working with MNIST on Keras. When I created a model using
model = models.Sequential()
model.add(layers.Dense(512,activation = 'relu',input_shape=(28*28,)))
model.add(layers.Dense(32,activation ='relu'))
and then I printed it
output is
<keras.engine.sequential.Sequential at 0x7f3d554f6710>
My question is that is there any way to see a better result of Keras, meaning if i print model i can see that i have 3 hidden layers with first hidden layer having 512 hidden units and 784 input units, 2nd hidden layer having 512 input units and 32 hidden units and so on.
You can also try plot_model()
model = tf.keras.Sequential()
model.add(tf.keras.layers.Dense(512,activation = 'relu',input_shape=(28*28,)))
model.add(tf.keras.layers.Dense(32,activation ='relu'))
from keras.utils.vis_utils import plot_model
plot_model(model, show_shapes=True, show_layer_names=True)
model.summary() will print he entire model for you.
model = Sequential()
model.add(Dense(512,activation = 'relu',input_shape=(28*28,)))
model.add(Dense(32,activation ='relu'))
Model: "sequential_1"
Layer (type) Output Shape Param #
dense (Dense) (None, 512) 401920
dense_1 (Dense) (None, 32) 16416
dense_2 (Dense) (None, 10) 330
Total params: 418,666
Trainable params: 418,666
Non-trainable params: 0
I am testing something which includes building a FCNN network Dynamically. Idea is to build Number of layers and it's neurons based on a given list and the dummy code is:
neurons = [10,20,30] # First Dense has 10 neuron, 2nd has 20 and third has 30
inputs = keras.Input(shape=(1024,))
x = Dense(10,activation='relu')(inputs)
for n in neurons:
x = Dense(n,activation='relu')(x)
out = Dense(1,activation='sigmoid')(x)
model = Model(inputs,out)
for layer in model.layers:
To my surprise, it is showing nothing.I even compiled and ran the functions again and nothing came out.
The model.summary always shows number of trainable and non trainable params but not the model structure and layer names. Why is this happening? Or is this normal?
About model.summary(), don't mix tf 2.x and standalone keras at a time. If I ran you model in tf 2.x, I get the expected results.
from tensorflow.keras.layers import *
from tensorflow.keras import Model
from tensorflow import keras
# your code ...
Model: "model"
Layer (type) Output Shape Param #
input_1 (InputLayer) [(None, 1024)] 0
dense (Dense) (None, 10) 10250
dense_1 (Dense) (None, 10) 110
dense_2 (Dense) (None, 20) 220
dense_3 (Dense) (None, 30) 630
dense_4 (Dense) (None, 1) 31
Total params: 11,241
Trainable params: 11,241
Non-trainable params: 0
About plotting the model, there is a couple of option that can be used while you plot your keras model. Here is one example:
keras.utils.plot_model(model, show_dtype=True,
show_layer_names=True, show_shapes=True,
I am currently working on a question answering system. I create a synthetic dataset that contains multiple words in the answers. But, the answers are not a span of the given context.
Initially, I am planning to test it using a deep learning-based model. But I have some problems building the model.
This is how I vectorized data.
def vectorize(data, word2idx, story_maxlen, question_maxlen, answer_maxlen):
""" Create the story and question vectors and the label """
Xs, Xq, Y = [], [], []
for story, question, answer in data:
xs = [word2idx[word] for word in story]
xq = [word2idx[word] for word in question]
y = [word2idx[word] for word in answer]
#y = np.zeros(len(word2idx) + 1)
#y[word2idx[answer]] = 1
return (pad_sequences(Xs, maxlen=story_maxlen),
pad_sequences(Xq, maxlen=question_maxlen),
pad_sequences(Y, maxlen=answer_maxlen))
below is how I create the model.
# story encoder. Output dim: (None, story_maxlen, EMBED_HIDDEN_SIZE)
story_encoder = Sequential()
# question encoder. Output dim: (None, question_maxlen, EMBED_HIDDEN_SIZE)
question_encoder = Sequential()
# episodic memory (facts): story * question
# Output dim: (None, question_maxlen, story_maxlen)
facts_encoder = Sequential()
facts_encoder.add(Merge([story_encoder, question_encoder],
mode="dot", dot_axes=[2, 2]))
facts_encoder.add(Permute((2, 1)))
## combine response and question vectors and do logistic regression
answer = Sequential()
answer.add(Merge([facts_encoder, question_encoder],
mode="concat", concat_axis=-1))
answer.add(LSTM(LSTM_OUTPUT_SIZE, return_sequences=True))
answer.add(Dense(vocab_size,activation= "softmax"))
answer.compile(optimizer="rmsprop", loss="categorical_crossentropy",
answer.fit([Xs_train, Xq_train], Y_train,
batch_size=BATCH_SIZE, nb_epoch=NBR_EPOCHS,
validation_data=([Xs_test, Xq_test], Y_test))
and this is the summary of the model
Layer (type) Output Shape Param #
merge_46 (Merge) (None, 5, 616) 0
lstm_23 (LSTM) (None, 5, 32) 83072
dropout_69 (Dropout) (None, 5, 32) 0
flatten_9 (Flatten) (None, 160) 0
dense_22 (Dense) (None, 37) 5957
Total params: 93,765.0
Trainable params: 93,765.0
Non-trainable params: 0.0
It gives the following error.
ValueError: Error when checking model target: expected dense_22 to have shape (None, 37) but got array with shape (1000, 2)
I think the error is related to Y_train, Y_test. I should encode them to categorical values and the answers are not spans of text, but sequential. I don't know what/how to do it.
how can I fix it? any ideas?
When I use sparse_categorical_crossentropy in the loss, and Reshape(2,-1);
Layer (type) Output Shape Param #
merge_94 (Merge) (None, 5, 616) 0
lstm_65 (LSTM) (None, 5, 32) 83072
dropout_139 (Dropout) (None, 5, 32) 0
reshape_22 (Reshape) (None, 2, 80) 0
dense_44 (Dense) (None, 2, 37) 2997
Total params: 90,805.0
Trainable params: 90,805.0
Non-trainable params: 0.0
The model after modifications
# story encoder. Output dim: (None, story_maxlen, EMBED_HIDDEN_SIZE)
story_encoder = Sequential()
# question encoder. Output dim: (None, question_maxlen, EMBED_HIDDEN_SIZE)
question_encoder = Sequential()
# episodic memory (facts): story * question
# Output dim: (None, question_maxlen, story_maxlen)
facts_encoder = Sequential()
facts_encoder.add(Merge([story_encoder, question_encoder],
mode="dot", dot_axes=[2, 2]))
facts_encoder.add(Permute((2, 1)))
## combine response and question vectors and do logistic regression
## combine response and question vectors and do logistic regression
answer = Sequential()
answer.add(Merge([facts_encoder, question_encoder],
mode="concat", concat_axis=-1))
answer.add(LSTM(LSTM_OUTPUT_SIZE, return_sequences=True))
answer.add(keras.layers.Reshape((2, -1)))
answer.add(Dense(vocab_size,activation= "softmax"))
answer.compile(optimizer="rmsprop", loss="sparse_categorical_crossentropy",
answer.fit([Xs_train, Xq_train], Y_train,
batch_size=BATCH_SIZE, nb_epoch=NBR_EPOCHS,
validation_data=([Xs_test, Xq_test], Y_test))
It still gives
ValueError: Error when checking model target: expected dense_46 to have 3 dimensions, but got array with shape (1000, 2)
As far as I understand - Y_train, Y_test comprise of indexes (not one-hot vectors). If so - change loss to sparse_categorical_entropy:
answer.compile(optimizer="rmsprop", loss="sparse_categorical_crossentropy",
As far as I understand - Y_train, Y_test have a sequence dimension. And the length of questions (5) doesn't equal to the length of the answers (2). This dimension is removed by Flatten(). Try to replace Flatten() by Reshape():
# answer.add(Flatten())
answer.add(tf.keras.layers.Reshape((2, -1)))
I'm trying to make a prediction with my model where shape of the array is (3084, 32, 32).
Getting value Error here is error image
Here is my model
model.add(Dense(1028, input_shape = (3084,), activation = "sigmoid"))
model.add(Dense(514, activation="sigmoid"))
model.add(Dense(len(lb.classes_), activation="softmax"))
Model: "sequential_21"
Layer (type) Output Shape Param #
dense_57 (Dense) (None, 1028) 3171380
dense_58 (Dense) (None, 514) 528906
dense_59 (Dense) (None, 4) 2060
Total params: 3,702,346
Trainable params: 3,702,346
Non-trainable params: 0
trying to fit using
opt = SGD(lr = 0.01)
model.compile(loss = "categorical_crossentropy", optimizer=opt, metrics=["accuracy"])
H = model.fit(train_X, train_Y, validation_data = (test_X, test_Y), epochs = 75, batch_size = 32)
You need to specify the input shape correctly, the following model should work.
from tensorflow.keras.layers import *
from tensorflow.keras.models import *
model = Sequential()
model.add(Dense(1028, input_shape = (32,32), activation = "sigmoid"))
model.add(Dense(514, activation="sigmoid"))
model.add(Dense(4, activation="softmax"))
I need to make a model that takes as input a 2D binary matrix: (37,10) for instance and return a real 2D matrix of the same shape as the input one. I erote this code but I am not sure what X (in the output layer) should be equal to.
Please let me know if you think my model is correct as defined and what to write instead of X
Thank you
I updated your code to get shape of output same as input. We need to add Flatten and Reshape layer at the start and end of the model. In simple, X should be equal to the number of elements in the input_shape.
from tensorflow.keras.layers import Dense, Flatten,Reshape
from tensorflow.keras.models import Sequential
from tensorflow.keras.optimizers import Adam
num_elm =input_shape[0]*input_shape[1]
model.add(Dense(32, activation='linear'))
model.add(Dense(32, activation='linear'))
model.add(Dense(num_elm, activation='linear'))
Model: "sequential_5"
Layer (type) Output Shape Param #
flatten_4 (Flatten) (None, 370) 0
dense_14 (Dense) (None, 32) 11872
dense_15 (Dense) (None, 32) 1056
dense_16 (Dense) (None, 370) 12210
reshape (Reshape) (None, 37, 10) 0
Total params: 25,138
Trainable params: 25,138
Non-trainable params: 0
X will be 10 even though using FC layers for 2-d data may not be very suitable in the first place, also you're sure the metrics will be accuracy.
Here's your model with the correct output shape.
from tensorflow.keras.layers import *
from tensorflow.keras.models import Model, Sequential
from tensorflow.keras.optimizers import Adam
import tensorflow as tf
import numpy as np
Model: "sequential_3"
Layer (type) Output Shape Param #
dense_8 (Dense) (None, 37, 32) 352
dense_9 (Dense) (None, 37, 32) 1056
dense_10 (Dense) (None, 37, 10) 330
Total params: 1,738
Trainable params: 1,738
Non-trainable params: 0
I am trying to understand the model.summary() in keras, I have the code as:
model = Sequential([
And when I print(model.summary()) I get output as
Layer (type) Output Shape Param #
dense_16 (Dense) (None, 3) 21
dense_17 (Dense) (None, 3) 12
dense_18 (Dense) (None, 1) 4
Total params: 37
Trainable params: 37
Non-trainable params: 0
I cannot understand the meaning of dense_16, dense_17 and dense_18 with respect to my described model input layers.
Those are just the names of the layer that were autogenerated by Keras. To name layers manually, pass a keyword argument name='my_custon_name' to each layer that you want to name. Note that layer names must be unique inside a model.
Layer names are useful for debugging and to get specific layers in code, for example using model.get_layer(layer_name).
These are just the names of your layers. If you do not explicitly specify the layer names, they will just be named and numbered automatically.