I'm trying to replicate the example on Keras's website:
# as the first layer in a Sequential model
model = Sequential()
model.add(LSTM(32, input_shape=(10, 64)))
# now model.output_shape == (None, 32)
# note: `None` is the batch dimension.
# for subsequent layers, no need to specify the input size:
model.add(LSTM(16))
But when I run the following:
# only lines I've added:
from keras.models import Sequential
from keras.layers import Dense, LSTM
# all else is the same:
model = Sequential()
model.add(LSTM(32, input_shape=(10, 64)))
model.add(LSTM(16))
However, I get the following:
ValueError: Input 0 is incompatible with layer lstm_4: expected ndim=3, found ndim=2
Versions:
Keras: '2.0.5'
Python: '3.4.3'
Tensorflow: '1.2.1'
LSTM layer as their default option has to return only the last output from a sequence. That's why your data loses its sequential nature. In order to change that try:
model.add(LSTM(32, input_shape=(10, 64), return_sequences=True))
What makes LSTM to return a whole sequence of predictions.
Related
I'm having difficulty printing the model.summary() after using the Sequential class in keras to build a structure like so:
embedding_inputs* numerical_input
\ /
\ /
-- CONCATENATE--
|
DENSE (50) #1
DENSE (50) #2
DENSE (50) #3
DENSE (50) #4
DENSE (1) #output
* embedding_inputs are a bunch of concatenated sequential models from
categorical variables. For the sake of simplicity,
let's pretend there is only one.
I know without the embedding layer(s), my model works and looks fine. But following my addition of an embedding layer and a concatenate layer, I'm told I need to build the model or that my Output tensors "must be the output of a Keras Layer."
I'm just utterly confused at this point. (I'm used to using the functional api but embarrassingly am having so much trouble with the Sequential one and would like to learn).
categorical = Sequential()
categorical.add(Embedding(
input_dim=len(df_train['mon'].astype('category').cat.categories),
output_dim=2,
input_length=1))
categorical.add(Flatten())
numeric = Sequential()
numeric.add(InputLayer(input_shape(1,len(numeric_column_names)),dtype='float32',name='numerical_in'))
model = Sequential()
model.add(Concatenate([numeric,categoric]))
model.add(Dense(50, input_dim=50, kernel_initializer='normal', activation='relu'))
model.add(Dense(50, input_dim=50, kernel_initializer='normal', activation='relu'))
model.add(Dense(50, input_dim=50, kernel_initializer='normal', activation='relu'))
model.add(Dense(50, input_dim=50, kernel_initializer='normal', activation='relu'))
model.add(Dense(1, kernel_initializer='normal')) #output layer (1 number)
If I attempt to use model.summary() without a build:
ValueError: This model has not yet been built. Build the model first by calling build() or calling fit() with some data. Or specify input_shape or batch_input_shape in the first layer for automatic build.
If I attempt to use model.build() first, I get a message like:
ValueError: Output tensors to a Sequential must be the output of a Keras `Layer` (thus holding past layer metadata). Found: None
I am trying to train a convolutional neural network on google colab for a medical classification problem. The data set is 89 256x256x256 images for training and 11 for testing. When I try to make my model train it gives me the following error:
import keras
from keras import optimizers
import keras.models
from keras.models import Sequential
import keras.layers
from keras.layers.convolutional import Conv3D
from keras.layers.convolutional import MaxPooling3D
from keras.layers import Dropout
from keras.layers import Flatten
from keras.layers import Dense
from keras import metrics
model = Sequential()
model.add(Conv3D(64, kernel_size=(3,3,3),
activation='relu',
input_shape=(10,1,256,256,256)))
model.add(Conv3D(64, (2,2,2), activation='relu'))
model.add(MaxPooling3D(pool_size=(2,2,2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(2, activation='softmax'))
opt=keras.optimizers.Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=None, decay=0.0, amsgrad=False)
model.compile(opt, loss='categorical_crossentropy', metrics=['mae','acc'])
model.fit(x=train_data, y=train_labels,epochs=100, batch_size=10, verbose=2 ,callbacks=None, validation_split=0.0, validation_data=(validation_data,validation_labels), shuffle=True)
This is the error i get:
ValueError: Input 0 is incompatible with layer conv3d_56: expected ndim=5, found ndim=6
Assuming you are using channels first data_format, your input_shape argugment to the first Conv3D layer should be (CHANNELS, HEIGHT, WIDTH, DEPTH). But your input shape tuple has length of 5, and that is not what Conv3D layer expecting. Assuming the batch_size(of 10) is specified by mistake, making the following changes should fix the problem
model.add(Conv3D(64, kernel_size=(3,3,3),
activation='relu',
input_shape=(1,256,256,256)))
Edit
If you are using channels_last data-format your input_shape should be (HEIGHT, WIDTH, DEPTH, CHANNELS). And assuming your images have 1 channels, the above line should be,
model.add(Conv3D(64, kernel_size=(3,3,3),
activation='relu',
input_shape=(256,256,256, 1)))
I have defined a simpleRNN in keras with the following code :
# define RNN architecture
from keras.layers import Input
from keras.models import Model
from keras.layers import SimpleRNN
from keras.models import Sequential
model = Sequential()
model.add(SimpleRNN(units = 10,
return_sequences=False,
unroll=True,
input_shape=(6, 2)))
model.compile(loss='mse',
optimizer='rmsprop',
metrics=['accuracy'])
model.summary()
then I feed it with input data having shape (batch_size, 6, 2) i.e. 6 timesteps each having two features. I therefore expect 6 simpleRNN cells.
When launching the training, I get the following error message :
Error when checking target: expected simple_rnn_2 to have shape (10,) but got array with shape (1,)
and I don't understand why.
The point of the RNN (my understanding) is to have its input fed by the previous RNN cell in case it is not the first RNN cell and the new timestep input.
So in this case, I expect the second RNN cell to be fed by the first RNN cell a vector of shape (10,) since units = 10. How come that it gets a (1,) sized vector ?
What is strange is that as soon as I add a Dense layer in the model, this solves the issue. So the following architecture :
# define RNN architecture
from keras.layers import Input
from keras.models import Model
from keras.layers import SimpleRNN, Dense
from keras.models import Sequential
model = Sequential()
model.add(SimpleRNN(units = 10,
return_sequences=False,
unroll=False,
input_shape=(6, 2)))
model.add(Dense(1, activation='relu'))
model.compile(loss='mse',
optimizer='rmsprop',
metrics=['accuracy'])
model.summary()
does not throw an error. Any idea why ?
Assuming you are actually training the model (you did not include that code), the problem is that you are feeding it target outputs of shape (1,) while the SimpleRNN expects input of shape (10,). You can look up the docs here: https://keras.io/layers/recurrent/
The docs clearly state that the output of the SimpleRNN is equal to units, which is 10. Each unit produces one output.
The second sample does work because you have added a Dense layer that reduces the output size to (1,). Now the model can accept your training target outputs and they are backpropped through the network.
I'm new to Keras
my neural network structure is here:
neural network structure
my idea is :
import keras.backend as KBack
import tensorflow as tf
#...some code here
model = Sequential()
hidden_units = 4
layer1 = Dense(
hidden_units,
input_dim=len(InputIndex),
activation='sigmoid'
)
model.add(layer1)
# layer1_bias = layer1.get_weights()[1][0]
layer2 = Dense(
1, activation='sigmoid',
use_bias=False
)
model.add(layer2)
# KBack.bias_add(model.output, layer1_bias[0])
I know this is not working cause layer1_bias[0] is not tensor, but I have no idea how to fix it. Or somebody has other solution.
Thanks.
You get the error because bias_add expects a Tensor and you are passing it a float (the actual value of the bias). Also, be aware that your hidden layer actually has 3 biases (one for each node). If you want to add the bias of the first node to your output layer, this should work:
import keras.backend as K
from keras.layers import Dense, Activation
from keras.models import Sequential
model = Sequential()
layer1 = Dense(3, input_dim=2, activation='sigmoid')
layer2 = Dense(1, activation=None, use_bias=False)
activation = Activation('sigmoid')
model.add(layer1)
model.add(layer2)
K.bias_add(model.output, layer1.bias[0:1]) # slice like this to not lose a dimension
model.add(activation)
print(model.summary())
Note that, to be 'correct' (according to the definition of what a dense layer does), you should add the bias first, then the activation.
Also, your code is not really in line with the picture of your network. In the picture, one single shared bias is added to each of the nodes in the network. You can do this with the functional API. The idea is to disable the use of biases in the hidden layer and the output layers, and to manually add a bias variable that you define yourself and that will be shared by the layers. I'm using tensorflow for tf.add() since that supports broadcasting:
from keras.layers import Dense, Lambda, Input, Add
from keras.models import Model
import keras.backend as K
import tensorflow as tf
# Define the shared bias as a custom keras variable
shared_bias = K.variable(value=[0], name='shared_bias')
input_layer = Input(shape=(2,))
# Disable biases in the hidden layer
dense_1 = Dense(units=3, use_bias=False, activation=None)(input_layer)
# Manually add the shared bias
dense_1 = Lambda(lambda x: tf.add(x, shared_bias))(dense_1)
# Disable bias in output layer
output_layer = Dense(units=1, use_bias=False)(dense_1)
# Manually add the bias variable
output_layer = Lambda(lambda x: tf.add(x, shared_bias))(output_layer)
model = Model(inputs=input_layer, outputs=output_layer)
print(model.summary())
This assumes that your shared bias is not trainable though.
I am running keras over tensorflow, trying to implement a multi-dimensional LSTM network to predict a linear continuous target variable , a single value for each example(return_sequences = False).
My sequence length is 10 and number of features (dim) is 11.
This is what I run:
import pprint, pickle
import numpy as np
from keras.models import Sequential
from keras.layers import Dense, Activation
from keras.layers import LSTM
# Input sequence
wholeSequence = [[0,0,0,0,0,0,0,0,0,2,1],
[0,0,0,0,0,0,0,0,2,1,0],
[0,0,0,0,0,0,0,2,1,0,0],
[0,0,0,0,0,0,2,1,0,0,0],
[0,0,0,0,0,2,1,0,0,0,0],
[0,0,0,0,2,1,0,0,0,0,0],
[0,0,0,2,1,0,0,0,0,0,0],
[0,0,2,1,0,0,0,0,0,0,0],
[0,2,1,0,0,0,0,0,0,0,0],
[2,1,0,0,0,0,0,0,0,0,0]]
# Preprocess Data:
wholeSequence = np.array(wholeSequence, dtype=float) # Convert to NP array.
data = wholeSequence
target = np.array([20])
# Reshape training data for Keras LSTM model
data = data.reshape(1, 10, 11)
target = target.reshape(1, 1, 1)
# Build Model
model = Sequential()
model.add(LSTM(11, input_shape=(10, 11), unroll=True, return_sequences=False))
model.add(Dense(11))
model.add(Activation('linear'))
model.compile(loss='mean_squared_error', optimizer='adam')
model.fit(data, target, nb_epoch=1, batch_size=1, verbose=2)
and get the error ValueError: Error when checking target: expected activation_1 to have 2 dimensions, but got array with shape (1, 1, 1)
Not sure what should the activation layer should get (shape wise)
Any help appreciated
thanks
If you just want to have a single linear output neuron, you can simply use a dense layer with one hidden unit and supply the activation there. Your output then can be a single vector without the reshape- I adjusted your given example code to make it work:
wholeSequence = np.array(wholeSequence, dtype=float) # Convert to NP array.
data = wholeSequence
target = np.array([20])
# Reshape training data for Keras LSTM model
data = data.reshape(1, 10, 11)
# Build Model
model = Sequential()
model.add(LSTM(11, input_shape=(10, 11), unroll=True, return_sequences=False))
model.add(Dense(1, activation='linear'))
model.compile(loss='mean_squared_error', optimizer='adam')
model.fit(data, target, nb_epoch=1, batch_size=1, verbose=2)