This question is about TensorFlow (and TensorBoard) version 2.2rc3, but I have experienced the same issue with 2.1.
Consider the following weird code:
from datetime import datetime
import tensorflow as tf
from tensorflow import keras
inputs = keras.layers.Input(shape=(784, ))
x1 = keras.layers.Dense(32, activation='relu', name='Model/Block1/relu')(inputs)
x1 = keras.layers.Dropout(0.2, name='Model/Block1/dropout')(x1)
x1 = keras.layers.Dense(10, activation='softmax', name='Model/Block1/softmax')(x1)
x2 = keras.layers.Dense(32, activation='relu', name='Model/Block2/relu')(inputs)
x2 = keras.layers.Dropout(0.2, name='Model/Block2/dropout')(x2)
x2 = keras.layers.Dense(10, activation='softmax', name='Model/Block2/softmax')(x2)
x3 = keras.layers.Dense(32, activation='relu', name='Model/Block3/relu')(inputs)
x3 = keras.layers.Dropout(0.2, name='Model/Block3/dropout')(x3)
x3 = keras.layers.Dense(10, activation='softmax', name='Model/Block3/softmax')(x3)
x4 = keras.layers.Dense(32, activation='relu', name='Model/Block4/relu')(inputs)
x4 = keras.layers.Dropout(0.2, name='Model/Block4/dropout')(x4)
x4 = keras.layers.Dense(10, activation='softmax', name='Model/Block4/softmax')(x4)
outputs = x1 + x2 + x3 + x4
model = tf.keras.Model(inputs=inputs, outputs=outputs)
model.summary()
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()
x_train = x_train.reshape(60000, 784).astype('float32') / 255
x_test = x_test.reshape(10000, 784).astype('float32') / 255
model.compile(loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
optimizer=keras.optimizers.RMSprop(),
metrics=['accuracy'])
logdir = "logs/" + datetime.now().strftime("%Y%m%d-%H%M%S")
tensorboard_callback = keras.callbacks.TensorBoard(log_dir=logdir)
model.fit(x_train, y_train,
batch_size=64,
epochs=5,
validation_split=0.2,
callbacks=[tensorboard_callback])
When running it and looking at the graph created in TensorBoard
you will see the following.
As can be seen, the addition operations are really ugly.
When replacing the line
outputs = x1 + x2 + x3 + x4
With the lines:
outputs = keras.layers.add([x1, x2], name='Model/add/add1')
outputs = keras.layers.add([outputs, x3], name='Model/add/add2')
outputs = keras.layers.add([outputs, x4], name='Model/add/add3')
a much nicer graph is created by TensorBoard (in this second screenshot, the Model as well as one of the inner blocks are shown in details).
The difference between the two representations of the model is that in the second one, we could name the addition operations and group them.
I could not find any way to name these operations, unless by using the keras.layers.add(). In this model the problem does not look that critical as the model is simple, and it is easy to replace + with keras.layers.add(). However, in more complex models, it can become a real pain. For example, operations such as t[:, start:end] should be translated to complex calls to tf.strided_slice(). So my models representations are quite messy with plenty of cryptic gather, stride and concat operations.
I wonder if there is a way to wrap / group such operations to allow nicer graphs in TensorBoard.
outputs = keras.layers.Add()([x1, x2, x3, x4])
Following the hint from Marco Cerliani, Lambda layer is indeed very useful here. So the following code will group nicely the +:
outputs = keras.layers.Lambda(lambda x: x[0] + x[1], name='Model/add/add1')([x1, x2])
outputs = keras.layers.Lambda(lambda x: x[0] + x[1], name='Model/add/add2')([outputs, x2])
outputs = keras.layers.Lambda(lambda x: x[0] + x[1], name='Model/add/add3')([outputs, x2])
Or if needed to wrap strides, the following code will group nicely the t[]:
x1 = keras.layers.Lambda(lambda x: x[:, 0:5], name='Model/stride_concat/stride1')(x1) # instead of x1 = x1[:, 0:5]
x2 = keras.layers.Lambda(lambda x: x[:, 5:10], name='Model/stride_concat/stride2')(x2) # instead of x2 = x2[:, 5:10]
outputs = keras.layers.concatenate([x1, x2], name='Model/stride_concat/concat')
This answers the question asked. But actually, there is still an open issue that is described in another question: 'TensorFlowOpLayer messes up the TensorBoard graphs'
Related
I have two inputs X1, X2 and corresponding label Y. I want to split the data into training and validation using SkLearn's train_test_split. My X1 is of shape (1920,12) and X2 is of shape(1920,51,5). The code I use is :
from sklearn.model_selection import train_test_split
X1 = np.load('x_train.npy')
X2 = np.load('oneHot.npy')
y_train = np.load('y_train.npy')
X = np.array(list(zip(X1, X2))) ### To zip the two inputs.
X_train, X_valid, y_train, y_valid = train_test_split(X, y_train,test_size=0.2)
X1_train, oneHot_train = X_train[:, 0], X_train[:, 1]
However when I check the shape X1_train and oneHot_train it is (1536,) whereas X1_train should be (1536,12) and oneHot_train should be (1536,51,5). What am I doing wrong here? Insights will be appreciated.
train_test_split can take up any number of iterators for splitting. Hence, you can directly feed the x1 and x2 - like below:
x1 = np.random.rand(1920,12)
x2 = np.random.rand(1920,51,5)
y = np.random.choice([0,1], 1920)
x1_train, x1_test, x2_train, x2_test, y_train, y_test = train_test_split(\
x1, x2, y ,test_size=0.2)
x1_train.shape, x1_test.shape
# ((1536, 12), (384, 12))
x2_train.shape, x2_test.shape
# ((1536, 51, 5), (384, 51, 5))
y_train.shape, y_test.shape
# ((1536,), (384,))
For a concrete example here is some code:
input1 = Input(shape = (3,2))
x1 = LSTM(8)(input1)
x1 = Flatten()(x1)
x1 = Dense(10)(x1)
input2 = Input(shape = (3,2))
x2 = LSTM(8)(input2)
x2 = Flatten()(x2)
x2 = Dense(10)(x2)
x = concatenate([x1, x2])
y = Dense(1)(x)
model = Model([input1, input2], y)
model.compile(optimizer='Adam',
loss='mean_squared_error')
Is it possible to use a Keras generator to fit this model with two numpy arrays?
E.g.
X1 = np.random.randn((100, 2))
X2 = np.random.randn((100, 2))
I have a dataset that contains 1 value of y_true per case. I want to build a DNN that outputs 3 coefficients that will later be used as follows to create y_pred
y_pred = 4*coeff_1 + 5*coeff_2 + 6 *coeff_3
I am using keras and when i tried to define a custom function like this
from keras.callbacks import ModelCheckpoint
from keras.layers import advanced_activations
from keras.models import Sequential
from keras.layers import Dense, Activation, Flatten
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestRegressor
from sklearn.metrics import mean_absolute_error
import keras.backend as K
def custom_objective(layer):
return K.sum(layer.output)
NN_model = Sequential()
# The Input Layer :
NN_model.add(Dense(X_train.shape[1], kernel_initializer='normal',input_dim = X_train.shape[1], activation='relu'))
# The Hidden Layers :
NN_model.add(Dense(20, kernel_initializer='normal',activation='elu'))
NN_model.add(Dense(20, kernel_initializer='normal',activation='elu'))
output_layer = Dense(1, kernel_initializer='normal',activation='linear')
# The Output Layer :
NN_model.add(output_layer)
# Compile the network :
NN_model.compile(loss=custom_objective(output_layer), optimizer='Adamax', metrics=['mean_absolute_error'])
NN_model.summary()
NN_model.fit(X_train, y_train, epochs=10,verbose = 1)
print('NN train = ', mean_absolute_error(y_train , NN_model.predict(X_train)))
predictions = NN_model.predict(X_test)
MAE = mean_absolute_error(y_test , predictions)
print('NN MAE = ', MAE)
I get all
TypeError: Using a tf.Tensor as a Python bool is not allowed. Use
if t is not None: instead of if t: to test if a tensor is defined,
and use TensorFlow ops such as tf.cond to execute subgraphs
conditioned on the value of a tensor.
So my question is
How can I define a DNN that will take 1 y_true per data, output 3 values which it will combine linearly to assemble a y_pred which will be used to get the loss function and train the network
Thank you for your time
How about something along these lines?
from keras.models import Model
from keras.layers import Dense, Input, Add, Lambda
def model(inp_size):
inp = Input(shape=(inp_size, 1))
x1 = Dense(20, activation='elu')(inp)
x1 = Dense(20, activation='elu')(x1)
x1 = Dense(1, activation = 'linear')(x1)
x2 = Dense(20, activation='elu')(inp)
x2 = Dense(20, activation='elu')(x2)
x2 = Dense(1, activation = 'linear')(x2)
x3 = Dense(20, activation='elu')(inp)
x3 = Dense(20, activation='elu')(x3)
x3 = Dense(1, activation = 'linear')(x3)
x1 = Lambda(lambda x: x * 4.0)(x1)
x2 = Lambda(lambda x: x * 5.0)(x2)
x3 = Lambda(lambda x: x * 6.0)(x3)
out = Add()([x1, x2, x3])
return Model(inputs = inp, outputs = out)
I have designed this toy problem to understand the working of SimpleRNN in Keras.
My input sequence is:
[x1,x2,x3,x4,x5]
and the corresponding output is:
[(0+x1)%2,(x1+x2)%2,(x2+x3)%2,(x3+x4)%2,(x4+x5)%2)]
My code is:
import numpy as np
import random
from scipy.ndimage.interpolation import shift
def generate_sequence():
max_len = 5
x = np.random.randint(1,100,max_len)
shifted_x = shift(x, 1, cval=0)
y = (x + shifted_x) % 2
return x.reshape(max_len,1),y.reshape(max_len,1),shifted_x.reshape(max_len,1)
X_train = np.zeros((100,5,1))
y_train = np.zeros((100,5,1))
for i in range(100):
x,y,z = generate_sequence()
X_train[i] = x
y_train[i] = y
X_test = np.zeros((100,5,1))
y_test = np.zeros((100,5,1))
for i in range(100):
x,y,z = generate_sequence()
X_test[i] = x
y_test[i] = y
from keras.layers import SimpleRNN
model = Sequential()
model.add(SimpleRNN(3,input_shape=(5,1),return_sequences=True,name='rnn'))
model.add(Dense(1,activation='sigmoid'))
# try using different optimizers and different optimizer configs
model.compile(loss='binary_crossentropy',
optimizer='sgd',
metrics=['accuracy'])
print('Train...')
model.fit(X_train, y_train,
batch_size=70,
epochs=200,verbose=0,validation_split=0.3)
score, acc = model.evaluate(X_test, y_test,
batch_size=batch_size)
print('Test score:', score)
print('Test accuracy:', acc)
When I train this SimpleRNN I only get an accuracy of 50%, each item in the sequence only depends on the previous item. Why is the RNN struggling to learn this?
100/100 [==============================] - 0s 37us/step
Test score: 0.6975522041320801
Test accuracy: 0.5120000243186951
UPDATE:
It turns out mod function is very hard to model, I switched to simple data generation strategy like y[t] = x[t] < x[t-1], then I could see the model performing with 80% binary accuracy.
def generate_rnn_sequence():
max_len = 5
x = np.random.randint(1,100,max_len)
shifted_x = shift(x, 1, cval=0)
y = (x < shifted_x).astype(float)
return x.reshape(5,1),y.reshape(5,1)
So How do i model a mod function using a RNN?
I am using a Siamese architecture in my model for a classification task of whether both the inputs are similar.
in1 = Input(shape=(None,), dtype='int32', name='in1')
x1 = Embedding(output_dim=dim, input_dim=n_symbols, input_length=None,
weights=[embedding_weights], name='x1')(in1)
in2 = Input(shape=(None,), dtype='int32', name='in2')
x2 = Embedding(output_dim=dim, input_dim=n_symbols, input_length=None,
weights=[embedding_weights], name='x2')(in2)
l = Bidirectional(LSTM(units=100, return_sequences=False))
y1 = l(x1)
y2 = l(x2)
y = concatenate([y1, y2])
out = Dense(1, activation='sigmoid')(y)
model = Model(inputs=[in1, in2], outputs=[out])
It works correctly as the number of weights to be trained remain the same even when I use a single input. The thing that confused my though, was the tensorboard vizualization of the model.
tensorboard graph
Shouldn't both x1 and x2 map to the same bidirectional node?
Also, what do the 18 and 32 tensors signify?