How can I feed data one by one to train with tensorflow 2.0 ver - keras

I have work on Autoencoder typed model with the attention method. Around 10000 batches of data are fed into the model and each batch contains 30 images (30 is the "step_size" in ConvLSTM) with a shape of (5, 5, 3 [R,G,B]).
Therefore, the array is of shape (10000, 30, 5, 5, 3) (batch_size, step_size, image_height, image_width, scale).
I intentionally made an output array shape as (1,5,5,3), because each image has to be handled independently to apply attention method to.
When I link all operations with tf.keras.Model such that its input has the shape of (10000,30,5,5,3) and the output shape of (1,5,5,3).
history = model.fit(train_data, train_data, batch_size = 1, epochs = 3)
I am trying to modify arguments in Model module, but it seems not working because the output shape is not the same as the input.
Are there any possible ways to feed data one by one?
I am eventually running a code something like:
model = keras.Model(intput, output)
model.compile(optimizer='adam',loss= tf.keras.losses.MSE)
history = model.fit(train_data, train_data, batch_size = 1, epochs = 3)

It could've done with GradientTape, feeding one by one.
def train(loss, model, opt, x_inp):
with tf.GradientTape() as tape:
gradients = tape.gradient(loss(model, x_inp), model.trainable_variables)
gradient_variables = zip(gradients, model.trainable_variables)
opt.apply_gradients(gradient_variables)
opt = tf.optimizers.Adam(learning_rate=learning_rate)
import datetime
current_time = datetime.datetime.now().strftime("%Y%m%d-%H%M%S")
train_summary_writer = tf.summary.create_file_writer(train_log_dir)
epochs = 3
with train_summary_writer.as_default():
with tf.summary.record_if(True):
for epoch in range(epochs):
for train_id in range(0, len(batch_data)):
x_inp = np.reshape(np.asarray(batch_data), [-1, step_max, sensor_n, sensor_n, scale_n])
train(loss, model, opt, x_inp)
loss_values = loss(model, x_inp)
reconstructed = np.reshape(model(x_inp), [1, sensor_n, sensor_n, scale_n])
print("loss : {}".format(loss_values.numpy()))

Related

How to merge new features at later stage of model?

I have training data in the form of numpy arrays, that I will use in ConvLSTM.
Following are dimensions of array.
trainX = (5000, 200, 5) where 5000 are number of samples. 200 is time steps per sample, and 8 is number of features per timestep. (samples, timesteps, features).
out of these 8 features, 3 features remains the same throghout all timesteps in a sample (In other words, these features are directly related to samples). for example, day of the week, month number, weekday (these changes from sample to sample). To reduce the complexity, I want to keep these three features separate from initial training set and merge them with the output of convlstm layer before applying dense layer for classication (softmax activiation). e,g
Intial training set dimension would be (7000, 200, 5) and auxiliary input dimensions to be merged would be (7000, 3) --> because these 3 features are directly related to sample. How can I implement this using keras?
Following is my code that I write using Functional API, but don't know how to merge these two inputs.
#trainX.shape=(7000,200,5)
#trainy.shape=(7000,4)
#testX.shape=(3000,200,5)
#testy.shape=(3000,4)
#trainMetadata.shape=(7000,3)
#testMetadata.shape=(3000,3)
verbose, epochs, batch_size = 1, 50, 256
samples, n_features, n_outputs = trainX.shape[0], trainX.shape[2], trainy.shape[1]
n_steps, n_length = 4, 50
input_shape = (n_steps, 1, n_length, n_features)
model_input = Input(shape=input_shape)
clstm1 = ConvLSTM2D(filters=64, kernel_size=(1,3), activation='relu',return_sequences = True)(model_input)
clstm1 = BatchNormalization()(clstm1)
clstm2 = ConvLSTM2D(filters=128, kernel_size=(1,3), activation='relu',return_sequences = False)(clstm1)
conv_output = BatchNormalization()(clstm2)
metadata_input = Input(shape=trainMetadata.shape)
merge_layer = np.concatenate([metadata_input, conv_output])
dense = Dense(100, activation='relu', kernel_regularizer=regularizers.l2(l=0.01))(merge_layer)
dense = Dropout(0.5)(dense)
output = Dense(n_outputs, activation='softmax')(dense)
model = Model(inputs=merge_layer, outputs=output)
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
history = model.fit([trainX, trainMetadata], trainy, validation_data=([testX, testMetadata], testy), epochs=epochs, batch_size=batch_size, verbose=verbose)
_, accuracy = model.evaluate(testX, testy, batch_size=batch_size, verbose=0)
y = model.predict(testX)
but I am getting Value error at merge_layer statement. Following is the ValueError
ValueError: zero-dimensional arrays cannot be concatenated
What you are saying can not be done using the Sequential mode of Keras.
You need to use the Model class API Guide to Keras Model.
With this API you can build the complex model you are looking for
Here you have an example of how to use it: How to Use the Keras Functional API for Deep Learning

Numpy and tensorflow RNN shape representation mismatch

I'm building my first RNN in tensorflow. After understanding all the concepts regarding the 3D input shape, I came across with this issue.
In my numpy version (1.15.4), the shape representation of 3D arrays is the following: (panel, row, column). I will make each dimension different so that it is clearer:
In [1]: import numpy as np
In [2]: arr = np.arange(30).reshape((2,3,5))
In [3]: arr
Out[3]:
array([[[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14]],
[[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24],
[25, 26, 27, 28, 29]]])
In [4]: arr.shape
Out[4]: (2, 3, 5)
In [5]: np.__version__
Out[5]: '1.15.4'
Here my understanding is: I have two timesteps with each timestep having 3 observations with 5 features in each observation.
However, in tensorflow "theory" (which I believe it is strongly based in numpy) RNN cells expect tensors (i.e. just n-dimensional matrices) of shape [batch_size, timesteps, features], which could be translated to: (row, panel, column) in the numpy "jargon".
As can be seen, the representation doesn't match, leading to errors when feeding numpy data into a placeholder, which in most of the examples and theory is defined like:
x = tf.placeholder(tf.float32, shape=[None, N_TIMESTEPS_X, N_FEATURES], name='XPlaceholder')
np.reshape() doesn't solve the issue because it just rearranges the dimensions, but messes up with the data.
I'm using for the first time the Dataset API, but I encounter the problems once into the session, not in the Dataset API ops.
I'm using the static_rnn method, and everything works well until I have to feed the data into the placeholder, which obviously results in a shape error.
I have tried to change the placeholder shape to shape=[N_TIMESTEPS_X, None, N_FEATURES]. HOWEVER, I'm using the dataset API, and I get errors when making the initializer if I change the Xplaceholder to the shape=[N_TIMESTEPS_X, None, N_FEATURES].
So, to summarize:
First problem: Shape errors with different shape representations.
Second problem: Dataset error when equating the shape representations (I think that either static_rnn or dynamic_rnn would function if this is resolved).
My question is:
¿Is there anything I'm missing in regard to this different representation logic which makes the practice confusing?
¿Could the solution be attained to switching to dynamic_rnn? (although the problems about the shape I encounter are related to the dataset API initializer being fed with shape [N_TIMESTEPS_X, None, N_FEATURES], not with the RNN cell itself.
Thank you very much for your time.
Full code:
'''The idea is to create xt, yt, xval and yval. My numpy arrays to
be fed are of the following shapes:
The 3D xt array has a shape of: (11, 69579, 74)
The 3D xval array has a shape of: (11, 7732, 74)
The yt array has a shape of: (69579, 3)
The yval array has a shape of: (7732, 3)
'''
N_TIMESTEPS_X = xt.shape[0] ## The stack number
BATCH_SIZE = 256
#N_OBSERVATIONS = xt.shape[1]
N_FEATURES = xt.shape[2]
N_OUTPUTS = yt.shape[1]
N_NEURONS_LSTM = 128 ## Number of units in the LSTMCell
N_NEURONS_DENSE = 64 ## Number of units in the Dense layer
N_EPOCHS = 600
LEARNING_RATE = 0.1
### Define the placeholders anda gather the data.
train_data = (xt, yt)
validation_data = (xval, yval)
## We define the placeholders as a trick so that we do not break into memory problems, associated with feeding the data directly.
'''As an alternative, you can define the Dataset in terms of tf.placeholder() tensors, and feed the NumPy arrays when you initialize an Iterator over the dataset.'''
batch_size = tf.placeholder(tf.int64)
x = tf.placeholder(tf.float32, shape=[None, N_TIMESTEPS_X, N_FEATURES], name='XPlaceholder')
y = tf.placeholder(tf.float32, shape=[None, N_OUTPUTS], name='YPlaceholder')
# Creating the two different dataset objects.
train_dataset = tf.data.Dataset.from_tensor_slices((x,y)).batch(BATCH_SIZE).repeat()
val_dataset = tf.data.Dataset.from_tensor_slices((x,y)).batch(BATCH_SIZE)
# Creating the Iterator type that permits to switch between datasets.
itr = tf.data.Iterator.from_structure(train_dataset.output_types, train_dataset.output_shapes)
train_init_op = itr.make_initializer(train_dataset)
validation_init_op = itr.make_initializer(val_dataset)
next_features, next_labels = itr.get_next()
### Create the graph
cellType = tf.nn.rnn_cell.LSTMCell(num_units=N_NEURONS_LSTM, name='LSTMCell')
inputs = tf.unstack(next_features, N_TIMESTEPS_X, axis=0)
'''inputs: A length T list of inputs, each a Tensor of shape [batch_size, input_size]'''
RNNOutputs, _ = tf.nn.static_rnn(cell=cellType, inputs=inputs, dtype=tf.float32)
predictionsLayer = tf.layers.dense(inputs=tf.layers.batch_normalization(RNNOutputs[-1]), units=N_NEURONS_DENSE, activation=None, name='Dense_Layer')
### Define the cost function, that will be optimized by the optimizer.
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(logits=predictionsLayer, labels=next_labels, name='Softmax_plus_Cross_Entropy'))
optimizer_type = tf.train.AdamOptimizer(learning_rate=LEARNING_RATE, name='AdamOptimizer')
optimizer = optimizer_type.minimize(cost)
### Model evaluation
correctPrediction = tf.equal(tf.argmax(predictionsLayer,1), tf.argmax(y,1))
accuracy = tf.reduce_mean(tf.cast(correctPrediction,tf.float32))
#confusionMatrix = tf.confusion_matrix(next_labels, predictionsLayer, num_classes=3, name='ConfMatrix')
N_BATCHES = train_data[0].shape[0] // BATCH_SIZE
## Saving variables so that we can restore them afterwards.
saver = tf.train.Saver()
save_dir = '/home/zmlaptop/Desktop/tfModels/{}_{}'.format(cellType.__class__.__name__, datetime.now().strftime("%Y%m%d%H%M%S"))
os.mkdir(save_dir)
varDict = {'nTimeSteps':N_TIMESTEPS_X, 'BatchSize': BATCH_SIZE, 'nFeatures':N_FEATURES,
'nNeuronsLSTM':N_NEURONS_LSTM, 'nNeuronsDense':N_NEURONS_DENSE, 'nEpochs':N_EPOCHS,
'learningRate':LEARNING_RATE, 'optimizerType': optimizer_type.__class__.__name__}
varDicSavingTxt = save_dir + '/varDict.txt'
modelFilesDir = save_dir + '/modelFiles'
os.mkdir(modelFilesDir)
logDir = save_dir + '/TBoardLogs'
os.mkdir(logDir)
acc_summary = tf.summary.scalar('Accuracy', accuracy)
loss_summary = tf.summary.scalar('Cost_CrossEntropy', cost)
summary_merged = tf.summary.merge_all()
with open(varDicSavingTxt, 'w') as outfile:
outfile.write(repr(varDict))
with tf.Session() as sess:
tf.set_random_seed(2)
sess.run(tf.global_variables_initializer())
train_writer = tf.summary.FileWriter(logDir + '/train', sess.graph)
validation_writer = tf.summary.FileWriter(logDir + '/validation')
# initialise iterator with train data
sess.run(train_init_op, feed_dict = {x : train_data[0], y: train_data[1], batch_size: BATCH_SIZE})
print('¡Training starts!')
for epoch in range(N_EPOCHS):
batchAccList = []
tot_loss = 0
for batch in range(N_BATCHES):
optimizer_output, loss_value, summary = sess.run([optimizer, cost, summary_merged])
accBatch = sess.run(accuracy)
tot_loss += loss_value
batchAccList.append(accBatch)
if batch % 10 == 0:
train_writer.add_summary(summary, batch)
epochAcc = tf.reduce_mean(batchAccList)
if epoch%10 == 0:
print("Epoch: {}, Loss: {:.4f}, Accuracy: {}".format(epoch, tot_loss / N_BATCHES, epochAcc))
#confM = sess.run(confusionMatrix)
#confDic = {'confMatrix': confM}
#confTxt = save_dir + '/confMDict.txt'
#with open(confTxt, 'w') as outfile:
# outfile.write(repr(confDic))
#print(confM)
# initialise iterator with validation data
sess.run(validation_init_op, feed_dict = {x : validation_data[0], y: validation_data[1], batch_size:len(validation_data[0])})
print('Validation Loss: {:4f}, Validation Accuracy: {}'.format(sess.run(cost), sess.run(accuracy)))
summary_val = sess.run(summary_merged)
validation_writer.add_summary(summary_val)
saver.save(sess, modelFilesDir)
Is there anything I'm missing in regard to this different
representation logic which makes the practice confusing?
In fact, you made a mistake about the input shapes of static_rnn and dynamic_rnn. The input shape of static_rnn is [timesteps,batch_size, features](link),which is a list of 2D tensors of shape [batch_size, features]. But The input shape of dynamic_rnn is either [timesteps,batch_size, features] or [batch_size,timesteps, features] depending on time_major is True or False(link).
Could the solution be attained to switching to dynamic_rnn?
The key is not that you use static_rnn or dynamic_rnn, but that your data shape matches the required shape. The general format of placeholder is like your code is [None, N_TIMESTEPS_X, N_FEATURES]. It's also convenient for you to use dataset API.
You can use transpose()(link) instead of reshape().transpose() will permute the dimensions of an array and won't messes up with the data.
So your code needs to be modified.
# permute the dimensions
xt = xt.transpose([1,0,2])
xval = xval.transpose([1,0,2])
# adjust shape,axis=1 represents timesteps
inputs = tf.unstack(next_features, axis=1)
Other errors should have nothing to do with rnn shape.

Changing batch_size parameter in keras leads to broadcast error

I am running a simple encoder-decoder setup to train a representation for a one dimensional image. In this sample the input are lines with varying slopes and in the encoded layer we would expect something that resembles the slope. My setup is keras with a tensorflow backend. I am very new to this as well.
It all works fine, at least until I move away from steps_per_epoch to batch_size in the model.fit() method. Certain values of the batch_size, such as 1,2,3, 8 and 16 do work, for others I get a value error. My initial guess was 2^n, but that did not work.
The error I get for batch_size = 5
ValueError: operands could not be broadcast together with shapes (5,50) (3,50) (5,50)
I am trying to understand which relation between batch_size and training data is valid such that it always passes. I assumed that the training set would be simply divided into floor(N/batch_size) batches and the remainder would be processed as such.
My questions are:
What is the relation between size of data set and batch_size that are allowed.
What exactly is the keras/tensorflow trying to do such that the batch_size is important?
Thank you very much for the help.
The code to reproduce this is
import numpy as np
from keras.models import Model
from keras.layers import Input, Dense, Conv1D, Concatenate
from keras.losses import mse
from keras.optimizers import Adam
INPUT_DIM = 50
INTER_DIM = 15
LATENT_DIM = 1
# Prepare Sample Data
one_line = np.linspace(1, 30, INPUT_DIM).reshape(1, INPUT_DIM)
test_array = np.repeat(one_line, 1000, axis=0)
slopes = np.linspace(0, 1, 1000).reshape(1000, 1)
data = test_array * slopes
# Train test split
train_mask = np.where(np.random.sample(1000) < 0.8, 1, 0).astype('bool')
x_train = data[train_mask].reshape(-1, INPUT_DIM, 1)
x_test = data[~train_mask].reshape(-1, INPUT_DIM, 1)
# Define Model
input = Input(shape=(INPUT_DIM, 1), name='input')
conv_layer_small = Conv1D(filters=1, kernel_size=[3], padding='same')(input)
conv_layer_medium = Conv1D(filters=1, kernel_size=[5], padding='same')(input)
merged_convs = Concatenate()(
[conv_layer_small, conv_layer_medium])
latent = Dense(LATENT_DIM, name='latent_layer',
activation='relu')(merged_convs)
encoder = Model(input, latent)
decoder_int = Dense(INTER_DIM, name='dec_int_layer', activation='relu')(latent)
output = Dense(INPUT_DIM, name='output', activation='linear')(decoder_int)
encoder_decoder = Model(input, output, name='encoder_decoder')
# Add Loss
reconstruction_loss = mse(input, output)
encoder_decoder.add_loss(reconstruction_loss)
encoder_decoder.compile(optimizer='adam')
if __name__ == '__main__':
epochs = 100
encoder_decoder.fit(
x_train,
epochs=epochs,
batch_size=4,
verbose=2
)

Deep learning Keras model CTC_Loss gives loss = infinity

i've a CRNN model for text recognition, it was published on Github, trained on english language,
Now i'm doing the same thing using this algorithm but for arabic.
My ctc function is:
def ctc_lambda_func(args):
y_pred, labels, input_length, label_length = args
# the 2 is critical here since the first couple outputs of the RNN
# tend to be garbage:
y_pred = y_pred[:, 2:, :]
return K.ctc_batch_cost(labels, y_pred, input_length, label_length)
My Model is:
def get_Model(training):
img_w = 128
img_h = 64
# Network parameters
conv_filters = 16
kernel_size = (3, 3)
pool_size = 2
time_dense_size = 32
rnn_size = 128
if K.image_data_format() == 'channels_first':
input_shape = (1, img_w, img_h)
else:
input_shape = (img_w, img_h, 1)
# Initialising the CNN
act = 'relu'
input_data = Input(name='the_input', shape=input_shape, dtype='float32')
inner = Conv2D(conv_filters, kernel_size, padding='same',
activation=act, kernel_initializer='he_normal',
name='conv1')(input_data)
inner = MaxPooling2D(pool_size=(pool_size, pool_size), name='max1')(inner)
inner = Conv2D(conv_filters, kernel_size, padding='same',
activation=act, kernel_initializer='he_normal',
name='conv2')(inner)
inner = MaxPooling2D(pool_size=(pool_size, pool_size), name='max2')(inner)
conv_to_rnn_dims = (img_w // (pool_size ** 2), (img_h // (pool_size ** 2)) * conv_filters)
inner = Reshape(target_shape=conv_to_rnn_dims, name='reshape')(inner)
# cuts down input size going into RNN:
inner = Dense(time_dense_size, activation=act, name='dense1')(inner)
# Two layers of bidirectional GRUs
# GRU seems to work as well, if not better than LSTM:
gru_1 = GRU(rnn_size, return_sequences=True, kernel_initializer='he_normal', name='gru1')(inner)
gru_1b = GRU(rnn_size, return_sequences=True, go_backwards=True, kernel_initializer='he_normal', name='gru1_b')(inner)
gru1_merged = add([gru_1, gru_1b])
gru_2 = GRU(rnn_size, return_sequences=True, kernel_initializer='he_normal', name='gru2')(gru1_merged)
gru_2b = GRU(rnn_size, return_sequences=True, go_backwards=True, kernel_initializer='he_normal', name='gru2_b')(gru1_merged)
# transforms RNN output to character activations:
inner = Dense(num_classes+1, kernel_initializer='he_normal',
name='dense2')(concatenate([gru_2, gru_2b]))
y_pred = Activation('softmax', name='softmax')(inner)
Model(inputs=input_data, outputs=y_pred).summary()
labels = Input(name='the_labels', shape=[30], dtype='float32')
input_length = Input(name='input_length', shape=[1], dtype='int64')
label_length = Input(name='label_length', shape=[1], dtype='int64')
# Keras doesn't currently support loss funcs with extra parameters
# so CTC loss is implemented in a lambda layer
loss_out = Lambda(ctc_lambda_func, output_shape=(1,), name='ctc')([y_pred, labels, input_length, label_length])
# clipnorm seems to speeds up convergence
# the loss calc occurs elsewhere, so use a dummy lambda func for the loss
if training:
return Model(inputs=[input_data, labels, input_length, label_length], outputs=loss_out)
return Model(inputs=[input_data], outputs=y_pred)
Then i compile it with SGD optimizer (Tried SGD,adam)
sgd = SGD(lr=0.0000002, decay=1e-6, momentum=0.9, nesterov=True, clipnorm=5)
model.compile(loss={'ctc': lambda y_true, y_pred: y_pred}, optimizer=sgd)
Then i fit the model with my training set (Images of words up to 30 characters) into (sequence of labels of 30
model.fit_generator(generator=tiger_train.next_batch(),
steps_per_epoch=int(tiger_train.n / batch_size),
epochs=30,
callbacks=[checkpoint],
validation_data=tiger_val.next_batch(),
validation_steps=int(tiger_val.n / val_batch_size))
Once it starts, it give me loss = inf, after many searches, i didn't find any similar problem.
So my questions is, how can i solve this, what can make a ctc_loss compute an infinite cost?
Thanks in advance
I found the problem, it was dimensions problem,
For R-CNN OCR using CTC layer, if you are detecting a sequence with length n, you should have an image with at least a width of (2*n-1). The more the better till you reach the best image/timesteps ratio to let the CTC layer able to recognize the letter correctly. If image with is less than (2*n-1), it will give a nan loss.
This error is happened when image text have two equal characters in the same sequence e.g happen --> pp. for so that you can remove data that has this characteristic.

Keras LSTM layers input shape

I am trying to feed a sequence with 20 featuresto an LSTM network as shown in the code. But I get an error that my Input0 is incompatible with LSTM input. Not sure how to change my layer structure to fit the data.
def build_model(features, aux1=None, aux2=None):
# create model
features[0] = np.asarray(features[0])
main_input = Input(shape=features[0].shape, dtype='float32', name='main_input')
main_out = LSTM(40, activation='relu')
aux1_input = Input(shape=(len(aux1[0]),), dtype='float32', name='aux1_input')
aux1_out = Dense(len(aux1[0]))(aux1_input)
aux2_input = Input(shape=(len(aux2[0]),), dtype='float32', name='aux2_input')
aux2_out = Dense(len(aux2[0]))(aux2_input)
x = concatenate([aux1_out, main_out, aux2_out])
x = Dense(64, activation='relu')(x)
x = Dropout(0.5)(x)
output = Dense(1, activation='sigmoid', name='main_output')(x)
model = Model(inputs=[aux1_input, aux2_input, main_input], outputs= [output])
return model
Features variable is an array of shape (1456, 20) I have 1456 days and for each day I have 20 variables.
Your main_input should be of shape (samples, timesteps, features)
and then you should define main_input like this:
main_input = Input(shape=(timesteps,)) # for stateless RNN (your one)
or main_input = Input(batch_shape=(batch_size, timesteps,)) for stateful RNN (not the one you are using in your example)
if your features[0] is a 1-dimensional array of various features (1 timestep), then you also have to reshape features[0] like this:
features[0] = np.reshape(features[0], (1, features[0].shape))
and then do it to features[1], features[2] etc
or better reshape all your samples at once:
features = np.reshape(features, (features.shape[0], 1, features.shape[1]))
LSTM layers are designed to work with "sequences".
You say your sequence has 20 features, but how many time steps does it have?? Do you mean 20 time steps instead?
An LSTM layer requires input shapes such as (BatchSize, TimeSteps, Features).
If it's the case that you have 1 feature in each of the 20 time steps, you must shape your data as:
inputData = someData.reshape(NumberOfSequences, 20, 1)
And the Input tensor should take this shape:
main_input = Input((20,1), ...) #yes, it ignores the batch size

Resources