I'm trying to learn Keras and are using LSTM for a classification problem. I want to be able to plot the
accuracy and loss and update the plot during training. For that I'm using the callback function.
For some reason the accuracy and loss I receive in the callbacks does not match with
the accuracy and loss printed by the fit function.
Here are the relevant lines of my code:
class PlotCallbacks(Callback):
def on_batch_end(self, batch, logs={}):
print(logs)
return
# Create the model
model = Sequential()
model.add(Embedding(top_words, embedding_vector_length,input_length=max_conv_length))
model.add(LSTM(300))
model.add(Dense(3, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(x_train, y_train, callbacks=[PlotCallbacks()], nb_epoch=1, batch_size=3, verbose=1)
When running the program I get this output (first row of each batch is printed by callback, second is printed by model.fit):
Epoch 1/1
{'acc': 0.0, 'loss': 1.1038421, 'batch': 0, 'size': 3}
3/25 [==>...........................] - ETA: 27s - loss: 1.1038 - acc: 0.0000e+00
{'acc': 1.0, 'loss': 1.0622898, 'batch': 1, 'size': 3}
6/25 [======>.......................] - ETA: 19s - loss: 1.0831 - acc: 0.5000
{'acc': 1.0, 'loss': 0.91526389, 'batch': 2, 'size': 3}
9/25 [=========>....................] - ETA: 13s - loss: 1.0271 - acc: 0.6667
{'acc': 1.0, 'loss': 0.36570337, 'batch': 3, 'size': 3}
12/25 [=============>................] - ETA: 11s - loss: 0.8618 - acc: 0.7500
{'acc': 1.0, 'loss': 0.1433304, 'batch': 4, 'size': 3}
15/25 [=================>............] - ETA: 9s - loss: 0.7181 - acc: 0.8000
{'acc': 1.0, 'loss': 0.041385528, 'batch': 5, 'size': 3}
18/25 [====================>.........] - ETA: 6s - loss: 0.6053 - acc: 0.8333
{'acc': 1.0, 'loss': 0.011424608, 'batch': 6, 'size': 3}
21/25 [========================>.....] - ETA: 3s - loss: 0.5205 - acc: 0.8571
{'acc': 1.0, 'loss': 0.0034991663, 'batch': 7, 'size': 3}
24/25 [===========================>..] - ETA: 1s - loss: 0.4558 - acc: 0.8750
{'acc': 1.0, 'loss': 0.0012318328, 'batch': 8, 'size': 1}
25/25 [==============================] - 26s - loss: 0.4377 - acc: 0.8800
I have tried to print logs.get('acc'), as well as saved the accuracies to a list in the PlotCallbacks object and print the list, but the problem remains.
Does anyone have a clue what the problem may be?
Thanks
on_batch_end() type callback function gets the accuracy of the batch that just got trained. Whereas the logs printed by keras is the average over all the batches that it has seen in the current epoch. You can easily observe that in your logs.. say in first 2 batches one accuracy was 0.0 and 1.0, which made the overall accuracy over 2 batches seen as 0.5000. here is exactly where the average is calculated.
Also accuracy as a metric is usually reported from epoch to epoch so you can change the callback to on_epoch_end().
Here is a simple callback that records the accuracy in the way defined by Keras' progbar (average over all the batches seen so far in the current epoch):
class AccHistory(Callback):
def on_train_begin(self, logs={}):
self.accs = []
self.acc_avg = []
def on_batch_end(self, batch, logs={}):
self.accs.append(round(1e2*float(logs.get('acc')),4))
self.acc_avg.append(round(np.mean(self.accs,dtype=np.float64),4))
def on_epoch_end(self, batch, logs={}):
self.accs = []
Related
I'm trying to create a model for time series analysis using LSTM layer, however accuracy is very low even when using Dense layers and no LSTM.
The data is time series (synthetic spectrum), which depends on 4 parameters. Changing the parameters enables to use different size datasets, where each sample is more or less different from the other. But no matter the size of dataset accuracy is always as low as 0.0 - 0.32 %.
Model with LSTM:
print(trainset.shape)
print(testset.shape)
print(trainlabels.shape)
model = Sequential()
model.add(Masking(mask_value=0.0, input_shape=(trainset.shape[1], trainset.shape[2])))
model.add(LSTM(10, activation='relu', stateful=False, return_sequences=False))
model.add(Dropout(0.3))
model.add(Dense(len(trainlabels), activation='relu'))
model.compile(loss='sparse_categorical_crossentropy',
optimizer='Adam', metrics=['accuracy'])
print(model.summary())
model.fit(trainset, trainlabels, validation_data=(testset, testlabels),
epochs=3, batch_size=10)
scores = model.evaluate(testset, testlabels, verbose=0)
print("Accuracy: %.2f%%" % (scores[1]*100))
OUTPUT:
(2478, 600, 1)
(620, 600, 1)
(2478,)
Model: "sequential_7"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
masking_7 (Masking) (None, 600, 1) 0
_________________________________________________________________
lstm_7 (LSTM) (None, 10) 480
_________________________________________________________________
dropout_7 (Dropout) (None, 10) 0
_________________________________________________________________
dense_7 (Dense) (None, 2478) 27258
=================================================================
Total params: 27,738
Trainable params: 27,738
Non-trainable params: 0
_________________________________________________________________
None
Train on 2478 samples, validate on 620 samples
Epoch 1/3
2478/2478 [==============================] - 53s 22ms/step - loss: 8.9022 - accuracy: 4.0355e-04 - val_loss: 7.8152 - val_accuracy: 0.0016
Epoch 2/3
2478/2478 [==============================] - 54s 22ms/step - loss: 7.8152 - accuracy: 4.0355e-04 - val_loss: 7.8152 - val_accuracy: 0.0016
Epoch 3/3
2478/2478 [==============================] - 53s 21ms/step - loss: 7.8152 - accuracy: 4.0355e-04 - val_loss: 7.8152 - val_accuracy: 0.0016
Accuracy: 0.16%
Some values in training data are 0.0, therefore Masking is used.
I have tried playing around with different loss, optimizer, activations, Dropout and layer parameters. The result is always the same even after adding more Dense layers, or changing batch size.
Model with Dense:
Data format is 2D instead of 3D.
model=keras.Sequential([
keras.layers.Masking(mask_value=0.0, input_shape=(trainset.shape[1],)),
keras.layers.Dense(128, activation='relu'),
keras.layers.Dropout(0.05),
keras.layers.Dense(64, activation='relu'),
keras.layers.Dropout(0.05),
keras.layers.Dense(len(labels), activation='softmax')
])
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
model.fit(np.uint8(trainset), np.uint8(trainlabels), epochs=100)
test_loss, test_acc=model.evaluate(np.uint8(testset), np.uint8(testlabels),
verbose=2)
print(test_acc)
OUTPUT:
Train on 1239 samples
Epoch 1/100
1239/1239 [==============================] - 1s 1ms/sample - loss: 8.5421 - accuracy: 0.0033
Epoch 2/100
1239/1239 [==============================] - 0s 371us/sample - loss: 6.2039 - accuracy: 0.0025
Epoch 3/100
1239/1239 [==============================] - 0s 347us/sample - loss: 5.6502 - accuracy: 0.0033
****
Epoch 97/100
1239/1239 [==============================] - 0s 380us/sample - loss: 0.1472 - accuracy: 0.9746
Epoch 98/100
1239/1239 [==============================] - 0s 364us/sample - loss: 0.1562 - accuracy: 0.9680
Epoch 99/100
1239/1239 [==============================] - 1s 408us/sample - loss: 0.1511 - accuracy: 0.9721
Epoch 100/100
1239/1239 [==============================] - 0s 378us/sample - loss: 0.1719 - accuracy: 0.9680
310/1 - 0s - loss: 18.6845 - accuracy: 0.0000e+00
0.0
With this model loss is very low but so is accuracy.
What kind of model architecture should be used for my data?
Thanks in advance for helping to learn this stuff!
I am trying to train on gray images. The batch_size = 32, image size = (48*48).
I define my network input_shape = (48,48,1). I get an error like below when I train the network.
Error :
ValueError: Error when checking input: expected conv2d_17_input to have 4 dimensions, but got array with shape (32, 48, 48)
model.add(Conv2D(32, kernel_size=(5, 5),
activation='relu',
input_shape=(48,48,1)
)
)
Let's say you have 1000 training images where each image is 48x48 greyscale. After you have loaded the images into a numpy array, you will end up with the shape : (1000, 48, 48).
This essentially means you have 1000 elements in your array and each element is a 48x48 matrix.
Now in order to feed this data to train a CNN, you have to reshape this list to (1000, 48, 48, 1) where 1 stands for channel dimension. Since you are having greyscaled images you have to use 1. If it was RGB it will be 3.
Consider the toy example given below,
x_train = np.random.rand(1000, 48, 48) #images
y_train = np.array([np.random.randint(0, 2) for x in range(1000)]) # labels
# simple model
model = Sequential()
model.add(Conv2D(32, kernel_size=(5, 5),
activation='relu',
input_shape=(48,48,1)
)
)
model.add(Flatten())
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam')
# fitting model
model.fit(x_train, y_train, epochs=10, batch_size=32)
This will throw an error,
Error when checking input: expected conv2d_3_input to have 4 dimensions, but got array with shape (1000, 48, 48)
To fix it reshape the x_train like this,
x_train = x_train.reshape(x_train.shape[0], x_train.shape[1], x_train.shape[2], 1)
Now fit the model,
model.fit(x_train, y_train, epochs=10, batch_size=32)
Epoch 1/10
1000/1000 [==============================] - 1s 1ms/step - loss: 0.7177
Epoch 2/10
1000/1000 [==============================] - 1s 882us/step - loss: 0.6762
Epoch 3/10
1000/1000 [==============================] - 1s 870us/step - loss: 0.5882
Epoch 4/10
1000/1000 [==============================] - 1s 888us/step - loss: 0.4588
Epoch 5/10
1000/1000 [==============================] - 1s 906us/step - loss: 0.3272
Epoch 6/10
1000/1000 [==============================] - 1s 910us/step - loss: 0.2228
Epoch 7/10
1000/1000 [==============================] - 1s 895us/step - loss: 0.1607
Epoch 8/10
1000/1000 [==============================] - 1s 879us/step - loss: 0.1172
Epoch 9/10
1000/1000 [==============================] - 1s 886us/step - loss: 0.0935
Epoch 10/10
1000/1000 [==============================] - 1s 888us/step - loss: 0.0638
Your input should have 4 dimenstions even if it is gray scale. So, you can use np.reshape(input,(32,48,48,1)) or np.expand_dims(input,axis=3).
Your images should be reshaped to (sample_length, 48, 48, 1) and your input_shape = (48, 48, 1)
x_train = x_train.reshape(x_train.shape[0], 48, 48, 1)
x_test = x_test.reshape(x_test.shape[0], 48, 48, 1)
input_shape = (48, 48, 1)
You can look at the MNIST example here, which is similar to your case.
i'm beginner studying neural network.
i'm doing kaggle project - bike sharing demand.
i want to use simple neural network by keras, but loss is not decreasing.
what should i do?
-----------code--------------
# dataset from pandas
feature_names = ["season", "holiday", "workingday", "weather",
"temp", "atemp", "humidity", "windspeed",
"datetime_year", "datetime_hour", "datetime_dayofweek"]
label_name = ["count"]
X_train = train[feature_names] #shape (10886, 11)
Y_train = train[label_name] #shape (10886, 1)
X_test = test[feature_names]
# layers
model = Sequential()
model.add(Dense(units = 50, kernel_initializer = 'uniform', activation = 'relu', input_dim=11))
model.add(Dropout(0.3))
model.add(Dense(units = 50, kernel_initializer = 'uniform', activation = 'relu'))
model.add(Dropout(0.3))
model.add(Dense(units = 5, kernel_initializer = 'uniform', activation = 'relu'))
model.add(Dropout(0.3))
model.add(Dense(units = 1, kernel_initializer = 'uniform', activation = 'sigmoid'))
model.compile(optimizer = 'adam', loss = 'mean_squared_error', metrics = ['accuracy'])
# Train
model.fit(X_train, Y_train, batch_size = 100, epochs = 200)
---------result------------
Epoch 1/200
10886/10886 [==============================] - 4s 325us/step - loss: 69206.2478 - acc: 0.0094
Epoch 2/200
10886/10886 [==============================] - 1s 93us/step - loss: 69184.5435 - acc: 0.0096
Epoch 3/200
10886/10886 [==============================] - 1s 89us/step - loss: 69181.6330 - acc: 0.0096
Epoch 4/200
10886/10886 [==============================] - 1s 93us/step - loss: 69179.0222 - acc: 0.0096
Epoch 5/200
10886/10886 [==============================] - 1s 91us/step - loss: 69175.7442 - acc: 0.0096
Epoch 6/200
10886/10886 [==============================] - 1s 109us/step - loss: 69171.9052 - acc: 0.0096
Epoch 7/200
10886/10886 [==============================] - 1s 122us/step - loss: 69171.6164 - acc: 0.0096
Epoch 8/200
10886/10886 [==============================] - 1s 92us/step - loss: 69167.6923 - acc: 0.0096
Epoch 9/200
10886/10886 [==============================] - 1s 91us/step - loss: 69166.2911 - acc: 0.0096
Epoch 10/200
10886/10886 [==============================] - 1s 94us/step - loss: 69164.1145 - acc: 0.0096
...
Try setting the learning rate in the model.compile section. Start with 0.0001, then 0.001 and 0.01 and see what happens.
I face two problems when I implement 1D convnet for multi-channel sequential data.
(224 samples x 300 time sequential x 19 channels)
1) I set batch_size as 7 but it jumps with 5 times of that.
not 7 14 21 28, but 7, 56, 105, 147... what's wrong with mine?
2) when I look at the records of accuracy, it looks like to learn NOTHING.
is it impossible implement classifier for multi-channel sequential data with Conv1D?
If possible can you give me some advice from my code?
#result
x_train shape: (224, 300, 19)
224 train samples
28 test samples
Train on 224 samples, validate on 28 samples
Epoch 1/50
7/224 [..............................] - ETA: 68s - loss: 0.6945 - acc: 0.5714
56/224 [======>.......................] - ETA: 6s - loss: 0.6993 - acc: 0.4464
105/224 [=============>................] - ETA: 2s - loss: 0.6979 - acc: 0.4381
147/224 [==================>...........] - ETA: 1s - loss: 0.6968 - acc: 0.4422
189/224 [========================>.....] - ETA: 0s - loss: 0.6953 - acc: 0.4444
224/224 [==============================] - 2s - loss: 0.6953 - acc: 0.4420 - val_loss: 0.6956 - val_acc: 0.5000
Epoch 2/50
7/224 [..............................] - ETA: 0s - loss: 0.6759 - acc: 0.5714
63/224 [=======>......................] - ETA: 0s - loss: 0.6924 - acc: 0.5556
133/224 [================>.............] - ETA: 0s - loss: 0.6905 - acc: 0.5338
203/224 [==========================>...] - ETA: 0s - loss: 0.6903 - acc: 0.5567
224/224 [==============================] - 0s - loss: 0.6923 - acc: 0.5357 - val_loss: 0.6968 - val_acc: 0.5000
# code
from __future__ import print_function
import keras
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten, Activation
from keras.layers import Conv2D, MaxPooling2D, Conv1D, MaxPooling1D
import numpy as np
batch_size = 7
num_classes = 2
epochs = 50
# input data dimensions : 300 sequential x 19 channels
eeg_rows, num_ch = 300, 19
x_train = np.load('eeg_train.npy')
y_train = np.load('label_train.npy')
x_test = np.load('eeg_test.npy')
y_test = np.load('label_test.npy')
x_valid = np.load('eeg_valid.npy')
y_valid = np.load('label_valid.npy')
x_train = x_train.reshape(x_train.shape[0], eeg_rows, num_ch)
x_test = x_test.reshape(x_test.shape[0], eeg_rows,num_ch)
x_valid = x_valid.reshape(x_valid.shape[0], eeg_rows, num_ch)
input_shape = (eeg_rows, num_ch)
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_valid = x_test.astype('float32')
x_train /= 100
x_test /= 100
x_valid /= 100
print('x_train shape:', x_train.shape)
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')
# model
conv = Sequential()
conv.add(Conv1D(32, 3, input_shape=input_shape, activation='relu', padding='same'))
conv.add(Conv1D(32, 3, activation='relu', padding='same'))
conv.add(MaxPooling1D(pool_size=2, strides=2, padding='same'))
conv.add(Dropout(0.2))
conv.add(Flatten())
conv.add(Dense(16, activation='relu'))
conv.add(Dropout(0.5))
conv.add(Dense(2, activation='softmax'))
conv.compile(loss='categorical_crossentropy',
optimizer=keras.optimizers.Adam(),
metrics=['accuracy'])
# train
conv.fit(x_train, y_train,
batch_size=batch_size,
epochs=epochs,
verbose=1,
validation_data=(x_valid, y_valid))
score = conv.evaluate(x_valid, y_valid, verbose=0)
print(conv.summary())
print(conv.input_shape)
print(conv.output_shape)
print('Test loss:', score[0])
print('Test accuracy:', score[1])
I am trying to train model to play Chrome Dino (the offline game).
The idea was to have 6 last screenshots of the game, use CNN on each separately (to extract features) and then put those features as timesteps into LSTM.
My training data is X = [6 timestep game screenshots] -> y=[1,0] (keep running, jump)
Timestep example
I have even split the database so it has 50% jump examples and 50% keep running examples.
Sadly I am stuck at 50% accuracy and loss is stuck too.
198/198 [==============================] - 0s - loss: 0.6944 - acc: 0.4596
Epoch 91/100
198/198 [==============================] - 0s - loss: 0.6932 - acc: 0.5000
Epoch 92/100
198/198 [==============================] - 0s - loss: 0.6932 - acc: 0.5000
Epoch 93/100
198/198 [==============================] - 0s - loss: 0.6932 - acc: 0.5000
Epoch 94/100
198/198 [==============================] - 0s - loss: 0.6933 - acc: 0.5000
Epoch 95/100
198/198 [==============================] - 0s - loss: 0.6942 - acc: 0.5000
Epoch 96/100
198/198 [==============================] - 0s - loss: 0.6939 - acc: 0.5000
Epoch 97/100
198/198 [==============================] - 0s - loss: 0.6935 - acc: 0.5000
I have tried many model hyperparams with different layers, but I always get the same result.
Current model
model = Sequential()
model.add(TimeDistributed(Convolution2D(64, 3, 3, activation='relu'), input_shape=(FRAMES_TO_PROCESS, FRAME_HEIGHT,FRAME_WIDTH, FRAME_FILTERS )))
model.add(TimeDistributed(MaxPooling2D(pool_size=(2, 2))))
model.add(TimeDistributed(ZeroPadding2D((1,1))))
model.add(TimeDistributed(Convolution2D(64, 3, 3, activation='relu')))
model.add(TimeDistributed(MaxPooling2D(pool_size=(2, 2), strides=(2,2))))
model.add(TimeDistributed(ZeroPadding2D((1,1))))
model.add(TimeDistributed(Convolution2D(128, 3, 3, activation='relu')))
model.add(TimeDistributed(ZeroPadding2D((1,1))))
model.add(TimeDistributed(Convolution2D(128, 3, 3, activation='relu')))
model.add(TimeDistributed(MaxPooling2D(pool_size=(2, 2), strides=(2,2))))
model.add(Activation('relu'))
model.add(TimeDistributed(Flatten()))
model.add(Dropout(0.1))
model.add(LSTM(120, return_sequences=False))
model.add(Dense(2, activation='softmax'))
model.compile(loss="categorical_crossentropy", optimizer="adam", metrics=['accuracy'])
Any idea what went wrong?