Problem with Keras to learn multiplication by two - keras

I'm just trying to play around with Keras, but I'm running into some trouble trying to teach it a basic function (multiply by two). My setup is as follows. Since I'm new to this, I added in comments what I believe to be happening at each step.
x_train = np.linspace(1,1000,1000)
y_train=x_train*2
model = Sequential()
model.add(Dense(32, input_dim=1, activation='sigmoid')) #add a 32-node layer
model.add(Dense(32, activation='sigmoid')) #add a second 32-node layer
model.add(Dense(1, activation='sigmoid')) #add a final output layer
model.compile(loss='mse',
optimizer='rmsprop') #compile it with loss being mean squared error
model.fit(x_train,y_train, epochs = 10, batch_size=100) #train
score = model.evaluate(x_train,y_train,batch_size=100)
print(score)
I get the following output:
1000/1000 [==============================] - 0s 355us/step - loss: 1334274.0375
Epoch 2/10
1000/1000 [==============================] - 0s 21us/step - loss: 1333999.8250
Epoch 3/10
1000/1000 [==============================] - 0s 29us/step - loss: 1333813.4062
Epoch 4/10
1000/1000 [==============================] - 0s 28us/step - loss: 1333679.2625
Epoch 5/10
1000/1000 [==============================] - 0s 27us/step - loss: 1333591.6750
Epoch 6/10
1000/1000 [==============================] - 0s 51us/step - loss: 1333522.0000
Epoch 7/10
1000/1000 [==============================] - 0s 23us/step - loss: 1333473.7000
Epoch 8/10
1000/1000 [==============================] - 0s 24us/step - loss: 1333440.6000
Epoch 9/10
1000/1000 [==============================] - 0s 29us/step - loss: 1333412.0250
Epoch 10/10
1000/1000 [==============================] - 0s 21us/step - loss: 1333390.5000
1000/1000 [==============================] - 0s 66us/step
['loss']
1333383.1143554687
It seems like the loss is extremely high for this basic function, and I'm confused why it's not able to learn it. Am I confused, or have I done something wrong?

Using a sigmoid activation constrains your output to the range [0, 1]. But your target output is in the range [0, 2000], so your network cannot learn. Try a relu activation instead.
Try using adam rather than rmsprop when debugging, it almost always works better.
Train longer.
Putting it all together, I get the following output:
Epoch 860/1000
1000/1000 [==============================] - 0s 29us/step - loss: 5.1868e-08

Related

Best model weights in neural network in case of early stopping

I am training a model with the following code
model=Sequential()
model.add(Dense(100, activation='relu',input_shape=(n_cols,)))
model.add(Dense(100, activation='relu'))
model.add(Dense(2,activation='softmax'))
model.compile(optimizer='adam',loss='categorical_crossentropy',metrics=['accuracy'])
early_stopping_monitor = EarlyStopping(patience=3)
model.fit(X_train_np,target,validation_split=0.3, epochs=100, callbacks=[early_stopping_monitor])
This is designed to stop the training if the val_loss: parameter does not improve after 3 epochs. The result is shown below. My question is will the model stop with weights of epoch 8 or 7. Because the performance got bad in epoch 8 so it stopped. But the model went ahead by 1 epoch with a bad performing parameter as earlier one (epoch 7) was better. Do I need to retrain the model now with 7 epochs?
Train on 623 samples, validate on 268 samples
Epoch 1/100
623/623 [==============================] - 1s 1ms/step - loss: 4.0365 - accuracy: 0.5923 - val_loss: 1.2208 - val_accuracy: 0.6231
Epoch 2/100
623/623 [==============================] - 0s 114us/step - loss: 1.4412 - accuracy: 0.6356 - val_loss: 0.7193 - val_accuracy: 0.7015
Epoch 3/100
623/623 [==============================] - 0s 103us/step - loss: 1.4335 - accuracy: 0.6260 - val_loss: 1.3778 - val_accuracy: 0.7201
Epoch 4/100
623/623 [==============================] - 0s 106us/step - loss: 3.5732 - accuracy: 0.6324 - val_loss: 2.7310 - val_accuracy: 0.6194
Epoch 5/100
623/623 [==============================] - 0s 111us/step - loss: 1.3116 - accuracy: 0.6372 - val_loss: 0.5952 - val_accuracy: 0.7351
Epoch 6/100
623/623 [==============================] - 0s 98us/step - loss: 0.9357 - accuracy: 0.6645 - val_loss: 0.8047 - val_accuracy: 0.6828
Epoch 7/100
623/623 [==============================] - 0s 105us/step - loss: 0.7671 - accuracy: 0.6934 - val_loss: 0.9918 - val_accuracy: 0.6679
Epoch 8/100
623/623 [==============================] - 0s 126us/step - loss: 2.2968 - accuracy: 0.6629 - val_loss: 1.7789 - val_accuracy: 0.7425
Use restore_best_weights with monitor value set to target quantity. So, the best weights will be restored after training automatically.
early_stopping_monitor = EarlyStopping(patience=3,
monitor='val_loss', # assuming it's val_loss
restore_best_weights=True )
From docs:
restore_best_weights: whether to restore model weights from the epoch with the best value of the monitored quantity ('val_loss' here). If False, the model weights obtained at the last step of training are used (default False).
Docmentation link
All the code that I have placed is in TensorFlow 2.0
file path: Is a string that can have formatting options such as the epoch number. For example the following is a common filepath (weights.{epoch:02d}-{val_loss:.2f}.hdf5)
monitor: (typically it is‘val_loss’or ‘val_accuracy’)
mode: Should it be minimizing or maximizing the monitor value
(typically either ‘min’ or ‘max’)
save_best_only: If this is set to true then it will only save the
model for the current epoch, if it’s metric values, is better than
what has gone before. However, if you set save_best_only to
false it will save every model after each epoch (regardless of
whether that model was better than previous models or not).
Code
model=Sequential()
model.add(Dense(100, activation='relu',input_shape=(n_cols,)))
model.add(Dense(100, activation='relu'))
model.add(Dense(2,activation='softmax'))
model.compile(optimizer='adam',loss='categorical_crossentropy',metrics=['accuracy'])
fname = "weights.{epoch:02d}-{val_loss:.2f}.hdf5"
checkpoint = tf.keras.callbacks.ModelCheckpoint(fname, monitor="val_loss",mode="min", save_best_only=True, verbose=1)
model.fit(X_train_np,target,validation_split=0.3, epochs=100, callbacks=[checkpoint])

Keras - Why is the accuracy of my CNN model not being affected by the hyper-parameters?

As the title clearly describes, the accuracy of my simple CNN model is not being affected by the hyper-parameters or even the existence of layers such as Dropout, and MaxPooling. I implemented the model using Keras. What could be the reason behind this odd situation? I added the regarding part of the code below:
input_dim = X_train.shape[1]
nb_classes = Y_train.shape[1]
model = Sequential()
model.add(Conv1D(filters=64, kernel_size=3, activation='relu', input_shape=(input_dim, 1)))
model.add(Dropout(0.5))
model.add(MaxPooling1D(pool_size=2))
model.add(Flatten())
model.add(Dense(40, activation='relu'))
model.add(Dense(nb_classes, activation='softmax'))
model.compile(loss='mean_squared_error', optimizer='adam', metrics=['accuracy'])
p.s. The input data (X_train and X_test) contains vectors which were reproduced by Word2Vec. The output is binary.
Edit: You may find a sample training log below:
Sample training log:
Train on 3114 samples, validate on 347 samples
Epoch 1/10
- 1s - loss: 0.6917 - accuracy: 0.5363 - val_loss: 0.6901 - val_accuracy: 0.5476
Epoch 2/10
- 1s - loss: 0.6906 - accuracy: 0.5369 - val_loss: 0.6896 - val_accuracy: 0.5476
Epoch 3/10
- 1s - loss: 0.6908 - accuracy: 0.5369 - val_loss: 0.6895 - val_accuracy: 0.5476
Epoch 4/10
- 1s - loss: 0.6908 - accuracy: 0.5369 - val_loss: 0.6903 - val_accuracy: 0.5476
Epoch 5/10
- 1s - loss: 0.6908 - accuracy: 0.5369 - val_loss: 0.6899 - val_accuracy: 0.5476
Epoch 6/10
- 1s - loss: 0.6909 - accuracy: 0.5369 - val_loss: 0.6901 - val_accuracy: 0.5476
Epoch 7/10
- 1s - loss: 0.6905 - accuracy: 0.5369 - val_loss: 0.6896 - val_accuracy: 0.5476
Epoch 8/10
- 1s - loss: 0.6909 - accuracy: 0.5369 - val_loss: 0.6897 - val_accuracy: 0.5476
Epoch 9/10
- 1s - loss: 0.6905 - accuracy: 0.5369 - val_loss: 0.6892 - val_accuracy: 0.5476
Epoch 10/10
- 1s - loss: 0.6909 - accuracy: 0.5369 - val_loss: 0.6900 - val_accuracy: 0.5476
First you need to change the last layer to
model.add(Dense(1, activation='sigmoid'))
You also need to change the loss function to
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
I assume that you have multi-class classification, right?
Then your loss is not appropriate: you should use 'categorical_crossentropy' not 'mean_squared_error'.
Also, try adding several Conv+Drop+MaxPool (3 sets) in order to clearly verify the robustness of your network.

LSTM only train once using keras

I'm trying to train a LSTM model to predict the temperature.but the model only got trained in first epochs.
I got the usage and temperature of cpu from a server in about twenty hours as the dataset.I want to predict the temperature of cpu after 10m by using 10m's data before.so I reshape my dataset to (1301,10,2) as I have 1301 samples,10m timesteps and 2 features, then I divide it to 1201 and 100 as the train dataset and the validation dataset.
I check the dataset manually,so it should be right.
I creat the LSTM model as below
model = Sequential()
model.add(LSTM(10, activation="relu", input_shape=(train_x.shape[1], train_x.shape[2]),return_sequences=True))
model.add(Flatten())
model.add(Dense(1, activation="softmax"))
model.compile(loss='mean_absolute_error', optimizer='RMSprop')
and try to fit it
model.fit(train_x, train_y, epochs=50, batch_size=32, validation_data=(test_x, test_y), verbose=2)
I got the log like this:
Epoch 1/50
- 1s - loss: 0.8016 - val_loss: 0.8147
Epoch 2/50
- 0s - loss: 0.8016 - val_loss: 0.8147
Epoch 3/50
- 0s - loss: 0.8016 - val_loss: 0.8147
Epoch 4/50
- 0s - loss: 0.8016 - val_loss: 0.8147
Epoch 5/50
- 0s - loss: 0.8016 - val_loss: 0.8147
Epoch 6/50
- 0s - loss: 0.8016 - val_loss: 0.8147
Epoch 7/50
- 0s - loss: 0.8016 - val_loss: 0.8147
Epoch 8/50
- 0s - loss: 0.8016 - val_loss: 0.8147
Epoch 9/50
- 0s - loss: 0.8016 - val_loss: 0.8147
The trainning time of each epoch is 0 expect the first epoch,and the loss never decrease.I tried changing the number of LSTM cells,loss function and optimizer,but it still don't work.
Changing the activation function of last layer from softmax to sigmoid make the model works.Thanks to #giser_yugang #Ashwin Geet D'Sa

Predicting the price of the natural gas using LSTM neural network

I want to build a model using Keras to predict the price of the natural gas.
The dataset contains the price for the gas daily and monthly since 1997 and it is available Here.
The following graph shows the prices during a sequence of days. X is days and Y is the price.
I have tried LSTM with 4,50,100 cell in hidden layer but the accuracy still not was bad and the model failed to predict future price.
I have added another two hidden layers (full connected) with 100 and 128 cell but it did not work too.
This is the model and the result form training process:
num_units = 100
activation_function = 'sigmoid'
optimizer = 'adam'
loss_function = 'mean_squared_error'
batch_size = 5
num_epochs = 10
log_file_name = f"{SEQ_LEN}-SEQ-{1}-PRED-{int(time.time())}"
# Initialize the model (of a Sequential type)
model = Sequential()
# Adding the input layer and the LSTM layer
model.add(LSTM(units = num_units, activation = activation_function,input_shape=(None, 1)))
# Adding the output layer
model.add(Dense(units = 1))
# Compiling the RNN
model.compile(optimizer = optimizer, loss = loss_function, metrics=['accuracy'])
# Using the training set to train the model
history = model.fit(train_x, train_y, batch_size = batch_size, epochs =num_epochs,validation_data=(test_x, test_y))
and the output is :
Train on 4362 samples, validate on 1082 samples
Epoch 1/10
4362/4362 [==============================] - 11s 3ms/step - loss: 0.0057 - acc: 2.2925e-04 - val_loss: 0.0016 - val_acc: 0.0018
Epoch 2/10
4362/4362 [==============================] - 9s 2ms/step - loss: 6.2463e-04 - acc: 4.5851e-04 - val_loss: 0.0013 - val_acc: 0.0018
Epoch 3/10
4362/4362 [==============================] - 9s 2ms/step - loss: 6.1073e-04 - acc: 2.2925e-04 - val_loss: 0.0014 - val_acc: 0.0018
Epoch 4/10
4362/4362 [==============================] - 8s 2ms/step - loss: 5.4403e-04 - acc: 4.5851e-04 - val_loss: 0.0014 - val_acc: 0.0018
Epoch 5/10
4362/4362 [==============================] - 7s 2ms/step - loss: 5.4765e-04 - acc: 4.5851e-04 - val_loss: 0.0012 - val_acc: 0.0018
Epoch 6/10
4362/4362 [==============================] - 8s 2ms/step - loss: 5.1991e-04 - acc: 4.5851e-04 - val_loss: 0.0013 - val_acc: 0.0018
Epoch 7/10
4362/4362 [==============================] - 7s 2ms/step - loss: 5.7324e-04 - acc: 2.2925e-04 - val_loss: 0.0011 - val_acc: 0.0018
Epoch 8/10
4362/4362 [==============================] - 7s 2ms/step - loss: 4.4248e-04 - acc: 4.5851e-04 - val_loss: 0.0011 - val_acc: 0.0018
Epoch 9/10
4362/4362 [==============================] - 7s 2ms/step - loss: 4.3868e-04 - acc: 4.5851e-04 - val_loss: 0.0011 - val_acc: 0.0018
Epoch 10/10
4362/4362 [==============================] - 7s 2ms/step - loss: 4.6654e-04 - acc: 4.5851e-04 - val_loss: 0.0011 - val_acc: 0.0018
How to know the number of layers and cells for problem like this? Anyone can suggest a netwrok structure that can solve this problem?

Keras model output information/log level

I am using Keras to build a neural network model:
model_keras = Sequential()
model_keras.add(Dense(4, input_dim=input_num, activation='relu',kernel_regularizer=regularizers.l2(0.01)))
model_keras.add(Dense(1, activation='linear',kernel_regularizer=regularizers.l2(0.01)))
sgd = optimizers.SGD(lr=0.01, clipnorm=0.5)
model_keras.compile(loss='mean_squared_error', optimizer=sgd)
model_keras.fit(X_norm_train, y_norm_train, batch_size=20, epochs=100)
The output looks like below. I am wondering if it is possible to out the loss, say every 10 epochs instead of every epoch? Thanks!
Epoch 1/200
20/20 [==============================] - 0s - loss: 0.2661
Epoch 2/200
20/20 [==============================] - 0s - loss: 0.2625
Epoch 3/200
20/20 [==============================] - 0s - loss: 0.2590
Epoch 4/200
20/20 [==============================] - 0s - loss: 0.2556
Epoch 5/200
20/20 [==============================] - 0s - loss: 0.2523
Epoch 6/200
20/20 [==============================] - 0s - loss: 0.2490
Epoch 7/200
20/20 [==============================] - 0s - loss: 0.2458
Epoch 8/200
20/20 [==============================] - 0s - loss: 0.2427
Epoch 9/200
20/20 [==============================] - 0s - loss: 0.2397
Epoch 10/200
20/20 [==============================] - 0s - loss: 0.2367
Epoch 11/200
20/20 [==============================] - 0s - loss: 0.2338
Epoch 12/200
20/20 [==============================] - 0s - loss: 0.2309
Epoch 13/200
20/20 [==============================] - 0s - loss: 0.2281
Epoch 14/200
20/20 [==============================] - 0s - loss: 0.2254
Epoch 15/200
20/20 [==============================] - 0s - loss: 0.2228
:
It is not possible to reduce frequency of logging to stdout, however, passing verbose=0 argument to fit() method would turn logging completely off.
Since the loop over epochs is not exposed in the Keras' sequential model, one way to collect scalar variable summaries with a custom frequency would be using Keras callbacks. In particular, you could use TensorBoard (assuming you are running with tensorflow backend) or CSVLogger (any backend) callbacks to collect any scalar variable summaries (training loss, in your case):
from keras.callbacks import TensorBoard
model_keras = Sequential()
model_keras.add(Dense(4, input_dim=input_num, activation='relu',kernel_regularizer=regularizers.l2(0.01)))
model_keras.add(Dense(1, activation='linear',kernel_regularizer=regularizers.l2(0.01)))
sgd = optimizers.SGD(lr=0.01, clipnorm=0.5)
model_keras.compile(loss='mean_squared_error', optimizer=sgd)
TB = TensorBoard(histogram_freq=10, batch_size=20)
model_keras.fit(X_norm_train, y_norm_train, batch_size=20, epochs=100, callbacks=[TB])
Setting histogram_freq=10 will save loss every 10 epochs.
EDIT: passing validation_data=(...) to the fit method will also allow to check validation level metrics.
Create a Keras callback to reduce the number of log lines. By default, Keras print log per every epoch. The following code prints only 10 log lines regardless the number of epochs.
class callback(tf.keras.callbacks.Callback):
def on_epoch_end(this,Epoch,Logs):
L = Logs["loss"];
if Epoch%Lafte==Lafte-1: #Log after a number of epochs
print(f"Average batch loss: {L:.9f}");
if Epoch==Epochs-1:
print(f"Fin-avg batch loss: {L:.9f}"); #Final average
Model = model();
Model.compile(...);
Dsize = ... #Number of samples in training data
Bsize = ... #Number of samples to process in 1 batch
Steps = 1000; #Number of batches to use to train
Epochs = round(Steps/(Dsize/Bsize));
Lafte = round(Epochs/10); #Log 10 times only, regardless of num of Epochs
if Lafte==0: Lafte=1; #Avoid modulus by zero in on_epoch_end
Model.fit(Data, epochs=Epochs, steps_per_epoch=round(Dsize/Bsize),
callbacks=[callback()], verbose=0);

Resources