Keras deep learning output format issue - python-3.x

I'm a newbie in deep learning and Keras. I really hope folks with experience in this field could help me answer the following question.
I downloaded the cifar10_cnn.py code from Keras github. I run it with python 3.5.2, Keras 2.0.2 and tried both backends tensorflow 0.12.0-rco and theano 0.9.0. But unfortunately both of them print output something like below:
Epoch 1/200
1/1562 [..............................] - ETA: 92s - loss: 2.2861 - acc: 0.1562
3/1562 [..............................] - ETA: 65s - loss: 2.3133 - acc: 0.1354
5/1562 [..............................] - ETA: 59s - loss: 2.3202 - acc: 0.1125
7/1562 [..............................] - ETA: 57s - loss: 2.3168 - acc: 0.1071
what I expect is some like below:
Epoch 1/200
32/50000 [..............................] - ETA: 3138s - loss: 2.3238 - acc: 0.0625
64/50000 [..............................] - ETA: 1579s - loss: 2.3165 - acc: 0.0625
96/50000 [..............................] - ETA: 1059s - loss: 2.3091 - acc: 0.0625
128/50000 [..............................] - ETA: 798s - loss: 2.3070 - acc: 0.0781
160/50000 [..............................] - ETA: 643s - loss: 2.3056 - acc: 0.0750
you could observe that 50000/32 = 1562.5, but I don't know why the output was changed like that. It's very confusing for new comer to see the numerator is 1 and denominator is 1562. Is this change related to python3?
Another confusion for me is where the output comes from? which API leads to the above output?

Related

Can keras progress bar show instantaneous metrics rather than running average?

By "progress bar" I mean the standard progress bar that shows up with tf.keras.Model.fit
As I understand, it shows a running average of your selected metrics (over the current epoch), but I want it to show the value at the last completed iteration.
Is there a built-in way to make this change? And if not, what would be the easiest way?
I made a callback a while ago to solve this problem.
class print_on_end(Callback):
def on_batch_end(self, batch, logs={}):
print()
You want to call it like this.
model.fit(training_dataset, steps_per_epoch=num_training_samples, epochs=EPOCHS,validation_data=validation_dataset, callbacks=[print_on_end()])
But this callback prints the everages, just on different lines, so i don't thinks its what you want.
This instead:
class LossAndErrorPrintingCallback(keras.callbacks.Callback):
def on_train_batch_end(self, batch, logs=None):
print("For batch {}, loss is {:7.2f}.".format(batch, logs["loss"]))
)
)
This callback prints the loss of every batch, so it should be what you are looking for.
( if you need the metric just change logs["loss"] with logs["name of the metric"] ex. logs["mean_absolute_error"]
EDIT:
to check the name of the metric inside logs you could print the keys of the log and find the ona you are searching.
class PrintKeys(keras.callbacks.Callback):
def on_train_batch_end(self, batch, logs=None):
keys = list(logs.keys())
print(keys)
)
)
In that method you should only find the keys for the loss and the metric.
source:
https://keras.io/guides/writing_your_own_callbacks/
An example of the class hierarchy of the MeanSquaredError metric is
MeanSquaredError->MeanMetricWrapper->Mean->Reduce->Metric
Is there a built-in way ?
The main problem is that all metrics are subclasses of the Reduce metric, which performs the aggregation, and there is nothing foreseen to change the behaviour of the Reduce base class.
How to achieve this the most easily
Given the schema above you can achieve what you want by means of creating a new metric subclass of the MeanMetricWrapper
which overrides the update_state method of the MeanMetricWrapper
by means of first calling self.reset_state and then MeanMetricWrapper.update_state.
Like this, the aggregates in the underlying Reduce base class will aggregate only a single value. Working Example below:
#! /usr/bin/env python
import numpy as np
import keras
from keras.metrics import MeanMetricWrapper
x=np.linspace(0, 1, 20000)[:,np.newaxis,np.newaxis]
y=np.sin(x*2*np.pi)
model = keras.Sequential()
model.add(keras.layers.Dense(4, activation="tanh", input_shape=(1,1)))
model.add(keras.layers.Dense(4, activation="tanh"))
model.add(keras.layers.Dense(4))
#####
# Here the Instantaneous metric variant
class InstMetric(MeanMetricWrapper):
def __init__(self, fn, **kwargs):
""" fn is the callable loss function you want to use in your metric """
super().__init__(fn=fn, **kwargs)
def update_state(self, y_true, y_pred, sample_weight=None):
self.reset_states()
return super().update_state(y_true, y_pred, sample_weight=sample_weight)
#####
model.compile(optimizer='adam', loss='mean_squared_error',
metrics=[
keras.metrics.MeanSquaredError(name="MSE"),
InstMetric(keras.metrics.mean_squared_error, name="IMSE")
]
)
model.fit(x=x, y=y, epochs=1, batch_size=5, steps_per_epoch=1000)
Storing this script as inst_demo.py and running it through tr to unfold the progress bar in the terminal
you get
$> ./inst_demo.py | tr \\r \\n
1/1000 [..............................] - ETA: 8:07 - loss: 0.4656 - MSE: 0.4656 - IMSE: 0.4656
42/1000 [>.............................] - ETA: 1s - loss: 0.4874 - MSE: 0.4874 - IMSE: 0.4133
87/1000 [=>............................] - ETA: 1s - loss: 0.4685 - MSE: 0.4685 - IMSE: 0.4764
132/1000 [==>...........................] - ETA: 1s - loss: 0.4627 - MSE: 0.4627 - IMSE: 0.5445
175/1000 [====>.........................] - ETA: 0s - loss: 0.4558 - MSE: 0.4558 - IMSE: 0.7689
217/1000 [=====>........................] - ETA: 0s - loss: 0.4443 - MSE: 0.4443 - IMSE: 0.1058
264/1000 [======>.......................] - ETA: 0s - loss: 0.4258 - MSE: 0.4258 - IMSE: 0.4162
311/1000 [========>.....................] - ETA: 0s - loss: 0.4090 - MSE: 0.4090 - IMSE: 0.1716
356/1000 [=========>....................] - ETA: 0s - loss: 0.3889 - MSE: 0.3889 - IMSE: 0.3417
400/1000 [===========>..................] - ETA: 0s - loss: 0.3707 - MSE: 0.3707 - IMSE: 0.1271
445/1000 [============>.................] - ETA: 0s - loss: 0.3532 - MSE: 0.3532 - IMSE: 0.0729
489/1000 [=============>................] - ETA: 0s - loss: 0.3383 - MSE: 0.3383 - IMSE: 0.2310
535/1000 [===============>..............] - ETA: 0s - loss: 0.3248 - MSE: 0.3248 - IMSE: 0.1228
580/1000 [================>.............] - ETA: 0s - loss: 0.3143 - MSE: 0.3143 - IMSE: 0.2670
625/1000 [=================>............] - ETA: 0s - loss: 0.3048 - MSE: 0.3048 - IMSE: 0.1762
671/1000 [===================>..........] - ETA: 0s - loss: 0.2962 - MSE: 0.2962 - IMSE: 0.0751
715/1000 [====================>.........] - ETA: 0s - loss: 0.2896 - MSE: 0.2896 - IMSE: 0.0650
756/1000 [=====================>........] - ETA: 0s - loss: 0.2831 - MSE: 0.2831 - IMSE: 0.2332
799/1000 [======================>.......] - ETA: 0s - loss: 0.2773 - MSE: 0.2773 - IMSE: 0.1026
841/1000 [========================>.....] - ETA: 0s - loss: 0.2721 - MSE: 0.2721 - IMSE: 0.1238
888/1000 [=========================>....] - ETA: 0s - loss: 0.2673 - MSE: 0.2673 - IMSE: 0.1471
936/1000 [===========================>..] - ETA: 0s - loss: 0.2631 - MSE: 0.2631 - IMSE: 0.2242
986/1000 [============================>.] - ETA: 0s - loss: 0.2580 - MSE: 0.2580 - IMSE: 0.2704
1000/1000 [==============================] - 2s 1ms/step - loss: 0.2574 - MSE: 0.2574 - IMSE: 0.2773
So you get an instant value each time the progressbar is updated.
You can also derive the InstMetric from any of the available keras metrics, if you don't want to be able to select the metric you will use.

What's the meaning of the number before the progress bar when tensorflow is training

Could anyone tell me what's the meaning of '10' and '49' in the following log of tensorflow?
Much Thanks
INFO:tensorflow:Started compiling
INFO:tensorflow:Finished compiling. Time elapsed: 5.899410247802734 secs
10/10 [==============================] - 23s 2s/step - loss: 2.6726 - acc: 0.1459
49/49 [==============================] - 108s 2s/step - loss: 2.3035 - acc: 0.2845 - val_loss: 2.6726 - val_acc: 0.1459
Epoch 2/100
10/10 [==============================] - 1s 133ms/step - loss: 2.8799 - acc: 0.1693
49/49 [==============================] - 17s 337ms/step - loss: 1.9664 - acc: 0.4042 - val_loss: 2.8799 - val_acc: 0.1693
10 and 49 corresponds to the number of batches which your dataset has been divided into in each epoch.
For example, in your train dataset, there are totally 10000 images and your batch size is 64, then there will be totally math.ceil(10000/64) = 157 batches possible in each epoch.

'loss: nan' during training of Neural Network in Keras

I am training a neural net in Keras. During training of the first epoch the loss value returns and then suddenly goes loss: nan before the first epoch ends, significantly dropping the accuracy. Then starting the second epoch the loss: nan continues but the accuracy is 0. This goes on for the rest of the epochs.
The frustrating bit is that there seems to be no consistency in the output for each time I train. As to say, the loss: nan shows up at different points in the first epoch.
There have been a couple of questions on this website that give "guides" to problems similar to this I just haven't seen one done so explicitly in keras.
I am trying to get my neural network to classify a 1 or a 0.
Here are some things I have done, post-ceding this will be my output and code.
Standardization // Normalization
I posted a question about my data here. I was able to figure it out and perform sklearn's StandardScaler() and MinMaxScaler() on my dataset. Both standardization and normalization methods did not help my issue.
Learning Rate
The optimizers I have tried are adam and SGD. In both cases I tried lowering the standard learning rate to see if that would help and in both cases. Same issue arose.
Activations
I thought that it was pretty standard to use relu but I saw on the internet somewhere someone talking about using tanh, tried it, no dice.
Batch Size
Tried 32, 50, 128, 200. 50 got me the farthest into the 1st epoch, everything else didn't help.
Combating Overfitting
Put a dropout layer in and tried a whole bunch of numbers.
Other Observations
The epochs train really really fast for the dimensions of the data (I could be wrong).
loss: nan could have something to do with my loss function being binary_crossentropy and maybe some values are giving that loss function a hard time.
kernel_initializer='uniform' has been untouched and unconsidered in my quest to figure this out.
The internet also told me that there could be a nan value in my data but I think that was for an error that broke their script.
from sklearn.preprocessing import MinMaxScaler
sc = MinMaxScaler()
X_train_total_scale = sc.fit_transform((X_train))
X_test_total_scale = sc.transform((X_test))
print(X_train_total_scale.shape) #(4140, 2756)
print(y_train.shape) #(4140,)
##NN
#adam = keras.optimizers.Adam(lr= 0.0001)
sgd = optimizers.SGD(lr=0.0001, decay=1e-6, momentum=0.9, nesterov=True)
classifier = Sequential()
classifier.add(Dense(output_dim = 1379, kernel_initializer='uniform', activation='relu', input_dim=2756))
classifier.add(Dropout(0.6))
classifier.add(Dense(output_dim = 1379, kernel_initializer='uniform', activation='relu'))
classifier.add(Dense(output_dim = 1, kernel_initializer='uniform', activation='sigmoid'))
classifier.compile(optimizer=sgd, loss='binary_crossentropy', metrics=['accuracy'])
classifier.fit(X_train_total_scale, y_train, validation_data=(X_test_total_scale, y_test), batch_size=50, epochs=100)
(batch size 200 shown to avoid too-big-a text block)
200/4140 [>.............................] - ETA: 7s - loss: 0.6866 - acc: 0.5400
400/4140 [=>............................] - ETA: 4s - loss: 0.6912 - acc: 0.5300
600/4140 [===>..........................] - ETA: 2s - loss: nan - acc: 0.5300
800/4140 [====>.........................] - ETA: 2s - loss: nan - acc: 0.3975
1000/4140 [======>.......................] - ETA: 1s - loss: nan - acc: 0.3180
1200/4140 [=======>......................] - ETA: 1s - loss: nan - acc: 0.2650
1400/4140 [=========>....................] - ETA: 1s - loss: nan - acc: 0.2271
1600/4140 [==========>...................] - ETA: 1s - loss: nan - acc: 0.1987
1800/4140 [============>.................] - ETA: 1s - loss: nan - acc: 0.1767
2000/4140 [=============>................] - ETA: 0s - loss: nan - acc: 0.1590
2200/4140 [==============>...............] - ETA: 0s - loss: nan - acc: 0.1445
2400/4140 [================>.............] - ETA: 0s - loss: nan - acc: 0.1325
2600/4140 [=================>............] - ETA: 0s - loss: nan - acc: 0.1223
2800/4140 [===================>..........] - ETA: 0s - loss: nan - acc: 0.1136
3000/4140 [====================>.........] - ETA: 0s - loss: nan - acc: 0.1060
3200/4140 [======================>.......] - ETA: 0s - loss: nan - acc: 0.0994
3400/4140 [=======================>......] - ETA: 0s - loss: nan - acc: 0.0935
3600/4140 [=========================>....] - ETA: 0s - loss: nan - acc: 0.0883
3800/4140 [==========================>...] - ETA: 0s - loss: nan - acc: 0.0837
4000/4140 [===========================>..] - ETA: 0s - loss: nan - acc: 0.0795
4140/4140 [==============================] - 2s 368us/step - loss: nan - acc: 0.0768 - val_loss: nan - val_acc: 0.0000e+00
Epoch 2/100
200/4140 [>.............................] - ETA: 1s - loss: nan - acc: 0.0000e+00
400/4140 [=>............................] - ETA: 0s - loss: nan - acc: 0.0000e+00
600/4140 [===>..........................] - ETA: 0s - loss: nan - acc: 0.0000e+00
800/4140 [====>.........................] - ETA: 0s - loss: nan - acc: 0.0000e+00
1000/4140 [======>.......................] - ETA: 0s - loss: nan - acc: 0.0000e+00
1200/4140 [=======>......................] - ETA: 0s - loss: nan - acc: 0.0000e+00
1400/4140 [=========>....................] - ETA: 0s - loss: nan - acc: 0.0000e+00
1600/4140 [==========>...................] - ETA: 0s - loss: nan - acc: 0.0000e+00
... and so on...
I hope to be able to get a full training done (duh) but I would also like to learn about some of the intuition people have to figure out these problems on their own!
Firstly, check for NaNs or inf in your dataset.
You could try different optimizers, e.g. rmsprop.
Learning rate could be smaller, though I haven't used anything lower than 0.0001 (which is what you're using) myself.
I thought that it was pretty standard to use relu but I saw on the internet somewhere someone talking about using tanh, tried it, no dice
Try leaky relu, elu if you're concerned about this.

In Keras model fit which parameters can tell whether Data is wrong or model is not good

I am training a simple model in keras for label classification task with following code.
This dataset has 5 classes so final layer of the network has 5 outputs.
Labels are also one-hot encoded. Here are my results:
32/4000 [..............................] - ETA: 0s - loss: 0.2264 - acc:
0.8750
2176/4000 [===============>..............] - ETA: 0s - loss: 0.3092 - acc:
0.8755
4000/4000 [==============================] - 0s 26us/step - loss: 0.2870 -
acc: 0.8805 - val_loss: 15.9636 - val_acc: 0.0070
Epoch 99/100
32/4000 [..............................] - ETA: 0s - loss: 0.1408 - acc:
0.9688
2176/4000 [===============>..............] - ETA: 0s - loss: 0.2696 - acc:
0.8824
4000/4000 [==============================] - 0s 25us/step - loss: 0.2729 -
acc: 0.8868 - val_loss: 15.9731 - val_acc: 0.0070
Epoch 100/100
32/4000 [..............................] - ETA: 0s - loss: 0.2299 - acc:
0.9375
2176/4000 [===============>..............] - ETA: 0s - loss: 0.2861 - acc:
0.8787
4000/4000 [==============================] - 0s 25us/step - loss: 0.2763 -
acc: 0.8865 - val_loss: 15.9791 - val_acc: 0.0070
10/1000 [..............................] - ETA: 0s
1000/1000 [==============================] - 0s 26us/step
32/5000 [..............................] - ETA: 0s
5000/5000 [==============================] - 0s 9us/step
When do tests at the end of training I get almost 100% error on the test data
I have looked at many related posts, but could not figure out what is wrong, but no luck.
Any advice ?

weird root mean squared error behavior in a CNN regression task in Keras

I am using the a CNN similar to alexnet for a image related regression task. I defined a rmse for the loss function. However, during the training in the first epoch, the loss returned a huge value. But following the second epoch, it dropped to a meaningful value. Here it is:
1/51 [..............................] - ETA: 847s - loss: 104.1821 -
acc: 0.2500 - root_mean_squared_error: 104.1821 2/51
[>.............................] - ETA: 470s - loss: 5277326.0910 -
acc: 0.5938 - root_mean_squared_error: 5277326.0910 3/51
[>.............................] - ETA: 345s - loss: 3518246.7337 -
acc: 0.5000 - root_mean_squared_error: 3518246.7337 4/51
[=>............................] - ETA: 281s - loss: 2640801.3379 -
acc: 0.6094 - root_mean_squared_error: 2640801.3379 5/51
[=>............................] - ETA: 241s - loss: 2112661.3062 -
acc: 0.5000 - root_mean_squared_error: 2112661.3062 6/51
[==>...........................] - ETA: 214s - loss: 1760566.4758 -
acc: 0.4375 - root_mean_squared_error: 1760566.4758 7/51
[===>..........................] - ETA: 194s - loss: 1509067.6495 -
acc: 0.4464 - root_mean_squared_error: 1509067.6495 8/51
[===>..........................] - ETA: 178s - loss: 1320442.6319 -
acc: 0.4570 - root_mean_squared_error: 1320442.6319 9/51
[====>.........................] - ETA: 165s - loss: 1173734.9212 -
acc: 0.4792 - root_mean_squared_error: 1173734.9212 10/51
[====>.........................] - ETA: 155s - loss: 1056369.3193 -
acc: 0.4875 - root_mean_squared_error: 1056369.3193 11/51
[=====>........................] - ETA: 146s - loss: 960343.5998 -
acc: 0.4943 - root_mean_squared_error: 960343.5998 12/51
[======>.......................] - ETA: 139s - loss: 880320.3762 -
acc: 0.5052 - root_mean_squared_error: 880320.3762 13/51
[======>.......................] - ETA: 131s - loss: 812608.7112 -
acc: 0.5216 - root_mean_squared_error: 812608.7112 14/51
[=======>......................] - ETA: 125s - loss: 754570.1939 -
acc: 0.5402 - root_mean_squared_error: 754570.1939 15/51
[=======>......................] - ETA: 120s - loss: 704269.2443 -
acc: 0.5479 - root_mean_squared_error: 704269.2443 16/51
[========>.....................] - ETA: 114s - loss: 660256.3035 -
acc: 0.5508 - root_mean_squared_error: 660256.3035 17/51
[========>.....................] - ETA: 109s - loss: 621420.7248 -
acc: 0.5607 - root_mean_squared_error: 621420.7248 18/51
[=========>....................] - ETA: 104s - loss: 586900.8398 -
acc: 0.5712 - root_mean_squared_error: 586900.8398 19/51
[==========>...................] - ETA: 100s - loss: 556014.6719 -
acc: 0.5806 - root_mean_squared_error: 556014.6719 20/51
[==========>...................] - ETA: 95s - loss: 528216.9077 - acc:
0.5875 - root_mean_squared_error: 528216.9077 21/51 [===========>..................] - ETA: 91s - loss: 503065.7743 - acc:
0.5967 - root_mean_squared_error: 503065.7743 22/51 [===========>..................] - ETA: 87s - loss: 480206.3521 - acc:
0.6094 - root_mean_squared_error: 480206.3521 23/51 [============>.................] - ETA: 83s - loss: 459331.8636 - acc:
0.6114 - root_mean_squared_error: 459331.8636 24/51 [=============>................] - ETA: 80s - loss: 440196.2991 - acc:
0.6159 - root_mean_squared_error: 440196.2991 25/51 [=============>................] - ETA: 76s - loss: 422590.8381 - acc:
0.6162 - root_mean_squared_error: 422590.8381 26/51 [==============>...............] - ETA: 73s - loss: 406339.5179 - acc:
0.6178 - root_mean_squared_error: 406339.5179 27/51 [==============>...............] - ETA: 69s - loss: 391292.6992 - acc:
0.6238 - root_mean_squared_error: 391292.6992 28/51 [===============>..............] - ETA: 66s - loss: 377319.9851 - acc:
0.6306 - root_mean_squared_error: 377319.9851 29/51 [===============>..............] - ETA: 63s - loss: 364310.7557 - acc:
0.6336 - root_mean_squared_error: 364310.7557 30/51 [================>.............] - ETA: 60s - loss: 352169.1059 - acc:
0.6385 - root_mean_squared_error: 352169.1059 31/51 [=================>............] - ETA: 57s - loss: 340810.8854 - acc:
0.6401 - root_mean_squared_error: 340810.8854 32/51 [=================>............] - ETA: 53s - loss: 330162.1334 - acc:
0.6455 - root_mean_squared_error: 330162.1334 33/51 [==================>...........] - ETA: 50s - loss: 320158.7622 - acc:
0.6553 - root_mean_squared_error: 320158.7622 34/51 [==================>...........] - ETA: 47s - loss: 310744.0080 - acc:
0.6645 - root_mean_squared_error: 310744.0080 35/51 [===================>..........] - ETA: 44s - loss: 301866.8259 - acc:
0.6714 - root_mean_squared_error: 301866.8259 36/51 [====================>.........] - ETA: 41s - loss: 293483.0129 - acc:
0.6762 - root_mean_squared_error: 293483.0129 37/51 [====================>.........] - ETA: 39s - loss: 285552.8197 - acc:
0.6757 - root_mean_squared_error: 285552.8197 38/51 [=====================>........] - ETA: 36s - loss: 278039.4488 - acc:
0.6752 - root_mean_squared_error: 278039.4488 39/51 [=====================>........] - ETA: 33s - loss: 270911.4670 - acc:
0.6795 - root_mean_squared_error: 270911.4670 40/51 [======================>.......] - ETA: 30s - loss: 264140.2391 - acc:
0.6820 - root_mean_squared_error: 264140.2391 41/51 [=======================>......] - ETA: 27s - loss: 257699.1895 - acc:
0.6852 - root_mean_squared_error: 257699.1895 42/51 [=======================>......] - ETA: 25s - loss: 251564.6846 - acc:
0.6890 - root_mean_squared_error: 251564.6846 43/51 [========================>.....] - ETA: 22s - loss: 245715.4124 - acc:
0.6933 - root_mean_squared_error: 245715.4124 44/51 [========================>.....] - ETA: 19s - loss: 240131.9916 - acc:
0.6960 - root_mean_squared_error: 240131.9916 45/51 [=========================>....] - ETA: 16s - loss: 234796.6948 - acc:
0.7007 - root_mean_squared_error: 234796.6948 46/51 [=========================>....] - ETA: 14s - loss: 229693.3717 - acc:
0.7045 - root_mean_squared_error: 229693.3717 47/51 [==========================>...] - ETA: 11s - loss: 224807.2748 - acc:
0.7055 - root_mean_squared_error: 224807.2748 48/51 [===========================>..] - ETA: 8s - loss: 220125.0731 - acc:
0.7077 - root_mean_squared_error: 220125.0731 49/51 [===========================>..] - ETA: 5s - loss: 215634.5638 - acc:
0.7117 - root_mean_squared_error: 215634.5638 50/51 [============================>.] - ETA: 3s - loss: 211323.1692 - acc:
0.7144 - root_mean_squared_error: 211323.1692 51/51 [============================>.] - ETA: 0s - loss: 207180.6328 - acc:
0.7151 - root_mean_squared_error: 207180.6328 52/51 [==============================] - 143s - loss: 203253.6237 - acc:
0.7157 - root_mean_squared_error: 203253.6237 - val_loss: 44.4203 - val_acc: 0.9878 - val_root_mean_squared_error: 44.4203 Epoch 2/128
1/51 [..............................] - ETA: 117s - loss: 52.6087 -
acc: 0.7188 - root_mean_squared_error: 52.6087
How to understand this behavior? Here is my implementation. First define the rmse function:
from keras import backend as K
def root_mean_squared_error(y_true, y_pred):
return K.sqrt(K.mean(K.square(y_pred - y_true), axis=-1))
Then for the model:
model.compile(optimizer="rmsprop", loss=root_mean_squared_error, metrics=['accuracy', root_mean_squared_error])
Then fit the model:
estimator = alexmodel()
datagen = ImageDataGenerator()
datagen.fit(x_train)
start = time.time()
history = estimator.fit_generator(datagen.flow(x_train, x_train,batch_size=batch_size, shuffle=True),
epochs=epochs,
steps_per_epoch=x_train.shape[0]/batch_size,
validation_data=(x_test, y_test))
end = time.time()
Can anyone tell me why is that? Anything potential wrong?
So - it's important to normalize your data. It seems that you haven't normalized your target and as a network is usually initialized in such way that it will produce small values at the beginning - this made your loss so huge during the first epoch. So I still advise you to normalize your target (by either using StandardScaler or MinMaxScaller) because a need to produce high scale values will make the weights of your network to have much higher absolute values which are something which you should prevent your network from.

Resources