resnet50 - increasing training speed? keras

resnet50 - increasing training speed? keras - keras

hth do I increase the speed of this? I mean the loss is moving down by hairs. HAIRS.
Epoch 1/30
4998/4998 [==============================] - 307s 62ms/step - loss: 0.6861 - acc: 0.6347
Epoch 2/30
4998/4998 [==============================] - 316s 63ms/step - loss: 0.6751 - acc: 0.6387
Epoch 3/30
4998/4998 [==============================] - 357s 71ms/step - loss: 0.6676 - acc: 0.6387
Epoch 4/30
4998/4998 [==============================] - 376s 75ms/step - loss: 0.6625 - acc: 0.6387
Epoch 5/30
4998/4998 [==============================] - 354s 71ms/step - loss: 0.6592 - acc: 0.6387
Epoch 6/30
4998/4998 [==============================] - 345s 69ms/step - loss: 0.6571 - acc: 0.6387
Epoch 7/30
4998/4998 [==============================] - 349s 70ms/step - loss: 0.6559 - acc: 0.6387
Model Architecture:
resnet50 (CNN with skip connections)
Except instead of 1 FC I have two. And I changed the softmax output to sigmoid for binary classification.
num positive training data: 1806
num neg training data: 3192
My output is represented by a 1 or 0 for each example ( [0, 0, 1, 1, ...])
batches = 40, num epochs =30, but that doesn't matter because the loss stopped

Related

What's the meaning of the number before the progress bar when tensorflow is training

Could anyone tell me what's the meaning of '10' and '49' in the following log of tensorflow?
Much Thanks
INFO:tensorflow:Started compiling
INFO:tensorflow:Finished compiling. Time elapsed: 5.899410247802734 secs
10/10 [==============================] - 23s 2s/step - loss: 2.6726 - acc: 0.1459
49/49 [==============================] - 108s 2s/step - loss: 2.3035 - acc: 0.2845 - val_loss: 2.6726 - val_acc: 0.1459
Epoch 2/100
10/10 [==============================] - 1s 133ms/step - loss: 2.8799 - acc: 0.1693
49/49 [==============================] - 17s 337ms/step - loss: 1.9664 - acc: 0.4042 - val_loss: 2.8799 - val_acc: 0.1693

10 and 49 corresponds to the number of batches which your dataset has been divided into in each epoch.
For example, in your train dataset, there are totally 10000 images and your batch size is 64, then there will be totally math.ceil(10000/64) = 157 batches possible in each epoch.

Keras loss function does not decrease on mean squared error

I implemented a neural network with Keras to predict the rating of an item. I consider each rating as a class, so this is my code (outputY is categorical):
inputLayerU = Input(shape=(features,))
inputLayerM = Input(shape=(features,))
dense1 = Dense(features, activation='relu')
denseU = dense1(inputLayerU)
denseM = dense1(inputLayerM)
concatLayer = concatenate([denseU, denseM], axis = 1)
denseLayer = Dense(features*2, activation='relu')(concatLayer)
outputLayer = Dense(5, activation='softmax')(denseLayer)
model = Model(inputs=[inputLayerU, inputLayerM], outputs=outputLayer)
model.compile(loss='categorical_crossentropy', optimizer=Adam(lr=0.01), metrics=['accuracy'])
model.fit([inputU, inputM],outputY , epochs=10, steps_per_epoch=10)
When I train this network I get the following result which is fine:
10/10 [==============================] - 2s 187ms/step - loss: 1.4778 - acc: 0.3209
Epoch 2/10
10/10 [==============================] - 0s 49ms/step - loss: 1.4058 - acc: 0.3625
Epoch 3/10
10/10 [==============================] - 1s 54ms/step - loss: 1.3825 - acc: 0.3824
Epoch 4/10
10/10 [==============================] - 0s 47ms/step - loss: 1.3614 - acc: 0.3923
Epoch 5/10
10/10 [==============================] - 0s 48ms/step - loss: 1.3372 - acc: 0.4060
Epoch 6/10
10/10 [==============================] - 0s 45ms/step - loss: 1.3138 - acc: 0.4202
Epoch 7/10
10/10 [==============================] - 0s 46ms/step - loss: 1.2976 - acc: 0.4266
Epoch 8/10
10/10 [==============================] - 0s 48ms/step - loss: 1.2842 - acc: 0.4325
Epoch 9/10
10/10 [==============================] - 1s 62ms/step - loss: 1.2729 - acc: 0.4402
Epoch 10/10
10/10 [==============================] - 1s 54ms/step - loss: 1.2631 - acc: 0.4464
Then I consider the problem as regression and try to predict the value of user ratings(I need to calculate error in both ways). So this is my code:
inputLayerU = Input(shape=(features,))
inputLayerM = Input(shape=(features,))
dense1 = Dense(features, activation='relu')
denseU = dense1(inputLayerU)
denseM = dense1(inputLayerM)
concatLayer = concatenate([denseU, denseM], axis = 1)
denseLayer = Dense(features*2, activation='relu')(concatLayer)
outputLayer = Dense(1, activation='softmax')(denseLayer)
model = Model(inputs=[inputLayerU, inputLayerM], outputs=outputLayer)
model.compile(loss='mean_squared_error', optimizer=Adam(lr=0.01), metrics=['accuracy'])
model.fit([inputU, inputM],outputY , epochs=10, steps_per_epoch=10)
and I get this results:
Epoch 1/10
10/10 [==============================] - 9s 894ms/step - loss: 7.9451 - acc: 0.0563
Epoch 2/10
10/10 [==============================] - 7s 711ms/step - loss: 7.9447 - acc: 0.0563
Epoch 3/10
10/10 [==============================] - 7s 709ms/step - loss: 7.9446 - acc: 0.0563
Epoch 4/10
10/10 [==============================] - 7s 710ms/step - loss: 7.9446 - acc: 0.0563
Epoch 5/10
10/10 [==============================] - 7s 702ms/step - loss: 7.9446 - acc: 0.0563
Epoch 6/10
10/10 [==============================] - 7s 706ms/step - loss: 7.9446 - acc: 0.0563
Epoch 7/10
10/10 [==============================] - 7s 701ms/step - loss: 7.9446 - acc: 0.0563
Epoch 8/10
10/10 [==============================] - 7s 702ms/step - loss: 7.9446 - acc: 0.0563
Epoch 9/10
10/10 [==============================] - 7s 717ms/step - loss: 7.9446 - acc: 0.0563
Epoch 10/10
10/10 [==============================] - 7s 700ms/step - loss: 7.9446 - acc: 0.0563
As you see it decreases a little, some times it doesn't change at all.
So what's wrong with my regression?

First, m not sure if we are supposed to apply the softmax function for regression problem, and secondly try using the Adam Optimizer with default parameters.

Predicting the price of the natural gas using LSTM neural network

I want to build a model using Keras to predict the price of the natural gas.
The dataset contains the price for the gas daily and monthly since 1997 and it is available Here.
The following graph shows the prices during a sequence of days. X is days and Y is the price.
I have tried LSTM with 4,50,100 cell in hidden layer but the accuracy still not was bad and the model failed to predict future price.
I have added another two hidden layers (full connected) with 100 and 128 cell but it did not work too.
This is the model and the result form training process:
num_units = 100
activation_function = 'sigmoid'
optimizer = 'adam'
loss_function = 'mean_squared_error'
batch_size = 5
num_epochs = 10
log_file_name = f"{SEQ_LEN}-SEQ-{1}-PRED-{int(time.time())}"
# Initialize the model (of a Sequential type)
model = Sequential()
# Adding the input layer and the LSTM layer
model.add(LSTM(units = num_units, activation = activation_function,input_shape=(None, 1)))
# Adding the output layer
model.add(Dense(units = 1))
# Compiling the RNN
model.compile(optimizer = optimizer, loss = loss_function, metrics=['accuracy'])
# Using the training set to train the model
history = model.fit(train_x, train_y, batch_size = batch_size, epochs =num_epochs,validation_data=(test_x, test_y))
and the output is :
Train on 4362 samples, validate on 1082 samples
Epoch 1/10
4362/4362 [==============================] - 11s 3ms/step - loss: 0.0057 - acc: 2.2925e-04 - val_loss: 0.0016 - val_acc: 0.0018
Epoch 2/10
4362/4362 [==============================] - 9s 2ms/step - loss: 6.2463e-04 - acc: 4.5851e-04 - val_loss: 0.0013 - val_acc: 0.0018
Epoch 3/10
4362/4362 [==============================] - 9s 2ms/step - loss: 6.1073e-04 - acc: 2.2925e-04 - val_loss: 0.0014 - val_acc: 0.0018
Epoch 4/10
4362/4362 [==============================] - 8s 2ms/step - loss: 5.4403e-04 - acc: 4.5851e-04 - val_loss: 0.0014 - val_acc: 0.0018
Epoch 5/10
4362/4362 [==============================] - 7s 2ms/step - loss: 5.4765e-04 - acc: 4.5851e-04 - val_loss: 0.0012 - val_acc: 0.0018
Epoch 6/10
4362/4362 [==============================] - 8s 2ms/step - loss: 5.1991e-04 - acc: 4.5851e-04 - val_loss: 0.0013 - val_acc: 0.0018
Epoch 7/10
4362/4362 [==============================] - 7s 2ms/step - loss: 5.7324e-04 - acc: 2.2925e-04 - val_loss: 0.0011 - val_acc: 0.0018
Epoch 8/10
4362/4362 [==============================] - 7s 2ms/step - loss: 4.4248e-04 - acc: 4.5851e-04 - val_loss: 0.0011 - val_acc: 0.0018
Epoch 9/10
4362/4362 [==============================] - 7s 2ms/step - loss: 4.3868e-04 - acc: 4.5851e-04 - val_loss: 0.0011 - val_acc: 0.0018
Epoch 10/10
4362/4362 [==============================] - 7s 2ms/step - loss: 4.6654e-04 - acc: 4.5851e-04 - val_loss: 0.0011 - val_acc: 0.0018
How to know the number of layers and cells for problem like this? Anyone can suggest a netwrok structure that can solve this problem?

Very low accuracy on Digit recgonition dataset with images having 4 channels, using Convolutional Neural Networks

I am currently working on a digit recognition challenge by Analytics Vidhya, the link to which is https://datahack.analyticsvidhya.com/contest/practice-problem-identify-the-digits/ .
The images in the dataset pertaining to this challenge are of dimensions 28*28*4 (28 = length = width , 4 = no. of channels).The code I have implemented is:
from keras.models import Sequential
from keras.layers import Dense,Dropout,Flatten,Activation
from keras.layers.convolutional import Conv2D
from keras.layers.convolutional import MaxPooling2D
from keras.utils import np_utils
from keras import backend as K
K.set_image_dim_ordering('th')
import numpy as np
# fix random seed for reproducibility
seed = 7
np.random.seed(seed)
# define the larger model
def larger_model():
# create model
model = Sequential()
model.add(Conv2D(32, (3, 3), input_shape=(4, 28, 28),activation='relu',padding='same'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Conv2D(15, (3, 3), activation='relu',padding='same'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.2))
model.add(Flatten())
model.add(Dense(200, activation='relu'))
model.add(Dense(num_classes, activation='softmax'))
# Compile model
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
return model
def loadImages(path):
# return array of images
imagesList = listdir(path)
loadedImages = []
for image in imagesList:
img = io.imread(path + "/" + image,as_grey = False)
loadedImages.append(np.array(img))
return loadedImages
path = "C:/Users/Farz Jamal/Downloads/mnist/Train/Images/train" #path_to_train_dataset
import pandas as pd
df = pd.read_csv("C:/Users/Farz Jamal/Downloads/mnist/Train/train.csv") #path_to_class_labels
y = np.array(df['label'])
from sklearn.cross_validation import train_test_split as ttt
x_train,x_val,y_train,y_val = ttt(imgs,y,test_size = 0.2)
Continued Code:
x_vall,x_test,y_vall,y_test = ttt(x_val,y_val,test_size = 0.4)
x_train,x_vall,x_test = np.array(x_train).astype('float32'),np.array(x_vall).astype('float32'),np.array(x_test).astype('float32')
# normalize inputs from 0-255 to 0-1
x_train = x_train / 255.0
x_vall = x_vall / 255.0
x_test = x_test / 255.0
y_train = np_utils.to_categorical(y_train)
y_vall = np_utils.to_categorical(y_vall)
y_test = np_utils.to_categorical(y_test)
num_classes = y_vall.shape[1] #10
#fitting_and_evaluating
model = larger_model()
# Fit the model
model.fit(x_train, y_train, validation_data=(x_vall, y_vall), epochs=50, batch_size=200)
# Final evaluation of the model
scores = model.evaluate(x_test, y_test, verbose=0)
The output is coming as follows:(from 16thepoch to 37th epoch)
Epoch 16/50
39200/39200 [==============================] - 271s 7ms/step - loss: 2.3013 - acc: 0.1135 - val_loss: 2.3015 - val_acc: 0.1095
Epoch 17/50
39200/39200 [==============================] - 275s 7ms/step - loss: 2.3011 - acc: 0.1128 - val_loss: 2.3014 - val_acc: 0.1095
Epoch 18/50
39200/39200 [==============================] - 270s 7ms/step - loss: 2.3011 - acc: 0.1124 - val_loss: 2.3015 - val_acc: 0.1095
Epoch 19/50
39200/39200 [==============================] - 273s 7ms/step - loss: 2.3012 - acc: 0.1131 - val_loss: 2.3017 - val_acc: 0.1095
Epoch 20/50
39200/39200 [==============================] - 273s 7ms/step - loss: 2.3011 - acc: 0.1130 - val_loss: 2.3018 - val_acc: 0.1111
Epoch 21/50
39200/39200 [==============================] - 272s 7ms/step - loss: 2.3010 - acc: 0.1127 - val_loss: 2.3013 - val_acc: 0.1095
Epoch 22/50
39200/39200 [==============================] - 281s 7ms/step - loss: 2.3006 - acc: 0.1133 - val_loss: 2.3015 - val_acc: 0.1097
Epoch 23/50
39200/39200 [==============================] - 273s 7ms/step - loss: 2.3005 - acc: 0.1136 - val_loss: 2.3018 - val_acc: 0.1099
Epoch 24/50
39200/39200 [==============================] - 276s 7ms/step - loss: 2.3005 - acc: 0.1135 - val_loss: 2.3022 - val_acc: 0.1116
Epoch 25/50
39200/39200 [==============================] - 271s 7ms/step - loss: 2.2998 - acc: 0.1155 - val_loss: 2.3025 - val_acc: 0.1071
Epoch 26/50
39200/39200 [==============================] - 271s 7ms/step - loss: 2.2996 - acc: 0.1156 - val_loss: 2.3021 - val_acc: 0.1100
Epoch 27/50
39200/39200 [==============================] - 272s 7ms/step - loss: 2.2981 - acc: 0.1168 - val_loss: 2.3024 - val_acc: 0.1078
Epoch 28/50
39200/39200 [==============================] - 270s 7ms/step - loss: 2.2970 - acc: 0.1187 - val_loss: 2.3035 - val_acc: 0.1065
Epoch 29/50
39200/39200 [==============================] - 271s 7ms/step - loss: 2.2945 - acc: 0.1218 - val_loss: 2.3061 - val_acc: 0.1041
Epoch 30/50
39200/39200 [==============================] - 270s 7ms/step - loss: 2.2935 - acc: 0.1223 - val_loss: 2.3059 - val_acc: 0.1003
Epoch 31/50
39200/39200 [==============================] - 274s 7ms/step - loss: 2.2906 - acc: 0.1268 - val_loss: 2.3067 - val_acc: 0.1014
Epoch 32/50
39200/39200 [==============================] - 276s 7ms/step - loss: 2.2873 - acc: 0.1278 - val_loss: 2.3078 - val_acc: 0.1073
Epoch 33/50
39200/39200 [==============================] - 292s 7ms/step - loss: 2.2806 - acc: 0.1368 - val_loss: 2.3118 - val_acc: 0.1034
Epoch 34/50
39200/39200 [==============================] - 301s 8ms/step - loss: 2.2744 - acc: 0.1404 - val_loss: 2.3160 - val_acc: 0.1022
Epoch 35/50
39200/39200 [==============================] - 289s 7ms/step - loss: 2.2662 - acc: 0.1486 - val_loss: 2.3172 - val_acc: 0.1029
Epoch 36/50
39200/39200 [==============================] - 295s 8ms/step - loss: 2.2557 - acc: 0.1543 - val_loss: 2.3162 - val_acc: 0.1087
Epoch 37/50
39200/39200 [==============================] - 308s 8ms/step - loss: 2.2459 - acc: 0.1632 - val_loss: 2.3275 - val_acc: 0.1083
As can be seen, there is very low training as well validation accuracy.
I have tried reducing Dropout(previously it was 0.5 for one of the layers) but still no effect. I doubled the neurons in the last hidden layer,(previously they were 100), still no effect. It seems like, it is something to do with the pre processing of the images as well as the input parameters for the image.
What can be done?

Copied in from comments as the answer:
In fact your model isn't learning anything, which usually points to a bug. I don't see anything overtly wrong. A common error is inputting garbage to the network accidentally. Take the first few images that you're feeding to the network and display them in a debugger before your fit step and print out the labels and make sure they match. Do a sanity check on your inputs.

image_dim_ordering - what am I missing here?

EDIT: Could not reproduce this issue using cuda 8.0 and using titan X (Pascal)
Using tensorflow backend for keras I have issues that are related to image_dim_ordering.
When I use image_dim_ordering='th' in the keras config file, everything works well But when I use 'tf', training simply doesn't really improve much from 0.5 accuracy.
The motivation is that currently my live augmentations are very costly, and I'd love to remove the unneeded reshape from theano dim order convention to tensorflow.
I tried recreating the issue with simple code to allow reproduction by other people which may assist me to understand what am I doing wrong here. I'm well aware of the channel,height,width different conventions, and at least I think that I handle that.
While I didn't fully reproduce my problem in the compact example (maybe because it's a trivial task), the training results are repeatedly different, and much worse for the 'tf' case, even when I try different seed values.
Note - in this reproducing code, all that the network needs to do is to tell apart full patches of -1.0 from full patches of 1.0
This is my '~/.keras/keras.json'
{
"floatx": "float32",
"epsilon": 1e-07,
"backend": "tensorflow",
"image_dim_ordering": "th"
}
my tensorflow version is ''0.11.0rc0'' (it happened on 0,10 as well)
my keras is latest git pull of today.
Using 'th' for the image_dim_ordering I get accuracy >=0.99 at epoch 4 for three different seeds.
Using 'tf' for the dim order I get accuracy >= 0.9 much latest as you can see below in the log, only at around epoch 24
The following is a standalone code that should reproduce the problem:
from keras import backend as K
import keras.optimizers
from keras.layers import Convolution2D, MaxPooling2D
from keras.layers import Activation, Dropout, Flatten, Dense, Input
from keras.models import Model
import numpy as np
def make_model(input_dim_size):
if K.image_dim_ordering() == 'tf':
input_shape = (input_dim_size, input_dim_size,1)
else:
input_shape = (1, input_dim_size, input_dim_size)
img_input = Input(shape=input_shape)
x = Convolution2D(64,5,5,border_mode='same')(img_input)
x = Activation('relu')(x)
x = MaxPooling2D((2,2),strides=(2,2))(x)
x = Convolution2D(64, 5, 5, border_mode='same')(x)
x = Activation('relu')(x)
x = MaxPooling2D((2, 2), strides=(2, 2))(x)
x = Convolution2D(64, 5, 5, border_mode='same')(x)
x = Activation('relu')(x)
x = MaxPooling2D((2, 2), strides=(2, 2))(x)
x = Convolution2D(128, 5, 5, border_mode='same')(x)
x = Activation('relu')(x)
x = MaxPooling2D((2, 2), strides=(2, 2))(x)
x = Convolution2D(128, 5, 5, border_mode='same')(x)
x = Activation('relu')(x)
x = MaxPooling2D((2, 2), strides=(2, 2))(x)
x = Flatten()(x)
x = Dense(1024*2)(x)
x = Activation('relu')(x)
x = Dropout(0.5)(x)
x = Dense(1024 * 2)(x)
x = Activation('relu')(x)
x = Dropout(0.75)(x)
x = Dense(200)(x)
x = Activation('relu')(x)
x = Dropout(0.75)(x)
x = Dense(1,activation='sigmoid')(x)
model = Model(img_input, x)
learning_rate = 0.01
sgd = keras.optimizers.sgd(lr=learning_rate, momentum=0.9, nesterov=True)
model.summary()
model.compile(loss='binary_crossentropy',
optimizer=sgd,
metrics=['accuracy']
)
return model
np.random.seed(456)
def dummy_generator(mini_batch_size=64, block_size=100):
if K.image_dim_ordering() == 'tf':
tensor_X_shape = (mini_batch_size,block_size, block_size,1)
else:
tensor_X_shape = (mini_batch_size, 1, block_size, block_size)
X = np.zeros(tensor_X_shape, dtype=np.float32)
y = np.zeros((mini_batch_size, 1))
while True:
for b in range(mini_batch_size):
X[b, :, :, :] = (float(b % 2) * 2.0) - 1.0
y[b, :] = float(b % 2)
yield X,y
with K.tf.device('/gpu:2'):
K.set_session(K.tf.Session(config=K.tf.ConfigProto(allow_soft_placement=True, log_device_placement=False)))
MINI_BATCH_SIZE = 64
PATCH_SIZE = 100
model = make_model(PATCH_SIZE)
gen = dummy_generator(mini_batch_size=MINI_BATCH_SIZE,block_size=PATCH_SIZE)
model.fit_generator(gen, MINI_BATCH_SIZE*10,
100, verbose=1,
callbacks=[],
validation_data=None,
nb_val_samples=None,
max_q_size=1,
nb_worker=1, pickle_safe=False)
For the 'tf' case this is the training log: (and looks very similar on different seeds):
Epoch 1/100
640/640 [==============================] - 1s - loss: 0.6932 - acc: 0.4781
Epoch 2/100
640/640 [==============================] - 0s - loss: 0.6932 - acc: 0.4938
Epoch 3/100
640/640 [==============================] - 0s - loss: 0.6921 - acc: 0.5203
Epoch 4/100
640/640 [==============================] - 0s - loss: 0.6920 - acc: 0.5469
Epoch 5/100
640/640 [==============================] - 0s - loss: 0.6935 - acc: 0.4875
Epoch 6/100
640/640 [==============================] - 0s - loss: 0.6941 - acc: 0.4969
Epoch 7/100
640/640 [==============================] - 0s - loss: 0.6937 - acc: 0.5047
Epoch 8/100
640/640 [==============================] - 0s - loss: 0.6931 - acc: 0.5312
Epoch 9/100
640/640 [==============================] - 0s - loss: 0.6923 - acc: 0.5250
Epoch 10/100
640/640 [==============================] - 0s - loss: 0.6929 - acc: 0.5281
Epoch 11/100
640/640 [==============================] - 0s - loss: 0.6934 - acc: 0.4953
Epoch 12/100
640/640 [==============================] - 0s - loss: 0.6918 - acc: 0.5234
Epoch 13/100
640/640 [==============================] - 0s - loss: 0.6930 - acc: 0.5125
Epoch 14/100
640/640 [==============================] - 0s - loss: 0.6939 - acc: 0.4797
Epoch 15/100
640/640 [==============================] - 0s - loss: 0.6936 - acc: 0.5047
Epoch 16/100
640/640 [==============================] - 0s - loss: 0.6917 - acc: 0.4922
Epoch 17/100
640/640 [==============================] - 0s - loss: 0.6945 - acc: 0.4891
Epoch 18/100
640/640 [==============================] - 0s - loss: 0.6948 - acc: 0.5000
Epoch 19/100
640/640 [==============================] - 0s - loss: 0.6968 - acc: 0.4594
Epoch 20/100
640/640 [==============================] - 0s - loss: 0.6919 - acc: 0.5391
Epoch 21/100
640/640 [==============================] - 0s - loss: 0.6904 - acc: 0.5172
Epoch 22/100
640/640 [==============================] - 0s - loss: 0.6881 - acc: 0.5906
Epoch 23/100
640/640 [==============================] - 0s - loss: 0.6804 - acc: 0.6359
Epoch 24/100
640/640 [==============================] - 0s - loss: 0.6470 - acc: 0.8219
Epoch 25/100
640/640 [==============================] - 0s - loss: 0.4134 - acc: 0.9625
Epoch 26/100
640/640 [==============================] - 0s - loss: 0.2347 - acc: 0.9953
Epoch 27/100
640/640 [==============================] - 0s - loss: 0.1231 - acc: 1.0000
And for the 'th' case the training log is (and looks very similar on different seeds):
Epoch 1/100
640/640 [==============================] - 3s - loss: 0.6891 - acc: 0.5594
Epoch 2/100
640/640 [==============================] - 2s - loss: 0.6079 - acc: 0.7328
Epoch 3/100
640/640 [==============================] - 2s - loss: 0.3166 - acc: 0.9422
Epoch 4/100
640/640 [==============================] - 2s - loss: 0.1767 - acc: 0.9969
I find it suspicious that it's so fast in the tensorflow case, (0s), but after adding debug printing to the generator it does seem to get called.
I thought that maybe it's related to keras not needing to reshape anything, but 2-3 seconds sounds too much time for this amount of reshapes
If anyone can try to reproduce the results that I see and help me understand what the heck am I missing here, I'd be grateful :)

This thread is bit old but I am still replying in case someone faces the same issues.
The error is caused due to the inconsistent Keras backend configuration...
{
"floatx": "float32",
"epsilon": 1e-07,
"backend": "tensorflow",
"image_dim_ordering": "th"
}
The configuration uses tensorflow as backend but uses image dimension ordering of Theano instead of tensorflow. change image_dim_ordering to tf and that should solve the issue..
"image_dim_ordering": "tf"

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

resnet50 - increasing training speed? keras - keras

Related

What's the meaning of the number before the progress bar when tensorflow is training

Keras loss function does not decrease on mean squared error

Predicting the price of the natural gas using LSTM neural network

Very low accuracy on Digit recgonition dataset with images having 4 channels, using Convolutional Neural Networks

image_dim_ordering - what am I missing here?

Categories

Resources