NaNs with customised weighted F1-Score in Keras

NaNs with customised weighted F1-Score in Keras - keras

I need to compute a weighted F1-score in such a way to penalize more errors over my least popular label (typical binary classification problem with an unbalanced dataset).
Unfortunately, I don't get a valid F1-score.
The followings are my metrics functions:
def sensitivity(y_true, y_pred):
true_positives = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)))
possible_positives = K.sum(K.round(K.clip(y_true, 0, 1)))
return true_positives / (possible_positives + K.epsilon())
def specificity(y_true, y_pred):
true_negatives = K.sum(K.round(K.clip((1-y_true) * (1-y_pred), 0, 1)))
possible_negatives = K.sum(K.round(K.clip(1-y_true, 0, 1)))
return true_negatives / (possible_negatives + K.epsilon())
def f1(y_true, y_pred):
def recall(y_true, y_pred):
true_positives = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)))
possible_positives = K.sum(K.round(K.clip(y_true, 0, 1)))
recall = true_positives / (possible_positives + K.epsilon())
return recall
def precision(y_true, y_pred):
true_positives = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)))
predicted_positives = K.sum(K.round(K.clip(y_pred, 0, 1)))
precision = true_positives / (predicted_positives + K.epsilon())
return precision
precision = precision(y_true, y_pred)
recall = recall(y_true, y_pred)
return 2*((precision*recall)/(precision+recall))
model.compile(loss='binary_crossentropy',
optimizer=RMSprop(0.001),
metrics=[sensitivity, specificity, 'accuracy', f1])
and here I train the model and do evaluation:
model.fit(x_train, y_train, epochs=12, batch_size=32, verbose=1, class_weight=class_weights_dict, validation_split=0.3)
classes = model.predict(x_test)
loss_and_metrics = model.evaluate(x_test, y_test, batch_size=128, verbose=1)
I always get nan as f1score - is something wrong conceptually or programmatically? Because data are the same I used with another classifier of the scikit-learn library (SVM) and it succeeded.
These are results:
Epoch 1/12
5133/5133 [==============================] - 5s 976us/step - loss: 0.6955 - sensitivity: 0.0561 - specificity: 0.9377 - acc: 0.8712 - f1: nan - val_loss: 0.6884 - val_sensitivity: 0.8836 - val_specificity: 0.0000e+00 - val_acc: 0.0723 - val_f1: nan
Epoch 2/12
5133/5133 [==============================] - 5s 894us/step - loss: 0.6954 - sensitivity: 0.3865 - specificity: 0.5548 - acc: 0.5398 - f1: nan - val_loss: 0.6884 - val_sensitivity: 0.0000e+00 - val_specificity: 1.0000 - val_acc: 0.9277 - val_f1: nan
Epoch 3/12
5133/5133 [==============================] - 5s 925us/step - loss: 0.6953 - sensitivity: 0.3928 - specificity: 0.5823 - acc: 0.5696 - f1: nan - val_loss: 0.6884 - val_sensitivity: 0.0000e+00 - val_specificity: 1.0000 - val_acc: 0.9277 - val_f1: nan
Epoch 4/12
5133/5133 [==============================] - 5s 935us/step - loss: 0.6954 - sensitivity: 0.1309 - specificity: 0.8504 - acc: 0.7976 - f1: nan - val_loss: 0.6884 - val_sensitivity: 0.0000e+00 - val_specificity: 1.0000 - val_acc: 0.9277 - val_f1: nan
etc.
Final result:
[0.6859536773606656, 0.0, 1.0, 0.9321705426356589, nan]

Regarding the nan in your f1 metric:
If you look at the log, your validation sensitivity is 0. Which means your precision and recall are both zero as well. So in the f1 calculation you are dividing by zero and getting a nan.
Add K.epsilon(), as you have done in the other functions.
On a side note, judging by your loss, which had a negligible improvement on the train set, your network had learnt nothing. I'd advice you to start by increasing the number of epochs, make the network deeper and don't pass anything to the class_weight argument (you mention not using weighted computation yet, but your code does set some class weight).

Check also if one of the batches has f1_score equals to nan.

Related

Using custom pre-trained word embeddings

I have a fairly simple script to classify intents from natural language queries working pretty well, to which I want to add a word embedding layer from a pre-trained custom model of 200 dims. I'm trying to help myself with this tutorial Keras pretrained_word_embeddings But with what I have achieved so far, the training is very very slow! and even worse the model doesn't learn, accuracy doesn't improve with each epoch, something impossible to handle. I think I have not configured the layers correctly or the parameters are not correct. Could you help with this??
with open("tf-kr_esp.json") as f:
rows = json.load(f)
for row in rows["utterances"]:
w = nltk.word_tokenize(row["text"])
words.extend(w)
documents.append((w, row["intent"]))
if row["intent"] not in classes:
classes.append(row["intent"])
words = sorted(list(set(words)))
classes = sorted(list(set(classes)))
word_index = dict(zip(words, range(len(words))))
embeddings_index = {}
with open('embeddings.txt') as f:
for line in f:
word, coefs = line.split(maxsplit=1)
coefs = np.fromstring(coefs, "f", sep=" ")
embeddings_index[word] = coefs
num_tokens = len(words) + 2
embedding_dim = 200
hits = 0
misses = 0
# Prepare embedding matrix
embedding_matrix = np.zeros((num_tokens, embedding_dim))
for word, i in word_index.items():
embedding_vector = embeddings_index.get(word)
if embedding_vector is not None:
# Words not found in embedding index will be all-zeros.
# This includes the representation for "padding" and "OOV"
embedding_matrix[i] = embedding_vector
hits += 1
else:
misses += 1
print("Converted %d words (%d misses)" % (hits, misses))
embedding_layer = Embedding(
num_tokens,
embedding_dim,
embeddings_initializer=tf.keras.initializers.Constant(embedding_matrix),
trainable=False,
)
# create our training data
training = []
output_empty = [0] * len(classes)
for doc in documents:
bag = []
pattern_words = doc[0]
for w in words:
bag.append(1) if w in pattern_words else bag.append(0)
output_row = list(output_empty)
output_row[classes.index(doc[1])] = 1
training.append([bag, output_row])
random.shuffle(training)
training = np.array(training, dtype="object")
train_x = list(training[:,0])
train_y = list(training[:,1])
int_sequences_input = tf.keras.Input(shape=(None,), dtype="int64")
embedded_sequences = embedding_layer(int_sequences_input)
x = layers.Conv1D(128, 5, activation="relu")(embedded_sequences)
x = layers.MaxPooling1D(5)(x)
x = layers.Conv1D(128, 5, activation="relu")(x)
x = layers.MaxPooling1D(5)(x)
x = layers.Conv1D(128, 5, activation="relu")(x)
x = layers.GlobalMaxPooling1D()(x)
x = layers.Dense(128, activation="relu")(x)
x = layers.Dropout(0.5)(x)
preds = layers.Dense(69, activation="softmax")(x)
model = tf.keras.Model(int_sequences_input, preds)
model.summary()
#sgd = SGD(learning_rate=0.01, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='categorical_crossentropy', optimizer='rmsprop', metrics=['accuracy'])
model.fit(np.array(train_x), np.array(train_y), epochs=20, batch_size=128, verbose=1)
Epoch 1/20
116/116 [==============================] - 279s 2s/step - loss: 4.2157 - accuracy: 0.0485
Epoch 2/20
116/116 [==============================] - 279s 2s/step - loss: 4.1861 - accuracy: 0.0550
Epoch 3/20
116/116 [==============================] - 281s 2s/step - loss: 4.1607 - accuracy: 0.0550
Epoch 4/20
116/116 [==============================] - 283s 2s/step - loss: 4.1387 - accuracy: 0.0550
Epoch 5/20
116/116 [==============================] - 286s 2s/step - loss: 4.1202 - accuracy: 0.0550
Epoch 6/20
116/116 [==============================] - 284s 2s/step - loss: 4.1047 - accuracy: 0.0550
Epoch 7/20
116/116 [==============================] - 286s 2s/step - loss: 4.0915 - accuracy: 0.0550
Epoch 8/20
116/116 [==============================] - 283s 2s/step - loss: 4.0806 - accuracy: 0.0550
Epoch 9/20
116/116 [==============================] - 280s 2s/step - loss: 4.0716 - accuracy: 0.0550
Epoch 10/20
116/116 [==============================] - 283s 2s/step - loss: 4.0643 - accuracy: 0.0550

Can you mention how many number of class you have got?
and also the embedding dimension is 200 that is okay but it is reality that pretrained vectors takes long time to train on the new embeddings. To make it more fast you can lower your input features in Convolutional layers. also you can use Adam as an optimizer instead of SGD. As SGD is much slower than Adam.

Keras LSTM Sequence Classification - Loss and validation loss decreases by trival amount and accuracy and validation accuracy remains static

Hoping for a quick second pair of eyes before I officially give up hope on applying deep learning to stock prediction.
The goal is to use an LSTM to predict one of two classes. The positive class corresponds to a sequence that led to a price increase of 5% or greater over the next six periods - the negative class corresponds to a sequence that did not. As expected this has led to a bit of class imbalance with the ratio being about 6:1 negative to positive. The problem right now though is that the model is showing the same accuracy across all epochs and is only predicting the negative class. This makes me think that I may have a problem with the structure of my model. The input is adataframe which includes price data and few moving averages:
price_open price_high price_low price_close ma_8 ma_13 ma_21 ma_55 6prd_pctchange entry_flag
time_period_start
11-02-2016 23:00 10.83280 10.98310 10.72591 10.96000 10.932415 10.855693 10.960608 11.087525 0.008535 0.0
11-03-2016 03:00 10.96016 11.02560 10.96000 11.00003 10.937569 10.873219 10.948081 11.075059 0.004544 0.0
11-03-2016 07:00 11.00007 11.14997 10.91000 11.00006 10.954170 10.919378 10.929689 11.062878 -0.007442 0.0
11-03-2016 11:00 11.05829 11.14820 10.90001 10.99208 10.959396 10.923376 10.912183 11.057317 0.008392 0.0
11-03-2016 15:00 10.90170 11.03112 10.70000 10.91529 10.938490 10.933783 10.890906 11.048504 0.006289 0.0
11-03-2016 19:00 10.89420 10.95000 10.82460 10.94980 10.944640 10.947429 10.882745 11.041227 0.005234 0.0
11-03-2016 23:00 10.94128 11.08475 10.88404 11.08475 10.974350 10.957118 10.888859 11.032288 0.011382 0.0
11-04-2016 03:00 11.02761 11.22778 10.94360 10.99813 10.987517 10.967185 10.893531 11.023518 -0.000173 0.0
11-04-2016 07:00 10.95076 11.01814 10.92000 10.92100 10.982642 10.964934 10.904055 11.011691 -0.007187 0.0
11-04-2016 11:00 10.94511 11.06298 10.89000 10.99557 10.982085 10.958244 10.914692 11.000365 0.000318 0.0
and has been converted into numpy arrays that are 6 periods in length and normalized using the scikit-learn method MinMaxScaler. As an example, the first sequence looks like below:
array([[0. , 0.16552483, 0.09965385, 0.52742716, 0. ,
0. , 1. , 1. ],
[0.5648144 , 0.37805671, 1. , 0.9996461 , 0.19101228,
0.19104958, 0.83911884, 0.73073358],
[0.74180673, 1. , 0.80769231, 1. , 0.80630067,
0.69421501, 0.60290376, 0.46764059],
[1. , 0.99114867, 0.76926923, 0.90586292, 1. ,
0.73780155, 0.37807623, 0.34751414],
[0.30555679, 0.40566085, 0. , 0. , 0.22515636,
0.85124563, 0.104818 , 0.15716305],
[0.27229589, 0. , 0.47923077, 0.40710157, 0.45309243,
1. , 0. , 0. ]])
When I build, compile, and fit a model on these sequences my results quickly plateau and the model ends up only predicting the negative class.
# Constants:
loss = 'binary_crossentropy'
optimizer = 'adam'
epochs = 12
batch_size = 300
# Complie model:
model = Sequential()
model.add(LSTM(100))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss=loss, optimizer=optimizer, metrics=['accuracy'])
results = model.fit(X_train, y_train, epochs=epochs, batch_size=batch_size, verbose=1, validation_data=(X_test, y_test), shuffle=False)
model.summary()
It outputs:
Epoch 1/12
22/22 [==============================] - 0s 16ms/step - loss: 0.5696 - accuracy: 0.8410 - val_loss: 0.3953 - val_accuracy: 0.8885
Epoch 2/12
22/22 [==============================] - 0s 10ms/step - loss: 0.4355 - accuracy: 0.8473 - val_loss: 0.3569 - val_accuracy: 0.8885
Epoch 3/12
22/22 [==============================] - 0s 9ms/step - loss: 0.4379 - accuracy: 0.8473 - val_loss: 0.3612 - val_accuracy: 0.8885
Epoch 4/12
22/22 [==============================] - 0s 9ms/step - loss: 0.4320 - accuracy: 0.8473 - val_loss: 0.3554 - val_accuracy: 0.8885
Epoch 5/12
22/22 [==============================] - 0s 10ms/step - loss: 0.4338 - accuracy: 0.8473 - val_loss: 0.3577 - val_accuracy: 0.8885
Epoch 6/12
22/22 [==============================] - 0s 10ms/step - loss: 0.4297 - accuracy: 0.8473 - val_loss: 0.3554 - val_accuracy: 0.8885
Epoch 7/12
22/22 [==============================] - 0s 9ms/step - loss: 0.4303 - accuracy: 0.8473 - val_loss: 0.3570 - val_accuracy: 0.8885
Epoch 8/12
22/22 [==============================] - 0s 9ms/step - loss: 0.4273 - accuracy: 0.8473 - val_loss: 0.3558 - val_accuracy: 0.8885
Epoch 9/12
22/22 [==============================] - 0s 9ms/step - loss: 0.4285 - accuracy: 0.8473 - val_loss: 0.3577 - val_accuracy: 0.8885
Epoch 10/12
22/22 [==============================] - 0s 9ms/step - loss: 0.4254 - accuracy: 0.8473 - val_loss: 0.3565 - val_accuracy: 0.8885
Epoch 11/12
22/22 [==============================] - 0s 9ms/step - loss: 0.4270 - accuracy: 0.8473 - val_loss: 0.3581 - val_accuracy: 0.8885
Epoch 12/12
22/22 [==============================] - 0s 9ms/step - loss: 0.4243 - accuracy: 0.8473 - val_loss: 0.3569 - val_accuracy: 0.8885
Model: "sequential_6"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
lstm_6 (LSTM) (None, 100) 42400
_________________________________________________________________
dense_6 (Dense) (None, 1) 101
=================================================================
And a quick check shows that it is only predicting the negative class:
predictions = model.predict(X_test)
predictions_round = [1 if x > 0.5 else 0 for x in predictions]
pd.Series(predictions_round).value_counts()
0 1641
dtype: int64
I'll be the first to say that this may be because predicting a stock price entry point is a task full of noise. BUT I also expected the model to at least make a handful of wrong guesses instead of simply guessing all the same class. To me, that seems like an issue with the way I built the model or structured the inputs.
X_train.shape and y_train.shape give me (6561, 6, 8) and (6561, ) respectively.
Thanks in advance for any help!

Keras: high training and validation accuracy but bad predictions

I'm implementing a Bidirectional LSTM in Keras. During the training, either training accuracy and validation accuracy are 0.83 and also losses are 0.45.
Epoch 1/50
32000/32000 [==============================] - 597s 19ms/step - loss: 0.4611 - accuracy: 0.8285 - val_loss: 0.4515 - val_accuracy: 0.8316
Epoch 2/50
32000/32000 [==============================] - 589s 18ms/step - loss: 0.4563 - accuracy: 0.8299 - val_loss: 0.4514 - val_accuracy: 0.8320
Epoch 3/50
32000/32000 [==============================] - 584s 18ms/step - loss: 0.4561 - accuracy: 0.8299 - val_loss: 0.4513 - val_accuracy: 0.8318
Epoch 4/50
32000/32000 [==============================] - 612s 19ms/step - loss: 0.4560 - accuracy: 0.8300 - val_loss: 0.4513 - val_accuracy: 0.8319
Epoch 5/50
32000/32000 [==============================] - 572s 18ms/step - loss: 0.4559 - accuracy: 0.8299 - val_loss: 0.4512 - val_accuracy: 0.8318
This is my model:
model = tf.keras.Sequential()
model.add(Masking(mask_value=0., input_shape=(timesteps, features)))
model.add(Bidirectional(LSTM(units=100, return_sequences=True), input_shape=(timesteps, features)))
model.add(Dropout(0.7))
model.add(Dense(1, activation='sigmoid'))
I normalized my dataset through scikit-learn StandardScaler.
I have a custom loss:
def get_top_one_probability(vector):
return (K.exp(vector) / K.sum(K.exp(vector)))
def listnet_loss(real_labels, predicted_labels):
return -K.sum(get_top_one_probability(real_labels) * tf.math.log(get_top_one_probability(predicted_labels)))
These are the model.compile and model.fit settings:
model.compile(loss=listnet_loss, optimizer=keras.optimizers.Adadelta(learning_rate=1.0, rho=0.95), metrics=["accuracy"])
model.fit(training_dataset, training_dataset_labels, validation_split=0.2, batch_size=1,
epochs=number_of_epochs, workers=10, verbose=1,
callbacks=[SaveModelCallback(), keras.callbacks.EarlyStopping(monitor='val_loss', patience=3)])
This is my test phase:
scaler = StandardScaler()
scaler.fit(test_dataset)
test_dataset = scaler.transform(test_dataset)
test_dataset = test_dataset.reshape((int(test_dataset.shape[0]/20), 20, test_dataset.shape[1]))
# Read model
json_model_file = open('/content/drive/My Drive/Tesi_magistrale/LSTM/models_padded_2/model_11.json', 'r')
loaded_model_json = json_model_file.read()
json_model_file.close()
model = model_from_json(loaded_model_json)
model.load_weights("/content/drive/My Drive/Tesi_magistrale/LSTM/models_weights_padded_2/model_11_weights.h5")
with open("/content/drive/My Drive/Tesi_magistrale/LSTM/predictions/padded/en_ewt-padded.H.pred", "w+") as predictions_file:
predictions = model.predict(test_dataset)
I rescaled also the test set. After line predictions = model.predict(test_dataset) I put some business logic to process my predictions (this logic is also used in the training phase).
I get very bad results on test set, also if the results in training are good.
What I do in a wrong way?

Somehow, the image generator of Keras works well when combined with fit() or fit_generator() function, but fails miserably when combined
with predict_generator() or the predict() function.
When using Plaid-ML Keras back-end for AMD processor, I would rather loop through all test images one-by-one and get the prediction for each image in each iteration.
import os
from PIL import Image
import keras
import numpy
# code for creating dan training model is not included
print("Prediction result:")
dir = "/path/to/test/images"
files = os.listdir(dir)
correct = 0
total = 0
#dictionary to label all animal category class.
classes = {
0:'This is Cat',
1:'This is Dog',
}
for file_name in files:
total += 1
image = Image.open(dir + "/" + file_name).convert('RGB')
image = image.resize((100,100))
image = numpy.expand_dims(image, axis=0)
image = numpy.array(image)
image = image/255
pred = model.predict_classes([image])[0]
animals_category = classes[pred]
if ("cat" in file_name) and ("cat" in sign):
print(correct,". ", file_name, animals_category)
correct+=1
elif ("dog" in file_name) and ("dog" in animals_category):
print(correct,". ", file_name, animals_category)
correct+=1
print("accuracy: ", (correct/total))

how to make correct dimension of training and test test to fit in the model for elmo embedding

i have got error while fitting the elmo embedding model with training set of dimension x_tr=(43163, 50),and y_tr=
(43163, 50, 1) as :
InvalidArgumentError: Incompatible shapes: [1600] vs. [32,50]
[[{{node metrics/acc/Equal}} = Equal[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"](metrics/acc/Reshape, metrics/acc/Cast)]].
how to solve this error ?
i tried to solve by making the training sample divisible by the batch size .
training set for fitting the model:
X_tr=np.array(X_tr)
print(X_tr.shape)
y_tr = np.array(y_tr).reshape(len(y_tr), max_len, 1)
print(y_tr.shape)
(43163, 50)
(43163, 50, 1)
making the model :
input_text = Input(shape=(max_len,), dtype=tf.string)
embedding = Lambda(ElmoEmbedding, output_shape=(None, 1024))(input_text)
x = Bidirectional(LSTM(units=512, return_sequences=True,
recurrent_dropout=0.2, dropout=0.2))(embedding)
x_rnn = Bidirectional(LSTM(units=512, return_sequences=True,
recurrent_dropout=0.2, dropout=0.2))(x)
x = add([x, x_rnn]) # residual connection to the first biLSTM
out = TimeDistributed(Dense(n_tags, activation="softmax"))(x)
model = Model(input_text, out)
compiling the model:
model.compile(optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"])
fitting the model:
fit_model = model.fit(np.array(X_tr), np.array(y_tr).reshape(len(y_tr), max_len, 1), validation_split=0.1,
batch_size=batch_size, epochs=5, verbose=1)
ERROR:
InvalidArgumentError: Incompatible shapes: [1600] vs. [32,50]
[[{{node metrics/acc/Equal}} = Equal[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"](metrics/acc/Reshape, metrics/acc/Cast)]]
Expected result could be:
Train on 38816 samples, validate on 4320 samples
Epoch 1/5
38816/38816 [==============================] - 433s 11ms/step - loss: 0.0625 - acc: 0.9818 - val_loss: 0.0459 - val_acc: 0.9858
Epoch 2/5
38816/38816 [==============================] - 430s 11ms/step - loss: 0.0404 - acc: 0.9869 - val_loss: 0.0421 - val_acc: 0.9865
Epoch 3/5
38816/38816 [==============================] - 429s 11ms/step - loss: 0.0334 - acc: 0.9886 - val_loss: 0.0426 - val_acc: 0.9868
Epoch 4/5
38816/38816 [==============================] - 429s 11ms/step - loss: 0.0275 - acc: 0.9904 - val_loss: 0.0431 - val_acc: 0.9868
Epoch 5/5
38816/38816 [==============================] - 430s 11ms/step - loss: 0.0227 - acc: 0.9920 - val_loss: 0.0461 - val_acc: 0.9867

solved:
i have solved this issue by removing metrics=['accuracy']
but why this accuracy metrics game error i'm still unaware.
If anyone know it please help me out

Keras - Classifier not learning from Transfer-Values of a Pre-Trained Model

I'm currently trying to use a pre-trained network and test in on this dataset.
Originally, I used VGG19 and just fine-tuned only the classifier at the end to fit with my 120 classes. I let all layers trainable to maybe improve performance by having a deeper training. The problem is that the model is very slow (even if I let it run for a night, I only got couple of epochs and reach an accuracy of around 45% - I have a GPU GTX 1070).
Then, my thinking was to freeze all layers from this model as I have only 10k images and only train the few last Denses layers but it's still not realy fast.
After watching this video (at around 2 min 30s), I decided to replicate the principle of Transfer-Values with InceptionResnetv2.
I processed every pictures and saved the output in a numpy matrix with the following code.
# Loading pre-trained Model + freeze layers
model = applications.inception_resnet_v2.InceptionResNetV2(
include_top=False,
weights='imagenet',
pooling='avg')
for layer in model.layers:
layer.trainable = False
# Extraction of features and saving
a = True
for filename in glob.glob('train/resized/*.jpg'):
name_img = os.path.basename(filename)[:-4]
class_ = label[label["id"] == name_img]["breed"].values[0]
input_img = np.expand_dims(np.array(Image.open(filename)), 0)
pred = model.predict(input_img)
if a:
X = np.array(pred)
y = np.array(class_)
a = False
else:
X = np.vstack((X, np.array(pred)))
y = np.vstack((y, class_))
np.savez_compressed('preprocessed.npz', X=X, y=y)
X is a matrix of shape (10222, 1536) and y is (10222, 1).
After, I designed my classifier (several topologies) and I have no idea why it is not able to perform any learning.
# Just to One-Hot-Encode labels properly to (10222, 120)
label_binarizer = sklearn.preprocessing.LabelBinarizer()
y = label_binarizer.fit_transform(y)
model = Sequential()
model.add(Dense(512, input_dim=X.shape[1]))
# model.add(Dense(2048, activation="relu"))
# model.add(Dropout(0.5))
# model.add(Dense(256))
model.add(Dense(120, activation='softmax'))
model.compile(
loss = "categorical_crossentropy",
optimizer = "Nadam", # I tried several ones
metrics=["accuracy"]
)
model.fit(X, y, epochs=100, batch_size=64,
callbacks=[early_stop], verbose=1,
shuffle=True, validation_split=0.10)
Below you can find the output from the model :
Train on 9199 samples, validate on 1023 samples
Epoch 1/100
9199/9199 [==============================] - 2s 185us/step - loss: 15.9639 - acc: 0.0096 - val_loss: 15.8975 - val_acc: 0.0137
Epoch 2/100
9199/9199 [==============================] - 1s 100us/step - loss: 15.9639 - acc: 0.0096 - val_loss: 15.8975 - val_acc: 0.0137
Epoch 3/100
9199/9199 [==============================] - 1s 98us/step - loss: 15.9639 - acc: 0.0096 - val_loss: 15.8975 - val_acc: 0.0137
Epoch 4/100
9199/9199 [==============================] - 1s 96us/step - loss: 15.9639 - acc: 0.0096 - val_loss: 15.8975 - val_acc: 0.0137
Epoch 5/100
9199/9199 [==============================] - 1s 99us/step - loss: 15.9639 - acc: 0.0096 - val_loss: 15.8975 - val_acc: 0.0137
Epoch 6/100
9199/9199 [==============================] - 1s 96us/step - loss: 15.9639 - acc: 0.0096 - val_loss: 15.8975 - val_acc: 0.0137
I tried to change topologies, activation functions, add dropouts but nothing creates any improvements.
I have no idea what is wrong in my way of doing this. Is the X matrix incorrect ? Isn't it allowed to use the pre-trained model only as feature extractor then perform the classification with a second model ?
Many thanks for your feedbacks,
Regards,
Nicolas

You'll need to call preprocess_input before feeding the image array to the model. It normalizes the values of input_img from [0, 255] into [-1, 1], which is the desired input range for InceptionResNetV2.
input_img = np.expand_dims(np.array(Image.open(filename)), 0)
input_img = applications.inception_resnet_v2.preprocess_input(input_img.astype('float32'))
pred = model.predict(input_img)

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

NaNs with customised weighted F1-Score in Keras - keras

Check also if one of the batches has f1_score equals to nan.

Related

Using custom pre-trained word embeddings

Keras LSTM Sequence Classification - Loss and validation loss decreases by trival amount and accuracy and validation accuracy remains static

Keras: high training and validation accuracy but bad predictions

how to make correct dimension of training and test test to fit in the model for elmo embedding

Keras - Classifier not learning from Transfer-Values of a Pre-Trained Model

Categories

Resources