"Your input ran out of data" always at step # of epochs - keras

I have been trying to make my first data generator for a model.fit() with Keras. The dataset I'm trying to make has two inputs, an image and a float value. All of my image names and values are stored in a csv file. I believe I made my generator incorrectly because no matter what my batch size is, I always get the error "Your input ran out of data" at the step equal to my epochs. So if my epochs are set to 100 my model will run until it reaches step 100. My dataset is about 100000 images/values big. If anyone could help me find a solution that would be great.
I am currently using:
python 3.8
tf-gpu 2.4.0rc1
keras 2.4.3
pandas 1.1.4
Code:
IMG_SIZE = 400
Version = 1
batch_size = 64
val = .05
val_aug = ImageDataGenerator(rescale=1/255)
aug = ImageDataGenerator(
rescale=1/255,
rotation_range=30,
width_shift_range=0.1,
height_shift_range=0.1,
shear_range=0.2,
zoom_range=0.2,
channel_shift_range=25,
horizontal_flip=True,
fill_mode='constant')
df = pd.read_csv('F:/DATA/Vote/Vote_Age.csv')
df = df.sample(frac = 1)
cut = int(len(df) * val)
train_df = df[cut:]
val_df = df[0:cut]
print(f'Training dataset: {len(train_df)}')
print(f'Val dataset: {len(val_df)}')
train_steps = int(len(train_df) / batch_size)
val_steps = int(len(val_df) / batch_size)
def data(df, generator, batch_size, IMG_SIZE):
z = 0
while True:
df = df.sample(frac = 1)
for i in range(int(len(df) / batch_size)):
images, ages, votes = [], [], []
for x in range(batch_size):
csv_row = df.iloc[(z), :]
z += 1
image_path = f'F:/DATA/Vote/Images/{int(csv_row[0])}.jpg'
image = cv2.resize(cv2.imread(image_path), (int(IMG_SIZE), int(IMG_SIZE)))
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
image = image.reshape(-1, IMG_SIZE, IMG_SIZE, 3)
generator.fit(image)
image = generator.flow(image, batch_size=1)
image = image.next()
image = image.reshape(IMG_SIZE, IMG_SIZE, 3)
images.append(image)
ages.append(csv_row[1])
votes.append(int(csv_row[2]))
images = np.array(images)
ages = np.array(ages)
votes = np.array(votes)
return [[images, ages], [votes]]
#########
#Model was very big and unnecessary to include
#########
train_dataset = train_data(train_df, batch_size, IMG_SIZE)
val_dataset = val_data(val_df, batch_size, IMG_SIZE)
model.fit(
x = train_dataset[0],
y = train_dataset[1],
validation_data=(val_dataset[0], val_dataset[1]),
steps_per_epoch=train_steps,
validation_steps=val_steps,
callbacks=earlyStop,
epochs=100, batch_size=batch_size,
workers=multiprocessing.cpu_count(),
verbose=1)
model.save(f'F:/DATA/Vote/Models/YiffModel{Version}')
Ouput:
...
83/1484 [>.............................] - ETA: 4:46 - loss: 7578.7731
84/1484 [>.............................] - ETA: 4:46 - loss: 7575.5172
85/1484 [>.............................] - ETA: 4:46 - loss: 7572.2818
86/1484 [>.............................] - ETA: 4:46 - loss: 7569.0662
87/1484 [>.............................] - ETA: 4:46 - loss: 7565.8702
88/1484 [>.............................] - ETA: 4:45 - loss: 7562.6932
89/1484 [>.............................] - ETA: 4:45 - loss: 7559.5349
90/1484 [>.............................] - ETA: 4:45 - loss: 7556.3948
91/1484 [>.............................] - ETA: 4:45 - loss: 7553.2726
92/1484 [>.............................] - ETA: 4:44 - loss: 7550.1679
93/1484 [>.............................] - ETA: 4:44 - loss: 7547.0802
94/1484 [>.............................] - ETA: 4:44 - loss: 7544.0094
95/1484 [>.............................] - ETA: 4:44 - loss: 7540.9549
96/1484 [>.............................] - ETA: 4:43 - loss: 7537.9164
97/1484 [>.............................] - ETA: 4:43 - loss: 7534.8937
98/1484 [>.............................] - ETA: 4:43 - loss: 7531.8863
99/1484 [=>............................] - ETA: 4:43 - loss: 7528.8939
100/1484 [=>............................] - ETA: 4:43 - loss: 7525.9163
WARNING:tensorflow:Your input ran out of data; interrupting training. Make sure that your dataset or generator can generate at least `steps_per_epoch * epochs` batches (in this case, 148400 batches). You may need to use the repeat() function when building your dataset.
WARNING:tensorflow:Your input ran out of data; interrupting training. Make sure that your dataset or generator can generate at least `steps_per_epoch * epochs` batches (in this case, 78 batches). You may need to use the repeat() function when building your dataset.
1484/1484 [==============================] - 30s 15ms/step - loss: 7250.9988 - val_loss: 13595.9355
C:\Users\Tristan\anaconda3\envs\tf2\lib\site-packages\tensorflow\python\keras\engine\training.py:2325: UserWarning: `Model.state_updates` will be removed in a future version. This property should not be used in TensorFlow 2.0, as `updates` are applied automatically.
warnings.warn('`Model.state_updates` will be removed in a future version. '
2021-01-05 15:37:04.425018: W tensorflow/python/util/util.cc:348] Sets are not currently considered sequences, but this may change in the future, so consider avoiding using them.
C:\Users\Tristan\anaconda3\envs\tf2\lib\site-packages\tensorflow\python\keras\engine\base_layer.py:1402: UserWarning: `layer.updates` will be removed in a future version. This property should not be used in TensorFlow 2.0, as `updates` are applied automatically.
warnings.warn('`layer.updates` will be removed in a future version. '
WARNING:tensorflow:FOR KERAS USERS: The object that you are saving contains one or more Keras models or layers. If you are loading the SavedModel with `tf.keras.models.load_model`, continue reading (otherwise, you may ignore the following instructions). Please change your code to save with `tf.keras.models.save_model` or `model.save`, and confirm that the file "keras.metadata" exists in the export directory. In the future, Keras will only load the SavedModels that have this file. In other words, `tf.saved_model.save` will no longer write SavedModels that can be recovered as Keras models (this will apply in TF 2.5).
FOR DEVS: If you are overwriting _tracking_metadata in your class, this property has been used to save metadata in the SavedModel. The metadta field will be deprecated soon, so please move the metadata to a different file.
libpng warning: iCCP: known incorrect sRGB profile

Related

Keras: high training and validation accuracy but bad predictions

I'm implementing a Bidirectional LSTM in Keras. During the training, either training accuracy and validation accuracy are 0.83 and also losses are 0.45.
Epoch 1/50
32000/32000 [==============================] - 597s 19ms/step - loss: 0.4611 - accuracy: 0.8285 - val_loss: 0.4515 - val_accuracy: 0.8316
Epoch 2/50
32000/32000 [==============================] - 589s 18ms/step - loss: 0.4563 - accuracy: 0.8299 - val_loss: 0.4514 - val_accuracy: 0.8320
Epoch 3/50
32000/32000 [==============================] - 584s 18ms/step - loss: 0.4561 - accuracy: 0.8299 - val_loss: 0.4513 - val_accuracy: 0.8318
Epoch 4/50
32000/32000 [==============================] - 612s 19ms/step - loss: 0.4560 - accuracy: 0.8300 - val_loss: 0.4513 - val_accuracy: 0.8319
Epoch 5/50
32000/32000 [==============================] - 572s 18ms/step - loss: 0.4559 - accuracy: 0.8299 - val_loss: 0.4512 - val_accuracy: 0.8318
This is my model:
model = tf.keras.Sequential()
model.add(Masking(mask_value=0., input_shape=(timesteps, features)))
model.add(Bidirectional(LSTM(units=100, return_sequences=True), input_shape=(timesteps, features)))
model.add(Dropout(0.7))
model.add(Dense(1, activation='sigmoid'))
I normalized my dataset through scikit-learn StandardScaler.
I have a custom loss:
def get_top_one_probability(vector):
return (K.exp(vector) / K.sum(K.exp(vector)))
def listnet_loss(real_labels, predicted_labels):
return -K.sum(get_top_one_probability(real_labels) * tf.math.log(get_top_one_probability(predicted_labels)))
These are the model.compile and model.fit settings:
model.compile(loss=listnet_loss, optimizer=keras.optimizers.Adadelta(learning_rate=1.0, rho=0.95), metrics=["accuracy"])
model.fit(training_dataset, training_dataset_labels, validation_split=0.2, batch_size=1,
epochs=number_of_epochs, workers=10, verbose=1,
callbacks=[SaveModelCallback(), keras.callbacks.EarlyStopping(monitor='val_loss', patience=3)])
This is my test phase:
scaler = StandardScaler()
scaler.fit(test_dataset)
test_dataset = scaler.transform(test_dataset)
test_dataset = test_dataset.reshape((int(test_dataset.shape[0]/20), 20, test_dataset.shape[1]))
# Read model
json_model_file = open('/content/drive/My Drive/Tesi_magistrale/LSTM/models_padded_2/model_11.json', 'r')
loaded_model_json = json_model_file.read()
json_model_file.close()
model = model_from_json(loaded_model_json)
model.load_weights("/content/drive/My Drive/Tesi_magistrale/LSTM/models_weights_padded_2/model_11_weights.h5")
with open("/content/drive/My Drive/Tesi_magistrale/LSTM/predictions/padded/en_ewt-padded.H.pred", "w+") as predictions_file:
predictions = model.predict(test_dataset)
I rescaled also the test set. After line predictions = model.predict(test_dataset) I put some business logic to process my predictions (this logic is also used in the training phase).
I get very bad results on test set, also if the results in training are good.
What I do in a wrong way?
Somehow, the image generator of Keras works well when combined with fit() or fit_generator() function, but fails miserably when combined
with predict_generator() or the predict() function.
When using Plaid-ML Keras back-end for AMD processor, I would rather loop through all test images one-by-one and get the prediction for each image in each iteration.
import os
from PIL import Image
import keras
import numpy
# code for creating dan training model is not included
print("Prediction result:")
dir = "/path/to/test/images"
files = os.listdir(dir)
correct = 0
total = 0
#dictionary to label all animal category class.
classes = {
0:'This is Cat',
1:'This is Dog',
}
for file_name in files:
total += 1
image = Image.open(dir + "/" + file_name).convert('RGB')
image = image.resize((100,100))
image = numpy.expand_dims(image, axis=0)
image = numpy.array(image)
image = image/255
pred = model.predict_classes([image])[0]
animals_category = classes[pred]
if ("cat" in file_name) and ("cat" in sign):
print(correct,". ", file_name, animals_category)
correct+=1
elif ("dog" in file_name) and ("dog" in animals_category):
print(correct,". ", file_name, animals_category)
correct+=1
print("accuracy: ", (correct/total))

How to implement a multi-label text classifier in Keras?

I've been trying to create a multi-label text classifier that uses GloVe embeddings using Keras. I'm currently experimenting with the TREC-6 data set available at http://cogcomp.org/Data/QA/QC/. I'm only considering the 5 broad labels for my classification problem and am ignoring the sub-labels.
Since it's a multi-label classification problem, given a sentence, my neural network should output all labels with a probability greater than 0.1. Issue is that the network almost always classifies a question with only 1 label, which is fine when I'm asking a question that belongs to only one category. When I combine questions from different categories however, it still gives only one label with a high confidence most of the time, although I want all relevant labels to be identified.
I'm absolutely sure the pre-processing steps are correct. I get the feeling that there's some issue in my model.
I started experimenting with only CNNs in the beginning by referring to the paper "Convolutional Neural Networks for Sentence Classification" at https://www.aclweb.org/anthology/D14-1181.pdf, but on a teacher's advice and after seeing that they fail for long questions with different relevant topics, I tried experimenting with LSTMs and BiLSTMS. I started with this approach https://www.kaggle.com/c/quora-insincere-questions-classification/discussion/80568 and kept modifying some parameters / adding and removing layers hoping to get a good result but I've failed so far.
I tried copy pasting some code for Attention mechanism and adding that after my LSTM layers as well but it doesn't help.
My current model looks somewhat like this. I'll paste most of the rest of my code for clarity. The model and training code is present in the sentence_classifier() function.
class SentenceClassifier:
def __init__(self):
self.MAX_SEQUENCE_LENGTH = 200
self.EMBEDDING_DIM = 100
self.LABEL_COUNT = 0
self.WORD_INDEX = dict()
self.LABEL_ENCODER = None
def clean_str(self, string):
"""
Cleans each string and convert to lower case.
"""
string = re.sub(r"\'s", "", string)
string = re.sub(r"\'ve", "", string)
string = re.sub(r"n\'t", " not", string)
string = re.sub(r"\'re", "", string)
string = re.sub(r"\'d", "", string)
string = re.sub(r"\'ll", "", string)
string = re.sub(r"[^A-Za-z0-9]", " ", string)
string = re.sub(r"\s{2,}", " ", string)
return string.strip().lower()
def loader_encoder(self, table, type="json"):
"""
Load and encode data from dataset.
type = "sql" means get data from MySQL database.
type = "json" means get data from .json file. """
if type == "json":
with open('data/' + table + '.json', 'r', encoding='utf8') as f:
datastore = json.load(f)
questions = []
tags = []
for row in datastore:
questions.append(row['question'])
tags.append(row['tags'].split(','))
tokenizer = Tokenizer(lower=True, char_level=False)
tokenizer.fit_on_texts(questions)
self.WORD_INDEX = tokenizer.word_index
questions_encoded = tokenizer.texts_to_sequences(questions)
questions_encoded_padded = pad_sequences(questions_encoded, maxlen=self.MAX_SEQUENCE_LENGTH, padding='post')
for i, ele in enumerate(tags):
for j, tag in enumerate(ele):
if len(tag) == 0 or tag == ',':
del tags[i][j]
encoder = MultiLabelBinarizer()
encoder.fit(tags)
self.LABEL_ENCODER = encoder
tags_encoded = encoder.fit_transform(tags)
self.LABEL_COUNT = len(tags_encoded[0]) #No. of labels
print("\tUnique Tokens in Training Data: ", len(self.WORD_INDEX))
return questions_encoded_padded, tags_encoded
def load_embeddings(self, EMBED_PATH='./embeddings/glove.6B.100d.txt'):
"""
Load pre-trained embeddings into memory.
"""
embeddings_index = {}
try:
f = open(EMBED_PATH, encoding='utf-8')
except FileNotFoundError:
print("Embeddings missing.")
sys.exit()
for line in f:
values = line.rstrip().rsplit(' ')
word = values[0]
vec = np.asarray(values[1:], dtype='float32')
embeddings_index[word] = vec
f.close()
print("\tNumber of tokens in embeddings file: ", len(embeddings_index))
return embeddings_index
def create_embedding_matrix(self, embeddings_index):
"""
Creates an embedding matrix for all the words(vocab) in the training data with shape (vocab, EMBEDDING_DIM).
Out-of-vocab words will be randomly initialized to values between +0.25 and -0.25.
"""
words_not_found = []
vocab = len(self.WORD_INDEX) + 1
embedding_matrix = np.random.uniform(-0.25, 0.25, size=(vocab, self.EMBEDDING_DIM))
for word, i in self.WORD_INDEX.items():
if i >= vocab:
continue
embedding_vector = embeddings_index.get(word)
if (embedding_vector is not None) and len(embedding_vector) > 0:
embedding_matrix[i] = embedding_vector
else:
words_not_found.append(word)
# print('Number of null word embeddings: %d' % np.sum(np.sum(embedding_matrix, axis=1) == 0))
print("\tShape of embedding matrix: ", str(embedding_matrix.shape))
print("\tNo. of words not found in pre-trained embeddings: ", len(words_not_found))
return embedding_matrix
def sentence_classifier_cnn(self, embedding_matrix, x, y, table, load_saved=0):
"""
A static CNN model.
Makes uses of Keras functional API for constructing the model.
If load_saved=1, THEN load old model, ELSE train new model
model_name = table + ".model.h5"
if load_saved == 1 and os.path.exists('./saved/' + model_name):
print("\nLoading saved model...")
model = load_model('./saved/' + model_name)
print("Model Summary")
print(model.summary())
"""
print("\nTraining model...")
inputs = Input(shape=(self.MAX_SEQUENCE_LENGTH,), dtype='int32')
embedding = Embedding(input_dim=(len(self.WORD_INDEX) + 1), output_dim=self.EMBEDDING_DIM,
weights=[embedding_matrix],
input_length=self.MAX_SEQUENCE_LENGTH)(inputs)
X = keras.layers.SpatialDropout1D(0.3)(embedding)
X = keras.layers.Bidirectional(keras.layers.CuDNNLSTM(64, return_sequences = True))(X)
#X2 = keras.layers.Bidirectional(keras.layers.CuDNNGRU(128, return_sequences = False))(X)
X = keras.layers.Conv1D(32, kernel_size=2, padding='valid', kernel_initializer='normal')(X)
X = keras.layers.GlobalMaxPooling1D()(X)
#X = Attention(self.MAX_SEQUENCE_LENGTH)(X)
X = Dropout(0.5)(X)
X = keras.layers.Dense(16, activation="relu")(X)
X = Dropout(0.5)(X)
X = keras.layers.BatchNormalization()(X)
output = Dense(units=self.LABEL_COUNT, activation='sigmoid')(X)
model = Model(inputs=inputs, outputs=output, name='intent_classifier')
print("Model Summary")
print(model.summary())
cbk = OutputObserver(model, classifier)
model.compile(loss='binary_crossentropy',
optimizer='adam',
metrics=['accuracy'])
model.fit(x, y,
batch_size=30,
epochs=23,
verbose=2,
callbacks = [cbk])
#keras.utils.vis_utils.plot_model(model, to_file='model_plot.png', show_shapes=True, show_layer_names=True)
return model
def tag_question(self, model, question):
question = self.clean_str(question)
question_encoded = [[self.WORD_INDEX[w] for w in question.split(' ') if w in self.WORD_INDEX]]
question_encoded_padded = pad_sequences(question_encoded, maxlen=self.MAX_SEQUENCE_LENGTH, padding='post')
predictions = model.predict(question_encoded_padded)
possible_tags = []
for i, probability in enumerate(predictions[0]):
if probability >= 0.01:
possible_tags.append([self.LABEL_ENCODER.classes_[i], probability])
possible_tags.sort(reverse=True, key=lambda x:x[1]) #sort in place on the basis of the probability in each sub-list in descending order
print(possible_tags)
def setup_classifier(self, table):
'''
'''
print("Loading Data Set...")
x, y = self.loader_encoder(table)
embeddings_index = self.load_embeddings()
print("\nGenerating embedding matrix...")
embedding_matrix = self.create_embedding_matrix(embeddings_index)
#Loading / Training model
model = self.sentence_classifier_cnn(embedding_matrix, x, y, table, load_saved=1)
return model, embeddings_index
def connect_to_db(self):
mydb = mysql.connector.connect(host="localhost", user="root", passwd="root", database="questiondb")
cursor = mydb.cursor()
return mydb, cursor
As you can see, I've used a callback to print my predictions after each step.
I've tried to predict labels for all kinds of questions but I get good results only for questions that fall under one category.
For instance,
classifier.tag_question(model, "how many days before new year?")
gives
[['numeric', 0.99226487]]
as the output. But a more complex question like
classifier.tag_question(model, "who is the prophet of the muslim people and where is india located and how much do fruits costs there?")
gives something like
[['human', 0.9990531]]
as the output although labels like 'location' and 'numeric' are also relevant.
I used a callback to predict the prediction for that question after every epoch and I see something like this.
Epoch 1/23
- 19s - loss: 0.6581 - acc: 0.6365
[['human', 0.69752634], ['location', 0.40014982], ['entity', 0.32047516], ['abbreviation', 0.23877779], ['numeric', 0.23324837], ['description', 0.15995058]]
Epoch 2/23
- 12s - loss: 0.4525 - acc: 0.8264
[['human', 0.7437608], ['location', 0.18141672], ['entity', 0.14474556], ['numeric', 0.09171515], ['description', 0.053900182], ['abbreviation', 0.05283475]]
Epoch 3/23
- 12s - loss: 0.3854 - acc: 0.8478
[['human', 0.86335427], ['location', 0.12673976], ['entity', 0.09847507], ['numeric', 0.064431995], ['description', 0.035599917], ['abbreviation', 0.02441895]]
Epoch 4/23
- 12s - loss: 0.3634 - acc: 0.8509
[['human', 0.90795004], ['location', 0.10085008], ['entity', 0.09804481], ['numeric', 0.050411616], ['description', 0.032810867], ['abbreviation', 0.014970899]]
Epoch 5/23
- 13s - loss: 0.3356 - acc: 0.8582
[['human', 0.8365586], ['entity', 0.1130701], ['location', 0.10253032], ['numeric', 0.039931685], ['description', 0.02874279]]
Epoch 6/23
- 13s - loss: 0.3142 - acc: 0.8657
[['human', 0.95577633], ['entity', 0.088555306], ['location', 0.055004593], ['numeric', 0.015950901], ['description', 0.01428318]]
Epoch 7/23
- 13s - loss: 0.2942 - acc: 0.8750
[['human', 0.89538944], ['entity', 0.130977], ['location', 0.06350105], ['description', 0.023014158], ['numeric', 0.019377537]]
Epoch 8/23
- 13s - loss: 0.2739 - acc: 0.8802
[['human', 0.9725125], ['entity', 0.061141968], ['location', 0.026945814], ['description', 0.010931551]]
Epoch 9/23
- 13s - loss: 0.2579 - acc: 0.8914
[['human', 0.9797143], ['entity', 0.042518377], ['location', 0.027904237]]
Epoch 10/23
- 13s - loss: 0.2380 - acc: 0.9020
[['human', 0.7897601], ['entity', 0.14315197], ['location', 0.07439863], ['description', 0.019453615], ['numeric', 0.010681627]]
Epoch 11/23
- 13s - loss: 0.2250 - acc: 0.9104
[['human', 0.9886158], ['entity', 0.024878502], ['location', 0.015951043]]
Epoch 12/23
- 13s - loss: 0.2131 - acc: 0.9178
[['human', 0.9677731], ['entity', 0.03698206], ['location', 0.026153017]]
Epoch 13/23
- 13s - loss: 0.2029 - acc: 0.9204
[['human', 0.9514474], ['entity', 0.053581357], ['location', 0.029657435]]
Epoch 14/23
- 13s - loss: 0.1915 - acc: 0.9285
[['human', 0.9706739], ['entity', 0.0328649], ['location', 0.013876333]]
Epoch 15/23
- 13s - loss: 0.1856 - acc: 0.9300
[['human', 0.9328136], ['location', 0.05573874], ['entity', 0.025918543]]
Epoch 16/23
- 13s - loss: 0.1802 - acc: 0.9318
[['human', 0.9895527], ['entity', 0.014941782], ['location', 0.011972391]]
Epoch 17/23
- 13s - loss: 0.1717 - acc: 0.9373
[['human', 0.9426272], ['entity', 0.03754583], ['location', 0.023379702]]
Epoch 18/23
- 13s - loss: 0.1614 - acc: 0.9406
[['human', 0.99186605]]
Epoch 19/23
- 13s - loss: 0.1573 - acc: 0.9432
[['human', 0.9926062]]
Epoch 20/23
- 13s - loss: 0.1511 - acc: 0.9448
[['human', 0.9993554]]
Epoch 21/23
- 13s - loss: 0.1591 - acc: 0.9426
[['human', 0.9964465]]
Epoch 22/23
- 13s - loss: 0.1507 - acc: 0.9451
[['human', 0.999688]]
Epoch 23/23
- 13s - loss: 0.1524 - acc: 0.9436
[['human', 0.9990531]]
I've tried varying my parameters hundreds of times, especially my network size, batch size and epochs to try and avoid over-fitting.
I know my question is ridiculously long but I'm running out of patience and any help would be appreciated.
Here's the link to my colab notebook - https://colab.research.google.com/drive/1EOklUw7efOv69HvWKpuKVy1LSzcvTTCk.

Keras - Classifier not learning from Transfer-Values of a Pre-Trained Model

I'm currently trying to use a pre-trained network and test in on this dataset.
Originally, I used VGG19 and just fine-tuned only the classifier at the end to fit with my 120 classes. I let all layers trainable to maybe improve performance by having a deeper training. The problem is that the model is very slow (even if I let it run for a night, I only got couple of epochs and reach an accuracy of around 45% - I have a GPU GTX 1070).
Then, my thinking was to freeze all layers from this model as I have only 10k images and only train the few last Denses layers but it's still not realy fast.
After watching this video (at around 2 min 30s), I decided to replicate the principle of Transfer-Values with InceptionResnetv2.
I processed every pictures and saved the output in a numpy matrix with the following code.
# Loading pre-trained Model + freeze layers
model = applications.inception_resnet_v2.InceptionResNetV2(
include_top=False,
weights='imagenet',
pooling='avg')
for layer in model.layers:
layer.trainable = False
# Extraction of features and saving
a = True
for filename in glob.glob('train/resized/*.jpg'):
name_img = os.path.basename(filename)[:-4]
class_ = label[label["id"] == name_img]["breed"].values[0]
input_img = np.expand_dims(np.array(Image.open(filename)), 0)
pred = model.predict(input_img)
if a:
X = np.array(pred)
y = np.array(class_)
a = False
else:
X = np.vstack((X, np.array(pred)))
y = np.vstack((y, class_))
np.savez_compressed('preprocessed.npz', X=X, y=y)
X is a matrix of shape (10222, 1536) and y is (10222, 1).
After, I designed my classifier (several topologies) and I have no idea why it is not able to perform any learning.
# Just to One-Hot-Encode labels properly to (10222, 120)
label_binarizer = sklearn.preprocessing.LabelBinarizer()
y = label_binarizer.fit_transform(y)
model = Sequential()
model.add(Dense(512, input_dim=X.shape[1]))
# model.add(Dense(2048, activation="relu"))
# model.add(Dropout(0.5))
# model.add(Dense(256))
model.add(Dense(120, activation='softmax'))
model.compile(
loss = "categorical_crossentropy",
optimizer = "Nadam", # I tried several ones
metrics=["accuracy"]
)
model.fit(X, y, epochs=100, batch_size=64,
callbacks=[early_stop], verbose=1,
shuffle=True, validation_split=0.10)
Below you can find the output from the model :
Train on 9199 samples, validate on 1023 samples
Epoch 1/100
9199/9199 [==============================] - 2s 185us/step - loss: 15.9639 - acc: 0.0096 - val_loss: 15.8975 - val_acc: 0.0137
Epoch 2/100
9199/9199 [==============================] - 1s 100us/step - loss: 15.9639 - acc: 0.0096 - val_loss: 15.8975 - val_acc: 0.0137
Epoch 3/100
9199/9199 [==============================] - 1s 98us/step - loss: 15.9639 - acc: 0.0096 - val_loss: 15.8975 - val_acc: 0.0137
Epoch 4/100
9199/9199 [==============================] - 1s 96us/step - loss: 15.9639 - acc: 0.0096 - val_loss: 15.8975 - val_acc: 0.0137
Epoch 5/100
9199/9199 [==============================] - 1s 99us/step - loss: 15.9639 - acc: 0.0096 - val_loss: 15.8975 - val_acc: 0.0137
Epoch 6/100
9199/9199 [==============================] - 1s 96us/step - loss: 15.9639 - acc: 0.0096 - val_loss: 15.8975 - val_acc: 0.0137
I tried to change topologies, activation functions, add dropouts but nothing creates any improvements.
I have no idea what is wrong in my way of doing this. Is the X matrix incorrect ? Isn't it allowed to use the pre-trained model only as feature extractor then perform the classification with a second model ?
Many thanks for your feedbacks,
Regards,
Nicolas
You'll need to call preprocess_input before feeding the image array to the model. It normalizes the values of input_img from [0, 255] into [-1, 1], which is the desired input range for InceptionResNetV2.
input_img = np.expand_dims(np.array(Image.open(filename)), 0)
input_img = applications.inception_resnet_v2.preprocess_input(input_img.astype('float32'))
pred = model.predict(input_img)

How to interpret Keras model.fit output?

I've just started using Keras. The sample I'm working on has a model and the following snippet is used to run the model
from sklearn.preprocessing import LabelBinarizer
label_binarizer = LabelBinarizer()
y_one_hot = label_binarizer.fit_transform(y_train)
model.compile('adam', 'categorical_crossentropy', ['accuracy'])
history = model.fit(X_normalized, y_one_hot, nb_epoch=3, validation_split=0.2)
I get the following response:
Using TensorFlow backend. Train on 80 samples, validate on 20 samples Epoch 1/3
32/80 [===========>..................] - ETA: 0s - loss: 1.5831 - acc:
0.4062 80/80 [==============================] - 0s - loss: 1.3927 - acc:
0.4500 - val_loss: 0.7802 - val_acc: 0.8500 Epoch 2/3
32/80 [===========>..................] - ETA: 0s - loss: 0.9300 - acc:
0.7500 80/80 [==============================] - 0s - loss: 0.8490 - acc:
0.8000 - val_loss: 0.5772 - val_acc: 0.8500 Epoch 3/3
32/80 [===========>..................] - ETA: 0s - loss: 0.6397 - acc:
0.8750 64/80 [=======================>......] - ETA: 0s - loss: 0.6867 - acc:
0.7969 80/80 [==============================] - 0s - loss: 0.6638 - acc:
0.8000 - val_loss: 0.4294 - val_acc: 0.8500
The documentation says that fit returns
A History instance. Its history attribute contains all information
collected during training.
Does anyone know how to interpret the history instance?
For example, what does 32/80 mean? I assume 80 is the number of samples but what is 32? ETA: 0s ??
ETA = Estimated Time of Arrival.
80 is the size of your training set, 32/80 and 64/80 mean that your batch size is 32 and currently the first batch (or the second batch respectively) is being processed.
loss and acc refer to the current loss and accuracy of the training set.
At the end of each epoch your trained NN is evaluated against your validation set. This is what val_loss and val_acc refer to.
The history object returned by model.fit() is a simple class with some fields, e.g. a reference to the model, a params dict and, most importantly, a history dict. It stores the values of loss and acc (or any other used metric) at the end of each epoch. For 2 epochs it will look like this:
{
'val_loss': [16.11809539794922, 14.12947562917035],
'val_acc': [0.0, 0.0],
'loss': [14.890108108520508, 12.088571548461914],
'acc': [0.0, 0.25]
}
This comes in very handy if you want to visualize your training progress.
Note: if your validation loss/accuracy starts increasing while your training loss/accuracy is still decreasing, this is an indicator of overfitting.
Note 2: at the very end you should test your NN against some test set that is different from you training set and validation set and thus has never been touched during the training process.
32 is your batch size. 32 is the default value that you can change in your fit function if you wish to do so.
After the first batch is trained Keras estimates the training duration (ETA: estimated time of arrival) of one epoch which is equivalent to one round of training with all your samples.
In addition to that you get the losses (the difference between prediction and true labels) and your metric (in your case the accuracy) for both the training and the validation samples.

Training Keras model with HDF5Matrix results in very slow learning

I am using HDF5Matrix to load a dataset and train my model with it. In the first epoch I obtain about 10% of accuracy.
At the moment, my dataset is not very large, so I can copy the contents of the HDF5Matrix to a numpy array and train with it. I reinitialise the model, and this time, in the first epoch I obtain a 40% accuracy.
For more information about the HDF5Matrix, see this example.
I understand in the fit method, the parameters shuffle must be either False or 'batch'. I get the same behaviour either way.
Does anybody have the same problem? Could you tell me if there is something I am doing wrong?
This is a snippet of the code:
using HDF5Matrix
from keras.utils.io_utils import HDF5Matrix
x_train = HDF5Matrix('../data/default_data.h5', 'data')
y_train = HDF5Matrix('../data/default_data.h5', 'labels') # create the model ...
# train the model
model.fit(x_train, y_train, epochs=200, batch_size=2048, shuffle='batch') # which outputs:
Epoch 1/200
1758510/1758510 [==============================] - 42s - loss: 2.5574 - categorical_accuracy: 0.1032
Epoch 2/200
1758510/1758510 [==============================] - 41s - loss: 2.3145 - categorical_accuracy: 0.1553
Epoch 3/200
1758510/1758510 [==============================] - 41s - loss: 2.1931 - categorical_accuracy: 0.2067
Epoch 4/200
694272/1758510 [==========>...................] - ETA: 24s - loss: 2.1055 - categorical_accuracy: 0.2328
Using numpy array
# create the model again
...
# copy the HDF5Matrix to a numpy array
X_training = x_train[0:1758510]
Y_training = y_train[0:1758510]
# check X_training is equal to x_train
...
# train the model again
model.fit(X_training,
Y_training,
epochs=200,
batch_size=256,
shuffle=True)
# which outputs
Epoch 1/200
1758510/1758510 [==============================] - 27s - loss: 1.5019 - categorical_accuracy: 0.4710
Epoch 2/200
89600/1758510 [>.............................] - ETA: 26s - loss: 1.2786 - categorical_accuracy: 0.5523
Thank you very much

Resources