Is it possible to extend python imageai pretrained model for more classes? - python-3.x

I am working on a project which uses imageai with YOLOv3 which works fast and accurately for my purpose. However this model is able to detect only 80 classes out of which I want some of them but want to add some more classes as well.
I referred to https://imageai.readthedocs.io/en/latest/customdetection/index.html to train my own custom model with 3 more classes. However, I am unable to detect the 80 classes that were provided by YOLOv3. Is there a way to generate a model that extends the existing YOLOv3 and can detect all 80 classes + extra classes that I want?
P.S. I am new to tensorflow and imageai so I don't know too much. Please bear with me.

I have not yet found a way to extend an existing model, but i can assure you that training your own model is far more efficient than using all the classes noones wants.
If it still interests you, this person had a similar question: Loading a trained Keras model and continue training
This is his finished code example:
"""
Model by: http://machinelearningmastery.com/
"""
import numpy
from tensorflow.keras.datasets import mnist
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.models import load_model
numpy.random.seed(7)
def baseline_model():
model = Sequential()
model.add(Dense(num_pixels, input_dim=num_pixels, activation='relu'))
model.add(Dense(num_classes, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
return model
if __name__ == '__main__':
# load data
(X_train, y_train), (X_test, y_test) = mnist.load_data()
# flatten 28*28 images to a 784 vector for each image
num_pixels = X_train.shape[1] * X_train.shape[2]
X_train = X_train.reshape(X_train.shape[0], num_pixels).astype('float32')
X_test = X_test.reshape(X_test.shape[0], num_pixels).astype('float32')
# normalize inputs from 0-255 to 0-1
X_train = X_train / 255
X_test = X_test / 255
# one hot encode outputs
y_train = np_utils.to_categorical(y_train)
y_test = np_utils.to_categorical(y_test)
num_classes = y_test.shape[1]
# build the model
model = baseline_model()
#Partly train model
dataset1_x = X_train[:3000]
dataset1_y = y_train[:3000]
model.fit(dataset1_x, dataset1_y, nb_epoch=10, batch_size=200, verbose=2)
# Final evaluation of the model
scores = model.evaluate(X_test, y_test, verbose=0)
print("Baseline Error: %.2f%%" % (100-scores[1]*100))
#Save partly trained model
model.save('partly_trained.h5')
del model
#Reload model
model = load_model('partly_trained.h5')
#Continue training
dataset2_x = X_train[3000:]
dataset2_y = y_train[3000:]
model.fit(dataset2_x, dataset2_y, nb_epoch=10, batch_size=200, verbose=2)
scores = model.evaluate(X_test, y_test, verbose=0)
print("Baseline Error: %.2f%%" % (100-scores[1]*100))

Related

GridsearchCV loss doesn't equal model.fit() loss values

I am confused as to which metric GridsearchCV is using in its parameter search. My understanding is that my model object feeds it a metric and this is what is used to determine the "best_params". But this doesn't appear to be the case. I thought that score=None is the default and as a result the first metric given in the metrics option of model.compile() was used. So in my case the the scoring function used should be the mean_squred_error. My explanation for this issue is described next.
Here is what I am doing. I simulated some regression data using sklearn with 10 features on 100,000 observations. I am playing around with keras because I typically used pytorch in the past and never really dabbled with keras until now. I am noticing a discrepancy in the loss function output from my GridsearchCV call vs the model.fit() call after I have my optimal set of parameters. Now I know I can just refit=True and not re-fit the model again, but I am trying to get a feel for the output of the keras and sklearn GridsearchCV functions.
To be explicit about the discrepancy here is what I am seeing. I simulated some data using sklearn as follows:
# Setting some data basics
N = 10000
feats = 10
# generate regression dataset
X, y = make_regression(n_samples=N, n_features=feats, n_informative=2, noise=3)
# training data and testing data #
X_train = X[:int(N * 0.8)]
y_train = y[:int(N * 0.8)]
X_test = X[int(N * 0.8):]
y_test = y[int(N * 0.8):]
I have created a "create_model" function that is looking to tune which activation function I am using (again this is a simple example for a proof of concept).
def create_model(activation_fn):
# create model
model = Sequential()
model.add(Dense(30, input_dim=feats, activation=activation_fn,
kernel_initializer='normal'))
model.add(Dropout(0.2))
model.add(Dense(10, activation=activation_fn))
model.add(Dropout(0.2))
model.add(Dense(1, activation='linear'))
# Compile model
model.compile(loss='mean_squared_error',
optimizer='adam',
metrics=['mean_squared_error','mae'])
return model
Performing the grid search I get the following output
model = KerasRegressor(build_fn=create_model, epochs=50, batch_size=200, verbose=0)
activations = ['linear','relu']
param_grid = dict(activation_fn = activations)
grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=1, cv=3)
grid_result = grid.fit(X_train, y_train, verbose=1)
print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_))
Best: -21.163454 using {'activation_fn': 'linear'}
Ok, so the best metric is the mean squared error of 21.16 (I understand they flip the sign to create a maximization problem). So, when I fit the model using the activation_fn = 'linear' the MSE I get is totally different.
best_model = create_model('linear')
history = best_model.fit(X_train, y_train, epochs=50, batch_size=200, verbose=1)
.....
.....
Epoch 49/50
8000/8000 [==============================] - 0s 48us/step - loss: 344.1636 - mean_squared_error: 344.1636 - mean_absolute_error: 12.2109
Epoch 50/50
8000/8000 [==============================] - 0s 48us/step - loss: 326.4524 - mean_squared_error: 326.4524 - mean_absolute_error: 11.9250
history.history['mean_squared_error']
Out[723]:
[10053.778002929688,
9826.66806640625,
......
......
344.16363830566405,
326.45237121582034]
The difference is in 326.45 vs. 21.16. Any insight as to what I am misunderstanding would be greatly appreciated. I would be more comfortable if they were within a reasonable neighborhood of each other, given one is the error from one fold vs the entire training data set. But 21 is nowhere near 326. Thanks!
The entire code is seen here.
import pandas as pd
import numpy as np
from keras import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Convolution2D, MaxPooling2D
from keras.utils import np_utils
from sklearn.model_selection import GridSearchCV
from keras.wrappers.scikit_learn import KerasClassifier, KerasRegressor
from keras.constraints import maxnorm
from sklearn import preprocessing
from sklearn.preprocessing import scale
from sklearn.datasets import make_regression
from matplotlib import pyplot as plt
# Setting some data basics
N = 10000
feats = 10
# generate regression dataset
X, y = make_regression(n_samples=N, n_features=feats, n_informative=2, noise=3)
# training data and testing data #
X_train = X[:int(N * 0.8)]
y_train = y[:int(N * 0.8)]
X_test = X[int(N * 0.8):]
y_test = y[int(N * 0.8):]
def create_model(activation_fn):
# create model
model = Sequential()
model.add(Dense(30, input_dim=feats, activation=activation_fn,
kernel_initializer='normal'))
model.add(Dropout(0.2))
model.add(Dense(10, activation=activation_fn))
model.add(Dropout(0.2))
model.add(Dense(1, activation='linear'))
# Compile model
model.compile(loss='mean_squared_error',
optimizer='adam',
metrics=['mean_squared_error','mae'])
return model
# fix random seed for reproducibility
seed = 7
np.random.seed(seed)
# create model
model = KerasRegressor(build_fn=create_model, epochs=50, batch_size=200, verbose=0)
# define the grid search parameters
activations = ['linear','relu']
param_grid = dict(activation_fn = activations)
grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=1, cv=3)
grid_result = grid.fit(X_train, y_train, verbose=1)
best_model = create_model('linear')
history = best_model.fit(X_train, y_train, epochs=50, batch_size=200, verbose=1)
history.history.keys()
plt.plot(history.history['mean_absolute_error'])
# summarize results
grid_result.cv_results_
print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_))
The large loss reported in your output (326.45237121582034) is the training loss. If you need a metric to be compared with the grid_result.best_score_ (in the GridSearchCV) and the MSE (in the best_model.fit), you have to request the validation loss (cf. code below).
Now to the question: why is the validation loss lower than the training loss? In your case it is essentially because of dropout (which is applied during training but not during validation/test) - that is why the difference between training and validation losses disappears when you remove dropout. You can find a detailed explanation here of the possible reasons for a lower validation loss.
In short, the performance (MSE) of your model is given by the grid_result.best_score_ (21.163454 in your example).
import numpy as np
from keras import Sequential
from keras.layers import Dense, Dropout
from sklearn.model_selection import GridSearchCV
from keras.wrappers.scikit_learn import KerasRegressor
from sklearn.datasets import make_regression
import tensorflow as tf
# fix random seed for reproducibility
seed = 7
np.random.seed(seed)
tf.random.set_seed(42)
# Setting some data basics
N = 10000
feats = 10
# generate regression dataset
X, y = make_regression(n_samples=N, n_features=feats, n_informative=2, noise=3)
# training data and testing data #
X_train = X[:int(N * 0.8)]
y_train = y[:int(N * 0.8)]
X_test = X[int(N * 0.8):]
y_test = y[int(N * 0.8):]
def create_model(activation_fn):
# create model
model = Sequential()
model.add(Dense(30, input_dim=feats, activation=activation_fn,
kernel_initializer='normal'))
model.add(Dropout(0.2))
model.add(Dense(10, activation=activation_fn))
model.add(Dropout(0.2))
model.add(Dense(1, activation='linear'))
# Compile model
model.compile(loss='mean_squared_error',
optimizer='adam',
metrics=['mean_squared_error','mae'])
return model
# create model
model = KerasRegressor(build_fn=create_model, epochs=50, batch_size=200, verbose=0)
# define the grid search parameters
activations = ['linear','relu']
param_grid = dict(activation_fn = activations)
grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=1, cv=3)
grid_result = grid.fit(X_train, y_train, verbose=1, validation_data=(X_test, y_test))
best_model = create_model('linear')
history = best_model.fit(X_train, y_train, epochs=50, batch_size=200, verbose=1, validation_data=(X_test, y_test))
history.history.keys()
# plt.plot(history.history['mae'])
# summarize results
print(grid_result.cv_results_)
print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_))

How to use KerasClassifier validation split and using scitkit learn GridSearchCV

I want to try to test some hyperparameters, thats i want to use the GridSearchCV, because it seems like thats the way to do it.
But i also want to use the validation split. To use Callsbacks like EarlyStopping or/and ReduceLROnPlateau. So my question is:
How do i implement GridSearchCV + validation_split correctly that none of the data in validation split is using for training and the whole training set is used to train my model?
Afaik GridSearchCV split again my remaining train data (which is 1-validation_split) and split it again? I get kinda high accuracy and im thinking that i dont split the data correctly
model = KerasClassifier(build_fn=create_model,verbose=2, validation_split=0.1)
optimizers = ['rmsprop', 'adam']
init = ['glorot_uniform',
#'normal',
'uniform',
'he_normal',
#'lecun_normal',
#'he_uniform'
]
epochs = [3] #5,8,10,30
batches = [64] #32,64
param_grid = dict(optimizer=optimizers, epochs=epochs, batch_size=batches, init=init)
grid = GridSearchCV(estimator=model, param_grid=param_grid)
grid_result = grid.fit(X_train, Y_train)
You can use your self-defined validation data by passing an extra argument to the grid.fit() function that is validation_data=(X_test, Y_test). The documentation, states that grid.fit() function accepts all valid arguments that can be passed to the actual model.fit() function of the default Keras model. Therefore, you can pass the validation data through the grid.fit() function. You may also pass the callback functions there.
I am adding a working code below (applied on MNIST-digit dataset). Notice how I added the validation data on grid.fit() and removed the 'validation_split':
import tensorflow as tf
import numpy as np
from tensorflow.keras.wrappers.scikit_learn import KerasClassifier
from sklearn.model_selection import GridSearchCV
from tensorflow.keras.datasets import mnist
from tensorflow.keras.utils import to_categorical
(X_train, Y_train), (X_test, Y_test) = mnist.load_data()
Y_train = to_categorical(Y_train, 10)
Y_test = to_categorical(Y_test, 10)
X_train = np.expand_dims(X_train, 3)
X_test = np.expand_dims(X_test, 3)
def create_model(optimizer, init):
model = tf.keras.Sequential([
tf.keras.layers.Convolution2D(32, 3, input_shape=(28, 28, 1),
activation='relu', kernel_initializer=init),
tf.keras.layers.Convolution2D(32, 3, activation='relu',
kernel_initializer=init),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(12, activation='relu',
kernel_initializer=init),
tf.keras.layers.Dense(10, activation='softmax',
kernel_initializer=init),
])
model.compile(loss='categorical_crossentropy',
optimizer=optimizer, metrics=['accuracy'])
return model
model = KerasClassifier(build_fn=create_model, verbose=2,)
optimizers = ['rmsprop', 'adam']
init = ['glorot_uniform',
#'normal',
'uniform',
'he_normal',
#'lecun_normal',
#'he_uniform'
]
epochs = [4,]
batches = [32, 64]
param_grid = dict(optimizer=optimizers, nb_epoch=epochs,
batch_size=batches, init=init)
grid = GridSearchCV(estimator=model, param_grid=param_grid)
grid_result = grid.fit(X_train, Y_train, validation_data=(X_test, Y_test))
Hope this helps. Thanks.

How do I implement multilabel classification neural network with keras

I am attempting to implement a neural network using Keras with a problem that involves multilabel classification. I understand that one way to tackle the problem is to transform it to several binary classification problems. I have implemented one of these, but am not sure how to proceed with the others, mostly how do I go about combining them? My data set has 5 input variables and 5 labels. Generally a single sample of data would have 1-2 labels. It is rare to have more than two labels.
Here is my code (thanks to machinelearningmastery.com):
import numpy
import pandas
from keras.models import Sequential
from keras.layers import Dense
from keras.wrappers.scikit_learn import KerasClassifier
from sklearn.model_selection import cross_val_score
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import StratifiedKFold
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline
# fix random seed for reproducibility
seed = 7
numpy.random.seed(seed)
# load dataset
dataframe = pandas.read_csv("Realdata.csv", header=None)
dataset = dataframe.values
# split into input (X) and output (Y) variables
X = dataset[:,0:5].astype(float)
Y = dataset[:,5]
# encode class values as integers
encoder = LabelEncoder()
encoder.fit(Y)
encoded_Y = encoder.transform(Y)
# baseline model
def create_baseline():
# create model
model = Sequential()
model.add(Dense(5, input_dim=5, kernel_initializer='normal', activation='relu'))
model.add(Dense(1, kernel_initializer='normal', activation='sigmoid'))
# Compile model
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
scores = model.evaluate(X, encoded_Y)
print("\n%s: %.2f%%" % (model.metrics_names[1], scores[1]*100))
#Make predictions....change the model.predict to whatever you want instead of X
predictions = model.predict(X)
# round predictions
rounded = [round(x[0]) for x in predictions]
print(rounded)
return model
# evaluate model with standardized dataset
estimator = KerasClassifier(build_fn=create_baseline, epochs=100, batch_size=5, verbose=0)
kfold = StratifiedKFold(n_splits=10, shuffle=True, random_state=seed)
results = cross_val_score(estimator, X, encoded_Y, cv=kfold)
print("Results: %.2f%% (%.2f%%)" % (results.mean()*100, results.std()*100))
The approach you are referring to is the one-versus-all or the one-versus-one strategy for multi-label classification. However, when using a neural network, the easiest solution for a multi-label classification problem with 5 labels is to use a single model with 5 output nodes. With keras:
model = Sequential()
model.add(Dense(5, input_dim=5, kernel_initializer='normal', activation='relu'))
model.add(Dense(5, kernel_initializer='normal', activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='sgd')
You can provide the training labels as binary-encoded vectors of length 5. For instance, an example that corresponds to classes 2 and 3 would have the label [0 1 1 0 0].

MLP classifier_for multi class

I am newbie on keras,
I try to follow the Keras tutorial for Multilayer Perceptron (MLP) for multi-class softmax classification, using my data set.
My data has 3 classes and only one feature, but I don't understand why the result always show just 0,3 of accuracy and the model predicted all training data as first class. then the confusion matrix is like this.
Confusion matrix
Here the coding:
import keras
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation
from keras.optimizers import SGD
import pandas as pd
import numpy as np
# Importing the dataset
dataset = pd.read_csv('StatusAll.csv')
X = dataset.iloc[:, 1:].values
y = dataset.iloc[:, 0:1].values
# Splitting the dataset into the Training set and Test set
from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 0)
from keras.utils import to_categorical
y_train = to_categorical(y_train)
y_test = to_categorical(y_test)
model = Sequential()
# Dense(64) is a fully-connected layer with 64 hidden units.
# in the first layer, you must specify the expected input data shape:
# here, 20-dimensional vectors.
model.add(Dense(64, activation='tanh', input_dim=1))
model.add(Dropout(0.5))
model.add(Dense(64, activation='tanh'))
model.add(Dropout(0.5))
model.add(Dense(4, activation='softmax'))
sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='categorical_crossentropy',
optimizer=sgd,
metrics=['accuracy'])
history = model.fit(x_train, y_train,
epochs=100,
batch_size=128)
score = model.evaluate(x_test, y_test, batch_size=128)
print('Test score:', score[0])
print('Test accuracy:', score[1])
from sklearn import metrics
prediction = model.predict(x_test)
prediction = np.around(prediction)
y_test_non_category = [ np.argmax(t) for t in y_test ]
y_predict_non_category = [ np.argmax(t) for t in prediction ]
from sklearn.metrics import confusion_matrix
conf_mat = confusion_matrix(y_test_non_category, y_predict_non_category)
print (conf_mat)
I hope I can get some advice, thanksss.
The x_train example
x_train
y_train before converted to categorical
enter image description here
Your final Dense layer has 4 outputs, it seems like you are classifying 4 instead of 3.
model.add(Dense(3, activation='softmax')) # Number of classes 3
It would be helpful to see sample data from x_train and y_train to make sure the pre-processing is correct. Because you have only 1 feature, a MLP might be overkill. A decision tree would be simpler unless you want to experiment with MLPs.

How do I test my own hand written digits or one of data from MNIST dataset using CLI

I'm studying machine learning and I'm totaly new with this. I have given a task to build a simple command line program that takes in a handwritten digit image,
and output prediction of which digit the computer thinks the image contains using MNIST dataset. I found a code that user keras.
from __future__ import print_function
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout
from keras.optimizers import RMSprop
batch_size = 128
num_classes = 10
epochs = 20
# the data, shuffled and split between train and test sets
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train = x_train.reshape(60000, 784)
x_test = x_test.reshape(10000, 784)
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')
# convert class vectors to binary class matrices
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)
print (tf.(orange_measurement))lis[]3
model = Sequential()
model.add(Dense(512, activation='relu', input_shape=(784,)))
model.add(Dropout(0.2))
model.add(Dense(512, activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(10, activation='softmax'))
model.summary()
model.compile(loss='categorical_crossentropy',
optimizer=RMSprop(),
metrics=['accuracy'])
history = model.fit(x_train, y_train,
batch_size=batch_size, epochs=epochs,
verbose=1, validation_data=(x_test, y_test))
score = model.evaluate(x_test, y_test, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])
after I execute this code how can I make it become simple CLI program that can receive picture and give me prediction what digit it is more likely.
as for example I saw in one youtube tutorial to determine flower(rose,daisy,dandalion, sunflower, and tulip) by obly executing in command:
# In Docker
python /tf_files/label_image.pyy /tf_files/flower_photos/daisy/21652746_cc379e0eea_m.jpg
after restarting docker and it'll show the confident of the computer. So what command can I use to test my own image or one imange from mnist dataset and result a prediction?
It looks like this code is learning how to identify the digits but when it's finished the model disappears. If you want to be able to use the model later you'll want to try model.save(filepath). (More information on how to save and load here:https://keras.io/getting-started/faq/#how-can-i-save-a-keras-model)
Then you could create a separate script, image_label.py for example, that loads the model and runs the second argument through the network. You're going to need to do some preprocessing of handwritten image files to run them through a network trained for MNIST images. If you want to test it on MNIST sample images it might be a little easier.

Resources