Issues with eli5 for feature importance in Keras - python-3.x

The end goal is to determine important features in a Neural Network model built within tensorflow keras OR kerasRegressor. The logic has been explained in this question, which utilizes eli5 by introducing noise for variables and measuring outcome.
I have been attempting for hours to implement this with no luck.
Question:
Why won't eli5 for feature importance work on either of my models?
My Current Error:
ValueError: Classification metrics can't handle a mix of multilabel-indicator and continuous-multioutput targets
I have built the model in both tf.keras && kerasRegressor reading somewhere that eli5 doesn't work with tensorflow.keras. I admit I do not truly understanding the difference b/n kerasRegressor & tf.keras.
Code for Keras Regressor Model:
def base_model():
# 1- Instantiate Model
modelNEW = keras.Sequential()
# 2- Specify Shape of First Layer
modelNEW.add(layers.Dense(512, activation = 'relu', input_shape = ourInputShape))
# 3- Add the layers
modelNEW.add(layers.Dense(3, activation= 'softmax')) #softmax returns array of probability scores (num prior), and in this case we have to predict either CSCANCEL, MEMBERCANCEL, ACTIVE)
modelNEW.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'])
return modelNEW
# *** THIS IS SUPPOSED TO PREVENT OVERFITTING ***
from tensorflow.keras.callbacks import EarlyStopping
callbacks = [
EarlyStopping(patience=2)
]
yTrain = keras.utils.to_categorical(yTrain, 3)
yValidation = keras.utils.to_categorical(yValidation, 3)
currentModel = KerasRegressor(build_fn=base_model, epochs=100, batch_size=50, shuffle='True')
history = currentModel.fit(xTrain, yTrain)
Code for tf.Keras model:
Only change is model name
modelNEW = keras.Sequential()
modelNEW.add(layers.Dense(512, activation = 'relu', input_shape = ourInputShape))
modelNEW.add(layers.Dense(3, activation= 'softmax')) #softmax returns array of probability scores (num prior), and in this case we have to predict either CSCANCEL, MEMBERCANCEL, ACTIVE)
modelNEW.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'])
*** THIS IS SUPPOSED TO PREVENT OVERFITTING ***
from tensorflow.keras.callbacks import EarlyStopping
callbacks = [
EarlyStopping(patience=2)
]
yTrain = keras.utils.to_categorical(yTrain, 3)
yValidation = keras.utils.to_categorical(yValidation, 3)
history = modelNEW.fit(xTrain, yTrain, epochs=100, batch_size=50, shuffle="True")
Attempting to implement eli5:
from keras.wrappers.scikit_learn import KerasClassifier, KerasRegressor
import eli5
from eli5.sklearn import PermutationImportance
from eli5 import show_weights
perm = PermutationImportance(currentModel, scoring="accuracy", random_state=1).fit(xTrain,yTrain)
eli5.show_weights(perm, feature_names = xTrain.columns.tolist())

Related

Evalulate Tensorflow Keras VS KerasRegressor Neural Network

I'm attempting to find variable importance on a Neural Network I've built. Using tensorflow, it seems you can use either the tensorflow.keras way, or the kerasRegressor way. Admittedly, I have been reading documentation / stack overflow for hours and am confused on the differences. They seem to perform similarly but have slightly different pros/cons.
One issue I'm running into is when I use tf.keras to build the model, I am able to clearly compare my training data to my validation/testing data, and get an 'accuracy score'. But, when using kerasRegressor, I am not.
The difference here is the .evaluate() function, which kerasRegressor doesn't seem to have.
Questions:
How to evaluate performance of kerasRegressor model w/ same output as tf.keras.evaluate()?
kerasRegressor Code:
K.clear_session()
def base_model():
# 1- Instantiate Model
modelNEW = keras.Sequential()
# 2- Specify Shape of First Layer
modelNEW.add(layers.Dense(512, activation = 'relu', input_shape = ourInputShape))
# 3- Add the layers
modelNEW.add(layers.Dense(3, activation= 'softmax')) #softmax returns array of probability scores (num prior), and in this case we have to predict either CSCANCEL, MEMBERCANCEL, ACTIVE)
modelNEW.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'])
return modelNEW
# *** THIS IS SUPPOSED TO PREVENT OVERFITTING ***
from tensorflow.keras.callbacks import EarlyStopping
callbacks = [
EarlyStopping(patience=2)
]
yTrain = keras.utils.to_categorical(yTrain, 3)
yValidation = keras.utils.to_categorical(yValidation, 3)
currentModel = KerasRegressor(build_fn=base_model, epochs=100, batch_size=50, shuffle='True')
history = currentModel.fit(xTrain, yTrain)
Now if I want to test the accuracy, I have to use .predict()
prediction = currentModel.predict(xValidation)
# print(prediction)
# train_error = np.abs(yValidation - prediction)
# mean_error = np.mean(train_error)
# min_error = np.min(train_error)
# max_error = np.max(train_error)
# std_error = np.std(train_error)```
tf.Keras neural Network:
modelNEW = keras.Sequential()
modelNEW.add(layers.Dense(512, activation = 'relu', input_shape = ourInputShape))
modelNEW.add(layers.Dense(3, activation= 'softmax')) #softmax returns array of probability scores (num prior), and in this case we have to predict either CSCANCEL, MEMBERCANCEL, ACTIVE)
modelNEW.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'])
*** THIS IS SUPPOSED TO PREVENT OVERFITTING ***
from tensorflow.keras.callbacks import EarlyStopping
callbacks = [
EarlyStopping(patience=2)
]
yTrain = keras.utils.to_categorical(yTrain, 3)
yValidation = keras.utils.to_categorical(yValidation, 3)
history = modelNEW.fit(xTrain, yTrain, epochs=100, batch_size=50, shuffle="True")
This is the evaluation I need to see, and cannot with kerasRegressor:
# 6- Model evaluation with test data
test_loss, test_acc = modelNEW.evaluate(xValidation, yValidation)
print('test_acc:', test_acc)
Possible Workaround, still error:
# predictionTrain = currentModel.predict(xTrain)
predictionValidation = currentModel.predict(xValidation)
# print('Train Accuracy = ',accuracy_score(yTrain,np.argmax(pred_train, axis=1)))
print('Test Accuracy = ',accuracy_score(yValidation,np.argmax(predictionValidation, axis=1)))
: Classification metrics can't handle a mix of multilabel-indicator and binary targets

Is it possible to extend python imageai pretrained model for more classes?

I am working on a project which uses imageai with YOLOv3 which works fast and accurately for my purpose. However this model is able to detect only 80 classes out of which I want some of them but want to add some more classes as well.
I referred to https://imageai.readthedocs.io/en/latest/customdetection/index.html to train my own custom model with 3 more classes. However, I am unable to detect the 80 classes that were provided by YOLOv3. Is there a way to generate a model that extends the existing YOLOv3 and can detect all 80 classes + extra classes that I want?
P.S. I am new to tensorflow and imageai so I don't know too much. Please bear with me.
I have not yet found a way to extend an existing model, but i can assure you that training your own model is far more efficient than using all the classes noones wants.
If it still interests you, this person had a similar question: Loading a trained Keras model and continue training
This is his finished code example:
"""
Model by: http://machinelearningmastery.com/
"""
import numpy
from tensorflow.keras.datasets import mnist
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.models import load_model
numpy.random.seed(7)
def baseline_model():
model = Sequential()
model.add(Dense(num_pixels, input_dim=num_pixels, activation='relu'))
model.add(Dense(num_classes, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
return model
if __name__ == '__main__':
# load data
(X_train, y_train), (X_test, y_test) = mnist.load_data()
# flatten 28*28 images to a 784 vector for each image
num_pixels = X_train.shape[1] * X_train.shape[2]
X_train = X_train.reshape(X_train.shape[0], num_pixels).astype('float32')
X_test = X_test.reshape(X_test.shape[0], num_pixels).astype('float32')
# normalize inputs from 0-255 to 0-1
X_train = X_train / 255
X_test = X_test / 255
# one hot encode outputs
y_train = np_utils.to_categorical(y_train)
y_test = np_utils.to_categorical(y_test)
num_classes = y_test.shape[1]
# build the model
model = baseline_model()
#Partly train model
dataset1_x = X_train[:3000]
dataset1_y = y_train[:3000]
model.fit(dataset1_x, dataset1_y, nb_epoch=10, batch_size=200, verbose=2)
# Final evaluation of the model
scores = model.evaluate(X_test, y_test, verbose=0)
print("Baseline Error: %.2f%%" % (100-scores[1]*100))
#Save partly trained model
model.save('partly_trained.h5')
del model
#Reload model
model = load_model('partly_trained.h5')
#Continue training
dataset2_x = X_train[3000:]
dataset2_y = y_train[3000:]
model.fit(dataset2_x, dataset2_y, nb_epoch=10, batch_size=200, verbose=2)
scores = model.evaluate(X_test, y_test, verbose=0)
print("Baseline Error: %.2f%%" % (100-scores[1]*100))

How to use KerasClassifier validation split and using scitkit learn GridSearchCV

I want to try to test some hyperparameters, thats i want to use the GridSearchCV, because it seems like thats the way to do it.
But i also want to use the validation split. To use Callsbacks like EarlyStopping or/and ReduceLROnPlateau. So my question is:
How do i implement GridSearchCV + validation_split correctly that none of the data in validation split is using for training and the whole training set is used to train my model?
Afaik GridSearchCV split again my remaining train data (which is 1-validation_split) and split it again? I get kinda high accuracy and im thinking that i dont split the data correctly
model = KerasClassifier(build_fn=create_model,verbose=2, validation_split=0.1)
optimizers = ['rmsprop', 'adam']
init = ['glorot_uniform',
#'normal',
'uniform',
'he_normal',
#'lecun_normal',
#'he_uniform'
]
epochs = [3] #5,8,10,30
batches = [64] #32,64
param_grid = dict(optimizer=optimizers, epochs=epochs, batch_size=batches, init=init)
grid = GridSearchCV(estimator=model, param_grid=param_grid)
grid_result = grid.fit(X_train, Y_train)
You can use your self-defined validation data by passing an extra argument to the grid.fit() function that is validation_data=(X_test, Y_test). The documentation, states that grid.fit() function accepts all valid arguments that can be passed to the actual model.fit() function of the default Keras model. Therefore, you can pass the validation data through the grid.fit() function. You may also pass the callback functions there.
I am adding a working code below (applied on MNIST-digit dataset). Notice how I added the validation data on grid.fit() and removed the 'validation_split':
import tensorflow as tf
import numpy as np
from tensorflow.keras.wrappers.scikit_learn import KerasClassifier
from sklearn.model_selection import GridSearchCV
from tensorflow.keras.datasets import mnist
from tensorflow.keras.utils import to_categorical
(X_train, Y_train), (X_test, Y_test) = mnist.load_data()
Y_train = to_categorical(Y_train, 10)
Y_test = to_categorical(Y_test, 10)
X_train = np.expand_dims(X_train, 3)
X_test = np.expand_dims(X_test, 3)
def create_model(optimizer, init):
model = tf.keras.Sequential([
tf.keras.layers.Convolution2D(32, 3, input_shape=(28, 28, 1),
activation='relu', kernel_initializer=init),
tf.keras.layers.Convolution2D(32, 3, activation='relu',
kernel_initializer=init),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(12, activation='relu',
kernel_initializer=init),
tf.keras.layers.Dense(10, activation='softmax',
kernel_initializer=init),
])
model.compile(loss='categorical_crossentropy',
optimizer=optimizer, metrics=['accuracy'])
return model
model = KerasClassifier(build_fn=create_model, verbose=2,)
optimizers = ['rmsprop', 'adam']
init = ['glorot_uniform',
#'normal',
'uniform',
'he_normal',
#'lecun_normal',
#'he_uniform'
]
epochs = [4,]
batches = [32, 64]
param_grid = dict(optimizer=optimizers, nb_epoch=epochs,
batch_size=batches, init=init)
grid = GridSearchCV(estimator=model, param_grid=param_grid)
grid_result = grid.fit(X_train, Y_train, validation_data=(X_test, Y_test))
Hope this helps. Thanks.

Fine-tune InceptionV3 model not working as expected

Trying to use transfer learning (fine tuning) with InceptionV3, removing the last layer, keeping training for all the layers off, and adding a single dense layer. When I look at the summary again, I do not see my added layer, and getting expectation.
RuntimeError: You tried to call count_params on dense_7, but the
layer isn't built. You can build it manually via:
dense_7.build(batch_input_shape).
from keras import applications
pretrained_model = applications.inception_v3.InceptionV3(weights = "imagenet", include_top=False, input_shape = (299, 299, 3))
from keras.layers import Dense
for layer in pretrained_model.layers:
layer.trainable = False
pretrained_model.layers.pop()
layer = (Dense(2, activation='sigmoid'))
pretrained_model.layers.append(layer)
Looking at summary again gives above exception.
pretrained_model.summary()
Wanted to train compile and fit model, but
pretrained_model.compile(optimizer=RMSprop(lr=0.0001),
loss = 'sparse_categorical_crossentropy', metrics = ['acc'])
Above line gives this error,
Could not interpret optimizer identifier:
You are using pop to pop the fully connected layer like Dense at the end of the network. But this is already accomplished by the argument include top = False. So you just need to initialize Inception with include_top = False, add the final Dense layer. In addition, since it's InceptionV3, I suggest you to add GlobalAveragePooling2D() after output of InceptionV3 to reduce overfitting. Here is a code,
from keras import applications
from keras.models import Model
from keras.layers import Dense, GlobalAveragePooling2D
pretrained_model = applications.inception_v3.InceptionV3(weights = "imagenet", include_top=False, input_shape = (299, 299, 3))
x = pretrained_model.output
x = GlobalAveragePooling2D()(x) #Highly reccomended
layer = Dense(2, activation='sigmoid')(x)
model = Model(input=pretrained_model.input, output=layer)
for layer in pretrained_model.layers:
layer.trainable = False
model.summary()
This should give you the desired model to fine tune.

how to reuse last layers' bias in next layers in Keras with tensorflow Backend

I'm new to Keras
my neural network structure is here:
neural network structure
my idea is :
import keras.backend as KBack
import tensorflow as tf
#...some code here
model = Sequential()
hidden_units = 4
layer1 = Dense(
hidden_units,
input_dim=len(InputIndex),
activation='sigmoid'
)
model.add(layer1)
# layer1_bias = layer1.get_weights()[1][0]
layer2 = Dense(
1, activation='sigmoid',
use_bias=False
)
model.add(layer2)
# KBack.bias_add(model.output, layer1_bias[0])
I know this is not working cause layer1_bias[0] is not tensor, but I have no idea how to fix it. Or somebody has other solution.
Thanks.
You get the error because bias_add expects a Tensor and you are passing it a float (the actual value of the bias). Also, be aware that your hidden layer actually has 3 biases (one for each node). If you want to add the bias of the first node to your output layer, this should work:
import keras.backend as K
from keras.layers import Dense, Activation
from keras.models import Sequential
model = Sequential()
layer1 = Dense(3, input_dim=2, activation='sigmoid')
layer2 = Dense(1, activation=None, use_bias=False)
activation = Activation('sigmoid')
model.add(layer1)
model.add(layer2)
K.bias_add(model.output, layer1.bias[0:1]) # slice like this to not lose a dimension
model.add(activation)
print(model.summary())
Note that, to be 'correct' (according to the definition of what a dense layer does), you should add the bias first, then the activation.
Also, your code is not really in line with the picture of your network. In the picture, one single shared bias is added to each of the nodes in the network. You can do this with the functional API. The idea is to disable the use of biases in the hidden layer and the output layers, and to manually add a bias variable that you define yourself and that will be shared by the layers. I'm using tensorflow for tf.add() since that supports broadcasting:
from keras.layers import Dense, Lambda, Input, Add
from keras.models import Model
import keras.backend as K
import tensorflow as tf
# Define the shared bias as a custom keras variable
shared_bias = K.variable(value=[0], name='shared_bias')
input_layer = Input(shape=(2,))
# Disable biases in the hidden layer
dense_1 = Dense(units=3, use_bias=False, activation=None)(input_layer)
# Manually add the shared bias
dense_1 = Lambda(lambda x: tf.add(x, shared_bias))(dense_1)
# Disable bias in output layer
output_layer = Dense(units=1, use_bias=False)(dense_1)
# Manually add the bias variable
output_layer = Lambda(lambda x: tf.add(x, shared_bias))(output_layer)
model = Model(inputs=input_layer, outputs=output_layer)
print(model.summary())
This assumes that your shared bias is not trainable though.

Resources