GridSearch for MultiClass KerasClassifier - keras

I am trying to do a grid search for a multiclass classification with Keras. Here is a section of the code:
Some properties of the data are below:
y_
array(['fast', 'immobile', 'immobile', ..., 'slow',
'immobile', 'slow'],
dtype='<U17')
y_onehot = pd.get_dummies(y_).values
y_onehot
array([[1, 0, 0],
[0, 0, 1],
[0, 0, 1],
...
[0, 1, 0],
[0, 0, 1],
[0, 1, 0]], dtype=uint8)
#Do train-test split
y_train.shape
(1904,)
y_train_onehot.shape
(1904, 3)
And the model...
# Function to create model, required for KerasClassifier
def create_model(optimizer='rmsprop', init='glorot_uniform'):
# create model
model = Sequential()
model.add(Dense(2048, input_dim=X_train.shape[1], kernel_initializer=init, activation='relu'))
model.add(Dense(512, kernel_initializer=init, activation='relu'))
model.add(Dense(y_train_onehot.shape[1], kernel_initializer=init, activation='softmax'))
# Compile model
model.compile(loss='categorical_crossentropy', optimizer=optimizer, metrics=['accuracy'])
return model
# create model
model = KerasClassifier(build_fn=create_model, verbose=0)
# grid search epochs, batch size and optimizer
optimizers = ['rmsprop', 'adam']
init = ['glorot_uniform', 'normal', 'uniform']
epochs = [50, 100, 150]
batches = [5, 10, 20]
param_grid = dict(optimizer=optimizers, epochs=epochs, batch_size=batches, init=init)
grid = GridSearchCV(estimator=model, param_grid=param_grid, scoring='accuracy')
grid_result = grid.fit(X_train, y_train_onehot)
And here is the error:
--> grid_result = grid.fit(X_train, y_train_onehot)
ValueError: Classification metrics can't handle a mix of multilabel-indicator and multiclass targets
The code was for a binary model but I am hoping to modify it for a multiclass data set. Kindly assist. Thanks!

The error is in the softmax layer.
I think you mean y_train_onehot.shape[1] instead of y_train_onehot[1]
Update 1: This is strange but your second problem seems to be y_train_onehot, would you mind to try 2 things:
try the same model without the onehot encoding on y_train.
if that alone doesn't work, change the loss to sparse_categorical_crossentropy
Also make sure to change y_train_onehot.shape[1] to the number of classes in the softmax layer

Related

Keras MaxPooling3D Layer: Negative dimension size

I was just trying to design a CNN 3D network for image classification.
Here, the input shape is (?,50,50,3,1), RGB pixel data, I tried adding data_format but didn't helped me out.
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Conv3D(64, (3,3,3), input_shape = x_train.shape[1:], activation = tf.nn.relu))
model.add(tf.keras.layers.MaxPooling3D(pool_size=(2,2,2)))
model.add(tf.keras.layers.Conv3D(64, (3,3,3), activation = tf.nn.relu))
model.add(tf.keras.layers.MaxPooling3D(pool_size=(2,2,2)))
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(64, activation = tf.nn.relu))
model.add(tf.keras.layers.Dense(1, activation = tf.nn.softmax))
model.compile(optimizer = 'adam',
loss = 'binary_crossentropy',
metrics = ['accuracy'])
model.fit(x_train, y_train, epochs = 10)
Getting This Error:
InvalidArgumentError: Negative dimension size caused by subtracting 2 from 1 for '{{node max_pooling3d/MaxPool3D}} = MaxPool3D[T=DT_FLOAT, data_format="NDHWC", ksize=[1, 2, 2, 2, 1], padding="VALID", strides=[1, 2, 2, 2, 1]](conv3d_1/Relu)' with input shapes: [?,48,48,1,64].
During handling of the above exception, another exception occurred:
ValueError Traceback (most recent call last)
<ipython-input-10-04d154198bb1> in <module>
2
3 model.add(tf.keras.layers.Conv3D(64, (3,3,3), input_shape = x_train.shape[1:], activation = tf.nn.relu))
----> 4 model.add(tf.keras.layers.MaxPooling3D(pool_size=(2,2,2)))
5
6 model.add(tf.keras.layers.Conv3D(64, (3,3,3), activation = tf.nn.relu))
ValueError: Negative dimension size caused by subtracting 2 from 1 for '{{node max_pooling3d/MaxPool3D}} = MaxPool3D[T=DT_FLOAT, data_format="NDHWC", ksize=[1, 2, 2, 2, 1], padding="VALID", strides=[1, 2, 2, 2, 1]](conv3d_1/Relu)' with input shapes: [?,48,48,1,64].
As pooling subsamples your input, there is a point where the output of the previous layer is too small to be pooled again, so depending of your input shape, remove the 2nd
model.add(tf.keras.layers.Conv3D(64, (3,3,3), activation = tf.nn.relu))
model.add(tf.keras.layers.MaxPooling3D(pool_size=(2,2,2)))
block and try it again.
If you want to get more information about pooling, I recommend this introduction:
https://machinelearningmastery.com/pooling-layers-for-convolutional-neural-networks/#:~:text=Maximum%20pooling%2C%20or%20max%20pooling,the%20case%20of%20average%20pooling.
I used this to change the shape of the input,
x_train = np.array(x_train).reshape(-1, 3, 50, 50, 1)
and after this I added a new parameter to the MaxPooling3D Layer,
data_format = 'channel_first'
Before this I was getting the error in 1st layer, but after this, the same error came up, but in the 2nd layer of pooling.
As #Yannick Funk told, I removed the 2nd layer and It worked fine.
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Conv3D(64, (3,3,3), input_shape = x_train.shape[1:], activation = tf.nn.relu))
model.add(tf.keras.layers.MaxPooling3D(pool_size=(2,2,2), data_format = 'channels_first'))
#model.add(tf.keras.layers.Conv3D(64, (3,3,3), activation = tf.nn.relu))
#model.add(tf.keras.layers.MaxPooling3D(pool_size=(2,2,2), data_format = 'channels_first'))
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(64, activation = tf.nn.relu))
model.add(tf.keras.layers.Dense(1, activation = tf.nn.softmax))
model.compile(optimizer = 'adam',
loss = 'binary_crossentropy',
metrics = ['accuracy'])
model.fit(x_train, y_train, epochs = 10)

why my model is giving same result even after >93 accuracy ? result >> array([[1., 0., 0.]], dtype=float32)

train_datagen = ImageDataGenerator(rescale = 1./255,
shear_range = 0.2,
zoom_range = 0.2,
horizontal_flip = True)
training_set = train_datagen.flow_from_directory('images',
target_size = (64, 64),
batch_size = 32,
class_mode = 'categorical')
#Found 27659 images belonging to 3 classes.
classifier = Sequential()
# Step 1 - Convolution
classifier.add(Convolution2D(32, kernel_size=(3,3), strides=(1,1), input_shape = (64, 64, 3), activation = 'relu'))
classifier.add(MaxPooling2D(pool_size = (2, 2)))
classifier.add(Convolution2D(64, (3,3), 2, activation = 'relu'))
classifier.add(MaxPooling2D(pool_size = (2, 2)))
classifier.add(Flatten())
classifier.add(Dense(128, activation = 'relu'))
classifier.add(Dense(3, activation = 'softmax'))
classifier.compile(optimizer = 'adam', loss = 'categorical_crossentropy', metrics = ['accuracy'])
classifier.fit_generator(training_set,
steps_per_epoch = 50,
epochs = 30)
test=image.load_img("./image.png",target_size=(64,64))
test=image.img_to_array(test)
import numpy as np
test=np.expand_dims(test,axis=0)
result=classifier.predict(test)
result
# result is always same as below
#array([[1., 0., 0.]], dtype=float32)
** why i'm getting the same answer all the time , ive increased the epoch but still same ,
why it is really happening , for 2 classes i've done but for 3 or more classes its not working
or
you can give me another code for more than 3 classes to predict
then
and another question is how to set label based on our directory for example
images---
-----cat folder
-----dog folder
-----fish folder
but
in the labeling it will be like [0,0,...,222] how do i know 0 is cat or dog?
**
I just tried your code with two classes (cats and dogs),i have modified your code to work in binary mode, especially last dense layer, softmax to sigmoid and the loss function to make it work.
However i could see that you can do that following improvements
1. Increasing the resolution of the image
2. Increasing the network size (both width and height)
3. Add the validation data to check the model performance.
It is always good to rely on validation accuracy rather than training accuracy. If validation accuracy is good, it means that your model might do well on test data-set.
I do not really think you need to care about labeling in this case as you might have data in the respective folders and ImageDataGenerator parses that folder structure and generates labels automatically.

Hyper parameter tuning with GridsearchCV for multi output data(Neural Network) using keras

I have a data which has 2 output variables and multiple input variables, I used neural network to make the model, but now I want to do hyper parameter tuning.
For tuning, I am using grid search, I wrote the code for grid search, but there is an error when I tried grid.fit(X_train, Y_train)the error is ValueError: Error when checking model target: the list of Numpy arrays that you are passing to your model is not the size the model expected. Expected to see 2 array(s), but instead got the following list of 1 arrays: [array([[ 1.62970054, 2.33817343],
then I made 2 different arrays and then passed into the code as grid.fit(X_train, [Y[0], Y[1]), as there are 2 outputs, now its showing that ValueError: Found input variables with inconsistent numbers of samples: [10000, 2]. Is there any way I can correct this code, or GridsearchCV don't accept multiple output values? I made the keras model using keras functional API and then I passed through kerasRegressor so that I can implement GridsearchCV. I am stuck at what should be the code so that grid.fit() will run...Or is there any other method by which I can do hyper parameter tuning
def create_model(activation='relu', dropout_rate=0.0, neurons=10, optimizer='Adam', weight_constraint=0):
main_input = Input(shape=(17,), name='main_input')
hidden = Dense(neurons, activation=activation, name='hidden', kernel_constraint=maxnorm(weight_constraint))(main_input)
hidden = Dropout(dropout_rate)(hidden)
out1 = Dense(1, activation='linear', name='out1')(hidden)
out2 = Dense(1, activation='linear',name='out2')(hidden)
model = Model(inputs=main_input, outputs=[out1,out2])
model.compile(optimizer = optimizer,loss={'out1':'mean_squared_error', 'out2':'mean_squared_error'})
return model
model = KerasRegressor(build_fn=create_model, batch_size=32, epochs=100)
activation = ['relu', 'tanh', 'sigmoid', 'linear']
dropout_rate = [0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9]
weight_constraint=[1, 2, 3, 4, 5]
neurons = [10, 30, 50, 70, 90]
optimizer = [ 'SGD', 'RMSprop', 'Adagrad', 'Adadelta', 'Adam']
epochs = [50, 100, 150, 200]
batch_size = [10, 20, 30, 40]
param_grid = dict(neurons=neurons, activation=activation, dropout_rate=dropout_rate, weight_constraint=weight_constraint, epochs=epochs, batch_size=batch_size, optimizer=optimizer)
grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=-1, cv=5)
grid_result = grid.fit(X_train, Y_train)
The error when the last line of the above code is grid_result = grid.fit(X_train, Y_train) is ValueError: Error when checking model target: the list of Numpy arrays that you are passing to your model is not the size the model expected. Expected to see 2 array(s), but instead got the following list of 1 arrays: [array([[ 1.62970054, 2.33817343],
[ 3.44504814, 6.05534019],
[ 1.58155862, 0.8296778 ],
...,
[ 1.27446578, 6.71978433],
[ 7.99909866, 17.82736535],
[ 1.4...
The error when I make 2 different arrays of 2 outputs and then rewrite the last line as grid.fit(X_train, [Y[0], Y[1]) is ValueError: Found input variables with inconsistent numbers of samples: [10000, 2]
My data is 10000 observation with 17 input variables and 2 output variables.

Multiclass predictions with a categorical_crossentropy loss

Let's say this example implements a simple binary classification.
X = array([[1,2,3],[2,3,4],[3,4,5]])
y = array([0],[1],[0])
...
model.compile(loss='binary_crossentropy', optimizer='adam')
model.fit(X, y, epochs=50, verbose=0)
# new instance where we do not know the answer
Xnew = array([[4, 5, 6]])
# make a prediction
ynew = model.predict(Xnew)
#show the inputs and predicted outputs
print("X=%s, Predicted=%s" % (Xnew[0], ynew[0]))
...
results
X=[4, 5, 6], Predicted=[0 or 1]
And this one implements multiclass classification.
X = array([[1,2,3],[2,3,4],[3,4,5]])
y = array([4],[5],[6])
...
model.compile(loss='categorical_crossentropy', optimizer='adam')
# fit model
model.fit(X, y, epochs=50, verbose=2)
model.reset_states()
# evaluate model on new data
yhat = model.predict((X))
...
results decoded
X=[4, 5, 6], Predicted=[4, 5, 6]
How to implement multiclass classification with single output to get something like this? (similar to forecasting time series)
X = array([[1,2,3],[2,3,4],[3,4,5]])
y = array([4],[5],[6])
# new instance where we do not know the answer
Xnew = array([[4, 5, 6]])
yhat = model.predict_classes(Xnew)
results decoded
X=[4, 5, 6], Predicted=[7]
What you are looking for is the loss='sparse_categorical_crossentropy' function which will assume that the integer targets are class labels. So if your model has 7 outputs, and you give target 2, sparse_categorical_crossentropy will convert 2 into [0,0,1,0,0,0,0] as the target and apply categorical_crossentropy as usual.
In this case, your output layer activation function should be softmax and number of outputs be equal to the number of classes. Most likely something like Dense(num_classes, activation='softmax')
If your integer classes are just [4,5,6] then you need to shift them to [0,1,2] to satisfy the condition max(Y_targets) < num_classes.

Grid search and XGBClassifier using class weights

I am trying to use scikit-learn GridSearchCV together with XGBoost XGBClassifier wrapper for my unbalanced multi-class classification problem. So far I have used a list of class weights as an input for the scale_pos_weight argument, but this does not seem to work as all my predictions are for the majority class. This is probably because in the documentation of the XGBClassifier it is mentioned that scale_pos_weight can only be used for binary classification problems.
So my question is, how can I input sample/class weights for a multi-class classification task using scikit-learn GridSearchCV?
My code is below:
import numpy as np
import xgboost as xgb
from sklearn.model_selection import GridSearchCV
from sklearn.utils.class_weight import compute_class_weight
class_weights = compute_class_weight('balanced', np.unique(training_targets),
training_targets[target_label[0]])
random_state = np.random.randint(0, 1000)
parameters = {
'max_depth': [3, 4, 5],
'learning_rate': [0.1, 0.2, 0.3],
'n_estimators': [50, 100, 150],
'gamma': [0, 0.1, 0.2],
'min_child_weight': [0, 0.5, 1],
'max_delta_step': [0],
'subsample': [0.7, 0.8, 0.9, 1],
'colsample_bytree': [0.6, 0.8, 1],
'colsample_bylevel': [1],
'reg_alpha': [0, 1e-2, 1, 1e1],
'reg_lambda': [0, 1e-2, 1, 1e1],
'base_score': [0.5]
}
xgb_model = xgb.XGBClassifier(scale_pos_weight = class_weights, silent = True,
random_state = random_state)
clf = GridSearchCV(xgb_model, parameters, scoring = 'f1_micro', n_jobs = -1, cv = 5)
clf.fit(training_features, training_targets.values[:, 0])
model = clf.best_estimator_
The scale_pos_weight is only for binary classification, so it won't work on multi-label classification tasks.
For your case, it's more advisable to use the weight parameter as described here (https://xgboost.readthedocs.io/en/latest/python/python_api.html). The argument will be an array which each element represents the weight you assigned for the corresponding data point.
The idea is essentially to manually assign different weights to different classes. There's no standard in how you need to assign weights, it's more up to your decision. The more weight a sample is being assigned, the more it affects the objective function during the training.
However, if you use the scikit learn API format, you cannot specify the weight parameter nor using the DMAtrix format. Thankfully, xgboost has its own cross validation function, which you can find details here: https://xgboost.readthedocs.io/en/latest/python/python_api.html
I suggest that you use the compute_sample_weight() function and set weights for each sample by looking at your labels. This will solve your problem in the most elegant way. See below for 3 classes (-1,0,1):
sample_weights=compute_sample_weight({-1:4,0:1,1:4},Train_Labels)
random_search = RandomizedSearchCV(model, param_distributions=params, n_iter=param_comb,return_train_score=True, scoring=score,cv=ps, n_jobs=-1, verbose=3, random_state=1001 )
random_search.fit(Train,Train_Labels,sample_weight=sample_weights)
In a multi-class setup we need to pass sample_weight parameter with a list of values (weights) matching the count of data-points (for example number of rows in X_train), to fit() of XGBoostClassifier. Check the docs.
While using XGBoostClassifier with scikit-learn GridSearchCV, you can pass sample_weight directly to the fit() of GridSearchCV.
Note: Tried in scikit-learn version 1.1.1. Not sure from which version onwards this is supported.
For example:
def get_weights(cls):
class_weights = {
# class-labels based on your dataset.
0: 1,
1: 4,
2: 1,
}
return [class_weights[cl] for cl in cls]
grid = {
"max_depth": [3, 4, 5, 6],
"n_estimators": range(20, 70, 10),
"learning_rate": np.arange(0.25, 0.50, 0.05),
}
xgb_clf = XGBClassifier(random_state=42, n_jobs=-1)
xgb_cvm = GridSearchCV(estimator=xgb_clf, param_grid=grid, n_jobs=-1, cv=5)
xgb_cvm.fit(X, y, sample_weight=get_weights(y))

Resources