From SKLearn to Keras - What is the difference? - keras

I'm trying to go from SKLearn to Keras in order to make specific improvements to my models.
However, I can't get the same performance I had with my SKLearn model :
mlp = MLPClassifier(
solver='adam', activation='relu',
beta_1=0.9, beta_2=0.999, learning_rate='constant',
alpha=0, hidden_layer_sizes=(238,),
max_iter=300
)
dev_score(mlp)
Gives ~0.65 score everytime
Here is my corresponding Keras code :
def build_model(alpha):
level_moreargs = {'kernel_regularizer':l2(alpha), 'kernel_initializer': 'glorot_uniform'}
model = Sequential()
model.add(Dense(units=238, input_dim=X.shape[1], **level_moreargs))
model.add(Activation('relu'))
model.add(Dense(units=class_names.shape[0], **level_moreargs)) # output
model.add(Activation('softmax'))
model.compile(loss=keras.losses.categorical_crossentropy, # like sklearn
optimizer=keras.optimizers.Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-08, decay=0.0),
metrics=['accuracy'])
return model
k_dnn = KerasClassifier(build_fn=build_model, epochs=300, batch_size=200, validation_data=None, shuffle=True, alpha=0.5, verbose=0)
dev_score(k_dnn)
From looking at the documentation (and digging into SKLearn code), this should correspond exactly to the same thing.
However, I get ~0.5 accuracy when I run this model, which is very bad.
And if I set alpha to 0, SKLearn's score barely changes (0.63), while Keras's goes random from 0.2 to 0.4.
What is the difference between these models ? Why is Keras, although being supposed to be better than SKLearn, outperformed by so far here ? What's my mistake ?
Thanks,

Related

What to expect from model.predict in Keras?

I am new to Keras and trying to write my first code. I want to understand what 'model.predict' should return. Consider a simple model below.
model = keras.Sequential()
model.add(keras.layers.Dense(12, input_dim=232, activation='relu'))
model.add(keras.layers.Dense(232, activation='relu'))
model.add(keras.layers.Dense(1, activation='sigmoid'))
# compile the keras model
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
# fit the keras model on the dataset
model.fit(vSignal, vLabels, epochs=15, batch_size=100 )
# evaluate the keras model
_, accuracy = model.evaluate(vSignal, vLabels)
print('Accuracy: %.2f' % (accuracy*100))
pred=model.predict(vSignalT)
Consider we train the "model" with "vSignal" and "vLabels" as shown above. Now consider that the accuracy of the model as given by model.evaluate is 100%. Now if we give same data 'vSignal' to 'model.predict' should we get the 'vLabels' return?
pred=model.predict(vSignalT) returns a numpy arrays of predictions.
each row consists of one of the vlabels that the model predicted.
for more information refer to here
save return value of fit function:
hist = model.fit(vSignal, vLabels, epochs=15, batch_size=100 );
then check the
hist.history["accuracy"]

How to get 90%+ test accuracy on IMDB data?

I was trying to train a model using IMDB data. I am getting expected train accuracy about 96%+ but I am not satisfied with the test accuracy.Now my expectation is to get 90%+ test accuracy on test data. I tried by using several classifier but each time I am getting 84% to 89% accuracy on test data. Here I am going to include some classifiers I already tried. Most of the cases I tried some parameter tuning by increasing epoch or changing the optimizer. Now my concern is how can I increase the test accuracy to 90%+ .
Classifiers I tried so far:
First:
model = Sequential()
model.add(Embedding(vocab_size, 32, input_length = max_words))
model.add(Bidirectional(LSTM(32, return_sequences = True)))
model.add(GlobalMaxPool1D())
model.add(Dense(20, activation="relu"))
model.add(Dropout(0.05))
model.add(Dense(1, activation="sigmoid"))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(x_train,y_train,validation_data=(x_test, y_test),epochs=10,batch_size=100)
Second:
model = Sequential([
Embedding(vocab_size, 32, input_length=max_words),
Dropout(0.2),
ZeroPadding1D(padding=1),
Convolution1D(64, 5, activation='relu'),
Dropout(0.2),
MaxPooling1D(),
Flatten(),
Dense(100, activation='relu'),
Dropout(0.2),
Dense(1, activation='sigmoid')
])
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(x_train,y_train,validation_data=(x_test, y_test),epochs=10,batch_size=100)
By checking on State-of-the-art analysis on IMDB dataset, I don't think you can get to ^90% with simple models like those you are using. However, you may try using pretrained embedding like glove instead of training your own embedding. Also, I found this repo have BERT implementation in keras, providing demo of IMBD classification, it is able to get ~99% acc.

Find Most Important Input from a Neural Network

I trained a neural network with 37 Inputs. It has around 85% accuracy. Is it possible for me to find out which Input has the most effect. I tried this code but I cannot figure out how to find most important Input
weights = model.layers[0].get_weights()[0]
biases = model.layers[0].get_weights()[1]
One possible solution is to wrap your model with keras.wrappers.scikit_learn and then use Recursive Feature elimination in scikit-learn:
def create_model():
# create model
model = Sequential()
model.add(Dense(512, activation='relu'))
model.add(Dense(512, activation='relu'))
model.add(Dense(10, activation='softmax'))
# Compile model
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
return model
model = KerasClassifier(build_fn=create_model, epochs=100, batch_size=128, verbose=0)
rfe = RFE(estimator=model, n_features_to_select=1, step=1)
rfe.fit(X, y)
ranking = rfe.ranking_.reshape(digits.images[0].shape)
# Plot pixel ranking
plt.matshow(ranking, cmap=plt.cm.Blues)
plt.colorbar()
plt.title("Ranking of pixels with RFE")
plt.show()
If you need to visualize weights see here.

Am I doing this right on Keras?

I am doing a keras implementation of sequence classification using 2 BLSTM and 2 FC layer. My data shape is (2655,219,835) where 219 is steps and 835 is number of features. Training is with 1800 intsances and test is with 855 instances. There are 4 classification groups- 0,1,2,3 and They are converted to vectors using to_categorical function. The network implementation code is below:
model = Sequential()
model.add(Masking(mask_value=0., input_shape=(219,835)))
model.add(Bidirectional(LSTM(128,return_sequences=True)))
model.add(Bidirectional(LSTM(128)))
model.add(Dense(256,activation='relu'))
model.add(Dense(256,activation='relu'))
model.add(Dense(4, activation='softmax'))
adam=keras.optimizers.Adam(lr=0.0001, beta_1=0.9, beta_2=0.999, epsilon=None, decay=0.0, amsgrad=False)
model.compile(loss='categorical_crossentropy', optimizer=adam, metrics=['accuracy'])
history=model.fit(train_data, label_train, epochs=30, validation_split=0.33, batch_size=128, verbose=1)
pred=model.predict(test_data)
Now, the problem is I am doing way too low accuracy than the reported accuracy in the paper. Could you please help me to find if I am making any mistake in here ?
Thanks!

Keras GridSearch scikit learn freezes

I struggle to implement grid search in Keras using scikit learn. Based on this tutorial, I wrote the following code:
from keras.wrappers.scikit_learn import KerasClassifier
from sklearn.model_selection import GridSearchCV
def create_model():
model = Sequential()
model.add(Dense(100, input_shape=(max_len, len(alphabet)), kernel_regularizer=regularizers.l2(0.001)))
model.add(Dropout(0.85))
model.add(LSTM(100, input_shape=(100,)))
model.add(Dropout(0.85))
model.add(Dense(num_output_classes, activation='softmax'))
adam = Adam(lr=0.001, beta_1=0.9, beta_2=0.999, decay=1e-6)
model.compile(loss='categorical_crossentropy',
optimizer=adam,
metrics=['accuracy'])
return model
seed = 7
np.random.seed(seed)
model = KerasClassifier(build_fn=create_model, epochs=10, verbose=0)
batch_size = [10,20]
param_grid = dict(batch_size=batch_size)
grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=-1)
grid_result = grid.fit(train_data_reduced, train_labels_reduced)
print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_))
means = grid_result.cv_results_['mean_test_score']
stds = grid_result.cv_results_['std_test_score']
params = grid_result.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
print("%f (%f) with: %r" % (mean, stdev, param))
It does not give me any error message, but it just runs on and on forever without printing out anything. I deliberately tried it with very few epochs, very few training examples and very few hyperparameters to search. Without grid search, one epoch runs through extremely fast, so I don’t think that I just need to give it more time. It simply doesn’t do anything.
Can anyone point out what I am missing?
Thanks a lot!
I had the same problem.
Removing n_jobs=-1 from your parameters list might help! Also try to not do the hot encoding.

Resources