Issue with accuracy never changing ANN in kreas - python-3.x

I am trying to build simple ANN to learn how to tell if the two images are similar or not using two distance equations. So here how I set up things. I created a distance between 3 images (1, an anchor, 2 a positive sample, 3 a negative sample) and then created two different distance measurements. 1 using ResNet features and another using hog features. The two distance measurements are then saved with the two picture paths as well as the correct label (0/1) 0 = Same 1 = not the same.
Now I am trying to build out my ANN to learn the difference between the two values and see if this will allow for me to see if two images a similar. But nothing happens when I train up the ANN. I think there are two possibilities.
1: I didn't set up the ann correctly.
2: There is no connection at all.
Please help me see what the issue is:
Here is my code:
# Load the Pandas libraries with alias 'pd'
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from keras.models import Sequential
from keras.layers import Dense
import numpy as np
# fix random seed for reproducibility
np.random.seed(7)
import csv
data = pd.read_csv("encoding.csv")
print(data.columns)
X = data[['resnet', 'hog','label']]
x = X[['resnet', 'hog']]
y = X[['label']]
model = Sequential()
#get number of columns in training data
n_cols = x.shape[1]
#add model layers
model.add(Dense(16, activation='relu', input_shape=(n_cols,)))
model.add(Dense(10, activation='relu'))
model.add(Dense(1, activation= 'softmax'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(x, y,
epochs=30,
batch_size=32,
validation_split=0.10)
Right now all it does is this over and over again:
167/167 [==============================] - 0s 3ms/step - loss: 8.0189 - acc: 0.4970 - val_loss: 7.5517 - val_acc: 0.5263
Here is the csv file that I am using:
EDIT
So I have changed the setup a bit and now it does bounce up to 73% val accuracy. But then it bounces around and ends at 40% what does than mean?
Here is the new model:
model = Sequential()
#get number of columns in training data
n_cols = x.shape[1]
model.add(Dense(256, activation='relu', input_shape=(n_cols,)))
model.add(BatchNormalization())
model.add(Dense(128, activation='relu'))
model.add(Dense(64, activation='relu'))
model.add(Dropout(0.2))
model.add(BatchNormalization())
model.add(Dense(1, activation= 'sigmoid'))
#sgd = optimizers.SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
#model.compile(loss = "binary_crossentropy", optimizer = sgd, metrics=['accuracy'])
model.compile(loss = "binary_crossentropy", optimizer = 'rmsprop', metrics=['accuracy'])
#model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(x, y,
epochs=100,
batch_size=64,
validation_split=0.10)

This makes no sense:
model.add(Dense(1, activation= 'softmax'))
Softmax with one neuron will produce a constant value of 1.0 due to the normalization. For binary classification with the binary_crossentropy loss, you should use one neuron with sigmoid activation.
model.add(Dense(1, activation= 'sigmoid'))

Two things to try :
First add complexity to your network, it is pretty simple, add more layers/neurons in order to capture more information from your data
Start with something like that, and see if it change something :
model.add(Dense(256, activation='relu', input_shape=(n_cols,)))
model.add(Dense(128, activation='relu'))
model.add(Dense(64, activation='relu'))
model.add(Dense(1, activation= 'sigmoid'))
Second, think to add more epochs, ANN can be long to converge
Update
More things to try :
Normalize and scale your data
Maybe too small dataset -> the more data you get, the better your model will be
Try differents hyper parameter, maybe decrease your learning rate like 1e-4 or 1e-5, try differents batch_size, ..
Add more regularization: try dropout between each layer

Related

what does [1] mean in model.evaluate(X, Y)[1]

The following codes are from a textbook called 'Deeplearning for everybody' and it is to predict diabetes based on the data from Pima indians. I wonder what the [1] at the end of the codes mean.
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
import numpy
import tensorflow as tf
np.random.seed(3)
tf.random.set_seed(3)
dataset = np.loadtxt('.\dataset\pima-indians-diabetes.csv', delimiter=',')
X = dataset[:, 0:8]
Y = dataset[:, 8]
model = Sequential()
model.add(Dense(12, input_dim=8, activation='relu'))
model.add(Dense(8, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy',
optimizer='adam',
metrics=['accuracy'])
model.fit(X, Y, epochs=200, batch_size=10)
print('\n Accuracy: %.4f' % (model.evaluate(X, Y)[1])) # <---------
In Keras model.evaluate() returns a list of aggregated metric values. Say you want to measure the loss, the accuracy, F1 score on your test data, then you would compile your model something like this: model.compile(optimizer, loss, metrics=['accuracy', custom_f1_function], .. ) . Then these will be calculated for every sample (or batch) in the dataset and then reduced usually by taking the average. In the end you will get a list that has three elements: aggregated loss, aggregated accuracy, aggregated F1 score. In your code you are accessing the second element of this list, namely the accuracy.
(The order in `metrics=[..] determines the order in the output list!)

Keras NN loss is 1

Getting started with simple NN but my loss remains one at each iteration. Can somebody point out what I'm doing wrong here.
This is from a Kaggle introductory course and my modified training set contains shop id, category id, item id, month and revenue. I'm basically trying to predict revenue per shop per category for the following month.
I've scaled revenue and trained on a simple NN with 2 hidden layers; however, it doesn't seem like the training is working as the loss remains constant. I haven't done anything with the labels (ie shop ids, category ids) but I would still think the loss would change on each iteration.
If you have some comments on coding practice, I would be interested as well.
Thanks.
X_train = grouped_train.drop('revenue', axis=1)
y_train = grouped_train['revenue']
print('X & y trains')
print(X_train.head())
print(y_train.head())
scaler = StandardScaler()
y_train = pd.DataFrame(scaler.fit_transform(y_train.values.reshape(-1,1)))
print('Scaled y train')
print(y_train.head())
keras.backend.clear_session()
model = Sequential()
model.add(Dense(30, activation='relu', input_shape=(4,)))
model.add(Dense(30, activation='relu'))
model.add(Dense(1, activation='relu'))
model.summary()
print('Compile & fit')
model.compile(loss='mean_squared_error', optimizer='RMSprop')
model.fit(X_train, scaled_data, batch_size=128, epochs=13)
predictions = pd.DataFrame(model.predict(test))
print('Scaled predictions')
print(predictions.head())
print('Unscaled predictions')
print(pd.DataFrame(scaler.inverse_transform(predictions)).head())
IN
OUT
Looks like you are using the wrong activation for the final layer. You have a regression problem so the standard final activation layer should be activation = 'linear'
model.add(Dense(1, activation='relu'))
model.add(Dense(1, activation='linear'))
Edit:
Additionally model.fit is using 'scaled_data' shouldn't scaled_data be replaced with y_train

Find Most Important Input from a Neural Network

I trained a neural network with 37 Inputs. It has around 85% accuracy. Is it possible for me to find out which Input has the most effect. I tried this code but I cannot figure out how to find most important Input
weights = model.layers[0].get_weights()[0]
biases = model.layers[0].get_weights()[1]
One possible solution is to wrap your model with keras.wrappers.scikit_learn and then use Recursive Feature elimination in scikit-learn:
def create_model():
# create model
model = Sequential()
model.add(Dense(512, activation='relu'))
model.add(Dense(512, activation='relu'))
model.add(Dense(10, activation='softmax'))
# Compile model
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
return model
model = KerasClassifier(build_fn=create_model, epochs=100, batch_size=128, verbose=0)
rfe = RFE(estimator=model, n_features_to_select=1, step=1)
rfe.fit(X, y)
ranking = rfe.ranking_.reshape(digits.images[0].shape)
# Plot pixel ranking
plt.matshow(ranking, cmap=plt.cm.Blues)
plt.colorbar()
plt.title("Ranking of pixels with RFE")
plt.show()
If you need to visualize weights see here.

Regularization strategy in Keras

I have trying to setup a non-linear regression problem in Keras. Unfortunately, results show that overfitting is occurring. Here is the code,
model = Sequential()
model.add(Dense(number_of_neurons, input_dim=X_train.shape[1], activation='relu', kernel_regularizer=regularizers.l2(0)))
model.add(Dense(int(number_of_neurons), activation = 'relu', kernel_regularizer=regularizers.l2(0)))
model.add(Dense(int(number_of_neurons), activation='relu', kernel_regularizer=regularizers.l2(0)))
model.add(Dense(int(number_of_neurons), activation='relu',kernel_regularizer=regularizers.l2(0)))
model.add(Dense(int(number_of_neurons), activation='relu',kernel_regularizer=regularizers.l2(0)))
model.add(Dense(outdim, activation='linear'))
Adam = optimizers.Adam(lr=0.001)
model.compile(loss='mean_squared_error', optimizer=Adam, metrics=['mae'])
model.fit(X, Y, epochs=1000, batch_size=500, validation_split=0.2, shuffle=True, verbose=2 , initial_epoch=0)
The results without regularization is shown here Without regularization. The mean absolute error for training is much less compared to validation, and both have a fixed gap which is a sign of over-fitting.
L2 regularization was specified for each layer like so,
model = Sequential()
model.add(Dense(number_of_neurons, input_dim=X_train.shape[1], activation='relu', kernel_regularizer=regularizers.l2(0.001)))
model.add(Dense(int(number_of_neurons), activation = 'relu', kernel_regularizer=regularizers.l2(0.001)))
model.add(Dense(int(number_of_neurons), activation='relu', kernel_regularizer=regularizers.l2(0.001)))
model.add(Dense(int(number_of_neurons), activation='relu',kernel_regularizer=regularizers.l2(0.001)))
model.add(Dense(int(number_of_neurons), activation='relu',kernel_regularizer=regularizers.l2(0.001)))
model.add(Dense(outdim, activation='linear'))
Adam = optimizers.Adam(lr=0.001)
model.compile(loss='mean_squared_error', optimizer=Adam, metrics=['mae'])
model.fit(X, Y, epochs=1000, batch_size=500, validation_split=0.2, shuffle=True, verbose=2 , initial_epoch=0)
The results for these are shown here L2 regularized result. The MAE for test is close to training which is good. However, the MAE for training is poor at 0.03 (without regularization it was much lower at 0.0028).
What can i do to reduce the training MAE with regularization?
Based on your results, it looks like you need to find the right amount of regularization to balance training accuracy with good generalization to the test set. This may be as simple as reducing the L2 parameter. Try reducing lambda from 0.001 to 0.0001 and comparing your results.
If you can't find a good parameter setting for L2, you could try dropout regularization instead. Just add model.add(Dropout(0.2)) between each pair of dense layers, and experiment with the dropout rate if necessary. A higher dropout rate corresponds to more regularization.

From SKLearn to Keras - What is the difference?

I'm trying to go from SKLearn to Keras in order to make specific improvements to my models.
However, I can't get the same performance I had with my SKLearn model :
mlp = MLPClassifier(
solver='adam', activation='relu',
beta_1=0.9, beta_2=0.999, learning_rate='constant',
alpha=0, hidden_layer_sizes=(238,),
max_iter=300
)
dev_score(mlp)
Gives ~0.65 score everytime
Here is my corresponding Keras code :
def build_model(alpha):
level_moreargs = {'kernel_regularizer':l2(alpha), 'kernel_initializer': 'glorot_uniform'}
model = Sequential()
model.add(Dense(units=238, input_dim=X.shape[1], **level_moreargs))
model.add(Activation('relu'))
model.add(Dense(units=class_names.shape[0], **level_moreargs)) # output
model.add(Activation('softmax'))
model.compile(loss=keras.losses.categorical_crossentropy, # like sklearn
optimizer=keras.optimizers.Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-08, decay=0.0),
metrics=['accuracy'])
return model
k_dnn = KerasClassifier(build_fn=build_model, epochs=300, batch_size=200, validation_data=None, shuffle=True, alpha=0.5, verbose=0)
dev_score(k_dnn)
From looking at the documentation (and digging into SKLearn code), this should correspond exactly to the same thing.
However, I get ~0.5 accuracy when I run this model, which is very bad.
And if I set alpha to 0, SKLearn's score barely changes (0.63), while Keras's goes random from 0.2 to 0.4.
What is the difference between these models ? Why is Keras, although being supposed to be better than SKLearn, outperformed by so far here ? What's my mistake ?
Thanks,

Resources