Keras NN loss is 1 - keras

Getting started with simple NN but my loss remains one at each iteration. Can somebody point out what I'm doing wrong here.
This is from a Kaggle introductory course and my modified training set contains shop id, category id, item id, month and revenue. I'm basically trying to predict revenue per shop per category for the following month.
I've scaled revenue and trained on a simple NN with 2 hidden layers; however, it doesn't seem like the training is working as the loss remains constant. I haven't done anything with the labels (ie shop ids, category ids) but I would still think the loss would change on each iteration.
If you have some comments on coding practice, I would be interested as well.
Thanks.
X_train = grouped_train.drop('revenue', axis=1)
y_train = grouped_train['revenue']
print('X & y trains')
print(X_train.head())
print(y_train.head())
scaler = StandardScaler()
y_train = pd.DataFrame(scaler.fit_transform(y_train.values.reshape(-1,1)))
print('Scaled y train')
print(y_train.head())
keras.backend.clear_session()
model = Sequential()
model.add(Dense(30, activation='relu', input_shape=(4,)))
model.add(Dense(30, activation='relu'))
model.add(Dense(1, activation='relu'))
model.summary()
print('Compile & fit')
model.compile(loss='mean_squared_error', optimizer='RMSprop')
model.fit(X_train, scaled_data, batch_size=128, epochs=13)
predictions = pd.DataFrame(model.predict(test))
print('Scaled predictions')
print(predictions.head())
print('Unscaled predictions')
print(pd.DataFrame(scaler.inverse_transform(predictions)).head())
IN
OUT

Looks like you are using the wrong activation for the final layer. You have a regression problem so the standard final activation layer should be activation = 'linear'
model.add(Dense(1, activation='relu'))
model.add(Dense(1, activation='linear'))
Edit:
Additionally model.fit is using 'scaled_data' shouldn't scaled_data be replaced with y_train

Related

What to expect from model.predict in Keras?

I am new to Keras and trying to write my first code. I want to understand what 'model.predict' should return. Consider a simple model below.
model = keras.Sequential()
model.add(keras.layers.Dense(12, input_dim=232, activation='relu'))
model.add(keras.layers.Dense(232, activation='relu'))
model.add(keras.layers.Dense(1, activation='sigmoid'))
# compile the keras model
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
# fit the keras model on the dataset
model.fit(vSignal, vLabels, epochs=15, batch_size=100 )
# evaluate the keras model
_, accuracy = model.evaluate(vSignal, vLabels)
print('Accuracy: %.2f' % (accuracy*100))
pred=model.predict(vSignalT)
Consider we train the "model" with "vSignal" and "vLabels" as shown above. Now consider that the accuracy of the model as given by model.evaluate is 100%. Now if we give same data 'vSignal' to 'model.predict' should we get the 'vLabels' return?
pred=model.predict(vSignalT) returns a numpy arrays of predictions.
each row consists of one of the vlabels that the model predicted.
for more information refer to here
save return value of fit function:
hist = model.fit(vSignal, vLabels, epochs=15, batch_size=100 );
then check the
hist.history["accuracy"]

Issue with accuracy never changing ANN in kreas

I am trying to build simple ANN to learn how to tell if the two images are similar or not using two distance equations. So here how I set up things. I created a distance between 3 images (1, an anchor, 2 a positive sample, 3 a negative sample) and then created two different distance measurements. 1 using ResNet features and another using hog features. The two distance measurements are then saved with the two picture paths as well as the correct label (0/1) 0 = Same 1 = not the same.
Now I am trying to build out my ANN to learn the difference between the two values and see if this will allow for me to see if two images a similar. But nothing happens when I train up the ANN. I think there are two possibilities.
1: I didn't set up the ann correctly.
2: There is no connection at all.
Please help me see what the issue is:
Here is my code:
# Load the Pandas libraries with alias 'pd'
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from keras.models import Sequential
from keras.layers import Dense
import numpy as np
# fix random seed for reproducibility
np.random.seed(7)
import csv
data = pd.read_csv("encoding.csv")
print(data.columns)
X = data[['resnet', 'hog','label']]
x = X[['resnet', 'hog']]
y = X[['label']]
model = Sequential()
#get number of columns in training data
n_cols = x.shape[1]
#add model layers
model.add(Dense(16, activation='relu', input_shape=(n_cols,)))
model.add(Dense(10, activation='relu'))
model.add(Dense(1, activation= 'softmax'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(x, y,
epochs=30,
batch_size=32,
validation_split=0.10)
Right now all it does is this over and over again:
167/167 [==============================] - 0s 3ms/step - loss: 8.0189 - acc: 0.4970 - val_loss: 7.5517 - val_acc: 0.5263
Here is the csv file that I am using:
EDIT
So I have changed the setup a bit and now it does bounce up to 73% val accuracy. But then it bounces around and ends at 40% what does than mean?
Here is the new model:
model = Sequential()
#get number of columns in training data
n_cols = x.shape[1]
model.add(Dense(256, activation='relu', input_shape=(n_cols,)))
model.add(BatchNormalization())
model.add(Dense(128, activation='relu'))
model.add(Dense(64, activation='relu'))
model.add(Dropout(0.2))
model.add(BatchNormalization())
model.add(Dense(1, activation= 'sigmoid'))
#sgd = optimizers.SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
#model.compile(loss = "binary_crossentropy", optimizer = sgd, metrics=['accuracy'])
model.compile(loss = "binary_crossentropy", optimizer = 'rmsprop', metrics=['accuracy'])
#model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(x, y,
epochs=100,
batch_size=64,
validation_split=0.10)
This makes no sense:
model.add(Dense(1, activation= 'softmax'))
Softmax with one neuron will produce a constant value of 1.0 due to the normalization. For binary classification with the binary_crossentropy loss, you should use one neuron with sigmoid activation.
model.add(Dense(1, activation= 'sigmoid'))
Two things to try :
First add complexity to your network, it is pretty simple, add more layers/neurons in order to capture more information from your data
Start with something like that, and see if it change something :
model.add(Dense(256, activation='relu', input_shape=(n_cols,)))
model.add(Dense(128, activation='relu'))
model.add(Dense(64, activation='relu'))
model.add(Dense(1, activation= 'sigmoid'))
Second, think to add more epochs, ANN can be long to converge
Update
More things to try :
Normalize and scale your data
Maybe too small dataset -> the more data you get, the better your model will be
Try differents hyper parameter, maybe decrease your learning rate like 1e-4 or 1e-5, try differents batch_size, ..
Add more regularization: try dropout between each layer

Neural network only predicts one class from binary class

My task is to learn defected items in a factory. It means, I try to detect defected goods or fine goods. This led a problem where one class dominates the others (one class is 99.7% of the data) as the defected items were very rare. Training accuracy is 0.9971 and validation accuracy is 0.9970. It sounds amazing.
But the problem is, the model only predicts everything is 0 class which is fine goods. That means, it fails to classify any defected goods.
How can I solve this problem? I have checked other questions and tried out, but I still have the situation. the total data points are 122400 rows and 5 x features.
In the end, my confusion matrix of the test set is like this
array([[30508, 0],
[ 92, 0]], dtype=int64)
which does a terrible job.
My code is as below:
le = LabelEncoder()
y = le.fit_transform(y)
ohe = OneHotEncoder(sparse=False)
y = y.reshape(-1,1)
y = ohe.fit_transform(y)
scaler = StandardScaler()
x = scaler.fit_transform(x)
x_train, x_test, y_train, y_test = train_test_split(x,y,test_size = 0.25, random_state = 777)
#DNN Modelling
epochs = 15
batch_size =128
Learning_rate_optimizer = 0.001
model = Sequential()
model.add(Dense(5,
kernel_initializer='glorot_uniform',
activation='relu',
input_shape=(5,)))
model.add(Dense(5,
kernel_initializer='glorot_uniform',
activation='relu'))
model.add(Dense(8,
kernel_initializer='glorot_uniform',
activation='relu'))
model.add(Dense(2,
kernel_initializer='glorot_uniform',
activation='softmax'))
model.compile(loss='binary_crossentropy',
optimizer=Adam(lr = Learning_rate_optimizer),
metrics=['accuracy'])
history = model.fit(x_train, y_train,
batch_size=batch_size,
epochs=epochs,
verbose=1,
validation_data=(x_test, y_test))
y_pred = model.predict(x_test)
confusion_matrix(y_test.argmax(axis=1), y_pred.argmax(axis=1))
Thank you
it sounds like you have highly imbalanced dataset, the model is learning only how to classify fine goods.
you can try one of the approaches listed here:
https://machinelearningmastery.com/tactics-to-combat-imbalanced-classes-in-your-machine-learning-dataset/
The best attempt would be to firstly take almost equal portions of data of both classes, split them into train-test-val, train the classifier and do thorough testing on your complete dataset. You can also try and use data augmentation techniques to your other set to get more data from the same set. Keep on iterating and maybe even try and change your loss function to suit your condition.

Find Most Important Input from a Neural Network

I trained a neural network with 37 Inputs. It has around 85% accuracy. Is it possible for me to find out which Input has the most effect. I tried this code but I cannot figure out how to find most important Input
weights = model.layers[0].get_weights()[0]
biases = model.layers[0].get_weights()[1]
One possible solution is to wrap your model with keras.wrappers.scikit_learn and then use Recursive Feature elimination in scikit-learn:
def create_model():
# create model
model = Sequential()
model.add(Dense(512, activation='relu'))
model.add(Dense(512, activation='relu'))
model.add(Dense(10, activation='softmax'))
# Compile model
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
return model
model = KerasClassifier(build_fn=create_model, epochs=100, batch_size=128, verbose=0)
rfe = RFE(estimator=model, n_features_to_select=1, step=1)
rfe.fit(X, y)
ranking = rfe.ranking_.reshape(digits.images[0].shape)
# Plot pixel ranking
plt.matshow(ranking, cmap=plt.cm.Blues)
plt.colorbar()
plt.title("Ranking of pixels with RFE")
plt.show()
If you need to visualize weights see here.

deep learning data preparation

I have a text dataset, that contains 6 classes. for each sample, I have the percent value and sum of the 6 percent values is 100% (features are related to each other). For example :
{A:16, B:35, C:7, D:0, E:3, F:40}
how can I feed a deep learning algorithm with this dataset?
I actually want the prediction to be exactly in the shape of training data.
Here is what you can do:
First of all, normalize all of your labels and scale them between 0-1.
Use a softmax layer for prediction.
Here is some code in Keras for intuition:
model = Sequential()
model.add(Dense(100, input_dim = x.shape[1], activation='relu'))
model.add(Dense(y.shape[1], activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam')

Resources