How do i calculate accuracy of my ANN in this case - python-3.x

I am running the following code. I want to calculate accuracy of my ANN for test data. I am using windows platfrom, python 3.5
import numpy
import pandas as pd
from keras.models import Sequential
from keras.layers import Dense
from keras.wrappers.scikit_learn import KerasRegressor
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import KFold
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline
from sklearn.metrics import accuracy_score
dataset=pd.read_csv('main.csv')
dataset=dataset.fillna(0)
X=dataset.iloc[:, 0:6].values
#X = X[numpy.logical_not(numpy.isnan(X))]
y=dataset.iloc[:, 6:8].values
#y = y[numpy.logical_not(numpy.isnan(y))]
#regr = LinearRegression()
#regr.fit(numpy.transpose(numpy.matrix(X)), numpy.transpose(numpy.matrix(y)))
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test=train_test_split(X,y, test_size=0.24,random_state=0)
create model
model = Sequential()
model.add(Dense(4, input_dim=6, kernel_initializer='normal', activation='relu'))
model.add(Dense(4, kernel_initializer='normal', activation='relu'))
model.add(Dense(2, kernel_initializer='normal'))
Compile model
model.compile(loss='mean_squared_error', optimizer='adam')
model.fit(X_train, y_train, batch_size=5, epochs=5)
y_pred=model.predict(X_test)
Now, i want to calculate the accuracy of y_pred. Any help will be appreciated.
The above code is self explanatory. I am currently using only 5 epochs just for experimenting.

Keras already implements metrics such as accuracy, so you just need to change the model.compile line to:
model.compile(loss='mean_squared_error', optimizer='adam',
metrics = ["accuracy"])
Then training and validation accuracy (in the [0, 1] range) will be presented at the progress bar during training, and you can compute accuracy with model.evaluate as well, which will return a tuple of loss and metrics (accuracy in this case).

Besides the suggestion of using keras. You can compute the accuracy using scikit-learn as follows:
from sklearn.metrics import accuracy_score
accuracy_score(y_test, y_pred)
For more information, check the documentation : sklearn.metrics.accuracy_score

Although in a narrow technical sense both answers already provided are correct, there is a more general issue with your question which affects the essence of it: are you in a regression or a classification context?
If you are in a regression context (as implied by your loss='mean_squared_error' and the linear activation in your output layer), then the simple augmentation of model compilation
model.compile(loss='mean_squared_error', optimizer='adam',
metrics = ["accuracy"])
will, as Matias says, provide the accuracy. Nevertheless, accuracy is meaningless in a regression setting; see the answer & discussion here for more details.
If you are in a classification context (as implied by your wish to calculate the accuracy, which is meaningful only in classification), then your loss function should not be the MSE, but the cross-entropy instead, plus that the activation of your last layer should not be linear.

to compute accuracy we can use model.evaluate function

Related

Polynomial Regression using keras

hi i am new to keras and i just wanted to know are ann's good for polynomial regression tasks or we shuold just
use sklearn for exmaple i write this script
import numpy as np
import keras
from keras.layers import Dense
from keras.models import Sequential
x=np.arange(1, 100)
y=x**2
model = Sequential()
model.add(Dense(units=200, activation = 'relu',input_dim=1))
model.add(Dense(units=200, activation= 'relu'))
model.add(Dense(units=1))
model.compile(loss='mean_squared_error',optimizer=keras.optimizers.SGD(learning_rate=0.001))
model.fit(x, y,epochs=2000)
but after testing it on some of numbers i didn't get good result like :
model.predict([300])
array([[3360.9023]], dtype=float32)
is there any problem in my code or i just shouldn't use ann's for polynomial regressions.
thank you.
I'm not 100 percent sure, but I think that the reason you are getting such bad predictions is because you did not scale your data. Artificial neural networks are extremely computationally intensive, and thus, scaling is a must. Scale your data as shown below:
import numpy as np
import keras
from keras.layers import Dense
from keras.models import Sequential
x=np.arange(1, 100)
y=x**2
from sklearn.preprocessing import StandardScaler
sc_x = StandardScaler()
x = sc_x.fit_transform(x)
sc_y = StandardScaler()
y = sc_y.fit_transform(y)
model = Sequential()
model.add(Dense(units=5, activation = 'relu',input_dim=1))
model.add(Dense(units=5, activation= 'relu'))
model.add(Dense(units=1))
model.compile(loss='mean_squared_error',optimizer=keras.optimizers.SGD(learning_rate=0.001))
model.fit(x, y,epochs=75, batch_size=10)
prediction = sc_y.inverse_transform(model.predict(sc_x.transform([300])))
print(prediction)
Note that I changed the number of epochs from 2000 to 75. This is because 2000 epochs is way to high for a neural network, and it requires lots of time to train. Your X dataset contains only 100 values, so the maximum number of epochs I would suggest is 75.
Furthermore, I also changed the number of neurons in each hidden layer from 200 to 5. This is because 200 neurons is far to many for most datasets, let alone a small dataset of length 100.
These changes should ensure that your neural network produces more accurate predictions.
Hope that helped.

Check if the way of evaluating keras model via unseen data is correct

I studied Keras and created my first neural network model as the following:
from keras.layers import Dense
import keras
from keras import Sequential
from sklearn.metrics import accuracy_score
tr_X, tr_y = getTrainingData()
# NN Architecture
model = Sequential()
model.add(Dense(16, input_dim=tr_X.shape[1]))
model.add(keras.layers.advanced_activations.PReLU())
model.add(Dense(16))
model.add(keras.layers.advanced_activations.PReLU())
model.add(Dense(1, activation='sigmoid'))
# Compile the Model
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
# Fit the Model
model.fit(tr_X, tr_y, epochs=1000, batch_size=200, validation_split=0.2)
# ----- Evaluate the Model (Using UNSEEN data) ------
ts_X, ts_y = getTestingData()
yhat_classes = model.predict_classes(ts_X, verbose=0)[:, 0]
accuracy = accuracy_score(ts_y, yhat_classes)
print(accuracy)
I am not sure about the last portion of my code, i.e., model evaluation using model.predict_classes() where new data are loaded via a custom method getTestingData(). See my goal is to test the final model using new UNSEEN data to evaluate its prediction. My question is about this part: Am I evaluating the model correctly?
Thank you,
Yes, that is correct. You can use predict or predict_classes to get the predictions on test data. If you need the loss & metrics directly, you can use the evaluate method by feeding ts_X and ts_y.
y_pred = model.predict(ts_X)
loss, accuracy = model.evaluate(ts_X, ts_y)
https://keras.io/models/model/#predict
https://keras.io/models/model/#evaluate
Difference between predict & predict_classes: What is the difference between "predict" and "predict_class" functions in keras?

Why does sigmoid function outperform tanh and softmax in this case?

The sigmoid function gives better results than tanh or softmax for the below neural network.
If I change the activation function from sigmoid to tanh or softmax the error increases an accuracy decreases. Although I have learned that tanh and softmax are better compared to sigmoid. Could someone help me understand this?
The datasets I used are iris and Pima Indians Diabetes Database. I have used TensorFlow 1.5 and Keras 2.2.4
from keras.models import Sequential
from keras.layers import Dense
from sklearn.model_selection import train_test_split
import numpy as np
dataset = np.genfromtxt('diabetes.csv', dtype=float, delimiter=',')
X = dataset[1:, 0:8]
Y = dataset[1:, 8]
xtrain, xtest, ytrain, ytest = train_test_split(X, Y, test_size=0.2, random_state=42)
model = Sequential()
model.add(Dense(10, input_dim=8, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(xtrain, ytrain, epochs=50, batch_size=20)
print(model.metrics_names)
print(model.evaluate(xtest, ytest))
The value range is between -1 and 1, but that's not necessarily a problem as far as the Tanh is concerned. By learning suitable weights, the Tanh can fit to the value range [0,1] using the bias. Therefore both the Sigmoid and the Tangh can be used here. Only Softmax is not possible for the reasons mentioned. See the code below:
import numpy as np
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
X = np.hstack((np.linspace(0, 0.45, num=50), np.linspace(0.55, 1, num=50)))
Y = (X > 0.5).astype('float').T
model = Sequential()
model.add(Dense(1, input_dim=1, activation='tanh'))
model.compile(loss='binary_crossentropy', optimizer='SGD', metrics=['accuracy'])
model.fit(X, Y, epochs=100)
print(model.evaluate(X, Y, verbose=False))
Whenever someone says you should always prefer foo over bar in machine learning, it's probably an inadmissible simplification. There are anti-patterns that one can explain to people, things that never work, like the Softmax in the example above. If the rest were that simple, AutoML would be a very boring field of research ;) . PS: I'm not exactly working on AutoML.
Softmax activation function is generally used as a categorical activation. This is because softmax squashes the outputs between the range (0,1) so that the sum of the outputs is always 1. If your output layer only has one unit/neuron, it will always have a constant 1 as an output.
Tanh, or hyperbolic tangent is a logistic function that maps the outputs to the range of (-1,1). Tanh can be used in binary classification between two classes. When using tanh, remember to label the data accordingly with [-1,1].
Sigmoid function is another logistic function like tanh. If the sigmoid function inputs are restricted to real and positive values, the output will be in the range of (0,1). This makes sigmoid a great function for predicting a probability for something.
So, all in all, the output activation function is usually not a choice of model performance but actually is dependent on the task and network architecture you are working with.

MLP classifier_for multi class

I am newbie on keras,
I try to follow the Keras tutorial for Multilayer Perceptron (MLP) for multi-class softmax classification, using my data set.
My data has 3 classes and only one feature, but I don't understand why the result always show just 0,3 of accuracy and the model predicted all training data as first class. then the confusion matrix is like this.
Confusion matrix
Here the coding:
import keras
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation
from keras.optimizers import SGD
import pandas as pd
import numpy as np
# Importing the dataset
dataset = pd.read_csv('StatusAll.csv')
X = dataset.iloc[:, 1:].values
y = dataset.iloc[:, 0:1].values
# Splitting the dataset into the Training set and Test set
from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 0)
from keras.utils import to_categorical
y_train = to_categorical(y_train)
y_test = to_categorical(y_test)
model = Sequential()
# Dense(64) is a fully-connected layer with 64 hidden units.
# in the first layer, you must specify the expected input data shape:
# here, 20-dimensional vectors.
model.add(Dense(64, activation='tanh', input_dim=1))
model.add(Dropout(0.5))
model.add(Dense(64, activation='tanh'))
model.add(Dropout(0.5))
model.add(Dense(4, activation='softmax'))
sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='categorical_crossentropy',
optimizer=sgd,
metrics=['accuracy'])
history = model.fit(x_train, y_train,
epochs=100,
batch_size=128)
score = model.evaluate(x_test, y_test, batch_size=128)
print('Test score:', score[0])
print('Test accuracy:', score[1])
from sklearn import metrics
prediction = model.predict(x_test)
prediction = np.around(prediction)
y_test_non_category = [ np.argmax(t) for t in y_test ]
y_predict_non_category = [ np.argmax(t) for t in prediction ]
from sklearn.metrics import confusion_matrix
conf_mat = confusion_matrix(y_test_non_category, y_predict_non_category)
print (conf_mat)
I hope I can get some advice, thanksss.
The x_train example
x_train
y_train before converted to categorical
enter image description here
Your final Dense layer has 4 outputs, it seems like you are classifying 4 instead of 3.
model.add(Dense(3, activation='softmax')) # Number of classes 3
It would be helpful to see sample data from x_train and y_train to make sure the pre-processing is correct. Because you have only 1 feature, a MLP might be overkill. A decision tree would be simpler unless you want to experiment with MLPs.

How to use LSTM to predict values from a different range than it was trained on

I try to train a simple LSTM to predict the next number in a sequence (1,2,3,4,5 --> 6).
from keras.models import Sequential
from keras.layers import LSTM, Dense
from sklearn.model_selection import train_test_split
import numpy as np
import matplotlib.pyplot as plt
xs = [[[(j+i)/100] for j in range(5)] for i in range(100)]
ys = [(i+5)/100 for i in range(100)]
x_train, x_test, y_train, y_test = train_test_split(xs, ys)
model = Sequential()
model.add(LSTM(1, input_shape=(5,1), return_sequences=True))
model.add(LSTM(1, return_sequences=False))
model.add(Dense(1))
model.compile(loss='mae', optimizer='adam', metrics=['accuracy'])
training = model.fit(x_train, y_train, epochs=200)
new_xs = np.array(xs)*5
new_ys = np.array(ys)*5
pred = model.predict(new_xs)
plt.scatter(range(len(pred)), pred, c='r')
plt.scatter(range(len(new_ys)), new_ys, c='b')
In order for the net to learn anything I had to normalize the training data (divided it by 100). It did work indeed for the data from the range it was trained on.
I want it to be able to predict the numbers form outside the range it was trained on, but as soon as it leaves the range, it starts to diverge:
When I increased the number of units in both LSTM layers to 30 it looks a little better, but it's still diverging:
Is LSTM capable of learning that task without adding an infinite number of units?
I was looking for an answer to the same question, and I came across the following paper from 2019 and the corresponding Git repo. In particular, see section 5.3 in the paper. It seems like ABBA-LSTM is the solution, though it depends on the time series problem you're trying to solve.

Resources