The following codes are from a textbook called 'Deeplearning for everybody' and it is to predict diabetes based on the data from Pima indians. I wonder what the [1] at the end of the codes mean.
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
import numpy
import tensorflow as tf
np.random.seed(3)
tf.random.set_seed(3)
dataset = np.loadtxt('.\dataset\pima-indians-diabetes.csv', delimiter=',')
X = dataset[:, 0:8]
Y = dataset[:, 8]
model = Sequential()
model.add(Dense(12, input_dim=8, activation='relu'))
model.add(Dense(8, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy',
optimizer='adam',
metrics=['accuracy'])
model.fit(X, Y, epochs=200, batch_size=10)
print('\n Accuracy: %.4f' % (model.evaluate(X, Y)[1])) # <---------
In Keras model.evaluate() returns a list of aggregated metric values. Say you want to measure the loss, the accuracy, F1 score on your test data, then you would compile your model something like this: model.compile(optimizer, loss, metrics=['accuracy', custom_f1_function], .. ) . Then these will be calculated for every sample (or batch) in the dataset and then reduced usually by taking the average. In the end you will get a list that has three elements: aggregated loss, aggregated accuracy, aggregated F1 score. In your code you are accessing the second element of this list, namely the accuracy.
(The order in `metrics=[..] determines the order in the output list!)
Related
First of all, I'm very new to ML and I have this task:
I need to build a ML model to give clients a list of 10 professions for each of them which best fit with their data: bachelor's degree type, favourite subjects, favourite professional sectors, etc...
My team and I have alredy extracted info from an SQL database and created a dataframe with all the relevant information and One Hot Encoded it: this is the result:
df_all_dumm.shape
(773, 1029)
So I have 773 clients and 1029 columns (a lot of them, but we thought it is necessary because all the columns were numeric categorical). Most of the columns are OHE professions columns (from 99 to 998), where there is "1" if the profession has been suggested to the client or "0" if not.
I'm a bit lost about if this dataset approach is fine for multi-label classification, about what method to use (NN, RandomTrees classif, scikit multi-learn models...). I have alredy tried with some multi-learn models like MLkNN or BRkNNaClassifier but the results are very poor (F1 Score = 0.1 - 0.2).
This is the dataset (it doesn't contain any private data, so I think there is no problem uploading it. Also, I don't know if this is the propper way to paste a link, sorry again)
https://drive.google.com/file/d/1nID4q7EfpoiNKdWz6N4FRUgIQEniwFRV/view?usp=sharing
EDIT:
I have created a Sequential Keras model:
# Slicing target columns from the rest of the df
data = pd.read_csv('df_all_dumm.csv')
data_c = data.copy()
data_in = data_c.copy()
data_c.iloc[:,99:999]
data_out = data_c.iloc[:,99:999]
data_in = data_in.drop(data_out.columns,1)
data_in = data_in.drop(['id'],1)
X_train, X_test, y_train, y_test = train_test_split(data_in,
data_out,
test_size = 0.3,
random_state = 42)
print("{0:2.2f}% of data in train set".format(len(X_train)/len(data.index)*100))
print("{0:2.2f}% of data in test set".format(len(X_test)/len(data.index)*100))
# Dataframes to tensors
X_train_tf = tf.convert_to_tensor(X_train.values)
X_test_tf = tf.convert_to_tensor(X_test.values)
y_train_tf = tf.convert_to_tensor(y_train.values)
y_test_tf = tf.convert_to_tensor(y_test.values)
from numpy import asarray
from sklearn.datasets import make_multilabel_classification
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Dropout
# get the model
def get_model(n_inputs, n_outputs):
model = Sequential()
model.add(Dense(512, input_dim=n_inputs, activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(600, input_dim=n_inputs, activation='relu'))
model.add(Dropout(0.1))
model.add(Dense(700, input_dim=n_inputs, activation='relu'))
model.add(Dense(900, input_dim=n_inputs, activation='relu'))
model.add(Dense(n_outputs, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
return model
# load dataset
X, y = X_train_tf, y_train_tf
n_inputs, n_outputs = X.shape[1], y.shape[1]
# get model
model = get_model(n_inputs, n_outputs)
# fit the model on all data
model.fit(X, y, validation_split=0.33, epochs=100, batch_size=10)
prec = model.evaluate(X_test_tf, y_test_tf)[1]
print("La precisiĆ³n de la red es: {} %".format(round(prec*100,2)))
La precisiĆ³n de la red es: 4.74 %
So, this is the model we created. I think the main problem here is that we have 900 different output labels and our data input size is 500...
We also thought about applying some clustering algorithm first to cluster the professions in, let's say 5 groups, train a NN for every group and then predict.
from numpy import loadtxt
from keras.models import Sequential
from keras.layers import Dense
from matplotlib import pyplot
from sklearn.preprocessing import MinMaxScaler
# load the dataset
dataset = loadtxt('train.csv', delimiter=',')
scaler = MinMaxScaler(feature_range=(0, 1))
scaler.fit(dataset)
normalized = scaler.transform(dataset)
for row in normalized:
# split into input (X) and output (y) variables
X = row[0:13]
y = row[13]
# define the keras model
model = Sequential()
model.add(Dense(12, input_dim=13, activation='relu'))
model.add(Dense(16, activation='relu'))
model.add(Dense(20, activation='relu'))
model.add(Dense(16, activation='relu'))
model.add(Dense(1, activation='relu'))
# compile the keras model
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['mse', 'mae', 'mape'])
history =model.fit(X, y, epochs=20,batch_size=1 , verbose=2)
pyplot.plot(history.history['mse'])
pyplot.plot(history.history['mae'])
pyplot.plot(history.history['mape'])
pyplot.show()
As per the error message, the shape of your input X in model.fit is not in the correct shape. The model expects the input to be an array of shape (13,) but the input to the model is (1,) shaped array. Print X and y values and verify if they are getting stored as list of lists as in X = [[1,2,3,4,...13]] and Y = [[0]] like this. In that case give X[0] and y[0] in model.fit method.
I am trying to build simple ANN to learn how to tell if the two images are similar or not using two distance equations. So here how I set up things. I created a distance between 3 images (1, an anchor, 2 a positive sample, 3 a negative sample) and then created two different distance measurements. 1 using ResNet features and another using hog features. The two distance measurements are then saved with the two picture paths as well as the correct label (0/1) 0 = Same 1 = not the same.
Now I am trying to build out my ANN to learn the difference between the two values and see if this will allow for me to see if two images a similar. But nothing happens when I train up the ANN. I think there are two possibilities.
1: I didn't set up the ann correctly.
2: There is no connection at all.
Please help me see what the issue is:
Here is my code:
# Load the Pandas libraries with alias 'pd'
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from keras.models import Sequential
from keras.layers import Dense
import numpy as np
# fix random seed for reproducibility
np.random.seed(7)
import csv
data = pd.read_csv("encoding.csv")
print(data.columns)
X = data[['resnet', 'hog','label']]
x = X[['resnet', 'hog']]
y = X[['label']]
model = Sequential()
#get number of columns in training data
n_cols = x.shape[1]
#add model layers
model.add(Dense(16, activation='relu', input_shape=(n_cols,)))
model.add(Dense(10, activation='relu'))
model.add(Dense(1, activation= 'softmax'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(x, y,
epochs=30,
batch_size=32,
validation_split=0.10)
Right now all it does is this over and over again:
167/167 [==============================] - 0s 3ms/step - loss: 8.0189 - acc: 0.4970 - val_loss: 7.5517 - val_acc: 0.5263
Here is the csv file that I am using:
EDIT
So I have changed the setup a bit and now it does bounce up to 73% val accuracy. But then it bounces around and ends at 40% what does than mean?
Here is the new model:
model = Sequential()
#get number of columns in training data
n_cols = x.shape[1]
model.add(Dense(256, activation='relu', input_shape=(n_cols,)))
model.add(BatchNormalization())
model.add(Dense(128, activation='relu'))
model.add(Dense(64, activation='relu'))
model.add(Dropout(0.2))
model.add(BatchNormalization())
model.add(Dense(1, activation= 'sigmoid'))
#sgd = optimizers.SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
#model.compile(loss = "binary_crossentropy", optimizer = sgd, metrics=['accuracy'])
model.compile(loss = "binary_crossentropy", optimizer = 'rmsprop', metrics=['accuracy'])
#model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(x, y,
epochs=100,
batch_size=64,
validation_split=0.10)
This makes no sense:
model.add(Dense(1, activation= 'softmax'))
Softmax with one neuron will produce a constant value of 1.0 due to the normalization. For binary classification with the binary_crossentropy loss, you should use one neuron with sigmoid activation.
model.add(Dense(1, activation= 'sigmoid'))
Two things to try :
First add complexity to your network, it is pretty simple, add more layers/neurons in order to capture more information from your data
Start with something like that, and see if it change something :
model.add(Dense(256, activation='relu', input_shape=(n_cols,)))
model.add(Dense(128, activation='relu'))
model.add(Dense(64, activation='relu'))
model.add(Dense(1, activation= 'sigmoid'))
Second, think to add more epochs, ANN can be long to converge
Update
More things to try :
Normalize and scale your data
Maybe too small dataset -> the more data you get, the better your model will be
Try differents hyper parameter, maybe decrease your learning rate like 1e-4 or 1e-5, try differents batch_size, ..
Add more regularization: try dropout between each layer
I am attempting to implement a neural network using Keras with a problem that involves multilabel classification. I understand that one way to tackle the problem is to transform it to several binary classification problems. I have implemented one of these, but am not sure how to proceed with the others, mostly how do I go about combining them? My data set has 5 input variables and 5 labels. Generally a single sample of data would have 1-2 labels. It is rare to have more than two labels.
Here is my code (thanks to machinelearningmastery.com):
import numpy
import pandas
from keras.models import Sequential
from keras.layers import Dense
from keras.wrappers.scikit_learn import KerasClassifier
from sklearn.model_selection import cross_val_score
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import StratifiedKFold
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline
# fix random seed for reproducibility
seed = 7
numpy.random.seed(seed)
# load dataset
dataframe = pandas.read_csv("Realdata.csv", header=None)
dataset = dataframe.values
# split into input (X) and output (Y) variables
X = dataset[:,0:5].astype(float)
Y = dataset[:,5]
# encode class values as integers
encoder = LabelEncoder()
encoder.fit(Y)
encoded_Y = encoder.transform(Y)
# baseline model
def create_baseline():
# create model
model = Sequential()
model.add(Dense(5, input_dim=5, kernel_initializer='normal', activation='relu'))
model.add(Dense(1, kernel_initializer='normal', activation='sigmoid'))
# Compile model
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
scores = model.evaluate(X, encoded_Y)
print("\n%s: %.2f%%" % (model.metrics_names[1], scores[1]*100))
#Make predictions....change the model.predict to whatever you want instead of X
predictions = model.predict(X)
# round predictions
rounded = [round(x[0]) for x in predictions]
print(rounded)
return model
# evaluate model with standardized dataset
estimator = KerasClassifier(build_fn=create_baseline, epochs=100, batch_size=5, verbose=0)
kfold = StratifiedKFold(n_splits=10, shuffle=True, random_state=seed)
results = cross_val_score(estimator, X, encoded_Y, cv=kfold)
print("Results: %.2f%% (%.2f%%)" % (results.mean()*100, results.std()*100))
The approach you are referring to is the one-versus-all or the one-versus-one strategy for multi-label classification. However, when using a neural network, the easiest solution for a multi-label classification problem with 5 labels is to use a single model with 5 output nodes. With keras:
model = Sequential()
model.add(Dense(5, input_dim=5, kernel_initializer='normal', activation='relu'))
model.add(Dense(5, kernel_initializer='normal', activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='sgd')
You can provide the training labels as binary-encoded vectors of length 5. For instance, an example that corresponds to classes 2 and 3 would have the label [0 1 1 0 0].
I have programmed convolutional neural network in keras to predict whether the image is of a cat or a dog.
I got an accuracy around 80%.
I tried checking the prediction of my code for many images as it always gave result of the one class out of the classes (dogs and cat) to be predicted.
I will walk you through my code:
The first part of my code is Data Preparation:
#Labels
Y = np.concatenate([np.array([[1,0]]*len(cats)),
np.array([[0,1]]*len(dogs))])
print(Y)
#Features
import scipy.misc
from keras.models import model_from_json
X=[]
for name in cats:
converted_image = scipy.misc.imresize(scipy.misc.imread(name), (64, 64))
X.append(converted_image)
for name in dogs:
converted_image = scipy.misc.imresize(scipy.misc.imread(name), (64, 64))
X.append(converted_image)
X= np.array(X)
print(len(X))
print(X.shape)
Now I have divided the data into training and testing
from sklearn.model_selection import train_test_split
X_train,X_test,Y_train,Y_test = train_test_split(X,Y,test_size=0.10,random_state=10)
X_train=X_train.astype("float32")
X_test = X_test.astype("float32")
print("Shape Of the Training Features are :",X_train.shape)
print("Shape Of Testing Features are :",X_test.shape)
print("Number of Training Samples : ",X_train.shape[0])
print("Number of Testing Samples :",X_test.shape[0])
Now, I will show my complete CNN architecture below:
from keras.models import Sequential
from keras.layers import Dense,Dropout,Activation,Lambda,Flatten
from keras.layers import Conv2D,MaxPooling2D
def layer(input_shape = (64,64,3)):
model=Sequential()
model.add(Lambda(lambda x: x/127.5 - 1.,input_shape=input_shape, output_shape=input_shape))
model.add(Conv2D(32, (3, 3), activation='relu', name='conv1', input_shape=input_shape, padding="valid"))
model.add(Conv2D(64,(3,3),activation='relu',name='conv2',padding='valid'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Dropout(0.25))
model.add(Conv2D(128,(3,3),activation='relu',name='conv3',padding='valid'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Dropout(0.25))
model.add(Flatten())
#model.add(Dense(25088,activation='relu'))
#model.add(Dense(12000,activation='relu'))
#model.add(Dense(6000,activation='relu'))
model.add(Dense(500,activation='relu'))
model.add(Dense(2,activation='softmax'))
model.summary()
return model
model = layer()
model.compile(loss='mse',optimizer='adadelta',metrics=['accuracy'])
model.fit(X_train, Y_train, batch_size=128, epochs=5, verbose=1, validation_data=(X_test, Y_test))
score = model.evaluate(X_test, Y_test, verbose=0)
print('Test score:', score[0])
print('Test accuracy:', score[1])
My accuracy and score from the output is as follows:
Test score: 0.1459069953918457
Test accuracy: 0.7876
-Saving and loading The Model Code
#saving
model_json = model.to_json()
open('cd.json','w').write(model_json)
#Save Weights
model.save_weights('cd_weights.h5',overwrite=True)
#loading
model_architecture = 'cd.json'
model_weights = 'cd_weights.h5'
model = model_from_json(open(model_architecture).read())
model.load_weights(model_weights)
Now prediction part
img_names = ['d1.jpg','c1.jpg']
imgs = [np.transpose(scipy.misc.imresize(scipy.misc.imread(img_name), (64, 64)),
(1, 0, 2)).astype('float32')
for img_name in img_names]
imgs = np.array(imgs) / 255
# train
optim = SGD()
model.compile(loss='categorical_crossentropy', optimizer=optim,
metrics=['accuracy'])
predictions = model.predict_classes(imgs)
print(predictions)
Output : [0 0]
The images are taken from training set c1-cat, d1-dog for every image it outputs zero as predicted.
I think there is some problem with model.predict_classes() function.
You can use the following code to perform the same thing as model.predict_classes() can do.
This will give us the prediction in terms of probability.
predictions = model.predict(imgs)
ex:
1) [0.7865448 0.21910465] says there is 78% chance of the picture being the first class (cat) and 21% of chance being the second class(dog)
2) [0.7856076 0.22000787] says there is 78% chance of the picture being the first class (cat) and 21% of chance being the second class(dog)
In order to convert this probability to classes we use np.argmax function.
print(np.argmax(predictions ))
Your output will be 0 indication first class(cat)
or 1 indication second class(dog)