Multiclass predictions with a categorical_crossentropy loss - python-3.x

Let's say this example implements a simple binary classification.
X = array([[1,2,3],[2,3,4],[3,4,5]])
y = array([0],[1],[0])
...
model.compile(loss='binary_crossentropy', optimizer='adam')
model.fit(X, y, epochs=50, verbose=0)
# new instance where we do not know the answer
Xnew = array([[4, 5, 6]])
# make a prediction
ynew = model.predict(Xnew)
#show the inputs and predicted outputs
print("X=%s, Predicted=%s" % (Xnew[0], ynew[0]))
...
results
X=[4, 5, 6], Predicted=[0 or 1]
And this one implements multiclass classification.
X = array([[1,2,3],[2,3,4],[3,4,5]])
y = array([4],[5],[6])
...
model.compile(loss='categorical_crossentropy', optimizer='adam')
# fit model
model.fit(X, y, epochs=50, verbose=2)
model.reset_states()
# evaluate model on new data
yhat = model.predict((X))
...
results decoded
X=[4, 5, 6], Predicted=[4, 5, 6]
How to implement multiclass classification with single output to get something like this? (similar to forecasting time series)
X = array([[1,2,3],[2,3,4],[3,4,5]])
y = array([4],[5],[6])
# new instance where we do not know the answer
Xnew = array([[4, 5, 6]])
yhat = model.predict_classes(Xnew)
results decoded
X=[4, 5, 6], Predicted=[7]

What you are looking for is the loss='sparse_categorical_crossentropy' function which will assume that the integer targets are class labels. So if your model has 7 outputs, and you give target 2, sparse_categorical_crossentropy will convert 2 into [0,0,1,0,0,0,0] as the target and apply categorical_crossentropy as usual.
In this case, your output layer activation function should be softmax and number of outputs be equal to the number of classes. Most likely something like Dense(num_classes, activation='softmax')
If your integer classes are just [4,5,6] then you need to shift them to [0,1,2] to satisfy the condition max(Y_targets) < num_classes.

Related

Keras: How to define input shape for 1st DENSE layer?

I am new to deep learning & keras.
Refer to below code.
I want to confirm the terminology. Before processing to create batches by the TimeSeries Generator, there are 10 samples. After processing by the Generator, am I correct to say
there are 8 samples in 1 batch?
I don't understand why yhat differs when I define the 1st layer input shape as 'input_shape' vs 'input_dim'. yhat should only be (1,1) - a single value.
If instead, I use a simple RNN layer as my 1st layer, what should the inputs shape be?
Thank you
# univariate one step problem with mlp
from numpy import array
from keras.models import Sequential
from keras.layers import Dense
from keras.preprocessing.sequence import TimeseriesGenerator
# define dataset
series = array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10]) # 10 samples before processing by the Generator
# define generator
timestep = 2
generator = TimeseriesGenerator(series, series, length=timestep, batch_size=8)
# number of batch
print('Batches: %d' % len(generator))
# OUT --> Batches: 1
# print each batch
for i in range(len(generator)):
x, y = generator[i]
print('%s => %s' % (x, y))
#OUT:
[[1 2]
[2 3]
[3 4]
[4 5]
[5 6]
[6 7]
[7 8]
[8 9]] => [ 3 4 5 6 7 8 9 10]
#After processing by the Generator, there are 8 samples in 1 batch.
x, y = generator[0]
print(x.shape)
# define model
model = Sequential()
#TensorFlow assumes the first dimension is the batch_size which can have any size so you don't need to define it. The 2nd D is the number of time steps. The 3rd D is the number of features
#1st LAYER with input shape defined by input_shape
#model.add(Dense(100, activation='relu', input_shape= (timestep,1)))
#1st LAYER with input shape defined by input_dim
model.add(Dense(100, activation='relu', input_dim=timestep))
model.add(Dense(1))
model.compile(optimizer='adam', loss='mse')
# fit model
model.fit_generator(generator, steps_per_epoch=1, epochs=200, verbose=0)
# make a one step prediction out of sample
x_input = array([9, 10]).reshape((1, timestep))
print(x_input.shape)
yhat = model.predict(x_input, verbose=0)
print(yhat)
# OUT: [[9.3066435, 10.239568]] if 1st layer's shape is input_shape
# OUT: [[11.545249]] if 1st layer's shape is input_dim

Hyper parameter tuning with GridsearchCV for multi output data(Neural Network) using keras

I have a data which has 2 output variables and multiple input variables, I used neural network to make the model, but now I want to do hyper parameter tuning.
For tuning, I am using grid search, I wrote the code for grid search, but there is an error when I tried grid.fit(X_train, Y_train)the error is ValueError: Error when checking model target: the list of Numpy arrays that you are passing to your model is not the size the model expected. Expected to see 2 array(s), but instead got the following list of 1 arrays: [array([[ 1.62970054, 2.33817343],
then I made 2 different arrays and then passed into the code as grid.fit(X_train, [Y[0], Y[1]), as there are 2 outputs, now its showing that ValueError: Found input variables with inconsistent numbers of samples: [10000, 2]. Is there any way I can correct this code, or GridsearchCV don't accept multiple output values? I made the keras model using keras functional API and then I passed through kerasRegressor so that I can implement GridsearchCV. I am stuck at what should be the code so that grid.fit() will run...Or is there any other method by which I can do hyper parameter tuning
def create_model(activation='relu', dropout_rate=0.0, neurons=10, optimizer='Adam', weight_constraint=0):
main_input = Input(shape=(17,), name='main_input')
hidden = Dense(neurons, activation=activation, name='hidden', kernel_constraint=maxnorm(weight_constraint))(main_input)
hidden = Dropout(dropout_rate)(hidden)
out1 = Dense(1, activation='linear', name='out1')(hidden)
out2 = Dense(1, activation='linear',name='out2')(hidden)
model = Model(inputs=main_input, outputs=[out1,out2])
model.compile(optimizer = optimizer,loss={'out1':'mean_squared_error', 'out2':'mean_squared_error'})
return model
model = KerasRegressor(build_fn=create_model, batch_size=32, epochs=100)
activation = ['relu', 'tanh', 'sigmoid', 'linear']
dropout_rate = [0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9]
weight_constraint=[1, 2, 3, 4, 5]
neurons = [10, 30, 50, 70, 90]
optimizer = [ 'SGD', 'RMSprop', 'Adagrad', 'Adadelta', 'Adam']
epochs = [50, 100, 150, 200]
batch_size = [10, 20, 30, 40]
param_grid = dict(neurons=neurons, activation=activation, dropout_rate=dropout_rate, weight_constraint=weight_constraint, epochs=epochs, batch_size=batch_size, optimizer=optimizer)
grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=-1, cv=5)
grid_result = grid.fit(X_train, Y_train)
The error when the last line of the above code is grid_result = grid.fit(X_train, Y_train) is ValueError: Error when checking model target: the list of Numpy arrays that you are passing to your model is not the size the model expected. Expected to see 2 array(s), but instead got the following list of 1 arrays: [array([[ 1.62970054, 2.33817343],
[ 3.44504814, 6.05534019],
[ 1.58155862, 0.8296778 ],
...,
[ 1.27446578, 6.71978433],
[ 7.99909866, 17.82736535],
[ 1.4...
The error when I make 2 different arrays of 2 outputs and then rewrite the last line as grid.fit(X_train, [Y[0], Y[1]) is ValueError: Found input variables with inconsistent numbers of samples: [10000, 2]
My data is 10000 observation with 17 input variables and 2 output variables.

Tensorflow error : Dimensions must be equal

I have a dataset of 25000 colored pictures 100*100(*3) and I am trying to build a simple neural network with one convolutional layer. Its pictures of cells that are infected or not by Malaria, so my output is 2.
But it seems like I have a dimension mismatch, and I don't know where my error comes from.
My neural network :
def simple_nn(X_training, Y_training, X_test, Y_test):
input = 100*100*3
batch_size = 25
X = tf.placeholder(tf.float32, [batch_size, 100, 100, 3])
#Was:
# W = tf.Variable(tf.zeros([input, 2]))
# b = tf.Variable(tf.zeros([2]))
#Now:
W = tf.Variable(tf.truncated_normal([4, 4, 3, 3], stddev=0.1))
B = tf.Variable(tf.ones([3])/10) # What should I put here ??
init = tf.global_variables_initializer()
# model
#Was:
# Y = tf.nn.softmax(tf.matmul(tf.reshape(X, [-1, input]), W) + b)
#Now:
stride = 1 # output is still 28x28
Ycnv = tf.nn.conv2d(X, W, strides=[1, stride, stride, 1], padding='SAME')
Y = tf.nn.relu(Ycnv + B)
# placeholder for correct labels
Y_ = tf.placeholder(tf.float32, [None, 2])
# loss function
cross_entropy = -tf.reduce_sum(Y_ * tf.log(Y))
# % of correct answers found in batch
is_correct = tf.equal(tf.argmax(Y,1), tf.argmax(Y_,1))
accuracy = tf.reduce_mean(tf.cast(is_correct, tf.float32))
learning_rate = 0.00001
optimizer = tf.train.GradientDescentOptimizer(learning_rate)
train_step = optimizer.minimize(cross_entropy)
sess = tf.Session()
sess.run(init)
#Training here...
My error :
Traceback (most recent call last):
File "neural_net.py", line 135, in <module>
simple_nn(X_training, Y_training, X_test, Y_test)
File "neural_net.py", line 69, in simple_nn
cross_entropy = -tf.reduce_sum(Y_ * tf.log(Y))
...
ValueError: Dimensions must be equal, but are 2 and 3 for 'mul' (op: 'Mul') with input shapes: [?,2], [25,100,100,3].
I used a simple layer before, and it was working. I changed my weight and bias, and honestly, I don't know why my bias are setup like this, I followed a tutorial (https://codelabs.developers.google.com/codelabs/cloud-tensorflow-mnist/#11) but it is not explained.
I also replaced my Y to a conv2D.
And I don't know what my ouput should be if I want to get a vector of size 2*1 as a result.
You have correctly defined your labels as
Y_ = tf.placeholder(tf.float32, [None, 2])
So the last dimension is 2. However, the output from the convolution step is not directly suitable for comparing it to the labels. What I mean is the following: if you do
Ycnv = tf.nn.conv2d(X, W, strides=[1, stride, stride, 1], padding='SAME')
Y = tf.nn.relu(Ycnv + B)
The dimensions of this are going to be four as the error says:
ValueError: Dimensions must be equal, but are 2 and 3 for 'mul' (op: 'Mul') with input shapes: [?,2], [25,100,100,3].
So it is impossible to multiply directly (or operate) the output from convolution with the labels. What I recommend is to flatten (reshape it to only one dimension) the output of convolution and pass it to a fully connected layer of 2 units (as much as classes you have). Like this:
Y = tf.reshape(Y, [1,-1])
logits = tf.layers.dense(Y, units= 2)
and you can pass this to the loss.
Also I recommend you to change the loss to a more approprite version. For example, tf.losses.sigmoid_cross_entropy.
Also, the way you use convolutions is strange. Why do you put handmade filters in the convolution? besides you should have to initialize and before it putting them in a collection. In conclusion I recommend you to delete all the following code:
W = tf.Variable(tf.truncated_normal([4, 4, 3, 3], stddev=0.1))
B = tf.Variable(tf.ones([3])/10) # What should I put here ??
init = tf.global_variables_initializer()
# model
#Was:
# Y = tf.nn.softmax(tf.matmul(tf.reshape(X, [-1, input]), W) + b)
#Now:
stride = 1 # output is still 28x28
Ycnv = tf.nn.conv2d(X, W, strides=[1, stride, stride, 1], padding='SAME')
Y = tf.nn.relu(Ycnv + B)
and substitute it by:
conv1 = tf.layers.conv2d(X, filters=64, kernel_size=3,
strides=1, padding='SAME',
activation=tf.nn.relu, name="conv1")
Also the init = tf.global_variable_initializer() should be at the end of the graph construction becuase, if not, there will be variables it won't catch.
My final working code is:
def simple_nn():
inp = 100*100*3
batch_size = 2
X = tf.placeholder(tf.float32, [batch_size, 100, 100, 3])
Y_ = tf.placeholder(tf.float32, [None, 2])
#Was:
# W = tf.Variable(tf.zeros([input, 2]))
# b = tf.Variable(tf.zeros([2]))
#Now:
# model
#Was:
# Y = tf.nn.softmax(tf.matmul(tf.reshape(X, [-1, input]), W) + b)
#Now:
stride = 1 # output is still 28x28
conv1 = tf.layers.conv2d(X, filters=64, kernel_size=3,
strides=1, padding='SAME',
activation=tf.nn.relu, name="conv1")
Y = tf.reshape(conv1, [1,-1])
logits = tf.layers.dense(Y, units=2, activation=tf.nn.relu)
# placeholder for correct labels
# loss function
cross_entropy = tf.nn.softmax_cross_entropy_with_logits_v2(labels=Y_, logits=logits)
loss = tf.reduce_mean(cross_entropy)
# % of correct answers found in batch
is_correct = tf.equal(tf.argmax(Y,1), tf.argmax(Y_,1))
accuracy = tf.reduce_mean(tf.cast(is_correct, tf.float32))
learning_rate = 0.00001
optimizer = tf.train.GradientDescentOptimizer(learning_rate)
train_step = optimizer.minimize(cross_entropy)
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
...

dividing my dataset (csv format) using Stratified k-fold sampling and saving the output of each fold in separate csv file.

My dataset has around 5000 samples and 3 classes (one hot encoded) and I am interested in creating samples using stratified K fold. Moreover, in the end, I want to split each output file (from the K fold) into train and test.
I tried the following suggestion from sklearn documentation but I want to retain the shape of my dataset.
from sklearn.model_selection import StratifiedShuffleSplit
X = np.array([[1, 2], [3, 4], [1, 2], [3, 4]])
y = np.array([0, 0, 1, 1])
sss = StratifiedShuffleSplit(n_splits=3, test_size=0.5, random_state=0)
sss.get_n_splits(X, y)
print(sss)
for train_index, test_index in sss.split(X, y):
print("TRAIN:", train_index, "TEST:", test_index)
X_train, X_test = X[train_index], X[test_index]
y_train, y_test = y[train_index], y[test_index]

GridSearch for MultiClass KerasClassifier

I am trying to do a grid search for a multiclass classification with Keras. Here is a section of the code:
Some properties of the data are below:
y_
array(['fast', 'immobile', 'immobile', ..., 'slow',
'immobile', 'slow'],
dtype='<U17')
y_onehot = pd.get_dummies(y_).values
y_onehot
array([[1, 0, 0],
[0, 0, 1],
[0, 0, 1],
...
[0, 1, 0],
[0, 0, 1],
[0, 1, 0]], dtype=uint8)
#Do train-test split
y_train.shape
(1904,)
y_train_onehot.shape
(1904, 3)
And the model...
# Function to create model, required for KerasClassifier
def create_model(optimizer='rmsprop', init='glorot_uniform'):
# create model
model = Sequential()
model.add(Dense(2048, input_dim=X_train.shape[1], kernel_initializer=init, activation='relu'))
model.add(Dense(512, kernel_initializer=init, activation='relu'))
model.add(Dense(y_train_onehot.shape[1], kernel_initializer=init, activation='softmax'))
# Compile model
model.compile(loss='categorical_crossentropy', optimizer=optimizer, metrics=['accuracy'])
return model
# create model
model = KerasClassifier(build_fn=create_model, verbose=0)
# grid search epochs, batch size and optimizer
optimizers = ['rmsprop', 'adam']
init = ['glorot_uniform', 'normal', 'uniform']
epochs = [50, 100, 150]
batches = [5, 10, 20]
param_grid = dict(optimizer=optimizers, epochs=epochs, batch_size=batches, init=init)
grid = GridSearchCV(estimator=model, param_grid=param_grid, scoring='accuracy')
grid_result = grid.fit(X_train, y_train_onehot)
And here is the error:
--> grid_result = grid.fit(X_train, y_train_onehot)
ValueError: Classification metrics can't handle a mix of multilabel-indicator and multiclass targets
The code was for a binary model but I am hoping to modify it for a multiclass data set. Kindly assist. Thanks!
The error is in the softmax layer.
I think you mean y_train_onehot.shape[1] instead of y_train_onehot[1]
Update 1: This is strange but your second problem seems to be y_train_onehot, would you mind to try 2 things:
try the same model without the onehot encoding on y_train.
if that alone doesn't work, change the loss to sparse_categorical_crossentropy
Also make sure to change y_train_onehot.shape[1] to the number of classes in the softmax layer

Resources