How to get classification probabilities in Keras? - keras

I am trying to get classification probabilities out of my trained Keras model but when I use the model.predict (or model.predict_proba) method, all I get is an array of this form:
array([[0., 0., 0., 0., 0., 0., 0., 1., 0., 0.]], dtype=float32)
So basically I get a one hot encoded float array. The "1" is mostly in the right place so the training seems to have worked fine. But why can't I get the probabilities out? See code for architecture used.
First I read in the data:
mnist_train = pd.read_csv('data/mnist_train.csv')
mnist_test = pd.read_csv('data/mnist_test.csv')
mnist_train_images = mnist_train.iloc[:, 1:].values
mnist_train_labels = mnist_train.iloc[:, :1].values
mnist_test_images = mnist_test.iloc[:, 1:].values
mnist_test_labels = mnist_test.iloc[:, :1].values
mnist_train_images = mnist_train_images.astype('float32')
mnist_test_images = mnist_test_images.astype('float32')
mnist_train_images /= 255
mnist_test_images /= 255
mnist_train_labels = keras.utils.to_categorical(mnist_train_labels, 10)
mnist_test_labels = keras.utils.to_categorical(mnist_test_labels, 10)
mnist_train_images = mnist_train_images.reshape(60000,28,28,1)
mnist_test_images = mnist_test_images.reshape(10000,28,28,1)
Then I build my model and train:
num_classes = mnist_test_labels.shape[1]
model = Sequential()
model.add(Conv2D(64, (5, 5), input_shape=(28, 28, 1), activation='relu', data_format="channels_last", padding="same"))
model.add(Conv2D(64, (5, 5), input_shape=(28, 28, 1), activation='relu', data_format="channels_last", padding="same"))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(128, (3, 3), activation='relu', data_format="channels_last", padding="same"))
model.add(Conv2D(128, (3, 3), activation='relu', data_format="channels_last", padding="same"))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.2))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dense(num_classes, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(mnist_train_images, mnist_train_labels, validation_data=(mnist_test_images, mnist_test_labels), epochs=20, batch_size=256, verbose=2)
scores = model.evaluate(mnist_test_images, mnist_test_labels, verbose=0)
print("CNN Error: %.2f%%" % (100-scores[1]*100))
model.save('mnist-weights.model')
model.save_weights("mnist-model.h5")
model_json = model.to_json()
with open("mnist-model.json", "w") as json_file:
json_file.write(model_json)
But when I then load the model in another application and try to predict probabilities like this, the described error occurs. What am I doing wrong?
json_file = open('alphabet_keras/mnist_model.json', 'r')
model_json = json_file.read()
model = model_from_json(model_json)
model.load_weights("alphabet_keras/mnist_model.h5")
letter = cv2.cvtColor(someImg, cv2.COLOR_BGR2GRAY)
letter = fitSquare(letter,28,2) # proprietary function, doesn't matter
letter_expanded = np.expand_dims(letter, axis=0)
letter_expanded = np.expand_dims(letter_expanded, axis=3)
model.predict_proba(letter_expanded)#[0]
The output is as follows:
array([[0., 0., 0., 0., 0., 0., 0., 1., 0., 0.]], dtype=float32)
I expect something like:
array([[0.1, 0.34, 0.2, 0.8, 0.1, 0.62, 0.67, 1.0, 0.31, 0.59]], dtype=float32)
There are not error messages of any kind. Please help :)

Your expected output is not correct, for classification the output of a neural network is a probability distribution over the labels, which means that the probabilities are between 0 and 1, and that they sum to 1.0. The values you show sum to more than 1.0.
About your specific problem, it looks the probabilities are saturated, this is caused by the fact that you are not normalizing the pixel values by dividing by 255, which you are doing with the training and testing sets, this inconsistency will saturate the output neurons.

Related

How to use "LeakyRelu" and Parametric Leaky Relu "PReLU" in Keras Tuner

I am using Keras Tuner and using RandomSearch() to hypertune my regression model. While I can hypertune using "relu" and "selu", I am unable to do the same for Leaky Relu. I understand that the reason "relu" and "selu" string works because, for "relu" and "selu", string aliases are available. String alias is not available for Leaky Relu. I tried passing a callable object of Leaky Relu (see my example below) but it doesn't seem to work. Can you please advise me how to do that? I have the same issue with using Parametric Leaky Relu,
Thank you in advance!
def build_model(hp):
model = Sequential()
model.add(
Dense(
units = 18,
kernel_initializer = 'normal',
activation = 'relu',
input_shape = (18, )
)
)
for i in range(hp.Int( name = "num_layers", min_value = 1, max_value = 5)):
model.add(
Dense(
units = hp.Int(
name = "units_" + str(i),
min_value = 18,
max_value = 180,
step = 18),
kernel_initializer = 'normal',
activation = hp.Choice(
name = 'dense_activation',
values=['relu', 'selu', LeakyReLU(alpha=0.01) ],
default='relu'
)
)
)
model.add( Dense( units = 1 ) )
model.compile(
optimizer = tf.keras.optimizers.Adam(
hp.Choice(
name = "learning_rate", values = [1e-2, 1e-3, 1e-4]
)
),
loss = 'mse'
)
return model
As a work-around, you can add another activation function in the tf.keras.activations.* module by modifying the source file ( which you'll see is activations.py )
Here's the code for tf.keras.activations.relu which you'll see in activations.py,
#keras_export('keras.activations.relu')
#dispatch.add_dispatch_support
def relu(x, alpha=0., max_value=None, threshold=0):
"""Applies the rectified linear unit activation function.
With default values, this returns the standard ReLU activation:
`max(x, 0)`, the element-wise maximum of 0 and the input tensor.
Modifying default parameters allows you to use non-zero thresholds,
change the max value of the activation,
and to use a non-zero multiple of the input for values below the threshold.
For example:
>>> foo = tf.constant([-10, -5, 0.0, 5, 10], dtype = tf.float32)
>>> tf.keras.activations.relu(foo).numpy()
array([ 0., 0., 0., 5., 10.], dtype=float32)
>>> tf.keras.activations.relu(foo, alpha=0.5).numpy()
array([-5. , -2.5, 0. , 5. , 10. ], dtype=float32)
>>> tf.keras.activations.relu(foo, max_value=5).numpy()
array([0., 0., 0., 5., 5.], dtype=float32)
>>> tf.keras.activations.relu(foo, threshold=5).numpy()
array([-0., -0., 0., 0., 10.], dtype=float32)
Arguments:
x: Input `tensor` or `variable`.
alpha: A `float` that governs the slope for values lower than the
threshold.
max_value: A `float` that sets the saturation threshold (the largest value
the function will return).
threshold: A `float` giving the threshold value of the activation function
below which values will be damped or set to zero.
Returns:
A `Tensor` representing the input tensor,
transformed by the relu activation function.
Tensor will be of the same shape and dtype of input `x`.
"""
return K.relu(x, alpha=alpha, max_value=max_value, threshold=threshold)
Copy this code and paste it just below. Change #keras_export('keras.activations.relu') to #keras_export( 'keras.activations.leaky_relu' ) and also change the value of alpha to 0.2, like,
#keras_export('keras.activations.leaky_relu')
#dispatch.add_dispatch_support
def relu(x, alpha=0.2, max_value=None, threshold=0):
"""Applies the rectified linear unit activation function.
With default values, this returns the standard ReLU activation:
`max(x, 0)`, the element-wise maximum of 0 and the input tensor.
Modifying default parameters allows you to use non-zero thresholds,
change the max value of the activation,
and to use a non-zero multiple of the input for values below the threshold.
For example:
>>> foo = tf.constant([-10, -5, 0.0, 5, 10], dtype = tf.float32)
>>> tf.keras.activations.relu(foo).numpy()
array([ 0., 0., 0., 5., 10.], dtype=float32)
>>> tf.keras.activations.relu(foo, alpha=0.5).numpy()
array([-5. , -2.5, 0. , 5. , 10. ], dtype=float32)
>>> tf.keras.activations.relu(foo, max_value=5).numpy()
array([0., 0., 0., 5., 5.], dtype=float32)
>>> tf.keras.activations.relu(foo, threshold=5).numpy()
array([-0., -0., 0., 0., 10.], dtype=float32)
Arguments:
x: Input `tensor` or `variable`.
alpha: A `float` that governs the slope for values lower than the
threshold.
max_value: A `float` that sets the saturation threshold (the largest value
the function will return).
threshold: A `float` giving the threshold value of the activation function
below which values will be damped or set to zero.
Returns:
A `Tensor` representing the input tensor,
transformed by the relu activation function.
Tensor will be of the same shape and dtype of input `x`.
"""
return K.relu(x, alpha=alpha, max_value=max_value, threshold=threshold)
You can use the String alias keras.activations.leaky_relu.
# Custom activation function
from keras.layers import Activation
from keras import backend as K
from keras.utils.generic_utils import get_custom_objects
## Add leaky-relu so we can use it as a string
get_custom_objects().update({'leaky-relu': Activation(LeakyReLU(alpha=0.2))})
## Main activation functions available to use
activation_functions = ['sigmoid', 'relu', 'elu', 'leaky-relu', 'selu', 'gelu',"swish"]

How dropout is implemented in Keras mobilenet v3 imagenet weights during transfer learning when some layers are frozen (made un-trainable)?

I am working on an image classification problem and was using 90% pre-trained Keras mobilenet v3 on ImageNet and remaining 10% layers are made trainable whilst applying dropout of 0.2. I was wondering how this was being handled in the backend.
MobileNetV3Small(input_shape=(IMG_HEIGHT, IMG_WIDTH, DEPTH),
alpha=1.0,
minimalistic=False,
include_top=False,
weights='imagenet',
input_tensor=None,
pooling='max',
dropout_rate=0.2)
If the layer is called with parameter training=False, like when you predict, nothing will happen. Let's start with some input:
import tensorflow as tf
rate = 0.4
dropout = tf.keras.layers.Dropout(rate)
x = tf.cast(tf.reshape(tf.range(1, 10), (3, 3)), tf.float32)
<tf.Tensor: shape=(3, 3), dtype=float32, numpy=
array([[1., 2., 3.],
[4., 5., 6.],
[7., 8., 9.]], dtype=float32)>
Now, let's call the dropout model while training:
dropout(x, training=True)
<tf.Tensor: shape=(3, 3), dtype=float32, numpy=
array([[ 0. , 3.3333333, 0. ],
[ 6.6666665, 8.333333 , 0. ],
[11.666666 , 13.333333 , 15. ]], dtype=float32)>
As you can see, all the remaining values are multiplied by 1/(1-p). Now let's call the network when training=False:
dropout(x, training=False)
<tf.Tensor: shape=(3, 3), dtype=float32, numpy=
array([[1., 2., 3.],
[4., 5., 6.],
[7., 8., 9.]], dtype=float32)>
Nothing happens.

why my model is giving same result even after >93 accuracy ? result >> array([[1., 0., 0.]], dtype=float32)

train_datagen = ImageDataGenerator(rescale = 1./255,
shear_range = 0.2,
zoom_range = 0.2,
horizontal_flip = True)
training_set = train_datagen.flow_from_directory('images',
target_size = (64, 64),
batch_size = 32,
class_mode = 'categorical')
#Found 27659 images belonging to 3 classes.
classifier = Sequential()
# Step 1 - Convolution
classifier.add(Convolution2D(32, kernel_size=(3,3), strides=(1,1), input_shape = (64, 64, 3), activation = 'relu'))
classifier.add(MaxPooling2D(pool_size = (2, 2)))
classifier.add(Convolution2D(64, (3,3), 2, activation = 'relu'))
classifier.add(MaxPooling2D(pool_size = (2, 2)))
classifier.add(Flatten())
classifier.add(Dense(128, activation = 'relu'))
classifier.add(Dense(3, activation = 'softmax'))
classifier.compile(optimizer = 'adam', loss = 'categorical_crossentropy', metrics = ['accuracy'])
classifier.fit_generator(training_set,
steps_per_epoch = 50,
epochs = 30)
test=image.load_img("./image.png",target_size=(64,64))
test=image.img_to_array(test)
import numpy as np
test=np.expand_dims(test,axis=0)
result=classifier.predict(test)
result
# result is always same as below
#array([[1., 0., 0.]], dtype=float32)
** why i'm getting the same answer all the time , ive increased the epoch but still same ,
why it is really happening , for 2 classes i've done but for 3 or more classes its not working
or
you can give me another code for more than 3 classes to predict
then
and another question is how to set label based on our directory for example
images---
-----cat folder
-----dog folder
-----fish folder
but
in the labeling it will be like [0,0,...,222] how do i know 0 is cat or dog?
**
I just tried your code with two classes (cats and dogs),i have modified your code to work in binary mode, especially last dense layer, softmax to sigmoid and the loss function to make it work.
However i could see that you can do that following improvements
1. Increasing the resolution of the image
2. Increasing the network size (both width and height)
3. Add the validation data to check the model performance.
It is always good to rely on validation accuracy rather than training accuracy. If validation accuracy is good, it means that your model might do well on test data-set.
I do not really think you need to care about labeling in this case as you might have data in the respective folders and ImageDataGenerator parses that folder structure and generates labels automatically.

Keras Softmax issues

I am new to Keras and am bit confused at the moment:
def get_compiled_model():
model = tf.keras.Sequential([
tf.keras.layers.Dense(1000, input_shape = (1000,), activation='relu', kernel_initializer = 'glorot_uniform'),
tf.keras.layers.Dense(1000, activation='relu', kernel_initializer = 'glorot_uniform'),
tf.keras.layers.Dense(41, activation='softmax', kernel_initializer = 'glorot_uniform')
])
model.compile(optimizer='SGD', loss='categorical_crossentropy', metrics=['accuracy'])
return model
I then call my model as follows:
model = get_compiled_model()
for i in range(10):
model.fit(train_object, epochs=10)
test_loss, test_acc = model.evaluate(test_object, verbose=2)
I keep getting 0 accuracy even after a lot of training. I think it is because the model is hardmaxing from the start:
for row in test_object.take(1):
row
print(model.predict(row[0])[0])
array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0.], dtype=float32)
This behavior happens even at the beginning which is confusing since we would expect something with decimals rather than all 0's and 1's.
Any help would be appreciated. To rephrase the question I am confused to why the model is hardmaxing instead of softmazing.
0
UPDATE: messed around with the size of the model, decreasing the size of the model gave us:
array([0.02439025, 0.02439031, 0.02439018, 0.02439029, 0.02439014, 0.02438815, 0.02439025, 0.02439022, 0.02439038, 0.02439022, 0.02439025, 0.02439038, 0.0243915 , 0.02439023, 0.02439109, 0.02438496, 0.02439068, 0.02439134, 0.02439025, 0.02439033, 0.02438724, 0.02439025, 0.02439067, 0.02439027, 0.02439025, 0.02439088, 0.02439021, 0.02439019, 0.02439023, 0.02439035, 0.02439059, 0.02439025, 0.02439438, 0.02439116, 0.02439019, 0.02439001, 0.02439013, 0.02439059, 0.02439025, 0.02439023, 0.02439026], dtype=float32)
which is the desired effect. Any idea why larger net causes it to hardmax?
UPDATE 2:
def get_compiled_model():
model = tf.keras.Sequential([
tf.keras.layers.Dense(124, input_shape = (1000,), activation='relu',
kernel_initializer = 'glorot_uniform'),
tf.keras.layers.Dropout(0.8),
tf.keras.layers.Dense(256, input_shape = (1000,), activation='relu', kernel_initializer = 'glorot_uniform'),
tf.keras.layers.Dropout(0.8),
tf.keras.layers.Dense(41, activation='relu', kernel_initializer = 'glorot_uniform'),
tf.keras.layers.Softmax(-1)
])
model.compile(optimizer='Adam',
loss='categorical_crossentropy',
metrics=['categorical_accuracy'])
return model
The current issues is that it converges to having the same weights on all the options all the time:
What is train_object? It's possible you forgot to specify targets, e.g., the y parameters to the fit function.
fit(x=None, y=None, batch_size=None, epochs=1, verbose=1, callbacks=None, validation_split=0.0, validation_data=None, shuffle=True, class_weight=None, sample_weight=None, initial_epoch=0, steps_per_epoch=None, validation_steps=None, validation_freq=1, max_queue_size=10, workers=1, use_multiprocessing=False)

keras multi output softmax model input shape

I have the following model:
from keras.layers import Activation, Input, Dense
from keras.models import Model
from keras.layers.merge import Concatenate, concatenate
input_ = Input(batch_shape=(512, 36))
x = input_
x1 = Dense(4)(x)
x2 = Dense(4)(x)
x3 = Dense(4)(x)
x4 = Dense(4)(x)
model = Model(inputs=input_, outputs=[x1, x2, x3, x4])
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
history = model.fit(X, test, epochs=20, batch_size=512, verbose=2, shuffle=False, validation_data=[X, test])
My Y has the following format:
col1 col2 ... col 4
1 0 2
0 0 2
2 1 1
and is reshaped via:
y = to_categorical(Y).reshape(4, -1, 3)
However, when running the fit command, I get the following error:
ValueError: Error when checking model target: the list of Numpy arrays that
you are passing to your model is not the size the model expected. Expected
to see 4 array(s), but instead got the following list of 1 arrays:
[array([[[1., 0., 0.],
[1., 0., 0.],
[1., 0., 0.],
Assuming Y is a numpy matrix?
Try this:
y = [to_categorical(Y[:, col_numb]).reshape(-1, 3) for col_numb in range(Y.shape[1])]

Resources