Tensorflow structured data model.predict() returns incorrect probabilities - python-3.x

I'm trying to follow a Tensorflow tutorial (i'm a beginner) for structured data models with some changes along the way.
My purpose is to create a model to which i provide data (in csv format) that looks something like this (the example has only 2 features but i want to extend it after i figure it out):
power_0,power_1,result
0.2,0.3,draw
0.8,0.1,win
0.3,0.1,draw
0.7,0.2,win
0.0,0.4,lose
I created the model using the following code:
def get_labels(df, label, mapping):
raw_y_true = df.pop(label)
y_true = np.zeros((len(raw_y_true)))
for i, raw_label in enumerate(raw_y_true):
y_true[i] = mapping[raw_label]
return y_true
tf.compat.v1.enable_eager_execution()
mapping_to_numbers = {'win': 0, 'draw': 1, 'lose': 2}
data_frame = pd.read_csv('data.csv')
data_frame.head()
train, test = train_test_split(data_frame, test_size=0.2)
train, val = train_test_split(train, test_size=0.2)
train_labels = np.array(get_labels(train, label='result', mapping=mapping_to_numbers))
val_labels = np.array(get_labels(val, label='result', mapping=mapping_to_numbers))
test_labels = np.array(get_labels(test, label='result', mapping=mapping_to_numbers))
train_features = np.array(train)
val_features = np.array(val)
test_features = np.array(test)
model = tf.keras.Sequential([
tf.keras.layers.Dense(128, activation='relu', input_shape=(train_features.shape[-1],)),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dropout(0.3),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dropout(0.3),
tf.keras.layers.Dense(3, activation='sigmoid'),
])
model.compile(
optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy'],
run_eagerly=True)
epochs = 10
batch_size = 100
history = model.fit(
train_features,
train_labels,
epochs=epochs,
validation_data=(val_features, val_labels))
input_data_frame = pd.read_csv('input.csv')
input_data_frame.head()
input_data = np.array(input_data_frame)
print(model.predict(input_data))
input.csv looks as following:
power_0,power_1
0.8,0.1
0.7,0.2
And the actual result is:
[[0.00604381 0.00242573 0.00440606]
[0.01321151 0.00634229 0.01041476]]
I expected to get the probability for each label ('win', 'draw' and 'lose'), can anyone please help me with this?
Thanks in advance

Use softmax activation in this line tf.keras.layers.Dense(3, activation='sigmoid').

This works well for me with your example:
model = tf.keras.Sequential([
tf.keras.layers.Flatten(input_shape=(train_features.shape[-1],)),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dense(16, activation='relu'),
tf.keras.layers.Dense(3, activation='softmax'),
])
model.compile(
optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'],
run_eagerly=True)
Using Flatten Layer

I have to write my suggestions here because i cant comment yet.
#zihaozhihao is right you have to use softmax instead of sigmoid because you dont work with a binary problem. Another problem might be your loss function which is:
model.compile(
optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy'],
run_eagerly=True)
Try to use loss='categorical_crossentropy',because you are working with a multilabel classification. You could read more about multilable classification here and here.
As for your propability question. You get the propability of each class for your two test inputs.For example:
win draw loss
[[0.00604381 0.00242573 0.00440606]
[0.01321151 0.00634229 0.01041476]]
The Problem is your loss function and the activation function which leads to strange propability values. You might want to check this post here for more information.
Hope this helps a little and feel free to ask.

Since it is a multiclass classification problem, please use categorical_crossentropy instead of binary_crossentropy for loss function, also use softmax instead of sigmoid as activation function.
Also, you should increase your epochs for getting better convergence.

Related

tensorflow sequential model outputting nan

Why is my code outputting nan? I'm using a sequential model with a 30x1 input vector and a single value output. I'm using tensorflow and python. This is one of my firs
While True:
# Define a simple sequential model
def create_model():
model = tf.keras.Sequential([
keras.layers.Dense(30, activation='relu',input_shape=(30,)),
keras.layers.Dense(12, activation='relu'),
keras.layers.Dropout(0.2),
keras.layers.Dense(7, activation='relu'),
keras.layers.Dense(1, activation = 'sigmoid')
])
model.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=[tf.keras.metrics.SparseCategoricalAccuracy()])
return model
# Create a basic model instance
model = create_model()
# Display the model's architecture
model.summary()
train_labels=[1]
test_labels=[1]
train_images= [[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30]]
test_images=[[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30]]
model.fit(train_images,
train_labels,
epochs=10,
validation_data=(test_images, test_labels),
verbose=1)
print('predicted:',model.predict(train_images))
You are using SparseCategoricalCrossentropy. It expects labels to be integers starting from 0. So, you have only one label 1, but it means you have at least two categories - 0 and 1. So you need at least two neurons in the last layer:
keras.layers.Dense(2, activation = 'sigmoid')
( If your goal is classification, you should maybe consider to use softmax instead of sigmoid, without from_logits=True )
You're using the wrong loss function for those labels. You need to use BinaryCrossentropy.
Change:
model.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=[tf.keras.metrics.SparseCategoricalAccuracy()])
To:
model.compile(optimizer='adam',
loss=tf.keras.losses.BinaryCrossentropy,
metrics=[tf.keras.metrics.BinaryAccuracy()])

Hyperparameter optimization in images with Talos and flow_from_directory

I tried to optimize the hyperparameters of my keras CNN made for image classification. I consider using grid search from sklearn and talos optimizer (https://github.com/autonomio/talos). I overcame the fundamental difficulty with making x and y out from flow_from_directory (code below), but... it still does not work! Any idea? Maybe someone faced the same problem.
def talos_model(train_flow, validation_flow, nb_train_samples, nb_validation_samples, params):
model = Sequential()
model.add(Conv2D(6,(5,5),activation="relu",padding="same",
input_shape=(img_width, img_height, 3)))
model.add(MaxPooling2D((2,2)))
model.add(Dropout(params['dropout']))
model.add(Conv2D(16,(5,5),activation="relu"))
model.add(MaxPooling2D((2,2)))
model.add(Dropout(params['dropout']))
model.add(Flatten())
model.add(Dense(120, activation='relu', kernel_initializer=params['kernel_initializer']))
model.add(Dropout(params['dropout']))
model.add(Dense(84, activation='relu', kernel_initializer=params['kernel_initializer']))
model.add(Dropout(params['dropout']))
model.add(Dense(10, activation='softmax'))
model.compile(loss=params['loss'],
optimizer=params['optimizer'],
metrics=['categorical_accuracy'])
checkpointer = ModelCheckpoint(filepath='talos_cnn.h5py',
monitor='val_categorical_accuracy', save_best_only=True)
history = model.fit_generator(generator=train_flow,
samples_per_epoch=nb_train_samples,
validation_data=validation_flow,
nb_val_samples=nb_validation_samples,
callbacks=[checkpointer],
verbose=1,
epochs=params['epochs'])
return history, model
train_generator = ImageDataGenerator(rescale=1/255)
validation_generator = ImageDataGenerator(rescale=1/255)
# Retrieve images and their classes for train and validation sets
train_flow = train_generator.flow_from_directory(directory=train_data_dir,
batch_size=batch_size,
target_size(img_height,img_width))
validation_flow = validation_generator.flow_from_directory(directory=validation_data_dir,
batch_size=batch_size,
target_size=(img_height,img_width),
shuffle = False)
#here I make x and y for talos
(X_train, Y_train) = train_flow.next()
#starting an experiment with talos
t = ta.Scan(x=X_train,
y=Y_train,
model=talos_model,
params=params,
dataset_name='landmarks',
experiment_no='1')
Error occurs in the last line:
The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

Neural network only predicts one class from binary class

My task is to learn defected items in a factory. It means, I try to detect defected goods or fine goods. This led a problem where one class dominates the others (one class is 99.7% of the data) as the defected items were very rare. Training accuracy is 0.9971 and validation accuracy is 0.9970. It sounds amazing.
But the problem is, the model only predicts everything is 0 class which is fine goods. That means, it fails to classify any defected goods.
How can I solve this problem? I have checked other questions and tried out, but I still have the situation. the total data points are 122400 rows and 5 x features.
In the end, my confusion matrix of the test set is like this
array([[30508, 0],
[ 92, 0]], dtype=int64)
which does a terrible job.
My code is as below:
le = LabelEncoder()
y = le.fit_transform(y)
ohe = OneHotEncoder(sparse=False)
y = y.reshape(-1,1)
y = ohe.fit_transform(y)
scaler = StandardScaler()
x = scaler.fit_transform(x)
x_train, x_test, y_train, y_test = train_test_split(x,y,test_size = 0.25, random_state = 777)
#DNN Modelling
epochs = 15
batch_size =128
Learning_rate_optimizer = 0.001
model = Sequential()
model.add(Dense(5,
kernel_initializer='glorot_uniform',
activation='relu',
input_shape=(5,)))
model.add(Dense(5,
kernel_initializer='glorot_uniform',
activation='relu'))
model.add(Dense(8,
kernel_initializer='glorot_uniform',
activation='relu'))
model.add(Dense(2,
kernel_initializer='glorot_uniform',
activation='softmax'))
model.compile(loss='binary_crossentropy',
optimizer=Adam(lr = Learning_rate_optimizer),
metrics=['accuracy'])
history = model.fit(x_train, y_train,
batch_size=batch_size,
epochs=epochs,
verbose=1,
validation_data=(x_test, y_test))
y_pred = model.predict(x_test)
confusion_matrix(y_test.argmax(axis=1), y_pred.argmax(axis=1))
Thank you
it sounds like you have highly imbalanced dataset, the model is learning only how to classify fine goods.
you can try one of the approaches listed here:
https://machinelearningmastery.com/tactics-to-combat-imbalanced-classes-in-your-machine-learning-dataset/
The best attempt would be to firstly take almost equal portions of data of both classes, split them into train-test-val, train the classifier and do thorough testing on your complete dataset. You can also try and use data augmentation techniques to your other set to get more data from the same set. Keep on iterating and maybe even try and change your loss function to suit your condition.

Find Most Important Input from a Neural Network

I trained a neural network with 37 Inputs. It has around 85% accuracy. Is it possible for me to find out which Input has the most effect. I tried this code but I cannot figure out how to find most important Input
weights = model.layers[0].get_weights()[0]
biases = model.layers[0].get_weights()[1]
One possible solution is to wrap your model with keras.wrappers.scikit_learn and then use Recursive Feature elimination in scikit-learn:
def create_model():
# create model
model = Sequential()
model.add(Dense(512, activation='relu'))
model.add(Dense(512, activation='relu'))
model.add(Dense(10, activation='softmax'))
# Compile model
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
return model
model = KerasClassifier(build_fn=create_model, epochs=100, batch_size=128, verbose=0)
rfe = RFE(estimator=model, n_features_to_select=1, step=1)
rfe.fit(X, y)
ranking = rfe.ranking_.reshape(digits.images[0].shape)
# Plot pixel ranking
plt.matshow(ranking, cmap=plt.cm.Blues)
plt.colorbar()
plt.title("Ranking of pixels with RFE")
plt.show()
If you need to visualize weights see here.

Keras translate example to use functional API

I have a working example in Keras which I want to translate to make use of the functional API. I am missing some detail I when I run the code I get ValueError: total size of new array must be unchanged when using embeddings.
Maybe someone sees what I am doing wrong.
The working code looks like this:
EMBEDDING_DIM=100
embeddingMatrix = mu.loadPretrainedEmbedding('englishEmb100.txt', EMBEDDING_DIM, wordMap)
###############
# Build Model #
###############
model = Sequential()
model.add(Embedding(vocabSize+1, EMBEDDING_DIM, weights=[embeddingMatrix], trainable=False))
#model.add(Embedding(vocabSize+1, EMBEDDING_DIM))
model.add(Bidirectional(LSTM(EMBEDDING_DIM, return_sequences=True)))
model.add(TimeDistributed(Dense(maximal_value)))
model.add(Activation('relu'))
# try using different optimizers and different optimizer configs
model.compile(loss='categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])
My attempt to re-write it with the functional API:
EMBEDDING_DIM=100
embeddingMatrix = mu.loadPretrainedEmbedding('englishEmb100.txt', EMBEDDING_DIM, wordMap)
input_layer = Input(shape=(longest_sequence,), dtype='int32')
emb = Embedding(vocabSize+1, EMBEDDING_DIM, weights=[embeddingMatrix]) (input_layer)
forwards = LSTM(EMBEDDING_DIM, return_sequences=True) (emb)
backwards = LSTM(EMBEDDING_DIM, return_sequences=True, go_backwards=True) (emb)
common = merge([forwards, backwards], mode='concat', concat_axis=-1)
dense = TimeDistributed(Dense(EMBEDDING_DIM, activation='tanh')) (common)
out = TimeDistributed(Dense(len(labelMap), activation='softmax')) (dense)
model = Model(input=input_layer, output=out)
model.compile(loss='categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])
There is somewhere an operation that change the size but I am not sure where or why this is happening. I would be happy if someone could help me with this.

Resources