Large Accuracy Difference: Conv2D and ConvLSTM2D - scikit-learn

I was trying to compare Conv2D and ConvLSTM2D architecture to estimate high resolution image from low resolution ones. But the predictions showed large difference between two architectures. What is causing such erroneous predictions? Is it due to incorrect implementation of the architectures?
In case of ConvLSTM2D:
import numpy as np, scipy.ndimage, matplotlib.pyplot as plt
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Convolution2D, ConvLSTM2D, MaxPooling2D, UpSampling2D
from sklearn.metrics import accuracy_score, confusion_matrix, cohen_kappa_score
from sklearn.preprocessing import MinMaxScaler, StandardScaler
np.random.seed(123)
raw = np.arange(96).reshape(8,3,4)
data1 = scipy.ndimage.zoom(raw, zoom=(1,100,100), order=1, mode='nearest') #low res
print (data1.shape)
#(8, 300, 400)
data2 = scipy.ndimage.zoom(raw, zoom=(1,100,100), order=3, mode='nearest') #high res
print (data2.shape)
#(8, 300, 400)
X_train = data1.reshape(1, data1.shape[0], data1.shape[1], data1.shape[2], 1)
Y_train = data2.reshape(1, data2.shape[0], data2.shape[1], data2.shape[2], 1)
model = Sequential()
input_shape = (data1.shape[0], data1.shape[1], data1.shape[2], 1)
model.add(ConvLSTM2D(16, kernel_size=(3, 3), activation='sigmoid', padding='same',input_shape=input_shape,return_sequences=True))
model.add(ConvLSTM2D(1, kernel_size=(3, 3), activation='sigmoid', padding='same',return_sequences=True))
model.compile(loss='mse', optimizer='adam')
model.fit(X_train, Y_train,
batch_size=1, epochs=10, verbose=1)
y_predict = model.predict(X_train)
y_predict = y_predict.reshape(data1.shape[0], data1.shape[1], data1.shape[2])
slope, intercept, r_value, p_value, std_err = linregress(data2[0,:,:].reshape(-1), y_predict[0,:,:].reshape(-1))
print (r_value**2)
0.26
In case of Conv2D:
X_train = data1.reshape(data1.shape[0], data1.shape[1], data1.shape[2], 1)
Y_train = data2.reshape(data2.shape[0], data2.shape[1], data2.shape[2], 1)
model = Sequential()
input_shape = (data1.shape[1], data1.shape[2], 1)
model.add(Convolution2D(64, kernel_size=(3,3), activation='sigmoid',padding='same',input_shape=input_shape))
model.add(Convolution2D(1, kernel_size=(3,3), activation='sigmoid',padding='same'))
model.compile(loss='mse', optimizer='adam')
model.fit(X_train, Y_train,
batch_size=1, epochs=10, verbose=1)
y_predict = model.predict(X_train)
y_predict = y_predict.reshape(data1.shape[0], data1.shape[1], data1.shape[2])
slope, intercept, r_value, p_value, std_err = linregress(data2[0,:,:].reshape(-1), y_predict[0,:,:].reshape(-1))
print (r_value**2)
0.93

Two important things may be severely affecting the results:
You have 64 Conv2D filters against 16 ConvLSTM2D filters
The LSTM layer is trying to understand a "movie" with all images in sequence, this is certainly more complicated than just processing individual images.
For the second point, you may try a shape of (8,1,300,400,1) instead. This will eliminate the time steps (and should make the LSTM work exactly as the Conv2D if we understand them correctly). But then this is useless as a recurrent layer.
Unfortunately, this is the only way to "compare" them. The LSTM layers are good for "movies" (images are frames in sequence), but this does not seem to be your case.

Related

Why does my model predict the same label?

I am training a small network and the training seems to go fine, the val loss decreases, I reach validation accuracy around 80, and it actually stops training once there is no more improvement (patience=10). It trained for 40 epochs. However, it keeps predicting only one class for every test image! I tried to initialize the conv layers randomly, I added regularizers, I switched from Adam to SGD, I added clipvalue, I added dropouts. I also switched to softmax (I have only two labels but I saw some recommendation on using softmax and Dense layer with 2 neurons). Some or one of these helped with the overfitting, but nothing worked for the prediction problem. The data is balanced, though it is a small dataset, so it doesn't make sense that it reaches 80% if it predicts the same labels for evaluation set as well.
What is wrong with my model and how can I fix it? Any comments are welcome.
#Import some packages to use
import cv2
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from keras.preprocessing.image import ImageDataGenerator
import os
from keras.regularizers import l2
from keras.callbacks import EarlyStopping, ModelCheckpoint, ReduceLROnPlateau
from keras.layers.core import Dense, Dropout, Flatten
from keras.layers.convolutional import Conv2D, MaxPooling2D
from keras.initializers import RandomNormal
os.environ["CUDA_VISIBLE_DEVICES"]="0"
epochs = 200
callbacks = []
#schedule = None
decay = 0.0
earlyStopping = EarlyStopping(monitor='val_loss', patience=10, verbose=0, mode='min')
mcp_save = ModelCheckpoint('.mdl_wts.hdf5', save_best_only=True, monitor='val_loss', mode='min')
reduce_lr_loss = ReduceLROnPlateau(monitor='val_loss', factor=0.1, patience=3, verbose=1, epsilon=1e-5, mode='min')
train_dir = '/home/d/Desktop/s/data/train'
eval_dir = '/home/d/Desktop/s/data/eval'
test_dir = '/home/d/Desktop/s/data/test'
# create a data generator
train_datagen = ImageDataGenerator(rescale=1./255, #Scale the image between 0 and 1
rotation_range=40,
width_shift_range=0.2,
height_shift_range=0.2,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True,)
val_datagen = ImageDataGenerator(rescale=1./255) #We do not augment validation data. we only perform rescale
test_datagen = ImageDataGenerator(rescale=1./255) #We do not augment validation data. we only perform rescale
# load and iterate training dataset
train_generator = train_datagen.flow_from_directory(train_dir, target_size=(224,224),class_mode='categorical', batch_size=16, shuffle='True', seed=42)
# load and iterate validation dataset
val_generator = val_datagen.flow_from_directory(eval_dir, target_size=(224,224),class_mode='categorical', batch_size=16, shuffle='True', seed=42)
# load and iterate test dataset
test_generator = test_datagen.flow_from_directory(test_dir, target_size=(224,224), class_mode=None, batch_size=1, shuffle='False', seed=42)
#We will use a batch size of 32. Note: batch size should be a factor of 2.***4,8,16,32,64...***
#batch_size = 4
#from keras import layers
from keras import models
from keras import optimizers
#from keras.layers import Dropout
#from keras.preprocessing.image import ImageDataGenerator
from keras.preprocessing.image import img_to_array, load_img
model = models.Sequential()
model.add(Conv2D(64, (3, 3), activation='relu', name='block1_conv1', kernel_initializer=RandomNormal(
mean=0.0, stddev=0.05), bias_initializer=RandomNormal(mean=0.0, stddev=0.05), input_shape=(224, 224, 3)))
model.add(Conv2D(64, (3, 3), activation='relu', name='block1_conv2', kernel_initializer=RandomNormal(
mean=0.0, stddev=0.05), bias_initializer=RandomNormal(mean=0.0, stddev=0.05)))
model.add(MaxPooling2D((2, 2)))
model.add(Dropout(0.2))
model.add(Conv2D(128, (3, 3), activation='relu', name='block2_conv1', kernel_initializer=RandomNormal(
mean=0.0, stddev=0.05), bias_initializer=RandomNormal(mean=0.0, stddev=0.05)))
model.add(Conv2D(128, (3, 3), activation='relu', name='block2_conv2',kernel_initializer=RandomNormal(
mean=0.0, stddev=0.05), bias_initializer=RandomNormal(mean=0.0, stddev=0.05)))
model.add(MaxPooling2D((2, 2), name='block2_pool'))
model.add(Dropout(0.2))
model.add(Conv2D(256, (3, 3), activation='relu', name='block3_conv1', kernel_initializer=RandomNormal(
mean=0.0, stddev=0.05), bias_initializer=RandomNormal(mean=0.0, stddev=0.05)))
model.add(Conv2D(256, (3, 3), activation='relu', name='block3_conv2', kernel_initializer=RandomNormal(
mean=0.0, stddev=0.05), bias_initializer=RandomNormal(mean=0.0, stddev=0.05)))
model.add(Conv2D(256, (3, 3), activation='relu', name='block3_conv3', kernel_initializer=RandomNormal(
mean=0.0, stddev=0.05), bias_initializer=RandomNormal(mean=0.0, stddev=0.05)))
model.add(MaxPooling2D((2, 2), name='block3_pool'))
model.add(Dropout(0.2))
#model.add(layers.Conv2D(512, (3, 3), activation='relu', name='block4_conv1'))
#model.add(layers.Conv2D(512, (3, 3), activation='relu', name='block4_conv2'))
#model.add(layers.Conv2D(512, (3, 3), activation='relu', name='block4_conv3'))
#model.add(layers.MaxPooling2D((2, 2), name='block4_pool'))
model.add(Flatten())
model.add(Dense(256, kernel_regularizer=l2(0.01), bias_regularizer=l2(0.01), activation='relu', kernel_initializer='he_uniform'))
model.add(Dropout(0.5))
model.add(Dense(2, kernel_regularizer=l2(0.01), bias_regularizer=l2(0.01), activation='softmax'))
#Lets see our model
model.summary()
#We'll use the RMSprop optimizer with a learning rate of 0.0001
#We'll use binary_crossentropy loss because its a binary classification
#model.compile(loss='binary_crossentropy', optimizer=optimizers.SGD(lr=1e-5, momentum=0.9), metrics=['acc'])
model.compile(loss='categorical_crossentropy',
#optimizer=optimizers.Adadelta(lr=1.0, rho=0.95, epsilon=1e-08, decay=decay),
optimizer=optimizers.SGD(lr= 0.0001, clipvalue = 0.5, decay=1e-6, momentum=0.9, nesterov=True),
metrics=['accuracy'])
#The training part
#We train for 64 epochs with about 100 steps per epoch
history = model.fit_generator(train_generator,
steps_per_epoch=train_generator.n // train_generator.batch_size,
epochs=epochs,
validation_data=val_generator,
validation_steps=val_generator.n // val_generator.batch_size,
callbacks=[earlyStopping, mcp_save]) #, reduce_lr_loss])
#Save the model
model.save_weights('/home/d/Desktop/s/categorical_weights.h5')
model.save('/home/d/Desktop/s/categorical_model_keras.h5')
#lets plot the train and val curve
#get the details form the history object
acc = history.history['acc']
val_acc = history.history['val_acc']
loss = history.history['loss']
val_loss = history.history['val_loss']
epochs = range(1, len(acc) + 1)
#Train and validation accuracy
plt.plot(epochs, acc, 'b', label='Training accuracy')
plt.plot(epochs, val_acc, 'r', label='Validation accuracy')
plt.title('Training and Validation accuracy')
plt.legend()
plt.figure()
#Train and validation loss
plt.plot(epochs, loss, 'b', label='Training loss')
plt.plot(epochs, val_loss, 'r', label='Validation loss')
plt.title('Training and Validation loss')
plt.legend()
plt.show()
model.evaluate_generator(generator=val_generator, steps=val_generator.n // val_generator.batch_size)
STEP_SIZE_TEST=test_generator.n//test_generator.batch_size
test_generator.reset()
pred=model.predict_generator(test_generator,
steps=STEP_SIZE_TEST,
verbose=1)
predicted_class_indices=np.argmax(pred,axis=1)
labels = (train_generator.class_indices)
np.save('/home/d/Desktop/s/classes', labels)
labels = dict((v,k) for k,v in labels.items())
predictions = [labels[k] for k in predicted_class_indices]
filenames=test_generator.filenames
results=pd.DataFrame({"Filename":filenames,
"Predictions":predictions})
results.to_csv("categorical_results.csv",index=False)
One of the problems that could lead to such behavior is imbalanced dataset. Your model found out that if it predicts the dominant class each time, it would get a good results.
There are many ways to tackle an imbalance dataset. Here is a good tutorial.
One of the easiest yet powerful solution is to apply higher penalty to your loss if it wrongly predicted the smaller class. This can be implemented in keras by setting the parameter class_weight in the fitor fit_generator function.
It can be a dictionary of example:
class_weight = {0: 0.75, 1: 0.25} # does not necessarily add to up 1.
history = model.fit_generator(train_generator,
steps_per_epoch=train_generator.n // train_generator.batch_size,
epochs=epochs,
class_weight= class_weight, # this is the important part
validation_data=val_generator,
validation_steps=val_generator.n // val_generator.batch_size,
callbacks=[earlyStopping, mcp_save]) #, reduce_lr_loss])
Adding to Coderji's answer, it might also prove advantageous to counter class imbalance using stratified k-fold cross-validation, with k = 5 being common practice. This basically splits your data set up into k splits like regular cross-validation, but also stratifies these splits. In the case of class imbalance, each of these splits contain over-/undersampled classes compensating for their lower/higher occurence within the data set.
As of yet Keras does not have it's own way to use stratified k-fold cross-validation. Instead it's advised to use sklearn's StratifiedKFold. This article gives a detailed overview how to achieve this in Keras,
with the gist of it being:
from sklearn.model_selection import StratifiedKFold# Instantiate the cross validator
skf = StratifiedKFold(n_splits=kfold_splits, shuffle=True)# Loop through the indices the split() method returns
for index, (train_indices, val_indices) in enumerate(skf.split(X, y)):
print "Training on fold " + str(index+1) + "/10..." # Generate batches from indices
xtrain, xval = X[train_indices], X[val_indices]
ytrain, yval = y[train_indices], y[val_indices] # Clear model, and create it
model = None
model = create_model()
# Debug message I guess
# print "Training new iteration on " + str(xtrain.shape[0]) + " training samples, " + str(xval.shape[0]) + " validation samples, this may be a while..."
history = train_model(model, xtrain, ytrain, xval, yval)
accuracy_history = history.history['acc']
val_accuracy_history = history.history['val_acc']
print "Last training accuracy: " + str(accuracy_history[-1]) + ", last validation accuracy: " + str(val_accuracy_history[-1])
create_model() returns a compiled Keras model
train_model() returns last history object of its last model.fit() operation

ValueError: Input arrays should have the same number of samples as target arrays LSTM Keras

Here is the part for data preparing - I just want the data to be in the correct shape
x_train , x_test, y_train, y_test = train_test_split(input_data, y , test_size = 0.2 , random_state = 33)
print(x_train.shape)
print(y_train.shape)
(200, 3)
(200, 1)
#Converting them into numpy arrays
input_x_train = x_train.as_matrix()
input_y_train = y_train.as_matrix()
print(input_x_train.shape)
print(input_y_train.shape)
(200, 3)
(200, 1)
input_x_test = x_test.as_matrix()
input_y_test = y_test.as_matrix()
print(input_x_test.shape)
print(input_y_test.shape)
(51, 3)
(51, 1)
#Reshaping into LSTM input format
input_x = input_x_train.reshape((1, input_x_train.shape[0], input_x_train.shape[1]))
print(input_x.shape)
(1, 200, 3)
Then I built my model like this:
import keras
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation
from keras.layers.recurrent import LSTM
from keras.layers.normalization import BatchNormalization
model = Sequential()
model.add(LSTM(32, input_shape=(200, 3)))
model.add(Dense(1))
model.compile(loss='mse', optimizer='adam')
model.fit(input_x, input_y_train, epochs=1, batch_size=16)
But I am getting this error
ValueError: Input arrays should have the same number of samples as
target arrays. Found 1 input samples and 200 target samples.
The input_shape parameter should not include your batch size. Given that each sample of yours dataset has three features, you should set input_shape=(3,).
In addition, you should not reshape your batch to (1, batch_size, 3). I'm not sure why you're doing that, but as far as I can tell that will break things. Remove that line entirely.

Multi-layer autoencoder using keras, specifying different optimizers

Currently I'm trying to implement a multi-layer autoencoder using Keras, working on the Mnist dataset (handwritten digits). My code is looking like this:
from keras.layers import Input, Dense, initializers
from keras.models import Model
import numpy as np
from Dataset import Dataset
import matplotlib.pyplot as plt
from keras import optimizers, losses
from keras import backend as K
import tensorflow as tf
from keras.callbacks import TensorBoard
from keras.layers import Dropout
from keras.models import Sequential
from keras import models
from keras import layers
import keras
from keras.optimizers import Adam
#global variables
d = Dataset()
num_features = d.X_train.shape[1]
low_dim = 32
def autoencoder(epochs):
w = initializers.RandomNormal(mean=0.0, stddev=0.05, seed=None)
model = Sequential()
#First autoencoder
model.add(Dense(400, activation='relu', kernel_initializer=w, input_dim=num_features, name='hidden'))
model.add(Dropout(0.2))
model.add(Dense(num_features, activation='sigmoid', input_dim = 400, name = 'output'))
#Second autoencoder
model.add(Dense(100, activation='relu', kernel_initializer=w, input_dim=num_features, name='hidden2'))
model.add(Dropout(0.2))
model.add(Dense(num_features, activation = 'sigmoid', input_dim = 100, name='output2'))
#Third autoencoder
model.add(Dense(50, activation='relu', kernel_initializer=w, input_dim=num_features, name='hidden3'))
model.add(Dropout(0.2))
model.add(Dense(num_features, activation='sigmoid', input_dim=10, name='output3'))
model.compile(optimizer=Adam(lr=0.01), loss='binary_crossentropy', metrics=['accuracy'])
history = model.fit(d.X_train, d.X_train,
epochs=epochs,
batch_size=64,
shuffle=True,
validation_data=(d.X_test, d.X_test))
model.test_on_batch(d.X_test, d.X_test)
print(history.history.keys())
plt.plot(history.history['acc'])
print(history.history['acc'])
plt.show()
return model
def finding_index():
elements, index = np.unique(d.Y_test, return_index = True)
return elements, index
def plotting():
ae = autoencoder(2)
elements, index = finding_index()
y_proba = ae.predict(d.X_test)
plt.figure(figsize=(20, 4))
#size = 20
for i in range(len(index)):
ax = plt.subplot(2, len(index), i + 1)
plt.imshow(d.X_test[index[i]].reshape(28, 28))
plt.gray()
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)
ax = plt.subplot(2, len(index), i + 1 + len(index))
plt.imshow(y_proba[index[i]].reshape(28, 28))
plt.gray()
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)
plt.show()
plotting()
I have two questions, is it supposed to be like this when you stack autoencoders or should I let one layer reduce dimensions to let's say 400 and then the next to a 100 and so on, or the way I have done it? The second one is, can you different optimizers (in my case Adam) for different layers? I would like to use SGD (stochastic gradient descent) for the last layer. Thanks in advance!
You should not do it the way you've done it, but the way you described it in the question. Also you should go down first and then up again (e.g 400, 100, 50, 25, 10, 25, 50, 100, 400) in granular steps.
For the second question is the answer that it depends. You could train the model with Adam first and then freeze all but the last layer to train this further with SGD. But you can't tell Keras to use different classifiers for different layers.

How to solve this problem of Memory error?

So I have this error message that ruins all the fun with my work:
Traceback (most recent call last):
File "C:\Python\Python36\Scripts\Masterarbeit-1308\CNN - Kopie.py", line 97, in <module>
model.fit(np.asarray(X_train), np.asarray(Y_train), batch_size=32, epochs=100, verbose=1, validation_data=(np.asarray(X_test), np.asarray(Y_test)))
File "C:\Users\\****\AppData\Roaming\Python\Python36\site-packages\numpy\core\numeric.py", line 492, in asarray
return array(a, dtype, copy=False, order=order)
MemoryError
Does anyone has a solution for this?
I work on a machine i7 7th generation with 16 GB RAM.
To explain more, That's my code, It take al list of arrays (.npy) converted from sounds spectograms to .npy and saved in Input-CNN:
import os, numpy as np
from keras.models import Sequential
from keras.layers import Convolution2D, MaxPooling2D, Activation, Flatten, Conv2D, Dropout, Dense
from keras.layers.normalization import BatchNormalization
import tensorflow as tf
from sklearn.utils import shuffle
from sklearn.cross_validation import train_test_split
from keras.utils import to_categorical
folder = 'D:\InputCNN - Copie'
folder1 = 'C:\Python\Python36\Scripts\Masterarbeit-1308\Data'
from keras import backend as K
My_Data = os.listdir(folder)
num_data= len(My_Data)
Classnames = os.listdir(folder1)
class_num = len(Classnames)
arr =[np.load(os.path.join(folder, filename), fix_imports=True) for filename in os.listdir(folder)]
labels = np.ones((num_data,))
labels[0:31]= 0
labels[31:80] = 1
labels[80:128] = 2
labels[128:131] = 3
labels[131:143] = 4
labels[143:157] = 5
labels[157:209] = 6
labels[209:] = 7
Y = to_categorical(labels,class_num)
x, y = shuffle(arr, Y, random_state=2)
dataset = tf.data.Dataset.from_tensor_slices(My_Data)
X_train, X_test, Y_train, Y_test = train_test_split(x, Y, test_size=0.2)
##
def build_model(idx,X,Y,nb_classes):
K.set_image_data_format('channels_last')
nb_filters = 64 # number of convolutional filters to use
pool_size = (2, 2) # size of pooling area for max pooling
kernel_size = (3, 3) # convolution kernel size
nb_layers = 4
input_shape = (X[idx].shape[1], X[idx].shape[2], X[idx].shape[3])
model = Sequential()
model.add(Conv2D(nb_filters, kernel_size, padding='valid', input_shape=input_shape))
model.add(BatchNormalization(axis=1))
model.add(Activation('relu'))
for layer in range(nb_layers-1):
model.add(Conv2D(nb_filters, kernel_size, padding='valid', input_shape=input_shape))
model.add(BatchNormalization(axis=1))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=pool_size))
model.add(Dropout(0.5))
model.add(Flatten())
model.add(Dense(128))
model.add(Activation('relu'))
model.add(Dropout(0.6))
model.add(Dense(nb_classes, activation='sigmoid'))
return model
for idx in range(len(X_train)-1):
model = build_model(idx,X_train,Y_train, class_num)
model.summary()
model.compile(loss='categorical_crossentropy',
optimizer='adadelta',
metrics=['accuracy'])
model.fit(np.array(X_train), np.array(Y_train), batch_size=8, epochs=100, verbose=1, validation_data=(np.array(X_test), np.array(Y_test))) #Here I have the problem
score = model.evaluate(np.array(X_test), np.array(Y_test), verbose=0)
print('Test score:', score[0])
print('Test accuracy:', score[1])
The model fit function is the problem in my code, that should train my preconfigured model and returns an history object (A record of the training). I tried np.array and np.asarray and I got the same error message.
If someone think that the model`s summary can be helpful, I'll post it.
I solved this issue. Actually I changed the shape of my data in the list "X_train" (from (218,128,740,1) to (128,740,1)).
I found that, thanks to Keras, it will add automatically another axis with the number of my data injected to the network, and np.asarray works well even with more data.

MLP classifier_for multi class

I am newbie on keras,
I try to follow the Keras tutorial for Multilayer Perceptron (MLP) for multi-class softmax classification, using my data set.
My data has 3 classes and only one feature, but I don't understand why the result always show just 0,3 of accuracy and the model predicted all training data as first class. then the confusion matrix is like this.
Confusion matrix
Here the coding:
import keras
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation
from keras.optimizers import SGD
import pandas as pd
import numpy as np
# Importing the dataset
dataset = pd.read_csv('StatusAll.csv')
X = dataset.iloc[:, 1:].values
y = dataset.iloc[:, 0:1].values
# Splitting the dataset into the Training set and Test set
from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 0)
from keras.utils import to_categorical
y_train = to_categorical(y_train)
y_test = to_categorical(y_test)
model = Sequential()
# Dense(64) is a fully-connected layer with 64 hidden units.
# in the first layer, you must specify the expected input data shape:
# here, 20-dimensional vectors.
model.add(Dense(64, activation='tanh', input_dim=1))
model.add(Dropout(0.5))
model.add(Dense(64, activation='tanh'))
model.add(Dropout(0.5))
model.add(Dense(4, activation='softmax'))
sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='categorical_crossentropy',
optimizer=sgd,
metrics=['accuracy'])
history = model.fit(x_train, y_train,
epochs=100,
batch_size=128)
score = model.evaluate(x_test, y_test, batch_size=128)
print('Test score:', score[0])
print('Test accuracy:', score[1])
from sklearn import metrics
prediction = model.predict(x_test)
prediction = np.around(prediction)
y_test_non_category = [ np.argmax(t) for t in y_test ]
y_predict_non_category = [ np.argmax(t) for t in prediction ]
from sklearn.metrics import confusion_matrix
conf_mat = confusion_matrix(y_test_non_category, y_predict_non_category)
print (conf_mat)
I hope I can get some advice, thanksss.
The x_train example
x_train
y_train before converted to categorical
enter image description here
Your final Dense layer has 4 outputs, it seems like you are classifying 4 instead of 3.
model.add(Dense(3, activation='softmax')) # Number of classes 3
It would be helpful to see sample data from x_train and y_train to make sure the pre-processing is correct. Because you have only 1 feature, a MLP might be overkill. A decision tree would be simpler unless you want to experiment with MLPs.

Resources