Training Keras MobileNetV2 on CIFAR-100 (from scratch) - keras

I want to train MobileNetV2 from scratch on CIFAR-100 and I get the following results where it just stops learning after some while.
Here is my code. I would like to see at least 60-70% validation accuracy and I wonder whether I have to pre-train it on imagenet or whether it is because CIFAR100 is just 32x32x3?
Due to some restrictions, I am using Keras 2.2.4 with tensorflow 1.12.0.
from keras.applications.mobilenet_v2 import MobileNetV2
[..]
(x_train, y_train), (x_test, y_test) = cifar100.load_data()
x_train = x_train / 255
x_test = x_test / 255
y_train = np_utils.to_categorical(y_train, 100)
y_test = np_utils.to_categorical(y_test, 100)
input_tensor = Input(shape=(32,32,3))
x = MobileNetV2(include_top=False,
weights=None,
classes=100)(input_tensor)
x = GlobalAveragePooling2D()(x)
x = Dense(1024, activation='relu')(x)
x = Dense(512, activation='relu')(x)
preds = Dense(100, activation='softmax')(x)
model = Model(inputs=[input_tensor], outputs=[preds])
optimizer = Adam(lr=1e-3)
model.compile(loss="categorical_crossentropy",
optimizer=optimizer,
metrics=['accuracy'])
epochs = 300
batch_size = 64
callbacks = [ReduceLROnPlateau(monitor='val_loss', factor=np.sqrt(0.1), cooldown=0, patience=10, min_lr=1e-6)]
generator = ImageDataGenerator(rotation_range=15,
width_shift_range=5. / 32,
height_shift_range=5. / 32,
horizontal_flip=True)
generator.fit(x_train)
model.fit_generator(generator.flow(x_train, y_train),
validation_data=(x_test, y_test),
steps_per_epoch=(len(x_train) // batch_size),
epochs=epochs, verbose=1,
callbacks=callbacks)

Well, MobileNets and all other imagenet based models down-sampling the image for 5 times(224 -> 7) and then do GlobalAveragePooling2D and then the output layers.
I think using 32*32 images on these models directly won't give you a good result, as the tensor shape would be 1*1 even before the GlobalAveragePooling2D.
Maybe you should try resize the image to like 96*96 or remove the first stride=2. Take the NASNet paper as reference, they use 4 poolings in both Cifar and ImageNet versions while only ImageNet version has stride=2 in the first Convolution layer.

Related

Calculate gradient of validation error w.r.t inputs using Keras/Tensorflow or autograd

I need to calculate the gradient of the validation error w.r.t inputs x. I'm trying to see how much the validation error changes when I perturb one of the training samples.
The validation error (E) explicitly depends on the model weights (W).
The model weights explicitly depend on the inputs (x and y).
Therefore, the validation error implicitly depends on the inputs.
I'm trying to calculate the gradient of E w.r.t x directly.
An alternative approach would be to calculate the gradient of E w.r.t W (can easily be calculated) and the gradient of W w.r.t x (cannot do at the moment), which would allow the gradient of E w.r.t x to be calculated.
I have attached a toy example. Thanks in advance!
import numpy as np
import mnist
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.utils import to_categorical
import tensorflow as tf
from autograd import grad
train_images = mnist.train_images()
train_labels = mnist.train_labels()
test_images = mnist.test_images()
test_labels = mnist.test_labels()
# Normalize the images.
train_images = (train_images / 255) - 0.5
test_images = (test_images / 255) - 0.5
# Flatten the images.
train_images = train_images.reshape((-1, 784))
test_images = test_images.reshape((-1, 784))
# Build the model.
model = Sequential([
Dense(64, activation='relu', input_shape=(784,)),
Dense(64, activation='relu'),
Dense(10, activation='softmax'),
])
# Compile the model.
model.compile(
optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy'],
)
# Train the model.
model.fit(
train_images,
to_categorical(train_labels),
epochs=5,
batch_size=32,
)
model.save_weights('model.h5')
# Load the model's saved weights.
# model.load_weights('model.h5')
calculate_mse = tf.keras.losses.MeanSquaredError()
test_x = test_images[:5]
test_y = to_categorical(test_labels)[:5]
train_x = train_images[:1]
train_y = to_categorical(train_labels)[:1]
train_y = tf.convert_to_tensor(train_y, np.float32)
train_x = tf.convert_to_tensor(train_x, np.float64)
with tf.GradientTape() as tape:
tape.watch(train_x)
model.fit(train_x, train_y, epochs=1, verbose=0)
valid_y_hat = model(test_x, training=False)
mse = calculate_mse(test_y, valid_y_hat)
de_dx = tape.gradient(mse, train_x)
print(de_dx)
# approach 2 - does not run
def calculate_validation_mse(x):
model.fit(x, train_y, epochs=1, verbose=0)
valid_y_hat = model(test_x, training=False)
mse = calculate_mse(test_y, valid_y_hat)
return mse
train_x = train_images[:1]
train_y = to_categorical(train_labels)[:1]
validation_gradient = grad(calculate_validation_mse)
de_dx = validation_gradient(train_x)
print(de_dx)
Here's how you can do this. Derivation is as below.
Few things to note,
I have reduced the feature size from 784 to 256 as I was running out of memory in colab (line marked in the code) . Might have to do some mem profiling to find out why
Only computed grads for the first layer. Easily extendable to other layers
Disclaimer: this derivation is correct to best of my knowledge. Please do some research and verify that it is the case. You will run into memory issues for larger inputs and layer sizes.
import numpy as np
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.utils import to_categorical
import tensorflow as tf
f = 256
model = Sequential([
Dense(64, activation='relu', input_shape=(f,)),
Dense(64, activation='relu'),
Dense(10, activation='softmax'),
])
# Compile the model.
model.compile(
optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy'],
)
w = model.weights[0]
# Inputs and labels
x_tr = tf.Variable(np.random.normal(size=(1,f)), shape=(1, f), dtype='float32')
y_tr = np.random.choice([0,1,2,3,4,5,6,7,8,9], size=(1,1))
y_tr_onehot = tf.keras.utils.to_categorical(y_tr, num_classes=10).astype('float32')
x_v = tf.Variable(np.random.normal(size=(1,f)), shape=(1, f), dtype='float32')
y_v = np.random.choice([0,1,2,3,4,5,6,7,8,9], size=(1,1))
y_v_onehot = tf.keras.utils.to_categorical(y_v, num_classes=10).astype('float32')
# In the context of GradientTape
with tf.GradientTape() as tape1:
with tf.GradientTape() as tape2:
y_tr_pred = model(x_tr)
tr_loss = tf.keras.losses.MeanSquaredError()(y_tr_onehot, y_tr_pred)
tmp_g = tape2.gradient(tr_loss, w)
print(tmp_g.shape)
# d(dE_tr/d(theta))/dx
# Warning this step consumes lot of memory for large layers
lr = 0.001
grads_1 = -lr * tape1.jacobian(tmp_g, x_tr)
with tf.GradientTape() as tape3:
y_v_pred = model(x_v)
v_loss = tf.keras.losses.MeanSquaredError()(y_v_onehot, y_v_pred)
# dE_val/d(theta)
grads_2 = tape3.gradient(v_loss, w)[tf.newaxis, :]
# Just crunching the dimension to get the final desired shape of (1,256)
grad = tf.matmul(tf.reshape(grads_2,[1, -1]), tf.reshape(tf.transpose(grads_1,[2,1,0,3]),[1, -1, 256]))

fit_generator issue using Neural Structured learning

I passed two days trying to use Neural Structured language to adapt into me CNN Model I use ImageDataGenerator and flow_from_directory when I use model.fit_generator I got an error message:
ValueError:
When passing input data as arrays, do not specify
steps_per_epoch/steps argument. Please use batch_size instead.
I use Keras 2.3.1 and TensorFlow 2.0 as backend
This is a snipped of my code:
num_classes = 4
img_rows, img_cols = 224, 224
batch_size = 16
train_datagen = ImageDataGenerator(
rescale=1./255,
rotation_range=30,
width_shift_range=0.3,
height_shift_range=0.3,
horizontal_flip=True,
fill_mode='nearest')
validation_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(
train_data_dir,
target_size=(img_rows, img_cols),
batch_size=batch_size, shuffle=True,
class_mode='categorical')
validation_generator = validation_datagen.flow_from_directory(
validation_data_dir,
target_size=(img_rows, img_cols),
batch_size=batch_size, shuffle=True,
class_mode='categorical')
def vgg():
model1 = Sequential([ ])
return model1
base_model = vgg()
I adapte Datagenerated from (x,y) format to a dictionary format
def convert_training_data_generator():
for x ,y in train_generator:
return {'feature': x, 'label':y}
def convert_testing_data_generator():
for x ,y in validation_generator:
return {'feature': x, 'label': y}
adv_config = nsl.configs.make_adv_reg_config(multiplier=0.2, adv_step_size=0.05)
model = nsl.keras.AdversarialRegularization(base_model, adv_config=adv_config)
train= convert_training_data_generator()
test= convert_testing_data_generator()
history = model.fit_generator(train,
steps_per_epoch= nb_train_samples // batch_size,
epochs = epochs,
callbacks = callbacks,
validation_data = test,
validation_steps = nb_validation_samples // batch_size)
I think here there is the same error. Maybe you should consider using instead model.fit() function. You should define in that case your train input your train labels and the batch_size. In order to figure out the difference between fit and fit_generator, you can follow that link.

Keras fit_generator() not working due to shape error

I am running MNIST prediction using Keras, with tensorflow backend.
I have code that runs with batches , using Keras fit() as
(X_train, y_train), (X_test, y_test) = mnist.load_data()
N1 = X_train.shape[0]
N2 = X_test.shape[0]
h = X_train.shape[1]
w = X_train.shape[2]
num_pixels = h*w
# reshape N1 samples to num_pixels
x_train = X_train.reshape(N1, num_pixels).astype('float32') # shape is now (60000,784)
x_test = X_test.reshape(N2, num_pixels).astype('float32') # shape is now (10000,784)
x_train = x_train / 255
x_test = x_test / 255
y_train = np_utils.to_categorical(y_train) #(60000,10)
y_test = np_utils.to_categorical(y_test) # (10000,10):
num_classes = y_test.shape[1]
def baseline_model():
# create model
model = Sequential()
model.add(Dense(num_pixels, input_dim=num_pixels, kernel_initializer='normal', activation='relu'))
model.add(Dense(num_classes, kernel_initializer='normal', activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
return model
model = baseline_model()
batch_size = 200
epochs = 20
max_batches = 2 * len(x_train) / batch_size # 2*60000/200
# reshape to be [samples][width][height][ channel] for ImageDataGenerator
x_t = X_train.reshape(N1, w, h, 1).astype('float32')
datagen = ImageDataGenerator(rescale= 1./255)
train_gen = datagen.flow(x_t, y_train, batch_size=batch_size)
for e in range(epochs):
batches = 0
for x_batch, y_batch in train_gen:
# x_batch is of size [batch_sz,w,h,ch]: resize to [bth_sz,pixel_sz]: (200,28,28,1)-> (200,784)
# for model.fit
x_batch = np.reshape(x_batch, [-1, num_pixels])
model.fit(x_batch, y_batch,validation_split=0.15,verbose=0)
batches += 1
print("Epoch %d/%d, Batch %d/%d" % (e+1, epochs, batches, max_batches))
if batches >= max_batches:
break
scores = model.evaluate(x_test, y_test, verbose=0)
However, when I try to implement similar code using fit_generator(), I get an error.
the code is as below:
(X_train, y_train), (X_test, y_test) = mnist.load_data()
# separate data into train and validation
from sklearn.model_selection import train_test_split
# Split the data
X_train, X_valid, y_train, y_valid = train_test_split(X_train, y_train, test_size=0.15, shuffle= True)
# number of training samples
N1 = X_train.shape[0] # training size
N2 = X_test.shape[0] # test size
N3 = X_valid.shape[0] # valid size
h = X_train.shape[1]
w = X_train.shape[2]
num_pixels = h*w
y_train = np_utils.to_categorical(y_train)
y_valid = np_utils.to_categorical(y_valid)
y_test = np_utils.to_categorical(y_test)
num_classes = y_test.shape[1]
def baseline_model():
# create model
model = Sequential()
model.add(Dense(num_pixels, input_dim=num_pixels, kernel_initializer='normal', activation='relu'))
model.add(Dense(num_classes, kernel_initializer='normal', activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
return model
model = baseline_model()
batch_size = 200
epochs = 20
steps_per_epoch_tr = int(N1/ batch_size) # 51000/200
steps_per_epoch_val = int(N3/batch_size)
# reshape to be [samples][width][height][ channel] for ImageData Gnerator->datagen.flow
x_t = X_train.reshape(N1, w, h, 1).astype('float32')
x_v = X_valid.reshape(N3, w, h, 1).astype('float32')
# define data preparation
datagen = ImageDataGenerator(rescale=1./255) # scales x_t/x_v
train_gen = datagen.flow(x_t, y_train, batch_size=batch_size)
valid_gen = datagen.flow(x_v,y_valid, batch_size=batch_size)
model.fit_generator(train_gen,steps_per_epoch = steps_per_epoch_tr,validation_data = valid_gen,
validation_steps = steps_per_epoch_val,epochs=epochs)
This gives an error:
This is due to expected image dimension error, but I am not sure where/how to fix this. any help is greatly appreciated.
Thanks
sedy
In the model.fit() case, this line flattened the input before feeding it for training.
x_batch = np.reshape(x_batch, [-1, num_pixels])
But in the generator case, there is nothing to flatten the input before feeding it to the Dense layer. The Dense layer cannot process 2D input (28 x 28). Adding, a Flatten() layer to the model should do the trick as shown below.
def baseline_model():
# create model
model = Sequential()
model.add(Flatten(input_shape=(28,28,1)))
model.add(Dense(num_pixels, input_dim=num_pixels, kernel_initializer='normal', activation='relu'))
model.add(Dense(num_classes, kernel_initializer='normal', activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
return model

Using tf.data.Dataset as training input to Keras model NOT working

I have a simple code, which DOES work, for training a Keras model in Tensorflow using numpy arrays as features and labels. If I then wrap these numpy arrays using tf.data.Dataset.from_tensor_slices in order to train the same Keras model using a tensorflow dataset, I get an error. I haven't been able to figure out why (it may be a tensorflow or keras bug, but I may also be missing something). I'm on python 3, tensorflow is 1.10.0, numpy is 1.14.5, no GPU involved.
OBS1: The possibility of using tf.data.Dataset as a Keras input is showed in https://www.tensorflow.org/guide/keras, under "Input tf.data datasets".
OBS2: In the code below, the code under "#Train with numpy arrays" is being executed, using numpy arrays. If this code is commented and the code under "#Train with tf.data datasets" is used instead, the error will be reproduced.
OBS3: In line 13, which is commented and starts with "###WORKAROUND 1###", if the comment is removed and the line is used for tf.data.Dataset inputs, the error changes, even though I can't completely understand why.
The complete code is:
import tensorflow as tf
import numpy as np
np.random.seed(1)
tf.set_random_seed(1)
print(tf.__version__)
print(np.__version__)
#Import mnist dataset as numpy arrays
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()#Import
x_train, x_test = x_train / 255.0, x_test / 255.0 #normalizing
###WORKAROUND 1###y_train, y_test = (y_train.astype(dtype='float32'), y_test.astype(dtype='float32'))
x_train = np.reshape(x_train, (x_train.shape[0], x_train.shape[1]*x_train.shape[2])) #reshaping 28 x 28 images to 1D vectors, similar to Flatten layer in Keras
batch_size = 32
#Create a tf.data.Dataset object equivalent to this data
tfdata_dataset_train = tf.data.Dataset.from_tensor_slices((x_train, y_train))
tfdata_dataset_train = tfdata_dataset_train.batch(batch_size).repeat()
#Creates model
keras_model = tf.keras.models.Sequential([
tf.keras.layers.Dense(512, activation=tf.nn.relu),
tf.keras.layers.Dropout(0.2, seed=1),
tf.keras.layers.Dense(10, activation=tf.nn.softmax)
])
#Compile the model
keras_model.compile(optimizer='adam',
loss=tf.keras.losses.sparse_categorical_crossentropy,
metrics=['accuracy'])
#Train with numpy arrays
keras_training_history = keras_model.fit(x_train,
y_train,
initial_epoch=0,
epochs=1,
batch_size=batch_size
)
#Train with tf.data datasets
#keras_training_history = keras_model.fit(tfdata_dataset_train,
# initial_epoch=0,
# epochs=1,
# steps_per_epoch=60000//batch_size
# )
print(keras_training_history.history)
The error observed when using tf.data.Dataset as input is:
(...)
ValueError: Tensor conversion requested dtype uint8 for Tensor with dtype float32: 'Tensor("metrics/acc/Cast:0", shape=(?,), dtype=float32)'
During handling of the above exception, another exception occurred:
(...)
TypeError: Input 'y' of 'Equal' Op has type float32 that does not match type uint8 of argument 'x'.
The error when removing the comment from line 13, as commented above in OBS3, is:
(...)
tensorflow.python.framework.errors_impl.InvalidArgumentError: In[0] is not a matrix
[[Node: dense/MatMul = MatMul[T=DT_FLOAT, _class=["loc:#training/Adam/gradients/dense/MatMul_grad/MatMul_1"], transpose_a=false, transpose_b=false, _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_sequential_input_0_0, dense/MatMul/ReadVariableOp)]]
Any help would be appreciated, including comments that you were able to reproduce the errors, so I can report the bug if it is the case.
I just upgraded to Tensorflow 1.10 to execute this code. I think that is the answer which is also discussed in the other Stackoverflow thread
This code executes but only if I remove the normalization as that line seems to use too much CPU memory. I see messages indicating that. I also reduced the cores.
import tensorflow as tf
import numpy as np
from tensorflow.keras.layers import Conv2D, MaxPool2D, Flatten, Dense, Dropout, Input
np.random.seed(1)
tf.set_random_seed(1)
batch_size = 128
NUM_CLASSES = 10
print(tf.__version__)
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()
#x_train, x_test = x_train / 255.0, x_test / 255.0 #normalizing
def tfdata_generator(images, labels, is_training, batch_size=128):
'''Construct a data generator using tf.Dataset'''
def preprocess_fn(image, label):
'''A transformation function to preprocess raw data
into trainable input. '''
x = tf.reshape(tf.cast(image, tf.float32), (28, 28, 1))
y = tf.one_hot(tf.cast(label, tf.uint8), NUM_CLASSES)
return x, y
dataset = tf.data.Dataset.from_tensor_slices((images, labels))
if is_training:
dataset = dataset.shuffle(1000) # depends on sample size
# Transform and batch data at the same time
dataset = dataset.apply(tf.contrib.data.map_and_batch(
preprocess_fn, batch_size,
num_parallel_batches=2, # cpu cores
drop_remainder=True if is_training else False))
dataset = dataset.repeat()
dataset = dataset.prefetch(tf.contrib.data.AUTOTUNE)
return dataset
training_set = tfdata_generator(x_train, y_train,is_training=True, batch_size=batch_size)
testing_set = tfdata_generator(x_test, y_test, is_training=False, batch_size=batch_size)
inputs = Input(shape=(28, 28, 1))
x = Conv2D(32, (3, 3), activation='relu', padding='valid')(inputs)
x = MaxPool2D(pool_size=(2, 2))(x)
x = Conv2D(64, (3, 3), activation='relu')(x)
x = MaxPool2D(pool_size=(2, 2))(x)
x = Flatten()(x)
x = Dense(512, activation='relu')(x)
x = Dropout(0.5)(x)
outputs = Dense(NUM_CLASSES, activation='softmax')(x)
keras_model = tf.keras.Model(inputs, outputs)
#Compile the model
keras_model.compile('adam', 'categorical_crossentropy', metrics=['acc'])
#Train with tf.data datasets
keras_training_history = keras_model.fit(
training_set.make_one_shot_iterator(),
steps_per_epoch=len(x_train) // batch_size,
epochs=5,
validation_data=testing_set.make_one_shot_iterator(),
validation_steps=len(x_test) // batch_size,
verbose=1)
print(keras_training_history.history)
Installing the tf-nightly build, together with changing dtypes of some tensors (the error changes after installing tf-nightly), solved the problem, so it is an issue which (hopefully) will be solved in 1.11.
Related material: https://github.com/tensorflow/tensorflow/issues/21894
I am wondering how Keras is able to do 5 epochs when the
make_one_shot_iterator() which only supports iterating once through a
dataset?
could be given smth like iterations = len(y_train) * epochs - here shown for tf.v1
the code from Mohan Radhakrishnan still works in tf.v2 with little corrections in objects' belongings to new classes (in tf.v2) fixings - to make the code up-to-date... No more make_one_shot_iterator() needed
# >> author: Mohan Radhakrishnan
import tensorflow as tf
import tensorflow.keras
import numpy as np
from tensorflow.keras.layers import Conv2D, MaxPool2D, Flatten, Dense, Dropout, Input
np.random.seed(1)
tf.random.set_seed(1)
batch_size = 128
NUM_CLASSES = 10
print(tf.__version__)
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()
#x_train, x_test = x_train / 255.0, x_test / 255.0 #normalizing
def tfdata_generator(images, labels, is_training, batch_size=128):
'''Construct a data generator using tf.Dataset'''
def preprocess_fn(image, label):
'''A transformation function to preprocess raw data
into trainable input. '''
x = tf.reshape(tf.cast(image, tf.float32), (28, 28, 1))
y = tf.one_hot(tf.cast(label, tf.uint8), NUM_CLASSES)
return x, y
dataset = tf.data.Dataset.from_tensor_slices((images, labels))
if is_training:
dataset = dataset.shuffle(1000) # depends on sample size
# Transform and batch data at the same time
dataset = dataset.apply( tf.data.experimental.map_and_batch(
preprocess_fn, batch_size,
num_parallel_batches=2, # cpu cores
drop_remainder=True if is_training else False))
dataset = dataset.repeat()
dataset = dataset.prefetch( tf.data.experimental.AUTOTUNE)
return dataset
training_set = tfdata_generator(x_train, y_train,is_training=True, batch_size=batch_size)
testing_set = tfdata_generator(x_test, y_test, is_training=False, batch_size=batch_size)
inputs = Input(shape=(28, 28, 1))
x = Conv2D(32, (3, 3), activation='relu', padding='valid')(inputs)
x = MaxPool2D(pool_size=(2, 2))(x)
x = Conv2D(64, (3, 3), activation='relu')(x)
x = MaxPool2D(pool_size=(2, 2))(x)
x = Flatten()(x)
x = Dense(512, activation='relu')(x)
x = Dropout(0.5)(x)
outputs = Dense(NUM_CLASSES, activation='softmax')(x)
keras_model = tf.keras.Model(inputs, outputs)
#Compile the model
keras_model.compile('adam', 'categorical_crossentropy', metrics=['acc'])
#Train with tf.data datasets
# training_set.make_one_shot_iterator() - 'PrefetchDataset' object has no attribute 'make_one_shot_iterator'
keras_training_history = keras_model.fit(
training_set,
steps_per_epoch=len(x_train) // batch_size,
epochs=5,
validation_data=testing_set,
validation_steps=len(x_test) // batch_size,
verbose=1)
print(keras_training_history.history)
not loading data locally, just easy DataFlow - that is very convinient - Thanks a lot - hope my corrections are proper

Stacked Autoencoder for classification using mnist

Am aware that container for autoencoder has been removed in new Keras. My aim is to extract the encoding representation of an input and feed it in as an input to the next layer i.e. stacked autoencoder for classification using three hidden layers. I got this error:
Exception: Error when checking model target: expected dense_160 to have shape (None, 500) but got array with shape (60000L, 784L). I am not too sure if this code achieves my intention. I will appreciate any guide on how to resolve this. Thanks.
The code is pasted below:
from keras.layers import Input, Dense
from keras.models import Model
from keras.datasets import mnist
import numpy as np
nb_classes = 10
nb_epoch=200
batch_size=256
hidden_layer1=784
hidden_layer2=600
hidden_layer3=500
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.
x_train = x_train.reshape((len(x_train), np.prod(x_train.shape[1:])))
x_test = x_test.reshape((len(x_test), np.prod(x_test.shape[1:])))
print('Train samples: {}'.format(x_train.shape[0]))
print('Test samples: {}'.format(x_test.shape[0]))
from keras.utils import np_utils
y_train = np_utils.to_categorical(y_train, nb_classes)
y_test = np_utils.to_categorical(y_test, nb_classes)
input_img = Input(shape=(784,))
encoded = Dense(hidden_layer1, activation='relu')(input_img)
encoded = Dense(hidden_layer2, activation='relu')(encoded)
encoded=Dense(hidden_layer3,activation='softmax')(encoded)
model = Model(input=input_img , output=encoded)
model.compile(optimizer='rmsprop',
loss='categorical_crossentropy',
metrics=['accuracy'])
model.fit(x_train, x_train,
nb_epoch=nb_epoch,
batch_size=batch_size,
shuffle=True,
validation_data=(x_test, x_test))
score = model.evaluate(x_test, y_test, verbose=0)
print('Test score before fine turning:', score[0])
print('Test accuracy after fine turning:', score[1])
I modified your code and tested, it worked.
nb_classes = 10
nb_epoch = 5
batch_size = 256
hidden_layer1 = 128
hidden_layer2 = 64
hidden_layer3 = 10 # because you have 10 categories
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.
x_train = x_train.reshape((len(x_train), np.prod(x_train.shape[1:])))
x_test = x_test.reshape((len(x_test), np.prod(x_test.shape[1:])))
print('Train samples: {}'.format(x_train.shape[0]))
print('Test samples: {}'.format(x_test.shape[0]))
from keras.utils import np_utils
y_train = np_utils.to_categorical(y_train, nb_classes)
y_test = np_utils.to_categorical(y_test, nb_classes)
input_img = Input(shape=(784,))
encoded = Dense(hidden_layer1, activation='relu')(input_img)
encoded = Dense(hidden_layer2, activation='relu')(encoded)
encoded = Dense(hidden_layer3, activation='softmax')(encoded)
decoded = Dense(hidden_layer2, activation='relu')(encoded)
decoded = Dense(hidden_layer1, activation='relu')(decoded)
decoded = Dense(784, activation='sigmoid')(decoded)
model = Model(input=input_img, output=encoded)
model.compile(optimizer='rmsprop',
loss='categorical_crossentropy',
metrics=['accuracy'])
model.fit(x_train, y_train,
nb_epoch=nb_epoch,
batch_size=batch_size,
shuffle=True,
validation_data=(x_test, y_test))
score = model.evaluate(x_test, y_test, verbose=1)
print('/n')
print('Test score before fine turning:', score[0])
print('Test accuracy after fine turning:', score[1])

Resources