I am using Keras autoencodes with Theano backend. And want to make autoencode for 720x1080 RGB images.
This is my code
from keras.datasets import mnist
import numpy as np
from keras.layers import Input, LSTM, RepeatVector, Conv2D, MaxPooling2D, UpSampling2D
from keras.models import Model
from PIL import Image
x_train = []
x_train_noisy = []
for i in range(5,1000):
image = Image.open('data/trailerframes/frame' + str(i) + '.jpg', 'r')
x_train.append(np.array(image))
image = Image.open('data/trailerframes_avg/frame' + str(i) + '.jpg', 'r')
x_train_noisy.append(np.array(image))
x_train = np.array(x_train)
x_train = x_train.astype('float32') / 255.
x_train_noisy = np.array(x_train_noisy)
x_train_noisy = x_train_noisy.astype('float32') / 255.
input_img = Input(shape=(720, 1080, 3))
x = Conv2D(32, (3, 3), activation='relu', padding='same')(input_img)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(32, (3, 3), activation='relu', padding='same')(x)
encoded = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(32, (3, 3), data_format="channels_last", activation='relu', padding='same')(encoded)
x = UpSampling2D((2, 2))(x)
x = Conv2D(32, (3, 3), data_format="channels_last", activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
decoded = Conv2D(1, (3, 3), data_format="channels_last", activation='sigmoid', padding='same')(x)
autoencoder = Model(input_img, decoded)
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')
autoencoder.fit(x_train_noisy, x_train,
epochs=10,
batch_size=128,
shuffle=True,
validation_data=(x_train_noisy, x_train))
But it gives me an error
ValueError: Error when checking input: expected input_7 to have shape (None, 720, 1080, 3) but got array with shape (995, 720, 1280, 3)
Input error:
As simple as:
You defined your input as (720,1080,3)
You're trying to trian your model with data in the form (720,1280,3)
One of them is wrong, and I think it's a typo in the input:
#change 1080 for 1280
input_img = Input(shape=(720, 1280, 3))
Output error (target):
Now, your target data is shaped like (720,1280,3), and your last layer outputs (720,1280,1)
A simple fix is:
decoded = Conv2D(3, (3, 3), data_format="channels_last", activation='sigmoid', padding='same')(x)
Using the encoder:
After training that model, you can create submodels for using only the encoder or the decoder:
encoderModel = Model(input_img, decoded)
decoderInput = Input((shape of the encoder output))
decoderModel = Model(decoderInput,decoded))
These two models will share the exact same weights of the entire model, training one model will affect all three models.
For using them without training, you can use model.predict(data), which will give you the results without training.
Related
I have been following the Keras documentation to build up a CNN autoencoder
https://blog.keras.io/building-autoencoders-in-keras.html .
from keras.layers import Input, Dense, Conv2D, MaxPooling2D, UpSampling2D
from keras.models import Model
from keras import backend as K
input_img = Input(shape=(28, 28, 1)) # adapt this if using `channels_first` image data format
x = Conv2D(16, (3, 3), activation='relu', padding='same')(input_img)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
encoded = MaxPooling2D((2, 2), padding='same')(x)
# at this point the representation is (4, 4, 8) i.e. 128-dimensional
x = Conv2D(8, (3, 3), activation='relu', padding='same')(encoded)
x = UpSampling2D((2, 2))(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
x = Conv2D(16, (3, 3), activation='relu')(x)
x = UpSampling2D((2, 2))(x)
decoded = Conv2D(1, (3, 3), activation='sigmoid', padding='same')(x)
autoencoder = Model(input_img, decoded)
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')
I have noticed that it uses Conv2D in its decoding layers instead of Conv2DTranspose. But some other articles explain CNN autoencoders using Conv2DTranspose as a replacement for Upsampling2D and Conv2D. I have seen several questions related to Conv2DTranspose itself. But I haven't found an answer to my question.
My question is can I use Conv2DTranspose instead of Upsampling2D and Conv2D layers. If so, why haven't the authors themselves (Keras documentation) have not used it? Does it make any difference?
Transpose Convolutions often result in artifacts called Checkerboard artifacts - Small adjacent squares easily distinguishable from each other. These make it very easy for humans to recognize fake images from real ones.
You can read this article for more information.
In short, using Resizing + Conv2D instead of Conv2dTranspose minimizes these checkerboard artifacts.
i implemented a vanilla autoencoder in keras like this:
import numpy as np
from keras.preprocessing.image import *
from keras.layers import Input, Dense, Conv2D, MaxPooling2D, UpSampling2D
from keras.models import Model
from my_callback import Histories
input_img = Input(shape=(256, 256, 3)) # adapt this if using `channels_first` image data format
img = load_img('ss.png', target_size=(256,256))
#img.show()
img = np.array(img, dtype=np.float32) / 255 - 0.5
img=np.expand_dims(img,axis=0)
x = Conv2D(128, (3, 3), activation='relu', padding='same')(input_img)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(64, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(32, (3, 3), activation='relu', padding='same')(x)
encoded = MaxPooling2D((2, 2), padding='same')(x)
# at this point the representation is (4, 4, 8) i.e. 128-dimensional
x = Conv2D(32, (3, 3), activation='relu', padding='same')(encoded)
x = UpSampling2D((2, 2))(x)
x = Conv2D(32, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
x = Conv2D(64, (3, 3), activation='relu',padding='same')(x)
x = UpSampling2D((2, 2))(x)
decoded = Conv2D(3, (3, 3), activation='sigmoid', padding='same')(x)
autoencoder = Model(input_img, decoded)
autoencoder.compile(optimizer='adam', loss='binary_crossentropy')
h=Histories()
autoencoder.fit(img, img,
epochs=300,
batch_size=10,callbacks=[h])
Now
I had successfully made a GIF through saving each predicted image by autoencoder by keras.callbacks and then use imageio to make a gif with the help of this post programmatically-generate-video-or-animated-gif-in-python .The code was:
**my_callbacks.py**
import keras
import numpy as np
from matplotlib import pyplot as plt
from keras.preprocessing.image import load_img
class Histories(keras.callbacks.Callback):
def on_epoch_end(self, epoch, logs=None):
img = load_img('ss.png', target_size=(256, 256))
img = np.array(img, dtype=np.float32) / 255 - 0.5
img = np.expand_dims(img, axis=0)
pixel = self.model.predict(img)
pixel = pixel.reshape((256, 256, 3))
#plt.imshow(pixel)
#plt.show()
print("saved images------------???????????????????????")
plt.imsave('all_gif/pixelhash{0}.png'.format(epoch), pixel)
return
And then the code to generate a gif of images from a folder.But sadly this code doesn't keep the order of images.Any solution?
import imageio
import os
filenames=os.listdir()
images = []
for filename in filenames:
images.append(imageio.imread(filename))
imageio.mimsave('movie.gif', images)
But Is there any way to generate gif automatically without saving images in python like via matplotlib?
I have images of shape 391 x 400. I attempted to use the autoencoder as described here.
Specifically, I have used the following code:
from keras.layers import Input, Dense, Conv2D, MaxPooling2D, UpSampling2D
from keras.models import Model
from keras import backend as K
input_img = Input(shape=(391, 400, 1)) # adapt this if using `channels_first` image data format
x = Conv2D(16, (3, 3), activation='relu', padding='same')(input_img)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
encoded = MaxPooling2D((2, 2), padding='same')(x)
# at this point the representation is (4, 4, 8) i.e. 128-dimensional
x = Conv2D(8, (3, 3), activation='relu', padding='same')(encoded)
x = UpSampling2D((2, 2))(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
x = Conv2D(16, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
decoded = Conv2D(1, (3, 3), activation='sigmoid', padding='same')(x)
autoencoder = Model(input_img, decoded)
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')
I am getting the following:
ValueError: Error when checking target: expected conv2d_37 to have shape (None, 392, 400, 1) but got array with shape (500, 391, 400, 1)
What I need: a layer that would drop/crop/reshape the last layer from 392 x 400 to 391 x 400.
Thank you for any help.
There's a layer called Cropping2D. To crop the last layer from 392 x 400 to 391 x 400, you can use it by:
cropped = Cropping2D(cropping=((1, 0), (0, 0)))(decoded)
autoencoder = Model(input_img, cropped)
The tuple ((1, 0), (0, 0)) means to crop 1 row from the top. If you want to crop from bottom, use ((0, 1), (0, 0)) instead. You can see the documentation for more detailed description about the cropping argument.
I am trying to reconstruct images using Conv Autoencoder, but I get an error related to dimensions, could you find a solution, thanks,
Basically fist I want to test the model on reconstructing the same input data which are images, then if the model worked fine, I need to model images to maps,
In this case, can I just change the data from image_data to map_data as shown in the code below:
from keras.layers import Input, Dense, Conv2D, MaxPooling2D, UpSampling2D
from keras.models import Model
from keras import backend as K
from keras.callbacks import TensorBoard
import numpy as np
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation, Flatten
from keras.utils import np_utils
import matplotlib.pyplot as plt
import matplotlib
import os
from PIL import Image
from numpy import *
from sklearn.utils import shuffle
from sklearn.cross_validation import train_test_split
from keras.preprocessing.image import ImageDataGenerator,array_to_img, img_to_array, load_img
image_data='C:/Users/user_PC/Desktop/Image2Map/Samples'
map_data='C:/Users/user_PC/Desktop/Image2Map/Samples'
K.set_image_dim_ordering('tf')
input_img = Input(batch_shape=(1024, 106, 106,3))
x = Conv2D(16, (3, 3), activation='relu', padding='same')(input_img)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
encoded = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(encoded)
x = UpSampling2D((2, 2))(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
x = Conv2D(16, (3, 3), activation='relu')(x)
x = UpSampling2D((2, 2))(x)
decoded = Conv2D(1, (3, 3), activation='sigmoid', padding='same')(x)
autoencoder = Model(input_img, decoded)
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')
train_datagen = ImageDataGenerator(
rescale=1. / 255)
test_datagen = ImageDataGenerator(rescale=1. / 255)
train_generator = train_datagen.flow_from_directory(
image_data,
target_size=(106, 106),
batch_size=4,
class_mode=None)
validation_generator = test_datagen.flow_from_directory(
image_data,
target_size=(106, 106),
batch_size=4,
class_mode=None)
imgs = np.concatenate([train_generator.next()[0] for i in range(1024)])
autoencoder.fit_generator(generator=(imgs,imgs),
samples_per_epoch=1024 // 4,
epochs=10,
validation_data=(imgs,imgs),
validation_steps=1024 // 4)
decoded_imgs = autoencoder.predict(image_data)
n = 10
plt.figure(figsize=(20, 4))
for i in range(n):
# display original
ax = plt.subplot(2, n, i)
plt.imshow(image_data[i].reshape(106, 106))
plt.gray()
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)
# display reconstruction
ax = plt.subplot(2, n, i + n)
plt.imshow(decoded_imgs[i].reshape(106, 106))
plt.gray()
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)
plt.show()
Change class_mode from 'None' to 'input'. It should work . read more here
I'm getting an error in Keras where the dimension of the output is different than the dimension of the input. I don't understand where the 20 comes from. All my dimensions seem to be showing 18. Also, the Output Shape for convolution2d_70 just says 'multiple', so I'm not sure what that means. Any ideas?
Exception: Error when checking model target: expected convolution2d_70 to have shape (None, 1, 36L, 20L) but got array with shape (49L, 1L, 36L, 18L)
from keras.layers import Dense, Input, Convolution2D, MaxPooling2D, UpSampling2D
from keras.models import Model
from os import listdir
import os.path
import numpy as np
import re
versions = !pip freeze
for line in versions:
if re.search('Keras', line):
print line
samples = 100
x = np.ndarray([samples,36,18])
i=0
for i in range(samples):
x[i] = np.random.randint(15, size=(36, 18))
i+=1
#TODO: CREATE A REAL TEST SET
x_train = x[:49]
x_test = x[50:]
print x_train.shape
print x_test.shape
#INPUT LAYER
input_img = Input(shape=(1,x_train.shape[1],x_train.shape[2]))
x = Convolution2D(16, 3, 3, activation='relu', border_mode='same')(input_img)
x = MaxPooling2D((2, 2), border_mode='same')(x)
x = Convolution2D(8, 3, 3, activation='relu', border_mode='same')(x)
x = MaxPooling2D((2, 2), border_mode='same')(x)
x = Convolution2D(8, 3, 3, activation='relu', border_mode='same')(x)
encoded = MaxPooling2D((2, 2), border_mode='same')(x)
# at this point the representation is (8, 4, 4) i.e. 128-dimensional
x = Convolution2D(8, 3, 3, activation='relu', border_mode='same')(encoded)
x = UpSampling2D((2, 2))(x)
x = Convolution2D(8, 3, 3, activation='relu', border_mode='same')(x)
x = UpSampling2D((2, 2))(x)
x = Convolution2D(16, 3, 3, activation='relu')(x)
x = UpSampling2D((2, 2))(x)
decoded = Convolution2D(1, 3, 3, activation='sigmoid', border_mode='same')(x)
#MODEL
autoencoder = Model(input=input_img, output=decoded)
#SEPERATE ENCODER MODEL
encoder = Model(input=input_img, output=encoded)
# create a placeholder for an encoded (32-dimensional) input
encoded_input = Input(shape=(8, 4, 4))
# retrieve the last layer of the autoencoder model
decoder_layer1 = autoencoder.layers[-3]
decoder_layer2 = autoencoder.layers[-2]
decoder_layer3 = autoencoder.layers[-1]
print decoder_layer3.get_config()
# create the decoder model
decoder = Model(input=encoded_input, output=decoder_layer3(decoder_layer2(decoder_layer1(encoded_input))))
#COMPILER
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')
autoencoder.summary()
x_train = x_train.astype('float32') / 15.
x_test = x_test.astype('float32') / 15.
x_test = np.reshape(x_test, (len(x_test), 1, x_test.shape[1], x_test.shape[2]))
x_train = np.reshape(x_train, (len(x_train), 1, x_train.shape[1], x_train.shape[2]))
autoencoder.fit(x_train,
x_train,
nb_epoch=5,
batch_size=1,
verbose=True,
shuffle=True,
validation_data=(x_test,x_test))
Notice that in third convolutional layer in decoder you are missing border_mode='same'. This makes your dimensionality mismatch.