Stacking multiple RGB images in numpy array for CNN implementation - python-3.x

I have 1000 RGB images which I want to read from the current directory and store it in a numpy array in the shape of (1000,3,32,32) for using it in CNN.
For this reason, I have read a sample image, resized it to 32 * 32. Then appended it to an array 'a' which I have created using zeros for the shape (1000,3,32,32). But I am getting an error called " 'numpy.ndarray' an object has no attribute 'append' ". How can it be solved? If it needs any different approach I am open to that as well.
import matplotlib.pyplot as plt
import numpy as np
reshapedimage =cv2.resize(cv2.imread("0 (1).png", 1), (32, 32))
a = np.zeros((1000,3,32,32))
a.append(reshapedimage)

I think you mean this:
import numpy as np
# Create dummy image-like thing
w, h = 32, 32
im=np.arange(h*w*3).reshape((3,h,w))
# Create empty list
stack=[]
# Append the image to the stack 5 times
stack.append(im)
stack.append(im)
stack.append(im)
stack.append(im)
stack.append(im)
# Make Numpy array and check size
v = np.array(stack)
print(v.shape)
Output
(5, 3, 32, 32)

Related

How to write Python console output (Image to RGB matrix) into a txt file?

I am new to Python. I am using 3.8.3 version. My code is:
if __name__ == '__main__':
import imageio
import matplotlib.pyplot as plt
pic = imageio.imread('Photo_Vikash_pandey.jpg')
plt.figure(figsize = (15,15))
plt.imshow(pic)
print('Value of only R channel {}'.format(pic[ 100, 50, 0]))
print('Value of only G channel {}'.format(pic[ 100, 50, 1]))
print('Value of only B channel {}'.format(pic[ 100, 50, 2]))
I can print the RGB values of any desired pixel. But how do I output the complete RGB matrix that corresponds to the entire image. The image size in pixels is 200x150. I wish to output in a .txt file.
The problem is that an image is a 3D matrix, i.e. in your case it's a 200x150x3 matrix (3 being the color channels, RGB).
So, you could do something like:
import cv2
import numpy as np
image = cv2.imread("path_to_your_image", 1)
for i in range(0,3):
np.savetxt(f"image_channel_{i}.txt", image[:,:,i])
Basically, this way you're considering each time the matrix of values for each color channel, which is a normal 2D matrix, and you can write each of them to a separate file.
You do that by using the notation:
image[:,:,i]
which means, taking all rows and columns of the matrix (the : in between parenthesis) for the index i, which changes in the loop between 0 and 2 (telling you all of this because you said you're new to Python).
Or, you can write the 3 matrices one after another in the same file, as you prefer.
The main point is that you can't write the whole 3D matrix to a file as it is.

how to convert 4d numpy array to PIL image?

I'm doing some image machine learning by keras and if i put one picture converted to numpy.array in my model, it returns a 4d numpy array(predicted picture).
I want to convert that array to image by using Image.fromarray in PIL library.
but Image.fromarray only accept 2d array or 3d array.
my predicted picture's array shape is (1, 256, 256, 3) 1 means number of data.
so 1 is useless data for image. I want to convert it to(256,256,3) with not damaging image data. what should I do? Thanks for your time.
1 is not useless data, it is a singular dimension. You can just leave it out, the size of the data wouldn't change.
You can do that with numpy.squeeze.
Also, make sure that your data is in the right format, for Image.fromarray this is uint8.
Example:
import numpy as np
from PIL import Image
data = np.ones((1,16,16,3))
for i in range(16):
data[0,i,i,1] = 0.0
print("size: %s, type: %s"%(data.shape, data.dtype))
# size: (1, 16, 16, 3), type: float64
data_img = (data.squeeze()*255).astype(np.uint8)
print("size: %s, type: %s"%(data_img.shape, data_img.dtype))
# size: (16, 16, 3), type: uint8
img = Image.fromarray(data_img, mode='RGB')
img.show()

Invalid dimension for image data in plt.imshow()

I am using mnist dataset for training a capsule network in keras background.
After training, I want to display an image from mnist dataset. For loading images, mnist.load_data() is used. The data is stored as (x_train, y_train),(x_test, y_test).
Now, for visualizing image, my code is as follows:
img_path = x_test[1]
print(img_path.shape)
plt.imshow(img_path)
plt.show()
The code gives output as follows:
(28, 28, 1)
and the error on plt.imshow(img_path) as follows:
TypeError: Invalid dimensions for image data
How to show image in png format. Help!
As per the comment of #sdcbr using np.sqeeze reduces unnecessary dimension. If image is 2 dimensions then imshow function works fine. If image has 3 dimensions then you have to reduce extra 1 dimension. But, for higher dim data you will have to reduce it to 2 dims, so np.sqeeze may be applied multiple times. (Or you may use some other dim reduction functions for higher dim data)
import numpy as np
import matplotlib.pyplot as plt
img_path = x_test[1]
print(img_path.shape)
if(len(img_path.shape) == 3):
plt.imshow(np.squeeze(img_path))
elif(len(img_path.shape) == 2):
plt.imshow(img_path)
else:
print("Higher dimensional data")
Example:
plt.imshow(test_images[0])
TypeError: Invalid shape (28, 28, 1) for image data
Correction:
plt.imshow((tf.squeeze(test_images[0])))
Number 7
You can use tf.squeeze for removing dimensions of size 1 from the shape of a tensor.
plt.imshow( tf.shape( tf.squeeze(x_train) ) )
Check out TF2.0 example
matplotlib.pyplot.imshow() does not support images of shape (h, w, 1). Just remove the last dimension of the image by reshaping the image to (h, w): newimage = reshape(img,(h,w)).

Keras Feature Extraction - expected input_1 to have 4 dimensions, but got array with shape (1, 46)

I have an issue with Keras, when extracting image features.
I already add 4d layer
with this code
# Add a fourth dimension (since Keras expects a list of images)
image_array = np.expand_dims(image_array, axis=0)
But still gives me an error.
This is my actual code:
from pathlib import Path
import numpy as np
import joblib
from keras.preprocessing import image
from keras.applications import vgg16
import os.path
# Path to folders with training data
img_db = Path("database") / "train"
images = []
labels = []
# Load all the not-dog images
for file in img_db.glob("*/*.jpg"):
file = str(file)
# split path with filename
pathname, filename = os.path.split(file)
person = pathname.split("\\")[-1]
print("Processing file: {}".format(file))
# Load the image from disk
img = image.load_img(file)
# Convert the image to a numpy array
image_array = image.img_to_array(img)
# Add a fourth dimension (since Keras expects a list of images)
# image_array = np.expand_dims(image_array, axis=0)
# Add the image to the list of images
images.append(image_array)
# For each 'not dog' image, the expected value should be 0
labels.append(person)
# Create a single numpy array with all the images we loaded
x_train = np.array(images)
# Also convert the labels to a numpy array
y_train = np.array(labels)
# Normalize image data to 0-to-1 range
x_train = vgg16.preprocess_input(x_train)
input_shape = (250, 250, 3)
# Load a pre-trained neural network to use as a feature extractor
pretrained_nn = vgg16.VGG16(weights='imagenet', include_top=False, input_shape=input_shape)
# Extract features for each image (all in one pass)
features_x = pretrained_nn.predict(x_train)
# Save the array of extracted features to a file
joblib.dump(features_x, "x_train.dat")
# Save the matching array of expected values to a file
joblib.dump(y_train, "y_train.dat")
Error
Traceback (most recent call last):
File
"C:/Users/w024029h/PycharmProjects/keras_pretrained/pretrained_vgg16.py",
line 57, in
features_x = pretrained_nn.predict(x_train) File "C:\Users\w024029h\AppData\Local\Programs\Python\Python36\lib\site-packages\keras\engine\training.py",
line 1817, in predict
check_batch_axis=False) File "C:\Users\w024029h\AppData\Local\Programs\Python\Python36\lib\site-packages\keras\engine\training.py",
line 113, in _standardize_input_data
'with shape ' + str(data_shape)) ValueError: Error when checking : expected input_1 to have 4 dimensions, but got array with shape (1,
46)
After adding an extra dimension, image_array will have a shape similar to (1, 3, 250, 250) or (1, 250, 250, 3) (depending on your backend, considering 3-channel images).
When you do images.append(image_array), it will append this 4d-array into a list of numpy arrays. In practice, this list will be a 5d array, but when you convert the list back to a numpy array, numpy does not have a way to know what is the desired shape/number of dimensions you want.
You can use np.vstack() (doc) to stack each individual 4d-array in the first axis.
Change these lines in your code:
# Create a single numpy array with all the images we loaded
x_train = np.array(images)
For:
x_train = np.vstack(images)

How to match cv2.imread to the keras image.img_load output

I'm studying deep learning. Trained an image classification algorithm. The problem is, however, that to train images I used:
test_image = image.load_img('some.png', target_size = (64, 64))
test_image = image.img_to_array(test_image)
While for actual application I use:
test_image = cv2.imread('trick.png')
test_image = cv2.resize(test_image, (64, 64))
But I found that those give a different ndarray (different data):
Last entries from load_image:
[ 64. 71. 66.]
[ 64. 71. 66.]
[ 62. 69. 67.]]]
Last entries from cv2.imread:
[ 15 23 27]
[ 16 24 28]
[ 14 24 28]]]
, so the system is not working. Is there a way to match results of one to another?
OpenCV reads images in BGR format whereas in keras, it is represented in RGB. To get the OpenCV version to correspond to the order we expect (RGB), simply reverse the channels:
test_image = cv2.imread('trick.png')
test_image = cv2.resize(test_image, (64, 64))
test_image = test_image[...,::-1] # Added
The last line reverses the channels to be in RGB order. You can then feed this into your keras model.
Another point I'd like to add is that cv2.imread usually reads in images in uint8 precision. Examining the output of your keras loaded image, you can see that the data is in floating point precision so you may also want to convert to a floating-point representation, such as float32:
import numpy as np
# ...
# ...
test_image = test_image[...,::-1].astype(np.float32)
As a final point, depending on how you trained your model it's usually customary to normalize the image pixel values to a [0,1] range. If you did this with your keras model, make sure you divide your values by 255 in your image read in through OpenCV:
import numpy as np
# ...
# ...
test_image = (test_image[...,::-1].astype(np.float32)) / 255.0
Recently, I came across the same issue. I tried to convert the color channel and resize the image with OpenCV. However, PIL and OpenCV have very different ways of image resizing.
Here is the exact solution to this problem.
This is the function that takes image file path , convert to targeted size and prepares for the Keras model -
import cv2
import keras
import numpy as np
from keras.preprocessing import image
from PIL import Image
def prepare_image (file):
im_resized = image.load_img(file, target_size = (224,224))
img_array = image.img_to_array(im_resized)
image_array_expanded = np.expand_dims(img_array, axis = 0)
return keras.applications.mobilenet.preprocess_input(image_array_expanded)
# execute the function
PIL_image = prepare_image ("lena.png")
If you have an OpenCV image then the function will be like this -
def prepare_image2 (img):
# convert the color from BGR to RGB then convert to PIL array
cvt_image = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
im_pil = Image.fromarray(cvt_image)
# resize the array (image) then PIL image
im_resized = im_pil.resize((224, 224))
img_array = image.img_to_array(im_resized)
image_array_expanded = np.expand_dims(img_array, axis = 0)
return keras.applications.mobilenet.preprocess_input(image_array_expanded)
# execute the function
img = cv2.imread("lena.png")
cv2_image = prepare_image2 (img)
# finally check if it is working
np.array_equal(PIL_image, cv2_image)
>> True
Besides CV2 using the BGR format and Keras (using PIL as a backend) using the RGB format, there are also significant differences in the resize methods of CV2 and PIL using the same parameters.
Multiple references can be found on the internet but the general idea is that there are subtle differences in pixel coordinate systems used in the two resize algorithms and also potential issues with different methods of casting to float as an intermediate step in the interpolation algo. End result is a visually similar image but one that is slightly shifted/perturbed between versions.
A perfect example of an adversarial attack that can cause huge differences in accuracy despite small input differences.

Resources