I know how to use the ImageDataGenerator to augment my data by translating, flipping, rotationg, shearing, etc. The question is let's say that I have both a training image and the corresponding segmentation images and I would like to augment both of these images. For example if I rotated a training image by 45 degrees then I would also like to augment the segmentation image by 45 degrees. In essence I want to perform the identical set of transforms to two data sets. Is that possible to do with ImageDataGenerator, or do I have to write all the augmentation functions from scratch? Thanks very much in advance.
You can use augmentations in tf.data.Dataset.map and return the image twice. I don't know of any way to do this with ImageDataGenerator.
import tensorflow as tf
import matplotlib.pyplot as plt
from skimage import data
cats = tf.concat([data.chelsea()[None, ...] for i in range(24)], axis=0)
test = tf.data.Dataset.from_tensor_slices(cats)
def augment(image):
image = tf.cast(x=image, dtype=tf.float32)
image = tf.divide(x=image, y=tf.constant(255.))
image = tf.image.random_hue(image=image, max_delta=5e-1)
image = tf.image.random_brightness(image=image, max_delta=2e-1)
return image, image
test = test.batch(1).map(augment)
fig = plt.figure()
plt.subplots_adjust(wspace=.1, hspace=.2)
images = next(iter(test.take(1)))
for index, image in enumerate(images):
ax = plt.subplot(1, 2, index + 1)
ax.set_xticks([])
ax.set_yticks([])
ax.imshow(tf.clip_by_value(tf.squeeze(image), clip_value_min=0, clip_value_max=1))
plt.show()
Related
can someone help me on how to increase the size of images from feature map extracted? i recently run CNN on set of images and would like to see the feature extracted. I manage to extract it but unable to actually see it because it was too small.
My code:
from matplotlib import pyplot
#summarize feature map shapes
for i in range(len(cnn.layers)):
layer = cnn.layers[i]
#check fr conv layer
if 'conv' not in layer.name:
continue
print(i, layer.name,layer.output.shape)
from keras import models
from keras.preprocessing import image
model_new = models.Model(inputs=cnn.inputs, outputs=cnn.layers[1].output)
img_path = 'train/1/2NbeGPsQf2Q - 4 0.jpg'
img = image.load_img(img_path, target_size=(img_rows, img_cols))
import numpy as np
from keras.applications.imagenet_utils import decode_predictions, preprocess_input
img = image.img_to_array(img)
img = np.expand_dims(img, axis=0)
img = preprocess_input(img)
features = model_new.predict(img)
square = 10
ix = 1
for _ in range(square):
for _ in range(square):
# specify subplot and turn of axis
ax = pyplot.subplot(square, square, ix)
ax.set_xticks([])
ax.set_yticks([])
# plot filter channel in colour
pyplot.imshow(features[0, :, :, ix-1], cmap='viridis')
ix += 1
# show the figure
pyplot.show()
the result is at attached.output of feature map layer 1
its too small. How can i make it bigger so i can see what actually is there?
Appreciate for any input. Thanks!
I am using mnist dataset for training a capsule network in keras background.
After training, I want to display an image from mnist dataset. For loading images, mnist.load_data() is used. The data is stored as (x_train, y_train),(x_test, y_test).
Now, for visualizing image, my code is as follows:
img_path = x_test[1]
print(img_path.shape)
plt.imshow(img_path)
plt.show()
The code gives output as follows:
(28, 28, 1)
and the error on plt.imshow(img_path) as follows:
TypeError: Invalid dimensions for image data
How to show image in png format. Help!
As per the comment of #sdcbr using np.sqeeze reduces unnecessary dimension. If image is 2 dimensions then imshow function works fine. If image has 3 dimensions then you have to reduce extra 1 dimension. But, for higher dim data you will have to reduce it to 2 dims, so np.sqeeze may be applied multiple times. (Or you may use some other dim reduction functions for higher dim data)
import numpy as np
import matplotlib.pyplot as plt
img_path = x_test[1]
print(img_path.shape)
if(len(img_path.shape) == 3):
plt.imshow(np.squeeze(img_path))
elif(len(img_path.shape) == 2):
plt.imshow(img_path)
else:
print("Higher dimensional data")
Example:
plt.imshow(test_images[0])
TypeError: Invalid shape (28, 28, 1) for image data
Correction:
plt.imshow((tf.squeeze(test_images[0])))
Number 7
You can use tf.squeeze for removing dimensions of size 1 from the shape of a tensor.
plt.imshow( tf.shape( tf.squeeze(x_train) ) )
Check out TF2.0 example
matplotlib.pyplot.imshow() does not support images of shape (h, w, 1). Just remove the last dimension of the image by reshaping the image to (h, w): newimage = reshape(img,(h,w)).
I would like to generate a skeleton out of an image. Since the edges that are generated using skimage from the original image isn't smooth, the resulting skeleton obtained from binary has disconnected edges with knots.
import skimage
from skimage import data,io,filters
import numpy as np
import cv2
import matplotlib.pyplot as plt
from skimage.filters import threshold_adaptive,threshold_mean
from skimage.morphology import binary_dilation
from skimage import feature
from skimage.morphology import skeletonize_3d
imgfile = "edit.jpg"
image = cv2.cvtColor(im, cv2.COLOR_BGR2GRAY)
thresh = threshold_mean(image)
binary = image > thresh
edges = filters.sobel(binary)
dilate = feature.canny(binary,sigma=0)
skeleton = skeletonize_3d(binary)
fig, axes = plt.subplots(nrows=2,ncols=2, figsize=(8, 2))
ax = axes.ravel()
ax[0].imshow(binary, cmap=plt.cm.gray)
ax[0].set_title('binarize')
ax[1].imshow(edges, cmap=plt.cm.gray)
ax[1].set_title('edges')
ax[2].imshow(dilate, cmap=plt.cm.gray)
ax[2].set_title('dilates')
ax[3].imshow(skeleton, cmap=plt.cm.gray)
ax[3].set_title('skeleton')
for a in ax:
a.axis('off')
plt.show()
I tried using dilate to smoothen the jagged edges. But the contours in the skeleton has two edges instead of a single edge that is desired.
I would like to ask for suggestions on how the edges can be smoothened to avoid knots and disconnected edges in the resulting skeleton.
Input image
Output images
Edit:After using gaussian smoothing
binary = image > thresh
gaussian = skimage.filters.gaussian(binary)
skeleton = skeletonize_3d(gaussian)
This median filter should do the work on your binary image for the skeletonization.
import scipy
binary_smoothed = scipy.signal.medfilt (binary, 3)
For the borders, I will probably use this and play with the parameters as shown in the link below
https://claudiovz.github.io/scipy-lecture-notes-ES/advanced/image_processing/auto_examples/plot_canny.html:
from image_source_canny import canny
borders = canny (binary_smoothed, 3, 0.3, 0.2)
I'm studying deep learning. Trained an image classification algorithm. The problem is, however, that to train images I used:
test_image = image.load_img('some.png', target_size = (64, 64))
test_image = image.img_to_array(test_image)
While for actual application I use:
test_image = cv2.imread('trick.png')
test_image = cv2.resize(test_image, (64, 64))
But I found that those give a different ndarray (different data):
Last entries from load_image:
[ 64. 71. 66.]
[ 64. 71. 66.]
[ 62. 69. 67.]]]
Last entries from cv2.imread:
[ 15 23 27]
[ 16 24 28]
[ 14 24 28]]]
, so the system is not working. Is there a way to match results of one to another?
OpenCV reads images in BGR format whereas in keras, it is represented in RGB. To get the OpenCV version to correspond to the order we expect (RGB), simply reverse the channels:
test_image = cv2.imread('trick.png')
test_image = cv2.resize(test_image, (64, 64))
test_image = test_image[...,::-1] # Added
The last line reverses the channels to be in RGB order. You can then feed this into your keras model.
Another point I'd like to add is that cv2.imread usually reads in images in uint8 precision. Examining the output of your keras loaded image, you can see that the data is in floating point precision so you may also want to convert to a floating-point representation, such as float32:
import numpy as np
# ...
# ...
test_image = test_image[...,::-1].astype(np.float32)
As a final point, depending on how you trained your model it's usually customary to normalize the image pixel values to a [0,1] range. If you did this with your keras model, make sure you divide your values by 255 in your image read in through OpenCV:
import numpy as np
# ...
# ...
test_image = (test_image[...,::-1].astype(np.float32)) / 255.0
Recently, I came across the same issue. I tried to convert the color channel and resize the image with OpenCV. However, PIL and OpenCV have very different ways of image resizing.
Here is the exact solution to this problem.
This is the function that takes image file path , convert to targeted size and prepares for the Keras model -
import cv2
import keras
import numpy as np
from keras.preprocessing import image
from PIL import Image
def prepare_image (file):
im_resized = image.load_img(file, target_size = (224,224))
img_array = image.img_to_array(im_resized)
image_array_expanded = np.expand_dims(img_array, axis = 0)
return keras.applications.mobilenet.preprocess_input(image_array_expanded)
# execute the function
PIL_image = prepare_image ("lena.png")
If you have an OpenCV image then the function will be like this -
def prepare_image2 (img):
# convert the color from BGR to RGB then convert to PIL array
cvt_image = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
im_pil = Image.fromarray(cvt_image)
# resize the array (image) then PIL image
im_resized = im_pil.resize((224, 224))
img_array = image.img_to_array(im_resized)
image_array_expanded = np.expand_dims(img_array, axis = 0)
return keras.applications.mobilenet.preprocess_input(image_array_expanded)
# execute the function
img = cv2.imread("lena.png")
cv2_image = prepare_image2 (img)
# finally check if it is working
np.array_equal(PIL_image, cv2_image)
>> True
Besides CV2 using the BGR format and Keras (using PIL as a backend) using the RGB format, there are also significant differences in the resize methods of CV2 and PIL using the same parameters.
Multiple references can be found on the internet but the general idea is that there are subtle differences in pixel coordinate systems used in the two resize algorithms and also potential issues with different methods of casting to float as an intermediate step in the interpolation algo. End result is a visually similar image but one that is slightly shifted/perturbed between versions.
A perfect example of an adversarial attack that can cause huge differences in accuracy despite small input differences.
I have an image 315x581. I want to crop it in 28x28 from top left to bottom right, then I need to save each 28x28 image in folder.
I could crop just one image from y1=0 to y2=28 and x1=0 to x2=28.
First problem is: I used cv2.imwrite("cropped.jpg", cropped) to save this small image, but It doesn't save it, provided that it works some line above.
Second problem is: How can I write a code which it keeps on cropping the image in 28x28 from left to right and top to bottom and save each subimage.
I used for loop, but I don't know how to complete it.
Thank you so much for any help.
Here this is my code,
import cv2
import numpy as np
from PIL import Image
import PIL.Image
import os
import gzip
import matplotlib
import matplotlib.pyplot as plt
import matplotlib.cm as cm
#%%
image1LL='C:/Users/Tala/Documents/PythonProjects/Poster-OpenCV-MaskXray/CHNCXR_0001_0_LL.jpg'
mask1LL='C:/Users/Tala/Documents/PythonProjects/Poster-OpenCV-MaskXray/CHNCXR_0001_0_threshLL.jpg'
#finalsSave='C:/Users/Tala/Documents/PythonProjects/Poster-OpenCV-MaskXray/Xray Result'
# load the image
img = cv2.imread(image1LL,0)
mask = cv2.imread(mask1LL,0)
# combine foreground+background
final1LL = cv2.bitwise_and(img,img,mask = mask)
cv2.imshow('final1LL',final1LL)
cv2.waitKey(100)
final1LL.size
final1LL.shape
# Save the image
cv2.imwrite('final1LL.jpg',final1LL)
# crop the image using array slices -- it's a NumPy array
# after all!
y1=0
x1=0
for y2 in range(0,580,28):
for x2 in range(0,314,28):
cropped = final1LL[0:28, 0:28]
cv2.imshow('cropped', cropped)
cv2.waitKey(100)
cv2.imwrite("cropped.jpg", cropped)
Your approach is good, but there is some fine tuning required. The following code will help you:
import cv2
filename = 'p1.jpg'
img = cv2.imread(filename, 1)
interval = 100
stride = 100
count = 0
print img.shape
for i in range(0, img.shape[0], interval):
for j in range(0, img.shape[1], interval):
print j
cropped_img = img[j:j + stride, i:i + stride] #--- Notice this part where you have to add the stride as well ---
count += 1
cv2.imwrite('cropped_image_' + str(count) + '_.jpg', cropped_img) #--- Also take note of how you would save all the cropped images by incrementing the count variable ---
cv2.waitKey()
My result:
Original image:
Some of the cropped images:
Cropped image 1
Cropped image 2
Cropped image 3
If you are using it in PyTorch as a deep learning framework, then this task would be quite easy and can be done without the need for any other external image processing libraries such as OpenCV. The below code will convert a single image into a stack of multiple images in a form of PyTorch tensor. If you want to use only images then you need to remove the line "transforms.ToTensor()" and save the "tens" variable in the code as an image using matplotlib.
Note: Here bird image is used with dimension 32 x 32 x 3, crop images 5x5x3 with stride =1.
image = Image.open('bird.png')
tensreal = trans(image)
trans = transforms.Compose([transforms.Resize(32),
transforms.ToTensor(),
])
stride = 1
crop_height = 5
crop_width = 5
img_height = 32
img_width = 32
tens_list = []
for i in range(0,img_width-crop_width,stride):
for j in range(0,img_height-crop_height ,stride):
tens = trans(image)
tens1 = tens[:, j:j+crop_height, i:i+crop_width]
tens_list.append(tens1)
all_tens = torch.stack(tens_list)
print(all_tens.size())