I have been using Structural Similarity Index (through tensorflow) for comparing images, however it takes too long. I was wondering if there is an alternative technique that doesn't take so much time. It is also okay if someone could point out a more efficient implementation of SSIM than tensorflow in Python.
My intention for using SSIM, is that given a reference image (A) and a set of images (B), I need to understand which image in B is the most similar to the reference image A.
UPDATE 01-02-2021
I decided to explore some other Python modules that could be used for Image comparison. I also wanted to use concurrent.futures, which I hadn't used before.
I created two GitGub Gists with the code that I wrote.
skimage ssim image comparison
ImageHash aHash image comparison
The ImageHash module was able to compare 100 images in 0.29 of a second and the skimage module took 1.2 seconds for the same task.
ORIGINAL POST
I haven't tested the speed of the code in this answer, because I have only used the code in some image testing that I posted to GitHub:
facial similarities
facial prediction
The code below will produce a similarity score between reference image (A) and set of images (B).
The complete code is located in my GitHub repository
import os
from os import walk
import numpy as np
from PIL import Image
from math import *
def get_image_files(directory_of_images):
"""
This function is designed to traverse a directory tree and extract all
the image names contained in the directory.
:param directory_of_images: the name of the target directory containing
the images to be trained on.
:return: list of images to be processed.
"""
images_to_process = []
for (dirpath, dirnames, filenames) in walk(directory_of_images):
for filename in filenames:
accepted_extensions = ('.bmp', '.gif', '.jpg', '.jpeg', '.png', '.svg', '.tiff')
if filename.endswith(accepted_extensions):
images_to_process.append(os.path.join(dirpath, filename))
return images_to_process
def pre_process_images(image_one, image_two, additional_resize=False, max_image_size=1000):
"""
This function is designed to resize the images using the Pillow module.
:param image_one: primary image to evaluate against a secondary image
:param image_two: secondary image to evaluate against the primary image
:param additional_resize:
:param max_image_size: maximum allowable image size in pixels
:return: resized images
"""
lower_boundary_size = (min(image_one.size[0], image_two.size[0]), min(image_one.size[1], image_two.size[1]))
# reference: https://pillow.readthedocs.io/en/stable/reference/Image.html#PIL.Image.Image.resize
# reference: https://pillow.readthedocs.io/en/stable/handbook/concepts.html#PIL.Image.LANCZOS
image_one = image_one.resize(lower_boundary_size, resample=Image.LANCZOS)
image_two = image_two.resize(lower_boundary_size, resample=Image.LANCZOS)
if max(image_one.size) > max_image_size and additional_resize:
resize_factor = max_image_size / max(image_one.size)
image_one = image_one.resize((int(lower_boundary_size[0] * resize_factor),
int(lower_boundary_size[1] * resize_factor)), resample=Image.LANCZOS)
image_two = image_two.resize((int(lower_boundary_size[0] * resize_factor),
int(lower_boundary_size[1] * resize_factor)), resample=Image.LANCZOS)
return image_one, image_two
def get_ssim_similarity(image_one_name, image_two_name, window_size=7, dynamic_range=255):
"""
The Structural Similarity Index (SSIM) is a method for measuring the similarity between two images.
The SSIM index can be viewed as a quality measure of one of the images being compared, provided the
other image is regarded as of perfect quality.
:param image_one_name: primary image to evaluate against a secondary image
:param image_two_name: secondary image to evaluate against the primary image
:param window_size: The side-length of the sliding window used in comparison. Must be an odd value.
:param dynamic_range: Dynamic range of the input image, specified as a positive scalar.
The default dynamic range is 255 for images of data type uint8.
:return: computational score and image names
"""
image_one = Image.open(image_one_name)
image_two = Image.open(image_two_name)
if min(list(image_one.size) + list(image_two.size)) < 7:
raise Exception("One of the images was too small to process using the SSIM approach")
image_one, image_two = pre_process_images(image_one, image_two, True)
image_one, image_two = image_one.convert('I'), image_two.convert('I')
c1 = (dynamic_range * 0.01) ** 2
c2 = (dynamic_range * 0.03) ** 2
pixel_length = window_size ** 2
ssim = 0.0
adjusted_width = image_one.size[0] // window_size * window_size
adjusted_height = image_one.size[1] // window_size * window_size
for i in range(0, adjusted_width, window_size):
for j in range(0, adjusted_height, window_size):
crop_box = (i, j, i + window_size, j + window_size)
crop_box_one = image_one.crop(crop_box)
crop_box_two = image_two.crop(crop_box)
np_array_one, np_array_two = np.array(crop_box_one).flatten(), np.array(crop_box_two).flatten()
np_variable_one, np_variable_two = np.var(np_array_one), np.var(np_array_two)
np_average_one, np_average_two = np.average(np_array_one), np.average(np_array_two)
cov = (np.sum(np_array_one * np_array_two) - (np.sum(np_array_one) *
np.sum(crop_box_two) / pixel_length)) / pixel_length
ssim += ((2.0 * np_average_one * np_average_two + c1) * (2.0 * cov + c2)) / \
((np_average_one ** 2 + np_average_two ** 2 + c1) * (np_variable_one + np_variable_two + c2))
similarity_percent = (ssim * pixel_length / (adjusted_height * adjusted_width)) * 100
return round(similarity_percent, 2)
target_image = 'a.jpg'
image_directory = 'b_images'
images = get_image_files(image_directory)
for image in images:
ssim_result = get_ssim_similarity(target_image, image)
I would also recommend looking at the Python module ImageHash. I have multiple code examples and test cases published here.
Related
I was using the Albumentations library in order to perform some data augmentations on an object detection dataset that I intended to train a YoloV5 model on.
I have to perform the augmentations seperately and save the images locally to disk, but when I do I noticed that some of the output bounding boxes returned aren't generating properly.
I have my augmentations set up in a seperate aug.py file, shown below (augmentations purposefully removed in debugging attempts, see below) -
import albumentations as A
import cv2
PROB = 0.5
bbp = A.BboxParams(format="yolo")
horizontal_flip_transform = A.Compose([
], bbox_params = bbp)
vertical_flip_transform = A.Compose([
], bbp)
pixel_dropout_transform = A.Compose([
], bbox_params = bbp)
random_rotate = A.Compose([
], bbox_params = bbp )
#NOTE: THIS METHOD IMPLIES THAT THE IMAGE WIDTHS MUST BE AT LEAST 50 PIXELS
#Remove this aug to remove this constraint
random_crop = A.Compose([
], bbox_params = bbp)
augs = [horizontal_flip_transform, vertical_flip_transform, pixel_dropout_transform, random_rotate, random_crop]
def get_augmentations():
return augs
And the relevant parts of my implementation for performing the augmentations and saving them to disk is below:
def run_augments_on_image(img_name, bboxes, max_images_to_generate = 500):
ret = []
img = np.array(Image.open(img_name), dtype=np.uint8)
transforms = get_augmentations()
for i in range(min(len(transforms), max_images_to_generate)):
transformed = transforms[i](image=img, bboxes = bboxes)
ret.append((transformed["image"] , transformed["bboxes"]))
return ret
def run_and_save_augments_on_image_sets(batch_img_names, bboxes_urls, max_images_to_generate, dataset_dir, trainval):
num_images = 0
for i in range(len(batch_img_names)):
bboxes = []
with open(os.path.join(dataset_dir, trainval, 'labels', bboxes_urls[i]), 'r') as f:
for row in f:
x = row.strip().split(' ')
x.append(row[0])
x.pop(0)
x[0] = float(x[0])
x[1] = float(x[1])
x[2] = float(x[2])
x[3] = float(x[3])
bboxes.append(x)
trans = run_augments_on_image(os.path.join(dataset_dir, trainval, 'images', batch_img_names[i]), bboxes)
img_index = len(os.listdir(os.path.join(dataset_dir, 'train' , 'images'))) + len(os.listdir(os.path.join(dataset_dir, 'valid', 'images'))) + 1
for j in range(len(trans)):
img_trans, bboxes_trans = trans[j]
p = Image.fromarray(img_trans).save(os.path.join(dataset_dir, trainval, 'images' , f'image-{img_index}.{batch_img_names[j].split(".")[-1]}'))
with open(os.path.join(dataset_dir, trainval, 'labels', f'image-{img_index}.txt'), 'w') as f:
for boxs in bboxes_trans:
print(f'{boxs[-1]} {boxs[0]} {boxs[1]} {boxs[2]} {boxs[3]}', file=f)
num_images += 1
img_index += 1
if num_images >= max_images_to_generate:
break
if num_images >= max_images_to_generate:
break
For testing purposes (some of the bounding boxes were off), I removed all the actual augmentations, expecting the input image label (one augmented image example shown below) to be equal to augmented label since there were no augmentations. But, as you can see, the two labels are different.
img-original.txt
0 0.5662285714285714 0.2740066225165563 0.5297714285714286 0.4837913907284769
img-augmented.txt
0 0.51488 0.47173333333333334 0.6405099999999999 0.6527333333333334
(The labels above are in normalized xywh YOLO format)
Why is albumentations altering the labels? None of the augmentations in augs.py contain anything.
TL;DR : When concatenating 10mb worth of images into one large image, resulting image is 1GB worth of memory, before I save/optimize it to disk. How can make this in-memory size smaller?
I am working on a project where I am taking a list of lists of Python Pil image objects (image tiles), and gluing them together to:
Generate a list of images that have been concatenated together into columns
Taking #1, and making a full blown image out of all the tiles
This post has been great at providing a function that accomplishes 1&2 by
Figuring out the final image size
Creating a blank canvas for images to be added to
Adding all the images, in a sequence, to canvas we just generated
However, the issue I am encountering with the code:
The size of the original objects in the list of lists, is ~50mb.
When I do the first past over the list of lists of image object, to generated list of images that are columns, the memory increases by 1gb... And when I make the final image, the memory increases by another 1gb.
Since the resulting image is 105,985 x 2560 pixels... the 1gb is somewhat expected ((105984*2560)*3 /1024 /1024) [~800mb]
My hunch is that the canvases that are being created, are non-optimized, hence, take up a bit of space (pixels * 3 bytes), but the image tile objects I am trying to paste onto canvas, are optimized for size.
Hence my question - utilizing PIL/Python3, is there a better way to concatenate images together, keeping their original sizes/optimizations? After I do process image/re-optimize it via
.save(DiskLocation, optimize=True, quality=94)
The resulting image is ~30 MB (which is, roughly the size of the original list of lists containing PIL objects)
For reference, from the post linked above, this is the function that I use to concatenate images together:
from PIL import Image
#written by teekarna
# https://stackoverflow.com/questions/30227466/combine-several-images-horizontally-with-python
def append_images(images, direction='horizontal',
bg_color=(255,255,255), aligment='center'):
"""
Appends images in horizontal/vertical direction.
Args:
images: List of PIL images
direction: direction of concatenation, 'horizontal' or 'vertical'
bg_color: Background color (default: white)
aligment: alignment mode if images need padding;
'left', 'right', 'top', 'bottom', or 'center'
Returns:
Concatenated image as a new PIL image object.
"""
widths, heights = zip(*(i.size for i in images))
if direction=='horizontal':
new_width = sum(widths)
new_height = max(heights)
else:
new_width = max(widths)
new_height = sum(heights)
new_im = Image.new('RGB', (new_width, new_height), color=bg_color)
offset = 0
for im in images:
if direction=='horizontal':
y = 0
if aligment == 'center':
y = int((new_height - im.size[1])/2)
elif aligment == 'bottom':
y = new_height - im.size[1]
new_im.paste(im, (offset, y))
offset += im.size[0]
else:
x = 0
if aligment == 'center':
x = int((new_width - im.size[0])/2)
elif aligment == 'right':
x = new_width - im.size[0]
new_im.paste(im, (x, offset))
offset += im.size[1]
return new_im
While I do not have explanation for what was causing my runaway memory issue, I was able to tack on some code that seemed to have fixed the issue.
For each tile that I am trying to glue together, I ran a 'resizing' script (below). Somehow, this fixed the issue I was having ¯\ (ツ) /¯
Resize images script:
def resize_image(image, ImageQualityReduction = .78125):
#resize image, by percentage
width, height = image.size
#print(width,height)
new_width = int(round(width * ImageQualityReduction))
new_height = int(round(height * ImageQualityReduction))
resized_image = image.resize((new_width, new_height), Image.ANTIALIAS)
return resized_image#, new_width, new_height
The problem here is that this is only used for one image and i need to optimize it so that multiple images can be stored. (their width,height etc)
I am not fluent in python. I have worked on it about 4 years ago but now i have almost forgotten most part of the syntax.
def __init__(self, im):
self.image = im
self.height, self.width, self.nbchannels = im.shape
self.size = self.width * self.height
self.maskONEValues = [1,2,4,8,16,32,64,128]
#Mask used to put one ex:1->00000001, 2->00000010 .. associated with OR bitwise
self.maskONE = self.maskONEValues.pop(0) #Will be used to do bitwise operations
self.maskZEROValues = [254,253,251,247,239,223,191,127]
#Mak used to put zero ex:254->11111110, 253->11111101 .. associated with AND bitwise
self.maskZERO = self.maskZEROValues.pop(0)
self.curwidth = 0 # Current width position
self.curheight = 0 # Current height position
self.curchan = 0 # Current channel position
I want to store multiple images (their width, height etc) from a file path (that contains these images) in an array
TRY:-
from PIL import Image
import os
# This variable will store the data of the images
Image_data = []
dir_path = r"C:\Users\vasudeos\Pictures"
for file in os.listdir(dir_path):
if file.lower().endswith(".png"):
# Creating the image file object
img = Image.open(os.path.join(dir_path, file))
# Getting Dimensions of the image
x, y = img.size
# Getting channels of the image
channel = img.mode
img.close()
# Adding the data of the image file to our list
Image_data.append(tuple([channel, (x, y)]))
print(Image_data)
Just change the dir_path variable with the directory of your Image files. This code stores the color channel and dimensions of the Images, in a separate tuple unique to that file. And adds the tuple to a list.
P.S.:
Tuple format = (channels, dimensions)
I am trying to generate heat map, or probability map, for Whole Slide Images (WSIs) using probability values. I have coordinate points (which determine areas on the WSIs) and corresponding probability values.
Basic Introduction on WSI: WSIs are large is size (almost 100000 x 100000 pixels). Hence, can't open these images using normal image viewer. The WSIs are processed using OpenSlide software.
I have seen previous posts in Stack Overflow on related to heat map, but as WSIs are processed in a different way, I am unable to figure out how to apply these solutions. Some examples that I followed: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, etc.
To generate heat map on WSIs, follow below instructions:
First of all Extract image patches and save the coordinates. Use below code for patch extraction. The code require some changes as per the requirements. The code has been copied from: patch extraction code link
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import argparse
import logging
try:
import Image
except:
from PIL import Image
import math
import numpy as np
import openslide
import os
from time import strftime,gmtime
parser = argparse.ArgumentParser(description='Extract a series of patches from a whole slide image')
parser.add_argument("-i", "--image", dest='wsi', nargs='+', required=True, help="path to a whole slide image")
parser.add_argument("-p", "--patch_size", dest='patch_size', default=299, type=int, help="pixel width and height for patches")
parser.add_argument("-b", "--grey_limit", dest='grey_limit', default=0.8, type=float, help="greyscale value to determine if there is sufficient tissue present [default: `0.8`]")
parser.add_argument("-o", "--output", dest='output_name', default="output", help="Name of the output file directory [default: `output/`]")
parser.add_argument("-v", "--verbose",
dest="logLevel",
choices=['DEBUG', 'INFO', 'WARNING', 'ERROR', 'CRITICAL'],
default="INFO",
help="Set the logging level")
args = parser.parse_args()
if args.logLevel:
logging.basicConfig(level=getattr(logging, args.logLevel))
wsi=' '.join(args.wsi)
""" Set global variables """
mean_grey_values = args.grey_limit * 255
number_of_useful_regions = 0
wsi=os.path.abspath(wsi)
outname=os.path.abspath(args.output_name)
basename = os.path.basename(wsi)
level = 0
def main():
img,num_x_patches,num_y_patches = open_slide()
logging.debug('img: {}, num_x_patches = {}, num_y_patches: {}'.format(img,num_x_patches,num_y_patches))
for x in range(num_x_patches):
for y in range(num_y_patches):
img_data = img.read_region((x*args.patch_size,y*args.patch_size),level, (args.patch_size, args.patch_size))
print_pics(x*args.patch_size,y*args.patch_size,img_data,img)
pc_uninformative = number_of_useful_regions/(num_x_patches*num_y_patches)*100
pc_uninformative = round(pc_uninformative,2)
logging.info('Completed patch extraction of {} images.'.format(number_of_useful_regions))
logging.info('{}% of the image is uninformative\n'.format(pc_uninformative))
def print_pics(x_top_left,y_top_left,img_data,img):
if x_top_left % 100 == 0 and y_top_left % 100 == 0 and x_top_left != 0:
pc_complete = round(x_top_left /img.level_dimensions[0][0],2) * 100
logging.info('{:.2f}% Complete at {}'.format(pc_complete,strftime("%a, %d %b %Y %H:%M:%S +0000", gmtime())))
exit()
img_data_np = np.array(img_data)
""" Convert to grayscale"""
grey_img = rgb2gray(img_data_np)
if np.mean(grey_img) < mean_grey_values:
logging.debug('Image grayscale = {} compared to threshold {}'.format(np.mean(grey_img),mean_grey_values))
global number_of_useful_regions
number_of_useful_regions += 1
wsi_base = os.path.basename(wsi)
wsi_base = wsi_base.split('.')[0]
img_name = wsi_base + "_" + str(x_top_left) + "_" + str(y_top_left) + "_" + str(args.patch_size)
#write_img_rotations(img_data_np,img_name)
logging.debug('Saving {} {} {}'.format(x_top_left,y_top_left,np.mean(grey_img)))
save_image(img_data_np,1,img_name)
def gen_x_and_y(xlist,ylist,img):
for x in xlist:
for y in ylist:
img_data = img.read_region((x*args.patch_size,y*args.patch_size),level, (args.patch_size, args.patch_size))
yield (x, y,img_data)
def open_slide():
"""
The first level is always the main image
Get width and height tuple for the first level
"""
logging.debug('img: {}'.format(wsi))
img = openslide.OpenSlide(wsi)
img_dim = img.level_dimensions[0]
"""
Determine what the patch size should be, and how many iterations it will take to get through the WSI
"""
num_x_patches = int(math.floor(img_dim[0] / args.patch_size))
num_y_patches = int(math.floor(img_dim[1] / args.patch_size))
remainder_x = img_dim[0] % num_x_patches
remainder_y = img_dim[1] % num_y_patches
logging.debug('The WSI shape is {}'.format(img_dim))
logging.debug('There are {} x-patches and {} y-patches to iterate through'.format(num_x_patches,num_y_patches))
return img,num_x_patches,num_y_patches
def validate_dir_exists():
if os.path.isdir(outname) == False:
os.mkdir(outname)
logging.debug('Validated {} directory exists'.format(outname))
if os.path.exists(wsi):
logging.debug('Found the file {}'.format(wsi))
else:
logging.debug('Could not find the file {}'.format(wsi))
exit()
def rgb2gray(rgb):
"""Converts an RGB image into grayscale """
r, g, b = rgb[:,:,0], rgb[:,:,1], rgb[:,:,2]
gray = 0.2989 * r + 0.5870 * g + 0.1140 * b
return gray
def save_image(img,j,img_name):
tmp = os.path.join(outname,img_name+"_"+str(j)+".png")
try:
im = Image.fromarray(img)
im.save(tmp)
except:
print('Could not print {}'.format(tmp))
exit()
if __name__ == '__main__':
validate_dir_exists()
main()
Secondly, generate the probability values of each patches.
Finally, replace all the pixel values within a coordinates with the corresponding probability values and display the results using color maps.
This is the basic idea of generating heat map on WSIs. You can modify the code and concept to get a heat map as per your wish.
We have developed a python package for processing whole-slide-images:
https://github.com/amirakbarnejad/PyDmed
Here is a tutorial for getting heatmaps for whole-slide-images:
https://amirakbarnejad.github.io/Tutorial/tutorial_section5.html.
Also here is a sample notebook that gets heatmaps for WSIs using PyDmed:
Link to the sample notebook.
The benefit of PyDmed is that it is multi-processed. The dataloader sends a stream of patches to GPU(s), and the StreamWriter writes to disk in a separate process. Therefore, it is highly efficient. The running time of course depends on the machine, the size of WSIs, etc. On a good machine with a good GPU, PyDmed can generate heatmaps for ~120 WSIs in one day.
I have 100 points and I want to devide them to 10 different groups bades on their distance from 10 reference points and write each group in a file.
I write my program as:
from numpy import *
from math import *
from time import *
a=1.0
b=1.0
nx=10 # number of mesh in x
ny=10 # number of mesh in y
dx=a/nx
dy=b/ny
data=loadtxt("cvt_squate.txt",float)
n=data.shape
fids = []
for i in range(n[0]):
ii=str(i)
fids.append(open('file' + ii + '.txt', 'w'))
def calculateDistance(x1,y1,x2,y2):
dist = sqrt((x2 - x1)**2 + (y2 - y1)**2)
return dist
for i in range(nx) :
for j in range(ny) :
distance=10.0
grain=1000
x=(i+0.5)*dx
y=(j+0.5)*dy
for k in range (n[0]):
d = calculateDistance(x,y,data[k,0],data[k,1])
if d<distance:
distance=d
grain=k
print(grain)
kk=str(grain)
outdata = vstack((x,y)).T
savetxt('file' + kk + 'txt', outdata)
But in my results, I have one point in each file instead of group of points.
Without any sample data it's difficult to see how your code is supposed to work. But firstly I'd recommend rewriting you imports to:
import numpy as np
import math
I can't see where you use the time module.
if you define your results as a list outside your For k loop and append all the points which are near the reference points to this list. Something like:
outdata = []
for k in range(n[0]):
d = calculateDistance(x,y,data[k,0],data[k,1])
if d<distance:
outdata.append([data[k,0],data[k,1]])
should get you nearer where you want to be.