How to make space for stitching multiple images in OpenCV - Python3 [duplicate] - python-3.x

I'm trying to stitch 2 images together by using template matching find 3 sets of points which I pass to cv2.getAffineTransform() get a warp matrix which I pass to cv2.warpAffine() into to align my images.
However when I join my images the majority of my affine'd image isn't shown. I've tried using different techniques to select points, changed the order or arguments etc. but I can only ever get a thin slither of the affine'd image to be shown.
Could somebody tell me whether my approach is a valid one and suggest where I might be making an error? Any guesses as to what could be causing the problem would be greatly appreciated. Thanks in advance.
This is the final result that I get. Here are the original images (1, 2) and the code that I use:
EDIT: Here's the results of the variable trans
array([[ 1.00768049e+00, -3.76690353e-17, -3.13824885e+00],
[ 4.84461775e-03, 1.30769231e+00, 9.61912797e+02]])
And here are the here the points passed to cv2.getAffineTransform: unified_pair1
array([[ 671., 1024.],
[ 15., 979.],
[ 15., 962.]], dtype=float32)
unified_pair2
array([[ 669., 45.],
[ 18., 13.],
[ 18., 0.]], dtype=float32)
import cv2
import numpy as np
def showimage(image, name="No name given"):
cv2.imshow(name, image)
cv2.waitKey(0)
cv2.destroyAllWindows()
return
image_a = cv2.imread('image_a.png')
image_b = cv2.imread('image_b.png')
def get_roi(image):
roi = cv2.selectROI(image) # spacebar to confirm selection
cv2.waitKey(0)
cv2.destroyAllWindows()
crop = image_a[int(roi[1]):int(roi[1]+roi[3]), int(roi[0]):int(roi[0]+roi[2])]
return crop
temp_1 = get_roi(image_a)
temp_2 = get_roi(image_a)
temp_3 = get_roi(image_a)
def find_template(template, search_image_a, search_image_b):
ccnorm_im_a = cv2.matchTemplate(search_image_a, template, cv2.TM_CCORR_NORMED)
template_loc_a = np.where(ccnorm_im_a == ccnorm_im_a.max())
ccnorm_im_b = cv2.matchTemplate(search_image_b, template, cv2.TM_CCORR_NORMED)
template_loc_b = np.where(ccnorm_im_b == ccnorm_im_b.max())
return template_loc_a, template_loc_b
coord_a1, coord_b1 = find_template(temp_1, image_a, image_b)
coord_a2, coord_b2 = find_template(temp_2, image_a, image_b)
coord_a3, coord_b3 = find_template(temp_3, image_a, image_b)
def unnest_list(coords_list):
coords_list = [a[0] for a in coords_list]
return coords_list
coord_a1 = unnest_list(coord_a1)
coord_b1 = unnest_list(coord_b1)
coord_a2 = unnest_list(coord_a2)
coord_b2 = unnest_list(coord_b2)
coord_a3 = unnest_list(coord_a3)
coord_b3 = unnest_list(coord_b3)
def unify_coords(coords1,coords2,coords3):
unified = []
unified.extend([coords1, coords2, coords3])
return unified
# Create a 2 lists containing 3 pairs of coordinates
unified_pair1 = unify_coords(coord_a1, coord_a2, coord_a3)
unified_pair2 = unify_coords(coord_b1, coord_b2, coord_b3)
# Convert elements of lists to numpy arrays with data type float32
unified_pair1 = np.asarray(unified_pair1, dtype=np.float32)
unified_pair2 = np.asarray(unified_pair2, dtype=np.float32)
# Get result of the affine transformation
trans = cv2.getAffineTransform(unified_pair1, unified_pair2)
# Apply the affine transformation to original image
result = cv2.warpAffine(image_a, trans, (image_a.shape[1] + image_b.shape[1], image_a.shape[0]))
result[0:image_b.shape[0], image_b.shape[1]:] = image_b
showimage(result)
cv2.imwrite('result.png', result)
Sources: Approach based on advice received here, this tutorial and this example from the docs.

July 12 Edit:
This post inspired GitHub repos providing functions to accomplish this task; one for a padded warpAffine() and another for a padded warpPerspective(). Check out the Python version or the C++ version.
Transformations shift the location of pixels
What any transformation does is takes your point coordinates (x, y) and maps them to new locations (x', y'):
s*x' h1 h2 h3 x
s*y' = h4 h5 h6 * y
s h7 h8 1 1
where s is some scaling factor. You must divide the new coordinates by the scale factor to get back the proper pixel locations (x', y'). Technically, this is only true of homographies---(3, 3) transformation matrices---you don't need to scale for affine transformations (you don't even need to use homogeneous coordinates...but it's better to keep this discussion general).
Then the actual pixel values are moved to those new locations, and the color values are interpolated to fit the new pixel grid. So during this process, these new locations get recorded at some point. We'll need those locations to see where the pixels actually move to, relative to the other image. Let's start with an easy example and see where points are mapped.
Suppose your transformation matrix simply shifts pixels to the left by ten pixels. Translation is handled by the last column; the first row is the translation in x and second row is the translation in y. So we would have an identity matrix, but with -10 in the first row, third column. Where would the pixel (0,0) be mapped? Hopefully, (-10,0) if logic makes any sense. And in fact, it does:
transf = np.array([[1.,0.,-10.],[0.,1.,0.],[0.,0.,1.]])
homg_pt = np.array([0,0,1])
new_homg_pt = transf.dot(homg_pt))
new_homg_pt /= new_homg_pt[2]
# new_homg_pt = [-10. 0. 1.]
Perfect! So we can figure out where all points map with a little linear algebra. We will need to get all the (x,y) points, and put them into a huge array so that every single point is in it's own column. Lets pretend our image is only 4x4.
h, w = src.shape[:2] # 4, 4
indY, indX = np.indices((h,w)) # similar to meshgrid/mgrid
lin_homg_pts = np.stack((indX.ravel(), indY.ravel(), np.ones(indY.size)))
These lin_homg_pts have every homogenous point now:
[[ 0. 1. 2. 3. 0. 1. 2. 3. 0. 1. 2. 3. 0. 1. 2. 3.]
[ 0. 0. 0. 0. 1. 1. 1. 1. 2. 2. 2. 2. 3. 3. 3. 3.]
[ 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]]
Then we can do matrix multiplication to get the mapped value of every point. For simplicity, let's stick with the previous homography.
trans_lin_homg_pts = transf.dot(lin_homg_pts)
trans_lin_homg_pts /= trans_lin_homg_pts[2,:]
And now we have the transformed points:
[[-10. -9. -8. -7. -10. -9. -8. -7. -10. -9. -8. -7. -10. -9. -8. -7.]
[ 0. 0. 0. 0. 1. 1. 1. 1. 2. 2. 2. 2. 3. 3. 3. 3.]
[ 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]]
As we can see, everything is working as expected: we have shifted the x-values only, by -10.
Pixels can be shifted outside of your image bounds
Notice that these pixel locations are negative---they're outside of the image bounds. If we do something a little more complex and rotate the image by 45 degrees, we'll get some pixel values way outside our original bounds. We don't care about every pixel value though, we just need to know how far the farthest pixels are that are outside the original image pixel locations, so that we can pad the original image that far out, before displaying the warped image on it.
theta = 45*np.pi/180
transf = np.array([
[ np.cos(theta),np.sin(theta),0],
[-np.sin(theta),np.cos(theta),0],
[0.,0.,1.]])
print(transf)
trans_lin_homg_pts = transf.dot(lin_homg_pts)
minX = np.min(trans_lin_homg_pts[0,:])
minY = np.min(trans_lin_homg_pts[1,:])
maxX = np.max(trans_lin_homg_pts[0,:])
maxY = np.max(trans_lin_homg_pts[1,:])
# minX: 0.0, minY: -2.12132034356, maxX: 4.24264068712, maxY: 2.12132034356,
So we see that we can get pixel locations well outside our original image, both in the negative and positive directions. The minimum x value doesn't change because when an homography applies a rotation, it does it from the top-left corner. Now one thing to note here is that I've applied the transformation to all pixels in the image. But this is really unnecessary, you can simply warp the four corner points and see where they land.
Padding the destination image
Note that when you call cv2.warpAffine() you have to input the destination size. These transformed pixel values reference that size. So if a pixel gets mapped to (-10,0), it won't show up in the destination image. That means that we'll have to make another homography with translations which shift all pixel locations be positive, and then we can pad the image matrix to compensate for our shift. We'll also have to pad the original image on the bottom and the right if the homography moves points to positions bigger than the image, too.
In the recent example, the min x value is the same, so we need no horizontal shift. However, the min y value has dropped by about two pixels, so we need to shift the image two pixels down. First, let's create the padded destination image.
pad_sz = list(src.shape) # in case three channel
pad_sz[0] = np.round(np.maximum(pad_sz[0], maxY) - np.minimum(0, minY)).astype(int)
pad_sz[1] = np.round(np.maximum(pad_sz[1], maxX) - np.minimum(0, minX)).astype(int)
dst_pad = np.zeros(pad_sz, dtype=np.uint8)
# pad_sz = [6, 4, 3]
As we can see, the height increased from the original by two pixels to account for that shift.
Add translation to the transformation to shift all pixel locations to positive
Now, we need to create a new homography matrix to translate the warped image by the same amount that we shifted by. And to apply both transformations---the original and this new shift---we have to compose the two homographies (for an affine transformation, you can simply add the translation, but not for an homography). Additionally we need to divide by the last entry to make sure the scales are still proper (again, only for homographies):
anchorX, anchorY = 0, 0
transl_transf = np.eye(3,3)
if minX < 0:
anchorX = np.round(-minX).astype(int)
transl_transf[0,2] -= anchorX
if minY < 0:
anchorY = np.round(-minY).astype(int)
transl_transf[1,2] -= anchorY
new_transf = transl_transf.dot(transf)
new_transf /= new_transf[2,2]
I also created here the anchor points for where we will place the destination image into the padded matrix; it's shifted by the same amount the homography will shift the image. So let's place the destination image inside the padded matrix:
dst_pad[anchorY:anchorY+dst_sz[0], anchorX:anchorX+dst_sz[1]] = dst
Warp with the new transformation into the padded image
All we have left to do is apply the new transformation to the source image (with the padded destination size), and then we can overlay the two images.
warped = cv2.warpPerspective(src, new_transf, (pad_sz[1],pad_sz[0]))
alpha = 0.3
beta = 1 - alpha
blended = cv2.addWeighted(warped, alpha, dst_pad, beta, 1.0)
Putting it all together
Let's create a function for this since we were creating quite a few variables we don't need at the end here. For inputs we need the source image, the destination image, and the original homography. And for outputs we simply want the padded destination image, and the warped image. Note that in the examples we used a 3x3 homography so we better make sure we send in 3x3 transforms instead of 2x3 affine or Euclidean warps. You can just add the row [0,0,1] to any affine warp at the bottom and you'll be fine.
def warpPerspectivePadded(img, dst, transf):
src_h, src_w = src.shape[:2]
lin_homg_pts = np.array([[0, src_w, src_w, 0], [0, 0, src_h, src_h], [1, 1, 1, 1]])
trans_lin_homg_pts = transf.dot(lin_homg_pts)
trans_lin_homg_pts /= trans_lin_homg_pts[2,:]
minX = np.min(trans_lin_homg_pts[0,:])
minY = np.min(trans_lin_homg_pts[1,:])
maxX = np.max(trans_lin_homg_pts[0,:])
maxY = np.max(trans_lin_homg_pts[1,:])
# calculate the needed padding and create a blank image to place dst within
dst_sz = list(dst.shape)
pad_sz = dst_sz.copy() # to get the same number of channels
pad_sz[0] = np.round(np.maximum(dst_sz[0], maxY) - np.minimum(0, minY)).astype(int)
pad_sz[1] = np.round(np.maximum(dst_sz[1], maxX) - np.minimum(0, minX)).astype(int)
dst_pad = np.zeros(pad_sz, dtype=np.uint8)
# add translation to the transformation matrix to shift to positive values
anchorX, anchorY = 0, 0
transl_transf = np.eye(3,3)
if minX < 0:
anchorX = np.round(-minX).astype(int)
transl_transf[0,2] += anchorX
if minY < 0:
anchorY = np.round(-minY).astype(int)
transl_transf[1,2] += anchorY
new_transf = transl_transf.dot(transf)
new_transf /= new_transf[2,2]
dst_pad[anchorY:anchorY+dst_sz[0], anchorX:anchorX+dst_sz[1]] = dst
warped = cv2.warpPerspective(src, new_transf, (pad_sz[1],pad_sz[0]))
return dst_pad, warped
Example of running the function
Finally, we can call this function with some real images and homographies and see how it pans out. I'll borrow the example from LearnOpenCV:
src = cv2.imread('book2.jpg')
pts_src = np.array([[141, 131], [480, 159], [493, 630],[64, 601]], dtype=np.float32)
dst = cv2.imread('book1.jpg')
pts_dst = np.array([[318, 256],[534, 372],[316, 670],[73, 473]], dtype=np.float32)
transf = cv2.getPerspectiveTransform(pts_src, pts_dst)
dst_pad, warped = warpPerspectivePadded(src, dst, transf)
alpha = 0.5
beta = 1 - alpha
blended = cv2.addWeighted(warped, alpha, dst_pad, beta, 1.0)
cv2.imshow("Blended Warped Image", blended)
cv2.waitKey(0)
And we end up with this padded warped image:
![[Padded and warped1]1
as opposed to the typical cut off warp you would normally get.

Related

How to find the direction of triangles in an image using OpenCV

I am trying to find the direction of triangles in an image. below is the image:
These triangles are pointing upward/downward/leftward/rightward. This is not the actual image. I have already used canny edge detection to find edges then contours and then the dilated image is shown below.
My logic to find the direction:
The logic I am thinking to use is that among the three corner coordinates If I can identify the base coordinates of the triangle (having the same abscissa or ordinates values coordinates), I can make a base vector. Then angle between unit vectors and base vectors can be used to identify the direction. But this method can only determine if it is up/down or left/right but cannot differentiate between up and down or right and left. I tried to find the corners using cv2.goodFeaturesToTrack but as I know it's giving only the 3 most effective points in the entire image. So I am wondering if there is other way to find the direction of triangles.
Here is my code in python to differentiate between the triangle/square and circle:
#blue_masking
mask_blue=np.copy(img1)
row,columns=mask_blue.shape
for i in range(0,row):
for j in range(0,columns):
if (mask_blue[i][j]==25):
mask_blue[i][j]=255
else:
mask_blue[i][j]=0
blue_edges = cv2.Canny(mask_blue,10,10)
kernel_blue = cv2.getStructuringElement(cv2.MORPH_ELLIPSE,(2,2))
dilated_blue = cv2.dilate(blue_edges, kernel)
blue_contours,hierarchy =
cv2.findContours(dilated_blue,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
for cnt in blue_contours:
area = cv2.contourArea(cnt)
perimeter = cv2.arcLength(cnt,True)
M = cv2.moments(cnt)
cx = int(M['m10']/M['m00'])
cy = int(M['m01']/M['m00'])
if(12<(perimeter*perimeter)/area<14.8):
shape="circle"
elif(14.8<(perimeter*perimeter)/area<18):
shape="squarer"
elif(18<(perimeter*perimeter)/area and area>200):
shape="triangle"
print(shape)
print(area)
print((perimeter*perimeter)/area,"\n")
cv2.imshow('mask_blue',dilated_blue)
cv2.waitKey(0)
cv2.destroyAllWindows()
Source image can be found here: img1
Please help, how can I found the direction of triangles?
Thank you.
Assuming that you only have four cases: [up, down, left, right], this code should work well for you.
The idea is simple:
Get the bounding rectangle for your contour. Use: box = cv2.boundingRect(contour_pnts)
Crop the image using the bounding rectangle.
Reduce the image vertically and horizontally using the Sum option. Now you have the sum of pixels along each axis. The axis with the largest sum determines whether the triangle base is vertical or horizontal.
To identify whether the triangle is pointing left/right or up/down: you need to check whether the bounding rectangle center is before or after the max col/row:
The code (assumes you start from the cropped image):
ver_reduce = cv2.reduce(img, 0, cv2.REDUCE_SUM, None, cv2.CV_32F)
hor_reduce = cv2.reduce(img, 1, cv2.REDUCE_SUM, None, cv2.CV_32F)
#For smoothing the reduced vector, could be removed
ver_reduce = cv2.GaussianBlur(ver_reduce, (3, 1), 0)
hor_reduce = cv2.GaussianBlur(hor_reduce, (1, 3), 0)
_,ver_max, _, ver_col = cv2.minMaxLoc(ver_reduce)
_,hor_max, _, hor_row = cv2.minMaxLoc(hor_reduce)
ver_col = ver_col[0]
hor_row = hor_row[1]
contour_pnts = cv2.findNonZero(img) #in my code I do not have the original contour points
rect_center, size, angle = cv2.minAreaRect(contour_pnts )
print(rect_center)
if ver_max > hor_max:
if rect_center[0] > ver_col:
print ('right')
else:
print ('left')
else:
if rect_center[1] > hor_row:
print ('down')
else:
print ('up')
Photos:
Well, Mark has mentioned a solution that may not be as efficient but perhaps more accurate. I think this one should be equally efficient but perhaps less accurate. But since you already have a code that finds triangles, try adding the following code after you have found triangle contour:
hull = cv2.convexHull(cnt) # convex hull of contour
hull = cv2.approxPolyDP(hull,0.1*cv2.arcLength(hull,True),True)
# You can double check if the contour is a triangle here
# by something like len(hull) == 3
You should get 3 hull points for a triangle, these should be the 3 vertices of your triangles. Given your triangles always 'face' only in 4 directions; Y coordinate of the hull will have close value to the Y coordinate of the centroid for triangle facing left or right and whether it's pointing left or right will depend on whether hull X is less than or greater than centroid X. Similarly use hull and centroid X and Y for triangle pointing up or down.

Rotate image such that is matches a second image

I would like to rotate an image based on a second image. Both images are satellite images, however, they are not rotated in the same direction(in one image top is in the north direction and in the other the rotation is not known). But, I have at least three pixel pairs in each of the images (x1,y1,x2,y2). So my idea is to figure out their relative position and get the rotation angle from that.
Currently, I estimate the angle like this:
def angle_between(v1, v2):
""" Returns the angle in radians between vectors 'v1' and 'v2'::
>>> angle_between((1, 0, 0), (0, 1, 0))
1.5707963267948966
>>> angle_between((1, 0, 0), (1, 0, 0))
0.0
>>> angle_between((1, 0, 0), (-1, 0, 0))
3.141592653589793
"""
v1_u = unit_vector(v1)
v2_u = unit_vector(v2)
angle_rad = np.arccos(np.clip(np.dot(v1_u, v2_u), -1.0, 1.0))
return (angle_rad*180)/math.pi
with the inputs like this:
v1 = [points[0][0] - points[1][0], points[0][1] - points[1][1]] #hist
v2 = [points[0][2] - points[1][2], points[0][3] - points[1][3]] #ref
However, this only uses two pixel pairs instead of the three. Therefore, the rotation is some times incorrect. Could anybody show me how to use all three pixels?
My first attempt was to check on which side of the straight the third pixel lies in the image and based on that negate the angle. But, this does not work for all images.
EDIT:
I cannot add the original images, as they are copyrighted, however, as the image content is not really important I added whitened images. The first is the input image with the three points drawn in, the second is the rotated image (where additionally the (wrong, due to rotation) cutout area is marked with a rectangle) and third the historical image.
The points are the following:
567.01,144,1544.4,4581.8
1182.6,1568.1,2934.1,3724.3
938.97,1398.1,2795.8,4002.5
with:
x_historical, y_historical, x_presentday, y_presentday

Need Help in finding 2 seperate contours instead of a combined contour in MICR code

I an running OCR on bank cheques using pyimagesearch tutorial to detect micr code. The code used in the tutorial detects group contours & character contours from a reference image containing symbols.
In the tutorial when finding the contours for symbol below
the code uses an built-in python iterator to iterate over the contours (here 3 seperate contours) and combined to give a character for recognition purposes.
But in the cheque dataset that I use, I have the symbol with low resolution
The actual bottom of the cheque is :
which causes the iterator to consider the contour-2 & contour-3 as a single contour. Due to this the iterator iterates over the character following the above symbol (here '0') and prepares a incorrect template to match with the reference symbols. You can see the code below for better understanding.
I know here noise in the image is a factor, but is it possible to reduce the noise & also find the exact contour to detect the symbol?
I tried using noise reduction techniques like cv2.fastNlMeansDenoising & cv2.GaussianBlur before cv2.findContours step the contours 2&3 are detected as single contour instead of 2 seperate contours.
Also I tried altering the `cv2.findContours' parameters
Below is the working code where the characters are iterated for better understanding of python builtin iterator:
def extract_digits_and_symbols(image, charCnts, minW=5, minH=10):
# grab the internal Python iterator for the list of character
# contours, then initialize the character ROI and location
# lists, respectively
charIter = charCnts.__iter__()
rois = []
locs = []
# keep looping over the character contours until we reach the end
# of the list
while True:
try:
# grab the next character contour from the list, compute
# its bounding box, and initialize the ROI
c = next(charIter)
(cX, cY, cW, cH) = cv2.boundingRect(c)
roi = None
# check to see if the width and height are sufficiently
# large, indicating that we have found a digit
if cW >= minW and cH >= minH:
# extract the ROI
roi = image[cY:cY + cH, cX:cX + cW]
rois.append(roi)
cv2.imshow('roi',roi)
cv2.waitKey(0)
locs.append((cX, cY, cX + cW, cY + cH))
# otherwise, we are examining one of the special symbols
else:
# MICR symbols include three separate parts, so we
# need to grab the next two parts from our iterator,
# followed by initializing the bounding box
# coordinates for the symbol
parts = [c, next(charIter), next(charIter)]
(sXA, sYA, sXB, sYB) = (np.inf, np.inf, -np.inf,
-np.inf)
# loop over the parts
for p in parts:
# compute the bounding box for the part, then
# update our bookkeeping variables
# c = next(charIter)
# (cX, cY, cW, cH) = cv2.boundingRect(c)
# roi = image[cY:cY+cH, cX:cX+cW]
# cv2.imshow('symbol', roi)
# cv2.waitKey(0)
# roi = None
(pX, pY, pW, pH) = cv2.boundingRect(p)
sXA = min(sXA, pX)
sYA = min(sYA, pY)
sXB = max(sXB, pX + pW)
sYB = max(sYB, pY + pH)
# extract the ROI
roi = image[sYA:sYB, sXA:sXB]
cv2.imshow('symbol', roi)
cv2.waitKey(0)
rois.append(roi)
locs.append((sXA, sYA, sXB, sYB))
# we have reached the end of the iterator; gracefully break
# from the loop
except StopIteration:
break
# return a tuple of the ROIs and locations
return (rois, locs)
edit: contour 2 & 3 instead of contours 1 & 2
Try to find the right threshold value, instead of using cv2.THRESH_OTSU. It seems should be possible to find a suitable threshold from the provided example. If you can't find the threshold value that works for all images, you can try morphological closing on the threshold result with structuring element with 1-pixel width.
Edit (steps):
For threshold, you need to find appropriate value by hand, in your image threhsold value 100 seems to work:
i = cv.imread('image.png')
g = cv.cvtColor(i, cv.COLOR_BGR2GRAY)
_, tt = cv.threshold(g, 100, 255, cv.THRESH_BINARY_INV)
as for closing variant:
_, t = cv.threshold(g, 0,255,cv.THRESH_BINARY_INV | cv.THRESH_OTSU)
kernel = np.ones((12,1), np.uint8)
c = cv.morphologyEx(t, cv.MORPH_OPEN, kernel)
Note that I used import cv2 as cv. I also used opening instead of closing since in the example they inverted colors during thresholding

Optimizing the access and changing of value of SLIC superpixels

I'm trying to do semantic segmentation using a SLIC variant and want to create a mask for the original image where each segment is colored (according to its class) based on available point-based annotations. If there is no point-based annotation in that segment, then leave is as 0.
I currently have x, y points and their associated labels for an image and a (slow) method that finds and colors the desired segments. I'm familiar with vectorization or the 'pythonic' was of doing things, but I can't seem to speed up this last for-loop and would love some advice or references on optimization. Thanks.
# Point-based annotations
annotation = pd.read_csv("a_dataframe.csv") # [X, Y, Label]
color_label = {'class 1' : 25, 'class 2' : 50, 'class 3' : 75}
# Uses CPU to create single segmented image with current params
slic = SlicAvx2(num_components = n_segments, compactness = n_compactness)
segmented_image = slic.iterate(cv2.cvtColor(each_image, cv2.COLOR_RGB2LAB))
# Finds the segments of interest and records their ID
X = np.array(each_annotation.iloc[:, 0], dtype = 'uint8')
Y = np.array(each_annotation.iloc[:, 1], dtype = 'uint8')
L = np.array(each_annotation.iloc[:, 2], dtype = 'str') # Labels
DS = segmented_image[X, Y] # Desired Segments
# Empty mask, marks the segments of interest with the classes of the point in them
mask = np.zeros(each_image.shape[:2], dtype = "uint8")
# Would ideally like to find a more quickly way of doing this
for (index, segVal) in enumerate(DS):
mask[segmented_image == segVal] = color_label.get(L[index])
I have essentially what I would like to replace that loop with here:
[mask[segmented_image == s] for i, s in enumerate(DS)]
but I'm not able to assign X, Y locations with the appropriate Label in mask. I thought it would be something similar to this:
[mask[segmented_image == s] for i, s in enumerate(DS)] = color_label.get(L[i])
but it appears that I'm trying to assign a color value to the lists I'm generating...
Are you looking for ind2rgb?
A way to convert an indexed map (one index per slic segment, possible same index for multiple regions) and convert it to RGB image based on a map from index to color.

De-Skewing image

I am unable to figure out how does this deskew is working
def deskew(img):
m = cv2.moments(img)
if abs(m['mu02']) < 1e-2:
return img.copy()
skew = m['mu11']/m['mu02']
M = np.float32([[1, skew, -0.5*SZ*skew], [0, 1, 0]])
img = cv2.warpAffine(img,M,(SZ, SZ),flags=affine_flags)
return img
I know that the moment is a quantitative measure of the shape.
In image processing, the moments give information about the total
area or Intensity, the centroid of the shape and the orientation of the
shape.
Area or total Mass:-
The zeroth moment M(0,0) gives the total Mass or Area.
In image processing, the M(0,0) is the sum of all the pixels and if it is a binary image then sum of pixels gives the area.
Center of mass or Centroid:- When the first moment is divided by
the total mass then it gives the centroid.
Centroid is that point where the shape is perfectly balanced on the
tip of the pin.
M(0,1)/M(0,0) ,M(1,0)/M(0,0)
I think the image from the tutorial you got the code from gives the intuitive idea pretty well:
To deskew the image, they used skewness on x axis (mu02) relative to the variance mu11. They used shear matrix with inverse of image skewness, which is why in skew = m['mu11']/m['mu02'] mu02 and mu11 fraction is flipped. To deskew relative to the center of the top of the image, rather than the (0,0) point, they also used translation, which is where you get M[0, 2] = -0.5*SZ*skew

Resources