How can I create mask images in Python 3 using features.rasterize? - python-3.x

Currently, I use the following piece of code to create mask images (classes = ['tree', 'car', 'bicycle'], polygons is the list of the geometry objects where each geometry object has coordinates field that defines the polygon on the image that is a bounding box for the class object):
def create_mask(self, mask_size, classes, polygons):
# type (Tuple[int, int], List[str], List[geometry]) -> Image
# Create a new palette image, the default color of Image.new() is black
# https://pillow.readthedocs.io/en/3.3.x/handbook/concepts.html#modes
img = Image.new('P', mask_size)
img.putpalette(self.palette) # palette = [0, 0, 0, 255, 0, 0, ...]
draw = ImageDraw.Draw(img)
for i, class_ in enumerate(classes):
color_index = self.class_to_color_index[class_]
draw.polygon(xy=polygons[i].exterior.coords, fill=color_index)
del draw
return img
Is there any way to rewrite this piece of code with using features.rasterize?

Related

Please suggest how can I extract text data from hand-filled character per box type forms using python

hand-filled character per box form
I want to automate a process in which I would get hand-filled character per box type forms in image format and I need to extract text from these forms. The boxes surrounds each letter, I have to extract all the text from the image form.
You can use selecting contours by size, find rotated rectangle and inverse transform make.
import cv2
import numpy as np
img = cv2.imread('4YAry.jpg')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# convert to binary image
thresh=cv2.threshold(gray, 200, 255, cv2.THRESH_BINARY_INV )[1]
contours,hierarchy = cv2.findContours(thresh, 1, 2)
for cnt in contours:
x , y , w , h = cv2 . boundingRect ( cnt )
if abs(w-345)<10: # width box is 345 px
rect = cv2.minAreaRect(cnt)
box = cv2.boxPoints(rect)
srcTri=np.array( [box[1], box[0], box[2]] ).astype(np.float32)
dstTri = np.array( [[0, 0], [0, rect[1][1]], [rect[1][0],0]] ).astype(np.float32)
warp_mat = cv2.getAffineTransform(srcTri, dstTri)
warp_dst = cv2.warpAffine(img, warp_mat, (np.int0(rect[1][0]), np.int0(rect[1][1])))
N=14
s=0.99*warp_dst.shape[1]/N # tune rectangle positions
for i in range(N):
warp_dst = cv2.rectangle ( warp_dst , ( 2+int(i*s) ,2 ), ( 2+int((i+1)*s) , warp_dst.shape[0]-3 ), ( 255 , 255 , 255 ), 2 )
cv2.imwrite('chars.png', warp_dst)
Using for instance Hough, detect the top and bottom edges and the vertical separations. Validate the separations by checking that they run from top to bottom. The horizontal lines will be more reliable and accurate, you can use their direction for deskewing if necessary.
After doing that, you will have missing separations and false ones. Using some heuristics, try to find the correct pitch and detect the false positives and false negatives. Now you can extract the content of the individual boxes, or erase the edges.
This process cannot be perfect, some characters will be damaged.

Increase width/height of image(not resize)

]From https://www.pyimagesearch.com/2018/07/19/opencv-tutorial-a-guide-to-learn-opencv/
I'm able to extract the contours and write as files.
For example I've a photo with some scribbled text : "in there".
I've been able to extract the letters as separate files but what I want is that these letter files should have same width and height. For example in case of "i" and "r" width will differ. In that case I want to append(any b/w pixels) to the right of "i" photo so it's width becomes same as that of "r"
How to do it in Python? Just increase the size of photo(not resize)
My code looks something like this:
# find contours (i.e., outlines) of the foreground objects in the
# thresholded image
cnts = cv2.findContours(thresh.copy(), cv2.RETR_EXTERNAL,
cv2.CHAIN_APPROX_SIMPLE)
cnts = imutils.grab_contours(cnts)
output = image.copy()
ROI_number = 0
for c in cnts:
x,y,w,h = cv2.boundingRect(c)
ROI = image[y:y+h, x:x+w]
file = 'ROI_{}.png'.format(ROI_number)
cv2.imwrite(file.format(ROI_number), ROI)
[][1
Here are a couple of other ways to do that using Python/OpenCV using cv2.copyMakeBorder() to extend the border to the right by 50 pixels. The first way simply extends the border by replication. The second extends it with the mean (average) blue background color using a mask to get only the blue pixels.
Input:
import cv2
import numpy as np
# read image
img = cv2.imread('i.png')
# get mask of background pixels (for result2b only)
lowcolor = (232,221,163)
highcolor = (252,241,183)
mask = cv2.inRange(img, lowcolor, highcolor)
# get average color of background using mask on img (for result2b only)
mean = cv2.mean(img, mask)[0:3]
color = (mean[0],mean[1],mean[2])
# extend image to the right by 50 pixels
result = img.copy()
result2a = cv2.copyMakeBorder(result, 0,0,0,50, cv2.BORDER_REPLICATE)
result2b = cv2.copyMakeBorder(result, 0,0,0,50, cv2.BORDER_CONSTANT, value=color)
# view result
cv2.imshow("img", img)
cv2.imshow("mask", mask)
cv2.imshow("result2a", result2a)
cv2.imshow("result2b", result2b)
cv2.waitKey(0)
cv2.destroyAllWindows()
# save result
cv2.imwrite("i_extended2a.jpg", result2a)
cv2.imwrite("i_extended2b.jpg", result2b)
Replicated Result:
Average Background Color Result:
In Python/OpenCV/Numpy you create a new image of the size and background color you want. Then you use numpy slicing to insert the old image into the new one. For example:
Input:
import cv2
import numpy as np
# read image
img = cv2.imread('i.png')
ht, wd, cc= img.shape
# create new image of desired size (extended by 50 pixels in width) and desired color
ww = wd+50
hh = ht
color = (242,231,173)
result = np.full((hh,ww,cc), color, dtype=np.uint8)
# copy img image into image at offsets yy=0,xx=0
yy=0
xx=0
result[yy:yy+ht, xx:xx+wd] = img
# view result
cv2.imshow("result", result)
cv2.waitKey(0)
cv2.destroyAllWindows()
# save result
cv2.imwrite("i_extended.jpg", result)

Remove unwanted text in logo detection- Image Processing, Computer vision

I am trying to detect logo in invoices. Though I am able to get some results but not sufficient enough to process. While detecting logos, Unwanted text is also getting detected.
The following is from actual invoice:-original Image
and the following results I am getting Image after operations
I am using the`following code which I have written:-
gray=cv2.imread("Image",0)
ret,thresh1 = cv2.threshold(gray,180,255,cv2.THRESH_BINARY)
kernel_logo = np.ones((10,10),np.uint8)
closing_logo = cv2.morphologyEx(thresh1,cv2.MORPH_CLOSE,kernel_logo,
iterations = 1)
n=3
noise_removed_logo = cv2.medianBlur(closing_logo, n)
eroded_logo = cv2.erode(noise_removed_logo,kernel_logo, iterations = 8)
dilated_logo=cv2.dilate(eroded_logo,kernel_logo, iterations=3)
Could you please help me what changes should I make to remove noise from my documented image. I am new to Computer Vision
Few more sample:- Original document
The result I am getting:- Result after operations on document
Hello Mohd Anas Khan .
Your approch to define logo is too simple so it couldn't work. If you want a product-level approach, use some machine learning or deep learning. If you want just some toys, then a simple countours finder with fixed rules should work.
For example, in the following approach i defined "logo" as "the contour which has biggest area". You'll need more rules later, so good luck.
import numpy as np
import cv2
im = cv2.imread('contours_1.jpg')
imgray = cv2.cvtColor(im,cv2.COLOR_BGR2GRAY)
ret,thresh = cv2.threshold(imgray,127,255, cv2.THRESH_BINARY_INV)
rect_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (5, 5))
threshed = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, rect_kernel)
cv2.imwrite("contours_1_thres.jpg", threshed)
im2, contours, hierarchy = cv2.findContours(threshed,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
ws = []
hs = []
areas = []
for contour in contours:
area = cv2.contourArea(contour)
x, y, w, h = cv2.boundingRect(contour)
print("w: {}, h: {}, area: {}".format(w, h, area))
ws.append(w)
hs.append(h)
areas.append(area)
max_idx = np.argmax(areas)
cv2.drawContours(im, [contours[max_idx]], -1, (0, 255, 0), 3)
# cv2.drawContours(im, contours, -1, (0, 255, 0), 3)
cv2.imwrite("contours_1_test.jpg", im)
The output images are as follow : (The detected logo is covered in green box )

Python - OpenCV - Binarize To Isolate Object Which is Same Color as Background

I need to isolate the cardboard target in the image below and binarize it, so that the target is white and the background black. Normally, this is not a problem, but the background is almost the exact same color as the target.
Attempts:
# LOAD IMAGE
img_filepath = 'real_6.png'
img = cv2.imread( img_filepath )
rgb_img = img[:,:,::-1]
plt.imshow( rgb_img )
plt.title('ORIGINAL')
plt.show()
img_gray = cv2.cvtColor( img, cv2.COLOR_BGR2GRAY )
# SMOOTH
blur_kernel = np.ones((5,5),np.float32)/30
blur_img = cv2.filter2D( rgb_img, -1, blur_kernel )
# THRESHOLD
lower_color_rng = np.array( [100,50,100] )
upper_color_rng = np.array( [255,255,255] )
target_keyholes_img = cv2.inRange( blur_img, lower_color_rng, upper_color_rng )
plt.imshow( target_keyholes_img, cmap='gray' )
plt.title( 'THRESHOLD' )
plt.show()
Attempted Image Extraction
How can I use OpenCV in Python 3 to binarize this image?
Original Image

vtk oriented bounding box in 3d space

I have a set of data points in the 3D space and would like to fit a bounding box to them. I know that vtkOBBTree::ComputeOBB can do this for me. But I can't seem to figure out how to visualize the oriented bounding box.
Any help is appreciated!
For a bounding box you can use vtkOutlineFilter. You just have to set as input the 3D data that you want to fit. Then you create the mapper and the actor, and add it to the scene, as you would do in a typical VTK scenario. Here is a working example, in Python:
from vtk import *
quadric = vtkQuadric()
quadric.SetCoefficients(.5, 1, .2, 0, .1, 0, 0, .2, 0, 0)
sample = vtkSampleFunction()
sample.SetSampleDimensions(50,50,50)
sample.SetImplicitFunction(quadric)
contour = vtkContourFilter()
contour.SetInputConnection(sample.GetOutputPort())
contour.GenerateValues(5,0,1)
contourMapper = vtkPolyDataMapper()
contourMapper.SetInputConnection(contour.GetOutputPort())
contourMapper.SetScalarRange(0,1.2)
contourActor = vtkActor()
contourActor.SetMapper(contourMapper)
outline = vtkOutlineFilter()
outline.SetInputConnection(sample.GetOutputPort())
outlineMapper = vtkPolyDataMapper()
outlineMapper.SetInputConnection(outline.GetOutputPort())
outlineActor = vtkActor()
outlineActor.SetMapper(outlineMapper)
outlineActor.GetProperty().SetColor(1,1,1)
ren = vtkRenderer()
ren.SetBackground(0.188,0.373,0.647)
ren.AddActor(contourActor)
ren.AddActor(outlineActor)
renWin = vtkRenderWindow()
renWin.AddRenderer(ren)
renWin.SetWindowName("IsoSurface")
renWin.SetSize(500,500)
iren = vtkRenderWindowInteractor()
iren.SetRenderWindow(renWin)
renWin.Render()
iren.Initialize()
iren.Start()

Resources