Is it possible to translate this OpenCV into Pillow? - python-3.x

I was wondering if I can translate this opencv-python method into Pillow as I am forced furtherly to process it in Pillow.
A workaround I thought about would be to just save it with OpenCV and load it after with Pillow but I am looking for a cleaner solution, because I am using the remove_background() method's output as input for each frame of a GIF. Thus, I will read and write images N * GIF_frames_count times for no reason.
The method I want to convert from Pillow to opencv-python:
def remove_background(path):
img = cv2.imread(path)
# Convert to gray
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# Threshold input image as mask
mask = cv2.threshold(gray, 250, 255, cv2.THRESH_BINARY)[1]
# Negate mask
mask = 255 - mask
# Apply morphology to remove isolated extraneous noise
# Use border constant of black since foreground touches the edges
kernel = np.ones((3, 3), np.uint8)
mask = cv2.morphologyEx(mask, cv2.MORPH_OPEN, kernel)
mask = cv2.morphologyEx(mask, cv2.MORPH_CLOSE, kernel)
# Anti-alias the mask -- blur then stretch
# Blur alpha channel
mask = cv2.GaussianBlur(mask, (0, 0), sigmaX=2, sigmaY=2, borderType=cv2.BORDER_DEFAULT)
# Linear stretch so that 127.5 goes to 0, but 255 stays 255
mask = (2 * (mask.astype(np.float32)) - 255.0).clip(0, 255).astype(np.uint8)
# Put mask into alpha channel
result = img.copy()
result = cv2.cvtColor(result, cv2.COLOR_BGR2BGRA)
result[:, :, 3] = mask
return result
Code taken from: how to remove background of images in python

Rather than re-writing all the code using PIL equivalents, you could adopt the "if it ain't broke, don't fix it" maxim, and simply convert the Numpy array that the existing code produces into a PIL Image that you can use for your subsequent purposes.
That is this described in this answer, which I'll paraphrase as:
# Make "PIL Image" from Numpy array
pi = Image.fromarray(na)
Note that the linked answer refers to scikit-image (which uses RGB ordering like PIL) rather than OpenCV, so there is the added wrinkle that you will also need to reorder the channels from BGRA to RGBA, so the last couple of lines will look like:
...
...
result = cv2.cvtColor(result, cv2.COLOR_BGR2RGBA)
result[:, :, 3] = mask
pi = Image.fromarray(result)

Related

Paste an image to another image at two given co-ordinates with altered opacity using PIL or OpenCV in Python

I have two images with given points, one point each image, that need to be aligned so that the result image is a summation of both images, while image 2 is pasted on image 1 with 40% opacity. I have taken this question into consideration but our case does not exactly match as the image co-ordinate is supplied by user and images can have wide range of sizes.
Image 1:
Image2:
Final result(desired output):
For this I have tried img.paste() function of PIL and replacing values in numpy array of images in cv2, both giving results that are far from desired.
I made two input images with ImageMagick like this:
magick -size 300x400 xc:"rgb(1,204,255)" -fill red -draw "point 280,250" 1.png
magick -size 250x80 xc:"rgb(150,203,0)" -fill red -draw "point 12,25" 2.png
Then ran the following code:
#!/usr/bin/env python3
"""
Paste one image on top of another such that given points in each are coincident.
"""
from PIL import Image
# Open images and ensure RGB
im1 = Image.open('1.png').convert('RGB')
im2 = Image.open('2.png').convert('RGB')
# x,y coordinates of point in each image
p1x, p1y = 280, 250
p2x, p2y = 12, 25
# Work out how many pixels of space we need left, right, above, below common point in new image
pL = max(p1x, p2x)
pR = max(im1.width-p1x, im2.width-p2x)
pT = max(p1y, p2y)
pB = max(im1.height-p1y, im2.height-p2y)
# Create background in solid white
bg = Image.new('RGB', (pL+pR, pT+pB),'white')
bg.save('DEBUG-bg.png')
# Paste im1 onto background
bg.paste(im1, (pL-p1x, pT-p1y))
bg.save('DEBUG-bg+im1.png')
# Make 40% opacity mask for im2
alpha = Image.new('L', (im2.width,im2.height), int(40*255/100))
alpha.save('DEBUG-alpha.png')
# Paste im2 over background with alpha
bg.paste(im2, (pL-p2x, pT-p2y), alpha)
bg.save('result.png')
The result is this:
The lines that save images with names starting "DEBUG-xxx.png" are just for easy debugging and can be removed. I can easily view them all to see what is going on with the code and I can easily delete them all by removing "DEBUG*png".
Without any more details, I will try to answer the question as best as I can and will name all the extra assumptions that I made (and how to handle them if you can't make them).
Since there were no provided images, I created a blue and green image with a black dot as merging coordinate, using the following code:
import numpy as np
from PIL import Image, ImageDraw
def create_image_with_point(name, color, x, y, width=3):
image = np.full((400, 400, 3), color, dtype=np.uint8)
image[y - width:y + width, x - width:x + width] = (0, 0, 0)
image = Image.fromarray(image, mode='RGB')
ImageDraw.Draw(image).text((x - 15, y - 20), 'Point', (0, 0, 0))
image.save(name)
return image
blue = create_image_with_point('blue.png', color=(50, 50, 255), x=300, y=100)
green = create_image_with_point('green.png', color=(50, 255, 50), x=50, y=50)
This results in the following images:
Now I will make the assumption that the images do not contain an alpha layer yet (as I created them without). Therefore I will load the image and add an alpha layer to them:
import numpy as np
from PIL import Image
blue = Image.open('blue.png')
blue.putalpha(255)
green = Image.open('green.png')
green.putalpha(255)
My following assumption is that you know the merge coordinates beforehand:
# Assuming x, y coordinates.
point_blue = (300, 100)
point_green = (50, 50)
Then you can create an empty image, that can hold both of the images easily:
new_image = np.zeros((1000, 1000, 4), dtype=np.uint8)
This is a far stretch assumption if you do not know the image size beforehand, and in case you do not know this you will have to calculate the combining size of the two images.
Then you can place the images dot in the center of the newly created images (in my case (500, 500). For this you use the merging points as offsets. And you can perform alpha blending (in any case: np.uint8(img_1*alpha + img_2*(1-alpha))) to merge the images using different opacity.
Which is in code:
def place_image(image: Image, point_xy: tuple[int, int], dest: np.ndarray, alpha: float = 1.) -> np.ndarray:
# Place the merging dot on (500, 500).
offset_x, offset_y = 500 - point_xy[0], 500 - point_xy[1]
# Calculate the location of the image and perform alpha blending.
destination = dest[offset_y:offset_y + image.height, offset_x:offset_x + image.width]
destination = np.uint8(destination * (1 - alpha) + np.array(image) * alpha)
# Copy the 'merged' imaged to the destination location.
dest[offset_y:offset_y + image.height, offset_x:offset_x + image.width] = destination
return dest
# Add the background image blue with alpha 1
new_image = place_image(blue, point_blue, dest=new_image, alpha=1)
# Add the second image with 40% opacity
new_image = place_image(green, point_green, dest=new_image, alpha=0.4)
# Store the resulting image.
image = Image.fromarray(new_image)
image.save('result.png')
The final result will be a bigger image, of the combined images, again you can calculate the correct bounding box, so you don't have these huge areas of 'nothing' sticking out. The final result will look like this:

How do I detect vertical text with OpenCV for extraction

I am new to OpenCV and trying to see if I can find a way to detect vertical text for the image attached.
In this case on row 3 , I would like to get the bounding box around Original Cost and the amount below ($200,000.00).
Similarly I would like to get the bounding box around Amount Existing Liens and the associated amount below. I then would use this data to send to an OCR engine to read text. Traditional OCR engines go line by line and extract and loses the context.
Here is what I have tried so far -
import cv2
import numpy as np
img = cv2.imread('Test3.png')
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
edges = cv2.Canny(gray,100,100,apertureSize = 3)
cv2.imshow('edges',edges)
cv2.waitKey(0)
minLineLength = 20
maxLineGap = 10
lines = cv2.HoughLinesP(edges,1,np.pi/180,15,minLineLength=minLineLength,maxLineGap=maxLineGap)
for x in range(0, len(lines)):
for x1,y1,x2,y2 in lines[x]:
cv2.line(img,(x1,y1),(x2,y2),(0,255,0),2)
cv2.imshow('hough',img)
cv2.waitKey(0)
Here is my solution based on Kanan Vyas and Adrian Rosenbrock
It's probably not as "canonical" as you'd wish.
But it seems to work (more or less...) with the image you provided.
Just a word of CAUTION: The code looks within the directory from which it is running, for a folder named "Cropped" where cropped images will be stored. So, don't run it in a directory which already contains a folder named "Cropped" because it deletes everything in this folder at each run. Understood? If you're unsure run it in a separate folder.
The code:
# Import required packages
import cv2
import numpy as np
import pathlib
###################################################################################################################################
# https://www.pyimagesearch.com/2015/04/20/sorting-contours-using-python-and-opencv/
###################################################################################################################################
def sort_contours(cnts, method="left-to-right"):
# initialize the reverse flag and sort index
reverse = False
i = 0
# handle if we need to sort in reverse
if method == "right-to-left" or method == "bottom-to-top":
reverse = True
# handle if we are sorting against the y-coordinate rather than
# the x-coordinate of the bounding box
if method == "top-to-bottom" or method == "bottom-to-top":
i = 1
# construct the list of bounding boxes and sort them from top to
# bottom
boundingBoxes = [cv2.boundingRect(c) for c in cnts]
(cnts, boundingBoxes) = zip(*sorted(zip(cnts, boundingBoxes),
key=lambda b:b[1][i], reverse=reverse))
# return the list of sorted contours and bounding boxes
return (cnts, boundingBoxes)
###################################################################################################################################
# https://medium.com/coinmonks/a-box-detection-algorithm-for-any-image-containing-boxes-756c15d7ed26 (with a few modifications)
###################################################################################################################################
def box_extraction(img_for_box_extraction_path, cropped_dir_path):
img = cv2.imread(img_for_box_extraction_path, 0) # Read the image
(thresh, img_bin) = cv2.threshold(img, 128, 255,
cv2.THRESH_BINARY | cv2.THRESH_OTSU) # Thresholding the image
img_bin = 255-img_bin # Invert the imagecv2.imwrite("Image_bin.jpg",img_bin)
# Defining a kernel length
kernel_length = np.array(img).shape[1]//200
# A verticle kernel of (1 X kernel_length), which will detect all the verticle lines from the image.
verticle_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (1, kernel_length))
# A horizontal kernel of (kernel_length X 1), which will help to detect all the horizontal line from the image.
hori_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (kernel_length, 1))
# A kernel of (3 X 3) ones.
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3, 3))# Morphological operation to detect verticle lines from an image
img_temp1 = cv2.erode(img_bin, verticle_kernel, iterations=3)
verticle_lines_img = cv2.dilate(img_temp1, verticle_kernel, iterations=3)
#cv2.imwrite("verticle_lines.jpg",verticle_lines_img)# Morphological operation to detect horizontal lines from an image
img_temp2 = cv2.erode(img_bin, hori_kernel, iterations=3)
horizontal_lines_img = cv2.dilate(img_temp2, hori_kernel, iterations=3)
#cv2.imwrite("horizontal_lines.jpg",horizontal_lines_img)# Weighting parameters, this will decide the quantity of an image to be added to make a new image.
alpha = 0.5
beta = 1.0 - alpha
# This function helps to add two image with specific weight parameter to get a third image as summation of two image.
img_final_bin = cv2.addWeighted(verticle_lines_img, alpha, horizontal_lines_img, beta, 0.0)
img_final_bin = cv2.erode(~img_final_bin, kernel, iterations=2)
(thresh, img_final_bin) = cv2.threshold(img_final_bin, 128, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)# For Debugging
# Enable this line to see verticle and horizontal lines in the image which is used to find boxes
#cv2.imwrite("img_final_bin.jpg",img_final_bin)
# Find contours for image, which will detect all the boxes
contours, hierarchy = cv2.findContours(
img_final_bin, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
# Sort all the contours by top to bottom.
(contours, boundingBoxes) = sort_contours(contours, method="top-to-bottom")
idx = 0
for c in contours:
# Returns the location and width,height for every contour
x, y, w, h = cv2.boundingRect(c)# If the box height is greater then 20, widht is >80, then only save it as a box in "cropped/" folder.
if (w > 50 and h > 20):# and w > 3*h:
idx += 1
new_img = img[y:y+h, x:x+w]
cv2.imwrite(cropped_dir_path+str(x)+'_'+str(y) + '.png', new_img)
###########################################################################################################################################################
def prepare_cropped_folder():
p=pathlib.Path('./Cropped')
if p.exists(): # Cropped folder non empty. Let's clean up
files = [x for x in p.glob('*.*') if x.is_file()]
for f in files:
f.unlink()
else:
p.mkdir()
###########################################################################################################################################################
# MAIN
###########################################################################################################################################################
prepare_cropped_folder()
# Read image from which text needs to be extracted
img = cv2.imread("dkesg.png")
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# Performing OTSU threshold
ret, thresh1 = cv2.threshold(gray, 0, 255, cv2.THRESH_OTSU | cv2.THRESH_BINARY_INV)
thresh1=255-thresh1
bin_y=np.zeros(thresh1.shape[0])
for x in range(0,len(bin_y)):
bin_y[x]=sum(thresh1[x,:])
bin_y=bin_y/max(bin_y)
ry=np.where(bin_y>0.995)[0]
for i in range(0,len(ry)):
cv2.line(img, (0, ry[i]), (thresh1.shape[1], ry[i]), (0, 0, 0), 1)
# We need to draw abox around the picture with a white border in order for box_detection to work
cv2.line(img,(0,0),(0,img.shape[0]-1),(255,255,255),2)
cv2.line(img,(img.shape[1]-1,0),(img.shape[1]-1,img.shape[0]-1),(255,255,255),2)
cv2.line(img,(0,0),(img.shape[1]-1,0),(255,255,255),2)
cv2.line(img,(0,img.shape[0]-1),(img.shape[1]-1,img.shape[0]-1),(255,255,255),2)
cv2.line(img,(0,0),(0,img.shape[0]-1),(0,0,0),1)
cv2.line(img,(img.shape[1]-3,0),(img.shape[1]-3,img.shape[0]-1),(0,0,0),1)
cv2.line(img,(0,0),(img.shape[1]-1,0),(0,0,0),1)
cv2.line(img,(0,img.shape[0]-2),(img.shape[1]-1,img.shape[0]-2),(0,0,0),1)
cv2.imwrite('out.png',img)
box_extraction("out.png", "./Cropped/")
Now... It puts the cropped regions in the Cropped folder. They are named as x_y.png with (x,y) the position on the original image.
Here are two examples of the outputs
and
Now, in a terminal. I used pytesseract on these two images.
The results are the following:
1)
Original Cost
$200,000.00
2)
Amount Existing Liens
$494,215.00
As you can see, pytesseract got the amount wrong in the second case... So, be careful.
Best regards,
Stéphane
I assume the bounding box is fix (rectangle that able to fit in "Original Amount and the amount below). You can use text detection to detect the "Original Amount" and "Amount Existing Liens" using OCR and crop out the image based on the detected location for further OCR on the amount. You can refer this link for text detection
Try to divide the image into different cells using the lines in the image.
For example, first divide the input into rows by detecting the horizontal lines. This can be done by using cv.HoughLinesP and checking for each line if the difference between y-coordinate of the begin and end point is smaller than a certain threshold abs(y2 - y1) < 10. If you have a horizontal line, it's a separator for a new row. You can use the y-coordinates of this line to split the input horizontally.
Next, for the row you're interested in, divide the region into columns using the same technique, but now make sure the difference between the x-coordinates of the begin and end point are smaller than a certain threshold, since you're now looking for the vertical lines.
You can now crop the image to different cells using the y-coordinates of the horizontal lines and the x-coordinates of the vertical lines. Pass these cropped regions one by one to the OCR engine and you'll have for each cell the corresponding text.

Is there any way we can find boundaries of multiple images and crop them out individually? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
Is there any way we can find boundaries of multiple images and crop them out individually. I am able to crop individual image when they can be cropped symmetrically in a rectangular box, but it becomes challenging when image to be cropped is asymmetric. In the image attached there are two images, i.e. "detail B" and "detail C". I just wanted to crop them out into two individual images. Can anyone advise how to get these images using Python?
The general approach is quite simple:
Inverse binary threshold a grayscale version of your image, e.g. using Otsu's method. Since you have all-white background, this should be fine.
To "merge" all neighbouring parts, i.e. the "detail" itself, the lines, and captions, dilate the resulting mask from the thresholding.
Find all external contours, filter the largest ones, and then one after another: Draw the filled contour on a separate mask, and set up a linear combination of your original image, where the mask is white, and an all-white image, where the mask is black; crop the correct part by finding the bounding rectangle of the contour.
Here's some Python code using OpenCV and NumPy:
import cv2
import numpy as np
from skimage import io # Only needed for web grabbing images
# Read image from web
image = cv2.cvtColor(io.imread('https://i.stack.imgur.com/rq12v.jpg'), cv2.COLOR_RGB2BGR)
# Inverse binary threshold grayscale version of image using Otsu's
thres = cv2.threshold(cv2.cvtColor(image, cv2.COLOR_BGR2GRAY), 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
# Dilate to merge all neighbouring parts
kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (11, 11))
thres = cv2.dilate(thres, kernel)
# Find external contours with respect to OpenCV version
cnts = cv2.findContours(thres, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
# Iterate all contours...
area_thr = 10000
k = 0
for c in cnts:
# Filter large contours
if cv2.contourArea(c) > area_thr:
k = k + 1
# Get bounding rectangle of contour
rect = cv2.boundingRect(c)
x1 = rect[0]
y1 = rect[1]
x2 = x1 + rect[2]
y2 = y1 + rect[3]
# Generate filled contour mask
mask = np.zeros((thres.shape[0], thres.shape[1], 3), np.uint8)
mask = cv2.drawContours(mask, [c], -1, (1, 1, 1), cv2.FILLED)
# Generate and save cropped image
crop = 255 * np.ones((thres.shape[0], thres.shape[1], 3), np.uint8)
crop = (1 - mask) * crop + mask * image
crop = crop[y1:y2, x1:x2]
cv2.imwrite('crop' + str(k) + '.png', crop)
The initial mask after thresholding and dilating looks like this:
We see six parts, whereas the two "details" are significantly larger.
The two cropped "details" are:
Hope that helps!
------------------
System information
------------------
Python: 3.8.1
NumPy: 1.18.1
OpenCV: 4.1.2
------------------

How to find the document edges in various coloured backgrounds using opencv python? [Document Scanning in various backgrounds]

I am currently have a document that needs to be smart scanned.
For that, I need to find proper contours of the document in any background so that I can do a warped perspective projection and detection with that image.
The main issue faced while doing this is that the document edge detects any kind of background.
I have tried to use the function HoughLineP and tried to find contours on the grayscale blurred image passed through canny edge detection until now.
MORPH = 9
CANNY = 84
HOUGH = 25
IM_HEIGHT, IM_WIDTH, _ = rescaled_image.shape
# convert the image to grayscale and blur it slightly
gray = cv2.cvtColor(rescaled_image, cv2.COLOR_BGR2GRAY)
gray = cv2.GaussianBlur(gray, (7,7), 0)
#dilate helps to remove potential holes between edge segments
kernel = cv2.getStructuringElement(cv2.MORPH_RECT,(MORPH,MORPH))
dilated = cv2.dilate(gray, kernel)
# find edges and mark them in the output map using the Canny algorithm
edged = cv2.Canny(dilated, 0, CANNY)
test_corners = self.get_corners(edged)
approx_contours = []
(_, cnts, hierarchy) = cv2.findContours(edged.copy(), cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)
cnts = sorted(cnts, key=cv2.contourArea, reverse=True)[:5]
# loop over the contours
for c in cnts:
# approximate the contour
approx = cv2.approxPolyDP(c, 80, True)
if self.is_valid_contour(approx, IM_WIDTH, IM_HEIGHT):
approx_contours.append(approx)
break
How to find a proper bounding box around the document via OpenCV code.
Any help will be much appreciated.
(The document is taken from the camera in any angle and any coloured background.)
Following code might help you to detect/segment the page in the image...
import cv2
import matplotlib.pyplot as plt
import numpy as np
image = cv2.imread('test_p.jpg')
image = cv2.imread('test_p.jpg')
print(image.shape)
ori = image.copy()
image = cv2.resize(image, (image.shape[1]//10,image.shape[0]//10))
Resized the image to make the operations more faster so that we can work on realtime..
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
gray = cv2.GaussianBlur(gray, (11,11), 0)
edged = cv2.Canny(gray, 75, 200)
print("STEP 1: Edge Detection")
plt.imshow(edged)
plt.show()
cnts = cv2.findContours(edged.copy(), cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)
cnts = sorted(cnts[1], key = cv2.contourArea, reverse = True)[:5]
Here we will consider only first 5 contours from the sorted list based on area
Here the size of the gaussian blur is bit sensitive, so chose it accordingly based on the image size.
After the above operations image may look like..
for c in cnts:
### Approximating the contour
#Calculates a contour perimeter or a curve length
peri = cv2.arcLength(c, True)
approx = cv2.approxPolyDP(c, 0.01 * peri, True)
# if our approximated contour has four points, then we
# can assume that we have found our screen
screenCnt = approx
if len(approx) == 4:
screenCnt = approx
break
# show the contour (outline)
print("STEP 2: Finding Boundary")
cv2.drawContours(image, [screenCnt], -1, (0, 255, 0), 2)
image_e = cv2.resize(image,(image.shape[1],image.shape[0]))
cv2.imwrite('image_edge.jpg',image_e)
plt.imshow(image_e)
plt.show()
Final Image may look like...
Rest of the things may be handled after getting the final image...
Code Reference :- Git Repository
I guess this answer would be helpful...
There is a similar problem which is called orthographic projection.
Orthographic approaches
Rather than doing, Gaussian blur+morphological operation to get the edge of the document, try to do orthographic projection first and then find contours via your method.
For fining proper bounding box, try some preset values or a reference letter after which an orthographic projection will allow you to compute the height and hence the dimensions of the bounding box.

OpenCV: how to add artificial smudge / motion blur effects to a whole image?

I would like to add artificial smudge / motion blur effects in a specific direction to images with OpenCV to simulate blurring caused by shaking/moving cameras while recording images.
What would be an appropriate way to do so in OpenCV (with Python)?
Example image:
To achieve this effect, you convolve the image with a line segment-like kernel (or PSF, a Point Spread Function), like this:
img = cv2.imread("Lenna.png")
psf = np.zeros((50, 50, 3))
psf = cv2.ellipse(psf,
(25, 25), # center
(22, 0), # axes -- 22 for blur length, 0 for thin PSF
15, # angle of motion in degrees
0, 360, # ful ellipse, not an arc
(1, 1, 1), # white color
thickness=-1) # filled
psf /= psf[:,:,0].sum() # normalize by sum of one channel
# since channels are processed independently
imfilt = cv2.filter2D(img, -1, psf)
To be more realistic you need to perform blurring in linear domain (i.e. invert then re-apply sRGB gamma, see here, it's a whole another can of worms), but this simple code gets you the following:
Source:
Result:

Resources