Image segmentation of objects in any illumination(low or high) - python-3.x

The problem I have at hand is to draw boundaries around a white ball. But the ball is present in different illuminations. Using canny edge detections and Hough transform for circles, I am able to detect the ball in bright light/partial bright light but not in low illumination.
So can anyone help with this problem.
The code that I have tried is below.
img=cv2.imread('14_04_2018_10_38_51_.8242_P_B_142_17197493.png.png')
cimg=img.copy()
img = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
edges = cv2.medianBlur(img,5)
edges=cv2.Canny(edges,200,200)
circles = cv2.HoughCircles(edges,cv2.HOUGH_GRADIENT,1,20,
param1=25,param2=10,minRadius=0,maxRadius=0)
if circles is not None:
circles = np.uint16(np.around(circles))
for i in circles[0,:]:
# draw the outer circle
cv2.circle(cimg,(i[0],i[1]),i[2],(255,255,255),2)
# draw the center of the circle
cv2.circle(cimg,(i[0],i[1]),2,(0,0,255),3)
cv2.imwrite('segmented_out.png',cimg)
else:
print("no circles")
cv2.imwrite('edges_out.png',edges)
In the image below we need to segment if the ball is in the shadow region as well.
The output should be something like below images..

Well I am not very experienced in OpenCV or Python but I am learning as well. Probably not very pythonic piece of code but you could try this:
import cv2
import math
circ=0
n = [10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 210, 220]
img = cv2.imread("ball1.jpg")
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
for i in n:
ret, threshold = cv2.threshold(gray,i,255,cv2.THRESH_BINARY)
im, contours, hierarchy = cv2.findContours(threshold,cv2.RETR_TREE,cv2.CHAIN_APPROX_NONE)
for j in range(0, len(contours)):
size = cv2.contourArea(contours[j])
if 500 < size < 5000:
if circ > 0:
(x,y),radius = cv2.minEnclosingCircle(contours[j])
radius = int(radius)
area = cv2.contourArea(contours[j])
circif = 4*area/(math.pi*(radius*2)**2)
if circif > circ:
circ = float(circif)
radiusx = radius
center = (int(x),int(y))
elif circ == 0:
(x,y),radius = cv2.minEnclosingCircle(contours[j])
radius = int(radius)
area = cv2.contourArea(contours[j])
circ = 4*area/(math.pi*(radius*2)**2)
else:
pass
cv2.circle(img,center,radiusx,(0,255,0),2)
cv2.imshow('image', img)
cv2.waitKey(0)
cv2.detroyAllWindows()
What it does is acctually you convert your picture to grayscale and apply different threshold settings to it. Then you eliminate noises with adding size to your specific contour. When you find it, you check its circularity (NOTE: it is not a scientific formula) and compare it to the next circularity. Perfect circle should return the result 1, so the highest number that will get in a contour (of all the contours) will be your ball.
Result:
NOTE: I haven't tried increasing the limit of size so maybe higher limit could return better result if you have a high resolution picture

Working with grayscale image will make you subject to different light conditions.
To be free from this I suggest to work in HSV color space, then use the Hue component instead of the grayscale image.
Hue is independent from the light condition, since it gives you information about the color, regardless of its Saturation or Value (a value bound to the brightness of the image).
This might bring you some clarity about color spaces and which is best to use for image segmentation.

In your case here. We have a white ball.White is not a color by itself.The main factor here is, what kind light actually falls on the white ballAs the kind of light that falls on it has a direct influence on the kind of extraction you might plan to do using a color space like HSV as mentioned above by #magicleon
HSV is your best bet for segmentation here.Using
whiteObject = cv2.inRange(hsvImage,lowerHSVLimit,upperHSVLimit)
lowerHSVLimit and upperHSVLimit HSV color range
Keeping in mind that the conditions
1) The image have similar conditions while they were clicked
2) You cover all the ranges of HSV before extraction
Hope you get an idea
Consider this example
Selecting a particular hue range from 45 to 60
Code
image = cv2.imread('allcolors.png')
hsvImg = cv2.cvtColor(image,cv2.COLOR_BGR2HSV)
lowerHSVLimit = np.array([45,0,0])
upperHSVLimit = np.array([60,255,255])
colour = cv2.inRange(hsvImg,lowerHSVLimit,upperHSVLimit)
plt.subplot(111), plt.imshow(colour,cmap="gray")
plt.title('hue range from 45 to 60'), plt.xticks([]), plt.yticks([])
plt.show()
Here the hue selected from 45 to 60

Related

Image intensity distribution changes during opencv warp affine

I am using python 3.8.5 and opencv 4.5.1 on windows 7
I am using the following code to rotate images.
def pad_rotate(image, ang, pad, pad_value=0):
(h, w) = image.shape[:2]
#create larger image and paste original image at the center.
# this is done to avoid any cropping during rotation
nH, nW = h + 2*pad, w + 2*pad #new height and width
cY, cX = nW//2, nH//2 #center of the new image
#create new image with pad_values
newImg = np.zeros((h+2*pad, w+2*pad), dtype=image.dtype)
newImg[:,:] = pad_value
#paste new image at the center
newImg[pad:pad+h, pad:pad+w] = image
#rotate CCW (for positive angles)
M = cv2.getRotationMatrix2D(center=(cX, cY), angle=ang, scale=1.0)
rotImg = cv2.warpAffine(newImg, M, (nW, nH), cv2.INTER_CUBIC,
borderMode=cv2.BORDER_CONSTANT, borderValue=pad_value)
return rotImg
My issue is that after the rotation, image intensity distribution is different than original.
Following part of the question is edited to clarify the issue
img = np.random.rand(500,500)
Rimg = pad_rotate(img, 15, 300, np.nan)
Here is what these images look like:
Their intensities have clearly shifted:
np.percentile(img, [20, 50, 80])
# prints array([0.20061218, 0.50015415, 0.79989986])
np.nanpercentile(Rimg, [20, 50, 80])
# prints array([0.32420028, 0.50031483, 0.67656537])
Can someone please tell me how to avoid this normalization?
The averaging effect of the interpolation changes the distribution...
Note:
There is a mistake in your code sample (not related to the percentiles).
The 4'th argument of warpAffine is dst.
replace cv2.warpAffine(newImg, M, (nW, nH), cv2.INTER_CUBIC with:
cv2.warpAffine(newImg, M, (nW, nH), flags=cv2.INTER_CUBIC
I tried to simplify the code sample that reproduces the problem.
The code sample uses linear interpolation, 1 degree rotation, and no NaN values.
import numpy as np
import cv2
img = np.random.rand(1000, 1000)
M = cv2.getRotationMatrix2D((img.shape[1]//2, img.shape[0]//2), 1, 1) # Rotate by 1 degree
Rimg = cv2.warpAffine(img, M, (img.shape[1], img.shape[0]), flags=cv2.INTER_LINEAR) # Use Linear interpolation
Rimg = Rimg[20:-20, 20:-20] # Crop the part without the margins.
print(np.percentile(img, [20, 50, 80])) #[0.20005696 0.49990526 0.79954818]
print(np.percentile(Rimg, [20, 50, 80])) #[0.32244747 0.4998595 0.67698961]
cv2.imshow('img', img)
cv2.imshow('Rimg', Rimg)
cv2.waitKey()
cv2.destroyAllWindows()
When we disable the interpolation,
Rimg = cv2.warpAffine(img, M, (img.shape[1], img.shape[0]), flags=cv2.INTER_NEAREST)
The percentiles are: [0.19943713 0.50004768 0.7995525 ].
Simpler example for showing that averaging elements changes the distribution:
A = np.random.rand(10000000)
B = (A[0:-1:2] + A[1::2])/2 # Averaging every two elements.
print(np.percentile(A, [20, 50, 80])) # [0.19995436 0.49999472 0.80007232]
print(np.percentile(B, [20, 50, 80])) # [0.31617922 0.50000145 0.68377251]
Why does interpolation skews the distribution towered the median?
I am not a mathematician.
I am sure you can get a better explanation...
Here is an intuitive example:
Assume there is list of values with uniform distribution in range [0, 1].
Assume there is a zero value in the list:
[0.2, 0.7, 0, 0.5... ]
After averaging every two sequential elements, the probability for getting a zero element in the output list is very small (only two sequential zeros result a zero).
The example shows that averaging pushes the extreme values towered the center.

How to set HoughCircles parameters automatically to detect different sizes of circles in opencv python?

I want to set HoughCircles parameters automatically to detect all size of circles in an image. And also should detect group of same size circles.
I am trying group of same size circles in one image. And group of same size circles in different image, the sizes of circles in both image are different.
So how to set HoughCircles parameters automatically that can detect group of circles in any image.
please help me.
Thank u
If you're looking to collectively just "bin" same-size circles, the below should serve as a good starting point that can be tweaked for your application.
import cv2
import numpy as np
img = cv2.imread('C:\\Test\\circles.jpg')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
circles = cv2.HoughCircles(gray, cv2.HOUGH_GRADIENT, dp=2.0, minDist=50, minRadius=20, maxRadius=250)
radius_map = {}
for n in range(20, 250, 1):
radius_map[n] = []
if circles is not None:
circles = np.round(circles[0, :]).astype("int")
for (x, y, r) in circles:
radius_map[r].append((x, y, r))
for key in radius_map:
if len(radius_map[key]) > 0:
output = img.copy()
for x, y, r in radius_map[key]:
cv2.circle(output, (x, y), r, (0, 255, 0), 4)
cv2.imshow(f"Radius {key}", output)
cv2.waitKey(0)
If you require some thresholded band of say, circles with radius 50 and 51 are considered the same size, you can iterate over the radius_map dict object and group radius bins together.
Input Image:
Output Images:

How do I detect vertical text with OpenCV for extraction

I am new to OpenCV and trying to see if I can find a way to detect vertical text for the image attached.
In this case on row 3 , I would like to get the bounding box around Original Cost and the amount below ($200,000.00).
Similarly I would like to get the bounding box around Amount Existing Liens and the associated amount below. I then would use this data to send to an OCR engine to read text. Traditional OCR engines go line by line and extract and loses the context.
Here is what I have tried so far -
import cv2
import numpy as np
img = cv2.imread('Test3.png')
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
edges = cv2.Canny(gray,100,100,apertureSize = 3)
cv2.imshow('edges',edges)
cv2.waitKey(0)
minLineLength = 20
maxLineGap = 10
lines = cv2.HoughLinesP(edges,1,np.pi/180,15,minLineLength=minLineLength,maxLineGap=maxLineGap)
for x in range(0, len(lines)):
for x1,y1,x2,y2 in lines[x]:
cv2.line(img,(x1,y1),(x2,y2),(0,255,0),2)
cv2.imshow('hough',img)
cv2.waitKey(0)
Here is my solution based on Kanan Vyas and Adrian Rosenbrock
It's probably not as "canonical" as you'd wish.
But it seems to work (more or less...) with the image you provided.
Just a word of CAUTION: The code looks within the directory from which it is running, for a folder named "Cropped" where cropped images will be stored. So, don't run it in a directory which already contains a folder named "Cropped" because it deletes everything in this folder at each run. Understood? If you're unsure run it in a separate folder.
The code:
# Import required packages
import cv2
import numpy as np
import pathlib
###################################################################################################################################
# https://www.pyimagesearch.com/2015/04/20/sorting-contours-using-python-and-opencv/
###################################################################################################################################
def sort_contours(cnts, method="left-to-right"):
# initialize the reverse flag and sort index
reverse = False
i = 0
# handle if we need to sort in reverse
if method == "right-to-left" or method == "bottom-to-top":
reverse = True
# handle if we are sorting against the y-coordinate rather than
# the x-coordinate of the bounding box
if method == "top-to-bottom" or method == "bottom-to-top":
i = 1
# construct the list of bounding boxes and sort them from top to
# bottom
boundingBoxes = [cv2.boundingRect(c) for c in cnts]
(cnts, boundingBoxes) = zip(*sorted(zip(cnts, boundingBoxes),
key=lambda b:b[1][i], reverse=reverse))
# return the list of sorted contours and bounding boxes
return (cnts, boundingBoxes)
###################################################################################################################################
# https://medium.com/coinmonks/a-box-detection-algorithm-for-any-image-containing-boxes-756c15d7ed26 (with a few modifications)
###################################################################################################################################
def box_extraction(img_for_box_extraction_path, cropped_dir_path):
img = cv2.imread(img_for_box_extraction_path, 0) # Read the image
(thresh, img_bin) = cv2.threshold(img, 128, 255,
cv2.THRESH_BINARY | cv2.THRESH_OTSU) # Thresholding the image
img_bin = 255-img_bin # Invert the imagecv2.imwrite("Image_bin.jpg",img_bin)
# Defining a kernel length
kernel_length = np.array(img).shape[1]//200
# A verticle kernel of (1 X kernel_length), which will detect all the verticle lines from the image.
verticle_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (1, kernel_length))
# A horizontal kernel of (kernel_length X 1), which will help to detect all the horizontal line from the image.
hori_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (kernel_length, 1))
# A kernel of (3 X 3) ones.
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3, 3))# Morphological operation to detect verticle lines from an image
img_temp1 = cv2.erode(img_bin, verticle_kernel, iterations=3)
verticle_lines_img = cv2.dilate(img_temp1, verticle_kernel, iterations=3)
#cv2.imwrite("verticle_lines.jpg",verticle_lines_img)# Morphological operation to detect horizontal lines from an image
img_temp2 = cv2.erode(img_bin, hori_kernel, iterations=3)
horizontal_lines_img = cv2.dilate(img_temp2, hori_kernel, iterations=3)
#cv2.imwrite("horizontal_lines.jpg",horizontal_lines_img)# Weighting parameters, this will decide the quantity of an image to be added to make a new image.
alpha = 0.5
beta = 1.0 - alpha
# This function helps to add two image with specific weight parameter to get a third image as summation of two image.
img_final_bin = cv2.addWeighted(verticle_lines_img, alpha, horizontal_lines_img, beta, 0.0)
img_final_bin = cv2.erode(~img_final_bin, kernel, iterations=2)
(thresh, img_final_bin) = cv2.threshold(img_final_bin, 128, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)# For Debugging
# Enable this line to see verticle and horizontal lines in the image which is used to find boxes
#cv2.imwrite("img_final_bin.jpg",img_final_bin)
# Find contours for image, which will detect all the boxes
contours, hierarchy = cv2.findContours(
img_final_bin, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
# Sort all the contours by top to bottom.
(contours, boundingBoxes) = sort_contours(contours, method="top-to-bottom")
idx = 0
for c in contours:
# Returns the location and width,height for every contour
x, y, w, h = cv2.boundingRect(c)# If the box height is greater then 20, widht is >80, then only save it as a box in "cropped/" folder.
if (w > 50 and h > 20):# and w > 3*h:
idx += 1
new_img = img[y:y+h, x:x+w]
cv2.imwrite(cropped_dir_path+str(x)+'_'+str(y) + '.png', new_img)
###########################################################################################################################################################
def prepare_cropped_folder():
p=pathlib.Path('./Cropped')
if p.exists(): # Cropped folder non empty. Let's clean up
files = [x for x in p.glob('*.*') if x.is_file()]
for f in files:
f.unlink()
else:
p.mkdir()
###########################################################################################################################################################
# MAIN
###########################################################################################################################################################
prepare_cropped_folder()
# Read image from which text needs to be extracted
img = cv2.imread("dkesg.png")
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# Performing OTSU threshold
ret, thresh1 = cv2.threshold(gray, 0, 255, cv2.THRESH_OTSU | cv2.THRESH_BINARY_INV)
thresh1=255-thresh1
bin_y=np.zeros(thresh1.shape[0])
for x in range(0,len(bin_y)):
bin_y[x]=sum(thresh1[x,:])
bin_y=bin_y/max(bin_y)
ry=np.where(bin_y>0.995)[0]
for i in range(0,len(ry)):
cv2.line(img, (0, ry[i]), (thresh1.shape[1], ry[i]), (0, 0, 0), 1)
# We need to draw abox around the picture with a white border in order for box_detection to work
cv2.line(img,(0,0),(0,img.shape[0]-1),(255,255,255),2)
cv2.line(img,(img.shape[1]-1,0),(img.shape[1]-1,img.shape[0]-1),(255,255,255),2)
cv2.line(img,(0,0),(img.shape[1]-1,0),(255,255,255),2)
cv2.line(img,(0,img.shape[0]-1),(img.shape[1]-1,img.shape[0]-1),(255,255,255),2)
cv2.line(img,(0,0),(0,img.shape[0]-1),(0,0,0),1)
cv2.line(img,(img.shape[1]-3,0),(img.shape[1]-3,img.shape[0]-1),(0,0,0),1)
cv2.line(img,(0,0),(img.shape[1]-1,0),(0,0,0),1)
cv2.line(img,(0,img.shape[0]-2),(img.shape[1]-1,img.shape[0]-2),(0,0,0),1)
cv2.imwrite('out.png',img)
box_extraction("out.png", "./Cropped/")
Now... It puts the cropped regions in the Cropped folder. They are named as x_y.png with (x,y) the position on the original image.
Here are two examples of the outputs
and
Now, in a terminal. I used pytesseract on these two images.
The results are the following:
1)
Original Cost
$200,000.00
2)
Amount Existing Liens
$494,215.00
As you can see, pytesseract got the amount wrong in the second case... So, be careful.
Best regards,
Stéphane
I assume the bounding box is fix (rectangle that able to fit in "Original Amount and the amount below). You can use text detection to detect the "Original Amount" and "Amount Existing Liens" using OCR and crop out the image based on the detected location for further OCR on the amount. You can refer this link for text detection
Try to divide the image into different cells using the lines in the image.
For example, first divide the input into rows by detecting the horizontal lines. This can be done by using cv.HoughLinesP and checking for each line if the difference between y-coordinate of the begin and end point is smaller than a certain threshold abs(y2 - y1) < 10. If you have a horizontal line, it's a separator for a new row. You can use the y-coordinates of this line to split the input horizontally.
Next, for the row you're interested in, divide the region into columns using the same technique, but now make sure the difference between the x-coordinates of the begin and end point are smaller than a certain threshold, since you're now looking for the vertical lines.
You can now crop the image to different cells using the y-coordinates of the horizontal lines and the x-coordinates of the vertical lines. Pass these cropped regions one by one to the OCR engine and you'll have for each cell the corresponding text.

How do I find corners of a paper when there are printed corners/lines on paper itself?

I'm using openCV in Python to find the corners of a sheet of paper to unwarp it.
img = cv2.imread(images[i])
corners = cv2.goodFeaturesToTrack(cv2.cvtColor(img,cv2.COLOR_BGR2GRAY),4,.01,1000,useHarrisDetector=True,k=.04)
corners = np.float32(corners)
print(corners)
ratio = 1.6
cardH = math.sqrt((corners[2][0][0] - corners[1][0][0]) * (corners[2][0][0] - corners[1][0][0]) + (corners[2][0][1] - corners[1][0][1]) * (
corners[2][0][1] - corners[1][0][1]))
cardW = ratio * cardH;
pts2 = np.float32(
[[corners[0][0][0], corners[0][0][1]], [corners[0][0][0] + cardW, corners[0][0][1]], [corners[0][0][0] + cardW, corners[0][0][1] + cardH],
[corners[0][0][0], corners[0][0][1] + cardH]])
M = cv2.getPerspectiveTransform(corners, pts2)
offsetSize = 500
transformed = np.zeros((int(cardW + offsetSize), int(cardH + offsetSize)), dtype=np.uint8);
dst = cv2.warpPerspective(img, M, transformed.shape)
Before:
https://imgur.com/a/H7HjFro
After:
https://imgur.com/a/OA6Iscq
As you can see with these images, they're detecting edges inside the paper itself, rather than the corner of the paper. Should I consider using a different algorithm entirely? I'm quite lost.
I've tried increasing the minimum euclidean distance to 1000, but that really didn't do anything.
Please note, this no one's real information, this is a fake dataset found on Kaggle.
The kaggle dataset can be found https://www.kaggle.com/mcvishnu1/fake-w2-us-tax-form-dataset
Here is one way to do that in Python/OpenCV.
Note that the found corners are listed counter-clockwise from the top-most corner.
Read the input
Convert to gray
Gaussian blur
Otsu threshold
Morphology open/close to clean up the threshold
Get largest contour
Approximate a polygon from the contour
Get the corners
Draw the polygon on the input
Compute side lengths
Compute output corresponding corners
Get perspective transformation matrix from corresponding corner points
Warp the input image according to the matrix
Save the results
Input:
import cv2
import numpy as np
# read image
img = cv2.imread("efile.jpg")
# convert img to grayscale
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# blur image
blur = cv2.GaussianBlur(gray, (3,3), 0)
# do otsu threshold on gray image
thresh = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY+cv2.THRESH_OTSU)[1]
# apply morphology
kernel = np.ones((7,7), np.uint8)
morph = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, kernel)
morph = cv2.morphologyEx(morph, cv2.MORPH_OPEN, kernel)
# get largest contour
contours = cv2.findContours(morph, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
contours = contours[0] if len(contours) == 2 else contours[1]
area_thresh = 0
for c in contours:
area = cv2.contourArea(c)
if area > area_thresh:
area_thresh = area
big_contour = c
# draw white filled largest contour on black just as a check to see it got the correct region
page = np.zeros_like(img)
cv2.drawContours(page, [big_contour], 0, (255,255,255), -1)
# get perimeter and approximate a polygon
peri = cv2.arcLength(big_contour, True)
corners = cv2.approxPolyDP(big_contour, 0.04 * peri, True)
# draw polygon on input image from detected corners
polygon = img.copy()
cv2.polylines(polygon, [corners], True, (0,0,255), 1, cv2.LINE_AA)
# Alternate: cv2.drawContours(page,[corners],0,(0,0,255),1)
# print the number of found corners and the corner coordinates
# They seem to be listed counter-clockwise from the top most corner
print(len(corners))
print(corners)
# for simplicity get average of top/bottom side widths and average of left/right side heights
# note: probably better to get average of horizontal lengths and of vertical lengths
width = 0.5*( (corners[0][0][0] - corners[1][0][0]) + (corners[3][0][0] - corners[2][0][0]) )
height = 0.5*( (corners[2][0][1] - corners[1][0][1]) + (corners[3][0][1] - corners[0][0][1]) )
width = np.int0(width)
height = np.int0(height)
# reformat input corners to x,y list
icorners = []
for corner in corners:
pt = [ corner[0][0],corner[0][1] ]
icorners.append(pt)
icorners = np.float32(icorners)
# get corresponding output corners from width and height
ocorners = [ [width,0], [0,0], [0,height], [width,height] ]
ocorners = np.float32(ocorners)
# get perspective tranformation matrix
M = cv2.getPerspectiveTransform(icorners, ocorners)
# do perspective
warped = cv2.warpPerspective(img, M, (width, height))
# write results
cv2.imwrite("efile_thresh.jpg", thresh)
cv2.imwrite("efile_morph.jpg", morph)
cv2.imwrite("efile_polygon.jpg", polygon)
cv2.imwrite("efile_warped.jpg", warped)
# display it
cv2.imshow("efile_thresh", thresh)
cv2.imshow("efile_morph", morph)
cv2.imshow("efile_page", page)
cv2.imshow("efile_polygon", polygon)
cv2.imshow("efile_warped", warped)
cv2.waitKey(0)
Thresholded image:
Morphology cleaned image:
Polygon drawn on input:
Extracted Corners (counterclockwise from top right corner)
4
[[[693 67]]
[[ 23 85]]
[[ 62 924]]
[[698 918]]]
Warped Result:

How to find the document edges in various coloured backgrounds using opencv python? [Document Scanning in various backgrounds]

I am currently have a document that needs to be smart scanned.
For that, I need to find proper contours of the document in any background so that I can do a warped perspective projection and detection with that image.
The main issue faced while doing this is that the document edge detects any kind of background.
I have tried to use the function HoughLineP and tried to find contours on the grayscale blurred image passed through canny edge detection until now.
MORPH = 9
CANNY = 84
HOUGH = 25
IM_HEIGHT, IM_WIDTH, _ = rescaled_image.shape
# convert the image to grayscale and blur it slightly
gray = cv2.cvtColor(rescaled_image, cv2.COLOR_BGR2GRAY)
gray = cv2.GaussianBlur(gray, (7,7), 0)
#dilate helps to remove potential holes between edge segments
kernel = cv2.getStructuringElement(cv2.MORPH_RECT,(MORPH,MORPH))
dilated = cv2.dilate(gray, kernel)
# find edges and mark them in the output map using the Canny algorithm
edged = cv2.Canny(dilated, 0, CANNY)
test_corners = self.get_corners(edged)
approx_contours = []
(_, cnts, hierarchy) = cv2.findContours(edged.copy(), cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)
cnts = sorted(cnts, key=cv2.contourArea, reverse=True)[:5]
# loop over the contours
for c in cnts:
# approximate the contour
approx = cv2.approxPolyDP(c, 80, True)
if self.is_valid_contour(approx, IM_WIDTH, IM_HEIGHT):
approx_contours.append(approx)
break
How to find a proper bounding box around the document via OpenCV code.
Any help will be much appreciated.
(The document is taken from the camera in any angle and any coloured background.)
Following code might help you to detect/segment the page in the image...
import cv2
import matplotlib.pyplot as plt
import numpy as np
image = cv2.imread('test_p.jpg')
image = cv2.imread('test_p.jpg')
print(image.shape)
ori = image.copy()
image = cv2.resize(image, (image.shape[1]//10,image.shape[0]//10))
Resized the image to make the operations more faster so that we can work on realtime..
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
gray = cv2.GaussianBlur(gray, (11,11), 0)
edged = cv2.Canny(gray, 75, 200)
print("STEP 1: Edge Detection")
plt.imshow(edged)
plt.show()
cnts = cv2.findContours(edged.copy(), cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)
cnts = sorted(cnts[1], key = cv2.contourArea, reverse = True)[:5]
Here we will consider only first 5 contours from the sorted list based on area
Here the size of the gaussian blur is bit sensitive, so chose it accordingly based on the image size.
After the above operations image may look like..
for c in cnts:
### Approximating the contour
#Calculates a contour perimeter or a curve length
peri = cv2.arcLength(c, True)
approx = cv2.approxPolyDP(c, 0.01 * peri, True)
# if our approximated contour has four points, then we
# can assume that we have found our screen
screenCnt = approx
if len(approx) == 4:
screenCnt = approx
break
# show the contour (outline)
print("STEP 2: Finding Boundary")
cv2.drawContours(image, [screenCnt], -1, (0, 255, 0), 2)
image_e = cv2.resize(image,(image.shape[1],image.shape[0]))
cv2.imwrite('image_edge.jpg',image_e)
plt.imshow(image_e)
plt.show()
Final Image may look like...
Rest of the things may be handled after getting the final image...
Code Reference :- Git Repository
I guess this answer would be helpful...
There is a similar problem which is called orthographic projection.
Orthographic approaches
Rather than doing, Gaussian blur+morphological operation to get the edge of the document, try to do orthographic projection first and then find contours via your method.
For fining proper bounding box, try some preset values or a reference letter after which an orthographic projection will allow you to compute the height and hence the dimensions of the bounding box.

Resources