I am trying to detect logo in invoices. Though I am able to get some results but not sufficient enough to process. While detecting logos, Unwanted text is also getting detected.
The following is from actual invoice:-original Image
and the following results I am getting Image after operations
I am using the`following code which I have written:-
gray=cv2.imread("Image",0)
ret,thresh1 = cv2.threshold(gray,180,255,cv2.THRESH_BINARY)
kernel_logo = np.ones((10,10),np.uint8)
closing_logo = cv2.morphologyEx(thresh1,cv2.MORPH_CLOSE,kernel_logo,
iterations = 1)
n=3
noise_removed_logo = cv2.medianBlur(closing_logo, n)
eroded_logo = cv2.erode(noise_removed_logo,kernel_logo, iterations = 8)
dilated_logo=cv2.dilate(eroded_logo,kernel_logo, iterations=3)
Could you please help me what changes should I make to remove noise from my documented image. I am new to Computer Vision
Few more sample:- Original document
The result I am getting:- Result after operations on document
Hello Mohd Anas Khan .
Your approch to define logo is too simple so it couldn't work. If you want a product-level approach, use some machine learning or deep learning. If you want just some toys, then a simple countours finder with fixed rules should work.
For example, in the following approach i defined "logo" as "the contour which has biggest area". You'll need more rules later, so good luck.
import numpy as np
import cv2
im = cv2.imread('contours_1.jpg')
imgray = cv2.cvtColor(im,cv2.COLOR_BGR2GRAY)
ret,thresh = cv2.threshold(imgray,127,255, cv2.THRESH_BINARY_INV)
rect_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (5, 5))
threshed = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, rect_kernel)
cv2.imwrite("contours_1_thres.jpg", threshed)
im2, contours, hierarchy = cv2.findContours(threshed,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
ws = []
hs = []
areas = []
for contour in contours:
area = cv2.contourArea(contour)
x, y, w, h = cv2.boundingRect(contour)
print("w: {}, h: {}, area: {}".format(w, h, area))
ws.append(w)
hs.append(h)
areas.append(area)
max_idx = np.argmax(areas)
cv2.drawContours(im, [contours[max_idx]], -1, (0, 255, 0), 3)
# cv2.drawContours(im, contours, -1, (0, 255, 0), 3)
cv2.imwrite("contours_1_test.jpg", im)
The output images are as follow : (The detected logo is covered in green box )
Related
I am trying to make a project which has three imshow windows each of different size, is there a way to make those three windows to pane or stack and display them in another window? Currently they are displayed like this
How can i make a window which will contain all these windows and only the main window will have a close button and not all of them.
Use
To stack them vertically:
img = np.concatenate((img1, img2), axis=0)
To stack them horizontally:
img = np.concatenate((img1, img2), axis=1)
Then show them using cv2.imshow
You can read in each image and store all but the first one, img0, into a list, imgs. Iterate through each image in the imgs list, comparing the width of img0 and the image of the iteration. With that we can define a pad, and update the img0 with the image:
Lets say we have these three images:
one.png:
two.png:
three.png:
The code:
import cv2
import numpy as np
img1 = cv2.imread("one.png")
img2 = cv2.imread("two.png")
img3 = cv2.imread("three.png")
imgs = [img1, img2, img3]
result = imgs[0]
for img in imgs[1:]:
w = img.shape[1] - result.shape[1]
pad = [(0, 0), (0, abs(w)), (0, 0)]
if w > 0:
result = np.r_[np.pad(result, pad), img]
else:
result = np.r_[result, np.pad(img, pad)]
cv2.imshow("Image", result)
cv2.waitKey(0)
Output:
hand-filled character per box form
I want to automate a process in which I would get hand-filled character per box type forms in image format and I need to extract text from these forms. The boxes surrounds each letter, I have to extract all the text from the image form.
You can use selecting contours by size, find rotated rectangle and inverse transform make.
import cv2
import numpy as np
img = cv2.imread('4YAry.jpg')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# convert to binary image
thresh=cv2.threshold(gray, 200, 255, cv2.THRESH_BINARY_INV )[1]
contours,hierarchy = cv2.findContours(thresh, 1, 2)
for cnt in contours:
x , y , w , h = cv2 . boundingRect ( cnt )
if abs(w-345)<10: # width box is 345 px
rect = cv2.minAreaRect(cnt)
box = cv2.boxPoints(rect)
srcTri=np.array( [box[1], box[0], box[2]] ).astype(np.float32)
dstTri = np.array( [[0, 0], [0, rect[1][1]], [rect[1][0],0]] ).astype(np.float32)
warp_mat = cv2.getAffineTransform(srcTri, dstTri)
warp_dst = cv2.warpAffine(img, warp_mat, (np.int0(rect[1][0]), np.int0(rect[1][1])))
N=14
s=0.99*warp_dst.shape[1]/N # tune rectangle positions
for i in range(N):
warp_dst = cv2.rectangle ( warp_dst , ( 2+int(i*s) ,2 ), ( 2+int((i+1)*s) , warp_dst.shape[0]-3 ), ( 255 , 255 , 255 ), 2 )
cv2.imwrite('chars.png', warp_dst)
Using for instance Hough, detect the top and bottom edges and the vertical separations. Validate the separations by checking that they run from top to bottom. The horizontal lines will be more reliable and accurate, you can use their direction for deskewing if necessary.
After doing that, you will have missing separations and false ones. Using some heuristics, try to find the correct pitch and detect the false positives and false negatives. Now you can extract the content of the individual boxes, or erase the edges.
This process cannot be perfect, some characters will be damaged.
I've download the opencv from https://opencv.org/opencv-demonstrator-gui/ to make some live test on some images.
I found that this filter work perfectly for my needs:
,
I need to code it in my python script, tried to follow this tutorial :https://docs.opencv.org/3.4/d2/d2c/tutorial_sobel_derivatives.html
but I'm unable to find and match setting I need (pre-filtering Deriche, or Schar operator type).
I guess also I should use this syntax:
cv.Sobel(gray, ddepth, 1, 0, ksize=3, scale=scale, delta=delta, borderType=cv.BORDER_DEFAULT)
Thx.
UPDATE
Using this lines I'm close to right result:
scale = 1
delta = 0
ddepth = cv2.CV_16S
src = cv2.imread(image, cv2.IMREAD_COLOR)
src = cv2.GaussianBlur(src, (3, 3), 0)
gray = cv2.cvtColor(src, cv2.COLOR_BGR2GRAY)
grad_x = cv2.Sobel(gray, ddepth, 1, 0, ksize=3, scale=scale, delta=delta, borderType=cv2.BORDER_DEFAULT)
# Gradient-Y
# grad_y = cv.Scharr(gray,ddepth,0,1)
grad_y = cv2.Sobel(gray, ddepth, 0, 1, ksize=3, scale=scale, delta=delta, borderType=cv2.BORDER_DEFAULT)
abs_grad_x = cv2.convertScaleAbs(grad_x)
abs_grad_y = cv2.convertScaleAbs(grad_y)
grad = cv2.addWeighted(abs_grad_x, 0.5, abs_grad_y, 0.5, 0)
You are only doing the X derivative Sobel filter in Python/OpenCV. It is likely you really want the gradient magnitude, not the X directional derivative. To compute the magnitude, you need both the X and Y derivatives and then compute the magnitude. You also like will need to compute as float so as not to get one sided derivatives. You can later convert the magnitude to 8-bit if you want.
gradx = cv2.Sobel(gray,cv2.CV_64F,1,0,ksize=3)
grady = cv2.Sobel(gray,cv2.CV_64F,0,1,ksize=3)
gradmag = cv2.magnitude(gradx,grady)
The Scharr is similar and can be found at https://docs.opencv.org/master/d4/d86/group__imgproc__filter.html#gaa13106761eedf14798f37aa2d60404c9
]From https://www.pyimagesearch.com/2018/07/19/opencv-tutorial-a-guide-to-learn-opencv/
I'm able to extract the contours and write as files.
For example I've a photo with some scribbled text : "in there".
I've been able to extract the letters as separate files but what I want is that these letter files should have same width and height. For example in case of "i" and "r" width will differ. In that case I want to append(any b/w pixels) to the right of "i" photo so it's width becomes same as that of "r"
How to do it in Python? Just increase the size of photo(not resize)
My code looks something like this:
# find contours (i.e., outlines) of the foreground objects in the
# thresholded image
cnts = cv2.findContours(thresh.copy(), cv2.RETR_EXTERNAL,
cv2.CHAIN_APPROX_SIMPLE)
cnts = imutils.grab_contours(cnts)
output = image.copy()
ROI_number = 0
for c in cnts:
x,y,w,h = cv2.boundingRect(c)
ROI = image[y:y+h, x:x+w]
file = 'ROI_{}.png'.format(ROI_number)
cv2.imwrite(file.format(ROI_number), ROI)
[][1
Here are a couple of other ways to do that using Python/OpenCV using cv2.copyMakeBorder() to extend the border to the right by 50 pixels. The first way simply extends the border by replication. The second extends it with the mean (average) blue background color using a mask to get only the blue pixels.
Input:
import cv2
import numpy as np
# read image
img = cv2.imread('i.png')
# get mask of background pixels (for result2b only)
lowcolor = (232,221,163)
highcolor = (252,241,183)
mask = cv2.inRange(img, lowcolor, highcolor)
# get average color of background using mask on img (for result2b only)
mean = cv2.mean(img, mask)[0:3]
color = (mean[0],mean[1],mean[2])
# extend image to the right by 50 pixels
result = img.copy()
result2a = cv2.copyMakeBorder(result, 0,0,0,50, cv2.BORDER_REPLICATE)
result2b = cv2.copyMakeBorder(result, 0,0,0,50, cv2.BORDER_CONSTANT, value=color)
# view result
cv2.imshow("img", img)
cv2.imshow("mask", mask)
cv2.imshow("result2a", result2a)
cv2.imshow("result2b", result2b)
cv2.waitKey(0)
cv2.destroyAllWindows()
# save result
cv2.imwrite("i_extended2a.jpg", result2a)
cv2.imwrite("i_extended2b.jpg", result2b)
Replicated Result:
Average Background Color Result:
In Python/OpenCV/Numpy you create a new image of the size and background color you want. Then you use numpy slicing to insert the old image into the new one. For example:
Input:
import cv2
import numpy as np
# read image
img = cv2.imread('i.png')
ht, wd, cc= img.shape
# create new image of desired size (extended by 50 pixels in width) and desired color
ww = wd+50
hh = ht
color = (242,231,173)
result = np.full((hh,ww,cc), color, dtype=np.uint8)
# copy img image into image at offsets yy=0,xx=0
yy=0
xx=0
result[yy:yy+ht, xx:xx+wd] = img
# view result
cv2.imshow("result", result)
cv2.waitKey(0)
cv2.destroyAllWindows()
# save result
cv2.imwrite("i_extended.jpg", result)
I have a set of data points in the 3D space and would like to fit a bounding box to them. I know that vtkOBBTree::ComputeOBB can do this for me. But I can't seem to figure out how to visualize the oriented bounding box.
Any help is appreciated!
For a bounding box you can use vtkOutlineFilter. You just have to set as input the 3D data that you want to fit. Then you create the mapper and the actor, and add it to the scene, as you would do in a typical VTK scenario. Here is a working example, in Python:
from vtk import *
quadric = vtkQuadric()
quadric.SetCoefficients(.5, 1, .2, 0, .1, 0, 0, .2, 0, 0)
sample = vtkSampleFunction()
sample.SetSampleDimensions(50,50,50)
sample.SetImplicitFunction(quadric)
contour = vtkContourFilter()
contour.SetInputConnection(sample.GetOutputPort())
contour.GenerateValues(5,0,1)
contourMapper = vtkPolyDataMapper()
contourMapper.SetInputConnection(contour.GetOutputPort())
contourMapper.SetScalarRange(0,1.2)
contourActor = vtkActor()
contourActor.SetMapper(contourMapper)
outline = vtkOutlineFilter()
outline.SetInputConnection(sample.GetOutputPort())
outlineMapper = vtkPolyDataMapper()
outlineMapper.SetInputConnection(outline.GetOutputPort())
outlineActor = vtkActor()
outlineActor.SetMapper(outlineMapper)
outlineActor.GetProperty().SetColor(1,1,1)
ren = vtkRenderer()
ren.SetBackground(0.188,0.373,0.647)
ren.AddActor(contourActor)
ren.AddActor(outlineActor)
renWin = vtkRenderWindow()
renWin.AddRenderer(ren)
renWin.SetWindowName("IsoSurface")
renWin.SetSize(500,500)
iren = vtkRenderWindowInteractor()
iren.SetRenderWindow(renWin)
renWin.Render()
iren.Initialize()
iren.Start()