Extracting characters from words in an image - python-3.x

I have a extracted image as this , I want to crop and extract the individual letters from this image.
I have tried the below code, but it is working only for the names which are written like this for this image I am getting expected result as single letter at a time.
import cv2
import numpy as np
img = cv2.imread('data1/NAME.png')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
ret, thresh1 = cv2.threshold(gray,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
kernel = np.ones((3, 3), np.uint8)
imgMorph = cv2.erode(thresh1, kernel, iterations = 1)
contours, hierarchy = cv2.findContours(imgMorph,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
i=1
for cnt in contours:
x,y,w,h = cv2.boundingRect(cnt)
if w>10 and w<100 and h>10 and h<100:
#save individual images
cv2.imwrite("data1/NAME_{}.png".format((i)),thresh1[y:y+h,x:x+w])
i=i+1
cv2.imshow('BindingBox',imgMorph)
cv2.waitKey(0)
cv2.destroyAllWindows()
this code giving the below results
and
and so on
expected result , like this.

You cannot separate touching or overlapping letters with morphological operations when the common line is as thick as the rest of the letter.
You cannot segment the letters but you can recognize them using advanced OCR techniques like machine learning.
Read this http://www.how-ocr-works.com/OCR/word-character-segmentation.html

It's not as simple as thresholding and detecting blobs. You'll need to train an OCR engin like Tesseract to detect handwritten characters.

Related

How do I detect vertical text with OpenCV for extraction

I am new to OpenCV and trying to see if I can find a way to detect vertical text for the image attached.
In this case on row 3 , I would like to get the bounding box around Original Cost and the amount below ($200,000.00).
Similarly I would like to get the bounding box around Amount Existing Liens and the associated amount below. I then would use this data to send to an OCR engine to read text. Traditional OCR engines go line by line and extract and loses the context.
Here is what I have tried so far -
import cv2
import numpy as np
img = cv2.imread('Test3.png')
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
edges = cv2.Canny(gray,100,100,apertureSize = 3)
cv2.imshow('edges',edges)
cv2.waitKey(0)
minLineLength = 20
maxLineGap = 10
lines = cv2.HoughLinesP(edges,1,np.pi/180,15,minLineLength=minLineLength,maxLineGap=maxLineGap)
for x in range(0, len(lines)):
for x1,y1,x2,y2 in lines[x]:
cv2.line(img,(x1,y1),(x2,y2),(0,255,0),2)
cv2.imshow('hough',img)
cv2.waitKey(0)
Here is my solution based on Kanan Vyas and Adrian Rosenbrock
It's probably not as "canonical" as you'd wish.
But it seems to work (more or less...) with the image you provided.
Just a word of CAUTION: The code looks within the directory from which it is running, for a folder named "Cropped" where cropped images will be stored. So, don't run it in a directory which already contains a folder named "Cropped" because it deletes everything in this folder at each run. Understood? If you're unsure run it in a separate folder.
The code:
# Import required packages
import cv2
import numpy as np
import pathlib
###################################################################################################################################
# https://www.pyimagesearch.com/2015/04/20/sorting-contours-using-python-and-opencv/
###################################################################################################################################
def sort_contours(cnts, method="left-to-right"):
# initialize the reverse flag and sort index
reverse = False
i = 0
# handle if we need to sort in reverse
if method == "right-to-left" or method == "bottom-to-top":
reverse = True
# handle if we are sorting against the y-coordinate rather than
# the x-coordinate of the bounding box
if method == "top-to-bottom" or method == "bottom-to-top":
i = 1
# construct the list of bounding boxes and sort them from top to
# bottom
boundingBoxes = [cv2.boundingRect(c) for c in cnts]
(cnts, boundingBoxes) = zip(*sorted(zip(cnts, boundingBoxes),
key=lambda b:b[1][i], reverse=reverse))
# return the list of sorted contours and bounding boxes
return (cnts, boundingBoxes)
###################################################################################################################################
# https://medium.com/coinmonks/a-box-detection-algorithm-for-any-image-containing-boxes-756c15d7ed26 (with a few modifications)
###################################################################################################################################
def box_extraction(img_for_box_extraction_path, cropped_dir_path):
img = cv2.imread(img_for_box_extraction_path, 0) # Read the image
(thresh, img_bin) = cv2.threshold(img, 128, 255,
cv2.THRESH_BINARY | cv2.THRESH_OTSU) # Thresholding the image
img_bin = 255-img_bin # Invert the imagecv2.imwrite("Image_bin.jpg",img_bin)
# Defining a kernel length
kernel_length = np.array(img).shape[1]//200
# A verticle kernel of (1 X kernel_length), which will detect all the verticle lines from the image.
verticle_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (1, kernel_length))
# A horizontal kernel of (kernel_length X 1), which will help to detect all the horizontal line from the image.
hori_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (kernel_length, 1))
# A kernel of (3 X 3) ones.
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3, 3))# Morphological operation to detect verticle lines from an image
img_temp1 = cv2.erode(img_bin, verticle_kernel, iterations=3)
verticle_lines_img = cv2.dilate(img_temp1, verticle_kernel, iterations=3)
#cv2.imwrite("verticle_lines.jpg",verticle_lines_img)# Morphological operation to detect horizontal lines from an image
img_temp2 = cv2.erode(img_bin, hori_kernel, iterations=3)
horizontal_lines_img = cv2.dilate(img_temp2, hori_kernel, iterations=3)
#cv2.imwrite("horizontal_lines.jpg",horizontal_lines_img)# Weighting parameters, this will decide the quantity of an image to be added to make a new image.
alpha = 0.5
beta = 1.0 - alpha
# This function helps to add two image with specific weight parameter to get a third image as summation of two image.
img_final_bin = cv2.addWeighted(verticle_lines_img, alpha, horizontal_lines_img, beta, 0.0)
img_final_bin = cv2.erode(~img_final_bin, kernel, iterations=2)
(thresh, img_final_bin) = cv2.threshold(img_final_bin, 128, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)# For Debugging
# Enable this line to see verticle and horizontal lines in the image which is used to find boxes
#cv2.imwrite("img_final_bin.jpg",img_final_bin)
# Find contours for image, which will detect all the boxes
contours, hierarchy = cv2.findContours(
img_final_bin, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
# Sort all the contours by top to bottom.
(contours, boundingBoxes) = sort_contours(contours, method="top-to-bottom")
idx = 0
for c in contours:
# Returns the location and width,height for every contour
x, y, w, h = cv2.boundingRect(c)# If the box height is greater then 20, widht is >80, then only save it as a box in "cropped/" folder.
if (w > 50 and h > 20):# and w > 3*h:
idx += 1
new_img = img[y:y+h, x:x+w]
cv2.imwrite(cropped_dir_path+str(x)+'_'+str(y) + '.png', new_img)
###########################################################################################################################################################
def prepare_cropped_folder():
p=pathlib.Path('./Cropped')
if p.exists(): # Cropped folder non empty. Let's clean up
files = [x for x in p.glob('*.*') if x.is_file()]
for f in files:
f.unlink()
else:
p.mkdir()
###########################################################################################################################################################
# MAIN
###########################################################################################################################################################
prepare_cropped_folder()
# Read image from which text needs to be extracted
img = cv2.imread("dkesg.png")
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# Performing OTSU threshold
ret, thresh1 = cv2.threshold(gray, 0, 255, cv2.THRESH_OTSU | cv2.THRESH_BINARY_INV)
thresh1=255-thresh1
bin_y=np.zeros(thresh1.shape[0])
for x in range(0,len(bin_y)):
bin_y[x]=sum(thresh1[x,:])
bin_y=bin_y/max(bin_y)
ry=np.where(bin_y>0.995)[0]
for i in range(0,len(ry)):
cv2.line(img, (0, ry[i]), (thresh1.shape[1], ry[i]), (0, 0, 0), 1)
# We need to draw abox around the picture with a white border in order for box_detection to work
cv2.line(img,(0,0),(0,img.shape[0]-1),(255,255,255),2)
cv2.line(img,(img.shape[1]-1,0),(img.shape[1]-1,img.shape[0]-1),(255,255,255),2)
cv2.line(img,(0,0),(img.shape[1]-1,0),(255,255,255),2)
cv2.line(img,(0,img.shape[0]-1),(img.shape[1]-1,img.shape[0]-1),(255,255,255),2)
cv2.line(img,(0,0),(0,img.shape[0]-1),(0,0,0),1)
cv2.line(img,(img.shape[1]-3,0),(img.shape[1]-3,img.shape[0]-1),(0,0,0),1)
cv2.line(img,(0,0),(img.shape[1]-1,0),(0,0,0),1)
cv2.line(img,(0,img.shape[0]-2),(img.shape[1]-1,img.shape[0]-2),(0,0,0),1)
cv2.imwrite('out.png',img)
box_extraction("out.png", "./Cropped/")
Now... It puts the cropped regions in the Cropped folder. They are named as x_y.png with (x,y) the position on the original image.
Here are two examples of the outputs
and
Now, in a terminal. I used pytesseract on these two images.
The results are the following:
1)
Original Cost
$200,000.00
2)
Amount Existing Liens
$494,215.00
As you can see, pytesseract got the amount wrong in the second case... So, be careful.
Best regards,
Stéphane
I assume the bounding box is fix (rectangle that able to fit in "Original Amount and the amount below). You can use text detection to detect the "Original Amount" and "Amount Existing Liens" using OCR and crop out the image based on the detected location for further OCR on the amount. You can refer this link for text detection
Try to divide the image into different cells using the lines in the image.
For example, first divide the input into rows by detecting the horizontal lines. This can be done by using cv.HoughLinesP and checking for each line if the difference between y-coordinate of the begin and end point is smaller than a certain threshold abs(y2 - y1) < 10. If you have a horizontal line, it's a separator for a new row. You can use the y-coordinates of this line to split the input horizontally.
Next, for the row you're interested in, divide the region into columns using the same technique, but now make sure the difference between the x-coordinates of the begin and end point are smaller than a certain threshold, since you're now looking for the vertical lines.
You can now crop the image to different cells using the y-coordinates of the horizontal lines and the x-coordinates of the vertical lines. Pass these cropped regions one by one to the OCR engine and you'll have for each cell the corresponding text.

Difficulty reading text with pytesseract

I need to read the highest temperature on thermographic images, as shown below:
IR_1544_INFRA.jpg
IR_1546_INFRA.jpg
IR_1560_INFRA.jpg
IR_1564_INFRA.jpg
I used the following code, this was the best result.
I also tried several other ways, such as: blur, gray scale, binarization, and others but they all failed.
import cv2
import pytesseract
pytesseract.pytesseract.tesseract_cmd = r"C:\Users\User\AppData\Local\Tesseract-OCR\tesseract.exe"
# Load image, grayscale, Otsu's threshold
entrada = cv2.imread('IR_1546_INFRA.jpg')
image = entrada[40:65, 277:319]
#image = cv2.imread('IR_1546_INFRA.jpg')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
thresh = 255 - cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
# Blur and perform text extraction
thresh = cv2.GaussianBlur(thresh, (3,3), 0)
data = pytesseract.image_to_string(thresh, lang='eng', config='--psm 6')
print(data)
cv2.imshow('thresh', thresh)
cv2.waitKey()
In the first image, I found
this
In the second image, I found this.
The imagem layout is always the same, that is, the temperature is always in the same place, so I cropped the image to isolate only the number. I would like (97.7 here, and 85.2 here).
My code needs to find from these images to always detect this temperature and generate a list indicating from highest to lowest.
What do you indicate for me to improve the assertiveness of pytesseract in the case of these images?
Note 1: When I annalyze the entire image (without cropping), it returns data that is not even present.
Note 2: In some images even with the binary number, pytesseract (image_to_string) does not return any data.
Thank you all and sorry for the typos, writing in english is still a challenge for me.
Because you have same images, you can crop the area you want and then do processing there. The processing is also simple. Change to gray, get threshold, invert, resize, and then do the OCR. You can see it in my code below. It works on all your attached images.
import cv2
import pytesseract
import os
image_path = "temperature"
for nama_file in sorted(os.listdir(image_path)):
print(nama_file)
img = cv2.imread(os.path.join(image_path, nama_file))
crop = img[43:62, 278:319]
gray = cv2.cvtColor(crop, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 200, 255, cv2.THRESH_BINARY)[1]
thresh = cv2.bitwise_not(thresh)
double = cv2.resize(thresh, None, fx=2, fy=2)
custom_config = r'-l eng --oem 3 --psm 7 -c tessedit_char_whitelist="1234567890." '
text = pytesseract.image_to_string(double, config=custom_config)
print("detected: " + text)
cv2.imshow("img", img)
cv2.imshow("double", double)
cv2.waitKey(0)
cv2.destroyAllWindows()

pytesseract fail to extract correct email id from image

I'm using pytesseract to extract email id from images, using below code.
import pytesseract as ps
text = ps.image_to_string('./email_id.png')
print(text)
But the extarcted text is not correct. Example for image:
it extracts email id as : adarsh_1493#yahoo.com
Similarly when I'm using below image:
The result is coming as : airiorceschooibegumpet#yahoo.com .
I tried to follow few configs as suggested by different posts but nothing worked. (Tesseract-OCR V5.0.0alpha20190708)
Any help highly appreciated!
You will need to preprocess the image.
I recommend the following types to begin with converting to grayscale and thresholding first:
# get grayscale image
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
# threshold gray image
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1]
Then try variations of noise removal, dilation and erosion.
# noise removal
cv2.medianBlur(thresh,5)
# dilation
kernel = np.ones((5,5),np.uint8)
cv2.dilate(thresh, kernel, iterations = 1)
# erosion
kernel = np.ones((5,5),np.uint8)
cv2.erode(thresh, kernel, iterations = 1)

detecting similar objects and cropping them from the image

I have to extract this:
from the given image:
I tried contour detection but that gives all the contours. But I specifically need that object in that image.
My idea is to:
Find the objects in the image
Draw bounding box around them
Crop them and save them individually.
I am working with opencv and using python3, which I am fairly new to.
As seen there are three objects similar to given template but of different sizes. Also there are other boxes which are not the area of interest. After cropping I want to save them as three separate images. Is there a solution to this situation ?
I tried multi-scale template matching with the cropped template.
Here is an attempt:
# import the necessary packages
import numpy as np
import argparse
import imutils
import glob
import cv2
# construct the argument parser and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-t", "--template", required=True, help="Path to template image")
args = vars(ap.parse_args())
# load the image image, convert it to grayscale, and detect edges
template = cv2.imread(args["template"])
template = cv2.cvtColor(template, cv2.COLOR_BGR2GRAY)
template = cv2.Canny(template, 50, 200)
(tH, tW) = template.shape[:2]
image = cv2.imread('input.png')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
# loop over the scales of the image
for scale in np.linspace(0.2, 1.0, 20)[::-1]:
# resize the image according to the scale, and keep track
# of the ratio of the resizing
resized = imutils.resize(gray, width = int(gray.shape[1] * scale))
r = gray.shape[1] / float(resized.shape[1])
# if the resized image is smaller than the template, then break
# from the loop
if resized.shape[0] < tH or resized.shape[1] < tW:
break
# detect edges in the resized, grayscale image and apply template
# matching to find the template in the image
edged = cv2.Canny(resized, 50, 200)
res = cv2.matchTemplate(edged, template, cv2.TM_CCOEFF)
loc = np.where( res >= 0.95)
for pt in zip(*loc[::-1]):
cv2.rectangle(image, int(pt*r), (int((pt[0] + tW)*r), int((pt[1] + tH)*r)), (0,255,0), 2)
cv2.imwrite('re.png',image)
Result that I am getting:
Expected result is bounding boxes around all the post-it boxes
I'm currently on mobile so I can't really write code, but this link does exactly what you're looking for!
If anything isn't clear I can adapt the code to your example later this evening, when I have acces to a laptop.
In your case I would crop out the content of the shape (the post-it) and template match just on the edges. That'll make sure it's not thrown off by the text inside.
Good luck!

How to Segment handwritten and printed digit without losing information in opencv?

I've written an algorithm that would detect printed and handwritten digit and segment it but while removing outer rectangle handwritten digit is lost using clear_border from ski-image package. Any suggestion to prevent information.
Sample:
How to get all 5 characters separately?
Segmenting characters from the image -
Approach -
Threshold the image (Convert it to BW)
Perform Dilation
Check the contours are large enough
Find rectangular Contours
Take ROI and save the characters
Python Code -
# import the necessary packages
import numpy as np
import cv2
import imutils
# load the image, convert it to grayscale, and blur it to remove noise
image = cv2.imread("sample1.jpg")
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
gray = cv2.GaussianBlur(gray, (7, 7), 0)
# threshold the image
ret,thresh1 = cv2.threshold(gray ,127,255,cv2.THRESH_BINARY_INV)
# dilate the white portions
dilate = cv2.dilate(thresh1, None, iterations=2)
# find contours in the image
cnts = cv2.findContours(dilate.copy(), cv2.RETR_EXTERNAL,
cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if imutils.is_cv2() else cnts[1]
orig = image.copy()
i = 0
for cnt in cnts:
# Check the area of contour, if it is very small ignore it
if(cv2.contourArea(cnt) < 100):
continue
# Filtered countours are detected
x,y,w,h = cv2.boundingRect(cnt)
# Taking ROI of the cotour
roi = image[y:y+h, x:x+w]
# Mark them on the image if you want
cv2.rectangle(orig,(x,y),(x+w,y+h),(0,255,0),2)
# Save your contours or characters
cv2.imwrite("roi" + str(i) + ".png", roi)
i = i + 1
cv2.imshow("Image", orig)
cv2.waitKey(0)
First of all I thresholded the image to convert it to black n white. I get characters in white portion of image and background as black. Then I Dilated the image to make the characters (white portions) thick, this will make it easy to find the appropriate contours. Then find findContours method is used to find the contours. Then we need to check that the contour is large enough, if the contour is not large enough then it is ignored ( because that contour is noise ). Then boundingRect method is used to find the rectangle for the contour. And finally, the detected contours are saved and drawn.
Input Image -
Threshold -
Dilated -
Contours -
Saved characters -
Problem of eroded/cropped handwritten digits:
you may solve this problem in the recognition step, or even in image improvement step (before recognition).
if only a very small part of digit is cropped (such your image example), it's enough to pad your image around by 1 or 2 pixels to make the segmentation process easy. Or some morpho filter (dilate) can improve your digit even after padding. (these solution are available in Opencv)
if a enough good part of digit is cropped, you need to add a degraded/cropped pattern of digits to the training Dataset used for digit recognition algorithm, (i.e. digit 3 with all possible cropping cases.. etc)
Problem of characters separation :
opencv offers blob detection algorithm that works well on your issue (choose the correct value for concave & convexity params)
opencv offers as well contour detector (canny() function), that helps to detect the contours of your character then you can find the fitted bounding (offered by Opencv as well : cv2.approxPolyDP(contour,..,..)) box around each character

Resources