I'm using pytesseract to extract email id from images, using below code.
import pytesseract as ps
text = ps.image_to_string('./email_id.png')
print(text)
But the extarcted text is not correct. Example for image:
it extracts email id as : adarsh_1493#yahoo.com
Similarly when I'm using below image:
The result is coming as : airiorceschooibegumpet#yahoo.com .
I tried to follow few configs as suggested by different posts but nothing worked. (Tesseract-OCR V5.0.0alpha20190708)
Any help highly appreciated!
You will need to preprocess the image.
I recommend the following types to begin with converting to grayscale and thresholding first:
# get grayscale image
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
# threshold gray image
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1]
Then try variations of noise removal, dilation and erosion.
# noise removal
cv2.medianBlur(thresh,5)
# dilation
kernel = np.ones((5,5),np.uint8)
cv2.dilate(thresh, kernel, iterations = 1)
# erosion
kernel = np.ones((5,5),np.uint8)
cv2.erode(thresh, kernel, iterations = 1)
Related
I was wondering if I can translate this opencv-python method into Pillow as I am forced furtherly to process it in Pillow.
A workaround I thought about would be to just save it with OpenCV and load it after with Pillow but I am looking for a cleaner solution, because I am using the remove_background() method's output as input for each frame of a GIF. Thus, I will read and write images N * GIF_frames_count times for no reason.
The method I want to convert from Pillow to opencv-python:
def remove_background(path):
img = cv2.imread(path)
# Convert to gray
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# Threshold input image as mask
mask = cv2.threshold(gray, 250, 255, cv2.THRESH_BINARY)[1]
# Negate mask
mask = 255 - mask
# Apply morphology to remove isolated extraneous noise
# Use border constant of black since foreground touches the edges
kernel = np.ones((3, 3), np.uint8)
mask = cv2.morphologyEx(mask, cv2.MORPH_OPEN, kernel)
mask = cv2.morphologyEx(mask, cv2.MORPH_CLOSE, kernel)
# Anti-alias the mask -- blur then stretch
# Blur alpha channel
mask = cv2.GaussianBlur(mask, (0, 0), sigmaX=2, sigmaY=2, borderType=cv2.BORDER_DEFAULT)
# Linear stretch so that 127.5 goes to 0, but 255 stays 255
mask = (2 * (mask.astype(np.float32)) - 255.0).clip(0, 255).astype(np.uint8)
# Put mask into alpha channel
result = img.copy()
result = cv2.cvtColor(result, cv2.COLOR_BGR2BGRA)
result[:, :, 3] = mask
return result
Code taken from: how to remove background of images in python
Rather than re-writing all the code using PIL equivalents, you could adopt the "if it ain't broke, don't fix it" maxim, and simply convert the Numpy array that the existing code produces into a PIL Image that you can use for your subsequent purposes.
That is this described in this answer, which I'll paraphrase as:
# Make "PIL Image" from Numpy array
pi = Image.fromarray(na)
Note that the linked answer refers to scikit-image (which uses RGB ordering like PIL) rather than OpenCV, so there is the added wrinkle that you will also need to reorder the channels from BGRA to RGBA, so the last couple of lines will look like:
...
...
result = cv2.cvtColor(result, cv2.COLOR_BGR2RGBA)
result[:, :, 3] = mask
pi = Image.fromarray(result)
I am trying to add batching to a OpenCV python script I am making and I cannot for the life of me see what I am doing wrong. I am a beginner at this so its probably something stupid. In the end I want the script to read every image file in the current working directory of the script and then crop based on the face detect from openCV and output those cropped images with the same name to a folder inside the CWD. Right now all it does though is output the last image in the folder into the output folder. Any ideas from those who know what they are doing?
import cv2
import sys
import os.path
import glob
#Cascade path
cascPath = 'haarcascade_frontalface_default.xml'
# Create the haar cascade
faceCascade = cv2.CascadeClassifier(cascPath)
# Read Images
images = glob.glob('*.jpg')
for i in images:
image = cv2.imread(i,1)
#Convert to grayscale
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
# Find face(s) using cascade
faces = faceCascade.detectMultiScale(
gray,
scaleFactor=1.1, #size of groups
minNeighbors=5, #How many groups around are detected as face for it to be valid
minSize=(300, 300) #Min size in pixels for face
)
# Outputs number of faces found in image
print('Found {0} faces!'.format(len(faces)))
# Places a rectangle on face (For debugging, wont be in crop version)
for (x, y, w, h) in faces:
cv2.rectangle(image, (x, y), (x+w, y+h), (255, 255, 255), 4)
# Resizes image to fit monitor and displayes it
imOut = cv2.resize(image, (750, 1142))
#cv2.imshow("Faces found", imS)
#cv2.waitKey(0)
#Saves image to output folder and creates folder if it doesnt exist
if not os.path.exists('output'):
os.makedirs('output')
os.chdir('output')
cv2.imwrite(i, imOut)
There are multiple corrections I have made in the code
You need the give the full path of the haarcascade_frontalface_default.xml
For instance: in Unix system:
cascPath = 'opencv/data/haarcascades/haarcascade_frontalface_default.xml'
You shouldn't create directories during the loop. You should create it before the loop.
if not os.path.exists('output'):
os.makedirs('output')
You don't need to change the directory to save the images. Just add the path before the image.
img_name = "output/out_{}.png".format(c) # c is the counter
Indentation is important, otherwise, you might have difficulties.
Example code:
import cv2
import os.path
import glob
# Cascade path
cascPath = '/opencv/data/haarcascades/haarcascade_frontalface_default.xml'
# Create the haar cascade
faceCascade = cv2.CascadeClassifier(cascPath)
if not os.path.exists('output'):
os.makedirs('output')
# Read Images
images = glob.glob('images/*.jpg')
for c, i in enumerate(images):
image = cv2.imread(i, 1)
# Convert to grayscale
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
# Find face(s) using cascade
faces = faceCascade.detectMultiScale(
gray,
scaleFactor=1.1, # size of groups
minNeighbors=5, # How many groups around are detected as face for it to be valid
minSize=(300, 300) # Min size in pixels for face
)
# Outputs number of faces found in image
print('Found {0} faces!'.format(len(faces)))
# Places a rectangle on face (For debugging, wont be in crop version)
for (x, y, w, h) in faces:
cv2.rectangle(image, (x, y), (x + w, y + h), (255, 255, 255), 4)
if len(faces) > 0:
# Resizes image to fit monitor and displayes it
imOut = cv2.resize(image, (750, 1142))
# cv2.imshow("Faces found", imS)
# cv2.waitKey(0)
# Saves image to output folder and creates folder if it doesnt exist
# os.chdir('output')
img_name = "output/out_{}.png".format(c)
cv2.imwrite(img_name, imOut)
Example output:
images = glob.glob('*.jpg')
for i in images:
image = cv2.imread(i,1)
#Convert to grayscale
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
...
What you are doing is: open every image one by one and when you reach the last image, apply the operations on that last image.
This can be easily fixed if you just include all your operations you want to apply on 1 image, all under the first for loop. Watch for the indentation, that's basically what you are doing wrong here.
images = glob.glob('*.jpg')
for i in images:
image = cv2.imread(i,1)
#Convert to grayscale
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
#do all your operations here
I need to read the highest temperature on thermographic images, as shown below:
IR_1544_INFRA.jpg
IR_1546_INFRA.jpg
IR_1560_INFRA.jpg
IR_1564_INFRA.jpg
I used the following code, this was the best result.
I also tried several other ways, such as: blur, gray scale, binarization, and others but they all failed.
import cv2
import pytesseract
pytesseract.pytesseract.tesseract_cmd = r"C:\Users\User\AppData\Local\Tesseract-OCR\tesseract.exe"
# Load image, grayscale, Otsu's threshold
entrada = cv2.imread('IR_1546_INFRA.jpg')
image = entrada[40:65, 277:319]
#image = cv2.imread('IR_1546_INFRA.jpg')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
thresh = 255 - cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
# Blur and perform text extraction
thresh = cv2.GaussianBlur(thresh, (3,3), 0)
data = pytesseract.image_to_string(thresh, lang='eng', config='--psm 6')
print(data)
cv2.imshow('thresh', thresh)
cv2.waitKey()
In the first image, I found
this
In the second image, I found this.
The imagem layout is always the same, that is, the temperature is always in the same place, so I cropped the image to isolate only the number. I would like (97.7 here, and 85.2 here).
My code needs to find from these images to always detect this temperature and generate a list indicating from highest to lowest.
What do you indicate for me to improve the assertiveness of pytesseract in the case of these images?
Note 1: When I annalyze the entire image (without cropping), it returns data that is not even present.
Note 2: In some images even with the binary number, pytesseract (image_to_string) does not return any data.
Thank you all and sorry for the typos, writing in english is still a challenge for me.
Because you have same images, you can crop the area you want and then do processing there. The processing is also simple. Change to gray, get threshold, invert, resize, and then do the OCR. You can see it in my code below. It works on all your attached images.
import cv2
import pytesseract
import os
image_path = "temperature"
for nama_file in sorted(os.listdir(image_path)):
print(nama_file)
img = cv2.imread(os.path.join(image_path, nama_file))
crop = img[43:62, 278:319]
gray = cv2.cvtColor(crop, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 200, 255, cv2.THRESH_BINARY)[1]
thresh = cv2.bitwise_not(thresh)
double = cv2.resize(thresh, None, fx=2, fy=2)
custom_config = r'-l eng --oem 3 --psm 7 -c tessedit_char_whitelist="1234567890." '
text = pytesseract.image_to_string(double, config=custom_config)
print("detected: " + text)
cv2.imshow("img", img)
cv2.imshow("double", double)
cv2.waitKey(0)
cv2.destroyAllWindows()
I have a extracted image as this , I want to crop and extract the individual letters from this image.
I have tried the below code, but it is working only for the names which are written like this for this image I am getting expected result as single letter at a time.
import cv2
import numpy as np
img = cv2.imread('data1/NAME.png')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
ret, thresh1 = cv2.threshold(gray,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
kernel = np.ones((3, 3), np.uint8)
imgMorph = cv2.erode(thresh1, kernel, iterations = 1)
contours, hierarchy = cv2.findContours(imgMorph,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
i=1
for cnt in contours:
x,y,w,h = cv2.boundingRect(cnt)
if w>10 and w<100 and h>10 and h<100:
#save individual images
cv2.imwrite("data1/NAME_{}.png".format((i)),thresh1[y:y+h,x:x+w])
i=i+1
cv2.imshow('BindingBox',imgMorph)
cv2.waitKey(0)
cv2.destroyAllWindows()
this code giving the below results
and
and so on
expected result , like this.
You cannot separate touching or overlapping letters with morphological operations when the common line is as thick as the rest of the letter.
You cannot segment the letters but you can recognize them using advanced OCR techniques like machine learning.
Read this http://www.how-ocr-works.com/OCR/word-character-segmentation.html
It's not as simple as thresholding and detecting blobs. You'll need to train an OCR engin like Tesseract to detect handwritten characters.
I am developing an application to read the numbers from an image using opencv in Python 3. I first converted the image to gray scale,then Apply dilation and erosion to remove some noise, then Apply threshold to get image with only black and white, then Write the image to local disk to do some ..., then apply tesseract to recognise the number for python.
I need to extract the numbers from the image. I am new to openCV. Does anybody know any other method to get the result??
I have share the image link bellow, i was trying to extract from that image. Thanks in advance
https://drive.google.com/file/d/141y-3okLPGP_STje14ukSqSHcgtwMdRO/view?usp=sharing
import cv2
import numpy as np
import pytesseract
from PIL import Image
from pytesseract import image_to_string
# Path of working folder on Disk
src_path = "/Users/sougata.a.roy/Desktop/Images/"
def get_string(img_path):
# Read image with opencv
img = cv2.imread(img_path)
# Convert to gray
img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# Apply dilation and erosion to remove some noise
kernel = np.ones((1, 1), np.uint8)
img = cv2.dilate(img, kernel, iterations=1)
img = cv2.erode(img, kernel, iterations=1)
# Write image after removed noise
cv2.imwrite(src_path + "removed_noise.jpg", img)
# Apply threshold to get image with only black and white
img = cv2.adaptiveThreshold(img, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY_INV, 31, 2)
# Write the image after apply opencv to do some ...
cv2.imwrite(src_path + 'thres.jpg', img)
# Recognize text with tesseract for python
result = pytesseract.image_to_string(Image.open(src_path + "thres.jpg"), lang='eng')
return result
print('--- Start recognize text from image ---')
print(get_string(src_path + 'abcdefg195.jpg'))
print("------ Done -------")
365