I need to read the highest temperature on thermographic images, as shown below:
IR_1544_INFRA.jpg
IR_1546_INFRA.jpg
IR_1560_INFRA.jpg
IR_1564_INFRA.jpg
I used the following code, this was the best result.
I also tried several other ways, such as: blur, gray scale, binarization, and others but they all failed.
import cv2
import pytesseract
pytesseract.pytesseract.tesseract_cmd = r"C:\Users\User\AppData\Local\Tesseract-OCR\tesseract.exe"
# Load image, grayscale, Otsu's threshold
entrada = cv2.imread('IR_1546_INFRA.jpg')
image = entrada[40:65, 277:319]
#image = cv2.imread('IR_1546_INFRA.jpg')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
thresh = 255 - cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
# Blur and perform text extraction
thresh = cv2.GaussianBlur(thresh, (3,3), 0)
data = pytesseract.image_to_string(thresh, lang='eng', config='--psm 6')
print(data)
cv2.imshow('thresh', thresh)
cv2.waitKey()
In the first image, I found
this
In the second image, I found this.
The imagem layout is always the same, that is, the temperature is always in the same place, so I cropped the image to isolate only the number. I would like (97.7 here, and 85.2 here).
My code needs to find from these images to always detect this temperature and generate a list indicating from highest to lowest.
What do you indicate for me to improve the assertiveness of pytesseract in the case of these images?
Note 1: When I annalyze the entire image (without cropping), it returns data that is not even present.
Note 2: In some images even with the binary number, pytesseract (image_to_string) does not return any data.
Thank you all and sorry for the typos, writing in english is still a challenge for me.
Because you have same images, you can crop the area you want and then do processing there. The processing is also simple. Change to gray, get threshold, invert, resize, and then do the OCR. You can see it in my code below. It works on all your attached images.
import cv2
import pytesseract
import os
image_path = "temperature"
for nama_file in sorted(os.listdir(image_path)):
print(nama_file)
img = cv2.imread(os.path.join(image_path, nama_file))
crop = img[43:62, 278:319]
gray = cv2.cvtColor(crop, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 200, 255, cv2.THRESH_BINARY)[1]
thresh = cv2.bitwise_not(thresh)
double = cv2.resize(thresh, None, fx=2, fy=2)
custom_config = r'-l eng --oem 3 --psm 7 -c tessedit_char_whitelist="1234567890." '
text = pytesseract.image_to_string(double, config=custom_config)
print("detected: " + text)
cv2.imshow("img", img)
cv2.imshow("double", double)
cv2.waitKey(0)
cv2.destroyAllWindows()
Related
I have various type of images like those:
As you see, they are all kinda similar, however I do not manage to properly extract the number on them.
So far my code consists in the following:
lower = np.array([250,200,90], dtype="uint8")
upper = np.array([255,204,99], dtype="uint8")
mask = cv2.inRange(img, lower, upper)
res = cv2.bitwise_and(img, img, mask=mask)
data = image_to_string(res, lang="eng", config='--psm 13 --oem 3 -c tessedit_char_whitelist=0123456789')
numbers = int(''.join(re.findall(r'\d+', data)))
I tried twearking the psm parameter 6,8 and 13 they all work for some of those examples, but none on all, and I have no idea how I could circumvent my problem.
Another solution proposed is:
gry = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
(h, w) = gry.shape[:2]
gry = cv2.resize(gry, (w*2, h*2))
erd = cv2.erode(gry, None, iterations=1)
thr = cv2.threshold(erd, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1]
bnt = cv2.bitwise_not(thr)
However, on the first picture, bnt gives:
And then pytesseract sees 460..
Any idea please?
My approach:
Upsample
Erosion
Simple-thresholding
Bitwise-not
Upsampling is required for accurate recognition. Resizing two-times will make the image readable.
Erosion operation is a morphological operation helps to remove the boundary of the pixels. Erosion remove the strokes on the digit, make it easier to detect.
Thresholding (Binary and Inverse Binary) helps to reveal the features.
Bitwise-not is an arithmetic operation highly useful for extracting part of the image.
You can learn more methods simple reading from Improving the quality of the output
Erosion
Threshold
Bitwise-not
Update
The first image is easy to read, since it is not requiring any pre-processing technique. Please read How to Improve Quality of Tesseract
Result:
1460
720
3250
3146
2681
1470
Code:
import cv2
import pytesseract
img_lst = ["oqWjd.png", "YZDt1.png", "MUShJ.png", "kbK4m.png", "POIK2.png", "4W3R4.png"]
for i, img_nm in enumerate(img_lst):
img = cv2.imread(img_nm)
gry = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
(h, w) = gry.shape[:2]
if i == 0:
thr = gry
else:
gry = cv2.resize(gry, (w * 2, h * 2))
erd = cv2.erode(gry, None, iterations=1)
if i == len(img_lst)-1:
thr = cv2.threshold(erd, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
else:
thr = cv2.threshold(erd, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1]
bnt = cv2.bitwise_not(thr)
txt = pytesseract.image_to_string(bnt, config="--psm 6 digits")
print("".join([t for t in txt if t.isalnum()]))
cv2.imshow("bnt", bnt)
cv2.waitKey(0)
If you want to display comma in the result, change print("".join([t for t in txt if t.isalnum()])) line to print(txt).
Not that on the fourth image the threshold method changed from binary to inverse-binary. Binary thresholding is not working accurately on all images. Therefore you need to change.
I have the following image
lower = np.array([175, 125, 45], dtype="uint8")
upper = np.array([255, 255, 255], dtype="uint8")
mask = cv2.inRange(image, lower, upper)
img = cv2.bitwise_and(image, image, mask=mask)
plt.figure()
plt.imshow(img)
plt.axis('off')
plt.show()
now if I try to transform into grayscale like this:
gray = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)
I get that:
And I would like to extract the number on it.
The suggestion:
gray = 255 - gray
emp = np.full_like(gray, 255)
emp -= gray
emp[emp==0] = 255
emp[emp<100] = 0
gauss = cv2.GaussianBlur(emp, (3,3), 1)
gauss[gauss<220] = 0
plt.imshow(gauss)
gives the image:
Then using pytesseract on any of the images:
data = pytesseract.image_to_string(img, config='outputbase digits')
gives:
'\x0c'
Another suggested solution is:
gray = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)
thr = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV)[1]
txt = pytesseract.image_to_string(thr)
plt.imshow(thr)
And this gives
'\x0c'
Not very satisfying... Anyone has a better solution please?
Thanks!
I have a two step solution
Apply thresholding
Set psm mode to 7.
When you apply thresholding to the image:
Thresholding is a simplest method of displaying the features of the image.
Now from the output image, when we read:
txt = image_to_string(thr, config="--psm 7")
print(txt)
Result will be:
| 1,625 |
Now why do we set page-segmentation-mode (psm) mode to the 7?
Well, treating image as a single text line will give the accurate result.
But we have to modify the result. Since the current result is | 1,625 |
We should remove the |
print("".join([t for t in txt if t != '|']))
Result:
1,625
Code:
import cv2
from pytesseract import image_to_string
img = cv2.imread("LZ3vi.png")
gry = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
thr = cv2.threshold(gry, 0, 255,
cv2.THRESH_BINARY_INV)[1]
txt = image_to_string(thr, config="--psm 7")
print("".join([t for t in txt if t != '|']).strip())
Update
how do you get this clean black and white image from my original image?
Using 3-steps
Reading the image using opencv's imread function
img = cv2.imread("LZ3vi.png")
Now we read the image in BGR fashion. (Not RGB)
Convert the image to the graysclae
gry = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
Result will be:
Apply threshold
thr = cv2.threshold(gry, 0, 255, cv2.THRESH_BINARY_INV)[1]
Result will be:
Now if you are wondering about thresholding. Read the simple-threhsolding
All my filters, grayscale... get weird colored images
The reason is, when you are displaying the image using pyplot, you need to set color-map (cmap) to gray
plt.imshow(img, cmap='gray')
You can read the other types here
Two issues blocked the pytessract from detecting your number:
The white rectangle around the number(Inverting and filling is the solution).
The Noise in the numbers shape(Gaussian Smoothing dealt with that)
The solution that AlexAlex has proposed will work perfectly if it was followed by a Gaussian filter:
output: 1,625
import numpy as np
import pytesseract
import cv2
BGR = cv2.imread('11.png')
RGB = cv2.cvtColor(BGR, cv2.COLOR_BGR2RGB)
lower = np.array([175, 125, 45], dtype="uint8")
upper = np.array([255, 255, 255], dtype="uint8")
mask = cv2.inRange(RGB, lower, upper)
img = cv2.bitwise_and(RGB, RGB, mask=mask)
gray = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)
gray = 255 - gray
emp = np.full_like(gray, 255)
emp -= gray
emp[emp==0] = 255
emp[emp<100] = 0
gauss = cv2.GaussianBlur(emp, (3,3), 1)
gauss[gauss<220] = 0
text = pytesseract.image_to_string(gauss, config='outputbase digits')
print(text)
I am trying to extract the cheque amount (underlined text in Input Image) from cheque images. I am trying to do this in the following 2 steps:
Locate the rectangle box of the amount in the image.
Perform OCR using OCR libraries like Tesseract OCR and get the text.
I tried to locate the rectangle box but it is locating so many things from the image.
How can we approach this problem? If anyone has a different approach to extract the amount then please guide me.
My Code
import numpy as np
import cv2
img = cv2.imread("Ex2.jpg")
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
ret, thresh = cv2.threshold(gray, 127, 255, cv2.THRESH_BINARY_INV)
contours,_ = cv2.findContours(thresh, cv2.RETR_TREE, cv2.CHAIN_APPROX_NONE)
for contour in contours:
(x,y,w,h) = cv2.boundingRect(contour)
cv2.rectangle(img, (x,y), (x+w,y+h), (0,255,0), 2)
cv2.imshow('detected.jpg',img)
cv2.waitKey(0)
cv2.destroyAllWindows()
Input Image
Currently, I am getting this.
What I tried so far. it's working fine most of image which is text black and background is white.
from PIL import Image
import pytesseract
import nltk
import cv2
imageName = "p9.png"
img = cv2.imread(imageName,cv2.IMREAD_COLOR) #Open the image from which charectors has to be recognized
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) #convert to grey to reduce detials
gray = cv2.bilateralFilter(gray, 11, 17, 17) #Blur to reduce noise
original = pytesseract.image_to_string(gray, config='')
print (original)
but below image I do not give right text.
Output:
REMIUM OKING OIL
KETTLE-RENDERED 9s MADE FROM POR siren!
fatworks €)
NET WT. 4 02 (3966)
how to resolve this issue.
What I meant is
imageName = "p9.png"
img = cv2.imread(imageName,cv2.IMREAD_COLOR) #Open the image from which charectors has to be recognized
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) #convert to grey to reduce detials
gray = cv2.bilateralFilter(gray, 11, 17, 17) #Blur to reduce noise
img = cv2.adaptiveThreshold(gray,255,cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY,71,2)
#img = cv2.adaptiveThreshold(img,255,cv2.ADAPTIVE_THRESH_MEAN_C,\
cv2.THRESH_BINARY,11,2)
original = pytesseract.image_to_string(img, config='')
Play around with parameters of this function to find what works for you best
cv2.adaptiveThreshold(src, maxValue, adaptiveMethod, thresholdType, blockSize, C[, dst]).
I also keep here a link to original tutorial: https://opencv-python-tutroals.readthedocs.io/en/latest/py_tutorials/py_imgproc/py_thresholding/py_thresholding.html
I am developing an application to read the numbers from an image using opencv in Python 3. I first converted the image to gray scale,then Apply dilation and erosion to remove some noise, then Apply threshold to get image with only black and white, then Write the image to local disk to do some ..., then apply tesseract to recognise the number for python.
I need to extract the numbers from the image. I am new to openCV. Does anybody know any other method to get the result??
I have share the image link bellow, i was trying to extract from that image. Thanks in advance
https://drive.google.com/file/d/141y-3okLPGP_STje14ukSqSHcgtwMdRO/view?usp=sharing
import cv2
import numpy as np
import pytesseract
from PIL import Image
from pytesseract import image_to_string
# Path of working folder on Disk
src_path = "/Users/sougata.a.roy/Desktop/Images/"
def get_string(img_path):
# Read image with opencv
img = cv2.imread(img_path)
# Convert to gray
img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# Apply dilation and erosion to remove some noise
kernel = np.ones((1, 1), np.uint8)
img = cv2.dilate(img, kernel, iterations=1)
img = cv2.erode(img, kernel, iterations=1)
# Write image after removed noise
cv2.imwrite(src_path + "removed_noise.jpg", img)
# Apply threshold to get image with only black and white
img = cv2.adaptiveThreshold(img, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY_INV, 31, 2)
# Write the image after apply opencv to do some ...
cv2.imwrite(src_path + 'thres.jpg', img)
# Recognize text with tesseract for python
result = pytesseract.image_to_string(Image.open(src_path + "thres.jpg"), lang='eng')
return result
print('--- Start recognize text from image ---')
print(get_string(src_path + 'abcdefg195.jpg'))
print("------ Done -------")
365