In this Python script, First I increase the image Resolution for better accuracy of OCR.
Second, I apply many filters to that image to increase contrast and text clarity.
Third, I fetch text from filtered image using Tesseract But the issue is Tesseract doesn't fetch data clearly.
And also tried to add page segmentation modes and OCR Engine modes on the pytesseract but did not receive the expected output.
This is my Code...
import os,argparse
import pytesseract
from pytesseract import Output
import csv
import numpy as np
from PIL import Image
image = cv2.imread('dota.jpg')
height = 9000
width = 16000
dimensions = (width, height)
image = cv2.resize(image,dimensions, interpolation=cv2.INTER_LINEAR)
def get_grayscale(image):
return cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
def adaptiveThreshold(image):
return cv2.adaptiveThreshold(image,255,cv2.ADAPTIVE_THRESH_GAUSSIAN_C,cv2.THRESH_BINARY,11,2)
gray = get_grayscale(image)
sharpenKernel = np.array(([[2, -1, 2], [-1, 9, -1], [2, -1, 2]]), np.float32)/9
sharpen = cv2.filter2D(src=gray, kernel=sharpenKernel, ddepth=-1)
adthresh = adaptiveThreshold(sharpen)
cong = '--psm 6'
final = pytesseract.image_to_data(adthresh,output_type=Output.DICT,config=cong,lang='eng')
print(final['text'])
Also, this code is available on Github.
: https://github.com/Bhavin-Prydan/Dota-Esportz/blob/main/dota.py
Related
I am new to image processing. I am going to detect characters in the image using below code but the problem it does not detect characters that are very adjacent to the boundary.
import cv2
import numpy as np
import pytesseract
try:
import Image, ImageOps, ImageEnhance, imread
except ImportError:
from PIL import Image, ImageOps, ImageEnhance
image = cv2.imread(r"bmdc2.jpg")
image = cv2.blur(image, (3, 3))
ret, image = cv2.threshold(image, 90, 255, cv2.THRESH_BINARY)
image = cv2.dilate(image, np.ones((3, 1), np.uint8))
image = cv2.erode(image, np.ones((2, 2), np.uint8))
#image = cv2.dilate(image, np.ones((2, 2), np.uint8))
#enlarge the image to get rid off boundary letter
cv2.resize(image,(200,80))
cv2.imshow("1", image)
cv2.waitKey(0)
#convert to image in memroy
img = Image.fromarray(image)
text = pytesseract.image_to_string(img)
print(text)
cv2.imshow("1", np.array(image))
cv2.waitKey(0)
Image I am trying to read but it is detected as 8677 without beginning 6. is detected as 460973, as 25469.
Any better solution is welcome. Image sample can be found at https://www.bmdc.org.bd/search-doctor
my problem is that I have an image that is this the image and I would like to know how to extract all the letters that can be seen in it. I have already tried with pytesseract but it returns me HANUTG and it therefore forgets the I. Here is my code:
import cv2
from PIL import Image, ImageEnhance, ImageFilter
im = cv2.imread("crop1.jpg")
pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe'
print(pytesseract.image_to_string(im, config='--psm 13 --oem 1 -c tessedit_char_whitelist=ABCDEFGHIJKLMNOPQRSTUVWXYZ'))
It is not very long but sufficient to extract the letters except apparently the I, I tested with other letters and there are some others who are not recognized too, how to do?
You need to process the image before feeding it to the tesseract as input when you couldn't get the desired output.
For example:
You could apply some adaptive-thresholding and add some white-borders to the current image. Results:
Adaptive-thresh result
White-border
now when you feed it to the tesseract:
H | AN U T G
Code:
import cv2
import pytesseract
bgr_img = cv2.imread("53wU6.jpg") # Load the image
gry_img = cv2.cvtColor(bgr_img, cv2.COLOR_BGR2GRAY)
thresh = cv2.adaptiveThreshold(gry_img, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 11, 21)
border_img = cv2.copyMakeBorder(gry_img, 50, 50, 50, 50, cv2.BORDER_CONSTANT, value=255)
txt = pytesseract.image_to_string(border_img, config='--psm 6')
print(txt)
Suggested reading: Improving the quality of the output
I am trying to count seeds in an image using cv2 thresholding. The test image is below:
When I run the below code to create a mask:
import cv2
import numpy as np
import matplotlib.pyplot as plt
img = cv2.imread('S__14278933.jpg')
#img = cv2.fastNlMeansDenoisingColored(img,None,10,10,7,21)
mask = cv2.threshold(img[:, :, 0], 255, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
plt.imshow(mask)
I get the following mask:
But ideally it should give a small yellow dot at the centre. I have tried this with other images and it works just fine.
Can someone help?
The lighting in your image seems not uniform. Try using Adaptive Thresholding:
import cv2
import numpy as np
# image path
path = "D://opencvImages//"
fileName = "c6pBO.jpg"
# Reading an image in default mode:
inputImage = cv2.imread(path + fileName)
# Convert the image to grayscale:
grayImage = cv2.cvtColor(inputImage, cv2.COLOR_BGR2GRAY)
# Get binary image via Adaptive Thresholding :
windowSize = 31
windowConstant = 40
binaryImage = cv2.adaptiveThreshold( grayImage, 255, cv2.ADAPTIVE_THRESH_MEAN_C,
cv2.THRESH_BINARY_INV, windowSize, windowConstant )
cv2.imshow("binaryImage", binaryImage)
cv2.waitKey(0)
You might want to apply an Area Filter after this, though, as dark portions on the image will yield noise.
Try it
img=cv2.imread('foto.jpg',0)
mask = cv2.threshold(img, 127, 255, cv2.THRESH_BINARY_INV )[1]
I have this image.
Running Pytesseract with python 3.8 produced follwoing problem:
The word "phone" is read as O (not zero, O as in oscar)
The word "Fax" is read as 2%.
The phone number is read as (56031770
The image in consideration does not contain the boxes.The boxes are taken from the cv2 output after applying boxes around detected text regions / words.
The fax number is read without a problem. (866)357-7704 (includeing the parentheses and the hyphen)
The image size is 23 megapixels (converted from a pdf file)
The image has been preporcessed with a threshholding in opencv so that you get a binary image
The image does not contain bold fonts. So I did not use any erosion.
What can I do to properly read the Phone Number?
Thank you.
PS: I am using image_to_data (not image_to_text) as I would need to know the locations of the strings on the page as well.
Edit: here is the relevant part of code:
from PIL import Image
import pytesseract
from pytesseract import Output
import argparse
import cv2
import os
import numpy as np
import math
from pdf2image import convert_from_path
from scipy.signal import convolve2d
import string
filename = "image.png"
image = cv2.imread(filename)
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
# estimate noise on image
H, W = gray.shape
M = [[1, -2, 1],
[-2, 4, -2],
[1, -2, 1]]
sigma = np.sum(np.sum(np.absolute(convolve2d(gray, M))))
sigma = sigma * math.sqrt(0.5 * math.pi) / (6 * (W-2) * (H-2))
# if image has too much noise then go with blurring method
if sigma > 10 :
# noisy
gray = cv2.medianBlur(gray, 3)
print("noises deblurred")
# otherwise go with threshholding method
else :
gray = cv2.threshold(gray, 0, 255,cv2.THRESH_BINARY | cv2.THRESH_OTSU)[1]
print("threshhold applied")
d = pytesseract.image_to_data(gray, output_type=Output.DICT)
for t in d['text'] :
print(t)
This will thus be psm 3 (default)
Version :
Tesseract : tesseract 4.1.1 (retrieved with tesseract --version) &
pytessract : Version: 0.3.2 (retrieved with pip3 show pytesseract)
[I have the images as below, i need to extract just the white strip portion from all the images.
i Have tried using PIL to extract the rectangular portion by manually specifying the pixel value, Can there be any automated way to get this work done where by just feeding the image gives back the rectangular portion
Below is My snipped code:
from PIL import Image
import cv2
import numpy as np
from matplotlib import pyplot as plt
img = Image.open('C:/Users/ShAgarwal/Documents/image_dataset/pic9.jpg')
half_the_width = img.size[0] / 2
half_the_height = img.size[1] / 2
img4 = img.crop(
(
half_the_width-1632,
half_the_height - 440,
half_the_width+1632,
half_the_height + 80
)
)
sample image
import cv2
import numpy as np
from matplotlib import pyplot as plt
image='IMG_3134.JPG'
# read image
imgc = cv2.imread(image)
img = cv2.resize(imgc, None, fx=0.25, fy=0.25) # resize since image is huge
#cropping the strip dimensions
#crop_img = img[1010:1650,140:1099723]
blurred = cv2.blur(img, (3,3))
canny = cv2.Canny(blurred, 50, 200)
Marking coordinates through auto image detection using canny's algorithm
## find the non-zero min-max coords of canny
pts = np.argwhere(canny>0)
y1,x1 = pts.min(axis=0)
y2,x2 = pts.max(axis=0)`
`## crop the region
cropped = img[y1:y2, x1:x2]
cv2.imwrite("cropped.png", cropped)
#Select the bounded area around white boundary
tagged = cv2.rectangle(img.copy(), (x1,y1), (x2,y2), (0,255,0), 3, cv2.LINE_AA)
r = cv2.selectROI(tagged)
imCrop = im[int(r[1]):int(r[1]+r[3]), int(r[0]):int(r[0]+r[2])]
#Bounded Area
cv2.imwrite("taggd2.png", imcrop)
cv2.waitKey()
Results from above code