pytesseract unable to recognize complex math formula from image - python-3.x

I am using pytesseract module in python, pytesseract recognizes text from image but it dosen't work on images that contain complex math formulas like under-root, derivation, integration math problem or equation.
code 2.py
# Import modules
from PIL import Image
import pytesseract
import cv2
# Include tesseract executable in your path
pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"
# Create an image object of PIL library
image = Image.open('23.jpg')
# img = cv2.imread('123.jpg')
# pass image into pytesseract module
# pytesseract is trained in many languages
image_to_text = pytesseract.image_to_string(image, lang='eng+equ')
image_to_text1 = pytesseract.image_to_string(image)
# Print the text
print(image_to_text)
# print(image_to_text1)
# workon digits
Output:
242/33
2x
2x+3X
2X+3x=4
2x?-3x +1=0
(x-1)(x+1) =x2-1
(x+2)/((x+3)(x-4))
7-4=3
V(x/2) =3
2xx—343=6x—3 (x#3)
Jeeta =e* +e
dy 2
S=2?-3
dz ¥
dy = (a? — 3)dx
Input image

To work with MATH language you should install the proper language for tesseract. In your case it is 'equ' from https://github.com/tesseract-ocr/tessdata/raw/3.04.00/equ.traineddata . The full list of available languages is described at https://tesseract-ocr.github.io/tessdoc/Data-Files
I'm not familiar with tesseract language install for windows. But there is a documentation at https://github.com/tesseract-ocr/tesseract/wiki :
If you want to use another language, download the appropriate training
data, unpack it using 7-zip, and copy the .traineddata file into the
'tessdata' directory, probably C:\Program Files\Tesseract-OCR\tessdata
And at first try to process your image with cli only ( without pyhton ), because cli has a full list of the options to tune.

I used this Code and it worked!
import re
import cv2
import pytesseract as tess
path = (r"C:\Users\10\AppData\Local\Tesseract-OCR\tesseract.exe")
tess.pytesseract.tesseract_cmd = path
png = "Downloads/m.png"
text = tess.image_to_string(png)
text.replace(" ", "")
pattern = re.compile("([0-9][=+-/*])")
equations = [x for x in text if bool(re.match(pattern, x))]
print(re.findall(r'(.*)', str(text))[0])

Related

OSError: Unable to open file (file signature not found)

I am currently doing an assignment on deep learning by downloading the assignment files from github.
import numpy as np
import matplotlib.pyplot as plt
import h5py
import scipy
from PIL import Image
from scipy import ndimage
from lr_utils import load_dataset
%matplotlib inline
You are given a dataset ("data.h5") containing: - a training set of m_train images labeled as cat (y=1) or non-cat (y=0) - a test set of m_test images labeled as cat or non-cat - each image is of shape (num_px, num_px, 3) where 3 is for the 3 channels (RGB). Thus, each image is square (height = num_px) and (width = num_px).
# Loading the data (cat/non-cat)
train_set_x_orig, train_set_y, test_set_x_orig, test_set_y, classes = load_dataset()
I ran the setup.sh file too but the error doesn't seem to go away.
lr_utils.py file:
import numpy as np
import h5py
def load_dataset():
train_dataset = h5py.File('datasets/train_catvnoncat.h5', "r")
train_set_x_orig = np.array(train_dataset["train_set_x"][:]) # your train set features
train_set_y_orig = np.array(train_dataset["train_set_y"][:]) # your train set labels
test_dataset = h5py.File('datasets/test_catvnoncat.h5', "r")
test_set_x_orig = np.array(test_dataset["test_set_x"][:]) # your test set features
test_set_y_orig = np.array(test_dataset["test_set_y"][:]) # your test set labels
classes = np.array(test_dataset["list_classes"][:]) # the list of classes
train_set_y_orig = train_set_y_orig.reshape((1, train_set_y_orig.shape[0]))
test_set_y_orig = test_set_y_orig.reshape((1, test_set_y_orig.shape[0]))
return train_set_x_orig, train_set_y_orig, test_set_x_orig, test_set_y_orig, classes
Kindly help!
I solved the issue by downloading uncorrupted .h5 files and putting them in the folder datasets/ in the same directory.
The files you downloaded are corrupted. You can visit https://github.com/abdur75648/Deep-Learning-Specialization-Coursera to download the uncorrupted files.
you can download uncorrupted files from here :
https://www.kaggle.com/datasets/muhammeddalkran/catvnoncat
and replace it in the directory of the corrupted files

Error Opening PGM file with PIL and SKIMAGE

I have following Image file:
Image
I used PIL and Skimage to open it but I get following errors
First with PIL (tried with and without trucate option):
Code:
from PIL import Image, ImageFile
ImageFile.LOAD_TRUNCATED_IMAGES = True
img = Image.open("image_output.pgm")
Erorr:
OSError: cannot identify image file 'image_output.pgm'
And with Skimage:
Code:
from skimage import io
img = io.imread("image_output.pgm")
Error:
OSError: cannot identify image file <_io.BufferedReader name='image_output.pgm'>
I can open the file with GUI applications like system photo viewer and Matlab.
How can I diagnose what is wrong with image? I compared the byte data with other PGM files which I can open in Python, but could not identify the difference.
Thanks.
Your file is P2 type PGM, which means it is in ASCII - you can view it in a normal text editor. It seems neither PIL, nor skimage want to read that, but are happy to read the corresponding P5 type which is identical except it is written in binary, rather than ASCII.
There are a few options...
1) You could use OpenCV to read it:
import cv2
im = cv2.imread('a.pgm')
2) You could convert it to P5 with ImageMagick and then read the output.pgm file with skimage or PIL:
magick input.pgm output.pgm
3) If adding OpenCV, or ImageMagick as a dependency is a real pain for you, it is possible to read a PGM image yourself:
#!/usr/bin/env python3
import re
import numpy as np
# Open image file, slurp the lot
with open('input.pgm') as f:
s = f.read()
# Find anything that looks like numbers
# Technically, there could be comments that should be ignored
l=re.findall(r'[0-9P]+',s)
# List "l" will contain: P5, width, height, 255, pixel1, pixel2, pixel3...
# Technically, if l[3]>255, you should change the type of the Numpy array to uint16, but that is not the case
w, h = int(l[1]), int(l[2])
# Make Numpy image from data
ni = np.array(l[4:],dtype=np.uint8).reshape((h,w))

Python 3.x - Slightly Precise Optical Character Recognition. What should I use?

import time
# cv2.cvtColor takes a numpy ndarray as an argument
import numpy as nm
import pytesseract
# importing OpenCV
import cv2
from PIL import ImageGrab, Image
bboxes = [(1469, 1014, 1495, 1029)]
def imToString():
# Path of tesseract executable
pytesseract.pytesseract.tesseract_cmd = 'D:\Program Files (x86)\Tesseract-OCR' + chr(92) + 'tesseract.exe'
while (True):
for box in bboxes:
# ImageGrab-To capture the screen image in a loop.
# Bbox used to capture a specific area.
cap = ImageGrab.grab(bbox=box)
# Converted the image to monochrome for it to be easily
# read by the OCR and obtained the output String.
tesstr = pytesseract.image_to_string(
cv2.cvtColor(nm.array(cap), cv2.COLOR_BGR2GRAY), lang='eng', config='digits') # ,lang='eng')
cap.show()
#input()
time.sleep(5)
print(tesstr)
# Calling the function
imToString()
It captures an image like this:
It isn't always two digits it can be one or three digits too.
Pytesseract returns values like: asi and oli
So, which Image To Text (OCR) Algorithm should I use for this problem? And, how to use that? I need a very precise value in this example it's 53 so the output should be around 50.

Assigning a filepath to a variable in Python 3

I am trying to convert few camera-clicked images of handwritten Gujarati characters to the form of MNIST dataset as I intend to pass the Gujarati handwritten characters images to the MNIST deep learning model. And as part of that, I'm trying to assign a file path to a variable named "datadir". But when executing the below code in Ubuntu 16.04, the terminal throws the error which looks like this:
File "gujaratinn.py", line 7
datadir = /home/cryptoaniket256/Desktop/opencv-3.4.1/project/Resize
^
SyntaxError: invalid syntax
Note that the name of the file is gujaratinn.py and all the camera-clicked images are stored in the Resize folder.
import numpy as np
import matplotlib.pyplot as py
import os
import cv2
from pathlib import Path
datadir = Path("/home/cryptoaniket256/Desktop/opencv-
3.4.1/project/Resize")
fileToOpen = datadir/"practice.txt"
f = open(fileToOpen)
print(f.read())
Are you affecting datadir with a path you wrote on 2 rows in your code ?
Try to put line 7 and 8 on the same row or change the quotes like that:
import numpy as np
import matplotlib.pyplot as py
import os
import cv2
from pathlib import Path
datadir = Path("""/home/cryptoaniket256/Desktop/opencv-3.4.1/project/Resize""")
fileToOpen = datadir/"practice.txt"
f = open(fileToOpen)
print(f.read())

Why is my image_path undefined when using export_graphviz? - Python 3

I'm trying to run this machine learning tree algorithm code in IPython:
from sklearn.datasets import load_iris
from sklearn.tree import DecisionTreeClassifier
iris = load_iris()
X = iris.data[:, 2:] # petal length and width
y = iris.target
tree_clf = DecisionTreeClassifier(max_depth=2)
tree_clf.fit(X, y)
from sklearn.tree import export_graphviz
export_graphviz(tree_clf, out_file=image_path("iris_tree.dot"),
feature_names=iris.feature_names[2:],
class_names=iris.target_names,
rounded=True,
filled=True
)
But I get this error when run in IPython:
I'm unfamiliar with export_graphviz, does anyone have any idea how to correct this?
I guess you are following "Hands on Machine Learning with Scikit-Learn and TensorFlow" book by Aurelien Geron. I encountered with the same problem while trying out "Decision Trees" chapter. You can always refer to his GitHub notebooks . For your code, you may refer "decision tree" notebook.
Below I paste the code from notebook. Please do go ahead and have a look at the notebook also.
# To support both python 2 and python 3
from __future__ import division, print_function, unicode_literals
# Common imports
import numpy as np
import os
# to make this notebook's output stable across runs
np.random.seed(42)
# To plot pretty figures
%matplotlib inline
import matplotlib
import matplotlib.pyplot as plt
plt.rcParams['axes.labelsize'] = 14
plt.rcParams['xtick.labelsize'] = 12
plt.rcParams['ytick.labelsize'] = 12
# Where to save the figures
PROJECT_ROOT_DIR = "."
CHAPTER_ID = "decision_trees"
def image_path(fig_id):
return os.path.join(PROJECT_ROOT_DIR, "images", CHAPTER_ID, fig_id)
def save_fig(fig_id, tight_layout=True):
print("Saving figure", fig_id)
if tight_layout:
plt.tight_layout()
plt.savefig(image_path(fig_id) + ".png", format='png', dpi=300)
To get rid of all the mess simply remove image_path,
now out_file="iris_tree.dot", after running that command a file will be saved in your folder named iris_tree. Open that file in Microsoft Word and copy all of its content. Now open your browser and type "webgraphviz" and then click on the first link. Then delete whatever is written in white space and paste your code which is copied from iris_tree. Then click "generate graph". Scroll down and your graph is ready.
I know you might have got what you were looking for. But in case you don't, all you need to do is just replace:
out_file=image_path("iris_tree.dot")
with:
out_file="iris_tree.dot"
This will create the .dot file in the same directory in which your current script is.
You can also give the absolute path to where you want to save the .dot file as:
out_file="/home/cipher/iris_tree.dot"
you must correct
out_file=image_path("iris_tree.dot"),
in below code line:
out_file="C:/Users/VIDA/Desktop/python/iris_tree.dot",
You can directly type instead of using the webgraphviz, if you are using sklearn version 0.20.
import graphviz
with open ("iris_tree.dot") as f:
dot_graph = f.read()
display (graphviz.Source(dot_graph))
With sklearn 0.22 you have to change again. See sklearn users guide.
I have a sklearn with the version of 0.20.1, and I got the example to work through the line below.
export_graphviz(
tree_clf,
out_file = "iris_tree.dot",
feature_names = iris.feature_names[2:])

Resources