How to crop images using Pillow and pytesseract? - python-3.x

I was trying to use pytesseract to find the box positions of each letter in an image. I tried to use an image, and cropping it with Pillow and it worked, but when I tried with a lower character size image (example), the program may recognize the characters, but cropping the image with the box coordinates give me images like this. I also tried to double up the size of the original image, but it changed nothing.
img = Image.open('imgtest.png')
data=pytesseract.image_to_boxes(img)
dati= data.splitlines()
corde=[]
for i in dati[0].split()[1:5]: #just trying with the first character
corde.append(int(i))
im=img.crop(tuple(corde))
im.save('cimg.png')

If we stick to the source code of image_to_boxes, we see, that the returned coordinates are in the following order:
left bottom right top
From the documentation on Image.crop, we see, that the expected order of coordinates is:
left upper right lower
Now, it also seems, that pytesseract iterates images from bottom to top. Therefore, we also need to further convert the top/upper and bottom/lower coordinates.
That'd be the reworked code:
from PIL import Image
import pytesseract
img = Image.open('MJwQi9f.png')
data = pytesseract.image_to_boxes(img)
dati = data.splitlines()
corde = []
for i in dati[0].split()[1:5]:
corde.append(int(i))
corde = tuple([corde[0], img.size[1]-corde[3], corde[2], img.size[1]-corde[1]])
im = img.crop(tuple(corde))
im.save('cimg.png')
You see, left and right are in the same place, but top/upper and bottom/lower switched places, and where also altered w.r.t. the image height.
And, that's the updated output:
The result isn't optimal, but I assume, that's due to the font.
----------------------------------------
System information
----------------------------------------
Platform: Windows-10-10.0.16299-SP0
Python: 3.9.1
Pillow: 8.1.0
pytesseract: 4.00.00alpha
----------------------------------------

Related

Compare images for finding the images are visibly same

Need to find the images are same , even if it has different resolution and size. No need of a pixel to pixel comparison, need to find all the images , texts, color etc. are same in the other image also.
Tried with different python packages to compare, but all are asking for same resolution. One of my image is screenshotted from Mac and other is from Ubuntu. Even both are same html, the contrast and resolution difference of the machines causing the images to become different when compare.
Tried.
Perceptual diff,
Image Hash etc.
• PDIFFER – Python wrapper for perceptualdiff tool (https://pypi.org/project/pdiffer/)
Problem - pip install pdiffer failed to install in the Macs for the latest as well as old versions
• NEEDLE - Installed needle (https://needle.readthedocs.io/en/latest/) This one has an option to specify comparison engine to be perceptualdiff/Imagemagick instead of default PIL.
Problem Found - Has an option to save baseline image first and then run assertions on them. It works when baseline is saved and then compared. I didn’t find anything that would compare the screenshots with existing images.
• OPENCV – Histogram based comparison. This converts images into grayscale, and into histograms and compares the histograms. Returns a value between -1 and 1 (-1 means not similar at all and 1 means highly similar) (https://www.pyimagesearch.com/2014/07/14/3-ways-compare-histograms-using-opencv-python/ )
Findings - I tested two images by converting into histograms and compared them Which returned a value of 0.8 (meaning somewhat similar).
Below code i tried using imagehash:
from PIL import Image import imagehash
image_one = 'result.png'
img = Image.open(image_one) image_one_hash = imagehash.whash(img)
print(image_one_hash)
image_two = 'not-found-02.png'
img2 = Image.open(image_two) image_two_hash = imagehash.whash(img2)
print(image_two_hash)
similarity = image_one_hash - image_two_hash print(similarity)

PIL's Image.rotate() crops image

I am trying out the Python Imaging Library, but `Image.rotate() seems to crop the image:
from PIL import Image
fn='screen.png'
im = Image.open(fn)
im = im.rotate(-90)
im = im.rotate(90)
im.show()
im.save('cropped.png')
The original image screen.png is:
The cropped image cropped.png is:
The cropping is caused by the 1st rotation and retained in the 2nd rotation.
I am still new to Python, despite trying to become more familiar with it over the years. Is this me doing something wrong with the PIL or is it a bug?
My Python version is:
Python 3.8.8 (default, Mar 4 2021, 21:24:42)
[GCC 10.2.0] on cygwin
Thanks to pippo1980, the solution is to specify expand=1 when invoking rotate():
from PIL import Image
fn='screen.png'
im = Image.open(fn)
im = im.rotate(-90,expand=1)
im = im.rotate(90,expand=1)
im.show()
im.save('cropped.png')
It adjusts the "outer image" width and height to allow an arbitrarily rotated image to fit without cropping. Despite the name, it also shrinks width/height where appropriate.

Both rasterio open and skimage.io.read return a NaN array for the TIFF I am trying to open

I'm trying to open a SAR image from sentinel-1. I can view the tiff file in QGIS, so I know the data is there, but when I go to open and view/show it in python, all of the modules I could use to open the data produce a NaN area, basically insinuating that there is no data in the image. Visualizing the image produces a completely black image, however the shape is correct.
Here is the code where I read in the image:
img = skimage.io.imread('NewData.tif', as_gray = True, plugin = 'tifffile')
with rio.open(r'NewData.tif') as src:
img2 = src.read()
imgMeta = src.profile
print(img)
skimage.io.imshow(img)
Any help would be appreciated.
thank you
The problem is not on the way rasterio or skimage is importing the image, but on the way it is displayed. I am assumign you are working with Calibrated SAR images that ARE NOT converted to the decibel dB scale. Here is the problem, the dynamic range of your data.
The issue here is that by default, the color ramp is not being strech according to the distribution of values in the raster histogram. In QGIS, SNAP or many other EO-related softwares, the color distribution matches the histogram to produce proper visualizations.
Solution: either you make that happen in your code or simply convert your backscatter values to decibel (which is a very common procedure when working with SAR data and produces an almost normal distrubution of the data). The conversion can be done in a EO software or more directly in your imported image with:
srcdB = 10*np.log10(src)
Once done, you can properly display your image:
import rasterio
from rasterio.plot import show
import numpy as np
with rio.open(r'/.../S1B_IW_GRDH_1SDV_20190319T161451_20190319T161520_015425_01CE3C_A401_Cal.tif') as src:
img2 = src.read()
imgMeta = src.profile
srcdB = 10*np.log10(src) # to decibel
show(srcdB, cmap='gray') # show using rasterio

Align the Images properly

Hi I am trying to get the handwritten data only from an image, for that I took a empty image and a filled one and then I am doing ImageChops.difference to get the data out of it.
The problem is right now with the alignment of images, both are not equally aligned in terms of depth, so the results are not correct.
from PIL import Image, ImageChops
def compare_images(path_one, path_two, diff_save_location):
"""
Compares to images and saves a diff image, if there
is a difference
#param: path_one: The path to the first image
#param: path_two: The path to the second image
"""
image_one = Image.open(path_one).convert('LA')
image_two = Image.open(path_two).convert('LA')
diff = ImageChops.difference(image_one, image_two)
if diff.getbbox():
diff.convert('RGB').save(diff_save_location)
if __name__ == '__main__':
compare_images('images/blank.jpg',
'images/filled.jpg',
'images/diff.jpg')
This is the result which I got.
the result which I am looking for:
Can anyone help me with this.
Thanks.
This site may be helpful: https://www.learnopencv.com/image-alignment-feature-based-using-opencv-c-python/ . The main idea is to first detect keypoints use SIFT, SURF or other algorithms in both images; then match the keypoints from the empty image with the keypoints from the handwritten image, to get a homography matrix; then use this matrix to align the two images.
After image alignment, post processing may be needed due to illumination or noise.

Does PIL work with skimage?

Why is is that when I do this:
from skimage import feature, io
from PIL import Image
edges = feature.canny(blimage)
io.imshow(edges)
io.show()
I get exactly what I want, that being the full edged-only image. But when I do this:
edges = feature.canny(blimage)
edges = Image.fromarray(edges)
edges.show()
I get a whole mess of random dots, lines and other things that have more resemble a Jackson Pollock painting than the image? Whats wrong, and how do I fix it such that I can get what I want through both methods?
for full code, visit my Github here:
https://github.com/Speedyflames/Image-Functions/blob/master/Image_Processing.py
Let's see what kind of image is produced by skimage.feature.canny:
edges = feature.canny(blimage)
>>> print(edges.dtype)
bool
It'a boolean image, True for white, False for black. PIL correctly identifies the datatype and tries to map it to its own 1-bit mode which is "1" (see PIL docs about it).
This looks broken though, it doesn't seem to properly get byte width or something like that.
There was an issue about it, they seem to have fixed the PIL to NumPy conversion but apparently the converse is still broken.
Anyway, long story short, your best bet to successfully convert a binary image from NumPy to PIL is converting it to grayscale:
edges_pil = Image.fromarray((edges * 255).astype(np.uint8))
>>> print edges_pil.mode
L
If you actually need a 1-bit image you can convert it afterwards with
edges_pil = edges_pil.convert('1')

Resources