I'm trying to OCR certain images, but am having problems with the accuracy. I would like to see if I can improve accuracy by converting images to B&W, high contrast. Any ideas how I can do that with ImageMagick?
Fred's two colour threshold ImageMagick script might help.
Related
Tried to change the color to gray but that didn't help so much:
ImageGray = cv2.cvtColor(image,BGR2GRAY)
Good question.
Try
resizing the images to a smaller size: reduce dataset resolution.
use GPU
I tried applying tesseract ocr on image but before applying OCR I want to improve the quality of the image so that OCR efficiency increase,
How to detect image brightness and increase or decrease the brightness of the image as per requirement.
How to detect image sharpness
It's not easy way to do that, if your images are similar to each other you can define "correct" brightness and adjust it on unprocessed images.
But what is "correct" brightness? You can use histogram to match this. See figure bellow. If you establish your correct histogram you can calibrate other images to it.
Richard Szeliski, Computer Vision Algorithms and Applications
I think you can use the autofocus method. You must check contrast histogram of image and define what is sharp to you.
source
*Basic recommendation, using the gray images is better on OCR
You can use the BRISQUE image quality assessment for scoring the image quality, it's available as a library. check it out, a smaller score means good quality.
I did crawl the images in the Google Image Search window
but, the images are too small so I want to increased the size
I increased the size using PIL, but the picture is broken(Image quality is too low)
How can I increase the images size with good quality?
I used PIL this way
from PIL import Image
im = Image.open('filename')
im_new = im.resize((500, 500))
im_new.save('filename2')
No, I think you maybe get a wrong understanding of the real problem.
The images you got are just some thumbnails, so it contains little information. Your efforts to improve the image quality
by some algorithm may be very hard to make a difference. Probably only by using some machine learning tricks can you make the photos a little nicer.
In my opinion, what you need to do is to get original images you got with Google search rather than use thumbnails. You can do this by do a lot more analysis with image search results. Good luck :)
So I am trying to extract text from image. And as the quality and size of image is not good, it is giving inaccurate results. I tried few enhancements and other things with PIL but that is only worsening the quality of image.
Can someone suggest some enhancement in image to get better results. Few Examples of images:
In the provided example of image the text is visually of quite good quality, so the question is how it comes that OCR gives inaccurate results?
To illustrate the conclusions given in further text of this answer let's run the the given image
through Tesseract. Below the result of Tesseract OCR:
"fhpgearedmomrs©gmachom"
Now let's resize the image four times and apply thresholding to it. I have done the resizing and thresholding manually in Gimp, but with appropriate resizing method and threshold value for PIL it can be for sure automated, so that after the enhancement you get an image similar to the enhanced image I have got:
The improved image run through Tesseract OCR gives following text:
"fhpgearedmotors©gmail.com"
This demonstrates that enlarging an image can help to achieve 100% accuracy on the provided text-image example.
It may appear weird that enlarging an image helps to achieve better OCR accuracy, BUT ... OCR was developed to convert scans of printed media to texts and expect 300 dpi images of the text by design. This explains why some OCR programs didn't resize the text by themselves to improve their results and do bad on small fonts expecting higher dpi resolution of the image which can be achieved by enlarging.
Here an excerpt from Tesseract FAQ on github.com prooving the statement above:
[There is a minimum text size for reasonable accuracy. You have to consider resolution as well as point size. Accuracy drops off below 10pt x 300dpi, rapidly below 8pt x 300dpi. A quick check is to count the pixels of the x-height of your characters. (X-height is the height of the lower case x.) At 10pt x 300dpi x-heights are typically about 20 pixels, although this can vary dramatically from font to font. Below an x-height of 10 pixels, you have very little chance of accurate results, and below about 8 pixels, most of the text will be "noise removed".]
I am using ImageMagick to programmatically reduce the size of a PNG image by reducing the colors in the image. I get the images unique-colors and divide this by 2. Then I assign this value to the -colors option as follows:
variable = unique-colors / 2
convert image.png -colors variable -depth 8
I thought this would substantially reduce the size of the image but instead it increases the images size on disk. Can anyone shed any light on this.
Thanks.
EDIT: Turns out the problem was dithering. Dithering helps your reduced color images look more like the originals but adds to the image size. To remove dithering in ImageMagick add +dither to your command.
Example
convert CandyBar.png +dither -colors 300 -depth 8 smallerCandyBar.png
Imagemagick probably uses some dithering algorithm to make image appear as though it has original amount of colors. This increases image data "randomness" (single pixels are recolored at some places to blend into other colors) and this image data no longer packs as well. Research further into how the convert command does the dithering. You can also see this effect by adding second image as a layer in gimp/equivalent program and tuning transparency.
You should use pngquant for this.
You don't need to guess number of colors, it has actual --quality setting:
pngquant --verbose --quality=70 image.png
The above will automatically choose number of colors needed to match given quality in the same scale as JPEG quality (100 = perfect, 70 = OK, 20 = awful).
pngquant has substantially better quantization algorithm, and the better the quantization the better quality/filesize ratio.
And pngquant doesn't dither areas that look good without dithering, and this avoids adding unnecessary noise/randomness to the file.
The "new" PNG's compression is not as good as the one of the original.