I tried applying tesseract ocr on image but before applying OCR I want to improve the quality of the image so that OCR efficiency increase,
How to detect image brightness and increase or decrease the brightness of the image as per requirement.
How to detect image sharpness
It's not easy way to do that, if your images are similar to each other you can define "correct" brightness and adjust it on unprocessed images.
But what is "correct" brightness? You can use histogram to match this. See figure bellow. If you establish your correct histogram you can calibrate other images to it.
Richard Szeliski, Computer Vision Algorithms and Applications
I think you can use the autofocus method. You must check contrast histogram of image and define what is sharp to you.
source
*Basic recommendation, using the gray images is better on OCR
You can use the BRISQUE image quality assessment for scoring the image quality, it's available as a library. check it out, a smaller score means good quality.
Related
So I am trying to extract text from image. And as the quality and size of image is not good, it is giving inaccurate results. I tried few enhancements and other things with PIL but that is only worsening the quality of image.
Can someone suggest some enhancement in image to get better results. Few Examples of images:
In the provided example of image the text is visually of quite good quality, so the question is how it comes that OCR gives inaccurate results?
To illustrate the conclusions given in further text of this answer let's run the the given image
through Tesseract. Below the result of Tesseract OCR:
"fhpgearedmomrs©gmachom"
Now let's resize the image four times and apply thresholding to it. I have done the resizing and thresholding manually in Gimp, but with appropriate resizing method and threshold value for PIL it can be for sure automated, so that after the enhancement you get an image similar to the enhanced image I have got:
The improved image run through Tesseract OCR gives following text:
"fhpgearedmotors©gmail.com"
This demonstrates that enlarging an image can help to achieve 100% accuracy on the provided text-image example.
It may appear weird that enlarging an image helps to achieve better OCR accuracy, BUT ... OCR was developed to convert scans of printed media to texts and expect 300 dpi images of the text by design. This explains why some OCR programs didn't resize the text by themselves to improve their results and do bad on small fonts expecting higher dpi resolution of the image which can be achieved by enlarging.
Here an excerpt from Tesseract FAQ on github.com prooving the statement above:
[There is a minimum text size for reasonable accuracy. You have to consider resolution as well as point size. Accuracy drops off below 10pt x 300dpi, rapidly below 8pt x 300dpi. A quick check is to count the pixels of the x-height of your characters. (X-height is the height of the lower case x.) At 10pt x 300dpi x-heights are typically about 20 pixels, although this can vary dramatically from font to font. Below an x-height of 10 pixels, you have very little chance of accurate results, and below about 8 pixels, most of the text will be "noise removed".]
I am microbiology student new to computer vision, so any help will be extremely appreciated.
This question involves microscope images that I am trying to analyze. The goal I am trying to accomplish is to count bacteria in an image but I need to pre-process the image first to enhance any bacteria that are not fluorescing very brightly. I have thought about using several different techniques like enhancing the contrast or sharpening the image but it isn't exactly what I need.
I want to reduce the noise(black spaces) to 0's on the RBG scale and enhance the green spaces. I originally was writing a for loop in OpenCV with threshold limits to change each pixel but I know that there is a better way.
Here is an example that I did in photo shop of the original image vs what I want.
Original Image and enhanced Image.
I need to learn to do this in a python environment so that I can automate this process. As I said I am new but I am familiar with python's OpenCV, mahotas, numpy etc. so I am not exactly attached to a particular package. I am also very new to these techniques so I am open to even if you just point me in the right direction.
Thanks!
You can have a look at histogram equalization. This would emphasize the green and reduce the black range. There is an OpenCV tutorial here. Afterwards you can experiment with different thresholding mechanisms that best yields the bacteria.
Use TensorFlow:
create your own dataset with images of bacteria and their positions stored in accompanying text files (the bigger the dataset the better).
Create a positive and negative set of images
update default TensorFlow example with your images
make sure you have a bunch of convolution layers.
train and test.
TensorFlow is perfect for such tasks and you don't need to worry about different intensity levels.
I initially tried histogram equalization but did not get the desired results. So I used adaptive threshold using the mean filter:
th = cv2.adaptiveThreshold(img, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY_INV, 3, 2)
Then I applied the median filter:
median = cv2.medianBlur(th, 5)
Finally I applied morphological closing with the ellipse kernel:
k1 = cv2.getStructuringElement(cv2.MORPH_ELLIPSE,(5,5))
dilate = cv2.morphologyEx(median, cv2.MORPH_CLOSE, k1, 3)
THIS PAGE will help you modify this result however you want.
i retrieve contours from images by using canny algorithm. it's enough to have a descriptor image and put in SVM and find similarities? Or i need necessarily other features like elongation, perimeter, area ?
I talk about this, because inspired by this example: http://scikit-learn.org/dev/auto_examples/plot_digits_classification.html i give my image in greyscale first, in canny algorithm style second and in both cases my confusion matrix was plenty of 0 like precision, recall, f1-score, support measure
My advice is:
unless you have a low number of images in your database and/or the recognition is going to be really specific (not a random thing for example) I would highly recommend you to apply one or more features extractors such SIFT, Fourier Descriptors, Haralick's Features, Hough Transform to extract more details which could be summarised in a short vector.
Then you could apply SVM after all this in order to get more accuracy.
I'm trying to OCR certain images, but am having problems with the accuracy. I would like to see if I can improve accuracy by converting images to B&W, high contrast. Any ideas how I can do that with ImageMagick?
Fred's two colour threshold ImageMagick script might help.
I'm looking for ways to determine the quality of a photography (jpg). The first thing that came into my mind was to compare the file-size to the amount of pixel stored within. Are there any other ways, for example to check the amount of noise in a jpg? Does anyone have a good reading link on this topic or any experience? By the way, the project I'm working on is written in C# (.net 3.5) and I use the Aurigma Graphics Mill for image processing.
Thanks in advance!
I'm not entirely clear what you mean by "quality", if you mean the quality setting in the JPG compression algorithm then you may be able to extract it from the EXIF tags of the image (relies on the capture device putting them in and no-one else overwriting them) for your library see here:
http://www.aurigma.com/Support/DocViewer/30/JPEGFileFormat.htm.aspx
If you mean any other sort of "quality" then you need to come up with a better definition of quality. For example, over-exposure may be a problem in which case hunting for saturated pixels would help determine that specific sort of quality. Or more generally you could look at statistics (mean, standard deviation) of the image histogram in the 3 colour channels. The image may be out of focus, in which case you could look for a cutoff in the spatial frequencies of the image Fourier transform. If you're worried about speckle noise then you could try applying a median filter to the image and comparing back to the original image (more speckle noise would give a larger change) - I'm guessing a bit here.
If by "quality" you mean aesthetic properties of composition etc then - good luck!
The 'quality' of an image is not measurable, because it doesn't correspond to any particular value.
If u take it as number of pixels in the image of specific size its not accurate. You might talk about a photograph taken in bad light conditions as being of 'bad quality', even though it has exactly the same number of pixels as another image taken in good light conditions. This term is often used to talk about the overall effect of an image, rather than its technical specifications.
I wanted to do something similar, but wanted the "Soylent Green" option and used people to rank images by performing comparisons. See the question responses here.
I think you're asking about how to determine the quality of the compression process itself. This can be done by converting the JPEG to a BMP and comparing that BMP to the original bitmap from with the JPEG was created. You can iterate through the bitmaps pixel-by-pixel and calculate a pixel-to-pixel "distance" by summing the differences between the R, G and B values of each pair of pixels (i.e. the pixel in the original and the pixel in the JPEG) and dividing by the total number of pixels. This will give you a measure of the average difference between the original and the JPEG.
Reading the number of pixels in the image can tell you the "megapixel" size(#pixels/1000000), which can be a crude form of programatic quality check, but that wont tell you if the photo is properly focused, assuming it is supposed to be focused (think fast-motion objects, like trains), nor weather or not there is something in the pic worth looking at, that will require a human, or pigeon if you prefer.