How to remove non-text or back print noises from image? - python-3.x

I'm trying to detect text from in image using pytesseract and openCV. After applying adaptive gaussian threshold i got the following as an output:
Adaptive
I also tried OTSU and got the following result :
OTSU
My guess is I should use the adaptive image and remove the noises. But I don't know how to remove the noises. Or should i use the OTSU one?

Related

OpenCV - Image Text enhancment - OCR pre-processing

my goal is to pre-process image (extracted from a video) for OCR detection.
Text is always black, like this example:
I tried to use age framering and HVS mask:
cv2.accumulateWeighted(frame,avg2,0.005)
#res2 = cv2.convertScaleAbs(avg2)
# Convert BGR to HSV
hsv = cv2.cvtColor(imgray, cv2.COLOR_BGR2HSV)
# define range of black color in HSV
lower_val = np.array([0,0,0])
upper_val = np.array([179,255,127])
# Threshold the HSV image to get only black colors
mask = cv2.inRange(hsv, lower_val, upper_val)
# invert mask to get black symbols on white background
mask_inv = cv2.bitwise_not(mask)
cv2.imshow("Mask", mask)
But result are not good enought.
Looking for some possible workaroud.
Thx
These type of images, where text instances can not be separated easily, tesseract won't provide with good results. Tesseract is a good option if you want to extract text from document/papaer/pdfs, etc. where text instances are clear.
For your problem, I would suggest you to follow text detection and text recognition models separetely. For text detection, you can use state-of-the-art models like east text detector, which is able to locate text in diffiuclt images. It will generate bounding boxes around text in the images and then this box are can be given to another text recognition model, which will perform actual recognition task.
For text detection : East or any other latest model
For text recognition: CRNN based models
Please tryto implement above models and I am sure they will perform way better than what you are getting from Tesseract:)
BR!

removing shadow from colour image (3-channel (jpg) or 4-channel (png))

I am trying to isolate shadows from this image and remove them:
The reason why I am doing that is because shadow is problematic for my edge detection algorithm.
What should I do to remove the shadow? I haven't done this before, so I do not even know where to start from.
From the similar questions on SO I wasn't able to find anything to help me with my task.
I have the image in both: png and jpg format, so I am not even sure which format to use to start with.
That's a very interesting question. One option you can try is to divide the RGB values in the image by the grayscale intensity of the image. There is apparently another method explained here: https://onlinelibrary.wiley.com/doi/full/10.1002/col.21889.

How to change a part of the color of the background, which is black, to white?

I have been working on PyTesseract OCR and converting PDF to JPEG inorder to OCR the image. A part of the image has a black background and white text, which Tesseract is unable to identify, whereas all other parts of my image are being read perfectly well. Is there a way to change a part of the image that has black background? I tried a few SO resources, but doesn't seem to help.
I am using Python 3, Open CV version 4 and PyTesseract
opencv has a bitwise not function wich correctly reverses the image
you can put a mask / freeze on the rest of the image (the part that is correct already) and use something like this:
imageWithMask = cv2.bitwise_not(imageWithMask)
alternatively you can also perform the operation on a copy of the image and only copy over parts / pixels / regions you need....

How do I get the color of the text?

I've been using the Microsoft OCR API and I'm getting the text from the images but I would like to know if the text is in an specific color or has an specific background color.
For example I have the following image and I would like to know if there is text in red
i.e. image
I thought that this line:
string requestParameters = "language=unk&detectOrientation=true";
would help me to establish the parameters I'd like to recieve from the image so if I wanted to know the color in a line of words. So I added a visual feature like this:
string requestParameters = "visualFeatures=Color,language=unk&detectOrientation=true";
But this did not solve the problem.
Also: Can I mix the uriBase link from the image analysis and the one from the OCR?
There is currently no way to retrieve the color information and OCR results in a single call.
You could try using the bounding boxes returned from OCR to crop the original image, and then send the crop it to the analyze endpoint with visualFeatures=color to get the color information for the detected text.
According to documentation, the possible request parameters of this api are:
language, detectOrientation
and the returned metadata has these entities:
orientation, language, regions, lines, words, boundingBox, text
It will be possible to combine the OCR algorithm with another one of the computer vision algorithms to detect the dominating colors in the text regions that the OCR identified.

How to apply a change of concentration and brightness of image

I want to apply a modification of the concentration and the brightness of an image and I found this example in opencv and python ,but it dont supported with my version of opencv , i want to get this result
using opencv 3 and python3 can any one help me

Resources