Now I have an image that contains some text and it has a colored background , I want to extract it using tesseract but first i want to replace the colored background with white one and make the text itself black to increase the accuracy of detection process .
i was trying to use Canny Detection
import cv2
import numpy as np
image=cv2.imread('tt.png')
cv2.imshow('input image',image)
cv2.waitKey(0)
gray=cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
edged=cv2.Canny(gray,30,200)
edged = cv2.bitwise_not(edged)
cv2.imshow('canny edges',edged)
cv2.waitKey(0)
that worked fine to replace the colored background with white but made the text's color white with black outlines (check the below images) .
so is there any way to make the whole text colored black ?
or
is there another way i can use to make that ?
before Canny detection
after Canny detection
Edit
the image may has mixed background colors like
input image
You should simply do it by using THRESH_BINARY_INV, it is the code:
cv::namedWindow("Original_Image", cv::WINDOW_FREERATIO);
cv::namedWindow("Result", cv::WINDOW_FREERATIO);
cv::Mat originalImg = cv::imread("BCQqn.png");
cv::Mat gray;
cv::cvtColor(originalImg, gray, cv::COLOR_BGR2GRAY);
cv::threshold(gray, gray, 130, 255, cv::THRESH_BINARY_INV);
cv::imshow("Original_Image", originalImg);
cv::imshow("Result", gray);
cv::waitKey();
And it is the result:
You can play with the threshold value (130 in the above example).
Note: The code is in C++, if you are using Python, then you can go the same steps, and is that.
Good Luck!!
Related
To set this up, I used svgwrite library to create a sample SVG image (20 squares of length 100 at random locations on a display size of length 400)
import svgwrite
import random
random.seed(42)
dwg = svgwrite.Drawing('x.svg', size=(400,400))
dwg.add(dwg.rect(insert=(0,0), size=('100%', '100%'), fill='white')) # White background
for i in range(20):
coordinates = (random.randint(0,399), random.randint(0,399))
color = (random.randint(0,255), random.randint(0,255), random.randint(0,255))
dwg.add(dwg.rect(coordinates, (100, 100),
stroke='black',
fill=svgwrite.rgb(*color),
stroke_width=1)
)
dwg.save()
I then wrote a sample pygame program to generate a PNG image of the same sample. (A seed has been used to generate the same sequence of squares.)
import pygame
import random
random.seed(42)
display = pygame.display.set_mode((400,400))
display.fill((255,255,255)) # White background
for i in range(20):
coordinates = (random.randint(0,399), random.randint(0,399))
color = (random.randint(0,255), random.randint(0,255), random.randint(0,255))
pygame.draw.rect(display, color, coordinates+(100,100), 0)
pygame.draw.rect(display, (0,0,0), coordinates+(100,100), 1) #For black border
pygame.image.save(display, "x.png")
These are the images that I got (SVG's can't be uploaded to SO, so I have provided a screenshot. Nevertheless, the programs above can be run to output the same).
My question is, why is the PNG (on the left) richer and sharper than the corresponding SVG image? The SVG looks blurred and bland, comparatively.
EDIT: One can notice the fine white line between the first two squares at the top-left corner. It's not very clear in the SVG.
Two things I think may impact:
You are using an image viewer, which could distort the vectorial SVG image. I think all of the vector images viewers get the actual screen size, then export the vectorial image into a matrix image sized in function of the size of the screen you have. Then they display the matrix image. If they render the image with softened sharpness, or if they have a problem by getting the size of your screen, the image may be blurred.
To make the PNG image, you use pygame. But you are using another module to make the SVG image. This module may function differently, and also exports the image with another quality than if you were exporting it with pygame.
For me personally the SVG image appears blurred with Gimp, for example, but not with another SVG viewer.
So I think the problem comes from your image viewer.
my goal is to pre-process image (extracted from a video) for OCR detection.
Text is always black, like this example:
I tried to use age framering and HVS mask:
cv2.accumulateWeighted(frame,avg2,0.005)
#res2 = cv2.convertScaleAbs(avg2)
# Convert BGR to HSV
hsv = cv2.cvtColor(imgray, cv2.COLOR_BGR2HSV)
# define range of black color in HSV
lower_val = np.array([0,0,0])
upper_val = np.array([179,255,127])
# Threshold the HSV image to get only black colors
mask = cv2.inRange(hsv, lower_val, upper_val)
# invert mask to get black symbols on white background
mask_inv = cv2.bitwise_not(mask)
cv2.imshow("Mask", mask)
But result are not good enought.
Looking for some possible workaroud.
Thx
These type of images, where text instances can not be separated easily, tesseract won't provide with good results. Tesseract is a good option if you want to extract text from document/papaer/pdfs, etc. where text instances are clear.
For your problem, I would suggest you to follow text detection and text recognition models separetely. For text detection, you can use state-of-the-art models like east text detector, which is able to locate text in diffiuclt images. It will generate bounding boxes around text in the images and then this box are can be given to another text recognition model, which will perform actual recognition task.
For text detection : East or any other latest model
For text recognition: CRNN based models
Please tryto implement above models and I am sure they will perform way better than what you are getting from Tesseract:)
BR!
I have multiple pictures, each of which has an object with its background removed. The pictures are 500x400 pixels in size.
I am looking for a way to programmatically (preferably using python) calculate the total number of pixels of the image inside the picture (inside the space without the background).
I used the PIL package in Python to get the dimensions of the image object, as follows:
print(image.size)
This command successfully produced the dimensions of the entire picture (500x400 pixels) but not the dimensions of the object of interest inside the picture.
Does anyone know how to calculate the dimensions of an object inside a picture using python? An example of a picture is embedded below.
You could floodfill the background pixels with some colour not present in the image, e.g. magenta, then count the magenta pixels and subtract that number from number of pixels in image (width x height).
Here is an example:
#!/usr/bin/env python3
from PIL import Image, ImageDraw
import numpy as np
# Open the image and ensure RGB
im = Image.open('man.png').convert('RGB')
# Make all background pixels magenta
ImageDraw.floodfill(im,xy=(0,0),value=(255,0,255),thresh=50)
# Save for checking
im.save('floodfilled.png')
# Make into Numpy array
n = np.array(im)
# Mask of magenta background pixels
bgMask =(n[:, :, 0:3] == [255,0,255]).all(2)
count = np.count_nonzero(bgMask)
# Report results
print(f"Background pixels: {count} of {im.width*im.height} total")
Sample Output
Background pixels: 148259 of 199600 total
Not sure how important the enclosed areas between arms and body are to you... if you just replace all greys without using the flood-filling technique, you risk making, say, the shirt magenta and counting that as background.
I am processing an image with OpenCV on Python and I want to count every objects (worms) on it. Worms are rather light beige whereas the background is black (see picture) so it is rather easy to distinguish them. The problem is that sometimes worms are too close to each other (sometimes they even overlap) and cv.findContours() will draw one big contour instead of two smaller ones (see picture below).
Because I am using cv.foundContours(), I have to first turn the picture into black and white, then blur it (optional) and finally threshold it in order to have white worms in a black background.
I am using the following code :
import cv2 as cv
img = cv.imread('worms.jpg')
gray = cv.cvtColor(img, cv.COLOR_BGR2GRAY)
blur=cv.GaussianBlur(gray,(5,5),1)
ret,osu = cv.threshold(blur,0,255,cv.THRESH_BINARY+cv.THRESH_OTSU)
imsource,contours,test = cv.findContours(osu,1,1)
cv.drawContours(img,contours,-1, (0,0,255),2)
I tried to erode the thresholded picture but it doesn't work well since the "bond" between two worms is quite big.
Thanks for the help
I have been working on PyTesseract OCR and converting PDF to JPEG inorder to OCR the image. A part of the image has a black background and white text, which Tesseract is unable to identify, whereas all other parts of my image are being read perfectly well. Is there a way to change a part of the image that has black background? I tried a few SO resources, but doesn't seem to help.
I am using Python 3, Open CV version 4 and PyTesseract
opencv has a bitwise not function wich correctly reverses the image
you can put a mask / freeze on the rest of the image (the part that is correct already) and use something like this:
imageWithMask = cv2.bitwise_not(imageWithMask)
alternatively you can also perform the operation on a copy of the image and only copy over parts / pixels / regions you need....