Related
I am trying to fetch label 0026 from the attached image:
Initial input image
I tried below code initially to fetch the text:
import pytesseract
BGR = cv2.imread('C:/Users/Choudharyp/CV/11952/01_A_parcel_layer_single_parcel.png')
RGB = cv2.cvtColor(BGR, cv2.COLOR_BGR2RGB)
lower = np.array([175, 125, 45], dtype="uint8")
upper = np.array([255, 255, 255], dtype="uint8")
mask = cv2.inRange(RGB, lower, upper)
img = cv2.bitwise_and(RGB, RGB, mask=mask)
gray = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)
gray = 255 - gray
emp = np.full_like(gray, 255)
emp -= gray
emp[emp==0] = 255
emp[emp<100] = 0
gauss = cv2.GaussianBlur(emp, (3,3), 1)
gauss[gauss<220] = 0
text = pytesseract.image_to_string(gauss, config='outputbase digits')
print("Text=>",text)
This did not work possibly because I need to remove green lines from the image.
Hence I first wrote below code to remove green lines from the image and extract only the black colors in the image (this works fine):
# Imports
import cv2
import numpy as np
import pytesseract
# Read image
imagePath = "C:/Users/Choudharyp/CV/11952/" #insert your own loctaion
inputImage = cv2.imread(imagePath + "01_A_parcel_layer_single_parcel.png")
# Conversion to CMYK (just the K channel):
# Convert to float and divide by 255:
imgFloat = inputImage.astype(np.float64) / 255.
# Calculate channel K:
kChannel = 1 - np.max(imgFloat, axis=2)
# Convert back to uint 8:
kChannel = (255 * kChannel).astype(np.uint8)
# Threshold image:
binaryThresh = 190
_, binaryImage = cv2.threshold(kChannel, binaryThresh, 255, cv2.THRESH_BINARY)
cv2.imshow('Black_LettersOnly', binaryImage)
cv2.waitKey(0)
Output looks like below :
Image with only black label
The label 0026 however is too small in the image.
Then, I used the same code as above to fetch the text from the image, however it still doesn't work. Can someone suggest what else I could do to start fetching the labels from the image ?
In binaryImage you get white text on black background. However, Tesseract is optimized to detect dark text on bright background and apparently it works perfectly fine in this case, when you simply invert your image:
...
text = pytesseract.image_to_string(255-binaryImage, config='outputbase digits')
print("Text=>", text.strip())
>>> Text=> 0026
However, if you want it even simpler, I'd prefer HSV color space and threshold the value channel with a one-liner to maintain just the non-colored/black pixels:
...
inputImageHSV = cv2.cvtColor(inputImage, cv2.COLOR_BGR2HSV)
binaryImage = (inputImageHSV[..., 2] > 120).astype(np.uint8) * 255
text = pytesseract.image_to_string(binaryImage, config='outputbase digits')
print("Text=>", text.strip())
>>> Text=> 0026
compositing a png into an MP4 video creates a black border around the edge.
This is using moviepy 1.0.0
Code below reproduces the MP4 with the attached red text png.
import numpy as np
import moviepy.editor as mped
def composite_txtpng_on_colour():
bg_color = mped.ColorClip(size=[400, 300], color=np.array([0, 255, 0]).astype(np.uint8),
duration=2).set_position((0, 0))
text_png_postition = [5, 5]
text_png = mped.ImageClip("./txtpng.png", duration=3).set_position((text_png_postition))
canvas_size = bg_color.size
stacked_clips = mped.CompositeVideoClip([bg_color, text_png], size=canvas_size).set_duration(2)
stacked_clips.write_videofile('text_with_black_border_video.mp4', fps=24)
composite_txtpng_on_colour()
The result is an MP4 that can be played in VLC player. A screenshot of the black edge can be seen below:-
Any suggestions to remove the black borders would be much appreciated.
Update: It looks like moviepy does a blit instead of alpha compositing.
def blit(im1, im2, pos=None, mask=None, ismask=False):
""" Blit an image over another. Blits ``im1`` on ``im2`` as position ``pos=(x,y)``, using the
``mask`` if provided. If ``im1`` and ``im2`` are mask pictures
(2D float arrays) then ``ismask`` must be ``True``.
"""
if pos is None:
pos = [0, 0]
# xp1,yp1,xp2,yp2 = blit area on im2
# x1,y1,x2,y2 = area of im1 to blit on im2
xp, yp = pos
x1 = max(0, -xp)
y1 = max(0, -yp)
h1, w1 = im1.shape[:2]
h2, w2 = im2.shape[:2]
xp2 = min(w2, xp + w1)
yp2 = min(h2, yp + h1)
x2 = min(w1, w2 - xp)
y2 = min(h1, h2 - yp)
xp1 = max(0, xp)
yp1 = max(0, yp)
if (xp1 >= xp2) or (yp1 >= yp2):
return im2
blitted = im1[y1:y2, x1:x2]
new_im2 = +im2
if mask is None:
new_im2[yp1:yp2, xp1:xp2] = blitted
else:
mask = mask[y1:y2, x1:x2]
if len(im1.shape) == 3:
mask = np.dstack(3 * [mask])
blit_region = new_im2[yp1:yp2, xp1:xp2]
new_im2[yp1:yp2, xp1:xp2] = (1.0 * mask * blitted + (1.0 - mask) * blit_region)
return new_im2.astype('uint8') if (not ismask) else new_im2
and so, Rotem is right.
new_im2[yp1:yp2, xp1:xp2] = (1.0 * mask * blitted + (1.0 - mask) * blit_region)
is
(alpha * img_rgb + (1.0 - alpha) * bg)
and this is how moviepy composites. And this is why we see black at the edges.
The main issue is the YUV420 color sub-sumpling, but it's also a result of compression artifacts, and imperfect "Text red" image.
The image imperfection is just in the alpha channel.
There are alpha (transparency) values that are not 255 and not 0 (semi-transparnt) pixels around the text.
The following code sample corrects it, and show the difference (using OpenCV):
orig_img = cv2.imread('txtpng.png', cv2.IMREAD_UNCHANGED)
img = orig_img.copy()
img[(img != 255) & (img != 0)] = 255 # Keep only two values: 0 and 255
cv2.imwrite('txtpng2.png', img) # Write img to txtpng2.png
cv2.imshow('alpha diff', cv2.absdiff(orig_img[:,:,3], img[:,:,3])*100) # Show the difference in alpha channels
cv2.waitKey()
cv2.destroyAllWindows()
The above code keeps only two values: 0 and 255 in img.
Result ('alpha diff'):
As you can see there are differences.
Setting codec parameters:
For reference I created an uncompressed AVI video file:
# Save uncompressed AVI as reference
stacked_clips.write_videofile('text_with_black_border_video.avi', fps=24, codec='rawvideo', ffmpeg_params=['-pix_fmt', 'bgr24'])
I also tried to select H.264 codec with yuv444p pixel format, but for some reason it's not working.
I have selected H.265 codec instead.
Using ffmpeg_params, I also set '-crf', '10' for almost lossless video compression:
stacked_clips.write_videofile('text_with_black_border_video.mp4', fps=24, codec='libx265', ffmpeg_params=['-pix_fmt', 'yuv444p', '-crf', '10'])
Here is the complete code sample:
import numpy as np
import moviepy.editor as mped
import cv2
orig_img = cv2.imread('txtpng.png', cv2.IMREAD_UNCHANGED)
img = orig_img.copy()
img[(img != 255) & (img != 0)] = 255 # Keep only two values: 0 and 255
cv2.imwrite('txtpng2.png', img)
cv2.imshow('img', img)
cv2.imshow('alpha diff', cv2.absdiff(orig_img[:,:,3], img[:,:,3])*100) # Show the difference in alpha channels
cv2.waitKey()
cv2.destroyAllWindows()
def composite_txtpng_on_colour():
bg_color = mped.ColorClip(size=[400, 300], color=np.array([0, 255, 0]).astype(np.uint8),
duration=2).set_position((0, 0))
text_png_postition = [5, 5]
text_png = mped.ImageClip('txtpng2.png', duration=3).set_position((text_png_postition))
canvas_size = bg_color.size
stacked_clips = mped.CompositeVideoClip([bg_color, text_png], size=canvas_size).set_duration(2)
stacked_clips.write_videofile('text_with_black_border_video.mp4', fps=24, codec='libx265', ffmpeg_params=['-pix_fmt', 'yuv444p', '-crf', '10'])
# Save uncompressed AVI as reference
stacked_clips.write_videofile('text_with_black_border_video.avi', fps=24, codec='rawvideo', ffmpeg_params=['-pix_fmt', 'bgr24'])
composite_txtpng_on_colour()
Result (text_with_black_border_video.mp4):
Reference (uncompressed text_with_black_border_video.avi):
Magnified part:
Note:
I am using moviepy version 1.03
I figured out why H.264 codec is not working with pixel format yuv444p.
We can add '-report' to the list of ffmpeg_params:
stacked_clips.write_videofile('text_with_black_border_video.mp4', fps=24, codec='libx264', ffmpeg_params=['-pix_fmt', 'yuv444p', '-crf', '10', '-report'])
The logged report begins with:
... \\lib\\site-packages\\imageio_ffmpeg\\binaries\\ffmpeg-win64-v4.2.2.exe" -y -loglevel error -f rawvideo -vcodec rawvideo -s 400x300 -pix_fmt rgb24 -r 24.00 -an -i - -vcodec libx264 -preset medium -pix_fmt yuv444p -crf 10 -report -pix_fmt yuv420p text_with_black_border_video.mp4
The log shows that moviepy added -pix_fmt yuv420p argument.
pix_fmt argument is added twice: -pix_fmt yuv444p -pix_fmt yuv420p.
The yuv420p argument "wins".
Update:
Way to keep the edges slightly blurred:
The edges color is not back, but dark green.
The edges color is a result of Alpha compositing in RGB color space.
I suppose the compositing is performed using the simple formula:
dst_img = alpha*img_rgb + (1-alpha)*bg
Where alpha = img[:,:,0:3]/255.
When applying the above formula we are getting the following image:
Text edge color is dark green.
Suggested solution:
Apply alpha compositing in LAB color space.
Advantage:
Unlike the RGB and CMYK color models, CIELAB is designed to approximate human vision.
Linear operations in LAB color space are precepted as (approximate) linear by human vision.
Here is a code sample for alpha compositing in LAB color space:
img = cv2.imread('txtpng.png', cv2.IMREAD_UNCHANGED)
img2 = np.zeros((300, 400, 4), np.uint8)
img2[(300-img.shape[0])//2:(300+img.shape[0])//2, (400-img.shape[1])//2:(400+img.shape[1])//2, :] = img
alpha = img2[:, :, 3].astype(np.float64)/255 # Convert alpha to range [0, 1]
alpha = np.dstack((alpha, alpha, alpha)) # Duplicate alpha to 3 channels
img2 = img2[:, :, 0:3] # Only BGR without alpha
bg = np.full_like(img2, (0, 255, 0)) # Green background
bg = cv2.cvtColor(bg, cv2.COLOR_BGR2LAB)
img2 = cv2.cvtColor(img2, cv2.COLOR_BGR2LAB)
composed_img = (img2.astype(np.float64)*alpha + bg.astype(np.float64)*(1-alpha)).astype(np.uint8)
composed_img = cv2.cvtColor(composed_img, cv2.COLOR_LAB2BGR)
cv2.imwrite('composed_img.png', composed_img) # Store the result
Result:
Text edge color looks better.
Note:
I couldn't find any papers about alpha compositing in LAB color space (but I didn't look so hard).
Don't use on_color to make a background color for your CompositeVideoClip, or else, edges will be black. Use this helper function to generate a color clip instead:
def get_color_video_clip(color, size, duration):
image = Image.new("RGB", size, color)
image_array = numpy.array(image)
return mpy.ImageClip(image_array, duration=duration)
Also, as a sanity check, play the video back in different media players. On Mac, IINA made edges gray for me, while QuickTime Player had no such issues.
I have this code of haarcascade which detect face and draw bounding box around it. I want to display only bounding box area in the original image in its original place and black out all other part just like we do it in color detection from opencv. Is there any way to do so?
cascPath = "haarcascade_frontalface_default.xml"
image = cv2.imread(imagePath)
faceCascade = cv2.CascadeClassifier(cascPath)
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
=
faces = faceCascade.detectMultiScale(
gray,
scaleFactor=1.1,
minNeighbors=5,
minSize=(30, 30),
flags = cv2.CASCADE_SCALE_IMAGE
)
print("Found {0} faces!".format(len(faces)))
for ((x, y, w, h),i) in zip(faces,range(len(faces))):
a=cv2.rectangle(image, (x, y), (x+w, y+h), 2)
roi_color=image[y:y+h, x:x+w]
I would try to crop the ROI and then put it in a img which is a all black rectangle. In Pillow this is very easy to do.
as I don't have face images is hard to reproduce your code. I will use some random image but It should look something like this:
I put blue color just to highlight the background, but it's just a matter to change it to whatever color you want
from PIL import Image
img = Image.open('watch.jpeg', 'r')
img_w, img_h = img.size
left = img_w/8
top = img_h/8
right = 3 * img_w/8
bottom = 3 * img_h/8
cropped_img = img.crop((left, top, right, bottom))
cropped_img.save("cropped.png")
background = Image.new('RGB', (1440, 900), (0, 0, 255))
bg_w, bg_h = background.size
offset = ((bg_w - img_w) // 2, (bg_h - img_h) // 2)
background.paste(cropped_img, offset)
background.save('out.png')
input image:
output image:
I'm trying to remove numbers which are laying inside the circular part of image, numbers are in black in color and background varies between red,yellow, blue and green.
I am using opencv to remove those numbers. I used a mask which extracts numbers from image, with help of cv2.inpaint tried to remove those numbers from images.
For my further analysis I required to have clear image. But my current approach gives distorted image and numbers are not completely removed.
I tried changing the threshold values, lowering will neglect numbers from dark shaded area such as from green and red.
import cv2
img = cv2.imread('scan_1.jpg')
mask = cv2.threshold(img,50,255,cv2.THRESH_BINARY_INV)[1][:,:,0]
cv2.imshow('mask', mask)
cv2.waitKey(0)
cv2.destroyAllWindows()
dst = cv2.inpaint(img, mask, 5, cv2.INPAINT_TELEA)
cv2.imshow('dst',dst)
cv2.waitKey(0)
cv2.destroyAllWindows()
cv2.imwrite('ost_1.jpg',dst)
Input images: a) scan_1.jpg
b) scan_2.jpg
Output images: a) ost_1.jpg
b) ost_2.jpg
Expected Image: Circles can ignored, but something similar to it is required.
Here is my attempt, a better/easier solution might be acquired if you do not care about preserving texts outside of your circle.
import cv2
import numpy as np
# connectivity method used for finding connected components, 4 vs 8
CONNECTIVITY = 4
# HSV threshold for finding black pixels
H_THRESHOLD = 179
S_THRESHOLD = 255
V_THRESHOLD = 150
# read image
img = cv2.imread("a1.jpg")
img_height = img.shape[0]
img_width = img.shape[1]
# save a copy for creating resulting image
result = img.copy()
# convert image to grayscale
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# found the circle in the image
circles = cv2.HoughCircles(gray, cv2.HOUGH_GRADIENT, 1.7, minDist= 100, param1 = 48, param2 = 100, minRadius=70, maxRadius=100)
# draw found circle, for visual only
circle_output = img.copy()
# check if we found exactly 1 circle
num_circles = len(circles)
print("Number of found circles:{}".format(num_circles))
if (num_circles != 1):
print("invalid number of circles found ({}), should be 1".format(num_circles))
exit(0)
# save center position and radius of found circle
circle_x = 0
circle_y = 0
circle_radius = 0
if circles is not None:
# convert the (x, y) coordinates and radius of the circles to integers
circles = np.round(circles[0, :]).astype("int")
for (x, y, radius) in circles:
circle_x, circle_y, circle_radius = (x, y, radius)
cv2.circle(circle_output, (circle_x, circle_y), circle_radius, (255, 0, 0), 4)
print("circle center:({},{}), radius:{}".format(x,y,radius))
# keep a median filtered version of image, will be used later
median_filtered = cv2.medianBlur(img, 21)
# Convert BGR to HSV
hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
# define range of black color in HSV
lower_val = np.array([0,0,0])
upper_val = np.array([H_THRESHOLD,S_THRESHOLD,V_THRESHOLD])
# Threshold the HSV image to get only black colors
mask = cv2.inRange(hsv, lower_val, upper_val)
# find connected components
components = cv2.connectedComponentsWithStats(mask, CONNECTIVITY, cv2.CV_32S)
# apply median filtering to found components
#centers = components[3]
num_components = components[0]
print("Number of found connected components:{}".format(num_components))
labels = components[1]
stats = components[2]
for i in range(1, num_components):
left = stats[i, cv2.CC_STAT_LEFT] - 10
top = stats[i, cv2.CC_STAT_TOP] - 10
width = stats[i, cv2.CC_STAT_WIDTH] + 10
height = stats[i, cv2.CC_STAT_HEIGHT] + 10
# iterate each pixel and replace them if
#they are inside circle
for row in range(top, top+height+1):
for col in range(left, left+width+1):
dx = col - circle_x
dy = row - circle_y
if (dx*dx + dy*dy <= circle_radius * circle_radius):
result[row, col] = median_filtered[row, col]
# smooth the image, may be necessary?
#result = cv2.blur(result, (3,3))
# display image(s)
cv2.imshow("img", img)
cv2.imshow("gray", gray)
cv2.imshow("found circle:", circle_output)
cv2.imshow("mask", mask)
cv2.imshow("result", result)
cv2.waitKey(0)
cv2.destroyAllWindows()
Result for a1:
I have a OpenCv Image likewise;
opencvImage = cv2.cvtColor(numpy_image, cv2.COLOR_RGBA2BGRA)
Then with the following code piece, I want to remove the transparency and set a White background.
source_img = cv2.cvtColor(opencvImage[:, :, :3], cv2.COLOR_BGRA2GRAY)
source_mask = opencvImage[:,:,3] * (1 / 255.0)
background_mask = 1.0 - source_mask
bg_part = (background_color * (1 / 255.0)) * (background_mask)
source_part = (source_img * (1 / 255.0)) * (source_mask)
result_image = np.uint8(cv2.addWeighted(bg_part, 255.0, source_part, 255.0, 0.0))
Actually, I am able to set the background white, however, the actual image color is change, as well.
I believe COLOR_BGRA2GRAY methods causes this problem. That's why, I tried to use IMREAD_UNCHANGED method, but I have this error : unsupported color conversion code in function 'cvtColor’
Btw, I am open to any solution, I just share my code - might need a small fix.
Here's a basic script that will replace all fully transparent pixels with white and then remove the alpha channel.
import cv2
#load image with alpha channel. use IMREAD_UNCHANGED to ensure loading of alpha channel
image = cv2.imread('your image', cv2.IMREAD_UNCHANGED)
#make mask of where the transparent bits are
trans_mask = image[:,:,3] == 0
#replace areas of transparency with white and not transparent
image[trans_mask] = [255, 255, 255, 255]
#new image without alpha channel...
new_img = cv2.cvtColor(image, cv2.COLOR_BGRA2BGR)
I do not know exactly what that error is, but I was testing just now a possible solution for you. Even it is in C++, I guess you can convert it easily to python.
/* Setting data info */
std::string test_image_path = "Galicia.png";
/* General variables */
cv::namedWindow("Input image", cv::WINDOW_NORMAL);
cv::namedWindow("Input image R", cv::WINDOW_NORMAL);
cv::namedWindow("Input image G", cv::WINDOW_NORMAL);
cv::namedWindow("Input image B", cv::WINDOW_NORMAL);
cv::namedWindow("Input image A", cv::WINDOW_NORMAL);
cv::namedWindow("Output image", cv::WINDOW_NORMAL);
/* Process */
cv::Mat test_image = cv::imread(test_image_path, cv::IMREAD_UNCHANGED);
std::cout << "Image type: " << test_image.type() << std::endl;
// Split channels of the png files
std::vector<cv::Mat> pngChannels(4);
cv::split(test_image, pngChannels);
cv::imshow("Input image", test_image);
cv::imshow("Input image R", pngChannels[0]);
cv::imshow("Input image G", pngChannels[1]);
cv::imshow("Input image B", pngChannels[2]);
cv::imshow("Input image A", pngChannels[3]);
// Set to 255(white) the RGB channels where the Alpha channel(mask) is 0(transparency)
pngChannels[0].setTo(cv::Scalar(255), pngChannels[3]==0);
pngChannels[1].setTo(cv::Scalar(255), pngChannels[3]==0);
pngChannels[2].setTo(cv::Scalar(255), pngChannels[3]==0);
// Merge again the channels
cv::Mat test_image_output;
cv::merge(pngChannels, test_image_output);
// Show the merged channels.
cv::imshow("Output image", test_image_output);
// For saving with changes, conversion is needed.
cv::cvtColor(test_image_output, test_image_output, cv::COLOR_RGBA2RGB);
cv::imwrite("Galicia_mod.png", test_image_output);
I complement the code with this screenshot that may help you to understand better my solution:
Best Wishes,
Arritmic
All previous answers use binarizing but mask can be non binary. In that case you can use alpha blending with white background
def alpha_blend_with_mask(foreground, background, mask): # modified func from link
# Convert uint8 to float
foreground = foreground.astype(float)
background = background.astype(float)
# Normalize the mask mask to keep intensity between 0 and 1
mask = cv2.cvtColor(mask, cv2.COLOR_GRAY2BGR)
mask = mask.astype(float) / 255
# Multiply the foreground with the mask matte
foreground = cv2.multiply(mask, foreground)
# Multiply the background with ( 1 - mask )
background = cv2.multiply(1.0 - mask, background)
# Add the masked foreground and background.
return cv2.add(foreground, background).astype(np.uint8)
img_with_white_background = alpha_blend_with_mask(img[..., :3], np.ones_like(clip_img) * 255, img[..., 3])
this worked for me..
# import cv2
# #load image with alpha channel. use IMREAD_UNCHANGED to ensure loading of alpha channel
image = cv2.imread('/content/test1.jpg')
#make mask of where the transparent bits are
transp_mask = image[:,:,:3] == 0
transp_mask = image[:,:,:3] == 1 # swap
#replace areas of transparency with white and not transparent
image[transp_mask] = [100]
#new image without alpha channel...
new_img = cv2.cvtColor(image, cv2.COLOR_BGRA2BGR)
cv2.imwrite('testingnew.jpg',new_img) converted to binary img
print(new_img.shape)
plt.imshow(new_img)
transparent image
output
output2
real img
the answer by #user1269942 leave black edges and makes an unnecessary contour around the image.
I needed to fill the background to this image
This was the image I needed to convert
this is the image after following the steps in accepted answer
However If we do masking based on a threshold value we can reduce that unnecessary contour based on how much we choose threshold value. I have choosen 75 in my case.
So instead of trans_mask = image[:,:,3] == 0
If we do
img2gray = cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
ret, trans_mask = cv2.threshold(img2gray, 75, 255, cv2.THRESH_BINARY)
trans_mask = trans_mask == 0
def fillColor(imageFile, color):
image = cv2.imread(imageFile, cv2.IMREAD_UNCHANGED)
#trans_mask = image[:,:,3] == 0
img2gray = cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
ret, trans_mask = cv2.threshold(img2gray, 75, 255, cv2.THRESH_BINARY)
trans_mask = trans_mask == 0
image[trans_mask] = color
new_img = cv2.cvtColor(image, cv2.COLOR_BGRA2BGR)
return new_img
The Output Image