Make Area OUTSIDE of Bounding Box/ROI Transparent Using CV2 - python-3.x

I'm wanting to know if there's a better way of doing this, but given an image, and bounding box coordinates, how can I transform the pixels OUTSIDE of the bounding box transparent?
Here's a simple pseudo reproducible example, just have to provide an image for it from your own machine. Note this code is not working as you'll see, looking for the best option to do this:
import cv2
# Read in the image and clone it
image = cv2.imread("YOUR_IMAGE.jpg", cv2.IMREAD_UNCHANGED)
clone = image.copy()
# Define a bounding box
bound_box = {"xmin": "54", "ymin": "40", "xmax": "434", "ymax": "306"}
# Create 4 rectangles (left, top, right, and bottom) around bounding box coords minus 1 pixel.
# This would be the sample line of code to use 4 times
cv2.rectangle(clone, (x, y), (x+w, y+h), (0, 200, 0), -1)
# Transparency factor.
alpha = 0.0
# Add the transparency to the 4 rectangles?
clone = cv2.addWeighted(clone, alpha, image, 1 - alpha, 0)

Related

Paste an image to another image at two given co-ordinates with altered opacity using PIL or OpenCV in Python

I have two images with given points, one point each image, that need to be aligned so that the result image is a summation of both images, while image 2 is pasted on image 1 with 40% opacity. I have taken this question into consideration but our case does not exactly match as the image co-ordinate is supplied by user and images can have wide range of sizes.
Image 1:
Image2:
Final result(desired output):
For this I have tried img.paste() function of PIL and replacing values in numpy array of images in cv2, both giving results that are far from desired.
I made two input images with ImageMagick like this:
magick -size 300x400 xc:"rgb(1,204,255)" -fill red -draw "point 280,250" 1.png
magick -size 250x80 xc:"rgb(150,203,0)" -fill red -draw "point 12,25" 2.png
Then ran the following code:
#!/usr/bin/env python3
"""
Paste one image on top of another such that given points in each are coincident.
"""
from PIL import Image
# Open images and ensure RGB
im1 = Image.open('1.png').convert('RGB')
im2 = Image.open('2.png').convert('RGB')
# x,y coordinates of point in each image
p1x, p1y = 280, 250
p2x, p2y = 12, 25
# Work out how many pixels of space we need left, right, above, below common point in new image
pL = max(p1x, p2x)
pR = max(im1.width-p1x, im2.width-p2x)
pT = max(p1y, p2y)
pB = max(im1.height-p1y, im2.height-p2y)
# Create background in solid white
bg = Image.new('RGB', (pL+pR, pT+pB),'white')
bg.save('DEBUG-bg.png')
# Paste im1 onto background
bg.paste(im1, (pL-p1x, pT-p1y))
bg.save('DEBUG-bg+im1.png')
# Make 40% opacity mask for im2
alpha = Image.new('L', (im2.width,im2.height), int(40*255/100))
alpha.save('DEBUG-alpha.png')
# Paste im2 over background with alpha
bg.paste(im2, (pL-p2x, pT-p2y), alpha)
bg.save('result.png')
The result is this:
The lines that save images with names starting "DEBUG-xxx.png" are just for easy debugging and can be removed. I can easily view them all to see what is going on with the code and I can easily delete them all by removing "DEBUG*png".
Without any more details, I will try to answer the question as best as I can and will name all the extra assumptions that I made (and how to handle them if you can't make them).
Since there were no provided images, I created a blue and green image with a black dot as merging coordinate, using the following code:
import numpy as np
from PIL import Image, ImageDraw
def create_image_with_point(name, color, x, y, width=3):
image = np.full((400, 400, 3), color, dtype=np.uint8)
image[y - width:y + width, x - width:x + width] = (0, 0, 0)
image = Image.fromarray(image, mode='RGB')
ImageDraw.Draw(image).text((x - 15, y - 20), 'Point', (0, 0, 0))
image.save(name)
return image
blue = create_image_with_point('blue.png', color=(50, 50, 255), x=300, y=100)
green = create_image_with_point('green.png', color=(50, 255, 50), x=50, y=50)
This results in the following images:
Now I will make the assumption that the images do not contain an alpha layer yet (as I created them without). Therefore I will load the image and add an alpha layer to them:
import numpy as np
from PIL import Image
blue = Image.open('blue.png')
blue.putalpha(255)
green = Image.open('green.png')
green.putalpha(255)
My following assumption is that you know the merge coordinates beforehand:
# Assuming x, y coordinates.
point_blue = (300, 100)
point_green = (50, 50)
Then you can create an empty image, that can hold both of the images easily:
new_image = np.zeros((1000, 1000, 4), dtype=np.uint8)
This is a far stretch assumption if you do not know the image size beforehand, and in case you do not know this you will have to calculate the combining size of the two images.
Then you can place the images dot in the center of the newly created images (in my case (500, 500). For this you use the merging points as offsets. And you can perform alpha blending (in any case: np.uint8(img_1*alpha + img_2*(1-alpha))) to merge the images using different opacity.
Which is in code:
def place_image(image: Image, point_xy: tuple[int, int], dest: np.ndarray, alpha: float = 1.) -> np.ndarray:
# Place the merging dot on (500, 500).
offset_x, offset_y = 500 - point_xy[0], 500 - point_xy[1]
# Calculate the location of the image and perform alpha blending.
destination = dest[offset_y:offset_y + image.height, offset_x:offset_x + image.width]
destination = np.uint8(destination * (1 - alpha) + np.array(image) * alpha)
# Copy the 'merged' imaged to the destination location.
dest[offset_y:offset_y + image.height, offset_x:offset_x + image.width] = destination
return dest
# Add the background image blue with alpha 1
new_image = place_image(blue, point_blue, dest=new_image, alpha=1)
# Add the second image with 40% opacity
new_image = place_image(green, point_green, dest=new_image, alpha=0.4)
# Store the resulting image.
image = Image.fromarray(new_image)
image.save('result.png')
The final result will be a bigger image, of the combined images, again you can calculate the correct bounding box, so you don't have these huge areas of 'nothing' sticking out. The final result will look like this:

Bounding boxes for individual contours except one color

I have an image as below
I want to add bounding boxes for each of the regions as shown in the pic below using OpenCV & Python
I now how to find contours is the region is one colour. However, here I want to find contours for all non-Black regions. I am just not able to figure it out. Can anyone help?
Regrading some regions being regions being non-continuous (2 vertical lines on the left), you can ignore that. I will dilate & make them continuous.
If I understand what you want, here is one way in Python/OpenCV.
Read the input
Convert to gray
Threshold to black and white
Find external contours and their bounding boxes
Draw the bounding box rectangles on a copy of the input
Save the results
Input:
import cv2
import numpy as np
# read image
img = cv2.imread('white_green.png')
# convert to grayscale
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
# threshold
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY)[1]\
# get contour bounding boxes and draw on copy of input
contours = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
contours = contours[0] if len(contours) == 2 else contours[1]
result = img.copy()
for c in contours:
x,y,w,h = cv2.boundingRect(c)
cv2.rectangle(result, (x, y), (x+w-1, y+h-1), (0, 0, 255), 1)
# view result
cv2.imshow("threshold", thresh)
cv2.imshow("result", result)
cv2.waitKey(0)
cv2.destroyAllWindows()
# save result
cv2.imwrite("white_green_thresh.jpg", thresh)
cv2.imwrite("white_green_bboxes.jpg", result)
Thresholded image:
Bounding Boxes:

How to find the document edges in various coloured backgrounds using opencv python? [Document Scanning in various backgrounds]

I am currently have a document that needs to be smart scanned.
For that, I need to find proper contours of the document in any background so that I can do a warped perspective projection and detection with that image.
The main issue faced while doing this is that the document edge detects any kind of background.
I have tried to use the function HoughLineP and tried to find contours on the grayscale blurred image passed through canny edge detection until now.
MORPH = 9
CANNY = 84
HOUGH = 25
IM_HEIGHT, IM_WIDTH, _ = rescaled_image.shape
# convert the image to grayscale and blur it slightly
gray = cv2.cvtColor(rescaled_image, cv2.COLOR_BGR2GRAY)
gray = cv2.GaussianBlur(gray, (7,7), 0)
#dilate helps to remove potential holes between edge segments
kernel = cv2.getStructuringElement(cv2.MORPH_RECT,(MORPH,MORPH))
dilated = cv2.dilate(gray, kernel)
# find edges and mark them in the output map using the Canny algorithm
edged = cv2.Canny(dilated, 0, CANNY)
test_corners = self.get_corners(edged)
approx_contours = []
(_, cnts, hierarchy) = cv2.findContours(edged.copy(), cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)
cnts = sorted(cnts, key=cv2.contourArea, reverse=True)[:5]
# loop over the contours
for c in cnts:
# approximate the contour
approx = cv2.approxPolyDP(c, 80, True)
if self.is_valid_contour(approx, IM_WIDTH, IM_HEIGHT):
approx_contours.append(approx)
break
How to find a proper bounding box around the document via OpenCV code.
Any help will be much appreciated.
(The document is taken from the camera in any angle and any coloured background.)
Following code might help you to detect/segment the page in the image...
import cv2
import matplotlib.pyplot as plt
import numpy as np
image = cv2.imread('test_p.jpg')
image = cv2.imread('test_p.jpg')
print(image.shape)
ori = image.copy()
image = cv2.resize(image, (image.shape[1]//10,image.shape[0]//10))
Resized the image to make the operations more faster so that we can work on realtime..
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
gray = cv2.GaussianBlur(gray, (11,11), 0)
edged = cv2.Canny(gray, 75, 200)
print("STEP 1: Edge Detection")
plt.imshow(edged)
plt.show()
cnts = cv2.findContours(edged.copy(), cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)
cnts = sorted(cnts[1], key = cv2.contourArea, reverse = True)[:5]
Here we will consider only first 5 contours from the sorted list based on area
Here the size of the gaussian blur is bit sensitive, so chose it accordingly based on the image size.
After the above operations image may look like..
for c in cnts:
### Approximating the contour
#Calculates a contour perimeter or a curve length
peri = cv2.arcLength(c, True)
approx = cv2.approxPolyDP(c, 0.01 * peri, True)
# if our approximated contour has four points, then we
# can assume that we have found our screen
screenCnt = approx
if len(approx) == 4:
screenCnt = approx
break
# show the contour (outline)
print("STEP 2: Finding Boundary")
cv2.drawContours(image, [screenCnt], -1, (0, 255, 0), 2)
image_e = cv2.resize(image,(image.shape[1],image.shape[0]))
cv2.imwrite('image_edge.jpg',image_e)
plt.imshow(image_e)
plt.show()
Final Image may look like...
Rest of the things may be handled after getting the final image...
Code Reference :- Git Repository
I guess this answer would be helpful...
There is a similar problem which is called orthographic projection.
Orthographic approaches
Rather than doing, Gaussian blur+morphological operation to get the edge of the document, try to do orthographic projection first and then find contours via your method.
For fining proper bounding box, try some preset values or a reference letter after which an orthographic projection will allow you to compute the height and hence the dimensions of the bounding box.

Text Detection: Getting Bounding boxes

I have a black and white image, of a text document. I want to be able to get a list of the bounding boxes for each character. I have attempted an algorithm myself, but it takes excessively long, and is only somewhat successful. Are there any python libraries I can use to find the bounding box? I've been looking into opencv but the documentation is hard to follow. And in this tutorial I can't even decipher whether the bounding boxes were found because I can't easily find what the functions actually do.
You can use boundingRect(). Make sure your image background is black and text in image is white.Using this code you can draw rectangles around text in your image. To get a list of every rectangle please add respective code segment as per your requirement.
import cv2
img = cv2.imread('input.png', 0)
cv2.threshold(img,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU,img)
image, contours, hier = cv2.findContours(img, cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_NONE)
for c in contours:
# get the bounding rect
x, y, w, h = cv2.boundingRect(c)
# draw a white rectangle to visualize the bounding rect
cv2.rectangle(img, (x, y), (x + w, y + h), 255, 1)
cv2.drawContours(img, contours, -1, (255, 255, 0), 1)
cv2.imwrite("output.png",img)
I would suggest that you look into the boundingRect() function in the openCV library:
https://docs.opencv.org/2.4/doc/tutorials/imgproc/shapedescriptors/bounding_rects_circles/bounding_rects_circles.html
The documentation is based on cpp but it can be implemented in python as well.

Not getting the box on detected car

I'm trying to run a python code to detect a car in an image and draw a box around it. But somehow, I'm not getting the box even though there is no error. The code I'm using is below:
The function to draw the box
def draw_boxes(img, bboxes, color=(0, 0, 255), thick=6):
# Make a copy of the image
imcopy = np.copy(img)
# Iterate through the bounding boxes
for bbox in bboxes:
# Draw a rectangle given bbox coordinates
cv2.rectangle(imcopy, bbox[0], bbox[1], color, thick)
# Return the image copy with boxes drawn
return imcopy
The function to draw the box
def search_windows(img, windows, model):
#1) Create an empty list to receive positive detection windows
#2) Iterate over all windows in the list
test_images = []
for window in windows:
#3) Extract the test window from original image
test_img = cv2.resize(img[window[0][1]:window[1][1], window[0][0]:window[1][0]], (64, 64))
# Normalize image
test_img = test_img/255
# Predict and round the result
test_images.append(test_img)
test_images = np.array(test_images)
prediction = np.around(model.predict(test_images))
on_windows = [windows[i] for i in np.where(prediction==1)[0]]
return on_windows
Read an image and use the functions to draw the box
img = mpimg.imread(test_images[0])
detected_windows = search_windows(img, windows, model)
window_img = draw_boxes(img, detected_windows, color=(0, 255, 0), thick=3)
plt.imshow(window_img)
Thanks in adanvce.
I found the answer. I need to set a better limit for my prediction in the code. If I set 0.4 or 0.5 as the value, then I get the boxes and then work my way for better predictions.

Resources