]From https://www.pyimagesearch.com/2018/07/19/opencv-tutorial-a-guide-to-learn-opencv/
I'm able to extract the contours and write as files.
For example I've a photo with some scribbled text : "in there".
I've been able to extract the letters as separate files but what I want is that these letter files should have same width and height. For example in case of "i" and "r" width will differ. In that case I want to append(any b/w pixels) to the right of "i" photo so it's width becomes same as that of "r"
How to do it in Python? Just increase the size of photo(not resize)
My code looks something like this:
# find contours (i.e., outlines) of the foreground objects in the
# thresholded image
cnts = cv2.findContours(thresh.copy(), cv2.RETR_EXTERNAL,
cv2.CHAIN_APPROX_SIMPLE)
cnts = imutils.grab_contours(cnts)
output = image.copy()
ROI_number = 0
for c in cnts:
x,y,w,h = cv2.boundingRect(c)
ROI = image[y:y+h, x:x+w]
file = 'ROI_{}.png'.format(ROI_number)
cv2.imwrite(file.format(ROI_number), ROI)
[][1
Here are a couple of other ways to do that using Python/OpenCV using cv2.copyMakeBorder() to extend the border to the right by 50 pixels. The first way simply extends the border by replication. The second extends it with the mean (average) blue background color using a mask to get only the blue pixels.
Input:
import cv2
import numpy as np
# read image
img = cv2.imread('i.png')
# get mask of background pixels (for result2b only)
lowcolor = (232,221,163)
highcolor = (252,241,183)
mask = cv2.inRange(img, lowcolor, highcolor)
# get average color of background using mask on img (for result2b only)
mean = cv2.mean(img, mask)[0:3]
color = (mean[0],mean[1],mean[2])
# extend image to the right by 50 pixels
result = img.copy()
result2a = cv2.copyMakeBorder(result, 0,0,0,50, cv2.BORDER_REPLICATE)
result2b = cv2.copyMakeBorder(result, 0,0,0,50, cv2.BORDER_CONSTANT, value=color)
# view result
cv2.imshow("img", img)
cv2.imshow("mask", mask)
cv2.imshow("result2a", result2a)
cv2.imshow("result2b", result2b)
cv2.waitKey(0)
cv2.destroyAllWindows()
# save result
cv2.imwrite("i_extended2a.jpg", result2a)
cv2.imwrite("i_extended2b.jpg", result2b)
Replicated Result:
Average Background Color Result:
In Python/OpenCV/Numpy you create a new image of the size and background color you want. Then you use numpy slicing to insert the old image into the new one. For example:
Input:
import cv2
import numpy as np
# read image
img = cv2.imread('i.png')
ht, wd, cc= img.shape
# create new image of desired size (extended by 50 pixels in width) and desired color
ww = wd+50
hh = ht
color = (242,231,173)
result = np.full((hh,ww,cc), color, dtype=np.uint8)
# copy img image into image at offsets yy=0,xx=0
yy=0
xx=0
result[yy:yy+ht, xx:xx+wd] = img
# view result
cv2.imshow("result", result)
cv2.waitKey(0)
cv2.destroyAllWindows()
# save result
cv2.imwrite("i_extended.jpg", result)
Related
I am trying to make a project which has three imshow windows each of different size, is there a way to make those three windows to pane or stack and display them in another window? Currently they are displayed like this
How can i make a window which will contain all these windows and only the main window will have a close button and not all of them.
Use
To stack them vertically:
img = np.concatenate((img1, img2), axis=0)
To stack them horizontally:
img = np.concatenate((img1, img2), axis=1)
Then show them using cv2.imshow
You can read in each image and store all but the first one, img0, into a list, imgs. Iterate through each image in the imgs list, comparing the width of img0 and the image of the iteration. With that we can define a pad, and update the img0 with the image:
Lets say we have these three images:
one.png:
two.png:
three.png:
The code:
import cv2
import numpy as np
img1 = cv2.imread("one.png")
img2 = cv2.imread("two.png")
img3 = cv2.imread("three.png")
imgs = [img1, img2, img3]
result = imgs[0]
for img in imgs[1:]:
w = img.shape[1] - result.shape[1]
pad = [(0, 0), (0, abs(w)), (0, 0)]
if w > 0:
result = np.r_[np.pad(result, pad), img]
else:
result = np.r_[result, np.pad(img, pad)]
cv2.imshow("Image", result)
cv2.waitKey(0)
Output:
TL;DR : When concatenating 10mb worth of images into one large image, resulting image is 1GB worth of memory, before I save/optimize it to disk. How can make this in-memory size smaller?
I am working on a project where I am taking a list of lists of Python Pil image objects (image tiles), and gluing them together to:
Generate a list of images that have been concatenated together into columns
Taking #1, and making a full blown image out of all the tiles
This post has been great at providing a function that accomplishes 1&2 by
Figuring out the final image size
Creating a blank canvas for images to be added to
Adding all the images, in a sequence, to canvas we just generated
However, the issue I am encountering with the code:
The size of the original objects in the list of lists, is ~50mb.
When I do the first past over the list of lists of image object, to generated list of images that are columns, the memory increases by 1gb... And when I make the final image, the memory increases by another 1gb.
Since the resulting image is 105,985 x 2560 pixels... the 1gb is somewhat expected ((105984*2560)*3 /1024 /1024) [~800mb]
My hunch is that the canvases that are being created, are non-optimized, hence, take up a bit of space (pixels * 3 bytes), but the image tile objects I am trying to paste onto canvas, are optimized for size.
Hence my question - utilizing PIL/Python3, is there a better way to concatenate images together, keeping their original sizes/optimizations? After I do process image/re-optimize it via
.save(DiskLocation, optimize=True, quality=94)
The resulting image is ~30 MB (which is, roughly the size of the original list of lists containing PIL objects)
For reference, from the post linked above, this is the function that I use to concatenate images together:
from PIL import Image
#written by teekarna
# https://stackoverflow.com/questions/30227466/combine-several-images-horizontally-with-python
def append_images(images, direction='horizontal',
bg_color=(255,255,255), aligment='center'):
"""
Appends images in horizontal/vertical direction.
Args:
images: List of PIL images
direction: direction of concatenation, 'horizontal' or 'vertical'
bg_color: Background color (default: white)
aligment: alignment mode if images need padding;
'left', 'right', 'top', 'bottom', or 'center'
Returns:
Concatenated image as a new PIL image object.
"""
widths, heights = zip(*(i.size for i in images))
if direction=='horizontal':
new_width = sum(widths)
new_height = max(heights)
else:
new_width = max(widths)
new_height = sum(heights)
new_im = Image.new('RGB', (new_width, new_height), color=bg_color)
offset = 0
for im in images:
if direction=='horizontal':
y = 0
if aligment == 'center':
y = int((new_height - im.size[1])/2)
elif aligment == 'bottom':
y = new_height - im.size[1]
new_im.paste(im, (offset, y))
offset += im.size[0]
else:
x = 0
if aligment == 'center':
x = int((new_width - im.size[0])/2)
elif aligment == 'right':
x = new_width - im.size[0]
new_im.paste(im, (x, offset))
offset += im.size[1]
return new_im
While I do not have explanation for what was causing my runaway memory issue, I was able to tack on some code that seemed to have fixed the issue.
For each tile that I am trying to glue together, I ran a 'resizing' script (below). Somehow, this fixed the issue I was having ¯\ (ツ) /¯
Resize images script:
def resize_image(image, ImageQualityReduction = .78125):
#resize image, by percentage
width, height = image.size
#print(width,height)
new_width = int(round(width * ImageQualityReduction))
new_height = int(round(height * ImageQualityReduction))
resized_image = image.resize((new_width, new_height), Image.ANTIALIAS)
return resized_image#, new_width, new_height
hand-filled character per box form
I want to automate a process in which I would get hand-filled character per box type forms in image format and I need to extract text from these forms. The boxes surrounds each letter, I have to extract all the text from the image form.
You can use selecting contours by size, find rotated rectangle and inverse transform make.
import cv2
import numpy as np
img = cv2.imread('4YAry.jpg')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# convert to binary image
thresh=cv2.threshold(gray, 200, 255, cv2.THRESH_BINARY_INV )[1]
contours,hierarchy = cv2.findContours(thresh, 1, 2)
for cnt in contours:
x , y , w , h = cv2 . boundingRect ( cnt )
if abs(w-345)<10: # width box is 345 px
rect = cv2.minAreaRect(cnt)
box = cv2.boxPoints(rect)
srcTri=np.array( [box[1], box[0], box[2]] ).astype(np.float32)
dstTri = np.array( [[0, 0], [0, rect[1][1]], [rect[1][0],0]] ).astype(np.float32)
warp_mat = cv2.getAffineTransform(srcTri, dstTri)
warp_dst = cv2.warpAffine(img, warp_mat, (np.int0(rect[1][0]), np.int0(rect[1][1])))
N=14
s=0.99*warp_dst.shape[1]/N # tune rectangle positions
for i in range(N):
warp_dst = cv2.rectangle ( warp_dst , ( 2+int(i*s) ,2 ), ( 2+int((i+1)*s) , warp_dst.shape[0]-3 ), ( 255 , 255 , 255 ), 2 )
cv2.imwrite('chars.png', warp_dst)
Using for instance Hough, detect the top and bottom edges and the vertical separations. Validate the separations by checking that they run from top to bottom. The horizontal lines will be more reliable and accurate, you can use their direction for deskewing if necessary.
After doing that, you will have missing separations and false ones. Using some heuristics, try to find the correct pitch and detect the false positives and false negatives. Now you can extract the content of the individual boxes, or erase the edges.
This process cannot be perfect, some characters will be damaged.
Currently, I use the following piece of code to create mask images (classes = ['tree', 'car', 'bicycle'], polygons is the list of the geometry objects where each geometry object has coordinates field that defines the polygon on the image that is a bounding box for the class object):
def create_mask(self, mask_size, classes, polygons):
# type (Tuple[int, int], List[str], List[geometry]) -> Image
# Create a new palette image, the default color of Image.new() is black
# https://pillow.readthedocs.io/en/3.3.x/handbook/concepts.html#modes
img = Image.new('P', mask_size)
img.putpalette(self.palette) # palette = [0, 0, 0, 255, 0, 0, ...]
draw = ImageDraw.Draw(img)
for i, class_ in enumerate(classes):
color_index = self.class_to_color_index[class_]
draw.polygon(xy=polygons[i].exterior.coords, fill=color_index)
del draw
return img
Is there any way to rewrite this piece of code with using features.rasterize?
This question already has answers here:
OpenCV: Choosing HSV thresholds for color filtering
(2 answers)
Closed 10 months ago.
Can anyone please tell me a name of a website or any place from where I can get the upper and lower range of HSV of basic colours like
yellow,green,red,blue,black,white,orange
Actually I was making a bot which would at first follow black coloured line and then in the middle of the line there would be another colour given from where 3 different lines of different colour gets divided.The bot needs to decide which line to follow.
For that I need the proper range of hsv colours
Inspired from the answer at answers.opencv link.
According to docs here
the HSV ranges like H from 0-179, S and V from 0-255,
so as for your requirements for lower range and upper range example you can do for any given [h, s, v] to
[h-10, s-40, v-40] for lower
and
[h+10, s+10, v+40] for upper
for the yellow,green,red,blue,black,white,orange rgb values.
Copied code from the example :
import cv2
import numpy as np
image_hsv = None # global ;(
pixel = (20,60,80) # some stupid default
# mouse callback function
def pick_color(event,x,y,flags,param):
if event == cv2.EVENT_LBUTTONDOWN:
pixel = image_hsv[y,x]
#you might want to adjust the ranges(+-10, etc):
upper = np.array([pixel[0] + 10, pixel[1] + 10, pixel[2] + 40])
lower = np.array([pixel[0] - 10, pixel[1] - 10, pixel[2] - 40])
print(pixel, lower, upper)
image_mask = cv2.inRange(image_hsv,lower,upper)
cv2.imshow("mask",image_mask)
def main():
import sys
global image_hsv, pixel # so we can use it in mouse callback
image_src = cv2.imread(sys.argv[1]) # pick.py my.png
if image_src is None:
print ("the image read is None............")
return
cv2.imshow("bgr",image_src)
## NEW ##
cv2.namedWindow('hsv')
cv2.setMouseCallback('hsv', pick_color)
# now click into the hsv img , and look at values:
image_hsv = cv2.cvtColor(image_src,cv2.COLOR_BGR2HSV)
cv2.imshow("hsv",image_hsv)
cv2.waitKey(0)
cv2.destroyAllWindows()
if __name__=='__main__':
main()
Above code is for when you want to directly select the HSV range from the image or video you are capturing, by clicking on the desired color.
If you want to predefine your ranges you can just use write simple code snippet using inbuilt python library colorsys to convert rbg to hsv using colorsys.rgb_to_hsv function
example in docs
Note this function accepts rgb values in range of 0 to 1 only and gives hsv values also in 0 to 1 range so to use the same values you will need to normalize it for opencv
code snippet
import colorsys
'''
convert given rgb to hsv opencv format
'''
def rgb_hsv_converter(rgb):
(r,g,b) = rgb_normalizer(rgb)
hsv = colorsys.rgb_to_hsv(r,g,b)
(h,s,v) = hsv_normalizer(hsv)
upper_band = [h+10, s+40, v+40]
lower_band = [h-10, s-40, v-40]
return {
'upper_band': upper_band,
'lower_band': lower_band
}
def rgb_normalizer(rgb):
(r,g,b) = rgb
return (r/255, g/255, b/255)
def hsv_normalizer(hsv):
(h,s,v) = hsv
return (h*360, s*255, v*255)
rgb_hsv_converter((255, 165, 0))
will return
{'upper_band': [48.82352941176471, 295.0, 295.0], 'lower_band': [28.82352941176471, 215.0, 215.0]}
which is your orange hsv bands.