Why NMSboxes is not eleminating multiple bounding boxes? - python-3.x

First of all here is my code :
image = cv2.imread(filePath)
height, width, channels = image.shape
# USing blob function of opencv to preprocess image
blob = cv2.dnn.blobFromImage(image, 1 / 255.0, (416, 416),
swapRB=True, crop=False)
#Detecting objects
net.setInput(blob)
outs = net.forward(output_layers)
# Showing informations on the screen
class_ids = []
confidences = []
boxes = []
for out in outs:
for detection in out:
scores = detection[5:]
class_id = np.argmax(scores)
confidence = scores[class_id]
if confidence > 0.7:
# Object detected
center_x = int(detection[0] * width)
center_y = int(detection[1] * height)
w = int(detection[2] * width)
h = int(detection[3] * height)
# Rectangle coordinates
x = int(center_x - w / 2)
y = int(center_y - h / 2)
boxes.append([x, y, w, h])
confidences.append(float(confidence))
class_ids.append(class_id)
indexes = cv2.dnn.NMSBoxes(boxes, confidences,score_threshold=0.4,nms_threshold=0.8,top_k=1)
font = cv2.FONT_HERSHEY_PLAIN
colors = np.random.uniform(0, 255, size=(len(classes), 3))
labels = ['bicycle','car','motorbike','bus','truck']
for i in range(len(boxes)):
if i in indexes:
label = str(classes[class_ids[i]])
if label in labels:
x, y, w, h = boxes[i]
color = colors[class_ids[i]]
cv2.rectangle(image, (x, y), (x + w, y + h), color, 2)
cv2.putText(image, label, (x, y + 30), font, 2, color, 3)
cv2.imshow(fileName,image)
My Question is : Isn't cv2.dnn.NMSBoxes is suppose to eliminate multiple bounding boxes? then why I still get output like sample below :
What I expected is something like below :
Did I do something wrong with my code? Is there any better alternative? Thank you very much for your help.

The process of NMS goes like this
Input - A list of Proposal boxes B, corresponding confidence scores S and overlap threshold N
Output - A list of filtered proposals D
Algorithm/steps
Select the proposal with highest confidence score, remove it from B and add it to the final proposal list D. (Initially D is empty)
Now compare this proposal with all the proposals — calculate the IOU (Intersection over Union) of this proposal with every other proposal. If the IOU is greater than the threshold N, remove that proposal from B
Again take the proposal with the highest confidence from the remaining proposals in B and remove it from B and add it to D
Once again calculate the IOU of this proposal with all the proposals in B and eliminate the boxes which have high IOU than threshold
This process is repeated until there are no more proposals left in B
The threshold that is being referred to here is nothing but the nms_threshold.
In the cv2.dnn.NMSBoxes function, nms_threshold is the IOU threshold used in non-maximum suppression.
So if you have a large value, you are enforcing two boxes to have a very high overlap (which is usually not the case) and the box will be removed only if it has an IOU more than 0.8 with another box. Since there's usually not this much overlap, the boxes won't be removed. Reducing this value will make it easier to remove redundant detections
Hope this makes sense
You can read more about Non-Maxima Suppresion here

Related

Crop satellite image image based on a historical image with OpenCV in Python

I have the following problem, I have a pair of two images one historical and one present-day satellite image and as the historical image covers a smaller area I want to crop the satellite images. Here one can see the code I wrote for this:
import numpy as np
import cv2
import os
import imutils
import math
entries = os.listdir('../')
refImage = 0
histImages = []
def loadImage(index):
referenceImage = cv2.imread("../" + 'ref_' + str(index) + '.png')
top = int(0.5 * referenceImage.shape[0]) # shape[0] = rows
bottom = top
left = int(0.5 * referenceImage.shape[1]) # shape[1] = cols
right = left
referenceImage = cv2.copyMakeBorder(referenceImage, top, bottom, left, right, cv2.BORDER_CONSTANT, None, (0,0,0))
counter = 0
for entry in entries:
if entry.startswith("image_"+str(index)):
refImage = referenceImage.copy()
histImage = cv2.imread("../" + entry)
#histImages.append(img)
points = np.loadtxt("H2OPM/"+"CP_"+ entry[6:9] + ".txt", delimiter=",")
vector_image1 = [points[0][0] - points[1][0], points[0][1] - points[1][1]] #hist
vector_image2 = [points[0][2] - points[1][2], points[0][3] - points[1][3]] #ref
angle = angle_between(vector_image1, vector_image2)
hhist, whist, chist = histImage.shape
rotatedImage = imutils.rotate(refImage, angle)
x = int(points[0][2] - points[0][0])
y = int(points[1][2] - points[1][0])
crop_img = rotatedImage[x+left:x+left+hhist, y+top:y+top+whist]
print("NewImageWidth:", (y+top+whist)-(y+top),(x+left+hhist)-(x+left))
print(entry)
print(x,y)
counter += 1
#histImage = cv2.line(histImage, (points[0][0], ), end_point, color, thickness)
cv2.imwrite("../matchedImages/"+'image_' + str(index) + "_" + str(counter) + '.png' ,histImage)
#rotatedImage = cv2.line(rotatedImage, (), (), (0, 255, 0), 9)
cv2.imwrite("../matchedImages/"+'ref_' + str(index) + "_" + str(counter) + '.png' ,crop_img)
First, I load the original satellite image and pad it so I don't lose information due to the rotation, second, I load one of the matched historical images as well as the matched keypoints of the two images (i.e. a list of x_hist, y_hist, x_present_day, y_present_day). Third, I compute the rotation angle between the two images (which works) and fourth, I crop the image (and fifth, I save the images).
Problem: As stated the rotation works fine, but my program ends up cropping the wrong part of the image.
I think that, due to the rotation, the boundaries (i.e. left, right, top, bottom) are no longer correct and I think this is where my problem lies, but I am not sure how to fix this problem.
Information that might help:
The images are both scaled the same way (so one pixel = approx. 1m)
I have at least 6 keypoints for each image
I haven't looked at your code yet but would it be due to you mixing up the x's and y's ? Check the OpenCV documentation to make sure the variables you import are in the correct order.
During my limited time and experience with opencv, it is quite weird because sometimes, it asks for for example, BGR instead of RGB values. (In my programme, not yours)
Also, you seem to have a bunch of lists, make sure the list[x][y] is not mixed up as list[y][x]
So I found the error in my computation. The bounding boxes of the cutout area were wrongly converted into the present-day image.
So this:
x = int(points[0][2] - points[0][0])
y = int(points[1][2] - points[1][0])
was swapped with this:
v = [pointBefore[0],pointBefore[1],1]
# Perform the actual rotation and return the image
calculated = np.dot(m,v)
newPoint = (int(calculated[0]- points[0][0]),int(calculated[1]- points[0][1]))
where m(=M) is from the transformation:
def rotate_bound(image, angle):
# grab the dimensions of the image and then determine the
# center
(h, w) = image.shape[:2]
(cX, cY) = (w // 2, h // 2)
# grab the rotation matrix (applying the negative of the
# angle to rotate clockwise), then grab the sine and cosine
# (i.e., the rotation components of the matrix)
M = cv2.getRotationMatrix2D((cX, cY), -angle, 1.0)
cos = np.abs(M[0, 0])
sin = np.abs(M[0, 1])
# compute the new bounding dimensions of the image
nW = int((h * sin) + (w * cos))
nH = int((h * cos) + (w * sin))
# adjust the rotation matrix to take into account translation
M[0, 2] += (nW / 2) - cX
M[1, 2] += (nH / 2) - cY
# perform the actual rotation and return the image
return cv2.warpAffine(image, M, (nW, nH)), M
Thanks.

Best-Fit without point interpolation

I have two sets of data. One is nominal form. The other is actual form. The problem is that when I wish to calculate the form error alone. It's a big problem when the two sets of data isn't "on top of each other". That gives errors that also include positional error.
Both curves are read from a series of data. The nominal shape (black) is made up from many different size radius that are tangent to each other. Its the leading edge of an airfoil profile.
I have tried various methods of "Best-Fit" I've found both here and on where ever google took me. But the problem is that they all smooth my "actual" data. So it get modified and is not keeping it's actual form.
Is there any function in scipy or any other python lib that "simply" can fit my two curves together without altering the actual shape?
I wish for the green curve with red dots to lie as much as possible on top of the black.
Might it be possible to calculate the center of gravity of both curves and then move the actual curve in x and y depending on the value difference from the center point? It might not be the ultimate solution, but it would get closer?
Here is a solution assuming that the nominal form can be described as a conic, i.a as solution of the equation ax^2 + by^2 + cxy + dx + ey = 1. Then, a least square fit can be applied to find the coefficients (a, b, c, d, e).
import numpy as np
import matplotlib.pylab as plt
# Generate example data
t = np.linspace(-2, 2.5, 25)
e, theta = 0.5, 0.3 # ratio minor axis/major & orientation angle major axis
c, s = np.cos(theta), np.sin(theta)
x = c*np.cos(t) - s*e*np.sin(t)
y = s*np.cos(t) + c*e*np.sin(t)
# add noise:
xy = 4*np.vstack((x, y))
xy += .08 *np.random.randn(*xy.shape) + np.random.randn(2, 1)
# Least square fit by a generic conic equation
# a*x^2 + b*y^2 + c*x*y + d*x + e*y = 1
x, y = xy
x = x - x.mean()
y = y - y.mean()
M = np.vstack([x**2, y**2, x*y, x, y]).T
b = np.ones_like(x)
# solve M*w = b
w, res, rank, s = np.linalg.lstsq(M, b, rcond=None)
a, b, c, d, e = w
# Get x, y coordinates for the fitted ellipse:
# using polar coordinates
# x = r*cos(theta), y = r*sin(theta)
# for a given theta, the radius is obtained with the 2nd order eq.:
# (a*ct^2 + b*st^2 + c*cs*st)*r^2 + (d*ct + e*st)*r - 1 = 0
# with ct = cos(theta) and st = sin(theta)
theta = np.linspace(-np.pi, np.pi, 97)
ct, st = np.cos(theta), np.sin(theta)
A = a*ct**2 + b*st**2 + c*ct*st
B = d*ct + e*st
D = B**2 + 4*A
radius = (-B + np.sqrt(D))/2/A
# Graph
plt.plot(radius*ct, radius*st, '-k', label='fitted ellipse');
plt.plot(x, y, 'or', label='measured points');
plt.axis('equal'); plt.legend();
plt.xlabel('x'); plt.ylabel('y');

OpenCV: Segment each digit from the given image. Digits are written in each cell of a row matrix. Each cell is bounded by margins

I have been trying to recognise handwritten letters (digits/alphabet) from a form-document. As it is known that form-documents have 1d row cells, where the applicant has to fill their information within those bounded cells. However, I'm unable to segment the digits(currently my input consists only digits) from the bounding boxes.
I went through the following steps:
Reading the image (as a grayscale image) via "imread" method of opencv2. Initial Image size:19 x 209(in pixels).
pic = "crop/cropped000.jpg"
newImg = cv2.imread(pic, 0)
Resizing the image 200% its original size via "resize" method of opencv2. I used INTER_AREA Interpolation. Resized Image size: 38 x 418(in pixels)
h,w = newImg.shape
resizedImg = cv2.resize(newImg, (2*w,2*h), interpolation=cv2.INTER_AREA)
Applied Canny edge detection.
v = np.median(resizedImg)
sigma = 0.33
lower = int(max(0, (1.0 - sigma) * v))
upper = int(min(255, (1.0 + sigma) * v))
edgedImg = cv2.Canny(resizedImg, lower, upper)
Cropped the contours and saved them as images in 'BB' directory.
im2, contours, hierarchy = cv2.findContours(edgedImg.copy(),cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
num = 0
for c in contours:
x, y, w, h = cv2.boundingRect(c)
num += 1
new_img = resizedImg[y:y+h, x:x+w]
cv2.imwrite('BB/'+str(num).zfill(3) + '.jpg', new_img)
Entire code in summary:
pic = "crop/cropped000.jpg"
newImg = cv2.imread(pic, 0)
h,w = newImg.shape
print(newImg.shape)
resizedImg = cv2.resize(newImg, (2*w,2*h), interpolation=cv2.INTER_AREA)
print(resizedImg.shape)
v = np.median(resizedImg)
sigma = 0.33
lower = int(max(0, (1.0 - sigma) * v))
upper = int(min(255, (1.0 + sigma) * v))
edgedImg = cv2.Canny(resizedImg, lower, upper)
im2, contours, hierarchy = cv2.findContours(edgedImg.copy(),cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
num = 0
for c in contours:
x, y, w, h = cv2.boundingRect(c)
num += 1
new_img = resizedImg[y:y+h, x:x+w]
cv2.imwrite('BB/'+str(num).zfill(3) + '.jpg', new_img)
Images produced are posted here:
https://imgur.com/a/GStIcdj
I had to double the image size because Canny edge detection was producing double-edges for an object (However, it still does). I have also played with other openCV functionalities like Thresholding, Gaussian Blur, Dilate, Erode but all in vain.
# we need one more parameter for Date cell width : as this could be different for diff bank
def crop_image_data_from_date_field(image, new_start_h, new_end_h, new_start_w, new_end_w, cell_width):
#for date each cell has same height and width : here width: 25 px so cord will be changed based on width
cropped_image_list = []
starting_width = new_start_w
for i in range(1,9): # as date has only 8 fields: DD/MM/YYYY
cropped_img = image[new_start_h:new_end_h, new_start_w + 1 :new_start_w+22]
new_start_w = starting_width + (i*cell_width)
cropped_img = cv2.resize(cropped_img, (28, 28))
image_name = 'cropped_date/cropped_'+ str(i) + '.png'
cv2.imwrite(image_name, cropped_img)
cropped_image_list.append(image_name)
# print('cropped_image_list : ',cropped_image_list,len(cropped_image_list))
# rec_value = handwritten_digit_recog.recog_digits(cropped_image_list)
recvd_value = custom_predict.predict_digit(cropped_image_list)
# print('recvd val : ',recvd_value)
return recvd_value
you need to specify each cell width and it's x,y,w,h.
I think this will help you.

How to detect objects shape from images using python3?

i want to write a code in python3 that detects objects shapes from images.
I want to choose a pixel from an object in the given image and find the neighbours pixels.
If they have the same RGB value that means that they are part of the object.
When neighbour pixel changes the RGB value with an ajustable difference from original pixel the algorithm should stop searching for neighbours. I think that this will work unless the backgroud and object have the same color.
I have found a way to put the pixels with same color in an rectangle,but this will not help me. I want to save just the shape of the object and put it in a different image.
For example,
If i want to start my algorithm from the middle of an object, let's
say a black table with a white background,the algorithm will find
pixels with the same color in any direction. When the neighbour pixel
RGB values will change with more than 30 units in one direction,the
algorithm will stop going in that direction,and will start going in
another direction untill I have the shape of the table.
I found a code on another post that help me to determinate regions of pixels with a shared value using PIL
Thanks!
from collections import defaultdict
from PIL import Image, ImageDraw
def connected_components(edges):
"""
Given a graph represented by edges (i.e. pairs of nodes), generate its
connected components as sets of nodes.
Time complexity is linear with respect to the number of edges.
"""
neighbors = defaultdict(set)
for a, b in edges:
neighbors[a].add(b)
neighbors[b].add(a)
seen = set()
def component(node, neighbors=neighbors, seen=seen, see=seen.add):
unseen = set([node])
next_unseen = unseen.pop
while unseen:
node = next_unseen()
see(node)
unseen |= neighbors[node] - seen
yield node
return (set(component(node)) for node in neighbors if node not in seen)
def matching_pixels(image, test):
"""
Generate all pixel coordinates where pixel satisfies test.
"""
width, height = image.size
pixels = image.load()
for x in xrange(width):
for y in xrange(height):
if test(pixels[x, y]):
yield x, y
def make_edges(coordinates):
"""
Generate all pairs of neighboring pixel coordinates.
"""
coordinates = set(coordinates)
for x, y in coordinates:
if (x - 1, y - 1) in coordinates:
yield (x, y), (x - 1, y - 1)
if (x, y - 1) in coordinates:
yield (x, y), (x, y - 1)
if (x + 1, y - 1) in coordinates:
yield (x, y), (x + 1, y - 1)
if (x - 1, y) in coordinates:
yield (x, y), (x - 1, y)
yield (x, y), (x, y)
def boundingbox(coordinates):
"""
Return the bounding box of all coordinates.
"""
xs, ys = zip(*coordinates)
return min(xs), min(ys), max(xs), max(ys)
def disjoint_areas(image, test):
"""
Return the bounding boxes of all non-consecutive areas
who's pixels satisfy test.
"""
for each in connected_components(make_edges(matching_pixels(image, test))):
yield boundingbox(each)
def is_black_enough(pixel):
r, g, b = pixel
return r < 10 and g < 10 and b < 10
if __name__ == '__main__':
image = Image.open('some_image.jpg')
draw = ImageDraw.Draw(image)
for rect in disjoint_areas(image, is_black_enough):
draw.rectangle(rect, outline=(255, 0, 0))
image.show()
Try using opencv with Python.
With opencv you can make advanced image analysis and there are many tutorials to use it.
http://www.pyimagesearch.com/2014/04/21/building-pokedex-python-finding-game-boy-screen-step-4-6/

Selecting colors that are furthest apart

I'm working on a project that requires me to select "unique" colors for each item. At times there could be upwards of 400 items. Is there some way out there of selecting the 400 colors that differ the most? Is it as simple as just changing the RGB values by a fixed increment?
You could come up with an equal distribution of 400 colours by incrementing red, green and blue in turn by 34.
That is:
You know you have three colour channels: red, green and blue
You need 400 distinct combinations of R, G and B
So on each channel the number of increments you need is the cube root of 400, i.e. about 7.36
To span the range 0..255 with 7.36 increments, each increment must be about 255/7.36, i.e. about 34
Probably HSL or HSV would be a better representations than RGB for this task.
You may find that changing the hue gives better variability perception to the eye, so adjust your increments in a way that for every X units changed in S and L you change Y (with Y < X) units of hue, and adjust X and Y so you cover the spectrum with your desired amount of samples.
Here is my final code. Hopefully it helps someone down the road.
from PIL import Image, ImageDraw
import math, colorsys, os.path
# number of color circles needed
qty = 400
# the lowest value (V in HSV) can go
vmin = 30
# calculate how much to increment value by
vrange = 100 - vmin
if (qty >= 72):
vdiff = math.floor(vrange / (qty / 72))
else:
vdiff = 0
# set options
sizes = [16, 24, 32]
border_color = '000000'
border_size = 3
# initialize variables
hval = 0
sval = 50
vval = vmin
count = 0
while count < qty:
im = Image.new('RGBA', (100, 100), (0, 0, 0, 0))
draw = ImageDraw.Draw(im)
draw.ellipse((5, 5, 95, 95), fill='#'+border_color)
r, g, b = colorsys.hsv_to_rgb(hval/360.0, sval/100.0, vval/100.0)
r = int(r*255)
g = int(g*255)
b = int(b*255)
draw.ellipse((5+border_size, 5+border_size, 95-border_size, 95-border_size), fill=(r, g, b))
del draw
hexval = '%02x%02x%02x' % (r, g, b)
for size in sizes:
result = im.resize((size, size), Image.ANTIALIAS)
result.save(str(qty)+'/'+hexval+'_'+str(size)+'.png', 'PNG')
if hval + 10 < 360:
hval += 10
else:
if sval == 50:
hval = 0
sval = 100
else:
hval = 0
sval = 50
vval += vdiff
count += 1
Hey I came across this problem a few times in my projects where I wanted to display, say, clusters of points. I found that the best way to go was to use the colormaps from matplotlib (https://matplotlib.org/stable/tutorials/colors/colormaps.html) and
colors = plt.get_cmap("hsv")[np.linspace(0, 1, n_colors)]
this will output rgba colors so you can get the rgb with just
rgb = colors[:,:3]

Resources