Crop satellite image image based on a historical image with OpenCV in Python - python-3.x

I have the following problem, I have a pair of two images one historical and one present-day satellite image and as the historical image covers a smaller area I want to crop the satellite images. Here one can see the code I wrote for this:
import numpy as np
import cv2
import os
import imutils
import math
entries = os.listdir('../')
refImage = 0
histImages = []
def loadImage(index):
referenceImage = cv2.imread("../" + 'ref_' + str(index) + '.png')
top = int(0.5 * referenceImage.shape[0]) # shape[0] = rows
bottom = top
left = int(0.5 * referenceImage.shape[1]) # shape[1] = cols
right = left
referenceImage = cv2.copyMakeBorder(referenceImage, top, bottom, left, right, cv2.BORDER_CONSTANT, None, (0,0,0))
counter = 0
for entry in entries:
if entry.startswith("image_"+str(index)):
refImage = referenceImage.copy()
histImage = cv2.imread("../" + entry)
#histImages.append(img)
points = np.loadtxt("H2OPM/"+"CP_"+ entry[6:9] + ".txt", delimiter=",")
vector_image1 = [points[0][0] - points[1][0], points[0][1] - points[1][1]] #hist
vector_image2 = [points[0][2] - points[1][2], points[0][3] - points[1][3]] #ref
angle = angle_between(vector_image1, vector_image2)
hhist, whist, chist = histImage.shape
rotatedImage = imutils.rotate(refImage, angle)
x = int(points[0][2] - points[0][0])
y = int(points[1][2] - points[1][0])
crop_img = rotatedImage[x+left:x+left+hhist, y+top:y+top+whist]
print("NewImageWidth:", (y+top+whist)-(y+top),(x+left+hhist)-(x+left))
print(entry)
print(x,y)
counter += 1
#histImage = cv2.line(histImage, (points[0][0], ), end_point, color, thickness)
cv2.imwrite("../matchedImages/"+'image_' + str(index) + "_" + str(counter) + '.png' ,histImage)
#rotatedImage = cv2.line(rotatedImage, (), (), (0, 255, 0), 9)
cv2.imwrite("../matchedImages/"+'ref_' + str(index) + "_" + str(counter) + '.png' ,crop_img)
First, I load the original satellite image and pad it so I don't lose information due to the rotation, second, I load one of the matched historical images as well as the matched keypoints of the two images (i.e. a list of x_hist, y_hist, x_present_day, y_present_day). Third, I compute the rotation angle between the two images (which works) and fourth, I crop the image (and fifth, I save the images).
Problem: As stated the rotation works fine, but my program ends up cropping the wrong part of the image.
I think that, due to the rotation, the boundaries (i.e. left, right, top, bottom) are no longer correct and I think this is where my problem lies, but I am not sure how to fix this problem.
Information that might help:
The images are both scaled the same way (so one pixel = approx. 1m)
I have at least 6 keypoints for each image

I haven't looked at your code yet but would it be due to you mixing up the x's and y's ? Check the OpenCV documentation to make sure the variables you import are in the correct order.
During my limited time and experience with opencv, it is quite weird because sometimes, it asks for for example, BGR instead of RGB values. (In my programme, not yours)
Also, you seem to have a bunch of lists, make sure the list[x][y] is not mixed up as list[y][x]

So I found the error in my computation. The bounding boxes of the cutout area were wrongly converted into the present-day image.
So this:
x = int(points[0][2] - points[0][0])
y = int(points[1][2] - points[1][0])
was swapped with this:
v = [pointBefore[0],pointBefore[1],1]
# Perform the actual rotation and return the image
calculated = np.dot(m,v)
newPoint = (int(calculated[0]- points[0][0]),int(calculated[1]- points[0][1]))
where m(=M) is from the transformation:
def rotate_bound(image, angle):
# grab the dimensions of the image and then determine the
# center
(h, w) = image.shape[:2]
(cX, cY) = (w // 2, h // 2)
# grab the rotation matrix (applying the negative of the
# angle to rotate clockwise), then grab the sine and cosine
# (i.e., the rotation components of the matrix)
M = cv2.getRotationMatrix2D((cX, cY), -angle, 1.0)
cos = np.abs(M[0, 0])
sin = np.abs(M[0, 1])
# compute the new bounding dimensions of the image
nW = int((h * sin) + (w * cos))
nH = int((h * cos) + (w * sin))
# adjust the rotation matrix to take into account translation
M[0, 2] += (nW / 2) - cX
M[1, 2] += (nH / 2) - cY
# perform the actual rotation and return the image
return cv2.warpAffine(image, M, (nW, nH)), M
Thanks.

Related

How to translate points on image after cropping it and resizing it?

I am creating a program which allows a user to annotate images with points.
This program allows user to zoom in an image so user can annotate more precisely.
Program zooms in an image doing the following:
Find the center of image
Find minimum and maximum coordinates of new cropped image relative to center
Crop image
Resize the image to original size
For this I have written the following Python code:
import cv2
def zoom_image(original_image, cut_off_percentage, list_of_points):
height, width = original_image.shape[:2]
center_x, center_y = int(width/2), int(height/2)
half_new_width = center_x - int(center_x * cut_off_percentage)
half_new_height = center_y - int(center_y * cut_off_percentage)
min_x, max_x = center_x - half_new_width, center_x + half_new_width
min_y, max_y = center_y - half_new_height, center_y + half_new_height
#I want to include max coordinates in new image, hence +1
cropped = original_image[min_y:max_y+1, min_x:max_x+1]
new_height, new_width = cropped.shape[:2]
resized = cv2.resize(cropped, (width, height))
translate_points(list_of_points, height, width, new_height, new_width, min_x, min_y)
I want to resize the image to original width and height so user always works on same "surface"
regardless of how zoomed image is.
The problem I encounter is how to correctly scale points (annotations) when doing this. My algorithm to do so was following:
Translate points on original image by subtracting min_x from x coordinate and min_y from y coordinate
Calculate constants for scaling x and y coordinates of points
Multiply coordinates by constants
For this I use the following Python code:
import cv2
def translate_points(list_of_points, height, width, new_height, new_width, min_x, min_y):
#Calculate constants for scaling points
scale_x, scale_y = width / new_width, height / new_height
#Translate and scale points
for point in list_of_points:
point.x = (point.x - min_x) * scale_x
point.y = (point.y - min_y) * scale_y
This code doesn't work. If I zoom in once, it is hard to detect the offset of pixels but it happens. If I keep zooming in, it will be much easier to detect the "drift" of points. Here are images to provide examples. On original image (1440x850) I places a point in the middle of blue crosshair. The more I zoom in the image it is easier to see that algorithm doesn't work with bigger cut-ofs.
Original image. Blue crosshair is middle point of an image. Red angles indicate what will be borders after image is zoomed once
Image after zooming in once.
Image after zooming in 5 times. Clearly, green point is no longer in the middle of image
The cut_off_percentage I used is 15% (meaning that I keep 85% of width and height of original image, calculated from the center).
I have also tried the following library: Augmentit python library
Library has functions for cropping images and resizing them together with points. Library also causes the points to drift. This is expected since the code I implemented and library's functions use the same algorithm.
Additionally, I have checked whether this is a rounding problem. It is not. Library rounds the points after multiplying coordinates with scales. Regardless on how they are rounded, points are still off by 4-5 px. This increases the more I zoom in the picture.
EDIT: A more detailed explanation is given here since I didn't understand a given answer.
The following is an image of right human hand.
Image of a hand in my program
Original dimension of this image is 1440 pixels in width and 850 pixels in height. As you can see in this image, I have annotated right wrist at location (756.0, 685.0). To check whether my program works correctly, I have opened this exact image in GIMP and placed a white point at location (756.0, 685.0). The result is following:
Image of a hand in GIMP
Coordinates in program work correctly. Now, if I were to calculate parameters given in first answer according to code given in first answer I get following:
vec = [756, 685]
hh = 425
hw = 720
cov = [720, 425]
These parameters make sense to me. Now I want to zoom the image to scale of 1.15. I crop the image by choosing center point and calculating low and high values which indicate what rectangle of image to keep and what to cut. On the following image you can see what is kept after cutting (everything inside red rectangle).
What is kept when cutting
Lows and highs when cutting are:
xb = [95,1349]
yb = [56,794]
Size of cropped image: 1254 x 738
This cropped image will be resized back to original image. However, when I do that my annotation gets completely wrong coordinates when using parameters described above.
After zoom
This is the code I used to crop, resize and rescale points, based on the first answer:
width, height = image.shape[:2]
center_x, center_y = int(width / 2), int(height / 2)
scale = 1.15
scaled_width = int(center_x / scale)
scaled_height = int(center_y / scale)
xlow = center_x - scaled_width
xhigh = center_x + scaled_width
ylow = center_y - scaled_height
yhigh = center_y + scaled_height
xb = [xlow, xhigh]
yb = [ylow, yhigh]
cropped = image[yb[0]:yb[1], xb[0]:xb[1]]
resized = cv2.resize(cropped, (width, height), cv2.INTER_CUBIC)
#Rescaling poitns
cov = (width / 2, height / 2)
width, height = resized.shape[:2]
hw = width / 2
hh = height / 2
for point in points:
x, y = point.scx, point.scy
x -= xlow
y -= ylow
x -= cov[0] - (hw / scale)
y -= cov[1] - (hh / scale)
x *= scale
y *= scale
x = int(x)
y = int(y)
point.set_coordinates(x, y)
So this really is an integer rounding issue. It's magnified at high zoom levels because being off by 1 pixel at 20x zoom throws you off much further. I tried out two versions of my crop-n-zoom gui. One with int rounding, another without.
You can see that the one with int rounding keeps approaching the correct position as the zoom grows, but as soon as the zoom takes another step, it rebounds back to being wrong. The non-rounded version sticks right up against the mid-lines (denoting the proper position) the whole time.
Note that the resized rectangle (the one drawn on the non-zoomed image) blurs past the midlines. This is because of the resize interpolation from OpenCV. The yellow rectangle that I'm using to check that my points are correctly scaling is redrawn on the zoomed frame so it stays crisp.
With Int Rounding
Without Int Rounding
I have the center-of-view locked to the bottom right corner of the rectangle for this demo.
import cv2
import numpy as np
# clamp value
def clamp(val, low, high):
if val < low:
return low;
if val > high:
return high;
return val;
# bound the center-of-view
def boundCenter(cov, scale, hh, hw):
# scale half res
scaled_hw = int(hw / scale);
scaled_hh = int(hh / scale);
# bound
xlow = scaled_hw;
xhigh = (2*hw) - scaled_hw;
ylow = scaled_hh;
yhigh = (2*hh) - scaled_hh;
cov[0] = clamp(cov[0], xlow, xhigh);
cov[1] = clamp(cov[1], ylow, yhigh);
# do a zoomed view
def zoomView(orig, cov, scale, hh, hw):
# calculate crop
scaled_hh = int(hh / scale);
scaled_hw = int(hw / scale);
xlow = cov[0] - scaled_hw;
xhigh = cov[0] + scaled_hw;
ylow = cov[1] - scaled_hh;
yhigh = cov[1] + scaled_hh;
xb = [xlow, xhigh];
yb = [ylow, yhigh];
# crop and resize
copy = np.copy(orig);
crop = copy[yb[0]:yb[1], xb[0]:xb[1]];
display = cv2.resize(crop, (width, height), cv2.INTER_CUBIC);
return display;
# draw vector shape
def drawVec(img, vec, pos, cov, hh, hw, scale):
con = [];
for point in vec:
# unpack point
x,y = point;
x += pos[0];
y += pos[1];
# here's the int version
# Note: this is the same as xlow and ylow from the above function
# x -= cov[0] - int(hw / scale);
# y -= cov[1] - int(hh / scale);
# rescale point
x -= cov[0] - (hw / scale);
y -= cov[1] - (hh / scale);
x *= scale;
y *= scale;
x = int(x);
y = int(y);
# add
con.append([x,y]);
con = np.array(con);
cv2.drawContours(img, [con], -1, (0,200,200), -1);
# font stuff
font = cv2.FONT_HERSHEY_SIMPLEX;
fontScale = 1;
fontColor = (255, 100, 0);
thickness = 2;
# draw blank
res = (800,1200,3);
blank = np.zeros(res, np.uint8);
print(blank.shape);
# draw a rectangle on the original
cv2.rectangle(blank, (100,100), (400,200), (200,150,0), -1);
# vectored shape
# comparison shape
bshape = [[100,100], [400,100], [400,200], [100,200]];
bpos = [0,0]; # offset
# random shape
vshape = [[148, 89], [245, 179], [299, 67], [326, 171], [385, 222], [291, 235], [291, 340], [229, 267], [89, 358], [151, 251], [57, 167], [167, 164]];
vpos = [100,100]; # offset
# get original image res
height, width = blank.shape[:2];
hh = int(height / 2);
hw = int(width / 2);
# center of view
cov = [600, 400];
camera_spd = 5;
# scale
scale = 1;
scale_step = 0.2;
# loop
done = False;
while not done:
# crop and show image
display = zoomView(blank, cov, scale, hh, hw);
# drawVec(display, vshape, vpos, cov, hh, hw, scale);
drawVec(display, bshape, bpos, cov, hh, hw, scale);
# draw a dot in the middle
cv2.circle(display, (hw, hh), 4, (0,0,255), -1);
# draw center lines
cv2.line(display, (hw,0), (hw,height), (0,0,255), 1);
cv2.line(display, (0,hh), (width,hh), (0,0,255), 1);
# draw zoom text
cv2.putText(display, "Zoom: " + str(scale), (15,40), font,
fontScale, fontColor, thickness, cv2.LINE_AA);
# show
cv2.imshow("Display", display);
key = cv2.waitKey(1);
# check keys
done = key == ord('q');
# Note: if you're actually gonna make a GUI
# use the keyboard module or something else for this
# wasd to move center-of-view
if key == ord('d'):
cov[0] += camera_spd;
if key == ord('a'):
cov[0] -= camera_spd;
if key == ord('w'):
cov[1] -= camera_spd;
if key == ord('s'):
cov[1] += camera_spd;
# z,x to decrease/increase zoom (lower bound is 1.0)
if key == ord('x'):
scale += scale_step;
if key == ord('z'):
scale -= scale_step;
scale = round(scale, 2);
# bound cov
boundCenter(cov, scale, hh, hw);
Edit: Explanation of the drawVec parameters
img: The OpenCV image to be drawn on
vec: A list of [x,y] points
pos: The offset to draw those points at
cov: Center-Of-View, where the middle of our zoomed display is pointed at
hh: Half-Height, the height of "img" divided by 2
hw: Half-Width, the width of "img" divided by 2
I have looked through my code and realized where I was making a mistake which caused points to be offset.
In my program, I have a canvas of specific size. The size of canvas is a constant and is always larger than images being drawn on canvas. When program draws an image on canvas it first resizes that image so it could fit on canvas. The size of resized image is somewhat smaller than size of canvas. Image is usually drawn starting from top left corner of canvas. Since I wanted to always draw image in the center of canvas, I shifted the location from top left corner of canvas to another point. This is what I didn't account when doing image zooming.
def zoom(image, ratio, points, canvas_off_x, canvas_off_y):
width, height = image.shape[:2]
new_width, new_height = int(ratio * width), int(ratio * height)
center_x, center_y = int(new_width / 2), int(new_height / 2)
radius_x, radius_y = int(width / 2), int(height / 2)
min_x, max_x = center_x - radius_x, center_x + radius_x
min_y, max_y = center_y - radius_y, center_y + radius_y
img_resized = cv2.resize(image, (new_width,new_height), interpolation=cv2.INTER_LINEAR)
img_cropped = img_resized[min_y:max_y+1, min_x:max_x+1]
for point in points:
x, y = point.get_original_coordinates()
x -= canvas_off_x
y -= canvas_off_y
x = int((x * ratio) - min_x + canvas_off_x)
y = int((y * ratio) - min_y + canvas_off_y)
point.set_scaled_coordinates(x, y)
In the code below canvas_off_x and canvas_off_y is the location of offset from top left corner of canvas

Python OpenCV template matching gives poor results

I am trying to build a tool that would recognize poker cards from an online site.
I thought the task would be trivial.
Harvest all possible cards, paste them in one image, get an "unknown" card, run template matching on all_cards_image, harvest the points where it is matched, compare it to a dictionary with (x,y) = "Ah" and voila.
But template matching gives such hit and miss results, some single cards are not recognized at all, some single cards are recognized as 4 of hearts, 5 of hearts and 6 of hearts simultaneously. 8 of clubs is recognized as 8 of spades and so on. When no card is present then it matches the whole template image.
I could not find threshold that would lead to satisfactory results.
I am attaching the whole code as well as the images.
It is clear to me that I could go the route of recognizing only rank of the card, maybe on black and white image and masking the suits for color (blue, green, red, black) and then matching them afterwards. I was just hoping there is a simple template matching solution without jumping through hoops.
First image with all the cards (all_cards.jpg) in code:
Second image (random screenshot) with all other details changed to white so as not to make the image too big, named "Screenshot_to_test_card_recognition_reduced.png" in code:
import cv2
import numpy as np
import os
# defining constants for positions, card width and height in a screenshot
CARD_HEIGHT = 59
CARD_WIDTH = 90
HORIZONTAL_SPACE = 5
HORIZONTAL_OFFSET_2 = 1463
VERTICAL_OFFSET_2 = 1047
FIRST_CARD_X = 955
FIRST_CARD_Y = 370
FLOPS_STARTING_POINTS = [(FIRST_CARD_X, FIRST_CARD_Y), (FIRST_CARD_X + HORIZONTAL_OFFSET_2, FIRST_CARD_Y), \
(FIRST_CARD_X, FIRST_CARD_Y + VERTICAL_OFFSET_2), \
(FIRST_CARD_X + HORIZONTAL_OFFSET_2, FIRST_CARD_Y + VERTICAL_OFFSET_2)]
# show image function
def show_image(img, title = "Unnamed"):
cv2.imshow(title, img)
cv2.waitKey(0)
cv2.destroyAllWindows()
# INSERT FOLDER WITH FILES
saved_folder = "D:\\WinPython 32 3.4\\WinPython-32bit-3.4.4.6Qt5\\notebooks\\Named cards"
template_image = os.path.join(saved_folder, 'all_cards.jpg')
screenshot = os.path.join(saved_folder, 'Screenshot_to_test_card_recognition_reduced.png')
img_rgb = cv2.imread(screenshot)
img_gray = cv2.cvtColor(img_rgb, cv2.COLOR_BGR2GRAY)
CORRECTION_FOR_SEARCH_WIDTH = 0 #60
CORRECTION_FOR_SEARCH_HEIGTH = 0 #20
# LOOP THROUGH ALL FIVE CARDS FOR EACH FLOP ON 4 TABLES
for point in FLOPS_STARTING_POINTS:
flop_starting_x, flop_starting_y = point
for card in range (0,5):
start_x = flop_starting_x + ((HORIZONTAL_SPACE) * card) + ((CARD_WIDTH) * card)
end_x = start_x + CARD_WIDTH - CORRECTION_FOR_SEARCH_WIDTH
start_y = flop_starting_y
end_y = start_y + CARD_HEIGHT - CORRECTION_FOR_SEARCH_HEIGTH
template = img_gray[start_y:end_y,start_x:end_x]
w, h = template.shape[::-1]
show_image(template)
all_cards_image = cv2.imread(template_image)
grey_all_cards_image = cv2.cvtColor(all_cards_image, cv2.COLOR_BGR2GRAY)
res = cv2.matchTemplate(grey_all_cards_image,template,cv2.TM_CCOEFF_NORMED)
threshold = 0.85
loc = np.where( res >= threshold)
for pt in zip(*loc[::-1]):
cv2.rectangle(all_cards_image, pt, (pt[0] + w, pt[1] + h), (0,0,255), 2)
print(loc)
show_image(all_cards_image)

OpenCV: Segment each digit from the given image. Digits are written in each cell of a row matrix. Each cell is bounded by margins

I have been trying to recognise handwritten letters (digits/alphabet) from a form-document. As it is known that form-documents have 1d row cells, where the applicant has to fill their information within those bounded cells. However, I'm unable to segment the digits(currently my input consists only digits) from the bounding boxes.
I went through the following steps:
Reading the image (as a grayscale image) via "imread" method of opencv2. Initial Image size:19 x 209(in pixels).
pic = "crop/cropped000.jpg"
newImg = cv2.imread(pic, 0)
Resizing the image 200% its original size via "resize" method of opencv2. I used INTER_AREA Interpolation. Resized Image size: 38 x 418(in pixels)
h,w = newImg.shape
resizedImg = cv2.resize(newImg, (2*w,2*h), interpolation=cv2.INTER_AREA)
Applied Canny edge detection.
v = np.median(resizedImg)
sigma = 0.33
lower = int(max(0, (1.0 - sigma) * v))
upper = int(min(255, (1.0 + sigma) * v))
edgedImg = cv2.Canny(resizedImg, lower, upper)
Cropped the contours and saved them as images in 'BB' directory.
im2, contours, hierarchy = cv2.findContours(edgedImg.copy(),cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
num = 0
for c in contours:
x, y, w, h = cv2.boundingRect(c)
num += 1
new_img = resizedImg[y:y+h, x:x+w]
cv2.imwrite('BB/'+str(num).zfill(3) + '.jpg', new_img)
Entire code in summary:
pic = "crop/cropped000.jpg"
newImg = cv2.imread(pic, 0)
h,w = newImg.shape
print(newImg.shape)
resizedImg = cv2.resize(newImg, (2*w,2*h), interpolation=cv2.INTER_AREA)
print(resizedImg.shape)
v = np.median(resizedImg)
sigma = 0.33
lower = int(max(0, (1.0 - sigma) * v))
upper = int(min(255, (1.0 + sigma) * v))
edgedImg = cv2.Canny(resizedImg, lower, upper)
im2, contours, hierarchy = cv2.findContours(edgedImg.copy(),cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
num = 0
for c in contours:
x, y, w, h = cv2.boundingRect(c)
num += 1
new_img = resizedImg[y:y+h, x:x+w]
cv2.imwrite('BB/'+str(num).zfill(3) + '.jpg', new_img)
Images produced are posted here:
https://imgur.com/a/GStIcdj
I had to double the image size because Canny edge detection was producing double-edges for an object (However, it still does). I have also played with other openCV functionalities like Thresholding, Gaussian Blur, Dilate, Erode but all in vain.
# we need one more parameter for Date cell width : as this could be different for diff bank
def crop_image_data_from_date_field(image, new_start_h, new_end_h, new_start_w, new_end_w, cell_width):
#for date each cell has same height and width : here width: 25 px so cord will be changed based on width
cropped_image_list = []
starting_width = new_start_w
for i in range(1,9): # as date has only 8 fields: DD/MM/YYYY
cropped_img = image[new_start_h:new_end_h, new_start_w + 1 :new_start_w+22]
new_start_w = starting_width + (i*cell_width)
cropped_img = cv2.resize(cropped_img, (28, 28))
image_name = 'cropped_date/cropped_'+ str(i) + '.png'
cv2.imwrite(image_name, cropped_img)
cropped_image_list.append(image_name)
# print('cropped_image_list : ',cropped_image_list,len(cropped_image_list))
# rec_value = handwritten_digit_recog.recog_digits(cropped_image_list)
recvd_value = custom_predict.predict_digit(cropped_image_list)
# print('recvd val : ',recvd_value)
return recvd_value
you need to specify each cell width and it's x,y,w,h.
I think this will help you.

Template Matching: efficient way to create mask for minMaxLoc?

Template matching in OpenCV is great. And you can pass a mask to cv2.minMaxLoc so that you only search (sort of) in part of the image for the template you want. You can also use a mask at the matchTemplate operation, but this only masks the template.
I want to find a template and I want to be assured that this template is within some other region of my image.
Calculating the mask for minMaxLoc seems kind of heavy. That is, calculating an accurate mask feels heavy. If you calculate a mask the easy way, it ignores the size of the template.
Examples are in order. My input images are show below. They're a bit contrived. I want to find the candy bar, but only if it's completely inside the white circle of the clock face.
clock1
clock2
template
In clock1, the candy bar is inside the circular clock face and it's a "PASS". But in clock2, the candy bar is only partially inside the face and I want it to be a "FAIL". Here's a code sample for doing it the easy way. I use cv.HoughCircles to find the clock face.
import numpy as np
import cv2
img = cv2.imread('clock1.jpg')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
template = cv2.imread('template.png')
t_h, t_w = template.shape[0:2] # template height and width
# find circle in gray image using Hough transform
circles = cv2.HoughCircles(gray, method = cv2.HOUGH_GRADIENT, dp = 1,
minDist = 150, param1 = 50, param2 = 70,
minRadius = 131, maxRadius = 200)
i = circles[0,0]
x0 = i[0]
y0 = i[1]
r = i[2]
# display circle on color image
cv2.circle(img,(x0, y0), r,(0,255,0),2)
# do the template match
result = cv2.matchTemplate(img, template, cv2.TM_CCOEFF_NORMED)
# finally, here is the part that gets tricky. we want to find highest
# rated match inside circle and we'd like to use minMaxLoc
# make mask by drawing circle on zero array
mask = np.zeros(result.shape, dtype = np.uint8) # minMaxLoc will throw
# error w/o np.uint8
cv2.circle(mask, (x0, y0), r, color = 1, thickness = -1)
# call minMaxLoc
min_val, max_val, min_loc, max_loc = cv2.minMaxLoc(result, mask = mask)
# draw found rectangle on img
if max_val > 0.4: # use 0.4 as threshold for finding candy bar
cv2.rectangle(img, max_loc, (max_loc[0]+t_w, max_loc[1]+t_h), (0,255,0), 4)
cv2.imwrite('output.jpg', img)
output using clock1
output using clock2
finds candy bar even
though part of it is outside circle
So to properly make a mask, I use a bunch of NumPy operations. I make four separate masks (one for each corner of the template bounding box) and then AND them together. I'm not aware of any convenience functions in OpenCV that would do the mask for me. I'm a little nervous that all of the array operations will be expensive. Is there a better way to do this?
h, w = result.shape[0:2]
# make arrays that hold x,y coords
grid = np.indices((h, w))
x = grid[1]
y = grid[0]
top_left_mask = np.hypot(x - x0, y - y0) - r < 0
top_right_mask = np.hypot(x + t_w - x0, y - y0) - r < 0
bot_left_mask = np.hypot(x - x0, y + t_h - y0) - r < 0
bot_right_mask = np.hypot(x + t_w - x0, y + t_h - y0) - r < 0
mask = np.logical_and.reduce((top_left_mask, top_right_mask,
bot_left_mask, bot_right_mask))
mask = mask.astype(np.uint8)
cv2.imwrite('mask.png', mask*255)
Here's what the "fancy" mask looks like:
Seems about right. It cannot be circular because of the template shape. If I run clock2.jpg with this mask I get:
It works. No candy bars are identified. But I wish I could do it in fewer lines of code...
EDIT:
I've done some profiling. I ran 100 cycles of the "easy" way and the "accurate" way and calculated frames per second (fps):
easy way: 12.7 fps
accurate way: 7.8 fps
so there is some price to pay for making the mask with NumPy. These tests were done on a relatively powerful workstation. It could get uglier on more modest hardware...
Method 1: 'mask' image before cv2.matchTemplate
Just for kicks, I tried to make my own mask of the image that I pass to cv2.matchTemplate to see what kind of performance I can achieve. To be clear, this isn't a proper mask -- I set all of the pixels to ignore to one color (black or white). This is to get around the fact only TM_SQDIFF and TM_CORR_NORMED support a proper mask.
#Alexander Reynolds makes a very good point in the comments that some care must be taken if the template image (the thing we're trying to find) has lots of black or lots of white. For many problems, we will know a priori what the template looks like and we can specify a white background or black background.
I use cv2.multiply, which seems to be faster than numpy.multiply. cv2.multiply has the added advantage that it automatically clips the results to the range 0 to 255.
import numpy as np
import cv2
import time
img = cv2.imread('clock1.jpg')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
template = cv2.imread('target.jpg')
t_h, t_w = template.shape[0:2] # template height and width
mask_background = 'WHITE'
start_time = time.time()
for i in range(100): # do 100 cycles for timing
# find circle in gray image using Hough transform
circles = cv2.HoughCircles(gray, method = cv2.HOUGH_GRADIENT, dp = 1,
minDist = 150, param1 = 50, param2 = 70,
minRadius = 131, maxRadius = 200)
i = circles[0,0]
x0 = i[0]
y0 = i[1]
r = i[2]
# display circle on color image
cv2.circle(img,(x0, y0), r,(0,255,0),2)
if mask_background == 'BLACK': # black = 0, white = 255 on grayscale
mask = np.zeros(img.shape, dtype = np.uint8)
elif mask_background == 'WHITE':
mask = 255*np.ones(img.shape, dtype = np.uint8)
cv2.circle(mask, (x0, y0), r, color = (1,1,1), thickness = -1)
img2 = cv2.multiply(img, mask) # element wise multiplication
# values > 255 are truncated at 255
# do the template match
result = cv2.matchTemplate(img2, template, cv2.TM_CCOEFF_NORMED)
# call minMaxLoc
min_val, max_val, min_loc, max_loc = cv2.minMaxLoc(result)
# draw found rectangle on img
if max_val > 0.4:
cv2.rectangle(img, max_loc, (max_loc[0]+t_w, max_loc[1]+t_h), (0,255,0), 4)
fps = 100/(time.time()-start_time)
print('fps ', fps)
cv2.imwrite('output.jpg', img)
Profiling results:
BLACK background 12.3 fps
WHITE background 12.1 fps
Using this method has very little performance hit relative to 12.7 fps in original question. However, it has the drawback that it will still find templates that still stick over the edge a little bit. Depending on the exact nature of the problem, this may be acceptable in many applications.
Method 2: use cv2.boxFilter to create mask for minMaxLoc
In this technique, we start with a circular mask (as in OP), but then modify it with cv2.boxFilter. We change the anchor from default center of kernel to the top left corner (0, 0)
import numpy as np
import cv2
import time
img = cv2.imread('clock1.jpg')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
template = cv2.imread('target.jpg')
t_h, t_w = template.shape[0:2] # template height and width
print('t_h, t_w ', t_h, ' ', t_w)
start_time = time.time()
for i in range(100):
# find circle in gray image using Hough transform
circles = cv2.HoughCircles(gray, method = cv2.HOUGH_GRADIENT, dp = 1,
minDist = 150, param1 = 50, param2 = 70,
minRadius = 131, maxRadius = 200)
i = circles[0,0]
x0 = i[0]
y0 = i[1]
r = i[2]
# display circle on color image
cv2.circle(img,(x0, y0), r,(0,255,0),2)
# do the template match
result = cv2.matchTemplate(img, template, cv2.TM_CCOEFF_NORMED)
# finally, here is the part that gets tricky. we want to find highest
# rated match inside circle and we'd like to use minMaxLoc
# start to make mask by drawing circle on zero array
mask = np.zeros(result.shape, dtype = np.float)
cv2.circle(mask, (x0, y0), r, color = 1, thickness = -1)
mask = cv2.boxFilter(mask,
ddepth = -1,
ksize = (t_w, t_h),
anchor = (0,0),
normalize = True,
borderType = cv2.BORDER_ISOLATED)
# mask now contains values from zero to 1. we want to make anything
# less than 1 equal to zero
_, mask = cv2.threshold(mask, thresh = 0.9999,
maxval = 1.0, type = cv2.THRESH_BINARY)
mask = mask.astype(np.uint8)
# call minMaxLoc
min_val, max_val, min_loc, max_loc = cv2.minMaxLoc(result, mask = mask)
# draw found rectangle on img
if max_val > 0.4:
cv2.rectangle(img, max_loc, (max_loc[0]+t_w, max_loc[1]+t_h), (0,255,0), 4)
fps = 100/(time.time()-start_time)
print('fps ', fps)
cv2.imwrite('output.jpg', img)
This code gives a mask identical to OP, but at 11.89 fps. This technique gives us more accuracy with slightly more performance hit than Method 1.

Generating a KML heatmap from given data set of [lat, lon, density]

I am looking to build a static KML (Google Earth markup) file which displays a heatmap-style rendering of a few given data sets in the form of [lat, lon, density] tuples.
A very straightforward data set I have is for population density.
My requirements are:
Must be able to feed in data for a given lat, lon
Must be able to specify the density of the data at that lat, lon
Must export to KML
The requirements are language agnostic for this project as I will be generating these files offline in order to build the KML used elsewhere.
I have looked at a few projects, most notably heatmap.py, which is a port of gheat in Python with KML export. I have hit a brick wall in the sense that the projects I have found to date all rely on building the heatmap from the density of [lat, lon] points fed into the algorithm.
If I am missing an obvious way to adapt my data set to feed in just the [lat, lon] tuples but adjusting how I feed them using the density values I have, I would love to know!
Hey Will, heatmap.py is me. Your request is a common-enough one and is on my list of things to address. I'm not quite sure yet how to do so in a general fashion; in heatmap.py parlance, it would be straightforward to have a per-point dotsize instead of a global dotsize as it is now, but I'm not sure that will address the true need. I'm aiming for a summer 2010 release, but you could probably make this mod yourself.
You may try searching for Kernel Density Estimator tools; that's what the statisticians call heatmaps. R has some good built-in tools you can use that might satisfy your need more quickly.
good luck!
I updated the heatmap.py script so you can specify a density for each point. I uploaded my changes to my blog. Not sure if it'll do exactly what you want though!
Cheers,
Alex
Update [13 Nov 2020]
I archived my blog a while back, so the link no longer works, so for reference here are the changes:
the diff file:
--- __init__.py 2010-09-14 08:40:35.829079482 +0100
+++ __init__.py.mynew 2010-09-06 14:50:10.394447647 +0100
## -1,5 +1,5 ##
#heatmap.py v1.0 20091004
-from PIL import Image,ImageChops
+from PIL import Image,ImageChops,ImageDraw
import os
import random
import math
## -43,10 +43,13 ##
Most of the magic starts in heatmap(), see below for description of that function.
"""
def __init__(self):
+ self.minIntensity = 0
+ self.maxIntensity = 0
self.minXY = ()
self.maxXY = ()
+
- def heatmap(self, points, fout, dotsize=150, opacity=128, size=(1024,1024), scheme="classic"):
+ def heatmap(self, points, fout, dotsize=150, opacity=128, size=(4048,1024), scheme="classic", area=(-180,180,-90,90)):
"""
points -> an iterable list of tuples, where the contents are the
x,y coordinates to plot. e.g., [(1, 1), (2, 2), (3, 3)]
## -59,33 +62,41 ##
size -> tuple with the width, height in pixels of the output PNG
scheme -> Name of color scheme to use to color the output image.
Use schemes() to get list. (images are in source distro)
+ area -> specify the coordinates covered by the resulting image
+ (could create an image to cover area larger than the max/
+ min values given in the points list)
"""
-
+ print("Starting heatmap")
self.dotsize = dotsize
self.opacity = opacity
self.size = size
self.imageFile = fout
-
+
if scheme not in self.schemes():
tmp = "Unknown color scheme: %s. Available schemes: %s" % (scheme, self.schemes())
raise Exception(tmp)
- self.minXY, self.maxXY = self._ranges(points)
- dot = self._buildDot(self.dotsize)
+ self.minXY = (area[0],area[2])
+ self.maxXY = (area[1],area[3])
+ self.minIntensity, self.maxIntensity = self._intensityRange(points)
+
img = Image.new('RGBA', self.size, 'white')
- for x,y in points:
+ for x,y,z in points:
+ dot = self._buildDot(self.dotsize,z)
tmp = Image.new('RGBA', self.size, 'white')
tmp.paste( dot, self._translate([x,y]) )
img = ImageChops.multiply(img, tmp)
-
+ print("All dots built")
colors = colorschemes.schemes[scheme]
img.save("bw.png", "PNG")
+ print("Saved temp b/w image")
+ print("Colourising")
self._colorize(img, colors)
img.save(fout, "PNG")
-
+ print("Completed colourising and saved final image %s" % fout)
def saveKML(self, kmlFile):
"""
Saves a KML template to use with google earth. Assumes x/y coordinates
## -110,17 +121,19 ##
"""
return colorschemes.schemes.keys()
- def _buildDot(self, size):
+ def _buildDot(self, size,intensity):
""" builds a temporary image that is plotted for
each point in the dataset"""
+
+ intsty = self._calcIntensity(intensity)
+ print("building dot... %d: %f" % (intensity,intsty))
+
img = Image.new("RGB", (size,size), 'white')
- md = 0.5*math.sqrt( (size/2.0)**2 + (size/2.0)**2 )
- for x in range(size):
- for y in range(size):
- d = math.sqrt( (x - size/2.0)**2 + (y - size/2.0)**2 )
- rgbVal = int(200*d/md + 50)
- rgb = (rgbVal, rgbVal, rgbVal)
- img.putpixel((x,y), rgb)
+ draw = ImageDraw.Draw(img)
+ shade = 256/(size/2)
+ for x in range (int(size/2)):
+ colour = int(256-(x*shade*intsty))
+ draw.ellipse((x,x,size-x,size-x),(colour,colour,colour))
return img
def _colorize(self, img, colors):
## -139,7 +152,7 ##
rgba.append(alpha)
img.putpixel((x,y), tuple(rgba))
-
+
def _ranges(self, points):
""" walks the list of points and finds the
max/min x & y values in the set """
## -153,6 +166,23 ##
return ((minX, minY), (maxX, maxY))
+ def _calcIntensity(self,z):
+ return (z/self.maxIntensity)
+
+ def _intensityRange(self, points):
+ """ walks the list of points and finds the
+ max/min points of intensity
+ """
+ minZ = points[0][2]
+ maxZ = minZ
+
+ for x,y,z in points:
+ minZ = min(z, minZ)
+ maxZ = max(z, maxZ)
+
+ print("(minZ, maxZ):(%d, %d)" % (minZ,maxZ))
+ return (minZ, maxZ)
+
def _translate(self, point):
""" translates x,y coordinates from data set into
pixel offsets."""
and a demo script:
import heatmap
import random
import MySQLdb
import math
print "starting script..."
db = MySQLdb.connect(host="localhost", # your host, usually localhost
user="username", # your username
passwd="password", # your password
db="database") # name of the data base
cur = db.cursor()
minLng = -180
maxLng = 180
minLat = -90
maxLat = 90
# create and execute the query
query = "SELECT lat, lng, intensity FROM mytable \
WHERE %f<=tllat AND tllat<=%f \
AND %f<=tllng AND tllng<=%f" % (minLat,maxLat,minLng,maxLng)
cur.execute(query)
pts = []
# print all the first cell of all the rows
for row in cur.fetchall() :
print (row[1],row[0],row[2])
# work out the mercator projection for latitute x = asinh(tan(x1))
proj = math.degrees(math.asinh(math.tan(math.radians(row[0]))))
print (row[1],proj,row[2])
print "-"*15
if (minLat < proj and proj < maxLat):
pts.append((row[1],proj,row[2]))
print "Processing %d points..." % len(pts)
hm = heatmap.Heatmap()
hm.heatmap(pts, "bandwidth2.png",30,155,(1024,512),'fire',(minLng,maxLng,minLat,maxLat))
I think one way to do this is to create a (larger) list of tuples with each point repeated according to the density at that point. A point with a high density is represented by lots of points on top of each other while a point with a low density has few points. So instead of: [(120.7, 82.5, 2), (130.6, 81.5, 1)] you would use [(120.7, 82.5), (120.7, 82.5), (130.6, 81.5)] (a fairly dull dataset).
One possible issue is that your densities may well be floats, not integers, so you should normalize and round the data. One way to do the conversion is something like this:
def dens2points (dens_tups):
min_dens = dens_tups[0][2]
for tup in dens_tups:
if (min_dens > tup[2]):
min_dens = tup[2]
print min_dens
result = []
for tup in dens_tups:
for i in range(int(tup[2]/min_dens)):
result.append((tup[0],tup[1]))
return result
if __name__ == "__main__":
input = [(10, 10, 20.0),(5, 5, 10.0),(10,10,0.9)]
output = dens2points(input)
print input
print output
(which isn't very pythonic, but seems to work for the simple test case). This subroutine should convert your data into a form that is accepted by heatmap.py. With a little effort I think the subroutine can be reduced to two lines.

Resources