I am using turfjs and leaflet to plot a grid and label each square in this fashion:
[A0,A1,...,A23]
[B0,B1,...,B23]
[C0,C1,...,C23]
End goal:
To know what are the coordinates of the corner points of each cell. I mean, I want to know what are the coordinates of the 4 corners of A0 ( and the other cells ).This will then be fed to a json file with something like this:
[
{"A0": [
["x","y"],
["x","y"],
["x","y"],
["x","y"]
]},
{"A1": [
["x","y"],
["x","y"],
["x","y"],
["x","y"]
]}
]
Then, my app will ask the GPS from the device and learn which "square" i'm in.
I have managed to plot the squares ( fiddle, but could not label them or even summon a click to console to find out what are the corner coordinates. I have console'd out the layers but i'm not sure if the plot of the geoJson layer is plotted from left to right. I have concluded each layer spits out 5 coordinates which I suspect that is the information I require but there is a 5th coordinate which does not make sense to be in a square grid cell, unless the 3rd coordinate is the center...
I was able to figure out the mystery of the GeoJson layer in leaflet.
The coordinates are returned like this:
[ 0, 3 , 6 ]
[ 1, 4 , 7 ]
[ 2, 5 , 8 ]
//will label this way:
A0 = 0 ( coordinate sets at 0 )
A1 = 1 ( coordinate sets at 1 )
A2 = 2
B0 = 3
B1 = 4
B2 = 5
...
I still don't know why there is a 5th coordinate in each layer plotted by leaflet. but this is good enough for me. I can know label them as I want.
Thank you for the help.
How can we convert map objects(derived from ndarray objects) to a dataframe or array object in python.
I have a normally distributed data with size 10*10 called a. There is one more data containing 0 and 1 of size 10*10 called b. I want to add a to b if b is not zero else return b.
I am doing it through map. I am able to create the map object called c but can't see the content of it. Can someone please help.
a=numpy.random.normal(loc=0.0,scale=0.001,size=(10,10))
b = np.random.randint(2, size=a.shape)
c=map(lambda x,y : y+x if y!=0 else x, a,b)
a=[[.24,.03,.87],
[.45,.67,.34],
[.54,.32,.12]]
b=[[0,1,0],
[1,0,0],
[1,0,1]]
then c should be as shown below.
c=[[0,1.03,.87],
[1.45,0,0],
[1.54,0,1.12]
]
np.multiply(a,b) + b
should do it
Here is the output
array([[0. , 1.03, 0. ],
[1.45, 0. , 0. ],
[1.54, 0. , 1.12]])
Since, a and b are numpy arrays, there is a numpy function especially for this use case as np.where (documentation).
If a and b are as follows,
a=np.array([[.24,.03,.87],
[.45,.67,.34],
[.54,.32,.12]])
b=np.array([[0,1,0],
[1,0,0],
[1,0,1]])
Then the output of the following line,
np.where(b!=0, a+b, b)
will be,
[[0. 1.03 0. ]
[1.45 0. 0. ]
[1.54 0. 1.12]]
I'm trying to stitch 2 images together by using template matching find 3 sets of points which I pass to cv2.getAffineTransform() get a warp matrix which I pass to cv2.warpAffine() into to align my images.
However when I join my images the majority of my affine'd image isn't shown. I've tried using different techniques to select points, changed the order or arguments etc. but I can only ever get a thin slither of the affine'd image to be shown.
Could somebody tell me whether my approach is a valid one and suggest where I might be making an error? Any guesses as to what could be causing the problem would be greatly appreciated. Thanks in advance.
This is the final result that I get. Here are the original images (1, 2) and the code that I use:
EDIT: Here's the results of the variable trans
array([[ 1.00768049e+00, -3.76690353e-17, -3.13824885e+00],
[ 4.84461775e-03, 1.30769231e+00, 9.61912797e+02]])
And here are the here the points passed to cv2.getAffineTransform: unified_pair1
array([[ 671., 1024.],
[ 15., 979.],
[ 15., 962.]], dtype=float32)
unified_pair2
array([[ 669., 45.],
[ 18., 13.],
[ 18., 0.]], dtype=float32)
import cv2
import numpy as np
def showimage(image, name="No name given"):
cv2.imshow(name, image)
cv2.waitKey(0)
cv2.destroyAllWindows()
return
image_a = cv2.imread('image_a.png')
image_b = cv2.imread('image_b.png')
def get_roi(image):
roi = cv2.selectROI(image) # spacebar to confirm selection
cv2.waitKey(0)
cv2.destroyAllWindows()
crop = image_a[int(roi[1]):int(roi[1]+roi[3]), int(roi[0]):int(roi[0]+roi[2])]
return crop
temp_1 = get_roi(image_a)
temp_2 = get_roi(image_a)
temp_3 = get_roi(image_a)
def find_template(template, search_image_a, search_image_b):
ccnorm_im_a = cv2.matchTemplate(search_image_a, template, cv2.TM_CCORR_NORMED)
template_loc_a = np.where(ccnorm_im_a == ccnorm_im_a.max())
ccnorm_im_b = cv2.matchTemplate(search_image_b, template, cv2.TM_CCORR_NORMED)
template_loc_b = np.where(ccnorm_im_b == ccnorm_im_b.max())
return template_loc_a, template_loc_b
coord_a1, coord_b1 = find_template(temp_1, image_a, image_b)
coord_a2, coord_b2 = find_template(temp_2, image_a, image_b)
coord_a3, coord_b3 = find_template(temp_3, image_a, image_b)
def unnest_list(coords_list):
coords_list = [a[0] for a in coords_list]
return coords_list
coord_a1 = unnest_list(coord_a1)
coord_b1 = unnest_list(coord_b1)
coord_a2 = unnest_list(coord_a2)
coord_b2 = unnest_list(coord_b2)
coord_a3 = unnest_list(coord_a3)
coord_b3 = unnest_list(coord_b3)
def unify_coords(coords1,coords2,coords3):
unified = []
unified.extend([coords1, coords2, coords3])
return unified
# Create a 2 lists containing 3 pairs of coordinates
unified_pair1 = unify_coords(coord_a1, coord_a2, coord_a3)
unified_pair2 = unify_coords(coord_b1, coord_b2, coord_b3)
# Convert elements of lists to numpy arrays with data type float32
unified_pair1 = np.asarray(unified_pair1, dtype=np.float32)
unified_pair2 = np.asarray(unified_pair2, dtype=np.float32)
# Get result of the affine transformation
trans = cv2.getAffineTransform(unified_pair1, unified_pair2)
# Apply the affine transformation to original image
result = cv2.warpAffine(image_a, trans, (image_a.shape[1] + image_b.shape[1], image_a.shape[0]))
result[0:image_b.shape[0], image_b.shape[1]:] = image_b
showimage(result)
cv2.imwrite('result.png', result)
Sources: Approach based on advice received here, this tutorial and this example from the docs.
July 12 Edit:
This post inspired GitHub repos providing functions to accomplish this task; one for a padded warpAffine() and another for a padded warpPerspective(). Check out the Python version or the C++ version.
Transformations shift the location of pixels
What any transformation does is takes your point coordinates (x, y) and maps them to new locations (x', y'):
s*x' h1 h2 h3 x
s*y' = h4 h5 h6 * y
s h7 h8 1 1
where s is some scaling factor. You must divide the new coordinates by the scale factor to get back the proper pixel locations (x', y'). Technically, this is only true of homographies---(3, 3) transformation matrices---you don't need to scale for affine transformations (you don't even need to use homogeneous coordinates...but it's better to keep this discussion general).
Then the actual pixel values are moved to those new locations, and the color values are interpolated to fit the new pixel grid. So during this process, these new locations get recorded at some point. We'll need those locations to see where the pixels actually move to, relative to the other image. Let's start with an easy example and see where points are mapped.
Suppose your transformation matrix simply shifts pixels to the left by ten pixels. Translation is handled by the last column; the first row is the translation in x and second row is the translation in y. So we would have an identity matrix, but with -10 in the first row, third column. Where would the pixel (0,0) be mapped? Hopefully, (-10,0) if logic makes any sense. And in fact, it does:
transf = np.array([[1.,0.,-10.],[0.,1.,0.],[0.,0.,1.]])
homg_pt = np.array([0,0,1])
new_homg_pt = transf.dot(homg_pt))
new_homg_pt /= new_homg_pt[2]
# new_homg_pt = [-10. 0. 1.]
Perfect! So we can figure out where all points map with a little linear algebra. We will need to get all the (x,y) points, and put them into a huge array so that every single point is in it's own column. Lets pretend our image is only 4x4.
h, w = src.shape[:2] # 4, 4
indY, indX = np.indices((h,w)) # similar to meshgrid/mgrid
lin_homg_pts = np.stack((indX.ravel(), indY.ravel(), np.ones(indY.size)))
These lin_homg_pts have every homogenous point now:
[[ 0. 1. 2. 3. 0. 1. 2. 3. 0. 1. 2. 3. 0. 1. 2. 3.]
[ 0. 0. 0. 0. 1. 1. 1. 1. 2. 2. 2. 2. 3. 3. 3. 3.]
[ 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]]
Then we can do matrix multiplication to get the mapped value of every point. For simplicity, let's stick with the previous homography.
trans_lin_homg_pts = transf.dot(lin_homg_pts)
trans_lin_homg_pts /= trans_lin_homg_pts[2,:]
And now we have the transformed points:
[[-10. -9. -8. -7. -10. -9. -8. -7. -10. -9. -8. -7. -10. -9. -8. -7.]
[ 0. 0. 0. 0. 1. 1. 1. 1. 2. 2. 2. 2. 3. 3. 3. 3.]
[ 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]]
As we can see, everything is working as expected: we have shifted the x-values only, by -10.
Pixels can be shifted outside of your image bounds
Notice that these pixel locations are negative---they're outside of the image bounds. If we do something a little more complex and rotate the image by 45 degrees, we'll get some pixel values way outside our original bounds. We don't care about every pixel value though, we just need to know how far the farthest pixels are that are outside the original image pixel locations, so that we can pad the original image that far out, before displaying the warped image on it.
theta = 45*np.pi/180
transf = np.array([
[ np.cos(theta),np.sin(theta),0],
[-np.sin(theta),np.cos(theta),0],
[0.,0.,1.]])
print(transf)
trans_lin_homg_pts = transf.dot(lin_homg_pts)
minX = np.min(trans_lin_homg_pts[0,:])
minY = np.min(trans_lin_homg_pts[1,:])
maxX = np.max(trans_lin_homg_pts[0,:])
maxY = np.max(trans_lin_homg_pts[1,:])
# minX: 0.0, minY: -2.12132034356, maxX: 4.24264068712, maxY: 2.12132034356,
So we see that we can get pixel locations well outside our original image, both in the negative and positive directions. The minimum x value doesn't change because when an homography applies a rotation, it does it from the top-left corner. Now one thing to note here is that I've applied the transformation to all pixels in the image. But this is really unnecessary, you can simply warp the four corner points and see where they land.
Padding the destination image
Note that when you call cv2.warpAffine() you have to input the destination size. These transformed pixel values reference that size. So if a pixel gets mapped to (-10,0), it won't show up in the destination image. That means that we'll have to make another homography with translations which shift all pixel locations be positive, and then we can pad the image matrix to compensate for our shift. We'll also have to pad the original image on the bottom and the right if the homography moves points to positions bigger than the image, too.
In the recent example, the min x value is the same, so we need no horizontal shift. However, the min y value has dropped by about two pixels, so we need to shift the image two pixels down. First, let's create the padded destination image.
pad_sz = list(src.shape) # in case three channel
pad_sz[0] = np.round(np.maximum(pad_sz[0], maxY) - np.minimum(0, minY)).astype(int)
pad_sz[1] = np.round(np.maximum(pad_sz[1], maxX) - np.minimum(0, minX)).astype(int)
dst_pad = np.zeros(pad_sz, dtype=np.uint8)
# pad_sz = [6, 4, 3]
As we can see, the height increased from the original by two pixels to account for that shift.
Add translation to the transformation to shift all pixel locations to positive
Now, we need to create a new homography matrix to translate the warped image by the same amount that we shifted by. And to apply both transformations---the original and this new shift---we have to compose the two homographies (for an affine transformation, you can simply add the translation, but not for an homography). Additionally we need to divide by the last entry to make sure the scales are still proper (again, only for homographies):
anchorX, anchorY = 0, 0
transl_transf = np.eye(3,3)
if minX < 0:
anchorX = np.round(-minX).astype(int)
transl_transf[0,2] -= anchorX
if minY < 0:
anchorY = np.round(-minY).astype(int)
transl_transf[1,2] -= anchorY
new_transf = transl_transf.dot(transf)
new_transf /= new_transf[2,2]
I also created here the anchor points for where we will place the destination image into the padded matrix; it's shifted by the same amount the homography will shift the image. So let's place the destination image inside the padded matrix:
dst_pad[anchorY:anchorY+dst_sz[0], anchorX:anchorX+dst_sz[1]] = dst
Warp with the new transformation into the padded image
All we have left to do is apply the new transformation to the source image (with the padded destination size), and then we can overlay the two images.
warped = cv2.warpPerspective(src, new_transf, (pad_sz[1],pad_sz[0]))
alpha = 0.3
beta = 1 - alpha
blended = cv2.addWeighted(warped, alpha, dst_pad, beta, 1.0)
Putting it all together
Let's create a function for this since we were creating quite a few variables we don't need at the end here. For inputs we need the source image, the destination image, and the original homography. And for outputs we simply want the padded destination image, and the warped image. Note that in the examples we used a 3x3 homography so we better make sure we send in 3x3 transforms instead of 2x3 affine or Euclidean warps. You can just add the row [0,0,1] to any affine warp at the bottom and you'll be fine.
def warpPerspectivePadded(img, dst, transf):
src_h, src_w = src.shape[:2]
lin_homg_pts = np.array([[0, src_w, src_w, 0], [0, 0, src_h, src_h], [1, 1, 1, 1]])
trans_lin_homg_pts = transf.dot(lin_homg_pts)
trans_lin_homg_pts /= trans_lin_homg_pts[2,:]
minX = np.min(trans_lin_homg_pts[0,:])
minY = np.min(trans_lin_homg_pts[1,:])
maxX = np.max(trans_lin_homg_pts[0,:])
maxY = np.max(trans_lin_homg_pts[1,:])
# calculate the needed padding and create a blank image to place dst within
dst_sz = list(dst.shape)
pad_sz = dst_sz.copy() # to get the same number of channels
pad_sz[0] = np.round(np.maximum(dst_sz[0], maxY) - np.minimum(0, minY)).astype(int)
pad_sz[1] = np.round(np.maximum(dst_sz[1], maxX) - np.minimum(0, minX)).astype(int)
dst_pad = np.zeros(pad_sz, dtype=np.uint8)
# add translation to the transformation matrix to shift to positive values
anchorX, anchorY = 0, 0
transl_transf = np.eye(3,3)
if minX < 0:
anchorX = np.round(-minX).astype(int)
transl_transf[0,2] += anchorX
if minY < 0:
anchorY = np.round(-minY).astype(int)
transl_transf[1,2] += anchorY
new_transf = transl_transf.dot(transf)
new_transf /= new_transf[2,2]
dst_pad[anchorY:anchorY+dst_sz[0], anchorX:anchorX+dst_sz[1]] = dst
warped = cv2.warpPerspective(src, new_transf, (pad_sz[1],pad_sz[0]))
return dst_pad, warped
Example of running the function
Finally, we can call this function with some real images and homographies and see how it pans out. I'll borrow the example from LearnOpenCV:
src = cv2.imread('book2.jpg')
pts_src = np.array([[141, 131], [480, 159], [493, 630],[64, 601]], dtype=np.float32)
dst = cv2.imread('book1.jpg')
pts_dst = np.array([[318, 256],[534, 372],[316, 670],[73, 473]], dtype=np.float32)
transf = cv2.getPerspectiveTransform(pts_src, pts_dst)
dst_pad, warped = warpPerspectivePadded(src, dst, transf)
alpha = 0.5
beta = 1 - alpha
blended = cv2.addWeighted(warped, alpha, dst_pad, beta, 1.0)
cv2.imshow("Blended Warped Image", blended)
cv2.waitKey(0)
And we end up with this padded warped image:
![[Padded and warped1]1
as opposed to the typical cut off warp you would normally get.
When I blur an object in Inkscape by let's say 10%, it get's a filter with a feGaussionBlur with a stdDeviation of 10% * size / 2.
However the filter has a size of 124% (it is actually that big, Inkscape doesn't add a bit just to be on the safe-side).
Where does this number come from? My guess would be 100% + 2.4 * (2*stdDeviation/size), but then where does this 2.4 come from?
From the SVG 1.1 spec:
This filter primitive performs a Gaussian blur on the input image.
The Gaussian blur kernel is an approximation of the normalized convolution:
G(x,y) = H(x)I(y)
where
H(x) = exp(-x2/ (2s2)) / sqrt(2* pis2)
and
I(y) = exp(-y2/ (2t2)) / sqrt(2 pi*t2)
with 's' being the standard deviation in the x direction and 't' being the standard deviation in the y direction, as specified by ‘stdDeviation’.
The value of ‘stdDeviation’ can be either one or two numbers. If two numbers are provided, the first number represents a standard deviation value along the x-axis of the current coordinate system and the second value represents a standard deviation in Y. If one number is provided, then that value is used for both X and Y.
Even if only one value is provided for ‘stdDeviation’, this can be implemented as a separable convolution.
For larger values of 's' (s >= 2.0), an approximation can be used: Three successive box-blurs build a piece-wise quadratic convolution kernel, which approximates the Gaussian kernel to within roughly 3%.
let d = floor(s * 3*sqrt(2*pi)/4 + 0.5)
... if d is odd, use three box-blurs of size 'd', centered on the output pixel.
... if d is even, two box-blurs of size 'd' (the first one centered on the pixel boundary between the output pixel and the one to the left, the second one centered on the pixel boundary between the output pixel and the one to the right) and one box blur of size 'd+1' centered on the output pixel.
Note: the approximation formula also applies correspondingly to 't'.*
I'm trying to reconstruct 3D points from 2D image correspondences. My camera is calibrated. The test images are of a checkered cube and correspondences are hand picked. Radial distortion is removed. After triangulation the construction seems to be wrong however. The X and Y values seem to be correct, but the Z values are about the same and do not differentiate along the cube. The 3D points look like as if the points were flattened along the Z-axis.
What is going wrong in the Z values? Do the points need to be normalized or changed from image coordinates at any point, say before the fundamental matrix is computed? (If this is too vague I can explain my general process or elaborate on parts)
Update
Given:
x1 = P1 * X and x2 = P2 * X
x1, x2 being the first and second image points and X being the 3d point.
However, I have found that x1 is not close to the actual hand picked value but x2 is in fact close.
How I compute projection matrices:
P1 = [eye(3), zeros(3,1)];
P2 = K * [R, t];
Update II
Calibration results after optimization (with uncertainties)
% Focal Length: fc = [ 699.13458 701.11196 ] ± [ 1.05092 1.08272 ]
% Principal point: cc = [ 393.51797 304.05914 ] ± [ 1.61832 1.27604 ]
% Skew: alpha_c = [ 0.00180 ] ± [ 0.00042 ] => angle of pixel axes = 89.89661 ± 0.02379 degrees
% Distortion: kc = [ 0.05867 -0.28214 0.00131 0.00244 0.35651 ] ± [ 0.01228 0.09805 0.00060 0.00083 0.22340 ]
% Pixel error: err = [ 0.19975 0.23023 ]
%
% Note: The numerical errors are approximately three times the standard
% deviations (for reference).
-
K =
699.1346 1.2584 393.5180
0 701.1120 304.0591
0 0 1.0000
E =
0.3692 -0.8351 -4.0017
0.3881 -1.6743 -6.5774
4.5508 6.3663 0.2764
R =
-0.9852 0.0712 -0.1561
-0.0967 -0.9820 0.1624
0.1417 -0.1751 -0.9743
t =
0.7942
-0.5761
0.1935
P1 =
1 0 0 0
0 1 0 0
0 0 1 0
P2 =
-633.1409 -20.3941 -492.3047 630.6410
-24.6964 -741.7198 -182.3506 -345.0670
0.1417 -0.1751 -0.9743 0.1935
C1 =
0
0
0
1
C2 =
0.6993
-0.5883
0.4060
1.0000
% new points using cpselect
%x1
input_points =
422.7500 260.2500
384.2500 238.7500
339.7500 211.7500
298.7500 186.7500
452.7500 236.2500
412.2500 214.2500
368.7500 191.2500
329.7500 165.2500
482.7500 210.2500
443.2500 189.2500
402.2500 166.2500
362.7500 143.2500
510.7500 186.7500
466.7500 165.7500
425.7500 144.2500
392.2500 125.7500
403.2500 369.7500
367.7500 345.2500
330.2500 319.7500
296.2500 297.7500
406.7500 341.2500
365.7500 316.2500
331.2500 293.2500
295.2500 270.2500
414.2500 306.7500
370.2500 281.2500
333.2500 257.7500
296.7500 232.7500
434.7500 341.2500
441.7500 312.7500
446.2500 282.2500
462.7500 311.2500
466.7500 286.2500
475.2500 252.2500
481.7500 292.7500
490.2500 262.7500
498.2500 232.7500
%x2
base_points =
393.2500 311.7500
358.7500 282.7500
319.7500 249.2500
284.2500 216.2500
431.7500 285.2500
395.7500 256.2500
356.7500 223.7500
320.2500 194.2500
474.7500 254.7500
437.7500 226.2500
398.7500 197.2500
362.7500 168.7500
511.2500 227.7500
471.2500 196.7500
432.7500 169.7500
400.2500 145.7500
388.2500 404.2500
357.2500 373.2500
326.7500 343.2500
297.2500 318.7500
387.7500 381.7500
356.2500 351.7500
323.2500 321.7500
291.7500 292.7500
390.7500 352.7500
357.2500 323.2500
320.2500 291.2500
287.2500 258.7500
427.7500 376.7500
429.7500 351.7500
431.7500 324.2500
462.7500 345.7500
463.7500 325.2500
470.7500 295.2500
491.7500 325.2500
497.7500 298.2500
504.7500 270.2500
Update III
See answer for corrections. Answers computed above were using the wrong variables/values.
** Note all reference are to Multiple View Geometry in Computer Vision by Hartley and Zisserman.
OK, so there were a couple bugs:
When computing the essential matrix (p. 257-259) the author mentions the correct R,t pair from the set of four R,t (Result 9.19) is the one where the 3D points lay in front of both cameras (Fig. 9.12, a) but doesn't mention how one computes this. By chance I was re-reading chapter 6 and discovered that 6.2.3 (p.162) discusses depth of points and Result 6.1 is the equation needed to be applied to get the correct R and t.
In my implementation of the optimal triangulation method (Algorithm 12.1 (p.318)) in step 2 I had T2^-1' * F * T1^-1 where I needed to have (T2^-1)' * F * T1^-1. The former translates the -1.I wanted, and in the latter, to translate the inverted the T2 matrix (foiled again by MATLAB!).
Finally, I wasn't computing P1 correctly, it should have been P1 = K * [eye(3),zeros(3,1)];. I forgot to multiple by the calibration matrix K.
Hope this helps future passerby's !
It may be that your points are in a degenerate configuration. Try to add a couple of points from the scene that don't belong to the cube and see how it goes.
More information required:
What is t? The baseline might be too small for parallax.
What is the disparity between x1 and x2?
Are you confident about the accuracy of the calibration (I'm assuming you used the Stereo part of the Bouguet Toolbox)?
When you say the correspondences are hand-picked, do you mean you selected the corresponding points on the image or did you use an interest point detector on the two images are then set the correspondences?
I'm sure we can resolve this problem :)