Godot Version: v4.0-alpha15
I have a terrain being generated using the SurfaceTool and the MeshInstance3D. I am also moving a purple decal acrossed the surface based on the 3D mouse position. Below is a screenshot of what this looks like.
I want to take the 3D mouse position and raise/lower the terrain on an action press. I found the MeshDataTool but am not quite sure if this allows for that and also not completely sure how to convert the 3D mouse position to the corresponding vertices.
At this point I am sort of completely stuck as there's not a whole lot of documentation that I could find that helps.
I appreciate the help in advance!
I was actually able to figure this out using the MeshDataTool as I mentioned in the original post.
I use the Vector3::distance_to method to get any vertex within 3 of the target position, then I use MeshDataTool::set_vertex in combination with Vector3 methods to add and subtract each of those vertex positions.
var target_position = Vector3(0, 0, 0)
var mdt = MeshDataTool.new()
var mesh = terrain_mesh_instance.mesh
mdt.create_from_surface(mesh, 0)
var points = get_vertex_id(mdt, target_position)
for i in points:
if Input.is_action_pressed("shift"):
mdt.set_vertex(i, mdt.get_vertex(i) + Vector3(0, 1 * delta,0))
elif Input.is_key_pressed(KEY_ALT):
var coords = mdt.get_vertex(i)
coords.y = 0
mdt.set_vertex(i, coords)
else:
mdt.set_vertex(i, mdt.get_vertex(i) + Vector3(0, -1 * delta,0))
# This fixes the normals so the shadows work properly
for face in mdt.get_face_count():
var vertex = mdt.get_face_vertex(face, 0)
var normal = mdt.get_face_normal(face)
mdt.set_vertex_normal(vertex, normal)
mesh.clear_surfaces()
mdt.commit_to_surface(mesh)
I was making animation with manim and I tried to make line that one end is fixed, and one end moves along the arc. The way I wanted to solve this problem was to define each point with theta, and let the point move along the arc as it moves. However, even if the setta was changed, the line did not move, so the rotate was used, but the length of the line did not change, so the line deviated from the arc and failed. I wonder how to make the line move smoothly with the size of the angle when theta value is changed to an arbitrary value with theta as a parameter.
from manim import *
import numpy as np
class RotateVector(Scene):
def construct(self):
theta = 1/9*np.pi
semicircle = Arc(angle=PI, radius=4, stroke_color = GREY, arc_center=np.array([0, -1.5, 0]))
bottom_line = Line(start=np.array([-4.0,-1.5,0]), end=np.array([4.0,-1.5,0]), stroke_color = GREY)
dot_A = np.array([-4.0,-1.5,0])
dot_A_text = Text('A', font_size=35).next_to(dot_A, LEFT+DOWN, buff=0.05)
dot_P = np.array([4*np.cos(theta*2),4*np.sin(theta*2)-1.5,0])
dot_P_text = Text('P', font_size=35).next_to(dot_P, RIGHT+UP, buff=0.05)
line_AP = Line(start=np.array([-4.0,-1.5,0]), end=np.array([4*np.cos(2*theta),4*np.sin(2*theta)-1.5,0]), stroke_color = GREY)
self.play(Create(semicircle), Create(bottom_line), Create(dot_A_text))
self.play(Create(line_AP), Create(dot_P_text))
self.wait(3)
Your mobjects do not change because the dependency on Theta is not retained after initializing the mobject -- but it doesn't take too much to make it work!
The solution to your Problem is using a ValueTracker for the value of theta (to easily allow animating changing it) and updater functions for the mobjects you would like to change. In other words:
theta_vt = ValueTracker(theta)
dot_P_text.add_updater(lambda mobj: mobj.next_to([4*np.cos(theta_vt.get_value()*2),4*np.sin(theta_vt.get_value()*2)-1.5,0]))
line_AP.add_updater(lambda mobj: mobj.put_start_and_end_on([-4, 1.5, 0], [4*np.cos(theta_vt.get_value()*2),4*np.sin(theta_vt.get_value()*2)-1.5,0]))
self.play(theta_vt.animate.set_value(2*PI))
You could even avoid the value tracker and use an auxiliary (empty) mobject for positioning:
P = Mobject().move_to([4*np.cos(theta*2),4*np.sin(theta*2)-1.5,0])
and then change the updaters to use the position of P instead of theta, and make the point move along the desired arc. (That point actually is the one that becomes slightly more complicated, while the updaters simplify a bit.)
I would like to rotate an image based on a second image. Both images are satellite images, however, they are not rotated in the same direction(in one image top is in the north direction and in the other the rotation is not known). But, I have at least three pixel pairs in each of the images (x1,y1,x2,y2). So my idea is to figure out their relative position and get the rotation angle from that.
Currently, I estimate the angle like this:
def angle_between(v1, v2):
""" Returns the angle in radians between vectors 'v1' and 'v2'::
>>> angle_between((1, 0, 0), (0, 1, 0))
1.5707963267948966
>>> angle_between((1, 0, 0), (1, 0, 0))
0.0
>>> angle_between((1, 0, 0), (-1, 0, 0))
3.141592653589793
"""
v1_u = unit_vector(v1)
v2_u = unit_vector(v2)
angle_rad = np.arccos(np.clip(np.dot(v1_u, v2_u), -1.0, 1.0))
return (angle_rad*180)/math.pi
with the inputs like this:
v1 = [points[0][0] - points[1][0], points[0][1] - points[1][1]] #hist
v2 = [points[0][2] - points[1][2], points[0][3] - points[1][3]] #ref
However, this only uses two pixel pairs instead of the three. Therefore, the rotation is some times incorrect. Could anybody show me how to use all three pixels?
My first attempt was to check on which side of the straight the third pixel lies in the image and based on that negate the angle. But, this does not work for all images.
EDIT:
I cannot add the original images, as they are copyrighted, however, as the image content is not really important I added whitened images. The first is the input image with the three points drawn in, the second is the rotated image (where additionally the (wrong, due to rotation) cutout area is marked with a rectangle) and third the historical image.
The points are the following:
567.01,144,1544.4,4581.8
1182.6,1568.1,2934.1,3724.3
938.97,1398.1,2795.8,4002.5
with:
x_historical, y_historical, x_presentday, y_presentday
I'm trying to stitch 2 images together by using template matching find 3 sets of points which I pass to cv2.getAffineTransform() get a warp matrix which I pass to cv2.warpAffine() into to align my images.
However when I join my images the majority of my affine'd image isn't shown. I've tried using different techniques to select points, changed the order or arguments etc. but I can only ever get a thin slither of the affine'd image to be shown.
Could somebody tell me whether my approach is a valid one and suggest where I might be making an error? Any guesses as to what could be causing the problem would be greatly appreciated. Thanks in advance.
This is the final result that I get. Here are the original images (1, 2) and the code that I use:
EDIT: Here's the results of the variable trans
array([[ 1.00768049e+00, -3.76690353e-17, -3.13824885e+00],
[ 4.84461775e-03, 1.30769231e+00, 9.61912797e+02]])
And here are the here the points passed to cv2.getAffineTransform: unified_pair1
array([[ 671., 1024.],
[ 15., 979.],
[ 15., 962.]], dtype=float32)
unified_pair2
array([[ 669., 45.],
[ 18., 13.],
[ 18., 0.]], dtype=float32)
import cv2
import numpy as np
def showimage(image, name="No name given"):
cv2.imshow(name, image)
cv2.waitKey(0)
cv2.destroyAllWindows()
return
image_a = cv2.imread('image_a.png')
image_b = cv2.imread('image_b.png')
def get_roi(image):
roi = cv2.selectROI(image) # spacebar to confirm selection
cv2.waitKey(0)
cv2.destroyAllWindows()
crop = image_a[int(roi[1]):int(roi[1]+roi[3]), int(roi[0]):int(roi[0]+roi[2])]
return crop
temp_1 = get_roi(image_a)
temp_2 = get_roi(image_a)
temp_3 = get_roi(image_a)
def find_template(template, search_image_a, search_image_b):
ccnorm_im_a = cv2.matchTemplate(search_image_a, template, cv2.TM_CCORR_NORMED)
template_loc_a = np.where(ccnorm_im_a == ccnorm_im_a.max())
ccnorm_im_b = cv2.matchTemplate(search_image_b, template, cv2.TM_CCORR_NORMED)
template_loc_b = np.where(ccnorm_im_b == ccnorm_im_b.max())
return template_loc_a, template_loc_b
coord_a1, coord_b1 = find_template(temp_1, image_a, image_b)
coord_a2, coord_b2 = find_template(temp_2, image_a, image_b)
coord_a3, coord_b3 = find_template(temp_3, image_a, image_b)
def unnest_list(coords_list):
coords_list = [a[0] for a in coords_list]
return coords_list
coord_a1 = unnest_list(coord_a1)
coord_b1 = unnest_list(coord_b1)
coord_a2 = unnest_list(coord_a2)
coord_b2 = unnest_list(coord_b2)
coord_a3 = unnest_list(coord_a3)
coord_b3 = unnest_list(coord_b3)
def unify_coords(coords1,coords2,coords3):
unified = []
unified.extend([coords1, coords2, coords3])
return unified
# Create a 2 lists containing 3 pairs of coordinates
unified_pair1 = unify_coords(coord_a1, coord_a2, coord_a3)
unified_pair2 = unify_coords(coord_b1, coord_b2, coord_b3)
# Convert elements of lists to numpy arrays with data type float32
unified_pair1 = np.asarray(unified_pair1, dtype=np.float32)
unified_pair2 = np.asarray(unified_pair2, dtype=np.float32)
# Get result of the affine transformation
trans = cv2.getAffineTransform(unified_pair1, unified_pair2)
# Apply the affine transformation to original image
result = cv2.warpAffine(image_a, trans, (image_a.shape[1] + image_b.shape[1], image_a.shape[0]))
result[0:image_b.shape[0], image_b.shape[1]:] = image_b
showimage(result)
cv2.imwrite('result.png', result)
Sources: Approach based on advice received here, this tutorial and this example from the docs.
July 12 Edit:
This post inspired GitHub repos providing functions to accomplish this task; one for a padded warpAffine() and another for a padded warpPerspective(). Check out the Python version or the C++ version.
Transformations shift the location of pixels
What any transformation does is takes your point coordinates (x, y) and maps them to new locations (x', y'):
s*x' h1 h2 h3 x
s*y' = h4 h5 h6 * y
s h7 h8 1 1
where s is some scaling factor. You must divide the new coordinates by the scale factor to get back the proper pixel locations (x', y'). Technically, this is only true of homographies---(3, 3) transformation matrices---you don't need to scale for affine transformations (you don't even need to use homogeneous coordinates...but it's better to keep this discussion general).
Then the actual pixel values are moved to those new locations, and the color values are interpolated to fit the new pixel grid. So during this process, these new locations get recorded at some point. We'll need those locations to see where the pixels actually move to, relative to the other image. Let's start with an easy example and see where points are mapped.
Suppose your transformation matrix simply shifts pixels to the left by ten pixels. Translation is handled by the last column; the first row is the translation in x and second row is the translation in y. So we would have an identity matrix, but with -10 in the first row, third column. Where would the pixel (0,0) be mapped? Hopefully, (-10,0) if logic makes any sense. And in fact, it does:
transf = np.array([[1.,0.,-10.],[0.,1.,0.],[0.,0.,1.]])
homg_pt = np.array([0,0,1])
new_homg_pt = transf.dot(homg_pt))
new_homg_pt /= new_homg_pt[2]
# new_homg_pt = [-10. 0. 1.]
Perfect! So we can figure out where all points map with a little linear algebra. We will need to get all the (x,y) points, and put them into a huge array so that every single point is in it's own column. Lets pretend our image is only 4x4.
h, w = src.shape[:2] # 4, 4
indY, indX = np.indices((h,w)) # similar to meshgrid/mgrid
lin_homg_pts = np.stack((indX.ravel(), indY.ravel(), np.ones(indY.size)))
These lin_homg_pts have every homogenous point now:
[[ 0. 1. 2. 3. 0. 1. 2. 3. 0. 1. 2. 3. 0. 1. 2. 3.]
[ 0. 0. 0. 0. 1. 1. 1. 1. 2. 2. 2. 2. 3. 3. 3. 3.]
[ 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]]
Then we can do matrix multiplication to get the mapped value of every point. For simplicity, let's stick with the previous homography.
trans_lin_homg_pts = transf.dot(lin_homg_pts)
trans_lin_homg_pts /= trans_lin_homg_pts[2,:]
And now we have the transformed points:
[[-10. -9. -8. -7. -10. -9. -8. -7. -10. -9. -8. -7. -10. -9. -8. -7.]
[ 0. 0. 0. 0. 1. 1. 1. 1. 2. 2. 2. 2. 3. 3. 3. 3.]
[ 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]]
As we can see, everything is working as expected: we have shifted the x-values only, by -10.
Pixels can be shifted outside of your image bounds
Notice that these pixel locations are negative---they're outside of the image bounds. If we do something a little more complex and rotate the image by 45 degrees, we'll get some pixel values way outside our original bounds. We don't care about every pixel value though, we just need to know how far the farthest pixels are that are outside the original image pixel locations, so that we can pad the original image that far out, before displaying the warped image on it.
theta = 45*np.pi/180
transf = np.array([
[ np.cos(theta),np.sin(theta),0],
[-np.sin(theta),np.cos(theta),0],
[0.,0.,1.]])
print(transf)
trans_lin_homg_pts = transf.dot(lin_homg_pts)
minX = np.min(trans_lin_homg_pts[0,:])
minY = np.min(trans_lin_homg_pts[1,:])
maxX = np.max(trans_lin_homg_pts[0,:])
maxY = np.max(trans_lin_homg_pts[1,:])
# minX: 0.0, minY: -2.12132034356, maxX: 4.24264068712, maxY: 2.12132034356,
So we see that we can get pixel locations well outside our original image, both in the negative and positive directions. The minimum x value doesn't change because when an homography applies a rotation, it does it from the top-left corner. Now one thing to note here is that I've applied the transformation to all pixels in the image. But this is really unnecessary, you can simply warp the four corner points and see where they land.
Padding the destination image
Note that when you call cv2.warpAffine() you have to input the destination size. These transformed pixel values reference that size. So if a pixel gets mapped to (-10,0), it won't show up in the destination image. That means that we'll have to make another homography with translations which shift all pixel locations be positive, and then we can pad the image matrix to compensate for our shift. We'll also have to pad the original image on the bottom and the right if the homography moves points to positions bigger than the image, too.
In the recent example, the min x value is the same, so we need no horizontal shift. However, the min y value has dropped by about two pixels, so we need to shift the image two pixels down. First, let's create the padded destination image.
pad_sz = list(src.shape) # in case three channel
pad_sz[0] = np.round(np.maximum(pad_sz[0], maxY) - np.minimum(0, minY)).astype(int)
pad_sz[1] = np.round(np.maximum(pad_sz[1], maxX) - np.minimum(0, minX)).astype(int)
dst_pad = np.zeros(pad_sz, dtype=np.uint8)
# pad_sz = [6, 4, 3]
As we can see, the height increased from the original by two pixels to account for that shift.
Add translation to the transformation to shift all pixel locations to positive
Now, we need to create a new homography matrix to translate the warped image by the same amount that we shifted by. And to apply both transformations---the original and this new shift---we have to compose the two homographies (for an affine transformation, you can simply add the translation, but not for an homography). Additionally we need to divide by the last entry to make sure the scales are still proper (again, only for homographies):
anchorX, anchorY = 0, 0
transl_transf = np.eye(3,3)
if minX < 0:
anchorX = np.round(-minX).astype(int)
transl_transf[0,2] -= anchorX
if minY < 0:
anchorY = np.round(-minY).astype(int)
transl_transf[1,2] -= anchorY
new_transf = transl_transf.dot(transf)
new_transf /= new_transf[2,2]
I also created here the anchor points for where we will place the destination image into the padded matrix; it's shifted by the same amount the homography will shift the image. So let's place the destination image inside the padded matrix:
dst_pad[anchorY:anchorY+dst_sz[0], anchorX:anchorX+dst_sz[1]] = dst
Warp with the new transformation into the padded image
All we have left to do is apply the new transformation to the source image (with the padded destination size), and then we can overlay the two images.
warped = cv2.warpPerspective(src, new_transf, (pad_sz[1],pad_sz[0]))
alpha = 0.3
beta = 1 - alpha
blended = cv2.addWeighted(warped, alpha, dst_pad, beta, 1.0)
Putting it all together
Let's create a function for this since we were creating quite a few variables we don't need at the end here. For inputs we need the source image, the destination image, and the original homography. And for outputs we simply want the padded destination image, and the warped image. Note that in the examples we used a 3x3 homography so we better make sure we send in 3x3 transforms instead of 2x3 affine or Euclidean warps. You can just add the row [0,0,1] to any affine warp at the bottom and you'll be fine.
def warpPerspectivePadded(img, dst, transf):
src_h, src_w = src.shape[:2]
lin_homg_pts = np.array([[0, src_w, src_w, 0], [0, 0, src_h, src_h], [1, 1, 1, 1]])
trans_lin_homg_pts = transf.dot(lin_homg_pts)
trans_lin_homg_pts /= trans_lin_homg_pts[2,:]
minX = np.min(trans_lin_homg_pts[0,:])
minY = np.min(trans_lin_homg_pts[1,:])
maxX = np.max(trans_lin_homg_pts[0,:])
maxY = np.max(trans_lin_homg_pts[1,:])
# calculate the needed padding and create a blank image to place dst within
dst_sz = list(dst.shape)
pad_sz = dst_sz.copy() # to get the same number of channels
pad_sz[0] = np.round(np.maximum(dst_sz[0], maxY) - np.minimum(0, minY)).astype(int)
pad_sz[1] = np.round(np.maximum(dst_sz[1], maxX) - np.minimum(0, minX)).astype(int)
dst_pad = np.zeros(pad_sz, dtype=np.uint8)
# add translation to the transformation matrix to shift to positive values
anchorX, anchorY = 0, 0
transl_transf = np.eye(3,3)
if minX < 0:
anchorX = np.round(-minX).astype(int)
transl_transf[0,2] += anchorX
if minY < 0:
anchorY = np.round(-minY).astype(int)
transl_transf[1,2] += anchorY
new_transf = transl_transf.dot(transf)
new_transf /= new_transf[2,2]
dst_pad[anchorY:anchorY+dst_sz[0], anchorX:anchorX+dst_sz[1]] = dst
warped = cv2.warpPerspective(src, new_transf, (pad_sz[1],pad_sz[0]))
return dst_pad, warped
Example of running the function
Finally, we can call this function with some real images and homographies and see how it pans out. I'll borrow the example from LearnOpenCV:
src = cv2.imread('book2.jpg')
pts_src = np.array([[141, 131], [480, 159], [493, 630],[64, 601]], dtype=np.float32)
dst = cv2.imread('book1.jpg')
pts_dst = np.array([[318, 256],[534, 372],[316, 670],[73, 473]], dtype=np.float32)
transf = cv2.getPerspectiveTransform(pts_src, pts_dst)
dst_pad, warped = warpPerspectivePadded(src, dst, transf)
alpha = 0.5
beta = 1 - alpha
blended = cv2.addWeighted(warped, alpha, dst_pad, beta, 1.0)
cv2.imshow("Blended Warped Image", blended)
cv2.waitKey(0)
And we end up with this padded warped image:
![[Padded and warped1]1
as opposed to the typical cut off warp you would normally get.
I am trying to make a program that automatically corrects the perspective of a rectangle. I have managed to get the silhouette of the rectangle, and have the code to correct the perspective, but I can't find the corners. The biggest problem is that, because it has been deformed, I can't use the following "code":
c1 = min(x), min(y)
c2 = max(x), min(y)
c3 = min(x), max(y)
c4 = max(x), max(y)
This wouldn't work with this situation (X represents a corner):
X0000000000X
.00000000000
..X000000000
.....0000000
........0000
...........X
Does anyone know how to do this?
Farthest point from the center will give you one corner.
Farthest point from the first corner will give you another corner, which may be either adjacent or opposite to the first.
Farthest point from the line between those two corners (a bit more math intensitive) will give you a third corner. I'd use distance from center as a tie breaker.
For finding the 4th corner, it'll be the point outside the triangle formed by the first 3 corners you found, farthest from the nearest line between those corners.
This is a very time consuming way to do it, and I've never tried it, but it ought to work.
You could try to use a scanline algorithm - For every line of the polygon (so y = min(y)..max(y)), get l = min(x) and r = max(x). Calculate the left/right slope (deltax) and compare it with the slope the line before. If it changed (use some tolerance here), you are at a corner of the rectangle (or close to it). That won't work for all cases, as the slope can't be that exact because of low resolution, but for large rectangles and slopes not too similar, this should work.
At least, it works well for your example:
X0000000000X l = 0, r = 11
.00000000000 l = 1, r = 11, deltaxl = 1, deltaxr = 0
..X000000000 l = 2, r = 11, deltaxl = 1, deltaxr = 0
.....0000000 l = 5, r = 11, deltaxl = 3, deltaxr = 0
........0000 l = 8, r = 11, deltaxl = 3, deltaxr = 0
...........X l = 11, r = 11, deltaxl = 3, deltaxr = 0
You start with the top of the rectangle where you get two different values for l and r, so you already have two of the corners. On the left side, for the first three lines you'll get deltax = 1, but after it, you'll get deltax = 3, so there is a corner at (3, 3). On the right side, nothing changes, deltax = 0, so you only get the point at the end.
Note that you're "collecting" corners here, so if you don't have 4 corners at the end, the slopes were too similar (or you have a picture of a triangle) and you can switch to a different (more exact) algorithm or just give an error. The same if you have more than 4 corners or some other strange things like holes in the rectangle. It seems some kind of image detection is involved, so these cases can occur, right?
There are cases in which a simple deltax = (x - lastx) won't work good, see this example for the left side of a rectangle:
xxxxxx
xxxxx deltax = 1 dy/dx = 1/1 = 1
xxxxx deltax = 0 dy/dx = 2/1 = 2
xxxx deltax = 1 dy/dx = 3/2 = 1.5
xxxx deltax = 0 dy/dx = 4/2 = 2
xxx deltax = 1 dy/dx = 5/3 = 1.66
Sometimes deltax is 0, sometimes is 1. It's better to use the slope of the line from the actual point to the top left/right point (deltay / deltax). Using it, you'll still have to stick with a tolerance, but your values will get more exact with each new line.
You could use a hough transform to find the 4 most prominent lines in the masked image. These lines will be the sides of the quadrangle.
The lines will intersect in up to 6 points, which are the 4 corners and the 2 perspective vanishing points.
These are easy to distinguish: pick any point inside the quadrangle, and check if the line from this point to each of the 6 intersection points intersects any of the lines. If not, then that intersection point is a corner.
This has the advantage that it works well even for noisy or partially obstructed images, or if your segmentation is not exact.
en.wikipedia.org/wiki/Hough_transform
Example CImg Code
I would be very interested in your results. I have been thinking about writing something like this myself, to correct photos of paper sheets taken at an angle. I am currently struggling to think of a way to correct the perspective if the 4 points are known
p.s.
Also check out
Zhengyou Zhang , Li-Wei He, "Whiteboard scanning and image enhancement"
http://research.microsoft.com/en-us/um/people/zhang/papers/tr03-39.pdf
for a more advanced solution for quadrangle detection
I have asked a related question, which tries to solve the perspective transform:
proportions of a perspective-deformed rectangle
This looks like a convex hull problem.
http://en.wikipedia.org/wiki/Convex_hull
Your problem is simpler but the same solution should work.