Transforming points from one coordinate to another - geometry

I have data points in a 2D coordinate space that I want to linearly transform to another coordinate space. The image below will make things a little clear.
The data points I have are in the gray coordinate space (left-top corner A is the x=0,y=0 point). I want to transform all points to the pink coordinate system, for which B is its x=0,y=0 point.
How would I go about doing that?

This is not a linear transformation.
Define this "coordinate system" as a convex quad, as follows:
The vertex coordinates are in parameter space u, v. Interpolating along one direction and then the other gives a general point:
This is bi-linear in parameters u, v. It only becomes linear if A + D - B - C = 0, i.e. the quad is a parallelogram.
Transforming between such coordinate systems:
Assume (required) that these ABCD vertices are embedded in a "global" Cartesian space
Convert from the parameter space of the first system to the global space using interpolation as above
Convert back to the parameter space by inverting the above equation, solving a pair of simultaneous equations:
Solutions for u, v:
1 for a parallelogram (G = 0)
2 for a general convex quad, since the coordinate lines (gray) cross a singularity in each direction
0 for a concave quad (complex solutions)

Related

Find all the planar surfaces in an rgbd image using depth and normal data

Many questions deal with generating normal from depth or depth from normal, but I want to ask about a simple way to generate all the planar surfaces given the depth and normal of an image.
I already have depth and normal of each pixel in the image. For each pixel (ui, vi), assume that we can get its 3D coordinates (xi, yi, zi) with zi as the depth and normal vector (nix, niy, niz). Thus, a unique tangent plane is defined by: nix(x - xi) + niy(y - yi) + niz(z - zi) = 0. Then, for each pixel we can define a unique planar surface by the above equation.
What is a common practice in finding the function f such that f(u, v) = (x, y, z) (from pixel to 3D coordinates)? Is pinhole model (plus the depth data) an effective and accurate one?
How does one generate all the planar surfaces effectively? One way is to iterate through all the pixels in the image and find all the planes, but this seems like an ineffective method.
If its pinhole model
make sure your 3D data is not distorted by projection.
group your points by normal
this is easy or hard depending on the points/normal accuracy. Simply sort the points by normals which leads to O(n.log(n)) where n is number of points.
test/group by planes in single normal group
The idea is to pick 3 points from a group compute plane from it and test which points of the group belongs to it. If too low count you got wrong points picked (not belonging to the same plane) and need to pick different ones. Also if the picked points are too close to each or on the same line you can not get correct plane from it.
The math function for plane is:
x*nx + y*ny + z*nz + d = 0
where (nx,ny,nz) is your normal of the group (unit vector) and (x,y,z) is your point position. So you just compute d from a known point (one of the picked ones (x0,y0,z0) ) ...
d = -x0*nx -y0*ny -z0*nz
and then just test which points are sattisfying this condition:
threshod=1e-20; // just accuracy margin
fabs(x*nx + y*ny + z*nz + d) <= threshod
now remove matched points from the group (move them into found plane object) and apply this bullet again on the remaining points until they count is low or no valid plane is found...
then test another group until no groups are left...
I think RANSAC can speed things up to avoid brute force in this case but never used it myself so google ...
A possible approach for the planes is to consider the set of normal vectors and perform clustering on them (for instance by k-means). Then every cluster can correspond to several parallel surfaces. By evaluating the distance from the origin (a scalar function), you can form sub-clusters which will separate those surfaces. Finally, points at constant distance can belong to different coplanar patches, which you can separate by connected component labelling.
It is likely that clustering on the normal vectors and distance simultaneously (hence in a 4D space) will yield better results and be simpler. Be sure to normalize the vectors. Another option is to represent the vectors by just two parameters (such as spherical angles), but this will lead to a quite non-uniform mapping, and create phase wrapping issues.

Uniform spatial bins on surface of a sphere

Is there a spatial lookup grid or binning system that works on the surface of a (3D) sphere? I have the requirements that
The bins must be uniform (so you can look up in constant time if there exists a point r distance away from any spot on the sphere, given constant r.)†
The number of bins must be at most linear with the surface area of the sphere. (Alternatively, increasing the surface resolution of the grid shouldn’t make it grow faster than the area it maps.)
I’ve already considered
Spherical coordinates: not good because the cells created are extremely nonuniform making it useless for proximity testing.
Cube meshes: Less distortion than spherical coordinates, but still very difficult to determine which cells to search for a given query.
3D voxel binning: Wastes the entire interior volume of the sphere with empty bins that will never be used (as well as the empty bins at the 6 corners of the bounding cube). Space requirements grow with O(n sqrt(n)) with increasing sphere surface area.
kd-Trees: perform poorly in 3D and are technically logarithmic complexity, not constant per query.
My best idea for a solution involves using the 3D voxel binning method, but somehow excluding the voxels that the sphere will never intersect. However I have no idea how to determine which voxels to exclude, nor how to calculate an index into such a structure given a query location on the sphere.
† For what it’s worth the points have a minimum spacing so a good grid really would guarantee constant lookup.
My suggestion would be a variant of the spherical coordinates, such that the polar angle is not sampled uniformly but instead the sine of this angle is sampled uniformly. This way, the element of area sinφ dφ dΘ is kept constant, leading to tiles of the same area (though variable aspect ratio).
At the poles, merge all tiles in a single disk-like polygon.
Another possibility is to project a regular icosahedron onto the sphere and to triangulate the spherical triangles so obtained. This takes a little of spherical trigonometry.
I had a similar problem and used "sparse" 3D voxel binning. Basically, my spatial index is a hash map from (x, y, z) coordinates to bins.
Because I also had a minimum distance constraint on my points, I chose the bin size such that a bin can contain at most one point. This is accomplished if the edge of the (cubic) bins is at most d / sqrt(3), where d is the minimum separation of two points on the sphere. The advantage is that you can represent a full bin as a single point, and an empty bin can just be absent from the hash map.
My only query was for points within a radius d (the same d), which then requires scanning the surrounding 125 bins (a 5×5×5 cube). You could technically leave off the 8 corners to get this down to 117, but I didn't bother.
An alternative for the bin size is to optimize it for queries rather than storage size and simplicity, and choose it such that you always have to scan at most 27 bins (a 3×3×3 cube). That would require a bin edge length of d. I think (but haven't thought hard about it) that a bin could contain up to 4 points in that case. You could represent these with a fixed-size array to save one pointer indirection.
In either case, the memory usage of your spatial index will be O(n) for n points, so it doesn't get any better than that.

texture mapping (u,v) values

Here is a excerpt from Peter Shirley's Fundamentals of computer graphics:
11.1.2 Texture Arrays
We will assume the two dimensions to be mapped are called u and v.
We also assume we have an nx and ny image that we use as the texture.
Somehow we need every (u,v) to have an associated color found from the
image. A fairly standard way to make texturing work for (u,v) is to
first remove the integer portion of (u,v) so that it lies in the unit
square. This has the effect of "tiling" the entire uv plane with
copies of the now-square texture. We then use one of the three
interpolation strategies to compute the image color for the
coordinates.
My question is: What are the integer portion of (u,v)? I thought u,v are 0 <= u,v <= 1.0. If there is an integer portion, shouldn't we be dividing u,v by the texture image width and height to get the normalized u,v values?
UV values can be less than 0 or greater than 1. The reason for dropping the integer portion is that UV values use the fractional part when indexing textures, where (0,0), (0,1), (1,0) and (1,1) correspond to the texture's corners. Allowing UV values to go beyond 0 and 1 is what enables the "tiling" effect to work.
For example, if you have a rectangle whose corners are indexed with the UV points (0,0), (0,2), (2,0), (2,2), and assuming the texture is set to tile the rectangle, then four copies of the texture will be drawn on that rectangle.
The meaning of a UV value's integer part depends on the wrapping mode. In OpenGL, for example, there are at least three wrapping modes:
GL_REPEAT - The integer part is ignored and has no meaning. This is what allows textures to tile when UV values go beyond 0 and 1.
GL_MIRRORED_REPEAT - The fractional part is mirrored if the integer part is odd.
GL_CLAMP_TO_EDGE - Values greater than 1 are clamped to 1, and values less than 0 are clamped to 0.
Peter O's answer is excellent. I want to add a high level point that the coordinate systems used in graphics are a convention that people just stick to as a defacto standard-- there's no law of nature here and it is arbitrary (but a decent standard thank goodness). I think one reason texture mapping is often confusing is that the arbitrariness of this stardard isn't obvious. This is that the image has a de facto coordinate system on the unit square [0,1]^2. Give me a (u,v) on the unit square and I will tell you a point in the image (for example, (0.2,0.3) is 20% to the right and 30% up from the bottom-left corner of the image). But what if you give me a (u,v) that is outside [0,1]^2 like (22.7, -13.4)? Some rule is used to make that on [0.1]^2, and the GL modes described are just various useful hacks to deal with that case.

Calculate equidistant point with minimal distance from 3 points in N-dimensional space

I'm trying to code the Ritter's bounding sphere algorithm in arbitrary dimensions, and I'm stuck on the part of creating a sphere which would have 3 given points on it's edge, or in other words, a sphere which would be defined by 3 points in N-dimensional space.
That sphere's center would be the minimal-distance equidistant point from the (defining) 3 points.
I know how to solve it in 2-D (circumcenter of a triangle defined by 3 points), and I've seen some vector calculations for 3D, but I don't know what the best method would be for N-D, and if it's even possible.
(I'd also appreciate any other advice about the smallest bounding sphere calculations in ND, in case I'm going in the wrong direction.)
so if I get it right:
Wanted point p is intersection between 3 hyper-spheres of the same radius r where the centers of hyper-spheres are your points p0,p1,p2 and radius r is minimum of all possible solutions. In n-D is arbitrary point defined as (x1,x2,x3,...xn)
so solve following equations:
|p-p0|=r
|p-p1|=r
|p-p2|=r
where p,r are unknowns and p0,p1,p2 are knowns. This lead to 3*n equations and n+1 unknowns. So get all the nonzero r solutions and select the minimal. To compute correctly chose some non trivial equation (0=r) from each sphere to form system of n+1 =equations and n+1 unknowns and solve it.
[notes]
To ease up the processing you can have your equations in this form:
(p.xi-p0.xi)^2=r^2
and use sqrt(r^2) only after solution is found (ignoring negative radius).
there is also another simpler approach possible:
You can compute the plane in which the points p0,p1,p2 lies so just find u,v coordinates of these points inside this plane. Then solve your problem in 2D on (u,v) coordinates and after that convert found solution form (u,v) back to your n-D space.
n=(p1-p0)x(p2-p0); // x is cross product
u=(p1-p0); u/=|u|;
v=u x n; v/=|v|; // x is cross product
if memory of mine serves me well then conversion n-D -> u,v is done like this:
P0=(0,0);
P1=(|p1-p0|,0);
P2=(dot(p2-p0,u),dot(p2-p0,v));
where P0,P1,P2 are 2D points in (u,v) coordinate system of the plane corresponding to points p0,p1,p2 in n-D space.
conversion back is done like this:
p=(P.u*u)+(P.v*v);
My Bounding Sphere algorithm only calculates a near-optimal sphere, in 3 dimensions.
Fischer has an exact, minimal bounding hyper-sphere (N dimensions.) See his paper: http://people.inf.ethz.ch/gaertner/texts/own_work/seb.pdf.
His (C++/Java)code: https://github.com/hbf/miniball.
Jack Ritter
jack#houseofwords.com

Algorithm for Polygon Image Fill

I want an efficient algorithm to fill polygon with an Image, I want to fill an Image into Trapezoid. currently I am doing it in two steps
1) First Perform StretchBlt on Image,
2) Perform Column by Column vertical StretchBlt,
Is there any better method to implement this? Is there any Generic and Fast algorithm which can fill any polygon?
Thanks,
Sunny
I can't help you with the distortion part, but filling polygons is pretty simple, especially if they are convex.
For each Y scan line have a table indexed by Y, containing a minX and maxX.
For each edge, run a DDA line-drawing algorithm, and use it to fill in the table entries.
For each Y line, now you have a minX and maxX, so you can just fill that segment of the scan line.
The hard part is a mental trick - do not think of coordinates as specifying pixels. Think of coordinates as lying between the pixels. In other words, if you have a rectangle going from point 0,0 to point 2,2, it should light up 4 pixels, not 9. Most problems with polygon-filling revolve around this issue.
ADDED: OK, it sounds like what you're really asking is how to stretch the image to a non-rectangular shape (but trapezoidal). I would do it in terms of parameters s and t, going from 0 to 1. In other words, a location in the original rectangle is (x + w0*s, y + h0*t). Then define a function such that s and t also map to positions in the trapezoid, such as ((x+t*a) + w0*s*(t-1) + w1*s*t, y + h1*t). This defines a coordinate mapping between the two shapes. Then just scan x and y, converting to s and t, and mapping points from one to the other. You probably want to have a little smoothing filter rather than a direct copy.
ADDED to try to give a better explanation:
I'm supposing both your rectangle and trapezoid have top and bottom edges parallel with the X axis. The lower-left corner of the rectangle is <x0,y0>, and the lower-left corner of the trapezoid is <x1,y1>. I assume the rectangle's width and height are <w,h>.
For the trapezoid, I assume it has height h1, and that it's lower width is w0, while it's upper width is w1. I assume it's left edge "slants" by a distance a, so that the position of its upper-left corner is <x1+a, y1+h1>. Now suppose you iterate <x,y> over the rectangle. At each point, compute s = (x-x0)/w, and t = (y-y0)/h, which are both in the range 0 to 1. (I'll let you figure out how to do that without using floating point.) Then convert that to a coordinate in the trapezoid, as xt = ((x1 + t*a) + s*(w0*(1-t) + w1*t)), and yt = y1 + h1*t. Then <xt,yt> is the point in the trapezoid corresponding to <x,y> in the rectangle. Now I'll let you figure out how to do the copying :-) Good luck.
P.S. And please don't forget - coordinates fall between pixels, not on them.
Would it be feasible to sidestep the problem and use OpenGL to do this for you? OpenGL can render to memory contexts and if you can take advantage of any hardware acceleration by doing this that'll completely dwarf any code tweaks you can make on the CPU (although on some older cards memory context rendering may not be able to take advantage of the hardware).
If you want to do this completely in software MESA may be an option.

Resources