skimage project an image's 3D plane to fronto-parallel view - geometry

I'm working on implementing Akush Gupta's synthetic data generation dataset (http://www.robots.ox.ac.uk/~vgg/data/scenetext/gupta16.pdf). In his work. he used a convolutional neural network to extract a point cloud from a 2-dimensional scenery image, segmented the point clouds to isolate different planes, used RANSAC to fit a 3d plane to the point cloud segments, and then warped the pixels for the segment, given the 3D plane, to a fronto-parallel view.
I'm stuck in this last part- warping my extracted 3D plane to a fronto-parallel view. I have X, Y, and Z vectors as well as a normal vector. I'm thinking what I need to do is perform some type of perspective transform or rotation that would bring all the pixels on the plane to a complete 0 Z-axis while the X and Y would remain the same. I could be wrong about this, it's been a long time since I've had any formal training in geometry or linear algebra.
It looks like skimage's Perspective Transform requires me to know the dimensions of the final segment coordinates in 2d space. It looks like AffineTransform requires me to know the rotation. All I have at this point is my X,Y,Z and normal vector and the suspicion that I may know my destination plane by just setting the Z axis to all zeros. I'm not sure if my assumption is correct but I need to be able to warp all the pixels in the segment of interest to fronto-parallel, fit a bounding box, place text inside of it, then warp the final segment back to the original perspective in 3d space.
Any help with how to think about this or implement it would be massively useful.

Related

How to approximate low-res 3D density map to smooth models?

3D Density maps of course can be plotted as heatmap, but when data itself is homogeneous (near 0) except for a small part (2D cross section for example):
This should give a letter 'E' shape as 2D "model". The original data is not saved as point-cloud however.
A naive approach would be to use the pixels that are more than a certain value, and then smooth the border. However this does not take into account of the border pixels being small.
Another would be to use some point-cloud based algorithms that come with modeling softwares, but then the point-cloud's probability function would still be discontinuous on pixel border, and not taking into account that only one side have signal.
Is there any tested solution to this (the example is 2D, the actual case is many 2D slices that compose a low-res 3D density map)? I was thinking of making border pixels have area proportional to signal data, and border should be defined from gradient? Any suggestions?
I was thinking of model visualization results similar to this (seems to be based on established point-cloud algorithm):

How can i create an image morpher inside a graphics shader?

Image morphing is mostly a graphic design SFX to adapt one picture into another one using some points decided by the artist, who has to match the eyes some key zones on one portrait with another, and then some kinds of algorithms adapt the entire picture to change from one to another.
I would like to do something a bit similar with a shader, which can load any 2 graphics and automatically choose zones of the most similar colors in the same kinds of zone of the picture and automatically morph two pictures in real time processing. Perhaps a shader based version would be logically alot faster at the task? except I don't even understand how it works at all.
If you know, Please don't worry about a complete reply about the process, it would be great if you have save vague background concepts and keywords, for how to attempt a 2d texture morph in a graphics shader.
There are more morphing methods out there the one you are describing is based on geometry.
morph by interpolation
you have 2 data sets with similar properties (for example 2 images are both 2D) and interpolate between them by some parameter. In case of 2D images you can use linear interpolation if both images are the same resolution or trilinear interpolation if not.
So you just pick corresponding pixels from each images and interpolate the actual color for some parameter t=<0,1>. for the same resolution something like this:
for (y=0;y<img1.height;y++)
for (x=0;x<img1.width;x++)
img.pixel[x][y]=(1.0-t)*img1.pixel[x][y] + t*img2.pixel[x][y];
where img1,img2 are input images and img is the ouptput. Beware the t is float so you need to overtype to avoid integer rounding problems or use scale t=<0,256> and correct the result by bit shift right by 8 bits or by /256 For different sizes you need to bilinear-ly interpolate the corresponding (x,y) position in both of the source images first.
All This can be done very easily in fragment shader. Just bind the img1,img2 to texture units 0,1 pick the texel from them interpolate and output the final color. The bilinear coordinate interpolation is done automatically by GLSL because texture coordinates are normalized to <0,1> no matter the resolution. In Vertex you just pass the texture and vertex coordinates. And in main program side you just draw single Quad covering the final image output...
morph by geometry
You have 2 polygons (or matching points) and interpolate their positions between the 2. For example something like this: Morph a cube to coil. This is suited for vector graphics. you just need to have points corespondency and then the interpolation is similar to #1.
for (i=0;i<points;i++)
{
p(i).x=(1.0-t)*p1.x + t*p2.x
p(i).y=(1.0-t)*p1.y + t*p2.y
}
where p1(i),p2(i) is i-th point from each input geometry set and p(i) is point from the final result...
To enhance visual appearance the linear interpolation is exchanged with specific trajectory (like BEZIER curves) so the morph look more cool. For example see
Path generation for non-intersecting disc movement on a plane
To acomplish this you need to use geometry shader (or maybe even tesselation shader). you would need to pass both polygons as single primitive, then geometry shader should interpolate the actual polygon and pass it to vertex shader.
morph by particle swarms
In this case you find corresponding pixels in source images by matching colors. Then handle each pixel as particle and create its path from position in img1 to img2 with parameter t. It i s the same as #2 but instead polygon areas you got just points. The particle has its color,position you interpolate both ... because there is very slim chance you will get exact color matches and the count ... (histograms would be the same) which is in-probable.
hybrid morphing
It is any combination of #1,#2,#3
I am sure there is more methods for morphing these are just the ones I know of. Also the morphing can be done not only in spatial domain...

What's the purpose of a unit normal vector when creating a 3D shape?

I understand that to create a shape (let's say a 3D sphere for an example) that I have to first find the vertex locations of the shape and second, use the parametric equation in order to create the x, y, z points of the triangle meshes. I am currently looking at a sample code to create shapes and it appears that after using the parametric equation in order to find the vectors of the triangle meshes, unit normals to the sphere at the vertices are found.
I understand why regular vectors in the first step are used to create the 3D shape and that a normal vector is perpendicular to the shape object, but I don't understand why the unit normal vectors at the vertices are used to create the shapes? What's the purpose of finding the normal of the vectors?
I am not sure I totally understand your question, but one very important use for normals in computer graphics is calculating reflections. For instance, if you're writing a simple raytracer, Lambertian reflectance is quite easy to compute if you know the normal vector where your camera ray intersects a surface. Normals are similarly required for (off the top of my head) the majority of calculations involved in more complex rendering techniques.

How is 3D plane normal vector related to its rotation

What i am trying to do http://www.youtube.com/watch?v=CaTI2d0tQME 3:15
In my 3D api there is quad.rotation[x,y,z], quad[x,y,z] which is center of it and width/height. I understand that vertices are being calculated from all of the given. And normal can be calculated from vertices but i have a feeling i should be able to get it just from the rotation?
Yes you can !
Your quad must be axis-oriented (along the X, Y or Z axis, which is its normal vector in its local space). Compose this vector with the quad rotation matrix and you will have your new, nice and shiny normal vector in world space !
A little warning : if the quad transformation matrix is generated by any 3D engine, it could contain scaling factors that will mess the normal vector up. In this case, the classical solution is to compute the transposed inverse of the matrix, or to generate your custom transformation matrix with the quad's rotation values.

computing the matrix that turns one set of coordinates into another

I am playing with some models for the game glest.
These models are made up of one or more meshes; each mesh is made up of many frames which describe the position of each vertex for each frame of animation. In the model shown below, the position of each vertex in each wheel in each frame is in an array.
These models have been exported from 3D tools like Blender. Someone somewhere has the originals.
But I am wondering, for simple animation such as a wheel turning, how can you compute the transforms - the steps of rotate, scale and translate, or the matrix that when applied to the previous frame will result in the new frame?
(Obviously not all frames will have such transforms, because they may distort the models and such.)
Also, how can you detect mirroring and other opportunities to reduce the amount of vertex data by applying a matrix and rendering the same vertices again?
Running speed - if its measured in just minutes - won't be a problem.
First off, some assumptions:
You're dealing with 3D affine transformations (linear transformation plus translation).
You have the vertices for each frame in your animation
You can associate at least 4 vertices in a frame with 4 vertices in the next frame
Then you can take 4 vertices as 4D collumn vectors (appending a 1 in each vector's 4th element) in the original space and concatenate them to create a 4x4 matrix, called X. Do the same for their corresponding vectors in the tranformed space and call them Y, which will also be a 4x4 matrix. A little linear algebra provides you with a method to find the 4x4 matrix A that when applied to X gives you Y. Thus:
AX = Y
A = YX-1
Using this to get rotations and scaling is not trivial. However, the rightmost column of A will contain the translation for the object between the successive frames.

Resources