Camera Frame and Object Frame - graphics

I'm reading about Interactive Graphics, in particular I started the section about the viewing and I did not understand well this sentence:
Initially, we start with the model-view matrix set to an identity matrix, so the camera frame and the object frame are identical.
I know what is a model view matrix and I know that in this case the camera view is oriented in the z negative axis. But I did not understand exactly what is the difference between the object frame and the camera frame.

you got 2 matrices: View and Model where View represents where from are you looking and in which directions (camera) and Model represents where is and how oriented your object you are currently rendering is.
However To speed-up rendering we are using just one cumulative matrix so:
ModelView = Inverse(View) * Model
so for example when you write something like this in OpenGL:
glMatrixMode(GL_MODELVIEW);
glLoadIdentity();
Then both View and Model matrices are identical and equal to unit matrix. After this point you add your incremental rotations and translations either to View (inverse order and direction) or to Model (normal order and direction).
For more info see:
Understanding 4x4 homogenous transform matrices
Especially the last 3 links in there...

Related

VTK - create 3D model

I'm trying to create a 3D mask model from the 3D coordinate points that are stored in the txt file. I use the Marching cubes algorithm. It looks like it´s not able to link individual points, and therefore holes are created in the model.
Steps: (by https://lorensen.github.io/VTKExamples/site/Cxx/Modelling/MarchingCubes/)
First, load 3D points from file as vtkPolyData.
Then, use vtkVoxelModeller
Put voxelModeller output to MC algorithm and finally visualize
visualization
Any ideas?
Thanks
The example takes a spherical mesh (a.k.a. a set of triangles forming a sealed 3D shape), converts it to a voxel representation (a 3D image where the voxels outside the mesh are black and those inside are not) then converts it back to a mesh using Marching Cubes algorithm. In practice the input and output of the example are very similar meshes.
In your case, you load the points and try to create a voxel representation of them. The problem is that your set of points is not sufficient to define a volume, they are not a sealed mesh, just a list of points.
In order to replicate the example you should do the following:
1) building a 3D mesh from your points (you gave no information of what the points are/represent so I can't help you much with this task). In other words you need to tell how these points are connected between then to form a 3D shape (vtkPolyData). VTK can't guess how your points are connected, you have to tell it.
2) once you have a mesh, if you need a voxel representation (vtkImageData) of it you can use vtkVoxelModeller or vtkImplicitModeller. At this point you can use vtk filters that need a vtkImageData as input.
3) finally in order to convert voxels back to a mesh (vtkPolyData) you can use vtkMarchingCubes (or better vtkFlyingEdges3D that is a very similar algorithm but much faster).
Edit:
It is not clear what the shape you want should be, but you can try to use vtkImageOpenClose3D so the steps are:
First, load 3D points from file as vtkPolyData.
Then, use vtkVoxelModeller
Put voxelModeller output to vtkImageOpenClose3D algorithm, then vtkImageOpenClose3D algorithm output to MC (change to vtkFlyingEdges3D) algorithm and finally visualize
Example for vtkImageOpenClose3D:
https://www.vtk.org/Wiki/VTK/Examples/Cxx/Images/ImageOpenClose3D

3d graphics from scratch

What the minimum configuration for the program I need to build 3D Graphics from scratch, for example I have only SFML for working with 2d graphics and I need to implement the Camera object that can move & rotate in a space
Where to start and how to implement vector3d -> vector2d conversion functions and other neccessary things
All I have for now is:
angles Phi, Xi, epsilon 1-3 and some object that I can draw on the screen with the following formula
x/y = center.x/y + scale.x/y * dot(point[i], epsilon1/epsilon2)
But this way Im just transforming "world" axis, not the Object points
First you need to implement transform matrix and vector math:
Mathematically compute a simple graphics pipeline
Understanding 4x4 homogenous transform matrices
The rest depends on kind of rendering you want to achieve:
boundary polygonal mesh rendering
This kind of rendering is the native for nowadays gfx cards. You need to implement buffers for:
depth (for filled polygons without z-sorting)
screen (to avoid flickering and also serves as Canvas)
shadow,stencil,aux (for advanced rendering techniques)
they have usually the same resolution as target rendering area. On top of this you need to implement supported primitives rendering at least point,line,triangle. see:
Algorithm to fill triangle
on top of all this you can add textures,shaders and whatever else you want to ...
(back)ray tracing
this kind of rendering is very different and current gfx HW is not build for it. This involves implementing ray/primitives intersections computation, Snell's law and analytical representation of meshes. This way you can also do multi-spectral rendering and more physically accurate effects/processes see:
How can I render an 'atmosphere' over a rendering of the Earth in Three.js? hybrid approach #1+#2
Algorithm for 2D Raytracer
How to implement 2D raycasting light effect in GLSL
Multi-Band Image raster to RGB
The difference between 2D and 3D ray tracer is almost none the only difference is how to compute perpendicular vector ...
There are also different rendering methods like Volume rendering, hybrid methods and others but their implementation is usually task oriented and generic description would most likely just mislead ... Here some 3D ray tracers of mine:
back raytrace through 3D mesh
back raytrace through 3D volume

skimage project an image's 3D plane to fronto-parallel view

I'm working on implementing Akush Gupta's synthetic data generation dataset (http://www.robots.ox.ac.uk/~vgg/data/scenetext/gupta16.pdf). In his work. he used a convolutional neural network to extract a point cloud from a 2-dimensional scenery image, segmented the point clouds to isolate different planes, used RANSAC to fit a 3d plane to the point cloud segments, and then warped the pixels for the segment, given the 3D plane, to a fronto-parallel view.
I'm stuck in this last part- warping my extracted 3D plane to a fronto-parallel view. I have X, Y, and Z vectors as well as a normal vector. I'm thinking what I need to do is perform some type of perspective transform or rotation that would bring all the pixels on the plane to a complete 0 Z-axis while the X and Y would remain the same. I could be wrong about this, it's been a long time since I've had any formal training in geometry or linear algebra.
It looks like skimage's Perspective Transform requires me to know the dimensions of the final segment coordinates in 2d space. It looks like AffineTransform requires me to know the rotation. All I have at this point is my X,Y,Z and normal vector and the suspicion that I may know my destination plane by just setting the Z axis to all zeros. I'm not sure if my assumption is correct but I need to be able to warp all the pixels in the segment of interest to fronto-parallel, fit a bounding box, place text inside of it, then warp the final segment back to the original perspective in 3d space.
Any help with how to think about this or implement it would be massively useful.

Creating a Cube-based 3-Dimensional Game

I am trying to create a 3-dimensional game that is based entirely off of cubes of the exact same size. I wanted to learn how to make my own 3-dimensional game using only 2-dimensional game libraries. Currently, the way I am doing is that I have an array storing the locations of all the centers of each cube in the game. Then, when drawing a single cube, I figure out which 3 sides of the cube I need to draw (since you don't need to draw all 3 sides of the cube). Then, knowing the 3-dimensional points of all the corners of the cube, I project those points onto a 2 dimensional space using the camera position, camera angle, and the point I am projecting.
Now my real question is: Now that I can draw a single cube, how do I draw multiple cubes, considering that cubes need to be drawn in a certain order (i.e. the cubes that are further away need to be drawn first so that the cubes that are closer to us appear on top of the cubes far away from us)? How do I determine which cubes to draw first given the list of cube centers and their sizes, and the camera position/angle?
its quite a few years when I learned 3D graphics ...
All of the sites I used are gone now (as usual on the WEB but there are tons of new ones). I am OpenGL user so I recommend to use that. Many of mine students liked NEHE tutorials so look here. Also look at 4x4 transform matrices (homogenous coordinates).
Z-Buffer is automatic in OpenGL just create context with Z-Buffer (all tutorials use it) and call glEnable(GL_DEPTH_TEST);
Face culling (skip the other side of object) is done by glEnable(GL_CULL_FACE); and drawing faces in the same polygon winding (CW or CCW)
Here few related questions:
Understanding 4x4 homogenous transform matrices
3D (software) render pipeline computation
some transform matrices 3x3 and 4x4 insights
how to project points and vectors ... fixed pipeline OpenGL will do it for you
code for inverse 4x4 matrix
this is mine clean OpenGL app example in borland/embarcadero BDS2006 Turbo C++
some projection math
And some good sites to look at (strongly recommend):
transforms
Learning Modern 3D Graphics Programming
GL_PROJECTION matrix abuse
Sorry for that list of links but I think they are relevant and copy their stuff here would be too much. Now back to your question how to render more cubes. I see few options (in OpenGL):
every cube has its own transform matrix
Representing its position and orientation in space then just before rendering each cube change GL_MODELVIEW matrix and draw cube (with the same code for each). If you have too much objects/cubes it will consume big chunk of memory (+16 floats per cube).
every cube is aligned to each other
in this case you need to know just the 3D position (+3 floats per cube) so just do something like this:
glMatrixMode(GL_MODELVIEW); // this store original matrix
glPushMatrix();
glTranslatef(x[i],y[i],z[i]); // position of i-th cube
// here render cube (like glBegin(GL_QUADS); ... or use glDraw... )
glMatrixMode(GL_MODELVIEW); // this restore original matrix
glPopMatrix();
use of shaders
in modern shader pipeline you can use geometry shader to emit cube on receiving point but this is too much for OpenGL beginner. But in this case you would draw just points and the shader will convert them to cubes on GPU which is much faster ...
use VBO or VAO
VAO is vertex array object (list of VBO's)
VBO is vertex buffer object
VBO is basically an array of parameters copied to GPU as one chunk and not by individual calls like glVertex,glColor,glNormal... which is much much faster. This allows you to create model of your space (all cubes) and draw it at once with enough speed unless you hit the GPU/CPU/MEM speed limits
VAO similarly groups more VBOs together so you need to bind just single VAO per object instead of one VBO for each one parameter further reducing the API calls number needed for rendering.

computing the matrix that turns one set of coordinates into another

I am playing with some models for the game glest.
These models are made up of one or more meshes; each mesh is made up of many frames which describe the position of each vertex for each frame of animation. In the model shown below, the position of each vertex in each wheel in each frame is in an array.
These models have been exported from 3D tools like Blender. Someone somewhere has the originals.
But I am wondering, for simple animation such as a wheel turning, how can you compute the transforms - the steps of rotate, scale and translate, or the matrix that when applied to the previous frame will result in the new frame?
(Obviously not all frames will have such transforms, because they may distort the models and such.)
Also, how can you detect mirroring and other opportunities to reduce the amount of vertex data by applying a matrix and rendering the same vertices again?
Running speed - if its measured in just minutes - won't be a problem.
First off, some assumptions:
You're dealing with 3D affine transformations (linear transformation plus translation).
You have the vertices for each frame in your animation
You can associate at least 4 vertices in a frame with 4 vertices in the next frame
Then you can take 4 vertices as 4D collumn vectors (appending a 1 in each vector's 4th element) in the original space and concatenate them to create a 4x4 matrix, called X. Do the same for their corresponding vectors in the tranformed space and call them Y, which will also be a 4x4 matrix. A little linear algebra provides you with a method to find the 4x4 matrix A that when applied to X gives you Y. Thus:
AX = Y
A = YX-1
Using this to get rotations and scaling is not trivial. However, the rightmost column of A will contain the translation for the object between the successive frames.

Resources