How to get screen (widget) coordinates from world coordinates - python-3.x

Let's say I have an entity's global (world) coordinate v (QVector3D). Then I make a coordinate transformation:
pos = camera.projectionMatrix() * camera.viewMatrix() * v
where projectionMatrix() and viewMatrix()are QMatrix4x4 instances. What do I actually get and how is this related to widget coordinates?

The following values are for OpenGL. They may differ in other Graphics APIs
You get Clip Space coordinates. Imagine a Cube with side length 2 ( i.e. -1 to 1 -w to w on all axes1). You transform your world to have everything you see with your camera in this cube, so that the graphics card can discard everything outside the cube (since you don't see it, it doesn't need rendering. this is for perforamnce reasons).
Going further, you (or rather your graphics API) would do a perspective divide. Then you are in Normalized Device Space - basically here you go from 3D to 2D, such that you know where in your rendering canvas your pixels have to be colored with whatever lighting calculations you use. This canvas is a quad with side length 1 (I believe).
Afterwards you would stretch these normalized device coordinates with whatever width and height your widget has, such that you know where in your widget the colored pixels go (defined in OpenGL as your Viewport).
What you see as Widget Coordinates are probably the coordinates of where your widget is on screen (usually the upper left corner is specified). Therefore, if your widget coordinate is (10, 10) and you have a rendered pixel in your Viewport transformation at (10, 10), then on screen your rendered pixel would be at (10+10, 10+10).
1After having had a discussion with derhass (see comments), a lot of books for graphics programming speak of [-1, -1, -1] x [1, 1, 1] as the clipping volume. The OpenGL 4.6 Core Spec however states that it is actually [-w, -w, -w] x [w, w, w] (and according to derhass, it is the same for other APIs. I have not checked this).

Related

Render pipeline output merge order

I am trying to draw two triangles in a single drawcall. The two triangles are parallel. And the forward direction of camera is along the normal of those triangles which is perpendicular to both triangles. From camera view, the two triangles are perfectly overlapped.
Alpha blend is enabled with blendop being srcAlpha and invSrcAlpha. The color of triangle in back is (0, 1, 0, 0.5), the color of triangle in front is (1, 0, 0, 0.5). And the RT is cleared as black. The pixel shader simply output the triangle color.
Here is an image to show the scene, the vertices of triangles are indexed as in the image.
What could be the final color in RT, could be all (0.5, 0.25, 0). In graphics pipeline, is it guaranteed the pixel of green triangle output before red triangle?
You do not any guarantee on the pixel evaluation order, here the red and green pxel can be evaluated in any order. Precisely be executed in the order the triangles are ordered in the vertex/index buffer.
It exists a feature named Rasterizer Order Views, documentation here. But, first, it depends on an optional feature, and second, it can only turns on when you are using a unordered access view, it is not the case here when you simply use the output merger to write the samples.
It looks like DirectX pipeline guarantees the order.
"DirectX rendering follows a strict set of rule that ensure triangles are always rendered in the order they are submitted: if two triangles are overlapping on the screen, the hardware guarantees that Triangle 1 will have its color result blended to the screen before Triangle 2 is processed and blended."
Here is the link https://software.intel.com/en-us/gamedev/articles/rasterizer-order-views-101-a-primer, and at the section "DirectX Pipeline and the limitations of UAVs".

DirectWrite'ing glyphs such that the em square has a specific size

I'm working on an application that renders music notation. The musical symbol are specified in regular font files, which use the convention that the height of the em square corresponds to the height of a regular five-line staff of music. For example, the glyph for a note head is approximately 0.25 em high, the distance between two lines of the staff.
When it comes to rendering, I use a coordinate system in which 4 units corresponds to the height of a five-line staff of music. Therefore, I need to render glyphs such that the em square ends up rendered 4 units high. However DirectWrite only allows specifying text size in device independent pixels (DIPs) and I'm confused about how to juggle between the coordinate systems. There are two parts to this:
From a given font size in DIPs I can compute a height in physical pixels, but what is mapped to that height? The em square or some other design-space metric?
What if I'm using some arbitrary transformation matrix? How do I specify DIPs in order to get meaningful values in the coordinate system I am using?
And for good measure:
If get this to work, is this going to mess up font hinting because my DIP values don't have a clear relationship to physical pixels?
After some more experimentation and research, I have come to the following conclusions.
The font size specifies the size of the EM square as drawn. Drawing at 12 DIPs means that the EM square is scaled to use 12 DIPs of vertical space.
The top Y coordinate of the layoutRect parameter of the ID2D1RenderTarget::DrawText function is mapped to the top of the font's ascent (for the first line of text).
The identity matrix gives a coordinate system in which (0, 0) is the top-left and (width, height), as retrieved from ID2D1RenderTarget::GetSize, is the bottom-right, in DIPs. Which means for any transformation matrix, the font size unit should match the unit in the render target's coordinate system and a vertical line of 42 units will be as high as the EM square with a font size of 42 units.
I was unable to find information about the effect of arbitrary transformations on font hinting, however.

Terrain tile scale in case of tilted camera

I am working on 3d terrain visualization tool right now. Surface is logically covered with square tiles. This tiling could be visualized as follows:
Suppose I want to draw a picture on these tiles. The level of detail for a picture is required to be selected according to the current camera scale which is calculated for each tile individually.
In case of vertical camera (no tilt, i.e. camera looks perpendicularly on the ground) all tiles have the same scale which is camera focal length divided on camera height above the ground.
Following picture depicts the situation:
where red triangle is camera which has no tilt, BG is camera height above the ground and EG is focal length, then scale = AC/DF = BG/EG
But if camera has tilt (i.e. pitch angle isn't 0) then scale is changed from tile to tile (even from point to point).
So I wonder if there any kind method to produce reasonable scale for each tile in that case ?
There may be (there almost surely is) a more straightforward solution, but what you could do is regular world to screen coordinate conversion.
You just take the coordinates of bounding points of the tile and calculate to which pixels on the screen these will project (you of course get floating point precision). From this, I believe you can calculate the "scale" you are mentioning.
This is applicable to any point or set of points in the world space.
Here is tutorial on how to do this "by hand".
If you are rendering the tiles with OpenGL or DirectX, you can do this much easier.

How to map points in a 3D-plane into screen plane

I have given an assignment of to project a object in 3D space into a 2D plane using simple graphics in C. The question is that a cube is placed in fixed 3D space and there is camera which is placed in a position whose co-ordinates are x,y,z and the camera is looking at the origin i.e. 0,0,0. Now we have to project the cube vertex into the camera plane.
I am proceeding with the following steps
Step 1: I find the equation of the plane aX+bY+cZ+d=0 which is perpendicular to the line drawn from the camera position to the origin.
Step 2: I find the projection of each vertex of the cube to the plane which is obtained in the above step.
Now I want to map those vertex position which i got by projection in step 2 in the plane aX+bY+cZ+d=0 into my screen plane.
thanks,
I don't think that by letting the z co-ordinate equals zero will lead me to the actual mapping. So any help to figure out this.
You can do that in two simple steps:
Translate the cube's coordinates to the camera's system (using
rotation), such that the camera's own coordinates in that system are x=y=z=0 and the cube's translated z's are > 0.
Project the translated cube's coordinates onto a 2d plain by dividing its x's and y's by their respective z's (you may need to apply a constant scaling factor here for the coordinates to be reasonable for the screen, e.g. not too small and within +/-half the screen's height in pixels). This will create the perspective effect. You can now draw pixels using these divided x's and y's on the screen assuming x=y=0 is the center of it.
This is pretty much how it is done in 3d games. If you use cube vertex coordinates, then you get projections of its sides onto the screen. You may then solid-fill the resultant 2d shapes or texture-map them. But for that you'll have to first figure out which sides are not obscured by others (unless, of course, you use a technique called z-buffering). You don't need that for a simple wire-frame demo, though, just draw straight lines between the projected vertices.

Algorithm for Polygon Image Fill

I want an efficient algorithm to fill polygon with an Image, I want to fill an Image into Trapezoid. currently I am doing it in two steps
1) First Perform StretchBlt on Image,
2) Perform Column by Column vertical StretchBlt,
Is there any better method to implement this? Is there any Generic and Fast algorithm which can fill any polygon?
Thanks,
Sunny
I can't help you with the distortion part, but filling polygons is pretty simple, especially if they are convex.
For each Y scan line have a table indexed by Y, containing a minX and maxX.
For each edge, run a DDA line-drawing algorithm, and use it to fill in the table entries.
For each Y line, now you have a minX and maxX, so you can just fill that segment of the scan line.
The hard part is a mental trick - do not think of coordinates as specifying pixels. Think of coordinates as lying between the pixels. In other words, if you have a rectangle going from point 0,0 to point 2,2, it should light up 4 pixels, not 9. Most problems with polygon-filling revolve around this issue.
ADDED: OK, it sounds like what you're really asking is how to stretch the image to a non-rectangular shape (but trapezoidal). I would do it in terms of parameters s and t, going from 0 to 1. In other words, a location in the original rectangle is (x + w0*s, y + h0*t). Then define a function such that s and t also map to positions in the trapezoid, such as ((x+t*a) + w0*s*(t-1) + w1*s*t, y + h1*t). This defines a coordinate mapping between the two shapes. Then just scan x and y, converting to s and t, and mapping points from one to the other. You probably want to have a little smoothing filter rather than a direct copy.
ADDED to try to give a better explanation:
I'm supposing both your rectangle and trapezoid have top and bottom edges parallel with the X axis. The lower-left corner of the rectangle is <x0,y0>, and the lower-left corner of the trapezoid is <x1,y1>. I assume the rectangle's width and height are <w,h>.
For the trapezoid, I assume it has height h1, and that it's lower width is w0, while it's upper width is w1. I assume it's left edge "slants" by a distance a, so that the position of its upper-left corner is <x1+a, y1+h1>. Now suppose you iterate <x,y> over the rectangle. At each point, compute s = (x-x0)/w, and t = (y-y0)/h, which are both in the range 0 to 1. (I'll let you figure out how to do that without using floating point.) Then convert that to a coordinate in the trapezoid, as xt = ((x1 + t*a) + s*(w0*(1-t) + w1*t)), and yt = y1 + h1*t. Then <xt,yt> is the point in the trapezoid corresponding to <x,y> in the rectangle. Now I'll let you figure out how to do the copying :-) Good luck.
P.S. And please don't forget - coordinates fall between pixels, not on them.
Would it be feasible to sidestep the problem and use OpenGL to do this for you? OpenGL can render to memory contexts and if you can take advantage of any hardware acceleration by doing this that'll completely dwarf any code tweaks you can make on the CPU (although on some older cards memory context rendering may not be able to take advantage of the hardware).
If you want to do this completely in software MESA may be an option.

Resources