How is the GPU "instructed" to render an image? - graphics

If this question is off, please let me know as I don't want to clutter the platform with off-topic questions!
Anyways, I'm having a hard time finding information about what's actually going on when an image is rendered because of some code I've written.
Say I wanted to add the numbers 5 and 3. The CPU would write 5 to one register and 3 to another one. The ALU would take care of the calculation and output 8. That's fine, the CPU uses MOVE and ADD to produce a result.
What I don't find any information on however, is what's going on when I want to draw a rectangle. There are importable frameworks for most programming languages which lets you do this. In SpriteKit (Swift & Objc) for example, you would write something like
let node = SKSpriteNode(color: .white, size: CGSize(width: 200, height: 300))
and add node to an SKScene (just a scene containing childNodes) and a white rectangle would "magically" get rendered. What I would like to know is what goes on under the hood. Why does this exact framework let you draw a rectangle. What is the assembly code (say, for Intel Core M) which makes the GPU calculate what this rectangle will look like? And how does SpriteKit build on the basics of Swift/Objective C to actually do this (and could I do this myself)?
Maybe a weird question, but I feel like I have to know (yes, sometimes I'm too curious). Thank you.
P.S. I would love a really detailed answer, not "the CPU 'tells' the GPU to draw a rectangle" - CPUs can't talk!

There are many ways to render convex polygon. The most used in past was ScanLine algorithm where you simply rasterize all the lines of circumference into left/right buffers and then just render using horizontal lines and interpolating the other coordinates along the way (like z,r,g,b,tx,ty,nx,ny,nz...). This was suited for single-thread CPU based SW rendering.
With parallelization (like on GPU) different approach get more popular. It simply renders only triangles (so you need to triangulate your polygons) and renders like this:
compute AABB
so simply min,max of x,y coordinates of the triangle vertexes.
loop through AABB
this is done in parallel and its done by GPU interpolators. Each interpolated (looped) "pixel" is called fragment (as it usually contains more than just color)
for each fragment
compute barycentric coordinates and from the result decide if fragment is inside (s+t<=1) or outside (s+t>1) triangle. If inside invoke Fragment shader.
All this gets done just before Fragment shader stage and usually all this (or majority of it) is implemented in HW so no code.
Nowadays GPU rendering is done by passing geometry to the gfx driver itself. What drivers does under the hood is just guess work for us but most likely they also just pass the geometry and configuration setting to the right places on the GPU (memory, registers, ...).

Related

How does Skia or Direct2D render lines or polygons with GPU?

This is a question to understand the principles of GPU accelerated rendering of 2d vector graphics.
With Skia or Direct2D, you can draw e.g. rounded rectangles, Bezier curves, polygons, and also have some effects like blur.
Skia / Direct2D offer CPU and GPU based rendering.
For the CPU rendering, I can imagine more or less how e.g. a rounded rectangle is rendered. I have already seen a lot of different line rendering algorithms.
But for GPU, I don't have much of a clue.
Are rounded rectangles composed of triangles?
Are rounded rectangles drawn entirely by wild pixel shaders?
Are there some basic examples which could show me the basic prinicples of how such things work?
(Probably, the solution could also be found in the source code of Skia, but I fear that it would be so complex / generic that a noob like me would not understand anything.)
In case of direct2d, there is no source code, but since it uses d3d10/11 under the hood, it's easy enough to see what it does behind the scenes with Renderdoc.
Basically d2d tends to have a policy to minimize draw calls by trying to fit any geometry type into a single buffer, versus skia which has some dedicated shader sets depending on the shape type.
So for example, if you draw a bezier path, Skia will try to use tesselation shader if possible (which will need a new draw call if the previous element you were rendering was a rectangle), since you change pipeline state.
D2D, on the other side, tends to tesselate on the cpu, and push to some vertexbuffer, and switches draw call only if you change brush type (if you change from one solid color brush to another it can keep the same shaders, so it doesn't switch), or when the buffer is full, or if you switch from shape to text (since it then needs to send texture atlases).
Please note that when tessellating bezier path D2D does a very great work at making the resulting geometry non self intersecting (so alpha blending works properly even on some complex self intersecting path).
In case on rounded rectangle, it does the same, just tessellates into triangles.
This allows it to minimize draw calls to a good extent, as well as allowing anti alias on a non msaa surface (this is done at mesh level, with some small triangles with alpha). The downside of it is that it doesn't use much hardware feature, and geometry emitted can be quite high, even for seemingly simple shapes).
Since d2d prefers to use triangle strips instead or triangle list, it can do some really funny things when drawing a simple list of triangles.
For text, d2d use instancing and draws one instanced quad per character, it is also good at batching those, so if you call some draw text functions several times in a row, it will try to merge this into a single call as well.

3d Graphing Application Questions

For one of my classes, I made a 3D graphing application (using Visual Basic). It takes in a string (z=f(x,y)) as input, parses it into RPN notation, then evaluates and graphs the equation. While it did work, it took about 20 seconds to graph. I would have liked to add slide bars to rotate the graph vertically and horizontally, but it was definitely too slow to allow that.
Does anyone know what programming languages would be best for this type of thing? Ideally, I will be able to smoothly rotate the function once it is graphed.
Also, I’m trying to find a better way to rotate the function. Right now, I evaluate it at a bunch of points, and then plot the points to the screen. Every time it is rotated, it must be re-evaluated and plot all the new points. This takes just as long as the original graph process, as it basically treats it as a completely new function.
Lastly, I need a better way to display the graph. Currently (using VB with visual studio) I plot 200,000 points to a chart, but this does not look great by any means. Eventually, I would like to be able to change color based on height, and other graphics manipulation to make it look better.
To be clear, I am not asking for someone to do any of this for me, but rather the means to go about coding this in an efficient way. I will greatly appreciate any advice anyone can give to help with any of these three concerns.
So I will explain how I would go about it using C++ and OpenGL. This doesn't mean those are the tools that you must use, it's just those are standard graphics tools.
Your function's surface is essentially a 2D manifold, which has the nice property of having an intuitive mapping to a 2D space. What is commonly referred to as UV mapping.
What you should do is pick the ranges for the rectangle domain you want to display (minimum x, maximum x, minimum y, maximum y) And make 2 nested for loops of the form:
// Pseudocode
for (x=minimum; x<maximum; x++)
for (y=minimum; y=maximum; y++)
3D point = (x,y, f(x,y))
Store all of these points into a container (std vector for c++ works fine) and this will be your "mesh".
This is done once, prior to rendering. You then render those points using, for example GL_POINTS, and rotate your graph mesh using rotations on the GPU.
This will only show scattered points, not a surface.
If you also wish to show the surface of your function, and not just the points, you can triangulate that set of points fairly easily.
Group each 4 contiguous vertices (i.e the vertices at indices <x,y>, <x+1,y>, <x+1,y>, <x+1,y+1>) and create the 2 triangles:
(<x,y>, <x+1,y>, <x,y+1>), (<x+1,y>, <x+1,y+1>, <x,y+1>)
This will fill triangulate the surface of your mesh.
Essentially you only need to build your mesh once, and this way rendering should be 60 fps for something with 20 000 vertices, regardless of whether you only render points or triangles too.
Programming language is mostly not relevant, so VB itself is probably not the issue. You can have the same issues in Python, C#, C++, etc. Of course you must master the programming language you choose.
One key aspect is using the right algorithms and data-structures. Proper use of memory allocations and memory layout for maximizing CPU (and GPU) cache are also key. Then you must take advantage of the platform and hardware capabilities (GPU and Multithreading). For the last point you definetely need to use a graphics library such as OpenGL or Vulkan.

What is the fastest engine for drawing large numbers of semitransparent trianges?

I enjoy computer graphics.
I was wondering what the fastest engine was with the following functionality:
Draws triangles with 4 color channels rgba and allows for the drawing of point and directional lights.
Texturing would be a cool additional feature, but again I am looking for the fastest engine, not the most functional. Camera animation and object animation will be imperative.
Finally there are really 2 answers for this question, 1 for general development and one for web, but if you can only speak to one or the other your contributions will be appreciated!
There are quite a lot of engines that do the job. One of the most known is for example Unity, where you also have tons of other features in good performance.
But I think you are not really looking for an engine but an API. Examples are OpenGL or DirectX (already mentioned). OpenGL even has a specific web content (WebGL).
There is one more problem: the triangles should be semitransparent. What is missing in the other answer is the question if the triangles are already ordered. OpenGL for example is good in rendering objects where it does not matter which triangle is nearest to the viewer. It "searches" this one on the fly and shows only the triangle that is visible. But with semitransparent triangles it is possible to see different triangles overlapping each other and therefore it is not only necessary to know which triangle is in the front, but which triangle comes directly after that and so on. OpenGL offers blending for this feature, but is necessary to order the semitransparent triangles manually before rendering. This is called the Painters Algorithm. While Sorting of objects is a complex problem, exspecially with a large number of objects, this could take quite long time.
For this there is another solution called "depth peeling". The idea is to render all triangles multiple times with OpenGL. The first time you get all the triangles which are in the front. Now you render all triangles again, but without the triangles in the front. This results in the second nearest triangles to the viewer. After that all triangles are rendered again, but without the first two "peels", which results in the third nearest triangles and so on. This is expensive because everything has to get rendered multiple times, but in cases where there is a very large number of triangles this is faster than sorting (and more precise due to overlapping triangles). In most cases four peels are enough for good results. For further read I suggest the following paper of Everitt: http://gamedevs.org/uploads/interactive-order-independent-transparency.pdf
Your best bet is probably OpenGL. In the case of the web, you could use WebGL and in the case of native desktop or mobile development you could directly use OpenGL.

fast 2D texture line sample

Imagine you have a chessboard textured triangle shown in front of you.
Then imagine you move the camera so that you can see the triangle from one side, when it nearly looks as a line.
You will provably see the line as grey, because this is the average color of the texels shown in a straight line from the camera to the end of the triangle. The GPU does this all the time.
Now, how is this implemented? Should I sample every texel in a straight line and average the result to get the same output? Or is there another more efficient way to do this? Maybe using mipmaps?
It does not matter if you look at the object from the side, front, or back; the implementation remains exactly the same.
The exact implementation depends on the required results. A typical graphics API such as Direct3D has many different texture sample techniques, which all have different properties. Have a look at the documentation for some common sampling techniques and an explanation.
If you start looking at objects from an oblique angle, the texture on the triangle might look distorted with most sampling techniques, and Anisotropic Filtering is often used in these scenario's.

How to know which triangle contribute to the color of a pixel?

I'm total new in graphics and DX, encountered a problem and no one around me know graphics too. Sorry if the question seems too naive.
I use DirectX 11 to render a mesh, and I want to get a buffer for each pixel. This buffer should store a linked-list (or some other structure) of all triangles that contribute color to this pixel.
Should I operate on which shader or which part of DX? Or simply, where could I get the triangle information in pixel shader?
You can write the triangle ID in the pixel shader but using the hardware z-buffer you can only capture one triangle per pixel.
With multisampled textures you can capture more triangles. This should be enough in practical situations.
If your triangles are extremely small and many of them are visible within one pixel then you should consider the A-Buffer with your own hidden surface removal algorithm.
If you need it only for debug purposes you can use any of graphics debuggers:
Visual Studio Graphics Debugger (integrated since Visual Studio 2012)
For AMD GPUs: GPUPerfStudio
For NVidia GPUs: Nsight
Good old PIX from DX SDK.
If you need it at runtime (BTW, why? =) )
Use System-Generated Values: VertexID, PrimitiveID and SV_VertexID to calculate exact primitive or even vertex, that contributed in pixel color. It is tricky, but possible.
Another way is to use some kind of custom triangle ID in vertex declaration. But be aware of culling.
You can output final data from pixel shader into buffer, then read from it on CPU.
All of such problems are pretty advanced topics in DirectX. I'm not sure if "total new in graphics and DX" coder can solve it.

Resources