fixed function vs shader based - graphics

I'm a beginner to computer graphics and am trying to get a better understanding. My professor has discussed fixed function pipeline and shader based programming. How do these two compare to each other? What's the difference?

The fixed-function pipeline is as the name suggests — the functionality is fixed. So someone wrote a list of different ways you'd be permitted to transform and rasterise geometry, and that's everything available. In broad terms, you can do linear transformations and then rasterise by texturing, interpolate a colour across a face, or by combinations and permutations of those things. But more than that, the fixed pipeline enshrines certain deficiencies.
For example, it was obvious at the time of design that there wasn't going to be enough power to compute lighting per pixel. So lighting is computed at vertices and linearly interpolated across the face.
There were some intermediate extensions related to specific effects — dot3 plus cubemaps for per-pixel lighting from a single source, for example — but the programmable pipeline lets you do whatever you want at each stage, giving you complete flexibility.
In the first place that allowed better lighting, then better general special effects (ripples on reflective water, imperfect glass, etc), and more recently has been used for things like deferred rendering that flip the pipeline on its end.
All support for the fixed-functionality pipeline is implemented by programming the programmable pipeline on hardware of the last decade or so. The programmable pipeline is an advance on its predecessor, afforded by hardware improvements.

Graphics Processing Units started off very simply with fixed functions, that allowed for quick 3D maths (much faster than CPU maths), and texture lookup, and some simple lighting and shading options (flat, phong, etc).
These were very basic but allowed the CPU to offload the very repetitive tasks of 3D rendering to the GPU. Once the Graphics was taken away from the CPU, and given to the GPU, Games made a massive leap forward.
It wasn't long before the fixed functions needed to be changed to assembly programs and soon there was demand for doing more than simple shading, basic reflections, and single texture maps offered by the fixed function GPUs.
So the 2nd breed of GPU was created, this had two distinct pipelines, one that processed vertex programs and moved verts around in 3D space, and the shader programs that worked with pixels allowing multiple textures to be merged, and more lights and shades to be created.
Now in the latest form of GPU all the pipes in the card are generic, and can run any type of GPU assembler code. This increased in the number of uses for the pipe - they still do vertex mapping, and pixel color calculation, but they also do geometry shaders (tessellation), and even Compute shaders (where the parallel processor is used to do a non-graphics job).
So fixed function is limited but easy, and now in the past for all but the most limited devices. Programmable function shaders using OpenGL (GLSL) or DirectX (HLSL) are the de-facto standard for modern GPUs.

Essential the fixed function pipeline is a hardwired implementation of a, well, fixed program, through which each piece of data a GPU processes traverses, without the ability to change the details of any step. The only thing you can parameterize are the occasional branch to switch between hardcoded paths in the program (like enabling or disabling lighting, or using a separate specular) or some constants used (light colors and positions, texture environment base color modulation). And each and every step follows a specific formula.
In a programmable pipeline however the GPU is clean slate. It's completely up to the programmer how the various stages of the rendering process (vertex transformation, tesselation, fragment processing) are carried out. And you can use whatever formula you see fit for the task.
Fixed function pipeline GPUs have exactly one illumination mode: A Lambertian illumination model, implemented using Gourad or Phong shading. There were a few tricks to slightly alter the illumination model, for example to be anisotropic, but you had to somehow outsmart (or outdumb to be hones) the GPU for this. With a programmable pipeline you simply do what you wanted to do in the first place.

Related

What are use cases for rasterizer discard?

I am working with Vulkan, but would like to know the answer to this question from a general graphics API point of view. Not rasterizing essentially means running only the vertex shader. That means no perspective division or clipping happens, because those sorts of things are done for the purpose of rasterizing (figuring out where/on which pixels/texels a triangle falls). What are the use cases for this? Essentially wouldn't this just mean some sort of program/calculation based only on the vertex input attribute stream, and if that's the case couldn't/shouldn't you just use a compute shader and skip the graphics pipeline?
Is this feature (rasterizer discard) for the ability to be able to the run the graphics pipeline with additional stages such as tessellation and geometry stages, and write to a buffer instead of a render target/framebuffer?

Number of polygons in a 3D object and the rendering workload?

Is there any relation (preferably an equation) between the number of polygons in a 3D object and the rendering workload? I want to see how much the rendering workload would be increased if for instance the number of polygons doubles.
There is no clear connection between the arbitrary number of polygons and the mythical "workload".
See the following samples:
You render a cube with 6 faces composed of 12 triangles. You get, say, 1000fps (without vsync). When you tesselate the cube into 120 triangles, most likely the fps counter remains 1000.
You render a single fullscreen-sized quad with a heavy fragment shader with a lot of calculation. You get 0.5fps (or more, but I hope you get the point).
Another extreme. You are rendering a thousand of similar cubes, each with different texture. The rendering state change will take most of the time, not the actual rendering.
So, polygons may have different screen area and they may be rendered not within a single primitive. If you're talking about one big vertex array with a large number of polygons, then for some certain scenarios the performance change must be something like linear. "Something" because the videocard and the drivers are clipping the invisible polys and perfrom the early-out tests for each pixel being rendered.
Could you define 'workload'? – Erno yesterday
Well, I mean working
calculations. I want to see how much overhead (for GPU, CPU,
memory,...) would be increased. Actually I want to conclude the energy
usage of the device – user1196937 2 hours ago
If that is the actual question, a comparison of energy usage:
You will have to pick specific configurations and test those. Energy usage is very different from GPU to GPU and machine to machine.
Some GPU manufactures give very detailed information on the performance of their processors but when you want to compare those you will need an actual machine.

Best practice for creating 2d graphics assets

As a brief background, I have been slowly chugging away at the core framework of a game I've been wanting to make for some time now. It has gotten to the point where I want to start really fleshing it out with some graphics assets other than colored boxes. And this brings me to the heart of my question:
What is the best method for creating graphics assets that appear the same quality independent of the device they are drawn on?
My game is styled after Pokemon, so I want to capture the 16-bit feel while still remaining crisp regardless of the device resolution. Does this mean I just create a ton of duplicate sprite sheets? i.e. a 16x16 32x32 48x48 64x64 version of each asset? Or should I be making vector art and rendering it out specifically for each device? Or is there some other alternative I haven't considered?
Thanks!
If by 16-bit feel you mean a classic old-school "pixelated" style (but with crisp edges). Then you can just draw them in the minimal dimension and upscale by whatever factor you need using a Pixel Art Scaling Algorithm, the simplest being nearest neighbour. There are of course many algos that produce much nicer results than NN like the 2xSaI and hqx family of algorithms, and RotSprite if you need rotation.
If you want clean antialiased edges you might want to check out this Microsoft Research paper: Depixelizing Pixel Art
You can then use these algos as a loading pre-pass for your game.
Alternatively, you could shift them "earlier" into your art pipeline to help speed up generation of multiple (resolution/transform) variants, which you could further touch up. This choice largely depends on your level of labor resources and perfectionism. Note also that this loses the "purity" of the solution since it violates DRY because updates will require changes in all variants of a sprite.
I would suggest to first try out some of these upscaling filters and see if you are happy with the results. If you are, you can get away with a loading prepass, which is by far the most desirable outcome because it reduces work and maintenance by a large factor.

Why are GPU threads in CUDA and OpenCL allocated in a grid?

I'm just learning OpenCL, and I'm at the point when trying to launch a kernel. Why is it that the GPU threads are managed in a grid?
I'm going to read more about this in detail, but it would be nice with a simple explanation. Is it always like this when working with GPGPUs?
This is a common approach, which is used in CUDA, OpenCL and I think ATI stream.
The idea behind the grid is to provide a simple, but flexible, mapping between the data being processed and the threads doing the data processing. In the simple version of the GPGPU execution model, one GPU thread is "allocated" for each output element in a 1D, 2D or 3D grid of data. To process this output element, the thread will read one (or more) elements from the corresponding location or adjacent locations in the input data grid(s). By organizing the threads in a grid, it's easier for the threads to figure out which input data elements to read and where to store the output data elements.
This contrasts with the common multi-core, CPU threading model where one thread is allocated per CPU core and each thread processes many input and output elements (e.g. 1/4 of the data in a quad-core system).
The simple answer is that GPUs are designed to process images and textures that are 2D grids of pixels. When you render a triangle in DirectX or OpenGL, the hardware rasterizes it into a grid of pixels.
I will invoke the classic analogy of putting a square peg in a round hole. Well, in this case the GPU is a very square hole and not as well rounded as GP (general purpose) would suggest.
The above explanations put forward the ideas of 2D textures, etc. The architecture of the GPU is such that all processing is done in streams with the pipeline being identical in each stream, so the data being processed need to be segmented like that.
One reason why this is a nice API is that typically you are working with an algorithm that has several nested loops. If you have one, two or three loops then a grid of one, two or three dimensions maps nicely to the problem, giving you a thread for the value of each index.
So values that you need in your kernel (index values) are naturally expressed in the API.

Are there any rendering alternatives to rasterisation or ray tracing?

Rasterisation (triangles) and ray tracing are the only methods I've ever come across to render a 3D scene. Are there any others? Also, I'd love to know of any other really "out there" ways of doing 3D, such as not using polygons.
Aagh! These answers are very uninformed!
Of course, it doesn't help that the question is imprecise.
OK, "rendering" is a really wide topic. One issue within rendering is camera visibility or "hidden surface algorithms" -- figuring out what objects are seen in each pixel. There are various categorizations of visibility algorithms. That's probably what the poster was asking about (given that they thought of it as a dichotomy between "rasterization" and "ray tracing").
A classic (though now somewhat dated) categorization reference is Sutherland et al "A Characterization of Ten Hidden-Surface Algorithms", ACM Computer Surveys 1974. It's very outdated, but it's still excellent for providing a framework for thinking about how to categorize such algorithms.
One class of hidden surface algorithms involves "ray casting", which is computing the intersection of the line from the camera through each pixel with objects (which can have various representations, including triangles, algebraic surfaces, NURBS, etc.).
Other classes of hidden surface algorithms include "z-buffer", "scanline techniques", "list priority algorithms", and so on. They were pretty darned creative with algorithms back in the days when there weren't many compute cycles and not enough memory to store a z-buffer.
These days, both compute and memory are cheap, and so three techniques have pretty much won out: (1) dicing everything into triangles and using a z-buffer; (2) ray casting; (3) Reyes-like algorithms that uses an extended z-buffer to handle transparency and the like. Modern graphics cards do #1; high-end software rendering usually does #2 or #3 or a combination. Though various ray tracing hardware has been proposed, and sometimes built, but never caught on, and also modern GPUs are now programmable enough to actually ray trace, though at a severe speed disadvantage to their hard-coded rasterization techniques. Other more exotic algorithms have mostly fallen by the wayside over the years. (Although various sorting/splatting algorithms can be used for volume rendering or other special purposes.)
"Rasterizing" really just means "figuring out which pixels an object lies on." Convention dictates that it excludes ray tracing, but this is shaky. I suppose you could justify that rasterization answers "which pixels does this shape overlap" whereas ray tracing answers "which object is behind this pixel", if you see the difference.
Now then, hidden surface removal is not the only problem to be solved in the field of "rendering." Knowing what object is visible in each pixel is only a start; you also need to know what color it is, which means having some method of computing how light propagates around the scene. There are a whole bunch of techniques, usually broken down into dealing with shadows, reflections, and "global illumination" (that which bounces between objects, as opposed to coming directly from lights).
"Ray tracing" means applying the ray casting technique to also determine visibility for shadows, reflections, global illumination, etc. It's possible to use ray tracing for everything, or to use various rasterization methods for camera visibility and ray tracing for shadows, reflections, and GI. "Photon mapping" and "path tracing" are techniques for calculating certain kinds of light propagation (using ray tracing, so it's just wrong to say they are somehow fundamentally a different rendering technique). There are also global illumination techniques that don't use ray tracing, such as "radiosity" methods (which is a finite element approach to solving global light propagation, but in most parts of the field have fallen out of favor lately). But using radiosity or photon mapping for light propagation STILL requires you to make a final picture somehow, generally with one of the standard techniques (ray casting, z buffer/rasterization, etc.).
People who mention specific shape representations (NURBS, volumes, triangles) are also a little confused. This is an orthogonal problem to ray trace vs rasterization. For example, you can ray trace nurbs directly, or you can dice the nurbs into triangles and trace them. You can directly rasterize triangles into a z-buffer, but you can also directly rasterize high-order parametric surfaces in scanline order (c.f. Lane/Carpenter/etc CACM 1980).
There's a technique called photon mapping that is actually quite similar to ray tracing, but provides various advantages in complex scenes. In fact, it's the only method (at least of which I know) that provides truly realistic (i.e. all the laws of optics are obeyed) rendering if done properly. It's a technique that's used sparingly as far as I know, since it's performance is hugely worse than even ray tracing (given that it effectively does the opposite and simulates the paths taken by photons from the light sources to the camera) - yet this is it's only disadvantage. It's certainly an interesting algorithm, though you're not going to see it in widescale use until well after ray tracing (if ever).
The Rendering article on Wikipedia covers various techniques.
Intro paragraph:
Many rendering algorithms have been
researched, and software used for
rendering may employ a number of
different techniques to obtain a final
image.
Tracing every ray of light in a scene
is impractical and would take an
enormous amount of time. Even tracing
a portion large enough to produce an
image takes an inordinate amount of
time if the sampling is not
intelligently restricted.
Therefore, four loose families of
more-efficient light transport
modelling techniques have emerged:
rasterisation, including scanline
rendering, geometrically projects
objects in the scene to an image
plane, without advanced optical
effects; ray casting considers the
scene as observed from a specific
point-of-view, calculating the
observed image based only on geometry
and very basic optical laws of
reflection intensity, and perhaps
using Monte Carlo techniques to reduce
artifacts; radiosity uses finite
element mathematics to simulate
diffuse spreading of light from
surfaces; and ray tracing is similar
to ray casting, but employs more
advanced optical simulation, and
usually uses Monte Carlo techniques to
obtain more realistic results at a
speed that is often orders of
magnitude slower.
Most advanced software combines two or
more of the techniques to obtain
good-enough results at reasonable
cost.
Another distinction is between image
order algorithms, which iterate over
pixels of the image plane, and object
order algorithms, which iterate over
objects in the scene. Generally object
order is more efficient, as there are
usually fewer objects in a scene than
pixels.
From those descriptions, only radiosity seems different in concept to me.

Resources