Typical rendering strategy for many and varied complex objects in directx? - graphics

I am learning directx. It provides a huge amount of freedom in how to do things, but presumably different stategies perform differently and it provides little guidance as to what well performing usage patterns might be.
When using directx is it typical to have to swap in a bunch of new data multiple times on each render?
The most obvious, and probably really inefficient, way to use it would be like this.
Stragety 1
On every single render
Load everything for model 0 (textures included) and render it (IASetVertexBuffers, VSSetShader, PSSetShader, PSSetShaderResources, PSSetConstantBuffers, VSSetConstantBuffers, Draw)
Load everything for model 1 (textures included) and render it (IASetVertexBuffers, VSSetShader, PSSetShader, PSSetShaderResources, PSSetConstantBuffers, VSSetConstantBuffers, Draw)
etc...
I am guessing you can make this more efficient partly if the biggest things to load are given dedicated slots, e.g. if the texture for model 0 is really complicated, don't reload it on each step, just load it into slot 1 and leave it there. Of course since I'm not sure how many registers there are certain to be of each type in DX11 this is complicated (can anyone point to docuemntation on that?)
Stragety 2
Choose some texture slots for loading and others for perpetual storage of your most complex textures.
Once only
Load most complicated models, shaders and textures into slots dedicated for perpetual storage
On every single render
Load everything not already present for model 0 using slots you set aside for loading and render it (IASetVertexBuffers, VSSetShader, PSSetShader, PSSetShaderResources, PSSetConstantBuffers, VSSetConstantBuffers, Draw)
Load everything not already present for model 1 using slots you set aside for loading and render it (IASetVertexBuffers, VSSetShader, PSSetShader, PSSetShaderResources, PSSetConstantBuffers, VSSetConstantBuffers, Draw)
etc...
Strategy 3
I have no idea, but the above are probably all wrong because I am really new at this.
What are the standard strategies for efficient rendering on directx (specifically DX11) to make it as efficient as possible?

DirectX manages the resources for you and tries to keep them in video memory as long as it can to optimize performance, but can only do so up to the limit of video memory in the card. There is also overhead in every state change even if the resource is still in video memory.
A general strategy for optimizing this is to minimize the number of state changes during the rendering pass. Commonly this means drawing all polygons that use the same texture in a batch, and all objects using the same vertex buffers in a batch. So generally you would try to draw as many primitives as you can before changing the state to draw more primitives
This often will make the rendering code a little more complicated and harder to maintain, so you will want to do some profiling to determine how much optimization you are willing to do.
Generally you will get better performance increases through more general algorithm changes beyond the scope of this question. Some examples would be reducing polygon counts for distant objects and occlusion queries. A popular true phrase is "the fastest polygons are the ones you don't draw". Here are a couple of quick links:
http://msdn.microsoft.com/en-us/library/bb147263%28v=vs.85%29.aspx
http://www.gamasutra.com/view/feature/3243/optimizing_direct3d_applications_.php
http://http.developer.nvidia.com/GPUGems2/gpugems2_chapter06.html

Other answers are better answers to the question per se, but by far the most relevant thing I found since asking was this discussion on gamedev.net in which some big title games are profiled for state changes and draw calls.
What comes out of it is that big name games don't appear to actually worry too much about this, i.e. it can take significant time to write code that addresses this sort of issue and the time it takes to spend writing code fussing with it probably isn't worth the time lost getting your application finished.

Related

What's the use of lightmap?

For starter, I know what a lightmap is, what to be gained by using it, and how to implement it. What I don't get is, if you have dynamically moving objects, which won't be able to generate a lightmap, you still need a light source to project their shadows. So, what would be gained by lightmap if we still need a light for the ones that won't get any lightmapping (ie. dynamic objects)?
Thanks in advance.
If you don't use realtime shadows (it's an option, often on mobile), than you can have more or less 2 approach for dynamic objects:
Use lightmap data baked into probes to approximate per-vertex lighting (no need to have realtime light). It's an approximation but can work on some contexts.
Use real-time lights only on dynamic objects, so you'll improve the look of those without sacrifice any performance on static ones, which can use only baked lights
If you need dynamic shadows casted by dynamic objects on static baked ones, than you still can benefits from lightmaps for several reasons:
Even if additional lighting passes are necessary to project shadows on static lightmapped objects, probably not all objects will be affected by shadow, but only the ones relatively close to the dynamic shadow caster object. So you still can save a lot of GPU time.
Lightmaps (especially on forward render path), allow to produce complex light scenarios, that would be otherwise impossible to achieve in realtime. Dynamic objects, doesn't need to be affected by all baked lights, but eventually only from the more important ones. This way you can have a limited amount of drawcalls for a really good looking static environments, and a limited number of "important" lights affecting the dynamic objects
if you have dynamically moving objects, which won't be able to
generate a lightmap, you still need a light source to project their
shadows.
That's true. But:
you save computing shading of static lightmapped objects because light won't affect them
as already said before, projected shadow will be casted on a limited set of objects

Level of Detail in 3D graphics - What are the pros and cons?

I understand the concept of LOD but I am trying to find out the negative side of it and I see no reference to that from Googling around. The only pro I keep coming across is that it improves performance by omitting details when an object is far and displaying better graphics when the object is near.
Seriously that is the only pro and zero con? Please advice. Tnks.
There are several kinds of LOD based on camera distance. Geometric, animation, texture, and shading variations are the most common (there are also LOD changes that can occur based on image size and, for gaming, hardware capabilities and/or frame rate considerations).
At far distances, models can change tessellation or be replaced by simpler models. Animated details (say, fingers) may simplify or disappear. Textures may move to simpler textures, bump maps vanish, spec/diffuse maps combines, etc. And shaders may also swap-put to reduce the number of texture inputs or calculation (though this is less common and may be less profitable, since when objects are far away they already fill fewer pixels -- but it's important for screen-filling entities like, say, a mountain).
The upsides are that your game/app will have to render less data, and in some cases, the LOD down-rezzed model may actually look better when far away than the more-complex model (usually because the more detailed model will exhibit aliasing when far away, but the simpler one can be tuned for that distance). This frees-up resources for the nearer models that you probably care about, and lets you render overall larger scenes -- you might only be able to render three spaceships at a time at full-res, but hundreds if you use LODs.
The downsides are pretty obvious: you need to support asset swapping, which can mean both the real-time selection of different assets and switching them but also the management (at times of having both models in your memory pipeline (one to discard, one to load)); and those models don't come from the air, someone needs to create them. Finally, and this is really tricky for PC apps, less so for more stable platforms like console gaming: HOW DO YOU MEASURE the rendering benefit? What's the best point to flip from version A of a model to B, and B to C, etc? Often LODs are made based on some pretty hand-wavy specifications from an engineer or even a producer or an art director, based on hunches. Good measurement is important.
LOD has a variety of frameworks. What you are describing fits a distance-based framework.
One possible con is that you will have inaccuracies when you choose an arbitrary point within the object for every distance calculation. This will cause popping effects at times since the viewpoint can change depending on orientation.

OpenGL geometry performance

I have an application which renders many filled polygons with OpenGL, in 2D. Filling is done by tesselation but performance is not optimal. 1900 polygons made up of 122000 vertex (that is, about 64 vertex per polygon) are displayed in about 3 seconds.
Apparently, the CPU is not the bottleneck, as if I replace calls to gluTessVertex by calls to glColor - just to test where is the bottleneck, performance is doubled.
I have the same problem with loading many small textures.
Now, which are the options to improve the performance? Seems that most time is spend in the geometry subsystem. Rendering is fast enough.
I already have a worker thread which does the load (so tesselation, texture binding) in one context, and another thread which does the draw in another context. The two contexts share objects via wglShareLists and it works like a charm.
Can I have a third thread in a third context which would handle also tesselation for half of the polygons? Anyone tried that? Is it safe? Any example of sharing objects between three contexts?
Forgot to say, I have an ATI Radeon HD 4550 graphics card, suppose it can handle more than 39kB/s of data.
Increase Performance
Sounds like you're using the old fixed-function pipeline.
If you're unsure of what that is, well, the following functions are a part of the fixed-function pipeline.
glBegin()
glEnd()
glVertex*()
glTexCoord*()
glNormal*()
glColor*()
etc.
Those functions are old and render geometry immediately. That means that each time you call the above functions, that geometry gets send to the GPU. By doing that a lot of times, you can easy make the FPS go way under 60 just by rendering simple things.
Now you need to use buffers and to be more precise VAOs with/or VBOs (and IBOs).
VBO or Vertex Buffer Object, is a buffer which can store vertices which you then can render. This is much much faster and better to use than glBegin() and glEnd(). When you create a VBO you supply it with vertices and they only require to be send to the GPU once, that's basically why they are fast, because they already are in the GPU and only require a single draw call instead of multiple.
The reason I said "with/or" is because in the newer versions you need to create a VAO which then would use a VBO, where before you could simply render the VBOs.
Tessellation
There are multiple ways to do tessellation and things which look like/would give the effect of tessellation.
For instance you could also simply render different models according to the required LOD (Level of Detail), thereby when you're up close to an object you then render the model with all it's details which probably would have a high vertices count. Then the further you're away from the model you simply render another version of that model but which have less vertices, which also equals less detail. Though if you can't really do that on something like terrain and definitely shouldn't do it on something like dynamic terrain and/or procedurally generated terrain.
You can also do actual geometry tessellation and you would do that through a Shader. Since tessellation is a really huge topic I will provide you with 2 urls which both explain and have code on them.
Both of these articles uses modern 4.0/4.0+ OpenGL.
http://prideout.net/blog/?p=48
http://antongerdelan.net/opengl/tessellation.html
Texturing
Generating and binding textures are still the same.
Instead of using gluBuild2DMipmaps() you can use glGenerateMipmap(GL_TEXTURE_2D); it was added in OpenGL version 3.0'ish if I remember correctly.
Again you can (and should) change all you glBegin() - glEnd() (and everything in between) calls out with VAOs and VBOs. You can store everything you want inside a buffer vertices, texture coordinates, normals, colors, etc. You can store the things in separate buffers or you can store them inside a single buffer, usually called an Interleaved Buffer or Interleaved VBO.
You wouldn't be needing glEnable(GL_TEXTURE_2D) and glDisable(GL_TEXTURE_2D) anymore, because you do that within a Shader, you bind textures and use them in a Shader, and since you create the Shader Program you can make it act however you want it to.

Best approach for game animation?

I have a course exercise in OpenGL to write a game with simple animation of a few objects
While discussing with my partner our design options we've realized we have two major choices for how the animation is going to work, Either
Set a timer for a constant interval, say 30 msec, when the timer hits, calculate where objects should be and draw the frame. or -
Don't use a timer, just a normal loop that runs all the time and in each iteration check how much time passed, calculate where the objects should be according to the interval and draw the frame.
What should generally be the preferred approach? Does anyone have concrete experience with either approach?
Render and compute as fast as you can to get the maximum frame rate (as capped by the vertical sync)
Don't use a timer, they're not reliable < 50-100 ms on Windows. Check how much time has passed. (Usually, you need both delta t and an absolute value, depending on if your animation is physics or keyframe based.)
Also, if you want to be stable, use an upper/lower bound on your time-step, to go into slow-motion if a frame takes a few secs to render (disc access by another process?) or skip an update if you get two of them within say 10 ms.
Update
(Since this is a rather popular answer)
I usually prefer having a fixed time-step, as it makes everything more stable. Most physics engines are pretty robust against varying time, but other things, like particle systems or various simpler animations or even game logic, are easier to tune when having everything run in a fixed time step.
Update2
(Since I got 10 upvotes ;)
For further stability over long periods of running (>4 hours), you probably want to make sure you're not using floats/doubles to compute large time differences, since you lose precision doing so and your game's animations/physics will suffer. Use fixed point (or 64-bit microsecond-based) integers instead.
For the hairy details, I recommend reading A matter of precision by Tom Forsyth.
Read this page about game loops.
In short, set a timer:
Update the state of the game at a fixed frequency (something like every 25 ms = 1s/40fps). That includes the properties of the game objects, the input, the physics, the AI, etc. I call that the Model and the Controller. The need for a fixed update rate comes from the problems that may appear on too slow or too fast hardware (read the article). Some physics engine also prefer to update at a fixed frequency.
Update the frame (the graphics) of the game as fast as possible. That would be the View. That way you'll provide a smooth game. You can also enable vsync so the display will wait for the graphic card (usually it's 60 fps).
So each iteration of the loop, you check if you should update the model/controller. If it's late, update until they are up to date. Then, update the frame once and continue your loop.
The tricky part is that because of the different update rates, in fast hardware, the view will update several times before the model and controller. Therefore you should interpolate the position of your game objects depending on "where they would be if the game state would have been updated". It's really not that difficult.
You may have to maintain 2 different data structures : one for the model and one for the view. For instance you could have a scene graph for your model and a BSP tree for your view.
The second would be my preferred approach, because timers are often not as accurate as you're probably thinking and have all the latency and overhead of the event handling system. Accounting for the time interval will give your animations a much more consistent look and be robust if/when your frame rate dips.
Having said that, if your animation is based on a physics simulation (eg rigid body or ragdoll animation), then having a fixed update interval for your physics can greatly simplify the implementation.
Option 2 is by far preferred. It will scale nicely across differently performing hardware.
The book "Game Programming Gems 1" had a chapter that covers exactly what you need:
Frame Rate Independent Linear Interpolation
Use the second method. Did a game for my senior project and from experience, there is no guarantee that your logic will be done processing when the timer wants to fire.
I would be tempted to use the loop, since it will render as fast as possible (i.e. immediately after your physics computations are done). This will probably be more robust if you run into any slow-down in computation, which would cause timer firings to start queueing up. However, in case of such a slow-down you may have to put a cap on the time step computed between updates, since your physics engine may go unstable with too large a jump in time.
I'd suggest setting the system up to work on a "delta" that's passed in from outside.
When I did this, inside the animation format I based everything on real time values. The delta I passed in was 1 / 30 seconds, but it could be anything. With this system you can get either your first or second option, depending on whether you pass in a fixed delta or if you pass in the amount of time that has passed since the last frame.
As for which is better, it depends on your game and your requirements. Ideally all of your systems would be based around the same delta so that your physics match your animations. If your game drops frames at all and if all of your systems work with a variable delta, I'd suggest the variable delta is the better of the two solutions for end user experience.

How do you measure if an interface change improved or reduced usability?

For an ecommerce website how do you measure if a change to your site actually improved usability? What kind of measurements should you gather and how would you set up a framework for making this testing part of development?
Multivariate testing and reporting is a great way to actually measure these kind of things.
It allows you to test what combination of page elements has the greatest conversion rate, providing continual improvement on your site design and usability.
Google Web Optimiser has support for this.
Similar methods that you used to identify the usability problems to begin with-- usability testing. Typically you identify your use-cases and then have a lab study evaluating how users go about accomplishing certain goals. Lab testing is typically good with 8-10 people.
The more information methodology we have adopted to understand our users is to have anonymous data collection (you may need user permission, make your privacy policys clear, etc.) This is simply evaluating what buttons/navigation menus users click on, how users delete something (i.e. changing quantity - are more users entering 0 and updating quantity or hitting X)? This is a bit more complex to setup; you have to develop an infrastructure to hold this data (which is actually just counters, i.e. "Times clicked x: 138838383, Times entered 0: 390393") and allow data points to be created as needed to plug into the design.
To push the measurement of an improvement of a UI change up the stream from end-user (where the data gathering could take a while) to design or implementation, some simple heuristics can be used:
Is the number of actions it takes to perform a scenario less? (If yes, then it has improved). Measurement: # of steps reduced / added.
Does the change reduce the number of kinds of input devices to use (even if # of steps is the same)? By this, I mean if you take something that relied on both the mouse and keyboard and changed it to rely only on the mouse or only on the keyboard, then you have improved useability. Measurement: Change in # of devices used.
Does the change make different parts of the website consistent? E.g. If one part of the e-Commerce site loses changes made while you are not logged on and another part does not, this is inconsistent. Changing it so that they have the same behavior improves usability (preferably to the more fault tolerant please!). Measurement: Make a graph (flow chart really) mapping the ways a particular action could be done. Improvement is a reduction in the # of edges on the graph.
And so on... find some general UI tips, figure out some metrics like the above, and you can approximate usability improvement.
Once you have these design approximations of user improvement, and then gather longer term data, you can see if there is any predictive ability for the design-level usability improvements to the end-user reaction (like: Over the last 10 projects, we've seen an average of 1% quicker scenarios for each action removed, with a range of 0.25% and standard dev of 0.32%).
The first way can be fully subjective or partly quantified: user complaints and positive feedbacks. The problem with this is that you may have some strong biases when it comes to filter those feedbacks, so you better make as quantitative as possible. Having some ticketing system to file every report from the users and gathering statistics about each version of the interface might be useful. Just get your statistics right.
The second way is to measure the difference in a questionnaire taken about the interface by end-users. Answers to each question should be a set of discrete values and then again you can gather statistics for each version of the interface.
The latter way may be much harder to setup (designing a questionnaire and possibly the controlled environment for it as well as the guidelines to interpret the results is a craft by itself) but the former makes it unpleasantly easy to mess up with the measurements. For example, you have to consider the fact that the number of tickets you get for each version is dependent on the time it is used, and that all time ranges are not equal (e.g. a whole class of critical issues may never be discovered before the third or fourth week of usage, or users might tend not to file tickets the first days of use, even if they find issues, etc.).
Torial stole my answer. Although if there is a measure of how long it takes to do a certain task. If the time is reduced and the task is still completed, then that's a good thing.
Also, if there is a way to record the number of cancels, then that would work too.

Resources