Z-fighting Direct3D9, only with dynamic buffer - direct3d

I lock and fill a vertex buffer every frame in Direct3d9 with data from my blendshape code. My shading uses two steps, so I render once with one shader, then draw an additive blend with my other shader.
For reasons beyond me, the data in my vertex buffer is (apparently) slightly different between those two drawing calls, because I have flickering z-fighting where the second pass sometimes renders 'behind' the first.
This is all done in one thread, and the buffer is unlocked a long time before the render calls. Additionally, no changes to any shader instruction take place, so the data should be exactly the same in both calls. If the blendshape happens not to change, no z-fighting takes place.
For now I 'push' the depth a little in my shader, but this is a very inelegant solution.
Why might this data be changed? Why may DirectX make changes to the data in my buffer after I unlock it? Can I force it not to change it?

1st. Are you sure the data is really changed by D3D, or this is just assumption? I'm sure D3D doesn't change your data
2nd. As you said, you have two different shaders drawing your geometry. They mave have different transformation operations. Or because of optimization the transformation in your shaders could be different, thats why your transformed vertices may differ slightly (but enough for z-fighting). I suggest using two passes in one shader/technique.
Or if your still want to use two shaders, you better use shared code for transformation and other identitcal operation.

I can sure that the D3D runtime will not change any data you passed in by a vertex buffer, I did the same thing like you when render two layers terrain, no Z-fighting. But there are indeed some render states will change it while rasterizing the triangles into pixels, they're D3DRS_DEPTHBIAS and D3DRS_SLOPESCALEDEPTHBIAS in D3D9, or the equal values in D3D10_RASTERIZER_DESC structure. If these render states were changed, you should check them.
You also need to be sure that all of the transform matrices or other constants which do calculation with position in the shader are precisely equal, otherwise there will be z-fighting.
I suggest you use some graphic debugging tools to check it. You can use PIX, or PerfHUD or Nsight if you were using NVIDIA card.
I'm sorry for my poor English, it must be hard to understand. But I wish this could help you, thanks.

Related

SSAO, Opmizations and pipelines using OpenSceneGraph

I have some questions about the SSAO technique implementation:
Does it really need a second (or more) pipeline with every geometry? I mean, I found some tutorials and stuff about it but mostly they just give you the directions to do it without entering in further details.
Is there any optimization possible? I'm using OSG and the I've got the impression that if you send the textures for the CPU and throwing again to the GPU isn't the best solution possible.
Is it possible to make the shaders generate a texture with the samples depth in a buffer and send it to the second pipe line using only the quad for the screen, the colors, the depth of the scene and the depth for the tests? I'm using osg and couldn't find how to properly do it in documentations.
In general, SSAO is best suited to being implemented as part of a deferred shading approach. A strictly forward shading approach is possible, but would still require two rendering passes, and SSAO can easily be added to the second rendering pass of a deferred shading engine. In SSAO, you need the complete depth buffer of your scene to be able to calculate occlusion, so the short answer to section 1 of your question is yes, SSAO requires two rendering passes.
Note that in deferred shading, although there are two rendering passes, the complex geometry (i.e. your models) is only rendered during the first pass, and the second pass is generally made up of simple polygon shapes rendered for each type of light. This is almost what you're suggesting in section 3 of your question.
With regards to section 2 of your question, when set up correctly, you shouldn't need to move your intermediate textures back to the CPU and then back to the GPU between the two rendering passes; you merely make your first rendering pass's textures available as a resource to your second rendering pass.

Why do we still use Fixed function blending operations in D3D11 ect?

I was looking aground trying to understand why we are still using fixed function blending modes in newer 3D API's (like D3D11). In D3D10 fixed function Alpha Clipping was removed in favor of using the shaders. Why because its a much more powerful approach to almost any situation.
So why then can we not calculate or own blending operations (aka texture sample from the RenderTarget we are currently rendering into)?? Is there some hardware design issue in the video card pipelines that make this difficult to accomplish?
The reason this would be useful, is because you could do things like make refraction shaders run way faster as you wouldn't have to swap back and forth between two renderTargets for each refractive object overlay. Such as a refractive windowing system for an OS or game UI.
Where might be the best place to suggest an idea like this as this is not a discussion forum as I would love to see this in D3D12? Or is this already possible in D3D11?
So why then can we not calculate or own blending operations
Who says you can't? With shader_image_load_store (and the D3D11 equivalent), you can do pretty much anything you want with images. Provided that you follow the rules. That last part is generally what trips people up. Doing a full read/modify/write in a shader, such that later fragment shader invocations don't read the wrong value is almost impossible in the most general case. You have to restrict it by saying that each rendered object will not overlap with itself, and you have to insert a memory barrier between rendered objects (which can overlap with other rendered objects). Or you use the linked list approach.
But the point is this: with these mechanisms, not only have people implemented blending in shaders, but they've implemented order-independent transparency (via linked lists). Nothing is stopping you from doing what you want right now.
Well, nothing except performance of course. The fixed-function blender will always be faster because it can run in parallel with the fragment shader operations. The blending units are separate hardware from the fragment shaders, so you can be doing blending operations while simultaneously doing fragment shader ops (obviously from later fragments, not the ones being blended).
The read/modify/write mechanism in the blend hardware is designed specifically for blending, while the image_load_store is a more generic mechanism. And while generic may beat specific in the long-term of hardware evolution, for the immediate and near-future, you can expect fixed-function blending to beat image_load_store blending performance-wise every time.
You should use it only when you must. And even the, decide if you really, really need it.
Is there some hardware design issue in the video card pipelines that make this difficult to accomplish?
Yes, this is actually the case. If one could do blending in the fragment shader, this would introduce possible feedback loops, and this really complicates things. Blending is done in a separate hardwired stage for performance and parallelization reasons.

Best looking texture mapping for characters (NPCs, Enemies).

I'm making "dungeon master-like" game where corridors and objects will be models. I have everything completed, but the graphic part of the game missing. I also made test levels without texture.
I would like to know which texture mapping would be the best for a realistic look.
I was thinking about parallax mapping for walls and doors, normal mapping for objects like treasure and boxes.
What mapping should I choose for enemies, npcs?
I have never worked with HLSL before, so I want to be sure that I'll go straight ahead for my goal because I expect another hard work there.
The mapping to use depends on your tastes. But first of all implement diffuse color mapping and per pixel lights. When that is working add normal mapping. If still not satisfied, add parallax mapping.
Even better results than the combination of normal and parallax mapping can be achieved using DirectX 11 Tesselation and displacement mapping. But this is much more GPU intensive and may not work on older hardware.

Fast pixel drawing library

My application produces an "animation" in a per-pixel manner, so i need to efficiently draw them. I've tried different strategies/libraries with unsatisfactory results especially at higher resolutions.
Here's what I've tried:
SDL: ok, but slow;
OpenGL: inefficient pixel operations;
xlib: better, but still too slow;
svgalib, directfb, (other frame buffer implementations): they seem perfect but definitely too tricky to setup for the end user.
(NOTE: I'm maybe wrong about these assertions, if it's so please correct me)
What I need is the following:
fast pixel drawing with performances comparable to OpenGL rendering;
it should work on Linux (cross-platform as a bonus feature);
it should support double buffering and vertical synchronization;
it should be portable for what concerns the hardware;
it should be open source.
Can you please give me some enlightenment/ideas/suggestions?
Are your pixels sparse or dense (e.g. a bitmap)? If you are creating dense bitmaps out of pixels, then another option is to convert the bitmap into an OpenGL texture and use OpenGL APIs to render at some framerate.
The basic problem is that graphics hardware will be very different on different hardware platforms. Either you pick an abstraction layer, which slows things down, or code more closely to the type of graphics hardware present, which isn't portable.
I'm not totally sure what you're doing wrong, but it could be that you are writing pixels one at a time to the display surface.
Don't do that.
Instead, create a rendering surface in main memory in the same format as the display surface to render to, and then copy the whole, rendered image to the display in a single operation. Modern GPU's are very slow per transaction, but can move lots of data very quickly in a single operation.
Looks like you are confusing window manager (SDL and xlib) with rendering library (opengl).
Just pick a window manager (SDL, glut, or xlib if you like a challenge), activate double buffer mode, and make sure that you got direct rendering.
What kind of graphical card do you have? Most likely it will process pixels on the GPU. Look up how to create pixel shaders in opengl. Pixel shaders are processing per pixel.

How do I project lines dynamically on to 3D terrain?

I'm working on a game in XNA for Xbox 360. The game has 3D terrain with a collection of static objects that are connected by a graph of links. I want to draw the links connecting the objects as lines projected on to the terrain. I also want to be able to change the colors etc. of links as players move their selection around, though I don't need the links to move. However, I'm running into issues making this work correctly and efficiently.
Some ideas I've had are:
1) Render quads to a separate render target, and use the texture as an overlay on top of the terrain. I currently have this working, generating the texture only for the area currently visible to the camera to minimize aliasing. However, I'm still getting aliasing issues -- the lines look jaggy, and the game chugs frequently when moving the camera EDIT: it chugs all the time, I just don't have a frame rate counter on Xbox so I only notice it when things move.
2) Bake the lines into a texture ahead of time. This could increase performance, but makes the aliasing issue worse. Also, it doesn't let me dynamically change the properties of the lines without much munging.
3) Make geometry that matches the shape of the terrain by tessellating the line-quads over the terrain. This option seems like it could help, but I'm unsure if I should spend time trying it out if there's an easier way.
Is there some magical way to do this that I haven't thought of? Is one of these paths the best when done correctly?
Your 1) is a fairly good solution. You can reduce the jagginess by filtering -- first, make sure to use bilinear sampling when using the overlay. Then, try blurring the overlay after drawing it but before using it; if you choose a proper filter, it will remove the aliasing.
If it's taking too much time to render the overlay, try reducing its resolution. Without the antialiasing filter, that would just make it jaggier, but with a good filter, it might even look better.
I don't know why the game would chug only when moving the camera. Remember, you should have a separate camera for the overlay -- orthogonal, and pointing down onto the terrain.
Does XNA have a shadowing library? If so, yo could just pretend the lines are shadows.

Resources