How to access depth buffer in pixel shader in DirectX 9.0 c - direct3d

Is it possible to access depth buffer via pixel shader 2.0 in DX 9.0c? I've google'd a bit and the only solution I've found describe GPU hack that works only on GeForce 6 & 7.
What I am trying to achieve is to write depth of field shader effect. I can't simple grab Z coords of vertices, because I am doing render to texture trick too, used for other post processing.
Edit:
I've tried this:
D3DXCreateTexture(lpD3Dev9, width, height, 1, D3DUSAGE_DEPTHSTENCIL, D3DFMT_D24S8, D3DPOOL_DEFAULT, &lpD3DepthBuffer);
lpD3DepthBuffer->GetSurfaceLevel(0,&lpNewDepthBuffer);
lpD3Dev9->GetDepthStencilSurface(&lpPrevDepthBuffer);
lpD3Dev9->SetDepthStencilSurface(lpNewDepthBuffer);
lpD3Dev9->BeginScene();
// Rendering...
lpD3Dev9->EndScene();
lpD3Dev9->SetDepthStencilSurface(lpPrevDepthBuffer);
// this function fails:
D3DXSaveTextureToFile("C:\backBuffer.png", D3DXIFF_PNG, lpD3DepthBuffer, NULL);
lpD3Dev9->BeginScene();
UINT passes;
D3DXHANDLE tech;
lpD3DXfxScreen->FindNextValidTechnique(0, &tech);
lpD3DXfxScreen->SetTechnique(tech);
lpD3DXfxScreen->Begin(&passes,0);
lpD3DXfxScreen->SetTexture("texDepth", lpD3DepthBuffer);
// render shader...
SaveTextureToFile fails, and shader gets texture that is pure white RGB(1, 1, 1)

To my knowledge there is no standardised way of doing it as you are trying to do it.
You are better off just creating a D3FMT_R32F (or whatever) texture as a D3DUSAGE_RENDERTARGET and then write depths to it as if its a normal texture (ie writing the depth in the r component from the pixel shader). You can then use that texture for whatever purpose you like purely by binding it to a sampler and reading the r value straight out of it to use in whatever pixel shader you are messing about with.
That said try going into the DirectX control panel and setting yourself to debug runtime. You'll see a tonne of useful error message spewed out in to the debug output stream by the DirectX runtime. There really is no reason not to be doing this as it would save you a lot of bother on problems like this.

Related

How do I rotate an object so that it's always facing the mouse position?

I'm using ggez to make a game with some friends, and I'm trying to have our character rotate to face the pointer at all times. I know so far that I need to get an angle value (f32) in radians, and I think I can use atan2 to get this (?) However, I just don't get the behavior that I want.
This is the code I have: (btw, move_data is a struct holding our player character's values, such as position, velocity, angle and rotation speed).
let m = mouse::position(ctx);
move_data.angle = ((m.y - move_data.position.y).atan2(move_data.position.x - m.x)) * (consts::PI / 2.0) as f32;
I think that I'm close, as I'm already able with this to rotate the character, but only in a sort of 'incomplete' way. The player character (pc) can mostly only face to the upper left corner, when I move the mouse there. Otherwise, if the pointer is to the right and/or below the pc, it rotates in a very slow and minor way, and stops facing the pointer. I don't know if this description makes sense.
I think the problem is that I'm not entirely sure what atan2 is doing in the first place (I only remember some basic trigonometry), and I am also not sure if I'm using it correctly, so I don't exactly know what my code is doing. (Here is the documentation I used for atan2: https://doc.rust-lang.org/std/primitive.f64.html#method.atan2)
I've gotten only so far after much trial and error, Googling as much as I can (mostly Unity tutorial results showed up when looking for algorithms to base my code off of) and I've also asked in the unofficial Rust community Discord server, but nothing so far has worked.
I also had this code earlier, but couldn't find how to make it work either.
let m = mouse::position(ctx); // Type Point2
let mouse_pos = Vector2::new(m.x, m.y); // Transformed to Vector2 to be read by Matrix
move_data.angle = Matrix::angle(&mouse_pos, &move_data.position);

Efficient 2D rendering with Glium

I'm using Glium to do rendering for an emulator I'm writing. I've pieced together something that works (based on this example) but I suspect that it's pretty inefficient. Here's the relevant function:
fn update_screen(display: &Display, screen: &Rc<RefCell<NesScreen>>) {
let target = display.draw();
// Write screen buffer
let borrowed_scr = screen.borrow();
let mut buf = vec![0_u8; 256 * 240 * 3];
buf.clone_from_slice(&borrowed_scr.screen_buffer[..]);
let screen = RawImage2d::from_raw_rgb_reversed(buf, SCREEN_DIMENSIONS);
glium::Texture2d::new(display, screen)
.unwrap()
.as_surface()
.fill(&target, MagnifySamplerFilter::Nearest);
target.finish().unwrap();
}
At a high level, this is what I'm doing:
Borrow NesScreen which contains the screen buffer, which is an array.
Clone the screen buffer into a vector
Create a texture from the vector data and render it
My suspicion is that cloning the entire screen buffer via clone_from_slice is really inefficient. The RawImage2d::from_raw_rgb_reversed function takes ownership of the vector passed into it, so I'm not sure how to do this in a way that avoids the clone.
So, two questions:
Is this actually inefficient? I don't have enough experience rendering stuff to know intuitively.
If so, is there a more efficient way to do this? I've scoured Glium quite a bit but there isn't much specific to 2D rendering.
This won't be a very good answer, but maybe a few things here could help you.
First of all: is this really inefficient? That's really hard to say, especially the OpenGL part, as OpenGL performance depends a lot on when synchronization is required/requested.
As for the cloning of the screen buffer: you are merely copying 180kb, which is not too much. I quickly benchmarked it on my machine and cloning a 180kb vector takes around 5µs, which is really not a lot.
Note that you can create a RawImage2d without using a method, because all fields are public. This means that you can avoid the simple 5µs clone if you create a reversed vector yourself. However, reversing the vector with the method glium uses is a lot slower than just cloning the vector; on my machine it takes 170µs for a vector of the same length. This is probably still tolerable if you just want to achieve 60fps = 17ms per frame, but still not very nice.
You could think about using the correct row ordering in your original array to avoid this problem. OR you could, instead of directly copying the texture to the framebuffer, just draw a fullscreen quad (one vertex for each screen corner) with the texture on it. Sure, then you need a mesh, a shader and all that stuff, but you could just "reverse" the image by tweaking the texture coordinates.
Lastly, I unfortunately don't know a lot about the time the GPU takes to execute the OpenGL commands. I'd guess that it's not optimal because OpenGL doesn't have a lot of room to schedule your commands, but has to execute them right away (forced synchronization). But maybe that's not avoidable in your case.

UAV counter indices used across multiple shaders?

I've been trying to implement a Compute Shader based particle system.
I have a compute shader which builds a structured buffer of particles, using a UAV with the D3D11_BUFFER_UAV_FLAG_COUNTER flag.
When I add to this buffer, I check if this particle has any complex behaviours, which I want to filter out and perform in a separate compute shader. As an example, if the particle wants to perform collision detection, I add its index to another structured buffer, also with the D3D11_BUFFER_UAV_FLAG_COUNTER flag.
I then run a second compute shader, which processes all the indices, and applies collision detection to those particles.
However, in the second compute shader, I'd estimate that about 5% of the indices are wrong - they belong to other particles, which don't support collision detection.
Here's the compute shader code that perfroms the list building:
// append to destination buffer
uint dstIndex = g_dstParticles.IncrementCounter();
g_dstParticles[ dstIndex ] = particle;
// add to behaviour lists
if ( params.flags & EMITTER_FLAG_COLLISION )
{
uint behaviourIndex = g_behaviourCollisionIndices.IncrementCounter();
g_behaviourCollisionIndices[ behaviourIndex ] = dstIndex;
}
If I split out the "add to behaviour lists" bit into a separate compute shader, and run it after the particle lists are built, everything works perfectly. However I think I shouldn't need to do this - it's a waste of bandwidth going through all the particles again.
I suspect that IncrementCounter is actually not guaranteed to return a unique index into the UAV, and that there is some clever optimisation going on that means the index is only valid inside the compute shader it is used in. And thus my attempt to pass it to the second compute shader is not valid.
Can anyone give any concrete answers to what's going on here? And if there's a way for me to keep the filtering inside the same compute shader as my core update?
Thanks!
IncrementCounter is an atomic operation and so will (driver/hardware bugs notwithstanding) return a unique value to each thread that calls it.
Have you thought about using Append/Consume buffers for this, as it's what they were designed for? The first pass simply appends the complex collision particles to an AppendStructuredBuffer and the second pass consumes from the same buffer but using a ConsumeStructuredBuffer view instead. The second run of compute will need to use DispatchIndirect so you only run as many thread groups as necessary for the number in the list (something the CPU won't know).
The usual recommendations apply though, have you tried the D3D11 Debug Layer and running it on the reference device to be sure it isn't a driver issue?

Transparency in Progressive Photon Mapping in cuda

I am working on a project, which is based on optix. I need to use progressive photon mapping, hence I am trying to use the Progressive Photon Mapping from the samples, but the transparency material is not implemented.
I've googled a lot and also tried to understand other samples that contains transparency material (e.g. Glass, Tutorial, whitted). At last, I got the solution as follows;
Find the hit point (intersection point) (h below)
Generate another ray from that point
use the color of the new generated points
By following you can also find the code of that part, by I do not understand why I get black color(.0f, .0f, 0.f) for the new generated ray (part 3 above).
optix::Ray ray( h, t, rtpass_ray_type, scene_epsilon );
HitPRD refr_prd;
refr_prd.ray_depth = hit_prd.ray_depth+1;
refr_prd.importance = importance;
rtTrace( top_object, ray, refr_prd );
result += (1.0f - reflection) * refraction_color * refr_prd.attenuation;
Any idea will be appreciated.
Please note that refr_prd.attenuation should contains some colors, after using function rtTrace(). I've mentioned reflection and reflaction_color to help you better understand the procedure. You can simply ignore them.
There are a number of methods to diagnose your problem.
Isolate the contribution of the refracted ray, by removing any contribution of the reflection ray.
Make sure you have a miss program. HitPRD::attenuation needs to be written to by all of your closest hit programs and your miss programs. If you suspect the miss program is being called set your miss color to something obviously bad ([1,0,1] is my favorite).
Use rtPrintf in combination with rtContextSetPrintLaunchIndex or setPrintLaunchIndex to print out the individual values of the product to see which term is zero from a given pixel. If you don't restrict the output to a given launch index you will get too much output. You probably also want to print out the depth as well.

Re-positioning a Rigid Body in Bullet Physics

I am writing a character animation rendering engine that uses Bullet Physics as a physics simulation engine.
A sequence will start out with no model on the screen, then an animation will be assigned to that model, the model will be moved to frame 0 of the animation, and the engine will begin rendering the model with the animation.
What is the correct way to re-position the rigid bodies on the character model when it is initialized at frame 0?
Currently I am using this code, which is called immediately after the animation is assigned to the model and the bones are moved to the frame 0 position:
_world->removeRigidBody(_body);
bool k = (_type == Kinematics);
_body->setCollisionFlags(_body->getCollisionFlags() & ~btCollisionObject::CF_NO_CONTACT_RESPONSE);
btTransform tr = BulletPhysics::ConvertD3DXMatrix(&(_bone->getCombinedTrans()));
tr *= _trans;
_body->setCenterOfMassTransform(tr);
_body->clearForces();
_body->setLinearVelocity(btVector3(0,0,0));
_body->setAngularVelocity(btVector3(0,0,0));
_world->addRigidBody(_body, _groupID, _groupMask);
The issue is that sometimes this works, and other times not. For an example, take a skirt of a model. Sometimes it will show up in the natural position, other times slightly misaligned and it will fall into place, and other times it shows up completely clipped through the body, as if collision was turned off and some force pushed it in that direction. This does make sense most of the time, because in the test animation I am using the model's initial position is in the center of the screen, but the animation starts off the left side of the screen. Does anyone know how to solve this?
I know the bones on the skirt are not the problem, because I turned off physics and forced it to manually update the bone positions each frame, and everything was in the correct positions throughout the entire animation.
EDIT: I also have constraints, might that be what's causing this?
Here is my reposition method that does exactly this.
void LimbBt::reposition(btVector3 position,btVector3 orientation) {
btTransform initialTransform;
initialTransform.setOrigin(position);
initialTransform.setRotation(orientation);
mBody->setWorldTransform(initialTransform);
mMotionState->setWorldTransform(initialTransform);
}
The motion state mMotionState is the motion state you created for the btRigidBody in the beginning. Just add your clearForces() and velocities to it to stop the body from moving on from the new position as if it went through a portal. That should do it. It works nicely with me here.
Edit: The constraints will adapt if you reposition all rigidbodies correctly. For that purpose, it is easy to calculate the relative position and reposition the whole constrained rigidbody construct according to that. If you do it incorrectly, you will get severe twitching, as the constraints will try to adjust you construct numerically, causing high forces if the constraint gaps are large.
Edit2: Another issue is that if you need deterministic behavior (every time you reset your bodies, they should fall exactly the same), then you will have to kill your old dynamicsWorld, recreate it and add all the bodies again. The world stores some information about the bodies that just can not be cleared for now. This might change in the future as bullet4 is going to support deterministic resets. But for now, if you do experiments with deterministic resets, you need to drop the world and recreate it.
source: discussion with Erwin Coumans, the developer of Bullet Physics.
I can't tell you what causes the unusual outcome when moving rigid bodies but I can definitely sympathize!
There are three things you'll need to do in order to solve this:
Convert your rigid bodies to kinematic ones
Adjust the World Transform of the bodies motion state and NOT the rigid body
Convert the kinematic body back to a rigid body
A short tested code snippet effectively teleporting a rigid body by updating its motion state to its new position and orientation, plus nullifying all velocities and forces acting upon it.
void teleport(btVector3 position, btQuaternion& orientation) const {
btTransform transform;
transform.setIdentity();
transform.setOrigin(position);
transform.setRotation(orientation);
m_rigidBodyVehicle->setWorldTransform(transform);
m_rigidBodyVehicle->getMotionState()->setWorldTransform(transform);
m_rigidBodyVehicle->setLinearVelocity(btVector3(0.0f, 0.0f, 0.0f));
m_rigidBodyVehicle->setAngularVelocity(btVector3(0.0f, 0.0f, 0.0f));
m_rigidBodyVehicle->clearForces();
}

Resources