Retrieving depth value of a sample position using depth map - graphics

I have been trying to implement SSAO following the LearnOpenGL implementation. In their implementation they have utilized the positions g-buffer to obtain the sample positions depth value and I am wondering how I could go about using the depth buffer instead since I have this ready to use, in order to retrieve the depth value instead of using the positions g-buffer. I have shown the LearnOpenGL implementation using the positions texture and the attempt I made at using the depth buffer. I think I might be missing a required step for utilizing the depth buffer but I am unsure.
[LearnOpenGL SSAO][1]
Using Positions g-buffer
layout(binding = 7) uniform sampler2D positionsTexture;
layout(binding = 6) uniform sampler2D depthMap;
// ...
vec4 offset = vec4(samplePos, 1.0);
offset = camera.proj * offset; //transform sample to clip space
offset.xyz /= offset.w; // perspective divide
offset.xyz = offset.xyz * 0.5 + 0.5; // transform to range 0-1
float sampleDepth = texture(positionsTexture, offset.xy).z;
I want to use the depth buffer instead. This approach did not seem to work for me
float sampleDepth = texture(depthMap, offset.xy).x;
Update: 8/01
I have implemented a linerazation function for the depth result. Still unable to obtain the right result. Am I missing something more?
float linearize_depth(float d,float zNear,float zFar)
{
return zNear * zFar / (zFar + d * (zNear - zFar));
}
float sampleDepth = linearize_depth(texture(depthMap, offset.xy).z, zNear, zFar);

Related

Is it possible to test if an arbitrary pixel is modifiable by the shader?

I am writing a spatial shader in godot to pixelate an object.
Previously, I tried to write outside of an object, however that is only possible in CanvasItem shaders, and now I am going back to 3D shaders due rendering annoyances (I am unable to selectively hide items without using the culling mask, which being limited to 20 layers is not an extensible solution.)
My naive approach:
Define a pixel "cell" resolution (ie. 3x3 real pixels)
For each fragment:
If the entire "cell" of real pixels is within the models draw bounds, color the current pixel as per the lower-left (where the pixel that has coordinates that are the multiple of the cell resolution).
If any pixel of the current "cell" is out of the draw bounds, set alpha to 1 to erase the entire cell.
psuedo-code for people asking for code of the likely non-existant functionality that I am seeking:
int cell_size = 3;
fragment {
// check within a cell to see if all pixels are part of the object being drawn to
for (int y = 0; y < cell_size; y++) {
for (int x = 0; x < cell_size; x++) {
int erase_pixel = 0;
if ( uv_in_model(vec2(FRAGCOORD.x - (FRAGCOORD.x % x), FRAGCOORD.y - (FRAGCOORD.y % y))) == false) {
int erase_pixel = 1;
}
}
}
albedo.a = erase_pixel
}
tl;dr, is it possible to know if any given point will be called by the fragment function?
On your object's material there should be a property called Next Pass. Add a new Spatial Material in this section, open up flags and check transparent and unshaded, and then right-click it to bring up the option to convert it to a Shader Material.
Now, open up the new Shader Material's Shader. The last process should have created a Shader formatted with a fragment() function containing the line vec4 albedo_tex = texture(texture_albedo, base_uv);
In this line, you can replace "texture_albedo" with "SCREEN_TEXTURE" and "base_uv" with "SCREEN_UV". This should make the new shader look like nothing has changed, because the next pass material is just sampling the screen from the last pass.
Above that, make a variable called something along the lines of "pixelated" and set it to the following expression:
vec2 pixelated = floor(SCREEN_UV * scale) / scale; where scale is a float or vec2 containing the pixel size. Finally replace SCREEN_UV in the albedo_tex definition with pixelated.
After this, you can have a float depth which samples DEPTH_TEXTURE with pixelated like this:
float depth = texture(DEPTH_TEXTURE, pixelated).r;
This depth value will be very large for pixels that are just trying to render the background onto your object. So, add a conditional statement:
if (depth > 100000.0f) { ALPHA = 0.0f; }
As long as the flags on this new next pass shader were set correctly (transparent and unshaded) you should have a quick-and-dirty pixelator. I say this because it has some minor artifacts around the edges, but you can make scale a uniform variable and set it from the editor and scripts, so I think it works nicely.
"Testing if a pixel is modifiable" in your case means testing if the object should be rendering it at all with that depth conditional.
Here's the full shader with my modifications from the comments
// NOTE: Shader automatically converted from Godot Engine 3.4.stable's SpatialMaterial.
shader_type spatial;
render_mode blend_mix,depth_draw_opaque,cull_back,unshaded;
//the size of pixelated blocks on the screen relative to pixels
uniform int scale;
void vertex() {
}
//vec2 representation of one used for calculation
const vec2 one = vec2(1.0f, 1.0f);
void fragment() {
//scale SCREEN_UV up to the size of the viewport over the pixelation scale
//assure scale is a multiple of 2 to avoid artefacts
vec2 pixel_scale = VIEWPORT_SIZE / float(scale * 2);
vec2 pixelated = SCREEN_UV * pixel_scale;
//truncate the decimal place from the pixelated uvs and then shift them over by half a pixel
pixelated = pixelated - mod(pixelated, one) + one / 2.0f;
//scale the pixelated uvs back down to the screen
pixelated /= pixel_scale;
vec4 albedo_tex = texture(SCREEN_TEXTURE,pixelated);
ALBEDO = albedo_tex.rgb;
ALPHA = 1.0f;
float depth = texture(DEPTH_TEXTURE, pixelated).r;
if (depth > 10000.0f)
{
ALPHA = 0.0f;
}
}

Crytek SSAO implementation producing odd result

I have been trying to better match my SSAO to Crytek's original implementation. Their Finding Next Gen - CryEngine 2 document states
To reduce the sample count we vary the sample position for nearby
pixels. The initial sample positions are distributed around the origin
in a sphere and the variation is achieved by reflecting the sample
positions on a random 3D plane through the origin
I have implemented this as the following: Full Shader
vec3 plane = texture(texNoise, uvCoords * noiseScale).xyz - vec3(1.0);
for(int i = 0; i < ssao.sample_amount; ++i) {
vec3 samplePos = reflect(TBN * ssao.samples[i].xyz, plane);
samplePos = viewSpacePositions + samplePos * radius;
/....
}
My result appears far from the original Crytek SSAO implementation with an overall odd look and occlusion seemingly only appearing on the right side.

Can I do random writes from a kernel without worrying about synchronization issues?

Consider a simple depth-of-field filter (my actual use case is similar). It loops over the image and scatters every pixel over a circular neighborhood of its. The radius of the neighborhood depends on the depth of the pixel - the closer the it is to the focal plane, the smaller the radius.
Note that I said "scatters" and not "gathers". In simpler image processing applications, you normally use the "gather" technique to perform an uniform Gaussian blur. IOW, you loop over the neighborhood of each pixel, and "gather" the nearby values into a weighted average. This works fine in that case, but if you make the blur kernel vary between pixels, while still using "gathering", you'll get a somewhat unrealistic effect. Such "space-variant filtering" scenarios are where "scattering" is different from "gathering".
To be clear: the scatter algo is something like this:
init resultImage to black
loop over sourceImage
var c = fetch current pixel from sourceImage
var toAdd = c * weight // weight < 1
loop over circular neighbourhood of current sourcepixel
add toAdd to current neighbor from resultImage
My question is: if I do a direct translation of this pseudocode to OpenCL, will there be synchronization issues due to different work-items simultaneously writing to the same output pixel?
Does the answer vary depending on whether I'm using Buffers or Images?
The course I'm reading suggests that there will be synchronization issues. But OTOH I read the source of Mandelbulber 1.21-2, which does a straightforward OpenCL DOF just like my above pseudocode, and it seems to work fine.
(the relevant code is in mandelbulber-opencl-1.21-2.orig/usr/share/cl/cl_DOF.cl and it's as follows)
//*********************************************************
// MANDELBULBER
// kernel for DOF effect
//
//
// author: Krzysztof Marczak
// contact: buddhi1980#gmail.com
// licence: GNU GPL v3.0
//
//*********************************************************
typedef struct
{
int width;
int height;
float focus;
float radius;
} sParamsDOF;
typedef struct
{
float z;
int i;
} sSortZ;
//------------------ MAIN RENDER FUNCTION --------------------
kernel void DOF(__global ushort4 *in_image, __global ushort4 *out_image, __global sSortZ *zBuffer, sParamsDOF p)
{
const unsigned int i = get_global_id(0);
uint index = p.height * p.width - i - 1;
int ii = zBuffer[index].i;
int2 scr = (int2){ii % p.width, ii / p.width};
float z = zBuffer[index].z;
float blur = fabs(z - p.focus) / z * p.radius;
blur = min(blur, 500.0f);
float4 center = convert_float4(in_image[scr.x + scr.y * p.width]);
float factor = blur * blur * sqrt(blur)* M_PI_F/3.0f;
int blurInt = (int)blur;
int2 scr2;
int2 start = (int2){scr.x - blurInt, scr.y - blurInt};
start = max(start, 0);
int2 end = (int2){scr.x + blurInt, scr.y + blurInt};
end = min(end, (int2){p.width - 1, p.height - 1});
for (scr2.y = start.y; scr2.y <= end.y; scr2.y++)
{
for(scr2.x = start.x; scr2.x <= end.x; scr2.x++)
{
float2 d = scr - scr2;
float r = length(d);
float op = (blur - r) / factor;
op = clamp(op, 0.0f, 1.0f);
float opN = 1.0f - op;
uint address = scr2.x + scr2.y * p.width;
float4 old = convert_float4(out_image[address]);
out_image[address] = convert_ushort4(opN * old + op * center);
}
}
}
No, you can't without worrying about synchronization. If two work items scatter to the same location without synchronization, you have a race condition and won't get the correct results. Same for both buffers and images. With buffers you could use atomics, but they can slow down your code, especially when there is contention (but even when not). AFAIK, read/write images don't have atomic operations.

Changing opacity of individual items in openGL ES 2.0 Quad Batch

Overview
In my app (which is a game), I make use of the batching of items to reduce the number of draw calls. So, I'll, create for example, a Java object called platforms which is for all the platforms in the game. All the enemies are batched together as are all collectible items etc....
This works really well. At present I am able to size and position the individual items in a batch independently of each other however, I've come to the point where I really need to change the opacity of individual items also. Currently, I can change only the opacity of the entire batch.
Batching
I am uploading the vertices for all items within the batch that are to be displayed (I can turn individual items off if I don't want them to be drawn), and then once they are all done, I simply draw them in one call.
The following is an idea of what I'm doing - I realise this may not compile, it is just to give an idea for the purpose of the question.
public void draw(){
//Upload vertices
for (count = 0;count<numOfSpritesInBatch;count+=1){
vertices[x] = xLeft;
vertices[(x+1)] = yPTop;
vertices[(x+2)] = 0;
vertices[(x+3)] = textureLeft;
vertices[(x+4)] = 0;
vertices[(x+5)] = xPRight;
vertices[(x+6)] = yTop;
vertices[(x+7)] = 0;
vertices[(x+8)] = textureRight;
vertices[x+9] = 0;
vertices[x+10] = xLeft;
vertices[x+11] = yBottom;
vertices[x+12] = 0;
vertices[x+13] = textureLeft;
vertices[x+14] = 1;
vertices[x+15] = xRight;
vertices[x+16] = yTop;
vertices[x+17] = 0;
vertices[x+18] = textureRight;
vertices[x+19] = 0;
vertices[x+20] = xLeft;
vertices[x+21] = yBottom;
vertices[x+22] = 0;
vertices[x+23] = textureLeft;
vertices[x+24] = 1;
vertices[x+25] = xRight;
vertices[x+26] = yBottom;
vertices[x+27] = 0;
vertices[x+28] = textureRight;
vertices[x+29] = 1;
x+=30;
}
vertexBuf.rewind();
vertexBuf.put(vertices).position(0);
GLES20.glBindTexture(GLES20.GL_TEXTURE_2D, texID);
GLES20.glUseProgram(iProgId);
Matrix.multiplyMM(mvpMatrix2, 0, mvpMatrix, 0, mRotationMatrix, 0);
mMVPMatrixHandle = GLES20.glGetUniformLocation(iProgId, "uMVPMatrix");
GLES20.glUniformMatrix4fv(mMVPMatrixHandle, 1, false, mvpMatrix2, 0);
vertexBuf.position(0);
GLES20.glVertexAttribPointer(iPosition, 3, GLES20.GL_FLOAT, false, 5 * 4, vertexBuf);
GLES20.glEnableVertexAttribArray(iPosition);
vertexBuf.position(3);
GLES20.glVertexAttribPointer(iTexCoords, 2, GLES20.GL_FLOAT, false, 5 * 4, vertexBuf);
GLES20.glEnableVertexAttribArray(iTexCoords);
//Enable Alpha blending and set blending function
GLES20.glEnable(GLES20.GL_BLEND);
GLES20.glBlendFunc(GLES20.GL_ONE, GLES20.GL_ONE_MINUS_SRC_ALPHA);
//Draw it
GLES20.glDrawArrays(GLES20.GL_TRIANGLES, 0, 6 * numOfSpritesInBatch);
//Disable Alpha blending
GLES20.glDisable(GLES20.GL_BLEND);
}
Shaders
String strVShader =
"uniform mat4 uMVPMatrix;" +
"attribute vec4 a_position;\n"+
"attribute vec2 a_texCoords;" +
"varying vec2 v_texCoords;" +
"void main()\n" +
"{\n" +
"gl_Position = uMVPMatrix * a_position;\n"+
"v_texCoords = a_texCoords;" +
"}";
String strFShader =
"precision mediump float;" +
"uniform float opValue;"+
"varying vec2 v_texCoords;" +
"uniform sampler2D u_baseMap;" +
"void main()" +
"{" +
"gl_FragColor = texture2D(u_baseMap, v_texCoords);" +
"gl_FragColor *= opValue;"+
"}";
Currently, I have a method in my Sprite class that allows me to change the opacty. For example, something like this:
spriteBatch.setOpacity(0.5f); //Half opacity
This works, but changes the whole batch - not what I'm after.
Application
I need this because I want to draw small indicators when the player destroys an enemy - which show the score obtained from that action. (The type of thing that happens in many games) - I want these little 'score indicators' to fade out once they appear. All the indicators would of course be created as a batch so they can all be drawn with one draw call.
The only other alternatives are:
Create 10 textures at varying levels of opacity and just switch between them to create the fading effect. Not really an option as way too wasteful.
Create each of these objects separately and draw each with their own draw call. Would work, but with a max of 10 of these objects on-screen, I could potentially be drawing using 10 draw calls just for these items - while the game as a whole currently only uses about 20 draw calls to draw a hundreds of items.
I need to look at future uses of this too in particle systems etc.... so I would really like to try to figure out how to do this (be able to adjust each item's opacity separately). If I need to do this in the shader, I would be grateful if you could show how this works. Alternatively, is it possible to do this outside of the shader?
Surely this can be done in some way or another? All suggestions welcome....
The most direct way of achieving this is to use a vertex attribute for the opacity value, instead of a uniform. This will allow you to set the opacity per vertex, without increasing the number of draw calls.
To implement this, you can follow the pattern you already use for the texture coordinates. They are passed into the vertex shader as an attribute, and then handed off to the fragment shader as a varying variable.
So in the vertex shader, you add:
...
attribute float a_opValue;
varying float v_opValue;
...
v_opValue = a_opValue;
...
In the fragment shader, you remove the uniform declaration for opValue, and replace it with:
varying float v_opValue;
...
gl_FragColor *= v_opValue;
...
In the Java code, you extend the vertex data with an additional value for the opacity, to use 6 values per vertex (3 position, 2 texture coordinates, 1 opacity), and update the state setup accordingly:
vertexBuf.position(0);
GLES20.glVertexAttribPointer(iPosition, 3, GLES20.GL_FLOAT, false, 6 * 4, vertexBuf);
GLES20.glEnableVertexAttribArray(iPosition);
vertexBuf.position(3);
GLES20.glVertexAttribPointer(iTexCoords, 2, GLES20.GL_FLOAT, false, 6 * 4, vertexBuf);
GLES20.glEnableVertexAttribArray(iTexCoords);
vertexBuf.position(5);
GLES20.glVertexAttribPointer(iOpValue, 1, GLES20.GL_FLOAT, false, 6 * 4, vertexBuf);
GLES20.glEnableVertexAttribArray(iOpValue);

equivalent to gl_FragCoord in glsl vertex shader

I'm trying to get a screen position of a vertex in pixels inside a vertex shader,
I saw some others posts here but I can't find answer that works for me.
this is what I've got in my vertex Shader:
#version 400
layout (location = 0) in vec3 inPosition;
uniform mat4 MVP; // modelViewProjection
uniform vec2 window;
void main()
{
// vertex in screen space
vec2 fake_frag_coord = (MVP * vec4(inPosition,1.0)).xy;
float X = (fake_frag_coord.x*window.x/2.0) + window.x;
float Y = (fake_frag_coord.y*window.y/2.0) + window.y;
}
It's not working very well and I know it's a strange think to do inside a vertex shader but I want to multiply my vertex offset by a 2d texture, so I need to find the pixel the vertex is on top to be able to multiply it by the pixel of the texture.
thanks!
Luiz
I have corrected your vertex shader with proper terms, and shown you the exact sequence of transformations that actually happens when GL computes gl_FragCoord (window-space).
#version 400
layout (location = 0) in vec4 inPosition; // Always use vec4, it makes life easier!
uniform mat4 MVP; // modelViewProjection
uniform vec2 window;
void main()
{
// Vertex in clip-space
vec4 fake_frag_coord = (MVP * inPosition); // Range: [-w,w]^4
// Vertex in NDC-space
fake_frag_coord.xyz /= fake_frag_coord.w; // Rescale: [-1,1]^3
fake_frag_coord.w = 1.0 / fake_frag_coord.w; // Invert W
// Vertex in window-space
fake_frag_coord.xyz *= vec3 (0.5) + vec3 (0.5); // Rescale: [0,1]^3
fake_frag_coord.xy *= window; // Scale and Bias for Viewport
// Assume depth range: [0,1] --> No need to adjust fake_frag_coord.z
[...]
}
Texture coordinates and window-space coordinates are very different things, however. Generally you need normalized coordinates for traditional texture fetches, that means you want the coordinates in the range [0,1].
Luckily window-space and texture-space share the same origin convention (0,0) = bottom-left, so you can cut out the line below to get the appropriate texture coordinates:
fake_frag_coord.xy *= window; // Scale and Bias for Viewport
I think Andon M. Coleman's answer is fine. However, I like to point out a more general issue with the approach discussed in the question: there might be no meaningful screen space position for a vertex at all.
The vertex might lie utside the viewing frustum. This will not be a a problem if the vertices you draw are guaranteed to lie in the frustum, or if you are drawing only points.
But it will fail if you have primitives intersecting the near plane. You might think that in such a case, you just get some coordinates which are outside [-1,1] in NDC space, and if you just use them to assign some output value for the vertex, the clipping state will make it right. But that assumption is wrong. You might values which are pefectly in [-1,1] in NDC space even for vertices which are outside the frustum, and it it will appear as if the vertices lie in front of the camera for all vertices wich actually lie behind the camera. And no subsequent clipping stage is able to fix this.
The only way to get this right would be to actually carry out the clipping operation, before doing the divide by w. And this is something you don't want to do in a vertex shader.
If you want to get this working on the js part of things, this is how I adapted Andon M. Coleman's reply:
var winW = window.innerWidth;
var winH = window.innerHeight;
camera.updateProjectionMatrix();
// Not sure about the order of these! I was using orthographic camera so it didn't matter but double check the order if it doesn't work!
var MVP = camera.projectionMatrix.multiply(camera.matrixWorldInverse);
// position to vertex clip-space
var fake_frag_coord = position.applyMatrix4(MVP); // Range: [-w,w]^4
// vertex to NDC-space
fake_frag_coord.x = fake_frag_coord.x / fake_frag_coord.w; // Rescale: [-1,1]^3
fake_frag_coord.y = fake_frag_coord.y / fake_frag_coord.w; // Rescale: [-1,1]^3
fake_frag_coord.z = fake_frag_coord.z / fake_frag_coord.w; // Rescale: [-1,1]^3
fake_frag_coord.w = 1.0 / fake_frag_coord.w; // Invert W
// Vertex in window-space
fake_frag_coord.x = fake_frag_coord.x * 0.5;
fake_frag_coord.y = fake_frag_coord.y * 0.5;
fake_frag_coord.z = fake_frag_coord.z * 0.5;
fake_frag_coord.x = fake_frag_coord.x + 0.5;
fake_frag_coord.y = fake_frag_coord.y + 0.5;
fake_frag_coord.z = fake_frag_coord.z + 0.5;
// Scale and Bias for Viewport (We want the window coordinates, so no need for this)
fake_frag_coord.x = fake_frag_coord.x / winW;
fake_frag_coord.y = fake_frag_coord.y / winH;

Resources