Changing opacity of individual items in openGL ES 2.0 Quad Batch - sprite

Overview
In my app (which is a game), I make use of the batching of items to reduce the number of draw calls. So, I'll, create for example, a Java object called platforms which is for all the platforms in the game. All the enemies are batched together as are all collectible items etc....
This works really well. At present I am able to size and position the individual items in a batch independently of each other however, I've come to the point where I really need to change the opacity of individual items also. Currently, I can change only the opacity of the entire batch.
Batching
I am uploading the vertices for all items within the batch that are to be displayed (I can turn individual items off if I don't want them to be drawn), and then once they are all done, I simply draw them in one call.
The following is an idea of what I'm doing - I realise this may not compile, it is just to give an idea for the purpose of the question.
public void draw(){
//Upload vertices
for (count = 0;count<numOfSpritesInBatch;count+=1){
vertices[x] = xLeft;
vertices[(x+1)] = yPTop;
vertices[(x+2)] = 0;
vertices[(x+3)] = textureLeft;
vertices[(x+4)] = 0;
vertices[(x+5)] = xPRight;
vertices[(x+6)] = yTop;
vertices[(x+7)] = 0;
vertices[(x+8)] = textureRight;
vertices[x+9] = 0;
vertices[x+10] = xLeft;
vertices[x+11] = yBottom;
vertices[x+12] = 0;
vertices[x+13] = textureLeft;
vertices[x+14] = 1;
vertices[x+15] = xRight;
vertices[x+16] = yTop;
vertices[x+17] = 0;
vertices[x+18] = textureRight;
vertices[x+19] = 0;
vertices[x+20] = xLeft;
vertices[x+21] = yBottom;
vertices[x+22] = 0;
vertices[x+23] = textureLeft;
vertices[x+24] = 1;
vertices[x+25] = xRight;
vertices[x+26] = yBottom;
vertices[x+27] = 0;
vertices[x+28] = textureRight;
vertices[x+29] = 1;
x+=30;
}
vertexBuf.rewind();
vertexBuf.put(vertices).position(0);
GLES20.glBindTexture(GLES20.GL_TEXTURE_2D, texID);
GLES20.glUseProgram(iProgId);
Matrix.multiplyMM(mvpMatrix2, 0, mvpMatrix, 0, mRotationMatrix, 0);
mMVPMatrixHandle = GLES20.glGetUniformLocation(iProgId, "uMVPMatrix");
GLES20.glUniformMatrix4fv(mMVPMatrixHandle, 1, false, mvpMatrix2, 0);
vertexBuf.position(0);
GLES20.glVertexAttribPointer(iPosition, 3, GLES20.GL_FLOAT, false, 5 * 4, vertexBuf);
GLES20.glEnableVertexAttribArray(iPosition);
vertexBuf.position(3);
GLES20.glVertexAttribPointer(iTexCoords, 2, GLES20.GL_FLOAT, false, 5 * 4, vertexBuf);
GLES20.glEnableVertexAttribArray(iTexCoords);
//Enable Alpha blending and set blending function
GLES20.glEnable(GLES20.GL_BLEND);
GLES20.glBlendFunc(GLES20.GL_ONE, GLES20.GL_ONE_MINUS_SRC_ALPHA);
//Draw it
GLES20.glDrawArrays(GLES20.GL_TRIANGLES, 0, 6 * numOfSpritesInBatch);
//Disable Alpha blending
GLES20.glDisable(GLES20.GL_BLEND);
}
Shaders
String strVShader =
"uniform mat4 uMVPMatrix;" +
"attribute vec4 a_position;\n"+
"attribute vec2 a_texCoords;" +
"varying vec2 v_texCoords;" +
"void main()\n" +
"{\n" +
"gl_Position = uMVPMatrix * a_position;\n"+
"v_texCoords = a_texCoords;" +
"}";
String strFShader =
"precision mediump float;" +
"uniform float opValue;"+
"varying vec2 v_texCoords;" +
"uniform sampler2D u_baseMap;" +
"void main()" +
"{" +
"gl_FragColor = texture2D(u_baseMap, v_texCoords);" +
"gl_FragColor *= opValue;"+
"}";
Currently, I have a method in my Sprite class that allows me to change the opacty. For example, something like this:
spriteBatch.setOpacity(0.5f); //Half opacity
This works, but changes the whole batch - not what I'm after.
Application
I need this because I want to draw small indicators when the player destroys an enemy - which show the score obtained from that action. (The type of thing that happens in many games) - I want these little 'score indicators' to fade out once they appear. All the indicators would of course be created as a batch so they can all be drawn with one draw call.
The only other alternatives are:
Create 10 textures at varying levels of opacity and just switch between them to create the fading effect. Not really an option as way too wasteful.
Create each of these objects separately and draw each with their own draw call. Would work, but with a max of 10 of these objects on-screen, I could potentially be drawing using 10 draw calls just for these items - while the game as a whole currently only uses about 20 draw calls to draw a hundreds of items.
I need to look at future uses of this too in particle systems etc.... so I would really like to try to figure out how to do this (be able to adjust each item's opacity separately). If I need to do this in the shader, I would be grateful if you could show how this works. Alternatively, is it possible to do this outside of the shader?
Surely this can be done in some way or another? All suggestions welcome....

The most direct way of achieving this is to use a vertex attribute for the opacity value, instead of a uniform. This will allow you to set the opacity per vertex, without increasing the number of draw calls.
To implement this, you can follow the pattern you already use for the texture coordinates. They are passed into the vertex shader as an attribute, and then handed off to the fragment shader as a varying variable.
So in the vertex shader, you add:
...
attribute float a_opValue;
varying float v_opValue;
...
v_opValue = a_opValue;
...
In the fragment shader, you remove the uniform declaration for opValue, and replace it with:
varying float v_opValue;
...
gl_FragColor *= v_opValue;
...
In the Java code, you extend the vertex data with an additional value for the opacity, to use 6 values per vertex (3 position, 2 texture coordinates, 1 opacity), and update the state setup accordingly:
vertexBuf.position(0);
GLES20.glVertexAttribPointer(iPosition, 3, GLES20.GL_FLOAT, false, 6 * 4, vertexBuf);
GLES20.glEnableVertexAttribArray(iPosition);
vertexBuf.position(3);
GLES20.glVertexAttribPointer(iTexCoords, 2, GLES20.GL_FLOAT, false, 6 * 4, vertexBuf);
GLES20.glEnableVertexAttribArray(iTexCoords);
vertexBuf.position(5);
GLES20.glVertexAttribPointer(iOpValue, 1, GLES20.GL_FLOAT, false, 6 * 4, vertexBuf);
GLES20.glEnableVertexAttribArray(iOpValue);

Related

Is it possible to test if an arbitrary pixel is modifiable by the shader?

I am writing a spatial shader in godot to pixelate an object.
Previously, I tried to write outside of an object, however that is only possible in CanvasItem shaders, and now I am going back to 3D shaders due rendering annoyances (I am unable to selectively hide items without using the culling mask, which being limited to 20 layers is not an extensible solution.)
My naive approach:
Define a pixel "cell" resolution (ie. 3x3 real pixels)
For each fragment:
If the entire "cell" of real pixels is within the models draw bounds, color the current pixel as per the lower-left (where the pixel that has coordinates that are the multiple of the cell resolution).
If any pixel of the current "cell" is out of the draw bounds, set alpha to 1 to erase the entire cell.
psuedo-code for people asking for code of the likely non-existant functionality that I am seeking:
int cell_size = 3;
fragment {
// check within a cell to see if all pixels are part of the object being drawn to
for (int y = 0; y < cell_size; y++) {
for (int x = 0; x < cell_size; x++) {
int erase_pixel = 0;
if ( uv_in_model(vec2(FRAGCOORD.x - (FRAGCOORD.x % x), FRAGCOORD.y - (FRAGCOORD.y % y))) == false) {
int erase_pixel = 1;
}
}
}
albedo.a = erase_pixel
}
tl;dr, is it possible to know if any given point will be called by the fragment function?
On your object's material there should be a property called Next Pass. Add a new Spatial Material in this section, open up flags and check transparent and unshaded, and then right-click it to bring up the option to convert it to a Shader Material.
Now, open up the new Shader Material's Shader. The last process should have created a Shader formatted with a fragment() function containing the line vec4 albedo_tex = texture(texture_albedo, base_uv);
In this line, you can replace "texture_albedo" with "SCREEN_TEXTURE" and "base_uv" with "SCREEN_UV". This should make the new shader look like nothing has changed, because the next pass material is just sampling the screen from the last pass.
Above that, make a variable called something along the lines of "pixelated" and set it to the following expression:
vec2 pixelated = floor(SCREEN_UV * scale) / scale; where scale is a float or vec2 containing the pixel size. Finally replace SCREEN_UV in the albedo_tex definition with pixelated.
After this, you can have a float depth which samples DEPTH_TEXTURE with pixelated like this:
float depth = texture(DEPTH_TEXTURE, pixelated).r;
This depth value will be very large for pixels that are just trying to render the background onto your object. So, add a conditional statement:
if (depth > 100000.0f) { ALPHA = 0.0f; }
As long as the flags on this new next pass shader were set correctly (transparent and unshaded) you should have a quick-and-dirty pixelator. I say this because it has some minor artifacts around the edges, but you can make scale a uniform variable and set it from the editor and scripts, so I think it works nicely.
"Testing if a pixel is modifiable" in your case means testing if the object should be rendering it at all with that depth conditional.
Here's the full shader with my modifications from the comments
// NOTE: Shader automatically converted from Godot Engine 3.4.stable's SpatialMaterial.
shader_type spatial;
render_mode blend_mix,depth_draw_opaque,cull_back,unshaded;
//the size of pixelated blocks on the screen relative to pixels
uniform int scale;
void vertex() {
}
//vec2 representation of one used for calculation
const vec2 one = vec2(1.0f, 1.0f);
void fragment() {
//scale SCREEN_UV up to the size of the viewport over the pixelation scale
//assure scale is a multiple of 2 to avoid artefacts
vec2 pixel_scale = VIEWPORT_SIZE / float(scale * 2);
vec2 pixelated = SCREEN_UV * pixel_scale;
//truncate the decimal place from the pixelated uvs and then shift them over by half a pixel
pixelated = pixelated - mod(pixelated, one) + one / 2.0f;
//scale the pixelated uvs back down to the screen
pixelated /= pixel_scale;
vec4 albedo_tex = texture(SCREEN_TEXTURE,pixelated);
ALBEDO = albedo_tex.rgb;
ALPHA = 1.0f;
float depth = texture(DEPTH_TEXTURE, pixelated).r;
if (depth > 10000.0f)
{
ALPHA = 0.0f;
}
}

Retrieving depth value of a sample position using depth map

I have been trying to implement SSAO following the LearnOpenGL implementation. In their implementation they have utilized the positions g-buffer to obtain the sample positions depth value and I am wondering how I could go about using the depth buffer instead since I have this ready to use, in order to retrieve the depth value instead of using the positions g-buffer. I have shown the LearnOpenGL implementation using the positions texture and the attempt I made at using the depth buffer. I think I might be missing a required step for utilizing the depth buffer but I am unsure.
[LearnOpenGL SSAO][1]
Using Positions g-buffer
layout(binding = 7) uniform sampler2D positionsTexture;
layout(binding = 6) uniform sampler2D depthMap;
// ...
vec4 offset = vec4(samplePos, 1.0);
offset = camera.proj * offset; //transform sample to clip space
offset.xyz /= offset.w; // perspective divide
offset.xyz = offset.xyz * 0.5 + 0.5; // transform to range 0-1
float sampleDepth = texture(positionsTexture, offset.xy).z;
I want to use the depth buffer instead. This approach did not seem to work for me
float sampleDepth = texture(depthMap, offset.xy).x;
Update: 8/01
I have implemented a linerazation function for the depth result. Still unable to obtain the right result. Am I missing something more?
float linearize_depth(float d,float zNear,float zFar)
{
return zNear * zFar / (zFar + d * (zNear - zFar));
}
float sampleDepth = linearize_depth(texture(depthMap, offset.xy).z, zNear, zFar);

Can I do random writes from a kernel without worrying about synchronization issues?

Consider a simple depth-of-field filter (my actual use case is similar). It loops over the image and scatters every pixel over a circular neighborhood of its. The radius of the neighborhood depends on the depth of the pixel - the closer the it is to the focal plane, the smaller the radius.
Note that I said "scatters" and not "gathers". In simpler image processing applications, you normally use the "gather" technique to perform an uniform Gaussian blur. IOW, you loop over the neighborhood of each pixel, and "gather" the nearby values into a weighted average. This works fine in that case, but if you make the blur kernel vary between pixels, while still using "gathering", you'll get a somewhat unrealistic effect. Such "space-variant filtering" scenarios are where "scattering" is different from "gathering".
To be clear: the scatter algo is something like this:
init resultImage to black
loop over sourceImage
var c = fetch current pixel from sourceImage
var toAdd = c * weight // weight < 1
loop over circular neighbourhood of current sourcepixel
add toAdd to current neighbor from resultImage
My question is: if I do a direct translation of this pseudocode to OpenCL, will there be synchronization issues due to different work-items simultaneously writing to the same output pixel?
Does the answer vary depending on whether I'm using Buffers or Images?
The course I'm reading suggests that there will be synchronization issues. But OTOH I read the source of Mandelbulber 1.21-2, which does a straightforward OpenCL DOF just like my above pseudocode, and it seems to work fine.
(the relevant code is in mandelbulber-opencl-1.21-2.orig/usr/share/cl/cl_DOF.cl and it's as follows)
//*********************************************************
// MANDELBULBER
// kernel for DOF effect
//
//
// author: Krzysztof Marczak
// contact: buddhi1980#gmail.com
// licence: GNU GPL v3.0
//
//*********************************************************
typedef struct
{
int width;
int height;
float focus;
float radius;
} sParamsDOF;
typedef struct
{
float z;
int i;
} sSortZ;
//------------------ MAIN RENDER FUNCTION --------------------
kernel void DOF(__global ushort4 *in_image, __global ushort4 *out_image, __global sSortZ *zBuffer, sParamsDOF p)
{
const unsigned int i = get_global_id(0);
uint index = p.height * p.width - i - 1;
int ii = zBuffer[index].i;
int2 scr = (int2){ii % p.width, ii / p.width};
float z = zBuffer[index].z;
float blur = fabs(z - p.focus) / z * p.radius;
blur = min(blur, 500.0f);
float4 center = convert_float4(in_image[scr.x + scr.y * p.width]);
float factor = blur * blur * sqrt(blur)* M_PI_F/3.0f;
int blurInt = (int)blur;
int2 scr2;
int2 start = (int2){scr.x - blurInt, scr.y - blurInt};
start = max(start, 0);
int2 end = (int2){scr.x + blurInt, scr.y + blurInt};
end = min(end, (int2){p.width - 1, p.height - 1});
for (scr2.y = start.y; scr2.y <= end.y; scr2.y++)
{
for(scr2.x = start.x; scr2.x <= end.x; scr2.x++)
{
float2 d = scr - scr2;
float r = length(d);
float op = (blur - r) / factor;
op = clamp(op, 0.0f, 1.0f);
float opN = 1.0f - op;
uint address = scr2.x + scr2.y * p.width;
float4 old = convert_float4(out_image[address]);
out_image[address] = convert_ushort4(opN * old + op * center);
}
}
}
No, you can't without worrying about synchronization. If two work items scatter to the same location without synchronization, you have a race condition and won't get the correct results. Same for both buffers and images. With buffers you could use atomics, but they can slow down your code, especially when there is contention (but even when not). AFAIK, read/write images don't have atomic operations.

what parameters of CIVignette mean

I check CIVignette of Core Image Filter Reference at
http://developer.apple.com/library/mac/#documentation/graphicsimaging/reference/CoreImageFilterReference/Reference/reference.html#//apple_ref/doc/filter/ci/CIColorControls
and play around a with the parameters:
inputRadius
inputIntensity
and still have not exactly understood what each parameter effects. Could please someone explain?
Take a look at wiki understand what vignetting in photography means.
It is the fall of of light starting from the center of an image towards the corner.
Apple does not explain much about the the params.
obviously the radius specifies somehow where the vignetitting starts
the param intensity i expect to be how fast the light goes down after vignetting starts.
The radius may not be given in points, a value of 1.0 relates to your picture size.
Intensity is definitely something like 1 to 10 or larger number. 1 has some effects, 10 is rather dark already.
The radius seems to be in pixel (or points). I use a portion of image size (says 1/10th of width) and the effect is pretty good! However, if the intensity is strong (says 10), the radius can be small (like 1) and you can still see the different.
Turns out there is an attributes property on CIFilter that explains its properties and ranges.
let filter = CIFilter(name: "CIVignette")!
print("\(filter.attributes)")
Generates the following output:
[
"CIAttributeFilterDisplayName": Vignette,
"CIAttributeFilterCategories": <__NSArrayI 0x6000037020c0>(
CICategoryColorEffect,
CICategoryVideo,
CICategoryInterlaced,
CICategoryStillImage,
CICategoryBuiltIn
),
"inputRadius": {
CIAttributeClass = NSNumber;
CIAttributeDefault = 1;
CIAttributeDescription = "The distance from the center of the effect.";
CIAttributeDisplayName = Radius;
CIAttributeMax = 2;
CIAttributeMin = 0;
CIAttributeSliderMax = 2;
CIAttributeSliderMin = 0;
CIAttributeType = CIAttributeTypeScalar;
},
"CIAttributeFilterName": CIVignette,
"inputImage": {
CIAttributeClass = CIImage;
CIAttributeDescription = "The image to use as an input image. For filters that also use a background image, this is the foreground image.";
CIAttributeDisplayName = Image;
CIAttributeType = CIAttributeTypeImage;
},
"inputIntensity": {
CIAttributeClass = NSNumber;
CIAttributeDefault = 0;
CIAttributeDescription = "The intensity of the effect.";
CIAttributeDisplayName = Intensity;
CIAttributeIdentity = 0;
CIAttributeMax = 1;
CIAttributeMin = "-1";
CIAttributeSliderMax = 1;
CIAttributeSliderMin = "-1";
CIAttributeType = CIAttributeTypeScalar;
},
"CIAttributeFilterAvailable_Mac": 10.9,
"CIAttributeFilterAvailable_iOS": 5,
"CIAttributeReferenceDocumentation": http://developer.apple.com/library/ios/documentation/GraphicsImaging/Reference/CoreImageFilterReference/index.html#//apple_ref/doc/filter/ci/CIVignette
]
inputRadius is a float between 0 and 2 that affects the 'size' of the shadow.
inputIntensity is a float between -1 and 1 that affects the 'darkness' of the filter.

How to compute the visible area based on a heightmap?

I have a heightmap. I want to efficiently compute which tiles in it are visible from an eye at any given location and height.
This paper suggests that heightmaps outperform turning the terrain into some kind of mesh, but they sample the grid using Bresenhams.
If I were to adopt that, I'd have to do a line-of-sight Bresenham's line for each and every tile on the map. It occurs to me that it ought to be possible to reuse most of the calculations and compute the heightmap in a single pass if you fill outwards away from the eye - a scanline fill kind of approach perhaps?
But the logic escapes me. What would the logic be?
Here is a heightmap with a the visibility from a particular vantagepoint (green cube) ("viewshed" as in "watershed"?) painted over it:
Here is the O(n) sweep that I came up with; I seems the same as that given in the paper in the answer below How to compute the visible area based on a heightmap? Franklin and Ray's method, only in this case I am walking from eye outwards instead of walking the perimeter doing a bresenhams towards the centre; to my mind, my approach would have much better caching behaviour - i.e. be faster - and use less memory since it doesn't have to track the vector for each tile, only remember a scanline's worth:
typedef std::vector<float> visbuf_t;
inline void map::_visibility_scan(const visbuf_t& in,visbuf_t& out,const vec_t& eye,int start_x,int stop_x,int y,int prev_y) {
const int xdir = (start_x < stop_x)? 1: -1;
for(int x=start_x; x!=stop_x; x+=xdir) {
const int x_diff = abs(eye.x-x), y_diff = abs(eye.z-y);
const bool horiz = (x_diff >= y_diff);
const int x_step = horiz? 1: x_diff/y_diff;
const int in_x = x-x_step*xdir; // where in the in buffer would we get the inner value?
const float outer_d = vec2_t(x,y).distance(vec2_t(eye.x,eye.z));
const float inner_d = vec2_t(in_x,horiz? y: prev_y).distance(vec2_t(eye.x,eye.z));
const float inner = (horiz? out: in).at(in_x)*(outer_d/inner_d); // get the inner value, scaling by distance
const float outer = height_at(x,y)-eye.y; // height we are at right now in the map, eye-relative
if(inner <= outer) {
out.at(x) = outer;
vis.at(y*width+x) = VISIBLE;
} else {
out.at(x) = inner;
vis.at(y*width+x) = NOT_VISIBLE;
}
}
}
void map::visibility_add(const vec_t& eye) {
const float BASE = -10000; // represents a downward vector that would always be visible
visbuf_t scan_0, scan_out, scan_in;
scan_0.resize(width);
vis[eye.z*width+eye.x-1] = vis[eye.z*width+eye.x] = vis[eye.z*width+eye.x+1] = VISIBLE;
scan_0.at(eye.x) = BASE;
scan_0.at(eye.x-1) = BASE;
scan_0.at(eye.x+1) = BASE;
_visibility_scan(scan_0,scan_0,eye,eye.x+2,width,eye.z,eye.z);
_visibility_scan(scan_0,scan_0,eye,eye.x-2,-1,eye.z,eye.z);
scan_out = scan_0;
for(int y=eye.z+1; y<height; y++) {
scan_in = scan_out;
_visibility_scan(scan_in,scan_out,eye,eye.x,-1,y,y-1);
_visibility_scan(scan_in,scan_out,eye,eye.x,width,y,y-1);
}
scan_out = scan_0;
for(int y=eye.z-1; y>=0; y--) {
scan_in = scan_out;
_visibility_scan(scan_in,scan_out,eye,eye.x,-1,y,y+1);
_visibility_scan(scan_in,scan_out,eye,eye.x,width,y,y+1);
}
}
Is it a valid approach?
it is using centre-points rather than looking at the slope between the 'inner' pixel and its neighbour on the side that the LoS passes
could the trig in to scale the vectors and such be replaced by factor multiplication?
it could use an array of bytes since the heights are themselves bytes
its not a radial sweep, its doing a whole scanline at a time but away from the point; it only uses only a couple of scanlines-worth of additional memory which is neat
if it works, you could imagine that you could distribute it nicely using a radial sweep of blocks; you have to compute the centre-most tile first, but then you can distribute all immediately adjacent tiles from that (they just need to be given the edge-most intermediate values) and then in turn more and more parallelism.
So how to most efficiently calculate this viewshed?
What you want is called a sweep algorithm. Basically you cast rays (Bresenham's) to each of the perimeter cells, but keep track of the horizon as you go and mark any cells you pass on the way as being visible or invisible (and update the ray's horizon if visible). This gets you down from the O(n^3) of the naive approach (testing each cell of an nxn DEM individually) to O(n^2).
More detailed description of the algorithm in section 5.1 of this paper (which you might also find interesting for other reasons if you aspire to work with really enormous heightmaps).

Resources