I develop offscreen Vulkan based render server to perform 2D scene drawing per request.
Target platform: Ubuntu 18.04 into Docker container
Physical device: llvmpipe (LLVM 11.0.1, 256 bits)
The scene consists of the same type of meshes and textures of different sizes. Each mesh is bound to its own texture. The maximum number of scene elements is 200.
I have just 1 material (vertex + fragment shaders) so I use just 1 pipeline.
High level description of my workfllow:
1) Setup framebuffer and readback image
2) Load all meshes (VBOs and IBOs)
3) Load all textures (images, views, samplers)
4) Create descriptor set for material exposes (mesh transform and texture sampler)
5) Put per-mesh parameters to storage buffer (transform matrices)
6) Update fixed array of texture samplers.
7) Draw each mesh
8) Send readback image to response.
Thats works great on dedicated GPU, llvmpipe does not support VK_EXT_descriptor_indexing and shaderSampledImageArrayDynamicIndexing feature. Its mean I cant indexing (in shaders) texture samler array by value from PushConstants.
#version 450
layout(set = 0, binding = 2) uniform sampler2D textures[200];
layout(push_constant) uniform Constants
{
uint id;
} meta;
void main()
{
// ...
vec4 t = texture(textures, uv); // failed on llvmpipe
// ...
}
To use only one sampler I need:
clear(framebuffer)
for mesh in meshes
{
bind(mesh.vbo)
bind(mesh.ibo)
bind(descriptorset)
update(sampler) // write current mesh texture
submit()
}
read(readback)
...
I dont understand how to setup renderpass to perform this steps. submit() in middle of this approach is confuse me.
Could you help me ?
I tried another approach that is based on StorageTexelBuffers.
1. Get max size of texel storage from device limits
(maxTexelBufferElements)
2. Split scene data ito chunks limited by maxTexelBufferElements.
3. Setup framebuffer and clear it
4. Draw a chunk[i]
5. Read back result
In this case samplers usage are not required.
I put N images in 1D array and pass it to fragment shader. In the shader I calculate index of the specific texel and gather it by imageLoad(...)
layout(location = 0) in vec2 uv;
layout(set = 0, binding = 2, rgba32f) uniform imageBuffer texels;
layout(push_constant) uniform Constants
{
uint id;
uint textureStart;
uint textureWidth;
uint textureHeight;
} meta;
void main()
{
// calculate specific texel real coordinates
uint s = uint(uv.x * float(meta.textureWidth));
uint t = uint(uv.y * float(meta.textureHeight));
// calculate texel index in global array
int index = int(meta.textureStart + s + t * meta.textureWidth);
outColor = imageLoad(noise, tx);
}
Start of the texture is passed in PushConstants.
Related
I am developing an data viewing application in which time series data of multiple sensors is plotted. The size of data is always in Gigabytes therefore I opt to use OpenGl for this purpose.
Now I want the data to scroll vertically for better analysis of it. I am able to do this by two methods till now but none of them seems to be quite fully-complete. First method I tried was to simply change the index of sensors in an periodic manner with motion of the scroll wheel(using np.roll). Now this method completes my requirement but it looks like I am jumping data. I want an smooth transaction of scroll.
Second method was to set an position change in vertex shader. I simply passes an float value of range between -1 and 1 of Y-axis and it simply interpolates the positional data above it to the bottom. The problem with this method is whenever I reach the position when an sensor data is on the both ends (-1 and 1) it joins the lines across the whole screen.
I am unable to understand what else can I do using shader to make this happen.
Below are the shader that I am using and Image result I am getting.
Passing data using VBO and using gl.glDrawArrays(gl.GL_LINE_STRIP, i * self.count, self.count) in a loop to make lines.
VS = '''uniform mat4 projection;
uniform mat4 modelview;
layout(location = 0) in vec2 position;
layout(location = 1) in vec3 color;
uniform float level;
void main() {
if (position.y>-level){
gl_Position = vec4(position.x, position.y-(1-level), 0., 1.);
}else
{gl_Position = vec4(position.x, position.y + (1+level), 0., 1.);}'''
FS = '''#version 450
// Output variable of the fragment shader, which is a 4D vector containing the
// RGBA components of the pixel color.
uniform vec3 triangleColor;
void main()
{
outColor = vec4(triangleColor, 1.0);}'''
So I was hoping to find a method in which I firstly plot data using shader and then rotate all the pixel array index
I'm looking for a shader CG or HLSL, that can count number of red pixels or any other colors that I want.
You could do this with atomic counters in a fragment shader. Just test the output color to see if it's within a certain tolerance of red, and if so, increment the counter. After the draw call you should be able to read the counter's value on the CPU and do whatever you like with it.
edit: added a very simple example fragment shader:
// Atomic counters require 4.2 or higher according to
// https://www.opengl.org/wiki/Atomic_Counter
#version 440
#extension GL_EXT_gpu_shader4 : enable
// Since this is a full-screen quad rendering,
// the only input we care about is texture coordinate.
in vec2 texCoord;
// Screen resolution
uniform vec2 screenRes;
// Texture info in case we use it for some reason
uniform sampler2D tex;
// Atomic counters! INCREDIBLE POWER
layout(binding = 0, offset = 0) uniform atomic_uint ac1;
// Output variable!
out vec4 colorOut;
bool isRed(vec4 c)
{
return c.r > c.g && c.r > c.b;
}
void main()
{
vec4 result = texture2D(tex, texCoord);
if (isRed(result))
{
uint cval = atomicCounterIncrement(ac1);
}
colorOut = result;
}
You would also need to set up the atomic counter in your code:
GLuint acBuffer = 0;
glGenBuffers(1, &acBuffer);
glBindBuffer(GL_ATOMIC_COUNTER_BUFFER, acBuffer);
glBufferData(GL_ATOMIC_COUNTER_BUFFER, sizeof(GLuint), NULL, GL_DYNAMIC_DRAW);
After my main camera renders, I'd like to use (or copy) its depth buffer to a (disabled) camera's depth buffer.
My goal is to draw particles onto a smaller render target (using a separate camera) while using the depth buffer after opaque objects are drawn.
I can't do this in a single camera, since the goal is to use a smaller render target for the particles for performance reasons.
Replacement shaders in Unity aren't an option either: I want my particles to use their existing shaders - i just want the depth buffer of the particle camera to be overwritten with a subsampled version of the main camera's depth buffer before the particles are drawn.
I didn't get any reply to my earlier question; hence, the repost.
Here's the script attached to my main camera. It renders all the non-particle layers and I use OnRenderImage to invoke the particle camera.
public class MagicRenderer : MonoBehaviour {
public Shader particleShader; // shader that uses the main camera's depth buffer to depth test particle Z
public Material blendMat; // material that uses a simple blend shader
public int downSampleFactor = 1;
private RenderTexture particleRT;
private static GameObject pCam;
void Awake () {
// make the main cameras depth buffer available to the shaders via _CameraDepthTexture
camera.depthTextureMode = DepthTextureMode.Depth;
}
// Update is called once per frame
void Update () {
}
void OnRenderImage(RenderTexture src, RenderTexture dest) {
// create tmp RT
particleRT = RenderTexture.GetTemporary (Screen.width / downSampleFactor, Screen.height / downSampleFactor, 0);
particleRT.antiAliasing = 1;
// create particle cam
Camera pCam = GetPCam ();
pCam.CopyFrom (camera);
pCam.clearFlags = CameraClearFlags.SolidColor;
pCam.backgroundColor = new Color (0.0f, 0.0f, 0.0f, 0.0f);
pCam.cullingMask = 1 << LayerMask.NameToLayer ("Particles");
pCam.useOcclusionCulling = false;
pCam.targetTexture = particleRT;
pCam.depth = 0;
// Draw to particleRT's colorBuffer using mainCam's depth buffer
// ?? - how do i transfer this camera's depth buffer to pCam?
pCam.Render ();
// pCam.RenderWithShader (particleShader, "Transparent"); // I don't want to replace the shaders my particles use; os shader replacement isnt an option.
// blend mainCam's colorBuffer with particleRT's colorBuffer
// Graphics.Blit(pCam.targetTexture, src, blendMat);
// copy resulting buffer to destination
Graphics.Blit (pCam.targetTexture, dest);
// clean up
RenderTexture.ReleaseTemporary(particleRT);
}
static public Camera GetPCam() {
if (!pCam) {
GameObject oldpcam = GameObject.Find("pCam");
Debug.Log (oldpcam);
if (oldpcam) Destroy(oldpcam);
pCam = new GameObject("pCam");
pCam.AddComponent<Camera>();
pCam.camera.enabled = false;
pCam.hideFlags = HideFlags.DontSave;
}
return pCam.camera;
}
}
I've a few additional questions:
1) Why does camera.depthTextureMode = DepthTextureMode.Depth; end up drawing all the objects in the scene just to write to the Z-buffer? Using Intel GPA, I see two passes before OnRenderImage gets called:
(i) Z-PrePass, that only writes to the depth buffer
(ii) Color pass, that writes to both the color and depth buffer.
2) I re-rendered the opaque objects to pCam's RT using a replacement shader that writes (0,0,0,0) to the colorBuffer with ZWrite On (to overcome the depth buffer transfer problem). After that, I reset the layers and clear mask as follows:
pCam.cullingMask = 1 << LayerMask.NameToLayer ("Particles");
pCam.clearFlags = CameraClearFlags.Nothing;
and rendered them using pCam.Render().
I thought this would render the particles using their existing shaders with the ZTest.
Unfortunately, what I notice is that the depth-stencil buffer is cleared before the particles are drawn (inspite me not clearing anything..).
Why does this happen?
It's been 5 years but I delevoped an almost complete solution for rendering particles in a smaller seperate render target. I write this for future visitors. A lot of knowledge is still required.
Copying the depth
First, you have to get the scene depth in the resolution of your smaller render texture.
This can be done by creating a new render texture with the color format "depth".
To write the scene depth to the low resolution depth, create a shader that just outputs the depth:
struct fragOut{
float depth : DEPTH;
};
sampler2D _LastCameraDepthTexture;
fragOut frag (v2f i){
fragOut tOut;
tOut.depth = tex2D(_LastCameraDepthTexture, i.uv).x;
return tOut;
}
_LastCameraDepthTexture is automatically filled by Unity, but there is a downside.
It only comes for free if the main camera renders with deferred rendering.
For forward shading, Unity seems to render the scene again just for the depth texture.
Check the frame debugger.
Then, add a post processing effect to the main camera that executes the shader:
protected virtual void OnRenderImage(RenderTexture pFrom, RenderTexture pTo) {
Graphics.Blit(pFrom, mSmallerSceneDepthTexture, mRenderToDepthMaterial);
Graphics.Blit(pFrom, pTo);
}
You can probably do this without the second blit, but it was easier for me for testing.
Using the copied depth for rendering
To use the new depth texture for your second camera, call
mSecondCamera.SetTargetBuffers(mParticleRenderTexture.colorBuffer, mSmallerSceneDepthTexture.depthBuffer);
Keep targetTexture empty.
You then must ensure the second camera does not clear the depth, only the color.
For this, disable clear on the second camera completely and clear manually like this
Graphics.SetRenderTarget(mParticleRenderTexture);
GL.Clear(false, true, Color.clear);
I recommend to also render the second camera by hand. Disable it and call
mSecondCamera.Render();
after clearing.
Merging
Now you have to merge the main view and the seperate layer.
Depending on your rendering, you will probably end up with a render texture with so called premultiplied alpha.
To mix this with the rest, use a post processing step on the main camera with
fixed4 tBasis = tex2D(_MainTex, i.uv);
fixed4 tToInsert = tex2D(TransparentFX, i.uv);
//beware premultiplied alpha in insert
tBasis.rgb = tBasis.rgb * (1.0f- tToInsert.a) + tToInsert.rgb;
return tBasis;
Additive materials work out of the box, but alpha blended do not.
You have to create a shader with custom blending to create working alpha blended materials. The blending is
Blend SrcAlpha OneMinusSrcAlpha, One OneMinusSrcAlpha
This changes how the alpha channel is modified for every performed blending.
Results
add blended in front of alpha blended
fx layer rgb
fx layer alpha
alpha blended in front of add blended
fx layer rgb
fx layer alpha
I did not test yet if the performance actually increases.
If anyone has a simpler solution, let me know please.
I managed to reuse camera Z-buffer "manually" in the shader used for rendering. See http://forum.unity3d.com/threads/reuse-depth-buffer-of-main-camera.280460/ for more.
Just alter the particle shader you use already for particle rendering.
I'm learning OpenGL 3.3, using some tutorials (http://opengl-tutorial.org). In the tutorial I'm using, there is a vertex shader which does the following:
Tutorial Shader source
#version 330 core
// Input vertex data, different for all executions of this shader.
layout(location = 0) in vec3 vertexPosition_modelspace;
// Values that stay constant for the whole mesh.
uniform mat4 MVP;
void main(){
// Output position of the vertex, in clip space : MVP * position
gl_Position = MVP * vec4(vertexPosition_modelspace,1);
}
Yet, when I try to emulate the same behavior in my application, I get the following:
error: implicit cast from "vec4" to "vec3".
After seeing this, I wasn't sure if it was because I was using 4.2 version shaders as opposed to 3.3, so changed everything to match what the author had been using, still receiving the same error afterward.
So, I changed my shader to do this:
My (latest) Source
#version 330 core
layout(location = 0) in vec3 vertexPosition_modelspace;
uniform mat4 MVP;
void main()
{
vec4 a = vec4(vertexPosition_modelspace, 1);
gl_Position.xyz = MVP * a;
}
Which, of course, still produces the same error.
Does anyone know why this is the case, as well as what a solution might be to this? I'm not sure if it could be my calling code (which I've posted, just in case).
Calling Code
static const GLfloat T_VERTEX_BUF_DATA[] =
{
// x, y z
-1.0f, -1.0f, 0.0f,
1.0f, -1.0f, 0.0f,
0.0f, 1.0f, 0.0f
};
static const GLushort T_ELEMENT_BUF_DATA[] =
{ 0, 1, 2 };
void TriangleDemo::Run(void)
{
glClear(GL_COLOR_BUFFER_BIT);
GLuint matrixID = glGetUniformLocation(mProgramID, "MVP");
glUseProgram(mProgramID);
glUniformMatrix4fv(matrixID, 1, GL_FALSE, &mMVP[0][0]); // This sends our transformation to the MVP uniform matrix, in the currently bound vertex shader
const GLuint vertexShaderID = 0;
glEnableVertexAttribArray(vertexShaderID);
glBindBuffer(GL_ARRAY_BUFFER, mVertexBuffer);
glVertexAttribPointer(
vertexShaderID, // Specify the ID of the shader to point to (in this case, the shader is built in to GL, which will just produce a white triangle)
3, // Specify the number of indices per vertex in the vertex buffer
GL_FLOAT, // Type of value the vertex buffer is holding as data
GL_FALSE, // Normalized?
0, // Amount of stride
(void*)0 ); // Offset within the array buffer
glDrawArrays(GL_TRIANGLES, 0, 3); //0 => start index of the buffer, 3 => number of vertices
glDisableVertexAttribArray(vertexShaderID);
}
void TriangleDemo::Initialize(void)
{
glGenVertexArrays(1, &mVertexArrayID);
glBindVertexArray(mVertexArrayID);
glGenBuffers(1, &mVertexBuffer);
glBindBuffer(GL_ARRAY_BUFFER, mVertexBuffer);
glBufferData(GL_ARRAY_BUFFER, sizeof(T_VERTEX_BUF_DATA), T_VERTEX_BUF_DATA, GL_STATIC_DRAW );
mProgramID = LoadShaders("v_Triangle", "f_Triangle");
glm::mat4 projection = glm::perspective(45.0f, 4.0f / 3.0f, 0.1f, 100.0f); // field of view, aspect ratio (4:3), 0.1 units near, to 100 units far
glm::mat4 view = glm::lookAt(
glm::vec3(4, 3, 3), // Camera is at (4, 3, 3) in world space
glm::vec3(0, 0, 0), // and looks at the origin
glm::vec3(0, 1, 0) // this is the up vector - the head of the camera is facing upwards. We'd use (0, -1, 0) to look upside down
);
glm::mat4 model = glm::mat4(1.0f); // set model matrix to identity matrix, meaning the model will be at the origin
mMVP = projection * view * model;
}
Notes
I'm in Visual Studio 2012
I'm using Shader Maker for the GLSL editing
I can't say what's wrong with the tutorial code.
In "My latest source" though, there's
gl_Position.xyz = MVP * a;
which looks weird because you're assigning a vec4 to a vec3.
EDIT
I can't reproduce your problem.
I have used a trivial fragment shader for testing...
#version 330 core
void main()
{
}
Testing "Tutorial Shader source":
3.3.11762 Core Profile Context
Log: Vertex shader was successfully compiled to run on hardware.
Log: Fragment shader was successfully compiled to run on hardware.
Log: Vertex shader(s) linked, fragment shader(s) linked.
Testing "My latest source":
3.3.11762 Core Profile Context
Log: Vertex shader was successfully compiled to run on hardware.
WARNING: 0:11: warning(#402) Implicit truncation of vector from size 4 to size 3.
Log: Fragment shader was successfully compiled to run on hardware.
Log: Vertex shader(s) linked, fragment shader(s) linked.
And the warning goes away after replacing gl_Position.xyz with gl_Position.
What's your setup? Do you have a correct version of OpenGL context? Is glGetError() silent?
Finally, are your GPU drivers up-to-date?
I've had problems with some GPUs (ATi ones, I believe) not liking integer literals when it expects a float. Try changing
gl_Position = MVP * vec4(vertexPosition_modelspace,1);
To
gl_Position = MVP * vec4(vertexPosition_modelspace, 1.0);
I just came across this error message on an ATI Radeon HD 7900 with latest drivers installed while compiling some sample code associated with the book "3D Engine Design for Virtual Globes" (http://www.virtualglobebook.com).
Here is the original fragment shader line:
fragmentColor = mix(vec3(0.0, intensity, 0.0), vec3(intensity, 0.0, 0.0), (distanceToContour < dF));
The solution is to cast the offending Boolean expression into float, as in:
fragmentColor = mix(vec3(0.0, intensity, 0.0), vec3(intensity, 0.0, 0.0), float(distanceToContour < dF));
The manual for mix (http://www.opengl.org/sdk/docs/manglsl) states
For the variants of mix where a is genBType, elements for which a[i] is false, the result for that
element is taken from x, and where a[i] is true, it will be taken from y.
So, since a Boolean blend value should be accepted by the compiler without comment, I think this should go down as an AMD/ATI driver issue.
My iOS 4 app uses OpenGL ES 2.0 and renders elements with a single texture. I would like to draw elements using multiple different textures and am having problems getting things to work.
I added a variable to my vertex shader to indicate which texture to apply:
...
attribute float TextureIn;
varying float TextureOut;
void main(void)
{
...
TextureOut = TextureIn;
}
I use that value in the fragment shader to select the texture:
...
varying lowp float TextureOut;
uniform sampler2D Texture0;
uniform sampler2D Texture1;
void main(void)
{
if (TextureOut == 1.0)
{
gl_FragColor = texture2D(Texture1, TexCoordOut);
}
else // 0
{
gl_FragColor = texture2D(Texture0, TexCoordOut);
}
}
Compile shaders:
...
_texture = glGetAttribLocation(programHandle, "TextureIn");
glEnableVertexAttribArray(_texture);
_textureUniform0 = glGetUniformLocation(programHandle, "Texture0");
_textureUniform1 = glGetUniformLocation(programHandle, "Texture1");
Init/Setup:
...
GLuint _texture;
GLuint _textureUniform0;
GLuint _textureUniform1;
...
glActiveTexture(GL_TEXTURE0);
glEnable(GL_TEXTURE_2D); // ?
glBindTexture(GL_TEXTURE_2D, _textureUniform0);
glUniform1i(_textureUniform0, 0);
glActiveTexture(GL_TEXTURE1);
glEnable(GL_TEXTURE_2D); // ?
glBindTexture(GL_TEXTURE_2D, _textureUniform1);
glUniform1i(_textureUniform1, 1);
glActiveTexture(GL_TEXTURE0);
Render:
...
glVertexAttribPointer(_texture, 1, GL_FLOAT, GL_FALSE, sizeof(Vertex), (GLvoid*) (sizeof(float) * 13));
glActiveTexture(GL_TEXTURE0);
glBindTexture(GL_TEXTURE_2D, _textureUniform0);
glUniform1i(_textureUniform0, 0);
glActiveTexture(GL_TEXTURE1);
glBindTexture(GL_TEXTURE_2D, _textureUniform1);
glUniform1i(_textureUniform1, 1);
glActiveTexture(GL_TEXTURE0);
glDrawElements(GL_TRIANGLES, indicesCountA, GL_UNSIGNED_SHORT, (GLvoid*) (sizeof(GLushort) * 0));
glDrawElements(GL_TRIANGLES, indicesCountB, GL_UNSIGNED_SHORT, (GLvoid*) (sizeof(GLushort) * indicesCountA));
glDrawElements(GL_TRIANGLES, indicesCountC, GL_UNSIGNED_SHORT, (GLvoid*) (sizeof(GLushort) * (indicesCountA + indicesCountB)));
My hope was to dynamically apply the texture associated with a vertex but it seems to only recognize GL_TEXTURE0.
The only way I have been able to change textures is to associated each texture with GL_TEXTURE0 and then draw:
glActiveTexture(GL_TEXTURE0);
glBindTexture(GL_TEXTURE_2D, _textureUniformX);
glUniform1i(_textureUniformX, 0);
glDrawElements(GL_TRIANGLES, indicesCountA, GL_UNSIGNED_SHORT, (GLvoid*) (sizeof(GLushort) * 0));
...
In order to render all the textures, I would need a separate glDrawElements() call for each texture, and I have read that glDrawElements() calls are a big hit to performance and the number of calls should be minimized. Thats why I was trying to dynamically specifiy which texture to use for each vertex.
It's entirely possible that my understanding is wrong or I am missing something important. I'm still new to OpenGL and the more I learn the more I feel I have more to learn.
It must be possible to use textures other than just GL_TEXTURE0 but I have yet to figure out how.
Any guidance or direction would be greatly appreciated.
Can it be you're just experiencing floating point rounding issues? There shouldn't be any (except if a single privimitve shares vertices with different textures), but just to be sure replace this TextureOut == 1.0 with a TextureOut > 0.5 or something the like.
As a general advice, you are correct in that the number of draw calls should be reduced as much a possible, but your approach is quite odd. You are buying draw call reduction with fragment shader branching. Your approach also doesn't scale well with the overall number of textures, since you always need all textures in separate texture units.
The usual approach to reduce texture switches is to put all the textures into a single large texture, a so-called texture atlas, and use the texture coordinates to select the appropriate subregion in this texture. This also has some pitfalls (which are an entirely different question), but nothing comes for free.
EDIT: Oh wait, I see what you're actually doing wrong
glBindTexture(GL_TEXTURE_2D, _textureUniform0);
You're binding a texture to the current texture unit, but instead of the texture object you give this function a uniform location, which is complete rubbish (but might even work in some weird circumstances, since both uniform locations and texture objects are themselves just integers). Of course you have to bind the actual texture.