I am creating an app for android using openGL ES. I am trying to draw, in 2D, lots of moving sprites which bounce around the screen.
Let's consider I have a ball at coordinates 100,100. The ball graphic is 10px wide, therefore I can create the vertices boundingBox = {100,110,0, 110,110,0, 100,100,0, 110,100,0} and perform the following on each loop of onDrawFrame() with the ball texture loaded.
//for each ball object
FloatBuffer ballVertexBuffer = byteBuffer.asFloatBuffer();
ballVertexBuffer.put(ball.boundingBox);
ballVertexBuffer.position(0);
gl.glVertexPointer(3, GL10.GL_FLOAT, 0, ballVertexBuffer);
gl.glDrawArrays(GL10.GL_TRIANGLE_STRIP, 0,4);
I would then update the boundingBox array to move the balls around the screen.
Alternatively, I could not alter the bounding box at all and instead translatef() the ball before drawing the verticies
gl.glVertexPointer(3, GL10.GL_FLOAT, 0, ballVertexBuffer);
gl.glPushMatrix();
gl.glTranslatef(ball.posX, ball.posY, 0);
gl.glDrawArrays(GL10.GL_TRIANGLE_STRIP, 0,4);
gl.glPopMatrix();
What would be the best thing to do in the case in terms of efficient and best practices.
OpenGL ES (as of 2.0) does not support instancing, unluckily. If it did, I would recommend drawing a 2-triangle sprite instanced N times, reading the x/y offsets of the center point, and possibly a scale value if you need differently sized sprites, from a vertex texture (which ES supports just fine). This would limit the amount of data you must push per frame to a minimum.
Assuming you can't do the simulation directly on the GPU (thus avoiding uploading the vertex data each frame) ... this basically leaves you only with only one efficient option:
Generate 2 VBOs, map one and fill it, while the other is used as the source of the draw call. You can also do this seemingly with a single buffer if you glBufferData(... 0) in between, which tells OpenGL to generate a new buffer and throw the old one away as soon as it's done reading from it.
Streaming vertices in every frame may not be super fast, but this does not matter as long as the latency can be well-hidden (e.g. by drawing from one buffer while filling another). Few draw calls, few state changes, and ideally no stalls should still make this fast.
Drawing calls are much more expensive than altering the data. Also glTranslate is not nearly as efficient as just adding a few numbers, after all it has to go through a full 4×4 matrix multiplication, which is 64 scalar multiplies and 16 scalar additions.
Of course the best method is using some form of instancing.
Related
This is a question to understand the principles of GPU accelerated rendering of 2d vector graphics.
With Skia or Direct2D, you can draw e.g. rounded rectangles, Bezier curves, polygons, and also have some effects like blur.
Skia / Direct2D offer CPU and GPU based rendering.
For the CPU rendering, I can imagine more or less how e.g. a rounded rectangle is rendered. I have already seen a lot of different line rendering algorithms.
But for GPU, I don't have much of a clue.
Are rounded rectangles composed of triangles?
Are rounded rectangles drawn entirely by wild pixel shaders?
Are there some basic examples which could show me the basic prinicples of how such things work?
(Probably, the solution could also be found in the source code of Skia, but I fear that it would be so complex / generic that a noob like me would not understand anything.)
In case of direct2d, there is no source code, but since it uses d3d10/11 under the hood, it's easy enough to see what it does behind the scenes with Renderdoc.
Basically d2d tends to have a policy to minimize draw calls by trying to fit any geometry type into a single buffer, versus skia which has some dedicated shader sets depending on the shape type.
So for example, if you draw a bezier path, Skia will try to use tesselation shader if possible (which will need a new draw call if the previous element you were rendering was a rectangle), since you change pipeline state.
D2D, on the other side, tends to tesselate on the cpu, and push to some vertexbuffer, and switches draw call only if you change brush type (if you change from one solid color brush to another it can keep the same shaders, so it doesn't switch), or when the buffer is full, or if you switch from shape to text (since it then needs to send texture atlases).
Please note that when tessellating bezier path D2D does a very great work at making the resulting geometry non self intersecting (so alpha blending works properly even on some complex self intersecting path).
In case on rounded rectangle, it does the same, just tessellates into triangles.
This allows it to minimize draw calls to a good extent, as well as allowing anti alias on a non msaa surface (this is done at mesh level, with some small triangles with alpha). The downside of it is that it doesn't use much hardware feature, and geometry emitted can be quite high, even for seemingly simple shapes).
Since d2d prefers to use triangle strips instead or triangle list, it can do some really funny things when drawing a simple list of triangles.
For text, d2d use instancing and draws one instanced quad per character, it is also good at batching those, so if you call some draw text functions several times in a row, it will try to merge this into a single call as well.
Back story: I'm creating a Three.js based 3D graphing library. Similar to sigma.js, but 3D. It's called graphosaurus and the source can be found here. I'm using Three.js and using a single particle representing a single node in the graph.
This was the first task I had to deal with: given an arbitrary set of points (that each contain X,Y,Z coordinates), determine the optimal camera position (X,Y,Z) that can view all the points in the graph.
My initial solution (which we'll call Solution 1) involved calculating the bounding sphere of all the points and then scale the sphere to be a sphere of radius 5 around the point 0,0,0. Since the points will be guaranteed to always fall in that area, I can set a static position for the camera (assuming the FOV is static) and the data will always be visible. This works well, but it either requires changing the point coordinates the user specified, or duplicating all the points, neither of which are great.
My new solution (which we'll call Solution 2) involves not touching the coordinates of the inputted data, but instead just positioning the camera to match the data. I encountered a problem with this solution. For some reason, when dealing with really large data, the particles seem to flicker when positioned in front/behind of other particles.
Here are examples of both solutions. Make sure to move the graph around to see the effects:
Solution 1
Solution 2
You can see the diff for the code here
Let me know if you have any insight on how to get rid of the flickering. Thanks!
It turns out that my near value for the camera was too low and the far value was too high, resulting in "z-fighting". By narrowing these values on my dataset, the problem went away. Since my dataset is user dependent, I need to determine an algorithm to generate these values dynamically.
I noticed that in the sol#2 the flickering only occurs when the camera is moving. One possible reason can be that, when the camera position is changing rapidly, different transforms get applied to different particles. So if a camera moves from X to X + DELTAX during a time step, one set of particles get the camera transform for X while the others get the transform for X + DELTAX.
If you separate your rendering from the user interaction, that should fix the issue, assuming this is the issue. That means that you should apply the same transform to all the particles and the edges connecting them, by locking (not updating ) the transform matrix until the rendering loop is done.
I'm looking for an efficient way to display lots of spheres using directx 11. The spheres are defined by (x,y,z,r) where (x,y,z) are coordinates in space and r is the radius. I want to display only the spheres that can be seen, meaning that spheres that are not in the field of view and spheres that are too small to be seen wouldn't be drawn. However, if a group of spheres smaller than one pixel is at least as big as one pixel, then I want to display the most predominant color. Spheres have only one color and different levels of transparency. Any help would be appreciated and incomplete answers are acceptable.
You need several things. First an indexed unit sphere geometry, second a buffer to store the sphere instance properties ( position, radius and color ) and third a small buffer for the API parameters yet to come. The three combines in a single 'ID3D11DeviceContext::DrawIndexedInstancedIndirect'
The remaining question is "how to feed the instance buffer ?". cpu is easy, just apply frustum culling, sort back to front because of the transparency and apply a merge based on the screen projection, update the buffer and use 'ID3D11DeviceContext::DrawIndexedInstanced'.
gpu version will do the same thing with compute shaders but will be harder to implement. The advantage, zero cpu/gpu synchronization and should support far more instance.
Background
Using gluTess to build a triangle list in Direct3D9 from a GDI+ DrawString(..) path:
A pixel shader (v3.0) is then used to fill in the shape. When painting with opaque values, everything looks fine:
The problem
At certain font sizes, if the color has an alpha component (ie Argb #55FFFFFF) we begin to see these nasty tessellation artifacts where triangles may overlap ever so slightly:
At larger font sizes the problem is sometimes not present:
Using Intel's excellent GPA Frame Analyzer Pixel History tool, we can see in areas where the artifacts occur, the pixel has been "touched" 3 times from the single Erg.
I'm trying to figure out how I can stop my pixel shader from touching the same pixel more than once.
Other solutions relating to overdraw prevention seem to be all about zbuffer strategies, however this problem is more to do with painting of a single 2D triangle list within a single pixel shader pass.
I'm at a bit of a loss trying to come up with a solution on this one. I was hoping that HLSL might have some sort of "touch each pixel only once" flag, but I've been unable to find anything like that. The closest I've found was to set the BLENDOP to MAX instead of ADD. But the output is not correct when blending over other colors in the scene.
I also have SRCBLEND = ONE, DSTBLEND = INVSRCALPHA. The only combination of flags which produce correct output (albeit with overdraw artifacts.)
I have played with SEPARATEALPHABLENDENABLE in the GPA frame analyzer, which sounded like almost exactly what I need here -- set blending to MAX but only on the "alpha" channel, however from what I can determine, that setting (and corresponding BLENDOPALPHA) affects nothing at all.
One final thing I thought of was to bake text as opaque onto a texture, and then repaint that texture into the scene with the appropriate alpha value applied, however this doesn't actually work in this project because I also support gradient brushes, where stop values may contain alpha, meaning either the artifacts would still be seen, or the final output just plain wrong if we stripped the alpha away from the stop values prior to baking to a texture. Also the whole endeavor would be hideously expensive.
Any hints or pointers would be appreciated. Thanks for reading.
The problem you're seeing shouldn't happen.
If two of your triangles are overlapping it's because you've placed the vertices in such a way that when the adjacent triangles are drawn, they overlap. What's probably happening is that these two adjacent triangles share two vertices, but each triangle has its own copy of each vertex that's been calculated to be in a very, very slightly different position.
The solution to the problem isn't to try and make the pixel shader touch the pixel only once it's to use an index buffer (if you aren't already) and have the shared vertices between each triangle actually share the same vertex and not use one that's ever-so-slightly not in the same place as the one used by the adjacent triangle.
If you aren't in control of the tessellation algorithm being used you may have to run a pass over the vertex buffer after its been generated to detect and merge vertices that are within some very small tolerance of one another. Even without an index buffer, a naive solution would be this:
For each vertex in the vertex buffer, compare its position to every other vertex in the rest of the vertex buffer.
If two vertices are within some small tolerance of another, replace the second vertex's position with the position of the one you are comparing it against.
This should have the effect of pairing up the positions of two vertices if they are close enough that you deem them to be the same.
You now shouldn't have any problem with overlapping triangles. In everyday rendering two triangles share edges with each other all the time and you won't ever get the effect where they appear to every-so-slightly overlap. The hardware guarantees that a sample point is either on one side of the line or the other, but never both at the same time, no matter how close the point is to the line (even if it's mathematically on the line, it still fails on one side or the other).
I've been struggling with this for a while.
Presently, I have a grid of 100 by 100 tiles, which belong to a Map.
The Map implements IDrawable. I call Draw() and it draws itself at 0,0 which is fine.
However, I want to expand this to draw essentially a viewport. The player will be drawn on the screen in the middle, and thus I want to display say, 10 tiles in each direction (rather than the entire map).
I'm having trouble thinking up the architecture for this one. I'm in the mindset that things should draw themselves, ie I say player1.Draw() and it draws itself. This would have worked before, where it drew the player at x,y on the screen, but with a viewport it will no longer know where to draw itself.
So should the viewport be told to draw, and examine every object in the game and draw those which are visible? Should the map tiles be objects that are subjected to this? Or should the viewport intelligently draw the map by coupling both together?
I'd love to know how typical scrolling tile games accomplish this.
If it matters, I'm using XNA
Edit to add: Can you do graphics manipulation such as trying the HTML rendering approach, where you tell things to draw, and they return a graphic of themselves, and then the parent places the graphic in the correct location? I'm thinking, if I had 2 viewports side by side for splitscreen, how would I stop them drawing outside the edges?
Possible design:
There's a 2D "world" that contains object instances.
"Object instance" is a sprite reference + its coordinates in the world.
When you draw scene, you request list of visible objects that exist in given 2D area, THEN you draw them.
With such design world can be very huge.
I'm in the mindset that things should draw themselves, ie I say player1.Draw() and it draws itself.
visible things should draw themselves. Objects outside of viewport are not visible.
, how would I stop them drawing outside the edges?
Not sure about XNA, but OpenGL has "scissors test"/"glViewport" and Direct3D 9 has "SetViewport" method that allows you to use part of the screen/window for rendering. There are also clipplanes and stencil buffer (using stencil for 2D clipping is overkill, though) You could also render to texture then render the texture. There are many ways to deal with this.
So should the viewport be told to draw, and examine every object in the game and draw those which are visible?
For a large world, you shouldn't examine every object, because it will be slow. You should be able to find visible object without testing every one of them. For that you'll need some kind of space partitioning - quad trees (because we are in 2D), k-d trees, etc. This way you should be able to handle few thousands (or even hundreds of thousands) of objects, as long as you don't see them all at once.
Should the map tiles be objects that are subjected to this?
If you keep drawing invisible things, FPS will drop.
and they return a graphic of themselves
For 2D game this may be very slow. Remember KISS principle.
Some basic ideas, not specifically for XNA:
objects draw themselves to a "virtual screen" in world coordinates, they don't draw themselves to the screen directly
drawable objects get a "graphics context" object which offers you a drawing API. The "graphics context" knows about the current viewport bounds and realizes the coordinate transformation from world coordinates to screen coordinates (for every drawing operations). The graphics context also does the direct drawing to the screen (or to a background screen buffer, if you need double buffering).
when you have many objects outside the visible bounds of your viewport, then as a performance optimization, your drawing loop can make a before-hand bounds-check for your objects and test if they are completely outside the visible area. If so, there is no need to let them draw themselves.