Vulkan: attachment synchronisation with implicit layout transitions - layout

I have read almost everything that google gave me to this topic and haven't been able to reach a satisfactory conclusion. It's essentially a follow up question to this one:
Moving image layouts with barrier or renderpasses
Assume I have a color attachment which is written to in one render pass and sampled from in a second one. Let there be only one subpass in both render passes. One way to handle the layout transition and dependencies is to add a barrier between the two render passes, which changes the layout from VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL to VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL.
But Vulkan also offers implicit layout transitions (vkAttachmentDescription, initialLayout and finalLayout). I guess that there is a performance advantage of using them, so let's simply try to get rid of our barrier. We set the initialLayout and finalLayout field in the vkAttachmentDescription structure and remove the barrier. The problem is, we lost the synchronisation provided by the barrier, so we need to get the synchronisation back by other means. And this is the point where the confusion starts, leading to my questions:
1) What's the recommended way to synchronize the attachment between the two render passes? Obviously I could simply re-add the barrier and not change the layout, but wouldn't that defeat the purpose of the whole exercise, which was to get better performance by using implicit layout transitions and getting rid of the barrier? Or should I add a subpass dependency from the single sub pass of render pass 1 to VK_SUBPASS_EXTERNAL? Are there any caveats of using VK_SUBPASS_EXTERNAL performance-wise?
2) What about synchronizing the attachment backwards? It is the application's responsibility to transition the attachment to the correct initial layout, which can be done with a barrier, obviously. Can this barrier be replaced to get a performance advantage? The only way I can think of would be to do the dependency part with a sub pass dependency from VK_SUBPASS_EXTERNAL to the single sub pass of render pass 1 and to use a 'fast' barrier (one that doesn't sync) which only does the layout change. Does this make sense? How would that barrier look like? Or is the 'full' barrier unavoidable in this case?
The short version of my questions is simply: how do other people do attachment synchronisation in conjunction with implicit layout transitions?

Generally speaking, when Vulkan or similar low-level APIs offers you multiple tools that can achieve what you want, you should give preference to the most specific tool that can solve your problem (without having to radically re-architect your code or fundamentally impact your design).
In your case, you have 2 options: barriers or render pass mechanisms (subpass dependencies and layout transitions). Barriers work with anything; they don't care where the image comes from, was used for, or where it is going. Render pass mechanisms only work for stuff that happens in a render pass and primarily deal with images attached to render passes (implicit layout transitions only work on attachments).
Render pass mechanisms are more specific, so you should prefer to use those tools if they meet your needs.
This is also why, if you have two "separate" rendering operations that could be in the same render pass (if you're reading from an attachment in a way that can live within the limitations of input attachments), you should prefer to put them in the same render pass.

The short version of my questions is simply: how do other people do attachment synchronisation in conjunction with implicit layout transitions?
Render pass dependencies is what You are looking for. In case of two render passes You need to use the mentioned VK_SUBPASS_EXTERNAL value.
It is the application's responsibility to transition the attachment to the correct initial layout, which can be done with a barrier, obviously. Can this barrier be replaced to get a performance advantage?
However You perform layout transition, it doesn't matter. It is Your responsibility to transfer image's layout to the one specified as the initial layout. But I think the best way would be to once again use implicit layout transitions provided by render passes. If You are using them already, it should be possible to setup them in such a way so the first render pass transitions image to a layout which is the same as the initial layout of the second render pass, and the final layout of the second render pass is the same as the initial layout of the first render pass.

Related

How to make GSAP marquee item change line immediately, not waiting all items finished animation?

I'm Oliver, a noob of web animation,these two days I'm trying to do gsap marquee side project, I build 500 dom boxes as the sandbox url:
https://codesandbox.io/s/gsap-marquee-test-6zx2d?file=/src/App.js&fbclid=IwAR1tbmloHRXHUBHKG5FjBGDAx0TFd9sTkBJfSwpye8CQteO-TO8FNi1w4mw
and I have few question:
1.I used setTimeout to seperate each box as a unique timeline animation,so that the single box animation could go to another line immediately after finished last line, instead of waiting the other 499 boxs finished in the same line if I use property stagger.
This method would produce 500 timeline instances,it seems not a good idea, are there any methods could produce the same animation in one or few timeline?
2.If I do such animation in canvas,the browser render effciency would be better?
You should avoid using setTimeout with GSAP as it's best to use GSAP to control the timing of things.
In this situation, you can probably make use of GSAP's staggers. You should also learn about the position parameter of GSAP's timelines. If you use one (or both, depending on the exact effect that you need) of these you should be able to avoid creating so many timelines.
Additionally, your animation is not responsive. You probably want to make use of functional properties (where your properties of a tween are functions, not just hard numbers) with timeline invalidation to make it responsive.
I also highly recommend going through the most common GSAP mistakes article as you're making some of them.
As for using canvas for rendering your boxes, it probably depends on what your boxes are like. In most cases it'd probably be faster to use canvas, yes. But the slow part of animating these boxes is not anything related to the animation functionality itself, per se. It's related to render speed. In general it's faster to render a bunch of objects to canvas than it is to render a bunch of DOM elements.

confused about render pass in Vulkan API

Recently i started learning Vulkan API,there are some topics that confuses me, my question is what is a render pass, why it is used concurrently with command buffer recording? and finally what are sub pass, sub pass dependencies and attachments? that are commonly related to render pass.
It's the only way to get something drawn (draw commands can only be inside render pass). So don't overthink it. As a begginer you only need to create one render pass with one (mandatory) subpass and that's it. You can understand the depths of it later.
Also you should give some chance to all those videos and tutorials, which are written at length and with more care than whatever will someone write here in short SO format.
Give the spec a chance (it's not so bad — but avoids redundant semantic and conceptual information). Try to read up some intro by AMD, vulkan-tutorial.com, Vulkan in 30 minutes (this one helped me started anyway — well there was not much more available at the time), API without secrets and watch e.g. Vulkan GDC session Part1, Part2.
Now you heard some people behind it and seen some of the commands. You should get back to us with more specific aspects you do not understand about it.
OK, I am just gonna add some conceptual description of it here to formally answer the question.
Render pass is sort of a description or a map or a scheme of a graphics job (which revolves around particular organization/use of Image resources). But it does not describe the actual commands nor the actual resources (that is done in command buffer recording for render pass instance between vkCmdBeginRenderPass() and vkCmdEndRenderPass())
Maybe a "black box" or "C++ like declaration" for which you provide implementation later is a good analogy.
Render pass has some set of attachments. Let's think of them as descriptions of needed frame image outputs and temporaries (but not the specific frame images themselves).
Render pass has some set of subpasses. Subpass describes how an attachment will be treated during its execution (e.g. as a color buffer in a color image layout).
Render pass has some set of subpass dependencies. Dependencies describe the execution order between subpasses (it forms a dependency DAG). Dependency also describes an equivalent of a pipeline barrier between two subpasses, or between a subpass and outside of the whole render pass (VK_SUBPASS_EXTERNAL dependency). Subpasses are executed in any order and can overlap (at the leisure of the driver), except for what you describe in the dependencies (or otherwise synchronize).
In command buffer using vkCmdBeginRenderPass() you create render pass instance (you provide actual Images for the attachments with VkFramebuffer, and actual commands which write to them).
The things that are part of the render pass description are executed automagically (the image layout transitions, barriers, and MSAA resolutions).
For the rest you record the commands for subpasses of the renderpass instance for the current CB. You do it sequentially for subpass 0, 1, 2, 3, 4, ... — that is not what the actual execution order will be though — you have described that with the subpass dependencies (and otherwise is at the leisure of the driver).
Then the command buffer with such render pass instanc(es) is submitted to queue and actually being executed.
It is perhaps these indirections that make it harder to grasp. Commands are recorded before they are even executed. And render pass is created before it is even recorded. :)

JAVAFX8 updating controls models?

UI Control such as LISTVIEW or Tree or ... comes with model that is observable.
When one make a change to that model, I suppose JavaFX knows how to refresh it automatically in the display.
However my question here is as follows:
Is it the intent way, that someone who wants to update and not replace this model, do so in a background thread with a platform.runlater.
In other words, one has some serious computation to do, and needs to to update an ObservableList as a result. Is it the intended way, to do the heavy work in a background thread and at the end of it, run the update in a platform run later?
I'm asking this because this is what I have been doing so far without problem. But from my reading here and there, in particular in
http://docs.oracle.com/javase/8/javafx/api/javafx/concurrent/Task.html
It seems that some other mechanism shall be used. One should rather return a full list instead of updating the observable list.
But this works only if things comes from the GUI. In case the update is triggered from the back end, there is no way to do so.
The solution that I have used so far, was always to hold a reference to the observable list and updating it by means of platform.Runlater.
Is there any other way ?
The link you give has an example (the PartialResultsTask) that does as you describe: it updates an existing ObservableList as it progresses via a call to Platform.runLater(). So this is clearly a supported way of doing things.
For updating from the back end (i.e. from a class unaware that the data are being used in a UI), you'd really have to post some code for anyone to be able to help. But you might have a look at the techniques used in this article. While he doesn't actually update lists from the backend in the examples there, the same strategy could be used to do so.

OpenGL ES 2.0: Efficient Rendering of Static and Dynamic Vertex Data

I am writing an iOS/Android game and looking for the most performant way to render my vertex data with OpenGL ES 2.0. I have two different kinds of data: dynamic data that changes its attributes every frame, for example the player or animated background objects, and static data such as the static background or the terrain. I googled a lot since yesterday, but I could not find a clear and unique answer to the question of what is the best was to render such data.
There are basically three options for rendering such data (If I do not miss one. If so, feel free to correct me.):
Vertex Arrays Only:
Just fill your vertex every frame on the CPU (including the dynamic data).
Vertex Buffer Objects Only:
Allocate a VBO on the GPU with GL_DYNAMIC_DRAW where both, the dynamic and static data is stored. The dynamic data is then updated every frame via glBufferSubData.
Use both:
Static data is stored and render with a VBO and the dynamic data is rendered with a Vertex Array. With this option, we need two rendering passes, one for rendering the VBO and one for rendering the vertex array.
Since the first option does not exploit the immutability of the static data and since the third option requires two rendering passes, my guess is that I should go with the second option. However, I am absolutely not sure about this and I hope you can clarify my confusion.
Allocate two Vertex Buffer Objects. One with hint GL_DYNAMIC_DRAW that will be updated frequently. Allocate a second VBO for immutable data and use the hint GL_STATIC_DRAW. According to the API documentation, GL_STATIC_DRAW should be used for data that "will be modified once and used many times"; just what you need.
Speaking of two rendering passes here is probably a misuse of the term: what you do is to render your scene in two separate drawing commands. Since drawing commands run asynchronously, you should not expericence any performance hit by doing so.
A second rendering pass, on the other hand, is when you render the entire scene twice (see for example here) with different settings, or when you do some image processing effects on outputs of previous rendering passes.

Does using glBindAttribLocation improve performance?

My understanding is that glBindAttribLocation allows you to custom set a handle to an attribute (before linking a shader program), which you can later use when rendering with glVertexAttribPointer.
But you don't have to use it, and may instead just rely on OpenGL assigning whatever handle it so chooses in its infinite wisdom. However, you would then need to query OpenGL to find out this handle by using glGetAttribLocation at some point before rendering with glVertexAttribPointer.
Now you could use glGetAttribLocation each time you render, which would seem wasteful since you can just use glGetAttribLocation once after building your program, then store the handle.
So essentially, you can store this handle by either using glBindAttribLocation or by using glGetAttribLocation so is there any difference performance-wise and what are the pros and cons of one over the other?
I cannot speak much about the direct performance difference, but it should be irrelevant anyway, since no matter if using glBindAttribLocation or glGetAttribLocation, you're doing it at initialization time anyway (and even then calling glGetAttribLocation shouldn't hurt that much).
But the main difference and advantage of an explicit glBindAttribLocation over letting GL decide is, that it allows you to establish your own attribue semantics and keep them consistent for each and every shader.
Say you have a whole bunch of objects and a whole bunch of shaders. But each shader has some notion of a position attribute (and normal, color, ...), likewise each object has attribute data for positions, normals, ... Now with glBindAttribLocation you can bind your position attribute to location 0 in each and every different shader. So when drawing your objects with different shaders, they can use a single vertex format (i.e. how you call glVertexAttribPointer for the individual attributes, and the individual enable calls).
On the other hand glGetAttribLocation doesn't give you any guarantees about what attributes get which indices (maybe one shader has some additional attribute and the compiler thinks it's a good way to reorder them, who knows). So in this case you have a different vertex format (glVertexAttribPointer call) for each object and each shader.
This is even more important when using Vertex Array Objects (which encapsulate all the attribute state, especially the glVertexAttribPointer and glEnableVertexAttribArray calls). In this case you usually don't need (and don't want) to call glVertexAttribPointer each time you draw an object with another shader.
So the bottom line is, always use glBindAttribLocation, at best (in a large application) it saves you many object and shader management issues and many unneccessary glVertexAttribPointer calls each frame (and that can likely be a performance gain), and at least (in a very small application) it is good practice and lets you stay open and flexible for extensions. As a side note, in desktop GL 3+ (or with the ARB_explicit_attrib_location extension) you can even assign attribute locations directly in the shader without the need for any API call.

Resources