Loading/removing dynamically buffers with Vulkan - multithreading

I switched to Vulkan from OpenGL to use multi-threading improvements.
In OpenGL, I was able to load dynamically object to the scene (buffer, textures, etc) while rendering by using a waiting system. I was loading all app-side stuffs in a thread, then when it was ready, just before a frame render in the main thread, I was sending everything into the video memory. That was fine.
With Vulkan, I know I can call some functions between threads without provoking the well known segfault from OpenGL. But, this doesn't works with vkQueueSubmit(). I already know, I tried the naive way. To me, it seems logical you can't bother a queue from multiple threads.
I came with some ideas, but I don't know which one is good or bad.
First, I would go the OpenGL way, I will prepare everything I can from the CPU/App side, then just before render a frame, I will submit buffers (with transfer queue) to the video memory. But I feel there is no a real improvement from OpenGL way...
Second, I will try to use the synchronization mechanism to be able to send buffers in a thread and render from an other. But I keep reading there is a lot of way to slow down everything by causing irrelevant locks or by using incorrectly semaphores and fences.
So my question, is basically what path to pick to solve this problem ? How can I load a buffer dynamically from an other thread while the main thread is rendering without making too much pain to performances ? How Vulkan can help ?

If you want to stream resources for immediate use (i.e. the main render cannot proceed without them), then you're pretty much going to either block the main thread waiting, or have it spin doing something visually interesting (e.g. an animated loading screen) waiting for the resources to load.
If you want to stream resources while the app is doing real rendering then the main trick here is to load resources asynchronously in the background and only switch to using those resources in the main thread once they are already loaded. If the main thread ever ends up actually blocked on a semaphore then you've probably already started dropping frames, so your "engine" design needs to ensure that never happens. A lot of game use simple low-detail proxy objects as stand-in versions while the high-detail version is loading in the background.
None of this is particularly related to the graphics API - both GL and Vulkan need the same macro-scale behavior. Vulkan API features don't particularly help because the major bottlenecks which cause problems here are storage/network/CPU which have nothing to do with the graphics part of the problem.

I decided to trust threads !
In the first place it seems to work, I get a lot of :
[MESSAGE:Validation Error: [ UNASSIGNED-Threading-MultipleThreads ] Object 0: handle = 0x56414228bad8, type = VK_OBJECT_TYPE_QUEUE; | MessageID = 0x141cb623 | THREADING ERROR : vkQueueSubmit(): object of type VkQueue is simultaneously used in thread 0x7f6b977fe640 and thread 0x7f6bc2bcb740]
But it works !
So, the basic idea is to have a thread for loading objects while the engine is drawing. This thread takes care of creating the UBO for the location of the object, then when the geometry is loaded from RAM, it creates the VBO and IBO (I left material with image/UBO on hold for now), then creates the graphics pipeline (with layout, descriptor layout, shaders compiled with GLSLang on the fly) (The next idea is to reuse pipeline for similar needs) and finallly sets a flag to say the object is ready to use. In the other hand, I have my main thread rendering and takes new objects when they shows up ready.
I think it works because I have a gentle video card (GTX 1070) with multiple queues setup, I had one for graphics and an other one for transfer setup.
I'm pretty sure, this will crash or goes poorly with a GPU with a single queue, and this should be why the validation layers tolds me these messages.

Related

How can I share an OpenGL context/texture between 2 processes (linux)

I am trying to build 2 applications running in separate processes. One application will display live video from a camera (server) and the other will overlay UI (client) on top of that video. The solution requires low latency and therefore I would like to render both without going thru the OS compositor.
The solution I am trying to implement involves creating a shared OpenGL context or texture so that the UI can render its part to some off screen buffer/texture.
After every live image frame is rendered the server can take the information from the off-screen buffer/texture and render it on top.
This way there is no latency added due to synchronization of the processes. The server will take the latest image from the UI if one is ready. In case it is not ready it shouldnt wait for it, and use a previous image instead.
How can i pass a texture or context between processes?
The CreateContext function can take a pointer of another context and make it shared but the address as far as I understand will not be valid outside the process space.
These days the "cleanest" way to share GPU resources between processes is to create those resources using Vulkan, export them into file descriptors (POSIX) or HANDLEs (Win32) and import those into OpenGL contexts created at either side. The file descriptors you can pass by the usual methods (sendmsg with SCM_RIGHTS, or pidfd_getfd, or open("/proc/${PID}/fd/${FD}").
Exporting from Vulkan:
https://www.khronos.org/registry/vulkan/specs/1.2-khr-extensions/html/chap46.html#VK_KHR_external_memory_fd (ff.)
Importing into OpenGL:
https://www.khronos.org/registry/OpenGL/extensions/EXT/EXT_external_objects.txt
https://www.khronos.org/registry/OpenGL/extensions/EXT/EXT_external_objects_fd.txt
https://www.khronos.org/registry/OpenGL/extensions/EXT/EXT_external_objects_win32.txt
Doing with just "pure" OpenGL requires a lot of hacks. One way would be to force indirect contexts (at the cost of modern capabilities, due to the lack of GLX support for those) and sharing the X11 IDs for that. Another method is to use ptrace to access to mapped buffers in the other process. Either is quite daunting and a lot more work to properly implement (BT;DT.), than setting up a Vulkan instance, creating all the textures therein and then importing them to OpenGL.

OpenTK MultiThreading: How to "unbind" a GraphicsContext

I am working on a multi threaded OpenGL application with OpenTK 3 and WinForms.
I have 2 shared GraphicsContexts:
a "main" rendering context, used for scene drawing and synchronous load operations.
a "secondary" resource loader context, used to load resources during draw.
This secondary context is used to load video frames coming from a Windows Media Foundation session (with a custom media sink). However, i have no control on what thread this media sink is running on, so i need a way, after each loading operation, to "unbind" that secondary GraphicsContext, so that it can be bound in the next thread where it will be needed.
Do I have to P/Invoke wglMakeCurrent(NULL, NULL) or is there a proper OpenTK way of doing this?
Short answer
Use OpenTK feature:
mycontext.MakeCurrent(null);
Long answer
Today's wglMakeCurrent doc has eliminated this old comment:
If hglrc is NULL, the function makes the calling thread's current
rendering context no longer current, and releases the device context
that is used by the rendering context. In this case, hdc is ignored.
I would trust that comment is still valid, due to so many code relying on it.
Pay attention to "releases the device context". Perhaps OpenTK does some action related to the device context. Perhaps the hdc is private (by using window style flag CS_OWNDC) So, let OpenTK handles this "NULL" case.
Better approach
Be aware that even when you use several shared contexts, is the GPU (normally one unique card) that does the loading, and not many cards allow loading while doing other jobs. Thus, it isn't guaranteed you get better performance. But shared contexts exist to this purpose, somehow.
Why should you use the same context in different threads?
I'd use a different thread for load video frames (without any gl-call) and for upload them to the GPU. This last thread is permanent and has its own gl-context, so it doesn't need to set as current every time it works. It sleeps or waits until the other thread has finished loading data, and after that task is completed it uploads that data to the GPU.

Creating Three.js meshes in a WebWorker

I'm trying to offload as many Threejs computations as possible to a Web Worker. It seems to be relatively doable when just wanting the worker to create geometries. However, I still need to create a significant amount of meshes, which implies a hefty cycle on the main thread.
Is it possible to offload mesh creation to a web worker and just have the main thread add it to the scene (when ready)?
The idea would be to have the worker create an array of meshes, based on some data, and have it send it over to the main thread.
Many thanks
I am currently willing to tackle this problem in one of my projects. If you haven't started yet yours, I would suggest to have a look at https://github.com/kripken/webgl-worker first. There are two examples (one simple, one a bit more complex) that could help to start with.
I will update later this answer with more details about how to integrate wegl-worker with three.js, which might require more setup than simple webgl/worker implementation.
Unfortunately, THREEJS 3D objects (classes) are to "heavy" to be used in workers (object can't pass through "worker thread"-"main thread" boundary, even after I patched threejs lib to be used inside worker).
But I successfully use workers to load pretty large objects asynchronously.
I use Catiline.js for convenience.
The idea is to use THREEJS objects native format (and buffer geometry) and simply parse it to js object inside worker. After it, you can use THREE.ObjectLoader in the main thread to get real scene object. The benefit from such approach is to move parsing (which can take quite a long time for the large object) to background and minify freezing.
I use 6 workers, choose worker randomly, pass data url to it and additionaly get benefits from XMLHttpRequest caching
Threejs objects can't be passed through a postMessage.
Instead we want to set up a connection back to the main page via web-sockets. This should let us freely pass whatever is needed.
This thread might be helpful to you... I recently had to do some SSR with Three.js and the concepts are similar expect you are parsing Buffer Geometries with ObjectLoader in the worker.
https://discourse.threejs.org/t/error-with-ssr-three-js-objects/8643

Is this a decent structure for a multithreaded videocoacher program?

Hi I’m currently working on a project for a videocoacher program for recording and replaying video, as well as showing delayed real-time video, and tracking placement via color.
The software is running on linux , on a 4 core odroid, and initially I started to make it multi threaded with threads implemented as a part of each new class. Each of these threads taking care of their own gui elements.
I’ve later found out that I need to show all gui elements/video in the main/gui thread. Earlier I’ve used opencv and boost. But it seems like using the Qt might be a better idea since some of the code already depends on the QT library. I am currently a novice at programming, and not very familiar with either opencv, qt, or threading.
My question is:
Is this relatively sound as a structure for the program, or is there something inherently wrong with how I am planning to do it now?
Main/GUI Thread
will show all visual & video content
will start a thread for ButtonControl object
ButtonControl
will handle all button input, controlling what happens in the program
depending on what buttons are pressed will start and end threads
like:
StoreToFile object ( starts storing video to a file, while sending a
video stream to GUI thread to show what it is storing in real-time)
ReadFromFile object ( reads the file currently stored and sends data
to display it in GUI thread
DelayedVideoStream object (stores video to buffer, and shows a
continuous delayed view of what happened 5seconds in the past)
ColorTracking object (tracks where a color placement is in the image
)
Kind regards, and thank you for taking the time to look at my question.
TLDR - is a structure where threads are implemented as classes and the image data is sent back to the gui/main thread a decent way to do a multithreaded program ?
Performance-wise, the best approach is not to deal with threads directly at all, but use QtConcurrent::run. It is safe to paint QImages that are simply passed via signals to a GUI object to display. I wrote a complete example demonstrating that approach. It leads to some very concise and easy-to-understand code thanks to related code being adjacent.
If you do want to use explicit threads, it will be much easier not to derive from QThread, but to simply move various worker objects into their threads, and have them communicate via signals and slots. I have a complete example for that approach as well.

No OpenGL context found in the current thread

I am using LibGDX to make a game. I want to simultaneously load/unload assets on the fly as needed. However, waiting for assets to load in the main thread causes lag. In order to remedy this, I've created a background thread that monitors which assets need to be loaded (textures, sounds, etc.) and loads/unloads them appropriately.
Unfortunately, I get the following error when calling AssetManager.update() from that thread.
com.badlogic.gdx.utils.GdxRuntimeException: java.lang.RuntimeException: No OpenGL context found in the current thread.
I've tried runing the background thread in the main thread in the beginning and just dealing with the first few screens, and everything works fine. I can also change the algorithm to just load everything into memory from the start in the same thread, and that works as well. However, neither works in the background thread.
When I run this on Android with OpenGL ES 2.0 (which is flexible in odd ways) instead of on Windows, everything runs fine, and I can even get the pixel dimensions of the images - but the textures render black.
My searches have told me that this is an issue of the OpenGL context being bound to a single thread, but not much else. This explains why everything works when I shove it in the main thread, and not when I put it in a different one. How do I fix this context problem?
First things first, you should not access the OpenGL context outside of the rendering thread.
I assume you have looked at these already, but just to make sure read up on the AssetManager wiki article, which talks a bit about how to use the AssetManager for asynchronous managing of assets. In addition to the wiki article, check out the AssetManagerTest to better understand how to use it. The asset manager test is probably your best bet into loading at how to dynamically load assets.
If you are loading a ton of stuff, you may want to look into creating a loading bar to load anything large upfront. It might work to check assets and such from another thread (and set a flag to call update), but at the end of the day you will need to call update() on the rendering thread.
Keeping in mind you have to call update() it from a different thread, I don't see why you would want another thread to check conditions and set a flag. There is probably more overhead using another thread and synchronizing the update() call than to just do it all on the rendering thread. Also, the update() method only pauses for a couple milliseconds at a time as it incrementally loads files. Typically, you would simply call load() for your asset, then check isLoaded() on your asset. If it isn't loaded you would then call update() once per frame until isLoaded() returns true. Once it returns true, you can then call get() and get whatever asset you were loading. This can all be done via the main rendering thread without having the app lag while its loading.
If you really want your other thread to call update(), you need to create a Runnable object and call postRunnable() such as how they have it described in the wiki article on multi-threading with libGDX. However, this defeats the whole point of using other threads because anything you use with postRunnable runs synchronously on the rendering thread.

Resources