Massive live video streaming using Linux - linux

I am considering different ways to stream a massive number of live videos to the screen in linux/X11, using multiple independent processes.
I started the project initially with openGL/GLX and openGL textures, but that was a dead end. The reason: "context switching". It turns out that (especially nvidia) performs poorly when several (independent multi-)processes are manipulating at fast pace textures, using multiple contexts. This results in crashes, freezes, etc.
( see the following thread: https://lists.freedesktop.org/archives/nouveau/2017-February/027286.html )
I finally turned into Xvideo and it seems to work very nicely. My initial tests show that Xvideo handles video dumping ~ 10 times more effectively than openGL and does not crash. One can demonstrate this running ~ 10 vlc clients with 720p#25fps and trying both Xvideo and OpenGL output (remember to put all fullscreen).
However, I am suspecting that Xvideo uses, under the hood, openGL, so let's see if I am getting this right ..
Both Xvideo and GLX are extension modules of X11, but:
(A) Dumping video through Xvideo:
XVideo considers the whole screen as a device port and manipulates it directly (it has these god-like powers, being an extension to X11)
.. so it only needs a single context from the graphics driver. Lets call it context 1.
Process 1 requests Xvideo services for a certain window .. Xvideo manages it into a certain portion of the screen, using context 1.
Process 2 requests Xvideo services for a certain window .. Xvideo manages it into a certain portion of the screen, using context 1.
(B) Dumping video "manually" through GLX and openGL texture dumping:
Process 1 requests a context from glx, gets context 1 and starts dumping textures with it.
Process 2 requests a context from glx, gets context 2 and starts dumping textures with it.
Am I getting this right?
Is there any way to achieve, using openGL directly, situation (A) ?
.. one might have to drop GLX completely, which starts to be a bit hard-core.

It's been a while but I finally got it sorted out, using OpenGL textures and multithreading. This seems to be the optimal way:
https://elsampsa.github.io/valkka-core/html/process_chart.html
(disclaimer: I did that)

Related

How to intercept Skia draw commands from Chromium Browser

This is an experimental project of mine related to remote browser isolation. I'm trying to intercept Skia draw commands in a running Chromium instance and later replay them in a different browser instance at client-side via CanvasKit, a WebAssembly build of Skia.
However, I'm having a hard time figuring out where and how to intercept those draw commands within Chromium source code. Any advice on how to approach my problem is much appreciated!
In Chromium, all the draw operations will be recorded in a DisplayItemList which you can find in the definition of class GraphicsContext in blink module. Second, these recorded operations will be replayed later when the CC thinks it will be right time.
On blink's end, all the things above related code was scattered mostly in blink/renderer/platform/graphics/graphics_context.cc and its related files. But if you see all Chromium as a whole, all the graphics things were triggered by CC (Chrome Compositor) which maintain a state machine and runs a draw frame loops triggered by system's vsync signal in Android. On this loop start, blink ends draw recording operations will be pushed. At the end of this loop, the composited frame's image will be translated to a serial of GPU ops and calling the system's GPU devices related APIs to do them. The CC related code files can
be found in components/viz/. You should read code of class Display as a starting key point.
My opinion comes from version 68 and you know code in Chromium changes frequently. So I cannot confirm the files and locations are still correct.

Get screenshot of EGL DRM/KMS application

How to get screenshot of graphical application programmatically? Application draw its window using EGL API via DRM/KMS.
I use Ubuntu Server 16.04.3 and graphical application written using Qt 5.9.2 with EGLFS QPA backend. It started from first virtual terminal (if matters), then it switch display to output in full HD graphical mode.
When I use utilities (e.g. fb2png) which operates on /dev/fb?, then only textmode contents of first virtual terminal (Ctrl+Alt+F1) are saved as screenshot.
It is hardly, that there are EGL API to get contents of any buffer from context of another process (it would be insecure), but maybe there are some mechanism (and library) to get access to final output of GPU?
One way would be to get a screenshot from within your application, reading the contents of the back buffer with glReadPixels(). Or use QQuickWindow::grabWindow(), which internally uses glReadPixels() in the correct way. This seems to be not an option for you, as you need to take a screenshot when the Qt app is frozen.
The other way would be to use the DRM API to map the framebuffer and then memcpy the mapped pixels. This is implemented in Chromium OS with Python and can be translated to C easily, see https://chromium-review.googlesource.com/c/chromiumos/platform/factory/+/367611. The DRM API can also be used by another process than the Qt UI process that does the rendering.
This is a very interesting question, and I have fought this problem from several angles.
The problem is quite complex and dependant on platform, you seem to be running on EGL, which means embedded, and there you have few options unless your platform offers them.
The options you have are:
glTexSubImage2D
glTexSubImage2D can copy several kinds of buffers from OpenGL textures to CPU memory. Unfortunatly it is not supported in GLES 2/3, but your embedded provider might support it via an extension. This is nice because you can either render to FBO or get the pixels from the specific texture you need. It also needs minimal code intervertion.
glReadPixels
glReadPixels is the most common way to download all or part of the GPU pixels which are already rendered. Albeit slow, it works on GLES and Desktop. On Desktop with a decent GPU is bearable up to interactive framerates, but beware on embedded it might be really slow as it stops your render thread to get the data (horrible framedrops ensured). You can save code as it can be made to work with minimal code modifications.
Pixel Buffer Objects (PBO's)
Once you start doing real research PBO's appear here and there because they can be made to work asynchronously. They are also generally not supported in embedded but can work really well on desktop even on mediocre GPU's. Also a bit tricky to setup and require specific render modifications.
Framebuffer
On embedded, sometimes you already render to the framebuffer, so go there and fetch the pixels. Also works on desktop. You can enven mmap() the buffer to a file and get partial contents easily. But beware in many embedded systems EGL does not work on the framebuffer but on a different 'overlay' so you might be snapshotting the background of it. Also to note some multimedia applications are run with UI's on the EGL and media players on the framebuffer. So if you only need to capture the video players this might work for you. In other cases there is EGL targeting a texture which is copied to the framebuffer, and it will also work just fine.
As far as I know render to texture and stream to a framebuffer is the way they made the sweet Qt UI you see on the Ableton Push 2
More exotic Dispmanx/OpenWF
On some embedded systems (notably the Raspberry Pi and most Broadcom Videocore's) you have DispmanX. Whichs is really interesting:
This is fun:
The lowest level of accessing the GPU seems to be by an API called Dispmanx[...]
It continues...
Just to give you total lack of encouragement from using Dispmanx there are hardly any examples and no serious documentation.
Basically DispmanX is very near to baremetal. So it is even deeper down than the framebuffer or EGL. Really interesting stuff because you can use vc_dispmanx_snapshot() and really get a snapshot of everything really fast. And by fast I mean I got 30FPS RGBA32 screen capture with no noticeable stutter on screen and about 4~6% of extra CPU overhead on a Rasberry Pi. Night and day because glReadPixels got was producing very noticeable framedrops even for 1x1 pixel capture.
That's pretty much what I've found.

OpenGL: render time limit on linux

I'm implementing some computation algorithm via OpenGL and Qt. All computations are executed in fragment shader.
Sometimes when i trying to execute some hard computations (that takes more than 5 seconds on GPU) OpenGL breaks computation before it ends. I suppose this is system like TDR from Windows.
I think that i should split input data by several parts but i need to know how long computation allowed.
How i can obtain render time limit on linux (it will be cool if there is crossplatform solution)?
I'm afraid this is not possible. After a lot of scouring through the documentation of both X and Wayland, I could not find anything mentioning GPU watchdog timer settings, so I believe this is driver-specific and likely inaccessible to the user (that or I am terrible at searching).
It is however possible to disable this watchdog under X on NVIDIA hardware by adding a line to your xorg.conf, which is then passed on to the graphics driver.
Option "Interactive" "boolean"
This option controls the behavior of the driver's watchdog, which attempts to detect and terminate GPU programs that get stuck, in order to ensure that the GPU remains available for other processes. GPU compute applications, however, often have long-running GPU programs, and killing them would be undesirable. If you are using GPU compute applications and they are getting prematurely terminated, try turning this option off.
Note that even the NVIDIA docs don't mention a numeric quantity for the timeout.

OpenCV FPS Optimisation

How can I increase opencv video FPS in Linux on Intel atom? The video seems lagging when processing with opencv libraries.
Furthermore, i m trying to execute a program/file with opencv
system(/home/file/image.jpg);
however, it shows Access Denied.
There are several things you can do to improve performance. Using OpenGL, GPUs, and even just disabling certain functions within OpenCV. When you capture video you can also change the FPS default which is sometimes set low. If you are getting access denied on that file I would check the permissions, but without setting the full error it is hard to figure out.
First is an example of disabling conversion and the second is setting the desired FPS. I think these defines are changed in OpenCV 3 though.
cap.set(CV_CAP_PROP_CONVERT_RGB , false);
cap.set(CV_CAP_PROP_FPS , 60);
From your question, it seems you have a problem that your frame buffer is collecting a lot of frames which you are not able to clear out before reaching to the real-time frame. i.e. a frame capture now, is processed several seconds later. Am I correct in understanding?
In this case, I'd suggest couple of things,
Use a separate thread to grab the frames from VideoCapture and then push these frames into a queue of a limited size. Of course this will lead to missing frames, but if you are interested in real time processing then this cost is often justified.
If you are using OOP, then I may suggest using a separate thread for each object, as this significantly speeds up the processing. You can see several fold increase depending on the application and functions used.

Tips to reduce opengl 3 & 4 frame-rate stuttering under Linux

In recent years I've developed several small games and applications for OpenLG 2 and ES.
I'm now trying to build a scene-graph based on opengl 3+ for casual “3D” graphics on desktop systems. (Nothing complex like the unreal- or crytec-engine in mind.)
I started my development with OsX 10.7 and was impressed by Apples recent ogl 3.2 release which achieves equivalent results compared to windows systems.
Otherwise the results for Linux are a disappointment. Even the most basic animation is stuttering and destroys the impression of reality. The results did not differ between the windows toolkits freeglut and glfw. (The Extensions are loaded with glew 1.7)
I would like to mention that I'm talking about the new opengl core, not the old opengl 2 render-path, which works fine under Linux but uses the cpu instead of the gpu for complex operations.
After watching professional demos like the “Unigine heaven demo” I think there is a general problem to use modern real-time 3D graphics with Linux.
Any suggestions to overcome this problem are very welcome.
UPDATE:
I'm using:
AMD Phenom II X6, Radeon HD57XX with latest proprietary drivers (11.8) and Unity(64Bit).
You could take my renderloop from the toolkit documentation:
do {
glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);
...
} while (!glfwGetKey(GLFW_KEY_ESC) && glfwGetWindowParam(GLFW_OPENED));
I'm using VBOs and all transformation stages are done with shaders. Animation timing is done with glfwGetTime(). This problem occurs in window and full-screen mode. I don't know if a composition manager interferes with full-screen applications. But it is also impossible to demand from the user to disable it.
Update 2:
Typo: I'm using a HD57XX card.
GLX_Extension: http://snipt.org/xnKg
PCI Dump: 01:00.0 VGA compatible controller: ATI Technologies Inc Juniper [Radeon HD 5700 Series]
X Info: http://pastie.org/2507935
Update 3:
Disabling the composition manager reduces, but did not completely remove the stuttering.
(I replaced the standard window manager with "ubuntu classic without extensions")
Once a second the animation freezes and ugly distortions appear:
(Image removed - Not allowed to post Images.)
Although Vertical synchronisation is enabled in the driver and checked in the application.
Since you're running Linux we require a bit of detailed information:
Which hardware do you use?
Only NVidia, AMD/ATI and Intel offer 3D acceleration so far.
Which drivers?
For NVidia and AMD/ATI there are propritary (nvidia-glx, fglrx) and open source drivers (nouveau, radeon). For Intel there are only the open source drivers.
Of all open source 3D drivers, the Intel drivers offer the best quality.
The open source AMD/ATI drivers, "radeon" have reached an acceptable state, but still are not on par, performance wise.
For NVidia GPUs, the only drivers that makes sense to use productively are the propritary ones. The open source "nouveau" drivers simply don't cut it, yet.
Do you run a compositing window manager?
Compositing creates a whole bunch of synchronization and timing issues. Also (some of) the OpenGL code you can find in the compositing WMs at some places drives tears into the eyes of a seasoned OpenGL coder, especially if one has experience writing realtime 3D (game) engines.
KDE4 and GNOME3 by default use compositing, if available. The same holds for the Ubuntu Unity desktop shell. Also for some non-compositing WMs the default skripts start xcompmgr for transparency and shadow effects.
And last but not least: How did you implement your rendering loop?
A mistake oftenly found is, that a timer is used to issue redisplay events at "regular" intervals. This is not how it's done properly. Timer events can be delayed arbitrarily, and the standard timers are not very accurate by themself, too.
The proper way is to call the display function in a tight loop and measure the time it takes between rendering iterations, then use this timing to advance the animation accordingly. A truly elegant method is using one of the VSync extensions that delivers one the display refresh frequency and the refresh counter. That way instead of using a timer you are told exactly the time advanced between frames in display refresh cycle periods.

Resources