caputre OpenGL window in X11 with fast framerate - possible?

caputre OpenGL window in X11 with fast framerate - possible? - linux

I have an OpenGL application with the size of 800x600 running on my linux machine (X11). The content of this application (the rendered image) should be exported via network to another PC.
First of all, i want to know if it is possible to take snapshots of the applications window with about 30 Hz, save them to jpeg and export them to the other machine via HTTP or whatever (like the IP Cameras are doing). Is it possbile to read the graphic's cards memory (Radeon HD 5800) in a fast way so that i can get a framerate of about 30 pictures per second?

If you're willing to tolerate some latency Pixel Buffer Objects (PBOs) should get you some decent read-back throughput.
libjpeg-turbo looks like a good solution for high-speed JPEG encoding.
If you don't have the source to the app you're trying to monitor then LD_PRELOAD hacks combined with the above should work.

You may want to take a look at VirtualGL which does exactly what you aim for.

Related

Changing Resolution of Linux Framebuffer

I am writing a high performance video application in linux and C++ (but language shouldn't matter for this question)
I currently have my application such that I can display images to the Framebuffer. When my computer boots, the resolution of the display connected seems to be set permanently. I would like to be able to change the resolution output on the computer dynamically. I have tried fbset but it did not work. I am not using X11 because I assumed that there would be a performance decrease.
Is directly writing to the framebuffer the best way to be doing my
rendering for performance?
If I use X11 I see that I can get commands
to change the resolution. Should this be something I investigate?
Is there another way to change resolution?

Get screenshot of EGL DRM/KMS application

How to get screenshot of graphical application programmatically? Application draw its window using EGL API via DRM/KMS.
I use Ubuntu Server 16.04.3 and graphical application written using Qt 5.9.2 with EGLFS QPA backend. It started from first virtual terminal (if matters), then it switch display to output in full HD graphical mode.
When I use utilities (e.g. fb2png) which operates on /dev/fb?, then only textmode contents of first virtual terminal (Ctrl+Alt+F1) are saved as screenshot.
It is hardly, that there are EGL API to get contents of any buffer from context of another process (it would be insecure), but maybe there are some mechanism (and library) to get access to final output of GPU?

One way would be to get a screenshot from within your application, reading the contents of the back buffer with glReadPixels(). Or use QQuickWindow::grabWindow(), which internally uses glReadPixels() in the correct way. This seems to be not an option for you, as you need to take a screenshot when the Qt app is frozen.
The other way would be to use the DRM API to map the framebuffer and then memcpy the mapped pixels. This is implemented in Chromium OS with Python and can be translated to C easily, see https://chromium-review.googlesource.com/c/chromiumos/platform/factory/+/367611. The DRM API can also be used by another process than the Qt UI process that does the rendering.

This is a very interesting question, and I have fought this problem from several angles.
The problem is quite complex and dependant on platform, you seem to be running on EGL, which means embedded, and there you have few options unless your platform offers them.
The options you have are:
glTexSubImage2D
glTexSubImage2D can copy several kinds of buffers from OpenGL textures to CPU memory. Unfortunatly it is not supported in GLES 2/3, but your embedded provider might support it via an extension. This is nice because you can either render to FBO or get the pixels from the specific texture you need. It also needs minimal code intervertion.
glReadPixels
glReadPixels is the most common way to download all or part of the GPU pixels which are already rendered. Albeit slow, it works on GLES and Desktop. On Desktop with a decent GPU is bearable up to interactive framerates, but beware on embedded it might be really slow as it stops your render thread to get the data (horrible framedrops ensured). You can save code as it can be made to work with minimal code modifications.
Pixel Buffer Objects (PBO's)
Once you start doing real research PBO's appear here and there because they can be made to work asynchronously. They are also generally not supported in embedded but can work really well on desktop even on mediocre GPU's. Also a bit tricky to setup and require specific render modifications.
Framebuffer
On embedded, sometimes you already render to the framebuffer, so go there and fetch the pixels. Also works on desktop. You can enven mmap() the buffer to a file and get partial contents easily. But beware in many embedded systems EGL does not work on the framebuffer but on a different 'overlay' so you might be snapshotting the background of it. Also to note some multimedia applications are run with UI's on the EGL and media players on the framebuffer. So if you only need to capture the video players this might work for you. In other cases there is EGL targeting a texture which is copied to the framebuffer, and it will also work just fine.
As far as I know render to texture and stream to a framebuffer is the way they made the sweet Qt UI you see on the Ableton Push 2
More exotic Dispmanx/OpenWF
On some embedded systems (notably the Raspberry Pi and most Broadcom Videocore's) you have DispmanX. Whichs is really interesting:
This is fun:
The lowest level of accessing the GPU seems to be by an API called Dispmanx[...]
It continues...
Just to give you total lack of encouragement from using Dispmanx there are hardly any examples and no serious documentation.
Basically DispmanX is very near to baremetal. So it is even deeper down than the framebuffer or EGL. Really interesting stuff because you can use vc_dispmanx_snapshot() and really get a snapshot of everything really fast. And by fast I mean I got 30FPS RGBA32 screen capture with no noticeable stutter on screen and about 4~6% of extra CPU overhead on a Rasberry Pi. Night and day because glReadPixels got was producing very noticeable framedrops even for 1x1 pixel capture.
That's pretty much what I've found.

Which microcontroller for fast high quality audio switching and playback

I'm building a device which will play high quality sound samples and will switch between samples in >5ms when a signal is applied.
I'm after a microcontroller which can allow this - I need 4 I/O pins for triggering the transistions between sounds, as well as the output pin(s) for the audio. The duration of the audio files will be 50ms or so but ideally would have enough storage to allow the files to be 1 second or longer. It will loop the current file until told to change. I don't want audiable pops or suchlike when switching files or running other commands - but there shouldn't be a need for anything complex to run beside it, it's purely audio playing and switching.
I've looked at various microcontrollers in the arduino family but they don't seem optimal for this purpose - (tried for example mozzi library for arduino but it's not fantastic quality). Ideally I could do it all on the chip (whatever it is, doesn't need to be arduino) - without needing external storage or RAM modules. But if that's neccessary I'll do it. The solution is to fit in a 2cm wide cylinder (but no length constraints) so would be ideally within that - so no SD card modules or whatever. Language wise - I'm fairly new to them all - but can learn whatever would be best.
Audio - (44.1kHz CD quality WAV, although could obviously switch to a different format if neccessary). If this is totally impossible to play such a high quality sound - then sound quality could be less.
Thank you for your help

For a simple application like this you would be best to just use a small ARM Cortex M device hooked up to an external SPI FLASH chip. Most microcontrollers scale processing power and RAM with FLASH storage so keeping it all on one chip will result in a grotesquely over-powered solution. Serial FLASH memory is very cheap, easy to use, and you can change the size in the future if you need to add more samples.
For the audio side if you really want CD quality you'll have to look at getting a external audio DAC as I don't know of any microcontrollers that integrate a CD quality codec. External DACs aren't expensive or complex to use, but just adds to the physical size and BOM cost. Many Cortex chips have built in 12-bit DACs though so if the audio has a reasonably small dynamic range you might find this is suitable for your needs.
In terms of minimising pops and clicks the Cortex devices will have enough power for some basic filtering to deal with this. I would recommend against Arduino though as you will quickly come up against processing power limitations and I doubt you will want to dive into assembler optimisations.

OpenCV FPS Optimisation

How can I increase opencv video FPS in Linux on Intel atom? The video seems lagging when processing with opencv libraries.
Furthermore, i m trying to execute a program/file with opencv
system(/home/file/image.jpg);
however, it shows Access Denied.

There are several things you can do to improve performance. Using OpenGL, GPUs, and even just disabling certain functions within OpenCV. When you capture video you can also change the FPS default which is sometimes set low. If you are getting access denied on that file I would check the permissions, but without setting the full error it is hard to figure out.
First is an example of disabling conversion and the second is setting the desired FPS. I think these defines are changed in OpenCV 3 though.
cap.set(CV_CAP_PROP_CONVERT_RGB , false);
cap.set(CV_CAP_PROP_FPS , 60);

From your question, it seems you have a problem that your frame buffer is collecting a lot of frames which you are not able to clear out before reaching to the real-time frame. i.e. a frame capture now, is processed several seconds later. Am I correct in understanding?
In this case, I'd suggest couple of things,
Use a separate thread to grab the frames from VideoCapture and then push these frames into a queue of a limited size. Of course this will lead to missing frames, but if you are interested in real time processing then this cost is often justified.
If you are using OOP, then I may suggest using a separate thread for each object, as this significantly speeds up the processing. You can see several fold increase depending on the application and functions used.

Fast Audio Input/Output

Here's what I want to do:
I want to allow the user to give my program some sound data (through a mic input), then hold it for 250ms, then output it back out through the speakers.
I have done this already using Java Sound API. The problem is that it's sorta slow. It takes a minimum of about 1-2 seconds from the time the sound is made to the time the sound is heard again from the speakers, and I haven't even tried to implement delay logic yet. Theoretically there should be no delay, but there is. I understand that you have to wait for the sound card to fill up its buffer or whatever, and the sample size and sampling rate have something to do with this.
My question is this: Should I continue down the Java path trying to do this? I want to get the delay down to like 100ms if possible. Does anyone have experience using the ASIO driver with Java? Supposedly it's faster..
Also, I'm a .NET guy. Does this make sense to do with .NET instead? What about C++? I'm looking for the right technology to use here, and maybe a good example of how to read/write to audio input/output streams using your suggested technology platform. Thanks for your help!

I've used JavaSound in the past and found it wonderfully flaky (and it keeps changing between VM releases). If you like C#, use it, just use the DirectX APIs. Here's an example of doing kind of what you want to do using DirectSound and C#. You could use the Effects plugins to perform your 250 ms echo.
http://blogs.microsoft.co.il/blogs/tamir/archive/2008/12/25/capturing-and-streaming-sound-by-using-directsound-with-c.aspx

You may want to look into JACK, an audio API designed for low-latency sound processing. Additionally, Google turns up this nifty presentation [PDF] about using JACK with Java.
Theoretically there should be no delay, but there is.
Well, it's impossible to have zero delay. The best you can hope for is an unnoticeable delay (in terms of human perception). It might help if you describe your basic algorithm for reading & writing the sound data, so people can identify possible problems.
A potential issue with using a garbage-collected language like Java is that the GC will periodically run, interrupting your processing for some arbitrary amount of time. However, I'd be surprised if it's >100ms in normal usage. If GC is a problem, most JVMs provide alternate collection algorithms you can try.

If you choose to go down the C/C++ path, I highly recommend using PortAudio ( http://portaudio.com/ ). It works with almost everything on multiple platforms and it gives you low-level control of the sound drivers without actually having to deal with the various sound driver technology that is around.
I've used PortAudio on multiple projects, and it is a real joy to use. And the license is permissive.

If low latency is your goal, you can't beat C.
libsoundio is a low-level C library for real-time audio input and output. It even comes with an example program that does exactly what you want - piping the microphone input to the speakers output.

It's possible with JavaSound to get end-to-end latency in the ballpark of 100-150ms.
The primary cause of latency is the buffer sizes of the capture and playback lines. The bufferSize is set when opening the lines:
capture: TargetDataLine#open(AudioFormat format, int bufferSize)
playback: SourceDataLine#open(AudioFormat format, int bufferSize)
If the buffer is too big it will cause excess latency, but if it's too small it will cause stuttery playback. So you need to find a balance for your applications needs and your computing power.
The default buffer size can be checked with DataLine#getBufferSize when calling #open(AudioFormat format). The default size will vary based on the AudioFormat and seems to be geared for high latency, stutter free playback applications (e.g. internet streaming). If you're developing a low latency application, the default buffer size is much too large and should be changed.
In my testing with a 16-bit PCM AudioFormat, a buffer size of 1024 bytes has been pretty close to ideal for low latency.
The second and often overlooked cause of audio latency is any other activity being done in the capture or playback threads. For example, logging messages to console can introduce 10's of ms of latency. Turn it off.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string