Bypassing operating system and drawing to frame buffer of GPU

Bypassing operating system and drawing to frame buffer of GPU - graphics

Is it possible to directly modify frame buffer of a GPU without using any operating system or OpenGl/directX calls but just GPU driver calls?
Basically I want to overlay some animations on my screen and don't want operating system to overwrite any pixels of it.

Yes, it's possible by using the Dumb-Buffer API of the DRM (Direct Rendering Manager) driver.
You should definitely see David Herrmann's tutorials and download his examples. He only forgot to initialize the gamma ramps.

Related

What are the syscalls for drawing graphics on the screen in Linux?

I was searching for a syscall that would draw a pixel on a given coordinate on the screen on something similar. But I couldn't find any such syscalls in this site.
I came to know that OS interacts with monitors using graphic drivers. But these drivers may be different on different machines. So is there a common native API provided by linux for handling these?
Much like how there are syscalls for opening, closing, reading, writing to files. Even though underlying file systems maybe different, these syscalls provide an abstract API for user programs to simplify things. I was searching something similar for drawing onto the screen.

Typically a user is running a display server and window system which organizes the screen into windows which applications draw to individually using the API provided by that system. The details will depend on the architecture of this system.
The traditional window system on Linux is the X window system and the more modern Wayland display server/protocol is also in common use. For example X has commands to instruct the X server to draw primitives to the screen.
If no such system is in use, you can directly draw to a display either via a framebuffer device or using the DRM API. Both are not accessed by special syscalls, but instead by using normal file syscalls like open, read, write, etc., but also ioctl, on special device files in /dev, e.g. /dev/dri/card0 for DRM to the first graphics card or /dev/fb0 for the first framebuffer device. DRM is also used for applications to render directly to the screen or a buffer when under a display server or window system as above.
In any case DRM is usually not used directly to draw e.g. pixels to the screen. It still is specific to the graphics card. Typically a library like Mesa3D is used to translate the specific details into a common API like OpenGL or Vulkan for applications to use.

Get screenshot of EGL DRM/KMS application

How to get screenshot of graphical application programmatically? Application draw its window using EGL API via DRM/KMS.
I use Ubuntu Server 16.04.3 and graphical application written using Qt 5.9.2 with EGLFS QPA backend. It started from first virtual terminal (if matters), then it switch display to output in full HD graphical mode.
When I use utilities (e.g. fb2png) which operates on /dev/fb?, then only textmode contents of first virtual terminal (Ctrl+Alt+F1) are saved as screenshot.
It is hardly, that there are EGL API to get contents of any buffer from context of another process (it would be insecure), but maybe there are some mechanism (and library) to get access to final output of GPU?

One way would be to get a screenshot from within your application, reading the contents of the back buffer with glReadPixels(). Or use QQuickWindow::grabWindow(), which internally uses glReadPixels() in the correct way. This seems to be not an option for you, as you need to take a screenshot when the Qt app is frozen.
The other way would be to use the DRM API to map the framebuffer and then memcpy the mapped pixels. This is implemented in Chromium OS with Python and can be translated to C easily, see https://chromium-review.googlesource.com/c/chromiumos/platform/factory/+/367611. The DRM API can also be used by another process than the Qt UI process that does the rendering.

This is a very interesting question, and I have fought this problem from several angles.
The problem is quite complex and dependant on platform, you seem to be running on EGL, which means embedded, and there you have few options unless your platform offers them.
The options you have are:
glTexSubImage2D
glTexSubImage2D can copy several kinds of buffers from OpenGL textures to CPU memory. Unfortunatly it is not supported in GLES 2/3, but your embedded provider might support it via an extension. This is nice because you can either render to FBO or get the pixels from the specific texture you need. It also needs minimal code intervertion.
glReadPixels
glReadPixels is the most common way to download all or part of the GPU pixels which are already rendered. Albeit slow, it works on GLES and Desktop. On Desktop with a decent GPU is bearable up to interactive framerates, but beware on embedded it might be really slow as it stops your render thread to get the data (horrible framedrops ensured). You can save code as it can be made to work with minimal code modifications.
Pixel Buffer Objects (PBO's)
Once you start doing real research PBO's appear here and there because they can be made to work asynchronously. They are also generally not supported in embedded but can work really well on desktop even on mediocre GPU's. Also a bit tricky to setup and require specific render modifications.
Framebuffer
On embedded, sometimes you already render to the framebuffer, so go there and fetch the pixels. Also works on desktop. You can enven mmap() the buffer to a file and get partial contents easily. But beware in many embedded systems EGL does not work on the framebuffer but on a different 'overlay' so you might be snapshotting the background of it. Also to note some multimedia applications are run with UI's on the EGL and media players on the framebuffer. So if you only need to capture the video players this might work for you. In other cases there is EGL targeting a texture which is copied to the framebuffer, and it will also work just fine.
As far as I know render to texture and stream to a framebuffer is the way they made the sweet Qt UI you see on the Ableton Push 2
More exotic Dispmanx/OpenWF
On some embedded systems (notably the Raspberry Pi and most Broadcom Videocore's) you have DispmanX. Whichs is really interesting:
This is fun:
The lowest level of accessing the GPU seems to be by an API called Dispmanx[...]
It continues...
Just to give you total lack of encouragement from using Dispmanx there are hardly any examples and no serious documentation.
Basically DispmanX is very near to baremetal. So it is even deeper down than the framebuffer or EGL. Really interesting stuff because you can use vc_dispmanx_snapshot() and really get a snapshot of everything really fast. And by fast I mean I got 30FPS RGBA32 screen capture with no noticeable stutter on screen and about 4~6% of extra CPU overhead on a Rasberry Pi. Night and day because glReadPixels got was producing very noticeable framedrops even for 1x1 pixel capture.
That's pretty much what I've found.

Writing end to end linux device driver

I am looking forward to learn writing a typical linux device driver. Can anyone guide me how can i learn all the aspects of a typical linux device driver ? The examples i see on internet are way too simple, they just send a "hello world" msg from user space to kernel driver module, and echo back "hello". I want to touch almost all areas in a simple way, one would face in writing a real world driver. Would i need to have a real hardware to go forward to meet my requirement ? Cannot system's memory simulate the hardware peripheral and let me treat it as a hardware and control it vie kernel driver covering good set of operations ? Any examples/guidance for this ?

Take a look at the following example of network driver. It uses QEMU for development and testing.
http://www.codeproject.com/Articles/1087177/Linux-Ethernet-Driver-using-Qemu

Sample drivers usually don't control real hardware. The QEMU answer mentioned here is a good exception I guess.
It depends what type of driver you want to focus on. Most classes of drivers distributed with the kernel have some simpler drivers you can learn from. Nbd for example is great for block subsystem and loop devices:
https://github.com/torvalds/linux/blob/c05c2ec96bb8b7310da1055c7b9d786a3ec6dc0c/drivers/block/nbd.c
Look at the smallest file sizes in a drivers/xyz directory and go up until the code is too complex.

Tips to reduce opengl 3 & 4 frame-rate stuttering under Linux

In recent years I've developed several small games and applications for OpenLG 2 and ES.
I'm now trying to build a scene-graph based on opengl 3+ for casual “3D” graphics on desktop systems. (Nothing complex like the unreal- or crytec-engine in mind.)
I started my development with OsX 10.7 and was impressed by Apples recent ogl 3.2 release which achieves equivalent results compared to windows systems.
Otherwise the results for Linux are a disappointment. Even the most basic animation is stuttering and destroys the impression of reality. The results did not differ between the windows toolkits freeglut and glfw. (The Extensions are loaded with glew 1.7)
I would like to mention that I'm talking about the new opengl core, not the old opengl 2 render-path, which works fine under Linux but uses the cpu instead of the gpu for complex operations.
After watching professional demos like the “Unigine heaven demo” I think there is a general problem to use modern real-time 3D graphics with Linux.
Any suggestions to overcome this problem are very welcome.
UPDATE:
I'm using:
AMD Phenom II X6, Radeon HD57XX with latest proprietary drivers (11.8) and Unity(64Bit).
You could take my renderloop from the toolkit documentation:
do {
glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);
...
} while (!glfwGetKey(GLFW_KEY_ESC) && glfwGetWindowParam(GLFW_OPENED));
I'm using VBOs and all transformation stages are done with shaders. Animation timing is done with glfwGetTime(). This problem occurs in window and full-screen mode. I don't know if a composition manager interferes with full-screen applications. But it is also impossible to demand from the user to disable it.
Update 2:
Typo: I'm using a HD57XX card.
GLX_Extension: http://snipt.org/xnKg
PCI Dump: 01:00.0 VGA compatible controller: ATI Technologies Inc Juniper [Radeon HD 5700 Series]
X Info: http://pastie.org/2507935
Update 3:
Disabling the composition manager reduces, but did not completely remove the stuttering.
(I replaced the standard window manager with "ubuntu classic without extensions")
Once a second the animation freezes and ugly distortions appear:
(Image removed - Not allowed to post Images.)
Although Vertical synchronisation is enabled in the driver and checked in the application.

Since you're running Linux we require a bit of detailed information:
Which hardware do you use?
Only NVidia, AMD/ATI and Intel offer 3D acceleration so far.
Which drivers?
For NVidia and AMD/ATI there are propritary (nvidia-glx, fglrx) and open source drivers (nouveau, radeon). For Intel there are only the open source drivers.
Of all open source 3D drivers, the Intel drivers offer the best quality.
The open source AMD/ATI drivers, "radeon" have reached an acceptable state, but still are not on par, performance wise.
For NVidia GPUs, the only drivers that makes sense to use productively are the propritary ones. The open source "nouveau" drivers simply don't cut it, yet.
Do you run a compositing window manager?
Compositing creates a whole bunch of synchronization and timing issues. Also (some of) the OpenGL code you can find in the compositing WMs at some places drives tears into the eyes of a seasoned OpenGL coder, especially if one has experience writing realtime 3D (game) engines.
KDE4 and GNOME3 by default use compositing, if available. The same holds for the Ubuntu Unity desktop shell. Also for some non-compositing WMs the default skripts start xcompmgr for transparency and shadow effects.
And last but not least: How did you implement your rendering loop?
A mistake oftenly found is, that a timer is used to issue redisplay events at "regular" intervals. This is not how it's done properly. Timer events can be delayed arbitrarily, and the standard timers are not very accurate by themself, too.
The proper way is to call the display function in a tight loop and measure the time it takes between rendering iterations, then use this timing to advance the animation accordingly. A truly elegant method is using one of the VSync extensions that delivers one the display refresh frequency and the refresh counter. That way instead of using a timer you are told exactly the time advanced between frames in display refresh cycle periods.

Is kernel or userspace responsible for rotating framebuffer to match screen

I'm working on embedded device with screen rotated 90 degrees clockwise: screen controller reports 800x600 screen, while device's screen is 600x800 portrait.
What do you think, whose responsibility it is to compensate for this: should kernel rotate framebuffer to provide 800x600 screen as expected by upper-level software or applications (X server, bootsplash) should adapt and draw to rotated screen?
Every part of stack is free software, so there are no non-technical problems for modification, the question is more about logical soundness.

It makes most sense for the screen driver to do it - the kernel after all is supposed to provide an abstraction of the device for the userspace applications to work with. If the screen is a 600x800 portrait oriented device, then that's what applications should see from the kernel.

yes,I agree, The display driver should update the display accordingly and keep the control

Not sure exactly how standard your embedded device is, if it is running a regular linux kernel, you might check in the kernel configurator (make xconfig, when compiling a new kernel) , one of the options for kernel 2.6.37.6 in the device, video card section, is to enable rotation of the kernel messages display so it scrolls 90 degrees left or right while booting up.
I think it also makes your consoles be rotated correctly after login too.
This was not available in kernels even 6-8 months ago, at least not available in kernel that slackware64 13.37 came with about that time.
Note that the bios messages are still rotated on a PC motherboard,
but that is hard-coded in the bios, which may not apply to the embedded system you are working with.
If this kernel feature is not useful to you for whatever reason, how they did it in the linux kernel might be good example of where and how to go about it. Once you get the exact name of the option from "make xconfig", it should be pretty easy to search where ever they log the kernel traffic for that name and dig up some info about it.
Hmmm. I just recompiled my kernel today, and I may have been wrong about how new this option is. Looks like it was available with some kernel versions before the included-with-Slackware64 versions that I referenced. Sorry!

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string