I cannot figure out how graphics is linked with code just like it is done in games and operating systems like Windows, OS X,etc. How is it done?Is the code executed via the graphics or the graphics is executed through the code?
The system sends a frame synch interrupt or message. The code is written so that it does a chunk of work in slightly under a frame, and it then has a new frame to present to the viewer when the request comes along.
So if a spaceship is moving, the program calculates its velocity in pixels per 1/60 second. Then it adds that amount to its position, and redraws it on the new frame.
Related
This is an experimental project of mine related to remote browser isolation. I'm trying to intercept Skia draw commands in a running Chromium instance and later replay them in a different browser instance at client-side via CanvasKit, a WebAssembly build of Skia.
However, I'm having a hard time figuring out where and how to intercept those draw commands within Chromium source code. Any advice on how to approach my problem is much appreciated!
In Chromium, all the draw operations will be recorded in a DisplayItemList which you can find in the definition of class GraphicsContext in blink module. Second, these recorded operations will be replayed later when the CC thinks it will be right time.
On blink's end, all the things above related code was scattered mostly in blink/renderer/platform/graphics/graphics_context.cc and its related files. But if you see all Chromium as a whole, all the graphics things were triggered by CC (Chrome Compositor) which maintain a state machine and runs a draw frame loops triggered by system's vsync signal in Android. On this loop start, blink ends draw recording operations will be pushed. At the end of this loop, the composited frame's image will be translated to a serial of GPU ops and calling the system's GPU devices related APIs to do them. The CC related code files can
be found in components/viz/. You should read code of class Display as a starting key point.
My opinion comes from version 68 and you know code in Chromium changes frequently. So I cannot confirm the files and locations are still correct.
How to get screenshot of graphical application programmatically? Application draw its window using EGL API via DRM/KMS.
I use Ubuntu Server 16.04.3 and graphical application written using Qt 5.9.2 with EGLFS QPA backend. It started from first virtual terminal (if matters), then it switch display to output in full HD graphical mode.
When I use utilities (e.g. fb2png) which operates on /dev/fb?, then only textmode contents of first virtual terminal (Ctrl+Alt+F1) are saved as screenshot.
It is hardly, that there are EGL API to get contents of any buffer from context of another process (it would be insecure), but maybe there are some mechanism (and library) to get access to final output of GPU?
One way would be to get a screenshot from within your application, reading the contents of the back buffer with glReadPixels(). Or use QQuickWindow::grabWindow(), which internally uses glReadPixels() in the correct way. This seems to be not an option for you, as you need to take a screenshot when the Qt app is frozen.
The other way would be to use the DRM API to map the framebuffer and then memcpy the mapped pixels. This is implemented in Chromium OS with Python and can be translated to C easily, see https://chromium-review.googlesource.com/c/chromiumos/platform/factory/+/367611. The DRM API can also be used by another process than the Qt UI process that does the rendering.
This is a very interesting question, and I have fought this problem from several angles.
The problem is quite complex and dependant on platform, you seem to be running on EGL, which means embedded, and there you have few options unless your platform offers them.
The options you have are:
glTexSubImage2D
glTexSubImage2D can copy several kinds of buffers from OpenGL textures to CPU memory. Unfortunatly it is not supported in GLES 2/3, but your embedded provider might support it via an extension. This is nice because you can either render to FBO or get the pixels from the specific texture you need. It also needs minimal code intervertion.
glReadPixels
glReadPixels is the most common way to download all or part of the GPU pixels which are already rendered. Albeit slow, it works on GLES and Desktop. On Desktop with a decent GPU is bearable up to interactive framerates, but beware on embedded it might be really slow as it stops your render thread to get the data (horrible framedrops ensured). You can save code as it can be made to work with minimal code modifications.
Pixel Buffer Objects (PBO's)
Once you start doing real research PBO's appear here and there because they can be made to work asynchronously. They are also generally not supported in embedded but can work really well on desktop even on mediocre GPU's. Also a bit tricky to setup and require specific render modifications.
Framebuffer
On embedded, sometimes you already render to the framebuffer, so go there and fetch the pixels. Also works on desktop. You can enven mmap() the buffer to a file and get partial contents easily. But beware in many embedded systems EGL does not work on the framebuffer but on a different 'overlay' so you might be snapshotting the background of it. Also to note some multimedia applications are run with UI's on the EGL and media players on the framebuffer. So if you only need to capture the video players this might work for you. In other cases there is EGL targeting a texture which is copied to the framebuffer, and it will also work just fine.
As far as I know render to texture and stream to a framebuffer is the way they made the sweet Qt UI you see on the Ableton Push 2
More exotic Dispmanx/OpenWF
On some embedded systems (notably the Raspberry Pi and most Broadcom Videocore's) you have DispmanX. Whichs is really interesting:
This is fun:
The lowest level of accessing the GPU seems to be by an API called Dispmanx[...]
It continues...
Just to give you total lack of encouragement from using Dispmanx there are hardly any examples and no serious documentation.
Basically DispmanX is very near to baremetal. So it is even deeper down than the framebuffer or EGL. Really interesting stuff because you can use vc_dispmanx_snapshot() and really get a snapshot of everything really fast. And by fast I mean I got 30FPS RGBA32 screen capture with no noticeable stutter on screen and about 4~6% of extra CPU overhead on a Rasberry Pi. Night and day because glReadPixels got was producing very noticeable framedrops even for 1x1 pixel capture.
That's pretty much what I've found.
I have recently started reading Linux Kernel Development By Robert Love and I am Love -ing it!
Please read the below excerpt from the book to better understand my questions:
A number identifies interrupts and the kernel uses
this number to execute a specific interrupt handler to process and respond to the interrupt.
For example, as you type, the keyboard controller issues an interrupt to let the system
know that there is new data in the keyboard buffer. The kernel notes the interrupt number of the incoming interrupt and executes the correct interrupt handler.The interrupt
handler processes the keyboard data and lets the keyboard controller know it is ready for
more data...
Now I have dual boot on my machine and sometimes (in fact,many) when I type something on windows, I find myself doing it in, what I call Night crawler mode. This is when I am typing and I don't see anything on the screen and later after a while the entire text comes in one flash, probably the buffer just spits everything out.
Now I don't see this happening on Linux. Is it because of the interrupt-context present in Linux and the absence of it in windows?
BTW, I am still not sure if there is an interrupt-context in windows, google didn't give me any relevant results for that.
All OSes have an interrupt context, it's a feature/constraint of the CPU architecture -- basically, this is "just the way things work" with computer hardware. Different OSes (and drivers within that OS) make different choices about what work and how much work to do in the interrupt before returning, though. That may be related to your windows experience, or it may not. There is a lot of code involved in getting a key press translated into screen output, and interrupt handling is only a tiny part.
A number identifies interrupts and the kernel uses this number to execute a specific interrupt handler to process and respond to the interrupt. For example, as you type, the keyboard controller issues an interrupt to let the system know that there is new data in the keyboard buffer.The kernel notes the interrupt num- ber of the incoming interrupt and executes the correct interrupt handler.The interrupt handler processes the keyboard data and lets the keyboard controller know it is ready for more data
This is a pretty poor description. Things might be different now with USB keyboards, but this seems to discuss what would happen with an old PS/2 connection, where an "8042"-compatible chipset on your motherboard signals on an IRQ line to the CPU, which then executes whatever code is at the address stored in location 9 in the interrupt table (traditionally an array of pointers starting at address 0 in physical memory, though from memory you could change the address, and last time I played with this stuff PCs still had <1MB RAM and used different memory layout modes).
That dispatch process has nothing to do with the kernel... it's the way the hardware works. (The keyboard controller could be asked not to generate interrupts, allowing OS/driver software to "poll" it regularly to see if there happened to be new event data available, but it'd be pretty crazy to use that really).
Still, the code address from the interrupt table will point into the kernel or keyboard driver, and the kernel/driver code will read the keyboard event data from the keyboad controller's I/O port. For these hardware interrupt handlers, a primary goal is to get the data from the device and store it into a buffer as quickly as possible - both to ensure a return from the interrupt to whatever processing was happening, and because the keyboard controller can only handle one event at a time - it needs to be read off into the buffer before the next event.
It's then up to the OS/driver to either provide some kind of input availability signal to application software, or wait for the application software to attempt to read more keyboard input, but it can do it a "whenever you're ready" fashion. Whichever way, once an application has time to read and start responding to the input, things can happen that mean it takes an unexpectedly long amount of time: it could be that the extra keystroke triggers some complex repagination algorithm that takes a long time to run, or that the keystroke results in the program executing code that has been swapped out to disk (check wikipedia for "virtual memory"), in which case it could be only after the hard disk has read part of the program into memory that the program can continue to run. There are thousands of such edge cases involving window movement, graphics clipping algorithms, etc. that could account for the keyboard-handling code taking a long time to complete, and if other keystrokes have happened meanwhile they'll be read by the keyboard driver into that buffer, then only "perceived" by the application after the slow/blocking processing completes. It may well be that the processing consequent to all the keystrokes then in the buffer completes much more quickly: for example, if part of the program was swapped in from disk, that part may be ready to process the remaining keystrokes.
Why would Linux do better at this than Windows? Mainly because the Operating System, drivers and applications tend to be "leaner and meaner"... less bloated software (like C++ vs C# .NET), less wasted memory, so less swapping and delays.
I'm working on embedded device with screen rotated 90 degrees clockwise: screen controller reports 800x600 screen, while device's screen is 600x800 portrait.
What do you think, whose responsibility it is to compensate for this: should kernel rotate framebuffer to provide 800x600 screen as expected by upper-level software or applications (X server, bootsplash) should adapt and draw to rotated screen?
Every part of stack is free software, so there are no non-technical problems for modification, the question is more about logical soundness.
It makes most sense for the screen driver to do it - the kernel after all is supposed to provide an abstraction of the device for the userspace applications to work with. If the screen is a 600x800 portrait oriented device, then that's what applications should see from the kernel.
yes,I agree, The display driver should update the display accordingly and keep the control
Not sure exactly how standard your embedded device is, if it is running a regular linux kernel, you might check in the kernel configurator (make xconfig, when compiling a new kernel) , one of the options for kernel 2.6.37.6 in the device, video card section, is to enable rotation of the kernel messages display so it scrolls 90 degrees left or right while booting up.
I think it also makes your consoles be rotated correctly after login too.
This was not available in kernels even 6-8 months ago, at least not available in kernel that slackware64 13.37 came with about that time.
Note that the bios messages are still rotated on a PC motherboard,
but that is hard-coded in the bios, which may not apply to the embedded system you are working with.
If this kernel feature is not useful to you for whatever reason, how they did it in the linux kernel might be good example of where and how to go about it. Once you get the exact name of the option from "make xconfig", it should be pretty easy to search where ever they log the kernel traffic for that name and dig up some info about it.
Hmmm. I just recompiled my kernel today, and I may have been wrong about how new this option is. Looks like it was available with some kernel versions before the included-with-Slackware64 versions that I referenced. Sorry!
I'm looking into making some software that makes the keyboard function like a piano (e.g., the user presses the 'W' key and the speakers play a D note). I'll probably be using OpenAL. I understand the basics of digital audio, but playing real-time audio in response to key presses poses some problems I'm having trouble solving.
Here is the problem: Let's say I have 10 audio buffers, and each buffer holds one second of audio data. If I have to fill buffers before they are played through the speakers, then I would would be filling buffers one or two seconds before they are played. That means that whenever the user tries to play a note, there will be a one or two second delay between pressing the key and the note being played.
How do you get around this problem? Do you just make the buffers as small as possible, and fill them as late as possible? Is there some trick that I am missing?
Most software synthesizers don't use multiple buffers at all.
They just use one single, small ringbuffer that is constantly played.
A high priority thread will as often as possible check the current play-position and fill the free part (e.g. the part that has been played since the last time your thread was running) of the ringbuffer with sound data.
This will give you a constant latency that is only bound by the size of your ring-buffer and the output latency of your soundcard (usually not that much).
You can lower your latency even further:
In case of a new note to be played (e.g. the user has just pressed a key) you check the current play position within the ring-buffer, add some samples for safety, and then re-render the sound data with the new sound-settings applied.
This becomes tricky if you have time-based effects running (delay lines, reverb and so on), but it's doable. Just keep track of the last 10 states of your time based effects every millisecond or so. That'll make it possible to get back 10 milliseconds in time.
With the WinAPI, you can only get so far in terms of latency. Usually you can't get below 40-50ms which is quite nasty. The solution is to implement ASIO support in your app, and make the user run something like Asio4All in the background. This brings the latency down to 5ms but at a cost: other apps can't play sound at the same time.
I know this because I'm a FL Studio user.
The solution is small buffers, filled frequently by a real-time thread. How small you make the buffers (or how full you let the buffer become with a ring-buffer) is constrained by scheduling latency of your operating system. You'll probably find 10ms to be acceptable.
There are some nasty gotchas in here for the uninitiated - particularly with regards to software architecture and thread-safety.
You could try having a look at Juce - which is a cross-platform framework for writing audio software, and in particular - audio plugins such as SoftSynths and effects. It includes software for both sample plug-ins and hosts. It is in the host that issues with threading are mostly dealt with.