Reading framebuffer at lowest level possible under X - graphics

I want to read the framebuffer of the videocard at the lowest level possible for a security application I'm writing.
I want to be as sure as possible that what I'm reading is exactly what will be finally
put on the bits of the hardware lighting the pixels of the screen,
and that no software layer is in the middle (or at least I want to have the lowest number possible of layers in the middle).
I've seen it's pretty easy to use X to grab the screen in a precise moment, but that call
is still passing through the X server.
I would like to have something really more low level,
even if this means messing up with some ioctl with the video card.
I've seen the existence of DRI and DRI2, but they are very very badly documented, especially
the latter.
I can't really understand how they work.
Do you have any idea, reference or starting point for a good research?
Anything would be appreciated!

I'm not sure how much reading the framebuffer will help you (even disregarding the issue pointed out by timday in his comment, deciding whether what you read there is what you want it to be may not be very easy), but if you are doing this on Linux you could map the kernel framebuffer devices, possibly using DirectFB to help you. Alternatively, if you are on a non-Linux PC platform you could use VESA (take a look at the VESA code in X.Org and the X.Org VESA driver (the actual code is split between the two). Be aware that you will probably also have some fun with things like multi-monitor setups.

Related

Possible to access low level touchpad input at user-level (esp. in Windows) to provide better gestures/palm rejection?

I have a laptop whose touchpad is very sensitive to spurious light, grazing touches of anything other than the finger being used, causing unwanted gesture input--even with the sensitivity set to low in the control panel. I can (and will) probably learn over time to hold my wrists in a manner to minimize the problem--but as someone interested in algorithms in things like signal processing, vision, etc., I thought it might be a fun project to try and write a more intelligent filtering algorithm for touch input.
I'm not scared by the math/algorithmic aspect--but what I have zero knowledge of is how the software stack for input devices works, on what level in the stack such code would need to run, and how privileged/close to the kernel I would need to get to have access to that (and whether such a level is even sufficiently documented and accessible to make this possible). Most of the stack presumably handles touch data at "mouse level" abstraction, i.e. as a pointer x/y pair, whereas filtering to eliminate spurious touches would presumably need to act on a sort of "pixel map" of the pad with the areas registering touch "bright", before some sort of "blob detection" on this computes the pointer coordinates.
Where is this transformation ("pad image" to "pointer") performed--in the driver for the touchpad, in the OS kernel, in some userspace code, etc.? Is it even performed at all, or does the capacitive sensing circuitry directly detect only the centroid of the points of contact to begin with? (I can't find a good description of even how multi-touch with capacitive sensing works, fundamentally on a physics level) Is this the sort of thing that's only possible to modify in something like Linux where every line of code in the whole system is modifiable, or is there a good way to "hook" this process even in OSes that are otherwise proprietary?

Need advice on hardware stack for Wireless Audio solution

Good day!
Problem definition:
Current implementations of Bluetooth does not allow to simply support good quality of Audio(Earphones mode) and 2-way audio transition (Headset mode).
Also, even if one would manage to set this configuration up, which have huge limitations on the hardware/software used, there is no way to handle sound input from 2 different audio devices simultaneously.
So, technically - one cannot just play the Game, communicate on the Discord, and optionally listen to some music, unless he is bound to some USB-bundled earphones. Which are usually really crappy, or really expensive. Or both.
Solution sketch:
So, I came up with an idea that one can actually build such device, using Raspberry Pi, Arduino, or even barebone-component-based stacks.
Theoretical layout of connections per-se would look somehow like that:
Idea is to create 2 "simple" devices
One, not-so-portable, that would handle several analog inputs, and one analog output
One, portable, that would handle single analog Input and Output, and could be used with any analog earphones.
"Requirements" to such system would be quite simple:
This bundle have to handle Data Transition on some distance, preferably up to 10 meters, or more.
The "Inlet" device should be portable enough to keep it in the pocket, or in an arm band, or something
Sound Quality should be at the very least on the level of Bluetooth headphones profile, or if possible - even better
If possible - it would be nice to keep the price of the Solution under 500 Euros, but I'm so tired of current state of things that I might consider raising the budget...
Don't mind the yellow buttons on the Outlet device. Those are optional, and will depend on the implementation stack :)
Question:
Can anyone advice me which component-base would be a better solution to making such a tool, and why?
And maybe someone actually knows of similar systems already existing?
Personally I would prefer anything but the barebone-components-based solution, just because I'm really rusty with that area, and it requires quite the amount of tools, to handle it properly.
While using pre-built modules can save me from buying most of the hardware tools, minifying my "hardware customization" part of this solution, leaving only software part to handle (which is my main area of expertise).
But then again, if there are some experts here, that would consider other stacks non-viable - I would really appreciate to see their reasonings.
P.S. Just to be clear: If this project will prove viable - I will implement it, and share the implementation details with the communities. I am not the first one who needs such system, and unfortunately it seems that Hardware/Software vendors are not really interested in designing similar solutions...
I happen to find a "temporary" solution.
I've came across a wireless headset, that allows to simultaneously support Wireless USB Bundle connection, and Bluetooth connection to different devices, and provide nice way of controlling sound input/output with both connections.
This was almost a pure luck, as this "feature" was not described anywhere in the specs...
Actual headset name is:
JBL Quantum 800
This does not closes the question per-se, as I still plan to implement this "Summer Project" at some point, but I believe this information might be useful to those searching for similar solutions.

Gesture Recognition for my Grandma (Kinect) Linux

I'm looking into making a project with the Kinect to allow my Grandma to control her TV without being daunted by using the remote. So, I've been looking into basic gesture recognition. The aim will be to say turn the volume of the TV up by sending the right IR code to the TV when the program detects that the right hand is being "waved."
The problem is, no matter where I look, I can't seem to find a Linux based tutorial which shows how to do something as a result of a gesture. One other thing to note is that I don't need to have any GUI apart from the debug window as this will slow my program down a fair bit.
Does anybody know of something somewhere which will allow me to in a loop, constantly check for some hand gesture and when it does, I can control something, without the need of any GUI at all, and on Linux? :/
I'm happy to go for any language but my experience revolves around Python and C.
Any help will be accepted with great appreciation.
Thanks in advance
Matt
In principle, this concept is great, but the amount of features a remote offers is going to be hard to replicate using a number of gestures that an older person can memorize. They will probably be even less incentivized to do this (learning new things sucks) if they already have a solution (remote), even though they really love you. I'm just warning you.
I recommend you use OpenNI and NITE. Note that the current version of OpenNI (2) does not have Kinect support. You need to use OpenNI 1.5.4 and look for the SensorKinect093 driver. There should be some gesture code that works for that (googling OpenNI Gesture yields a ton of results). If you're using something that expects OpenNI 2, be warned that you may have to write some glue code.
The basic control set would be Volume +/-, Channel +/-, Power on/off. But that will be frustrating if she wants to go from Channel 03 to 50.
I don't know how low-level you want to go, but a really, REALLY simple gesture recognize could look at horizontal and vertical swipes of the right hand exceeding a velocity threshold (averaged). Be warned: detected skeletons can get really wonky when people are sitting (that's actually a bit of what my PhD is on).

Framebuffer Documentation

Is there any documentation on how to write software that uses the framebuffer device in Linux? I've seen a couple simple examples that basically say: "open it, mmap it, write pixels to mapped area." But no comprehensive documentation on how to use the different IOCTLS for it anything. I've seen references to "panning" and other capabilities but "googling it" gives way too many hits of useless information.
Edit:
Is the only documentation from a programming standpoint, not a "User's howto configure your system to use the fb," documentation the code?
You could have a look at fbi's source code, an image viewer which uses the linux framebuffer. You can get it here : http://linux.bytesex.org/fbida/
-- It appears there might not be too many options possible to programming with the fb from user space on a desktop beyond what you mentioned. This might be one reason why some of the docs are so old. Look at this howto for device driver writers and which is referenced from some official linux docs: www.linux-fbdev.org [slash] HOWTO [slash] index.html . It does not reference too many interfaces.. although looking at the linux source tree does offer larger code examples.
-- opentom.org [slash] Hardware_Framebuffer is not for a desktop environment. It reinforces the main methodology, but it does seem to avoid explaining all the ingredients necessary to doing the "fast" double buffer switching it mentions. Another one for a different device and which leaves some key buffering details out is wiki.gp2x.org [slash] wiki [slash] Writing_to_the_framebuffer_device , although it does at least suggest you might be able use fb1 and fb0 to engage double buffering (on this device.. though for desktop, fb1 may not be possible or it may access different hardware), that using volatile keyword might be appropriate, and that we should pay attention to the vsync.
-- asm.sourceforge.net [slash] articles [slash] fb.html assembly language routines that also appear (?) to just do the basics of querying, opening, setting a few basics, mmap, drawing pixel values to storage, and copying over to the fb memory (making sure to use a short stosb loop, I suppose, rather than some longer approach).
-- Beware of 16 bpp comments when googling Linux frame buffer: I used fbgrab and fb2png during an X session to no avail. These each rendered an image that suggested a snapshot of my desktop screen as if the picture of the desktop had been taken using a very bad camera, underwater, and then overexposed in a dark room. The image was completely broken in color, size, and missing much detail (dotted all over with pixel colors that didn't belong). It seems that /proc /sys on the computer I used (new kernel with at most minor modifications.. from a PCLOS derivative) claim that fb0 uses 16 bpp, and most things I googled stated something along those lines, but experiments lead me to a very different conclusion. Besides the results of these two failures from standard frame buffer grab utilities (for the versions held by this distro) that may have assumed 16 bits, I had a different successful test result treating frame buffer pixel data as 32 bits. I created a file from data pulled in via cat /dev/fb0. The file's size ended up being 1920000. I then wrote a small C program to try and manipulate that data (under the assumption it was pixel data in some encoding or other). I nailed it eventually, and the pixel format matched exactly what I had gotten from X when queried (TrueColor RGB 8 bits, no alpha but padded to 32 bits). Notice another clue: my screen resolution of 800x600 times 4 bytes gives 1920000 exactly. The 16 bit approaches I tried initially all produced a similar broken image to fbgrap, so it's not like if I may not have been looking at the right data. [Let me know if you want the code I used to test the data. Basically I just read in the entire fb0 dump and then spit it back out to file, after adding a header "P6\n800 600\n255\n" that creates the suitable ppm file, and while looping over all the pixels manipulating their order or expanding them,.. with the end successful result for me being to drop every 4th byte and switch the first and third in every 4 byte unit. In short, I turned the apparent BGRA fb0 dump into a ppm RGB file. ppm can be viewed with many pic viewers on Linux.]
-- You may want to reconsider the reasons for wanting to program using fb0 (this might also account for why few examples exist). You may not achieve any worthwhile performance gains over X (this was my, if limited, experience) while giving up benefits of using X. This reason might also account for why few code examples exist.
-- Note that DirectFB is not fb. DirectFB has of late gotten more love than the older fb, as it is more focused on the sexier 3d hw accel. If you want to render to a desktop screen as fast as possible without leveraging 3d hardware accel (or even 2d hw accel), then fb might be fine but won't give you anything much that X doesn't give you. X apparently uses fb, and the overhead is likely negligible compared to other costs your program will likely have (don't call X in any tight loop, but instead at the end once you have set up all the pixels for the frame). On the other hand, it can be neat to play around with fb as covered in this comment: Paint Pixels to Screen via Linux FrameBuffer
Check for MPlayer sources.
Under the /libvo directory there are a lot of Video Output plugins used by Mplayer to display multimedia. There you can find the fbdev (vo_fbdev* sources) plugin which uses the Linux frame buffer.
There are a lot of ioctl calls, with the following codes:
FBIOGET_VSCREENINFO
FBIOPUT_VSCREENINFO
FBIOGET_FSCREENINFO
FBIOGETCMAP
FBIOPUTCMAP
FBIOPAN_DISPLAY
It's not like a good documentation, but this is surely a good application implementation.
Look at source code of any of: fbxat,fbida, fbterm, fbtv, directFB library, libxineliboutput-fbe, ppmtofb, xserver-fbdev all are debian packages apps. Just apt-get source from debian libraries. there are many others...
hint: search for framebuffer in package description using your favorite package manager.
ok, even if reading the code is sometimes called "Guru documentation" it can be a bit too much to actually do it.
The source to any splash screen (i.e. during booting) should give you a good start.

Learning about low-level graphics programming

I'm interesting in learning about the different layers of abstraction available for making graphical applications.
I see a lot of terms thrown around: At the highest level of abstraction, I hear about things like C#, .NET, pyglet and pygame. Further down, I hear about DirectX and OpenGL. Then there's DirectDraw, SDL, the Win32 API, and still other multi-platform libraries like WxWidgets.
How can I get a good sense of where one of these layers ends and where the next one begins? What is the "lowest possible level" way of creating a window in Windows, in C? What about C++? (A code sample would be divine.) What about in X11? Are the Windows implementations of OpenGL and DirectX built on top of the Win32 API? Where can I begin to learn about these things?
There's another question on SO where Programming Windows is suggested. What about for Linux? Is there an equivalent such book?
I'm aware that this is very low-level, and that there are many friendlier tools available, but I would like to at least learn the basics of what's going on beneath the surface. As much as I'd like to begin slinging windows and vectors right off the bat, starting with something like pygame is too high-level for me; I really need to make the full conceptual circuit of how you draw stuff on a computer.
I will certainly appreciate suggestions for books and resources, but I think it would be stupendously cool if the answers to this question filled up with lots of different ways to get to "Hello world" with different approaches to graphics programming. C? C++? Using OpenGL? Using DirectX? On Windows XP? On Ubuntu? Maybe I ask for too much.
The lowest level would be the graphics card's video RAM. When the computer first starts, the graphics card is typically set to the 80x25 character legacy mode.
You can write text with a BIOS provided interrupt at this point. You can also change the foreground and background color from a palette of 16 distinctive colors. You can use access ports/registers to change the display mode. At this point you could say, load a different font into the display memory and still use the 80x25 mode (OS installations usually do this) or you can go ahead and enable VGA/SVGA. It's quite complicated, that's what drivers are for.
Once the card's in the 'higher' mode you'd change what's on screen by accessing the memory mapped to the video card. It's stored horizontally pixel by pixel with some 'dirty regions' of pixels that aren't mapped to screen at the end of each line which you have to compensate for. But yeah, you could copy the pixels of an image in memory directly to the screen.
For things like DirectX, OpenGL. rather than write directly to the screen, commands are sent to the graphics card and it updates its screen automatically. Commands like "Hey you, draw this image I've loaded into the VRAM here, here and here" or "Draw these triangles with this transformation matrix..." take a fraction of the time compared to pixel by pixel . The CPU will thank you.
DirectX/OpenGL is a programmer friendly library for sending those commands to the card with all the supporting functions to help you get it done smoothly. A more direct approach would only be unproductive.
SDL is an abstraction layer so without bothering to read up on it I'd guess it would have different ways of working on each system. On one it might use semi-direct screen writing, another Direct3D, etc. Whatever's fastest as long as the code stays cross-platform..able.
The GDI/GDI+ and XWindow system. They're designed specifically to draw windows. Originally they drew using the pixel-by-pixel method (which was good enough because they'd only have to redraw when a button was pressed or a window moved, etc.) but now they use Direct3D/OpenGL for accelerated drawing (and special effects). Optimizations depend on the versions and implementations of these libraries.
So if you want the most power and speed, DirectX/openGL is the way to go. SDL is certainly useful for getting the most from a cross-platform environment and integrates with OpenGL anyway. The windowing system comes last but don't underestimate it. Especially with the stuff Microsoft's coming up with lately.
Michael Abrash's Graphics Programming 'Black Book' is a great place to start. Plus you can download it for free!
If you really want to start at the bottom then drawing a line is the most basic operation. Computer graphics is simply about filling in pixels on a grid (screen), so you need to work out which pixels to fill in to get a line that goes from (x0,y0) to (x1,y1).
Check out Bresenham's algorithm to get a feel for what is involved.
To be a good graphics and image processing programmer doesn't require this low level knowledge, but i do hate to be clueless about the insides of what i'm using. I see two ways to chase this - high-level down, or bottom-level up.
Top-down is a matter of following how the action traces from a high-level graphics operation such as to draw a circle, to the hardware. Get to know OpenGL well. Then the source to Mesa (free!) provides a peek at how OpenGL can be implemented in software. The source to Xorg would be next, first to see how the action goes from API calls through the client side to the X server. Finally you dive into a device driver that interfaces with hardware.
Bottom up: build your own graphics hardware. Think of ways it could connect to a computer - how to handle massive numbers of pixels through a few byte-size registers, how DMA would work. Write a device driver, and try designing a graphics library that might be useful for app programmers.
The bottom-up way is how i learned, years ago when it was a possibility with the slow 8-bit microprocessors. The direct experience with circuitry and hardware-software interfacing gave me a good appreciation of the difficult design decisions - e.g. to paint rectangles using clever hardware, in the device driver, or higher level. None of this is of practical everyday value, but provided a foundation of knowledge to understand newer technology.
see Open GPU Documentation section:
http://developer.amd.com/documentation/guides/Pages/default.aspx
HTH
On MSWindows it is easy: you use what the API provides, whether it is the standard windows programming API or the DirectX-family API's: that's what you use, and they are well documented.
In an X windows environment you use whatever X11-libraries that are provided. If you want to understand the principles behind windowing on X, I suggest that you do this, nevermind that many others tell you not to, it will really help you to understand graphics and windowing under X. You can read the documentation on X-programming (google for it). (After this exercise you would appreciate the higher level libraries!)
Apart from the above, at the absolutely lowest level (excluding chip-level) that you can go is to call the interrupts that switch to the various graphics modes available - there are several - and then write to the screen buffers, but for this you would have to use assembler, anything else would be too slow. Going this way will not be portable at all.
Another post mentions Abrash's Black Book - an excellent resource.
Edit: As for books on programming Linux: it is a community thing, there are many howto's around; also find a forum, join it, and as long as you act civilized you will get all the help you can ever need.
Right off the bat, I'd say "you're asking too much." From what little experience I've had, I would recommend reading some tutorials or getting a book on either directX or OpenGL to start out. To go any lower than that would be pretty complex. Most of the books I've seen in OGL or DX have pretty good introductions that explain what the functions/classes do.
Once you get the hang of one of these, you could always dig in to the libraries to see what exactly they're doing to go lower.
Or, if you really, absolutely MUST learn the LOWEST level... read the book in the above post.
libX11 is the lowest level library for X11. I believe the opengl/directx talk to the driver/hardware directly (or emulate unsupported ops), so they would be the lowest level library.
If you want to start with very low level programming, look for x86 assembly code for VGA and fire up a copy of dosbox or similar.
Vulkan api is an api which gives you very low level access to most if not all features of the gpu, computational and graphical, it works on amd and Nvidia gpus (not all)
you can also use CUDA, but it only works on Nvidia gpus and has access to computational features only, no video output.

Resources