drop/rewrite/generate keyboard events under Linux - linux

I would like to hook into, intercept, and generate keyboard (make/break) events under Linux before they get delivered to any application. More precisely, I want to detect patterns in the key event stream and be able to discard/insert events into the stream depending on the detected patterns.
I've seen some related questions on SO, but:
either they only deal with how to get at the key events (key loggers etc.), and not how to manipulate the propagation of them (they only listen, but don't intercept/generate).
or they use passive/active grabs in X (read more on that below).
A Small DSL
I explain the problem below, but to make it a bit more compact and understandable, first a small DSL definition.
A_: for make (press) key A
A^: for break (release) key A
A^->[C_,C^,U_,U^]: on A^ send a make/break combo for C and then U further down the processing chain (and finally to the application). If there is no -> then there's nothing sent (but internal state might be modified to detect subsequent events).
$X: execute an arbitrary action. This can be sending some configurable key event sequence (maybe something like C-x C-s for emacs), or execute a function. If I can only send key events, that would be enough, as I can then further process these in a window manager depending on which application is active.
Problem Description
Ok, so with this notation, here are the patterns I want to detect and what events I want to pass on down the processing chain.
A_, A^->[A_,A^]: expl. see above, note that the send happens on A^.
A_, B_, A^->[A_,A^], B^->[B_,B^]: basically the same as 1. but overlapping events don't change the processing flow.
A_, B_, B^->[$X], A^: if there was a complete make/break of a key (B) while another key was held (A), X is executed (see above), and the break of A is discarded.
(it's in principle a simple statemachine implemented over key events, which can generate (multiple) key events as output).
Additional Notes
The solution has to work at typing speed.
Consumers of the modified key event stream run under X on Linux (consoles, browsers, editors, etc.).
Only keyboard events influence the processing (no mouse etc.)
Matching can happen on keysyms (a bit easier), or keycodes (a bit harder). With the latter, I will just have to read in the mapping to translate from code to keysym.
If possible, I'd prefer a solution that works with both USB keyboards as well as inside a virtual machine (could be a problem if working at the driver layer, other layers should be ok).
I'm pretty open about the implementation language.
Possible Solutions and Questions
So the basic question is how to implement this.
I have implemented a solution in a window manager using passive grabs (XGrabKey) and XSendEvent. Unfortunately passive grabs don't work in this case as they don't capture correctly B^ in the second pattern above. The reason is that the converted grab ends on A^ and is not continued to B^. A new grab is converted to capture B if still held but only after ~1 sec. Otherwise a plain B^ is sent to the application. This can be verified with xev.
I could convert my implementation to use an active grab (XGrabKeyboard), but I'm not sure about the effect on other applications if the window manager has an active grab on the keyboard all the time. X documentation refers to active grabs as being intrusive and designed for short term use. If someone has experience with this and there are no major drawbacks with longterm active grabs, then I'd consider this a solution.
I'm willing to look at other layers of key event processing besides window managers (which operate as X clients). Keyboard drivers or mappings are a possibility as long as I can solve the above problem with them. This also implies that the solution doesn't have to be a separate application. I'm perfectly fine to have a driver or kernel module do this for me. Be aware though that I have never done any kernel or driver programming, so I would appreciate some good resources.
Thanks for any pointers!

Use XInput2 to make device(keyboard) floating, then monitor KeyPress and KeyRelease event on the device, using XTest to regenerate KeyPress & KeyRelease event.

Related

LabView playing more than one sound at the time

I'm using event structure and want to do some like Launchpad.
Numeric keyboard have for each number added a sound.
Problem is, that when I press number example one, the program is waiting when the music stop play and next I can press example number four.
Is it possible, to play sounds from 3 key's at the same time using event structure ?
I put the files online here and added screenshots below. Block diagram:
Front panel:
Working Solution
I think I got this working much more easily than I expected using the Play Sound File VI under the Graphics and Sound -> Sound -> Output palette. That link is the 2011 documentation (couldn't find a more recent link), but it does not look like it has changed. The working result is shown below, with two different events handled by the event structure:
Key Down? event:
Stop Button event:
You may be fine without using the Sound Output Clear VI to the right of the main event loop, but having it there won't hurt.
It turns out that the Play Sound File VI does not block, so you can play multiple overlapping sound files. If you run into blocking on your machine (one sound file plays, then the next, and so on), let me know because I have another solution that might work.
A word on events
An important thing to understand is that events are handled in a queue. When you press keys, those key presses go in order onto the event queue. Each time your event-handling loop executes, it takes the oldest event out of that queue and processes it. The event structure in LabVIEW will only handle one event per iteration of your event-handling loop. On the next iteration, if events are still in the queue that your structure is set up to process, it will take the next-oldest one for that iteration and repeat.
Now, lets say that you want to do some super complicated processing that takes 10 seconds every time you press a key, and you put that processing inside of your main event loop. Your key presses still go onto the event queue as fast as you press them, but LabVIEW has to wait the full 10 seconds before it can dequeue the next keypress and process it, so you end up with an application that seems to hang while it chugs through the queue much slower than you are adding items to the queue.
One way to get around this is to take that complicated processing and put it outside of the queue in another process. If you have the resources, you can actually call a separate instance of a processing sub-VI in its own thread for every one of those key presses. This allows the event handling loop to spawn processes as fast as you can press keys, and then your processes take whatever time they need to simultaneously (resources permitting) perform whatever actions you wanted.
Essentially that is what the Play Sound File VI is doing. It sees that you want to play a file and spawns a process to play that sound over the speakers, allowing the event-handling loop to continue immediately rather than waiting for the sound to finish playing. As you press more keys, more processes get spawn that kill themselves when they are finished. You can do this manually too, which is the other solution that I have for you if Play Sound File does not behave the same way for you as it did for me.
Update:
Thanks to #Engineero for pointing out that Play Sound File vi actually isn't blocking. The updated code shows how to play overlapping sounds. I'll leave it to the user to add the Stop Sound on Key Up code. No timeout is needed because nothing is taking place in the event structure.
Also, note that for me the Play Sound vi needed to be in a while loop to keep playing. Not sure why this is needed, but the NI examples sets it up this way (\examples\Graphics and Sound\Sound\Sound Player.vi).
Finally, you may crash the vi if your sound card gets overwhelmed as mentioned here. If that happens I would go with a better sound library to try and squeeze more performance out of your sound card.
Original:
First, I assume you a referring to this Launchpad?
I was able to press up to 4 keys at once will the following - the important thing is to set the event timeout to 1 ms. If you need more than that it will require a more sophisticated design.
I was no able to easily implement a sound because all the basic LabVIEW beeps are what's considered "blocking I/O" meaning if you call 2 Beeps simultaneously than Windows will play one after another not both at the same time. You will need to implement you instrument notes using non blocking I/O probably in a language other than LabVIEW such as this C++ library.

Draw on top of suspended full-screen Direct3D app

Currently, I am able to hook onto Direct3D application and draw custom stuff onto its surface. However, I would like to suspend this application and then draw something else.
Is this even remotely possible to do so? Like creating another my own Direct3D window on top of that application?
I'm targetting only Windows 7, but the application I want to draw on is using only DirectX 9.
The problem is that I have very little experience with DirectX in general.
Sort of.
You're working with two different elements here, one quite large and but not particularly complex: hooking D3D. The other ("suspending" the app) is simple within that, but you don't quite want what you think you want.
To hook D3D, by the simplest method, you need to intercept the call to CreateDirect3D9 and return your own IDirect3D9, which later creates and returns your own IDirect3DDevice9. This will give you full control over the app's render process.
In order to "suspend" it, you need to wait for the desired trigger, then in your IDirect3DDevice9::Present, call your own event loop. This will, for all intents and purposes, suspend execution of the original app's code, but not the process itself (allowing your code and event loop to process). There will be some limitations of this, and you may not be able to consume window/Windows events (simply), but it will give you full control and effectively pause the original app.
Note, however, that you must intercept and reroute execution in every thread you want to "suspend," it's only specific to a single thread and you don't want physics or AI crunching on while render and UI are paused.
You need to perform your overlay drawing, whatever that may be, during your loop or your IDirect3DDevice9::Present hook, then call the real device's Present method as needed. If you want to run multiple frames of your overlay, then call the real Present repeatedly before returning from your Present. Tweak as necessary. Rendering here is done pretty much normally (check out general D3D tutorials for that), but there is one major catch: the device's state is unknown and may be incompatible, but must be "untouched" on return. This is handled simply by caching an IDirect3DStateBlock9 created from the device immediately after creating it. In your Present hook, create another state block with the state on entrance, restore the clean state block, run your code, then restore the entrance state block. You can work with any states, off a fresh slate, without damaging the device's state (I use this in practice, in works great).
If you want some rather extensive examples of how this works, I'd suggest checking out the Voodoo Shader project, which has full D3D8 and 9 hooks, including everything needed for overlays [/shameless own-project promotion]. Feel free to reuse any of the concepts, or comment with further questions; this certainly isn't all the details that may be useful to you.
This is a very complex thing to accomplish, as it is very much a hack to do so. The only people you see doing such things are steam, teamspeak, xfire, fraps, and a few hard-core devs.
There are kits out on the internet that show you have to inject a DLL into the memory space of the target application to achieve such a feat, and methods such as proxy DLLs.
Proxy DLL:
http://www.codeguru.com/cpp/g-m/directx/directx8/article.php/c11453
Injection:
http://www.progamercity.net/d3d/372-c-directx9-0-hooking-via-detours.html
Good luck, this will take you a while.

Intercept and send keystrokes with Python on Linux

I'm looking for a way to intercept all keyboard signals before they reach the active application. I then want to interpret and map the keystrokes before sending them on to the currently active application.
A Python library would be great, but C/C++ would also suffice.
I'm assuming you are using a system with X(org). If not some stuff can be done as well as the evdev level, but that's a another story.
Two parts in your question:
intercepting all key events -> XGrabKeyboard()
sending key events to the active application: I'd use libfakekey, it's a bit hacky hacky (it dynamically remaps part of the current keymap to send the KeySym you want to send) but it worked for me (small tip, don't forget to gerenate both the key presses and key release events :p).
Of course in your application grabbing the keyboard, you will have to listen to the KeyEvents from X and send keys from there.

Has anybody some advice on programming realtime audio synthesis?

I'm currently working on a personal project: creating a library for realtime audio synthesis in Flash. In short: tools to connect wavegenarators, filters, mixers, etc with eachother and supply the soundcard with raw (realtime) data. Something like max/msp or Reaktor.
I already have some working stuff, but I'm wondering if the basic setup that I wrote is right. I don't want to run into problems later on that force me to change the core of my app (although that can always happen).
Basically, what I do now is start at the end of the chain, at the place where the (raw) sounddata goes 'out' (to the soundcard). To do that, I need to write chunks of bytes (ByteArrays) to an object, and to get that chunk I ask whatever module is connected to my 'Sound Out' module to give me his chunk. That module does the same request to the module that's connected to his input, and that keeps happening until the start of the chain is reached.
Is this the right approach? I can imagine running into problems if there's a feedbackloop, or if there's another module with no output: if i were to connect a spectrumanalyzer somewhere, that would be a dead end in the chain (a module with no outputs, just an input). In my current setup, such a module wouldnt work because i only start calculating from the sound-output module.
Has anyone experience with programming something like this? I'd be very interested in some thoughts about the right approach. (For clarity: i'm not looking for specific Flash-implementations, and that's why i didnt tag this question under flash or actionscript)
I did a similar thing a while back, and I used the same approach as you do - start at the virtual line out, and trace the signal back to the top. I did this per sample though, not per buffer; if I were to write the same application today, I might choose per-buffer instead though, because I suspect it would perform better.
The spectrometer was designed as an insert module, that is, it would only work if both its input and its output were connected, and it would pass its input to the output unchanged.
To handle feedback, I had a special helper module that introduced a 1-sample delay and would only fetch its input once per cycle.
Also, I think doing all your internal processing with floats, and thus arrays of floats as the buffers, would be a lot easier than byte arrays, and it would save you the extra effort of converting between integers and floats all the time.
In later versions you may have different packet rates in different parts of your network.
One example would be if you extend it to transfer data to or from disk. Another example
would be that low data rate control variables such as one controlling echo-delay may, later, become a part of your network. You probably don't want to process control variables with the same frequency that you process audio packets, but they are still 'real time' and part of the function network. They may for example need smoothing to avoid sudden transitions.
As long as you are calling all your functions at the same rate, and all the functions are essentially taking constant-time, your pull-the-data approach will work fine. There will
be little to choose between pulling data and pushing. Pulling is somewhat more natural for playing audio, pushing is somewhat more natural for recording, but either works and ends up making the same calls to the underlying audio processing functions.
For the spectrometer you've got
the issue of multiple sinks for
data, but it is not a problem.
Introduce a dummy link to it from
the real sink. The dummy link can
cause a request for data that is not
honoured. As long as the dummy link knows
it is a dummy and does not care about
the lack of data, everything will be
OK. This is a standard technique for reducing multiple sinks or sources to a single one.
With this kind of network you do not want to do the same calculation twice in one complete update. For example if you mix a high-passed and low-passed version of a signal you do not want to evaluate the original signal twice. You must do something like record a timer tick value with each buffer, and stop propagation of pulls when you see the current tick value is already present. This same mechanism will also protect you against feedback loops in evaluation.
So, those two issues of concern to you are easily addressed within your current framework.
Rate matching where there are different packet rates in different parts of the network is where the problems with the current approach will start. If you are writing audio to disk then for efficiency you'll want to write large chunks infrequently. You don't want to be blocking your servicing of the more frequent small audio input and output processing packets during those writes. A single rate pulling or pushing strategy on its own won't be enough.
Just accept that at some point you may need a more sophisticated way of updating than a single rate network. When that happens you'll need threads for the different rates that are running, or you'll write your own simple scheduler, possibly as simple as calling less frequently evaluated functions one time in n, to make the rates match. You don't need to plan ahead for this. Your audio functions are almost certainly already delegating responsibility for ensuring their input buffers are ready to other functions, and it will only be those other functions that need to change, not the audio functions themselves.
The one thing I would advise at this stage is to be careful to centralise audio buffer
allocation, noticing that buffers are like fenceposts. They don't belong to an audio
function, they lie between the audio functions. Centralising the buffer allocation will make it easy to retrospectively modify the update strategy for different rates in different parts of the network.

Excluding some keys from XGrabKeyboard

Consider an application where it's desirable to grab the keyboard when focused in order to capture all window manager commands (Alt+F4 and whatnot) for processing. Now, this has the downside that the user has no way of switching to another application or virtual desktop via the keyboard when the keyboard is grabbed. I'd like to have a user-defined whitelist of key combination (say, the key combinations for switching virtual desktops) that are excluded from the grab.
I can think of two possible approaches. When a whitelisted key event arrives, either
Somehow tell X to continue processing it as usual. This sounds like a more natural way of doing it but I can't find a way to do this, or
Ungrab the keyboard and re-send the event by hand to the window manager for processing, however I don't know where to send it (the root window?) or whether that would even work.
Can anyone fill in the blanks on those? Any other suggestions?
If there's no way to exclude keys from a grab, I guess I'll have to settle for having an "escape key" that ungrabs the keyboard when pressed. The user'll have to press both that and then the window manager command, though, which isn't as nice.
I don't think there's a way to do it. None of the mechanisms work quite how you would need them to.
Approach 1 is sort of what the window manager does if it decides not to intercept a click or key for example. However, the WM is using "passive" grabs on particular keys (XGrabKey=passive XGrabKeyboard=active) and then XAllowEvents(). XAllowEvents() does not work with XGrabKeyboard(). also, when you XAllowEvents with one of the Replay modes, the replayed event bypasses all passive grabs on the window that had the original grab and on all its parent windows. The WM's grabs will be on the root window which will always be a parent so there is no way to replay to the root window, best I can tell. Doing XGrabKey on every possible key would be sort of psycho anyhow.
Approach 2 would have bad race condition problems, because other key and mouse events could be processed before you could resend, so you'd reorder keys and send events to destroyed windows and other confusion. Also, there is no good way to send a key event. XSendEvent() is ignored by many clients (it sets a send_event flag in the event allowing this). XTest extension can be used but may be disabled on production X servers and still has race condition issues.
What you probably would need is a protocol extension that let you do an AllowEvents(mode=ReplayKeyboard) after a GrabKeyboard and without bypassing passive grabs on parent windows.
One caveat is that I don't know all the wild stuff that can be done with XKB and XInput2, so maybe there's something in those extensions.
Anyway, as far as I know you have to settle for the "escape key," though it might be nice eventually for the X server and/or the window manager specs to have "VMWare/VNC-type-thing awareness," that won't help you in the short term. An EWMH spec extension could be as simple as a new _NET_WM_WINDOW_TYPE for vnc/vmware/stuff-like-that and the window manager could reduce its keybindings or add an extra modifier to them or something when that window was focused, for example.

Resources