Excluding some keys from XGrabKeyboard - keyboard

Consider an application where it's desirable to grab the keyboard when focused in order to capture all window manager commands (Alt+F4 and whatnot) for processing. Now, this has the downside that the user has no way of switching to another application or virtual desktop via the keyboard when the keyboard is grabbed. I'd like to have a user-defined whitelist of key combination (say, the key combinations for switching virtual desktops) that are excluded from the grab.
I can think of two possible approaches. When a whitelisted key event arrives, either
Somehow tell X to continue processing it as usual. This sounds like a more natural way of doing it but I can't find a way to do this, or
Ungrab the keyboard and re-send the event by hand to the window manager for processing, however I don't know where to send it (the root window?) or whether that would even work.
Can anyone fill in the blanks on those? Any other suggestions?
If there's no way to exclude keys from a grab, I guess I'll have to settle for having an "escape key" that ungrabs the keyboard when pressed. The user'll have to press both that and then the window manager command, though, which isn't as nice.

I don't think there's a way to do it. None of the mechanisms work quite how you would need them to.
Approach 1 is sort of what the window manager does if it decides not to intercept a click or key for example. However, the WM is using "passive" grabs on particular keys (XGrabKey=passive XGrabKeyboard=active) and then XAllowEvents(). XAllowEvents() does not work with XGrabKeyboard(). also, when you XAllowEvents with one of the Replay modes, the replayed event bypasses all passive grabs on the window that had the original grab and on all its parent windows. The WM's grabs will be on the root window which will always be a parent so there is no way to replay to the root window, best I can tell. Doing XGrabKey on every possible key would be sort of psycho anyhow.
Approach 2 would have bad race condition problems, because other key and mouse events could be processed before you could resend, so you'd reorder keys and send events to destroyed windows and other confusion. Also, there is no good way to send a key event. XSendEvent() is ignored by many clients (it sets a send_event flag in the event allowing this). XTest extension can be used but may be disabled on production X servers and still has race condition issues.
What you probably would need is a protocol extension that let you do an AllowEvents(mode=ReplayKeyboard) after a GrabKeyboard and without bypassing passive grabs on parent windows.
One caveat is that I don't know all the wild stuff that can be done with XKB and XInput2, so maybe there's something in those extensions.
Anyway, as far as I know you have to settle for the "escape key," though it might be nice eventually for the X server and/or the window manager specs to have "VMWare/VNC-type-thing awareness," that won't help you in the short term. An EWMH spec extension could be as simple as a new _NET_WM_WINDOW_TYPE for vnc/vmware/stuff-like-that and the window manager could reduce its keybindings or add an extra modifier to them or something when that window was focused, for example.

Related

Draw on top of suspended full-screen Direct3D app

Currently, I am able to hook onto Direct3D application and draw custom stuff onto its surface. However, I would like to suspend this application and then draw something else.
Is this even remotely possible to do so? Like creating another my own Direct3D window on top of that application?
I'm targetting only Windows 7, but the application I want to draw on is using only DirectX 9.
The problem is that I have very little experience with DirectX in general.
Sort of.
You're working with two different elements here, one quite large and but not particularly complex: hooking D3D. The other ("suspending" the app) is simple within that, but you don't quite want what you think you want.
To hook D3D, by the simplest method, you need to intercept the call to CreateDirect3D9 and return your own IDirect3D9, which later creates and returns your own IDirect3DDevice9. This will give you full control over the app's render process.
In order to "suspend" it, you need to wait for the desired trigger, then in your IDirect3DDevice9::Present, call your own event loop. This will, for all intents and purposes, suspend execution of the original app's code, but not the process itself (allowing your code and event loop to process). There will be some limitations of this, and you may not be able to consume window/Windows events (simply), but it will give you full control and effectively pause the original app.
Note, however, that you must intercept and reroute execution in every thread you want to "suspend," it's only specific to a single thread and you don't want physics or AI crunching on while render and UI are paused.
You need to perform your overlay drawing, whatever that may be, during your loop or your IDirect3DDevice9::Present hook, then call the real device's Present method as needed. If you want to run multiple frames of your overlay, then call the real Present repeatedly before returning from your Present. Tweak as necessary. Rendering here is done pretty much normally (check out general D3D tutorials for that), but there is one major catch: the device's state is unknown and may be incompatible, but must be "untouched" on return. This is handled simply by caching an IDirect3DStateBlock9 created from the device immediately after creating it. In your Present hook, create another state block with the state on entrance, restore the clean state block, run your code, then restore the entrance state block. You can work with any states, off a fresh slate, without damaging the device's state (I use this in practice, in works great).
If you want some rather extensive examples of how this works, I'd suggest checking out the Voodoo Shader project, which has full D3D8 and 9 hooks, including everything needed for overlays [/shameless own-project promotion]. Feel free to reuse any of the concepts, or comment with further questions; this certainly isn't all the details that may be useful to you.
This is a very complex thing to accomplish, as it is very much a hack to do so. The only people you see doing such things are steam, teamspeak, xfire, fraps, and a few hard-core devs.
There are kits out on the internet that show you have to inject a DLL into the memory space of the target application to achieve such a feat, and methods such as proxy DLLs.
Proxy DLL:
http://www.codeguru.com/cpp/g-m/directx/directx8/article.php/c11453
Injection:
http://www.progamercity.net/d3d/372-c-directx9-0-hooking-via-detours.html
Good luck, this will take you a while.

Intercept and send keystrokes with Python on Linux

I'm looking for a way to intercept all keyboard signals before they reach the active application. I then want to interpret and map the keystrokes before sending them on to the currently active application.
A Python library would be great, but C/C++ would also suffice.
I'm assuming you are using a system with X(org). If not some stuff can be done as well as the evdev level, but that's a another story.
Two parts in your question:
intercepting all key events -> XGrabKeyboard()
sending key events to the active application: I'd use libfakekey, it's a bit hacky hacky (it dynamically remaps part of the current keymap to send the KeySym you want to send) but it worked for me (small tip, don't forget to gerenate both the key presses and key release events :p).
Of course in your application grabbing the keyboard, you will have to listen to the KeyEvents from X and send keys from there.

drop/rewrite/generate keyboard events under Linux

I would like to hook into, intercept, and generate keyboard (make/break) events under Linux before they get delivered to any application. More precisely, I want to detect patterns in the key event stream and be able to discard/insert events into the stream depending on the detected patterns.
I've seen some related questions on SO, but:
either they only deal with how to get at the key events (key loggers etc.), and not how to manipulate the propagation of them (they only listen, but don't intercept/generate).
or they use passive/active grabs in X (read more on that below).
A Small DSL
I explain the problem below, but to make it a bit more compact and understandable, first a small DSL definition.
A_: for make (press) key A
A^: for break (release) key A
A^->[C_,C^,U_,U^]: on A^ send a make/break combo for C and then U further down the processing chain (and finally to the application). If there is no -> then there's nothing sent (but internal state might be modified to detect subsequent events).
$X: execute an arbitrary action. This can be sending some configurable key event sequence (maybe something like C-x C-s for emacs), or execute a function. If I can only send key events, that would be enough, as I can then further process these in a window manager depending on which application is active.
Problem Description
Ok, so with this notation, here are the patterns I want to detect and what events I want to pass on down the processing chain.
A_, A^->[A_,A^]: expl. see above, note that the send happens on A^.
A_, B_, A^->[A_,A^], B^->[B_,B^]: basically the same as 1. but overlapping events don't change the processing flow.
A_, B_, B^->[$X], A^: if there was a complete make/break of a key (B) while another key was held (A), X is executed (see above), and the break of A is discarded.
(it's in principle a simple statemachine implemented over key events, which can generate (multiple) key events as output).
Additional Notes
The solution has to work at typing speed.
Consumers of the modified key event stream run under X on Linux (consoles, browsers, editors, etc.).
Only keyboard events influence the processing (no mouse etc.)
Matching can happen on keysyms (a bit easier), or keycodes (a bit harder). With the latter, I will just have to read in the mapping to translate from code to keysym.
If possible, I'd prefer a solution that works with both USB keyboards as well as inside a virtual machine (could be a problem if working at the driver layer, other layers should be ok).
I'm pretty open about the implementation language.
Possible Solutions and Questions
So the basic question is how to implement this.
I have implemented a solution in a window manager using passive grabs (XGrabKey) and XSendEvent. Unfortunately passive grabs don't work in this case as they don't capture correctly B^ in the second pattern above. The reason is that the converted grab ends on A^ and is not continued to B^. A new grab is converted to capture B if still held but only after ~1 sec. Otherwise a plain B^ is sent to the application. This can be verified with xev.
I could convert my implementation to use an active grab (XGrabKeyboard), but I'm not sure about the effect on other applications if the window manager has an active grab on the keyboard all the time. X documentation refers to active grabs as being intrusive and designed for short term use. If someone has experience with this and there are no major drawbacks with longterm active grabs, then I'd consider this a solution.
I'm willing to look at other layers of key event processing besides window managers (which operate as X clients). Keyboard drivers or mappings are a possibility as long as I can solve the above problem with them. This also implies that the solution doesn't have to be a separate application. I'm perfectly fine to have a driver or kernel module do this for me. Be aware though that I have never done any kernel or driver programming, so I would appreciate some good resources.
Thanks for any pointers!
Use XInput2 to make device(keyboard) floating, then monitor KeyPress and KeyRelease event on the device, using XTest to regenerate KeyPress & KeyRelease event.

How to monitor screen updates?

I am trying to write a program that monitors when the screen has been redrawn.
Meaning if any part of any window is redrawn, then the program is notified.
As far as I understand I should use a journal record hook like at
http://www.vbaccelerator.com/home/vb/code/libraries/Hooks/Journal_Record_Hooks/article.asp
However, I do not understand which MSG type would get me the WM_PAINT events (WH_CALLWNDPROC and WH_CALLWNDPROCRET do not seem to do the job). I'm not even sure that WM_PAINT is what I'm looking for...
Basically, if I knew when the DC associated with GetDesktopWindow() has changed then my problem would be solved.
Question is: How do you monitor screen updates?
I don't believe this is possible without hooking the display driver. I can imagine there would be some serious performance implications if it were possible in general...
You would be better taking a screenshot every second or whatever. Every version of Windows has the little network icon in the tray always changing when you transfer data over a network, meaning the screen will be changing pretty much constantly.

Busy cursors - why?

Can anyone give me a scenario where they think busy cursors are justified? I feel like they're always a bad idea from a user's perspective. Clarification: by busy cursors, I mean when the user can no longer interact with the application, they can only move their hourglass mouse pointer around and whistle a tune.
In summary, I think that the user should be blocked from doing stuff in your app only when the wait interval is very short (2 seconds or less) and the cognitive overhead of doing multi-threading is likely to result in a less stable app. For more detail, see below.
For an operation lasting less than 0.1 second, you don't usually need to go asynchronous or even show an hourglass.
For an operation lasting between 0.1 and 2 seconds, you usually don't need to go asynchronous. Just switch the cursor to the hourglass, then do the work inline. The visual cue is enough to keep the end-user happy.
If the end-user initiates an operation that is going to take just a couple of seconds, he's in a "focused" mode of thinking in which he's subconsciously waiting for the results of his action, and he hasn’t switched his conscious brain out of that particular focus. So blocking the UI - with a visual indicator that this has happened - is perfectly acceptable for such a short period of time.
For an operation lasting more than 2 seconds, you should usually go asynchronous. But even then, you should provide some sort of progress indicator. People find it difficult to concentrate in the absence of stimulation, and 2 seconds is long enough that the end-user is naturally going to move from conscious ‘focused’ activity to conscious ‘waiting’ activity.
The progress indicator gives them something to occupy them while they are in that waiting mode, and also gives the means of determining when they are going to switch back into their ‘focused’ context. The visual cues give the brain something around which to structure those context switches, without demanding too much conscious thought.
Where it gets messy is where you have an operation that usually completes in X time, but occasionally takes Y, where Y is much greater than X. This can happen for remote actions such as reaching across a network. That's when you might need a combination of the above actions. For example, consider displaying an egg-timer for the first 2 seconds and only then bringing in your progress indicator. This avoids wrenching the end-user from the 'focused' context directly to the 'waiting' context without an intermediate step.
It's not specifically the busy cursor that is important, but it IS important, absolutely, always to give feedback to the user that something is happening in response to their input. It is important to realize that without a busy cursor, progress bar, throbber, flashing button, swirling baton, dancing clown.. it doesn't matter ANYTHING- if you don't have it, and the computer just sits there doing nothing, the computer looks broken to the user.
immediate feedback for every user action is incredibly important.
I think you may well be right: in a decent asynchronous app, you never need to show a busy cursor. The user can always do something even if the big last operation is completing.
That said, sometimes Java apps like Netbeans or Eclipse, or even Visual Studio, hang with no busy cursor and no hope. But in that case, a busy cursor probably wouldn't help much either...but I think you're right: busy cursors are from a non-multithreading era for apps. In Flex apps, for instance, EVERYTHING is automatically event-driven callbacks, so setting a busy cursor would just be meaningless (though possible, of course).
You show a busy cursor when the user can not do anything until the operation is completed - including exiting the application.
I find it interesting that you don't see busy cursors in Web Browsers - perhaps that why people like them so much.
No, wait, I have a better answer. You show a busy cursor when the computer is thinking.
When one hits the Refresh button on a web browser, busy cursor must appear immediately to tell the user to let them know that a page is being loaded.
I think it was Don't Make Me Think that said that the tolerable loading time for human is zero second.
Google says:
Responsive
It's possible to write code that wins
every performance test in the world,
but that still sends users in a fiery
rage when they try to use it. These
are the applications that aren't
responsive enough — the ones that feel
sluggish, hang or freeze for
significant periods, or take too long
to process input.
There are two purposes for it:
Indicate for the user that something is happening.
Indicate for the user that nothing can't be done right now.
Busy cursor is better signal about the operation than nothing. For longer lasting operations something better should be used. For example browsers is still operational when a page is being retrieved and there is even a button to stop the operation. As the user interface is fully functional, there is no need to use busy cursor. However busy cursor can be used even in this kind of situations in the transition phases like when starting the operation or when stopping it.
I try to use them on any action that may take from 0.5 to 3 seconds, for longer actions I think progress indicators with enough information should be used.
I noticed with Fedora 8 at least that when an app sets the "busy" cursor, the "busy interactive" one is actually displayed. I guess this is because the system is still responsive to mouse input (like dragging the window etc.). As an aside, selecting the "busy interactive" cursor explicitly on linux is tricky:
http://www.pixelbeat.org/programming/x_cursors/
The only thing I believe the busy cursor does is it informs the user that ...
I'm not outright ignoring you, I'm just doing something else that may take awhile
While it is absolutely necessary to alert the user that your application is doing something, a busy cursor is only useful for the first few seconds of processing. For a delay of more than about 15-20 seconds, something else must be presented such as a progress bar, status message, message box, whatever. People assume your software has locked up after a minute or so and will try to terminate it. Sometimes, overall visual cues are just as important as a busy cursor.
For example, applications with tabs that do not respond with appropriate highlighting until the operation in the tab completes can be fixed up by updating the tab temporarily until all operations are complete. Sometimes, just a little optimization or refactoring will clean up horrible user interface responsiveness such as this.
I would use them only for quick completing things, like say under half a second. If anything takes longer than that then a progress dialog should popup, or a progress bar should appear in the status bar or somewhere else in the interface.
The user should always be able to cancel the action if it is taking too long to complete.
In response to the comment, the busy cursor would only be visible for the half second or so, as once the progress dialog is up it should change to being one of those "half busy" cursors, or just the normal arrow cursor.
You should avoid having a busy cursor up except in extreme circumstances, and if you think you need one, then think again and redesign.
For example, to indicate that you've clicked on a button, even though it's not done processing the event. If there were not some indication, the user might try to click the button again, causing all manner of badness.

Resources