Inner-workings of Keyboard Event Handling under Linux - linux

When I press a keyboard's key on some GTK application under Linux, what happens exactly? How does a key get received (from which device), interpreted, passed to the program, and then handled?

It's actually rather a complex process...
The keyboard has a 2D matrix of key connections and its own microprocessor or gate array containing a microprocessor. It is constantly scanning the matrix to find out if any keys are pressed. (To save pins, the keys are not individually tested.) The keyboard micro speaks a protocol with the keyboard controller in your CPU and transmits a message indicating a keypress.
The keyboard controller notes the code and interrupts the CPU.
The keyboard driver receives an interrupt, reads the keycode out of a controller register, and places the keycode in a buffer that links the interrupt side of the kernel to the per-process threads. It marks the thread waiting for keyboard input as "runnable"
This thread wakes up. It turns out, this is the X server. The X server reads the keycode from the kernel.
The server will will check to see which window has keyboard focus. The window will be connected to one of various clients.
The server sends an event to the client that displayed that specific window. (Note that to the server, every textbox and such is a "window", not just entire applications.)
The event loop in the client is waiting for the next server event message. This connection could be via TCP or it could be a local Unix feature. It uses read(2) or a socket op to actually get the next event message.
The low-level xlib routine passes the keypress up to higher level widgets, eventually getting to a GTK function of some kind.
The GTK API element hands the character to your program.
I glossed over language mappings, console multiplexing, and a few other things...
Update: So, /dev/input/* and in fact all of the /dev/* things are things called block or character special files. The important thing is that they have no stored data in the filesystem, just a major and minor device number that serve to look up a driver in the kernel in a table. It's almost that simple. If you ls -l /dev/input you will see a major and minor device number instead of a file size. The major number identifies the device driver and the minor number is a sort of instance-number that is passed (within the kernel) as a parameter to the driver.

Related

How to detect xterm resize in-band

I'm looking for a way to get resize events on an xterm as an alternative to the winch signal. I need to get a signal for the xterm resize that is remote compatible, that is, could be used over a serial line/telnet/ssh/whatever. The winch signal is only for local machine tasks.
I know that vi/curses can do this because I have tried ssh and use vi to edit a file, and it responds to resizing of the window.
So after a bit of research, it does seem there is no in-band method for this. I don't think it would be difficult to add. You have two sides to a connection, ssh, serial or whatever, let's say A and B where A is on the host and B is the remote. To get behavior equivalent to a local windowed task, I would say at least two things are essential:
Mouse movement/button messaging.
Window resize messaging.
The second, can be a two step process. If the B task gets an alert that the window size has changed, it can use in band queries to determine what the new size is, using the (perhaps infamous) method of setting the cursor to the end of an impossibly large screen and then reading the actual location of the cursor that results, then restoring the location, all done with standard in-band controls.
Why care about in-band vs. out of band? Well, nobody cares now :-), but serial lines used to be quite popular, and in-band, that is, actual escape sequences sent down the line are going to work, but telnet or ssh "out of band" communication is not.
It would not be difficult to add, a local program that shims the ssh or telnet, or even xterm itself, can get the winch message and change that to an escape alert to the B side. Here the mouse messaging is instructive. The messages consist of two actions:
The B side must enable such messages, so that the A side won't send unknown escapes to it.
The A side must have an escape to signify winch.
Thus a new escape from B to A, and one from A to B.
I'm actually fairly satisfied with winch. What was wanted is to have the same capabilities as vi/curses (based on my observation). This kind of support is already far better than Windows, which (as far as I know) implements none of this remote side support.

Designing a Linux char device driver so multiple processes can read

I notice that for serial devices, e.g. /dev/ttyUSB0, multiple processes can open the device but only one process gets the bytes (whichever reads them first).
However, for the Linux input API, e.g. /dev/input/event0, multiple processes can open the device, and all of the processes are able to read the input events.
My current goal:
I'd like to write a driver for several multi-position switches (e.g. like a slider switch with 3 or 4 possible positions), where apps can get a notification of any switch position changes. Ideally I'd like to use the Linux input API, however it seems that the Linux input API has no support for the concept of multi-position switches. So I'm looking at making a custom driver with similar capabilities to the Linux input API.
Two questions:
From a driver design point-of-view, why is there that difference in behaviour between Linux input API and Linux serial devices? I reckon it could be useful for multiple processes to all be able to open one serial port and all listen to incoming bytes.
What is a good way to write a Linux character device driver so that it's like the Linux input API, so multiple processes can open the device and read all the data?
The distinction is partly historical and partly due to the different expectation models.
The event subsystem is designed for unidirectional notification of simple events from multiple writers into the system with very little (or no) configuration options.
The tty subsystem is intended for bidirectional end-to-end communication of potentially large amounts of data and provides a reasonably flexible (albeit fairly baroque) configuration mechanism.
Historically, the tty subsystem was the main mechanism of communicating with the system: you plug your "teletype" into a serial port and bits went in and out. Different teletypes from different vendors used different protocols and thus the termios interface was born. To make the system perform well in a multi-user context, buffering was added in the kernel (and made configurable). The expectation model of the tty subsystem is that of a point-to-point link between moderately intelligent endpoints who will agree on what the data passing between them will look like.
While there are circumstances where "single writer, multiple readers" would make sense in the tty subsystem (GPS receiver connected to a serial port, continually reporting its position, for instance), that's not the main purpose of the system. But you can easily accomplish this "multiple readers" in userspace.
The event system on the other hand, is basically an interrupt mechanism intended for things like mice and keyboards. Unlike teletypes, input devices are unidirectional and provide little or no control over the data they produce. There is also little point in buffering the data. Nobody is going to be interested in where the mouse moved ten minutes ago.
I hope that answers your first question.
For your second question: "it depends". What do you want to accomplish? And what is the "longevity" of the data? You also have to ask yourself whether it makes sense to put the complexity in the kernel or if it wouldn't be better to put it in userspace.
Getting data out to multiple readers isn't particularly difficult. You could create a receive buffer per reader and fill each of them as the data comes in. Things get a little more interesting if the data comes in faster than the readers can consume it, but even that is mostly a solved problem. Look at the network stack for inspiration!
If your device is simple and just produces events, maybe you just want to be an input driver?
Your second question is a lot more difficult to answer without knowing more about what you want to accomplish.
Update after you added your specific goal:
When I do position switches, I usually just create a character device and implement poll and read. If you want to be fancy and have a lot of switches, you could do mmap but I wouldn't bother.
Userspace just opens your /dev/foo and reads the current state and starts polling. When your switches change state, you just wake up the readers and they they'll read again. All your readers will wake up, they'll all read the new state and everyone will be happy.
Be careful to only wake up readers when your switches are 'settled'. Many position switches are very noisy and they'll bounce around a fair bit.
In other words: I would ignore the input system altogether for this. As you surmise, position switches are not really "inputs".
How a character device handles these kinds of semantics is completely up to the driver to define and implement.
It would certainly be possible, for example, to implement a driver for a serial device that will deliver all read data to every process that has the character driver open. And it would also be possible to implement an input device driver that delivers events to only one process, whichever one is queued up to receive the latest event. It's all a matter of coding the appropriate implementation.
The difference is that it all comes down to a simple question: "what makes sense". For a serial device, it's been decided that it makes more sense to handle any read data by a single process. For an input device, it's been decided that it makes more sense to deliver all input events to every process that has the input device open. It would be reasonable to expect that, for example, one process might only care about a particular input event, say pointer button #3 pressed, while another process wants to process all pointer motion events. So, in this situation, it might make more sense to distribute all input events to all concerned parties.
I am ignoring some side issues, for simplicity, like in the situation of serial data being delivered to all reading processes what should happen when one of them stops reading from the device. That's also something that would be factored in, when deciding how to implement the semantics of a particular device.
What is a good way to write a Linux character device driver so that it's like the Linux input API, so multiple processes can open the device and read all the data?
See the .open member of struct file_operations for the char device. Whenever userspace opens the device, then the .open function is called. It can add the open file to a list of open files for the device (and then .release removes it).
The char device data struct should most likely use a kernel struct list_head to keep a list of open files:
struct my_dev_data {
...
struct cdev cdev;
struct list_head file_open_list;
...
}
Data for each file:
struct file_data {
struct my_dev_data *dev_data;
struct list_head file_open_list;
...
}
In the .open function, add the open file to dev_data->file_open_list (use a mutex to protect these operations as needed):
static int my_dev_input_open(struct inode * inode, struct file * filp)
{
struct my_dev_data *dev_data;
dev_data = container_of(inode->i_cdev, struct my_dev_data, cdev);
...
/* Allocate memory for file data and channel data */
file_data = devm_kzalloc(&dev_data->pdev->dev,
sizeof(struct file_data), GFP_KERNEL);
...
/* Add open file data to list */
INIT_LIST_HEAD(&file_data->file_open_list);
list_add(&file_data->file_open_list, &dev_data->file_open_list);
...
file_data->dev_data = dev_data;
filp->private_data = file_data;
}
The .release function should remove the open file from dev_data->file_open_list, and release the memory of file_data.
Now that the .open and .release functions maintain the list of open files, it is possible for all open files to read data. Two possible strategies:
A separate read buffer for each open file. When data is received, it is copied into the buffers of all open files.
A single read buffer for the device, but a separate read pointer for each open file. Data can be freed from the buffer once it has been read through all open files.
Serial to input/event
You could try to look into serial mouse driver source code.
This seem to be what you're searching for: from a TTYSx build a input/event device.
Simplier: creating a server, instead of a driver.
Historically, the 1st character device I remember is /dev/lp0.
To be able to write on it from many source, without overlap or other conflict,
a LPR server was wrotten.
To share a device, you have to
open this device in exclusive (rw) mode.
Create a socket (un*x or TCP) to listen on
redirect request from socket's clients to the device and maybe
store device status (from reading device's answers)
send device status to socket's clients when required.

How much is input delay from keyboard

Assuming that OS running on host PC is real-time (has no extra delays when handling the input from keyboard) - how much is the value of delay between the moment when the single key is pressed on a keyboard and the moment when the keyboard controller generates an interrupt.
I understand that it vary depending on keyboard connection (ps/2, usb, laptop keyboard) and wire length (not taking into account wireless models). Maybe there are some common calculation already done?

Interrupt-driven driver using TTY?

I am a newbie to developing drivers for Linux ... . I am developing a SMS-driver (AT commands over serial port to modem) using TTY for accessing the serial port. The driver is written in C.
In the design messages from modem to driver can be triggered by two events:
1) Status as respond to AT commands issued by driver (i.e. expected messages)
2) Indication of new SMS (i.e. unexpected messages)
I am planning on two threads - one for writing to TTY and one for reading from TTY. Is it possible to configure TTY so that my read-thread wakes-up on incoming chars (i.e. read-thread is event-triggered and not based on polling)?
Best Regards,
Witek
I don't think you really want two threads. Typical program flow (write AT command, check response etc ...) will be easier to write and debug in a monothreaded program.
Waiting for chars can be made with select() call. The tty layer is mostly configured through
the tcsetattr, tcgetattr and friends system call. With this call you can configure wether you want to be interrupted on new line or on each char for example. See man termios for the manpage. The two big options are wether you want the special characters like EOF, EOL Ctrl-C etc... to be treated has data (raw mode) or be interpreted by the tty layer (canonical mode).
See the part on select in the serial programming guide, or the select manpage for more info

Serial port determinism

This seems like a simple question, but it is difficult to search for. I need to interface with a device over the serial port. In the event my program (or another) does not finish writing a command to the device, how do I ensure the next run of the program can successfully send a command?
Example:
The foo program runs and begins writing "A_VERY_LONG_COMMAND"
The user terminates the program, but the program has only written, "A_VERY"
The user runs the program again, and the command is resent. Except, the device sees "A_VERYA_VERY_LONG_COMMAND," which isn't what we want.
Is there any way to make this more deterministic? Serial port programming feels very out-of-control due to issues like this.
The required method depends on the device.
Serial ports have additional control signal lines as well as the serial data line; perhaps one of them will reset the device's input. I've never done serial port programming but I think ioctl() handles this.
There may be a single byte which will reset, e.g. some control character.
There might be a timing-based signal, e.g. Hayes command set modems use “pause +++ pause”.
It might just reset after not receiving a complete command after a fixed time.
It might be useful to know whether the device was originally intended to support interactive use (serial terminal), control by a program, or both.
I would guess that if you call write("A_VERY_LONG_COMMAND"), and then the user hits Ctrl+C while the bytes are going out on the line, the driver layer should finish sending the full buffer. And if the user interrupts in the middle of the call, the driver layer will probably just ignore the whole thing.
Just in case, when you open a new COM port, it's always wise to clear the port.
Do you have control over the device end? It might make sense to implement a timeout to make the device ignore unfinished or otherwise corrupt packets.
The embedded device should be implemented such that you can either send an abort/clear/break character that will dump the contents of its command buffer and give you a clean slate on your client app startup.
Or else it should provide a software reset character which will reset the command buffer and all state.
Or else it so be designed so that you can send a command termination (perhaps a newline, etc, depending on command protocol) and possibly have an error generated on the parsing of a garbled partial command that was in its buffer, query/clear the error, and then be good to go.
It wouldn't be a bad idea upon connection of your client program to send some health/status/error query repeatedly until you get a sound response, and only then commence sending configuration or operation commands. Unless you can via a query determine that the device was left in a suitable state, you probably want to assume nothing and configure it from scratch, after a configuration reset if available.

Resources