I'm trying to do something odd to which I've not found reference in the archives. On a Freescale iMX6 processor, there's an input line that generates an interrupt after being pressed (the 500mS delay does not work), the intent of which interrupt is to notify the system of a request for an orderly shutdown. On the system in question, the button attached is also attached to the Enter key GPIO. The generated interrupt appears to be a falling edge/rising edge (or vice versa, it matters not) separated by about 75mS or so. The interrupt does not repeat unless the key is released and pressed again.
The bit to clear the interrupt in the ISR is in a register allocated and held by the Real Time Clock driver (a side effect of the Freescale architecture) so I have to embed my interrupt handler inside the RTC driver, which of course has its own interrupt code.
I thought myself clever when I implemented the suggestion to question 18296686 regarding shutting down (embedded) Linux from kernel-space, but that fails to distinguish between Enter and power-off. I need to detect the power-off interrupt, wait ~750-1000mS, and check whether the button (the <Enter> key is attached to a GPIO) is still depressed, thus signalling a power-off.
I was thinking a poll(2) interface to the driver, but since the driver is really the RTC driver, the interface confuses me, and I'm looking for help in implementing this.
Related
Consider the situation, where you issue a read from the disc (I/O operation). Then what is the exact mechanism that the OS uses to get to know whether the operation has been executed?
Then what is the exact mechanism that the OS uses to get to know whether the operation has been executed?
The exact mechanism depends on the specific hardware (and OS and scenario); but typically when a device finishes doing something the device triggers an IRQ that causes the CPU to interrupt whatever it was doing and switch to a device driver's interrupt handler.
Sometimes/often device driver ends up maintaining a queue or buffer of pending commands; so that when its interrupt handler is executed (telling it that a previous command has completed) it takes the next pending command and tells the device to start it. Sometimes/often this also includes some kind of IO priority scheme, where driver can ask device to do more important work sooner (while less important work is postponed/remains pending).
A device driver is typically also tied to scheduler in some way - a normal thread in user-space might (directly or indirectly - e.g. via. file system) request that data be transferred and the scheduler will be told to not give that thread CPU time because it's blocked/waiting for something; and then later when the transfer is completed the device driver's interrupt handler tells the scheduler that the requesting thread can continue, causing it to be unblocked/able to be given CPU time by scheduler again.
I learn how an OS system works and know that peripheral devices can send interrupts that OS handles then. But I don't have a vision of how actually it handles it.
What happens when I move the mouse around? Does it send interrupts every millisecond? How OS can handle the execution of a process and mouse positioning especially if there is one CPU? How can OS perform context switch in this case effectively?
Or for example, there are 3 launched processes. Process 1 is active, process 2 and process 3 are ready to go but in the pending state. The user inputs something with the keyboard in process 1. As I understand OS scheduler can launch process 2 or process 3 while awaiting input. I assume that the trick is in timings. Like the processor so fast that it's able to launched processes 2 and 3 between user's presses.
Also, I will appreciate any literature references where I could get familiar with how io stuff works especially in terms of timings and scheduling.
Let's assume it's some kind of USB device. For USB you have 2 layers of device drivers - the USB controller driver and the USB peripheral (keyboard, mouse, joystick, touchpad, ...) driver. The USB peripheral driver asks the USB controller driver to poll the device regularly (e.g. maybe every 8 milliseconds) and the USB controller driver sets that up and the USB controller hardware does this polling (not software/driver), and if it receives something from the USB peripheral it'll send an IRQ back to the USB controller driver.
When the USB controller sends an IRQ it causes the CPU to interrupt whatever it was doing and execute the USB controller driver's IRQ handler. The USB controller driver's IRQ handler examines the state of the USB controller and figures out why it sent an IRQ; and notices that the USB controller received data from a USB peripheral; so it determines which USB peripheral driver is responsible and forwards the received data to that USB peripheral's device driver.
Note: Because it's bad to spend too much time handling an IRQ (because it can cause the handling of other more important IRQs to be postponed) often there will be some kind separation between the IRQ handler and higher level logic at some point; which is almost always some variation of a queue where the IRQ handler puts a notification on a queue and then returns from the IRQ handler, and the notification on the queue causes something else to be executed run later. This might happen in the middle of the USB controller driver (e.g. USB controller driver's IRQ handler does a little bit of work, then creates a notification that causes the rest of the USB controller driver to do the rest of the work). There's multiple ways to implement this "queue of notifications" (deferred procedure calls, message passing, some other form of communication, etc) and different operating systems use different approaches.
The USB peripheral's device driver (e.g. keyboard driver, mouse driver, ...) receives the data sent by the USB controller's driver (that came from the USB controller that got it from polling the USB peripheral); and examines that data. Depending on what the data contains the USB peripheral's device driver will probably construct some kind of event describing what happened in a "standard for that OS" way. This can be complicated (e.g. involve tracking past state of the device and lookup tables for keyboard layout, etc). In any case the resulting event will be forwarded to something else (often a user-space process) using some form of "queue of notifications". This might be the same kind of "queue of notifications" that was used before; but might be something very different (designed to suit user-space instead of being designed for kernel/device drivers only).
Note: In general every OS that supports multi-tasking provides one or more ways that normal processes can use to communicate with each other; called "inter-process communication". There are multiple possibilities - pipes, sockets, message passing, etc. All of them interact with scheduling. E.g. a process might need to wait until it receives data and call a function (e.g. to read from a pipe, or read from a socket, or wait for a message, or ..) that (if there's no data in the queue to receive) will cause the scheduler to be told to put the task into a "blocked" state (where the task won't be given any CPU time); and when data arrives the scheduler is told to bump the task out of the "blocked" state (so it can/will be given CPU time again). Often (for good operating systems), whenever a task is bumped out of the "blocked" state the scheduler will decide if the task should preempt the currently running task immediately, or not; based on some kind of task/thread priorities. In other words; if a lower priority task is currently running and a higher priority task is waiting to receive data, then when the higher priority task receives the data it was waiting for the scheduler may immediately do a task switch (from lower priority task to higher priority task) so that the higher priority task can examine the data it received extremely quickly (without waiting for ages while the CPU is doing less important work).
In any case; the event (from the USB peripheral's device driver) is received by something (likely a process in user-space, likely causing that process to be unblocked and given CPU time immediately by the scheduler). This is the top of a "hierarchy/tree of stuff" in user-space; where each thing in the tree might look at the data it receives and may forward it to something else in the tree (using the same inter-process communication to forward the data to something else). For example; that "hierarchy/tree of stuff" might have a "session manager" at the top of the tree, then "GUI" under that, then several application windows under that. Sometimes an event will be consumed and not forwarded to something else (e.g. if you press "alt+tab" then the GUI might handle that itself, and the GUI won't forward it to the application window that currently has keyboard focus).
Eventually most events will end up at a normal application. Normal applications often have a language run-time that will abstract the operating systems details to make the application more portable (so that the programmer doesn't have to care which OS their application is running on). For example, for Java, the Java virtual machine might convert the operating system's event (that arrived in an "OS specific" format via. an "OS specific" communication mechanism) into a generic "KeyEvent" (and notify any "KeyListener").
The entire path (from drivers to a function/method inside an application) could involve many thousands of lines of code written by hundreds of people spread across many separate layers; where the programmers responsible for one piece (e.g. GUI) don't have to worry much about what the programmers working on other pieces (e.g. drivers) do. For this reason; you probably won't find a single source of information that covers everything (at all layers). Instead, you'll find information for device driver developers only, or information for C++ application developers only, or ...
This is also why nobody will be able to provide more than a generic overview (without any OS specific or "layer specific" details) - they'd have to write 12 entire books to provide an extremely detailed answer.
I'm trying to better understand the interaction between the "return IRQ_HANDLED" statement used in a GPIO pin-based interrupt handler (top-half) and the GPIO pin hardware. In particular, consider the hypothetical situation wherein a device has pulled a GPIO pin low to indicate that it needs attention. This causes the associated (top half) interrupt handler to be invoked. Now assume that the top-half handler queues up some work and then returns with "return IRQ_HANDLED" but that for whatever reason the interrupt has not been cleared on the device that generated it (i.e. the device is holding the GPIO pin in the low state). Does invocation of "return IRQ_HANDLED" cause the interrupt to be regenerated? I ask this in the context of the following article:
http://www.makelinux.net/books/lkd2/ch06lev1sec4
"Reentrancy and Interrupt Handlers
Interrupt handlers in Linux need not be reentrant. When a given interrupt handler is executing, the corresponding interrupt line is masked out on all processors, preventing another interrupt on the same line from being received. Normally all other interrupts are enabled, so other interrupts are serviced, but the current line is always disabled. Consequently, the same interrupt handler is never invoked concurrently to service a nested interrupt. This greatly simplifies writing your interrupt handler."
The above comment indicates that upon invocation of an interrupt handler, the interrupt line for that interrupt is masked. I'm trying to figure out if the invocation of "return IRQ_HANDLED" is what unmasks the interrupt line. And, with respect to the hypothetical case described above, what would happen if I "return IRQ_HANDLED" yet the device has not really had its interrupt cleared and hence is still holding the GPIO pin in a low (triggered) state. More specifically, will this cause the interrupt to be generated again such that the processor never has a chance to do the work queued when the interrupt first occurred. I.e., would this lead to an interrupt storm wherein the processor could be interrupted endlessly thus not allowing any useful processing to occur. I should add that I ask this question in the context of a single CPU linux ARM9 system (Phytec LPC3180) running kernel 2.6.10.
Thanks in advance,
Jim
PS: I'm not clear as to the difference between enabling/disabling an interrupt (in particular, an interrupt associated with a particular GPIO pin) and masking/unmasking the same GPIO interrupt.
Our group is working with an embedded processor (Phytec LPC3180, ARM9). We have designed a board that includes four MAX3107 uart chips on one of the LPC3180's I2C busses. In case it matters, we are running kernel 2.6.10, the latest version available for this processor (support of this product has not been very good; we've had to develop or fix a number of the drivers provided by Phytec, and Phytec seems to have no interest in upgrading the linux code (especially kernel version) for this product. This is too bad in that the LPC3180 is a nice device, especially in the context of low power embedded products that DO NOT require ethernet and in fact don't want ethernet (owing to the associated power consumption of ethernet controller chips). The handler that is installed now (developed by someone else) is based on a top-half handler and bottom-half work queue approach.
When one of four devices (MAX3107 UART chips) on the I2C bus receives a character it generates an interrupt. The interrupt lines of all four MAX3107 chips are shared (open drain pull-down) and the line is connected to a GPIO pin of the 3180 which is configured for level interrupt. When one of the 3017's generates an interrupt a handler is run which does the following processing (roughly):
spin_lock_irqsave();
disable_irq_nosync(irqno);
irq_enabled = 0;
irq_received = 1;
spin_unlock_irqrestore()
set_queued_work(); // Queue up work for all four devices for every interrupt
// because at this point we don't know which of the four
// 3107's generated the interrupt
return IRQ_HANDLED;
Note, and this is what I find somewhat troubling, that the interrupt is not re-enabled before leaving the above code. Rather, the driver is written such that the interrupt is re-enabled by a bottom half work queue task (using the "enable_irq(LPC_IRQ_LINE) function call". Since the work queue tasks do not run in interrupt context I believe they may sleep, something that I believe to be a bad idea for an interrupt handler.
The rationale for the above approach follows:
1. If one of the four MAX3107 uart chips receives a character and generates an interrupt (for example), the interrupt handler needs to figure out which of the four I2C devices actually caused the interrupt. However, and apparently, one cannot read the I2C devices from within the context of the upper half interrupt handler since the I2C reads can sleep, something considered inappropriate for an interrupt handler upper-half.
2. The approach taken to address the above problem (i.e. which device caused the interrupt) is to leave the interrupt disabled and exit the top-half handler after which non-interrupt context code can query each of the four devices on the I2C bus to figure out which received the character (and hence generated the interrupt).
3. Once the bottom-half handler figures out which device generated the interrupt, the bottom-half code disables the interrupt on that chip so that it doesn't re-trigger the interrupt line to the LPC3180. After doing so it reads the serial data and exits.
The primary problem here seems to be that there is not a way to query the four MAX3107 uart chips from within the interrupt handler top-half. If the top-half simply re-enabled interrupts before returning, this would cause the same chip to generate the interrupt again, leading, I think, to the situation where the top-half disables the interrupt, schedules bottom-half work queues and disables the interrupt only to find itself back in the same place because before the lower-half code would get to the chip causing the interrupt, another interrupt has occurred, and so forth, ....
Any advice for dealing with this driver will be much appreciated. I really don't like the idea of allowing the interrupt to be disabled in the top-half of the driver yet not be re-enabled prior to existing the top-half drive code. This does not seem safe.
Thanks,
Jim
PS: In my reading I've discovered threaded interrupts as a means to deal with the above-described requirements (at least that's my interpretation of web site articles such as http://lwn.net/Articles/302043/). I'm not sure if the 2.6.10 kernel as provided by Phytec includes threaded interrupt functions. I intend to look into this over the next few days.
If your code is written properly it shouldn't matter if a device issues interrupts before handling of prior interrupts is complete, and you are correct that you don't want to do blocking operations in the top half, but blocking operations are acceptable in a bottom half, in fact that is part of the reason they exist!
In this case I would suggest an approach where the top half just schedules the bottom half, and then the bottom half loops over all 4 devices and handles any pending requests. It could be that multiple devices need processing, or none.
Update:
It is true that you may overload the system with a load test, and the software may need to be optimized to handle heavy loads. Additionally I don't have a 3180, and four 3107s (or similar) of my own to test this out on, so I am speaking theoretically, but I am not clear why you need to disable interrupts at all.
Generally speaking when a hardware device asserts an interrupt it will not assert another one until the current one is cleared. So you have 4 devices sharing one int line:
Your top half fires and adds something to the work queue (ie triggers bottom half)
Your bottom half scans all devices on that int line (ie all four 3107s)
If one of them caused the interrupt you will then read all data necessary to fully process the data (possibly putting it in a queue for higher level processing?)
You "clear" the interrupt on the current device.
When you clear the interrupt then the device is allowed to trigger another interrupt, but not before.
More details about this particular device:
It seems that this device (MAX3107) has a buffer of 128 words, and by default you are getting interrupted after every single word. But it seems that you should be able to take better advantage of the buffer by setting the FIFO level registers. Then you will get interrupted only after that number of words has been rx (or if you fill your tx FIFO up beyond the threshold in which case you should slow down the transmit speed (ie buffer more in software)).
It seems the idea is to basically pull data off the devices periodically (maybe every 100ms or 10ms or whatever seems to work for you) and then only have the interrupt act as a warning that you have crossed a threshold, which might schedule the periodic function for immediate execution, or increases the rate at which it is called.
Interrupts are enabled & disabled because we use level-based interrupts, not edge-based. The ramifications of that are explicitly explained in the driver code header, which you have, Jim.
Level-based interrupts were required to avoid losing an edge interrupt from a character that arrives on one UART immediately after one arriving on another: servicing the first effectively eliminates the second, so that second character would be lost. In fact, this is exactly what happened in the initial, edge-interrupt version of this driver once >1 UART was exercised.
Has there been an observed failure with the current scheme?
Regards,
The Driver Author (someone else)
I am considering an upcoming situation in an embedded Linux project (no hardware yet) where two external chips will need to share a single physical IRQ line. This line is capable in hardware of edge triggering but not level triggered interrupts.
Looking at the shared irq support in Linux, I understand that the way this would work with two separate drivers is that each would have their interrupt handler called, check their hardware and handle if appropriate.
However I imagine the following race condition and would like to know if I'm missing something or what might be done to work around this. Let's say there are two external interrupt sources, devices A and B:
device B interrupt occurs, IRQ goes active
IRQ edge causes Linux core interrupt handler to run
ISR for device A runs, finds no interrupt pending
device A interrupt occurs, IRQ stays active (wire-OR)
ISR for device B runs, finds interrupt pending, handles and clears it
core interrupt handler exits
IRQ stays active, no more edges are generated, IRQ is locked up
It seems that for this to be fixed, the core interrupt handler would have to check the IRQ level after running all handlers, and if still active, run them all again. Will Linux do this? I don't think the interrupt core knows how to check the level of an IRQ line.
Is this race something that can actually happen, and if so how do I deal with this?
Basically, with the hardware you've described, doing a wired-or for the interrupts will NEVER work correctly on it's own.
If you want to do wired-or, you really need to be using level-sensitive IRQ inputs. If that's not feasible, then perhaps you can add in some kind of interrupt controller. That device would take N level-sensitive inputs, and have one output, and some kind of 'clear'. When the interrupt controller gets a clear it would lower it's output, then re-assert the output if any of it's inputs were still asserted.
On the software side, you could look at is running the IRQ line to another processor input. This would allow you to at least check the state, but the Linux core ISR handling isn't going to know anything about this, and so you'll have to patch in something to get it to check it and cycle through the ISRs again. Also, this means that in heavy interrupt loading situations you're NEVER going to get out of this ISR. Given that you're doing a wire-or on the IRQs, I'm kind of assuming these devices won't be interrupting too often.
One other thing is to look really hard at the processor. There may be some kind of trick you can pull with the interrupt setup in order to get it to recognize the interrupt again.
I wouldn't try anything too tricky myself, I'd either separate the sources onto separate IRQ inputs, change to a level-sensitive input, or add an interrupt controller chip.