ARM9 Kernel 2.6.10 GPIO pin interrupts return IRQ_HANDLED - linux

I'm trying to better understand the interaction between the "return IRQ_HANDLED" statement used in a GPIO pin-based interrupt handler (top-half) and the GPIO pin hardware. In particular, consider the hypothetical situation wherein a device has pulled a GPIO pin low to indicate that it needs attention. This causes the associated (top half) interrupt handler to be invoked. Now assume that the top-half handler queues up some work and then returns with "return IRQ_HANDLED" but that for whatever reason the interrupt has not been cleared on the device that generated it (i.e. the device is holding the GPIO pin in the low state). Does invocation of "return IRQ_HANDLED" cause the interrupt to be regenerated? I ask this in the context of the following article:
http://www.makelinux.net/books/lkd2/ch06lev1sec4
"Reentrancy and Interrupt Handlers
Interrupt handlers in Linux need not be reentrant. When a given interrupt handler is executing, the corresponding interrupt line is masked out on all processors, preventing another interrupt on the same line from being received. Normally all other interrupts are enabled, so other interrupts are serviced, but the current line is always disabled. Consequently, the same interrupt handler is never invoked concurrently to service a nested interrupt. This greatly simplifies writing your interrupt handler."
The above comment indicates that upon invocation of an interrupt handler, the interrupt line for that interrupt is masked. I'm trying to figure out if the invocation of "return IRQ_HANDLED" is what unmasks the interrupt line. And, with respect to the hypothetical case described above, what would happen if I "return IRQ_HANDLED" yet the device has not really had its interrupt cleared and hence is still holding the GPIO pin in a low (triggered) state. More specifically, will this cause the interrupt to be generated again such that the processor never has a chance to do the work queued when the interrupt first occurred. I.e., would this lead to an interrupt storm wherein the processor could be interrupted endlessly thus not allowing any useful processing to occur. I should add that I ask this question in the context of a single CPU linux ARM9 system (Phytec LPC3180) running kernel 2.6.10.
Thanks in advance,
Jim
PS: I'm not clear as to the difference between enabling/disabling an interrupt (in particular, an interrupt associated with a particular GPIO pin) and masking/unmasking the same GPIO interrupt.

Related

clear pending interrupts in linux kernel

say I have some code as follows:
local_irq_disable();
... // some interrupts come during this time
local_irq_enable();
after I called local_irq_enable(), all interrupts blocked(pending interrupts) are still there & cause the cpu to respond.
Is there anything will clear pending interrupts?
my code runs on an ARM aarch64 machine.
A typical chain is that the cpu interrupt pin is multiplexed via an interrupt controller (ex. GIC) to a set of devices.
Disabling interrupts merely shunts the pin on the CPU, the interrupt controller still maintains the pending state. You could use a feature on the interrupt controller to mask all interrupts, which would permit you to then enable the CPU interrupts without receiving any. Not really sure the point in that, when you could just leave the CPU ignoring interrupts.
To truly clear the pending interrupts, you need to invoke the device specific code (ie. interrupt handler) for each device with a pending interrupt. You could look through the status bits of the GIC, identify each pending interrupt, then look through the kernel's interrupt structure to determine the relevant device and invoke its handler. It is a lot easier to just turn interrupts back on.
If you disable interrupts, there will probably be a pending interrupt that's been sent to your CPU from the PIC that it's waiting for you to acknowledge. So before you re-enable interrupts, you'd have to tell the PIC to de-assert this interrupt if present.
While the PIC was waiting for acknowledgment, it may have been buffering other interrupts (or sending them to other CPUs). So you'd need to tell the PIC to clear these if present, or wait a sufficient amount of time for other CPUs to handle all these interrupts. This is assuming of course that the interrupts are being distributed evenly and no interrupt is biased towards your CPU.

Linux kernel regmap irq handler has an issue

I have been working on irq handlers using the regmap irq chip implementation.
I have seen that there is high incosistency with the irq handlers execution. Especially if the irq is generated continuously during suspend. The irq chokes and never clears the interrupt source i.e handler never runs at times. Even if the handler runs half way and the system sleeps, it does not continue on resume.
Its creating serious issues. How do I handle this?
Regmap entirely uses threaded irqs. In addition, I was using i2c calls in the nested calls which are again threaded irqs. Due to this, I would always remain in user space and not in irq context. I2c tranfer has schedule in it and that brings in a completely different execution flow. In addition, there were problems in wake enabling the irq.

Need help handling multiple shared I2C MAX3107 chips on shared ARM9 GPIO interrupt (linux)

Our group is working with an embedded processor (Phytec LPC3180, ARM9). We have designed a board that includes four MAX3107 uart chips on one of the LPC3180's I2C busses. In case it matters, we are running kernel 2.6.10, the latest version available for this processor (support of this product has not been very good; we've had to develop or fix a number of the drivers provided by Phytec, and Phytec seems to have no interest in upgrading the linux code (especially kernel version) for this product. This is too bad in that the LPC3180 is a nice device, especially in the context of low power embedded products that DO NOT require ethernet and in fact don't want ethernet (owing to the associated power consumption of ethernet controller chips). The handler that is installed now (developed by someone else) is based on a top-half handler and bottom-half work queue approach.
When one of four devices (MAX3107 UART chips) on the I2C bus receives a character it generates an interrupt. The interrupt lines of all four MAX3107 chips are shared (open drain pull-down) and the line is connected to a GPIO pin of the 3180 which is configured for level interrupt. When one of the 3017's generates an interrupt a handler is run which does the following processing (roughly):
spin_lock_irqsave();
disable_irq_nosync(irqno);
irq_enabled = 0;
irq_received = 1;
spin_unlock_irqrestore()
set_queued_work(); // Queue up work for all four devices for every interrupt
// because at this point we don't know which of the four
// 3107's generated the interrupt
return IRQ_HANDLED;
Note, and this is what I find somewhat troubling, that the interrupt is not re-enabled before leaving the above code. Rather, the driver is written such that the interrupt is re-enabled by a bottom half work queue task (using the "enable_irq(LPC_IRQ_LINE) function call". Since the work queue tasks do not run in interrupt context I believe they may sleep, something that I believe to be a bad idea for an interrupt handler.
The rationale for the above approach follows:
1. If one of the four MAX3107 uart chips receives a character and generates an interrupt (for example), the interrupt handler needs to figure out which of the four I2C devices actually caused the interrupt. However, and apparently, one cannot read the I2C devices from within the context of the upper half interrupt handler since the I2C reads can sleep, something considered inappropriate for an interrupt handler upper-half.
2. The approach taken to address the above problem (i.e. which device caused the interrupt) is to leave the interrupt disabled and exit the top-half handler after which non-interrupt context code can query each of the four devices on the I2C bus to figure out which received the character (and hence generated the interrupt).
3. Once the bottom-half handler figures out which device generated the interrupt, the bottom-half code disables the interrupt on that chip so that it doesn't re-trigger the interrupt line to the LPC3180. After doing so it reads the serial data and exits.
The primary problem here seems to be that there is not a way to query the four MAX3107 uart chips from within the interrupt handler top-half. If the top-half simply re-enabled interrupts before returning, this would cause the same chip to generate the interrupt again, leading, I think, to the situation where the top-half disables the interrupt, schedules bottom-half work queues and disables the interrupt only to find itself back in the same place because before the lower-half code would get to the chip causing the interrupt, another interrupt has occurred, and so forth, ....
Any advice for dealing with this driver will be much appreciated. I really don't like the idea of allowing the interrupt to be disabled in the top-half of the driver yet not be re-enabled prior to existing the top-half drive code. This does not seem safe.
Thanks,
Jim
PS: In my reading I've discovered threaded interrupts as a means to deal with the above-described requirements (at least that's my interpretation of web site articles such as http://lwn.net/Articles/302043/). I'm not sure if the 2.6.10 kernel as provided by Phytec includes threaded interrupt functions. I intend to look into this over the next few days.
If your code is written properly it shouldn't matter if a device issues interrupts before handling of prior interrupts is complete, and you are correct that you don't want to do blocking operations in the top half, but blocking operations are acceptable in a bottom half, in fact that is part of the reason they exist!
In this case I would suggest an approach where the top half just schedules the bottom half, and then the bottom half loops over all 4 devices and handles any pending requests. It could be that multiple devices need processing, or none.
Update:
It is true that you may overload the system with a load test, and the software may need to be optimized to handle heavy loads. Additionally I don't have a 3180, and four 3107s (or similar) of my own to test this out on, so I am speaking theoretically, but I am not clear why you need to disable interrupts at all.
Generally speaking when a hardware device asserts an interrupt it will not assert another one until the current one is cleared. So you have 4 devices sharing one int line:
Your top half fires and adds something to the work queue (ie triggers bottom half)
Your bottom half scans all devices on that int line (ie all four 3107s)
If one of them caused the interrupt you will then read all data necessary to fully process the data (possibly putting it in a queue for higher level processing?)
You "clear" the interrupt on the current device.
When you clear the interrupt then the device is allowed to trigger another interrupt, but not before.
More details about this particular device:
It seems that this device (MAX3107) has a buffer of 128 words, and by default you are getting interrupted after every single word. But it seems that you should be able to take better advantage of the buffer by setting the FIFO level registers. Then you will get interrupted only after that number of words has been rx (or if you fill your tx FIFO up beyond the threshold in which case you should slow down the transmit speed (ie buffer more in software)).
It seems the idea is to basically pull data off the devices periodically (maybe every 100ms or 10ms or whatever seems to work for you) and then only have the interrupt act as a warning that you have crossed a threshold, which might schedule the periodic function for immediate execution, or increases the rate at which it is called.
Interrupts are enabled & disabled because we use level-based interrupts, not edge-based. The ramifications of that are explicitly explained in the driver code header, which you have, Jim.
Level-based interrupts were required to avoid losing an edge interrupt from a character that arrives on one UART immediately after one arriving on another: servicing the first effectively eliminates the second, so that second character would be lost. In fact, this is exactly what happened in the initial, edge-interrupt version of this driver once >1 UART was exercised.
Has there been an observed failure with the current scheme?
Regards,
The Driver Author (someone else)

How shared IRQ races are avoided in Linux

I am considering an upcoming situation in an embedded Linux project (no hardware yet) where two external chips will need to share a single physical IRQ line. This line is capable in hardware of edge triggering but not level triggered interrupts.
Looking at the shared irq support in Linux, I understand that the way this would work with two separate drivers is that each would have their interrupt handler called, check their hardware and handle if appropriate.
However I imagine the following race condition and would like to know if I'm missing something or what might be done to work around this. Let's say there are two external interrupt sources, devices A and B:
device B interrupt occurs, IRQ goes active
IRQ edge causes Linux core interrupt handler to run
ISR for device A runs, finds no interrupt pending
device A interrupt occurs, IRQ stays active (wire-OR)
ISR for device B runs, finds interrupt pending, handles and clears it
core interrupt handler exits
IRQ stays active, no more edges are generated, IRQ is locked up
It seems that for this to be fixed, the core interrupt handler would have to check the IRQ level after running all handlers, and if still active, run them all again. Will Linux do this? I don't think the interrupt core knows how to check the level of an IRQ line.
Is this race something that can actually happen, and if so how do I deal with this?
Basically, with the hardware you've described, doing a wired-or for the interrupts will NEVER work correctly on it's own.
If you want to do wired-or, you really need to be using level-sensitive IRQ inputs. If that's not feasible, then perhaps you can add in some kind of interrupt controller. That device would take N level-sensitive inputs, and have one output, and some kind of 'clear'. When the interrupt controller gets a clear it would lower it's output, then re-assert the output if any of it's inputs were still asserted.
On the software side, you could look at is running the IRQ line to another processor input. This would allow you to at least check the state, but the Linux core ISR handling isn't going to know anything about this, and so you'll have to patch in something to get it to check it and cycle through the ISRs again. Also, this means that in heavy interrupt loading situations you're NEVER going to get out of this ISR. Given that you're doing a wire-or on the IRQs, I'm kind of assuming these devices won't be interrupting too often.
One other thing is to look really hard at the processor. There may be some kind of trick you can pull with the interrupt setup in order to get it to recognize the interrupt again.
I wouldn't try anything too tricky myself, I'd either separate the sources onto separate IRQ inputs, change to a level-sensitive input, or add an interrupt controller chip.

Can an interrupt handler be preempted?

I know that linux does nested interrupts where one interrupt can "preempt" another interrupt, but what about with other tasks.
I am just trying to understand how linux handles interrupts. Can they be preempted by some other user task/kernel task.
Reading Why kernel code/thread executing in interrupt context cannot sleep? which links to Robert Loves article, I read this :
some interrupt handlers (known in
Linux as fast interrupt handlers) run
with all interrupts on the local
processor disabled. This is done to
ensure that the interrupt handler runs
without interruption, as quickly as
possible. More so, all interrupt
handlers run with their current
interrupt line disabled on all
processors. This ensures that two
interrupt handlers for the same
interrupt line do not run
concurrently. It also prevents device
driver writers from having to handle
recursive interrupts, which complicate
programming.
So AFIK all IRQ's are disabled while within the interrupt handler, therefore it cannot be interrupted!?
Simple answer: An interrupt can only be interrupted by interrupts of higher priority.
Therefore an interrupt can be interrupted by the kernel or a user task if the interrupt's priority is lower than the kernel scheduler interrupt priority or user task interrupt priority.
Note that by "user task" I mean user-defined interrupt.

Resources