On the mainbord we have an interrupt controller (IRC) which acts as a multiplexer between the devices which can raise an interrupt and the CPU:
|--------|
|-----------| | |
-(0)------| IRC _____|______| CPU |
-(...)----| ____/ | | |
-(15)-----|/ | |--------|
|-----------|
Every device is associated with an IRQ (the number on the left). After every execution the CPU senses the interrupt-request line. If a signal is detected a state save will be performed and the CPU loads an Interrupt Handler Routine which can be found in the Interrupt Vector which is located on a fixed address in memory. As far as I can see the Number of the IRQ and the Vector number in the Interrupt Vector are not the same because I have for example my network card registered to IRQ 8. On an Intel Pentium processor this would point to a routine which is used to signal one error condition so there must be a mapping somewhere which points to the correct handler.
Questions:
1) If I write an device driver and register an IRQ X for it. From where does the system know which device should be handled? I can for example use request_irq() with IRQ number 10 but how does the system know that the handler should be used for the mouse or keyboard or for whatever i write the driver?
2) How is the Interrupt Vector looking then? I mean if I use the IRQ 10 for my device this would overwrite an standard handler which is for error handling in the table (the first usable one is 32 according to Silberschatz (Operating System Concepts)).
3) Who initialy sets the IRQs? The Bios? The OS?
4) Who is responsible for the matching of the IRQ and the offset in the Interrupt Vector?
5) It is possible to share IRQS. How is that possible? There are hardware lanes on the Mainboard which connect devices to the Interrupt Controller. How can to lanes be configured to the same Interrupt? There must be a table which says lane 2 and 3 handle IRQ15 e.g. Where does this table reside and how is it called?
Answers with respect to linux kernel. Should work for most other OS's also.
1) If I write an device driver and register an IRQ X for it. From where does the system know which device should be handled? I can for example use request_irq() with IRQ number 10 but how does the system know that the handler should be used for the mouse or keyboard or for whatever i write the driver?
There is no 1 answer to it. For example if this is a custom embedded system, the hardware designer will tell the driver writer "I am going to route device x to irq y". For more flexibility, for example for a network card which generally uses PCI protocol. There are hardware/firmware level arbitration to assign an irq number to a new device when it is detected. This will then be written to one of the PCI configuration register. The driver first reads this device register and then registers its interrupt handler for that particular irq. There will be similar mechanisms for other protocols.
What you can do is look up calls to request_irq in kernel code and how the driver obtained the irq value. It will be different for each kind of driver.
The answer to this question is thus, the system doesn't know. The hardware designer or the hardware protocols provide this information to driver writer. And then the driver writer registers the handler for that particular irq, telling the system what to do in case you see that irq.
2) How is the Interrupt Vector looking then? I mean if I use the IRQ 10 for my device this would overwrite an standard handler which is for error handling in the table (the first usable one is 32 according to Silberschatz (Operating System Concepts)).
Good question. There are two parts to it.
a) When you request_irq (irq,handler). The system really doesn't program entry 0 in the IVT or IDT. But entry N + irq. Where N is the number of error handlers or general purpose exceptions supported on that CPU. Details vary from system to system.
b) What happens if you erroneously request an irq which is used by another driver. You get an error and IDT is not programmed with your handler.
Note: IDT is interrupt descriptor table.
3) Who initialy sets the IRQs? The Bios? The OS?
Bios first and then OS. But there are certain OS's for example, MS-DOS which doesn't reprogram the IVT set up by BIOS. More sophisticated modern OS's like Windows or Linux do not want to rely on particular bios functions, and they re-program the IDT. But bios has to do it initially only then OS comes into picture.
4) Who is responsible for the matching of the IRQ and the offset in the Interrupt Vector?
I am really not clear what you mean. The flow is like this. First your device is assigned an irq number, and then you register an handler for it with that irq number. If you use wrong irq number, and then enable interrupt on your device, system will crash. Because the handler is registered fro wrong irq number.
5) It is possible to share IRQS. How is that possible? There are hardware lanes on the Mainboard which connect devices to the Interrupt Controller. How can to lanes be configured to the same Interrupt? There must be a table which says lane 2 and 3 handle IRQ15 e.g. Where does this table reside and how is it called?
This is a very good question. Extra table is not how it is solved in kernel. Rather for each shared irq, the handlers are kept in a linked list of function pointers. Kernel loops through all the handlers and invokes them one after another until one of the handler claims the interrupt as its own.
The code looks like this:
driver1:
d1_int_handler:
if (device_interrupted()) <------------- This reads the hardware
{
do_interrupt_handling();
return MY_INTERRUPT;
}else {
return NOT_MY_INTERRUPT;
}
driver2:
Similar to driver 1
kernel:
do_irq(irq n)
{
if (shared_irq(n))
{
irq_chain = get_chain(n);
while(irq_chain)
{
if ((ret = irq_chain->handler()) == MY_INTERRUPT)
break;
irq_chain = irq_chain->next;
}
if (ret != MY_INTERRUPT)
error "None of the drivers accepted the interrupt";
}
}
Related
What are irq domains, i read kernel documentation (https://www.kernel.org/doc/Documentation/IRQ-domain.txt) they say:
The number of interrupt controllers registered as unique irqchips
show a rising tendency: for example subdrivers of different kinds
such as GPIO controllers avoid reimplementing identical callback
mechanisms as the IRQ core system by modeling their interrupt
handlers as irqchips, i.e. in effect cascading interrupt controllers.
How GPIO controller can be called as interrupt controller?
What are linux irq domains, why are they needed?
It's documented perfectly in the first paragraph of Documentation/IRQ-domain.txt, so I will assume that you already know it. If no -- please ask what is unclear regarding that documentation. The text below explains how to use IRQ domain API and how it works.
How GPIO controller can be called as interrupt controller?
Let me answer this question using max732x.c driver as a reference (driver code). It's a GPIO driver and it also acts like interrupt controller, so it should be a good example of how IRQ domain API works.
Physical level
To completely understand further explanation, let's first look into MAX732x mechanics. Application circuit from datasheet (simplified for our example):
When there is a change of voltage level on P0-P7 pins, MAX7325 will generate interrupt on INT pin. The driver (running on SoC) can read the status of P0-P7 pins via I2C (SCL/SDA pins) and generate separate interrupts for each of P0-P7 pins. This is why this driver acts as interrupt controller.
Consider next configuration:
"Some device" changes level on P4 pin, tempting MAX7325 to generate interrupt. Interrupt from MAX7325 is connected to GPIO4 IP-core (inside of SoC), and it uses line #29 of that GPIO4 module to notify CPU about interrupt. So we can say that MAX7325 is cascaded to GPIO4 controller. GPIO4 also acts as interrupt controller, and it's cascaded to GIC interrupt controller.
Device tree
Let's declare above configuration in device tree. We can use bindings from Documentation/devicetree/bindings/gpio/gpio-max732x.txt as reference:
expander: max7325#6d {
compatible = "maxim,max7325";
reg = <0x6d>;
gpio-controller;
#gpio-cells = <2>;
interrupt-controller;
#interrupt-cells = <2>;
interrupt-parent = <&gpio4>;
interrupts = <29 IRQ_TYPE_EDGE_FALLING>;
};
The meaning of properties is as follows:
interrupt-controller property defines that device generates interrupts; it will be needed further to use this node as interrupt-parent in "Some device" node.
#interrupt-cells defines format of interrupts property; in our case it's 2: 1 cell for line number and 1 cell for interrupt type
interrupt-parent and interrupts properties are describing interrupt line connection
Let's say we have driver for MAX7325 and driver for "Some device". Both are running in CPU, of course. In "Some device" driver we want to request interrupt for event when "Some device" changes level on P4 pin of MAX7325. Let's first declare this in device tree:
some_device: some_device#1c {
reg = <0x1c>;
interrupt-parent = <&expander>;
interrupts = <4 IRQ_TYPE_EDGE_RISING>;
};
Interrupt propagation
Now we can do something like this (in "Some device" driver):
devm_request_threaded_irq(core->dev, core->gpio_irq, NULL,
some_device_isr, IRQF_TRIGGER_RISING | IRQF_ONESHOT,
dev_name(core->dev), core);
And some_device_isr() will be called each time when level on P4 pin of MAX7325 goes from low to high (rising edge). How it works? From left to the right, if you look to the picture above:
"Some device" changes level on P4 of MAX7325
MAX7325 changes level on its INT pin
GPIO4 module is configured to catch such a change, so it's generates interrupt to GIC
GIC notifies CPU
All those actions are happening on hardware level. Let's see what's happening on software level. It actually goes backwards (from right to the left on the picture):
CPU now is in interrupt context in GIC interrupt handler. From gic_handle_irq() it calls handle_domain_irq(), which in turn calls generic_handle_irq(). See Documentation/gpio/driver.txt for details. Now we are in SoC's GPIO controller IRQ handler.
SoC's GPIO driver also calls generic_handle_irq() to run handler, which is set for each particular pin. See for example how it's done in omap_gpio_irq_handler(). Now we are in MAX7325 IRQ handler.
MAX7325 IRQ handler (here) calls handle_nested_irq(), so that all IRQ handlers of devices connected to MAX7325 ("Some device" IRQ handler, in our case) will be called in max732x_irq_handler() thread
finally, IRQ handler of "Some device" driver is called
IRQ domain API
GIC driver, GPIO driver and MAX7325 driver -- they all are using IRQ domain API to represent those drivers as interrupt controllers. Let's take a look how it's done in MAX732x driver. It was added in this commit. It's easy to figure out how it works just by reading IRQ domain documentation and looking to this commit. The most interesting part of that commit is this line (in max732x_irq_handler()):
handle_nested_irq(irq_find_mapping(chip->gpio_chip.irqdomain, level));
irq_find_mapping() will find linux IRQ number by hardware IRQ number (using IRQ domain mapping function). Then handle_nested_irq() function will be called, which will run IRQ handler of "Some device" driver.
GPIOLIB_IRQCHIP
Since many GPIO drivers are using IRQ domain in the same way, it was decided to extract that code to GPIOLIB framework, more specifically to GPIOLIB_IRQCHIP. From Documentation/gpio/driver.txt:
To help out in handling the set-up and management of GPIO irqchips and the
associated irqdomain and resource allocation callbacks, the gpiolib has
some helpers that can be enabled by selecting the GPIOLIB_IRQCHIP Kconfig
symbol:
gpiochip_irqchip_add(): adds an irqchip to a gpiochip. It will pass
the struct gpio_chip* for the chip to all IRQ callbacks, so the callbacks
need to embed the gpio_chip in its state container and obtain a pointer
to the container using container_of().
(See Documentation/driver-model/design-patterns.txt)
gpiochip_set_chained_irqchip(): sets up a chained irq handler for a
gpio_chip from a parent IRQ and passes the struct gpio_chip* as handler
data. (Notice handler data, since the irqchip data is likely used by the
parent irqchip!) This is for the chained type of chip. This is also used
to set up a nested irqchip if NULL is passed as handler.
This commit converts IRQ domain API to GPIOLIB_IRQCHIP API in MAX732x driver.
Next questions
Further discussion is here:
part 2
part 3
Here's a comment I found in include/linux/irqdomain.h:
Interrupt controller "domain" data structure. This could be defined as
a irq domain controller. That is, it handles the mapping between
hardware and virtual interrupt numbers for a given interrupt domain.
the actual structure I think it's referring to there is irq_domain.
Currently i have a requirement to support MSI with 2 vectors on my PCI device. Each vector needs to have a different handler routine. HW document says the following
vector 0 is for temperature sensor
vector 1 is for power sensor
Below is the driver code i am following.
1. First enable two vectors using pci_enable_msi_block(pdev, 2)
2. Next assign interrupt handlers using request_irq(two different irq, two diff interrupt handlers).
int vecs = 2;
struct pci_dev *pdev = dev->pci_dev;
result = pci_enable_msi_block(pdev, vecs);
Here result is zero which says call succeeded in enabling two vectors.
Questions i have is:
HW document says vector 0, i hope this is not the vector 0 of OS right? In any case i can't get vector 0 in OS.
Difficult problem i am facing is when i do request_irq() for first irq, how do i say to OS that i need to map this request to vector 0 of HW? Consecutively for second irq, how do i map t vector 1 of HW?
pci_enable_msi_block:
If 2 MSI messages are requested using this function and if the function call returns 0, then 2 MSI messages are allocated for the device and pdev->irq is updated to the lowest of the interrupts assigned to the device.
So pdev->irq and pdev->irq+1 are the new interrupts assigned to the device. You can now register two interrupt handlers:
request_irq(pdev->irq, handler1, ...)
request_irq(pdev->irq+1, handler2, ...)
With MSI and MSI-X, the interrupt number(irq) is a CPU "vector". Message signaled interrupts allow the device to write a small amount of data to a special memory-mapped I/O address; the chipset then delivers the corresponding interrupt to a processor.
May be there are two different MSI interrupt data that can be written to a MSI address. Its like your hardware supports 2 MSI (one for Temperature Sensor and one for Power Sensor). So when you issue pci_enable_msi_block(pdev, 2);, the interrupt will be asserted by the chipset to the processor whenever any of the two MSI data is written to that special memory-mapped I/O address (MSI address).
After the call to pci_enable_msi_block(pdev, 2); ,you can request two irqs through request_irq(pdev->irq, handler, flags....) and request_irq(pdev->irq + 1, handler, flags....). So whenever the MSI data is written to the MSI address, pdev->irq or pdev->irq + 1 will be asserted depending on which sensor sent the MSI and the corresponding handler will be invoked.
This two MSI data can be configured into the hardware's MSI data register.
I have several registered interrupts assigned to gpios, and application in user space.
How to notify application about occurred interrupt and which interrupt there was?
Possibly fasync is applicable for this goal, but I can find examples how to send information from interrupt handler to user space application.
It will be good if you can present some useful examples.
Thanks in advance.
You don't need fancy kernel to userspace communication. A userspace application has access to GPIOs using Sysfs. Read about it in Documentation/gpio.txt.
First, export a GPIO pin like this (the actual number depends on your setup):
# echo 23 > /sys/class/gpio/export
This will export GPIO pin #23, and thus create /sys/class/gpio/gpio23.
Set its direction:
# echo in > /sys/class/gpio/gpio23/direction
If the hardware GPIO controller supports interrupts generation, the driver should also support it and you will see /sys/class/gpio/gpio23/edge. Write either rising, falling or both to this file to indicate the signal edge(s) that will create a "userspace interrupt". Now, to get interrupted, use the poll(2) system call on /sys/class/gpio/gpio23/value. Then, when the poll call unblocks, read the new value (/sys/class/gpio/gpio23/value), which will be '0' or '1' (ASCII).
dinesh provided a C implementation of eepp's proposed solution, which requires that the application block in poll().
Here is a C++ implementation which abstracts this functionality, and provides callback/interrupt functionality instead. Note the GPIO constructor which takes a callback function as an argument. This provides the capability desired by the OP.
https://github.com/tweej/HighLatencyGPIO
For last few days i am studying a lot about linux chapter 10 book ldd3. I have some doubt please clarify them. Some are my analysis please suggest if they are wrong.
For ARM there is one interrupt vector table address for -- IRQ interrupt -- 0x00000018
Then a chip manufacturer can have a separate interrupt line for there hardware like USART, SPI, I2C, External Interrupt -- and multiplex them to a single IRQ line of ARM.
and have registers (of their choice) to determine which one fired the interrupt.
Also if example there is an single interrupt line available for GPIO pin level change interrupt.
As per below link's single interrupt lines can be shared by many handlers of different device drivers.
fiq & irq handler -- arm
Usually the interrupt controller is a hardware unit that multiplexes many interrupt lines together, generating single line to the CPU. When an interrupt occurs, the controller asserts the IRQ line. The CPU stops executing and jumps through the IRQ vector (location varies) to the interrupt handler. The interrupt handlers reads a register on interrupt controller to determine the interrupt line and, invokes the correct interrupt handler and then clears the interrupt - allowing another to occur.
http://www.makelinux.net/ldd3/chp-10-sect-2
How to register an interrupt handler is described in this link.
https://unix.stackexchange.com/questions/47306/how-does-the-linux-kernel-handle-shared-irqs
Linux calls all the intruppt handler for the same shared line.
My question is as an device driver programmer i am only calling .... request_irq().
Who is providing for the code of generic -- IRQ interrupt #0x00000018 address -- which is reading the vendor specific register to determine which interrupt line raised the IRQ.
And then telling linux functionality -- to call all the shared interrupt handler's registered for that IRQ line ?
Is it GCC compiler startup code for chipset doing this work for us ?
The actual interrupt handling is set up by linux/arch/arm/kernel/entry-armv.S. There is then a long chain of code involved in decoding and running interrupt handlers.
The actual request_irq is generic code, which sets up a "descriptor", irq_desc, defined in linux/include/linux/irqdesc.h.
The actual handling of "which interrupt is which" is configured in the specific setup for the board. I'm giving an example of an omap2/omap3 board here (randomly chosen because I have worked with those boards, but not in Linux):
linux/arch/arm/mach-omap2/irq.c
I hope this helps.
In entry-armv.S file, you can locate the code for filling IRQ line as folows:
/*
* Interrupt handling. Preserves r7, r8, r9
*/
.macro irq_handler
get_irqnr_preamble r5, lr
1: get_irqnr_and_base r0, r6, r5, lr
movne r1, sp
#
# routine called with r0 = irq number, r1 = struct pt_regs *
#
adrne lr, BSYM(1b)
bne asm_do_IRQ
Macro get_irqnr_and_base is suposed to be machine specific, and hence contained in file arch/arm/mach_/include/entry-macro.S.
You can see that this macro is implemented different way for differnet arm-based machines. So that's how identification of IRQ lines are done based on differnet HWs.
I am writing a device driver to handle interrupts for a PCIe card, which currently works for any interrupt vector raised on the IRQ line.
But it has a few types that can be raised, flagged by the Vector register. So now I need to read the vector information and be a bit cleverer...
So, do I :-
1/ Have separate dev nodes /dev/int1, /dev/int2, etc for each interrupt type, and just doc that int1 is for vector type A etc?
1.1/ As each file/char-devices will have its own minor number, when opened I'll know which is which. i think.
1.2/ ldd3 seems to demo this method.
2/ Have one node /dev/int (as I do now) and have multiple processes hanging off the same read method? sounds better?!
2.1/ Then only wake the correct process up...?
2.2/ Do I use separate wait_queue_head_t wait_queues? Or different flag/test conditions?
In the read method:-
wait_event_interruptible(wait_queue, flag);
In the handler not real code! :-
int vector = read_vector();
if vector = A then
wake_up_interruptible(wait_queue, flag)
return IRQ_HANDLED;
else
return IRQ_NONE/IRQ_RETVAL?
EDIT: notes from peoples comments :-
1) my user-space code mmap's all of the PCIe firmware registers
2) User-space code has a few threads, each perform a blocking read on the device driver device nodes, which then returns data from the firmware when an interrupt occurs. I need the correct thread woken up depending on the interrupt type.
I am not sure I understand correctly what you mean with the Vector register (a pointer to some documentation would help me precise for your case).
Anyway, any PCI device gets a unique interrupt number (given by the BIOS or some firmware on other architectures than x86). You just need to register this interrupt in your driver.
priv->name = DRV_NAME;
err = request_irq(pdev->irq, your_irqhandler, IRQF_SHARED, priv->name,
pdev);
if (err) {
dev_err(&pdev->dev, "cannot request IRQ\n");
goto err_out_unmap;
}
One other thing that I do not really understand is why you would export your interrupts as a dev node: interrupts are certainly something that need to remain in your driver/kernel code. But I guess here you want to export a device that is then accessed in userspace. I just find /dev/int no to be a good naming.
For your question about multiple dev nodes: if your different interrupt sources then provide access to different hardware resources (even if on the same PCI board) I would go for option 1), with a wait_queue for each device. Otherwise, I would go for option 2)
Since your interrupts are coming from the same physical device, if you chose option 1) or option 2), the interrupt line will have to be shared and you will have to read the vector in your interrupt handler to define which hardware resource raised the interrupt.
For option 1), it would be something like this:
static irqreturn_t pex_irqhandler(int irq, void *dev) {
struct pci_dev *pdev = dev;
int result;
result = pci_read_config_byte(pdev, PCI_INTERRUPT_LINE, &myirq);
if (result) {
int vector = read_vector();
if (vector == A) {
set_flagA(flag);
} else if (vector == B) {
set_flagB(flag);
}
wake_up_interruptible(wait_queue, flag);
return IRQ_HANDLED;
} else {
return IRQ_NONE;
}
For option 2, it would be similar, but you would have only one if clause (for the respective vector value) in every different interrupt handler that you would request for every node.
If you have different chanel you can read() from, then you should definitely use different minor number. Imagine you have a card whith four serial port, you would definitely want four /dev/ttySx.
But does your device fit whith this model ?
First, I assume you're not trying to get your code into the mainline kernel. If you are, expect a vigorous discussion about the best way to do this. If you're writing a simple interrupt handling driver for a card which is mostly driven by mmap from user-space, there are a lot of ways to solve this problem.
If you use multiple device nodes (option 1), you can also implement poll so that a single application can open multiple device nodes and wait for a selection of interrupts. The minor number will be sufficient to tell them apart. If you have a wake queue for each vector, you can wake only the relevant listeners. You'll need to latch the vector after a successful poll to be sure that the read succeeds.
If you use a single device node (option 2), you'll need to add some extra magic so that the threads can register their interest in particular interrupt vectors. You could do this with an ioctl, or have the threads write the interrupt vectors to the device. Each thread should open the device node to get its own file descriptor. You can then associate the list of requested vectors with each open file descriptor. As a bonus, you can let the application read the interrupt vector from the device, so it knows which one happened.
You'll need to think about how the interrupt gets cleared. The interrupt handler will need to remove the interrupt, then store the result so it can be passed to user-space. You might find a kfifo useful for this rather than a wait queue. If you have a fifo for each open file descriptor, you can distribute the interrupt notifications to each listening application.