Linux Network Driver MSI Interrupt Issue

Linux Network Driver MSI Interrupt Issue - linux

I am attempting to create a network driver for custom hardware. I am targeting a Xilinx Zync-7000 FPGA device.
My issue is the software handling of the MSI interrupt on the CPU side. The problem I have is when the interrupt is fired on the PCIe device the driver code executes the interrupt handler one time and returns, but then the PCIe IO stops working and the MSI is reset when I look at lspci. Any future interrupts are not caught by the kernel and the PCIe dev is pretty much dead. I checked the hardware and no resets are issued to the FPGA so I am thinking that something is going on in the kernel.
Thank you in advance.

After posting this question I discovered the problem which has been plaguing me for a little over a day now. What was happening is when I mapped my DMA buffer as follows:
net_priv->rx_phy_addr = dma_map_single(&pdev->dev, net_priv->rx_virt_addr,
dev->mtu, PCI_DMA_FROMDEVICE);
I unmapped the same buffer later with
dma_unmap_single(&pdev->dev, net_priv->rx_phy_addr, BUFFER_SIZE,
PCI_DMA_FROMDEVICE);
My BUFFER_SIZE typo was 1MB in size and dev->mtu is 1.5kB. What seems to happen is that when I unmapped 1MB of space it started unmapping other memory maps in addition to the 1.5bkB. As soon as the dma_unmap_single completed the PCIe IO region was dead as well as the interrupt region. Hope my mistake can help someone else out.

Related

Handling IRQ delay in linux driver

I've build a linux driver for an SPI device.
The SPI device sends an IRQ to the processor when a new data is ready to be read.
The IRQ fires about every 3 ms, then the driver goes to read 2 bytes with SPI.
The problem I have is that sometimes, there's more than 6 ms between the IRQ has been fired and the moment where SPI transfer starts, which means I lost 2 bytes of the SPI device.
In addition, there's a uncertain delay between the 2 bytes; sometime it's close to 0, sometime it's up to 300us..
Then my question is : how can I reduce the latency between IRQ and SPI readings ?
And how to avoid latency between the 2 bytes ?
I've tried compiling the kernel with premptive option, it does not change things that much.
As for the hardware, I'm using a mini2440 board running at 400 MHz, using a hardware SPI port (not i/o simulated SPI).
Thanks for help.
BR,
Vincent.

From the brochure of the Samsung S3C2440A CPU, the SPI interface hardware supports both interrupt and DMA-based operation. A look at the actual datasheet reveals that the hardware also supports a polling mode.
If you want to achieve high data rates reliably, the DMA-based approach is what you need. Once a DMA operation is configured, the hardware will move the data to RAM on its own, without the need for low-latency interrupt handling.
That said, I do not know the state of the Linux SPI drivers for your CPU. It could be a matter of missing support for DMA, of specific system settings or even of how you are using the driver from your own code. The details w.r.t. SPI are often highly dependent on the particular implementation...

I had a similar problem: I basically got an IRQ and needed to drain a queue via SPI in less than 10 ms or the chip would start to drop data. With high system load (ssh login was actually enough) sometimes the delay between the IRQ handler enqueueing the next SPI transfer with spi_async and the SPI transfer actually happening exceeded 11 ms.
The solution I found was the rt flag in struct spi_device (see here). Enabling that will set the thread that controls the SPI to real-time priority, which made the timing of all SPI transfers super reliable. And by the way that change also removes delay before the complete callback.
Just as a heads up, I think this was not available in earlier kernel versions.

The thing is Linux SPI stack uses queues for transmitting the messages.
This means that there is no guarantee about the delay between the moment you ask to send the SPI message, and the moment where it is effectively sent.
Finally, to fullfill my 3ms requirements between each SPI message, I had to stop using Linux SPI stack, and directly write into the CPU's register inside my own IRQ.
That's highly dirty, but it's the only way to make it work with small delays.

Critical Timing in an ARM Linux Kernel Driver

I am running linux on an MX28 (ARMv5), and am using a GPIO line to talk to a device. Unfortunately, the device has some special timing requirements. A low on the GPIO line cannot last longer than 7us, highs have no special timing requirements. The code is implemented as a kernel device driver, and toggles the GPIO with direct register writes rather than going through the kernel GPIO api. For testing, I am just generating 3 pulses. The process is as follows, all in one function so it should fit in the instruction cache:
set gpio high
Save Flags & Disable Interrupts
gpio low
pause
gpio high
repeat 2x more
Restore Flags/Reenable Interrups
Here's the output of a logic analyzer tied to the GPIO.
Most of the time it works just great, and the pulses last just under 1us. However, about 10% of the lows last for many, many microseconds. Even though interrupts are disabled, something is causing the flow of the code to be interrupted.
I am at a loss. RT Linux would likely not help here, because the problem is not latency, it appears to be something happening during the low, even though nothing should interrupt it with the IRQs disabled. Any suggestions would be greatly, greatly appreciated.

The ARM cache on an IMX25 (ARM926) is 16K Code, 16K Data L1 with a 32byte length or eight instructions. With the DDR-SDRAM controller running at 133Mhz and a 16bit bus the transfer rate is about 300MB/s. A cache fill should only take about 100nS, not 9uS; this is about 100 times too long.
However, you have four other issues with Linux.
TLB misses and a page table walk.
Data aborts.
DMA masters stealing.
FIQ interrupts.
It is unlikely that the LCD master is stealing enough bandwidth, unless you have a huge display. Is your display larger than 1/4VGA? If not, this is only 10% of the memory bandwidth and this will pipeline with the processor. Do you have either Ethernet or USB active? These peripherals are higher data rate and could cause this type of contention with SDRAM.
All of these issues maybe avoided by writing your toggler PC relative and copying it to the IRAM. See: iram_alloc.c; this file should be portable to older versions of Linux. The XBAR switch allows fetches from SDRAM and IRAM simultaneously. The IRAM can still be a target of other DMA masters. If you are really pressed, move the code to the ETB buffers which no other master in the system can access.
The TLB miss can actually be quite steep as it may need to run several single beat SDRAM cycles; still this should be under 1uS. You have not posted code, so it is possible that a variable and/or other is causing a data fault which is not maskable.
If you have any drivers using the FIQ, they may still be running even though you have masked the normal IRQ interrupts. For instance, the ALSA driver for this system normally uses the FIQ.
Both the ETB and the IRAM are 32-bit data paths and low wait state. Either one will probably give better response than the DDR-SDRAM.
We have achieved sub micro-second response by using a FIQ and IRAM to toggle GPIOs on an IMX258 with another protocol using bit banging.

One possible workaround to the problem Chris mentioned (in addition to problems with paging of kernel module code) is to use a PWM peripheral where the duration of the pulse is pre-programmed and the timing is implemented in hardware.
Fancy processors with caches are not suitable for hard realtime work. Execution time varies if cache misses are non-deterministic (and designs where cache misses are completely deterministic aren't complicated enough to justify a fancy processor).
You can try to avoid memory controller latency during critical sections by aligning the critical section so that it doesn't straddle cache lines. Or prefetch the code you will need. But this is going to be very non-portable and create a nightmare for future maintenance. And still doesn't protect the access to memory-mapped GPIO from bus contention.

ARM Cortex M3 GPIO Interrupts - One ISR per port with 8 pins - How to handle all pins?

I'm using the Luminary LM3S8962 micro-controller and its included Library Guide, but this should be relevant to any ARM Cortex-M3s that have Nested Vector Interrupts.
You can only register one interrupt service routine function with an entire GPIO Port. A GPIO port typically has 8 pins on it, each of which can be configured with an interrupt. For each pin, you can test whether or not an interrupt "happened" on it (is pending), right? and for each pin you can clear a pending interrupt, right?
If a pin on the GPIO port triggers the ISR then the processor is in the ISR. Then what happens if another pin on the same port triggers an interrupt while we're in the ISR? We assume the code detects what pins have pending interrupts.
- Is this ISR interrupted and a new one begins, with the same code, but an updated PinInterruptStatus register ? (I hope not)
- Is this ISR executed until completion, immediately executing the interrupt for the other pin right afterward? (I know ARM Cortex M3 implements tail-chaining of interrupts)
- Or must there be a while loop that loops until all the pins have been cleared, clearing a pin after it has been processed?
maybe this will help:
http://www.ti.com/lit/gpn/lm3s8962

As stated in the comment: generally ISRs should take steps to prevent reentrancy. In something like a PIC, this could be as simple as disabling the interrupt at the "top" of the ISR, and enabling the interrupt at the "bottom". The M3's NVIC is a bit more complicated. This white paper (http://www.arm.com/files/pdf/IntroToCortex-M3.pdf) states the following on p.7:
The NVIC supports nesting (stacking) of interrupts, allowing an
interrupt to be serviced earlier by exerting higher priority. It also
supports dynamic reprioritisation of interrupts. Priority levels can
be changed by software during run time. Interrupts that are being
serviced are blocked from further activation until the interrupt
service routine is completed, so their priority can be changed without
risk of accidental re-entry.
The above discussion directly addresses the possibility of same interrupt reentrancy, and it also introduces the concept of prioritization to handle interrupts of higher priority interrupting your ISR.
This reference is pretty good: http://infocenter.arm.com/help/topic/com.arm.doc.dui0552a/DUI0552A_cortex_m3_dgug.pdf. On p. 4-9, you'll find instructions to enable/disable interrupts. On page 4-6, you'll find a description of the Interrupt Clear-pending Registers. Using these, you can determine what interrupts are pending. If you really want to get fancy with interrupt enable/disable control, check out the BASEPRI and BASEPRO_MAX registers.
Having said that, I'm not sure I agree with your statement that your question is relevant to any Cortex-M3. Keil (my flavor of Cortex-M3) mentions that the EXTI (external interrupt controller) handles GPIO pin interrupts. Interestingly, the ARM documentation briefly discusses "EXTI", but does not refer to it as a "controller" like the Keil STM32 documentation. A quick google on "STM32 EXTI" yeilds lots of hits, a similar search on "Luminary EXTI" does not yield much. Given that, I'm guessing that this particular controller is one of the peripheral devices that ARM leaves to 3rd parties.
This document
bolsters that view: http://www.st.com/internet/com/TECHNICAL_RESOURCES/TECHNICAL_LITERATURE/REFERENCE_MANUAL/CD00171190.pdf. There are several AFIO_EXTI registers mentioned here. These permit the mapping of GPIO lines to interrupts. Unfortunately, I can't find anything similar in the Luminary documentation.
So...what does this mean? It looks like you only have port-level granularity for your interrupt. Thus, your ISR will have to determine which pin transitioned (assuming your are looking for edges). Good luck!

In Cortex-M3, if two interrupts are the same priority (for all GPIO pins), the former will not be interrupted. The interrupt comes later will be in pending state.
When a GPIO interrupt occurs you can check the GPIO Interrupt Status for Rising/Falling IO0IntEnR/IO0IntEnF (depending on ) for the corresponding bit to find the pin that causes the interrupt.

Linux Interrupt Handling in User Space

In Linux, what are the options for handling device interrupts in user space code rather than in kernel space?

Experience tells it is possible to write good and stable user-space drivers for almost any PCI adapter. It just requires some sophistication and a small proxying layer in the kernel. UIO is a step in that direction, but If you want to correctly handle interrupts in user-space then UIO might not be enough, for example if the device doesn't support the PCI-spec's interrupt disable bit which UIO relies on.
Notice that process wakeup latencies are a few microsecs so if your implementation requires very low latency then user-space might be a drag on it.
If I were to implement a user-space driver, I would reduce the kernel ISR to just a "disable & ack & wakeup-userpace" operation, handle the interrupt inside the waked-up process, and then re-enable the interrupt (of course, by writing to mapped PCI memory from the userspace process).

There is Userspace I/O system (UIO), but handling should still be done in kernelspace. OTOH, if you just need to notice the interrupt, you don't need the kernel part.

You may like to take a look at CHAPTER 10: Interrupt Handling from Linux Device Drivers, Third Edition book.

Have to trigger userland code indirectly.
Kernel ISR indicates interrupt by writing file / setting register / signalling. User space application polls this and goes on with the appropriate code.
Edge cases: more or less interrupts than expected (time out / too many interrupts per time interval)
Linux file abstraction is used to connect kernel and user space. This is performed by character devices and ioctl() calls. Some may prefer sysfs entries for this purpose.
This can look odd because event triggered device notifications (interrupts) are hooked with 'time triggered' polling, but it is actually asyncronous blocking (read/select). Anyway some questions are arising according to performance.
So interrupts cannot be directly handled outside the kernel.
E.g. shared memory can be in user space and with some I/O permission settings addresses can be mapped, so U-I/O works, but not for direct interrupt handling.
I have found only one 'minority report' in topic vfio (http://lxr.free-electrons.com/source/Documentation/vfio.txt):
https://stackoverflow.com/a/21197797/5349798
Similar questions:
Running user thread in context of an interrupt in linux
Is it possible in linux to register a interrupt handler from any user-space program?
Linux Kernel: invoke call back function in user space from kernel space
Linux Interrupt vs. Polling
Linux user space PCI driver
How do I inform a user space application that the driver has received an interrupt in linux?

low latency Interrupt handling (expected avg time to return from kernel to user space is?)

I have a Fibre Optic link, with a proprietary Device Driver.
The link goes into a PCIe card. Running on a RHEL 5.2 (2.6.18-128~)
I have mmap'ed the interface on the card for setup and FIFO access etc, and these read/writes take a few µs to complete, so all good there.
But of course cannot use this for interrupts, so I have to use the kernel module provided, with its user-space lib interface.
WaitForInterrupt(); // API lib interface to kernel module
// Interrupt occurs and am returned to my code in user space
time = CurrentTime() - LatchedTime(); // time to get to here
It takes around 70µs to return from WaitForInterrupt(). (The time the interrupt is raised is latched in the firmware, I read this which as I say above takes ~2µs, and compare it against the current time in the firmware)
What are expected access times between an interrupt occurring and the User Space API interrupt call wait method returning?
Network/other-high-speed interfaces take?

500ms is many orders of magnitudes larger than what a simple switch between userspace/kernel takes, but as someone mentioned in comments, linux is not a real time OS, so there's no guarantee 500ms "hickups" won't show up now and then.
It's quite impossible to tell what the culprit is, the device driver could simpliy be trying to bundle up data to be more efficient.
That said, we've had endless troubles with some custom cards and interactions with both APIC and ACPI, requireing a delicate balance of bios settings, what card goes into which PCI slot and whether a particular video card screws up everything - likely a cause of a dubious driver interacting with more or less buggy bios/video-cards..

If you're able to reliably exceed 500us on a system that's not heavily loaded, I think you're looking at a bad driver implementation (or its userspace wrapper/counterpart).
In my experience the latency to wake a user thread on interrupt should be less than 10us, though (as others have said) Linux provides no latency guarantees.

If you have a recent kernel, you can use the perf sched tool to measure the latency, and see where the time is being used. (500us does sound a tad on the high side, depending on your processor, how many tasks are running, ...)

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string