Why are interrupts needed for MMIO on pcie? - linux

This blog post talks about the difficulties in bringing pci passtrhough support for ARM devices: https://www.linaro.org/blog/kvm-pciemsi-passthrough-armarm64/
It cies GICv2/GICv3 which are ARM's interrupt controllers. You can write to it via MMIO and make it deliver interrupts to CPUs.
However, why interrupts are needed? Shouldn't the PCIe driver talk with the PCIe device through MMIO. That is, writing/reading from memory?

It is necessary because otherwise the operating-system doesn't have any way of knowing an event happened. Operating-systems are not polling memory constantly. They still need to know that an event happened and when. That's where interrupts come in.
Imagine you have an hard-disk PCIe controller. How does the operating-system know when the disk is done writing its data to RAM?

Related

What is the relationship between SPI and DMA in linux?

In my opinion, SPI and DMA are both controllers.
SPI is a communication tool and DMA can transfer data without CPU.
The system API such as spi_sync() or spi_async(), are controlled by the CPU.
So what is the meaning of SPI with DMA, does it mean DMA can control the SPI API without CPU? Or the SPI control uses CPU but the data transfer to DMA directly?
SPI is not a tool, it is a communication protocol. Typical micro controllers have that protocol implemented in hardware which can accessed by read/write to dedicated registers in the address space of the given controller.
DMA on micro controllers is typically designed to move content of registers to memory and visa versa. DMA can sometimes configured to write a special amount of read/writes or increasing or decreasing source and target address of memory and so on.
If you have a micro controller which have SPI with DMA support, it typically means that you can have some data in the memory which will be transferred to the SPI unit to send multiple data bytes without intervention of the cpu core itself. Or read an amount of data bytes from SPI to memory automatically without wasting cpu core.
How such DMA SPI transfers are configured is written in the data sheets of the controllers. There are a very wide span of types so no specific information can be given here without knowing the micro type.
The linux APIs for dealing with SPI are abstracting the access of DMA and SPI by using the micro controller specific implementations in the drivers.
It is quite unclear if you want to use the API to access your SPI or you want to implement a device driver to make the linux API working on your specific controller.
It is not possible to give you a general introduction to write a kernel driver here or clarify register by register from your data sheets. If you need further information you have to make your question much more specific!

PCIe interrupt handling linux kernel

I am working on a PCIe Linux driver. I would like to register an ISR for the device. IRQ number assigned to the device by the Linux system is 16, which is shared by other(USB host controller) device also. (Checked by lspci -v). It is a pin based interrupt.
By searching online I found almost all PCI driver example just provides only IRQF_SHARED as flag in API request_irq(), and does not provide any other flags to mention behaviour like High/Low level interrupt.
My question is, how the Linux kernel determines the behaviour of shared interrupt (for PCIe device), if it is low level or High level ?
PCIe uses MSI, so there is no hi/low level to be concerned with. Traditional PCI cards use level triggered interrupts, but most devices use active low signaling so this isn't something a driver writer has access to modify/tweak.

NAPI interrupt disabling and handling shared interrupt line

I'm trying to understanding NAPI implementaion in linux kernel. These are my basic doubts.
1) NAPI disables further interrupts and handles the skbs' using polling
Who disables it?
Does the Interrupt handler should disable it?
If yes - Isn't the time gap between disabling interrupt and handling the SOFTIRQ net_rx_action where actually polling is done is way too much.
2) By default all NAPI enabled drivers on receiving a single frame disable interrupt and handle remaining frames using polling in bottom halfs?
or is there a logic where only if frames > 32 (on continous receving all frames in irq handler) makes a switch to poll mode?
3) Now coming to SHARED IRQ -
what happens to other devices interrupts , other device bottom half might not run since those devices are not there in poll_list.
I wrote a comprehensive guide to understanding, tuning, and optimizing the Linux network stack which explains everything about network drivers, NAPI, and more, so check it out.
As far as your questions:
Device IRQs are supposed to be disabled by the driver's IRQ handler after NAPI is enabled. Yes, there is a time gap, but it should be quite small. That is part of the tradeoff decision you must make: do you care more about throughput or latency? Depending on which, you can optimize your network stack appropriately. In any case, most NICs allow the user to increase (or decrease) the size of the ring buffer that tracks incoming network data. So, a pause is fine because packets will just be queued for processing later.
It depends on the driver, but in general most drivers will enable NAPI poll mode in the IRQ handler, as soon as it is fired (usually) with a call to napi_schedule. You can find a walkthrough of how NAPI is enabled for the Intel igb driver here. Note that IRQ handlers are not necessarily fired for every single packet. You can adjust the rate at which IRQ handlers fire on most cards by using a feature called interrupt coalescing. Some NICs may not support this option.
The IRQ handlers for other devices will be executed when the IRQ is fired because IRQ handlers have very high priority on the CPU. The NAPI poll loop (which runs in a SoftIRQ) will run on whichever CPU the device IRQ was handled. Thus, if you have multiple NICs and multiple CPUs, you can tune the IRQ affinity of the IRQs for each NIC to prevent starving a particular NIC.
As for the example you asked about in the comments:
say NIC 1 and NIC 2 share IRQ line , lets assume NIC 1 is low load , NIC 2 high load and NIC 1 receives interrupt, driver of NIC 1 would disable interrupt until it's softirq is handled , say that time gap as t1 . So for time t1 NIC 2 interrupts are too disabled, right?
This depends on the driver, but in the normal case, NIC 1 only disables interrupts while the IRQ handler is being executed. The call to napi_schedule tells the softirq code that it should start running if it hasn't started yet. The softirq code runs asynchronously, so no NIC 1 does not wait for the softirq to be handled.
Now, as far as shared IRQs go: again it depends on the device and the driver. The driver should be written in such a way that it can handle shared IRQs. If the driver disables an IRQ that is being shared, all devices sharing that IRQ will not receive interrupts. This would be bad. One way that some devices solve this is by allowing a driver to read/write to a specific register causing that specific device to stop generating interrupts. This is a preferred solution as it does not block other devices generating the same IRQ.
When IRQs are disabled for NAPI, what is meant is that the driver asks the NIC hardware to stop sending IRQs. Thus, other IRQs on the same line (for other devices) will still continue to be processed. Here's an example of how the Intel igb driver turns off IRQs for that device by writing to registers.

How to make PCI device initiate a DMA operation?

I need to find a way to trigger DMA operations easily at my command to facilitate hardware debugging. Is it possible to initialize a DMA read on existing PCI device (e.g. sound card or netcard) in my Linux, by writing directly to its registers? Or do I have to write a custom driver and invoke it by insmod?
There is no standard way to start a DMA operation. Generally, you need to prepare a DMA buffer on the host and setup DMA registers on your device, load DMA address(s), size(s) etc.
However, there is no single standard for DMA registers for PCI devices.
You need to find the specification document of your PCI device. In that spec, look for DMA chapter (this is also called PCI "master access" as opposed to "target access").
You will find there:
- If scatter-gather or contiguous DMA are supported.
- How to setup DMA registers, one of them is usually called DMA CSR - "DMA command/status register".
- If the device supports complicated DMA layout (one or many ring buffers, chain of DMA descriptors etc.)
But the good thing is that many PCI devices support 0-size DMA.
Which does not do any memory access but just triggers a "DMA complete" interrupt. This can be a very convenient place to start for you.

Best way to transfer video data to a device over PCI in linux

I need to transfer video data to and from an FPGA device over PCI in a linux environment. I'm using a third party PCI master core on the FPGA. So far, I've implemented a simple DMA controller on the FPGA to transfer data from the FPGA to the CPU, using consecutive PCI write bursts.
Next, I need to transfer video data from the CPU to the FPGA. What is the best way to go about this?
Should I implement a module on the FPGA which performs a whole bunch of burst reads over PCI. Or is there a way to get the CPU to efficiently write data into the FPGA's memory using PCI write bursts?
My bandwidth requirements are around 30 MB/s in both directions.
Thanks.
You could do posted writes from CPU like what video card drivers do but you'll need to have some driver magic such as setting MTRR (which means you might have some architectural dependency). If you want to be safe DMA read from FPGA is a better way to go. 30MB/s isn't much.
Sounds to me the FPGA should master both reads and writes. Otherwise you would hog the host CPU. That's a classic task for a DMA (and you cannot guarantee a DMA exists on every host).

Resources