What's the difference between streaming mappings and coherent mapping in DMA - linux

According to Linux Device Drivers book author says something like: one have to make sure that DMA address mapping range between Operating system and The hardware should be equal
The first question that must be answered before attempting DMA is where the given device is
capable of such an operation on the current host
From Kernel.org it says
first step for The setup for streaming mappings is performed via a call to
int dma_set_mask(struct device *dev, u64 mask);
And first step for DMA coherent mapping/consistent allocations is performed via a call
to dma_set_coherent_mask()
In E1000E driver and also in RealTek drivers do both because they use this function call in probe function of pci driver
dma_set_mask_and_coherent
Which is for both streaming and coherent mapping informing kernel the bit mask supported by hardware 64
This is how RealTek Device driver enables both DMA mappings
dma_alloc_coherent(&pdev->dev, R8169_RX_RING_BYTES,
&tp->RxPhyAddr, GFP_KERNEL);
inside open function of net_device
And for streaming DMA mapping this used
alloc_pages_node // allocating Kernel page for DMA
dma_map_page(d, data, 0, R8169_RX_BUF_SIZE, DMA_FROM_DEVICE); //Enabling Streaming mappingg?
also in open function
My question is why two mappings for DMA, why real drivers use both Streaming and Coherent mappings?
Like in RealTek device it just use single page streaming mappings plus coherent mapping so basically its connecting Rx Descriptor Array represented by pointer with Coherent mapping and Page streaming mapping connected to an array which it calls Rx_databuff[256U] of type page *

Related

Linux Kernel ALSA Driver DMA to FPGA issue

I'm working on a zynq 7000 SOC. I wrote an ALSA driver. Inside the FPGA, I implemented all the circuitery (FIFO, clocks generator, I2S dac output). From the Linux kernel point of view, there is an address to feed sample in the FPGA FIFO, one address to read the hardware pointer and an interrupt for the "snd_pcm_period_elapsed" function call.
In the "pcm_copy" callback, i copy samples from userspace to kernel space previously allocaded buffer, and then, i use a loop to push data in the FPGA from the kernel buffer.
I would like to USE an AXI dma to be able to copy directly from the pcm buffer to the FPGA FIFO. The DMA works perfectly (i used something very simple and not the linux DMA api witch is for me very difficult to understand...)
To test my DMA, i allocate a coherent buffer "dma_alloc_coherent" and i copy the samples in the coherent buffer and then i launch the transfert.
My issue is: I don't undersand how to get the physical address of the PCM buffer to be able to use the DMA transfert directly without copy in the coherent buffer. I read a lot of documents and i looked a lot of devices drivers, most of them are using PCI DMA witch is an other story ...
I founded somewhere that i have to implement the mmap callback in my alsa driver, but i didn't understand well, i also looked arround the "dma_area" field in the substream data structure witch contain an address, but if i use it, i have a dma error...
I really want to well understand how the pcm ring buffer is allocated and so on, does anybody can help me ?
Best regards,
Nicolas

why firmware is part of a driver. Is it important, Can I exclude it from a driver

First why firmware is important. For example in real drivers I see there is DMA or MMIO read/write is done, but normally driver code in Linux add firmware struct after requesting it from Kernel using request_firmware function.
why should I add firmware in PCI driver when i can read and write to device from driver using direct memory access. DMA is totally different and has nothing to do with firmware. DMA map device object to kernel virtual page and firmware struct object simple has read and write operations.which I don't know why they are needed. For example for typical driver this function is a struct of firmware write operations
typedef void (*my_driver_write_phy)(struct mydriver_private *o, int register, int val);
This function is registered as a callback and member of my driver's firmware struct. so I guess Kernel calls this function. But my question is, when does kernel call this function. is it used for additional features (please explain if it can be excluded) or is it called every time when data transfer happen from device to/from system including when DMA memory is accessed
So basically real question is: IS firmware required also for Direct Memory access? AND can it be excluded

Difference between usb_alloc_coherent and kzalloc/kmalloc

What is the fundamental difference between using usb_alloc_coherent and kzalloc/kmalloc in context of USB driver. Both does the same, allocate a memory area for URB buffer. But what is the difference of them. Is there any benefit of using usb_alloc_coherent instead of kzalloc/kmalloc?
Drivers are device (endpoint) centric but memory allocation must consult capabilities of the USB controller. This is because it is the controller which performs the DMA from memory onto the USB bus. So usb_alloc_coherent basically wraps the generic dma_alloc_coherent but calls it for the controller, not the endpoint. Using DMA-API instead of just kmalloc ensures that no bounce-buffers will be required.
This saves device driver writes from code ugliness (breaking abstractions) and handling of some corner cases. usb_alloc_coherent also uses a memory poll to speed things up a bit.
The documentation says:
usb_alloc_coherent - allocate dma-consistent buffer for URB_NO_xxx_DMA_MAP
These buffers are used with URB_NO_xxx_DMA_MAP set in urb->transfer_flags
to avoid behaviors like using "DMA bounce buffers", or thrashing IOMMU
hardware during URB completion/resubmit.

DMA and I/O memory region under Linux

I'm writing this because I have some doubts about the behaviour of DMA.
I'm reading about the PCI layout and how the device drivers interacts with the card, and I read about DMA.
Since I understood, PCI cards doesn't have a DMA controller, instead of that they request to be a master of the bus, and then they are able to take a DMA address and do transfers between memory and device ( through the bus ).
This DMA address is a portion of RAM, actually it's a physical address and before do nothing you need to convert that in something that your drivers can use, like a kernel virtual memory.
I've checked that with this code:
/* Virtual kernel address */
kernel_buff = pci_alloc_consistent(dev, PAGE_SIZE, &dma_addr);
pr_info("Kernel buffer - %12p , Dma_addr - %12p\n", kernel_buff, (void *)dma_addr );
pr_info( "Kernelbuffer - dma_addr - %12p\n", kernel_buff - dma_addr);
strcpy(kernel_buff, "Test dma\n");
/* Test memory */
ptest = (void *)dma_addr;
ptest = phys_to_virt((unsigned long)ptest);
pr_info("Ptest virtual memory(%p) containts - %s\n", ptest, (char *)ptest);
And the output was:
[425971.835669] Kernel buffer - ffff8800ca70a000 , Dma_addr - ca70a000
[425971.835671] Kernelbuffer - dma_addr - ffff880000000000
[425971.835673] Ptest virtual memory(ffff8800ca70a000) containts - Test dma
This is how I understood that DMA is a portion of RAM.
My doubt is about how this transfer is made.
I mean, every time that I write in this buffer, the data that the buffer constains will be transfered to the device? Or only the adress of the memory location, and then the device will read from this location?
This is about DMA.
And about I/O memory maps:
When we request a I/O memory region of the device with for example:
pci_resource_start
We are requesting the region of the memory where device's registers is located?
So in this way we have this memory location into the RAM ? And we wan write/read as a normal memory locations.
And the final point is that, we use DMA because the I/O memory mapping only allows few bytes per cycle since this process involves the CPU, right?
So we can transfer amounts of data between memory locations( RAM and bus of device) without the cpu.
The steps involved to transfer the data to the device could be summarized as follows :
Assume that you have the data in a buffer.
The driver creates a DMA mapping for this buffer (say using pci_alloc_consistent() or the newer dma_alloc_coherent()), and returns the corresponding DMA bus address.
This DMA bus address is to be informed to the device. This is done by writing into the correct DMA registers of the device through writel() (assuming that the device registers are memory mapped).
The device also needs to be informed about the amount of data that is being transferred and such (by writing to the appropriate registers of the device using writel())
Now issue the command to the device to start the DMA transactions by writing to one of its control registers (again possibly using writel()).
Once the data transaction is completed, the device issues an interrupt.
In the interrupt handler, the driver may unallocate the buffer which was used for transaction and might as well perform DMA unmapping.
And there you have it.. The data is transferred to the device!
Now coming to the question regarding the IO memory maps :
First of all when we call pci_resource_start(), we do not "request" the IO ports. This is the way by which we are just gathering info. about the ports. The request is done using pci_request_regions(). To be specific to your questions :
We are requesting the region of the memory where device's registers is located?
Using this, we are requesting the kernel to give access to this region of memory (memory mapped ports) where the device's registers are located.
So in this way we have this memory location into the RAM ?
No, we do not have this memory location in RAM, it is only memory mapped, which means that the device shares the same address, data and control lines with the RAM and hence, same instruction that are used to access the RAM can also be used to access the device registers.
You've answered your last question yourself. DMA provides huge amounts to data to be transferred efficiently. But, there are cases where you need to use the memory mapping to transfer the data. The best example is already stated in the explanation of DMA transaction process, where, you need to transfer the address and control information to the device. This could be done only through memory mapped IO.
Hope this helps.

Mapping DMA interrupts in the linux kernel

I'm writing a kernel module for a powerpc SoC which contains a DMA controller. I want to map the DMA interrupts in the linux kernel. my DMA structure has two interrupts:
struct dma
{
u32 dma1;
u32 dma2;
}*dma;
I have memory mapped the DMA structure in the Kernel. I have used the function irq_of_parse_and_map() to get the virq number to the corresponding interrupts.
dma->dma1=irq_of_parse_and_map(ofdev->node,0);
dma->dma2=irq_of_parse_and_map(ofdev->node,1);
but i cant get the virq numbers for the above interrupts. What APIs might be available to access the VIRQ numbers?
PowerPC based system uses a Device Tree Blob (DTB), also referred as Device Tree Source (DTS), which is a database that represents the hardware components (Processor Configuration, Buses, Peripherals etc...) on a given board. Linux kernel during its bootup expects certain information on the hardware that it runs on. Hardware information is passed from DTB to kernel by the bootloader software (eg: u-boot) as per Open Firmware standard. Once kernel get the hardware information, it will do all the software setup as a part of the kernel initilization routine.
From here on if any kernel software component (eg: device driver) needs hardware detail, it should get it from the kernel by using a set of Open Firmware Standard Binary Interfaces. Some of them are listed below:
of_register_platform_driver() - Register driver for device
of_unregister_platform_driver() - Unregister driver for device
of_address_to_resource() - Obtain physical address of peripheral
of_get_property() - Find property with a given name for a given node
of_find_node_by_phandle() - Find a node given a phandle
irq_of_parse_and_map() - Parse and map an interrupt into linux virq space
of_irq_to_resourse() - Obtain virtual IRQ of peripheral
...
...
Now coming to the problem raised here. irq_of_parse_and_map() is used to parse and map an interrupt into linux virq space. Usually this will be done by the Interrupt Controller device driver of the system. Once the interrupt mapping is done, you can get the Interrupt Source virq by referring to of_irq_to_resource() call. This step will be required for registering interrupt handler to the interrupt source. So try using of_irq_to_resource() instead of irq_of_parse_and_map().
Ref:
Device Tree Blob: http://www.informit.com/articles/article.aspx?p=1647051&seqNum=5
Open Firmware: http://www.openfirmware.org/
OF IRQ Interface: linux-2.6/drivers/of/irq.c

Resources