I'm looking for a way to send some data from my software app written in C to AXI-Stream interface of Zynq. Something like
open(/dev/axistream);
send_data(data);
I'm running Linux on the Arm part and now I want to connect it to the programmable logic part.
On a zynq device communication between the Cortex-A9 processor and FPGA is done using AXI protocol. There are three types of ports which can be used to communicate between FPGA and CPU (Zynq TRM) :
General Purpose AXI ports: 2x Master (from CPU to FPGA) and 2x Slave port (from FPGA to CPU). these ports are connected to the central interconnect of the processing system and can be used to transfer data to/from DDR memory or on-chip memory (OCM).
High Performance AXI ports: 4x Slave port (from FPGA to CPU) provide high-bandwidith access to DDR or OCM
ACP (Accelerator Coherency Port): Slave port (from FPGA to CPU) high-troughput port connected directly to the snoop control unit (SCU). The SCU maintains cache coherency (ommits the need for cache flush/invalidates).
From your question, I would understand that in your case the CPU is the Master of the communication. You will need to use the General-Purpose axi master ports. You cannot connect an AXI4 streaming interface to the AXI interconnect. You will need to convert AXI4 Streaming to AXI. Depending on your performance needs an AXI DMA ip core (AXI DMA IP core) might be a good solution.
If you want to communicate from software point of view using "open(/dev/)" you will need a Linux device driver. If you are using the DMA core your communication will typically look like this:
You will configure the DMA core to fetch data from a certain memory address
Start the DMA core
the DMA core will fetch the data and feed it to the AXI4 streaming interface of your IP block
Your IP block will do some operation on the data and send back to memory (using DMA) or do something else (send to external interface, ...)
The register set of your DMA core will be memory mapped and accessible through you own linux device driver. For debugging purposes i would suggest using mmap to access the registers and quickly validate the operations of your hardware. Once you go for the linux kernel device driver i would suggest you reading this book: Linux Device Drivers 3the edition
The best choice for efficient data transfer is using DMA enabled PS-PL communication. After implementing a DMA controller inside PL, such as AXI CDMA you can connect it to an AXI4-Stream IP then to your desired IP core.
If your not going to set up a general framework you can access DMA-enabled part of DDR memory using mmap() system call.
Here is a template to transfer data from user space to the IP core in which a loop-back is implemented.
https://github.com/h-nasiri/Zynq-Linux-DMA
Zynq AXI CDMA
The AXI CDMA uses processing system HP slave port to get read/write access of DDR system memory. There is also a Linux OS based application software that uses mmap() to initialize the DMA core and then do the data transfer.
You can easily add an AXI4-Stream interconnect to the AXI CDMA and connect
If I understand correctly, you want to DMA data from the from the PS to PL using the DMA engine. In that case, you would need to write a driver in Linux which will either use the AXI DMA engine driver, or configure the DMA engine from user space.
Is that what you are looking for?
Related
I am studying Operating Systems, and came across divice controllers.
I gathered that a device controller is hardware whereas a device driver is software.
I also know that a HDD and a SSD both have a small PCB buit into them and I assume those PCB's are the device controllers.
Now what I want to know is if there is another device controller on the PC/motherboard side of the bus/cable connecting the HDD/SSD to the OS?
Is the configuration: OS >> Device Driver >> Bus >> Device Controller >> HDD/SSD
Or is it: OS >> Device Driver >> Device Controler >> Bus >> Device Controller >> HDD/SSD
Or is it some other configuration?
Sites I visited for answers:
Tutorialspoint
JavaPoint
Idc online
Quora
Most hard-disks on desktop are SATA or NVME. eMMC is popular for smartphones but some might use something else. These are hardware interface standards that describe the way to interact electrically with those disks. It tells you what voltage at what frequency and for what amount of time you need to apply (a signal) to a certain pin (a bus line) to make the device behave or react in a certain way.
Most computers are separated in a few external chips. On desktop, it is mostly SATA, NVME, DRAM, USB, Audio Output, network card and graphics card. Even though there is few chips, the CPU would be very expensive if it had to support all those hardware interface standards on the same silicon chip. Instead, the CPU implements PCI/PCI-e as a general interface to interact with all those chips using memory mapped registers. Each of these devices have an external PCI-e controller between the device and the CPU. In the same order as above, you have AHCI, NVME controller, DRAM (not PCI and in the CPU), xHCI (almost everywhere) and Intel HDA (example). Network cards are PCI-e and there isn't really a controller outside the card. Graphics card are also self standing PCI-e devices.
So, the OS detects the registers of those devices that are mapped in the address space. The OS writes at those locations, and it will write the registers of the devices. PCI-e devices can read/write DRAM directly but this is managed by the CPU in its general implementation of the PCI-e standard most likely by doing some bus arbitration. The CPU really doesn't care what's the device that it is writing. It knows that there is a PCI register there and the OS instructs to write it with something so it does. It just happens that this device is an implementation of a standard and that the OS developers read the standard so they write the proper values in those registers and the proper data structures in DRAM to make sure that the device knows what to do.
Drivers implement the standard of the software interface of those controllers. The drivers are the ones instructing the CPU on values to write and writing the proper data structures in DRAM for giving commands to the controllers. The user thread simply places the syscall number in a conventionnal register determined by the OS developers and they call an instruction to jump into the kernel at a specific address that the kernel decides by writing a register at boot. Once there, the kernel looks at the register for the number and determines what driver to call based on the operation.
On Linux and some place else, it is done with files. You call syscalls on files and the OS has a driver attached to the file. They are called virtual files. A lot of transfer mechanisms are similar to the reading/writing files pattern so Linux uses that to make a general driver model where the kernel doesn't even need to understand the driver. The driver just says create me a file there that's not really on the hard disk and if someone opens it and calls an operation on it then call this function that's there in my driver. From there, the driver can do whatever it wants because it is in kernel mode. It just creates the proper data structures in DRAM and writes the registers of the device it drives to make it do something.
In my opinion, SPI and DMA are both controllers.
SPI is a communication tool and DMA can transfer data without CPU.
The system API such as spi_sync() or spi_async(), are controlled by the CPU.
So what is the meaning of SPI with DMA, does it mean DMA can control the SPI API without CPU? Or the SPI control uses CPU but the data transfer to DMA directly?
SPI is not a tool, it is a communication protocol. Typical micro controllers have that protocol implemented in hardware which can accessed by read/write to dedicated registers in the address space of the given controller.
DMA on micro controllers is typically designed to move content of registers to memory and visa versa. DMA can sometimes configured to write a special amount of read/writes or increasing or decreasing source and target address of memory and so on.
If you have a micro controller which have SPI with DMA support, it typically means that you can have some data in the memory which will be transferred to the SPI unit to send multiple data bytes without intervention of the cpu core itself. Or read an amount of data bytes from SPI to memory automatically without wasting cpu core.
How such DMA SPI transfers are configured is written in the data sheets of the controllers. There are a very wide span of types so no specific information can be given here without knowing the micro type.
The linux APIs for dealing with SPI are abstracting the access of DMA and SPI by using the micro controller specific implementations in the drivers.
It is quite unclear if you want to use the API to access your SPI or you want to implement a device driver to make the linux API working on your specific controller.
It is not possible to give you a general introduction to write a kernel driver here or clarify register by register from your data sheets. If you need further information you have to make your question much more specific!
I am trying to use DMA to program an FPGA connected to an OMAP-L138's SPI bus, but without success.
Currently, I am using the stock davinci-spi driver (drivers/spi/spi-davinci.c)that comes with linux 3.19. FPGA configuration is successful (without DMA enabled), but it is very slow. I am using a device tree to configure the SPI interface.
I would like to use DMA to improve performance, however from looking at the spi-davinci.c source code and its device tree bindings, the driver does not appear to support DMA when configured with device tree. Is my understanding correct? If so, are there any plans to support DMA transfers using davinci's SPI driver when also using device tree?
Here are a few guidelines to achieve your goal:
First, check if the SPI has it's own DMA engine. If it doesn't, perhaps there's a generic DMA controller on board. You can check this by looking at the SPI datasheet and looking at the board interconnect schematics.
If none of the above are true, then you can't use DMA with the SPI.
If the SPI has its own DMA, you'll need to write a driver for that.
If there's a DMA on board, it's probably utilized by other components, search for dma_dngine driver for that particular device. Then you'll need to create a DMA client for that particular DMA engine.
Please read:
DMA Provider
DMA Client
Good luck
In the book LDD3, if one driver want to control the pins of CPU, it should call request_region() function to declare the usage of the ports.
When I want to implement a simple driver module on my Raspberry Pi, however, that I found in this example the request of ports is implemented by gpio_request() function.
Why and when we need to use gpio_request() instead of request_region()? And, what's the difference purposes for these two functions.
BTW: I searched the LDD3 page by page but I can't find any clues about the GPIO... why there is no any introductions to GPIO? Is it because of the 2.6 kernel version?
In the book LDD3, if one driver want to control the pins of CPU, it should call request_region() function to declare the usage of the ports.
First, the word "port" is ambiguous and requires context. Port can refer to a physical connector (e.g. USB port), or a logical connection (e.g. TCP port).
Your understanding of request_region() is flawed. That routine is for management of I/O address space. Your question is tagged with raspberry-p1 which uses an ARM processor and has no I/O address space to manage. ARM processors use memory-mapped device registers. You would use request_mem_region() in a device driver for the memory addresses of that peripheral's register block.
Each GPIO is controlled by a bit position in one or more control registers. Those registers would be handled by an overall GPIO subsystem. (There's also a lower-layer (closer to the HW) pin-control driver for multiplexed pins, i.e. pins that can be assigned to a peripheral device or used as GPIO.)
The driver for the GPIO (or pin-control) subsystem should perform a request_mem_region() for the memory addresses of the SoC's GPIO control registers. A gpio_request() would be management of an individual pin that is subordinate to management of the registers.
Note that use of request_mem_region() and gpio_request() are not mutually exclusive in a device driver. For instance the driver for a USB controller would request_mem_region() the memory addresses for its control registers. It may also have to gpio_request() for pin(s) that control the power to the USB connector(s) (assuming that's how the power is controlled with logic external to the controller).
why there is no any introductions to GPIO? Is it because of the 2.6 kernel version?
Conventions for using GPIO in Linux appeared in Documentation/gpio.h in 2007 with version 2.6.22. Generic (i.e. standardized rather than platform specific) GPIO support appeared in the Linux kernel several years later with version 2.6.3x(?). Prior to that (and even after) each platform (e.g. SoC manufacturer) had its own set of routines for accessing (and maybe managing) GPIOs.
LDD3 claims to be current as of the 2.6.10 kernel. Also that book may be x86-centric (as Linux has x86 origins), and x86 processors typically do not have GPIOs.
I need to transfer video data to and from an FPGA device over PCI in a linux environment. I'm using a third party PCI master core on the FPGA. So far, I've implemented a simple DMA controller on the FPGA to transfer data from the FPGA to the CPU, using consecutive PCI write bursts.
Next, I need to transfer video data from the CPU to the FPGA. What is the best way to go about this?
Should I implement a module on the FPGA which performs a whole bunch of burst reads over PCI. Or is there a way to get the CPU to efficiently write data into the FPGA's memory using PCI write bursts?
My bandwidth requirements are around 30 MB/s in both directions.
Thanks.
You could do posted writes from CPU like what video card drivers do but you'll need to have some driver magic such as setting MTRR (which means you might have some architectural dependency). If you want to be safe DMA read from FPGA is a better way to go. 30MB/s isn't much.
Sounds to me the FPGA should master both reads and writes. Otherwise you would hog the host CPU. That's a classic task for a DMA (and you cannot guarantee a DMA exists on every host).