I am using the Mellanox ConnectX-6 NIC and the configuration is as shown below:
(lspci -vv)
a1:00.0 Ethernet controller: Mellanox Technologies MT28908 Family [ConnectX-6]
Subsystem: Mellanox Technologies Device 0028
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Interrupt: pin A routed to IRQ 182
NUMA node: 1
Region 0: Memory at a0000000 (64-bit, prefetchable) [size=32M]
Expansion ROM at 9d000000 [disabled] [size=1M]
Capabilities: [60] Express (v2) Endpoint, MSI 00 ...
I am measuring RDMA throughput between two systems by varying the chunk size for total transfer of 10GB data. (Both the machines have same Mellanox NIC)
The results show that just after the chunk size of 32 MB (i.e. 33, 34, 35 MB …), the throughput would drop drastically by around 50+ Gbps. (Normal speeds for this NIC is 175-185 Gbps, so till 32 MB I get these speeds but in 33MB chunk size, I am getting somewhere in between 85-120 Gbps)
So would like to know is the prefetchable memory 32 MB which is listed in the above configuration has any impact on RDMA throughput.
Related
The following lspci output contains a line Expansion ROM at 42000000 [virtual] [disabled] [size=2M], what is the meaning and implications of [virtual]? How can I enable the expansion rom?
# lspci -s d:0.0 -v
0d:00.0 3D controller: Moore Threads Technology Co.,Ltd MTT S2000
Subsystem: Moore Threads Technology Co.,Ltd MTT S2000
Flags: bus master, fast devsel, latency 0, IRQ 35
Memory at 40000000 (32-bit, non-prefetchable) [size=32M]
Memory at 2000000000 (64-bit, prefetchable) [size=16G]
Expansion ROM at 42000000 [virtual] [disabled] [size=2M]
Capabilities: [80] Power Management version 3
Capabilities: [90] MSI: Enable+ Count=1/8 Maskable+ 64bit+
Capabilities: [c0] Express Endpoint, MSI 00
Capabilities: [100] Advanced Error Reporting
Capabilities: [150] Device Serial Number 00-00-00-00-00-00-00-00
Capabilities: [160] Power Budgeting <?>
Capabilities: [180] Resizable BAR <?>
Capabilities: [1b8] Latency Tolerance Reporting
Capabilities: [1c0] Dynamic Power Allocation <?>
Capabilities: [300] Secondary PCI Express
Capabilities: [4c0] Virtual Channel
Capabilities: [900] L1 PM Substates
Capabilities: [910] Data Link Feature <?>
Capabilities: [920] Lane Margining at the Receiver <?>
Capabilities: [9c0] Physical Layer 16.0 GT/s <?>
Kernel driver in use: mtgpu
I tried to retrieve the ROM address by setpci, it seems it's not very meaningful (I was expecting something like 42000000):
# setpci -s d:0.0 ROM_ADDRESS
00000002
For some non-[virtual] Expansion ROMs, I can enable them by using setpci -s <slot> ROM_ADDRESS=1:1, but I failed for this one.
My goal is to read the expansion rom of the device (either using dd or memtool), after enable the expansion rom somehow.
UPDATE :[ Seemed to have been hardware error, working fine with same code but new card ]
I recently bought a very cheap parallel pci card (link) to try to learn a bit about device drivers in linux (via ldd3) on my ubuntu machine
I've connected leds to pins 2-9, and have been able to set/clear the pins using IO ports. However have not been able to raise an interrupt and handle it. Any help or pointers will be appreciated
(please note I have pin 9 directly wired to pin 10)
lspci
07:04.0 Parallel controller: Device 1c00:2170 (rev 0f) (prog-if 01 [BiDir])
Subsystem: Device 1c00:2170
Control: I/O+ Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Interrupt: pin A routed to IRQ 11
Region 0: I/O ports at ccf0 [size=8]
Region 1: I/O ports at ccf8 [size=8]
after system boot, the io registers are:
DATA: 0xff, STATUS: 0x07, CONTROL: 0xc0
I've tried:
outb_p(0x10, BASE+2); // enable irq
outb_p(0x00, BASE); outb_p(0xFF, BASE); // trigger interrupt
// => DATA: 0xff, STATUS: 0x7b, CONTROL: 0xd0
but the interrupt count(as reported by lspci) in /proc/stat 's intr line for IRQ11 (as reported by lspci ) remains zero
I have also tried wrapping the above seq between probe_irq_on/off() (with an additional outb_p(0x00, BASE+2); udelay(5) in between) which also fails to spot and report any interrupt.
This kernel probing was done after a call to pci_enable_device(dev) in the module code.
Please let me know if any other info is required. Thanks in advance.
Linux kernel fails to assign memory to the device when the BAR size is set to 1GB. The device enumeration works fine as long as the BAR memory size is set to 512MB. But when set to 1GB, it enumerates the device, but then the memory mappings are not assigned.
Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B-
ParErr- DEVSEL=fast >TAbort- SERR-
(64-bit, non-prefetchable) [disabled] Region 2: Memory at
(64-bit, non-prefetchable) [disabled] Region 4: Memory
at (64-bit, non-prefetchable) [disabled]
What could be the reason for this? What can be done to debug this?
Enabled kernel debug at boot-up and this is what is logged for that device:
[ 7.087688] pci 0000:8b:00.0: BAR 4: can't assign mem (size
0x40000000) [ 7.109427] pci 0000:8b:00.0: BAR 0: can't assign mem
(size 0x100000) [ 7.130599] pci 0000:8b:00.0: BAR 2: can't assign
mem (size 0x2000)
you can try setpci -s "your pcie device bus number" COMMAND=0x02 for example setpcie -s 01:00.0 COMMAND=0x02 this will enable memory mapped transfers for your pcie device.
you can refer to this link:
https://forums.xilinx.com/t5/PCI-Express/lspci-reports-BAR-0-disabled/td-p/747139
My machine (running Linux kernel 3.2.38) on boot has wrong subsystem IDs (sub-device and sub-vendor IDs) of a PCI device. If I then physically unplug and re-plug the PCI device while the system is still up (i.e., hot-plug), it gets the correct IDs.
Note that the wrong sub-device and sub-vendor IDs it gets are same as the device's device and vendor IDs (see the first two lines in the lspci output below).
Following is the output of lspci -vvnn before and after hot-plugging the device:
Before hot-plugging:
0b:0f.0 Bridge [0680]: Device [1a88:4d45] (rev 05)
Subsystem: Device [1a88:4d45]
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 32 (250ns min, 63750ns max)
Interrupt: pin A routed to IRQ 10
Region 0: I/O ports at 2100 [size=256]
Region 1: I/O ports at 2000 [size=256]
Region 2: Memory at 92920000 (32-bit, non-prefetchable) [size=64]
After hot-plugging:
0b:0f.0 Bridge [0680]: Device [1a88:4d45] (rev 05)
Subsystem: Device [007d:5a14]
Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Interrupt: pin A routed to IRQ 10
Region 0: I/O ports at 2100 [disabled] [size=256]
Region 1: I/O ports at 2000 [disabled] [size=256]
Region 2: [virtual] Memory at 92920000 (32-bit, non-prefetchable) [size=64]
My question: Is there a way to get the IDs fixed without hot-plugging the device? e.g. forcing kernel to re-read PCI device IDs e.g. by performing a PCI bus rescan/re-enumeration/re-configuration?
Any help would be highly appreciated. Thanks.
PS. Note that the problem isn't really related to kernel/software as it exists even if boot into UEFI internal shell.
PPS. The PCI device in this case is MEN F206N and "My machine" is MEN F22P
You may forcefully rescan the PCI by :
# echo 1 > /sys/bus/pci/rescan
A closer look at your lscpi output before and after hot plugging the device shows more delta than just the sub device/vendor ID. I'd be surprised if the device functions as expected after hot plugging.
Besides, forcing PCI reenumeration is not possible primarily because there may be other devices that have been enumerated correctly and functioning already. How do you expect reenumeration to deal with that? (and there are other reasons too.)
Prafulla
I wrote a simple PCIe driver and I want to test if it works. For example, If it is possible to write and read to the memory which is used from the device as well.
How can I do that?
And which stuff should be proved too?
You need to find the sysfs entry for your device, for example
/sys/devices/pci0000:00/0000:00:07.0/0000:28:00.0
(It can be easier to get there via the symlinks in other subdirectories of /sys, e.g. /sys/class/...)
In this directory there should be (pseudo-)files named resource... which correspond to the various address ranges (Base Address Registers) of your device. I think these can be mmap()ed (but I've never done that).
There's a lot of other stuff you can do with the entries in /sys. See the kernel documentation for more details.
To test the memory you can follow this approach:
1) Do lspci -v
Output of this command will be something like this
0002:03:00.1 Ethernet controller: QUALCOMM Corporation Device ABCD (rev 11)
Subsystem: QUALCOMM Corporation Device 8470
Flags: fast devsel, IRQ 110
Memory at 11d00f1008000 (64-bit, prefetchable) [disabled] [size=32K]
Memory at 11d00f0800000 (64-bit, prefetchable) [disabled] [size=8M]
Capabilities: [48] Power Management version 3
Capabilities: [50] Vital Product Data
Capabilities: [58] MSI: Enable- Count=1/8 Maskable- 64bit+
Capabilities: [a0] MSI-X: Enable- Count=1 Masked-
Capabilities: [ac] Express Endpoint, MSI 00
Capabilities: [100] Advanced Error Reporting
Capabilities: [13c] Device Serial Number 00-00-00-00-00-00-00-00
Capabilities: [150] Power Budgeting <?>
Capabilities: [180] Vendor Specific Information: ID=0000 Rev=0 Len=028 <?>
Capabilities: [250] #12
2) We can see in the above output memory is disabled. To enable it we can execute the following:
setpci -s 0002:03:00.1 COMMAND=0x02
This command will enable the memory at the address: 11d00f1008000
Now, try to read this memory using your processor read command it should be accessible.