What does `[virtual]` mean in `lspci` output for Expansion ROM? - linux

The following lspci output contains a line Expansion ROM at 42000000 [virtual] [disabled] [size=2M], what is the meaning and implications of [virtual]? How can I enable the expansion rom?
# lspci -s d:0.0 -v
0d:00.0 3D controller: Moore Threads Technology Co.,Ltd MTT S2000
Subsystem: Moore Threads Technology Co.,Ltd MTT S2000
Flags: bus master, fast devsel, latency 0, IRQ 35
Memory at 40000000 (32-bit, non-prefetchable) [size=32M]
Memory at 2000000000 (64-bit, prefetchable) [size=16G]
Expansion ROM at 42000000 [virtual] [disabled] [size=2M]
Capabilities: [80] Power Management version 3
Capabilities: [90] MSI: Enable+ Count=1/8 Maskable+ 64bit+
Capabilities: [c0] Express Endpoint, MSI 00
Capabilities: [100] Advanced Error Reporting
Capabilities: [150] Device Serial Number 00-00-00-00-00-00-00-00
Capabilities: [160] Power Budgeting <?>
Capabilities: [180] Resizable BAR <?>
Capabilities: [1b8] Latency Tolerance Reporting
Capabilities: [1c0] Dynamic Power Allocation <?>
Capabilities: [300] Secondary PCI Express
Capabilities: [4c0] Virtual Channel
Capabilities: [900] L1 PM Substates
Capabilities: [910] Data Link Feature <?>
Capabilities: [920] Lane Margining at the Receiver <?>
Capabilities: [9c0] Physical Layer 16.0 GT/s <?>
Kernel driver in use: mtgpu
I tried to retrieve the ROM address by setpci, it seems it's not very meaningful (I was expecting something like 42000000):
# setpci -s d:0.0 ROM_ADDRESS
00000002
For some non-[virtual] Expansion ROMs, I can enable them by using setpci -s <slot> ROM_ADDRESS=1:1, but I failed for this one.
My goal is to read the expansion rom of the device (either using dd or memtool), after enable the expansion rom somehow.

Related

Is RDMA/DMA performance impacted by 'Region 0' size

I am using the Mellanox ConnectX-6 NIC and the configuration is as shown below:
(lspci -vv)
a1:00.0 Ethernet controller: Mellanox Technologies MT28908 Family [ConnectX-6]
Subsystem: Mellanox Technologies Device 0028
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Interrupt: pin A routed to IRQ 182
NUMA node: 1
Region 0: Memory at a0000000 (64-bit, prefetchable) [size=32M]
Expansion ROM at 9d000000 [disabled] [size=1M]
Capabilities: [60] Express (v2) Endpoint, MSI 00 ...
I am measuring RDMA throughput between two systems by varying the chunk size for total transfer of 10GB data. (Both the machines have same Mellanox NIC)
The results show that just after the chunk size of 32 MB (i.e. 33, 34, 35 MB …), the throughput would drop drastically by around 50+ Gbps. (Normal speeds for this NIC is 175-185 Gbps, so till 32 MB I get these speeds but in 33MB chunk size, I am getting somewhere in between 85-120 Gbps)
So would like to know is the prefetchable memory 32 MB which is listed in the above configuration has any impact on RDMA throughput.

Linux kernel fails to assign memory to the PCIe device when the BAR size is set to 1GB

Linux kernel fails to assign memory to the device when the BAR size is set to 1GB. The device enumeration works fine as long as the BAR memory size is set to 512MB. But when set to 1GB, it enumerates the device, but then the memory mappings are not assigned.
Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B-
ParErr- DEVSEL=fast >TAbort- SERR-
(64-bit, non-prefetchable) [disabled] Region 2: Memory at
(64-bit, non-prefetchable) [disabled] Region 4: Memory
at (64-bit, non-prefetchable) [disabled]
What could be the reason for this? What can be done to debug this?
Enabled kernel debug at boot-up and this is what is logged for that device:
[ 7.087688] pci 0000:8b:00.0: BAR 4: can't assign mem (size
0x40000000) [ 7.109427] pci 0000:8b:00.0: BAR 0: can't assign mem
(size 0x100000) [ 7.130599] pci 0000:8b:00.0: BAR 2: can't assign
mem (size 0x2000)
you can try setpci -s "your pcie device bus number" COMMAND=0x02 for example setpcie -s 01:00.0 COMMAND=0x02 this will enable memory mapped transfers for your pcie device.
you can refer to this link:
https://forums.xilinx.com/t5/PCI-Express/lspci-reports-BAR-0-disabled/td-p/747139

what's the difference between free's result and dmidecode's result in linux?

I use two tools to collect my memory info, the dmidecode and free, and the two show different results,the dmidecode show my memory is 4096MB, the free -m show's 3829, what it's different and why?
Handle 0x0083, DMI type 17, 27 bytes
Memory Device
Array Handle: 0x0082
Error Information Handle: No Error
Total Width: 32 bits
Data Width: 32 bits
Size: 4096 MB
Form Factor: DIMM
Set: None
Locator: RAM slot #0
Bank Locator: RAM slot #0
Type: DRAM
Type Detail: EDO
Speed: Unknown
Manufacturer: Not Specified
Serial Number: Not Specified
Asset Tag: Not Specified
Part Number: Not Specified
free -m output:
total used free shared buffers cached
Mem: 3829 3566 262 0 495 1779
-/+ buffers/cache: 1291 2537
Swap: 8191 0 8191
dmidecode uses BIOS facilities (smbios in particular) to get amount of memory physically present in a system. On system's boot BIOS determines its size from SPD chips on DIMM modules.
But during boot some memory is reserved by BIOS itself (i.e. for Video RAM of embedded videocard), so amount of memory presented to OS is a bit smaller, that is what you see in free output.
Usually you can check it from dmesg output:
[ 0.000000] BIOS-provided physical RAM map:
[ 0.000000] BIOS-e820: 0000000000000000 - 000000000009e000 (usable)
[ 0.000000] BIOS-e820: 000000000009e000 - 00000000000a0000 (reserved)
[ 0.000000] BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
[ 0.000000] BIOS-e820: 0000000000100000 - 00000000bd92a000 (usable)
[ 0.000000] BIOS-e820: 00000000bd92a000 - 00000000bd94c000 (ACPI NVS)
...

Linux How to test a PCIe driver?

I wrote a simple PCIe driver and I want to test if it works. For example, If it is possible to write and read to the memory which is used from the device as well.
How can I do that?
And which stuff should be proved too?
You need to find the sysfs entry for your device, for example
/sys/devices/pci0000:00/0000:00:07.0/0000:28:00.0
(It can be easier to get there via the symlinks in other subdirectories of /sys, e.g. /sys/class/...)
In this directory there should be (pseudo-)files named resource... which correspond to the various address ranges (Base Address Registers) of your device. I think these can be mmap()ed (but I've never done that).
There's a lot of other stuff you can do with the entries in /sys. See the kernel documentation for more details.
To test the memory you can follow this approach:
1) Do lspci -v
Output of this command will be something like this
0002:03:00.1 Ethernet controller: QUALCOMM Corporation Device ABCD (rev 11)
Subsystem: QUALCOMM Corporation Device 8470
Flags: fast devsel, IRQ 110
Memory at 11d00f1008000 (64-bit, prefetchable) [disabled] [size=32K]
Memory at 11d00f0800000 (64-bit, prefetchable) [disabled] [size=8M]
Capabilities: [48] Power Management version 3
Capabilities: [50] Vital Product Data
Capabilities: [58] MSI: Enable- Count=1/8 Maskable- 64bit+
Capabilities: [a0] MSI-X: Enable- Count=1 Masked-
Capabilities: [ac] Express Endpoint, MSI 00
Capabilities: [100] Advanced Error Reporting
Capabilities: [13c] Device Serial Number 00-00-00-00-00-00-00-00
Capabilities: [150] Power Budgeting <?>
Capabilities: [180] Vendor Specific Information: ID=0000 Rev=0 Len=028 <?>
Capabilities: [250] #12
2) We can see in the above output memory is disabled. To enable it we can execute the following:
setpci -s 0002:03:00.1 COMMAND=0x02
This command will enable the memory at the address: 11d00f1008000
Now, try to read this memory using your processor read command it should be accessible.

Discovering linux disk configurations from the command line

How can I discover if a remote machine is configured with or without hardware or software RAID? All I know is i have 256GB at present, I need to order more space but before I can I need to know how the drives are configured.
df lists the drive as:
/dev/sdb1 287826944 273086548 119644 100% /mnt/db
and hdparm:
/dev/sdb:
HDIO_GET_MULTCOUNT failed: Invalid argument
readonly = 0 (off)
readahead = 256 (on)
geometry = 36404/255/63, sectors = 299439751168, start = 0
What else should I run and what should I look for?
Software RAID would not be /dev/sdb - dev/md0. Nor is it LVM.
So it's either real hardware RAID, or a raw disk.
lspci might show you and RAID controllers plugged in.
dmesg | grep sdb might tell you some more about the disk.
sdparm /dev/sdb might tell you something? Particularly if it really is a SCSI disk.
To check for software RAID:
cat /proc/mdstat
On my box, this shows:
Personalities : [raid1]
md0 : active raid1 sda1[0] sdb1[1]
96256 blocks [2/2] [UU]
md1 : active raid1 sda2[0] sdb2[1]
488287552 blocks [2/2] [UU]
unused devices: <none>
You get the names of all software RAID arrays, the RAID level for each, the partitions that are part of each RAID array, and the status of the arrays.
dmesg might help.
On a system where we do have software raid we see things like:
SCSI device sda: 143374744 512-byte hdwr sectors (73408 MB)
sda: Write Protect is off
sda: Mode Sense: ab 00 10 08
SCSI device sda: write cache: enabled, read cache: enabled, supports DPO and FUA
SCSI device sda: 143374744 512-byte hdwr sectors (73408 MB)
sda: Write Protect is off
sda: Mode Sense: ab 00 10 08
SCSI device sda: write cache: enabled, read cache: enabled, supports DPO and FUA
sda: sda1 sda2
sd 0:0:0:0: Attached scsi disk sda
SCSI device sdb: 143374744 512-byte hdwr sectors (73408 MB)
sdb: Write Protect is off
sdb: Mode Sense: ab 00 10 08
SCSI device sdb: write cache: enabled, read cache: enabled, supports DPO and FUA
SCSI device sdb: 143374744 512-byte hdwr sectors (73408 MB)
sdb: Write Protect is off
sdb: Mode Sense: ab 00 10 08
SCSI device sdb: write cache: enabled, read cache: enabled, supports DPO and FUA
sdb: sdb1 sdb2
sd 0:0:1:0: Attached scsi disk sdb
A bit later we see:
md: md0 stopped.
md: bind
md: bind
md: raid0 personality registered for level 0
md0: setting max_sectors to 512, segment boundary to 131071
raid0: looking at sda2
raid0: comparing sda2(63296000) with sda2(63296000)
raid0: END
raid0: ==> UNIQUE
raid0: 1 zones
raid0: looking at sdb2
raid0: comparing sdb2(63296000) with sda2(63296000)
raid0: EQUAL
raid0: FINAL 1 zones
raid0: done.
raid0 : md_size is 126592000 blocks.
raid0 : conf->hash_spacing is 126592000 blocks.
raid0 : nb_zone is 1.
raid0 : Allocating 4 bytes for hash.
and a df shows:
Filesystem Size Used Avail Use% Mounted on
/dev/sda1 7.8G 3.3G 4.2G 45% /
tmpfs 2.0G 0 2.0G 0% /dev/shm
/dev/md0 117G 77G 35G 69% /scratch
So part of sda and all of sdb have been bound as one raid volume.
What you have could be one disk, or it could be hardware raid. dmesg should give you some clues.
It is always possible that it is a hardware raid controller that just looks like
a single sata (or scsi) drive. Ie, our systems with fiber channel raid arrays, linux
only sees a single device, and you control the raid portion and disk assignment
via connecting to the fiber raid array directly.
You can try mount -v or you can look in /sys/ or /dev/ for hints. dmesg might reveal information about the drivers used, and lspci could list any add-in hw raid cards, but in general there is no generic method you can rely on to find out the exact hardware & driver setup.
You might try using mdadm with more explanation here. If the 'mount' command does not show /dev/md*, chances are you are not using (or seeing) the software raid.
This is really a system administration, not programming related question, I'll tag it as such.

Resources