I'm working on 2.6.35.9 version of the Linux kernel and am trying to disable Command Completion Coalescing.
The output of lspci is as shown below:
00:00.0 Host bridge: Intel Corporation 82P965/G965 Memory Controller Hub (rev 02)
00:01.0 PCI bridge: Intel Corporation 82P965/G965 PCI Express Root Port (rev 02)
00:19.0 Ethernet controller: Intel Corporation 82566DC Gigabit Network Connection (rev 02)
00:1a.0 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI Controller #4 (rev 02)
00:1a.1 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI Controller #5 (rev 02)
00:1a.7 USB Controller: Intel Corporation 82801H (ICH8 Family) USB2 EHCI Controller #2 (rev 02)
00:1c.0 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express Port 1 (rev 02)
00:1c.4 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express Port 5 (rev 02)
00:1d.0 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI Controller #1 (rev 02)
00:1d.1 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI Controller #2 (rev 02)
00:1d.2 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI Controller #3 (rev 02)
00:1d.7 USB Controller: Intel Corporation 82801H (ICH8 Family) USB2 EHCI Controller #1 (rev 02)
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev f2)
00:1f.0 ISA bridge: Intel Corporation 82801HH (ICH8DH) LPC Interface Controller (rev 02)
00:1f.2 RAID bus controller: Intel Corporation 82801 SATA RAID Controller (rev 02)
00:1f.3 SMBus: Intel Corporation 82801H (ICH8 Family) SMBus Controller (rev 02)
01:00.0 VGA compatible controller: nVidia Corporation G72 [GeForce 7300 LE] (rev a1)
04:03.0 Mass storage controller: Promise Technology, Inc. PDC20268 (Ultra100 TX2) (rev 02)
I have Native Command Queuing enabled on my drives.
I was looking at the Serial ATA AHCI 1.3 Specification and found on page 115 that -
The CCC feature is only in use when CCC_CTL.EN is set to ‘1’. If CCC_CTL.EN is set to ‘0’, no CCC
interrupts shall be generated.
Next, I had a look at the relevant code (namely, the files concerning AHCI) for this version of the kernel but wasn't able to make any progress. I found the following macro enum HOST_CAP_CCC = (1 << 7) in drivers/ata/ahci.h, but I'm not sure how this should be modified to disable command coalescing.
Can someone please assist me in identifying how CCC can be disabled? Thank you!
In response to gby's comment:
I conducted an experiment where I issued requests of size 64KB from my driver code. 64KB corresponds to 128 sectors (each sector = 512 bytes).
When I look at the response timestamp differences, here is what I find:
Timestamp | Timestamp | Difference
at | at | in microsecs
Sector 255 - Sector 127 = 510
Sector 383 - Sector 255 = 3068
Sector 511 - Sector 383 = 22
Sector 639 - Sector 511 = 22
Sector 767 - Sector 639 = 12
Sector 895 - Sector 767 = 19
Sector 1023 - Sector 895 = 13
Sector 1151 - Sector 1023 = 402
As you can see, the response timestamp differences seem to suggest that the write completion interrupts are being batched into one and then one single interrupt is being raised, which might explain the really low numbers in tens of microseconds.
Also, when conducting this experiment, the on-disk write cache was disabled using hdparm.
Clearly, there is some interrupt batching involved here which I need to disable so that an interrupt is raised for each and every write request.
UPDATE:
Here is another experiment that I tried.
Create a bio structure in my driver and call the __make_request() function of the lower level driver. Only one 2560 bytes write request is sent from my driver.
Once this write is serviced, an interrupt is generated which is intercepted by do_IRQ(). Finally, the function blk_complete_request() is called. Keep in mind that we are still in the top half of the interrupt handler (i.e., interrupt context, not kernel context). Now, we compose another struct bio in blk_complete_request() and call the __make_request() function of the lower level driver. We record a timestamp at this point (say T_0). When the request completion callback is obtained, we record another timestamp (call it T_1). The difference - T_1 - T_0 - is always above 1 millisec. This experiment was repeated numerous times, and each time, the destination sector affected this difference - T_1 - T_0. It was observed that if the destination sectors are separated by approximately 350 sectors, the time difference is about 1.2 millisec for requests of size 2560 bytes.
Every time, the next write request is sent only when the previous request has been serviced. So, all these requests are chained and the disk has to service only one request at a time.
My understanding is that since the destination sectors of consecutive requests have been separated by a fairly large amount, by the time the next request is issued, the requested sector would be almost below the disk head and thus the write should happen immediately and T_1 - T_0 should be small (at least < 1 millisec).
The Serial ATA AHCI 1.3 Specification (page 114) states that:
When a software specified number of commands have completed or a software specified
timeout has expired, an interrupt is generated by hardware to allow software to process completed commands.
My guess is that this timer maybe the reason why the latency of each request is above 1 millisec. That's why I need to disable CCC.
I did mail the author - Jeff Garzik - but I haven't heard from him yet. Is he a registered user on stackoverflow? If yes, I could PM him...
The HDD we are using is: WD Caviar Black (Model number - WD1001FALS).
Anyone? :-(
AFAIK, HBA capabilities bit7(CCC supported) is RO and you can check it first to see if CCC supported. Then by spec you can disable CCC by setting CCC_CTL.EN because it is RW
Do you try to clear it then conduct your experiment ?
Related
I want to load and unload linux drivers in the device terminal,and I have two options but I do not want to do the first one
Build driver as a module
CONFIG_DRIVER = m
and I can use rmmod and modprobe to unload and load device driver.
Build device driver into kernel itself
CONFIG_DRIVER = Y
I want to follow the 2nd option but I do not know how to unload and load the device driver, can the open source community please help me out here !
It's easy as that. You find a device and driver which you want to unbind. For example, on my Intel Minnownboard (v1) I have PCH UDC controller (a PCI device):
% lspci -nk
...
02:02.4 0c03: 8086:8808 (rev 02)
Subsystem: 1cc8:0001
Kernel driver in use: pch_udc
Now I know necessary bits:
bus on which the device is located: PCI
device name: 0000:02:02.4 (note that lspci gives reduced PCI address, i.e. without domain or i.o.w. BDF, while driver expects domain:BDF)
driver name: pch_udc
Take altogether we can unbind the device:
% echo 0000:02:02.4 > /sys/bus/pci/drivers/pch_udc/unbind
[ 3042.531872] configfs-gadget 0000:02:02.4: unregistering UDC driver [g1]
[ 3042.540979] udc 0000:02:02.4: releasing '0000:02:02.4'
You may bind it again. Simple use bind node in the same folder.
The feature appeared more than 15 years ago and here is the article on LWN that explains it.
The question is pretty much in the title:
How to figure out what directory corresponds to my PCI device in linux?
For example:
My VGA controller
01:00.0 VGA compatible controller: NVIDIA Corporation GF119 [GeForce GT 520] (rev a1) (prog-if 00 [VGA controller]) corresponds to /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0. I figured it out by experimentation. What I want to know if there exists some sort of a table that says that a given device corresponds to a given directory in linux
So, the problem can be described as follows:
We got 11 completely equal PCI devices, connected through two CompactPCI buses, 6 on one, and 5 on the other.
We are trying to access the resources of the devices through the sysfs filesystem, example:
/sys/class/pci_bus/0000:04/device/0000:04:0d.0/resource1. First 4 devices allow read/write access to their resources without problems, but:
The 5th and 6th devices of both buses don't work: all files exist, but all read operations return a bunch of FFs, regardless of the written values, so I can't really say if the write was successful or not. When one of the first 4 is physically removed, 5th device starts working as usual, same goes for 6 on the bus with 6 devices. It looks like it can only work with 4 devices per bus, not more. It should be noted that CompactPCI allows using 7 PCI devices on the bus at once, according to the specification.
It can't really be a hardware problem, because Windows driver(developed long ago by someone we don't have access to) does it just fine.
lspci:
03:0b.0 Multimedia controller: Device 6472:8001 (rev 01)
03:0c.0 Multimedia controller: Device 6472:8001 (rev 01)
03:0d.0 Multimedia controller: Device 6472:8001 (rev 01)
03:0e.0 Multimedia controller: Device 6472:8001 (rev 01)
03:0f.0 Multimedia controller: Device 6472:8001 (rev 01)
04:09.0 Multimedia controller: Device 6472:8001 (rev 01)
04:0a.0 Multimedia controller: Device 6472:8001 (rev 01)
04:0b.0 Multimedia controller: Device 6472:8001 (rev 01)
04:0c.0 Multimedia controller: Device 6472:8001 (rev 01)
04:0d.0 Multimedia controller: Device 6472:8001 (rev 01)
04:0f.0 Multimedia controller: Device 6472:8001 (rev 01)
lspci -vv(equal aside from bus numbers for all 11 devices):
04:0f.0 Multimedia controller: Device 6472:8001 (rev 01)
Control: I/O+ Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr Stepping- SERR- FastB2B- DisINTx-
Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx+
Interrupt: pin A routed to IRQ 11
Region 0: I/O ports at d800 [size=128]
Region 1: Memory at febfe800 (32-bit, non-prefetchable) [size=128]
Don't know if I really need show you the code, because it is as simple as it is possible - file is opened, then mmaped, then the resulting pointer is used to write and read into that file.
fd = open ( (device_ + "resource" + std::to_string (i)).c_str(), O_RDWR);
ptr = (u_int32_t*) mmap (NULL, 0x7f, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
All paths are recovered right, that's what I've checked first.
dmesg has no errors regarding PCI.
After quite a long time, I've decided to answer this question. I didn't solve the problem by myself, and wrote an email to maintainer of PCI-related code in linux kernel. After tens of attempts to find out what went wrong, we just stopped - I had to switch to another project, spare time was over. The only thing that has been is discovered is that in such a configuration you CANNOT use mmap(and this is the primary way of accessing BARs through sysfs filesystem). So, instead, I have developed a simple PCI driver, which does exactly the same thing, but using read/write operations, and it worked.
Basically,
kernel -> userspace - result
ioremap -> read/write - works
ioremap -> mmap - doesn't work
sysfs -> mmap - doesn't work
How to register an user space call back function with USB driver for mass storage devices in Linux?
I got follwing messages on to console when usb stick is attached.
usb 1-1: new high speed USB device using ehci_hcd and address 2
usb 1-1: Product: DataTraveler G2
usb 1-1: Manufacturer: Kingston
usb 1-1: SerialNumber: 0019E06B07F7A961877C02A9
usb 1-1: configuration #1 chosen from 1 choice
scsi0 : SCSI emulation for USB Mass Storage devices
scsi 0:0:0:0: Direct-Access Kingston DataTraveler G2 1.00 PQ: 0 ANSI: 2
SCSI device sda: 7818240 512-byte hdwr sectors (4003 MB)
sda: Write Protect is off
sda: assuming drive cache: write through
SCSI device sda: 7818240 512-byte hdwr sectors (4003 MB)
sda: Write Protect is off
sda: assuming drive cache: write through sda:sda1
sd 0:0:0:0: Attached scsi removable disk sda
sd 0:0:0:0: Attached scsi generic sg0 type 0
You could create an udev rule which executes a command when it is inserted. Basically you create a file containing a set of rules for matching, and the path to a program/script to run. It'll look something like this:
KERNEL=="sd?1", ATTRS{serial}=="0019E06B07F7A961877C02A9", RUN+="/path/to/script arg1 arg2 ... argN"
This will run /path/to/script with the arguments arg1 to argN when a device node named sd?1 is created, where ? is any character, with the serial number given in your data. You can get a lot of info from the udevinfo program to incorporate in the rule if you need better control over when it should fire. Such as if you want it to fire for all Kingston drives, for instance. Then you'd need to find the vendorID and maybe some more information unique to these drives.
I have a temperature sensor, which is connected using an USB-I2C adapter (http://www.robot-electronics.co.uk/htm/usb_i2c_tech.htm)
I attached this device to my linux computer (suse10).
I typed dmesg and saw
usb 3-3: new full speed USB device using ohci_hcd and address 10
usb 3-3: new device found, idVendor=0403, idProduct=6001
usb 3-3: new device strings: Mfr=1, Product=2, SerialNumber=3
usb 3-3: Product: FT232R USB UART
usb 3-3: Manufacturer: FTDI
usb 3-3: SerialNumber: A7007K93
usb 3-3: configuration #1 chosen from 1 choice
ftdi_sio 3-3:1.0: FTDI USB Serial Device converter detected
drivers/usb/serial/ftdi_sio.c: Detected FT232BM
usb 3-3: FTDI USB Serial Device converter now attached to ttyUSB0
But I have no idea how to read the current temperature.
updated 1: Actually the I2C bus can attach up to 127 sensors. But I have no idea how to list the addresses of available sensors.
Can anybody give me some hints? Thanks in advance
Your adapter allows you to send I2C commands over a virtual serial port. A serial port has been created for you. You need to open it and send commands to it. The commands are specific to the device you are connected to. See the example in the link you provided to get an idea.
It is hard to give you correct instructions without a datasheet. Most probably your device will use one byte address and the read procedure is as follows:
[I2C_AD1] [Device I2C address + Read bit] [Device Address register] [Number of bytes to read]
0x55 0xXX 0x00 0x01
You need to send 4 bytes to the serial port. The first one instructs the USB to I2C converter to send a read command. The second one is the address of the device attached to the I2C bus. I2C devices use 7-bit addresses (0-127). Usually these are given with one bit shifted at the left. Therefore you need to scan these addresses (iterate from 0 to 127, shift left one bit, set bit0 to 1):
([0x00 - 0x7F] << 1) | 1
Since we don't have a datasheet I can't tell anything about the last two bytes. You could try to use dummy values. If a device is attached to the scanned I2C address, it should reply with a NACK to an attempt to read a non-existing register. Read commands sent to an I2C address that doesn't correspond to an actual device should be ignored.