Tracking down Source of interrupts - linux

I've got Xeon-D based platform running CentOS 7.3 ( Linux kernel 3.10.0-862) and I've noticed a lot of unhandled interrupts coming to a particular IRQ (IRQ#18). The IRQ line seems to be shared by USB host controller and an SMBus driver, however the interrupts keep on coming even after disabling all USB ports in BIOS and unloading the i2c-i801 module (one at a time, to keep the kernel listening to IRQ line). Therefore, I'm at a loss to track down where these interrupts are coming from, if not from USB or I2C.
Eventually, things get so bad that the kernel simply disables the IRQ (irq 18: nobody cared ...), causing things to stop working on the smbus.
The only way to keep it going is to use noirqdebug parameter in my kernel boot cmdline.
I've observed the same issue when running Ubuntu 18.04, with a newer kernel.
I've got following questions:
How do IRQ numbers get assigned at kernel and device levels?
Is it possible that something not on the USB or SMBus could be sending interrupts to IRQ#18?
How does one go about tracking down the source of these interrupts?
and finally,
How to fix this? :-)
I'd appreciate any suggestions or pointers and I'll be happy to provide any information I might've missed. Thanks in advance.

Related

kgdboe kgdb kernel debugging at boot

I'm attempting to get kernel debugging to work during boot. I've followed all the steps to install it (how to use kgdb over ethernet(kgdboe)?) and can connect fine when I insmod after loading, but if I add this
BOOT_IMAGE=/vmlinuz-4.0.0-rc7+ root=UUID=<my_root> ro drm.debug=0x04 kgdbwait kgdboe=#<src_ip>/eth1,#<target_ip>/ vt.handoff=7
to the kernel boot line, I don't see the module loaded, and it doesn't kgdbwait.
When I look at my kern.log, I see the following:
kgdboe: eth0 does not have a in_ifaddr struct associated. Cannot get default IP address.
I have both eth0 and eth1 by the way, but only eth1 is connected.
Any suggestions? Is it just that the pcie network card isn't loaded until after boot and it's causing me issues?
Also, why would I need to specify the source or target ip addresses? Is there any way to have kgdboe accept all ip addresses, even when trying to load it at boot?
Thanks
Yes, for early kernel debug kgdboe does not really work. There are several issues, some easy to solve, some not solveable. You can hard link the required modules rather than demand load them to solve the easy issue. But the core problem is that the kgdb early wait will pause all worker threads, and nearly all of the Ethernet PCIe card drivers require worker threads, or else require IRQs. Even on the polled Ethernet driver support (very limited), the IRQ's can be preempted (or illegally hold locks), and prevent the polled Ethernet driver from functioning. As a result early kernel debug does not function with kgdboe reliably, and with some Ethernet drivers, at all. (e.g. kgdbwait on the GRUB2 boot line.) There has been occasional talk about hacking up various Ethernet driver sources to attempt to provide kgdboe support over a special purpose Ethernet driver, but none that I know of that is distributed. You are still best off with using a serial port, and for full functionality, a serial console, which can be multiplexed onto a single serial port if need be with kgdboc (agent-proxy). If true remote access is required, then remote into the debugging system that initiates the serial connection.
You can also use the USB port, but requires a specific USB<->serial USB dongle that is no longer sold. (Ajays Blue dongle). These were discontinued about 6 months ago, and there is no replacement yet. (It was a Windows debugging device adapted to Linux, and Windows has moved on to native USB3.0 debugging features, and Linux has yet to catch up to that.) So, unless you have the needed USB converter, or have another source, or have an alternative adapter, you are out of luck on USB2.0.
Serial is still your best option, sadly, even in 2016.
See: http://kdbg.wiki.kernel.org

How to simulate an interrupt storm or a live lock on Linux?

Background:
I am developing a tool which boots up a custom build of Linux and boots into QT based desktop for x86 based machine. My custom Linux runs from USB and when the it boots on a machine with certain brand of sound cards connected, then my tool runs to a live lock situation with a lot of interrupts. I doubt its some problem with APIC driver but the system is renderd useless and I have to poweroff the system.
My Question:
I would like to simulate the same situation by using a kernel driver or module. I am not sure if I can cause an interrupt to fire from a module. I have a experience with I2C or SPI which causes interrupts on ARM based Linux boards. But i dont know how to do it from a module
Could anybody please suggest me how to cause an interrupt from a driver?
Just create a module with an interrupt forkbomb in it. Google it. It'll only take a second for your vm to halt.
http://www.tldp.org/LDP/tlk/dd/interrupts.html

How can the linux kernel be forced to enumerate the PCI-e bus?

Linux kernel 2.6
I've got an fpga that is loaded over GPIO connected to a development board running linux.
The fpga will transmit and receive data over the pci-express bus. However, this is enumerated
at boot and as such, no link is discovered (because the fpga is not loaded at boot).
How can I force re-enumeration of the pci-e bus in linux?
Is there a simple command or will I have to make kernel changes?
I need the capability to hotplug pcie devices.
As root, try the following command:
echo "1" > /sys/bus/pci/rescan
See this link for more information: http://www.kernel.org/doc/Documentation/ABI/testing/sysfs-bus-pci
I wonder what platform you are on: A work around (aka hack) for this that works on x86 systems is to have the BIOS basically statically configure a PCI device at whatever bus, device, function the FPGA normally lands on, then the OS will enumerate the device and reserve the PCI space for it (even though the device isn't really there). Then in your device driver you will have to do some extra things like setup the BARs and int lines manually after the fpga has been programmed. Of course this requires modifying the BIOS, which if you are working with a BIOS vendor you can contract them to make this change for you, if you are not working with a BIOS vendor then it will be much harder... Also keep in mind that I was working on VxWorks on x86, and we had a AMI make a custom BIOS for our boards...
If you don't have a BIOS, then consider programming it in the bootloader, there you already have the ability to read from disk, and adding GPIO capabilities probably isn't too difficult (assuming you are using jtag and GPIOs?), in fact depending on what bootloader you use it might already be able to do GPIO?
The issues with modifying the kernel to do this is that you have to find the sweet spot where you can read the bitfile, before the PCI enumeration... If for example the disk device drivers are initialized after PCI, then obviously you must do some radical changes to the kernel just to read the bitfile prior to PCI enumeration, which might cause other annoying problems...
One other option which you may have already discovered, and which is really only ok for development time: Power up the system, program the fpga board, then do a reset (without power cycle, for example: sudo reboot now), the FPGA should keep its configuration, and linux should enumerate it...
After turning on your computer, the BIOS enumerates the PCI bus and attempts to fulfill all IO space and memory mapped IO (MMIO) requests. It sets up these BAR's initially, and when the operating system loads these BAR's can be changed by the OS as it sees fit while the PCI bus driver enumerates the bus yet again. It is even possible for the superuser of the system to run the command setpci to change these BAR's after the BIOS has already attempted to configure them and the OS has loaded (may cause drivers to fail and several other bad things if done improperly).
I have had to do this in cases where the card in question was not assigned any resources by the BIOS since the region requested required a 64-bit address and the BIOS only operated with 32-bit address assignments. I was able to go in after-the-fact and change these addresses (originally assigned by the BIOS) to whatever addresses I saw fit, insert the kernel module, and my driver would map and use these newly-assigned addresses for the card without knowing the difference.
The problem that exists with hotplugging PCI-Express cards is that the power to the slot, itself, cannot be turned on/off without specific hotplug controllers that need to exist on the motherboard/backplane. Not having these hotplug controllers to turn the slot's power off may lead to shorts between the tiny pins when the card is physically inserted and/or removed if power is still present. Hotplug events, however, can be initiated by either end (the host or the endpoint device). This does not seem to be the case, however if your FPGA already has a link established with the root complex, a possible solution to your problem would be to generate hotplug interrupts to cause a bus rescan in the OS.
There is a major problem, though -- if your card does not actually obtain a link to the root complex, it won't be able to generate any hotplug events; which seems to be the case. After booting, the FPGA should toggle the PRESENT line on the PCIe bus to tell the OS there is a card ready to be enumerated. Once detected, the OS should attempt to establish a link to the card and assign memory regions to the device. After the OS enumerates the card you'll be able to load drivers against it and see it in lspci. You stated you're using kernel 2.6, which does have support for hotplugging and dynamic resource allocation so this method should work as long as your FPGA supports the ability to toggle the PRESENT PCIe line, too.

How we can check NAPI feature is enabled or not in linux?

I want to check NAPI feature is enabled in our linux or not. we are using bnx2 driver.
Our linux os is RHEL5 and kernel is 2.6.18-164.el5PAE
If anyone knows please help.
Thanks in Advance
Just generate continuous network traffic to the network interface using a traffic generator. Make sure *not to generate back traffic through the interface. Then do:
watch cat /proc/interrupts | grep bnx2
You will see the interrupt line statistics that the bnx2 driver has registered on.
Does the interrupt count keep rising in relation to the packet rate? then NAPI is off. Is the interrupt count steady or rises much slowly then the packet rate? NAPI is on.
Note that a shared interrupt line with some other device/driver may complicate the test.

Is there a way to ask the Linux Kernel to re-run its PCI initialization code?

I'm looking for either a kernel mode call that I can make from a driver, a userland utility, or a system call that will ask the Kernel to look at the PCI bus and either completely re-run its initialization, or initialize a specific device. Specifically, I need the Kernel to recognize a device that was added to the bus after boot and then configure its address space, interrupt, and other configuration parameters, and finally enable the device so that I can load the driver for it (unless this all happens as part of the driver load).
I'm stuck on the 2.4.x series Kernel for this, and am currently working with 2.4.20, but will be moving to 2.4.37 if it matters. The distro is a stripped down Red Hat 7.3 running in a ram disk, but I can add in whatever tools are needed to get this working (as long as they play nice with 2.4 series).
If some background would help clarify what I'm trying to do: From a cold boot, once in Linux I use GPIO to program an FPGA. Part of the FPGA, once programmed, implements a simple PCI device. Currently, after programming the FPGA, I reboot the system and Linux recognizes the device after coming up and loads the driver for it.
Instead of needing that reboot, I'd like to simply ask the Kernel to do whatever it does during boot up to find PCI devices (I have the Kernel configured to find PCI devices on its own, instead of asking the BIOS for that information, so the BIOS won't need to know about this device (I hope)).
I believe that Linux is capable of seeing the device after it is programmed but before a reboot, because scanpci will show the device after I program it, as will lspci -H 1. I just need a way to get it into /proc/pci, configured and enabled.
This below command will help the user to rescan it complete root hub.
echo "1" > /sys/class/pci_bus/0000\:00/rescan
You could speed up the reboot with kexec, if you don't figure out how to get the PCI scan redone. You could ask this on the LKML, if you haven't already.
unloading/reloading the module doesn't help, does it?
http://www.linuxjournal.com/article/5633 suggests you should be able to do it with 2.4 kernels using pcihpfs.
If that isn't working, maybe the driver doesn't support hotplug?
It would probably crash the system if you reconfigured the addresses of other PCI devices while they are in use.
A better way would be to just configure the new card. If your kernel has support for Cardus devices, it already knows how to configure a newly-inserted PCI device (which is what Cardbus is). You just need to figure out how to get the kernel to do it...
It should be possible for a kernel module to do this. Even if you can't get built-in hotplug code, you should be able to set the pci resources using calls to pci_bus_write_config_dword() and friends. There is probably some IRQ routing setup to do as well.

Resources