Tasklet counts in /proc/softirqs increase very rapidly on USB operation in Linux - linux

I have a legacy device with following configuration:
Chipset Architecture : Intel NM10 express
CPU : Atom D2250 Dual Core
Volatile Memory : 1GB
CPU core : 4
USB Host controller driver : ehci-pci
When I perform any USB operation I observe increasing tasklet count linearly and if USB operation continues for a long time(approx half an hour) then tasklet count crosses a million and this seems to be very strange to me.
I read about interrupt handling mechanism used by ehci-pci which is not latest(i.e. PCI pin-based) but still the tasklet counts are so high in numbers.
I use /proc/softirqs to read tasklet count.
Any lead on this?
root#panther1:~# cat /proc/interrupts
CPU0 CPU1 CPU2 CPU3
23: 0 0 0 1411443 IO-APIC 23-fasteoi ehci_hcd:usb1, uhci_hcd:usb2
root#panther1:~# cat /proc/softirqs
CPU0 CPU1 CPU2 CPU3
HI: 7 3 13 1125529
TIMER: 2352846 2325384 2533628 2675821
NET_TX: 0 0 2 1703
NET_RX: 1161 1193 2730 51184
BLOCK: 0 0 0 0
IRQ_POLL: 0 0 0 0
TASKLET: 256 164 90 1162220
SCHED: 1078965 1015261 1155661 1207484
HRTIMER: 0 0 0 0
RCU: 1370647 1367098 1485356 1503762

Related

Add irq handler for keyboard in VirtualBox

I'm trying to write simple kernel module that can give a random number for user by registering shared interrupt handler to keyboard or mouse interrupts and reading timestamp in virtualbox. The problem is that is can't find out how does virtualbox handle the keyboard or mouse interrupts and which irq number should I use?
cat /proc/interrupts output:
CPU0
0: 30 IO-APIC 2-edge timer
1: 38 IO-APIC 1-edge i8042
8: 0 IO-APIC 8-edge rtc0
9: 0 IO-APIC 9-fasteoi acpi
12: 158 IO-APIC 12-edge i8042
14: 0 IO-APIC 14-edge ata_piix
15: 769 IO-APIC 15-edge ata_piix
16: 329 IO-APIC 16-fasteoi enp0s8
18: 713 IO-APIC 18-fasteoi vmwgfx
19: 5387 IO-APIC 19-fasteoi enp0s3
20: 327 IO-APIC 20-fasteoi vboxguest
21: 28865 IO-APIC 21-fasteoi ahci[0000:00:0d.0], snd_intel8x0
22: 26 IO-APIC 22-fasteoi ohci_hcd:usb1
NMI: 0 Non-maskable interrupts
LOC: 82937 Local timer interrupts
SPU: 0 Spurious interrupts
PMI: 0 Performance monitoring interrupts
IWI: 0 IRQ work interrupts
RTR: 0 APIC ICR read retries
RES: 0 Rescheduling interrupts
CAL: 0 Function call interrupts
TLB: 0 TLB shootdowns
TRM: 0 Thermal event interrupts
THR: 0 Threshold APIC interrupts
DFR: 0 Deferred Error APIC interrupts
MCE: 0 Machine check exceptions
MCP: 3 Machine check polls
ERR: 0
MIS: 0
PIN: 0 Posted-interrupt notification event
NPI: 0 Nested posted-interrupt event
PIW: 0 Posted-interrupt wakeup event

dpdk-testpmd receives packets but does not send anything

Development setup:
AMD 3700X on a B450 motherboard
2 x intel T210 1Gb NICs (one port each, connected to one another)
Ubuntu 20.04
linux kernel 5.6.19-050619-generic
DPDK version stable-20.11.3
$ sudo usertools/dpdk-devbind.py --status
Network devices using DPDK-compatible driver
============================================
0000:06:00.0 'I210 Gigabit Network Connection 1533' drv=uio_pci_generic unused=vfio-pci
Network devices using kernel driver
===================================
0000:04:00.0 'I210 Gigabit Network Connection 1533' if=enigb1 drv=igb unused=vfio-pci,uio_pci_generic *Active*
0000:05:00.0 'RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller 8168' if=enp5s0 drv=r8169 unused=vfio-pci,uio_pci_generic *Active*
One NIC(00:04:00.0) provides traffic (88 packets captured from a live device) in linux mode (nothing to do with DPDK):
$sudo tcpreplay -q -p 1000 -l 5 -i enigb1 random_capture_realtek_dec_14.pcapng 2>&1 | grep -v "PF_PACKET\|Warning"
Actual: 440 packets (96555 bytes) sent in 0.440001 seconds
Rated: 219442.6 Bps, 1.75 Mbps, 999.99 pps
Statistics for network device: enigb1
Successful packets: 440
Failed packets: 5
Truncated packets: 0
Retried packets (ENOBUFS): 0
Retried packets (EAGAIN): 0
The other NIC (00:06:00.0) has dpdk-testpmd running on it:
sudo ./build/app/dpdk-testpmd -c 0xf000 -n 2 --huge-dir=/mnt/huge-2M -a 06:00.0 -- --portmask=0x3
EAL: Detected 16 lcore(s)
EAL: Detected 1 NUMA nodes
EAL: Detected static linkage of DPDK
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
EAL: Selected IOVA mode 'PA'
EAL: 3 hugepages of size 1073741824 reserved, but no mounted hugetlbfs found for that size
EAL: Probing VFIO support...
EAL: VFIO support initialized
EAL: Probe PCI driver: net_e1000_igb (8086:1533) device: 0000:06:00.0 (socket 0)
EAL: No legacy callbacks, legacy socket not created
testpmd: create a new mbuf pool <mb_pool_0>: n=171456, size=2176, socket=0
testpmd: preferred mempool ops selected: ring_mp_mc
Warning! port-topology=paired and odd forward ports number, the last port will pair with itself.
Configuring Port 0 (socket 0)
Port 0: 68:05:CA:E3:05:A2
Checking link statuses...
Done
No commandline core given, start packet forwarding
io packet forwarding - ports=1 - cores=1 - streams=1 - NUMA support enabled, MP allocation mode: native
Logical Core 13 (socket 0) forwards packets on 1 streams:
RX P=0/Q=0 (socket 0) -> TX P=0/Q=0 (socket 0) peer=02:00:00:00:00:00
io packet forwarding packets/burst=32
nb forwarding cores=1 - nb forwarding ports=1
port 0: RX queue number: 1 Tx queue number: 1
Rx offloads=0x0 Tx offloads=0x0
RX queue: 0
RX desc=512 - RX free threshold=32
RX threshold registers: pthresh=0 hthresh=0 wthresh=0
RX Offloads=0x0
TX queue: 0
TX desc=512 - TX free threshold=0
TX threshold registers: pthresh=8 hthresh=1 wthresh=16
TX offloads=0x0 - TX RS bit threshold=0
Press enter to exit
Port 0: link state change event
Telling cores to stop...
Waiting for lcores to finish...
---------------------- Forward statistics for port 0 ----------------------
RX-packets: 153 RX-dropped: 287 RX-total: 440
TX-packets: 0 TX-dropped: 0 TX-total: 0
----------------------------------------------------------------------------
+++++++++++++++ Accumulated forward statistics for all ports+++++++++++++++
RX-packets: 153 RX-dropped: 287 RX-total: 440
TX-packets: 0 TX-dropped: 0 TX-total: 0
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Done.
No matter how much packets I pump thru the pipeline, at most 153 packets are processed, the rest is dropped and nothing is sent. I'd guess those 153 packets clog the TX queue in the 00:06:00.0 and that's it.
I tried dpdk-pktgen and it sent nothing as well.
Note: the NIC used to send the packets (tcpreplay) is the same model as the one used for dpdk-testpmd.
What am I doing wrong, or what am I not doing (and I should)?
Update 1:
testpmd> set promisc all on
testpmd> start tx_first
io packet forwarding - ports=1 - cores=1 - streams=1 - NUMA support enabled, MP allocation mode: native
(...)
testpmd> show port xstats 0
(...)
rx_good_packets: 153
tx_good_packets: 0
rx_good_bytes: 31003
tx_good_bytes: 0
rx_missed_errors: 287
(...)
rx_total_packets: 440
tx_total_packets: 285
rx_total_bytes: 97035
tx_total_bytes: 17100
tx_size_64_packets: 0
tx_size_65_to_127_packets: 0
tx_size_128_to_255_packets: 0
tx_size_256_to_511_packets: 0
tx_size_512_to_1023_packets: 0
tx_size_1023_to_max_packets: 0
testpmd> show fwd stats all
---------------------- Forward statistics for port 0 ----------------------
RX-packets: 153 RX-dropped: 287 RX-total: 440
TX-packets: 0 TX-dropped: 0 TX-total: 0
----------------------------------------------------------------------------

Is it possible to get current CPU utilisation from a specific core via /sys in Linux?

I would like to write a shellscript that reads the current CPU utilisation on a per-core basis. Is it possible to read this from the /sys directory in Linux (CentOS 8)? I have found /sys/bus/cpu/drivers/processor/cpu0 which does give me a fair bit of information (like current frequency), but I've yet to figure out how to read CPU utilisation.
In other words: Is there a file that gives me current utilisation of a specific CPU core in Linux, specifically CentOS 8?
I believe that you should be able to extract information from /proc/stat - the lines that start with cpu$N, where $N is 0, 1, 2, ...... For example:
Strongly suggesting reading articles referenced on other answer.
cpu0 101840 1 92875 80508446 4038 0 4562 0 0 0
cpu1 81264 0 68829 80842548 4424 0 2902 0 0 0
Repeated call will show larger values:
cpu 183357 1 162020 161382289 8463 0 7470 0 0 0
cpu0 102003 1 93061 80523961 4038 0 4565 0 0 0
cpu1 81354 0 68958 80858328 4424 0 2905 0 0 0
Notice CPU0 5th column (idle count) moving from 80508446 to 80523961
Format of each line in
cpuN user-time nice-time system-time idle-time io-wait ireq softirq
steal guest guest_nice
So a basic solution:
while true ;
for each cpu
read current counters, at least user-time system-time and idle
usage = current(user-time + system-time) - prev(user-time+system-time)
idle = current(idle) - prev(idle)
utilization = usage/(usage+idle)
// print or whatever.
set prev=current
done

write error: No space left on device in embedded linux

all
I have a embedded board, run linux OS. and I use yaffs2 as rootfs.
I run a program on it, but after some times, it got a error "error No space left on device.". but I checked the flash, there still have a lot free space.
I just write some config file. the config file is rarely update. the program will write some log to flash. log size is limited to 2M.
I don't know why, and how to solve.
Help me please!(my first language is not English,sorry. hope you understand what I say)
some debug info:
# ./write_test
version 1.0
close file :: No space left on device
return errno 28
# cat /proc/yaffs
YAFFS built:Nov 23 2015 16:57:34
Device 0 "rootfs"
start_block........... 0
end_block............. 511
total_bytes_per_chunk. 2048
use_nand_ecc.......... 1
no_tags_ecc........... 1
is_yaffs2............. 1
inband_tags........... 0
empty_lost_n_found.... 0
disable_lazy_load..... 0
refresh_period........ 500
n_caches.............. 10
n_reserved_blocks..... 5
always_check_erased... 0
data_bytes_per_chunk.. 2048
chunk_grp_bits........ 0
chunk_grp_size........ 1
n_erased_blocks....... 366
blocks_in_checkpt..... 0
n_tnodes.............. 749
n_obj................. 477
n_free_chunks......... 23579
n_page_writes......... 6092
n_page_reads.......... 11524
n_erasures............ 96
n_gc_copies........... 5490
all_gcs............... 1136
passive_gc_count...... 1136
oldest_dirty_gc_count. 95
n_gc_blocks........... 96
bg_gcs................ 96
n_retired_writes...... 0
n_retired_blocks...... 0
n_ecc_fixed........... 0
n_ecc_unfixed......... 0
n_tags_ecc_fixed...... 0
n_tags_ecc_unfixed.... 0
cache_hits............ 0
n_deleted_files....... 0
n_unlinked_files...... 289
refresh_count......... 1
n_bg_deletions........ 0
Device 2 "data"
start_block........... 0
end_block............. 927
total_bytes_per_chunk. 2048
use_nand_ecc.......... 1
no_tags_ecc........... 1
is_yaffs2............. 1
inband_tags........... 0
empty_lost_n_found.... 0
disable_lazy_load..... 0
refresh_period........ 500
n_caches.............. 10
n_reserved_blocks..... 5
always_check_erased... 0
data_bytes_per_chunk.. 2048
chunk_grp_bits........ 0
chunk_grp_size........ 1
n_erased_blocks....... 10
blocks_in_checkpt..... 0
n_tnodes.............. 4211
n_obj................. 24
n_free_chunks......... 658
n_page_writes......... 430
n_page_reads.......... 467
n_erasures............ 7
n_gc_copies........... 421
all_gcs............... 20
passive_gc_count...... 13
oldest_dirty_gc_count. 3
n_gc_blocks........... 6
bg_gcs................ 4
n_retired_writes...... 0
n_retired_blocks...... 0
n_ecc_fixed........... 0
n_ecc_unfixed......... 0
n_tags_ecc_fixed...... 0
n_tags_ecc_unfixed.... 0
cache_hits............ 0
n_deleted_files....... 0
n_unlinked_files...... 2
refresh_count......... 1
n_bg_deletions........ 0
#
log and config file stored in "data".
thanks!!
In General this could be your disk space (here Flash), first of all check your flash space with with df -h (or other commands you have.. df is present in BusyBox). But if your flash space (specially on your program partition) is ok, this could be your "inode" (directory) space problem, you could see your inode usage with df -i command. (a good link for this: https://wiki.gentoo.org/wiki/Knowledge_Base:No_space_left_on_device_while_there_is_plenty_of_space_available)
If non of these is the problem cause, I think you have to have a deeper look at your code, specially if you deal with disk I/O!
Also good to mention that be aware of memory & heap space & free all allocated spaces in you functions.

How do I get the interrupt vector number on Linux?

When I run "cat /proc/interrupts", I can get the following:
CPU0 CPU1
0: 253 1878 IO-APIC-edge timer
1: 3 0 IO-APIC-edge i8042
7: 1 0 IO-APIC-edge parport0
8: 0 1 IO-APIC-edge rtc0
9: 0 0 IO-APIC-fasteoi acpi
12: 1 3 IO-APIC-edge i8042
16: 681584 60 IO-APIC-fasteoi uhci_hcd:usb3, nvidia
17: 0 0 IO-APIC-fasteoi uhci_hcd:usb4, uhci_hcd:usb7
18: 0 0 IO-APIC-fasteoi uhci_hcd:usb8
22: 2 1 IO-APIC-fasteoi ehci_hcd:usb1, uhci_hcd:usb5
23: 17 17 IO-APIC-fasteoi ehci_hcd:usb2, uhci_hcd:usb6
44: 146232 472747 PCI-MSI-edge ahci
45: 118 115 PCI-MSI-edge snd_hda_intel
46: 10038650 842 PCI-MSI-edge eth1
NMI: 44479 43798 Non-maskable interrupts
LOC: 19025635 29426776 Local timer interrupts
SPU: 0 0 Spurious interrupts
PMI: 44479 43798 Performance monitoring interrupts
IWI: 0 0 IRQ work interrupts
RES: 3442001789 3442627214 Rescheduling interrupts
CAL: 1406 1438 Function call interrupts
TLB: 781318 792403 TLB shootdowns
TRM: 0 0 Thermal event interrupts
THR: 0 0 Threshold APIC interrupts
MCE: 0 0 Machine check exceptions
MCP: 2063 2063 Machine check polls
ERR: 0
MIS: 0
How can I get the interrupt number of "NMI" "LOC" "SPU" "PMI", etc.
On x86 NMIs are always on interrupt vector 2. The number is hard-coded just as common exceptions (division by 0, page fault, etc). You can find this in the CPU documentation from Intel/AMD.
If the APIC is enabled (as is the case in the dump presented in the question), Spurious Interrupt's interrupt vector number can be obtained from APIC's SVR register. Again, see the same CPU documentation on that.
If the APIC isn't enabled and instead the PIC is being used, then Spurious Interrupts are delivered as IRQ7 (see the 8259A PIC chip spec for that). The BIOS programs the PIC in such a way that IRQ7 is interrupt vector 0Fh, but Windows and Linux change this mapping to avoid sharing the same interrupt vectors for IRQs and CPU exceptions. It seems like this mapping can't be queried from the PIC, but it's established via sending the Initialization Control Word 2 (ICW2) to the PIC. Here's the relevant piece of Linux code in init_8259A():
/* ICW2: 8259A-1 IR0-7 mapped to 0x30-0x37 on x86-64,
to 0x20-0x27 on i386 */
outb_pic(IRQ0_VECTOR, PIC_MASTER_IMR);
That should answer the Spurious Interrupt vector part.
As for LOC and PMI, I think, these are local APIC's interrupts and you can find their interrupt vectors from the APIC just like with the Spurious Interrupt above.

Resources