In ss -s, what is the kernel counter actually counting? - linux

While troubleshooting a problem on an OEL 7 server (3.10.0-1062.9.1.el7.x86_64), I ran the command
sudo ss -s
Which gave me the output of:
Total: 601 (kernel 1071)
TCP: 8 (estab 2, closed 0, orphaned 0, synrecv 0, timewait 0/0), ports 0
Transport Total IP IPv6
1071 - -
RAW 2 0 2
UDP 6 4 2
TCP 8 5 3
INET 16 9 7
FRAG 0 0 0
Doing an ss -a | wc -l came back with 225 entries.
It leads me to the question, what is kernel 1071 actually counting?
Looking through the various man pages did not provide an answer.
Using strace, I can see where ss reads:
/proc/net/sockstat
/proc/net/sockstat6
/proc/net/snmp
/proc/slabinfo
Looking through those files and the docs, it looks like the value is coming from /proc/slabinfo.
Grepping through /proc/slabinfo for 1071 came back with one entry:
sock_inode_cache 1071 1071 640 51 8 : tunables 0 0 0 : slabdata 21 21 0
Looking through the files and docs on sock_inode_cache has not helped so far. I am hoping someone here knows what the kernel counter is actually counting, or can point me in the right direction.

what is kernel 1071 actually counting?
sock_inode_cache represents Linux kernel Slab statistics. It shows how many socket inodes (active objects) are there.
struct socket_alloc corresponds to the sock_inode_cache slab cache and contains the struct socket and struct inode, so it is connected to VFS.

Related

USB2CAN qdisc buffer full and no requeues

Hi I'm trying to connect my Linux VM to a physical CAN-Bus.
The USB Passthrough and setup of the CAN interface is working perfectly fine, but I have trouble sending messages from the VM.
First of all here is my VM version and Hardware:
user#usb-can:~$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 22.04.1 LTS
Release: 22.04
Codename: jammy
USB2CAN adapter and Documentation
http://www.inno-maker.com/product/usb-can/
https://github.com/INNO-MAKER/usb2can/blob/master/Document/USB2CAN%20UserManual%20v.1.8.pdf
So first of all if I'm sending 15 CAN Messages from my VM to my CAN interface with cansend can0 123#DEADBEEF. and the first 2-3 messages are registered and also shown when I do a candump can0:
user#usb-can:~$ candump can0
can0 123 [4] DE AD BE EF
can0 123 [4] DE AD BE EF
can0 123 [4] DE AD BE EF
However the remaining 12 are not sent anymore and when I send additional frames I get:
user#usb-can:~cansend can0 123#DEADBEEF
write: No buffer space available
So I found out that I could inspect the buffer, and it showed this:
user#usb-can:~$ tc -s qdisc show dev can0
qdisc pfifo_fast 0: root refcnt 2 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
Sent 144 bytes 9 pkt (dropped 3, overlimits 0 requeues 1)
backlog 176b 11p requeues 1
And this locks the whole device and I cant send anything because packages get dropped.
However this is with nothing attached to the adapter so I assume that is normal? Maybe somebody can verify this with knowledge about USB to CAN devices or with his own device?
Because there is no termination resistor so it would make sense that its not working properly.
BUT When I connect a Termination resistor of 120 Ohm and use the jumper to enable the 120 Ohm in the adapter, I should have the 2 required termination resistors and thus being able to send the CAN frames. But I get the same error as before:
user#usb-can:~$ tc -s qdisc show dev can0
qdisc pfifo_fast 0: root refcnt 2 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
Sent 80 bytes 5 pkt (dropped 0, overlimits 0 requeues 1)
backlog 176b 11p requeues 1
So in my mind the CAN Network looks like this:
USB 2 CAN adapter
_________________/\_________________
/ \
_________
|usb2can|
---------
| |
CAN HIGH ___________*____|______________________________
| | |
__________ | _______________
|120 Ohm | | |120 Ohm |
---------- | ---------------
| | |
CAN LOW ----------------*------------------------------
Do I need to add another device to the network to make it work or shouldn't it work like that?
I already tried using different termination resistors if maybe one would be broken and also tried attaching an additional device. But no success yet.

How do I get 4 MB huge pages on Linux

According to:
$ ls -l /sys/kernel/mm/hugepages
drwxr-xr-x 2 root root 0 Dec 6 10:38 hugepages-1048576kB
drwxr-xr-x 2 root root 0 Dec 6 10:38 hugepages-2048kB
There is a choice of 2 MB and 1 GB sizes of huge pages on my system which is running a 5.4.17 kernel
However according to:
$ cpuid | grep -i tlb |sort| uniq
0x03: data TLB: 4K pages, 4-way, 64 entries
0x63: data TLB: 2M/4M pages, 4-way, 32 entries
0x76: instruction TLB: 2M/4M pages, fully, 8 entries
0xb5: instruction TLB: 4K, 8-way, 64 entries
0xc3: L2 TLB: 4K/2M pages, 6-way, 1536 entries
cache and TLB information (2):
data TLB: 1G pages, 4-way, 4 entries
L1 TLB/cache information: 2M/4M pages & L1 TLB (0x80000005/eax):
L1 TLB/cache information: 4K pages & L1 TLB (0x80000005/ebx):
L2 TLB/cache information: 2M/4M pages & L2 TLB (0x80000006/eax):
L2 TLB/cache information: 4K pages & L2 TLB (0x80000006/ebx):
the TLBs on my Skylake also support 4 MB pages. The same information can be found at
https://en.wikichip.org/wiki/intel/microarchitectures/skylake_(server)
So the question is: can I really have 4 MB pages, and if so what do I need to do to set up my system to have that option?
The best answer is probably to install and/or use libhugetlbfs
If it's already installed, you can check status of huge pages in the OS with a command like:
$ hugeadm --pool-list
Size Minimum Current Maximum Default
2097152 0 1 257388 *
4194304 0 0 128694
8388608 0 0 64347
16777216 0 0 32173
33554432 0 0 16086
67108864 0 0 8043
134217728 0 0 4021
268435456 0 0 2010
536870912 0 0 1005
1073741824 0 0 502
2147483648 0 0 251
The same hugeadm command can also be run as sudo with various options to configure the available huge memory pools. See the hugeadm man page for details.

How to identify, what is stalling the system in Linux?

I have an embedded system, when I do the user i/o operations, the system just stalls. It does the action after a long time. This system is quite complex and has many process running. My question is how can I identify what is making the system stall - it does nothing literally for 5 minutes. After 5 minutes, I see the outcome. I really don't know what is stalling the system. Any inputs on how to debug this issue. I have run the top on the system. However, it doesn't lead to any issue. See here, the jup_render is just taking 30% of CPU, which is not enough to stall the system. So, I am not sure whether top is useful here or not.
~ # top
top - 12:01:05 up 21 min, 1 user, load average: 1.49, 1.26, 0.87
Tasks: 116 total, 2 running, 114 sleeping, 0 stopped, 0 zombie
Cpu(s): 44.4%us, 13.9%sy, 0.0%ni, 40.3%id, 0.0%wa, 0.0%hi, 1.4%si, 0.0%st
Mem: 822572k total, 389640k used, 432932k free, 1980k buffers
Swap: 0k total, 0k used, 0k free, 227324k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
850 root 20 0 309m 32m 16m S 30 4.0 3:10.88 jup_render
870 root 20 0 221m 13m 10m S 27 1.7 2:28.78 jup_render
688 root 20 0 1156m 4092 3688 S 11 0.5 1:25.49 rxserver
9 root 20 0 0 0 0 S 2 0.0 0:06.81 ksoftirqd/1
16 root 20 0 0 0 0 S 1 0.0 0:06.87 ksoftirqd/3
9294 root 20 0 1904 616 508 R 1 0.1 0:00.10 top
812 root 20 0 865m 85m 46m S 1 10.7 1:21.17 lippo_main
13 root 20 0 0 0 0 S 1 0.0 0:06.59 ksoftirqd/2
800 root 20 0 223m 8316 6268 S 1 1.0 0:08.30 rat-cadaemon
3 root 20 0 0 0 0 S 1 0.0 0:05.94 ksoftirqd/0
1456 root 20 0 80060 10m 8208 S 1 1.2 0:04.82 jup_render
1330 root 20 0 202m 10m 8456 S 0 1.3 0:06.08 jup_render
8905 root 20 0 1868 556 424 S 0 0.1 0:02.91 dropbear
1561 root 20 0 80084 10m 8204 S 0 1.2 0:04.92 jup_render
753 root 20 0 61500 7376 6184 S 0 0.9 0:04.06 ale_app
1329 root 20 0 79908 9m 8208 S 0 1.2 0:04.77 jup_render
631 dbus 20 0 3248 1636 676 S 0 0.2 0:13.10 dbus-daemon
1654 root 20 0 80068 10m 8204 S 0 1.2 0:04.82 jup_render
760 root 20 0 116m 15m 12m S 0 1.9 0:10.19 jup_server
8 root 20 0 0 0 0 S 0 0.0 0:00.00 kworker/1:0
2 root 20 0 0 0 0 S 0 0.0 0:00.00 kthreadd
7 root RT 0 0 0 0 S 0 0.0 0:00.00 migration/1
170 root 0 -20 0 0 0 S 0 0.0 0:00.00 kblockd
6 root RT 0 0 0 0 S 0 0.0 0:00.00 migration/0
167 root 20 0 0 0 0 S 0 0.0 0:00.00 sync_supers
281 root 0 -20 0 0 0 S 0 0.0 0:00.00 nfsiod
For an embedded system that has many process running, there can be multitude of reasons. You may need to investigate in all perspective.
Check code for race conditions and deadlock.The kernel might be busy looping in a certain condition . There can be scenario where your application is waiting on a select call or the CPU resource is used up (This choice of CPU resource usage is ruled out based on the output of top command shared by you) or blocked on a read.
If you are performing a blocking I/O operations, the process shall get into wait queue and only move back to the execution path(ready queue) after the completion of the request. That is, it is moved out of the scheduler run queue and put with a special state. It shall be put back into the run queue only if they wake from the sleep or the resource waited for is made available.
Immediate step shall be to try out 'strace'. It shall intercept/record system calls that are called by a process and also the signals that are received by a process. It will be able to show the order of events and all the return/resumption paths of calls. This can take you almost closer to the area of problem.
There are other many handy tools that can be tried based on your development environment/setup. Key tools are as below :
'iotop' - It shall provide you a table of current I/O usage by processes or threads on the system by monitoring the I/O usage information output by the kernel.
'LTTng' - Makes tracing of race conditions and interrupt cascades possible. It is the successor to LTT. It is a combination of kprobes, tracepoint and perf functionalities.
'Ftrace' - This is a Linux kernel internal tracer with which you can analyze/debug latency and performance related issues.
If your system is based on TI processor, the CCS(Trace analyzer) provides capability to perform non-intrusive debug and analysis of system activity. So, note that based on your setup, you may also need to use the relevant tool .
Came across few more ideas :
magic SysRq key is another option in linux. If the driver is stuck, the command SysRq p can take you to the exact routine that is causing the problem.
Profiling of data can tell where exactly the time is being spent by the kernel. There are couple of tools like Readprofile and Oprofile. Oprofile can be enabled by configuring with CONFIG_PROFILING and CONFIG_OPROFILE. Another option is to rebuild the kernel by enabling the profiling option and reading the profile counters using Readprofile utility by booting up with profile=2 via command line.
mpstat can give 'the percentage of time that the CPU or CPUs were idle during which the system had an outstanding disk I/O request' via 'iowait' argument.
You said you run the top app. Did you find out which programme gets the biggest CPU time and how much is a percentage for it?
If you run the top you should see another screen in there, which you neither provided nor mentioned a cpu load percentage (or other relevant info).
I advise you to include what you can find interesting/relevant or suspicious through top. If it was already done you should discover it in your question more distinctively because now it's not obvious what is the CPU maximum load.

CPU User time and System time on AIX

How can I get CPU user time and system time for each cpu on AIX.
I know I can get this value from cat /proc/stat on a linux machine, and from pstat_getprocessor() on an HP-UX machine. Is there a way to get this same metric on an AIX machine.
$ cat /proc/stat
...
cpu 23697394 7969 2744135 4505191649 2958605 190 17883 0 0
cpu0 12511394 4575 1520243 2251753159 1480624 137 10580 0 0
cpu1 11186000 3394 1223891 2253438490 1477980 53 7302 0 0
...
mpstat is providing these metrics, either parse its output or figure out how/where does it find them.

How do I know if my server has NUMA?

Hopping from Java Garbage Collection, I came across JVM settings for NUMA. Curiously I wanted to check if my CentOS server has NUMA capabilities or not. Is there a *ix command or utility that could grab this info?
I'm no expert here, but here's something:
Box 1, no NUMA:
~$ dmesg | grep -i numa
[ 0.000000] No NUMA configuration found
Box 2, some NUMA:
~$ dmesg | grep -i numa
[ 0.000000] NUMA: Initialized distance table, cnt=8
[ 0.000000] NUMA: Node 4 [0,80000000) + [100000000,280000000) -> [0,280000000)
I think this previous question is similar: How to confirm NUMA?
In particular, you can review the NUMA man page here:
http://man7.org/linux/man-pages/man7/numa.7.html
And from there you'll see:
$ find /proc -name numa_maps
/proc/1/task/1/numa_maps
/proc/1/numa_maps
/proc/2/task/2/numa_maps
/proc/2/numa_maps
/proc/3/task/3/numa_maps
[etc if you have numa]
And you can get more detail like so:
$ grep NUMA=y /boot/config-`uname -r`
CONFIG_NUMA=y
CONFIG_K8_NUMA=y
CONFIG_X86_64_ACPI_NUMA=y
CONFIG_ACPI_NUMA=y
$ numactl --hardware
available: 2 nodes (0-1)
node 0 size: 18156 MB
node 0 free: 9053 MB
node 1 size: 18180 MB
node 1 free: 6853 MB
node distances:
node 0 1
0: 10 20
1: 20 10
For Redhat 4,5,6 and 7 systems, one can try the following to determine if NUMA configuration is disabled:
numactl --show does not show multiple nodes
# numactl --show
policy: default
preferred node: current
physcpubind: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
cpubind: 0
nodebind: 0
membind: 0
or numactl --hardware does not list multiple nodes
# numactl --hardware
available: 1 nodes (0)
node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
node 0 size: 524163 MB
node 0 free: 505253 MB
node distances:
node 0
0: 10
You can also get this info from lscpu command:
lscpu | grep -i numa
NUMA node(s): 2
NUMA node0 CPU(s): 0-19,40-59
NUMA node1 CPU(s): 20-39,60-79
You can also just query the information from /sys (this is what tools like numactl do underneath). As others pointed out, using dmesg will be unreliable since it usually does not have unlimited buffering.
To find out how many NUMA nodes are currently available, do:
cat /sys/devices/system/node/online
0-3

Resources