What does the sbwait process state imply in FreeBSD top? - freebsd

On a FreeBSD system, in the top output below, the mysql daemon is in state "sbwait". What does this imply?
last pid: 12833; load averages: 0.18, 0.26, 0.25 up 3+17:40:21 04:58:46
26 processes: 1 running, 25 sleeping
CPU: 16.5% user, 0.0% nice, 12.8% system, 6.8% interrupt, 63.9% idle
Mem: 184M Active, 137M Inact, 88M Wired, 6308K Cache, 53M Buf, 7192K Free
Swap: 4096M Total, 420K Used, 4095M Free
PID USERNAME THR PRI NICE SIZE RES STATE TIME WCPU COMMAND
1772 mysql 17 30 0 224M 165M sbwait 511:31 14.79% mysqld
12833 root 1 20 0 9944K 1488K RUN 0:00 0.10% top
1472 root 1 20 0 9612K 828K select 5:07 0.00% powerd
1465 root 1 20 0 11296K 1644K select 2:01 0.00% ntpd
1804 root 1 20 0 11324K 2140K select 0:37 0.00% sendmail
1403 root 1 20 0 12200K 2320K select 0:27 0.00% nmbd
1814 root 1 20 0 9644K 1004K nanslp 0:08 0.00% cron
1407 root 1 20 0 20756K 3756K select 0:06 0.00% smbd
1273 root 1 20 0 9612K 1036K select 0:04 0.00% syslogd
11937 root 1 20 0 15788K 3124K select 0:03 0.00% sshd
1808 smmsp 1 20 0 11324K 1864K pause 0:01 0.00% sendmail
1438 root 1 20 0 20840K 3696K select 0:00 0.00% smbd
1111 _dhcp 1 20 0 9540K 1136K select 0:00 0.00% dhclient
11941 root 1 20 0 10940K 2024K pause 0:00 0.00% csh
1517 mysql 1 52 0 9924K 1072K wait 0:00 0.00% sh
1073 root 1 47 0 9540K 1012K select 0:00 0.00% dhclient
1797 root 1 20 0 13064K 1892K select 0:00 0.00% sshd

It means that one of the threads in the process in the process is waiting for data to arrive on a socket. The default mode of top is not very informative for a threaded process like mysqld; while you have 17 mysql threads, top can only show you one of them in this mode. You should use the '-H' flag to top (or the 'H' keyboard command while in top) to see the individual threads separately, which will show each thread's distinct status.

Use the source:
find /usr/src -type f -exec grep -H sbwait {} \+
That will give you some files to look at.
Look at /usr/src/sys/kern/uipc_sockbuf.c:
/*
* Wait for data to arrive at/drain from a socket buffer.
*/
int
sbwait(struct sockbuf *sb)
{
SOCKBUF_LOCK_ASSERT(sb);
sb->sb_flags |= SB_WAIT;
return (msleep(&sb->sb_cc, &sb->sb_mtx,
(sb->sb_flags & SB_NOINTR) ? PSOCK : PSOCK | PCATCH, "sbwait",
sb->sb_timeo));
}

Related

How to get the number of swapped out LWPs in linux

I use vmstat | head -3 in Solaris to check the process situation and get this
kthr memory page disk faults cpu
r b w swap free re mf pi po fr de sr s4 s5 -- -- in sy cs us sy id
0 0 0 100548720 6357568 219 3300 0 0 0 0 0 8 0 0 0 2193 6771 2256 0 1 98
the w means the number of swapped out lightweight processes (LWPs) that are waiting for processing resources to finish. We use this data to monitor system conditions.
But running the command in Redhat get this result
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu -----
r b swpd free buff cache si so bi bo in cs us sy id wa st
0 0 784968 348724 28 44781148 0 1 16 49 7 9 1 0 98 1 0
The w in column 3 is missing, I try to find a replacement in another command
use the top command
top - 20:28:12 up 5 days, 22:24, 7 users, load average: 0.02, 0.03, 0.06
Tasks: 725 total, 1 running, 724 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.3 us, 0.3 sy, 0.0 ni, 99.0 id, 0.3 wa, 0.1 hi, 0.1 si, 0.0 st
MiB Mem : 48087.3 total, 301.6 free, 4029.0 used, 43756.7 buff/cache
MiB Swap: 16384.0 total, 15617.9 free, 766.1 used. 3496.9 avail Mem
1 running is not the data I need
Use the ps -efL | head -3 command
UID PID PPID LWP C NLWP STIME TTY TIME CMD
root 1 0 1 0 1 Jun 22 ? 00:00:22 /usr/lib/systemd/systemd --switched-root --system --deserialize 18
root 2 0 2 0 1 Jun 22 ? 00:00:00 [kthreadd]
The NLWP here is also not the data I need
Is there any way to get the same data as the w column of vmstat in solaris?

Node Application Spawning Multiple Child Processes & Consuming All CPU Cores

Problem
I am running a NodeJs application which is doing frequent insert operations in MongoDb & the problem is, it is spawning multiple child processes even if I am trying to run it in fork mode.
Initially, I suspected PM2 is causing the problem but then I tried following ways to start the application but I can still see multiple child processes being spawned -
PM2 is fork mode
PM2 it cluster mode with 1 instance
VSCode Debugger
Terminal Command ( node app.js )
Stats
I managed to collect various stats to prove that (I have updated the machine specific information )-
Node App is spawning multiple processes
These processes are consuming all of the CPU cores
Process Status
command - ps -aefL | grep "29475"
machine+ 8750 28307 8750 0 1 12:10 pts/24 00:00:00 grep --color=auto 29475
machine+ 29475 18579 29475 0 10 10:58 ? 00:00:05 node /app/api.js
machine+ 29475 18579 29476 0 10 10:58 ? 00:00:00 node /app/api.js
machine+ 29475 18579 29477 0 10 10:58 ? 00:00:00 node /app/api.js
machine+ 29475 18579 29478 0 10 10:58 ? 00:00:00 node /app/api.js
machine+ 29475 18579 29479 0 10 10:58 ? 00:00:00 node /app/api.js
machine+ 29475 18579 29481 0 10 10:58 ? 00:00:00 node /app/api.js
machine+ 29475 18579 29501 0 10 10:58 ? 00:00:00 node /app/api.js
machine+ 29475 18579 29502 0 10 10:58 ? 00:00:00 node /app/api.js
machine+ 29475 18579 29503 0 10 10:58 ? 00:00:00 node /app/api.js
machine+ 29475 18579 29504 0 10 10:58 ? 00:00:00 node /app/api.js
Virtual Memory Statistics
Command - vmstat 2 100
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
11 0 0 603060 818492 3718660 0 0 16 40 73 59 7 28 65 1 0
6 0 0 601664 818492 3718696 0 0 0 130 3383 6112 76 24 0 0 0
6 0 0 611208 818492 3711916 0 0 0 22 3472 6378 77 23 0 0 0
7 0 0 610932 818492 3711920 0 0 0 16 3932 6441 77 24 0 0 0
5 0 0 617304 818492 3711980 0 0 0 100 3480 6151 72 25 2 1 0
6 0 0 612584 818496 3718788 0 0 0 102 3634 7433 67 25 6 1 0
8 0 0 612128 818496 3718788 0 0 0 18 3318 5584 77 23 0 0 0
8 0 0 617552 818496 3713300 0 0 0 0 3468 6787 78 22 0 0 0
7 0 0 618096 818496 3712132 0 0 0 38 3743 6913 76 23 0 0 0
12 0 0 618040 818496 3712528 0 0 0 146 3852 7434 76 24 0 0 0
System Resource Usage Status
Command - top -p 5733 -p 29475 (mongod & node app processes)
top - 12:30:06 up 1 day, 3:21, 1 user, load average: 8.91, 4.81, 3.31
Tasks: 2 total, 0 running, 2 sleeping, 0 stopped, 0 zombie
%Cpu(s): 72.1 us, 27.7 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.2 si, 0.0 st
KiB Mem : 16336020 total, 686496 free, 12203100 used, 3446424 buff/cache
KiB Swap: 16686076 total, 16686076 free, 0 used. 2746100 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
5733 root 20 0 1296480 366432 29724 S 204.0 2.2 63:07.65 mongod
29475 machine+ 20 0 1351412 118900 30960 S 8.8 0.7 0:10.82 node /app/api.js
System Monitor
Findings
If I stop the running node service -
CPU usage goes down
The child processes disappear
Why is the node application spawning so many processes?
Why the count of child processes always comes out to be 10?
Let me know if further information is required.

Top Command: How come CPU% in process is higher than in overall CPU Usage Percentage

How come CPU% in process is higher than in overall CPU Usage Percentage
top - 19:42:24 up 68 days, 19:49, 6 users, load average: 439.72, 540.53, 631.13
Tasks: 354 total, 3 running, 350 sleeping, 0 stopped, 1 zombie
Cpu(s): 21.5%us, 46.8%sy, 0.0%ni, 17.4%id, 0.0%wa, 0.1%hi, 14.2%si, 0.0%st
Mem: 65973304k total, 50278472k used, 15694832k free, 28749456k buffers
Swap: 19455996k total, 93436k used, 19362560k free, 14769728k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
4425 ladm 20 0 63.6g 211m 1020 S **425.7** 0.3 433898:26 zzz
28749 isdm 20 0 167g 679m 7928 S 223.7 1.1 2526:40 xxx
28682 iadm 20 0 167g 1.1g 7928 S 212.8 1.8 2509:08 ccc
28834 iladm 20 0 11.8g 377m 7968 S 136.3 0.6 850:25.78 vvv
7776 root 20 0 237m 139m 11m S 3.3 0.2 658:24.58 bbbb
45 root 20 0 0 0 0 R 1.1 0.0 1313:36 nnnn/10
1313 isom 20 0 103m 712 504 S 1.1 0.0 0:00.20 mmmm.sh
4240 ladm 20 0 338m 18m 576 S 1.1 0.0 558:21.33 memcached
32341 root 20 0 15172 1440 916 R 1.1 0.0 0:00.04 top
The machine in question is using 100% of the cores available.
In the situation presented, the pc or server has more than 1 core, therefore a process can use more than 1. That's why one process can use 425.7%, meaning that it's using more than 4 cores to do its job.

Opening an existing process

I am using Eclipse in Linux through a remote connection (xrdp). My internet got disconnected, so I got disconnected from the server while eclipse was running.
Now I logged in again, and I do the "top" command I can see that eclipse is running and still under my user name.
Is there some way I can bring that process back into my view (I do not want to kill it because I am in the middle of checking in a large swath of code)? It doesnt show up on the bottom panel after I logged in again.
Here is the "top" output:
/home/mclouti% top
top - 08:32:31 up 43 days, 13:06, 29 users, load average: 0.56, 0.79, 0.82
Tasks: 447 total, 1 running, 446 sleeping, 0 stopped, 0 zombie
Cpu(s): 6.0%us, 0.7%sy, 0.0%ni, 92.1%id, 1.1%wa, 0.1%hi, 0.1%si, 0.0%st
Mem: 3107364k total, 2975852k used, 131512k free, 35756k buffers
Swap: 2031608k total, 59860k used, 1971748k free, 817816k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
13415 mclouti 15 0 964m 333m 31m S 21.2 11.0 83:12.96 eclipse
16040 mclouti 15 0 2608 1348 888 R 0.7 0.0 0:00.12 top
31395 mclouti 15 0 29072 20m 8524 S 0.7 0.7 611:08.08 Xvnc
2583 root 20 0 898m 2652 1056 S 0.3 0.1 139:26.82 automount
28990 postgres 15 0 13564 868 304 S 0.3 0.0 26:33.36 postgres
28995 postgres 16 0 13808 1248 300 S 0.3 0.0 6:54.95 postgres
31440 mclouti 15 0 3072 1592 1036 S 0.3 0.1 6:01.54 gam_server
1 root 15 0 2072 524 496 S 0.0 0.0 0:03.00 init
2 root RT -5 0 0 0 S 0.0 0.0 0:04.53 migration/0
3 root 34 19 0 0 0 S 0.0 0.0 0:00.04 ksoftirqd/0
4 root RT -5 0 0 0 S 0.0 0.0 0:00.00 watchdog/0
5 root RT -5 0 0 0 S 0.0 0.0 0:01.72 migration/1
6 root 34 19 0 0 0 S 0.0 0.0 0:00.07 ksoftirqd/1
7 root RT -5 0 0 0 S 0.0 0.0 0:00.00 watchdog/1
8 root RT -5 0 0 0 S 0.0 0.0 0:04.33 migration/2
9 root 34 19 0 0 0 S 0.0 0.0 0:00.05 ksoftirqd/2
It is a long shot, but you could try this little program from this thread
#include <stdio.h>
#include <stdlib.h>
#include <X11/Xlib.h>
int main(int argc, char **argv)
{
if ( argc != 2 ) {
printf("Usage:\n\ttotop <window id>\n");
return 1;
}
Display *dsp = XOpenDisplay(NULL);
long id = strtol(argv[1], NULL, 16);
XRaiseWindow ( dsp, id );
XSetInputFocus ( dsp, id, RevertToNone, CurrentTime );
XCloseDisplay ( dsp );
return 0;
}
You can compile it with:
$ c++ totop.cpp -L/usr/X11R6/lib -lX11 -o totop
I assumed that you saved it in "totop.cpp".
It has problem I do not know how to fix:
if window is in another virtual desktop this program doesn't work.
Here another question rises: how to send window to current desktop?
You can get window id using xwininfo.
A little script using this program used to call Eclipse:
#!/bin/bash
if ps -A | grep eclipse; then # if Eclipse already launched
id=$(xwininfo -name "Eclipse" | grep id: | awk "{ print \$4 }")
totop $id
else # launch Eclipse
eclipse
fi

Can I measure memory taken by mod_perl?

Problem: my mod_perl leaks and I cannot control it.
I run mod_perl script under Ubuntu (production code).
Usually there are 8-10 script instances running concurrently.
According to Unix "top" utilty each instance takes 55M of memory.
55M is a lot, but I was told here that most of this memory is shared.
The memory is leaking.
There are 512M on the server.
There is a significant decrease of free memory in 24 hours after reboot.
Test: free memory on the system at the moment 10 scripts are running:
-after reboot: 270M
-in 24 hours since reboot: 50M
In 24 hours memory taken by each script is roughly the same - 55M (according to "top" utility).
I don't understand where the memory leakes out.
And don't know how can I find the leaks.
I share memory, I preload all the modules required by the script in startup.pl.
One more test.
A very simple mod_perl script ("Hello world!") takes 52M (according to "top")
According to "Practical mod_perl" I can use GTop utility to measure the real memory taken by mod_perl.
I have made a very simple script that measures the memory with GTop.
It shows there are 54M real memory taken by a very simple perl script!
54 Megabytes by "Hello world"?!!!
proc-mem-size: 59,707392
proc-mem-share: 52,59264
diff: 54,448128
There must be something wrong in the way I measure mod_perl memory.
Help please!
This problem is driving me mad for several days.
These are the snapshots of "top" output after reboot and in 24 hours after reboot.
The processes are sorted by Memory.
---- RIGHT AFTER REBOOT ----
top - 10:25:24 up 55 min, 2 users, load average: 0.10, 0.07, 0.07
Tasks: 59 total, 3 running, 56 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.0%us, 0.0%sy, 0.0%ni, 97.3%id, 0.7%wa, 0.0%hi, 0.0%si, 2.0%st
Mem: 524456k total, 269300k used, 255156k free, 12024k buffers
Swap: 0k total, 0k used, 0k free, 71276k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
2307 www-data 15 0 58500 27m 5144 S 0.0 5.3 0:02.02 apache2
2301 www-data 15 0 58492 27m 4992 S 0.0 5.3 0:02.09 apache2
2302 www-data 15 0 57936 26m 4960 R 0.0 5.2 0:01.74 apache2
2895 www-data 15 0 57812 26m 5048 S 0.0 5.2 0:00.98 apache2
2903 www-data 15 0 56944 26m 4792 S 0.0 5.1 0:01.12 apache2
2886 www-data 15 0 56860 26m 4784 S 0.0 5.1 0:01.20 apache2
2896 www-data 15 0 56520 26m 4804 S 0.0 5.1 0:00.85 apache2
2911 www-data 15 0 56404 25m 4768 S 0.0 5.1 0:00.87 apache2
2901 www-data 15 0 56520 25m 4744 S 0.0 5.1 0:00.84 apache2
2893 www-data 15 0 56608 25m 4740 S 0.0 5.1 0:00.73 apache2
2277 root 15 0 51504 22m 6332 S 0.0 4.5 0:01.02 apache2
2056 mysql 18 0 98628 21m 5164 S 0.0 4.2 0:00.64 mysqld
3162 root 15 0 6356 3660 1276 S 0.0 0.7 0:00.00 vi
2622 root 15 0 8584 2980 2392 R 0.0 0.6 0:00.07 sshd
3083 root 15 0 8448 2968 2392 S 0.0 0.6 0:00.06 sshd
3164 par 15 0 5964 2828 1868 S 0.0 0.5 0:00.05 proftpd
1 root 18 0 3060 1900 576 S 0.0 0.4 0:00.00 init
2690 root 17 0 4272 1844 1416 S 0.0 0.4 0:00.00 bash
3151 root 15 0 4272 1844 1416 S 0.0 0.4 0:00.00 bash
2177 root 15 0 8772 1640 520 S 0.0 0.3 0:00.00 sendmail-mta
2220 proftpd 15 0 5276 1448 628 S 0.0 0.3 0:00.00 proftpd
2701 root 15 0 2420 1120 876 R 0.0 0.2 0:00.09 top
1966 root 18 0 5396 1084 692 S 0.0 0.2 0:00.00 sshd
---- ROUGHLY IN 24 HOURS AFTER REBOOT
top - 17:45:38 up 23:39, 1 user, load average: 0.02, 0.09, 0.11
Tasks: 55 total, 2 running, 53 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 524456k total, 457660k used, 66796k free, 127780k buffers
Swap: 0k total, 0k used, 0k free, 114620k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
16248 www-data 15 0 63712 35m 6668 S 0.0 6.8 0:23.79 apache2
19417 www-data 15 0 60396 31m 6472 S 0.0 6.2 0:10.95 apache2
19419 www-data 15 0 60276 31m 6376 S 0.0 6.1 0:11.71 apache2
19321 www-data 15 0 60480 29m 4888 S 0.0 5.8 0:11.51 apache2
21241 www-data 15 0 58632 29m 6260 S 0.0 5.8 0:05.18 apache2
22063 www-data 15 0 57400 28m 6396 S 0.0 5.6 0:02.05 apache2
21240 www-data 15 0 58520 27m 4856 S 0.0 5.5 0:04.60 apache2
21236 www-data 15 0 58244 27m 4868 S 0.0 5.4 0:05.24 apache2
22499 www-data 15 0 56736 26m 4776 S 0.0 5.1 0:00.70 apache2
2055 mysql 15 0 100m 25m 5656 S 0.0 5.0 0:20.95 mysqld
2277 root 18 0 51500 22m 6332 S 0.0 4.5 0:01.07 apache2
22686 www-data 15 0 53004 21m 4092 S 0.0 4.3 0:00.21 apache2
22689 root 15 0 8584 2980 2392 R 0.0 0.6 0:00.06 sshd
2176 root 15 0 8768 1928 736 S 0.0 0.4 0:00.00 sendmail-
+mta
1 root 18 0 3064 1900 576 S 0.0 0.4 0:00.02 init
22757 root 15 0 4268 1844 1416 S 0.0 0.4 0:00.00 bash
2220 proftpd 18 0 5276 1448 628 S 0.0 0.3 0:00.00 proftpd
22768 root 15 0 2424 1100 876 R 0.0 0.2 0:00.00 top
1965 root 15 0 5400 1088 692 S 0.0 0.2 0:00.00 sshd
2258 root 18 0 3416 1036 820 S 0.0 0.2 0:00.01 cron
1928 klog 25 0 2248 1008 420 S 0.0 0.2 0:00.04 klogd
1946 messageb 19 0 2648 804 596 S 0.0 0.2 0:01.63 dbus-daem
+on
1908 syslog 18 0 2016 716 556 S 0.0 0.1 0:00.17 syslogd
It doesn't actually look like the number of apache/mod_perl processes in existence or the memory they use has changed much between the two reports you post. I note you did not post the header for the second report. It would be interesting to see the "cached" figure after 24 hours. I am going to go out on a limb and guess that this is where your memory is going - Linux is using it for caching file I/O. You can think of the file I/O cache as essentially free memory, since Linux will make that memory available if processes need it.
You can also check that this is what's going on by performing
sync; echo 3 > /proc/sys/vm/drop_caches
as root to cause the memory in use by the caches to be released, and confirming that this causes the amount of free memory reported to revert to initial values.

Resources