Here is an output from top (sorted by %Mem):
Mem: 5796624k total, 4679932k used, 1116692k free, 317652k buffers
Swap: 0k total, 0k used, 0k free, 1734160k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
13169 storm 20 0 3279m 344m 16m S 0.7 6.1 201:38.40 java
5463 storm 20 0 2694m 172m 14m S 0.0 3.0 72:38.49 java
5353 storm 20 0 2561m 155m 14m S 0.0 2.7 30:20.43 java
13102 app 20 0 3813m 80m 17m S 0.3 1.4 132:37.16 java
13147 storm 20 0 3876m 65m 16m S 0.0 1.2 23:21.73 java
3081 named 20 0 230m 16m 2652 S 0.0 0.3 1:22.81 named
29773 root 20 0 318m 10m 3576 S 0.0 0.2 5:59.41 logstash-forwar
5345 root 20 0 193m 10m 1552 S 0.0 0.2 12:24.21 supervisord
1048 root 20 0 249m 5200 1068 S 0.0 0.1 0:22.55 rsyslogd
21774 root 20 0 99968 3980 3032 S 0.0 0.1 0:00.00 sshd
3456 postfix 20 0 81108 3432 2556 S 0.0 0.1 0:02.83 qmgr
3453 root 20 0 80860 3416 2520 S 0.0 0.1 0:19.40 master
In GBs:
Mem: 5.8g total, 4.7g used, 1.1g free, 0.3g buffers
So free mem is 1.1 / 5.8 ~ 19%
Where as if we add the top %Mem, we see the used is about: 6.1+3.0+2.7+1.4+1.2+0.3+... ~ 16% and that means the free should be about 84%
Why dont the numbers match (19% vs 84%)?
From the memory usage related lines in top:
Mem: 5796624k total, 4679932k used, 1116692k free, 317652k buffers
Swap: 0k total, 0k used, 0k free, 1734160k cached
Total memory equals the sum of used and free memory. Used, on the other hand, is the sum of "really used by applications" and cached and buffers. So, in your case goes like this:
Mem = 5796624k = 4679932k + 1116692k;
"Really used by applications" = Used - (cached + buffers)
= 4679932k - (1734160k + 317652k )
= 2628120k.
So total memory is 5.8g and 2.6g is really used by applications. Since, 1.1g is free which means 5.8g - (1.1g + 2.6g) = 2.1g memory is cached which improves performance. In the very moment an application requires part of the cached memory it is immediately given to it. That's why your computation of free memory in percentage of total memory is not matching as you were expecting!
Related
I am checking the impact of Linux's sched_rt_runtime_us.
My understanding of the Linux RT scheduling is sched_rt_period_us defines scheduling period of RT process, and sched_rt_runtime_us defines how much the RT process can run within that period.
In my Linux-4.18.20, the kernel.sched_rt_period_us = 1000000, kernel.sched_rt_runtime_us = 950000, so in each second, 95% time is used by RT process, 5% is for SCHED_OTHER processes.
By changing the kernel.sched_rt_runtime_us, the CPU usage of RT process shown in top should be proportional with sched_rt_runtime_us/sched_rt_period_us.
But my testing does NOT get the expected results, and what I got is as follows,
%CPU
kernel.sched_rt_runtime_us = 50000
2564 root rt 0 4516 748 684 R 19.9 0.0 0:37.82 testsched_top
kernel.sched_rt_runtime_us = 100000
2564 root rt 0 4516 748 684 R 40.5 0.0 0:23.16 testsched_top
kernel.sched_rt_runtime_us = 150000
2564 root rt 0 4516 748 684 R 60.1 0.0 0:53.29 testsched_top
kernel.sched_rt_runtime_us = 200000
2564 root rt 0 4516 748 684 R 80.1 0.0 1:24.96 testsched_top
The testsched_top is a SCHED_FIFO process with priority 99, and it is running in an isolated CPU.
The cgroup is configured in grub.cfg as cgroup_disable=cpuset,cpu,cpuacct to disable CPU related stuff.
I don't know why this happens, is there anything missing or wrong in my testing and understanding of Linux SCHED_FIFO scheduling?
N.B.: I am running this in Ubuntu VM, which is configured with 8 vCPUs, in which 4-7 are isolated to run RT processes. The host is Intel X86_64 with 6Cores (12 Threads), and there is NO other VMs running in the host. The above testsched_top was copied from https://viviendolared.blogspot.com/2017/03/death-by-real-time-scheduling.html?m=0, it sets priority 99 for SCHED_FIFO and loops indefinitely in one isolated CPU. I checked that isolated CPU usage, and got above results. –
I think I got the answer, and thank Rachid for the question.
In short, the kernel.sched_rt_period_us is the sum of RT time slice in a group of CPUs.
For example, in my 8vCPU VM configuration, CPU4-7 are isolated for running specific processes. So the kernel.sched_rt_period_us should be evenly divided among these 4 isolated CPUs, which means kernel.sched_rt_period_us/4 = 250000 is 100% CPU quota for each CPU in the isolated group. Setting kernel.sched_rt_period_us to 250000 makes the SCHED_FIFO process take all of the CPU. Accordingly, 25000 means 10% CPU usage for the CPU, 50000 means 20%, etc.
This is validated when CPU6 and CPU7 are isolated, in this case, 500000 can make the CPU to be 100% used by SCHED_FIFO process, 250000 makes 50% CPU usage.
Since these two kernel parameters are global ones, which means if the SCHED_FIFO process is put into the CPU0-5, 1000000/6 = 166000 should be the 100% quota for each CPU, 83000 makes 50% CPU usage, I also validated this.
Here is the snapshot of top,
%Cpu4 : 49.7 us, 0.0 sy, 0.0 ni, 50.3 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu5 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu6 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu7 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 16422956 total, 14630144 free, 964880 used, 827932 buff/cache
KiB Swap: 1557568 total, 1557568 free, 0 used. 15245156 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
3748 root rt 0 4516 764 700 R 49.5 0.0 30:21.03 testsched_top
This is a 8 core machine.
The %Cpu(s)->id is 99.4 but one java process already occupies 82.7% cpu usage.
The "top" output is as below:
top - 09:04:09 up 17:22, 1 user, load average: 0.00, 0.00, 0.00
Tasks: 142 total, 1 running, 74 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.1 us, 0.1 sy, 0.0 ni, 99.4 id, 0.0 wa, 0.0 hi, 0.0 si, 0.3 st
KiB Mem : 62876640 total, 9865752 free, 51971500 used, 1039388 buff/cache
KiB Swap: 0 total, 0 free, 0 used. 10121552 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
4859 root 20 0 50.1g 49.4g 144356 S 82.7 82.4 20:28.62 java
3847 root 20 0 6452 792 716 S 0.3 0.0 0:09.50 rngd
1 root 20 0 43724 5680 4196 S 0.0 0.0 0:02.30 systemd
2 root 20 0 0 0 0 S 0.0 0.0 0:00.01 kthreadd
the answer:
we have cpu = 8 cores
from top we have:
process cpu_usage = 82.7%
idle = 99.4 % id
look ! lets calculate
8 cores at full usage give = 800%,
so cpu usage % = > [82.7/800] * 100% = 10.3 % (calculated)
cpu idle % = 100-10.3 = 89.7 % (calculated)
well 89.7% kind of slightly different from 99.4% but will
give you a flavour
i suppose your main confusion is about 82.7%.
82.7% does not mean 82.7% usage of all the cpu. it would be so if your cpu had 1 core. for multicore cpus 100% usage means that kind of only one core is 100% busy
not all the cpu.
Hello it's depending on your usage, running service or apps (eclipse, android studio, JBoss Server, etc...) check them.
Normally the CPU tries to distribute the functions/process between its cores in order to fulfill the multi-tasking. So, the specific process can fetch up the big part of one the cores, however the other cores and the CPU are not handling a huge load.
BR
How come CPU% in process is higher than in overall CPU Usage Percentage
top - 19:42:24 up 68 days, 19:49, 6 users, load average: 439.72, 540.53, 631.13
Tasks: 354 total, 3 running, 350 sleeping, 0 stopped, 1 zombie
Cpu(s): 21.5%us, 46.8%sy, 0.0%ni, 17.4%id, 0.0%wa, 0.1%hi, 14.2%si, 0.0%st
Mem: 65973304k total, 50278472k used, 15694832k free, 28749456k buffers
Swap: 19455996k total, 93436k used, 19362560k free, 14769728k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
4425 ladm 20 0 63.6g 211m 1020 S **425.7** 0.3 433898:26 zzz
28749 isdm 20 0 167g 679m 7928 S 223.7 1.1 2526:40 xxx
28682 iadm 20 0 167g 1.1g 7928 S 212.8 1.8 2509:08 ccc
28834 iladm 20 0 11.8g 377m 7968 S 136.3 0.6 850:25.78 vvv
7776 root 20 0 237m 139m 11m S 3.3 0.2 658:24.58 bbbb
45 root 20 0 0 0 0 R 1.1 0.0 1313:36 nnnn/10
1313 isom 20 0 103m 712 504 S 1.1 0.0 0:00.20 mmmm.sh
4240 ladm 20 0 338m 18m 576 S 1.1 0.0 558:21.33 memcached
32341 root 20 0 15172 1440 916 R 1.1 0.0 0:00.04 top
The machine in question is using 100% of the cores available.
In the situation presented, the pc or server has more than 1 core, therefore a process can use more than 1. That's why one process can use 425.7%, meaning that it's using more than 4 cores to do its job.
On RedHat Linux 6.2 I'm running free -m and it shows nearly all 8GB used
total used free shared buffers cached
Mem: 7989 7734 254 0 28 7128
-/+ buffers/cache: 578 7411
Swap: 4150 0 4150
But at the same time in top -M I cannot see any processes using all this memory:
top - 16:03:34 up 4:10, 2 users, load average: 0.08, 0.04, 0.01
Tasks: 169 total, 1 running, 163 sleeping, 5 stopped, 0 zombie
Cpu(s): 0.7%us, 0.3%sy, 0.0%ni, 98.6%id, 0.4%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 7989.539M total, 7721.570M used, 267.969M free, 28.633M buffers
Swap: 4150.992M total, 0.000k used, 4150.992M free, 7115.312M cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1863 sroot 20 0 398m 24m 9.8m S 0.3 0.3 3:12.87 App1
1 sroot 20 0 2864 1392 1180 S 0.0 0.0 0:00.91 init
2 sroot 20 0 0 0 0 S 0.0 0.0 0:00.00 kthreadd
3 sroot RT 0 0 0 0 S 0.0 0.0 0:00.07 migration/0
4 sroot 20 0 0 0 0 S 0.0 0.0 0:00.00 ksoftirqd/0
5 sroot RT 0 0 0 0 S 0.0 0.0 0:00.00 migration/0
6 sroot RT 0 0 0 0 S 0.0 0.0 0:00.00 watchdog/0
7 sroot RT 0 0 0 0 S 0.0 0.0 0:00.08 migration/1
8 sroot RT 0 0 0 0 S 0.0 0.0 0:00.00 migration/1
I also tried this ps mem script but it onlt shows about 400MB memory being used.
Don't look at the "Mem" line, look at the one below it.
The Linux kernel consumes as much memory as it can to provide the I/O cache (and other non-critical buffers, but the cache is going to be most of this usage). This memory is relinquished to processes when they request it. The "-/+ buffers/cache" line is showing you the adjusted values after the I/O cache is accounted for, that is, the amount of memory used by processes and the amount available to processes (in this case, 578MB used and 7411MB free).
The difference of used memory between the "Mem" and "-/+ buffers/cache" line shows you how much is in use by the kernel for the purposes of caching: 7734MB - 578MB = 7156MB in the I/O cache. If processes need this memory, the kernel will simply shrink the size of the I/O cache.
Also, as the first line shows
total used free shared buffers cached
Mem: 7989 7734 254 0 28 7128
-/+ buffers/cache: 578 7411
If we add (cached[7128] + buffers[28] + free[254]), we will get approximately the second line's free[7411] value
7128 + 28 + 254 = 7410
If the cached is small, try this command:
ps aux --sort -rss
Three hours ago the server memory usage blowed up to 105% from around 60%.I am using a dedicated MediaTemple server with 512mb RAM.Should I be worried?Why would something like this happen?
Any help would be greatly appreciated.
Tasks: 38 total, 2 running, 36 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 946344k total, 550344k used, 396000k free, 0k buffers
Swap: 0k total, 0k used, 0k free, 0k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1 root 15 0 10364 740 620 S 0.0 0.1 0:38.54 init
3212 root 18 0 96620 4068 3200 R 0.0 0.4 0:00.21 sshd
3214 root 15 0 12080 1728 1316 S 0.0 0.2 0:00.05 bash
3267 apache 15 0 412m 43m 4396 S 0.0 4.7 0:03.88 httpd
3290 apache 15 0 412m 43m 4340 S 0.0 4.7 0:02.98 httpd
3348 root 15 0 114m 52m 2112 S 0.0 5.6 0:48.94 spamd
3349 popuser 15 0 114m 50m 972 S 0.0 5.5 0:00.06 spamd
3455 sw-cp-se 18 0 60116 3216 1408 S 0.0 0.3 0:00.12 sw-cp-serverd
3525 admin 18 0 81572 4604 2912 S 0.0 0.5 0:01.74 in.proftpd
3585 apache 18 0 379m 15m 3356 S 0.0 1.7 0:00.01 httpd
3589 root 15 0 12624 1224 956 R 0.0 0.1 0:00.00 top
7397 root 15 0 21660 944 712 S 0.0 0.1 0:00.58 xinetd
9500 named 16 0 301m 5284 1968 S 0.0 0.6 0:00.43 named
9575 root 15 -4 12632 680 356 S 0.0 0.1 0:00.00 udevd
9788 root 25 0 13184 608 472 S 0.0 0.1 0:00.00 couriertcpd
9790 root 25 0 3672 380 312 S 0.0 0.0 0:00.00 courierlogger
9798 root 25 0 13184 608 472 S 0.0 0.1 0:00.00 couriertcpd
First analyze the process which was taking that much of CPU by the same top command. If the process was a multi-threaded program use the following top command:
top -H -p "pid of that process"
It will help you find the thread whichever is taking a lot of CPU for further diagnosis.