Total CPU usage - multicore system - linux

I am using xen and with xen top I get the total CPU usage in percentage:
NAME STATE CPU(sec) CPU(%) MEM(k) MEM(%) MAXMEM(k) MAXMEM(%) VCPUS NETS NETTX(k) NETRX(k) VBDS VBD_OO VBD_RD VBD_WR VBD_RSECT VBD_WSECT SSID
VM1 -----r 25724 299.4 3025244 12.0 20975616 83.4 12 1 14970253 27308358 1 3 146585 92257 10835706 9976308 0
As you can see from above I see the CPU usage is 299 %, but how I can get the total CPU usage from a VM ?
Top doesn't show me the total usage.

We usually see 100% cpu per core.
I guess there are at least 3 cores/cpus.
try this to count cores:
grep processor /proc/cpuinfo | wc -l
299% is the total cpu usage.
sar and mpstat are often used to display cpu usage of a system. Check that systat package is installed and display total cpu usage with:
$ mpstat 1 1
Linux 2.6.32-5-amd64 (debian) 05/01/2016 _x86_64_ (8 CPU)
07:48:51 PM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle
07:48:52 PM all 0.12 0.00 0.50 0.00 0.00 0.00 0.00 0.00 99.38
Average: all 0.12 0.00 0.50 0.00 0.00 0.00 0.00 0.00 99.38
If you agree that CPU utilisation is (100 - %IDLE):
$ mpstat 1 1 | awk '/^Average/ {print 100-$NF,"%"}'
0.52 %

Related

Get CPU usage for indivudal cores in mpstat

I've been asked to grab the CPU usage for individual cores using mpstat. I can get all the information I need for each an individual CPU like so:
mpstat -P 0
Which gives the following output:
Linux 3.10.0-957.21.3.el7.x86_64 (cpu_devel) 03/16/2021 _x86_64_ (48 CPU)
09:59:32 AM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle
09:59:32 AM 0 0.05 0.00 0.05 0.00 0.00 0.01 0.00 0.00 0.00 99.89
What I need to do is grab the number under idle (99.89) and subtract that from 100 to get the total CPU usage. I was trying to grab the 12th field with a delimiter of spaces like this:
mpstat -P 0 | cut -d' ' -f12
But that showed me that there are actually multiple spaces between each field. So I'm looking for help to find a cleaner solution!
You could simply do this with awk. Simply pass your command(mpstat) output as a standard input to awk command as an input; then in main program of awk look if the line number is 4th line then print last column of that line(with checking a condition if $NF is greater than 0 then subtract it with 100 else print it as it is).
mpstat -P 0 | awk 'FNR==4{print ($NF>0?100-$NF:$NF)}'

why is cpu-cycles much less than cpu current frequency?

My cpu max frequency is 2.8GHZ and cpu frequency mode is performance, but cpu-cycles is only 0.105GHZ from perf, why??
The cpu-cycles event is 0x3c, it is CPU_CLK_UNHALTED.THREAD_P or CPU_CLK_THREAD_UNHALTED.REF_XCLK ?
Could I read the PMC register from perf directly?
Now the usage of cpu-8 reaches 90% by the command 'mpstat'.
CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle
8 0.00 0.00 0.98 0.00 0.00 0.00 0.00 89.22 0.00 9.80
8 0.00 0.00 0.99 0.00 0.00 0.00 0.00 88.12 0.00 10.89
The cpu is Intel(R) Xeon(R) CPU E5-2680 v2 # 2.80GHz.
processor : 8
vendor_id : GenuineIntel
cpu family : 6
model : 62
model name : Intel(R) Xeon(R) CPU E5-2680 v2 # 2.80GHz
stepping : 4
microcode : 0x428
cpu MHz : 2800.000
cache size : 25600 KB
I want to get some idea about the cpu-8 by perf.
perf stat -C 8
Performance counter stats for 'CPU(s) 8':
8828.237941 task-clock (msec) # 1.000 CPUs utilized
11,550 context-switches # 0.001 M/sec
0 cpu-migrations # 0.000 K/sec
0 page-faults # 0.000 K/sec
926,167,840 cycles # 0.105 GHz
4,012,135,689 stalled-cycles-frontend # 433.20% frontend cycles idle
473,099,833 instructions # 0.51 insn per cycle
# 8.48 stalled cycles per insn
98,346,040 branches # 11.140 M/sec
1,254,592 branch-misses # 1.28% of all branches
8.828177754 seconds time elapsed
The cpu-cycles is only 0.105GHZ,it is really strange.
I try to understand the cpu-cycles meaning.
cat /sys/bus/event_source/devices/cpu/events/cpu-cycles
event=0x3c
I look up the document "Intel® 64 and IA-32 Architectures Software Developer’s Manual Volume 3", at 19.6 session, page 40.
I also check the cpu frequency setting, the cpu should be running at the max frequency.
cat scaling_governor
performance
cat scaling_governor
performance
==============================================
I try this command:
taskset -c 8 stress --cpu 1
perf stat -C 8 sleep 10
Performance counter stats for 'CPU(s) 8':
10000.633899 task-clock (msec) # 1.000 CPUs utilized
1,823 context-switches # 0.182 K/sec
0 cpu-migrations # 0.000 K/sec
8 page-faults # 0.001 K/sec
29,792,267,638 cycles # 2.979 GHz
5,866,181,553 stalled-cycles-frontend # 19.69% frontend cycles idle
54,171,961,339 instructions # 1.82 insn per cycle
# 0.11 stalled cycles per insn
16,356,002,578 branches # 1635.497 M/sec
33,041,249 branch-misses # 0.20% of all branches
10.000592203 seconds time elapsed
some detail information about my environment
I run a application, let's call it 'A', in a virtual machine 'V', in a host 'H'。
The virtual machine is created by qume-kvm.
The application is used to receive packets from network and deal with them.
cpu-cycles could be frozen due to that CPU enters C1 or C2 idle state.

High IO load during the daily raid check which lasts for hours on Debian Jessie?

I'm experiencing a load of about 6 during the daily raid check:
# cat /proc/mdstat
Personalities : [raid1]
md2 : active raid1 sda3[0] sdb3[1]
2111700992 blocks super 1.2 [2/2] [UU]
[=================>...] check = 87.1% (1840754048/2111700992) finish=43.6min speed=103504K/sec
bitmap: 2/16 pages [8KB], 65536KB chunk
md1 : active raid1 sda2[0] sdb2[1]
523712 blocks super 1.2 [2/2] [UU]
resync=DELAYED
The suspect seems to be jdbc2:
Total DISK READ : 0.00 B/s | Total DISK WRITE : 433.45 K/s
Actual DISK READ: 0.00 B/s | Actual DISK WRITE: 902.05 K/s
TID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND
19794 be/3 root 0.00 B 616.00 K 0.00 % 99.46 % [jbd2/loop0-8]
259 be/3 root 0.00 B 96.00 K 0.00 % 87.46 % [jbd2/md2-8]
19790 be/0 root 0.00 B 18.93 M 0.00 % 10.13 % [loop0]
The Linux box is Debian GNU/Linux 8.7 (jessie) with a 4.4.44-1-pve kernel.
Almost instantly, when the raid check finishes, the load returns back to less than one. How can I figure out what's causing this problem?
I'm not sure how long the daily RAID check should run, but now it takes several hours, which seems excessive.
The IO levels drop significantly when the raid check has been finished:
Total DISK READ : 0.00 B/s | Total DISK WRITE : 8.29 M/s
Actual DISK READ: 0.00 B/s | Actual DISK WRITE: 8.63 M/s
TID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND
259 be/3 root 0.00 B 188.00 K 0.00 % 28.80 % [jbd2/md2-8]
19794 be/3 root 0.00 B 720.00 K 0.00 % 28.65 % [jbd2/loop0-8]
This problem doesn't seem to make any sense to me. Any help further debugging this would be very useful.
The md RAID check needs to iterate through the RAID stripes on disk and perform the integrity check. This is both an I/O and CPU operation. So the load of the system will increase significantly during this time.

How to get second-level output from sar when used with -f option?

sar man page says that one can specify the resolution in seconds for its output.
However, I am not able to get a second level resolution by the following command.
sar -i 1 -f /var/log/sa/sa18
11:00:01 AM CPU %user %nice %system %iowait %steal %idle
11:10:01 AM all 0.04 0.00 0.04 0.00 0.01 99.91
11:20:01 AM all 0.04 0.00 0.04 0.00 0.00 99.92
11:30:01 AM all 0.04 0.00 0.04 0.00 0.00 99.92
Following command too does not give second level resolution:
sar -f /var/log/sa/sa18 1
I am able to get second-level result only if I do not specify the -f option:
sar 1 10
08:34:31 PM CPU %user %nice %system %iowait %steal %idle
08:34:32 PM all 0.12 0.00 0.00 0.00 0.00 99.88
08:34:33 PM all 0.00 0.00 0.12 0.00 0.00 99.88
08:34:34 PM all 0.00 0.00 0.12 0.00 0.00 99.88
But I want to see system performance varying by second for some past day.
How do I get sar to print second-level output with the -f option?
Linux version: Linux 2.6.32-642.el6.x86_64
sar version : sysstat version 9.0.4
I think the exist sar report file 'sa18' collected with an interval 10 mins. So we don't get the output in seconds.
Please check the /etc/cron.d/sysstat file.
[root#testserver ~]# cat /etc/cron.d/sysstat
#run system activity accounting tool every 10 minutes
*/10 * * * * root /usr/lib64/sa/sa1 1 1
#generate a daily summary of process accounting at 23:53
53 23 * * * root /usr/lib64/sa/sa2 -A
If you want to reduce the sar interval interval you can modify the sysstat file.
The /var/log/sa directory has all of the information already.
The sar command serves here as a parser, and reads all data in the sa file.
So you can use sar -f /var/log/sa/<sa file> to see first-level results, and use other flags, like '-r', for other results.
# sar -f /var/log/sa/sa02
12:00:01 CPU %user %nice %system %iowait %steal %idle
12:10:01 all 14.70 0.00 5.57 0.69 0.01 79.03
12:20:01 all 23.53 0.00 6.08 0.55 0.01 69.83
# sar -r -f /var/log/sa/sa02
12:00:01 kbmemfree kbavail kbmemused kbactive kbinact kbdirty
12:10:01 2109732 5113616 30142444 25408240 2600
12:20:01 1950480 5008332 30301696 25580696 2260
12:30:01 2278632 5324260 29973544 25214788 4112

GNU parallel load balancing

I am trying to find a way to execute CPU intensive parallel jobs over a cluster. My objective is to schedule one job per core, so that every job hopefully gets 100% CPU utilization once scheduled. This is what a have come up with so far:
FILE build_sshlogin.sh
#!/bin/bash
serverprefix="compute-0-"
lastserver=15
function worker {
server="$serverprefix$1";
free=$(ssh $server /bin/bash << 'EOF'
cores=$(grep "cpu MHz" /proc/cpuinfo | wc -l)
stat=$(head -n 1 /proc/stat)
work1=$(echo $stat | awk '{print $2+$3+$4;}')
total1=$(echo $stat | awk '{print $2+$3+$4+$5+$6+$7+$8;}')
sleep 2;
stat=$(head -n 1 /proc/stat)
work2=$(echo $stat | awk '{print $2+$3+$4;}')
total2=$(echo $stat | awk '{print $2+$3+$4+$5+$6+$7+$8;}')
util=$(echo " ( $work2 - $work1 ) / ($total2 - $total1) " | bc -l );
echo " $cores * (1 - $util) " | bc -l | xargs printf "%1.0f"
EOF
)
if [ $free -gt 0 ]
then
echo $free/$server
fi
}
export serverprefix
export -f worker
seq 0 $lastserver | parallel -k worker {}
This script is used by GNU parallel as follows:
parallel --sshloginfile <(./build_sshlogin.sh) --workdir $PWD command args {1} ::: $(seq $runs)
The problem with this technique is that if someone starts another CPU intensive job on a server in the cluster, without checking the CPU usage, then the script will end up scheduling jobs to a core that is being used. In addition, if by the time the first jobs finishes, the CPU usage has changed, then the newly freed cores will not be included for scheduling by GNU parallel for the remaining jobs.
So my question is the following: Is there a way to make GNU parallel re-calculate the free cores/server before it schedules each job? Any other suggestions for solving the problem are welcome.
NOTE: In my cluster all cores have the same frequency. If someone can generalize to account for different frequencies, that's also welcome.
Look at --load which is meant for exactly this situation.
Unfortunately it does not look at CPU utilization but load average. But if your cluster nodes do not have heavy disk I/O then CPU utilization will be very close to load average.
Since load average changes slowly you probably also need to use the new --delay option to give the load average time to rise.
Try mpstat
mpstat
Linux 2.6.32-100.28.5.el6.x86_64 (dev-db) 07/09/2011
10:25:32 PM CPU %user %nice %sys %iowait %irq %soft %steal %idle intr/s
10:25:32 PM all 5.68 0.00 0.49 2.03 0.01 0.02 0.00 91.77 146.55
This is an overall snapshot on a per core basis
$ mpstat -P ALL
Linux 2.6.32-100.28.5.el6.x86_64 (dev-db) 07/09/2011 _x86_64_ (4 CPU)
10:28:04 PM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle
10:28:04 PM all 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 99.99
10:28:04 PM 0 0.01 0.00 0.01 0.01 0.00 0.00 0.00 0.00 99.98
10:28:04 PM 1 0.00 0.00 0.01 0.00 0.00 0.00 0.00 0.00 99.98
10:28:04 PM 2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00
10:28:04 PM 3 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00
There lot of options, these two give a simple actual %idle per cpu. Check the manpage.

Resources