Script for monitoring disk i/o rates on Linux - linux

I need a for monitoring ALL disk i/o rates on Linux using bash, awk, sed. The problem is that it must return one row per time interval (so this one row should contain: tps, kB_read/s, kB_wrtn/s, kB_read, kB_wrtn, but summarized per all disks).
Natural choice here is of course:
-d -k -p $interval $loops
To limit it to all disks I use:
-d -k -p `parted -l | grep Disk | cut -f1 -d: | cut -f2 -d' '`
Now the nice trick to summarize columns:
-d -k -p `parted -l | grep Disk | cut -f1 -d: | cut -f2 -d' '` > /tmp/jPtafDiskIO.txt
echo `date +"%H:%M:%S"`,`awk 'FNR>2' /tmp/jPtafDiskIO.txt | awk 'BEGIN {FS=OFS=" "}NR == 1 { n1 = $2; n2 = $3; n3 = $4; n4 = $5; n5 = $6; next } { n1 += $2; n2 += $3; n3 += $4; n4 += $5; n5 += $6 } END { print n1","n2","n3","n4","n5 }'` >> diskIO.log
I am almost there, however this (running in the loop) makes being invoked each time from beginning, so I don't get the statistics from interval to interval, but always averages (so each invoke brings me pretty the same output).
I know it sounds complicated, but maybe somebody has an idea? Maybe totally different approach?
Thx.
EDIT:
Sample input (/tmp/jPtafDiskIO.txt):
> Linux 2.6.18-194.el5 (hostname) 08/25/2012
>
> Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn
> sda 0.00 0.00 0.00 35655 59
> sda2 0.00 0.00 0.00 67 272
> sda1 0.00 0.00 0.00 521 274
> sdb 52.53 0.56 569.40 20894989
> 21065384388 sdc 1.90 64.64 10.93
> 2391333384 404432217 sdd 0.00 0.00 0.04
> 17880 1343028
Output diskIO.log:
16:53:12,54.43,65.2,580.37,2412282496,21471160238

Why not use iotop http://guichaz.free.fr/iotop/ ?

dstat might be what you're looking for. It has a lot of things it can report on, with some common ones displayed by default.

Related

Linux bash scripting: Sum one column using awk for overall cpu utilization and display all fields

problem below:
Script: I execute ps command with pid,user... and I am trying to use awk to sum overall cpu utilization of different processes.
Command:
> $ps -eo pid,user,state,comm,%cpu,command --sort=-%cpu | egrep -v '(0.0)|(%CPU)' | head -n10 | awk '
> { process[$4]+=$5; }
> END{
> for (i in process)
> {
> printf($1" "$2" "$3" ""%-20s %s\n",i, process[i]" "$6) ;
> }
>
> }' | sort -nrk 5 | head
Awk: Sum 5th column according to the process name (4th column)
Output:
1. 10935 zbynda S dd 93.3 /usr/libexec/gnome-terminal-server
2. 10935 zbynda S gnome-shell 1.9 /usr/libexec/gnome-terminal-server
3. 10935 zbynda S sublime_text 0.6 /usr/libexec/gnome-terminal-server
4. 10935 zbynda S sssd_kcm 0.2 /usr/libexec/gnome-terminal-server
As you can see, the fourth and the fifth columns are all good, but the other ones (rows/columns) are just the first entry from ps command. I should have 4 different processes as in the fourth column, but for example, the last column shows only one same process.
How to get other entries from ps command? (not only the first entry)
Try this
ps -eo pid,user,state,comm,%cpu,command --sort=-%cpu | egrep -v '(0.0)|(%CPU)' | head -n10 | awk '
{ process[$4]+=$5; a1[$4]=$1;a2[$4]=$2;a3[$4]=$3;a6[$4]=$6}
END{
for (i in process)
{
printf(a1[i]" "a2[i]" "a3[i]" ""%-20s %s\n",i, process[i]" "a6[i]) ;
}
}' | sort -nrk 5 | head
an END rule is executed once only, after all the input is read.
Your printf uses $6, which retains the value from the last line. Think you want to use "i" instead.
Of course $1, $2, and $3 have the same problem so you will need to preserve incoming values as well. An exercise to the student is to fix this.

Adding time into a .plot file without adding a new line using awk

I am writing a shell script that runs the command mpstat and iostat to get CPU and disk usage, extract information from those and put them into a .plot file to later graph them using bargraph.pl. What I am having troubles on is when I go use awk to get the time from mpstat like this
mpstat | awk 'FNR == 4 {print $1;}' >> CPU_usage.plot
It will prints a new line at the end of the code. I tried using printf as this is working for my other lines of codes to get the specific information needed without adding a new line of code, but I don't know how I can format it. Is there any way to do this with awk or any other method that I can use to accomplish this? Thanks in advance.
When use the command mpstat this is what bash returns
Linux 3.4.0+ (DESKTOP-JM295S0) 04/30/2017 _x86_64_ (4 CPU)
03:56:43 PM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle
03:56:43 PM all 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00
This is what I'm trying to accomplish, take the time, usr, sys, and idle and put them into a file called CPU_usage.plot. This is what I wanted to put into the file:
03:56:43 0.00 0.00 100.00
What I got instead is:
03:56:43
0.00 0.00 100.00
This is my code:
mpstat | awk 'FNR == 4 {print $1;}' >> CPU_usage.plot
mpstat | awk 'FNR == 4 {printf " %f", $4;}' >> CPU_usage.plot
mpstat | awk 'FNR == 4 {printf " %f", $6;}' >> CPU_usage.plot
mpstat | awk 'FNR == 4 {printf " %f\n", $13;}' >> CPU_usage.plot
Use the following awk approach:
mpstat | awk 'NR==4{print $1,$4,$6,$13}' OFS="\t" >> CPU_usage.plot
Now, CPU_usage.plot file should contain:
03:56:43 0.00 0.00 100.00

overall CPU usage and Memory(RAM) usage in percentage in linux/ubuntu

I want to findout overall CPU usage and RAM usage in percentage, but i dint get success
$ command for cpu usage
4.85%
$ command for memory usage
15.15%
OR
$ command for cpu and mamory usage
cpu: 4.85%
mem: 15.15%
How can I achieve this?
You can use top and/or vmstat from the procps package.
Use vmstat -s to get the amount of RAM on your machine (optional), and
then use the output of top to calculate the memory usage percentages.
%Cpu(s): 3.8 us, 2.8 sy, 0.4 ni, 92.0 id, 1.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 24679620 total, 1705524 free, 7735748 used, 15238348 buff/cache
KiB Swap: 0 total, 0 free, 0 used. 16161296 avail Mem
You can also do this for relatively short output:
watch '/usr/bin/top -b | head -4 | tail -2'
A shell pipe that calculates the current RAM usage periodically is
watch -n 5 "/usr/bin/top -b | head -4 | tail -2 | perl -anlE 'say sprintf(\"used: %s total: %s => RAM Usage: %.1f%%\", \$F[7], \$F[3], 100*\$F[7]/\$F[3]) if /KiB Mem/'"
(CPU + Swap usages were filtered out here.)
This command prints every 5 seconds:
Every 5.0s: /usr/bin/top -b | head -4 | tail -2 | perl -anlE 'say sprintf("u... wb3: Wed Nov 21 13:51:49 2018
used: 8349560 total: 24667856 => RAM Usage: 33.8%
Please use one of the following:
$ free -t | awk 'NR == 2 {print "Current Memory Utilization is : " $3/$2*100}'
Current Memory Utilization is : 14.6715
OR
$ free -t | awk 'FNR == 2 {print "Current Memory Utilization is : " $3/$2*100}'
Current Memory Utilization is : 14.6703
CPU usage => top -bn2 | grep '%Cpu' | tail -1 | grep -P '(....|...) id,' | awk '{print 100-$8 "%"}'
Memory usage => free -m | grep 'Mem:' | awk '{ print $3/$2*100 "%"}'
For the cpu usage% you can use:
top -b -n 1| grep Cpu | awk -F "," '{print $4}' | awk -F "id" '{print $1}' | awk -F "%" '{print $1}'
One liner solution to get RAM % in-use:
free -t | awk 'FNR == 2 {printf("%.0f%"), $3/$2*100}'
Example output: 24%
for more precision, you can change the integer N inside printf(%.<N>%) from the the previous command. For example to get 2 decimal places of precision you could do:
free -t | awk 'FNR == 2 {printf("%.2f%"), $3/$2*100}'
Example output: 24.57%

Biggest and smallest of all lines

I have a output like this
3.69
0.25
0.80
1.78
3.04
1.99
0.71
0.50
0.94
I want to find the biggest number and the smallest number in the above output
I need output like
smallest is 0.25 and biggest as 3.69
Just sort your input first and print the first and last value. One method:
$ sort file | awk 'NR==1{min=$1}END{print "Smallest",min,"Biggest",$0}'
Smallest 0.25 Biggest 3.69
Hope this help.
OUTPUT="3.69 0.25 0.80 1.78 3.04 1.99 0.71 0.50 0.94"
SORTED=`echo $OUTPUT | tr ' ' '\n' | sort -n`
SMALLEST=`echo "$SORTED" | head -n 1`
BIGGEST=`echo "$SORTED" | tail -n 1`
echo "Smallest is $SMALLEST"
echo "Biggest is $BIGGEST"
Added op's awk oneliner request.
I'm not good at awk, but this works anyway. :)
echo "3.69 0.25 0.80 1.78 3.04 1.99 0.71 0.50 0.94" | awk '{
for (i=1; i<=NF; i++) {
if (length(s) == 0) s = $i;
if (length(b) == 0) b = $i;
if ($i < s) s = $i;
if (b < $i) b = $i;
}
print "Smallest is", s;
print "Biggest is", b;
}'
You want an awk solution?
echo "3.69 0.25 0.80 1.78 3.04 1.99 0.71 0.50 0.94" | \
awk -v RS=' ' '/.+/ { biggest = ((biggest == "") || ($1 > biggest)) ? $1 : biggest;
smallest = ((smallest == "") || ($1 < smallest)) ? $1:smallest}
END { print biggest, smallest}'
Produce the following output:
3.69 0.25
You can use this method also
sort file | echo -e `sed -nr '1{s/(.*)/smallest is :\1/gp};${s/(.*)/biggest no is :\1/gp'}`
TXR solution:
$ txr -e '(let ((nums [mapcar tofloat (gun (get-line))]))
(if nums
(pprinl `smallest is #(find-min nums) and biggest is #(find-max nums)`)
(pprinl "empty input")))'
0.1
-1.0
3.5
2.4
smallest is -1.0 and biggest is 3.5

Which one is more efficient for float operations, awk or bc?

I am writing a system performance script in bash. I want to compute the CPU usage in terms of percent. I have two implementations, one using awk and another one using bc. I would like to know which of the two versions is more efficient. Is it better to use awk or bc for float computations? Thanks!
Version #1 (Using bc)
CPU=$(mpstat 1 1 | grep "Average" | awk '{print $11}')
CPU=$(echo "scale=2;(100-$CPU)" | bc -l)
echo $CPU
Version #2 (Using awk)
CPU=$(mpstat 1 1 | grep "Average" | awk '{idle = $11} {print 100 - idle}')
echo $CPU
Since the processing time of both is going to be tiny, the version that spawns the least amount of processes and subshells is going to be "more efficient".
That's your second example.
But you can make it even simpler by eliminating the grep:
CPU=$(mpstat 1 1 | awk '/Average/{print 100 - $11}')
In version 1, why do you need 2nd line? Why can't you do it from 1st line itself? I am asking because, 1st version is grep+awk+bc; 2nd example is grep+awk. So the comparison is not valid, I think.
For using only bc, without awk, try this:
CPU=$(mpstat 1 1 | grep Average | { read -a P; echo 100 - ${P[10]}; } | bc )
thanks all for educating me on awk/bc!
did the benchmark (in hopefully more proper way):
tl;dr: awk wins
semi-long story:
3 times 1000 runs awk averages to 2.081333s on my system while bc averages to 3.460333s
full story:
[me#thebox tmp]$ time for i in `seq 1 1000` ; do echo "Average: all 5.05 0.00 6.57 0.51 0.00 0.00 0.00 0.00 87.88" | awk '/Average/ {print 100 - $11}' >/dev/null ; done
real 0m1.922s
user 0m0.320s
sys 0m1.308s
[me#thebox tmp]$ time for i in `seq 1 1000` ; do echo "Average: all 5.05 0.00 6.57 0.51 0.00 0.00 0.00 0.00 87.88" | awk '/Average/{print 100 - $11}' >/dev/null ; done
real 0m2.124s
user 0m0.370s
sys 0m1.368s
[me#thebox tmp]$ time for i in `seq 1 1000` ; do echo "Average: all 5.05 0.00 6.57 0.51 0.00 0.00 0.00 0.00 87.88" | awk '/Average/{print 100 - $11}' >/dev/null ; done
real 0m2.198s
user 0m0.412s
sys 0m1.383s
[me#thebox tmp]$ time for i in `seq 1 1000` ; do echo "Average: all 5.05 0.00 6.57 0.51 0.00 0.00 0.00 0.00 87.88" | grep Average | { read -a P; echo 100 - ${P[10]}; } | bc >/dev/null ; done
real 0m3.799s
user 0m0.691s
sys 0m3.059s
[me#thebox tmp]$ time for i in `seq 1 1000` ; do echo "Average: all 5.05 0.00 6.57 0.51 0.00 0.00 0.00 0.00 87.88" | grep Average | { read -a P; echo 100 - ${P[10]}; } | bc >/dev/null ; done
real 0m3.545s
user 0m0.604s
sys 0m2.801s
[me#thebox tmp]$ time for i in `seq 1 1000` ; do echo "Average: all 5.05 0.00 6.57 0.51 0.00 0.00 0.00 0.00 87.88" | grep Average | { read -a P; echo 100 - ${P[10]}; } | bc >/dev/null ; done
real 0m3.037s
user 0m0.602s
sys 0m2.626s
[me#thebox tmp]$
without further tracing I believe this is related to the overhead of forking more processes when using bc.
I did the following benchmark:
#!/bin/bash
count=0
tic="$(date +%s)"
while [ $count -lt 50 ]
do
mpstat 1 1 | awk '/Average/{print 100 - $11}'
count=$(($count+1))
done
toc="$(date +%s)"
sec="$(expr $toc - $tic)"
count=0
tic="$(date +%s)"
while [ $count -lt 50 ]
do
CPU=$(mpstat 1 1 | grep "Average" | awk '{print $11}')
echo "scale=2;(100-$CPU)" | bc -l
count=$(($count+1))
done
toc="$(date +%s)"
sec1="$(expr $toc - $tic)"
echo "Execution Time awk: "$sec
echo "Execution Time bc: "$sec1
Both execution times were the same... 50 seconds. Apparently it does not make any difference.

Resources