Which one is more efficient for float operations, awk or bc?

Which one is more efficient for float operations, awk or bc? - linux

I am writing a system performance script in bash. I want to compute the CPU usage in terms of percent. I have two implementations, one using awk and another one using bc. I would like to know which of the two versions is more efficient. Is it better to use awk or bc for float computations? Thanks!
Version #1 (Using bc)
CPU=$(mpstat 1 1 | grep "Average" | awk '{print $11}')
CPU=$(echo "scale=2;(100-$CPU)" | bc -l)
echo $CPU
Version #2 (Using awk)
CPU=$(mpstat 1 1 | grep "Average" | awk '{idle = $11} {print 100 - idle}')
echo $CPU

Since the processing time of both is going to be tiny, the version that spawns the least amount of processes and subshells is going to be "more efficient".
That's your second example.
But you can make it even simpler by eliminating the grep:
CPU=$(mpstat 1 1 | awk '/Average/{print 100 - $11}')

In version 1, why do you need 2nd line? Why can't you do it from 1st line itself? I am asking because, 1st version is grep+awk+bc; 2nd example is grep+awk. So the comparison is not valid, I think.
For using only bc, without awk, try this:
CPU=$(mpstat 1 1 | grep Average | { read -a P; echo 100 - ${P[10]}; } | bc )

thanks all for educating me on awk/bc!
did the benchmark (in hopefully more proper way):
tl;dr: awk wins
semi-long story:
3 times 1000 runs awk averages to 2.081333s on my system while bc averages to 3.460333s
full story:
[me#thebox tmp]$ time for i in `seq 1 1000` ; do echo "Average: all 5.05 0.00 6.57 0.51 0.00 0.00 0.00 0.00 87.88" | awk '/Average/ {print 100 - $11}' >/dev/null ; done
real 0m1.922s
user 0m0.320s
sys 0m1.308s
[me#thebox tmp]$ time for i in `seq 1 1000` ; do echo "Average: all 5.05 0.00 6.57 0.51 0.00 0.00 0.00 0.00 87.88" | awk '/Average/{print 100 - $11}' >/dev/null ; done
real 0m2.124s
user 0m0.370s
sys 0m1.368s
[me#thebox tmp]$ time for i in `seq 1 1000` ; do echo "Average: all 5.05 0.00 6.57 0.51 0.00 0.00 0.00 0.00 87.88" | awk '/Average/{print 100 - $11}' >/dev/null ; done
real 0m2.198s
user 0m0.412s
sys 0m1.383s
[me#thebox tmp]$ time for i in `seq 1 1000` ; do echo "Average: all 5.05 0.00 6.57 0.51 0.00 0.00 0.00 0.00 87.88" | grep Average | { read -a P; echo 100 - ${P[10]}; } | bc >/dev/null ; done
real 0m3.799s
user 0m0.691s
sys 0m3.059s
[me#thebox tmp]$ time for i in `seq 1 1000` ; do echo "Average: all 5.05 0.00 6.57 0.51 0.00 0.00 0.00 0.00 87.88" | grep Average | { read -a P; echo 100 - ${P[10]}; } | bc >/dev/null ; done
real 0m3.545s
user 0m0.604s
sys 0m2.801s
[me#thebox tmp]$ time for i in `seq 1 1000` ; do echo "Average: all 5.05 0.00 6.57 0.51 0.00 0.00 0.00 0.00 87.88" | grep Average | { read -a P; echo 100 - ${P[10]}; } | bc >/dev/null ; done
real 0m3.037s
user 0m0.602s
sys 0m2.626s
[me#thebox tmp]$
without further tracing I believe this is related to the overhead of forking more processes when using bc.

I did the following benchmark:
#!/bin/bash
count=0
tic="$(date +%s)"
while [ $count -lt 50 ]
do
mpstat 1 1 | awk '/Average/{print 100 - $11}'
count=$(($count+1))
done
toc="$(date +%s)"
sec="$(expr $toc - $tic)"
count=0
tic="$(date +%s)"
while [ $count -lt 50 ]
do
CPU=$(mpstat 1 1 | grep "Average" | awk '{print $11}')
echo "scale=2;(100-$CPU)" | bc -l
count=$(($count+1))
done
toc="$(date +%s)"
sec1="$(expr $toc - $tic)"
echo "Execution Time awk: "$sec
echo "Execution Time bc: "$sec1
Both execution times were the same... 50 seconds. Apparently it does not make any difference.

Related

Adding time into a .plot file without adding a new line using awk

I am writing a shell script that runs the command mpstat and iostat to get CPU and disk usage, extract information from those and put them into a .plot file to later graph them using bargraph.pl. What I am having troubles on is when I go use awk to get the time from mpstat like this
mpstat | awk 'FNR == 4 {print $1;}' >> CPU_usage.plot
It will prints a new line at the end of the code. I tried using printf as this is working for my other lines of codes to get the specific information needed without adding a new line of code, but I don't know how I can format it. Is there any way to do this with awk or any other method that I can use to accomplish this? Thanks in advance.
When use the command mpstat this is what bash returns
Linux 3.4.0+ (DESKTOP-JM295S0) 04/30/2017 _x86_64_ (4 CPU)
03:56:43 PM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle
03:56:43 PM all 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00
This is what I'm trying to accomplish, take the time, usr, sys, and idle and put them into a file called CPU_usage.plot. This is what I wanted to put into the file:
03:56:43 0.00 0.00 100.00
What I got instead is:
03:56:43
0.00 0.00 100.00
This is my code:
mpstat | awk 'FNR == 4 {print $1;}' >> CPU_usage.plot
mpstat | awk 'FNR == 4 {printf " %f", $4;}' >> CPU_usage.plot
mpstat | awk 'FNR == 4 {printf " %f", $6;}' >> CPU_usage.plot
mpstat | awk 'FNR == 4 {printf " %f\n", $13;}' >> CPU_usage.plot

Use the following awk approach:
mpstat | awk 'NR==4{print $1,$4,$6,$13}' OFS="\t" >> CPU_usage.plot
Now, CPU_usage.plot file should contain:
03:56:43 0.00 0.00 100.00

overall CPU usage and Memory(RAM) usage in percentage in linux/ubuntu

I want to findout overall CPU usage and RAM usage in percentage, but i dint get success
$ command for cpu usage
4.85%
$ command for memory usage
15.15%
OR
$ command for cpu and mamory usage
cpu: 4.85%
mem: 15.15%
How can I achieve this?

You can use top and/or vmstat from the procps package.
Use vmstat -s to get the amount of RAM on your machine (optional), and
then use the output of top to calculate the memory usage percentages.
%Cpu(s): 3.8 us, 2.8 sy, 0.4 ni, 92.0 id, 1.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 24679620 total, 1705524 free, 7735748 used, 15238348 buff/cache
KiB Swap: 0 total, 0 free, 0 used. 16161296 avail Mem
You can also do this for relatively short output:
watch '/usr/bin/top -b | head -4 | tail -2'
A shell pipe that calculates the current RAM usage periodically is
watch -n 5 "/usr/bin/top -b | head -4 | tail -2 | perl -anlE 'say sprintf(\"used: %s total: %s => RAM Usage: %.1f%%\", \$F[7], \$F[3], 100*\$F[7]/\$F[3]) if /KiB Mem/'"
(CPU + Swap usages were filtered out here.)
This command prints every 5 seconds:
Every 5.0s: /usr/bin/top -b | head -4 | tail -2 | perl -anlE 'say sprintf("u... wb3: Wed Nov 21 13:51:49 2018
used: 8349560 total: 24667856 => RAM Usage: 33.8%

Please use one of the following:
$ free -t | awk 'NR == 2 {print "Current Memory Utilization is : " $3/$2*100}'
Current Memory Utilization is : 14.6715
OR
$ free -t | awk 'FNR == 2 {print "Current Memory Utilization is : " $3/$2*100}'
Current Memory Utilization is : 14.6703

CPU usage => top -bn2 | grep '%Cpu' | tail -1 | grep -P '(....|...) id,' | awk '{print 100-$8 "%"}'
Memory usage => free -m | grep 'Mem:' | awk '{ print $3/$2*100 "%"}'

For the cpu usage% you can use:
top -b -n 1| grep Cpu | awk -F "," '{print $4}' | awk -F "id" '{print $1}' | awk -F "%" '{print $1}'

One liner solution to get RAM % in-use:
free -t | awk 'FNR == 2 {printf("%.0f%"), $3/$2*100}'
Example output: 24%
for more precision, you can change the integer N inside printf(%.<N>%) from the the previous command. For example to get 2 decimal places of precision you could do:
free -t | awk 'FNR == 2 {printf("%.2f%"), $3/$2*100}'
Example output: 24.57%

Bash format uptime to show days, hours, minutes

I'm using uptime in bash in order to get the current runtime of the machine. I need to grab the time and display a format like 2 days, 12 hours, 23 minutes.

My uptime produces output that looks like:
$ uptime
12:49:10 up 25 days, 21:30, 28 users, load average: 0.50, 0.66, 0.52
To convert that to your format:
$ uptime | awk -F'( |,|:)+' '{print $6,$7",",$8,"hours,",$9,"minutes."}'
25 days, 21 hours, 34 minutes.
How it works
-F'( |,|:)+'
awk divides its input up into fields. This tells awk to use any combination of one or more of space, comma, or colon as the field separator.
print $6,$7",",$8,"hours,",$9,"minutes."
This tells awk to print the sixth field and seventh fields (separated by a space) followed by a comma, the 8th field, the string hours, the ninth field, and, lastly, the string minutes..
Handling computers with short uptimes using sed
Starting from a reboot, my uptime produces output like:
03:14:20 up 1 min, 2 users, load average: 2.28, 1.29, 0.50
04:12:29 up 59 min, 5 users, load average: 0.06, 0.08, 0.48
05:14:09 up 2:01, 5 users, load average: 0.13, 0.10, 0.45
03:13:19 up 1 day, 0 min, 8 users, load average: 0.01, 0.04, 0.05
04:13:19 up 1 day, 1:00, 8 users, load average: 0.02, 0.05, 0.21
12:49:10 up 25 days, 21:30, 28 users, load average: 0.50, 0.66, 0.52
The following sed command handles these formats:
uptime | sed -E 's/^[^,]*up *//; s/, *[[:digit:]]* users.*//; s/min/minutes/; s/([[:digit:]]+):0?([[:digit:]]+)/\1 hours, \2 minutes/'
With the above times, this produces:
1 minutes
59 minutes
2 hours, 1 minutes
1 day, 0 minutes
1 day, 1 hours, 0 minutes
25 days, 21 hours, 30 minutes
How it works
-E turns on extended regular expression syntax. (On older GNU seds, use -r in place of -E)
s/^[^,]*up *//
This substitutes command removes all text up to up.
s/, *[[:digit:]]* users.*//
This substitute command removes the user count and all text which follows it.
s/min/minutes/
This replaces min with minutes.
s/([[:digit:]]+):0?([[:digit:]]+)/\1 hours, \2 minutes/'
If the line contains a time in the hh:mm format, this separates the hours from the minutes and replaces it with hh hours, mm minutes.
Handling computers with short uptimes using awk
uptime | awk -F'( |,|:)+' '{d=h=m=0; if ($7=="min") m=$6; else {if ($7~/^day/) {d=$6;h=$8;m=$9} else {h=$6;m=$7}}} {print d+0,"days,",h+0,"hours,",m+0,"minutes."}'
On the same test cases as above, this produces:
0 days, 0 hours, 1 minutes.
0 days, 0 hours, 59 minutes.
0 days, 2 hours, 1 minutes.
1 days, 0 hours, 0 minutes.
1 days, 1 hours, 0 minutes.
25 days, 21 hours, 30 minutes.
For those who prefer awk code spread out over multiple lines:
uptime | awk -F'( |,|:)+' '{
d=h=m=0;
if ($7=="min")
m=$6;
else {
if ($7~/^day/) { d=$6; h=$8; m=$9}
else {h=$6;m=$7}
}
}
{
print d+0,"days,",h+0,"hours,",m+0,"minutes."
}'

Just vor completeness... what's about:
$ uptime -p
up 2 weeks, 3 days, 14 hours, 27 minutes

Solution: In order to get the linux uptime in seconds, Go to bash and type cat /proc/uptime.Parse the first number and convert it according to your requirement.
From RedHat documentation:
This file contains information detailing how long the system has been on since its last restart. The output of /proc/uptime is quite minimal:
350735.47 234388.90
The First number is the total number of seconds the system has been
up.
The Second number is how much of that time the machine has spent
idle, in
seconds.

I made a universal shell script, for systems which support uptime -p like newer linux and for those that don't, like Mac OS X.
#!/bin/sh
uptime -p >/dev/null 2>&1
if [ "$?" -eq 0 ]; then
# Supports most Linux distro
# when the machine is up for less than '0' minutes then
# 'uptime -p' returns ONLY 'up', so we need to set a default value
UP_SET_OR_EMPTY=$(uptime -p | awk -F 'up ' '{print $2}')
UP=${UP_SET_OR_EMPTY:-'less than a minute'}
else
# Supports Mac OS X, Debian 7, etc
UP=$(uptime | sed -E 's/^[^,]*up *//; s/mins/minutes/; s/hrs?/hours/;
s/([[:digit:]]+):0?([[:digit:]]+)/\1 hours, \2 minutes/;
s/^1 hours/1 hour/; s/ 1 hours/ 1 hour/;
s/min,/minutes,/; s/ 0 minutes,/ less than a minute,/; s/ 1 minutes/ 1 minute/;
s/ / /; s/, *[[:digit:]]* users?.*//')
fi
echo "up $UP"
Gist
Referenced John1024 answer with my own customizations.

For this:
0 days, 0 hours, 1 minutes.
0 days, 0 hours, 59 minutes.
0 days, 2 hours, 1 minutes.
1 days, 0 hours, 0 minutes.
1 days, 1 hours, 0 minutes.
25 days, 21 hours, 30 minutes
More simple is:
uptime -p | cut -d " " -f2-

For the sake of variety, here's an example with sed:
My raw output:
$ uptime
15:44:56 up 3 days, 22:58, 7 users, load average: 0.48, 0.40, 0.31
Converted output:
$uptime|sed 's/.*\([0-9]\+ days\), \([0-9]\+\):\([0-9]\+\).*/\1, \2 hours, \3 minutes./'
3 days, 22 hours, 58 minutes.

This answer is pretty specific for the uptime shipped in OS X, but takes into account any case of output.
#!/bin/bash
INFO=`uptime`
echo $INFO | awk -F'[ ,:\t\n]+' '{
msg = "↑ "
if ($5 == "day" || $5 == "days") { # up for a day or more
msg = msg $4 " " $5 ", "
n = $6
o = $7
} else {
n = $4
o = $5
}
if (int(o) == 0) { # words evaluate to zero
msg = msg int(n)" "o
} else { # hh:mm format
msg = msg int(n)" hr"
if (n > 1) { msg = msg "s" }
msg = msg ", " int(o) " min"
if (o > 1) { msg = msg "s" }
}
print "[", msg, "]"
}'
Some example possible outputs:
22:49 up 24 secs, 2 users, load averages: 8.37 2.09 0.76
[ ↑ 24 secs ]
22:50 up 1 min, 2 users, load averages: 5.59 2.39 0.95
[ ↑ 1 min ]
23:39 up 51 mins, 3 users, load averages: 2.18 1.94 1.74
[ ↑ 51 mins ]
23:54 up 1:06, 3 users, load averages: 3.67 2.57 2.07
[ ↑ 1 hr, 6 mins ]
16:20 up 120 days, 10:46, 3 users, load averages: 1.21 2.88 0.80
[ ↑ 120 days, 10 hrs, 46 mins ]

uptime_minutes() {
set `uptime -p`
local minutes=0
shift
while [ -n "$1" ]; do
case $2 in
day*)
((minutes+=$1*1440))
;;
hour*)
((minutes+=$1*60))
;;
minute*)
((minutes+=$1))
;;
esac
shift
shift
done
echo $minutes
}

Biggest and smallest of all lines

I have a output like this
3.69
0.25
0.80
1.78
3.04
1.99
0.71
0.50
0.94
I want to find the biggest number and the smallest number in the above output
I need output like
smallest is 0.25 and biggest as 3.69

Just sort your input first and print the first and last value. One method:
$ sort file | awk 'NR==1{min=$1}END{print "Smallest",min,"Biggest",$0}'
Smallest 0.25 Biggest 3.69

Hope this help.
OUTPUT="3.69 0.25 0.80 1.78 3.04 1.99 0.71 0.50 0.94"
SORTED=`echo $OUTPUT | tr ' ' '\n' | sort -n`
SMALLEST=`echo "$SORTED" | head -n 1`
BIGGEST=`echo "$SORTED" | tail -n 1`
echo "Smallest is $SMALLEST"
echo "Biggest is $BIGGEST"
Added op's awk oneliner request.
I'm not good at awk, but this works anyway. :)
echo "3.69 0.25 0.80 1.78 3.04 1.99 0.71 0.50 0.94" | awk '{
for (i=1; i<=NF; i++) {
if (length(s) == 0) s = $i;
if (length(b) == 0) b = $i;
if ($i < s) s = $i;
if (b < $i) b = $i;
}
print "Smallest is", s;
print "Biggest is", b;
}'

You want an awk solution?
echo "3.69 0.25 0.80 1.78 3.04 1.99 0.71 0.50 0.94" | \
awk -v RS=' ' '/.+/ { biggest = ((biggest == "") || ($1 > biggest)) ? $1 : biggest;
smallest = ((smallest == "") || ($1 < smallest)) ? $1:smallest}
END { print biggest, smallest}'
Produce the following output:
3.69 0.25

You can use this method also
sort file | echo -e `sed -nr '1{s/(.*)/smallest is :\1/gp};${s/(.*)/biggest no is :\1/gp'}`

TXR solution:
$ txr -e '(let ((nums [mapcar tofloat (gun (get-line))]))
(if nums
(pprinl `smallest is #(find-min nums) and biggest is #(find-max nums)`)
(pprinl "empty input")))'
0.1
-1.0
3.5
2.4
smallest is -1.0 and biggest is 3.5

Script for monitoring disk i/o rates on Linux

I need a for monitoring ALL disk i/o rates on Linux using bash, awk, sed. The problem is that it must return one row per time interval (so this one row should contain: tps, kB_read/s, kB_wrtn/s, kB_read, kB_wrtn, but summarized per all disks).
Natural choice here is of course:
-d -k -p $interval $loops
To limit it to all disks I use:
-d -k -p `parted -l | grep Disk | cut -f1 -d: | cut -f2 -d' '`
Now the nice trick to summarize columns:
-d -k -p `parted -l | grep Disk | cut -f1 -d: | cut -f2 -d' '` > /tmp/jPtafDiskIO.txt
echo `date +"%H:%M:%S"`,`awk 'FNR>2' /tmp/jPtafDiskIO.txt | awk 'BEGIN {FS=OFS=" "}NR == 1 { n1 = $2; n2 = $3; n3 = $4; n4 = $5; n5 = $6; next } { n1 += $2; n2 += $3; n3 += $4; n4 += $5; n5 += $6 } END { print n1","n2","n3","n4","n5 }'` >> diskIO.log
I am almost there, however this (running in the loop) makes being invoked each time from beginning, so I don't get the statistics from interval to interval, but always averages (so each invoke brings me pretty the same output).
I know it sounds complicated, but maybe somebody has an idea? Maybe totally different approach?
Thx.
EDIT:
Sample input (/tmp/jPtafDiskIO.txt):
> Linux 2.6.18-194.el5 (hostname) 08/25/2012
>
> Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn
> sda 0.00 0.00 0.00 35655 59
> sda2 0.00 0.00 0.00 67 272
> sda1 0.00 0.00 0.00 521 274
> sdb 52.53 0.56 569.40 20894989
> 21065384388 sdc 1.90 64.64 10.93
> 2391333384 404432217 sdd 0.00 0.00 0.04
> 17880 1343028
Output diskIO.log:
16:53:12,54.43,65.2,580.37,2412282496,21471160238

Why not use iotop http://guichaz.free.fr/iotop/ ?

dstat might be what you're looking for. It has a lot of things it can report on, with some common ones displayed by default.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Which one is more efficient for float operations, awk or bc? - linux

Since the processing time of both is going to be tiny, the version that spawns the least amount of processes and subshells is going to be "more efficient". That's your second example. But you can make it even simpler by eliminating the grep: CPU=$(mpstat 1 1 | awk '/Average/{print 100 - $11}')

Related

Adding time into a .plot file without adding a new line using awk

overall CPU usage and Memory(RAM) usage in percentage in linux/ubuntu

Bash format uptime to show days, hours, minutes

Biggest and smallest of all lines

Script for monitoring disk i/o rates on Linux

Categories

Resources