Best way to divide in bash using pipes? - linux

I'm just looking for an easy way to divide a number (or provide other math functions). Let's say I have the following command:
find . -name '*.mp4' | wc -l
How can I take the result of wc -l and divide it by 3?
The examples I've seen don't deal with re-directed out/in.

Using bc:
$ bc -l <<< "scale=2;$(find . -name '*.mp4' | wc -l)/3"
2.33
In contrast, the bash shell only performs integer arithmetic.
Awk is also very powerful:
$ find . -name '*.mp4' | wc -l | awk '{print $1/3}'
2.33333
You don't even need wc if using awk:
$ find . -name '*.mp4' | awk 'END {print NR/3}'
2.33333

Edit 2018-02-22: Adding shell connector
There is more than 1 way:
Depending on precision required and number of calcul to be done! See shell connector further!
Using bc (binary calculator)
find . -type f -name '*.mp4' -printf \\n | wc -l | xargs printf "%d/3\n" | bc -l
6243.33333333333333333333
or
echo $(find . -name '*.mp4' -printf \\n | wc -l)/3|bc -l
6243.33333333333333333333
or using bash, result in integer only:
echo $(($(find . -name '*.mp4' -printf \\n| wc -l)/3))
6243
Using bash interger builtin math processor
res=000$((($(find . -type f -name '*.mp4' -printf "1+")0)*1000/3))
printf -v res "%.2f" ${res:0:${#res}-3}.${res:${#res}-3}
echo $res
6243.33
Pure bash
With recent 64bits bash, you could even use #glennjackman's ideas of using globstar, but computing pseudo floating could be done by:
shopt -s globstar
files=(**/*.mp4)
shopt -u globstar
res=$[${#files[*]}000/3]
printf -v res "%.2f" ${res:0:${#res}-3}.${res:${#res}-3}
echo $res
6243.33
There is no fork and $res contain a two digit rounded floating value.
Nota: Care about symlinks when using globstar and **!
Introducing shell connector
If you plan to do a lot of calculs, require high precision and use bash, you could use long running bc sub process:
mkfifo /tmp/mybcfifo
exec 5> >(exec bc -l >/tmp/mybcfifo)
exec 6</tmp/mybcfifo
rm /tmp/mybcfifo
then now:
echo >&5 '12/34'
read -u 6 result
echo $result
.35294117647058823529
This subprocess stay open and useable:
ps --sid $(ps ho sid $$) fw
PID TTY STAT TIME COMMAND
18027 pts/9 Ss 0:00 bash
18258 pts/9 S 0:00 \_ bc -l
18789 pts/9 R+ 0:00 \_ ps --sid 18027 fw
Computing $PI:
echo >&5 '4*a(1)'
read -u 6 PI
echo $PI
3.14159265358979323844
To terminate sub process:
exec 6<&-
exec 5>&-
Little demo, about The best way to divide in bash using pipes!
Computing range {1..157} / 42 ( I will let you google for answer to the ultimate question of life, the universe, and everything ;)
... and print 13 result by lines in order to reduce output:
printf -v form "%s" "%5.3f "{,}{,}{,,};form+="%5.3f\n";
By regular way
testBc(){
for ((i=1; i<157; i++)) ;do
echo $(bc -l <<<"$i/42");
done
}
By using long running bc sub process:
testLongBc(){
mkfifo /tmp/mybcfifo;
exec 5> >(exec bc -l >/tmp/mybcfifo);
exec 6< /tmp/mybcfifo;
rm /tmp/mybcfifo;
for ((i=1; i<157; i++)) ;do
echo "$i/42" 1>&5;
read -u 6 result;
echo $result;
done;
exec 6>&-;
exec 5>&-
}
Let's see without:
time printf "$form" $(testBc)
0.024 0.048 0.071 0.095 0.119 0.143 0.167 0.190 0.214 0.238 0.262 0.286 0.310
0.333 0.357 0.381 0.405 0.429 0.452 0.476 0.500 0.524 0.548 0.571 0.595 0.619
0.643 0.667 0.690 0.714 0.738 0.762 0.786 0.810 0.833 0.857 0.881 0.905 0.929
0.952 0.976 1.000 1.024 1.048 1.071 1.095 1.119 1.143 1.167 1.190 1.214 1.238
1.262 1.286 1.310 1.333 1.357 1.381 1.405 1.429 1.452 1.476 1.500 1.524 1.548
1.571 1.595 1.619 1.643 1.667 1.690 1.714 1.738 1.762 1.786 1.810 1.833 1.857
1.881 1.905 1.929 1.952 1.976 2.000 2.024 2.048 2.071 2.095 2.119 2.143 2.167
2.190 2.214 2.238 2.262 2.286 2.310 2.333 2.357 2.381 2.405 2.429 2.452 2.476
2.500 2.524 2.548 2.571 2.595 2.619 2.643 2.667 2.690 2.714 2.738 2.762 2.786
2.810 2.833 2.857 2.881 2.905 2.929 2.952 2.976 3.000 3.024 3.048 3.071 3.095
3.119 3.143 3.167 3.190 3.214 3.238 3.262 3.286 3.310 3.333 3.357 3.381 3.405
3.429 3.452 3.476 3.500 3.524 3.548 3.571 3.595 3.619 3.643 3.667 3.690 3.714
real 0m10.113s
user 0m0.900s
sys 0m1.290s
Wow! Ten seconds on my raspberry-pi!!
Then with:
time printf "$form" $(testLongBc)
0.024 0.048 0.071 0.095 0.119 0.143 0.167 0.190 0.214 0.238 0.262 0.286 0.310
0.333 0.357 0.381 0.405 0.429 0.452 0.476 0.500 0.524 0.548 0.571 0.595 0.619
0.643 0.667 0.690 0.714 0.738 0.762 0.786 0.810 0.833 0.857 0.881 0.905 0.929
0.952 0.976 1.000 1.024 1.048 1.071 1.095 1.119 1.143 1.167 1.190 1.214 1.238
1.262 1.286 1.310 1.333 1.357 1.381 1.405 1.429 1.452 1.476 1.500 1.524 1.548
1.571 1.595 1.619 1.643 1.667 1.690 1.714 1.738 1.762 1.786 1.810 1.833 1.857
1.881 1.905 1.929 1.952 1.976 2.000 2.024 2.048 2.071 2.095 2.119 2.143 2.167
2.190 2.214 2.238 2.262 2.286 2.310 2.333 2.357 2.381 2.405 2.429 2.452 2.476
2.500 2.524 2.548 2.571 2.595 2.619 2.643 2.667 2.690 2.714 2.738 2.762 2.786
2.810 2.833 2.857 2.881 2.905 2.929 2.952 2.976 3.000 3.024 3.048 3.071 3.095
3.119 3.143 3.167 3.190 3.214 3.238 3.262 3.286 3.310 3.333 3.357 3.381 3.405
3.429 3.452 3.476 3.500 3.524 3.548 3.571 3.595 3.619 3.643 3.667 3.690 3.714
real 0m0.670s
user 0m0.190s
sys 0m0.070s
Less than one second!!
Hopefully, results are same, but execution time is very different!
My shell connector
I've published a connector function: Connector-bash on GitHub.com
and shell_connector.sh on my own site.
source shell_connector.sh
newConnector /usr/bin/bc -l 0 0
myBc 1764/42 result
echo $result
42.00000000000000000000

find . -name '*.mp4' | wc -l | xargs -I{} expr {} / 2
Best used if you have multiple outputs you'd like to pipe through xargs. Use{} as a placeholder for the expression term.

Depending on your bash version, you don't even need find for this simple task:
shopt -s nullglob globstar
files=( **/*.mp4 )
dc -e "3 k ${#files[#]} 3 / p"
This method will correctly handle the bizarre edgecase of filenames containing newlines.

Related

How to make this awk script simple and use in gnuscript in loop form

I am attempting to plot a multicolum file using gnuplot script.
I am doing it like
plot "100.dat" u ($1-CONS):($2*$3) w l lt 4 ,
"200.dat" u ($1-CONS):($2*$3) w l lt 2 ,
"300.dat" u ($1-CONS):($2*$3) w l lt 1
where CONS is my variable defined at the top of file.
My set xrange is [-0.2:0.2] while data in the scale is beyond this scale.
What I want to capture is (in loop form for multiple files):
maximum value of above three plots in negative and positive both sides and corresponding value of column 1 in my xrange for both the maximum.
in a shell script I can do it easily but I am facing problem in defining in my gnuscript
my shell script is below
for i in 100.0000 200.0000 200.0000
do
grep $i data.dat > $i.dat
awk '{print ($1-CONS), ($2*$3)}' $i.dat | awk '{ if($1 <= 0.2 && $1 >= 0.0) { print }}' > $i.p2.dat ; awk 'BEGIN {min=1000000; max=0;}; { if($2<min && $2 != "") min = $2; if($2>max && $2 != "") max = $2; } END {print min, max}' $i.p2.dat | awk '{print $2}' > $i.p2Max.dat ; PMAX=$(cat $i.p2Max.dat) ; grep "$PMAX" $i.p2.dat | tail -n 1 >> MAX.dat
awk '{print ($1-CONS), ($2*$3)}' $i.dat | awk '{ if($1 <= 0.0 && $1 >= -0.2) { print }}' > $i.mi.dat ; awk 'BEGIN {min=1000000; max=0;}; { if($2<min && $2 != "") min = $2; if($2>max && $2 != "") max = $2; } END {print min, max}' $i.mi.dat | awk '{print $2}' > $i.mi_Max.dat ; N_MAX=$(cat $i.mi_Max.dat) ; grep "$N_MAX" $i.mi.dat | tail -n 1 >> MAX.dat
done
I am looking for a simple script that can be used in the gnuplot script in loop form so that if I have multiple data file and I need to grep the maximum of a colum two (on both the sides of the zero) then it store the maximum value of column two wrt corresponding value of column 1 separately for negative and positive scale.
I would love to see if this can be done using a loop so that I do not need to write all the lines repetitively.
Your description is a bit confusing to me. My understanding is the following: loop through several files and extract the maxima in the xranges [-0.2:0] and [0:0.2],
respectively.
Test data:
100.dat
-0.17 0.447 0.287
-0.13 0.353 0.936
-0.09 0.476 0.309
-0.05 0.504 0.220
-0.01 0.340 0.564
0.03 0.096 0.947
0.07 0.564 0.885
0.11 0.312 0.957
0.15 0.058 0.347
0.19 0.016 0.923
0.23 0.835 0.461
200.dat
-0.17 0.608 0.875
-0.13 0.266 0.805
-0.09 0.948 0.696
-0.05 0.513 0.800
-0.01 0.736 0.392
0.03 0.318 0.312
0.07 0.708 0.534
0.11 0.246 0.975
0.15 0.198 0.914
0.19 0.174 0.318
0.23 0.727 0.341
300.dat
-0.17 0.527 0.658
-0.13 0.166 0.340
-0.09 0.695 0.031
-0.05 0.623 0.542
-0.01 0.996 0.674
0.03 0.816 0.365
0.07 0.286 0.433
0.11 0.069 0.381
0.15 0.719 0.621
0.19 0.516 0.701
0.23 0.248 0.659
Code:
### loop of files and extracting values
reset session
FILES = "100.dat 200.dat 300.dat"
Count = words(FILES)
CONS = 0.03
# get maxima
array NegMaxX[Count]
array NegMaxY[Count]
array PosMaxX[Count]
array PosMaxY[Count]
do for [i=1:Count] {
stats [-0.2:0] word(FILES,i) u ($1-CONS):($2*$3) nooutput
NegMaxX[i] = STATS_pos_max_y
NegMaxY[i] = STATS_max_y
stats [0:0.2] word(FILES,i) u ($1-CONS):($2*$3) nooutput
PosMaxX[i] = STATS_pos_max_y
PosMaxY[i] = STATS_max_y
}
set xrange[-0.2:0.2]
# set labels
do for [i=1:Count] {
set label i*2-1 at NegMaxX[i], NegMaxY[i] sprintf("%.3f/%.3f",NegMaxX[i],NegMaxY[i])
set label i*2 at PosMaxX[i], PosMaxY[i] sprintf("%.3f/%.3f",PosMaxX[i],PosMaxY[i])
}
plot for [i=1:Count] word(FILES,i) u ($1-CONS):($2*$3) w l lt i ti word(FILES,i), \
### end of code
Result:

Bash alias cpu usage

I've tried this command but I have a percentage error calculator:
alias cpu="mpstat | awk '\$12 ~ /[0-9.]+/ { print 100 - $12\"%\" }'"
Thank you for help
Change it to this:
alias cpu="mpstat | awk '\$12 ~ /[0-9.]+/ { print 100 - \$12\"%\" }'"
\ was missing after 100 -.
-> mpstat
Linux 3.2.0-69-virtual (myhost) 01/06/2017 _x86_64_ (8 CPU)
10:18:16 PM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle
10:18:16 PM all 12.06 7.00 7.96 0.02 0.00 0.24 0.22 0.00 72.49
-> cpu
27.51%

time command output on an already running process

I have a process that spawns some other processes,
I want to use the time command on a specific process and get the same output as the time command.
Is that possible and how?
I want to use the time command on a specific process and get the same output as the time command.
Probably it is enough just to use pidstat to get user and sys time:
$ pidstat -p 30122 1 4
Linux 2.6.32-431.el6.x86_64 (hostname) 05/15/2014 _x86_64_ (8 CPU)
04:42:28 PM PID %usr %system %guest %CPU CPU Command
04:42:29 PM 30122 706.00 16.00 0.00 722.00 3 has_serverd
04:42:30 PM 30122 714.00 12.00 0.00 726.00 3 has_serverd
04:42:31 PM 30122 714.00 14.00 0.00 728.00 3 has_serverd
04:42:32 PM 30122 708.00 16.00 0.00 724.00 3 has_serverd
Average: 30122 710.50 14.50 0.00 725.00 - has_serverd
If not then according to strace time uses wait4 system call (http://linux.die.net/man/2/wait4) to get information about a process from the kernel. The same info returns getrusage but you cannot call it for an arbitrary process according to its documentation (http://linux.die.net/man/2/getrusage).
So, I do not know any command that will give the same output. However it is feasible to create a bash script that gets PID of the specific process and outputs something like time outpus then
This script does these steps:
1) Get the number of clock ticks per second
getconf CLK_TCK
I assume it is 100 and 1 tick is equal to 10 milliseconds.
2) Then in loop do the same sequence of commands while exists the directory /proc/YOUR-PID:
while [ -e "/proc/YOUR-PID" ];
do
read USER_TIME SYS_TIME REAL_TIME <<< $(cat /proc/PID/stat | awk '{print $14, $15, $22;}')
sleep 0.1
end loop
Some explanation - according to man proc :
user time: ($14) - utime - Amount of time that this process has been scheduled in user mode, measured in clock ticks
sys time: ($15) - stime - Amount of time that this process has been scheduled in kernel mode, measured in clock ticks
starttime ($22) - The time in jiffies the process started after system boot.
3) When the process is finished get finish time
read FINISH_TIME <<< $(cat '/proc/self/stat' | awk '{print $22;}')
And then output:
the real time = ($FINISH_TIME-$REAL_TIME) * 10 - in milliseconds
user time: ($USER_TIME/(getconf CLK_TCK)) * 10 - in milliseconds
sys time: ($SYS_TIME/(getconf CLK_TCK)) * 10 - in milliseconds
I think it should give roughly the same result as time. One possible problem I see is if the process exists for a very short period of time.
This is my implementation of time:
#!/bin/bash
# Uses herestrings
print_res_jeffies()
{
let "TIME_M=$2/60000"
let "TIME_S=($2-$TIME_M*60000)/1000"
let "TIME_MS=$2-$TIME_M*60000-$TIME_S*1000"
printf "%s\t%dm%d.%03dms\n" $1 $TIME_M $TIME_S $TIME_MS
}
print_res_ticks()
{
let "TIME_M=$2/6000"
let "TIME_S=($2-$TIME_M*6000)/100"
let "TIME_MS=($2-$TIME_M*6000-$TIME_S*100)*10"
printf "%s\t%dm%d.%03dms\n" $1 $TIME_M $TIME_S $TIME_MS
}
if [ $(getconf CLK_TCK) != 100 ]; then
exit 1;
fi
if [ $# != 1 ]; then
exit 1;
fi
PROC_DIR="/proc/"$1
if [ ! -e $PROC_DIR ]; then
exit 1
fi
USER_TIME=0
SYS_TIME=0
START_TIME=0
while [ -e $PROC_DIR ]; do
read TMP_USER_TIME TMP_SYS_TIME TMP_START_TIME <<< $(cat $PROC_DIR/stat | awk '{print $14, $15, $22;}')
if [ -e $PROC_DIR ]; then
USER_TIME=$TMP_USER_TIME
SYS_TIME=$TMP_SYS_TIME
START_TIME=$TMP_START_TIME
sleep 0.1
else
break
fi
done
read FINISH_TIME <<< $(cat '/proc/self/stat' | awk '{print $22;}')
let "REAL_TIME=($FINISH_TIME - $START_TIME)*10"
print_res_jeffies 'real' $REAL_TIME
print_res_ticks 'user' $USER_TIME
print_res_ticks 'sys' $SYS_TIME
And this is an example that compares my implementation of time and real time:
>time ./sys_intensive > /dev/null
Alarm clock
real 0m10.004s
user 0m9.883s
sys 0m0.034s
In another terminal window I run my_time.sh and give it PID:
>./my_time.sh `pidof sys_intensive`
real 0m10.010ms
user 0m9.780ms
sys 0m0.030ms

Linux differences between consecutive lines

I need to loop trough n lines of a file and for any i between 1 and n - 1 to get the difference line(n - 1) - line(n).
And here is the source file:
root#syncro:/var/www# cat cron.log | grep "/dev/vda"
/dev/vda 20418M 14799M 4595M 77% /
/dev/vda 20418M 14822M 4572M 77% /
/dev/vda 20418M 14846M 4548M 77% /
/dev/vda 20418M 14867M 4527M 77% /
/dev/vda 20418M 14888M 4506M 77% /
/dev/vda 20418M 14910M 4484M 77% /
/dev/vda 20418M 14935M 4459M 78% /
/dev/vda 20418M 14953M 4441M 78% /
/dev/vda 20418M 14974M 4420M 78% /
/dev/vda 20418M 15017M 4377M 78% /
/dev/vda 20418M 15038M 4356M 78% /
root#syncro:/var/www# cat cron.log | grep "/dev/vda" | cut -b 36-42 | tr -d " M"
4595
4572
4548
4527
4506
4484
4459
4441
4420
4377
4356
those /dev/vda... lines are logged hourly with df -BM in cron.log file and the difference between lines will reveal the hourly disk consumption.
So, the expected output will be:
23 (4595 - 4572)
24 (4572 - 4548)
...
43 (4420 - 4377)
21 (4377 - 4356)
I don't need the text between ( and ), I put it here for explanation only.
I'm not sure if I got you correctly, but the following awk script should work:
awk '{if(NR>1){print _n-$4};_n=$4}' your.file
Output:
23
24
21
21
22
25
18
21
43
21
You don't need the other programs in the pipe. Just:
awk '/\/dev\/vda/ {if(c++>0){print _n-$4};_n=$4}' src/checkout-plugin/a.txt
will be enough. The regex on start of the awk scripts tells awk to apply the following block only to lines which match the pattern. A side effect is that NR can't be used anymore to detect the "second line" in which the calculation starts. I introduced a custome counter c for that purpose.
Also note that awk will remove the M on it's own, because the column has been used in a numeric calculation.

find all users who has over N process and echo them in shell

I'm writing script is ksh. Need to find all users who has over N process and echo them in shell.
N reads from ksh.
I know what I should use ps -elf but how parse it, find users with >N process and create array with them. Little troubles with array in ksh. Please help. Maybe simple solutions can help me instead of array creating.
s162103#helios:/home/s162103$ ps -elf
0 S s153308 4804 1 0 40 20 ? 17666 ? 11:03:08 ? 0:00 /usr/lib/gnome-settings-daemon --oa
0 S root 6546 1327 0 40 20 ? 3584 ? 11:14:06 ? 0:00 /usr/dt/bin/dtlogin -daemon -udpPor
0 S webservd 15646 485 0 40 20 ? 2823 ? п╪п╟я─я ? 0:23 /opt/csw/sbin/nginx
0 S s153246 6746 6741 0 40 20 ? 18103 ? 11:14:21 ? 0:00 iiim-panel --disable-crash-dialog
0 S s153246 23512 1 0 40 20 ? 17903 ? 09:34:08 ? 0:00 /usr/bin/metacity --sm-client-id=de
0 S root 933 861 0 40 20 ? 5234 ? 10:26:59 ? 0:00 dtgreet -display :14
...
when i type
ps -elf | awk '{a[$3]++;}END{for(i in a)if (a[i]>N)print i, a[i];}' N=1
s162103#helios:/home/s162103$ ps -elf | awk '{a[$3]++;}END{for(i in a)if (a[i]>N)print i, a[i];}' N=1
root 118
/usr/sadm/lib/smc/bin/smcboot 3
/usr/lib/autofs/automountd 2
/opt/SUNWut/lib/utsessiond 2
nasty 31
dima 22
/opt/oracle/product/Oracle_WT1/ohs/ 7
/usr/lib/ssh/sshd 5
/usr/bin/bash 11
that is not user /usr/sadm/lib/smc/bin/smcboot
there is last field in ps -elf ,not user
Something like this(assuming 3rd field of your ps command gives the user id):
ps -elf |
awk '{a[$3]++;}
END {
for(i in a)
if (a[i]>N)
print i, a[i];
}' N=3
The minimal ps command you want to use here is ps -eo user=. This will just print the username for each process and nothing more. The rest can be done with awk:
ps -eo user= |
awk -v max=3 '{ n[$1]++ }
END {
for (user in n)
if (n[user]>max)
print n[user], user
}'
I recommend to put the count in the first column for readability.
read number
ps -elfo user= | sort | uniq -c | while read count user
do
if (( $count > $number ))
then
echo $user
fi
done
That is best solution and it works!

Resources