Plot graph using shell for total file count and date created - linux

I want to create a histogram with total file count intervals of 50 on Y-axis and time created in weeks on X-axis (i.e if new files were created between week 1 and 2 and so on)
Something like
200, 150, 100 , 50 files created during a certain week
7, 14, 21, 28 days on Y-axis. Kind of lost on how to implement this. Any help is appreciated
Update: I am trying along these lines
find <dirname> -type f -ctime -1 -ctime -7 | wc -l
find <dirname> -type f -ctime +7 -ctime -14 | wc -l
Find the max number and use this as my X-axis upper limit. Then divide this number into equal intervals to plot my X-axis

This is a start using GNU awk for time functions (untested since you didn't provide concise, testable sample input that we could test against):
find "$1" -type f -printf '%T# %p\0' |
awk -v RS='\0' '
BEGIN {
nowSecs = systime()
}
{
fileName = gensub(/\S+\s+/,"",1)
fileModSecs = int($1)
fileAgeSecs = nowSecs - fileModSecs
fileAgeDays = int(fileAgeSecs / (24 * 60 * 60))
fileAgeWeeks = int(fileAgeDays / 7)
weekNr = fileAgeWeeks + 1
fileCnts[weekNr]++
numWeeks = (weekNr > numWeeks ? weekNr : numWeeks)
maxFileCnt = (fileCnts[weekNr] > maxFileCnt ? fileCnts[weekNr] : maxFileCnt)
print nowSecs, fileModSecs, fileAgeSecs, fileAgeDays, fileAgeWeeks, weekNr, fileName | "cat>&2"
}
END {
for (fileCnt=maxFileCnt; fileCnt>0; fileCnt--) {
for (weekNr=1; weekNr<=numWeeks; weekNr++) {
if (weekNr in fileCnts) {
char[weekNr] = "*"
}
printf "%s%s", char[weekNr], (weekNr<numWeeks ? OFS : ORS)
}
}
for (weekNr=1; weekNr<=numWeeks; weekNr++) {
printf "%s%s", weekNr, (weekNr<numWeeks ? OFS : ORS)
}
}
'
You need to figure out the details of the loops in the END section for printing the histogram but the above at least shows you how to get the count of files by week without calling find multiple times and hard-coding the number of days week by week.

Apologies being ksh instead of bash (bash level is near echo "Hello World") :)...
Would that do what you need ?
#!/bin/ksh
######################################
#
# statDirReport.sh
#
version="1.0"
# Andre Gelinas, 2018
#
######################################
#############
# Variables
#############
typeset -F2 SCALE
# Max value of X
X_SCALE=30
#############
# Main
#############
if [[ -n $1 ]]; then
DIRNAME=$1
else
print -n "Enter full path to stat : "; read DIRNAME
fi
if [[ ! -d $DIRNAME || ! -r $DIRNAME || ! -x $DIRNAME ]]; then
print "ERROR - Directory unusable - Exiting"
exit
fi
## Getting the data
CTIME1=1
CTIME2=0
for ((i=1;i<=4;i++)); do
CTIME2=$(($i*7))
FILE_COUNT[$i]=$(find $DIRNAME -type f -ctime +$CTIME1 -ctime -$CTIME2 | wc -l)
#To find late on the max amount
F_COUNT[${FILE_COUNT[$i]}]=${FILE_COUNT[$i]}
#
CTIME1=$CTIME2
done
#Doing some math
## Highest number of file
MAX_COUNT=${F_COUNT[-1]}
## Find the value of each tick
SCALE=$(($MAX_COUNT/$X_SCALE))
## Find the real length of the histogram for each week
## having the highest amount using full x scale (integer mathematics)
for ((i=1;i<=4;i++)); do
DATA_2_SCALE[$i]=$(((${FILE_COUNT[$i]}*$X_SCALE)/$MAX_COUNT))
done
# Getting the report
typeset -L2 Col1
typeset -L1 Col2
typeset -L$(($X_SCALE+5)) Col3
typeset -L5 Col4
Col1="Wk"
Col2=" "
Col3="Data"
Col4="Real"
clear
print "statDirReport v$version\tScale is #=$SCALE\n"
print "$Col1$Col2$Col3$Col4\n"
for ((i=1;i<=4;i++)); do
Col1=$i
Col2="|"
graph=""
Col4=${FILE_COUNT[$i]}
for ((j=1;j<=${DATA_2_SCALE[$i]};j++)); do
graph+="#"
done
Col3=$graph
print "$Col1$Col2$Col3$Col4"
done
Edit to modify to add dates as title for the histograms. Modify the last part, right after the "DATA_2_SCALE" loop, with :
#Setting the title of each histogram
## Finding how many sec since the beginning of time
TODAY_SEC=$(date +"%s")
## Finding real date for find range
SEC_PER_DAY=86400
lastDate=$(date -u -d #"$TODAY_SEC" +"%m/%d")
for ((i=1;i<=4;i++)); do
firstDate=$(date -u -d #"$(($TODAY_SEC-(7*$i*$SEC_PER_DAY)))" +"%m/%d")
WEEK[$i]=$firstDate" to "$lastDate" "
lastDate=$firstDate
done
# Getting the report
typeset -L15 Col1
typeset -L1 Col2
typeset -L$(($X_SCALE+5)) Col3
typeset -L5 Col4
Col1="Wk"
Col2=" "
Col3="Data"
Col4="Real"
clear
print "statDirReport v$version\tScale is #=$SCALE\n"
print "$Col1$Col2$Col3$Col4\n"
for ((i=1;i<=4;i++)); do
Col1=${WEEK[$i]}
Col2="|"
graph=""
Col4=${FILE_COUNT[$i]}
for ((j=1;j<=${DATA_2_SCALE[$i]};j++)); do
graph+="#"
done
Col3=$graph
print "$Col1$Col2$Col3$Col4"
done

Using feedgnuplot on a home directory:
dirname=~
e=0
for f in `seq 7 7 28` ; do
find "${dirname}" -type f -ctime +$e -ctime -$f | wc -l
e=$f
done 2> /dev/null |
feedgnuplot --terminal 'dumb 50,15' --with boxes --unset grid --exit
Output:
5500 +-+-----+-------+------+-------+-----+-+
5000 +-+ ********* + + + +-+
4500 +-+ * * ******** +-+
4000 +-+ * * * * +-+
3500 +-+ * * * * +-+
3000 +-+ * * * * +-+
2500 +-+ * ********* * +-+
2000 +-+ * * * * +-+
1500 +-+ * * * * +-+
1000 +-+ * + * + * + ********* +-+
500 +-+-********************************-+-+
0 1 2 3 4 5

Related

Time difference in shell (hour)

I'm trying to calculate time difference stored inside of two variables inside of a shell script, I'm observing the following pattern:
hhmm -> 0950
so:
time1=1333
time2=0950
Now I need to calculate the difference in time between time1 and time2, as for now I have tried:
deltaTime=$(($time1-$time2))
but I'm facing the following error message
1333-0950: value too great for base (error token is "0950")
I'm expecting as a result: $deltaTime=0343
Unfortunately, I am strictly bound to use this time pattern. I have already researched for a solution online, some of them propose to use date -d... but I couldn't get it to work :(
Your approach has two issues.
First issue: bash recognizes numbers with leading zeroes as octal. You can force base10 by adding 10# prefix.
Second issue: it is incorrect to consider strings in hhmm format as numbers and substract them. e.g. 1333-950=383 but difference between 09:50 and 13:33 is 3 hours and 43 minutes. You should convert string values to common units, e.g. to minutes, substract them and convert back to hhmm format.
time1=1333
time2=0950
str2min()
{
printf "%u" $((10#${1%??} * 60 + 10#${1#??}))
}
min2str()
{
printf "%02u%02u" $(($1 / 60)) $(($1 % 60))
}
time1m=$(str2min $time1)
time2m=$(str2min $time2)
timediff=$(($time1m - $time2m))
deltaTime=$(min2str $timediff)
You could use this implementation maybe?
#!/usr/bin/env bash
diff_hhmm() {
local -r from=$1
local -i from_hh=10#${from:0:2} # skip 0 chars, read 2 chars (`${from:0:2}`) using base 10 (`10#`)
local -ri from_mm=10#${from:2:2} # skip 2 chars, read 2 chars (`${from:0:2}`) using base 10 (`10#`)
local -r upto=$2
local -ri upto_hh=10#${upto:0:2}
local -ri upto_mm=10#${upto:2:2}
local -i diff_hh
local -i diff_mm
# Compute difference in minutes
(( diff_mm = from_mm - upto_mm ))
# If it's negative, we've "breached" into the previous hour, so adjust
# the `diff_mm` value to be modulo 60 and compensate the `from_hh` var
# to reflect that we've already subtracted some of the minutes there.
if (( diff_mm < 0 )); then
(( diff_mm += 60 ))
(( from_hh -= 1 ))
fi
# Compute difference in hours
(( diff_hh = from_hh - upto_hh ))
# Ensure the result is modulo 24, the number of hours in a day.
if (( diff_hh < 0 )); then
(( diff_hh += 24 ))
fi
# Print the values with 0-padding if necessary.
printf '%02d%02d\n' "$diff_hh" "$diff_mm"
}
$ diff_hhmm 1333 0950
0343
$ diff_hhmm 0733 0950
2143
$ diff_hhmm 0733 0930
2203
Or an even shorter implementation using a big arithmetic compound command ((( ... )) ) and inlining some variables:
diff_hhmm_terse() {
local -i diff_hh diff_mm
((
diff_mm = 10#${1:2:2} - 10#${2:2:2},
diff_hh = 10#${1:0:2} - 10#${2:0:2},
diff_hh -= diff_mm < 0 ? 1 : 0,
diff_mm += diff_mm < 0 ? 60 : 0,
diff_hh += diff_hh < 0 ? 24 : 0
))
printf '%02d%02d\n' "$diff_hh" "$diff_mm"
}
Do you have the possibility to drop the leading zero?
As you can see from my prompt:
Prompt> echo $((1333-0950))
-bash: 1333-0950: value too great for base (error token is "0950")
Prompt> echo $((1333-950))
383
Other proposal:
date '+%s'
Let me give you some examples:
date '+%s'
1662357975
... (after some time)
date '+%s'
1662458180
=>
echo $((1662458180-1662357975))
100205 (amount of seconds)
=>
echo $(((1662458180-1662357975)/3600))
27 (amount of hours)
This bash one-liner may be used if time difference is not negative (that is, time1 >= time2):
printf '%04d\n' $(( 10#$time1 - 10#$time2 - (10#${time1: -2} < 10#${time2: -2} ? 40 : 0) ))

Calculating midpoint, initial bearing, and distance in bash

SOLVED:
I'm trying to write a script that calculates the distance, bearing, and mid-point given a pair of lat lon coords.
I found the formula's easily enough, but I'm getting the wrong answers. It might just be a math mistake, but I've looked it over multiple times and I'm missing something.
I'm following this website's formulas: http://www.movable-type.co.uk/scripts/latlong-nomodule.html
Here's what I get, output is dist, bearing, midpoint lat, midpoint long
script.bash 1 -80 -3 -79.2
453.58,158.22,68.1258,95.390
this is what I should get:
script.bash 1 -80 -3 -79.2
453.6 168.7 -1 -79.6
So, distance looks good. But the others are all off. Any thoughts?
Here is my code:
#!/bin/bash
lat1=$1
lat2=$3
lon1=$2
lon2=$4
#some basic info
R=6371
lat1r=`echo "$lat1*3.14159/180" | bc -l`
lat2r=`echo "$lat2*3.14159/180" | bc -l`
lon1r=`echo "$lon1*3.14159/180" | bc -l`
lon2r=`echo "$lon2*3.14159/180" | bc -l`
dLat=`echo "$lat2r - $lat1r" | bc -l`
dLon=`echo "$lon2r - $lon1r" | bc -l`
#Distance calculations
a=`echo "-s ($dLat/2) * -s ($dLat/2) + -c ($lat1r) * -c ($lat2r) * -s ($dLon/2) * -s ($dLon/2)" | bc -l`
c1=`echo "sqrt($a) " | bc -l`
c2=`echo "sqrt(1 - $a)" | bc -l`
cat=`echo "$c1,$c2"| awk -F',' '{ print atan2($1,$2) }'`
c=`echo "2*$cat" | bc -l`
d=`echo "$R*$c" | bc -l`
#Bearing calculation
y=`echo "-s ($dLon) * -c ($lat2r)" | bc -l`
x=`echo "-c ($lat1r) * -s ($lat2r) - -s ($lat1r) * -c ($lat2r) * -c ($dLon)" | bc -l`
brng=`echo "$y,$x"| awk -F',' '{ print atan2($1,$2) }'`
brn=`echo "$brng * 180 / 3.14159" | bc -l`
echo "$brng * 180 / 3.14159"
#Mid point calculation
Bx=`echo "-c ($lat2r) * -c ($dLon)" | bc -l`
By=`echo "-c ($lat2r) * -s ($dLon)" | bc -l`
atc1=`echo " -s ($lat1r) + -s ($lat2r)" | bc -l`
atc2=`echo " sqrt( ( -c ($lat1r) + $Bx )^2 + $By^2 ) " | bc -l`
latmidr=`echo "$atc1,$atc2"| awk -F',' '{ print atan2($1,$2) }'`
latmid=`echo "$latmidr * 180 / 3.14159" | bc -l`
atc3=$By
atc4=`echo " -c ($lat1r) + $Bx" | bc -l`
lonmidr=`echo "$atc3,$atc4"| awk -F',' '{ print atan2($1,$2) }'`
lonmid=`echo "$lonmidr * 180 / 3.14159" | bc -l`
echo $d,$brn,$latmidr,$lonmid
That's a completely inappropriate job for a shell script, you should have done it as a single awk (or similar, e.g. perl, ruby, python) script. Btw naming a variable the same as a command (cat) obfuscates your code and makes it more error prone.
Here's what your starting point should be (check the math/conversions as I almost certainly didn't always interpret what you were trying to do when piping strings with -s and -c to bc etc. correctly as I was guessing):
$ cat tst.sh
#!/usr/bin/env bash
awk -v lat1="$1" -v lat2="$3" -v lon1="$2" -v lon2="$4" '
BEGIN {
#some basic info
pi = 3.14159
R = 6371
lat1r = lat1 * pi / 180
lat2r = lat2 * pi / 180
lon1r = lon1 * pi / 180
lon2r = lon2 * pi / 180
dLat = lat2r - lat1r
dLon = lon2r - lon1r
#Distance calculations
a = sin(dLat/2) * sin(dLat/2) + cos(lat1r) * cos(lat2r) * sin(dLon/2) * sin(dLon/2)
c1 = sqrt(a)
c2 = sqrt(1 - a)
cat = atan2(c1,c2)
c = 2 * cat
d = R * c
#Bearing calculation
x = cos(lat1r) * sin(lat2r) - sin(lat1r) * cos(lat2r) * cos(dLon)
y = sin(dLon) * cos(lat2r)
brng = atan2(y,x)
brn = brng * 180 / pi
print brng * 180 / pi
#Mid point calculation
Bx = cos(lat2r) * cos(dLon)
By = cos(lat2r) * sin(dLon)
atc1 = sin(lat1r) + sin(lat2r)
atc2 = sqrt( (cos(lat1r) + Bx )^2 + By^2 )
latmidr = atan2(atc1,atc2)
latmid = latmidr * 180 / pi
atc3 = By
atc4 = cos(lat1r) + Bx
lonmidr = lon1r + atan2(atc3,atc4)
lonmid = lonmidr * 180 / pi
print d, brn, latmid, lonmid
}
'
.
$ ./tst.sh 1 -80 -3 -79.2
168.696
453.581 168.696 -1.00002 -79.6002

Calculate the 5 minutes ceiling

Operating System: Red Hat Enterprise Linux Server 7.2 (Maipo)
I want to round the time to the nearest 5 minutes, only up, not down, for example:
08:09:15 should be 08:10:00
08:11:26 should be 08:15:00
08:17:58 should be 08:20:00
I have been trying with:
(date -d #$(( (($(date +%s) + 150) / 300) * 300)) "+%H:%M:%S")
This will round the time but also down (08:11:18 will result in 08:10:00 and not 08:15:00)
Any idea how i can achieve this?
You may use this utility function for your rounding up:
roundDt() {
local n=300
local str="$1"
date -d #$(( ($(date -d "$str" '+%s') + $n)/$n * $n)) '+%H:%M:%S'
}
Then invoke this function as:
roundDt '08:09:15'
08:10:00
roundDt '08:11:26'
08:15:00
roundDt '08:17:58'
08:20:00
To trace how this function is computing use -x (trace mode) after exporting:
export -f roundDt
bash -cx "roundDt '08:11:26'"
+ roundDt 08:11:26
+ typeset n=300
+ typeset str=08:11:26
++ date -d 08:11:26 +%s
+ date -d #1535631300 +%H:%M:%S
08:15:00
GNU date can calculate already. It is explained in the manual in the chapter "Relative items in date strings". So you need just one date call.
d=$(date +%T) # get the current time
IFS=: read h m s <<< "$d" # parse it in hours, minutes and seconds
inc=$(( 300 - (m * 60 + s) % 300 )) # calculate the seconds to increment
date -d "$d $inc sec" +%T # output the new time with the offset
Btw: +%T is the same as +%H:%M:%S.

Split string in ksh

I got a string as follow :
foo=0j0h0min0s
What would be the best way to convert it in seconds without using date ?
I tried something like this that sounded pretty nice but no luck :
#> IFS=: read -r j h min s <<<"$foo"
#> time_s=$((((j * 24 + h) * 60 + min) * 60 + s))
ksh: syntax error: `<' unexpected
Any idea is welcome, I just can't use date -d to make conversion as it is not present on the system I am working on.
<<<"$foo" is mainly a bash-ism. It is supported in some/newer ksh. (google 'ksh here string' ).
Your read is trying to split at :, wich is not present in your input
If you first get rid of characters, you can split at blank (as ususal)
and changing the here-string to a here-doc
#!/bin/ksh
foo=1j2h3min4s
read -r j h min s << END
"${foo//[a-z]/ }"
END
# or echo "${foo//[a-z]/ }" | read -r j h min s
time_s=$((((j * 24 + h) * 60 + min) * 60 + s))
echo ">$foo< = >${foo//[a-z]/ }< = $j|$h|$min|$s => >$time_s<"
>1j2h3min4s< = >1 2 3 4 < = "1|2|3|4 " => >93784<
# or using array, easy to assign, more typing where used
typeset -a t=( ${foo//[a-z]/ } )
time_s=$(( (( t[0] * 24 + t[1]) * 60 + t[2]) * 60 + t[3] ))
echo ">$foo< = >${foo//[a-z]/ }< = ${t[0]}|${t[1]}|${t[2]}|${t[3]} => >$time_s<"

bc truncate floating point number

How do I truncate a floating point number using bc
e.g if I do
echo '4.2-1.3' | bc
which outputs 2.9 how I get it to truncate/use floor to get 2
Use / operator.
echo '(4.2-1.3) / 1' | bc
Dividing by 1 works ok if scale is 0 (eg, if you start bc with bc and don't change scale) but fails if scale is positive (eg, if you start bc with bc -l or increase scale). (See transcript below.) For a general solution, use a trunc function like the following:
define trunc(x) { auto s; s=scale; scale=0; x=x/1; scale=s; return x }
Transcript that illustrates how divide by 1 by itself fails in the bc -l case, but how trunc function works ok at truncating toward zero:
> bc -l
bc 1.06.95
[etc...]
for (x=-4; x<4; x+=l(2)) { print x,"\t",x/1,"\n"}
-4 -4.00000000000000000000
-3.30685281944005469059 -3.30685281944005469059
-2.61370563888010938118 -2.61370563888010938118
-1.92055845832016407177 -1.92055845832016407177
-1.22741127776021876236 -1.22741127776021876236
-.53426409720027345295 -.53426409720027345295
.15888308335967185646 .15888308335967185646
.85203026391961716587 .85203026391961716587
1.54517744447956247528 1.54517744447956247528
2.23832462503950778469 2.23832462503950778469
2.93147180559945309410 2.93147180559945309410
3.62461898615939840351 3.62461898615939840351
define trunc(x) { auto s; s=scale; scale=0; x=x/1; scale=s; return x }
for (x=-4; x<4; x+=l(2)) { print x,"\t",trunc(x),"\n"}
-4 -4
-3.30685281944005469059 -3
-2.61370563888010938118 -2
-1.92055845832016407177 -1
-1.22741127776021876236 -1
-.53426409720027345295 0
.15888308335967185646 0
.85203026391961716587 0
1.54517744447956247528 1
2.23832462503950778469 2
2.93147180559945309410 2
3.62461898615939840351 3
Try the following solution. It will truncate anything after the decimal point without a problem:
echo 'x = 4.2 - 1.3; scale = 0; x / 1' | bc -l
echo 'x = l(101) / l(10); scale = 0; x / 1' | bc -l
You can make the code a tad shorter by performing calculations directly on the numbers:
echo 'scale = 0; (4.2 - 1.3) / 1' | bc -l
echo 'scale = 0; (l(101) / l(10)) / 1' | bc -l
In general, you can use this function to get only the integer part of a number:
define int(x) {
auto s;
s = scale;
scale = 0;
x /= 1; /* This will have the effect of truncating x to its integer value */
scale = s;
return (x);
}
Save that code into a file (let's call it int.bc) and run the following command:
echo 'int(4.2 - 1.3);' | bc -l int.bc
The variable governing the amount of decimals on division is scale.
So, if scale is 0 (the default), dividing by 1 would truncate to 0 decimals:
$ echo '(4.2-1.3) / 1 ' | bc
2
In other operations, the number of decimals is calculated from the scale (number of decimals) of each operand. In add, subtract and multiplication, for example, the resulting scale is the biggest of both:
$ echo ' 4.2 - 1.33333333 ' | bc
2.86666667
$ echo ' 4.2 - 1.333333333333333333 ' | bc
2.866666666666666667
$ echo ' 4.2000 * 1.33 ' | bc
5.5860
Instead, in division, the number of decimals is strictly equal to th evalue of the variable scale:
$ echo 'scale=0;4/3;scale=3;4/3;scale=10;4/3' | bc
1
1.333
1.3333333333
As the value of scale has to be restored, it is better to define a function (GNU syntax):
$ echo ' define int(x){ os=scale;scale=0;x=x/1;scale=os;return(x) }
int( 4.2-1.3 )' | bc
2
Or in older POSIX language:
$ echo ' define i(x){
o=scale;scale=0;x=x/1;scale=o;return(x)
}
i( 4.2-1.3 )' | bc
2
You say:
truncate/use floor
And those are not the same thing in all cases. The other answers so far only show you how to truncate (i.e. "truncate towards zero" i.e. "discard the part after the decimal").
For negative numbers, the behavior is different.
To wit:
truncate(-2.5) = -2
floor(-2.5) = -3
So, here is a floor function for bc:
# Note: trunc(x) is defined as noted elsewhere in the other answers
define floor(x) {
auto t
t=trunc(x)
if (t>x) {
return t-1
} else {
return t
}
}
Aside:
You can put this, and other helper functions, in a file. For instance, I have this alias in my shell:
alias bc='bc -l ~/.bcinit'
And so whenever I run bc, I get all my utility functions from ~/.bcinit available by default.
Also, there is a good list of bc functions here: http://phodd.net/gnu-bc/code/funcs.bc
You may do something like this:
$ printf "%.2f\n" $(echo "(4530 / 4116 - 1) * 100" | bc -l)
10.06
Here I am trying to find the % change. Not purely bc though.

Resources