Parsing cal output in POSIX compliant shell script by read command - linux

I am trying to write a POSIX compliant script, which will print all months in specified year $3, that have day in $1 (for example Mo, Tu,...) on a same date as $2 (1,2,3,...).
Example:
Input: ./task1.sh Tu 5 2006
Output:
September 2006
Mo Tu We Th Fr Sa Su
1 2 3
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30
December 2006
Mo Tu We Th Fr Sa Su
1 2 3
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30 31
I have written this script:
#!/bin/sh
year=$3
dayInMonth=$2
dayInWeek=$1
index=1
while expr $index '!=' 13 >/dev/null; do
cal -m $index $year| tail -n +2| while read Mo Tu We Th Fr Sa Su ; do
eval theDay='$'$dayInWeek
if [ "$theDay" = "$dayInMonth" ]; then
cal -m $index $year;
fi
done
index=$(expr $index + 1)
done
But there is a problem with reading of third line of cal output. In these lines numbers of days usually don't start at Mo place. How can I parse third line of cal output so the numbers in $Mo, $Tu, $We,... are always correct?

Update: You've added the requirement for a posix conform solution. date -d as used in my answer is not POSIX conform. I'll keep the answer for those who are using GNU/Linux.
Btw, the following command gives you posixly correct the day of week offset of Jan 5, 2006:
cal 01 2006 | awk -v d=5 'NR>2{for(i=1;i<NF;i++){if($i==d){print i;exit}}}'
You need to tinker a little shell script around that.
I would use the date command, like this:
#!/bin/bash
dayofweek="${1}"
day="${2}"
year="${3}"
for m in {01..12} ; do
date=$(LANG=C date -d "${year}-${m}-${day}" +'%a %B')
read d m <<< "${date}"
[ "${d}" = "${dayofweek}" ] && echo "${m}"
done
Results:
$ bash script.sh Thu 05 2006
January
October

It's easier to check dates with the command date.
for month in {1..12}; do
if [[ $(date -d $(printf "%s-%2.2d-%2.2d" "$year" "$month" "$day") "+%a") == "Tue" ]]; then
cal -m $month $year;
fi
done
The script loops over the 12 months and generate a date based on year and day. The date command outputs the day of the in a 3 letters format with +%a.
If you want the day of week in number format, use +%u and == 2 in the if statement.

Related

Last date of each month in a calender year [duplicate]

As we know that each year have the following max day in each month as follows:
Jan - 31 days
Feb - 28 days / 29 days (leap year)
Mar - 31 days
Apr - 30 days
May - 31 days
Jun - 30 days
Jul - 31 days
Aug - 31 days
Sep - 30 days
Oct - 31 days
Nov - 30 days
Dec - 31 days
How to I get bash to return the value (last day of each month) for the current year without using if else or switch or while loop?
my take:
for m in {1..12}; do
date -d "$m/1 + 1 month - 1 day" "+%b - %d days";
done
To explain: for the first iteration when m=1 the -d argument is "1/1 + 1 month - 1 day" and "1/1" is interpreted as Jan 1st. So Jan 1 + 1 month - 1 day is Jan 31. Next iteration "2/1" is Feb 1st, add a month subtract a day to get Feb 28 or 29. And so on.
cat <<EOF
Jan - 31 days
Feb - `date -d "yesterday 3/1" +"%d"` days
Mar - 31 days
Apr - 30 days
May - 31 days
Jun - 30 days
Jul - 31 days
Aug - 31 days
Sep - 30 days
Oct - 31 days
Nov - 30 days
Dec - 31 days
EOF
cal $(date +"%m %Y") |
awk 'NF {DAYS = $NF}; END {print DAYS}'
This uses the standard cal utility to display the specified month, then runs a simple Awk script to pull out just the last day's number.
Assuming you allow "for", then the following in bash
for m in {1..12}; do
echo $(date -d $m/1/1 +%b) - $(date -d "$(($m%12+1))/1 - 1 days" +%d) days
done
produces this
Jan - 31 days
Feb - 29 days
Mar - 31 days
Apr - 30 days
May - 31 days
Jun - 30 days
Jul - 31 days
Aug - 31 days
Sep - 30 days
Oct - 31 days
Nov - 30 days
Dec - 31 days
Note: I removed the need for cal
For those that enjoy trivia:
Number months from 1 to 12 and look at the binary representation in four
bits {b3,b2,b1,b0}. A month has 31 days if and only if b3 differs from b0.
All other months have 30 days except for February.
So with the exception of February this works:
for m in {1..12}; do
echo $(date -d $m/1/1 +%b) - $((30+($m>>3^$m&1))) days
done
Result:
Jan - 31 days
Feb - 30 days (wrong)
Mar - 31 days
Apr - 30 days
May - 31 days
Jun - 30 days
Jul - 31 days
Aug - 31 days
Sep - 30 days
Oct - 31 days
Nov - 30 days
Dec - 31 days
Try using this code
date -d "-$(date +%d) days month" +%Y-%m-%d
Returns the number of days in the month compensating for February changes in leap years without looping or using an if statement
This code tests date to see if Feb 29th of the requested year is valid, if so then it updates the second character in the day offset string. The month argument selects the respective substring and adds the month difference to 28.
function daysin()
{
s="303232332323" # normal year
((!($2%4)&&($2%100||!($2%400)))) && s=313232332323 # leap year
echo $[ ${s:$[$1-1]:1} + 28 ]
}
daysin $1 $2 #daysin [1-12] [YYYY]
On a Mac which features BSD date you can just do:
for i in {2..12}; do date -v1d -v"$i"m -v-1d "+%d"; done
Quick Explanation
-v stands for adjust. We are adjusting the date to:
-v1d stands for first day of the month
-v"$i"m defined the month e.g. (-v2m for Feb)
-v-1d minus one day (so we're getting the last day of the previous month)
"+%d" print the day of the month
for i in {2..12}; do date -v1d -v"$i"m -v-1d "+%d"; done
31
28
31
30
31
30
31
31
30
31
30
You can add year of course. See examples in the manpage (link above).
Contents of script.sh:
#!/bin/bash
begin="-$(date +'%-m') + 2"
end="10+$begin"
for ((i=$begin; i<=$end; i++)); do
echo $(date -d "$i month -$(date +%d) days" | awk '{ printf "%s - %s days", $2, $3 }')
done
Results:
Jan - 31 days
Feb - 29 days
Mar - 31 days
Apr - 30 days
May - 31 days
Jun - 30 days
Jul - 31 days
Aug - 31 days
Sep - 30 days
Oct - 31 days
Nov - 30 days
for m in $(seq 1 12); do cal $(date +"$m %Y") | grep -v "^$" |tail -1|grep -o "..$"; done
iterate from 1 to 12 (for...)
print calendar table for each month (cal...)
remove empty lines from output (grep -v...)
print last number in the table (tail...)
There is no sense, to avoid using cal, because it is required by POSIX, so should be there
A variation for the accepted answer to show the use of "yesterday"
$ for m in {1..12}; do date -d "yesterday $m/1 + 1 month" "+%b - %d days"; done
Jan - 31 days
Feb - 28 days
Mar - 31 days
Apr - 30 days
May - 31 days
Jun - 30 days
Jul - 31 days
Aug - 31 days
Sep - 30 days
Oct - 31 days
Nov - 30 days
Dec - 31 days
How it works?
Show the date of yesterday for the date "month/1" after adding 1 month
I needed this few times, so when in PHP comes with easy in bash is not,
so I used this till throw me error "invalid arithemtic operator"
and even with warrings in spellcheck ( "mt" stands for month, "yr" for year )
last=$(echo $(cal ${mt} ${yr}) | awk '{print $NF}')
so this works fine...
### get last day of month
#
# implement from PHP
# src: https://www.php.net/manual/en/function.cal-days-in-month.php
#
if [ $mt -eq 2 ];then
if [[ $(bc <<< "${yr} % 4") -gt 0 ]];then
last=28
else
if [[ $(bc <<< "${yr} % 100") -gt 0 ]];then
last=29
else
[[ $(bc <<< "${yr} % 400") -gt 0 ]] && last=28 || last=29
fi
fi
else
[[ $(bc <<< "(${mt}-1) % 7 % 2") -gt 0 ]] && last=30 || last=31
fi
Building on patm's answer using BSD date for macOS (patm's answer left out December):
for i in {1..12}; do date -v1m -v1d -v+"$i"m -v-1d "+%b - %d days"; done
Explanation:
-v, when using BSD date, means adjust date to:
-v1m means go to first month (January of current year).
-v1d means go to first day (so now we are in January 1).
-v+"$i"m means go to next month.
-v-1d means subtract one day. This gets the last day of the previous month.
"+%b - %d days" is whatever format you want the output to be in.
This will output all the months of the current year and the number of days in each month. The output below is for the as-of-now current year 2022:
Jan - 31 days
Feb - 28 days
Mar - 31 days
Apr - 30 days
May - 31 days
Jun - 30 days
Jul - 31 days
Aug - 31 days
Sep - 30 days
Oct - 31 days
Nov - 30 days
Dec - 31 days

Get date with same day in month

I want to get all dates with the same day of week.
inputDate="2021/08/25"
That means I should get all the same day of week as inputDate.
outputDates="2021/08/04,2021/08/11,2021/08/18,2021/08/25"
I only got this so far..
inputDate="2021/08/25"
dd=$(date -d "$inputDate" +"%Y/%m/%d")
So what I'm planning is to do "date -7" and loop 5 times forward and backward and collect it then check if value of month is still the same with inputDate if not then drop it
Do you have any way to do this?
Using only shell, the easyest way to get all weekdays from a month is by using cal command:
cal -n1 8 2021
outputs:
August 2021
Su Mo Tu We Th Fr Sa
1 2 3 4 5 6 7
8 9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28
29 30 31
Then you can filter using sed, awk or other tools to reach your goal.
Example:
year=2021
month=8
day=25
weekday_number="$(date -d$year-$month-$day +%w)"
cn=$(($weekday_number + 1))
cal -n1 $month $year |
sed -r 's/(..)\s/\1\t/g;s/ +//g' |
awk -v cn=$cn -F'\t' 'NR<3 || $cn == "" {next} {print $cn}' |
while read wday; do
echo $year/$month/$wday
done
outputs:
2021/8/4
2021/8/11
2021/8/18
2021/8/25
Without using cal or ncal...
#!/bin/bash
inputDate="2021/08/25"
dow=$(date -d "$inputDate" +"%a")
month=$(date -d "$inputDate" +"%m")
outputDates=""
for x in $(seq 0 9)
do
validDate=$(date -d "$x $dow 5 week ago" +"%Y/%m/%d" | grep "/$month/")
if [ ! -z $validDate ]
then
if [ ! -z $outputDates ]
then
outputDates="$outputDates,$validDate"
else
outputDates="$validDate"
fi
fi
done
echo "$outputDates"
This script outputs:
2021/08/04,2021/08/11,2021/08/18,2021/08/25

Get the next time occurance with linux date

Linux date utility can understand a lot of strings including for instance:
$ date -d '8:30'
Fri Jan 2 08:30:00 CET 2015
I'm looking for a way to get the next 8:30, thus:
in case it is Fri Jan 2 before 8:30, the result above should be returned;
otherwise it should print Sat Jan 3 08:30:00 CET 2015.
As one can see next 8:30 doesn't result in the correct answer:
$ date -d 'next 8:30'
date: invalid date ‘next 8:30’
Is there a single expression to calculate this?
Handling it in the shell oneself is of course an option, but makes things more complicates because of daylight save time regulation etc.
In case the clock is adapted to daylight save time, next 8:30 should be parsed to 8:30 according to the settings of the next day.
Testcase:
Given it is Fri Jan 2 12:01:01 CET 2015, the result should be:
$ date -d 'next 8:30'
Sat Jan 3 08:30:00 CET 2015
$ date -d 'next 15:30'
Fri Jan 2 15:30:00 CET 2015
Just use something like:
if [[ $(date -d '8:30 today' +%s) -lt $(date +%s) ]] ; then
next830="$(date -d '8:30 tomorrow')"
else
next830="$(date -d '8:30 today')"
fi
The %s format string gives you seconds since the epoch so the if statement is basically:
if 8:30-today is before now:
use 8:30-tomorrow
else
use 8:30-today
I researched and it does not seem to be possible to do so.
What you can probably do is to compare the hour and minute with 830 and print accordingly:
[ $(date '+%H%M') -le 830 ] && date -d '8:30' || date -d '8:30 + 1 day'
In case you want to work with this easily, create a function to do these calculations.
Test
$ [ $(date '+%H%M') -le 830 ] && date '8:30' || date -d '8:30 + 1 day'
Sat Jan 3 08:30:00 CET 2015

Remove duplicates and take the latest data based on timestamp in a csv file using linux

I have a huge csv file (100,000 records), which has data like below:
Col1 Col2 Date & Time
a xyz Oct 31 2014 09:01
b xyz Dec 12 2013 08:15
a xyz Oct 30 2014 07:01
c xyz Dec 26 2013 08:39
a xyz Nov 12 2014 08:25
c xyz Dec 12 2013 08:10
b xyz Dec 12 2013 09:21
I need to remove the duplicates and keep only that data which is latest (based on the third column - Date & time). So the output should be like
Col1 Col2 Date & Time
a xyz Nov 12 2014 08:25
b xyz Dec 12 2013 09:21
c xyz Dec 26 2013 08:39
I tried to sort the file first and and then remove the duplicates, but that's failing for this huge csv file. Can someone help?
P.S. In col1, data can be from a-z multiple times. Its just a sample here.
Let's give a try with this:
while IFS="," read -r a b c
do
printf "%s %s %s %d\n" "$a" "$b" "$c" $(date -d"$c" +"%s")
done < file | \
awk '{it=$NF; NF--
if (max[$1]<it) {max[$1]=it; res[$1]=$0}}
END {for (i in max) print res[i]}'
This stores the maximum date in the array max[], which is indexed by the temporary last field which indicates seconds since 1 Jan 1970 (created previously with the while read bash). After processing the whole block, in END{}, it prints the result.
It returns:
a xyz Nov 12 2014 08:25
b xyz Dec 12 2013 09:21
c xyz Dec 26 2013 08:39
If it happens to be comma separated, use:
$ while IFS="," read -r a b c; do printf "%s,%s,%s,%d\n" "$a" "$b" "$c" $(date -d"$c" +"%s"); done < a | awk 'BEGIN{FS=OFS=","} {it=$NF; NF--
if (max[$1]<it) {max[$1]=it; res[$1]=$0}}
END {for (i in max) print res[i]}'
a,xyz,Nov 12 2014 08:25
b,xyz,Dec 12 2013 09:21
c,xyz,Dec 26 2013 08:39
There's 3 steps to your process.
First:
extract the key fields. (I'd use perl and split).
Parse the date into a numeric format. You could either do some sort of ISO style e.g. 2014-12-26 08:39 or turn it into a Unix 'epoch' time. (If it's CSV, you could probably munge it through Excel if you really wanted.)
run through your inputs, discarding any 'old' values.
So with that in mind - and assuming that because you've said 'CSV' you mean it's actually comma separated values.
#!/usr/bin/perl
use strict;
use warnings;
use Time::Piece;
my %most_recent;
my $header = <DATA>;
while ( my $line = <DATA> ) {
chomp $line;
my ( $col1, $col2, $date_and_time ) = split( /,\s*/, $line, 3 );
$date_and_time =~ s/\s+$//g;
my $time = Time::Piece -> new -> strptime( $date_and_time, "%b %d %Y %H:%M" );
if ( not defined $most_recent{$col1}{$col2}
or $most_recent{$col1}{$col2} < $time )
{
$most_recent{$col1}{$col2} = $time;
}
}
print "Most recent:\n";
foreach my $col1 ( keys %most_recent ) {
foreach my $col2 ( keys %{ $most_recent{$col1} } ) {
print "$col1, $col2, $most_recent{$col1}{$col2}, \n";
}
}
__DATA__
Col1, Col2, Date & Time
a, xyz, Oct 31 2014 09:01
b, xyz, Dec 12 2013 08:15
a, xyz, Oct 30 2014 07:01
c, xyz, Dec 26 2013 08:39
a, xyz, Nov 12 2014 08:25
c, xyz, Dec 12 2013 08:10
b, xyz, Dec 12 2013 09:21
This will - for each unique pairing of Col1 and Col2 run through picking out the most recent timestamp for that pair.
Note - at various steps (split and timestamp parse) whitespace is discarded.

Shell script can not pass file data to shell input

cal April 2012 | cat > t | cat < t | more
Why does it showing nothing? Why isn't it showing
April 2012
Su Mo Tu We Th Fr Sa
1 2 3 4 5 6 7
8 9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28
29 30
| (anonymous pipe) connects stdout (1) of the first process with stdin (0) of the second. After redirecting the output to a file, there is no stdout left, so there's nothing to pipe. Also, cat | cat < file does not really make sense, it gets two inputs connected to stdin (at least with bash, redirection comes later and "wins": echo uiae | cat <somefile will output the content of somefile)
If you want to display output of a command and, at the same time, write it to the file, use the tee binary. It writes to a file, but still writes to stdout
cal April 2012 | tee t | more
cat t # content of the above `cal` command
Because that first cat > t sends all its output to a file called t, leaving no more for the pipeline.
If your intent is to send it to a file and through more to the terminal, just use:
cal April 2012 | tee t | more
This | cat < t construct is very strange and I'm not even sure if it would work. It's trying to connect two totally different things to the standard input of cat and certainly unnecessary.
this works for me if there's no existing file named t in the current directory. I'm using bash on Ubuntu Oneiric.
$ cal April 2012 | cat > t | cat < t | more
April 2012
Su Mo Tu We Th Fr Sa
1 2 3 4 5 6 7
8 9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28
29 30
$ cal April 2012 | cat > t | cat < t | more
$ rm t
$ cal April 2012 | cat > t | cat < t | more
April 2012
Su Mo Tu We Th Fr Sa
1 2 3 4 5 6 7
8 9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28
29 30

Resources