extract date hours minutes and seconds from date string fomat (2021-09-04T20:02:33,315Z) in shell script - linux

I have a log file and each row in a log file contains data with timestamp in the format 2021-09-04T20:02:33,315Z and I want to filter the last 30 seconds logs alone from the log file.
I found awk can be used to extract the dates in the range
sudo awk -vDate=$(date -d '30 seconds ago' +%Y-%m-%dT%H:%M:%S,000Z) '{ if ($4 > Date) print Date FS $4}'
But I am stuck on if condition in the command to extract date hours minutes and seconds to check the condition.

You can use range in awk with a coma (,) between two tests.
Like this:
awk '$1$2$3 > "Sep3008:47:46", $1$2$3 < "Sep3008:54:04" {print $0}' /var/log/messages
You can compute timestamp before with your format like in your example:
awk -v TS1=$(date ...) -v TS2=$(date ...) '$4 > TS1, $4 < TS2 { print ...}'

Related

Find the next nearest value (bash)

Let's say I have some holiday data (holiday_master.csv) in columns, something like
...
20200320 Vernal Equinox Day
20200429 Showa Day
20200503 Constitution Day
20200505 Green Day
20200720 Children's Day
20200811 Sea Day
...
Given this set of data, I want to find the next closest holiday from the given date.
For example if the input is 20200420, 20200429 Showa Day is expected.
If the input is 20200620, 20200720 Children's Day is expected.
I have a feeling that awk has the necessary functionality to do this, but any solution that works in a bash script is welcome.
Would you please try the bash script:
#!/bin/bash
input="20200428" # or assign to whatever
< "holiday_master.csv" sort -nk1,1 | # sort the csv file by date and pass to the while loop
while read -r date desc; do
if (( date >= input )); then # if the date is greater than or equal to the input
echo "$date" "$desc" # then print the line
break # and exit the loop
fi
done
Assuming no two days will ever have the same date...
DATE=<some desired input date>
awk "{print (\$1 - $DATE"' "\t" $0)}' calendar.txt | sed '/^-/d' | sort | head -n 1 | awk '{$1=""; print $0}'
Explanation
awk "{print (\$1 - $DATE"' "\t" $0)}' calendar.txt: Prepend a column to the input.txt file describing the difference between the desired input date and the date column
sed '/^-/d': Remove all lines beginning with -. Dates with negative differences have already passed.
sort: Sort the remaining entries from least to greatest (based upon the difference column)
head -n 1: Select only the first row (The lowest difference)
awk '{$1=""; print $0}': Print all but the first column
Prettier script version
#!/bin/bash
# Usage: script <Date> <Calendar file>
DATE=${1:--1}
CAL=${2:-calendar.txt}
# Arg check and execute
if[ ! -f $CAL ]
then
echo "File not found: $CAL"
echo "Usage: script <Date> <Calendar file>"
elif [ $DATE -le 0 ]
then
echo "Invalid date: $DATE"
echo "Usage: script <Date> <Calendar file>"
elif [ $(echo "$DATE" | grep -Ewo -- '-?[0-9]+' | wc -l) -eq 0 ]
then
echo "Invalid date: $DATE"
echo "Usage: script <Date> <Calendar file>"
else
awk '{print ($1 - '"$DATE"' "\t" $0)}' $CAL | sed '/^-/d' | sort | head -n 1 | awk '{$1=""; print $0}'
fi
As you use YYYYMMDD format we might compare it just like numbers (note: year is greater than month, month is greater than day). So you can use AWK following way, let:
20200320 Vernal Equinox Day
20200429 Showa Day
20200503 Constitution Day
20200505 Green Day
20200720 Children's Day
20200811 Sea Day
be file named holidays.txt then:
awk 'BEGIN{inputdate=20200420}{if($1>inputdate){print $2;exit}}' holidays.txt
output:
Showa
Explanation: in BEGIN I set inputdate to 20200420 then when line with greater number in 1st column is found I print content of 2nd column and exit (otherwise later dates would be printed too). Note that AWK does automatically parse number when asked to do comparison (> in this case) so you do not have to care about conversion yourself - you could even do inputdate="20200420" and it would work too.
This solution assumes that all dates in file are already sorted.
Using awk and assuming the source data is comma separated:
awk -F, -v dayte="20200420" '
BEGIN {
"date -d "dayte" +%s" | getline dat1
{
{
"date -d "$1" +%s" | getline dat2;
dat3=dat2-dat1;
if (dat3 > 0 )
{
hols[dat3]=$2
}
}
END {
asorti(hols,hols1,"#ind_num_asc");
print hols[hols1[1]]
}
' holiday_master.csv
One liner:
awk -F, -v dayte="20200420" 'BEGIN { "date -d "dayte" +%s" | getline dat1 } { "date -d "$1" +%s" | getline dat2;dat3=dat2-dat1;if (dat3 > 0 ) { hols[dat3]=$2 } } END { asorti(hols,hols1,"#ind_num_asc");print hols[hols1[1]] }' holiday_master.csv
Set the field separator to , and set a variable dayte to the date we wish to check. In the BEGIN block, we pass the dayte variable through to date command via an awk pipe/getline and read the epoch result into the variable dat1. We do the same with the first column on the master file ($1) and read this into dat2. We take the difference between the epoch dates and read the result into dat3. Only if the result is positive (in the future) do we then use dat3 for an index in a "hols" array, with the holiday description as the value. In the END block, we sort the indexes of hols into a news hols1 array basing the sort on ascending, numeric indexes. We then take the first index of the new hols1 array to attain the holiday that is closest to the dayte variable.
Assuming the holiday list file is sorted by date as you have given, the below would work
$ awk -v dt="20200420" ' (dt-$1)<0 { print;exit } ' holiday.txt
20200429 Showa Day
$ awk -v dt="20200620" ' (dt-$1)<0 { print;exit } ' holiday.txt
20200720 Children's Day
$
If the holiday file is not sorted, then you can use below
$ shuf holiday.txt | awk -v dt="20200420" ' dt-$1<0 { a[(dt-$1)*-1]=$0 } END { asort(a); print a[1] } '
20200429 Showa Day
$ shuf holiday.txt | awk -v dt="20200620" ' dt-$1<0 { a[(dt-$1)*-1]=$0 } END { asort(a); print a[1] } '
20200720 Children's Day

Using awk to add one month to a date [duplicate]

This question already has answers here:
Increment date with AWK for few days and months
(3 answers)
Closed 4 years ago.
I have a file 1.txt like below:
"15227962157615645"$"2018-12-04 06:55:43"
"15227525816721347"$"2018-12-03 18:48:11"
I can get the date using:
awk -F\" '{print $4}' 1.txt
Additionally I need add one month to the date. For the above input my desired output would be:
2019-01-04 06:55:43
2019-01-03 18:48:11
I tried to use
awk -F\" '{print date -d "$4 +1 month"+%Y-%m-%d}' 1.txt
but it does not work.
Awk has limited support for date calculation, so here is a bash only solution relying on the date command:
IFS='$';
while read n t; do
printf '%s$"%s"\n' "$n" "$(date -d "${t//\"/} +1 month" '+%F %T')"
done <file
The input field separator is set to $ to get the time into $t variable.
The double quote of the date field are removed using bash parameter expansion ${t//\"/}.
This allows to pass the +1 month key word to date.
Then the printf prints back to the original format of the input file.

How to get logs of last hour in linux using awk command

I have a logs file named source.log having time format like :-
Fri, 09 Dec 2016 05:03:29 GMT 127.0.0.1
and i am using script to get logs from a logs file for last 1 hour.
Script:-
awk -vDate=`date -d'now-1 hour' +[%d/%b/%Y:%H:%M:%S` '$4 > Date {print Date, $0}' source.log > target.log
But this script gives the result same as like the source file.
There is something wrong in time format matching, due to which it is not giving last hour records.
I know I'm late to help the OP, but maybe this answer can help anyone else in this situation.
First it's necessary to compare the whole date and not only the time part, because times near midnight.
Note that awk can only compare strings and numbers. Some awk implementations have the mktime() function that converts a specifically formatted string into UNIX timestamp, in order to make datetime comparisons, but it doesn't support any datetime format, so we can't use it.
The best way would be changing (if possible) the datetime format of the log entries, using 'YYMMDDhhmmss' datetime format or ISO format. In this way, comparing two datetimes is simple as compare strings or numbers.
But let's assume that we can't change log entries date format, so we'll need to convert ourselves inside awk:
awk -vDate="`date -d'now-1 hour' +'%Y%m%d%H%M%S'`" '
BEGIN{
for(i=0; i<12; i++)
MON[substr("JanFebMarAprMayJunJulAugSepOctNovDec", i*3+1, 3)] = sprintf("%02d", i+1);
}
toDate() > Date
function toDate(){
time = $5; gsub(/:/, "", time);
return $4 MON[$3] $2 time;
}' source.log
Explanation
-vDate=... sets the Date awk variable with the initial datetime (one hour ago).
BEGIN section creates an array indexed by the month abbreviation (it's especific to english)
toDate() function converts the line's fields into a string with the same format as Date variable (YYYMMDDhhmmss).
Finally when the condition toDate() > Date is true, awk prints the current line (log entry).

Filter Linux logs based on Unix Timestamp

I have a log on a linux server. The entries are in the format:
[timestamp (seconds since jan 1 1970)] log data entry
I need a bash script that will take the name of the log file and output only yesterdays entries (from 12:00 to 23:59:59 of previous day) and output those lines to a new file.
I've seen various scripts that filter logs based on dates but all of them so far deal with date stamps in more human readable formats, or are not dynamic. They rely on hard coded dates. I want a script that is going to run in a cron job daily so it has to be aware of what the current date is each time it runs.
Thanks.
Update: This is what I have so far. It just never seems to do the evaluation of the date. It prints 00 for the date so everything gets through.
head -5 logfile.log | awk '{
if($1 >= (date -d "today 00:00:00" +"%s"))
print $1 (date -d "today 00:00:00" +"%s");
}'
I'm confused though, even if the date evaluates properly, $1 is going to have numbers inside square brackets, and my date will be just numbers. Will it do the comparison properly if the strings are formatted differently like that? I haven't figured out how to shove the date number returned by date into a string with brackets yet.
Well, maybe using the dates as Dale said. But using a little trick to extract the "[" and "]", and after compare the dates. Something like this:
YESTERDAY=$(date -d "yesterday 00:00:00" +"%s")
TODAY=$(date -d "today 00:00:00" +"%s")
# Combine the processing in awk
awk -v MIN=${YESTERDAY} -v MAX=${TODAY} -F["]""["] '{ if ( $2 >= MIN && $2 <= MAX) print $0}' logfile.log
Combining tips and tricks from Glenn, Dale, and Davison:
awk -v today=$(date -d "today 00:00:00" +"%s") -v yesterday=$(date -d "yesterday 00:00:00" +"%s") -F'[\\[\\] ]' '{ if($2 >= yesterday && $2 < today) print }' logfile.log
Uses the shell's $() command substitution to feed variables to awk's -v argument parser
-F'[\\[\\] ]' sets the field separator to be either [, ], or
input data:
[1300000000 log1 data1 entry1]
[1444370000 log2 data2 entry2]
[1444374000 log3 data3 entry3]
[1444460399 log4 data4 entry4]
[1500000000 log5 data5 entry5]
output:
[1444370000 log2 data2 entry2]
You might try something like this:
YESTERDAY=$(date -d "yesterday 00:00:00" +"%s")
TODAY=$(date -d "today 00:00:00" +"%s")
cat your_log.log | \
awk -v MIN=${YESTERDAY} -v MAX=${TODAY} \
'{if($1 >= MIN && $1 < MAX) print}'
:)
Dale

How to search a log file for two different dates in Linux

I'm using an RPM-based distro and I want to dynamically search a log file for today's date and yesterday's date to output a report. The string has to be dynamic ( no egrep "\b2012-10-[20-30]\b" ) meaning that I can take the same one-liner or script and search a file for today's date and yesterday's date and print some output. Basically searching log files for specific entries.
Here's what I got, but I want to replace the egrep with something dynamic:
grep "No Such User Here" /var/log/maillog | egrep "\b2012-10-2[3-4]\b" | cut -d "<" -f 3 | egrep -o '\b[a-zA-Z0-9._%-]+#[a-zA-Z0-9.-]+\.[a-zA-Z]{2,4}\b' | cut -d "#" -f 2 | sort -d |uniq -ci | awk -F" " '{ print "Domain: " $2 " has been sent " $1 " messages that got a No Such User Here error." }'
Any help is appreciated. I'm looking for something that very likely uses the date command
date "+%Y-%m-%d"
but I need to take the %d and search for both the current day, and yesterday. Can this be done?
Any insight is much appreciated.
If you have GNU date:
$ x=$(date "+%Y-%m-%d")
$ y=$(date "+%Y-%m-%d" -d "-1 day")
$ egrep "($x|$y)" file
x contains current date and y contains the yesterday's date.
With GNU awks time functions:
gawk 'BEGIN{
today = strftime("%Y-%m-%d")
yesterday = strftime("%Y-%m-%d",systime()-24*60*60)
}
$0 ~ "(" today "|" yesterday ")"
' file

Resources