How to search a log file for two different dates in Linux - linux

I'm using an RPM-based distro and I want to dynamically search a log file for today's date and yesterday's date to output a report. The string has to be dynamic ( no egrep "\b2012-10-[20-30]\b" ) meaning that I can take the same one-liner or script and search a file for today's date and yesterday's date and print some output. Basically searching log files for specific entries.
Here's what I got, but I want to replace the egrep with something dynamic:
grep "No Such User Here" /var/log/maillog | egrep "\b2012-10-2[3-4]\b" | cut -d "<" -f 3 | egrep -o '\b[a-zA-Z0-9._%-]+#[a-zA-Z0-9.-]+\.[a-zA-Z]{2,4}\b' | cut -d "#" -f 2 | sort -d |uniq -ci | awk -F" " '{ print "Domain: " $2 " has been sent " $1 " messages that got a No Such User Here error." }'
Any help is appreciated. I'm looking for something that very likely uses the date command
date "+%Y-%m-%d"
but I need to take the %d and search for both the current day, and yesterday. Can this be done?
Any insight is much appreciated.

If you have GNU date:
$ x=$(date "+%Y-%m-%d")
$ y=$(date "+%Y-%m-%d" -d "-1 day")
$ egrep "($x|$y)" file
x contains current date and y contains the yesterday's date.

With GNU awks time functions:
gawk 'BEGIN{
today = strftime("%Y-%m-%d")
yesterday = strftime("%Y-%m-%d",systime()-24*60*60)
}
$0 ~ "(" today "|" yesterday ")"
' file

Related

Find the next nearest value (bash)

Let's say I have some holiday data (holiday_master.csv) in columns, something like
...
20200320 Vernal Equinox Day
20200429 Showa Day
20200503 Constitution Day
20200505 Green Day
20200720 Children's Day
20200811 Sea Day
...
Given this set of data, I want to find the next closest holiday from the given date.
For example if the input is 20200420, 20200429 Showa Day is expected.
If the input is 20200620, 20200720 Children's Day is expected.
I have a feeling that awk has the necessary functionality to do this, but any solution that works in a bash script is welcome.
Would you please try the bash script:
#!/bin/bash
input="20200428" # or assign to whatever
< "holiday_master.csv" sort -nk1,1 | # sort the csv file by date and pass to the while loop
while read -r date desc; do
if (( date >= input )); then # if the date is greater than or equal to the input
echo "$date" "$desc" # then print the line
break # and exit the loop
fi
done
Assuming no two days will ever have the same date...
DATE=<some desired input date>
awk "{print (\$1 - $DATE"' "\t" $0)}' calendar.txt | sed '/^-/d' | sort | head -n 1 | awk '{$1=""; print $0}'
Explanation
awk "{print (\$1 - $DATE"' "\t" $0)}' calendar.txt: Prepend a column to the input.txt file describing the difference between the desired input date and the date column
sed '/^-/d': Remove all lines beginning with -. Dates with negative differences have already passed.
sort: Sort the remaining entries from least to greatest (based upon the difference column)
head -n 1: Select only the first row (The lowest difference)
awk '{$1=""; print $0}': Print all but the first column
Prettier script version
#!/bin/bash
# Usage: script <Date> <Calendar file>
DATE=${1:--1}
CAL=${2:-calendar.txt}
# Arg check and execute
if[ ! -f $CAL ]
then
echo "File not found: $CAL"
echo "Usage: script <Date> <Calendar file>"
elif [ $DATE -le 0 ]
then
echo "Invalid date: $DATE"
echo "Usage: script <Date> <Calendar file>"
elif [ $(echo "$DATE" | grep -Ewo -- '-?[0-9]+' | wc -l) -eq 0 ]
then
echo "Invalid date: $DATE"
echo "Usage: script <Date> <Calendar file>"
else
awk '{print ($1 - '"$DATE"' "\t" $0)}' $CAL | sed '/^-/d' | sort | head -n 1 | awk '{$1=""; print $0}'
fi
As you use YYYYMMDD format we might compare it just like numbers (note: year is greater than month, month is greater than day). So you can use AWK following way, let:
20200320 Vernal Equinox Day
20200429 Showa Day
20200503 Constitution Day
20200505 Green Day
20200720 Children's Day
20200811 Sea Day
be file named holidays.txt then:
awk 'BEGIN{inputdate=20200420}{if($1>inputdate){print $2;exit}}' holidays.txt
output:
Showa
Explanation: in BEGIN I set inputdate to 20200420 then when line with greater number in 1st column is found I print content of 2nd column and exit (otherwise later dates would be printed too). Note that AWK does automatically parse number when asked to do comparison (> in this case) so you do not have to care about conversion yourself - you could even do inputdate="20200420" and it would work too.
This solution assumes that all dates in file are already sorted.
Using awk and assuming the source data is comma separated:
awk -F, -v dayte="20200420" '
BEGIN {
"date -d "dayte" +%s" | getline dat1
{
{
"date -d "$1" +%s" | getline dat2;
dat3=dat2-dat1;
if (dat3 > 0 )
{
hols[dat3]=$2
}
}
END {
asorti(hols,hols1,"#ind_num_asc");
print hols[hols1[1]]
}
' holiday_master.csv
One liner:
awk -F, -v dayte="20200420" 'BEGIN { "date -d "dayte" +%s" | getline dat1 } { "date -d "$1" +%s" | getline dat2;dat3=dat2-dat1;if (dat3 > 0 ) { hols[dat3]=$2 } } END { asorti(hols,hols1,"#ind_num_asc");print hols[hols1[1]] }' holiday_master.csv
Set the field separator to , and set a variable dayte to the date we wish to check. In the BEGIN block, we pass the dayte variable through to date command via an awk pipe/getline and read the epoch result into the variable dat1. We do the same with the first column on the master file ($1) and read this into dat2. We take the difference between the epoch dates and read the result into dat3. Only if the result is positive (in the future) do we then use dat3 for an index in a "hols" array, with the holiday description as the value. In the END block, we sort the indexes of hols into a news hols1 array basing the sort on ascending, numeric indexes. We then take the first index of the new hols1 array to attain the holiday that is closest to the dayte variable.
Assuming the holiday list file is sorted by date as you have given, the below would work
$ awk -v dt="20200420" ' (dt-$1)<0 { print;exit } ' holiday.txt
20200429 Showa Day
$ awk -v dt="20200620" ' (dt-$1)<0 { print;exit } ' holiday.txt
20200720 Children's Day
$
If the holiday file is not sorted, then you can use below
$ shuf holiday.txt | awk -v dt="20200420" ' dt-$1<0 { a[(dt-$1)*-1]=$0 } END { asort(a); print a[1] } '
20200429 Showa Day
$ shuf holiday.txt | awk -v dt="20200620" ' dt-$1<0 { a[(dt-$1)*-1]=$0 } END { asort(a); print a[1] } '
20200720 Children's Day

Finding the number of specific files via bash

Fill in the dots on the next Unix command, like so, the standard out gives an overview per file type with the number of files in the /dev directory.
In this overview, all filetypes must be listed in descending order of the number of found files of the certain type. If there are filetypes with an equal number of files, they must be listed in alphabetical order.
$ find /dev -ls | …
7 c
6 l
3 d
Tips:
The part already given with the find-command, also finds hidden files in the directory.
With help of the cut-command, you can select a certain part of a line, the two most important options are -f and -d. The first one splits the lines in columns. By default, the tab-character is used. With the option -d you can specify a custom delimiter.
tr, sort and uniq might be useful.
What I have so far:
find /dev -ls | tr \\t " " | tr -s " " | cut -f3 -d ' ' | cut -c-1 | sort | uniq -c | sort -r
But this doesn't seem to work...
Thanks in advance.
I like use awk for this cases instead of tr
find /dev -ls | gawk '{ c=substr($3,1,1) ; x[c]++ } END { for(y in x) print x[y] " " y }' | sort -n

Unix File listing with date as prefix in names

How can I list files based on names with date prefix & suffix. Ex: I have file with name as "http_access_2017-04-13.log" then how can I with files with last five days back files ??
Create you time stamps with date -d:
ago ()
{
date +%Y-%m-%d -d "$1 days ago"
}
for n in $(seq 5); do
echo http_access_$(ago $n).log
done
Try this command:
ls -lr http_access*.log | tail -5

Filter Linux logs based on Unix Timestamp

I have a log on a linux server. The entries are in the format:
[timestamp (seconds since jan 1 1970)] log data entry
I need a bash script that will take the name of the log file and output only yesterdays entries (from 12:00 to 23:59:59 of previous day) and output those lines to a new file.
I've seen various scripts that filter logs based on dates but all of them so far deal with date stamps in more human readable formats, or are not dynamic. They rely on hard coded dates. I want a script that is going to run in a cron job daily so it has to be aware of what the current date is each time it runs.
Thanks.
Update: This is what I have so far. It just never seems to do the evaluation of the date. It prints 00 for the date so everything gets through.
head -5 logfile.log | awk '{
if($1 >= (date -d "today 00:00:00" +"%s"))
print $1 (date -d "today 00:00:00" +"%s");
}'
I'm confused though, even if the date evaluates properly, $1 is going to have numbers inside square brackets, and my date will be just numbers. Will it do the comparison properly if the strings are formatted differently like that? I haven't figured out how to shove the date number returned by date into a string with brackets yet.
Well, maybe using the dates as Dale said. But using a little trick to extract the "[" and "]", and after compare the dates. Something like this:
YESTERDAY=$(date -d "yesterday 00:00:00" +"%s")
TODAY=$(date -d "today 00:00:00" +"%s")
# Combine the processing in awk
awk -v MIN=${YESTERDAY} -v MAX=${TODAY} -F["]""["] '{ if ( $2 >= MIN && $2 <= MAX) print $0}' logfile.log
Combining tips and tricks from Glenn, Dale, and Davison:
awk -v today=$(date -d "today 00:00:00" +"%s") -v yesterday=$(date -d "yesterday 00:00:00" +"%s") -F'[\\[\\] ]' '{ if($2 >= yesterday && $2 < today) print }' logfile.log
Uses the shell's $() command substitution to feed variables to awk's -v argument parser
-F'[\\[\\] ]' sets the field separator to be either [, ], or
input data:
[1300000000 log1 data1 entry1]
[1444370000 log2 data2 entry2]
[1444374000 log3 data3 entry3]
[1444460399 log4 data4 entry4]
[1500000000 log5 data5 entry5]
output:
[1444370000 log2 data2 entry2]
You might try something like this:
YESTERDAY=$(date -d "yesterday 00:00:00" +"%s")
TODAY=$(date -d "today 00:00:00" +"%s")
cat your_log.log | \
awk -v MIN=${YESTERDAY} -v MAX=${TODAY} \
'{if($1 >= MIN && $1 < MAX) print}'
:)
Dale

Grep for 2 words after pattern found

The scenario is i have a file and contains a string "the date and time is 2012-12-07 17:11:50"
I had searched and found a command
grep 'the date and time is' 2012-12-07.txt | cut -d\ -f5
it just displays the 5th word and i need the combination of 5th and 6th, so i tried
grep 'the date and time is' 2012-12-07.txt | cut -d\ -f5 -f6
But its error.
Now, how to grep the 5th and 6th word with one command
I just need the output like 2012-12-07 17:11:50
You should be able to use
$ grep 'the date and time is' 2012-12-07.txt | cut -d' ' -f6-7
Check the man page for the syntax of the -f option's argument.
I guess its not 5th and 6th instead its 6th and 7th
grep 'the date and time is' 2012-12-07.txt |awk '{print $6,$7}'
That sounds like a job for awk, which is possibly marginally faster than constructing a pipeline consisting of multiple processes:
pax> echo 'hello
the date and time is 2012-12-07 17:11:50
goodbye' | awk '/the date and time is/ {print $6" "$7}'
2012-12-07 17:11:50
It combines the search and the modification in one single command.
Keep in mind that this solution, like yours, won't help if there is stuff on the line before your search string but awk can be made to do that as well, depending on the complexity of your needs, such as:
pax> echo 'hello
Today (Friday), the date and time is 2012-12-07 17:11:50
goodbye' | awk '/the date and time is/ {
sub (".*is","",$0);
print $1" "$2
}'
2012-12-07 17:11:50

Resources