How to grep log files during a specific time period [duplicate]

How to grep log files during a specific time period [duplicate] - linux

This question already has answers here:
Extract data from log file in specified range of time [duplicate]
(5 answers)
Closed 6 years ago.
Okay, So i have log files and I would like to search within specific ranges. These ranges will be different throughout the day. Below is a piece of a log file and this is the only piece I can show you, sorry work stuff. I am using the cat command if that matters.
Working EXAMPLE : cat /dir/dir/dir/2014-07-30.txt | grep *someword* | cut -d',' -f1,4,3,7
2014-07-30 19:17:34.542 ;; (p=0,siso=0)
The above gets me the info I need along with the time stamp, but shows all time ranges and that is what I would like to correct. Lets say I only want ranges of 18 to 20 in the first column of the time.
Actual --> 2014-07-30 19:17:34.542 ;; (p=0,siso=0)
Only range I am looking for --> [18-20]:00:00.000 ;; (p=0,siso=0)
I am not worried about the 00s as they can be any digit.
Thanks for looking. I have not used much in the way of scripting as you can tell from my example, but any help is greatly appreciated.
I have included a log file, the colons and commas are where they should be.
2014-07-30 14:33:19.259 ;; (p=0,ser=0,siso=0) IN ### Word:Numbers=00000,word=None something goes here and here (something here andhere:here also here:2222),codeword=8,codeword=0,Noideanumbers=00000000,something=something, ;;

Using awk:
logsearch() {
grep "$3" "$4" | awk -v start="$1" -v end="$2" '{split($2, a, /:/)} (a[1] >= start) && (a[1] <= end)'
}
# logsearch <START> <END> <PATTERN> <FILE>
logsearch 18 20 '*someword*' /dir/dir/dir/2014-07-30.txt
Or with only awk (possibly different pattern quoting requirements):
logsearch2 ()
{
awk -v start="$1" -v end="$2" -v pat="$3" '($0 ~ pat) {split($2, a, /:/)} ($0 ~ pat) && (a[1] >= start) && (a[1] <= end)' "$4"
}

Not having seen the original input data I'm guessing from your cut what's going on.
Will this give you something similar to your desired outcome?
awk -F, '/someword/ && $4 ~ /^(18|19|20)/{printf "%s %s %s %s\n", $1,$4,$3,$7}' /dir/dir/dir/2014-07-30.txt
That said: a bit of sample data typically goes a long way!
Edit1:
Given the input line you added to both your comment and the original post the following awk statement does what you're asking:
awk '/something/ && $2 ~ /^(18|19|20)/{printf "%s %s %s %s\n", $1,$2,$3,$4} /path/to/your/input_file

This is a very interesting question. The pure BASH solution offers quite a bit of flexibility in how you deal with or process the entries after you identify those responsive to the range of date/time of interest. The simplest way in BASH is simply to get your start-time and stop-time in seconds since epoch and then test each log entry to determine if it falls within that range and then -- do something with the log entry. The basic logic involved is relatively short. The width of the date_time field within the log can be set by passing the width as argument 4. Set the default dwidth as needed (currently 15 to match syslog and journalctl format. The only required argument is the logfile name. If no start/stop time is specified, it will find all entries:
## set filename, set start time and stop time (in seconds since epoch)
# and time_field width (number of chars that make up date in log entry)
lfname=${1}
test -n "$2" && starttm=`date --date "$2" +%s` || starttm=0
test -n "$3" && stoptm=`date --date "$3" +%s` || stoptm=${3:-`date --date "Jan 01 2037 00:01:00" +%s`}
dwidth=${4:-15}
## read each line from the log file and act on only those with
# date_time between starttm and stoptm (inclusive)
while IFS=$'\n' read line || test -n "$line"; do
test "${line:0:1}" != - || continue # exclude journalctl first line
logtm=`date --date "${line:0:$dwidth}" +%s` # get logtime from entry in seconds since epoch
if test $logtm -ge $starttm && test $logtm -le $stoptm ; then
echo "logtm: ${line:0:$dwidth} => $logtm"
fi
done < "${lfname}"
working example:
#!/bin/bash
## log date format len
# journalctl 15
# syslog 15
# your log example 23
function usage {
test -n "$1" && printf "\n Error: %s\n" "$1"
printf "\n usage : %s logfile ['start datetime' 'stop datetime' tmfield_width]\n\n" "${0//*\//}"
printf " example: ./date-time-diff.sh syslog \"Jul 31 00:15:02\" \"Jul 31 00:18:30\"\n\n"
exit 1
}
## test for required input & respond to help
test -n "$1" || usage "insufficient input."
test "$1" = "-h" || test "$1" = "--help" && usage
## set filename, set start time and stop time (in seconds since epoch)
# and time_field width (number of chars that make up date in log entry)
lfname=${1}
test -n "$2" && starttm=`date --date "$2" +%s` || starttm=0
test -n "$3" && stoptm=`date --date "$3" +%s` || stoptm=${3:-`date --date "Jan 01 2037 00:01:00" +%s`}
dwidth=${4:-15}
## read each line from the log file and act on only those with
# date_time between starttm and stoptm (inclusive)
while IFS=$'\n' read line || test -n "$line"; do
test "${line:0:1}" != - || continue # exclude journalctl first line
logtm=`date --date "${line:0:$dwidth}" +%s` # get logtime from entry in seconds since epoch
if test $logtm -ge $starttm && test $logtm -le $stoptm ; then
echo "logtm: ${line:0:$dwidth} => $logtm"
fi
done < "${lfname}"
exit 0
usage:
$ ./date-time-diff.sh -h
usage : date-time-diff.sh logfile ['start datetime' 'stop datetime' tmfield_width]
example: ./date-time-diff.sh syslog "Jul 31 00:15:02" "Jul 31 00:18:30"
Remember to quote your starttm and stoptm strings. Testing with 20 entries in logfile between Jul 31 00:12:58 and Jul 31 00:21:10.
test output:
$ ./date-time-diff.sh jc.log "Jul 31 00:15:02" "Jul 31 00:18:30"
logtm: Jul 31 00:15:02 => 1406783702
logtm: Jul 31 00:15:10 => 1406783710
logtm: Jul 31 00:15:11 => 1406783711
logtm: Jul 31 00:15:11 => 1406783711
logtm: Jul 31 00:15:11 => 1406783711
logtm: Jul 31 00:15:11 => 1406783711
logtm: Jul 31 00:18:30 => 1406783910
Depending on what you need, another one of the solutions may fit your needs, but if you need to be able to process or manipulate the matching log entries, it is hard to beat a BASH script.

You can pipe the results to grep again.
cat /dir/dir/dir/2014-07-30.txt | grep someword | cut -d',' -f1,4,3,7 \
| grep '^\d\d\d\d-\d\d-\d\d \(1[89]\|20\)'

I don't have enough reputation to comment, but as minopret suggested do one grep at a time.
Here is one of the solutions to get the 18-20 range:
grep ' 20: \| 17:\| 18:' filename.txt

I have found the answer in the form I was looking for:
cat /dir/dir/dir/2014-07-30.txt | grep *someword* | cut -d',' -f1,4,3,7 | egrep '[^ ]+ (2[0-2]):[0-9]'
The following command gets me all the information I need from the cut, and greps for the someword I need and with the egrep I can search the times I need.

Related

Unix environment — extract date strings and compare

Could anyone help me with a UNIX script that extracts the date from the last line of a file, compares it to current date, and if date from file is with 1 hour delay between current date, echo YES?
File.txt
18:48:43 iLIKEtoMOVEitMoveIT
18:58:43 iLIKEtoMOVEitMoveIT
19:22:43 iLIKEtoMOVEitMoveIT
clear line
So far I figured out how to get the last line which has the time:
tail -n 2 File.txt | head c-8
Output = 19:22:43
And how to store the current date as only time in a variable:
TheCurrentDate="date +"%T""
How to compare those 2 HH:MM:SS and calculate if one hour has passed between them, then echo"YES". All put in script.sh

DATE=$(tail -n 2 File.txt | cut -c 1-8 | head -n 1)
FROM_FILE=$(date -d "$DATE" +%s)
NOW=$(date +%s)
DIFFERENCE=$((NOW - FROM_FILE))
if [ $DIFFERENCE -le 3600 ]; then
echo YES
fi
The idea is to convert the timestamp to seconds since epoch (+%s) using date. Then you just compare numbers.
EDIT
Your File.txt should does seem like a log file, where the logging program doesn't bother to prepend the date. As #Jonathan Leffler pointed out, NOW could be 00:15 and FROM_FILE could be 23:45.
In this case, date would interpret FROM_FILE as being the end of today, rather than the end of yesterday. This can be fixed ad-hoc:
DATE=$(tail -n 2 File.txt | cut -c 1-8 | head -n 1)
FROM_FILE=$(date -d "$DATE" +%s)
NOW=$(date +%s)
if [ $FROM_FILE -gt $NOW ]; then
# it's not really in the future, it's from yesterday
FROM_FILE=$((FROM_FILE - 24 * 3600))
fi
DIFFERENCE=$((NOW - FROM_FILE))
if [ $DIFFERENCE -le 3600 ]; then
echo YES
fi

bash: syntax error: operand expected (error token is "-")

I am trying to watch a log for certain messages within the last hour. The logs are formatted in this manner:
[1/18/19 9:59:13:791 CST] <Extra text here...>
I was having trouble with just doing a date comparison with awk, so my thought was to convert to epoch and compare. I am taking field 1 and cutting off the milliseconds from field 2, and removing [] (though I guess I could just do [ for my purposes).
while read -r line || [[ -n "$line" ]]
do
log_date_str="$(awk '{gsub("\\[|\\]", "");print $1" "substr($2,1,length($2)-4)}' <<< "$line")"
log_date="$(date -d "$log_date_str" +%s)"
[[ $(($(date +%s)-$log_date)) -le 3600 ]] && echo "$line"
done < /path/to/file
When I try to run this against the log file though, I get this error:
date: invalid date `************ S'
-bash: 1547832909-: syntax error: operand expected (error token is "-")
Taking a single date, e.g. "1/18/19 9:59:13" works with the date conversion to epoch, but I'm not sure where to go with that error.

As the comments pointed out, I was getting data that was not a date since it was parsing the entire log. Grepping the input file for my desired text solved my problems, and I've changed to Charles' suggestion.
while read -r line || [[ -n "$line" ]]
do
log_date_str="$(awk '{gsub("\\[|\\]", "");print $1" "substr($2,1,length($2)-4)}' <<< "$line")"
log_date="$(date -d "$log_date_str" +%s)"
(( ( $(date +%s) - log_date ) <= 3600 )) && echo "$line"
done < <(grep ERROR_STRING /path/to/file.log)

How to grep a string variable with spaces in a bash script [duplicate]

This question already has answers here:
Grep error due to expanding variables with spaces
(3 answers)
Closed 5 years ago.
Solution: My date variables were in the wrong format (day number and day of the week were flipped). I changed this, then used the if statement proposed by #PesaThe below instead of my test.
Original Post:
I am writing a bash script to run as part of my servers' daily maintenance tasking. This particular job is to search for entries in input_file matching yesterday's and today's time stamps. Here are my date variables.
today=$(date "+%a, %b %d %Y")
yesterday=$(date --date=yesterday "+%a, %b %d %Y")
Here are the declarations, which are exactly as they should be:
declare -- adminLogLoc="/opt/sc/admin/logs/"
declare -- adminLog="/opt/sc/admin/logs/201801.log"
declare -- today="Tue, Jan 02 2018"
declare -- yesterday="Mon, Jan 01 2018"
declare -- report="/maintenance/daily/2018-01-02_2.2.txt"
Here are some actual log entries like those I need output. These were found with grep $today $adminLog | grep error
Tue, 02 Jan 2018 14:38:50 +0000||error|WARNING|13|Query #2464 used to generate source data is inactive.
Tue, 02 Jan 2018 14:38:50 +0000||error|WARNING|13|Query #2468 used to generate source data is inactive.
Tue, 02 Jan 2018 14:38:50 +0000||error|WARNING|13|Query #2470 used to generate source data is inactive.
Tue, 02 Jan 2018 14:38:50 +0000||error|WARNING|13|Query #2474 used to generate source data is inactive.
Here is the if statement I am trying to run:
# Check for errors yesterday
if [ $(grep $yesterday $adminLog|grep "error") != "" ]; then
echo "No errors were found for $yesterday." >> $report
else
$(grep $yesterday $adminLog|grep "error") >> $report
fi
# Check for errors today (at the time the report is made, there
# probably won't be many, if any at all)
if [ $(grep $today $adminLog|grep "error") != "" ]; then
echo "No errors were found for $today." >> $report
else
$(grep $today $adminLog|grep "error") >> $report
fi
I have tried this several ways, such as putting double quotes around the variables in the test, so on. When I run the grep search in the command line after setting the variables, it works perfectly, but when I run it in the test brackets, grep uses each term (i.e. Tue, Jan... so on) as individual arguments. I have also tried
grep $yesterday $adminLog 2> /dev/null | grep -q error
if [ $? = "0" ] ; then
with no luck.
How can I get this test to work so I can input the specified entry into my log file? Thank you.

Could you please try following script and let me know if this helps you. This snippet will help to simply print the yesterday's and today's logs in case you want to take them into a output file or so, we could adjust it accordingly too then.
#!/bin/bash
today=$(date "+%a, %b %d %Y")
yesterday=$(date --date=yesterday "+%a, %b %d %Y")
todays=$(grep "$today" Input_file)
yesterdays=$(grep "$yesterday" Input_file)
if [[ -n $todays ]]
then
echo "$todays"
else
echo "no logs found for todays date."
fi
if [[ -n $yesterdays ]]
then
echo "$yesterdays"
else
echo "NO logs found for yesterday's date."
fi

compare time in bash script with ± x min

I'm newbie in bash and need some advice.
I have a .txt file with a time stamp inside that is reloaded every x time, and each time stamps the current date and time.
"20221218-0841"
Now i have build a bash script to check the content and give me an answer if it is the same.
#!/bin/bash
time_status=`cat /root/test.txt | tail -c 14 | cut -d')' -f1`
date_now=`date +%Y%m%d-%H%M`
if [ "$date_now" == "$time_status" ]
then
echo "OK - $time_status "
date +%Y%m%d-%H%M
exit 0
fi
if [ "$date_now" != "$time_status" ]
then
echo "WARNING - $time_status "
date +%Y%m%d-%H%M
exit 1
fi
Everything is ok since now, the script does what it have to do, but i need to get ok for answer and exit with 0 when the time is ± 3 min not exactly the same.
Can someone provide some leads into this?

You can manipulate the date, this way,
# Reading only the '%H%M' part from two variables using read and spitting
# with '-' de-limiter
IFS='-' read _ hourMinuteFromFile <<<"$time_status"
IFS='-' read _ currentHourMinute <<<"$date_now"
# Getting the diff only for the minutes field which form the last two
# parts of the variable above
dateDiff=$(( ${hourMinuteFromFile: -2} - ${currentHourMinute: -2} ))
# Having the condition now for the difference from -3 to 3 as below,
if (( -3 <= ${dateDiff} <=3 )); then
echo "OK - $time_status "
fi
Dry run,
time_status="20170318-1438"
date_now="20170318-1436"
dateDiff=$(( ${hourMinuteFromFile: -2} - ${currentHourMinute: -2} ))
echo "$dateDiff"
2
Another good coding practice is to avoid using ``, back-ticks for command-substitution and use ${..} syntax and also do-away with a the useless use of cat,
time_status=$(tail -c 14 file | cut -d')' -f1)
date_now=$(date +%Y%m%d-%H%M)

You can transform the dates into seconds since 1970-01-01 00:00:00 UTC with date +%s and then perform the usual integer arithmetic on the result.
d1='2017-03-18 10:39:34'
d2='2017-03-18 10:42:25'
s1=$(date +%s -d "$d1")
s2=$(date +%s -d "$d2")
ds=$((s1 - s2))
if [ "$ds" -ge -180 -a "$ds" -le 180 ]
then
echo same
else
echo different
fi

Arrange Log Entries into Dated Files

I'm trying to split a large log file, containing log entries for months at a time, and I'm trying to split it up into logfiles by date. There are thousands of line as follows:
Sep 4 11:45 kernel: Entry
Sep 5 08:44 syslog: Entry
I'm trying to split it up so that the files, logfile.20090904 and logfile.20090905 contain the entries.
I've created a program to read each line, and send it to the appropriate file, but it runs pretty slow (especially since I have to turn a month name to a number). I've thought about doing a grep for every day, which would require finding the first date in the file, but that seems slow as well.
Is there a more optimal solution? Maybe I'm missing a command line program that would work better.
Here is my current solution:
#! /bin/bash
cat $FILE | while read line; do
dts="${line:0:6}"
dt="`date -d "$dts" +'%Y%m%d'`"
# Note that I could do some caching here of the date, assuming
# that dates are together.
echo $line >> $FILE.$dt 2> /dev/null
done

#OP try not to use bash's while read loop to iterate a big file. Its tried and proven that its slow, and furthermore, you are calling external date command for every line of the file you read. Here's a more efficient way, using only gawk
gawk 'BEGIN{
m=split("Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec",mth,"|")
}
{
for(i=1;i<=m;i++){ if ( mth[i]==$1){ month = i } }
tt="2009 "month" "$2" 00 00 00"
date= strftime("%Y%m%d",mktime(tt))
print $0 > FILENAME"."date
}
' logfile
output
$ more logfile
Sep 4 11:45 kernel: Entry
Sep 5 08:44 syslog: Entry
$ ./shell.sh
$ ls -1 logfile.*
logfile.20090904
logfile.20090905
$ more logfile.20090904
Sep 4 11:45 kernel: Entry
$ more logfile.20090905
Sep 5 08:44 syslog: Entry

The quickest thing given what you've already done would be to simply name the files "Sep 4" and so on, then rename them all at the end - that way all you have to do is read a certain number of characters, no extra processing.
If for some reason you don't want to do that, but you know the dates are in order, you could cache the previous date in both forms, and do a string comparison to find out whether you need to run date again or just use the old cached date.
Finally, if speed really keeps being an issue, you could try perl or python instead of bash. You're not doing anything too crazy here, though (besides starting a subshell and date process every line, which we already figured out how to avoid), so I don't know how much it'll help.

A skeleton of script:
BIG_FILE=big.txt
# remove $BIG_FILE when the script exits
trap "rm -f $BIG_FILE" EXIT
cat $FILES > $BIG_FILE || { echo "cat failed"; exit 1 }
# sort file by date in place
sort -M $BIG_FILE -o $BIG_FILE || { echo "sort failed"; exit 1 }
while read line;
# extract date part from line ...
DATE_STR=${line:0:12}
# a new date - create a new file
if (( $DATE_STR != $PREV_DATE_STR)); then
# close file descriptor of "dated" file
exec 5>&-
PREV_DATE_STR=$DATE_STR
# open file of a "dated" file for write
FILE_NAME= ... set to file name ...
exec 5>$FILE_NAME || { echo "exec failed"; exit 1 }
fi
echo -- $line >&5 || { echo "print failed"; exit 1 }
done < $BIG_FILE

This script executes the inner loop 365 or 366 times, once for each day of the year, instead of iterating over each line of the log file:
#!/bin/bash
month=0
months=(Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec)
for eom in 31 29 31 30 31 30 31 31 30 31 30 31
do
(( month++ ))
echo "Month $month"
if (( month == 2 )) # see what day February ends on
then
eom=$(date -d "3/1 - 1 day" +%-d)
fi
for (( day=1; day<=eom; day++ ))
do
grep "^${months[$month - 1]} $day " dates.log > temp.out
if [[ -s temp.out ]]
then
mv temp.out file.$(date -d $month/$day +"%Y%m%d")
else
rm temp.out
fi
# instead of creating a temp file and renaming or removing it,
# you could go ahead and let grep create empty files and let find
# delete them at the end, so instead of the grep and if/then/else
# immediately above, do this:
# grep --color=never "^${months[$month - 1]} $day " dates.log > file.$(date -d $month/$day +"%Y%m%d")
done
done
# if you let grep create empty files, then do this:
# find -type f -name "file.2009*" -empty -delete

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How to grep log files during a specific time period [duplicate] - linux

You can pipe the results to grep again. cat /dir/dir/dir/2014-07-30.txt | grep someword | cut -d',' -f1,4,3,7 \ | grep '^\d\d\d\d-\d\d-\d\d \(1[89]\|20\)'

I don't have enough reputation to comment, but as minopret suggested do one grep at a time. Here is one of the solutions to get the 18-20 range: grep ' 20: \| 17:\| 18:' filename.txt

Related

Unix environment — extract date strings and compare

bash: syntax error: operand expected (error token is "-")

How to grep a string variable with spaces in a bash script [duplicate]

compare time in bash script with ± x min

Arrange Log Entries into Dated Files

Categories

Resources