Select lines by condition and count with one line command - linux

I need help with analyze nginx logs. Sample of log:
10.10.10.10 - - [21/Mar/2016:00:00:00 +0000] "GET /example?page=&per_page=100&scopes= HTTP/1.1" 200 769 "-" "" "1.1.1.1"
10.10.10.10 - - [21/Mar/2016:00:00:00 +0000] "GET /example?page=&per_page=500&scopes= HTTP/1.1" 200 769 "-" "" "1.1.1.1"
11.11.11.11 - - [21/Mar/2016:00:00:00 +0000] "GET /example?page=&per_page=10&scopes= HTTP/1.1" 200 769 "-" "" "1.1.1.1"
12.12.12.12 - - [21/Mar/2016:00:00:00 +0000] "GET /example?page=&per_page=500&scopes= HTTP/1.1" 200 769 "-" "" "1.1.1.1"
13.13.13.13 - - [21/Mar/2016:00:00:00 +0000] "GET /example HTTP/1.1" 200 769 "-" "" "1.1.1.1"
Is it possible to select with count all uniq ip addresses which contain per_page parameter and this parameter equal or greater than 100?
So, the output can be in any format:
10.10.10.10 - 2 # ip 10.10.10.10 was found twice
12.12.12.12 - 1
Is it possible to get with one command?

$ awk '/per_page=[0-9]{3}/{cnt[$1]++} END{for (ip in cnt) print ip, cnt[ip]}' file
12.12.12.12 1
10.10.10.10 2
This is absolutely basic awk - read the book Effective Awk Programming, 4th Edition, by Arnold Robbins if you're going to be any other text file processing in UNIX.

Related

I want to write a bash script that will find number of occurrence of a string from last ten minutes logs via sed or grep command [duplicate]

This question already has answers here:
Filter log file entries based on date range
(5 answers)
Closed 6 years ago.
I want to extract information from a log file using a shell script (bash) based on time range. A line in the log file looks like this:
172.16.0.3 - - [31/Mar/2002:19:30:41 +0200] "GET / HTTP/1.1" 200 123 "" "Mozilla/5.0 (compatible; Konqueror/2.2.2-2; Linux)"
i want to extract data specific intervals. For example I need to look only at the events which happened during the last X minutes or X days ago from the last recorded data. I'm new in shell scripting but i have tried to use grep command.
You can use sed for this. For example:
$ sed -n '/Feb 23 13:55/,/Feb 23 14:00/p' /var/log/mail.log
Feb 23 13:55:01 messagerie postfix/smtpd[20964]: connect from localhost[127.0.0.1]
Feb 23 13:55:01 messagerie postfix/smtpd[20964]: lost connection after CONNECT from localhost[127.0.0.1]
Feb 23 13:55:01 messagerie postfix/smtpd[20964]: disconnect from localhost[127.0.0.1]
Feb 23 13:55:01 messagerie pop3d: Connection, ip=[::ffff:127.0.0.1]
...
How it works
The -n switch tells sed to not output each line of the file it reads (default behaviour).
The last p after the regular expressions tells it to print lines that match the preceding expression.
The expression '/pattern1/,/pattern2/' will print everything that is between first pattern and second pattern. In this case it will print every line it finds between the string Feb 23 13:55 and the string Feb 23 14:00.
More info here
Use grep and regular expressions, for example if you want 4 minutes interval of logs:
grep "31/Mar/2002:19:3[1-5]" logfile
will return all logs lines between 19:31 and 19:35 on 31/Mar/2002.
Supposing you need the last 5 days starting from today 27/Sep/2011 you may use the following:
grep "2[3-7]/Sep/2011" logfile
well, I have spent some time on your date format.....
however, finally i worked it out..
let's take an example file (named logFile), i made it a bit short.
say, you want to get last 5 mins' log in this file:
172.16.0.3 - - [31/Mar/2002:19:20:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:20:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:20:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:20:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:20:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:20:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:20:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:20:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:20:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:20:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:20:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:20:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:20:41 +0200] "GET
### lines below are what you want (5 mins till the last record)
172.16.0.3 - - [31/Mar/2002:19:27:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:27:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:27:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:27:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:27:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:27:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:27:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:27:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:27:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:27:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:27:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:27:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:27:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:27:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:30:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:30:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:30:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:30:41 +0200] "GET
here is the solution:
# this variable you could customize, important is convert to seconds.
# e.g 5days=$((5*24*3600))
x=$((5*60)) #here we take 5 mins as example
# this line get the timestamp in seconds of last line of your logfile
last=$(tail -n1 logFile|awk -F'[][]' '{ gsub(/\//," ",$2); sub(/:/," ",$2); "date +%s -d \""$2"\""|getline d; print d;}' )
#this awk will give you lines you needs:
awk -F'[][]' -v last=$last -v x=$x '{ gsub(/\//," ",$2); sub(/:/," ",$2); "date +%s -d \""$2"\""|getline d; if (last-d<=x)print $0 }' logFile
output:
172.16.0.3 - - 31 Mar 2002 19:27:41 +0200 "GET
172.16.0.3 - - 31 Mar 2002 19:27:41 +0200 "GET
172.16.0.3 - - 31 Mar 2002 19:27:41 +0200 "GET
172.16.0.3 - - 31 Mar 2002 19:27:41 +0200 "GET
172.16.0.3 - - 31 Mar 2002 19:27:41 +0200 "GET
172.16.0.3 - - 31 Mar 2002 19:27:41 +0200 "GET
172.16.0.3 - - 31 Mar 2002 19:27:41 +0200 "GET
172.16.0.3 - - 31 Mar 2002 19:27:41 +0200 "GET
172.16.0.3 - - 31 Mar 2002 19:27:41 +0200 "GET
172.16.0.3 - - 31 Mar 2002 19:27:41 +0200 "GET
172.16.0.3 - - 31 Mar 2002 19:27:41 +0200 "GET
172.16.0.3 - - 31 Mar 2002 19:27:41 +0200 "GET
172.16.0.3 - - 31 Mar 2002 19:27:41 +0200 "GET
172.16.0.3 - - 31 Mar 2002 19:27:41 +0200 "GET
172.16.0.3 - - 31 Mar 2002 19:30:41 +0200 "GET
172.16.0.3 - - 31 Mar 2002 19:30:41 +0200 "GET
172.16.0.3 - - 31 Mar 2002 19:30:41 +0200 "GET
172.16.0.3 - - 31 Mar 2002 19:30:41 +0200 "GET
EDIT
you may notice that in the output the [ and ] are disappeared. If you do want them back, you can change the last awk line print $0 -> print $1 "[" $2 "]" $3
I used this command to find last 5 minutes logs for particular event "DHCPACK", try below:
$ grep "DHCPACK" /var/log/messages | grep "$(date +%h\ %d) [$(date --date='5 min ago' %H)-$(date +%H)]:*:*"
You can use this for getting current and log times:
#!/bin/bash
log="log_file_name"
while read line
do
current_hours=`date | awk 'BEGIN{FS="[ :]+"}; {print $4}'`
current_minutes=`date | awk 'BEGIN{FS="[ :]+"}; {print $5}'`
current_seconds=`date | awk 'BEGIN{FS="[ :]+"}; {print $6}'`
log_file_hours=`echo $line | awk 'BEGIN{FS="[ [/:]+"}; {print $7}'`
log_file_minutes=`echo $line | awk 'BEGIN{FS="[ [/:]+"}; {print $8}'`
log_file_seconds=`echo $line | awk 'BEGIN{FS="[ [/:]+"}; {print $9}'`
done < $log
And compare log_file_* and current_* variables.

How to write a Bash script / command that shows the range of 500-511 HTTP server errors in a file

I am trying to write a bash script that will list and count the number of HTTP: 500 - 511 web error inside this file "ccc2022-02-19.txt"
Inside every file there are several 500 errors ranging from HTTP 500, 501, 502, 503 up to 511.
Within the directory where this files are , there are 4 different type of files listed there daily but I am only interested on the files that starts with "ccc" because they are listed daily for example "ccc2022-02-19.txt", "ccc2022-02-20.txt" etc
Below is an example of the file content "ccc2022-02-19.txt"
10.32.10.181 ignore 19 Feb 2022 00:26:04 GMT 10.32.10.44 GET / HTTP/1.1 500 73 N 0 h
10.32.26.124 ignore 19 Feb 2022 00:26:06 GMT 10.32.10.44 GET / HTTP/1.1 501 73 N 0 h
10.32.42.249 ignore 19 Feb 2022 00:26:27 GMT 10.32.10.44 GET / HTTP/1.1 500 73 N 1 h
10.32.10.181 ignore 19 Feb 2022 00:26:34 GMT 10.32.10.44 GET / HTTP/1.1 302 73 N 0 h
10.32.26.124 ignore 19 Feb 2022 00:26:36 GMT 10.32.10.44 GET / HTTP/1.1 503 73 N 1 h
10.32.26.124 ignore 19 Feb 2022 00:26:36 GMT 10.32.10.44 GET / HTTP/1.1 502 73 N 1 h
10.32.26.124 ignore 19 Feb 2022 00:26:36 GMT 10.32.10.44 GET / HTTP/1.1 502 73 N 1 h
10.32.26.124 ignore 19 Feb 2022 00:26:36 GMT 10.32.10.44 GET / HTTP/1.1 504 73 N 1 h
10.32.26.124 ignore 19 Feb 2022 00:26:36 GMT 10.32.10.44 GET / HTTP/1.1 511 73 N 1 h
10.32.26.124 ignore 19 Feb 2022 00:26:36 GMT 10.32.10.44 GET / HTTP/1.1 508 73
I have tried using this command
awk '{for(i=1;i<=NF;i++){if($i>=500 && $i<=511){print $i}}}' ccc2022-02-19.txt
which listed the numbers 500 -511 but I'm afraid that it is not giving only the HTTP response but grepped other number too like 50023, 503893 found inside the file.
To be specific, I just want to see only the HTTP errors. Please note that the file content above is just an example......
Here is a simple awk script:
awk '$12 ~ /5[[:digit:]]{2}/ && $12 < 512 {print $12}' input.txt
Explanation
$12 ~ /5[[:digit:]]{2}/ Field #12 match 5[0-9][0-9]
$12 < 512 Field #12 less than 12
$12 ~ /5[[:digit:]]{2}/ && $12 < 512 (Field #12 match 5[0-9][0-9]) AND (Field #12 less than 12)
{print $12} Print field #12 only if 2 conditions above are met
I think this script might help
#!/bin/bash
ccc=500
while [ $ccc -le 511 ]
do
echo $ccc
ccc=$(( $ccc +1 ))
sleep 0.5
done
You can try out this:
#!/bin/bash
CURRENTDATE=`date +"%Y-%m-%d"`
echo Today date is=${CURRENTDATE}
echo Looking for today file www${CURRENTDATE}.txt
echo "#####"
echo Start listing 500 response codes for file:ccc${CURRENTDATE}.txt
#awk '{print $3 " " $4 " " $5 " " $6 " " $11}' ccc${CURRENTDATE}.txt | grep 500
echo "I am not listing to reduce amount of lines per Ms-teams limit"
echo Completed listing 500 response codes for file:ccc${CURRENTDATE}.txt
echo "#####"
Assuming all lines look like the sample (ie, the http error code is always in the 12th white-space delimited field):
$ awk '$12>= 500 && $12<=511 {print $12}' ccc2022-02-19.txt
500
501
500
503
502
502
504
511
508
If this doesn't work for all possible input lines then the question should be updated with a more representative set of sample data.
This should achieve what you want. Please guys always try to read the description before concluding that he asked a stupid question. It is actually clear!!
awk '{print $3 " " $4 " " $5 " " $6 " " $11 " " $12}' ccc2022-02-21.txt | grep 500 | wc -l
This experiment was done in reference to the file output he provided above and i tested this and it worked! This was a brilliant question in my opinion

How to filter apache access logs on the basis of ips, domain and url?

i have to filter out group of same ips , domain, and some url pattern and print output as well along with count, domain, and url pattern from my apache access logs.|
Currently i am using awk command but is shows only count and ip's not domain and url patterns.
My input is
Feb 2 03:15:01 lb2 haproxy2[30529]: "www.abc.com" 207.46.13.4 02/Feb/2020:03:15:01.668 GET /detail.php?id=166108259 HTTP/1.1 GET 404 123481 "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)" "" ci-https-in~ ciapache ci-web1 0 0 1 71 303 762 263 1 1 -- "" "" "" ""
Feb 2 03:15:02 lb2 haproxy2[30530]: "wap.abc.com" 106.76.245.226 02/Feb/2020:03:15:01.987 GET /listing.php?id=1009 HTTP/1.1 GET 200 182 "Mozilla/5.0 (Linux; Android 5.1.1; LG-K420 Build/LMY47V) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2490.76 Mobile Safari/537.36" "https://wap.abc.com/s.php?q=land+buyers" ci-https-in~ ciapache ci-web2 0 0 0 18 18 17813 219 0 0 -- "" "" "" ""
Feb 2 03:15:02 lb2 haproxy2[30531]: "wap.abc.com" 106.76.245.226 02/Feb/2020:03:15:02.067 GET /listing.php?id=6397 HTTP/1.1 GET 200 128116 "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" "" ci-https-in~ varnish ci-van 0 0 0 1 3 470 1001 0 0 -- "" "" "" ""
Feb 2 03:15:02 lb2 haproxy2[30531]: "wap.abc.com" 106.76.245.226 02/Feb/2020:03:15:02.067 GET /listing.php?id=6397 HTTP/1.1 GET 200 128116 "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" "" ci-https-in~ varnish ci-van 0 0 0 1 3 470 1001 0 0 -- "" "" "" ""
Expected output
count ip domain url
2 106.76.245.226 wap.abc.com /listing.php?id=6397
1 106.76.245.226 wap.abc.com /listing.php?id=1009
1 207.46.13.4 www.abc.com /detail.php?id=166108259
currently i am using this command but it is not giving expected output
cat /var/log/httpd/access_log | grep www.abc.com* | awk '{print $7}' | sort -n | uniq -c | sort -rn | head -n 50
grep www.abc.com* /var/log/httpd/access_log | awk '{print $7,$6,$10}' | sort -n | uniq -c | sort -rn | head -n 50
use other columns as well in awk.

Is IIS and TortoiseSVN working copy compatible?

I have a question about how IIS handle SVN folder.
I am working with ASP-Web forms and MapGuide. My problem is, when I set the path in IIS to my TortoiseSVN working copy, then MapGuide stops working. But when I just copy and paste all files from my working copy to a standard windows folder and set the path to it, then everything works fine.
So what does TortoiseSVN do?
Edit: here are some logs and errors
2017-11-21 09:36:37 ::1 GET /mapguide/mapviewernet/ajaxviewer.aspx SESSION=78e11ef8-ce9f-11e7-8000-208df200a4f8_en_MTI3LjAuMC4x0AFC0AFB0AFA&WEBLAYOUT=Library://MyProject/Layouts/MyProject.WebLayout 81 - ::1 - - 500 19 5 0
2017-11-21 09:36:37 ::1 GET /xxx/xxx/MapContainerRechtsForm.aspx - 81 - ::1 Mozilla/5.0+(Windows+NT+10.0;+Win64;+x64;+rv:57.0)+Gecko/20100101+Firefox/57.0 http://localhost:81/xxx/xxx/MapContainerForm.aspx 200 0 0 562
2017-11-21 09:36:37 ::1 GET /xxx/javascript/jquery.min.js - 81 - ::1%0 Mozilla/5.0+(Windows+NT+10.0;+Win64;+x64;+rv:57.0)+Gecko/20100101+Firefox/57.0 http://localhost:81/xxx/xxx/MapContainerRechtsForm.aspx 200 0 0 0
2017-11-21 09:36:37 ::1 GET /mapguide/mapviewernet/ajaxviewer.aspx SESSION=78e11ef8-ce9f-11e7-8000-208df200a4f8_en_MTI3LjAuMC4x0AFC0AFB0AFA&WEBLAYOUT=Library://MyProject/Layouts/MyProject.WebLayout 81 - ::1 - - 500 19 5 0
2017-11-21 09:36:37 ::1 GET /xxx/xxx/MapContainerRechtsForm.aspx - 81 - ::1 Mozilla/5.0+(Windows+NT+10.0;+Win64;+x64;+rv:57.0)+Gecko/20100101+Firefox/57.0 http://localhost:81/xxx/xxx/xxx.aspx 200 0 0 31
2017-11-21 09:36:37 ::1 GET /xxx/javascript/jquery.min.js - 81 - ::1%0 Mozilla/5.0+(Windows+NT+10.0;+Win64;+x64;+rv:57.0)+Gecko/20100101+Firefox/57.0 http://localhost:81/xxx/xxx/xxx.aspx 200 0 0 0
2017-11-21 09:36:37 ::1 GET /xxx/xxx/MapContainerForm.aspx - 81 - ::1 Mozilla/5.0+(Windows+NT+10.0;+Win64;+x64;+rv:57.0)+Gecko/20100101+Firefox/57.0 - 200 0 0 515
2017-11-21 09:37:24 ::1 GET /mapguide/mapagent/mapagent.fcgi OPERATION=GETPROVIDERCAPABILITIES&VERSION=2.0.0&SESSION=8d781ed4-ce9a-11e7-8000-208df200a4f8_en_MTI3LjAuMC4x0AFC0AFB0AFA&FORMAT=text%2Fxml&CLIENTAGENT=MapGuide%20Maestro%20v6.0.0.8909&PROVIDER=OSGeo.SDF 81 - ::1 - - 500 19 5 0
2017-11-21 09:38:24 ::1 GET /mapguide/mapagent/mapagent.fcgi OPERATION=GETPROVIDERCAPABILITIES&VERSION=2.0.0&SESSION=8d781ed4-ce9a-11e7-8000-208df200a4f8_en_MTI3LjAuMC4x0AFC0AFB0AFA&FORMAT=text%2Fxml&CLIENTAGENT=MapGuide%20Maestro%20v6.0.0.8909&PROVIDER=OSGeo.SDF 81 - ::1 - - 500 19 5 0
2017-11-21 09:39:24 ::1 GET /mapguide/mapagent/mapagent.fcgi OPERATION=GETPROVIDERCAPABILITIES&VERSION=2.0.0&SESSION=8d781ed4-ce9a-11e7-8000-208df200a4f8_en_MTI3LjAuMC4x0AFC0AFB0AFA&FORMAT=text%2Fxml&CLIENTAGENT=MapGuide%20Maestro%20v6.0.0.8909&PROVIDER=OSGeo.SDF 81 - ::1 - - 500 19 5 0
2017-11-21 09:40:24 ::1 GET /mapguide/mapagent/mapagent.fcgi OPERATION=GETPROVIDERCAPABILITIES&VERSION=2.0.0&SESSION=8d781ed4-ce9a-11e7-8000-208df200a4f8_en_MTI3LjAuMC4x0AFC0AFB0AFA&FORMAT=text%2Fxml&CLIENTAGENT=MapGuide%20Maestro%20v6.0.0.8909&PROVIDER=OSGeo.SDF 81 - ::1 - - 500 19 5 0
How it looks:
How it should look:
Dim Response As Net.WebResponse = Nothing
Dim WebReq As Net.HttpWebRequest = Net.HttpWebRequest.Create(URL)
Response = WebReq.GetResponse <-- exception
> > StatusCode = InternalServerError {500} ResponseUri =
> > {http://localhost:81/mapguide/mapviewernet/ajaxviewer.aspx?SESSION=48f61ece-cea8-11e7-8000-208df200a4f8_en_MTI3LjAuMC4x0AFC0AFB0AFA&WEBLAYOUT=Library://myProject/Layouts/myWebLayout.WebLayout}
Ok I got the solution for my problem:
I have to add "Authenticated Users" group to my project folder. Because my web.config in that folder could not be accessed

Take data from log file in specified range of time in unix [duplicate]

This question already has answers here:
Filter log file entries based on date range
(5 answers)
Closed 6 years ago.
I want to extract information from a log file using a shell script (bash) based on time range. A line in the log file looks like this:
172.16.0.3 - - [31/Mar/2002:19:30:41 +0200] "GET / HTTP/1.1" 200 123 "" "Mozilla/5.0 (compatible; Konqueror/2.2.2-2; Linux)"
i want to extract data specific intervals. For example I need to look only at the events which happened during the last X minutes or X days ago from the last recorded data. I'm new in shell scripting but i have tried to use grep command.
You can use sed for this. For example:
$ sed -n '/Feb 23 13:55/,/Feb 23 14:00/p' /var/log/mail.log
Feb 23 13:55:01 messagerie postfix/smtpd[20964]: connect from localhost[127.0.0.1]
Feb 23 13:55:01 messagerie postfix/smtpd[20964]: lost connection after CONNECT from localhost[127.0.0.1]
Feb 23 13:55:01 messagerie postfix/smtpd[20964]: disconnect from localhost[127.0.0.1]
Feb 23 13:55:01 messagerie pop3d: Connection, ip=[::ffff:127.0.0.1]
...
How it works
The -n switch tells sed to not output each line of the file it reads (default behaviour).
The last p after the regular expressions tells it to print lines that match the preceding expression.
The expression '/pattern1/,/pattern2/' will print everything that is between first pattern and second pattern. In this case it will print every line it finds between the string Feb 23 13:55 and the string Feb 23 14:00.
More info here
Use grep and regular expressions, for example if you want 4 minutes interval of logs:
grep "31/Mar/2002:19:3[1-5]" logfile
will return all logs lines between 19:31 and 19:35 on 31/Mar/2002.
Supposing you need the last 5 days starting from today 27/Sep/2011 you may use the following:
grep "2[3-7]/Sep/2011" logfile
well, I have spent some time on your date format.....
however, finally i worked it out..
let's take an example file (named logFile), i made it a bit short.
say, you want to get last 5 mins' log in this file:
172.16.0.3 - - [31/Mar/2002:19:20:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:20:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:20:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:20:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:20:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:20:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:20:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:20:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:20:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:20:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:20:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:20:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:20:41 +0200] "GET
### lines below are what you want (5 mins till the last record)
172.16.0.3 - - [31/Mar/2002:19:27:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:27:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:27:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:27:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:27:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:27:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:27:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:27:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:27:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:27:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:27:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:27:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:27:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:27:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:30:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:30:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:30:41 +0200] "GET
172.16.0.3 - - [31/Mar/2002:19:30:41 +0200] "GET
here is the solution:
# this variable you could customize, important is convert to seconds.
# e.g 5days=$((5*24*3600))
x=$((5*60)) #here we take 5 mins as example
# this line get the timestamp in seconds of last line of your logfile
last=$(tail -n1 logFile|awk -F'[][]' '{ gsub(/\//," ",$2); sub(/:/," ",$2); "date +%s -d \""$2"\""|getline d; print d;}' )
#this awk will give you lines you needs:
awk -F'[][]' -v last=$last -v x=$x '{ gsub(/\//," ",$2); sub(/:/," ",$2); "date +%s -d \""$2"\""|getline d; if (last-d<=x)print $0 }' logFile
output:
172.16.0.3 - - 31 Mar 2002 19:27:41 +0200 "GET
172.16.0.3 - - 31 Mar 2002 19:27:41 +0200 "GET
172.16.0.3 - - 31 Mar 2002 19:27:41 +0200 "GET
172.16.0.3 - - 31 Mar 2002 19:27:41 +0200 "GET
172.16.0.3 - - 31 Mar 2002 19:27:41 +0200 "GET
172.16.0.3 - - 31 Mar 2002 19:27:41 +0200 "GET
172.16.0.3 - - 31 Mar 2002 19:27:41 +0200 "GET
172.16.0.3 - - 31 Mar 2002 19:27:41 +0200 "GET
172.16.0.3 - - 31 Mar 2002 19:27:41 +0200 "GET
172.16.0.3 - - 31 Mar 2002 19:27:41 +0200 "GET
172.16.0.3 - - 31 Mar 2002 19:27:41 +0200 "GET
172.16.0.3 - - 31 Mar 2002 19:27:41 +0200 "GET
172.16.0.3 - - 31 Mar 2002 19:27:41 +0200 "GET
172.16.0.3 - - 31 Mar 2002 19:27:41 +0200 "GET
172.16.0.3 - - 31 Mar 2002 19:30:41 +0200 "GET
172.16.0.3 - - 31 Mar 2002 19:30:41 +0200 "GET
172.16.0.3 - - 31 Mar 2002 19:30:41 +0200 "GET
172.16.0.3 - - 31 Mar 2002 19:30:41 +0200 "GET
EDIT
you may notice that in the output the [ and ] are disappeared. If you do want them back, you can change the last awk line print $0 -> print $1 "[" $2 "]" $3
I used this command to find last 5 minutes logs for particular event "DHCPACK", try below:
$ grep "DHCPACK" /var/log/messages | grep "$(date +%h\ %d) [$(date --date='5 min ago' %H)-$(date +%H)]:*:*"
You can use this for getting current and log times:
#!/bin/bash
log="log_file_name"
while read line
do
current_hours=`date | awk 'BEGIN{FS="[ :]+"}; {print $4}'`
current_minutes=`date | awk 'BEGIN{FS="[ :]+"}; {print $5}'`
current_seconds=`date | awk 'BEGIN{FS="[ :]+"}; {print $6}'`
log_file_hours=`echo $line | awk 'BEGIN{FS="[ [/:]+"}; {print $7}'`
log_file_minutes=`echo $line | awk 'BEGIN{FS="[ [/:]+"}; {print $8}'`
log_file_seconds=`echo $line | awk 'BEGIN{FS="[ [/:]+"}; {print $9}'`
done < $log
And compare log_file_* and current_* variables.

Resources