Filter log lines within a 10 minute interval - linux

I have below lines in my dummy.text file. I would like to filter these data using bash script or awk.
Jul 28 15:05:47 * aaa has joined
Jul 28 15:07:47 * bbb has joined
Jul 28 15:08:41 * ccc has joined
Jul 28 15:13:32 * ddd has joined
Jul 28 15:14:40 * eee has joined
For example, let's say aaa has joined the session at time 15:05:47 and ccc joined the session at 15:08:47. I want to get the line who has joined equal/after 15:00:00 and before 15:10:00. The expected result would be:
Jul 28 15:05:47 * aaa has joined
Jul 28 15:07:47 * bbb has joined
Jul 28 15:08:41 * ccc has joined
Side note: after getting the expected output I'm looking to write cron job in which this data will be forward to mail.

One way :
awk -F'[ :]' '$3 == 15 && $4 >= 0 && $4 <= 10' file.txt

If you specify the 10-minute interval as 15:00:00 up to but not including 15:10:00, then:
awk -v start=15:00:00 -v end=15:10:00 '$3 >= start && $3 < end'
If you decide you want to omit the final :00 for the seconds from the times, then:
awk -v start=15:00 -v end=15:10 '$3 >= start ":00" && $3 < end ":00"'
Both of these will report on entries in the time interval on any day. If you want to restrict the date, then you can apply further conditions (on $1 and $2).
If you calculate the start and end values in shell variables, then:
start=$(…) # Calculate start time hh:mm
end=$(…) # Calculate end time hh:mm
awk -v start="$start" -v end="$end" '$3 >= start ":00" && $3 < end ":00"'

cat dummy.text | awk '$3>="15:05:47" && $3<="15:08:47" {print}'

Related

Awk to find lines within date range in a file with custom date format

I'm trying to find all lines between a date range in a file. However dates are formatted in a non standard way. Is there a way for awk to read these? The log file is formatted like so:
Jan 5 11:34:00 log messages here
Jan 13 16:21:00 log messages here
Feb 1 01:14:00 log messages here
Feb 10 16:32:00 more messages
Mar 7 16:32:00 more messages
Apr 21 16:32:00 more messages
For example if I want to pull all lines between January 1st and Feb 10th:
I've tried:
awk 'BEGIN{IGNORECASE=1} ($0>=from&&$0<=to)' from="Jan 1 00:00:00" to="Feb 10 23:59:59"
It's a system that only has access to awk so I am kind of limited. Any help would be greatly appreciated.
EDIT:
Thanks alot for the answers so far! They've worked great and have helped my understanding of AWK. However I did forget to mention I need to be able to include the time as well.
For example finding lines in the range including and between:
Jan 1 12:34:00
and
Feb 20 14:23:01
EDIT2: Based on the answer provided by #Cyrus, I decided to use this to parse through times as well:
awk -v start="0101 10:23:22" -v stop="0210 14:21:02" \
'BEGIN{m["Jan"]="01"; m["Feb"]="02"; m["Mar"]="03"; m["Apr"]="04"}
{original = $0; $1 = m[$1]; $2 = sprintf("%.2d", $2)}
$1$2$3 >= start && $1$2$3 <= stop {print original}' file
$ cat tst.awk
{
mthNr = (index("JanFebMarAprMayJunJulAugSepOctNovDec",$1)+2)/3
date = sprintf("%02d%02d", mthNr, $2)
}
(date >= from) && (date <= to)
$ awk -v from='0101' -v to='0210' -f tst.awk file
Jan 5 11:34:00 log messages here
Jan 13 16:21:00 log messages here
Feb 1 01:14:00 log messages here
Feb 10 16:32:00 more messages
Massage to suit...
With awk. 0101 is January 1st and 0210 February 10th.
awk -v start="0101" -v stop="0210" \
'BEGIN{m["Jan"]="01"; m["Feb"]="02"; m["Mar"]="03"; m["Apr"]="04"}
{original = $0; $1 = m[$1]; $2 = sprintf("%.2d", $2)}
$1$2 >= start && $1$2 <= stop {print original}' file
Output:
Jan 5 11:34:00 log messages here
Jan 13 16:21:00 log messages here
Feb 1 01:14:00 log messages here
Feb 10 16:32:00 more messages

AWK adding if statement to add zero to number range 0 to 9 ( NEED TO USE AWK)

Hi I need to format the date command output using awk and add zero before the days starting 1 to 9 .
today=`date | awk {'print $1 " " $2 " " $3'}`
So in the above the output is
Wed Mar 2
I need to add 0 to 2 to get to days of the month 1 through 9
Wed Mar 02
Ho can I add this command using the awk command
for i in 0{1..9}; do echo $i; done
So I need to add 0/zero to $3 when it's between 1 or 9
I tried doing it this way , but something is not working I get error
a3=`date|awk '{
if ($3 <=9)
print $1" "$2" " "0"$3;
else
print $1" "$2" " $3;
}'`
echo $a3
Can you please assist?
Regards
If I were you I'd just specify a format directly:
$ date '+%a %b %d'
Wed Mar 02
date takes a format string preceded by a + as its final argument.
if you must do in awk you can use printf for formatted printing
$ echo 1 2 10 20 | awk -v RS=" " '{printf "%s\t-> %02d\n",$1,$1}'
1 -> 01
2 -> 02
10 -> 10
20 -> 20

extract header if pattern in a column matches

I am trying to extract and print header of a file if the pattern in that particular column matches.
Here is a example :
[user ~]$ cal |sed 's/July 2014//'
Su Mo Tu We Th Fr Sa
1 2 3 4 5
6 7 8 9 10 11 12
13 14 15 16 17 18 19
20 21 22 23 24 25 26
27 28 29 30 31
Expected output :
if input date =31 then print the day on 31st.
Just to be clear, I cannot use date -d flag as its not supported by my OS.Probably would need awk here to crack the question.
[user ~]$ date -d 20140731 +%A
Thursday
I hope I am able to convey my question and concern clearly.
Using awk:
cal | awk -v date=31 'NR == 2 { split($0, header) } NR > 2 { for (i = 1; i <= NF; ++i) if ($i == date) { print header[NR == 3 ? i + 7 - NF : i]; exit } }'
Output:
Th
Here is a gnu awk solution:
cal | awk -v date=31 -v FIELDWIDTHS="3 3 3 3 3 3 3 3" 'NR==2 {split($0,a)} {for (i=1;i<=NF;i++) if ($i==date) print a[i]}'
Th
You set the date that you like to be displayed as a variable, so it can be change to what you like.
Or it could be written like this:
cal | awk 'NR==2 {split($0,a)} {for (i=1;i<=NF;i++) if ($i==date) print a[i]}' FIELDWIDTHS="3 3 3 3 3 3 3 3" date=31
PS FIELDWIDTH was introduced in gnu awk 2.31
Parsing the output of cal isn't really that advisable...
Can your OS's date handle -j?
date -j 073100002014 "+%a"
Thu
How is your OS at perl?
perl -MDateTime -E '$dt=DateTime->new(year=>2014,month=>7,day=>31);say $dt->day_name'
Thursday
Or, if it doesn't do perl -E, you could do
perl -MDateTime -e '$dt=DateTime->new(year=>2014,month=>7,day=>31);print $dt->day_name'
Thursday
How is your OS at php?
php -r '$jd=cal_to_jd(CAL_GREGORIAN,7,31,2014);echo(jdk($jd,2));'
Thu

analyzing time tracking data in linux

I have a log file containing a time series of events. Now, I want to analyze the data to count the number of event for different intervals. Each entry shows that an event has occured in this timestamp. For example here is a part of log file
09:00:00
09:00:35
09:01:20
09:02:51
09:03:04
09:05:12
09:06:08
09:06:46
09:07:42
09:08:55
I need to count the events for 5 minutes intervals. The result should be like:
09:00 4 //which means 4 events from time 09:00:00 until 09:04:59<br>
09:05 5 //which means 4 events from time 09:00:05 until 09:09:59<br>
and so on.
Do you know any trick in bash, shell, awk, ...?
Any help is appreciated.
awk to the rescue.
awk -v FS="" '{min=$5<5?0:5; a[$1$2$4min]++} END{for (i in a) print i, a[i]}' file
Explanation
It gets the values of the 1st, 2nd, 4th and 5th characters in every line and keeps track of how many times they have appeared. To group in 0-4 and 5-9 range, it creates the var min that is 0 in the first case and 5 in the second.
Sample
With your input,
$ awk -v FS="" '{min=$5<5?0:5; a[$1$2$4min]++} END{for (i in a) print i, a[i]}' a
0900 5
0905 5
With another sample input,
$ cat a
09:00:00
09:00:35
09:01:20
09:02:51
09:03:04
09:05:12
09:06:08
09:06:46
09:07:42
09:08:55
09:18:55
09:19:55
10:09:55
10:19:55
$ awk -v FS="" '{min=$5<5?0:5; a[$1$2$4min]++} END{for (i in a) print i, a[i]}' a
0900 5
0905 5
0915 2
1005 1
1015 1
another way with awk
awk -F : '{t=sprintf ("%02d",int($2/5)*5);a[$1 FS t]++}END{for (i in a) print i,a[i]}' file |sort -t: -k1n -k2n
09:00 5
09:05 5
explanation:
use : as field seperator
int($2/5)*5 is used to group the minutes into every 5 minute (00,05,10,15...)
a[$1 FS t]++ count the numbers.
the last sort command will output the sorted time.
Perl with output piped through uniq just for fun:
$ cat file
09:00:00
09:00:35
09:01:20
09:02:51
09:03:04
09:05:12
09:06:08
09:06:46
09:07:42
09:08:55
09:18:55
09:19:55
10:09:55
10:19:55
11:21:00
Command:
perl -F: -lane 'print $F[0].sprintf(":%02d",int($F[1]/5)*5);' file | uniq -c
Output:
5 09:00
5 09:05
2 09:15
1 10:05
1 10:15
1 11:20
1 11:00
Or just perl:
perl -F: -lane '$t=$F[0].sprintf(":%02d",int($F[1]/5)*5); $c{$t}++; END { print join(" ", $_, $c{$_}) for sort keys %c }' file
Output:
09:00 5
09:05 5
09:15 2
10:05 1
10:15 1
11:00 1
11:20 1
I realize this is an old question, but when I stumbled onto it I couldn't resist poking at it from another direction...
sed -e 's/:/ /' -e 's/[0-4]:.*$/0/' -e 's/[5-9]:.*$/5/' | uniq -c
In this form it assumes the data is from standard input, or add the filename as the final argument before the pipe.
It's not unlike Michal's initial approach, but if you happen to need a quick and dirty analysis of a huge log, sed is a lightweight and capable tool.
The assumption is that the data truly is in a regular format - any hiccups will appear in the result.
As a breakdown - given the input
09:00:35
09:01:20
09:02:51
09:03:04
09:05:12
09:06:08
and applying each edit clause individually, the intermediate results are as follows:
1) Eliminate the first colon.
-e 's/:/ /'
09 00:35
09 01:20
09 02:51
09 03:04
09 05:12
2) Transform minutes 0 through 4 to 0.
-e 's/[0-4]:.*$/0/'
09 00
09 00
09 00
09 00
09 05:12
09 06:08
3) Transform minutes 5-9 to 5:
-e 's/[5-9]:.*$/5/'
09 00
09 00
09 00
09 00
09 05
09 05
2 and 3 also delete all trailing content from the lines, which would make the lines non-unique (and hence 'uniq -c' would fail to produce the desired results).
Perhaps the biggest strength of using sed as the front end is that you can select on lines of interest, for example, if root logged in remotely:
sed -e '/sshd.*: Accepted .* for root from/!d' -e 's/:/ /' ... /var/log/secure

Show a list of users that logged in exactly 5 days ago from today in linux?

The last command displays the history of login attempts. How to filter the output so that it displays the users logged in from 5 days before current date?
Here is what I've been able to do so far:
last | grep Dec | grep -v reboot | awk '{print$5}'
This parses the dates from the output of last command.
#!/bin/bash
count=`$date "+%d"`
count=$((count-5))
last|grep -v reboot|grep Dec|awk '($5>=$count) {print $0}'
worked for me :) Thanks for the help #Olivier Dulac
I couldn't do it in one line, but here's a little bash script which might get the job done:
#! /bin/bash
# Find the date string we want
x=$(date --date="5 days ago" +"%a %b %e");
# And now chain a heap of commands together to...
# 1. Get the list of user
# 2. Ignore reboot
# 3. Filter the date lines we want
# 4. Print the user name using awk
# 5+6. Sort them and extract the unique values
last | grep -v "reboot" | grep "$x" | awk '{print $1}' | sort | uniq
in your awk (I don't have "last" here so I can't know the format)
just add a condition to only print the whole line when you see what you want:
ex: if the month is the 3rd field, and day is the 4th field,
last | grep -v reboot | awk ' ( ($3 == "Dec") && ($4 == "07") ) { print $0 ; }'
(once again, without an actual excerpt of "last", I can't tell if the above works, but I hope you get the general idea)
I think chooban's solution is the closest, but it lists only the matching lines. I found a better solution, and most probably it handles the 2013-12-31 - 2014-01-01 issue properly (I found no trace of the output format if a user is logged in more the one year..., or the login time is in the previous year). It is a grep-less one (long)liner:
last | awk -v l="$(last -t $(date -d '-5 day' +%Y%m%d%H%M%S)|head -n 1)" 'BEGIN {l=substr(l,1,55)} /^reboot / {next} substr($0,1,55) == l {exit} 1
I assumed that there is no such user as 'reboot'. It uses the fact that last -t YYYYMMDDHHMMSS prints the lines before the specific date, but unfortunately it changes the format if the logout is inside the specified period (shows "gone - no logout"), so it has to be cut off.
This is not the nicest as it calls last twice, but it seems working.
Output:
root pts/1 mytst.xyzzy.tv Wed Dec 11 12:45 still logged in
root pts/0 mytst.xyzzy.tv Wed Dec 11 11:25 still logged in
root pts/0 mytst.xyzzy.tv Tue Dec 10 16:02 - 17:14 (01:12)
root pts/0 mytst.xyzzy.tv Tue Dec 10 10:59 - 15:04 (04:05)
root pts/0 mytst.xyzzy.tv Mon Dec 9 13:23 - 17:10 (03:46)
root pts/1 mytst.xyzzy.tv Fri Dec 6 16:01 - 16:07 (00:06)
root pts/0 mytst.xyzzy.tv Fri Dec 6 15:52 - 16:08 (00:15)
I hope this could help!

Resources