awk get the data based on if else condition

awk get the data based on if else condition - linux

I need again your expertise, I am trying to do some conditional using awk to get the columns.
If I look at the $5 the data can have year and in some places a date.
So when year is there it's good to print, but other values where I have date and time like 05:17:27 then I need to print the last field.
2021
2021
05:17:27
20:33:17
05:17:20
2020
2020
2021
2020
2021
Below is my sample data.
data_file.
yogutdb01 Mon 28 Jun 2021 11:19:56 PM MST
yogutdb02 Thu 30 Sep 2021 02:02:53 AM MST
yogutdb03 Thu Jul 13 05:17:27 2017
yogutdb04 Fri Jun 23 20:33:17 2017
yogutdb05 Thu Jul 13 05:17:20 2017
yogutdb06 Wed 24 Jun 2020 03:49:16 PM MST
yogutdb07 Wed 24 Jun 2020 04:05:10 PM MST
yogutdb08 Sat 22 May 2021 04:19:14 AM MST
yogutdb09 Thu 09 Apr 2020 12:16:32 PM CEST
yogutdb10 Tue 11 May 2021 03:03:02 PM MST
My trial: I am using below but getting syntax error on the else condition.
$ awk '{ ($5=="[^0-9]+$")print $1,$2,$3,$4,$5; else print $1,$2,$3,$4,$NF}' my_data.text
Desired Should be:
yogutdb01 2021
yogutdb02 2021
yogutdb03 2017
yogutdb04 2017
yogutdb05 2017
yogutdb06 2020
yogutdb07 2020
yogutdb08 2021
yogutdb09 2020
yogutdb10 2021
OR
yogutdb01 Mon 28 Jun 2021
yogutdb02 Thu 30 Sep 2021
yogutdb03 Thu Jul 13 2017
yogutdb04 Fri Jun 23 2017
yogutdb05 Thu Jul 13 2017
yogutdb06 Wed 24 Jun 2020
yogutdb07 Wed 24 Jun 2020
yogutdb08 Sat 22 May 2021
yogutdb09 Thu 09 Apr 2020
yogutdb10 Tue 11 May 2021

You cannot use the == operator to test the regex match. Instead you can use
match() function or ~ operator.
You should place the ^ regex in front of [0-9], not inside.
Then would you please try:
awk '{if (match($5,/^[0-9]+$/)) print $1, $2, $3, $4, $5; else print $1, $2, $3, $4, $NF}' my_data.text
Output:
yogutdb01 Mon 28 Jun 2021
yogutdb02 Thu 30 Sep 2021
yogutdb03 Thu Jul 13 2017
yogutdb04 Fri Jun 23 2017
yogutdb05 Thu Jul 13 2017
yogutdb06 Wed 24 Jun 2020
yogutdb07 Wed 24 Jun 2020
yogutdb08 Sat 22 May 2021
yogutdb09 Thu 09 Apr 2020
yogutdb10 Tue 11 May 2021
Here is an alternative using ~ operator:
awk '$5 ~ /^[0-9]+$/ {print $1, $2, $3, $4, $5; next} {print $1, $2, $3, $4, $NF}' my_data.text

As per your desired outcome, you should try below which will work.
You can use Regular expression matches like ~.
$ awk '{ if ($5 !~ /:/) { print $1,$2,$3,$4,$5; next } { print $1,$2,$3,$4, $NF } }' exampl_data1
Result:
yogutdb01 Mon 28 Jun 2021
yogutdb02 Thu 30 Sep 2021
yogutdb03 Thu Jul 13 2017
yogutdb04 Fri Jun 23 2017
yogutdb05 Thu Jul 13 2017
yogutdb06 Wed 24 Jun 2020
yogutdb07 Wed 24 Jun 2020
yogutdb08 Sat 22 May 2021
yogutdb09 Thu 09 Apr 2020
yogutdb10 Tue 11 May 2021
Just to mention, as #tshiono also asked in the comment,to get the output in order, you can use below.
$ awk '{ if ($5 !~ /:/) { print $1, $2, $3, $4, $5; next } { print $1, $2, $4, $3, $NF } }' exampl_data1

You could print the first 4 fields, and check the 5th field for only 4 digits. If there are not only 4 digits, print the last field.
awk '{print $1, $2, $3, $4, ($5 ~ /^[0-9]+$/ ? $5 : $NF)}' my_data.text
Output
yogutdb01 Mon 28 Jun 2021
yogutdb02 Thu 30 Sep 2021
yogutdb03 Thu Jul 13 2017
yogutdb04 Fri Jun 23 2017
yogutdb05 Thu Jul 13 2017
yogutdb06 Wed 24 Jun 2020
yogutdb07 Wed 24 Jun 2020
yogutdb08 Sat 22 May 2021
yogutdb09 Thu 09 Apr 2020
yogutdb10 Tue 11 May 2021

UPDATE : new version that also fixes month-date cross-placements in columns 3 and 4 :
echo "${aaaaa}" \
\
| mawk 'NF=_+!($_=$(!+$NF?_:NF))*($3=$(2+2^(\
__= $4 ~ /^[0-3][0-9]$/)) \
substr("",$4=$(4-__)))' \_=5
yogutdb01 Mon 28 Jun 2021
yogutdb02 Thu 30 Sep 2021
yogutdb03 Thu 13 Jul 2017 *** fixed these 3 rows
yogutdb04 Fri 23 Jun 2017 ***
yogutdb05 Thu 13 Jul 2017 ***
yogutdb06 Wed 24 Jun 2020
yogutdb07 Wed 24 Jun 2020
yogutdb08 Sat 22 May 2021
yogutdb09 Thu 09 Apr 2020
yogutdb10 Tue 11 May 2021
first one acts upon the assumption that there aren't any numerical data at $NF other than 4-digit year
2nd option performs a more thorough year-data check. Both involve assigning the proper year value into $5, then using assignment into NF to trim out all the excess columns/fields to the right of it.
< datafile.txt \
\
| mawk 'NF=_^($_=$(!+$NF?_:NF))^!_' \_=5
or
| mawk 'NF= +_+($_=$(/[ ][012][0-9][0-9][0-9]$/? NF :_))*!_' \_=5
| gawk 'NF= _+!($_=$(/[ ][0-2][0-9]{3}$/ ? NF :_))' \_=5
yogutdb01 Mon 28 Jun 2021
yogutdb02 Thu 30 Sep 2021
yogutdb03 Thu Jul 13 2017
yogutdb04 Fri Jun 23 2017
yogutdb05 Thu Jul 13 2017
yogutdb06 Wed 24 Jun 2020
yogutdb07 Wed 24 Jun 2020
yogutdb08 Sat 22 May 2021
yogutdb09 Thu 09 Apr 2020
yogutdb10 Tue 11 May 2021

Related

Extracting the lines from a file which shows date more than 30 days

Experts,
I am new to scripting world
I am trying to keep the lines which are older than 30 days.
the file contains following lines
Server1 last patched on Mon Oct 11 09:50:47 2021
Server2 last patched on Fri Jun 3 07:53:36 2022
Server3 last patched on Fri Jun 3 11:58:26 2022
Server4 last patched on Fri Jun 17 12:58:59 2022
Server5 last patched on Fri Marc 17 04:12:51 2022
Server6 last patched on Mon Oct 17 23:08:24 2022
Thank you for your help.
expecting to keep the lines which are older than 30 days.
i have tried this
awk -v dat="Sun Oct 04 00:00:00 2022" -F':' '$5<dat' list.txt
gives same result
Server1 last patched on Mon Oct 11 09:50:47 2021
Server2 last patched on Fri Jun 3 07:53:36 2022
Server3 last patched on Fri Jun 3 11:58:26 2022
Server4 last patched on Fri Jun 17 12:58:59 2022
Server5 last patched on Fri Marc 17 04:12:51 2022
Server6 last patched on Mon Oct 17 23:08:24 2022
expected results is
Server1 last patched on Mon Oct 11 09:50:47 2021
Server2 last patched on Fri Jun 3 07:53:36 2022
Server3 last patched on Fri Jun 3 11:58:26 2022
Server4 last patched on Fri Jun 17 12:58:59 2022
Server5 last patched on Fri Marc 17 04:12:51 2022
the file entry is
server1 - Red Hat Enterprise Linux Server release 7.9 (Maipo) - last patched on Tue Sep 20 10:45:56 2022

The difficulty is having to parse the timestamps into an actual time value: you can't just compare them as strings and expect chronological order.
Here's a bit of perl:
perl -MTime::Piece -lane '
BEGIN {$start = (localtime) - 86400 * 30}
$t = Time::Piece->strptime("#F[4..8]", "%a %b %d %T %Y");
print if $t < $start;
' file
Server1 last patched on Mon Oct 11 09:50:47 2021
Server2 last patched on Fri Jun 3 07:53:36 2022
Server3 last patched on Fri Jun 3 11:58:26 2022
Server4 last patched on Fri Jun 17 12:58:59 2022
Server5 last patched on Fri Mar 17 04:12:51 2022
Note, I had to edit Marc to Mar to satisfy strptime's %b month abbreviation.
if its in this format. 'server1 - Red Hat Enterprise Linux Server release 7.9 (Maipo) - last patched on Tue Sep 20 10:45:56 2022'. do i need to change #F[14..18]
Yes. If the number of words before the date is variable, I'd use regex matching. I'm going to assume "last patched on" is always present:
perl -MTime::Piece -lne '
BEGIN {$start = (localtime) - 86400 * 30}
if (/last patched on (.+)/) {
$t = Time::Piece->strptime($1, "%a %b %d %T %Y");
print if $t < $start;
}
' file
I can't reproduce your error:
$ cat file
server1 - Red Hat Enterprise Linux Server release 7.9 (Maipo) - last patched on Tue Sep 20 10:45:56 2022
Server1 last patched on Mon Oct 11 09:50:47 2021
Server2 last patched on Fri Jun 3 07:53:36 2022
Server3 last patched on Fri Jun 3 11:58:26 2022
Server4 last patched on Fri Jun 17 12:58:59 2022
Server5 last patched on Fri Mar 17 04:12:51 2022
Server6 last patched on Mon Oct 17 23:08:24 2022
$ perl -MTime::Piece -lne '
BEGIN {$start = (localtime) - 86400 * 30}
if (/last patched on (.+)/) {
$t = Time::Piece->strptime($1, "%a %b %d %T %Y");
print if $t < $start;
}
' file
server1 - Red Hat Enterprise Linux Server release 7.9 (Maipo) - last patched on Tue Sep 20 10:45:56 2022
Server1 last patched on Mon Oct 11 09:50:47 2021
Server2 last patched on Fri Jun 3 07:53:36 2022
Server3 last patched on Fri Jun 3 11:58:26 2022
Server4 last patched on Fri Jun 17 12:58:59 2022
Server5 last patched on Fri Mar 17 04:12:51 2022

Cannot test transition from DST

I am having issues trying to test transition from DST. This is my timezone:
cat /etc/timezone
Europe/Rome
while this is the output of zdump -v -c 2019,2023 /usr/share/zoneinfo/Europe/Rome:
/usr/share/zoneinfo/Europe/Rome -9223372036854775808 = NULL
/usr/share/zoneinfo/Europe/Rome -9223372036854689408 = NULL
/usr/share/zoneinfo/Europe/Rome Sun Mar 31 00:59:59 2019 UT = Sun Mar 31 01:59:59 2019 CET isdst=0 gmtoff=3600
/usr/share/zoneinfo/Europe/Rome Sun Mar 31 01:00:00 2019 UT = Sun Mar 31 03:00:00 2019 CEST isdst=1 gmtoff=7200
/usr/share/zoneinfo/Europe/Rome Sun Oct 27 00:59:59 2019 UT = Sun Oct 27 02:59:59 2019 CEST isdst=1 gmtoff=7200
/usr/share/zoneinfo/Europe/Rome Sun Oct 27 01:00:00 2019 UT = Sun Oct 27 02:00:00 2019 CET isdst=0 gmtoff=3600
/usr/share/zoneinfo/Europe/Rome Sun Mar 29 00:59:59 2020 UT = Sun Mar 29 01:59:59 2020 CET isdst=0 gmtoff=3600
/usr/share/zoneinfo/Europe/Rome Sun Mar 29 01:00:00 2020 UT = Sun Mar 29 03:00:00 2020 CEST isdst=1 gmtoff=7200
/usr/share/zoneinfo/Europe/Rome Sun Oct 25 00:59:59 2020 UT = Sun Oct 25 02:59:59 2020 CEST isdst=1 gmtoff=7200
/usr/share/zoneinfo/Europe/Rome Sun Oct 25 01:00:00 2020 UT = Sun Oct 25 02:00:00 2020 CET isdst=0 gmtoff=3600
/usr/share/zoneinfo/Europe/Rome Sun Mar 28 00:59:59 2021 UT = Sun Mar 28 01:59:59 2021 CET isdst=0 gmtoff=3600
/usr/share/zoneinfo/Europe/Rome Sun Mar 28 01:00:00 2021 UT = Sun Mar 28 03:00:00 2021 CEST isdst=1 gmtoff=7200
/usr/share/zoneinfo/Europe/Rome Sun Oct 31 00:59:59 2021 UT = Sun Oct 31 02:59:59 2021 CEST isdst=1 gmtoff=7200
/usr/share/zoneinfo/Europe/Rome Sun Oct 31 01:00:00 2021 UT = Sun Oct 31 02:00:00 2021 CET isdst=0 gmtoff=3600
/usr/share/zoneinfo/Europe/Rome Sun Mar 27 00:59:59 2022 UT = Sun Mar 27 01:59:59 2022 CET isdst=0 gmtoff=3600
/usr/share/zoneinfo/Europe/Rome Sun Mar 27 01:00:00 2022 UT = Sun Mar 27 03:00:00 2022 CEST isdst=1 gmtoff=7200
/usr/share/zoneinfo/Europe/Rome Sun Oct 30 00:59:59 2022 UT = Sun Oct 30 02:59:59 2022 CEST isdst=1 gmtoff=7200
/usr/share/zoneinfo/Europe/Rome Sun Oct 30 01:00:00 2022 UT = Sun Oct 30 02:00:00 2022 CET isdst=0 gmtoff=3600
/usr/share/zoneinfo/Europe/Rome 9223372036854689407 = NULL
/usr/share/zoneinfo/Europe/Rome 9223372036854775807 = NULL
European Summer Time begins (clocks go forward) at 01:00 UTC on the last Sunday in March (28 March 2021), and ends (clocks go back) at 01:00 UTC on the last Sunday in October (31 October 2021).
I issued the following to test the transition to DST:
sudo date -s '2021-03-28 01:59:59'
Sun Mar 28 01:59:59 CET 2021
Sun Mar 28 03:00:00 CEST 2021
The time indeed moves forward by 1 hour at 2:00 CET (01:00 UTC) and the timezone switches from CET to CEST.
However, I get this trying to test the transition from DST:
sudo date -s '2021-10-31 01:59:59'
Sun Oct 31 01:59:59 CEST 2021
Sun Oct 31 02:00:00 CEST 2021
Setting the time directly to 02:00 actually triggers a timezone change:
sudo date -s '2021-10-31 02:00:00'
Sun Oct 31 02:00:00 CET 2021
Why is it changing timezone when setting the time to 2:00 while it is not when setting it to 1:59:59? Shouldn't it transition from DST at 3:00:00 nonetheless?

Print only first and last matching patterns

I am new to scripting and am learning as I go, I appreciate all and any help you can provide. I have a file with the following data:
0252 Fri 03 Jul 2015 84082679
0252 Fri 10 Jul 2015 81473945
0252 Fri 17 Jul 2015 87405062
0252 Fri 24 Jul 2015 89400396
0253 Fri 03 Jul 2015 29038894
0253 Fri 10 Jul 2015 29392107
0253 Fri 17 Jul 2015 31271055
0253 Fri 24 Jul 2015 31367348
071 Fri 03 Jul 2015 18594024
071 Fri 10 Jul 2015 18568430
071 Fri 17 Jul 2015 18648903
071 Fri 24 Jul 2015 18887643
072 Fri 03 Jul 2015 20141235
072 Fri 10 Jul 2015 19563727
072 Fri 17 Jul 2015 19573266
My desired output would look like:
0252 Fri 03 Jul 2015 84082679
0252 Fri 24 Jul 2015 89400396
0253 Fri 03 Jul 2015 29038894
0253 Fri 24 Jul 2015 31367348
071 Fri 03 Jul 2015 18594024
071 Fri 24 Jul 2015 18887643
072 Fri 03 Jul 2015 20141235
072 Fri 17 Jul 2015 19573266
The first column in the input data defines the "groups". From each group I want to print exactly two lines: the first line and the last line.
I would like to use awk to get my desired result, as i am trying to sort this information as final output. Any help is greatly appreciated, thank you.

Perl to the rescue!
perl -lane '
if ($F[0] eq $id) {
$keep = $_
} else {
$id = $F[0];
print $keep if defined $keep;
print
}
}{ print $keep
' < input.txt > output.txt
-n reads the input line by line
-a splits each line into the #F array
-l adds newline to print
$id is used to keep the value from the first column
$keep remembers the last line. When the $id changes, $keep and the current line are printed.
after the Eskimo greeting operator }{, the last line is printed once the whole file has been processed.

$ awk -v h=99 'h>$3{if (last) print last;print;} {h=$3;last=$0;} END{print last}' file
0252 Fri 03 Jul 2015 84082679
0252 Fri 24 Jul 2015 89400396
0253 Fri 03 Jul 2015 29038894
0253 Fri 24 Jul 2015 31367348
071 Fri 03 Jul 2015 18594024
071 Fri 24 Jul 2015 18887643
072 Fri 03 Jul 2015 20141235
072 Fri 17 Jul 2015 19573266
How it works
The script uses two variables: h and last. h is the value of the third field on the previous line and last is the text of the last line. Any decrease in h triggers printing.
-v h=99
Set initial value of h to a large number.
h>$3{if (last) print last;print;}
If h is larger than field 3, then print both the previous line (if there is one) and the current line.
h=$3;last=$0;
Update h and last.
END{print last}
Print the last line.

This might work for you (GNU sed):
sed -r '1p;N;/^(\S+\s+).*\n\1/D;2s/.*\n//' file
Always print the first line. Append the next line to the current line and compare the first field of the first with the first field of the second. It they are the same delete the first and repeat. Otherwise, print both lines but only the second if on line 2.

$ cat tst.awk
$1 != p1 { print p0 $0 }
{ p1 = $1; p0 = $0 ORS }
END { printf "%s", p0 }
$ awk -f tst.awk file
0252 Fri 03 Jul 2015 84082679
0252 Fri 24 Jul 2015 89400396
0253 Fri 03 Jul 2015 29038894
0253 Fri 24 Jul 2015 31367348
071 Fri 03 Jul 2015 18594024
071 Fri 24 Jul 2015 18887643
072 Fri 03 Jul 2015 20141235
072 Fri 17 Jul 2015 19573266

Grails get day of current week and last three weeks

I got a domain work with id, day
Day shows value from Match to current.
I need to find the list of current week and last two weeks
Ex: today is Monday (04/22) then what I need is:
Week1: 06-12 April
Week2: 13-19 April
Current week: 20-26 April.
Please helps, thanks.

Posted here for posterity:
def current = new Date().clearTime()
int currentDay = Calendar.instance.with {
time = current
get( Calendar.DAY_OF_WEEK )
}
def listOfDays = (current - 13 - currentDay)..(current + 7 - currentDay)
listOfDays.each {
println it
}
Prints:
Sun Apr 06 00:00:00 BST 2014
Mon Apr 07 00:00:00 BST 2014
Tue Apr 08 00:00:00 BST 2014
Wed Apr 09 00:00:00 BST 2014
Thu Apr 10 00:00:00 BST 2014
Fri Apr 11 00:00:00 BST 2014
Sat Apr 12 00:00:00 BST 2014
Sun Apr 13 00:00:00 BST 2014
Mon Apr 14 00:00:00 BST 2014
Tue Apr 15 00:00:00 BST 2014
Wed Apr 16 00:00:00 BST 2014
Thu Apr 17 00:00:00 BST 2014
Fri Apr 18 00:00:00 BST 2014
Sat Apr 19 00:00:00 BST 2014
Sun Apr 20 00:00:00 BST 2014
Mon Apr 21 00:00:00 BST 2014
Tue Apr 22 00:00:00 BST 2014
Wed Apr 23 00:00:00 BST 2014
Thu Apr 24 00:00:00 BST 2014
Fri Apr 25 00:00:00 BST 2014
Sat Apr 26 00:00:00 BST 2014

Groovy, get list of current week and last 2 weeks

I got a domain work with id, day, list day from January to now.
I get the current time by code:
def current = new Date()
So, I'd like to get list day from last 2 weeks, included this week, then I used the following code but it doesn't work.
def getWeek = current.Time - 13 (13 is 2 week + today)
Please help me solve it.

Not 100% sure I understand, but you should be able to use a Range:
def current = new Date().clearTime()
def listOfDays = (current - 13)..current
listOfDays.each { println it }
That prints:
Wed Apr 09 00:00:00 BST 2014
Thu Apr 10 00:00:00 BST 2014
Fri Apr 11 00:00:00 BST 2014
Sat Apr 12 00:00:00 BST 2014
Sun Apr 13 00:00:00 BST 2014
Mon Apr 14 00:00:00 BST 2014
Tue Apr 15 00:00:00 BST 2014
Wed Apr 16 00:00:00 BST 2014
Thu Apr 17 00:00:00 BST 2014
Fri Apr 18 00:00:00 BST 2014
Sat Apr 19 00:00:00 BST 2014
Sun Apr 20 00:00:00 BST 2014
Mon Apr 21 00:00:00 BST 2014
Tue Apr 22 00:00:00 BST 2014
If you mean you want the entire 2 weeks before the current week AND the current week, you could do:
def current = new Date().clearTime()
int currentDay = Calendar.instance.with {
time = current
get( Calendar.DAY_OF_WEEK )
}
def listOfDays = (current - 13 - currentDay)..(current + 7 - currentDay)
listOfDays.each {
println it
}
Which prints:
Sun Apr 06 00:00:00 BST 2014
Mon Apr 07 00:00:00 BST 2014
Tue Apr 08 00:00:00 BST 2014
Wed Apr 09 00:00:00 BST 2014
Thu Apr 10 00:00:00 BST 2014
Fri Apr 11 00:00:00 BST 2014
Sat Apr 12 00:00:00 BST 2014
Sun Apr 13 00:00:00 BST 2014
Mon Apr 14 00:00:00 BST 2014
Tue Apr 15 00:00:00 BST 2014
Wed Apr 16 00:00:00 BST 2014
Thu Apr 17 00:00:00 BST 2014
Fri Apr 18 00:00:00 BST 2014
Sat Apr 19 00:00:00 BST 2014
Sun Apr 20 00:00:00 BST 2014
Mon Apr 21 00:00:00 BST 2014
Tue Apr 22 00:00:00 BST 2014
Wed Apr 23 00:00:00 BST 2014
Thu Apr 24 00:00:00 BST 2014
Fri Apr 25 00:00:00 BST 2014
Sat Apr 26 00:00:00 BST 2014

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

awk get the data based on if else condition - linux

Related

Extracting the lines from a file which shows date more than 30 days

Cannot test transition from DST

Print only first and last matching patterns

Grails get day of current week and last three weeks

Groovy, get list of current week and last 2 weeks

Categories

Resources