Remove first n "words" from string variable in Bash - string

I want to remove the first 4 words from my string variable "DATES".
Does someone have a simple solution for this?
Here my example:
DATES="31 May 2021 10:22:01 30 May 2021 10:23:01 29 May 2021 10:24:01"
WC=$(echo $DATES | wc -w)
DATE_COUNT=$(( $WC / 4 - 1 ))
for i in {0..$DATE_COUNT}
do
YEAR=$(echo $DATES | awk '{print $3}')
MONTH=$(echo $DATES | awk '{print $2}')
MONTH=$( date --date="$(printf "01 %s" $MONTH)" +"%m")
DAY=$(echo $DATES | awk '{print $1}')
TIME=$(echo $DATES | awk '{print $4}' | sed 's/://g')
DATE_ARRAY[$i]="$YEAR$MONTH$DAY$TIME"
#Remove first 4 words from string
done

Use cut.
DATES="31 May 2021 10:22:01 30 May 2021 10:23:01 29 May 2021 10:24:01"
echo $DATES | cut -d' ' -f 5-
Output:
30 May 2021 10:23:01 29 May 2021 10:24:01
You can even use it for a cleaner solution than awk, like this:
YEAR=$(echo $DATES | cut -d' ' -f 3)
General version to remove n first words
remove_n_first_words(){
echo $2 | cut -d' ' -f $(($1+1))-
}
remove_n_first_words 4 "$DATES"

Using bash regex operator =~:
$ [[ $DATES =~ ^(([^ ]+ +){4})(.*) ]] && echo ${BASH_REMATCH[3]}
30 May 2021 10:23:01 29 May 2021 10:24:01

Maybe use read ?
DATES="31 May 2021 10:22:01 30 May 2021 10:23:01 29 May 2021 10:24:01"
read -ra dates <<< "$DATES"; echo "${dates[#]:4}"
Or just store the data in an array directly.
DATES=(31 May 2021 10:22:01 30 May 2021 10:23:01 29 May 2021 10:24:01)
echo "${DATES[#]:4}"
To get the total words/elements like with wc -c
echo "${#DATES[*]}"

Related

converting 4 digit year to 2 digit in shell script

I have file as:
$cat file.txt
1981080512 14 15
2019050612 17 18
2020040912 19 95
Here the 1st column represents dates as YYYYMMDDHH
I would like to write the dates as YYMMDDHH. So the desire output is:
81080512 14 15
19050612 17 18
20040912 19 95
My script:
while read -r x;do
yy=$(echo $x | awk '{print substr($0,3,2)}')
mm=$(echo $x | awk '{print substr($0,5,2)}')
dd=$(echo $x | awk '{print substr($0,7,2)}')
hh=$(echo $x | awk '{print substr($0,9,2)}')
awk '{printf "%10s%4s%4s\n",'$yy$mm$dd$hh',$2,$3}'
done < file.txt
It is printing
81080512 14 15
81080512 17 18
Any help please. Thank you.
Please don't kill me for this simple answer, but what about this:
cut -c 3- file.txt
You simply cut the first two digits by showing character 3 till the end of every line (the -c switch indicates that you need to cut characters (not bytes, ...)).
You can do it using single GNU AWK's substr as follows, let file.txt content be then
1981080512 14 15
2019050612 17 18
2020040912 19 95
then
awk '{$1=substr($1,3);print}' file.txt
output
81080512 14 15
19050612 17 18
20040912 19 95
Explanation: I used substr function to get 3rd and onward characters from 1st column and assign it back to said column, then I print such changed line.
(tested in gawk 4.2.1)

how can i cut off the strings from an output in Bash shell?

The command i run is as follows:
rpm -qi setup | grep Install
The output of the command:
Install Date: Do 30 Jul 2020 15:55:28 CEST
I would like to edit this output further more in order to remain with just:
30 Jul 2020
And the rest of the output not to be displayed.
What best editing way in bash can i possibly simply get this end result?
Use grep -Po like so (-P = use Perl regex engine, and -o = print just the match, not the entire line):
echo '**Install Date: Do 30 Jul 2020 15:55:28 CEST**' | grep -Po '\d{1,2}\s+\w{3}\s+\d{4}'
You can also use cut like so (-d' ' = split on blanks, -f4-6 =
print fields 4 through 6):
echo '**Install Date: Do 30 Jul 2020 15:55:28 CEST**' | cut -d' ' -f4-6
Output:
30 Jul 2020
You can do it using just rpmqueryformat and bashprintf:
$ printf '%(%d %b %Y)T\n' $(rpm -q --queryformat '%{INSTALLTIME}\n' setup)
29 Apr 2020

Capture a set of numbers in sed

I have the following string
Text1 Text2 v2010.0_1.3 Tue Jun 6 14:38:31 PDT 2017
I am trying to capture only v2010.0_1.3 using
echo "Text1 Text2 v2010.0_1.3 Tue Jun 6 14:38:31 PDT 2017" |
sed -nE 's/.*(v.*\s).*/\1/p'
and I get the following result v2010.0_1.3 Tue Jun 6 14:38:31 PDT. It looks like sed is not stopping the first occurrence of the space, but at the last one. How can I capture only until the first occurence?
Using sed
sed's regular expressions are "greedy" (more precisely, they are leftmost-longest matches). You need to work around that. For example:
$ s="Text1 Text2 v2010.0_1.3 Tue Jun 6 14:38:31 PDT 2017"
$ echo "$s" | sed -nE 's/.*(v[^[:blank:]]*).*/\1/p'
v2010.0_1.3
Notes:
The expression (v[^[:blank:]]*) will capture as a group any string of non-blanks that begins with v.
\s is non-portable (GNU only). [[:blank:]] will work reliably to match blanks and tabs in a unicode-safe way.
Using awk
$ echo "$s" | awk '/^v/' RS=' '
v2010.0_1.3
RS=' ' tells awk to treat a space as a record separator. /^v/ will print any record that begins with v.

AWK adding if statement to add zero to number range 0 to 9 ( NEED TO USE AWK)

Hi I need to format the date command output using awk and add zero before the days starting 1 to 9 .
today=`date | awk {'print $1 " " $2 " " $3'}`
So in the above the output is
Wed Mar 2
I need to add 0 to 2 to get to days of the month 1 through 9
Wed Mar 02
Ho can I add this command using the awk command
for i in 0{1..9}; do echo $i; done
So I need to add 0/zero to $3 when it's between 1 or 9
I tried doing it this way , but something is not working I get error
a3=`date|awk '{
if ($3 <=9)
print $1" "$2" " "0"$3;
else
print $1" "$2" " $3;
}'`
echo $a3
Can you please assist?
Regards
If I were you I'd just specify a format directly:
$ date '+%a %b %d'
Wed Mar 02
date takes a format string preceded by a + as its final argument.
if you must do in awk you can use printf for formatted printing
$ echo 1 2 10 20 | awk -v RS=" " '{printf "%s\t-> %02d\n",$1,$1}'
1 -> 01
2 -> 02
10 -> 10
20 -> 20

Halt sed after first replacement

I'm writing a script and I need to search for months in a line, eg 01, 02..., 12 and replace it with its abbreviation. However, though I need to search for all months, I only need to replace the first instance of a month number that I find. For instance, if we have a line that looks like this:
05 06 07
I need sed to perform the following:
May 06 07
The current command I'm using produces:
May Jun Jul
Which is not desirable. Here's what I'm using:
date=$(echo $line |cut -d , -f 1 | sed 's/-/ /g;s/:00//;s/:/ /g;s/01/Jan/;s/02/Feb/;s/03/Mar/;s/04/April/;s/05/May/;s/06/Jun/;s/07/Jul/;s/08/Aug/;s/09/Sep/;s/10/Oct/;s/11/Nov/;s/12/Dec/')
Thanks for the help in advanced
with gnu date
echo "05 06 07" | while read -r mon rest
do
mon=$(date -d "2014-$mon-01" +%b)
echo $mon $rest
done
Try something like
> echo $line
foo foo05 06 07 foo 02
> [[ $line =~ [0-9]{2}' '[0-9]{2}' '[0-9]{2} ]] && date -d $(echo "${BASH_REMATCH[0]}" | tr ' ' '/') '+%b %d %y'
May 06 07
This is awkward, but it seems to work:
... sed 's/-/ /g;s/:00//;s/:/ /g
s/01/Jan/
t
s/02/Feb/
t
s/03/Mar/
t
s/04/April/
t
s/05/May/
t
s/06/Jun/
t
s/07/Jul/
t
s/08/Aug/
t
s/09/Sep/
t
s/10/Oct/
t
s/11/Nov/
t
s/12/Dec/'

Resources