Bash - Extracting just the date and time from a string variable with other surrounding text being excluded - string

I'm new to sed and have been trying to use it with no luck yet in this case.
I'm reading through a log file and I store the prior line into a variable so that I can extract out the date if needed.
variable string example:
jcl/jclnt.log-[05/06/20 16:42:52.964]:jclnt ST:
I'm only wanting the date and timestamp in the square brackets. I want to ignore the characters before and after. The date and time format are always the same length and format. I can match on it with a regex, just not sure how to extract it from a variable into a new variable with only the data inside the square brackets.
I tried something like this:
priordate= echo "$prior" | awk -F'[][]' '{print $2}'
But that didn't work.

It should work if you remove the space before your echo.
echo "jcl/jclnt.log-[05/06/20 16:42:52.964]:jclnt ST:" | awk -F'[][]' '{print $2}'
05/06/20 16:42:52.964
and then make the statement thus: priordate=$(echo ...)

You can use Bash's native regular expression matching. This is a quick and dirty regular expression that just relies on capturing whatever is between [ and ]. You can certainly make it more specific if necessary.
#!/bin/bash
s="jcl/jclnt.log-[05/06/20 16:42:52.964]:jclnt ST:"
pattern="\[(.*)\]"
if [[ "${s}" =~ $pattern ]]
then
date_time="${BASH_REMATCH[1]}"
fi
echo "${date_time}"
Output:
05/06/20 16:42:52.964

Related

Bash: How to extract numbers preceded by _ and followed by

I have the following format for filenames: filename_1234.svg
How can I retrieve the numbers preceded by an underscore and followed by a dot. There can be between one to four numbers before the .svg
I have tried:
width=${fileName//[^0-9]/}
but if the fileName contains a number as well, it will return all numbers in the filename, e.g.
file6name_1234.svg
I found solutions for two underscores (and splitting it into an array), but I am looking for a way to check for the underscore as well as the dot.
You can use simple parameter expansion with substring removal to simply trim from the right up to, and including, the '.', then trim from the left up to, and including, the '_', leaving the number you desire, e.g.
$ width=filename_1234.svg; val="${width%.*}"; val="${val##*_}"; echo $val
1234
note: # trims from left to first-occurrence while ## trims to last-occurrence. % and %% work the same way from the right.
Explained:
width=filename_1234.svg - width holds your filename
val="${width%.*}" - val holds filename_1234
val="${val##*_}" - finally val holds 1234
Of course, there is no need to use a temporary value like val if your intent is that width should hold the width. I just used a temp to protect against changing the original contents of width. If you want the resulting number in width, just replace val with width everywhere above and operate directly on width.
note 2: using shell capabilities like parameter expansion prevents creating a separate subshell and spawning a separate process that occurs when using a utility like sed, grep or awk (or anything that isn't part of the shell for that matter).
Try the following code :
filename="filename_6_1234.svg"
if [[ "$filename" =~ ^(.*)_([^.]*)\..*$ ]];
then
echo "${BASH_REMATCH[0]}" #will display 'filename_6_1234.svg'
echo "${BASH_REMATCH[1]}" #will display 'filename_6'
echo "${BASH_REMATCH[2]}" #will display '1234'
fi
Explanation :
=~ : bash operator for regex comparison
^(.*)_([^.])\..*$ : we look for any character, followed by an underscore, followed by any character, followed by a dot and an extension. We create 2 capture groups, one for before the last underscore, one for after
BASH_REMATCH : array containing the captured groups
Some more way
[akshay#localhost tmp]$ filename=file1b2aname_1234.svg
[akshay#localhost tmp]$ after=${filename##*_}
[akshay#localhost tmp]$ echo ${after//[^0-9]}
1234
Using awk
[akshay#localhost tmp]$ awk -F'[_.]' '{print $2}' <<< "$filename"
1234
I would use
sed 's!_! !g' | awk '{print "_" $NF}'
to get from filename_1234.svg to _1234.svg then
sed 's!svg!!g'
to get rid of the extension.
If you set IFS, you can use Bash's build-in read.
This splits the filename by underscores and dots and stores the result in the array a.
IFS='_.' read -a a <<<'file1b2aname_1234.svg'
And this takes the second last element from the array.
echo ${a[-2]}
There's a solution using cut:
name="file6name_1234.svg"
num=$(echo "$name" | cut -d '_' -f 2 | cut -d '.' -f 1)
echo "$num"
-d is for specifying a delimiter.
-f refers to the desired field.
I don't know anything about performance but it's simple to understand and simple to maintain.

how to print last part of a string in shell script

I have this line:
102:20620453:d=2017021012:UGRD:10 m above ground:15 hour fcst::lon=79.500000,lat=9.000000,val=-5.35
Now I want to just print the value -5.35 from this line and nothing else.
I also want this command to be able to extract the -7.04 from this line and nothing else.
102:20620453:d=2017021012:UGRD:10 m above ground:15 hour fcst::lon=280.500000,lat=11.000000,val=-7.04
I have read the other stack overflow questions and they did not seem to quite get at what I was looking for. I noticed that they did you awk or sed. What types of things should I do to be able to extract just the part of the above lines after val=?
There's no need for awk, sed, or any other external tool: bash has its own built-in regular expression support, via the =~ operator to [[ ]], and the BASH_REMATCH array (populated with matched contents).
val_re='[, ]val=([^ ]+)'
line='102:20620453:d=2017021012:UGRD:10 m above ground:15 hour fcst::lon=79.500000,lat=9.000000,val=-5.35'
[[ $line =~ $val_re ]] && echo "${BASH_REMATCH[1]}"
That said, if you really want to remove everything up to and including the string val= (and thus to have your code break if other values were added to the format in the future), you could also do so like this:
val=${line##*val=} # assign everything from $line after the last instance of "val=" to val
The syntax here is parameter expansion. See also BashFAQ #100: How do I do string manipulations in bash?
You can use awk with field separator as = and print last field:
awk -F'=' '{print $NF}' <<< "$str"
-5.35
this will search the string val= from the end and give anything after that
str='102:20620453:d=2017021012:UGRD:10 m above ground:15 hour fcst::lon=79.500000,lat=9.000000,val=-5.35'
echo "$str" | grep -Po '(?<=val=).*'
answer works on GNU grep only

Split a string and pick the uppercase substring

Consider the following example variables in bash:
PET="cat/DOG/hamster"
FOOD="soup/soup/PIZZA"
SUBJECT="MATH/physics/biology"
How can I split any of those strings by a slash, extract the part that's all uppercase and store it in a variable? For example, how would I take DOG out of the $PET variable and store it in an $OPTION variable?
I need a portable solution that works under bash and zsh specifically.
You could use tr to remove all characters that are not uppercase:
OPTION=$(tr -dc '[:upper:]' <<< $PET)
Note that here-strings (<<< $VARIABLE) are a bash-ism. In other shells you'll have to echo the variable into tr:
OPTION=$(echo "$PET" | tr -dc '[:upper:]')
It sounds like there is only one portion of the string is in uppercase, so you can ignore the splitting portion of the question. This should work in both zsh and bash (although it is not portable in the sense of POSIX compatibility):
$ echo "${PET//[^A-Z]}"
DOG
You can try something like this -
OPTION=$(gawk -F'/' '{for (i=1;i<=NF;i++) if ($i ~ /\<[A-Z]+\>/) print $i}' <<< $PET)
If you like a pure bash solution then you can add following piece of code
#!/bin/bash
PET="cat/DOG/hamster"
IFSBK=$IFS
IFS='/'
for word in $PET; do
if [[ $word =~ [A-Z]+ ]]; then
OPTION="$word"
fi
done
IFS=$IFSBK

Rename a variable in a for loop

Lets say i have a nested for loop:
for i in $test
do
name=something
for j in $test2
do
name2=something
jj=$j | sed s/'tRap\/tRapTrain'/'BEEML\/BEEMLTrain'/g
if [ name == name2 ]
then
qsub scrip.sh $i $j $jj
fi
done
done
Now the problem occurs when i try to rename the variable $j into variable $jj. I only get empty values back for submitting the script within the if statement. Is there another way to rename variables so that i can pass them through to the if part of the code?
PS. i tried 3 for loops but this makes the script awfully slow.
Your problem is piping the assignment into sed. Try something like
jj=$(echo $j | sed s/'tRap\/tRapTrain'/'BEEML\/BEEMLTrain'/g)
This uses command substitution to assign jj.
This is not correct:
jj=$j | sed s/'tRap\/tRapTrain'/'BEEML\/BEEMLTrain'/g
In order to assign the output of a command to a variable you need to use command substitution like this:
jj=$(sed s/'tRap\/tRapTrain'/'BEEML\/BEEMLTrain'/g <<< "$j")
You may not even have to use sed because bash has in-built string replacement. For example, the following will replace foo with bar in the j variable and assign it to jj:
jj=${j//foo/bar}
There is also a problem with your if-statement. It should be:
if [ "$name" == "$name2" ]
A tiny little thing:
Sed treats the first character after the action selector as the field separator.
Knowing this you can translate your expresion:
sed s/'tRap\/tRapTrain'/'BEEML\/BEEMLTrain'/g
into:
sed s%'tRap/tRapTrain'%'BEEML/BEEMLTrain'%g
So you don't have to worry about scaping your slashes when substituting paths. I normally use '%', but feel free to use any other character. I think the optimal approach would be using a non printable character:
SEP=$'\001' ; sed s${SEP}'tRap/tRapTrain'${SEP}'BEEML/BEEMLTrain'${SEP}g

Applying bash string operators on a constant string

I'm trying to use bash string operators on a constant string. For instance, you can do the following on variable $foo:
$ foo=a:b:c; echo ${foo##*:}
c
Now, if the "a:b:c" string is constant, I would like to have a more concise solution like:
echo ${"a:b:c"##*:}
However, this is not valid bash syntax. Is there any way to perform this?
[The reason I need to do this (rather than hardcoding the result of the substitution, ie. "c" here) is because I have a command template where a "%h" placeholder is replaced by something before running the command; the result of the substitution is seen as a constant by bash.]
That's not possible using parameter expansion.
You could use other commands for this like sed/awk/expr.
but I don't see the requirement for this.
You could just do:
tmp=%h
echo ${tmp##*:}
Or if speed is not an issue, and you don't want to clutter the current environment with unneeded variables:
(tmp=%h; echo ${tmp##*:})
Anyway, you'd be better off using the command template to do the string manipulation or using something simple like cut:
# get third filed delimited by :
$ cut -d: -f3<<<'a:b:c'
c
Or more sophisticated like awk or sed:
#get last field separated by ':'
$ awk -F: '{print $NF}'<<<'a:b:c'
c
$ sed 's/.*:\([^:]*\)/\1/'<<<'a:b:c'
c
Depends on what you need.
You could use expr to get a similar result:
$ expr match "a:b:c" '.*:\(.*\)'
c
You may be able to use Bash regex matching:
pattern='.*:([^:]+)$'
[[ "a:b:c" =~ $pattern ]]
echo "${BASH_REMATCH[1]}"
But why can't you do your template substitution into a variable assignment, then use the variable in the parameter expansion?

Resources