how to print last part of a string in shell script - string

I have this line:
102:20620453:d=2017021012:UGRD:10 m above ground:15 hour fcst::lon=79.500000,lat=9.000000,val=-5.35
Now I want to just print the value -5.35 from this line and nothing else.
I also want this command to be able to extract the -7.04 from this line and nothing else.
102:20620453:d=2017021012:UGRD:10 m above ground:15 hour fcst::lon=280.500000,lat=11.000000,val=-7.04
I have read the other stack overflow questions and they did not seem to quite get at what I was looking for. I noticed that they did you awk or sed. What types of things should I do to be able to extract just the part of the above lines after val=?

There's no need for awk, sed, or any other external tool: bash has its own built-in regular expression support, via the =~ operator to [[ ]], and the BASH_REMATCH array (populated with matched contents).
val_re='[, ]val=([^ ]+)'
line='102:20620453:d=2017021012:UGRD:10 m above ground:15 hour fcst::lon=79.500000,lat=9.000000,val=-5.35'
[[ $line =~ $val_re ]] && echo "${BASH_REMATCH[1]}"
That said, if you really want to remove everything up to and including the string val= (and thus to have your code break if other values were added to the format in the future), you could also do so like this:
val=${line##*val=} # assign everything from $line after the last instance of "val=" to val
The syntax here is parameter expansion. See also BashFAQ #100: How do I do string manipulations in bash?

You can use awk with field separator as = and print last field:
awk -F'=' '{print $NF}' <<< "$str"
-5.35

this will search the string val= from the end and give anything after that
str='102:20620453:d=2017021012:UGRD:10 m above ground:15 hour fcst::lon=79.500000,lat=9.000000,val=-5.35'
echo "$str" | grep -Po '(?<=val=).*'
answer works on GNU grep only

Related

How do I use 'sed' to read a line from txt?

I am trying to use a variable (the day of the year out of 365 using +%j) to search a txt file, using that variable to find the line number.
I can't figure out how to substitute '1p' below to something like.. "$dat + p" or something like that. Nothing I am thinking of or finding online is working, I am thinking I just am not coming up with good enough search terms to figure this thing out.
Here is what I have so far:
#!/usr/bin/bash
dat= date +%j
arg=$(cat /home/adam/dailyverses.txt | sed -n '1p')
echo $arg
First you need to catch the output of the date command this way:
dat="$(date +%j)"
Or this way when you need to comply with an old shell:
dat=`date "+%j"`
I am trying to use a variable (the day of the year out of 365 using +%j) to search a txt file, using that variable to find the line number.
So I think sed is not the right tool. Trying to make a substitution is not the right way.
Searching a pattern and getting the line number which match this pattern: grep would be the perfect fit! For example:
grep -nE "^.*${dat}.*$" /home/adam/dailyverses.txt|cut -d: -f1
You can also do it using bash:
line_number=1
while read -r; do
[[ $REPLY =~ ^.*${dat}.*$ ]] && echo "${line_number}"
(( line_number++ ))
done < "/home/adam/dailyverses.txt"
Or awk:
awk 'BEGIN {count = 1} {count++; if($0 ~ "^.*'"${dat}"'.*$"){print count}}' "/home/adam/dailyverses.txt"
Of course you can change the regexp according to your needs.
I can't figure out how to substitute '1p' below to something like.. "$dat + p" or something like that.
That's not really the same thing than just "find the line number" like you asked before. But if it's what's you really want:
dat="$(date +%j)"
sed -n "${dat}p" /home/adam/dailyverses.txt

Is it possible to retrieve one string between 2 special characters from text file using bash?

Let's say I have the following text file
test.txt
ABC_01:Testing-ABCDEFG
If I want to retrieve the string after colon, I will be using
awk -F ":" '/ABC_01/{print $NF}' test.txt
which will return Testing-ABCDEFG
But what should I do if I only want to retrieve the string after the colon and before the hyphen?
You are so close. That is where split() comes in, e.g.
awk -F: '/ABC_01/{ split($NF,arr,"-"); print arr[1] }'
Which will output
Testing
The GNU Awk User's Guide - String Manipulation Functions provides the details on split(). Give it a try and let me know if you have any further questions.
Using Bash's built'in Extended Regex Engine
#!/usr/bin/env bash
while read -r; do
[[ $REPLY =~ :(.*)- ]] || :
echo "${BASH_REMATCH[1]}"
done
Using standard POSIX shell IFS field separators:
#!/usr/bin/env sh
while IFS=':-' read -r _ m _; do
echo "$m"
done
Using (GNU) grep and look-around:
$ grep -oP '(?<=:)[^-]*(?=-)' file
Testing
Explained:
grep GNU grep supports PCRE and look-around
`-o Print only the matched (non-empty) parts of a matching line
-P Interpret PATTERNS as Perl-compatible regular expressions
(?<=:) positive look-behind, ie. preceeded by a colon
[^-]* anything but a hyphen
(?=-) positive look-ahead, ie. followed by a hyphen

Bash - Extracting just the date and time from a string variable with other surrounding text being excluded

I'm new to sed and have been trying to use it with no luck yet in this case.
I'm reading through a log file and I store the prior line into a variable so that I can extract out the date if needed.
variable string example:
jcl/jclnt.log-[05/06/20 16:42:52.964]:jclnt ST:
I'm only wanting the date and timestamp in the square brackets. I want to ignore the characters before and after. The date and time format are always the same length and format. I can match on it with a regex, just not sure how to extract it from a variable into a new variable with only the data inside the square brackets.
I tried something like this:
priordate= echo "$prior" | awk -F'[][]' '{print $2}'
But that didn't work.
It should work if you remove the space before your echo.
echo "jcl/jclnt.log-[05/06/20 16:42:52.964]:jclnt ST:" | awk -F'[][]' '{print $2}'
05/06/20 16:42:52.964
and then make the statement thus: priordate=$(echo ...)
You can use Bash's native regular expression matching. This is a quick and dirty regular expression that just relies on capturing whatever is between [ and ]. You can certainly make it more specific if necessary.
#!/bin/bash
s="jcl/jclnt.log-[05/06/20 16:42:52.964]:jclnt ST:"
pattern="\[(.*)\]"
if [[ "${s}" =~ $pattern ]]
then
date_time="${BASH_REMATCH[1]}"
fi
echo "${date_time}"
Output:
05/06/20 16:42:52.964

Split a string and pick the uppercase substring

Consider the following example variables in bash:
PET="cat/DOG/hamster"
FOOD="soup/soup/PIZZA"
SUBJECT="MATH/physics/biology"
How can I split any of those strings by a slash, extract the part that's all uppercase and store it in a variable? For example, how would I take DOG out of the $PET variable and store it in an $OPTION variable?
I need a portable solution that works under bash and zsh specifically.
You could use tr to remove all characters that are not uppercase:
OPTION=$(tr -dc '[:upper:]' <<< $PET)
Note that here-strings (<<< $VARIABLE) are a bash-ism. In other shells you'll have to echo the variable into tr:
OPTION=$(echo "$PET" | tr -dc '[:upper:]')
It sounds like there is only one portion of the string is in uppercase, so you can ignore the splitting portion of the question. This should work in both zsh and bash (although it is not portable in the sense of POSIX compatibility):
$ echo "${PET//[^A-Z]}"
DOG
You can try something like this -
OPTION=$(gawk -F'/' '{for (i=1;i<=NF;i++) if ($i ~ /\<[A-Z]+\>/) print $i}' <<< $PET)
If you like a pure bash solution then you can add following piece of code
#!/bin/bash
PET="cat/DOG/hamster"
IFSBK=$IFS
IFS='/'
for word in $PET; do
if [[ $word =~ [A-Z]+ ]]; then
OPTION="$word"
fi
done
IFS=$IFSBK

Rename a variable in a for loop

Lets say i have a nested for loop:
for i in $test
do
name=something
for j in $test2
do
name2=something
jj=$j | sed s/'tRap\/tRapTrain'/'BEEML\/BEEMLTrain'/g
if [ name == name2 ]
then
qsub scrip.sh $i $j $jj
fi
done
done
Now the problem occurs when i try to rename the variable $j into variable $jj. I only get empty values back for submitting the script within the if statement. Is there another way to rename variables so that i can pass them through to the if part of the code?
PS. i tried 3 for loops but this makes the script awfully slow.
Your problem is piping the assignment into sed. Try something like
jj=$(echo $j | sed s/'tRap\/tRapTrain'/'BEEML\/BEEMLTrain'/g)
This uses command substitution to assign jj.
This is not correct:
jj=$j | sed s/'tRap\/tRapTrain'/'BEEML\/BEEMLTrain'/g
In order to assign the output of a command to a variable you need to use command substitution like this:
jj=$(sed s/'tRap\/tRapTrain'/'BEEML\/BEEMLTrain'/g <<< "$j")
You may not even have to use sed because bash has in-built string replacement. For example, the following will replace foo with bar in the j variable and assign it to jj:
jj=${j//foo/bar}
There is also a problem with your if-statement. It should be:
if [ "$name" == "$name2" ]
A tiny little thing:
Sed treats the first character after the action selector as the field separator.
Knowing this you can translate your expresion:
sed s/'tRap\/tRapTrain'/'BEEML\/BEEMLTrain'/g
into:
sed s%'tRap/tRapTrain'%'BEEML/BEEMLTrain'%g
So you don't have to worry about scaping your slashes when substituting paths. I normally use '%', but feel free to use any other character. I think the optimal approach would be using a non printable character:
SEP=$'\001' ; sed s${SEP}'tRap/tRapTrain'${SEP}'BEEML/BEEMLTrain'${SEP}g

Resources