How to extract words between two characters in linux? - linux

I have the following stored in a file named tmp.txt
user/config/jars/content-config-factory-3.2.0.0.jar
I need to store this word to a variable -
$variable=content-config-factory
I have written the following
while read line
do
var=$(echo $line | awk 'BEGIN{FS="\/"; OFS=" "} {print $NF}' )
var=$(echo $var | awk 'BEGIN{FS="-"; OFS=" "} {print $(1)}' )
echo $var
done < tmp.txt
This returns the result "content" instead of "content-config-factory".
Can anyone please tell me how to extract a word between two characters from a string efficiently.

An awk solution would be like
awk -F/ '{sub("-[^-]+$", "", $NF); print $NF}
Test
$ echo "user/config/jars/content-config-factory-3.2.0.0.jar" | awk -F/ '{sub("-[^-]+$", "", $NF); print $NF}'
content-config-factory

You can try this way also and get your expected result
variable=$(sed 's:.*/\(.*\)-.*:\1:' FileName)
echo $variable
OutPut :
content-config-factory

You could use grep,
grep -oP '(?<=/)[^/]*(?=-\d+\.)' file
Example:
$ var=$(echo 'user/config/jars/content-config-factory-3.2.0.0.jar' | grep -oP '(?<=/)[^/]*(?=-\d+\.)')
$ echo "$var"
content-config-factory

Related

Using awk to split string with \n followed by ' and ,

If given the string '1234␤',56789, how can I use awk to split by the sequence ␤',? Here ␤ represents a literal newline character.
Right now I have,
echo $LINE | awk -F'\\\\n',' '{ print $1}'
The split doesn't happen with this. Any advice?
Try to print all fields using the value of -F
echo "1234\n',56789," | awk -F "[',]+" -v ORS="" '{$1=$1}1'
line="1234\n',56789,"; echo "$line" | awk -F "[',]+" -v ORS="" '{$1=$1; print $0}'
Output
1234\n 56789
To print a specific field
echo "1234\n',56789," | awk -F "[',]+" -v ORS="" '{$1=$1; print $1}'
line="1234\n',56789,"; echo "$line" | awk -F "[',]+" -v ORS="" '{$1=$1; print $1}'
Output
1234\n

grep for contents after pattern for word character and comma

echo "this is a test:foo,bar,baz']" | grep -o -E "test:.*" | awk -F: '{ print $2 }'
foo,bar,baz']
I get '] printed at the end, how to print only the word characters and common, nothing else, in this case I need to extract only foo,bar,baz
You can use a single awk for this:
echo "this is a test:foo,bar,baz']" | awk -F 'test:' '{sub(/[^,[:alnum:]].*/, "", $2); print $2}'
foo,bar,baz
Or, you can use a single sed:
echo "this is a test:foo,bar,baz']" | sed 's/.*test://; s/[^,[:alnum:]].*//'
foo,bar,baz
echo "this is a test:foo,bar,baz']"| awk -F: '{sub(/baz../,"baz"); print $2}'
outputs
foo,bar,baz
Using gnu grep pearl regex
$ echo "this is a test:foo,bar,baz']" | grep -oP "(?<=test:)(\w,*)+"
foo,bar,baz

can not use unix $variable in Fixed search of awk command

I can not use unix $variable in Fiexd search of awk command.
Please see below my commands.
a="NEW_TABLES NEW_INSERT"
b="NEW"
echo $a | awk -v myvar=$b -F'$0~myvar' '{print $2}'
is not returning any output
but if manually enter the $b value there , its working as below
echo $a | awk -v -F'NEW' '{print $2}'
outputs:
TABLES NEW_INSERT
This should make it:
$ a="NEW_TABLES NEW_INSERT"
$ echo $a | awk -F"NEW_" '{print $2}'
TABLES
$ b="NEW_"
$ echo $a | awk -F"$b" '{print $2}'
TABLES
Your quotings are all messed up and you can use your variable to split the line using split function:
a="NEW_TABLES NEW_INSERT"
b="NEW"
echo $a | awk -v myvar="$b" '{split($0,ary,myvar);print ary[2]}'
Outputs:
_TABLES

bash, extract string from text file with space delimiter

I have a text files with a line like this in them:
MC exp. sig-250-0 events & $0.98 \pm 0.15$ & $3.57 \pm 0.23$ \\
sig-250-0 is something that can change from file to file (but I always know what it is for each file). There are lines before and above this, but the string "MC exp. sig-250-0 events" is unique in the file.
For a particular file, is there a good way to extract the second number 3.57 in the above example using bash?
use awk for this:
awk '/MC exp. sig-250-0/ {print $10}' your.txt
Note that this will print: $3.57 - with the leading $, if you don't like this, pipe the output to tr:
awk '/MC exp. sig-250-0/ {print $10}' your.txt | tr -d '$'
In comments you wrote that you need to call it in a script like this:
while read p ; do
echo $p,awk '/MC exp. sig-$p/ {print $10}' filename | tr -d '$'
done < grid.txt
Note that you need a sub shell $() for the awk pipe. Like this:
echo "$p",$(awk '/MC exp. sig-$p/ {print $10}' filename | tr -d '$')
If you want to pass a shell variable to the awk pattern use the following syntax:
awk -v p="MC exp. sig-$p" '/p/ {print $10}' a.txt | tr -d '$'
More lines would've been nice but I guess you would like to have a simple use awk.
awk '{print $N}' $file
If you don't tell awk what kind of field-separator it has to use it will use just a space ' '. Now you just have to count how many fields you have got to get your field you want to get. In your case it would be 10.
awk '{print $10}' file.txt
$3.57
Don't want the $?
Pipe your awk result to cut:
awk '{print $10}' foo | cut -d $ -f2
-d will use the $ als field-separator and -f will select the second field.
If you know you always have the same number of fields, then
#!/bin/bash
file=$1
key=$2
while read -ra f; do
if [[ "${f[0]} ${f[1]} ${f[2]} ${f[3]}" == "MC exp. $key events" ]]; then
echo ${f[9]}
fi
done < "$file"

extracting a text with awk

I want to grep a file and extract the third part of this line
#define SIM_VERSION_COMPAT 1302
with awk. So I wrote:
grep "#define SIM_VERSION_COMPAT" global.h | awk '{ print $$3 }'
The result should be 1302 but I get nothing (blank).
No need to use grep and pipe you can use awk like this:
awk '/#define SIM_VERSION_COMPAT/{print $3}' global.h
[spatel#tc01 ~]$ echo "#define SIM_VERSION_COMPAT 1302" | awk '{ print $3 }'
1302
Just using grep:
$ grep -Po '(?<=#define SIM_VERSION_COMPAT )[0-9]+' global.h
1302
This uses positive lookbehind to match lines containing #define SIM_VERSION_COMPAT but only prints the digit string following.
You can also use cut command as well
grep "#define SIM_VERSION_COMPAT" temp.txt | cut -d" " -f 3

Resources