extracting a text with awk - linux

I want to grep a file and extract the third part of this line
#define SIM_VERSION_COMPAT 1302
with awk. So I wrote:
grep "#define SIM_VERSION_COMPAT" global.h | awk '{ print $$3 }'
The result should be 1302 but I get nothing (blank).

No need to use grep and pipe you can use awk like this:
awk '/#define SIM_VERSION_COMPAT/{print $3}' global.h

[spatel#tc01 ~]$ echo "#define SIM_VERSION_COMPAT 1302" | awk '{ print $3 }'
1302

Just using grep:
$ grep -Po '(?<=#define SIM_VERSION_COMPAT )[0-9]+' global.h
1302
This uses positive lookbehind to match lines containing #define SIM_VERSION_COMPAT but only prints the digit string following.

You can also use cut command as well
grep "#define SIM_VERSION_COMPAT" temp.txt | cut -d" " -f 3

Related

grep for contents after pattern for word character and comma

echo "this is a test:foo,bar,baz']" | grep -o -E "test:.*" | awk -F: '{ print $2 }'
foo,bar,baz']
I get '] printed at the end, how to print only the word characters and common, nothing else, in this case I need to extract only foo,bar,baz
You can use a single awk for this:
echo "this is a test:foo,bar,baz']" | awk -F 'test:' '{sub(/[^,[:alnum:]].*/, "", $2); print $2}'
foo,bar,baz
Or, you can use a single sed:
echo "this is a test:foo,bar,baz']" | sed 's/.*test://; s/[^,[:alnum:]].*//'
foo,bar,baz
echo "this is a test:foo,bar,baz']"| awk -F: '{sub(/baz../,"baz"); print $2}'
outputs
foo,bar,baz
Using gnu grep pearl regex
$ echo "this is a test:foo,bar,baz']" | grep -oP "(?<=test:)(\w,*)+"
foo,bar,baz

How to extract words between two characters in linux?

I have the following stored in a file named tmp.txt
user/config/jars/content-config-factory-3.2.0.0.jar
I need to store this word to a variable -
$variable=content-config-factory
I have written the following
while read line
do
var=$(echo $line | awk 'BEGIN{FS="\/"; OFS=" "} {print $NF}' )
var=$(echo $var | awk 'BEGIN{FS="-"; OFS=" "} {print $(1)}' )
echo $var
done < tmp.txt
This returns the result "content" instead of "content-config-factory".
Can anyone please tell me how to extract a word between two characters from a string efficiently.
An awk solution would be like
awk -F/ '{sub("-[^-]+$", "", $NF); print $NF}
Test
$ echo "user/config/jars/content-config-factory-3.2.0.0.jar" | awk -F/ '{sub("-[^-]+$", "", $NF); print $NF}'
content-config-factory
You can try this way also and get your expected result
variable=$(sed 's:.*/\(.*\)-.*:\1:' FileName)
echo $variable
OutPut :
content-config-factory
You could use grep,
grep -oP '(?<=/)[^/]*(?=-\d+\.)' file
Example:
$ var=$(echo 'user/config/jars/content-config-factory-3.2.0.0.jar' | grep -oP '(?<=/)[^/]*(?=-\d+\.)')
$ echo "$var"
content-config-factory

Grep entire line after word

What would be the grep command to get an everything in the line after a match?
For example on a file path:
/home/usr/we/This/is/the/file/path
and I want the output to be
/we/This/is/the/File/Path
Matching the /we as the regex.
grep -o does what you want.
grep -o '/we.*'
OP like to use we as a trigger. Using awk
awk -F/ '{for (i=1;i<=NF;i++) {if ($i~/we/) f=1;if (f) printf "/%s",$i}print ""}' file
/we/This/is/the/file/path
Using gnu awk
awk '{print gensub(/.*(\/we)/,"\\1","g")}' file
/we/This/is/the/file/path
YourInput | sed 's|/home/usr\(/we.*\)|\1|'
assuming it's always (and only) starting with /home/usr
else
YourInput | sed -n 's|^.*\(/we.*\)||p'
return only line(s) having /we and remove text before /we

Using awk to modify output

I have a command that is giving me the output:
/home/konnor/md5sums:ea66574ff0daad6d0406f67e4571ee08 counted-file.xml.20131003-083611
I need the output to be:
ea66574ff0daad6d0406f67e4571ee08 counted-file.xml
The closest I got was:
$ echo /home/konnor/md5sums:ea66574ff0daad6d0406f67e4571ee08 counted-file.xml.20131003-083611 | awk '{ printf "%s", $1 }; END { printf "\n" }'
/home/konnor/md5sums:ea66574ff0daad6d0406f67e4571ee08
I'm not familiar with awk but I believe this is the command I want to use, any one have any ideas?
Or just a sed oneliner:
echo /home/konnor/md5sums:ea66574ff0daad6d0406f67e4571ee08 counted-file.xml.20131003-083611 \
| sed -E 's/.*:(.*\.xml).*/\1/'
$ echo "/home/konnor/md5sums:ea66574ff0daad6d0406f67e4571ee08 counted-file.xml.20131003-083611" |
cut -d: -f2 |
cut -d. -f1-2
ea66574ff0daad6d0406f67e4571ee08 counted-file.xml
Note that this relies on the dot . being present as in counted-file.xml.
$ awk -F[:.] -v OFS="." '{print $2,$3}' <<< "/home/konnor/md5sums:ea66574ff0daad6d0406f67e4571ee08 counted-file.xml.20131003-083611"
ea66574ff0daad6d0406f67e4571ee08 counted-file.xml
not sure if this is ok for you:
sed 's/^.*:\(.*\)\.[^.]*$/\1/'
with your example:
kent$ echo "/home/konnor/md5sums:ea66574ff0daad6d0406f67e4571ee08 counted-file.xml.20131003-083611"|sed 's/^.*:\(.*\)\.[^.]*$/\1/'
ea66574ff0daad6d0406f67e4571ee08 counted-file.xml
this grep line works too:
grep -Po ':\K.*(?=\..*?$)'

bash, extract string from text file with space delimiter

I have a text files with a line like this in them:
MC exp. sig-250-0 events & $0.98 \pm 0.15$ & $3.57 \pm 0.23$ \\
sig-250-0 is something that can change from file to file (but I always know what it is for each file). There are lines before and above this, but the string "MC exp. sig-250-0 events" is unique in the file.
For a particular file, is there a good way to extract the second number 3.57 in the above example using bash?
use awk for this:
awk '/MC exp. sig-250-0/ {print $10}' your.txt
Note that this will print: $3.57 - with the leading $, if you don't like this, pipe the output to tr:
awk '/MC exp. sig-250-0/ {print $10}' your.txt | tr -d '$'
In comments you wrote that you need to call it in a script like this:
while read p ; do
echo $p,awk '/MC exp. sig-$p/ {print $10}' filename | tr -d '$'
done < grid.txt
Note that you need a sub shell $() for the awk pipe. Like this:
echo "$p",$(awk '/MC exp. sig-$p/ {print $10}' filename | tr -d '$')
If you want to pass a shell variable to the awk pattern use the following syntax:
awk -v p="MC exp. sig-$p" '/p/ {print $10}' a.txt | tr -d '$'
More lines would've been nice but I guess you would like to have a simple use awk.
awk '{print $N}' $file
If you don't tell awk what kind of field-separator it has to use it will use just a space ' '. Now you just have to count how many fields you have got to get your field you want to get. In your case it would be 10.
awk '{print $10}' file.txt
$3.57
Don't want the $?
Pipe your awk result to cut:
awk '{print $10}' foo | cut -d $ -f2
-d will use the $ als field-separator and -f will select the second field.
If you know you always have the same number of fields, then
#!/bin/bash
file=$1
key=$2
while read -ra f; do
if [[ "${f[0]} ${f[1]} ${f[2]} ${f[3]}" == "MC exp. $key events" ]]; then
echo ${f[9]}
fi
done < "$file"

Resources