extracting a text with awk

extracting a text with awk - linux

I want to grep a file and extract the third part of this line
#define SIM_VERSION_COMPAT 1302
with awk. So I wrote:
grep "#define SIM_VERSION_COMPAT" global.h | awk '{ print $$3 }'
The result should be 1302 but I get nothing (blank).

No need to use grep and pipe you can use awk like this:
awk '/#define SIM_VERSION_COMPAT/{print $3}' global.h

[spatel#tc01 ~]$ echo "#define SIM_VERSION_COMPAT 1302" | awk '{ print $3 }'
1302

Just using grep:
$ grep -Po '(?<=#define SIM_VERSION_COMPAT )[0-9]+' global.h
1302
This uses positive lookbehind to match lines containing #define SIM_VERSION_COMPAT but only prints the digit string following.

You can also use cut command as well
grep "#define SIM_VERSION_COMPAT" temp.txt | cut -d" " -f 3

Related

grep for contents after pattern for word character and comma

echo "this is a test:foo,bar,baz']" | grep -o -E "test:.*" | awk -F: '{ print $2 }'
foo,bar,baz']
I get '] printed at the end, how to print only the word characters and common, nothing else, in this case I need to extract only foo,bar,baz

You can use a single awk for this:
echo "this is a test:foo,bar,baz']" | awk -F 'test:' '{sub(/[^,[:alnum:]].*/, "", $2); print $2}'
foo,bar,baz
Or, you can use a single sed:
echo "this is a test:foo,bar,baz']" | sed 's/.*test://; s/[^,[:alnum:]].*//'
foo,bar,baz

echo "this is a test:foo,bar,baz']"| awk -F: '{sub(/baz../,"baz"); print $2}'
outputs
foo,bar,baz

Using gnu grep pearl regex
$ echo "this is a test:foo,bar,baz']" | grep -oP "(?<=test:)(\w,*)+"
foo,bar,baz

How to extract words between two characters in linux?

I have the following stored in a file named tmp.txt
user/config/jars/content-config-factory-3.2.0.0.jar
I need to store this word to a variable -
$variable=content-config-factory
I have written the following
while read line
do
var=$(echo $line | awk 'BEGIN{FS="\/"; OFS=" "} {print $NF}' )
var=$(echo $var | awk 'BEGIN{FS="-"; OFS=" "} {print $(1)}' )
echo $var
done < tmp.txt
This returns the result "content" instead of "content-config-factory".
Can anyone please tell me how to extract a word between two characters from a string efficiently.

An awk solution would be like
awk -F/ '{sub("-[^-]+$", "", $NF); print $NF}
Test
$ echo "user/config/jars/content-config-factory-3.2.0.0.jar" | awk -F/ '{sub("-[^-]+$", "", $NF); print $NF}'
content-config-factory

You can try this way also and get your expected result
variable=$(sed 's:.*/\(.*\)-.*:\1:' FileName)
echo $variable
OutPut :
content-config-factory

You could use grep,
grep -oP '(?<=/)[^/]*(?=-\d+\.)' file
Example:
$ var=$(echo 'user/config/jars/content-config-factory-3.2.0.0.jar' | grep -oP '(?<=/)[^/]*(?=-\d+\.)')
$ echo "$var"
content-config-factory

Grep entire line after word

What would be the grep command to get an everything in the line after a match?
For example on a file path:
/home/usr/we/This/is/the/file/path
and I want the output to be
/we/This/is/the/File/Path
Matching the /we as the regex.

grep -o does what you want.
grep -o '/we.*'

OP like to use we as a trigger. Using awk
awk -F/ '{for (i=1;i<=NF;i++) {if ($i~/we/) f=1;if (f) printf "/%s",$i}print ""}' file
/we/This/is/the/file/path
Using gnu awk
awk '{print gensub(/.*(\/we)/,"\\1","g")}' file
/we/This/is/the/file/path

YourInput | sed 's|/home/usr\(/we.*\)|\1|'
assuming it's always (and only) starting with /home/usr
else
YourInput | sed -n 's|^.*\(/we.*\)||p'
return only line(s) having /we and remove text before /we

Using awk to modify output

I have a command that is giving me the output:
/home/konnor/md5sums:ea66574ff0daad6d0406f67e4571ee08 counted-file.xml.20131003-083611
I need the output to be:
ea66574ff0daad6d0406f67e4571ee08 counted-file.xml
The closest I got was:
$ echo /home/konnor/md5sums:ea66574ff0daad6d0406f67e4571ee08 counted-file.xml.20131003-083611 | awk '{ printf "%s", $1 }; END { printf "\n" }'
/home/konnor/md5sums:ea66574ff0daad6d0406f67e4571ee08
I'm not familiar with awk but I believe this is the command I want to use, any one have any ideas?

Or just a sed oneliner:
echo /home/konnor/md5sums:ea66574ff0daad6d0406f67e4571ee08 counted-file.xml.20131003-083611 \
| sed -E 's/.*:(.*\.xml).*/\1/'

$ echo "/home/konnor/md5sums:ea66574ff0daad6d0406f67e4571ee08 counted-file.xml.20131003-083611" |
cut -d: -f2 |
cut -d. -f1-2
ea66574ff0daad6d0406f67e4571ee08 counted-file.xml
Note that this relies on the dot . being present as in counted-file.xml.

$ awk -F[:.] -v OFS="." '{print $2,$3}' <<< "/home/konnor/md5sums:ea66574ff0daad6d0406f67e4571ee08 counted-file.xml.20131003-083611"
ea66574ff0daad6d0406f67e4571ee08 counted-file.xml

not sure if this is ok for you:
sed 's/^.*:\(.*\)\.[^.]*$/\1/'
with your example:
kent$ echo "/home/konnor/md5sums:ea66574ff0daad6d0406f67e4571ee08 counted-file.xml.20131003-083611"|sed 's/^.*:\(.*\)\.[^.]*$/\1/'
ea66574ff0daad6d0406f67e4571ee08 counted-file.xml
this grep line works too:
grep -Po ':\K.*(?=\..*?$)'

bash, extract string from text file with space delimiter

I have a text files with a line like this in them:
MC exp. sig-250-0 events & $0.98 \pm 0.15$ & $3.57 \pm 0.23$ \\
sig-250-0 is something that can change from file to file (but I always know what it is for each file). There are lines before and above this, but the string "MC exp. sig-250-0 events" is unique in the file.
For a particular file, is there a good way to extract the second number 3.57 in the above example using bash?

use awk for this:
awk '/MC exp. sig-250-0/ {print $10}' your.txt
Note that this will print: $3.57 - with the leading $, if you don't like this, pipe the output to tr:
awk '/MC exp. sig-250-0/ {print $10}' your.txt | tr -d '$'
In comments you wrote that you need to call it in a script like this:
while read p ; do
echo $p,awk '/MC exp. sig-$p/ {print $10}' filename | tr -d '$'
done < grid.txt
Note that you need a sub shell $() for the awk pipe. Like this:
echo "$p",$(awk '/MC exp. sig-$p/ {print $10}' filename | tr -d '$')
If you want to pass a shell variable to the awk pattern use the following syntax:
awk -v p="MC exp. sig-$p" '/p/ {print $10}' a.txt | tr -d '$'

More lines would've been nice but I guess you would like to have a simple use awk.
awk '{print $N}' $file
If you don't tell awk what kind of field-separator it has to use it will use just a space ' '. Now you just have to count how many fields you have got to get your field you want to get. In your case it would be 10.
awk '{print $10}' file.txt
$3.57
Don't want the $?
Pipe your awk result to cut:
awk '{print $10}' foo | cut -d $ -f2
-d will use the $ als field-separator and -f will select the second field.

If you know you always have the same number of fields, then
#!/bin/bash
file=$1
key=$2
while read -ra f; do
if [[ "${f[0]} ${f[1]} ${f[2]} ${f[3]}" == "MC exp. $key events" ]]; then
echo ${f[9]}
fi
done < "$file"

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

extracting a text with awk - linux

I want to grep a file and extract the third part of this line #define SIM_VERSION_COMPAT 1302 with awk. So I wrote: grep "#define SIM_VERSION_COMPAT" global.h | awk '{ print $$3 }' The result should be 1302 but I get nothing (blank).

No need to use grep and pipe you can use awk like this: awk '/#define SIM_VERSION_COMPAT/{print $3}' global.h

[spatel#tc01 ~]$ echo "#define SIM_VERSION_COMPAT 1302" | awk '{ print $3 }' 1302

Just using grep: $ grep -Po '(?<=#define SIM_VERSION_COMPAT )[0-9]+' global.h 1302 This uses positive lookbehind to match lines containing #define SIM_VERSION_COMPAT but only prints the digit string following.

You can also use cut command as well grep "#define SIM_VERSION_COMPAT" temp.txt | cut -d" " -f 3

Related

grep for contents after pattern for word character and comma

How to extract words between two characters in linux?

Grep entire line after word

Using awk to modify output

bash, extract string from text file with space delimiter

Categories

Resources