using awk on a string - linux

can I use awk to extract the first column or any column on a string?
Actually i am using a file and reading it to a variable I want to use AWK on that variable and do my job.
How is it possible? Any suggestions.

Print first column*:
<some output producing command> | awk '{print $1}'
Print second column:
<some output producing command> | awk '{print $2}'
etc.
Where <some output producing command> is like cat filename.txt or echo $VAR, etc.
e.g. ls -l | awk '{print $9}' extracts the ninth column, which is like an ... awkward way of ls -1
*Columns are defined by the separating whitespace.
EDIT: If your text is already in a variable, something like:
VAR2=$(echo $VAR | awk '{print $9}')
would work, provided you change 9 to the desired column.

Related

bash: awk print with in print

I need to grep some pattern and further i need to print some output within that. Currently I am using the below command which is working fine. But I like to eliminate using multiple pipe and want to use single awk command to achieve the same output. Is there a way to do it using awk?
root#Server1 # cat file
Jenny:Mon,Tue,Wed:Morning
David:Thu,Fri,Sat:Evening
root#Server1 # awk '/Jenny/ {print $0}' file | awk -F ":" '{ print $2 }' | awk -F "," '{ print $1 }'
Mon
I want to get this output using single awk command. Any help?
You can try something like:
awk -F: '/Jenny/ {split($2,a,","); print a[1]}' file
Try this
awk -F'[:,]+' '/Jenny/{print $2}' file.txt
It is using muliple -F value inside the [ ]
The + means one or more since it is treated as a regex.
For this particular job, I find grep to be slightly more robust.
Unless your company has a policy not to hire people named Eve.
(Try it out if you don't understand.)
grep -oP '^[^:]*Jenny[^:]*:\K[^,:]+' file
Or to do a whole-word match:
grep -oP '^[^:]*\bJenny\b[^:]*:\K[^,:]+' file
Or when you are confident that "Jenny" is the full name:
grep -oP '^Jenny:\K[^,:]+' file
Output:
Mon
Explanation:
The stuff up until \K speaks for itself: it selects the line(s) with the desired name.
[^,:]+ captures the day of week (in this case Mon).
\K cuts off everything preceding Mon.
-o cuts off anything following Mon.

How Can I Perform Awk Commands Only On Certain Fields

I have CSV columns that I'm working with:
info,example-string,super-example-string,otherinfo
I would like to get:
example-string super example string
Right now, I'm running the following command:
awk -F ',' '{print $3}' | sed "s/-//g"
But, then I have to paste the lines together to combine $2 and $3.
Is there anyway to do something like this?
awk -F ',' '{print $2" "$3}' | sed "s/-//g"
Except, where the sed command is only performed on $3 and $2 stays in place? I'm just concerned later on if the lines don't match up, the data could be misaligned.
Please note: I need to keep the pipe for the SED command. I just used a simple example but I end up running a lot of commands after that as well.
Try:
$ awk -F, '{gsub(/-/," ",$3); print $2,$3}' file
example-string super example string
How it works
-F,
This tells awk to use a comma as the field separator.
gsub(/-/," ",$3)
This replaces all - in field 3 with spaces.
print $2,$3
This prints fields 2 and 3.
Examples using pipelines
$ echo 'info,example-string,super-example-string,otherinfo' | awk -F, '{gsub(/-/," ",$3); print $2,$3}'
example-string super example string
In a pipeline with sed:
$ echo 'info,example-string,super-example-string,otherinfo' | awk -F, '{gsub(/-/," ",$3); print $2,$3}' | sed 's/string/String/g'
example-String super example String
Though best solution will be either use a single sed or use single awk. Since you have requested to use awk and sed solution so providing this. Also considering your actual data will be same as shown sample Input_file.
awk -F, '{print $2,$3}' Input_file | sed 's/\([^ ]*\)\([^-]*\)-\([^-]*\)-\([^-]*\)/\1 \2 \3 \4/'
Output will be as follows.
example-string super example string

grep command not working as my expectation

I have a text file like mentioned below, and along with that I will pass an input for which I want a corresponding output.
Input file: test.txt
abc:abc_1
abcd:abcd_1
1_abcd:1_abcd_bkp
xyz:xyz_2
so if I use abc with the above test.txt file, I want abc_1; and if I pass abcd, I need abcd_1 as output.
I tried cat text.txt | grep abc | cut -d":" -f2,2, but I am getting the output
abc_1
abcd_1
1_abcd_bkp
when I want only abc_1.
With GNU grep:
grep -Po "^abc:\K.*" file
Output:
abc_1
\K keeps the text matched so far out of the overall regex match.
You want to use a regular expression with the -e switch.
In particular, regular expressions allow you to use caret (^) to express the start of a line.
Since you only care about abc when it's at the start of a line and it's followed by :, you want:
cat test.txt | grep -e "^abc:" | cut -d":" -f2,2
Output:
abc_1
awk to the rescue!
awk -F: -v key="abc" '$1==key{print $2}'
using : as the delimiter do the look up for key on field 1 to return field 2.
Or, by moving the key in the script
awk -F: '$1=="abc"{print $2}'
you can try the exclude -v:
cat text.txt | grep abc | grep -vi abc[a-z]
not sure if that would work exactly, try something with that kind of idea
Without specifying second field to be printed the whole line will be or in other cases lines.
awk -F: '/abc_/{print $2}' file
abc_1
awk -F: 'NR==1,/abc/{print $2}' file
abc_1

cut or awk command to print first field of first row

I am trying print the first field of the first row of an output. Here is the case. I just need to print only SUSE from this output.
# cat /etc/*release
SUSE Linux Enterprise Server 11 (x86_64)
VERSION = 11
PATCHLEVEL = 2
Tried with cat /etc/*release | awk {'print $1}' but that print the first string of every row
SUSE
VERSION
PATCHLEVEL
Specify NR if you want to capture output from selected rows:
awk 'NR==1{print $1}' /etc/*release
An alternative (ugly) way of achieving the same would be:
awk '{print $1; exit}'
An efficient way of getting the first string from a specific line, say line 42, in the output would be:
awk 'NR==42{print $1; exit}'
Specify the Line Number using NR built-in variable.
awk 'NR==1{print $1}' /etc/*release
try this:
head -1 /etc/*release | awk '{print $1}'
df -h | head -4 | tail -1 | awk '{ print $2 }'
Change the numbers to tweak it to your liking.
Or use a while loop but thats probably a bad way to do it.
You could use the head instead of cat:
head -n1 /etc/*release | awk '{print $1}'
sed -n 1p /etc/*release |cut -d " " -f1
if tab delimited:
sed -n 1p /etc/*release |cut -f1
Try
sed 'NUMq;d' /etc/*release | awk {'print $1}'
where NUM is line number
ex. sed '1q;d' /etc/*release | awk {'print $1}'
awk, sed, pipe, that's heavy
set `cat /etc/*release`; echo $1
the most code-golfy way i could think of to print first line only in awk :
awk '_{exit}--_' # skip the quotations and make it just
# awk _{exit}--_
#
# if u're feeling adventurous
first pass through exit block, "_" is undefined,
so it fails and skips over for row 1.
then the decrementing of the same counter will make
it "TRUE" in awk's eyes (anything not empty string
or numeric zero is considered "true" in their agile boolean sense). that same counter also triggers default action of print for row 1.
—- incrementing… decrementing… it's same thing,
merely direction and sign inverted.
then finally, at start of row 2, it hits criteria to
enter the action block, which instructs it to instantly
exit, thus performing essentially the same functionality as
awk '{ print; exit }'
… in a slightly less verbose manner. For a single line print, it's not even worth it to set FS to skip the field splitting part.
using that concept to print just 1st row 1st field :
awk '_{exit} NF=++_'
awk '_++{exit} NF=_'
awk 'NR==1&&NF=1' file
grep -om1 '^[^ ]\+' file
# multiple files
awk 'FNR==1&&NF=1' file1 file2
You can kill the process which is running the container.
With this command you can list the processes related with the docker container:
ps -aux | grep $(docker ps -a | grep container-name | awk '{print $1}')
Now you have the process ids to kill with kill or kill -9.

How to run grep inside awk?

Suppose I have a file input.txt with few columns and few rows, the first column is the key, and a directory dir with files which contain some of these keys. I want to find all lines in the files in dir which contain these key words. At first I tried to run the command
cat input.txt | awk '{print $1}' | xargs grep dir
This doesn't work because it thinks the keys are paths on my file system. Next I tried something like
cat input.txt | awk '{system("grep -rn dir $1")}'
But this didn't work either, eventually I have to admit that even this doesn't work
cat input.txt | awk '{system("echo $1")}'
After I tried to use \ to escape the white space and the $ sign, I came here to ask for your advice, any ideas?
Of course I can do something like
for x in `cat input.txt` ; do grep -rn $x dir ; done
This is not good enough, because it takes two commands, but I want only one. This also shows why xargs doesn't work, the parameter is not the last argument
You don't need grep with awk, and you don't need cat to open files:
awk 'NR==FNR{keys[$1]; next} {for (key in keys) if ($0 ~ key) {print FILENAME, $0; next} }' input.txt dir/*
Nor do you need xargs, or shell loops or anything else - just one simple awk command does it all.
If input.txt is not a file, then tweak the above to:
real_input_generating_command |
awk 'NR==FNR{keys[$1]; next} {for (key in keys) if ($0 ~ key) {print FILENAME, $0; next} }' - dir/*
All it's doing is creating an array of keys from the first file (or input stream) and then looking for each key from that array in every file in the dir directory.
Try following
awk '{print $1}' input.txt | xargs -n 1 -I pattern grep -rn pattern dir
First thing you should do is research this.
Next ... you don't need to grep inside awk. That's completely redundant. It's like ... stuffing your turkey with .. a turkey.
Awk can process input and do "grep" like things itself, without the need to launch the grep command. But you don't even need to do this. Adapting your first example:
awk '{print $1}' input.txt | xargs -n 1 -I % grep % dir
This uses xargs' -I option to put xargs' input into a different place on the command line it runs. In FreeBSD or OSX, you would use a -J option instead.
But I prefer your for loop idea, converted into a while loop:
while read key junk; do grep -rn "$key" dir ; done < input.txt
Use process substitution to create a keyword "file" that you can pass to grep via the -f option:
grep -f <(awk '{print $1}' input.txt) dir/*
This will search each file in dir for lines containing keywords printed by the awk command. It's equivalent to
awk '{print $1}' input.txt > tmp.txt
grep -f tmp.txt dir/*
grep requires parameters in order: [what to search] [where to search]. You need to merge keys received from awk and pass them to grep using the \| regexp operator.
For example:
arturcz#szczaw:/tmp/s$ cat words.txt
foo
bar
fubar
foobaz
arturcz#szczaw:/tmp/s$ grep 'foo\|baz' words.txt
foo
foobaz
Finally, you will finish with:
grep `commands|to|prepare|a|keywords|list` directory
In case you still want to use grep inside awk, make sure $1, $2 etc are outside quote.
eg. this works perfectly
cat file_having_query | awk '{system("grep " $1 " file_to_be_greped")}'
// notice the space after grep and before file name

Resources