How To Substitute Piped Output of Awk Command With Variable - linux

I'm trying to take a column and pipe it through an echo command. If possible, I would like to keep it in one line or do this as efficiently as possible. While researching, I found that I have to use single quotes to expand the variable and to escape the double quotes.
Here's what I was trying:
awk -F ',' '{print $2}' file1.txt | while read line; do echo "<href=\"'${i}'\">'${i}'</a>"; done
But, I keep getting the number of lines than the single line's output. If you know how to caputure each line in field 4, that would be so helpful.
File1.txt:
Hello,http://example1.com
Hello,http://example2.com
Hello,http://example3.com
Desired output:
<href="http://example1.com">http://example1.com</a>
<href="http://example2.com">http://example2.com</a>
<href="http://example3.com">http://example3.com</a>

$ awk -F, '{printf "<href=\"%s\">%s</a>\n", $2, $2}' file
<href="http://example1.com">http://example1.com</a>
<href="http://example2.com">http://example2.com</a>
<href="http://example3.com">http://example3.com</a>
Or slightly briefer but less robustly:
$ sed 's/.*,\(.*\)/<href="\1">\1<\/a>/' file
<href="http://example1.com">http://example1.com</a>
<href="http://example2.com">http://example2.com</a>
<href="http://example3.com">http://example3.com</a>

Related

bash: awk print with in print

I need to grep some pattern and further i need to print some output within that. Currently I am using the below command which is working fine. But I like to eliminate using multiple pipe and want to use single awk command to achieve the same output. Is there a way to do it using awk?
root#Server1 # cat file
Jenny:Mon,Tue,Wed:Morning
David:Thu,Fri,Sat:Evening
root#Server1 # awk '/Jenny/ {print $0}' file | awk -F ":" '{ print $2 }' | awk -F "," '{ print $1 }'
Mon
I want to get this output using single awk command. Any help?
You can try something like:
awk -F: '/Jenny/ {split($2,a,","); print a[1]}' file
Try this
awk -F'[:,]+' '/Jenny/{print $2}' file.txt
It is using muliple -F value inside the [ ]
The + means one or more since it is treated as a regex.
For this particular job, I find grep to be slightly more robust.
Unless your company has a policy not to hire people named Eve.
(Try it out if you don't understand.)
grep -oP '^[^:]*Jenny[^:]*:\K[^,:]+' file
Or to do a whole-word match:
grep -oP '^[^:]*\bJenny\b[^:]*:\K[^,:]+' file
Or when you are confident that "Jenny" is the full name:
grep -oP '^Jenny:\K[^,:]+' file
Output:
Mon
Explanation:
The stuff up until \K speaks for itself: it selects the line(s) with the desired name.
[^,:]+ captures the day of week (in this case Mon).
\K cuts off everything preceding Mon.
-o cuts off anything following Mon.

Extract Column(s) from text file having Multi Character Delimiter i.e. "%$%"

I have tried different solution for the problem given on the forum but doesn't work for the specified Delimiter %$%, I need to extract one specific column from the file containing 200+ columns.
I tried the following:
awk -F"%$%" '{print $1}' sample.txt > outfile.txt
awk 'gsub("%$%",":")' sample.txt > outfile.txt
The symbol $ is a special character in a regex, so you need to escape it with a \, which is also a special character for the string literal, so it needs to be escaped again.
So, finally we have:
$ cat sample
ghkjlj;lk%$%23e;k32poek%$%eqdje2oijd%$%xrgtdy5h
$ awk -F'%\\$%' '{print $1}' sample
ghkjlj;lk
no matter -F (FS) or gsub(), it expects a regex, you need either use character class or escape those chars with special meaning, like $ in your example.
kent$ awk -F'%[$]%' '{print $1}' <<<"foo%$%bar%$%blah"
foo
If you just want to change the separator, you can do with gsub or using OFS:
kent$ awk -F'%[$]%' -v OFS=":" '$1=$1' <<<"foo%$%bar%$%blah"
foo:bar:blah
kent$ awk 'gsub(/%[$]%/,":")+1' <<<"foo%$%bar%$%blah"
foo:bar:blah

How Can I Perform Awk Commands Only On Certain Fields

I have CSV columns that I'm working with:
info,example-string,super-example-string,otherinfo
I would like to get:
example-string super example string
Right now, I'm running the following command:
awk -F ',' '{print $3}' | sed "s/-//g"
But, then I have to paste the lines together to combine $2 and $3.
Is there anyway to do something like this?
awk -F ',' '{print $2" "$3}' | sed "s/-//g"
Except, where the sed command is only performed on $3 and $2 stays in place? I'm just concerned later on if the lines don't match up, the data could be misaligned.
Please note: I need to keep the pipe for the SED command. I just used a simple example but I end up running a lot of commands after that as well.
Try:
$ awk -F, '{gsub(/-/," ",$3); print $2,$3}' file
example-string super example string
How it works
-F,
This tells awk to use a comma as the field separator.
gsub(/-/," ",$3)
This replaces all - in field 3 with spaces.
print $2,$3
This prints fields 2 and 3.
Examples using pipelines
$ echo 'info,example-string,super-example-string,otherinfo' | awk -F, '{gsub(/-/," ",$3); print $2,$3}'
example-string super example string
In a pipeline with sed:
$ echo 'info,example-string,super-example-string,otherinfo' | awk -F, '{gsub(/-/," ",$3); print $2,$3}' | sed 's/string/String/g'
example-String super example String
Though best solution will be either use a single sed or use single awk. Since you have requested to use awk and sed solution so providing this. Also considering your actual data will be same as shown sample Input_file.
awk -F, '{print $2,$3}' Input_file | sed 's/\([^ ]*\)\([^-]*\)-\([^-]*\)-\([^-]*\)/\1 \2 \3 \4/'
Output will be as follows.
example-string super example string

Unix (ksh) script to read file, parse and output certain columns only

I have an input file that looks like this:
"LEVEL1","cn=APP_GROUP_ABC,ou=dept,dc=net","uid=A123456,ou=person,dc=net"
"LEVEL1","cn=APP_GROUP_DEF,ou=dept,dc=net","uid=A123456,ou=person,dc=net"
"LEVEL1","cn=APP_GROUP_ABC,ou=dept,dc=net","uid=A567890,ou=person,dc=net"
I want to read each line, parse and then output like this:
A123456,ABC
A123456,DEF
A567890,ABC
In other words, retrieve the user id from "uid=" and then the identifier from "cn=APP_GROUP_". Repeat for each input record, writing to a new output file.
Note that the column positions aren't fixed, so can't rely on positions, guessing I have to search for the "uid=" string and somehow use the position maybe?
Any help much appreciated.
You can do this easily with sed:
sed 's/.*cn=APP_GROUP_\([^,]*\).*uid=\([^,]*\).*/\2,\1/'
The regex captures the two desired strings, and outputs them in reverse order with a comma between them. You might need to change the context of the captures, depending on the precise nature of your data, because the uid= will match the last uid= in the line, if there are more than one.
You can use awk to split in columns, split by ',' and then split by =, and grab the result. You can do it easily as awk -F, '{ print $5}' | awk -F= '{print $2}'
Take a look at this line looking at the example you provided:
cat file | awk -F, '{ print $5}' | awk -F= '{print $2}'
A123456
A123456
A567890

How to reverse order of fields using AWK?

I have a file with the following layout:
123,01-08-2006
124,01-09-2007
125,01-10-2009
126,01-12-2010
How can I convert it into the following by using AWK?
123,2006-08-01
124,2007-09-01
125,2009-10-01
126,2009-12-01
Didn't read the question properly the first time. You need a field separator that can be either a dash or a comma. Once you have that you can use the dash as an output field separator (as it's the most common) and fake the comma using concatenation:
awk -F',|-' 'OFS="-" {print $1 "," $4,$3,$2}' file
Pure awk
awk -F"," '{ n=split($2,b,"-");$2=b[3]"-"b[2]"-"b[1];$i=$1","$2 } 1' file
sed
sed -r 's/(^.[^,]*,)([0-9]{2})-([0-9]{2})-([0-9]{4})/\1\4-\3-\2/' file
sed 's/\(^.[^,]*,\)\([0-9][0-9]\)-\([0-9][0-9]\)-\([0-9]\+\)/\1\4-\3-\2/' file
Bash
#!/bin/bash
while IFS="," read -r a b
do
IFS="-"
set -- $b
echo "$a,$3-$2-$1"
done <"file"
Unfortunately, I think standard awk only allows one field separator character so you'll have to pre-process the data. You can do this with tr but if you really want an awk-only solution, use:
pax> echo '123,01-08-2006
124,01-09-2007
125,01-10-2009
126,01-12-2010' | awk -F, '{print $1"-"$2}' | awk -F- '{print $1","$4"-"$3"-"$2}'
This outputs:
123,2006-08-01
124,2007-09-01
125,2009-10-01
126,2010-12-01
as desired.
The first awk changes the , characters to - so that you have four fields separated with the same character (this is the bit I'd usually use tr ',' '-' for).
The second awk prints them out in the order you specified, correcting the field separators at the same time.
If you're using an awk implementation that allows multiple FS characters, you can use something like:
gawk -F ',|-' '{print $1","$4"-"$3"-"$2}'
If it doesn't need to be awk, you could use Perl too:
$ perl -nle 'print "$1,$4-$3-$2" while (/(\d{3}),(\d{2})-(\d{2})-(\d{4})\s*/g)' < file.txt

Resources