Escaping backslash in AWK

Escaping backslash in AWK - linux

I'm trying to understand why the command below doesn't work (output is empty):
echo 'aaa\tbbb' | awk -F '\\t' '{print $2}'
I would expect the output to be 'bbb'.
Interestingly this works (output is 'bbb'):
echo 'aaa\tbbb' | awk -F 't' '{print $2}'
And this works as well (ouptut is 'tbbb'):
echo 'aaa\tbbb' | awk -F '\\' '{print $2}'
It looks as if \\\t is read as backslash followed by tab instead of escaped backslash followed by t.
Is there a proper way to write this command?

You need to tell echo to interpret backslash escapes. Try:
$ echo -e 'aaa\tbbb' | awk -F '\t' '{print $2}'
bbb
man echo would tell:
-e enable interpretation of backslash escapes

Related

How to use awk under fish to split a string by triple dollar signs ($$$) as field separator?

awk -F '$' works well with single-dollar-sign-separated string (a$b$c for example), but when it comes to multiple dollar signs, awk does not work.
The expected output is: 1$23, I have tried the following combinations but in vain:
$ printf '1$23$$$456' | awk -F '$$$' '{print $1}'
1$23$$$456
$ printf '1$23$$$456' | awk -F '\$\$\$' '{print $1}'
1$23$$$456
$ printf '1$23$$$456' | awk -F '\\$\\$\\$' '{print $1}'
1$23$$$456
$ printf '1$23$$$456' | awk -F '$' '{print $1}'
1
I wonder if there is a way to split a string by a sequence of dollar signs using awk?
update
$ awk --version
awk version 20070501
$ echo $SHELL
/usr/local/bin/fish

The problem is due to fish quoting rules. The important difference is that fish, unlike Bash, allows escaping a single quote within a single quoted string, like so:
$ echo '\''
'
and, consequently, a literal backslash has to be escaped as well:
$ echo '\\'
\
So, to get what in Bash corresponds to \\, we have to use \\\\:
$ printf '1$23$$$456' | awk -F '\\\\$\\\\$\\\\$' '{print $1}'
1$23

awk and various shells have nasty behaviours with escaping characters with back-slashes. Various shells could have different behaviours and sometimes you really need to escape like crazy to make it work. The easiest is to use [$] for a single symbol. This always works for field separators as FS is a regular expression if it is more than one symbol.
$ awk -F '[$][$][$]' '{...}' file

More \
#> printf '1$23$$$456' | awk -F '\\$\\$\\$' '{print $1}'
1$23

Maybe not use awk if that's throwing you curves?
$: echo '1$23$$$456' | sed 's/$$$.*//'
1$23
Why farm it out to a subshell at all for somehting that's just string processing?
$: x='1$23$$$456'
$: echo "${x%%\$\$\$*}"
1$23

Take output from AWK command and display line by line based on white space

I am running the following command in a bash script:
echo `netstat -plten | grep -i autossh | awk '{print $4}'` >> /root/logs/autossh.txt
The output displays in a single line:
127.0.0.1:25001 127.0.0.1:15501 127.0.0.1:10001 127.0.0.1:20501 127.0.0.1:15001 127.0.0.1:5501 127.0.0.1:20001
I would like each IP to display line by line. What do I need to do with the awk command to make the output display line by line

Just remove the echo and subshell:
netstat -plten | grep -i autossh | awk '{print $4}' >> /root/logs/autossh.txt
awk is already printing them one per line, but when you pass them to echo it parses its arguments and prints them each with a space between them. Every line of awk output then becomes a separate argument to echo so you lose your line endings.
Of course, awk can do pattern matching too, so no real need for grep:
netstat -plten | awk '/autossh/ {print $4}' >> /root/logs/autossh.txt
with gawk at least you can have it ignore case too
netstat -plten | awk 'BEGIN {IGNORECASE=1} /autossh/ {print $4}' >> /root/logs/autossh.txt
or as Ed Morton pointed out, with any awk you could do
netstat -plten | awk 'tolower($0) ~ /autossh/ {print $4}' >> /root/logs/autossh.txt

You can just quote the result of command substitution to prevent the shell from performing word splitting.
You can modify it as follows to achieve what you want.
echo "`netstat -plten | grep -i autossh | awk '{print $4}'`" >> /root/logs/autossh.txt

Bash tries to execute commands in heredoc

I am trying to write a simple bash script that will print a multiline output to another file. I am doing it through heredoc format:
#!/bin/sh
echo "Hello!"
cat <<EOF > ~/Desktop/what.txt
a=`echo $1 | awk -F. '{print $NF}'`
b=`echo $2 | tr '[:upper:]' '[:lower:]'`
EOF
I was expecting to see a file in my desktop with these contents:
a=`echo $1 | awk -F. '{print $NF}'`
b=`echo $2 | tr '[:upper:]' '[:lower:]'`
But instead, I am seeing these as the contents of my what.txt file:
a=
b=
Somehow, even though it is part of a heredoc, bash is trying to execute it line by line. How do I prevent this, and print the contents to the file as it is?

Quote EOF so that bash takes inputs literally:
cat <<'EOF' > what.txt
a=`echo $1 | awk -F. '{print $NF}'`
b=`echo $2 | tr '[:upper:]' '[:lower:]'`
EOF
Also start using $() for command substitution instead of old and problematic ``.

Bash script: Read text after characters

I'd like to read the text after characters in a file.
For example:
MPlayer-2013-08-30-i486|MPlayer|2013-08-30-i486||Multimedia;video|4508K||MPlayer-2013-08-30-i486.pet|+ffmpeg|mplayer video player|slackware|14.0||
I'd like to read the version of the program (in the third box):
2013-08-30-i486
How I can do this in my bash script?

This is pretty easily done with cut:
echo 'MPlayer-2013-08-30-i486|MPlayer|2013-08-30-i486||Multimedia;video|4508K||MPlayer-2013-08-30-i486.pet|+ffmpeg|mplayer video player|slackware|14.0||' | cut -d '|' -f 3
2013-08-30-i486
which will split on | and choose the 3rd field.

Using BASH regex:
s='MPlayer-2013-08-30-i486|MPlayer|2013-08-30-i486||Multimedia;video|4508K||MPlayer-2013-08-30-i486.pet|+ffmpeg|mplayer video player|slackware|14.0||'
[[ "$s" =~ MPlayer-([^|]+) ]] && echo "${BASH_REMATCH[1]}"
2013-08-30-i486
Using awk:
awk -F 'MPlayer-|\\|' '{print $2}' <<< "$s"
2013-08-30-i486
To grab 3rd field using awk:
awk -F '\\|' '{print $3}' <<< "$s"
2013-08-30-i486

This is simple to do in AWK:
$ awk -F'|' '{print $3}' file
2013-08-30-i486
It seems that the same data is repeated in several places, so I assume that they are all OK to use...In the above line, the input is being split into fields on the | character and the third field is being printed. The same thing will happen for every line of input.

Through grep,
$ grep -oP 'MPlayer-\K[^|.]*(?=\|)' file
2013-08-30-i486
Through sed,
$ echo 'MPlayer-2013-08-30-i486|MPlayer|2013-08-30-i486||Multimedia;video|4508K||MPlayer-2013-08-30-i486.pet|+ffmpeg|mplayer video player|slackware|14.0||' | sed -r 's/^[^|]+\|[^|]+\|([^|]+).*$/\1/'
2013-08-30-i486

Using read (all shells):
IFS='|' read __ __ VERSION __ < file
echo "$VERSION"
Another using read -a and Bash arrays:
IFS='|' read -a FIELDS < file
echo "${FIELDS[2]}"
Output:
2013-08-30-i486

The read built-in will be most efficient for a single line:
IFS="|" read __ __ version __ <<< "$line"
although if you are processing a file full of such lines with
while IFS="|" read __ __ version __; do
# do something with $version
done < file
it might be more efficient to use cut:
while read version; do
# do something with $version
done < <(cut -d'|' -f3 file)
or awk:
awk -F'|' '{ # do something with $3 }' file

can not use unix $variable in awk command [duplicate]

This question already has answers here:
Using awk with variables
(3 answers)
Closed 8 years ago.
I have following variable set in my unix environment. If i try to use it in awk command its not working but the same command is working when i dont use $b variable
$b="NEW"
when i try following command it is not working
echo "$a" | tr [a-z] [A-Z] |awk -v RS=, '/TABLE/&&/CREATE/&&/`echo ${b}`/{print $NF}'
But, if i replace the $b value to NEW as below its working
echo "$a" | tr [a-z] [A-Z] |awk -v RS=, '/TABLE/&&/CREATE/&&/NEW/{print $NF}'

You cannot use a bash var inside awk like that. Instead, use:
echo "$a" | tr [a-z] [A-Z] | awk -v RS=, -v myvar=$b '/TABLE/&&/CREATE/&& $0~myvar {print $NF}'
See an example:
$ var="hello"
$ awk -v text=$var 'BEGIN{print text}'
hello
Also, to me it works with tr 'a-z' 'A-Z' instead of tr [a-z] [A-Z]. And based on Mark Setchell suggestion, you can skip it by using the IGNORECASE = 1:
echo "$a" | awk -v RS=, -v myvar=$b 'BEGIN{IGNORECASE=1} /TABLE/&&/CREATE/&& $0~myvar {print $NF}'

Regarding your question:
if i replace the $b value to NEW as below its working
It works because the value of your variable is NEW and what you end up doing is using that in the regex, which is exactly how it is supposed to be done.
about your second question:
can not use unix $variable in awk command
You cannot use shell variables in awk like that. You need to create an awk variable by using -v option and assigning your bash variable.
awk -v awkvar="$bashvar" '/ /{ ... }'
This makes your existing syntax as:
echo "$a" | tr [a-z] [A-Z] | awk -v RS=, -v var="$b" '/TABLE/&&/CREATE/&&/var/{print $NF}'
This again won't work because inside /../ variables are not interpolated, meaning they are considered literally. So, you need to do:
echo "$a" | tr [a-z] [A-Z] |awk -v RS=, -v var="$b" '/TABLE/&&/CREATE/&&$0~var{print $NF}'

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Escaping backslash in AWK - linux

You need to tell echo to interpret backslash escapes. Try: $ echo -e 'aaa\tbbb' | awk -F '\t' '{print $2}' bbb man echo would tell: -e enable interpretation of backslash escapes

Related

How to use awk under fish to split a string by triple dollar signs ($$$) as field separator?

Take output from AWK command and display line by line based on white space

Bash tries to execute commands in heredoc

Bash script: Read text after characters

can not use unix $variable in awk command [duplicate]

Categories

Resources