awk script header: #!/bin/bash or #!/bin/awk -f? - linux

In an awk file, e.g example.awk, should the header be #!/bin/bash or #!/bin/awk -f?
The reason for my question is that if I try this command in the console I receive the correct file.txt with "line of text":
awk 'BEGIN {print "line of text"}' >> file.txt
but if i try execute the following file with ./example.awk:
#! /bin/awk -f
awk 'BEGIN {print "line of text"}' >> file.txt
it returns an error:
$ ./awk-usage.awk
awk: ./awk-usage.awk:3: awk 'BEGIN {print "line of text"}' >> file.txt
awk: ./awk-usage.awk:3: ^ invalid char ''' in expression
If I change the header to #!/bin/bash or #!/bin/sh it works.
What is my error? What is the reason of that?

Since you explicitly run the awk command, you should use #!/bin/bash. You can use #!/bin/awk if you remove the awk command and include only the awk program (e.g. BEGIN {print "line of text"}), but then you need to append to file using awk syntax (print ... >> file).
awk -f takes a file containing the awk script, so that is completely wrong here.

Your script is a shell script that happens to contains an awk command.
#! /bin/sh tells your shell to execute the file as a shell command with /bin/sh - and it is a shell command. If you replace that with #! /bin/awk -f then the file is executed with awk, basically the same as executing
/bin/awk -f awk 'BEGIN {print "line of text"}' >> file.txt

Related

Bash tries to execute commands in heredoc

I am trying to write a simple bash script that will print a multiline output to another file. I am doing it through heredoc format:
#!/bin/sh
echo "Hello!"
cat <<EOF > ~/Desktop/what.txt
a=`echo $1 | awk -F. '{print $NF}'`
b=`echo $2 | tr '[:upper:]' '[:lower:]'`
EOF
I was expecting to see a file in my desktop with these contents:
a=`echo $1 | awk -F. '{print $NF}'`
b=`echo $2 | tr '[:upper:]' '[:lower:]'`
But instead, I am seeing these as the contents of my what.txt file:
a=
b=
Somehow, even though it is part of a heredoc, bash is trying to execute it line by line. How do I prevent this, and print the contents to the file as it is?
Quote EOF so that bash takes inputs literally:
cat <<'EOF' > what.txt
a=`echo $1 | awk -F. '{print $NF}'`
b=`echo $2 | tr '[:upper:]' '[:lower:]'`
EOF
Also start using $() for command substitution instead of old and problematic ``.

How to split a line on Thorn character 'þ' in linux?

How to split a line on Thorn character 'þ' in linux ?
I have tried the following
awk -F 'þ' '{print $2}'
awk -F '\xC3\xBE\x02' '{print $2}'
awk -F 'þ' '{print $2}'
nothing worked.
EDIT:
The file is located in HDFS(Hadoop File System) path the following command works on command line but not in shell script (when shell script is executed, it gives an empty output ie thorn char is not recognized!!
Command line:
~/etltestsar/DoubleClick$ hadoop fs -cat /raw/doubleclick/data/dt=2015-03-30/NetworkMatchtablesActivity_7657_03-30-2015_advertiser.log.gz|gunzip|tail -n +2|awk -F 'þ' '
Warning: $HADOOP_HOME is deprecated.
3848762
3963771
4112862
4140939
4199580
4199584
.....
Same command in shell script produces no output
hadoop#node28-19-88:~/etltestsar/DoubleClick$ sh testthorn.sh
Warning: $HADOOP_HOME is deprecated.
Get a different awk? GNU awk 4.1.1 in bash 4.1.17(9) on cygwin:
$ cat file
fooþbar
$ awk -F 'þ' '{print $2}' file
bar

Print name of the file in front of every line of file

I have a lot of text files and I want to make a bash script in linux to print the name of file in each lines of file. For example I have file lenovo.txt and I want that every line in the file to start with lenovo.txt.
I try to make a "for" for this but didn't work.
for i in *.txt
do
awk '{print '$i' $0}' /var/SambaShare/$i > /var/SambaShare/new_$i
done
Thanks!
It doesn't work because you need to pass $i to awk with the -v option. But you can also use the FILENAME built-in variable in awk :
ls *txt
file.txt file2.txt
cat *txt
A
B
C
A2
B2
C2
for i in *txt; do
awk '{print FILENAME,$0}' $i;
done
file.txt A
file.txt B
file.txt C
file2.txt A2
file2.txt B2
file2.txt C2
An to redirect into a new file :
for i in *txt; do
awk '{print FILENAME,$0}' $i > ${i%.txt}_new.txt;
done
As for your corrected version :
for i in *.txt
do
awk -v i=$i '{print i,$0}' $i > new_$i
done
Hope this helps.
Using grep you can make use of the --with-filename (alias -H) option and use an empty pattern that always matches:
for i in *.txt
do
grep -H "" $i > new_$i
done
Awk and Bash don't share the same variables as they are different languages with separate interpreters. You should pass Bash variables to Awk with the -v option.
You should also quote your file name variables to ensure they don't get expanded as separate arguments if they contain whitespace.
for i in *.txt
do
awk -v i="$i" '{print i,$0}' "$i" > "$i"
done

Pass parameter to an awk script file

If I want to pass a parameter to an awk script file, how can I do that ?
#!/usr/bin/awk -f
{print $1}
Here I want to print the first argument passed to the script from the shell, like:
bash-prompt> echo "test" | ./myawkscript.awk hello
bash-prompt> hello
In awk $1 references the first field in a record not the first argument like it does in bash. You need to use ARGV for this, check out here for the offical word.
Script:
#!/bin/awk -f
BEGIN{
print "AWK Script"
print ARGV[1]
}
Demo:
$ ./script.awk "Passed in using ARGV"
AWK Script
Passed in using ARGV
You can use -v as a command-line option to provide a variable to the script:
Say we have a file script.awk like this:
BEGIN {print "I got the var:", my_var}
Then we run it like this:
$ awk -v my_var="hello this is me" -f script.awk
I got the var: hello this is me
your hash bang defines the script is not shell script, it is an awk script. you cannot do it in bash way within your script.
also, what you did : echo blah|awk ... is not passing paramenter, it pipes the output of echo command to another command.
you could try these way below:
echo "hello"|./foo.awk file -
or
var="hello"
awk -v a="$var" -f foo.awk file
with this, you have var a in your foo.awk, you could use it.
if you want to do something like shell script accept $1 $2 vars, you can write a small shellscript to wrap your awk stuff.
EDIT
No I didn't misunderstand you.
let's take the example:
let's say, your x.awk has:
{print $1}
if you do :
echo "foo" | x.awk file
it is same as:
echo "foo"| awk '{print $1}' file
here the input for awk is only file, your echo foo doesn't make sense. if you do:
echo "foo"|awk '{print $1}' file -
or
echo "foo"|awk '{print $1}' - file
awk takes two input (arguments for awk) one is stdin one is the file, in your awk script you could:
echo "foo"|awk 'NR==FNR{print $1;next}{print $1}' - file
this will print first foo from your echo, then the column1 from file of course this example does nothing actual work, just print them all.
you can of course have more than two inputs, and don't check the NR and FNR, you could use the
ARGC The number of elements in the ARGV array.
ARGV An array of command line arguments, excluding options and the program argument, numbered from zero to ARGC-1
for example :
echo "foo"|./x.awk file1 - file2
then your "foo" is the 2nd arg, you can get it in your x.awk by ARGV[2]
echo "foo" |x.awk file1 file2 file2 -
now it is ARGV[4] case.
I mean, your echo "foo"|.. would be stdin for awk, it could by 1st or nth "argument"/input for awk. depends on where you put the -(stdin). You have to handle it in your awk script.

How to get extension of a file in shell script

I am trying to get file extension for a file in shell script. But without any luck.
The command I am using is
file_ext=${filename##*.}
and
file_ext = $filename |awk -F . '{if (NF>1) {print $NF}}'
But both of the commands failed to put value in variable file_ext. But when i try
echo $filename |awk -F . '{if (NF>1) {print $NF}}'
It gives me the desired result. I am new to shell script. Please describe the situation what is happening. And also how should I do it?
Thanks.
to get file extension, just use the shell
$ filename="myfile.ext"
$ echo ${filename##*.}
ext
$ file_ext=${filename##*.} #put to variable
$ echo ${file_ext}
ext
Spaces hurt.
Anyway you should do:
file_ext=$(echo $filename | awk -F . '{if (NF>1) {print $NF}}')
[Edit] Better suggestion by Martin:
file_ext=$(printf '%s' "$filename" | awk -F . '{if (NF>1) {print $NF}}')
That will store in $file_ext the output of the command.
You have to be careful when declaring variables.
variable1="string" # assign a string value
variable3=`command` # assign output from command
variable2=$(command) # assign output from command
Notice that you cannot put a space after the variable, because then it gets interpreted as a normal command.

Resources