I am currently writing a awkscript which looks like this :
#!/usr/bin/awk -f
BEGIN {
print "Starting extracting data"
}
{
print $0
}
END {
print "End of file"
}
My script works fine like that on my computer but it is not portable... I'd like to do something like
#!/usr/bin/env awk -f
...
but the Debian shell does not accept several parameters in a single shebang. I get "awk -f" no such file or directory. Is there any workaround I could use or is it completely impossible ?
There is no easy way to do this cleanly; as mentioned in glenn jackman's comment, shebangs are implemented by execve(), which only accepts an interpreter and a single optional argument.
You'll need a workaround.
The comments to the question imply you don't want a wrapper script, but what if it's contained inside the final ~awk script?
#!/bin/sh
awk -f - "$#" << 'EOF'
#!/usr/bin/awk -f
BEGIN {
print "Starting extracting data"
}
{
print $0
}
END {
print "End of file"
}
This uses portable POSIX shell in its shebang and then immediately invokes awk with awk code from a heredoc on standard input (-f -), passing the further arguments and options ("$#") to awk as files. The heredoc is quoted (<< 'EOF') so things like $0 aren't interpreted by the POSIX shell. Since a line consisting solely of EOF is not present, the heredoc ends with the file.
(The second shebang is not read by anything. It's purely cosmetic for people who read the code. If you name this file with the .awk suffix, editors like vim will default their syntax highlighting to awk despite the contents of the first shebang.)
That code won't work for piping content into the file because the script is itself piped into awk. If you want to support piping, it needs to be a little uglier, using bash's input process substitution (<(…)):
#!/bin/bash
exec awk -f <(awk 'NR > 2' "$0") "$#"
#!/usr/bin/awk -f
BEGIN {
print "Starting extracting data"
}
{
print $0
}
END {
print "End of file"
}
This tells bash to execute awk on a named pipe created from that second awk command, which reads the full script (bash interprets $0 as the name of the file it is running) and prints lines 3 and higher. Again, the second shebang is purely cosmetic and is therefore just a comment.
There's really no reason to invoke awk in the shebang. Just invoke it via sh:
#!/bin/sh
exec awk '
BEGIN {
print "Starting extracting data"
}
{
print $0
}
END {
print "End of file"
}
' "$#"
Related
I am currently working with a vendor-provided software that is trying to handle sending attachment files to another script that will text-extract from the listed file. The script fails when we receive files from an outside source that contain spaces, as the vendor-supplied software does not surround the filename in quotes - meaning when the text-extraction script is run, it receives a filename that will split apart on the space and cause an error on the extractor script. The vendor-provided software is not editable by us.
This whole process is designed to be an automated transfer, so having this wrench that could be randomly thrown into the gears is an issue.
What we're trying to do, is handle the spaced name in our text extractor script, since that is the piece we have some control over. After a quick Google, it seems like changing the IFS value for the script would be the quick solution, but unfortunately, that script would take effect after the extensions have already mutilated the incoming data.
The script I'm using takes in a -e value, a -i value, and a -o value. These values are sent from the vendor supplied script, which I have no editing control over.
#!/bin/bash
usage() { echo "Usage: $0 -i input -o output -e encoding" 1>&2; exit 1; }
while getopts ":o:i:e:" o; do
case "${o}" in
i)
inputfile=${OPTARG}
;;
o)
outputfile=${OPTARG}
;;
e)
encoding=${OPTARG}
;;
*)
usage
;;
esac
done
shift $((OPTIND-1))
...
...
<Uses the inputfile, outputfile, and encoding variables>
I admit, there may be pieces to this I don't fully understand, and it could be a simple fix, but my end goal is to be able to extract -o, -i, and -e that all contain 1 value, regardless of the spaces within each section. I can handle quoting the script after I can extract the filename value
The script fragment that you have posted does not have any issues with spaces in the arguments.
The following, for example, does not need quoting (since it's an assignment):
inputfile=${OPTARG}
All other uses of $inputfile in the script should be double quoted.
What matters is how this script is called.
This would fail and would assign only hello to the variable inputfile:
$ ./script.sh -i hello world.txt
The string world.txt would prompt the getopts function to stop processing the command line and the script would continue with the shift (world.txt would be left in $1 afterwards).
The following would correctly assign the string hello world.txt to inputfile:
$ ./script.sh -i "hello world.txt"
as would
$ ./script.sh -i hello\ world.txt
The following script uses awk to split the arguments while including spaces in the file names. The arguments can be in any order. It does not handle multiple consecutive spaces in an argument, it collapses them to one.
#!/bin/bash
IFS=' '
str=$(printf "%s" "$*")
istr=$(echo "${str}" | awk 'BEGIN {FS="-i"} {print $2}' | awk 'BEGIN {FS="-o"} {print $1}' | awk 'BEGIN {FS="-e"} {print $1}')
estr=$(echo "${str}" | awk 'BEGIN {FS="-e"} {print $2}' | awk 'BEGIN {FS="-o"} {print $1}' | awk 'BEGIN {FS="-i"} {print $1}')
ostr=$(echo "${str}" | awk 'BEGIN {FS="-o"} {print $2}' | awk 'BEGIN {FS="-e"} {print $1}' | awk 'BEGIN {FS="-i"} {print $1}')
inputfile=""${istr}""
outputfile=""${ostr}""
encoding=""${estr}""
# call the jar
There was an issue when calling the jar where Java threw a MalformedUrlException on a filename with a space.
So after reading through the commentary, we decided that although it may not be the right answer for every scenario, the right answer for this specific scenario was to extract the pieces manually.
Because we are building this for a pre-built script passing to it, and we aren't updating that script any time soon, we can accept with certainty that this script will always receive a -i, -o, and -e flag, and there will be spaces between them, which causes all the pieces passed in to be stored in different variables in $*.
And we can assume that the text after a flag is the response to the flag, until another flag is referenced. This leaves us 3 scenarios:
The variable contains one of the flags
The variable contains the first piece of a parameter immediately after the flag
The variable contains part 2+ of a parameter, and the space in the name was interpreted as a split, and needs to be reinserted.
One of the other issues I kept running into was trying to get string literals to equate to variables in my IF statements. To resolve that issue, I pre-stored all relevant data in array variables, so I could test $variable == $otherVariable.
Although I don't expect it to change, we also handled what to do if the three flags appear in a different order than we anticipate (Our assumption was that they list as i,o,e... but we can't see excatly what is passed). The parameters are dumped into an array in the order they were read in, and a parallel array tracks whether the items in slots 0,1,2 relate to i,o,e.
The final result still has one flaw: if there is more than one consecutive space in the filename, the whitespace is trimmed before processing, and I can only account for one space. But saying as we processed over 4000 files before encountering one with a space, I find it unlikely with the naming conventions that we would encounter something with more than one space.
At that point, we would have to be stepping in for a rare intervention anyways.
Final code change is as follows:
#!/bin/bash
IFS='|'
position=-1
ioeArray=("" "" "")
previous=""
flagArr=("-i" "-o" "-e" " ")
ioePattern=(0 1 2)
#echo "for loop:"
for i in $*; do
#printf "%s\n" "$i"
if [ "$i" == "${flagArr[0]}" ] || [ "$i" == "${flagArr[1]}" ] || [ "$i" == "${flagArr[2]}" ]; then
((position += 1));
previous=$i;
case "$i" in
"${flagArr[0]}")
ioePattern[$position]=0
;;
"${flagArr[1]}")
ioePattern[$position]=1
;;
"${flagArr[2]}")
ioePattern[$position]=2
;;
esac
continue;
fi
if [[ $previous == "-"* ]]; then
ioeArray[$position]=${ioeArray[$position]}$i;
else
ioeArray[$position]=${ioeArray[$position]}" "$i;
fi
previous=$i;
done
echo "extracting (${ioeArray[${ioePattern[0]}]}) to (${ioeArray[${ioePattern[1]}]}) with (${ioeArray[${ioePattern[2]}]}) encoding."
inputfile=""${ioeArray[${ioePattern[0]}]}"";
outputfile=""${ioeArray[${ioePattern[1]}]}"";
encoding=""${ioeArray[${ioePattern[2]}]}"";
I have to delete a line in a file from inside a shell script.
I am trying this:
linenumber=0
##CHeck If server IP exists
if grep -wq $serverip $FILE; then
echo "IP exists"
linenumber=$(awk -v serverip="$serverip" '$0 ~ serverip {print NR}' $FILE)
echo "$linenumber"
sed -e '${$linenumber}d' $FILE
fi
Basically I extract the line number and then want to delete it.
sed -e '1d' $FILE --> WOrks on CLI but inside script does not work
Why? How to get it working ?
This is simply a case of using the incorrect quotes around your sed command, so the variable isn't being used. Ignoring the fact that you're unnecessarily using 3 tools when 1 would suffice, the fix is this:
sed -e "${linenumber}d" "$FILE"
Perhaps your requirement is more complex than it appears but I would suggest changing your entire script to this:
awk -v serverip="$serverip" '!($0 ~ serverip)' "$FILE"
This prints every line that doesn't contain the shell variable $serverip. It is assumed that you have escaped any regex meta-characters present in the variable.
Alternatively (and more succinctly):
sed "/$serverip/d" "$FILE"
If you actually want the messages to be printed out (I assumed that they were for debugging), then that's easy enough to achieve:
awk -v serverip="$serverip" '$0 ~ serverip { print "IP exists"; print NR; next } 1' "$FILE"
If you're not familiar with the 1 at the end, it's just a common shorthand which causes awk to print each line (1 is always true and the default action is { print }).
In following shell script I want to perform two different tasks depending on file type,
but it is giving an error: "[==c]: command not found"
echo "enter file name"
read num
var_check= echo $str |awk -F . '{if (NF>1) {print $NF}}'
if ["$var_check"=="c"];then
echo "Some task for c"
elif ["$var_check"=="cpp"];then
echo "Some task for cpp"
else
echo "Wrong file extension"
fi
You wrote:
if ["$var_check"=="c"];then
The [ command is a command; its name must be surrounded by spaces (put simplistically).
if [ "$var_check" == "c" ]; then
The last argument, ], must also be preceded by a space. The operands within must also be space separated; they need to be separate arguments. The rules for the [[ ... ]] operator are a bit different, but using spaces helps people read the code even there. What you wrote is a bit like expecting:
ls"-l"/dev/tty
to work; it won't.
You also need to double check whether your test or [ operator supports ==; the normal form is =.
The line:
var_check= echo $str |awk -F . '{if (NF>1) {print $NF}}'
This runs the echo command with var_check set as an environment variable, which is unlikely to be what you wanted. You almost certainly intended to write:
var_check=$(echo $str |awk -F . '{if (NF>1) {print $NF}}')
This runs the echo and awk commands and captures the output in var_check. Use the $(...) notation in preference to the older but more complex to use `...` notation. In simple cases, they look the same; when you nest them, the $(...) notation is far, far simpler to understand and use.
Also, looking on the larger scale (3 lines instead of just 1 line):
echo "enter file name"
read num
var_check=$(echo $str |awk -F . '{if (NF>1) {print $NF}}')
You read the file name into variable num; you then echo $str instead of $num. If you've already got $str set somewhere earlier in the script (in unshown code), what you've got may be fine. Taken as a standalone fragment, it isn't right.
You could also simplify the awk a little:
var_check=$(echo $str |awk -F . 'NF > 1 {print $NF}')
This would work the same as what you wrote, but uses fewer parentheses and braces.
New guy here with a problem that will hopefully have an easy solution, but I just can't seem to manage.
So, I have a large list of files that I need to process using the same command line program, and I'm trying to write a small shell script to automate this. I wrote something that will read the input file name from a text file, and repeat the command for each of those files. So far so good. My problem though is with naming the output. Each file is named in the general format "lane_number_bla_bla_bla", and they are processed in pairs. So, there will be a "lane_1_bla_bla_bla_001" and "lane_1_bla_bla_bla_002" that need to combine into a single output file. For this, I'm trying to use awk to read the sample number from the .txt list of input files and parse it into the output file number. Here's the code I came up with (note that the echo statement before the command is there just for testing; it's removed when it comes to run the actual program; also this is not the actual command which is rather more complicated, but the principle still applies):
echo "Which input1 should I use?"
read text
input1=$text
echo "Which input2 should I use?"
read text
input2=$text
echo "How many lines?"
read text
n=$text
for i in $(seq 1 $n)
do
awkinput1=$(awk NR==$i $input1)
awkinput2=$(awk NR==$i $input2)
num=$(awk 'NR==$i{print $2 }' FS="_" $input1)
lane=$(awk 'NR==$i{print $1 }' FS="_" $input1)
echo "command $awkinput1.in > $awkinput1.out && command $awkinput2.in > $awkinput2.out && command cat $awkinput1.out $awkinput2.in > $num-$lane-CAT.out &"
if (( $i % 10 == 0 )); then wait; fi # Limit to 10 concurrent subshells.
done
When I run this, both $awkinput fields get replaced properly in the comand line by the appropriate filename, but not the $num and $lane fields, which print nothing.
So, what am I doing wrong? I'm sure it's pretty simple, but I tried quite a lot of different ways to format the relevant awk command, and nothing seems to work. I'm doing this on a remote linux server using SSH protocol, if it makes a difference.
Thanks a lot!
Shell does not parse $i quoted by single quote ('). So quoted string should be terminated before $i.
FS should be set before parsing lines.
Following code will work.
num=$(awk 'BEGIN{FS="_"} NR=='$i'{print $2 }' $input1)
lane=$(awk 'BEGIN{FS="_"} NR=='$i'{print $1 }' $input1)
Code below will be more efficient:
while read in1 ; do
read in2 <&3
num=$(awk 'BEGIN{FS="_"} {print $2 }' <<<"$in1")
lane=$(awk 'BEGIN{FS="_"} {print $1 }' <<<"$in1")
...
done <$input1 3<$input2
Beyond Compare provides "Select for compare" and "Compare to Selected" by using two nautilus scripts (stored in /home/user/.gnome2/nautilus-scripts).
Script 1: Select for compare
#!/bin/sh
quoted=$(echo "$NAUTILUS_SCRIPT_SELECTED_FILE_PATHS" | awk 'BEGIN { FS = "\n" } { printf "\"%s\" ", $1 }' | sed -e s#\"\"##)
echo "$quoted" > $HOME/.beyondcompare/nautilus
Script 2: Compare to Selected
#!/bin/sh
arg2=$(cat $HOME/.beyondcompare/nautilus)
arg1=$(echo "$NAUTILUS_SCRIPT_SELECTED_FILE_PATHS" | awk 'BEGIN { FS = "\n" } { printf "\"%s\" ", $1 }' | sed -e s#\"\"##)
bcompare $arg1 $arg2
I am trying to do similar scripts for Meld, but it is not working.
I am not familiar with shell scripts. Can anyone help me understand this:
quoted=$(echo "$NAUTILUS_SCRIPT_SELECTED_FILE_PATHS" | awk 'BEGIN { FS = "\n" } { printf "\"%s\" ", $1 }' | sed -e s#\"\"##)
so that I can adapt to meld.
If you are not rolling your own solution for the sake of learning, I would suggest installing the diff-ext extension to nautilus. It is cross platform and if you are running Debian/Ubuntu installing it should be as simple as sudo apt-get install diff-ext.
Check out some screenshots here - http://diff-ext.sourceforge.net/screenshots.shtml
The quoted=$( ...) assigns whatever output there is to the variable named quoted, and can be used later in the script as $quoted OR ${quoted} OR "${quoted}" OR "$quoted"
The '|' char is called a 'pipe' in unix/linux and it connects the output of the preceding command to feed into the following command.
So you just take the script apart 1 piece at a time and see what it does,
quoted=$(
# I would execute below by itself first
echo "$NAUTILUS_SCRIPT_SELECTED_FILE_PATHS"
# then add on this piped program to see how data gets transformed
| awk 'BEGIN { FS = "\n" } { printf "\"%s\" ", $1 }'
# then add this
| sed -e s#\"\"##
# the capturing of the output to the var 'quoted' is the final step of code
)
# you **cannot** copy paste this whole block of code and expect it to work ;-)
I don't know what is supposed to be in $NAUTILUS_SCRIPT_SELECTED_FILE_PATHS, so it is hard to show you here. AND, that variable is not defined in any of the code you specify here, so you may only get a blank line when you echo its value. Be prepared to do some research on how that value get set AND what are the correct values.
Also I notice that your code is 'prefixed' as #!/bin/sh. If it is truly /bin/sh then command substitution like quoted=$(....) will not work and should generate an error message. Persumably your system is really using bash for /bin/sh. You can eliminate any possible confusion in the future (when changing to a system where /bin/sh = bourne shell), by changing the 'shebang' to #! /bin/bash.
I hope this helps.
I just discovered diff-ext thanks to this post, excellent!
The first try I did failed: by default diff-ext does not handle backup files (*~ and *.bak). To enable this, run:
$ diff-ext-setup
and in the Mime types pane, check application/x-trash.
Now you can compare a file and its backup.