Shell Script error: "head: invalid trailing option -- 1" - linux

I have this code in my shell(bash) script for splitting a file into smaller parts:
for (( i=$start; i<=$lineCount; i=i+$interval))
do
temp=`expr $i + $interval`;
if [ $temp -le $lineCount ]
then
command="head -$temp $fileName | tail -$interval > $tmpFileName";
echo "Creating Temp File: $command";
else
lastLines=`expr $lineCount - $i`;
command="head -$temp $fileName | tail -$lastLines > tmpFileName";
echo "Creating Temp File: $command";
fi
`$command`;
done
It prints the following output on stdin:
Creating Temp File: head -10 tmp.txt | tail -10 > tmp.txt_TMP
head: invalid trailing option -- 1
Try `head --help' for more information.
But the command printed: head -10 tmp.txt | tail -10 > tmp.txt_TMP runs correctly on the command line.
What am I doing wrong?

When you put the pipe | in a variable, the shell interprets it as an ordinary character and not as a pipe. Ditto for redirection operators like >, <, ...
An ugly way would be to use eval.
A better approach would be to split your command into different parts so as to get rid of pipes and redirection operators in it.
For example:
command="head -$temp $fileName | tail -$lastLines > tmpFileName";
would be written as:
cmd1="head -$temp $fileName";
cmd2="tail -$lastLines";
and executed by saying:
"$cmd1" | "$cmd2" > tmpFileName;
Moreover, you don't need backticks to execute a command that is stored in a variable. Simply say:
$command

Problem is here:
command="head -$temp $fileName | tail -$interval > $tmpFileName"
and later:
`$command`
Instead of storing whole piped command in a string you can directly execute the command:
head -$temp "$fileName" | tail -$interval > "$tmpFileName"

There are several issues here. First of all, lets see our refactored version with correct quoting and many other improvements:
for (( i=start; i<=lineCount; i+=interval)); do
((temp = i + interval))
if (( temp <= lineCount )); then
echo "Creating Temp File using 'tail -n $interval'"
head -n "$temp" "$fileName" | tail -n "$interval" > "$tmpFileName"
else
((lastLines = lineCount - i))
echo "Creating Temp File using 'tail -n $lastLines'"
head -n "$temp" "$fileName" | tail -n "$lastLines" > "$tmpFileName"
fi
done
I have changed all arithmetic expressions to correct syntax. This is what you want, because it is more readable.
Then, it seems like you want to put a command into a variable and then run it. To cut the story short, you simply should not do that. Here's why
Also, this is not c++, you don't have to place ; on every line.
You have also missed $ character in tmpFileName on the 10th line of your code.

Related

Print second last line from variable in bash

VAR="1\n2\n3"
I'm trying to print out the second last line. One liner in bash!
I've gotten so far: printf -- "$VAR" | head -2
It however prints out too much.
I can do this with a file no problem: tail -2 ~/file | head -1
You almost done this task by yourself. Try
VAR="1\n2\n3"; printf -- "$VAR"|tail -2|head -1
Here is one pure bash way of doing this:
readarray -t arr < <(printf -- "$VAR") && echo "${arr[-2]}"
2
You may also use this awk as a single command:
VAR="1\n2\n3"
awk -F '\\\\n' '{print $(NF-1)}' <<< "$VAR"
2
maybe more efficient using a temporary variable and using expansions
var=$'1\n2\n3' ; tmpvar=${var%$'\n'*} ; echo "${tmpvar##*$'\n'}"
Use echo -e for backslash interpretation and to translate \n to newlines and print the interested line number using NR.
$ echo -e "${VAR}" | awk 'NR==2'
2
With multiple lines and do, tail and head can be used to print any particular line number.
$ echo -e "$VAR" | tail -2 | head -1
2
or do a fancy sed, where you keep the previous line in the buffer-space (x) to print and keep deleting until the last line,
$ echo -e "$VAR" | sed 'x;$!d'
2

How to efficiently loop through the lines of a file in Bash?

I have a file example.txt with about 3000 lines with a string in each line. A small file example would be:
>cat example.txt
saudifh
sometestPOIFJEJ
sometextASLKJND
saudifh
sometextASLKJND
IHFEW
foo
bar
I want to check all repeated lines in this file and output them. The desired output would be:
>checkRepetitions.sh
found two equal lines: index1=1 , index2=4 , value=saudifh
found two equal lines: index1=3 , index2=5 , value=sometextASLKJND
I made a script checkRepetions.sh:
#!bin/bash
size=$(cat example.txt | wc -l)
for i in $(seq 1 $size); do
i_next=$((i+1))
line1=$(cat example.txt | head -n$i | tail -n1)
for j in $(seq $i_next $size); do
line2=$(cat example.txt | head -n$j | tail -n1)
if [ "$line1" = "$line2" ]; then
echo "found two equal lines: index1=$i , index2=$j , value=$line1"
fi
done
done
However this script is very slow, it takes more than 10 minutes to run. In python it takes less than 5 seconds... I tried to store the file in memory by doing lines=$(cat example.txt) and doing line1=$(cat $lines | cut -d',' -f$i) but this is still very slow...
When you do not want to use awk (a good tool for the job, parsing the input only once),
you can run through the lines several times. Sorting is expensive, but this solution avoids the loops you tried.
grep -Fnxf <(uniq -d <(sort example.txt)) example.txt
With uniq -d <(sort example.txt) you find all lines that occur more than once. Next grep will search for these (option -f) complete (-x) lines without regular expressions (-F) and show the line it occurs (-n).
See why-is-using-a-shell-loop-to-process-text-considered-bad-practice for some of the reasons why your script is so slow.
$ cat tst.awk
{ val2hits[$0] = val2hits[$0] FS NR }
END {
for (val in val2hits) {
numHits = split(val2hits[val],hits)
if ( numHits > 1 ) {
printf "found %d equal lines:", numHits
for ( hitNr=1; hitNr<=numHits; hitNr++ ) {
printf " index%d=%d ,", hitNr, hits[hitNr]
}
print " value=" val
}
}
}
$ awk -f tst.awk file
found 2 equal lines: index1=1 , index2=4 , value=saudifh
found 2 equal lines: index1=3 , index2=5 , value=sometextASLKJND
To give you an idea of the performance difference using a bash script that's written to be as efficient as possible and an equivalent awk script:
bash:
$ cat tst.sh
#!/bin/bash
case $BASH_VERSION in ''|[123].*) echo "ERROR: bash 4.0 required" >&2; exit 1;; esac
# initialize an associative array, mapping each string to the last line it was seen on
declare -A lines=( )
lineNum=0
while IFS= read -r line; do
(( ++lineNum ))
if [[ ${lines[$line]} ]]; then
printf 'Content previously seen on line %s also seen on line %s: %s\n' \
"${lines[$line]}" "$lineNum" "$line"
fi
lines[$line]=$lineNum
done < "$1"
$ time ./tst.sh file100k > ou.sh
real 0m15.631s
user 0m13.806s
sys 0m1.029s
awk:
$ cat tst.awk
lines[$0] {
printf "Content previously seen on line %s also seen on line %s: %s\n", \
lines[$0], NR, $0
}
{ lines[$0]=NR }
$ time awk -f tst.awk file100k > ou.awk
real 0m0.234s
user 0m0.218s
sys 0m0.016s
There are no differences in the output of both scripts:
$ diff ou.sh ou.awk
$
The above is using 3rd-run timing to avoid caching issues and being tested against a file generated by the following awk script:
awk 'BEGIN{for (i=1; i<=10000; i++) for (j=1; j<=10; j++) print j}' > file100k
When the input file had zero duplicate lines (generated by seq 100000 > nodups100k) the bash script executed in about the same amount of time as it did above while the awk script executed much faster than it did above:
$ time ./tst.sh nodups100k > ou.sh
real 0m15.179s
user 0m13.322s
sys 0m1.278s
$ time awk -f tst.awk nodups100k > ou.awk
real 0m0.078s
user 0m0.046s
sys 0m0.015s
To demonstrate a relatively efficient (within the limits of the language and runtime) native-bash approach, which you can see running in an online interpreter at https://ideone.com/iFpJr7:
#!/bin/bash
case $BASH_VERSION in ''|[123].*) echo "ERROR: bash 4.0 required" >&2; exit 1;; esac
# initialize an associative array, mapping each string to the last line it was seen on
declare -A lines=( )
lineNum=0
while IFS= read -r line; do
lineNum=$(( lineNum + 1 ))
if [[ ${lines[$line]} ]]; then
printf 'found two equal lines: index1=%s, index2=%s, value=%s\n' \
"${lines[$line]}" "$lineNum" "$line"
fi
lines[$line]=$lineNum
done <example.txt
Note the use of while read to iterate line-by-line, as described in BashFAQ #1: How can I read a file line-by-line (or field-by-field)?; this permits us to open the file only once and read through it without needing any command substitutions (which fork off subshells) or external commands (which need to be individually started up by the operating system every time they're invoked, and are likewise expensive).
The other part of the improvement here is that we're reading the whole file only once -- implementing an O(n) algorithm -- as opposed to running O(n^2) comparisons as the original code did.

How to use the pipe in a file with head-tail operation?

size=$(wc -l < "$1")
if [ "$size" -gt 0 ]
then
tr "[:lower:]" "[:upper:]" < $1 > output
for (( i=1; i <= "$size"; ++i ))
do
echo "Line " $i $(head -"$i" > output | tail -1 > output)
done
Hi, guys!
I have a problem with this little code. Everything works fine except the head-tail thing. What I wanna do is just displaying the line number "i" from a file.
The results that I receive are just the last line ($size).
I think maybe it is something wrong with the input of tail. The head -"$i" doesn't go at the specified line. :(
Any thoughts?
Ohhhh... I just realised: As input for my tail i give the same input for head.
The solution is to give to tail the result from head. How do I do that? :-/
You don't need to redirect to file output from head. Otherwise, the pipe does not get any input at all. Also, use >> to append results otherwise you will just keep overwriting the file with the next iteration of the loop. But make sure to delete the output file before each new call to the script, else you will just keep appending to the output file infinitely.
echo "Line " $i $(head -"$i" $infile | tail -1 >> output)
Use read to fetch a line of input from the file.
# Since `1` is always true, essentially count up forever
for ((i=1; 1; ++i)); do
# break when a read fails to read a line
IFS= read -r line || break
echo "Line $i: $(tr [:lower:] [:upper:])"
done < "$1" > output
A more standard approach is to iterate over the file and maintain i explicitly.
i=1
while IFS= read -r line; do
echo "Line $i: $(tr [:lower:] [:upper:])"
((i++))
done < "$1" > output
I think you're re-implementing cat -n with prefix "Line ". If so, awk to the rescue!
awk '{print "Line "NR, tolower($0)}'
I made it. :D
The trick is to put the output of the head to another file that will be the input for tail, like that:
echo "Line " $i $(head -"$i" < output >outputF | tail -1 < outputF)
Your questions made me think differently. Thank you!

File that autoruns itself

The same way it’s possible to write a file that autoextracts itself, I’m looking for a way to autorun a program within a script (or whatever it needs). I want the program part of the script, because I just want one file. It’s actually a challenge: I have a xz compressed program, and I wanna be able to run it without any intervention of the xz program by the user (just a ./theprogram).
Any idea?
Autorun after doing what? Login? Call it in ~/.bashrc. During boot? Write an appropriate /etc/init.d/yourprog and link it to the desired runlevel. Selfextract? Make it a shell archive (shar file). See the shar utility, http://linux.die.net/man/1/shar
Sorry but I was just thinking... Something like this would not work?
(I am assuming it is a script...)
#!/bin/bash
cat << 'EOF' > yourfile
yourscript
EOF
chmod +x yourfile
./yourfile
Still, it's pretty hard to understand exactly what you are trying to do... it seems to me that the "autorun" is pretty similar to a "call the program from within the script"..
I had written a script for this. This should help:
#!/bin/bash
set -e
payload=$(cat $0 | grep --binary-files=text -n ^PAYLOAD: | cut -d: -f1 )
filaname=`head $0 -n $payload | tail -n 1 | cut -d: -f2-`
tail -n +$(( $payload + 1 )) $0 > /tmp/$filaname
set +e
#Do whatever with the payload
exit 0
#Command to add payload:
#read x; ls $x && ( cp 'binary_script.sh' ${x}_binary_script.sh; echo $x >> ${x}_binary_script.sh; cat $x >> ${x}_binary_script.sh )
#Note: Strictly NO any character after "PAYLOAD:", not even newline...
PAYLOAD:
Sample usage:
Suppose myNestedScript.sh contains below data:
#!/bin/bash
echo hello world
Then run
x=myNestedScript.sh; ls $x && ( cp 'binary_script.sh' ${x}_binary_script.sh; echo $x >> ${x}_binary_script.sh; cat $x >> ${x}_binary_script.sh )
It will generate below file, which you can directly execute. Upon executing below file, it will extract myNestedScript.sh to /tmp & run that script.
#!/bin/bash
set -e
payload=$(cat $0 | grep --binary-files=text -n ^PAYLOAD: | cut -d: -f1 )
filaname=`head $0 -n $payload | tail -n 1 | cut -d: -f2-`
tail -n +$(( $payload + 1 )) $0 > /tmp/$filaname
set +e
chmod 755 /tmp/$filaname
/tmp/$filaname
exit 0
PAYLOAD:myNestedScript.sh
#!/bin/bash
echo hello world

Parsing a CSV string in Shell Script and writing it to a File

I am not a Linux scripting expert and I have exhausted my knowledge on this matter. Here is my situation.
I have a list of states passed as a command line argument to a shell script ( e.g "AL,AK,AS,AZ,AR,CA..." ). The Shell script needs to extract each of the state code and write it to a file ( states.txt) , with each state in one line. See below
AL
AK
AS
AZ
AR
CA
..
..
How can this be achieved using a linux shell script.
Thanks in advance.
Use tr:
echo "AL,AK,AS,AZ,AR,CA" | tr ',' '\n' > states.txt
echo "AL,AK,AS,AZ,AR,CA" | awk -F, '{for (i = 1; i <= NF; i++) print $i}';
Naive solution:
echo "AL,AK,AS,AZ,AR,CA" | sed 's/,/\n/g'
I think awk is the simplest solution, but you could try using cut in a loop.
Sample script (outputs to stdout, but you can just redirect it):
#!/bin/bash
# Check for input
if (( ${#1} == 0 )); then
echo No input data supplied
exit
fi
# Initialise first input
i=$1
# While $i still contains commas
while { echo $i| grep , > /dev/null; }; do
# Get first item of $i
j=`echo $i | cut -d ',' -f '1'`
# Shift off the first item of $i
i=`echo $i | cut --complement -d ',' -f '1'`
echo $j
done
# Display the last item
echo $i
Then you can just run it as ./script.sh "AL,AK,AS,AZ,AR,CA" > states.txt (assuming you save it as script.sh in the local directory and give it execute permission)

Resources