Difference between `exec n<&0 < file` and `exec n<file` commands and some general questions regarding exec command - linux

As I am a newbie in shell scripting, exec command always confuses me and while exploring this topic with while loop had triggered following 4 questions:
What is the difference between the below syntax 1 and 2
syntax 1:
while read LINE
do
: # manipulate file here
done < file
syntax 2:
exec n<&0 < file
while read LINE
do
: # manipulate file here
done
exec 0<&n n<&-
Kindly elaborate the operation of exec n<&0 < file lucidly
Is this exec n<&0 < file command equivalent to exec n<file ? (if not then what's the difference between two?)
I had read some where that in Bourne shell and older versions of ksh, a problem with the while loop is that it is executed in a subshell. This means that any changes to the script environment,such as exporting variables and changing the current working directory, might not be present after the while loop completes.
As an example, consider the following script:
#!/bin/sh
if [ -f “$1” ] ; then
i=0
while read LINE
do
i=`expr $i + 1`
done < “$1”
echo $i
fi
This script tries to count the number of lines in the file specified to it as an argument.
On executing this script on the file
$ cat dirs.txt
/tmp
/usr/local
/opt/bin
/var
can produce the following incorrect result:
0
Although you are incrementing the value of $i using the command
i=expr $i + 1
when the while loop completes, the value of $i is not preserved.
In this case, you need to change a variable’s value inside the while loop and then use that value outside the loop.
One way to solve this problem is to redirect STDIN prior to entering the loop and then restore STDIN after the loop completes.
The basic syntax is
exec n<&0 < file
while read LINE
do
: # manipulate file here
done
exec 0<&n n<&-
My question here is:
In Bourne shell and older versions of ksh,ALTHOUGH WHILE LOOP IS EXECUTED IN SUBSHELL, how this exec command here helps in retaining variable value even after while loop completion i.e. how does here exec command accomplishes the task change a variable’s value inside the while loop and then use that value outside the loop.

The difference should be nothing in modern shells (both should be POSIX compatible), with some caveats:
There are likely thousands of unique shell binaries in use, some of which are missing common features or simply buggy.
Only the first version will behave as expected in an interactive shell, since the shell will close as soon as standard input gets EOF, which will happen once it finishes reading file.
The while loop reads from FD 0 in both cases, making the exec pointless if the shell supports < redirection to while loops. To read from FD 9 you have to use done <&9 (POSIX) or read -u 9 (in Bash).
exec (in this case, see help exec/man exec) applies the redirections following it to the current shell, and they are applied left-to-right. For example, exec 9<&0 < file points FD 9 to where FD 0 (standard input) is currently pointing, effectively making FD 9 a "copy" of FD 0. After that file is sent to standard input (both FD 0 and 9).
Run a shell within a shell to see the difference between the two (commented to explain):
$ echo foo > file
$ exec "$SHELL"
$ exec 9<&0 < file
$ foo # The contents of the file is executed in the shell
bash: foo: command not found
$ exit # Because the end of file, equivalent to pressing Ctrl-d
$ exec "$SHELL"
$ exec 9< file # Nothing happens, simply sends file to FD 9
This is a common misconception about *nix shells: Variables declared in subshells (such as created by while) are not available to the parent shell. This is by design, not a bug. Many other answers here and on USE refer to this.

So many questions... but all of them seem variants of the same one, so I'll go on...
exec without a command is used to do redirection in the current process. That is, it changes the files attached to different file descriptors (FD).
Question #1
I think it should be this way. In may system the {} are mandadory:
exec {n}<&0 < file
This line dups FD 0 (standard input) and stores the new FD into the n variable. Then it attaches file to the standard input.
while read LINE ; do ... done
This line reads lines into variable LINE from the standard input, that will be file.
exec 0<&n {n}<&-
And this line dups back the FD from n into 0 (the original standard input), that automatically closes file and then closes n (the dupped original stdin).
The other syntax:
while read LINE; do ... done < file
does the same, but in a less convoluted way.
Question #2
exec {n}<&0 < file
These are redirections, and they are executed left to right. The first one n<&0 does a dup(0) (see man dup) and stores the result new FD in variable n. Then the <file does a open("file"...) and assigns it to the FD 0.
Question #3
No. exec {n}<file opens the file and assigns the new FD to variable n, leaving the standard input (FD 0) untouched.
Question #4
I don't know about older versions of ksh, but the usual problem is when doing a pipe.
grep whatever | while read LINE; do ... done
Then the while command is run in a subshell. The same is true if it is to the left of the pipe.
while read LINE ; do ... done | grep whatever
But for simple redirects there is no subshell:
while read LINE ; do ... done < aaa > bbb
Extra unnumbered question
About your example script, it works for me once I've changed the typographic quotes to normal double quotes ;-):
#!/bin/sh
if [ -f "$1" ] ; then
i=0
while read LINE
do
i=`expr $i + 1`
done < "$1"
echo $i
fi
For example, if the file is test:
$ ./test test
9
And about your latest question, the subshell is not created by while but by the pipe | or maybe in older versions of ksh, by the redirection <. What the exec trick does is to prevent that redirection so no subshell is created.

Let me answer your questions out-of-order.
Q #2
The command exec n<&0 < file is not valid syntax. Probably the n stands for "some arbitrary number". That said, for example
exec 3<&0 < file
executes two redirections, in sequence: it duplicates/copies the standard input file descriptor, which is 0, as file descriptor 3. Next, it "redirects" file descriptor 0 to read from file file.
Later, the command
exec 0<&3 3<&-
first copies back the standard input file descriptor from the saved file descriptor 3, redirecting standard input back to its previous source. Then it closes file descriptor 3, which has served its purpose to backup the initial stdin.
Q #1
Effectively, the two examples do the same: they temporarily redirect stdin within the scope of the while loop.
Q #3
Nope: exec 3<filename opens file filename using file descriptor 3. exec 3<&0 <filename I described in #2.
Q #4
I guess those older shells mentioned effectively executed
while ...; do ... ; done < filename
as
cat filename | while ...
thereby executing the while loop in a subshell.
Doing the redirections beforehand with those laborious exec commands avoids the redirection of the while block, and thereby the implicit sub-shell.
However, I never heard of that weird behavior, and I guess you won't have to deal with it unless you're working with truly ancient shells.

Related

"read" command not executing in "while read line" loop [duplicate]

This question already has answers here:
Read user input inside a loop
(6 answers)
Closed 5 years ago.
First post here! I really need help on this one, I looked the issue on google, but can't manage to find an useful answer for me. So here's the problem.
I'm having fun coding some like of a framework in bash. Everyone can create their own module and add it to the framework. BUT. To know what arguments the script require, I created an "args.conf" file that must be in every module, that kinda looks like this:
LHOST;true;The IP the remote payload will connect to.
LPORT;true;The port the remote payload will connect to.
The first column is the argument name, the second defines if it's required or not, the third is the description. Anyway, long story short, the framework is supposed to read the args.conf file line by line to ask the user a value for every argument. Here's the piece of code:
info "Reading module $name argument list..."
while read line; do
echo $line > line.tmp
arg=`cut -d ";" -f 1 line.tmp`
requ=`cut -d ";" -f 2 line.tmp`
if [ $requ = "true" ]; then
echo "[This argument is required]"
else
echo "[This argument isn't required, leave a blank space if you don't wan't to use it]"
fi
read -p " $arg=" answer
echo $answer >> arglist.tmp
done < modules/$name/args.conf
tr '\n' ' ' < arglist.tmp > argline.tmp
argline=`cat argline.tmp`
info "Launching module $name..."
cd modules/$name
$interpreter $file $argline
cd ../..
rm arglist.tmp
rm argline.tmp
rm line.tmp
succes "Module $name execution completed."
As you can see, it's supposed to ask the user a value for every argument... But:
1) The read command seems to not be executing. It just skips it, and the argument has no value
2) Despite the fact that the args.conf file contains 3 lines, the loops seems to be executing just a single time. All I see on the screen is "[This argument is required]" just one time, and the module justs launch (and crashes because it has not the required arguments...).
Really don't know what to do, here... I hope someone here have an answer ^^'.
Thanks in advance!
(and sorry for eventual mistakes, I'm french)
Alpha.
As #that other guy pointed out in a comment, the problem is that all of the read commands in the loop are reading from the args.conf file, not the user. The way I'd handle this is by redirecting the conf file over a different file descriptor than stdin (fd #0); I like to use fd #3 for this:
while read -u3 line; do
...
done 3< modules/$name/args.conf
(Note: if your shell's read command doesn't understand the -u option, use read line <&3 instead.)
There are a number of other things in this script I'd recommend against:
Variable references without double-quotes around them, e.g. echo $line instead of echo "$line", and < modules/$name/args.conf instead of < "modules/$name/args.conf". Unquoted variable references get split into words (if they contain whitespace) and any wildcards that happen to match filenames will get replaced by a list of matching files. This can cause really weird and intermittent bugs. Unfortunately, your use of $argline depends on word splitting to separate multiple arguments; if you're using bash (not a generic POSIX shell) you can use arrays instead; I'll get to that.
You're using relative file paths everywhere, and cding in the script. This tends to be fragile and confusing, since file paths are different at different places in the script, and any relative paths passed in by the user will become invalid the first time the script cds somewhere else. Worse, you aren't checking for errors when you cd, so if any cd fails for any reason, then entire rest of the script will run in the wrong place and fail bizarrely. You'd be far better off figuring out where your system's root directory is (as an absolute path), then referencing everything from it (e.g. < "$module_root/modules/$name/args.conf").
Actually, you're not checking for errors anywhere. It's generally a good idea, when writing any sort of program, to try to think of what can go wrong and how your program should respond (and also to expect that things you didn't think of will also go wrong). Some people like to use set -e to make their scripts exit if any simple command fails, but this doesn't always do what you'd expect. I prefer to explicitly test the exit status of the commands in my script, with something like:
command1 || {
echo 'command1 failed!' >&2
exit 1
}
if command2; then
echo 'command2 succeeded!' >&2
else
echo 'command2 failed!' >&2
exit 1
fi
You're creating temp files in the current directory, which risks random conflicts (with other runs of the script at the same time, any files that happen to have names you're using, etc). It's better to create a temp directory at the beginning, then store everything in it (again, by absolute path):
module_tmp="$(mktemp -dt module-system)" || {
echo "Error creating temp directory" >&2
exit 1
}
...
echo "$answer" >> "$module_tmp/arglist.tmp"
(BTW, note that I'm using $() instead of backticks. They're easier to read, and don't have some subtle syntactic oddities that backticks have. I recommend switching.)
Speaking of which, you're overusing temp files; a lot of what you're doing with can be done just fine with shell variables and built-in shell features. For example, rather than reading line from the config file, then storing them in a temp file and using cut to split them into fields, you can simply echo to cut:
arg="$(echo "$line" | cut -d ";" -f 1)"
...or better yet, use read's built-in ability to split fields based on whatever IFS is set to:
while IFS=";" read -u3 arg requ description; do
(Note that since the assignment to IFS is a prefix to the read command, it only affects that one command; changing IFS globally can have weird effects, and should be avoided whenever possible.)
Similarly, storing the argument list in a file, converting newlines to spaces into another file, then reading that file... you can skip any or all of these steps. If you're using bash, store the arg list in an array:
arglist=()
while ...
arglist+=("$answer") # or ("#arg=$answer")? Not sure of your syntax.
done ...
"$module_root/modules/$name/$interpreter" "$file" "${arglist[#]}"
(That messy syntax, with the double-quotes, curly braces, square brackets, and at-sign, is the generally correct way to expand an array in bash).
If you can't count on bash extensions like arrays, you can at least do it the old messy way with a plain variable:
arglist=""
while ...
arglist="$arglist $answer" # or "$arglist $arg=$answer"? Not sure of your syntax.
done ...
"$module_root/modules/$name/$interpreter" "$file" $arglist
... but this runs the risk of arguments being word-split and/or expanded to lists of files.

How to understand and avoid non-interactive mode errors when running ispell from script?

Background
Ispell is a basic command line spelling program in linux, which I want to call for a previously collected list of file names. These file names are recursively collected from a latex root file for example. This is usefull when requiring to spell all recursively included latex files, and no other files. However, calling ispell from the command line turns out to be non-trivial as ispell gives errors of the form
"Can't deal with non-interactive use yet." in some cases.
(As a side not, ideally I would like to call ispell programmatically from java using the ProcessBuilder class, and without requiring bash. The same error seems to pester this approach however.)
Question
Why is it that ispell gives the error "Can't deal with non-interactive use yet." in certain cases, when called in bash from a loop involving the read method, but not in other cases, as shown in the below code example?
The below minimal code example creates two small files
(testFileOne.txt, testFileTwo.txt) and a file containing the paths of the two created files (testFilesListTemp.txt).
Next, ispell is called for testFilesListTemp.txt in three different ways:
1. With the help of "cat"
2. By first collecting the names as a string, then looping over the substrings in the collected string, and calling ispell for each of them.
3. By looping over the contents of testFilesListTemp.txt directly, and
calling ispell for the extracted paths.
For some reaons the third method does not work, and yields an error
"Can't deal with non-interactive use yet.". Why exactly does this error
occur, and how can it be prevented, and/or is there perhaps another variation
of the third approach that would work without errors?
#!/bin/bash
#ispell ./testFiles/ispellTestFile1.txt
# Creating two small files and a file with file paths for testing
printf "file 1 contents" > testFileOne.txt
printf "file 2 contents. With a spelling eeeeror." > testFileTwo.txt
printf "./testFileOne.txt\n./testFileTwo.txt\n" > testFilesListTemp.txt
COLLECTED_LATEX_FILE_NAMES_FILE=testFilesListTemp.txt
# Approach 1: produce list of file names with cat and
# pass as argumentto ispell
# WORKS
ispell $(cat $COLLECTED_LATEX_FILE_NAMES_FILE)
# Second approach, first collecting file names as long string,
# then looping over substrings and calling ispell for each one of them
FILES=""
while read p; do
echo "read file $p"
FILES="$FILES $p"
done < $COLLECTED_LATEX_FILE_NAMES_FILE
printf "files list: $FILES\n"
for latexName in $FILES; do
echo "filename: $latexName"
ispell $latexName
done
# Third approach, not working
# ispell compmlains in this case about not working in non-interactive
# mode
#: "Can't deal with non-interactive use yet."
while read p; do
ispell "$p"
done < $COLLECTED_LATEX_FILE_NAMES_FILE
The third example does not work, because you redirect standard input. ispell needs a terminal and a user interaction. When you write code like this:
while read p; do
ispell "$p"
done < $COLLECTED_LATEX_FILE_NAMES_FILE
everything that is read from standard input by any program within the loop will be taken from the $COLLECTED_LATEX_FILE_NAMES_FILE file. ispell detects that and refuses operating. However, you can use "description redirection" to make read p read from the file, and ispell "$p" read from the "real" terminal. Just do:
exec 3<&0
while read p; do
ispell "$p" 0<&3
done < $COLLECTED_LATEX_FILE_NAMES_FILE
exec 3<&0 "copies" (saves) your standard input (0, the "terminal") to descriptor 3. And later on you redirect standard input (0) to ispell from that descriptor, by typing 0<&3 (you can omit 0 if you like).

Reading the path of files as string in shell script

My Aim -->
Files Listing from a command has to be read line by line and be used as part of another command.
Description -->
A command in linux returns
archive/Crow.java
archive/Kaka.java
mypmdhook.sh
which is stored in changed_files variable. I use the following while loop to read the files line by line and use it as part of a pmd command
while read each_file
do
echo "Inside Loop -- $each_file"
done<$changed_files
I am new to writing shell script but my assumption was that the lines would've been separated in the loop and printed in each iteration but instead I get the following error --
mypmdhook.sh: 7: mypmdhook.sh: cannot open archive/Crow.java
archive/Kaka.java
mypmdhook.sh: No such file
Can you tell me how I can just get the value as a string and not as a file what is opened. By the way, the file does exist which made me feel even more confused.(and later use it inside a command). I'd be happy with any kind of answer that helps me understand and resolve this issue.
Since you have data stored in a variable, use a "here string" instead of file redirection:
changed_files="archive/Crow.java
archive/Kaka.java
mypmdhook.sh"
while read each_file
do
echo "Inside Loop -- $each_file"
done <<< "$changed_files"
Inside Loop -- archive/Crow.java
Inside Loop -- archive/Kaka.java
Inside Loop -- mypmdhook.sh
Extremely important to quote "$changed_files" in order to preserve the newlines, so the while-read loop works as you expect. A rule of thumb: always quote variables, unless you knows exactly why you want to leave the quotes off.
What happens here is that the value of your variable $changed_files is substituted into your command, and you get something like
while read each_file
do
echo "Inside Loop -- $each_file"
done < archive/Crow.java
archive/Kaka.java
mypmdhook.sh
then the shell tries to open the file for redirecting the input and obviously fails.
The point is that redirections (e.g. <, >, >>) in most cases accept filenames, but what you really need is to give the contents of the variable to the stdin. The most obvious way to do that is
echo $changed_files | while read each_file; do echo "Inside Loop -- $each_file"; done
You can also use the for loop instead of while read:
for each_file in $changed_files; do echo "inside Loop -- $each_file"; done
I prefer using while read ... if there is a chance that some filename may contain spaces, but in most cases for ... in will work for you.
Rather than storing command's output in a variable use while loop like this:
mycommand | while read -r each_file; do echo "Inside Loop -- $each_file"; done
If you're using BASH you can use process substitution:
while read -r each_file; do echo "Inside Loop -- $each_file"; done < <(mycommand)
btw your attempt of done<$changed_files will assume that changed_files represents a file.

bash script read pipe or argument

I want my script to read a string either from stdin , if it's piped, or from an argument. So first i want to check if some text is piped and if not it should use an argument as input. My code looks something like this:
value=$(cat) # read from stdin
if [ "$pipe" != "" ]; then #check if pipe is not empty
#Do something with pipe string
else
#Do something with argument string
fi
The problem is when it's not piped, then the script will halt and wait for "ctrl d" and i dont want that. Any suggestions on how to solve this?
Thanks in advance.
/Tomas
What about checking the argument first?
if (($#)) ; then
process "$1"
else
cat | process
fi
Or, just take advantage from the same behaviour of cat:
cat "$#" | process
If you only need to know if it's a pipe or a redirection, it should be sufficient to determine if stdin is a terminal or not:
if [ -t 0 ]; then
# stdin is a tty: process command line
else
# stdin is not a tty: process standard input
fi
[ (aka test) with -t is equivalent to the libc isatty() function.
The above will work with both something | myscript and myscript < infile. This is the simplest solution, assuming your script is for interactive use.
The [ command is a builtin in bash and some other shells, and since [/test with -tis in POSIX, it's portable too (not relying on Linux, bash, or GNU utility features).
There's one edge case, test -t also returns false if the file descriptor is invalid, but it would take some slight adversity to arrange that. test -e will detect this, though assuming you have a filename such as /dev/stdin to use.
The POSIX tty command can also be used, and handles the adversity above. It will print the tty device name and return 0 if stdin is a terminal, and will print "not a tty" and return 1 in any other case:
if tty >/dev/null ; then
# stdin is a tty: process command line
else
# stdin is not a tty: process standard input
fi
(with GNU tty, you can use tty -s for silent operation)
A less portable way, though certainly acceptable on a typical Linux, is to use GNU stat with its %F format specifier, this returns the text "character special file", "fifo" and "regular file" in the cases of terminal, pipe and redirection respectively. stat requires a filename, so you must provide a specially-named file of the form /dev/stdin, /dev/fd/0, or /proc/self/fd/0, and use -L to chase symlinks:
stat -L -c "%F" /dev/stdin
This is probably the best way to handle non-interactive use (since you can't make assumptions about terminals then), or to detect an actual pipe (FIFO) distinct from redirection.
There is a slight gotcha with %F in that you cannot use it to tell the difference between a terminal and certain other device files, for example /dev/zero or /dev/null which are also "character special files" and might reasonably appear. An unpretty solution is to use %t to report the underlying device type (major, in hex), assuming you know what the underlying tty device number ranges are... and that depends on whether you're using BSD style ptys or Unix98 ptys, or whether you're on the actual console, among other things. In the simple case %t will be 0 though for a pipe or a redirection of a normal (non-special) file.
More general solutions to this kind of problem are to use bash's read with a timeout (read -t 0 ...) or non-blocking I/O with GNU dd (dd iflag=nonblock).
The latter will allow you to detect lack of input on stdin, dd will return an exit code of 1 if there is nothing ready to read. However, these are more suitable for non-blocking polling loops, rather than a once-off check: there is a race condition when you start two or more processes in a pipeline as one may be ready to read before another has written.
It's easier to check for command line arguments first and fallback to stdin if no arguments. Shell Parameter Expansion is a nice shorthand instead of the if-else:
value=${*:-`cat`}
# do something with $value

exec n<&m versus exec n>&m -- based on Sobell's Linux book

In Mark Sobell's A Practical Guide to Linux Commands, Editors, and Shell Programming, Second Edition he writes (p. 432):
The <& token duplicates an input file
descriptor; >& duplicates an output
file descriptor.
This seems to be inconsistent with another statement on the same page:
Use the following format to open or
redirect file descriptor n as a
duplicate of file descriptor m:
exec n<&m
and with an example also on the same page:
# File descriptor 3 duplicates standard input
# File descriptor 4 duplicates standard output
exec 3<&0 4<&1
If >& duplicates an output file descriptor then should we not say
exec 4>&1
to duplicate standard output?
The example is right in practice. The book's original explanation is an accurate description of what the POSIX standard says, but the POSIX-like shells I have handy (bash and dash, the only ones I believe are commonly seen on Linux) are not that picky.
The POSIX standard says the same thing as the book about input and output descriptors, and goes on to say this: for n<&word, "if the digits in word do not represent a file descriptor already open for input, a redirection error shall result". So if you want to be careful about POSIX compatibility, you should avoid this usage.
The bash documentation also says the same thing about <& and >&, but without the promise of an error. Which is good, because it doesn't actually give an error. Instead, empirically n<&m and n>&m appear to be interchangeable. The only difference between <& and >& is that if you leave off the fd number on the left, <& defaults to 0 (stdin) and >& to 1 (stdout).
For example, let's start a shell with fd 1 pointing at a file bar, then try out exactly the exec 4<&1 example, try to write to the resulting fd 4, and see if it works:
$ sh -c 'exec 4<&1; echo foo >&4' >bar; cat bar
foo
It does, and this holds using either dash or bash (or bash --posix) for the shell.
Under the hood, this makes sense because <& and >& are almost certainly just calling dup2(), which doesn't care whether the fds are opened for reading or writing or appending or what.
[EDIT: Added reference to POSIX after discussion in comments.]
If stdout is a tty, then it can safely be cloned for reading or writing. If stdout is a file, then it may not work. I think the example should be 4>&1. I agree with Greg that you can both read and write the clone descriptor, but requesting a redirection with <& is supposed to be done with source descriptors that are readable, and expecting stdout to be readable doesn't make sense. (Although I admit I don't have a reference for this claim.)
An example may make it clearer. With this script:
#!/bin/bash
exec 3<&0
exec 4<&1
read -p "Reading from fd 3: " <&3
echo From fd 3: $REPLY >&2
REPLY=
read -p "Reading from fd 4: " <&4
echo From fd 4: $REPLY >&2
echo To fd 3 >&3
echo To fd 4 >&4
I get the following output (the stuff after the : on "Reading from" lines is typed at the terminal):
$ ./5878384b.sh
Reading from fd 3: foo
From fd 3: foo
Reading from fd 4: bar
From fd 4: bar
To fd 3
To fd 4
$ ./5878384b.sh < /dev/null
From fd 3:
Reading from fd 4: foo
From fd 4: foo
./5878384b.sh: line 12: echo: write error: Bad file descriptor
To fd 4
$ ./5878384b.sh > /dev/null
Reading from fd 3: foo
From fd 3: foo
./5878384b.sh: line 9: read: read error: 0: Bad file descriptor
From fd 4:
To fd 3
Mind the difference between file descriptors and IO streams such as stderr and stdout.
The redirecting operators are just redirecting IO streams via different file descriptors (IO stream handling mechanisms); they do not do any copying or duplicating of IO streams (that's what tee(1) is for).
See: File Descriptor 101
Another test to show that n<&m and n>&m are interchangeable would be "to use either style of 'n<&-' or 'n>&-' for closing a file descriptor, even if it doesn't match the read/write mode that the file descriptor was opened with" (http://www.gnu.org/s/hello/manual/autoconf/File-Descriptors.html).

Resources