bash: Creating many descriptors in a loop - linux

I am trying to create multiple descriptors to files named 1, 2, 3, etc. in bash.
For example, exec 9>abc/1 works just fine, but when I try to create descriptors in a for loop, like this: exec $[$i+8]>abc/$i, it doesn't work. I tried many different ways, but it seems that exec just does not accept variables. Is there any way to do what I want to?
EDIT: If not, maybe there is a way to use flock without descriptors?

Yes, exec doesn't accept variables for file descriptor numbers. As pointed out in comments, you can use
eval "exec $((i + 8))>"'"abc/$i"'
which, if $i is 1, is equivalent to
exec 9>"abc/$i"
Those complex quotes ensure that eval-ed and then exec-ed command is safe even if file name is changed to something different than abc/1.
But there is a warning:
Redirections using file descriptors greater than 9 should be used with care, as they may conflict with file descriptors the shell uses internally.
So if your task doesn't require consecutive file descriptor numbers, you can use automatically allocated descriptors:
Each redirection that may be preceded by a file descriptor number may instead be preceded by a word of the form {varname}. In this case, for each redirection operator except >&- and <&-, the shell will allocate a file descriptor greater than 10 and assign it to varname.
So,
exec {fd}>"abc/$i"
echo "$fd"
will open file descriptor 10 (or greater) for writing to abc/1 and print that file descriptor number (e.g. 10).

Related

Directory depth in recursive script

hi i'd like to get some help with my linux bash homeworks.
i have to make a script that gets a directory and returns the depth of the deepest subdirectory (+1 for each directory).
I must do it recursively.
I must use 'list_dirs.sh' that takes the virable dir and echo its subdirs.
thats what i got so far:
dir=$1
sub=`source list_dirs.sh`
((depth++))
for i in $sub
do
if [ -n "$sub" ] ; then
./depthScript $dir/$i
fi
done
if ((depth > max)) ; then
max=$depth
echo $max
fi
after testing with a dir that supose to return 3 I got insted:
1
1
1
1
it seems like my depth counter forget previous values and I get output for
each directory.. need some help!
You can use bash functions to create recursive function calls.
Your function would ideally echo 0 in the base case where it is called on a directory with no subdirectories, and echo 1+$(getDepth $subdir) in the case where some subdirectory $subdir exists. See this question on recursive functions in bash for a framework.
When you run a script normally (i.e. it's in your PATH and you just enter its name, or you enter an explicit path to it like ./depthScript), it runs as a subprocess of the current shell. This is important because each process has its own variables. Variables also come in two kinds: shell variables (which are only available in that one process) and environment variables (the values of which get exported to subprocesses but not back up from them). And depending on where you want a variable's value to be available, there are three different ways to define them:
# By default, variables are shell variable that's only defined in this process:
shellvar=something
# `export` puts a variable into the environment, so it'll be be exported to subprocesses.
# You can export a variable either while setting it, or as a separate operation:
export envvar=something
export anotherenvvar
anotherenvvar=something
# You can also prefix a command with a variable assignment. This makes an
# environment variable in the command process's environment, but not the current
# shell process's environment:
prefixvar=something ./depthScript $dir/$i
Given the above assignments:
shellvar is defined in the current shell process, but not in any other process (including the subprocess created to run depthScript).
envvar and anotherenvvar will be inherited by the subprocess (and its subprocesses, and all subprocesses for later commands), but any changes made to it in those subprocesses have no effect at all in the current process.
prefixvar is available only in the subprocess created to run depthScript (and its subprocesses), but not in the current shell process or any other of its subprocesses.
Short summary: it's a mess because of the process structure, and as a result it's best to just avoid even trying to pass values around between scripts (or different invocations of the same script) in variables. Use environment variables for settings and such that you want to be generally available (but don't need to be changed much). Use shell variables for things local to a particular invocation of a script.
So, how should you pass the depth values around? Well, the standardish way is for each script (or command) to print its output to "standard output", and then whatever's using the script can capture its output to either a file (command >outfile) or a variable (var=$(command)). I'd recommend the latter in this case:
depth=$(./depthScript "$dir/$i")
if ((depth > max)) ; then
max=$depth
fi
Some other recommendations:
Think your control and data flow through. The current script loops through all subdirectories, then at the end runs a single check for the deepest subdir. But you need to check each subdirectory individually to see if it's deeper than the current max, and at the end report the deepest of them.
Double-quote your variable references (as I did with "$dir/$i" above). Unquoted variable references are subject to word splitting and wildcard expansion, which is the source of much grief. It looks like you'll need to leave $sub unquoted because you need it to be split into words, but this will make the script unable to cope with directory names with spaces. See BashFAQ #20: "How can I find and safely handle file names containing newlines, spaces or both?"
The if [ -n "$sub" ] ; then test is irrelevant. If $sub is empty, the loop will never run.
In a shell script, relative paths (like ./depthScript) are relative to whatever the working directory of the parent process, not to the location of the script. If someone runs your script from another directory, ./depthScript will not work. Use "$BASH_SOURCE" instead. See BashFAQ #28: "How do I determine the location of my script? I want to read some config files from the same place."
When trying to troubleshoot a script, it can help to put set -x before the troublesome section. This makes the shell print each command as it runs, so you can see what's going on.
Run your scripts through shellcheck.net -- it'll point out a lot of common mistakes.

bash "echo" including ">" in the middle creating file - please explain

When I write:
echo 2*3>5 is a valid inequality
In my bash terminal, a new file named 5 is created in my directory which contains:
2*3 is a valid inequality
I want to know what exactly is going on here and why am I getting this output?
I believe it's obvious that I'm new to Linux!
Thanks
In bash, redirections can occur anywhere in the line (but you shouldn't do it! --- see the bash-hackers tutorial). Bash takes the >5 as a redirection, creates output file 5, and then processes the rest of the arguments. Therefore, echo 2*3 is a valid inequality happens, which gives you the output you see in the output file 5.
What you probably want is
echo "2*3>5 is a valid inequality"
or
echo '2*3>5 is a valid inequality'
(with single-quotes), either of which will give you the message you specify as a printout on the command line. The difference is that, within "", variables (such as $foo) will be filled in, but not within ''.
Edit: The bash man page says that the
redirection operators may precede or appear anywhere within a simple command or may follow a command. Redirections are processed in the order they appear, from left to right.
bash does the output redirection first i.e. >5 is done first and a file named 5 is created (or truncated if it already exists). The resultant file descriptor remains open for the runtime of the echo command.
Then the remaining portion, 2*3 is a valid inequality, runs as the argument to echo and standard output is saved in the (already-open) file 5 eventually.
To get the whole string as the output, use single or double quotes:
echo '2*3>5 is a valid inequality'
This is an example of output redirection. You're instructing the echo statement to, instead of writing to standard out, write to a filename. That filename happens to be "5".
You can avoid that behavior by quoting:
echo "2*3>5 is a valid inequality"

Unix shell descriptor redirection

How translate this:
echo "test" | tee -a test1 test2
To a pure UNIX descriptor redirection solution (oneliner better and no PIPES).
Is it possible?
If you want a byte written to one file descriptor (pipe, socket etc.) to show up as readable data on more than one file descriptor which are not dup()s of each other (but e.g. they correspond to two different regular files), then it's not possible on a generic Unix system. Even if the two file descriptors are dup()s, after reading the byte from one of them it would make the byte disappear from the other one, so it can't be read twice.
If you want to do it in Bash without using a |, then it's not possible.
If you want to do it in Zsh without using a |, then just follow chepner's comment: do setopt multios, and then echo test >>test1 >>test2. In the background Zsh will create a helper process to do the copying equivalent to what tee -a does.

how to create a zero-copy, no-capacity, blocking pipe in bash?

I know the concept sounds a little abusive (?), but still - how can I create a pipe in bash which:
has no capacity
and therefore requires no memory copy, and
requires the write to be blocking
I am guessing a lot here. But possibly you are thinking about coprocesses and do not know what that term means.
bash supports coprocesses:
http://www.gnu.org/software/bash/manual/html_node/Coprocesses.html
The format for a coprocess is:
coproc [NAME] command [redirections]
This creates a coprocess named NAME.
If NAME is not supplied, the default name is COPROC.
NAME must not be supplied if command is a simple command (see Simple Commands);
otherwise, it is interpreted as the first word of the simple command.
When the coproc is executed, the shell creates an array variable (see Arrays) named NAME in the context of the executing shell. The standard output of command is connected via a pipe to a file descriptor in the executing shell, and that file descriptor is assigned to NAME[0].
The standard input of command is connected via a pipe to a file descriptor in the executing shell, and that file descriptor is assigned to NAME[1].
This pipe is established before any redirections specified by the command (see Redirections).
The file descriptors can be utilized as arguments to shell commands and redirections using standard word expansions.

What does dup2 actually do in this case?

I need some clarification here:
I have some code like this:
child_map[0] = fileno(fd[0]);
..
pid = fork();
if(pid == 0)
/* child process*/
dup2(child_map[0], STDIN_FILENO);
Now, will STDIN_FILENO and child_map[0] point to the same file descriptor ? Will the future inputs be taken from the file pointed to by child_map[0] and STDIN_FILENO ?
I thought STDIN_FILENO means the standard output(terminal).
After the dup2(), child_map[0] and STDIN_FILENO will continue to be separate file descriptors, but they will refer to the same open file description. That means that if, for example, child_map[0] == 5 and STDIN_FILENO == 0, then both file descriptor 5 and 0 will remain open after the dup2().
Referring to the same open file description means that the file descriptors are interchangeable - they share attributes like the current file offset. If you perform an lseek() on one file descriptor, the current file offset is changed for both.
To close the open file description, all file descriptors that point to it must be closed.
It is common to execute close(child_map[0]) after the dup2(), which leaves only one file descriptor open to the file.
It causes all functions which read from stdin to get their data from the specified file descriptor, instead of the parent's stdin (often a terminal, but could be a file or pipe depending on shell redirection).
In fact, this is how a shell would launch processes with redirected input.
e.g.
cat somefile | uniq
uniq's standard input is bound to a pipe, not the terminal.
STDIN_FILENO is stdin, not stdout. (There's a STDOUT_FILENO too.) Traditionally the former is 0 and the latter 1.
This code is using dup2() to redirect the child's stdin from another file descriptor that the parent had opened. (It is in fact the same basic mechanism used for redirection in shells.) What usually happens afterward is that some other program that reads from its stdin is execed, so the code has set up its stdin for that.

Resources