Can I avoid using a FIFO file to join the end of a Bash pipeline to be stored in a variable in the current shell?

Can I avoid using a FIFO file to join the end of a Bash pipeline to be stored in a variable in the current shell? - linux

I have the following functions:
execIn ()
{
local STORE_INvar="${1}" ; shift
printf -v "${STORE_INvar}" '%s' "$( eval "$#" ; printf %s x ; )"
printf -v "${STORE_INvar}" '%s' "${!STORE_INvar%x}"
}
and
getFifo ()
{
local FIFOfile
FIFOfile="/tmp/diamondLang-FIFO-$$-${RANDOM}"
while [ -e "${FIFOfile}" ]
do
FIFOfile="/tmp/diamondLang-FIFO-$$-${RANDOM}"
done
mkfifo "${FIFOfile}"
echo "${FIFOfile}"
}
I want to store the output of the end of a pipeline into a variable as given to a function at the end of the pipeline, however, the only way I have found to do this that will work in early versions of Bash is to use mkfifo to make a temp fifo file. I was hoping to use file descriptors to avoid having to create temporary files. So, This works, but is not ideal:
Set Up: (before I can do this I need to have assigned a FIFO file to a var that can be used by the rest of the process)
$ FIFOfile="$( getFifo )"
The Pipeline I want to persist:
$ printf '\n\n123\n456\n524\n789\n\n\n' | grep 2 # for e.g.
The action: (I can now add) >${FIFOfile} &
$ printf '\n\n123\n456\n524\n789\n\n\n' | grep 2 >${FIFOfile} &
N.B. The need to background it with & - Problem 1: I get [1] <PID_NO> output to the screen.
The actual persist:
$ execIn SOME_VAR cat - <${FIFOfile}
Problem 2: I get more noise to the screen
[1]+ Done printf '\n\n123\n456\n524\n789\n\n\n' | grep 2 > ${FIFOfile}
Problem 3: I loose the blanks at the start of the stream rather than at the end as I have experienced before.
So, am I doing this the right way? I am sure that there must be a way to avoid the need of a FIFO file that needs cleanup afterwards using file descriptors, but I cannot seem to do this as I cannot assign either side of the problem to a file descriptor that is not attached to a file or a FIFO file.
I can try and resolve the problems with what I have, although to make this work properly I guess I need to pre-establish a pool of FIFO files that can be pulled in to use or else I have a pre-req of establishing this file before the command. So, for many reasons this is far from ideal. If anyone can advise me of a better way you would make my day/week/month/life :)
Thanks in advance...

Process substitution was available in bash from the ancient days. You absolutely do not have a version so ancient as to be unable to use it. Thus, there's no need to use a FIFO at all:
readToVar() { IFS= read -r -d '' "$1"; }
readToVar targetVar < <(printf '\n\n123\n456\n524\n789\n\n\n')
You'll observe that:
printf '%q\n' "$targetVar"
...correctly preserves the leading newlines as well as the trailing ones.
By contrast, in a use case where you can't afford to lose stdin:
readToVar() { IFS= read -r -d '' "$1" <"$2"; }
readToVar targetVar <(printf '\n\n123\n456\n524\n789\n\n\n')
If you really want to pipe to this command, are willing to require a very modern bash, and don't mind being incompatible with job control:
set +m # disable job control
shopt -s lastpipe # in a pipeline, parent shell becomes right-hand side
readToVar() { IFS= read -r -d '' "$1"; }
printf '\n\n123\n456\n524\n789\n\n\n' | grep 2 | readToVar targetVar

The issues you claim to run into with using a FIFO do not actually exist. Put this in a script, and run it:
#!/bin/bash
trap 'rm -rf "$tempdir"' 0 # cleanup on exit
tempdir=$(mktemp -d -t fifodir.XXXXXX)
mkfifo "$tempdir/fifo"
printf '\n\n123\n456\n524\n789\n\n\n' >"$tempdir/fifo" &
IFS= read -r -d '' content <"$tempdir/fifo"
printf '%q\n' "$content" # print content to console
You'll notice that, when run in a script, there is no "noise" printed to the screen, because all that status is explicitly tied to job control, which is disabled by default in scripts.
You'll also notice that both leading and tailing newlines are correctly represented.

One idea, tell me I am crazy, might be to use the !! notation to grab the line just executed, e.g. if there is a command that can terminate a pipeline and stop it actually executing, whilst still as far as the shell is concerned, consider it as a successful execution, I am thinking something like the true command, I could then use !! to grab that line and call my existing function to execute it with process substitution or something. I could then wrap this into an alias, something like: alias streamTo=' | true ; LAST_EXEC="!!" ; myNewCommandVariation <<<' which I think could be used something like: $ cmd1 | cmd2 | myNewCommandVariation THE_VAR_NAME_TO_SET and the <<< from the alias would pass the var name to the command as an arg or stdin, either way, the command would be not at the end of a pipeline. How mad is this idea?

Not a full answer but rather a first point: is there some good reason not using mktemp for creating a new file with a random name? As far as I can see, your function called getFifo() doesn't perform much more.
mktemp -u
will give to you a free new name without creating anything; then you can use mkfifo with this name.

Related

How to make read prompt show when redirecting stderr to a process in bash

I currently have this issue where redirecting(2>) or piping(|&) stderr somewhere will cause read prompts to not print on the terminal when input is required. Instead the text is only printed after the program returned. The problem might look similar to this one, but I am not sure how it works and if we are having the same issue.
What I am trying to do is to filter the output of pacman so that it is not cluttered with warnings that I knew would be present. This is the command I execute:
pacman -Syu --ignore="$(IFS=',';echo "${AUTOPKG[*]}")" 2> >(grep -v 'ignoring package upgrade' >&2)
Where ${AUTOPKG[*]} is expanded to a comma separated list of package I want to ignore at that moment. grep -v filters out the unnecessary warnings and redirect the output back to the terminal.
In addition, I have also tried using the read utility to narrow the problem down to stderr only as redirecting stdout alone will still let the prompt through. I have no more clues from here on though, any help is appreciated.

So I have experimented with the commands and stuff and found that the line is printed to the file descriptor just that it is not printing it out because it still haven't hit a newline, so I realized I can use read -rn1 to read the input and pass that to grep with a timeout that prints the buffer out if too much time is passed. Here is the drop-in replacement function for grep:
timeoutGrep()
{
local string pid_sleep=0
while IFS= read -rN1 ch ;do
if (( $pid_sleep )) && [ -e /proc/$pid_sleep ] ;\
then kill $pid_sleep ;\
else string='' ;fi
string+="$ch"
if [ "$ch" = $'\n' ] ;then
grep "$#" <(printf '%s' "$string")
pid_sleep=0
else
sleep 0.5 && grep "$#" <(printf '%s' "$string") | tr -d '\n' &
pid_sleep=$!
fi
done
}
The rundown of what this function does is to read a char and append it to a buffer, then depend on the character read: '\n'? then send the line to grep immediately, since we know the line is finished, then tell the next iteration it didn't call a background process; or not '\n'? then create a background grep that is delayed, tell the next iteration what the pid of it is. An iteration is advanced when a new character is read, if the background process was successfully run it will be terminated, otherwise if it is 0, no background process was created, in both cases clean the buffer. If it is neither of both cases, assume the background process still hasn't grepped the string, kill it before it does, then carry on to continue the loop.
It seems to be working for now though it has some delays and may mess up the order of the message. I also tried using the timeout option in read but it seems to stop the running process somehow which I suspect is due to closing of the file descriptor.

As with the command: "echo '#!/bin/bash' |tee file", but with "echo '#!/bin/bash' | myscript file"

What "... | tee file" does is take stdin (standard input) and divert it to two places: stdout (standard output) and to a path/file named "file". In effect it does this, as far as I can judge:
#!/bin/bash
var=(cat) # same as var=(cat /dev/stdin)
echo -e "$var"
for file in "$#"
do
echo -e "$var" > "${file}"
done
exit 0
So I use the above code to create tee1 to see if I could emulate what tee does. But my real intent is to write a modified version that appends to existing file(s) rather than redo them from scratch. I call this one tee2:
#!/bin/bash
var=(cat) # same as var=(cat /dev/stdin)
echo -e "$var"
for file in "$#"
do
echo -e "$var" >> "${file}"
done
exit 0
It makes sense to me, but not to bash. Now an alternative approach is to do something like this:
echo -e "$var"
for file in "$#"
do
echo -e "$var"| tee tmpfile
cat tmpfile >> "${file}"
done
rm tmpfile
exit 0
It also makes sense to me to do this:
#!/bin/bash
cp -rfp /dev/stdin tmpfile
cat tmpfile
for file in "$#"
do
cat tmpfile >> "${file}"
done
exit 0
Or this:
#!/bin/bash
cat /dev/stdin
for file in "$#"
do
cat /dev/stdin >> "${file}"
done
exit 0
Some online searches suggest that printf be used in place of echo -e for more consistency across platforms. Other suggest that cat be used in place of read, though since stdin is a device, it should be able to be used in place of catm as in:
> tmpfile
IFS=\n
while read line
do
echo $line >> tmpfile
echo $line
done < /dev/stdin
unset IFS
Then the for loop follows. But I can't get that to work. How can I do it with bash?

But my real intent is to write a modified version that appends to existing file(s) rather than redo them from scratch.
The tee utility is specified to support an -a option, meaning "Append the output to the files." [spec]
(And I'm not aware of any implementations of tee that deviate from the spec in this regard.)
Edited to add: If your question is really "what's wrong with all the different things I tried", then, that's probably too broad for a single Stack Overflow question. But here's a short list:
var=(cat) means "Set the array variable var to contain a single element, namely, the string cat."
Note that this does not, in any way, involve the program cat.
You probably meant var=$(cat), which means "Run the command cat, capturing its standard output. Discard any null bytes, and discard any trailing sequence of newlines. Save the result in the regular variable var."
Note that even this version is not useful for faithfully implementing tee, since tee does not discard null bytes and trailing newlines. Also, tee forwards input as it becomes available, whereas var=$(cat) has to wait until input has completed. (This is a problem if standard input is coming from the terminal — in which case the user would expect to see their input echoed back — or from a program that might be trying to communicate with the user — in which case you'd get a deadlock.)
echo -e "$var" makes a point of processing escape sequences like \t. (That's what the -e means.) This is not what you want. In addition, it appends an extra newline, which isn't what you want if you've managed to set $var correctly. (If you haven't managed to set $var correctly, then this might help compensate for that, but it won't really fix the problem.)
To faithfully print the contents of var, you should write printf %s "$var".
I don't understand why you switched to the | tee tmpfile approach. It doesn't improve anything so far as I can tell, and it introduces the bug that now if you're copying to n files, then you will also write n copies to standard output. (You fixed that bug in later versions, though.)
The versions where you write directly to a file, instead of saving to a variable first, are a massive improvement in terms of faithfully copying the contents of standard input. But they still have the problem of waiting until input is complete.
The version where you cat /dev/stdin multiple times (once for each destination) won't work, because there's no "rewinding" of standard input. Once something is consumed, it's gone. (This makes sense when you consider that standard input is frequently passed around from program to program — your cat-s, for example, are inheriting it from your Bash script, and your Bash script may be inheriting it from the terminal. If some sort of automatic rewinding were to happen, how would it decide how far back to go?) (Note: if standard input is coming from a regular file, then it's possible to explicitly seek backward along it, and thereby "unconsume" already-consumed input. But that doesn't happen automatically, and anyway that's not possible when standard input is coming from a terminal, from a pipe, etc.)

How to execute Linux shell variables within double quotes?

I have the following hacking-challenge, where we don't know, if there is a valid solution.
We have the following server script:
read s # read user input into var s
echo "$s"
# tests if it starts with 'a-f'
echo "$s" > "/home/user/${s}.txt"
We only control the input "$s". Is there a possibility to send OS-commands like uname or do you think "no way"?

I don't see any avenue for executing arbitrary commands. The script quotes $s every time it is referenced, so that limits what you can do.
The only serious attack vector I see is that the echo statement writes to a file name based on $s. Since you control $s, you can cause the script to write to some unexpected locations.
$s could contain a string like bob/important.txt. This script would then overwrite /home/user/bob/important.txt if executed with sufficient permissions. Sorry, Bob!
Or, worse, $s could be bob/../../../etc/passwd. The script would try to write to /home/user/bob/../../../etc/passwd. If the script is running as root... uh oh!
It's important to note that the script can only write to these places if it has the right permissions.
You could embed unusual characters in $s that would cause irregular file names to be created. Un-careful scripts could be taken advantage of. For example, if $s were foo -rf . bar, then the file /home/user/foo -rf . bar.txt would be created.
If someone ran for file in /home/user; rm $file; done they'd have a surprise on their hands. They would end up running rm /home/user/foo -rf . bar.txt, which is a disaster. If you take out /home/user/foo and bar.txt you're left with rm -rf . — everything in the current directory is deleted. Oops!
(They should have quoted "$file"!)
And there are two other minor things which, while I don't know how to take advantage of them maliciously, do cause the script to behave slightly differently than intended.
read allows backslashes to escape characters like space and newline. You can enter \space to embed spaces and \enter to have read parse multiple lines of input.
echo accepts a couple of flags. If $s is -n or -e then it won't actually echo $s; rather, it will interpret $s as a command-line flag.

Use read -r s or any \ will be lost/missinterpreted by your command.
read -r s?"Your input: "
if [ -n "${s}" ]
then
# "filter" file name from command
echo "${s##*/}" | sed 's|^ *\([[:alnum:]_]\{1,\}\)[[:blank:]].*|/home/user/\1.txt|' | read Output
(
# put any limitation on user here
ulimit -t 5 1>/dev/null 2>&1
`${read}`
) > ${OutPut}
else
echo "Bad command" > /home/user/Error.txt
fi

Sure:
read s
$s > /home/user/"$s".txt
If I enter uname, this prints Linux. But beware: this is a security nightmare. What if someone enters rm -rf $HOME? You'd also have issues with commands containing a slash.

How to read from user within while-loop read line?

I had a bash file which prompted the user for some parameters and used defaults if nothing was given. The script then went on to perform some other commands with the parameters.
This worked great - no problems until most recent addition.
In an attempt to read the NAMES parameter from a txt file, I've added a while-loop to take in the names in the file, but I would still like the remaining parameters prompted for.
But once I added the while loop, the output shows the printed prompt in get_ans() and never pauses for a read, thus all the defaults are selected.
I would like to read the first parameter from a file, then all subsequent files from prompting the user.
What did I break by adding the while-loop?
cat list.txt |
while read line
do
get_ans "Name" "$line"
read NAME < $tmp_file
get_ans "Name" "$line"
read NAME < $tmp_file
done
function get_ans
{
if [ -f $tmp_file ]; then
rm $tmp_file
PROMPT=$1
DEFAULT=$2
echo -n "$PROMPT [$DEFAULT]: "
read ans
if [ -z "$ans" ]; then
ans="$DEFAULT"
fi
echo "$ans" > $tmp_file
}
(NOTE: Code is not copy&paste so please excuse typos. Actual code has function defined before the main())

You pipe data into your the while loops STDIN. So the read in get_ans is also taking data from that STDIN stream.
You can pipe data into while on a different file descriptor to avoid the issue and stop bothering with temp files:
while read -u 9 line; do
NAME=$(get_ans Name "$line")
done 9< list.txt
get_ans() {
local PROMPT=$1 DEFAULT=$2 ans
read -p "$PROMPT [$DEFAULT]: " ans
echo "${ans:-$DEFAULT}"
}

To read directly from the terminal, not from stdin (assuming you're on a *NIX machine, not a Windows machine):
while read foo</some/file; do
read bar</dev/tty
echo "got <$bar>"
done

When you pipe one command into another on the command line, like:
$ foo | bar
The shell is going to set it up so that bar's standard input comes from foo's standard output. Anything that foo sends to stdout will go directly to bar's stdin.
In your case, this means that the only thing that your script can read from is the standard output of the cat command, which will contain the contents of your file.
Instead of using a pipe on the command line, make the filename be the first parameter of your script. Then open and read from the file inside your code and read from the user as normal.

Accessing each line using a $ sign in linux

Whenever I execute a linux command that outputs multiple lines, I want to perform some operation on each line of the output. generally i do
command something | while read a
do
some operation on $a;
done
This works fine. But my question is, Is there some how I can access each line by a predefined symbol( dont know how to call it) /// something like $? .. or .. $! .. or .. $_
Is it possible to do
cat to_be_removed.txt | rm -f $LINE
is there a predefined $LINE in bash .. or the previous one is the shortest way. ie.
cat to_be_removed.txt | while read line; do rm -f $line; done;

xargs is what you're looking for:
cat to_be_removed.txt | xargs rm -f
Watch out for spaces in your filenames if you use that one, though. Check out the xargs man page for more information.

You might be looking for the xargs command.
It takes control arguments, plus a command and optionally some arguments for the command. It then reads its standard input, normally splitting at white space, and then arranges to repeatedly execute the command with the given arguments and as many 'file names' read from the standard input as will fit on the command line.

rm -f $(<to_be_removed.txt)
This works because rm can take multiple files as input. It also makes it much more efficient because you only call rm once and you don't need to create a pipe to cat or xargs
On a separate note, rather than using pipes in a while loop, you can avoid a subshell by using process substitution:
while read line; do
some operation on $a;
done < <(command something)
The additional benefit you get by avoiding a subshell is that variables you change inside the loop maintain their altered values outside the loop as well. This is not the case when using the pipe form and it is a common gotcha.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string