Accessing each line using a $ sign in linux - linux

Whenever I execute a linux command that outputs multiple lines, I want to perform some operation on each line of the output. generally i do
command something | while read a
do
some operation on $a;
done
This works fine. But my question is, Is there some how I can access each line by a predefined symbol( dont know how to call it) /// something like $? .. or .. $! .. or .. $_
Is it possible to do
cat to_be_removed.txt | rm -f $LINE
is there a predefined $LINE in bash .. or the previous one is the shortest way. ie.
cat to_be_removed.txt | while read line; do rm -f $line; done;

xargs is what you're looking for:
cat to_be_removed.txt | xargs rm -f
Watch out for spaces in your filenames if you use that one, though. Check out the xargs man page for more information.

You might be looking for the xargs command.
It takes control arguments, plus a command and optionally some arguments for the command. It then reads its standard input, normally splitting at white space, and then arranges to repeatedly execute the command with the given arguments and as many 'file names' read from the standard input as will fit on the command line.

rm -f $(<to_be_removed.txt)
This works because rm can take multiple files as input. It also makes it much more efficient because you only call rm once and you don't need to create a pipe to cat or xargs
On a separate note, rather than using pipes in a while loop, you can avoid a subshell by using process substitution:
while read line; do
some operation on $a;
done < <(command something)
The additional benefit you get by avoiding a subshell is that variables you change inside the loop maintain their altered values outside the loop as well. This is not the case when using the pipe form and it is a common gotcha.

Related

How can I use xargs to run a function in a command substitution for each match?

While writing Bash functions for string replacements I have encountered a strange behaviour when using xargs. This is actually driving me mad currently as I cannot get it to work.
Fortunately I have been able to nail it down to the following simple example:
Define a simple function which doubles every character of the given parameter:
function subs { echo $1 | sed -E "s/(.)/\1\1/g"; }
Call the function:
echo $(subs "ABC")
As expected the output is:
AABBCC
Now call the function using xargs:
echo "ABC" | xargs -I % echo $(subs "%")
Surprisingly the result now is:
ABCABC
It seems as if the sed command inside the function treats the whole string now as a single character.
Why does this happen and how can it be prevented?
You might ask, why I use xargs at all. Of course, this is a simplified example and the actual use case is much more complex.
In the original use case, I have a program which produces lots of output. I pipe the output through several greps to get the lines of interest. Afterwards, I pipe the lines to sed to extract the data I need from the lines. Because some transformations I need to do on the data are too complex to do with regular expressions alone, I'd like to use a function for these. So, my original idea was to simply pipe into the function but I couldn't get that to work and end up with the xargs solution. My original idea was something like this:
command | grep ... | grep ... | grep ... | sed ... | subs
BTW: I do not do this from the command line but from within a script. The function is defined in the very same script in which it is used.
I'm using Bash 3.2 (Mac OS X default), so fancy Bash 4.x stuff won't help me, sorry.
I'll be happy about everything which might shed some light on this topic.
Best regards
Frank
If you really need to do this (and you probably don't, but we can't help without a more representative sample), a better-practice approach might look like:
subs() { sed -E "s/(.)/\1\1/g" <<<"$1"; }
export -f subs
echo "ABC" | xargs bash -c 'for arg; do subs "$arg"; done' _
The use of echo "$(subs "$arg")" instead of just subs "$arg" adds nothing but bugs (consider what happens if one of your arguments is -n -- and that's assuming a relatively tame echo; they're allowed to consume backslashes even without a -e argument and to do all manner of other surprising things). You could do it above, but it slows your program down and makes it more prone to surprising behaviors; there's no point.
Running export -f subs export your function to the environment, so it can be run by other instances of bash invoked as child processes (all programs invoked by xargs are outside your shell, so they can't see shell-local variables or functions).
Without -I -- which is to say, in its default mode of operation -- xargs appends arguments to the end of the command it's given. This permits a much more efficient usage mode, where instead of invoking one command per line of input, it passes as many arguments as possible to the shortest possible number of subprocesses.
This also avoids major security bugs that can happen when using xargs -I in conjunction with bash -c '...' or sh -c '...'. (If you ever use -I% sh -c '...%...', then your filenames become part of your code, and are able to be used in injection attacks on your system).
That's because the construct $(subs "%") gets expanded by the shell when parsing the pipeline, so xargs runs with echo %%.

Loop for reading more than 1 query in a bash file

I need a loop in a Bash script (analysis-run.sh) for running many queries. As I have many queries I can't run them manually so I need a way to automate them. So far, I created a file inputs.txt with all my queries and at the end of the bash script file I added the following:
while read f ; do
./analysis-run.sh $f
done < imputs.txt
With that loop, analysis-run is only running the first query of inputs.txt over and over again. I am really new to this, so any help would be appreciated.
The content of imputs.txt is:
bones
muscles
blood
saliva
and so on..
The content of analysis-run.sh is:
Execute this script as ./analysis-run.sh [query] [group]
query=$1
group=$2
if [ $group = "clean" ]; then
cluster=A
else
cluster=B
fi
adamo-obtain_bundance.py - query $query -ref combined_$cluster.$group.align -splits 1 -group $group
adamo-obtain_structure.py -i $query.combined_$query$group.csv -o $query.$group -cutoff 0.5 -group $group
With that loop, analysis-run is only running the first query of inputs.txt over and over again.
The problem (probably) is that you need to quote $f:
while read -r f ; do
./analysis-run.sh "$f"
done < samples.txt
Without the quotes, the line read from samples.txt will be subject to word splitting and glob expansion.
Read http://tldp.org/LDP/abs/html/quotingvar.html
And run your scripts though ShellCheck
Using loops in Bash can sometimes work, but it is loaded with perils.
Using xargs is usually the cleanest, most robust approach...
<inputs.txt xargs --max-args=1 do_something
The command to execute could be provided as a Bash function...
function do_something
{
echo value=${1}
}
Although the call to xargs is somewhat more involved when taking that approach. See: Calling functions with xargs within a bash script
About xargs
xargs takes a list of arguments (usually file names), which are provided as an input file or stream, and it places those arguments on the command-line for another specified command or function. If the command can handle multiple input arguments, you can drop the --max-args=1 option.

Can I avoid using a FIFO file to join the end of a Bash pipeline to be stored in a variable in the current shell?

I have the following functions:
execIn ()
{
local STORE_INvar="${1}" ; shift
printf -v "${STORE_INvar}" '%s' "$( eval "$#" ; printf %s x ; )"
printf -v "${STORE_INvar}" '%s' "${!STORE_INvar%x}"
}
and
getFifo ()
{
local FIFOfile
FIFOfile="/tmp/diamondLang-FIFO-$$-${RANDOM}"
while [ -e "${FIFOfile}" ]
do
FIFOfile="/tmp/diamondLang-FIFO-$$-${RANDOM}"
done
mkfifo "${FIFOfile}"
echo "${FIFOfile}"
}
I want to store the output of the end of a pipeline into a variable as given to a function at the end of the pipeline, however, the only way I have found to do this that will work in early versions of Bash is to use mkfifo to make a temp fifo file. I was hoping to use file descriptors to avoid having to create temporary files. So, This works, but is not ideal:
Set Up: (before I can do this I need to have assigned a FIFO file to a var that can be used by the rest of the process)
$ FIFOfile="$( getFifo )"
The Pipeline I want to persist:
$ printf '\n\n123\n456\n524\n789\n\n\n' | grep 2 # for e.g.
The action: (I can now add) >${FIFOfile} &
$ printf '\n\n123\n456\n524\n789\n\n\n' | grep 2 >${FIFOfile} &
N.B. The need to background it with & - Problem 1: I get [1] <PID_NO> output to the screen.
The actual persist:
$ execIn SOME_VAR cat - <${FIFOfile}
Problem 2: I get more noise to the screen
[1]+ Done printf '\n\n123\n456\n524\n789\n\n\n' | grep 2 > ${FIFOfile}
Problem 3: I loose the blanks at the start of the stream rather than at the end as I have experienced before.
So, am I doing this the right way? I am sure that there must be a way to avoid the need of a FIFO file that needs cleanup afterwards using file descriptors, but I cannot seem to do this as I cannot assign either side of the problem to a file descriptor that is not attached to a file or a FIFO file.
I can try and resolve the problems with what I have, although to make this work properly I guess I need to pre-establish a pool of FIFO files that can be pulled in to use or else I have a pre-req of establishing this file before the command. So, for many reasons this is far from ideal. If anyone can advise me of a better way you would make my day/week/month/life :)
Thanks in advance...
Process substitution was available in bash from the ancient days. You absolutely do not have a version so ancient as to be unable to use it. Thus, there's no need to use a FIFO at all:
readToVar() { IFS= read -r -d '' "$1"; }
readToVar targetVar < <(printf '\n\n123\n456\n524\n789\n\n\n')
You'll observe that:
printf '%q\n' "$targetVar"
...correctly preserves the leading newlines as well as the trailing ones.
By contrast, in a use case where you can't afford to lose stdin:
readToVar() { IFS= read -r -d '' "$1" <"$2"; }
readToVar targetVar <(printf '\n\n123\n456\n524\n789\n\n\n')
If you really want to pipe to this command, are willing to require a very modern bash, and don't mind being incompatible with job control:
set +m # disable job control
shopt -s lastpipe # in a pipeline, parent shell becomes right-hand side
readToVar() { IFS= read -r -d '' "$1"; }
printf '\n\n123\n456\n524\n789\n\n\n' | grep 2 | readToVar targetVar
The issues you claim to run into with using a FIFO do not actually exist. Put this in a script, and run it:
#!/bin/bash
trap 'rm -rf "$tempdir"' 0 # cleanup on exit
tempdir=$(mktemp -d -t fifodir.XXXXXX)
mkfifo "$tempdir/fifo"
printf '\n\n123\n456\n524\n789\n\n\n' >"$tempdir/fifo" &
IFS= read -r -d '' content <"$tempdir/fifo"
printf '%q\n' "$content" # print content to console
You'll notice that, when run in a script, there is no "noise" printed to the screen, because all that status is explicitly tied to job control, which is disabled by default in scripts.
You'll also notice that both leading and tailing newlines are correctly represented.
One idea, tell me I am crazy, might be to use the !! notation to grab the line just executed, e.g. if there is a command that can terminate a pipeline and stop it actually executing, whilst still as far as the shell is concerned, consider it as a successful execution, I am thinking something like the true command, I could then use !! to grab that line and call my existing function to execute it with process substitution or something. I could then wrap this into an alias, something like: alias streamTo=' | true ; LAST_EXEC="!!" ; myNewCommandVariation <<<' which I think could be used something like: $ cmd1 | cmd2 | myNewCommandVariation THE_VAR_NAME_TO_SET and the <<< from the alias would pass the var name to the command as an arg or stdin, either way, the command would be not at the end of a pipeline. How mad is this idea?
Not a full answer but rather a first point: is there some good reason not using mktemp for creating a new file with a random name? As far as I can see, your function called getFifo() doesn't perform much more.
mktemp -u
will give to you a free new name without creating anything; then you can use mkfifo with this name.

How to execute Linux shell variables within double quotes?

I have the following hacking-challenge, where we don't know, if there is a valid solution.
We have the following server script:
read s # read user input into var s
echo "$s"
# tests if it starts with 'a-f'
echo "$s" > "/home/user/${s}.txt"
We only control the input "$s". Is there a possibility to send OS-commands like uname or do you think "no way"?
I don't see any avenue for executing arbitrary commands. The script quotes $s every time it is referenced, so that limits what you can do.
The only serious attack vector I see is that the echo statement writes to a file name based on $s. Since you control $s, you can cause the script to write to some unexpected locations.
$s could contain a string like bob/important.txt. This script would then overwrite /home/user/bob/important.txt if executed with sufficient permissions. Sorry, Bob!
Or, worse, $s could be bob/../../../etc/passwd. The script would try to write to /home/user/bob/../../../etc/passwd. If the script is running as root... uh oh!
It's important to note that the script can only write to these places if it has the right permissions.
You could embed unusual characters in $s that would cause irregular file names to be created. Un-careful scripts could be taken advantage of. For example, if $s were foo -rf . bar, then the file /home/user/foo -rf . bar.txt would be created.
If someone ran for file in /home/user; rm $file; done they'd have a surprise on their hands. They would end up running rm /home/user/foo -rf . bar.txt, which is a disaster. If you take out /home/user/foo and bar.txt you're left with rm -rf . — everything in the current directory is deleted. Oops!
(They should have quoted "$file"!)
And there are two other minor things which, while I don't know how to take advantage of them maliciously, do cause the script to behave slightly differently than intended.
read allows backslashes to escape characters like space and newline. You can enter \space to embed spaces and \enter to have read parse multiple lines of input.
echo accepts a couple of flags. If $s is -n or -e then it won't actually echo $s; rather, it will interpret $s as a command-line flag.
Use read -r s or any \ will be lost/missinterpreted by your command.
read -r s?"Your input: "
if [ -n "${s}" ]
then
# "filter" file name from command
echo "${s##*/}" | sed 's|^ *\([[:alnum:]_]\{1,\}\)[[:blank:]].*|/home/user/\1.txt|' | read Output
(
# put any limitation on user here
ulimit -t 5 1>/dev/null 2>&1
`${read}`
) > ${OutPut}
else
echo "Bad command" > /home/user/Error.txt
fi
Sure:
read s
$s > /home/user/"$s".txt
If I enter uname, this prints Linux. But beware: this is a security nightmare. What if someone enters rm -rf $HOME? You'd also have issues with commands containing a slash.

understanding linux arguments and piping

So I'm trying to use the sh (Bourne Shell) to write some scripts. I keep running into this confusion. For the following:
1. rm `echo test`
2. echo test | rm
I know backticks are used to run the command first, okay.
But for piping in #2, why doesn't rm take in test as an argument? Is there something about piping I don't understand? I thought it was simply sending output of one command as the input to another.
And... related to my piping confusion maybe.
dir=/blah/blar/blar
files=`ls ${dir} -rt`
count=`wc -l $files` # doesn't work, in fact it's running it along with each file that exists
count2=`$files | wc -l` # doesn't work
How come I can't store the ls into "files" and use that?
You would need to use xargs there, as rm takes arguments to delete, it doesn't read from the STDIN (which is what pipes typically pipe).
echo test | xargs rm
The first one works because backticks are for substitutions, much like ${} but not as easy. :)
Alternatively, you could use find.
find . -name test -exec rm -f '{}' \;
In the first case the results of echo test (the string test) are being provided as a command-line argument to rm. In the second, the string test is being piped to the stdin file descriptor of the rm process. These are two very different things. Since rm doesn't read from stdin, it never sees test.

Resources