optimize xargs argument enumeration - linux

Can this usage of xargs argument enumaration be optimized better?
The aim is to inject single argument in the middle of the actual command.
I do:
echo {1..3} | xargs -I{} sh -c 'for i in {};do echo line $i here;done'
or
echo {1..3} | for i in $(xargs -n1);do echo line $i here; done
I get:
line 1 here
line 2 here
line 3 here
which is what I need but I wondered if loop and temporary variable could be avoided?

You need to separate the input to xargs by newlines:
echo {1..3}$'\n' | xargs -I% echo line % here
For array expansions, you can use printf:
ar=({1..3})
printf '%s\n' "${ar[#]}" | xargs -I% echo line % here
(and if it's just for output, you can use it without xargs:
printf 'line %s here\n' "${ar[#]}"
)

Try without xargs. For most situations xargs is overkill.
Depending on what you really want you can choose a solution like
# Normally you want to avoid for and use while, but here you want the things splitted.
for i in $(echo {1 2 3} );do
echo line $i here;
done
# When you want 1 line turned into three, `tr` can help
echo {1..3} | tr " " "\n" | sed 's/.*/line & here/'
# printf will repeat itself when there are parameters left
printf "line %s here\n" $(echo {1..3})
# Using the printf feature you can avoid the echo
printf "line %s here\n" {1..3}

Maybe this?
echo {1..3} | tr " " "\n" | xargs -n1 sh -c ' echo "line $0 here"'
The tr replaces the spaces with newlines, so xargs sees three lines. I would not be surprised if there were a better (more efficient) solution, but this one is quite simple.
Please note I have modified my previous answer to remove the use of {}, which was suggested in the comments to eliminate a potential code injection vulnerability.

There is a not well known feature of GNU sed. You can add the e flag to the s command and then sed executes whatever is in the pattern space and replaces the pattern space with the output if that command.
If you are really only interested in the output of the echo commands, you might try this GNU sed example, which eliminates the temporary variable, the loop (and the xargs as well):
echo {1..3} | sed -r 's/([^ ])+/echo "line \1 here"\n/ge
it fetches one token (i.e. whatever is separated by the spaces)
replaces it with echo "line \1 here"\n command, with \1 replaced by the token
then executes echo
puts the output of the echo command back into pattern space
that means it outputs the result of the three echos
But an even better way to get the desired output is to skip the execution and do the transformation directly in sed, like this:
echo {1..3} | sed -r 's/([^ ])+ ?/line \1 here\n/g'

Related

how to not escape space and backslashes in echo and while in bash?

I'm passing two positional args to a script to run, both args are a path, and while in the scenario analyzing the paths, the problem is sometimes there is some path like: m i sc . . . . .. . . it has dots and spaces, and sometimes even we have a backslash in dir names.
It is so tried to get arguments via two procedures, directly and via at sign.
SOURCE_ARG=$1
DESTINATION_ARG=$2
and
ARG_COUNT=0
for POSITIONAL_ARGUMENTS in "${#}"
do
((ARG_COUNT++))
ARGUMENT_ARRAY[$ARG_COUNT]=$POSITIONAL_ARGUMENTS
done
In the loop, I iterate through the result of commands that have forwarded to them.
while IFS= read -r dir
do
echo "${ARGUMENT_ARRAY[1]}"
echo "${dir}"
while IFS= read -r item
do
# do some stuff
done < <(ls -A "$dir"/)
done < <(du -hP "$SOURCE_ARG" | awk '{$1=""; print $0}' | grep -v "^.$" | sed "s/^ //g")
when i use echo "${ARGUMENT_ARRAY[1]}" i get the same path as i need to check but when using loop iteration varible as dir in here ->echo "${dir}" i got all the spaces escaped, since other commands for that path could not do their jobs.
What I'm Asking for is that how can I get the output of $dir within the loop and as like as echo "${ARGUMENT_ARRAY[1]}" that i mentioned above(input with all spaces and backslashes)
Thanks to #Barmar in comments.
The only reason that filenames are without escapes (i.e. you see directories with no special character or special characters have been escaped) is because du is printing the filenames with escapes, so $dir variable would have escaped once and special characters are no longer available for the other loop iteration in my problem.
Now that we know the problem was raised by using du in my script:
while IFS= read -r dir
# do sth
done < <(du -hP "$SOURCE_ARG" | awk '{$1=""; print $0}' | grep -v "^.$" | sed "s/^ //g")
We can change the du to find and the problem is solved:
while IFS= read -r dir
# do sth
done < <(find "$SOURCE_ARG" -type d –)
PS 1:
Another problem raised as I wanted to print the lines to check them if they are ok or not (i.e. while debugging application) was with echo.
So be sure to try printf "%s\n" "$dir" instead of echo, as some versions of echo process escape sequences.
echo "${dir}"
printf "%s\n" "$dir"
PS 2:
Also If a filename has more than one space in a row, The way I used awk, was collapsing them into a single space.
awk '{$1=""; print $0}' | grep -v "^.$" | sed "s/^ //g"

Text formating - sed, awk, shell

I need some assistance trying to build up a variable using a list of exclusions in a file.
So I have a exclude file I am using for rsync that looks like this:
*.log
*.out
*.csv
logs
shared
tracing
jdk*
8.6_Code
rpsupport
dbarchive
inarchive
comms
PR116PICL
**/lost+found*/
dlxwhsr*
regression
tmp
working
investigation
Investigation
dcsserver_weblogic_
dcswebrdtEAR_weblogic_
I need to build up a string to be used as a variable to feed into egrep -v, so that I can use the same exclusion list for rsync as I do when egrep -v from a find -ls.
So I have created this so far to remove all "*" and "/" - and then when it sees certain special characters it escapes them:
cat exclude-list.supt | while read line
do
echo $line | sed 's/\*//g' | sed 's/\///g' | 's/\([.-+_]\)/\\\1/g'
What I need the ouput too look like is this and then export that as a variable:
SEXCLUDE_supt="\.log|\.out|\.csv|logs|shared|PR116PICL|tracing|lost\+found|jdk|8\.6\_Code|rpsupport|dbarchive|inarchive|comms|dlxwhsr|regression|tmp|working|investigation|Investigation|dcsserver\_weblogic\_|dcswebrdtEAR\_weblogic\_"
Can anyone help?
A few issues with the following:
cat exclude-list.supt | while read line
do
echo $line | sed 's/\*//g' | sed 's/\///g' | 's/\([.-+_]\)/\\\1/g'
Sed reads files line by line so cat | while read line;do echo $line | sed is completely redundant also sed can do multiple substitutions by either passing them as a comma separated list or using the -e option so piping to sed three times is two too many. A problem with '[.-+_]' is the - is between . and + so it's interpreted as a range .-+ when using - inside a character class put it at the end beginning or end to lose this meaning like [._+-].
A much better way:
$ sed -e 's/[*/]//g' -e 's/\([._+-]\)/\\\1/g' file
\.log
\.out
\.csv
logs
shared
tracing
jdk
8\.6\_Code
rpsupport
dbarchive
inarchive
comms
PR116PICL
lost\+found
dlxwhsr
regression
tmp
working
investigation
Investigation
dcsserver\_weblogic\_
dcswebrdtEAR\_weblogic\_
Now we can pipe through tr '\n' '|' to replace the newlines with pipes for the alternation ready for egrep:
$ sed -e 's/[*/]//g' -e 's/\([._+-]\)/\\\1/g' file | tr "\n" "|"
\.log|\.out|\.csv|logs|shared|tracing|jdk|8\.6\_Code|rpsupport|dbarchive|...
$ EXCLUDE=$(sed -e 's/[*/]//g' -e 's/\([._+-]\)/\\\1/g' file | tr "\n" "|")
$ echo $EXCLUDE
\.log|\.out|\.csv|logs|shared|tracing|jdk|8\.6\_Code|rpsupport|dbarchive|...
Note: If your file ends with a newline character you will want to remove the final trailing |, try sed 's/\(.*\)|/\1/'.
This might work for you (GNU sed):
SEXCLUDE_supt=$(sed '1h;1!H;$!d;g;s/[*\/]//g;s/\([.-+_]\)/\\\1/g;s/\n/|/g' file)
This should work but I guess there are better solutions. First store everything in a bash array:
SEXCLUDE_supt=$( sed -e 's/\*//g' -e 's/\///g' -e 's/\([.-+_]\)/\\\1/g' exclude-list.supt)
and then process it again to substitute white space:
SEXCLUDE_supt=$(echo $SEXCLUDE_supt |sed 's/\s/|/g')

Convert first letter of given file to lower case

I want to convert the 1st letter of each line to lower case up to the end of the file. How can I do this using shell scripting?
I tried this:
plat=`echo $plat |cut -c1 |tr [:upper:] [:lower:]``echo $plat |cut -c2-`
but this converts only the first character to lower case.
My file looks like this:
Apple
Orange
Grape
Expected result:
apple
orange
grape
You can do that with sed:
sed -e 's/./\L&/' Shell.txt
(Probably safer to do
sed -e 's/^./\L&\E/' Shell.txt
if you ever want to extend this.)
Try:
plat=`echo $plat |cut -c1 |tr '[:upper:]' '[:lower:]'``echo $plat |cut -c2-`
Pure Bash 4.0+ , parameter substitution:
>"$outfile" # empty output file
while read ; do
echo "${REPLY,}" >> "$outfile" # 1. character to lowercase
done < "$infile"
mv "$outfile" "$infile"
Here is a single sed command that uses only POSIX sed features:
sed -e 'h;s,^\(.\).*$,\1,;y,ABCDEFGHIJKLMNOPQRSTUVWXYZ,abcdefghijklmnopqrstuvwxyz,;G;s,\
.,,'
These are two lines, the first line ending in a backslash to quote the newline character.

xargs with multiple arguments

I have a source input, input.txt
a.txt
b.txt
c.txt
I want to feed these input into a program as the following:
my-program --file=a.txt --file=b.txt --file=c.txt
So I try to use xargs, but with no luck.
cat input.txt | xargs -i echo "my-program --file"{}
It gives
my-program --file=a.txt
my-program --file=b.txt
my-program --file=c.txt
But I want
my-program --file=a.txt --file=b.txt --file=c.txt
Any idea?
Don't listen to all of them. :) Just look at this example:
echo argument1 argument2 argument3 | xargs -l bash -c 'echo this is first:$0 second:$1 third:$2'
Output will be:
this is first:argument1 second:argument2 third:argument3
None of the solutions given so far deals correctly with file names containing space. Some even fail if the file names contain ' or ". If your input files are generated by users, you should be prepared for surprising file names.
GNU Parallel deals nicely with these file names and gives you (at least) 3 different solutions. If your program takes 3 and only 3 arguments then this will work:
(echo a1.txt; echo b1.txt; echo c1.txt;
echo a2.txt; echo b2.txt; echo c2.txt;) |
parallel -N 3 my-program --file={1} --file={2} --file={3}
Or:
(echo a1.txt; echo b1.txt; echo c1.txt;
echo a2.txt; echo b2.txt; echo c2.txt;) |
parallel -X -N 3 my-program --file={}
If, however, your program takes as many arguments as will fit on the command line:
(echo a1.txt; echo b1.txt; echo c1.txt;
echo d1.txt; echo e1.txt; echo f1.txt;) |
parallel -X my-program --file={}
Watch the intro video to learn more: http://www.youtube.com/watch?v=OpaiGYxkSuQ
How about:
echo $'a.txt\nb.txt\nc.txt' | xargs -n 3 sh -c '
echo my-program --file="$1" --file="$2" --file="$3"
' argv0
It's simpler if you use two xargs invocations: 1st to transform each line into --file=..., 2nd to actually do the xargs thing ->
$ cat input.txt | xargs -I# echo --file=# | xargs echo my-program
my-program --file=a.txt --file=b.txt --file=c.txt
You can use sed to prefix --file= to each line and then call xargs:
sed -e 's/^/--file=/' input.txt | xargs my-program
Here is a solution using sed for three arguments, but is limited in that it applies the same transform to each argument:
cat input.txt | sed 's/^/--file=/g' | xargs -n3 my-program
Here's a method that will work for two args, but allows more flexibility:
cat input.txt | xargs -n 2 | xargs -I{} sh -c 'V="{}"; my-program -file=${V% *} -file=${V#* }'
I stumbled on a similar problem and found a solution which I think is nicer and cleaner than those presented so far.
The syntax for xargs that I have ended with would be (for your example):
xargs -I X echo --file=X
with a full command line being:
my-program $(cat input.txt | xargs -I X echo --file=X)
which will work as if
my-program --file=a.txt --file=b.txt --file=c.txt
was done (providing input.txt contains data from your example).
Actually, in my case I needed to first find the files and also needed them sorted so my command line looks like this:
my-program $(find base/path -name "some*pattern" -print0 | sort -z | xargs -0 -I X echo --files=X)
Few details that might not be clear (they were not for me):
some*pattern must be quoted since otherwise shell would expand it before passing to find.
-print0, then -z and finally -0 use null-separation to ensure proper handling of files with spaces or other wired names.
Note however that I didn't test it deeply yet. Though it seems to be working.
xargs doesn't work that way. Try:
myprogram $(sed -e 's/^/--file=/' input.txt)
It's because echo prints a newline. Try something like
echo my-program `xargs --arg-file input.txt -i echo -n " --file "{}`
I was looking for a solution for this exact problem and came to the conclution of coding a script in the midle.
to transform the standard output for the next example use the -n '\n' delimeter
example:
user#mybox:~$ echo "file1.txt file2.txt" | xargs -n1 ScriptInTheMiddle.sh
inside the ScriptInTheMidle.sh:
!#/bin/bash
var1=`echo $1 | cut -d ' ' -f1 `
var2=`echo $1 | cut -d ' ' -f2 `
myprogram "--file1="$var1 "--file2="$var2
For this solution to work you need to have a space between those arguments file1.txt and file2.txt, or whatever delimeter you choose, one more thing, inside the script make sure you check -f1 and -f2 as they mean "take the first word and take the second word" depending on the first delimeter's position found (delimeters could be ' ' ';' '.' whatever you wish between single quotes .
Add as many parameters as you wish.
Problem solved using xargs, cut , and some bash scripting.
Cheers!
if you wanna pass by I have some useful tips http://hongouru.blogspot.com
Actually, it's relatively easy:
... | sed 's/^/--prefix=/g' | xargs echo | xargs -I PARAMS your_cmd PARAMS
The sed 's/^/--prefix=/g' is optional, in case you need to prefix each param with some --prefix=.
The xargs echo turns the list of param lines (one param in each line) into a list of params in a single line and the xargs -I PARAMS your_cmd PARAMS allows you to run a command, placing the params where ever you want.
So cat input.txt | sed 's/^/--file=/g' | xargs echo | xargs -I PARAMS my-program PARAMS does what you need (assuming all lines within input.txt are simple and qualify as a single param value each).
There is another nice way of doing this, if you do not know the number of files upront:
my-program $(find . -name '*.txt' -printf "--file=%p ")
Nobody has mentioned echoing out from a loop yet, so I'll put that in for completeness sake (it would be my second approach, the sed one being the first):
for line in $(< input.txt) ; do echo --file=$line ; done | xargs echo my-program
Old but this is a better answer:
cat input.txt | gsed "s/\(.*\)/\-\-file=\1/g" | tr '\n' ' ' | xargs my_program
# i like clean one liners
gsed is just gnu sed to ensure syntax matches version brew install gsed or just sed if your on gnu linux already...
test it:
cat input.txt | gsed "s/\(.*\)/\-\-file=\1/g" | tr '\n' ' ' | xargs echo my_program

Linux using grep to print the file name and first n characters

How do I use grep to perform a search which, when a match is found, will print the file name as well as the first n characters in that file? Note that n is a parameter that can be specified and it is irrelevant whether the first n characters actually contains the matching string.
grep -l pattern *.txt |
while read line; do
echo -n "$line: ";
head -c $n "$line";
echo;
done
Change -c to -n if you want to see the first n lines instead of bytes.
You need to pipe the output of grep to sed to accomplish what you want. Here is an example:
grep mypattern *.txt | sed 's/^\([^:]*:.......\).*/\1/'
The number of dots is the number of characters you want to print. Many versions of sed often provide an option, like -r (GNU/Linux) and -E (FreeBSD), that allows you to use modern-style regular expressions. This makes it possible to specify numerically the number of characters you want to print.
N=7
grep mypattern *.txt /dev/null | sed -r "s/^([^:]*:.{$N}).*/\1/"
Note that this solution is a lot more efficient that others propsoed, which invoke multiple processes.
There are few tools that print 'n characters' rather than 'n lines'. Are you sure you really want characters and not lines? The whole thing can perhaps be best done in Perl. As specified (using grep), we can do:
pattern="$1"
shift
n="$2"
shift
grep -l "$pattern" "$#" |
while read file
do
echo "$file:" $(dd if="$file" count=${n}c)
done
The quotes around $file preserve multiple spaces in file names correctly. We can debate the command line usage, currently (assuming the command name is 'ngrep'):
ngrep pattern n [file ...]
I note that #litb used 'head -c $n'; that's neater than the dd command I used. There might be some systems without head (but they'd pretty archaic). I note that the POSIX version of head only supports -n and the number of lines; the -c option is probably a GNU extension.
Two thoughts here:
1) If efficiency was not a concern (like that would ever happen), you could check $status [csh] after running grep on each file. E.g.: (For N characters = 25.)
foreach FILE ( file1 file2 ... fileN )
grep targetToMatch ${FILE} > /dev/null
if ( $status == 0 ) then
echo -n "${FILE}: "
head -c25 ${FILE}
endif
end
2) GNU [FSF] head contains a --verbose [-v] switch. It also offers --null, to accomodate filenames with spaces. And there's '--', to handle filenames like "-c". So you could do:
grep --null -l targetToMatch -- file1 file2 ... fileN |
xargs --null head -v -c25 --

Resources