Why bc and args doesn't work together in one line? - linux

I need help using xargs(1) and bc(1) in the same line. I can do it multiple lines, but I really want to find a solution in one line.
Here is the problem: The following line will print the size of a file.txt
ls -l file.txt | cut -d" " -f5
And, the following line will print 1450 (which is obviously 1500 - 50)
echo '1500-50' | bc
Trying to add those two together, I do this:
ls -l file.txt | cut -d" " -f5 | xargs -0 -I {} echo '{}-50' | bc
The problem is, it's not working! :)
I know that xargs is probably not the right command to use, but it's the only command I can find who can let me decide where to put the argument I get from the pipe.
This is not the first time I'm having issues with this kind of problem. It will be much of a help..
Thanks

If you do
ls -l file.txt | cut -d" " -f5 | xargs -0 -I {} echo '{}-50'
you will see this output:
23
-50
This means, that bc does not see a complete expression.
Just use -n 1 instead of -0:
ls -l file.txt | cut -d" " -f5 | xargs -n 1 -I {} echo '{}-50'
and you get
23-50
which bc will process happily:
ls -l file.txt | cut -d" " -f5 | xargs -n 1 -I {} echo '{}-50' | bc
-27
So your basic problem is, that -0 expects not lines but \0 terminated strings. And hence the newline(s) of the previous commands in the pipe garble the expression of bc.

This might work for you:
ls -l file.txt | cut -d" " -f5 | sed 's/.*/&-50/' | bc
Infact you could remove the cut:
ls -l file.txt | sed -r 's/^(\S+\s+){4}(\S+).*/\2-50/' | bc
Or use awk:
ls -l file.txt | awk '{print $5-50}'

Parsing output from the ls command is not the best idea. (really).
you can use many other solutions, like:
find . -name file.txt -printf "%s\n"
or
stat -c %s file.txt
or
wc -c <file.txt
and can use bash arithmetics, for avoid unnecessary slow process forks, like:
find . -type f -print0 | while IFS= read -r -d '' name
do
size=$(wc -c <$name)
s50=$(( $size - 50 ))
echo "the file=$name= size:$size minus 50 is: $s50"
done

Here is another solution, which only use one external command: stat:
file_size=$(stat -c "%s" file.txt) # Get the file size
let file_size=file_size-50 # Subtract 50
If you really want to combine them into one line:
let file_size=$(stat -c "%s" file.txt)-50
The stat command gets you the file size in bytes. The syntax above is for Linux (I tested against Ubuntu). On the Mac the syntax is a little different:
let file_size=$(stat -f "%z" mini.csv)-50

Related

Using STDIN from pipe in sed command to replace value in a file

I've got a command to perform a series of commands that produce a variable output string such as 123456. I want to pipe that to a sed command replacing a known string in a csv file that looks like this:
Fred,Wilma,Betty,Barney
However, the command below does not work and I haven't found any other references to using pipe values as the variable for a replace.
How does this code change if the values in the csv are in a random order and I always want to change the second value?
Example code:
find / -iname awk 2>/dev/null | sha256sum | cut -c1-10 > test.txt |
sed -i -e '/Wilma/ r test.txt' -e 's/Wilma//' input.csv
Contents of input.csv should become: Fred,0d522cd316,Betty,Barney
Okay, in
find / -iname awk 2>/dev/null | sha256sum | cut -c1-10 > test.txt | sed -i -e '/Wilma/ r test.txt' -e 's/Wilma//' input.csv
you have a bug. That "> test.txt" after cut is going to eat your stdin on sed, so things go weird with that pipe afterwards taking stdin. You don't want a pipe there, or you don't want to redirect to a file.
The way to take piped stdin and use it as a parameter in a command is through xargs.
find / -iname awk 2>/dev/null | sha256sum | cut -c1-10 | xargs --replace=INSERTED -- sed -i -e 's/Wilma/INSERTED/' input.csv
(...though that find|shasum is suspect too, in that the order of files is random(ish) and it matters for a reliable sum. You prpobably mean to "|sort" after find.)
(Some would sed -i -e "s/Wilma/$(find|sort|shasum|cut)" f, but I ain't among them. Animals.)
For replacing a fixed string like "Wilma", try:
sed -i 's/Wilma/'"$(find / -iname awk 2>/dev/null |
sha256sum | cut -c1-10)"'/' input.csv
To replace the 2nd field no matter what's in it, try:
sed -i 's/[^,]*/'"$(find / -iname awk 2>/dev/null |
sha256sum | cut -c1-10)"'/2' input.csv

different shell behaviour: bash omits newline, zsh keeps it

I have a script which searches source files that contain a "TODO"
note inside the comments. Furthermore I use a concatenation of grep, git blame, uniq and sort to get
the list ordered by the person who wrote the TODO comment.
The following works fine in bash and zsh:
#!/bin/bash
for FILE in $(grep -r -i "todo" apps/business | awk '{print $1}' | sed 's/://' | sed 's/\#//')
do
git blame $FILE | grep -i "todo"
done | sort -k2 | uniq
Now I want to count all the entries. Instead of calling the (time expensive)
grep/git blame again, I want to save everything into $MATCHES to count it
without evaluating it again.
MATCHES=$(for FILE in $(grep -r -i "todo" apps/business | awk '{print $1}' | sed 's/://' | sed 's/\#//')
do
git blame $FILE | grep -i "todo"
done | sort -k2 | uniq)
echo $MATCHES
That's where I experience different behaviour in bash/zsh:
zsh: Returns the same as the first script (as expected)
bash: Ignores the newlines of git blame, puts everything on one line. wc -l counts 1 line.
What am I missing here? Why is bash behaving differently here?
And how do I get bash to not-ignore the newline?
zsh doesn't perform word-splitting on the unquoted parameter expansion $MATCH by default. Use echo "$MATCHES" | wc -l, and bash should work as well.
Note this is the wrong way to iterate over the output of a command; use a while loop and the read command instead.
grep -ri "todo" apps/business | awk '{print $1}' | sed -e 's/://' -e 's/\#//' |
while IFS= read -r FILE; do
git blame "$FILE" | grep -i todo
done | sort -k2 | uniq

Output of wc -l without file-extension

I've got the following line:
wc -l ./*.txt | sort -rn
i want to cut the file extension. So with this code i've got the output:
number filename.txt
for all my .txt-files in the .-directory. But I want the output without the file-extension, like this:
number filename
I tried a pipe with cut for different kinds of parameter, but all i got was to cut the whole filename with this command.
wc -l ./*.txt | sort -rn | cut -f 1 -d '.'
Assuming you don't have newlines in your filename you can use sed to strip out ending .txt:
wc -l ./*.txt | sort -rn | sed 's/\.txt$//'
unfortunately, cut doesn't have a syntax for extracting columns according to an index from the end. One (somewhat clunky) trick is to use rev to reverse the line, apply cut to it and then rev it back:
wc -l ./*.txt | sort -rn | rev | cut -d'.' -f2- | rev
Using sed in more generic way to cut off whatever extension the files have:
$ wc -l *.txt | sort -rn | sed 's/\.[^\.]*$//'
14 total
8 woc
3 456_base
3 123_base
0 empty_base
A better approach using proper mime type (what is the extension of tar.gz or such multi extensions ? )
#!/bin/bash
for file; do
case $(file -b $file) in
*ASCII*) echo "this is ascii" ;;
*PDF*) echo "this is pdf" ;;
*) echo "other cases" ;;
esac
done
This is a POC, not tested, feel free to adapt/improve/modify

Replacing unknown amount of blank spaces for X amount

Hey so I'm writing a linux script and I came to an interesting finding.
I've got a command that will sort the files inside a directory by it's size and prints the largest one. Command is as follows
find . -type f -ls | sort -r -n -k7 | head -n 1
This will print something amongst the likes of
895918591 8 -r-w-x 1 user01 xdf 1931 28 march 23:21 ./myscript.sh
So I want to to get the largest file size alone and print it. To separate it I used cut -d' ' -f2 issue is, this leaves only empty output. That is because the amount of spaces is inconsistent.
So I tried doing something like this
find . -type f -ls | sort -r -n -k7 | head -n 1 | tr -d [:blank:] | cut -d' ' -f2
Issue is, this removes all the blank spaces now I can't separate them by common separator. So I'm asking, is there a way to replace literally all the blank spaces and then replace them with a single blank space?
If not, at least any other way to get to that number of bytes?
Sed and Awk are great tools for this kind of thing. Sed is a regex-based language that modifies the contents of each line the Sed program receives, and Awk is also a line-oriented tool that automatically splits its input into fields.
To turn sequences of blanks into one blank (substitute all matches of /\s+/ with ) in Sed:
$ find ... | sed 's/\s+/ /g'
To just print the first "word" (sequence of nonspaces) of each line in Awk:
$ find ... | awk '{print $1}'
http://tldp.org/LDP/abs/html/sedawk.html can get you started with these languages.
Instead of cut you can use awk:
find . -type f -ls | sort -r -n -k7 | head -n 1 | awk '{print $2}'
However you can even avoid head as well using awk:
find . -type f -ls | sort -r -n -k7 | awk '{print $2; exit}'
The tool to convert multiple spaces to just one is called tr -s:
tr translates
s squeezes
Sample:
$ cat a
hello this is a sample text with multiple spaces
$ tr -s " " < a
hello this is a sample text with multiple spaces
If you then want to convert every space into X, just pipe to sed 's/ / /g'.
I think you're overthinking the issue at hand:
find -type f -printf "%s\n"|sort -n|tail -n1
Instead of using cut, you can try using the printf command that gives you control over your display
find . -type f -ls | sort -r -n -k7 | head -n 1 -printf %s
You're doing it wrong.
Parsing ls in any form ( like find's -ls option ) is the bad approach.
Do not use ls output for anything. ls is a tool for interactively looking at directory metadata. Any attempts at parsing ls output with code are broken.
I strongly suggest you to read further about this subject. Read Parsing ls.
Instead, use the following function:
# Usage: largest [dir]
largest() {
local f size largest
while read -rd '' f; do
size=$(wc -c < "$f")
if (( size > largest[0] )); then
largest=("$size" "$f")
fi
done < <(find "${1-.}" -type f -print0)
printf '%s is the largest file in %s\n' "${largest[1]}" "${1-.}"
}

xargs with multiple arguments

I have a source input, input.txt
a.txt
b.txt
c.txt
I want to feed these input into a program as the following:
my-program --file=a.txt --file=b.txt --file=c.txt
So I try to use xargs, but with no luck.
cat input.txt | xargs -i echo "my-program --file"{}
It gives
my-program --file=a.txt
my-program --file=b.txt
my-program --file=c.txt
But I want
my-program --file=a.txt --file=b.txt --file=c.txt
Any idea?
Don't listen to all of them. :) Just look at this example:
echo argument1 argument2 argument3 | xargs -l bash -c 'echo this is first:$0 second:$1 third:$2'
Output will be:
this is first:argument1 second:argument2 third:argument3
None of the solutions given so far deals correctly with file names containing space. Some even fail if the file names contain ' or ". If your input files are generated by users, you should be prepared for surprising file names.
GNU Parallel deals nicely with these file names and gives you (at least) 3 different solutions. If your program takes 3 and only 3 arguments then this will work:
(echo a1.txt; echo b1.txt; echo c1.txt;
echo a2.txt; echo b2.txt; echo c2.txt;) |
parallel -N 3 my-program --file={1} --file={2} --file={3}
Or:
(echo a1.txt; echo b1.txt; echo c1.txt;
echo a2.txt; echo b2.txt; echo c2.txt;) |
parallel -X -N 3 my-program --file={}
If, however, your program takes as many arguments as will fit on the command line:
(echo a1.txt; echo b1.txt; echo c1.txt;
echo d1.txt; echo e1.txt; echo f1.txt;) |
parallel -X my-program --file={}
Watch the intro video to learn more: http://www.youtube.com/watch?v=OpaiGYxkSuQ
How about:
echo $'a.txt\nb.txt\nc.txt' | xargs -n 3 sh -c '
echo my-program --file="$1" --file="$2" --file="$3"
' argv0
It's simpler if you use two xargs invocations: 1st to transform each line into --file=..., 2nd to actually do the xargs thing ->
$ cat input.txt | xargs -I# echo --file=# | xargs echo my-program
my-program --file=a.txt --file=b.txt --file=c.txt
You can use sed to prefix --file= to each line and then call xargs:
sed -e 's/^/--file=/' input.txt | xargs my-program
Here is a solution using sed for three arguments, but is limited in that it applies the same transform to each argument:
cat input.txt | sed 's/^/--file=/g' | xargs -n3 my-program
Here's a method that will work for two args, but allows more flexibility:
cat input.txt | xargs -n 2 | xargs -I{} sh -c 'V="{}"; my-program -file=${V% *} -file=${V#* }'
I stumbled on a similar problem and found a solution which I think is nicer and cleaner than those presented so far.
The syntax for xargs that I have ended with would be (for your example):
xargs -I X echo --file=X
with a full command line being:
my-program $(cat input.txt | xargs -I X echo --file=X)
which will work as if
my-program --file=a.txt --file=b.txt --file=c.txt
was done (providing input.txt contains data from your example).
Actually, in my case I needed to first find the files and also needed them sorted so my command line looks like this:
my-program $(find base/path -name "some*pattern" -print0 | sort -z | xargs -0 -I X echo --files=X)
Few details that might not be clear (they were not for me):
some*pattern must be quoted since otherwise shell would expand it before passing to find.
-print0, then -z and finally -0 use null-separation to ensure proper handling of files with spaces or other wired names.
Note however that I didn't test it deeply yet. Though it seems to be working.
xargs doesn't work that way. Try:
myprogram $(sed -e 's/^/--file=/' input.txt)
It's because echo prints a newline. Try something like
echo my-program `xargs --arg-file input.txt -i echo -n " --file "{}`
I was looking for a solution for this exact problem and came to the conclution of coding a script in the midle.
to transform the standard output for the next example use the -n '\n' delimeter
example:
user#mybox:~$ echo "file1.txt file2.txt" | xargs -n1 ScriptInTheMiddle.sh
inside the ScriptInTheMidle.sh:
!#/bin/bash
var1=`echo $1 | cut -d ' ' -f1 `
var2=`echo $1 | cut -d ' ' -f2 `
myprogram "--file1="$var1 "--file2="$var2
For this solution to work you need to have a space between those arguments file1.txt and file2.txt, or whatever delimeter you choose, one more thing, inside the script make sure you check -f1 and -f2 as they mean "take the first word and take the second word" depending on the first delimeter's position found (delimeters could be ' ' ';' '.' whatever you wish between single quotes .
Add as many parameters as you wish.
Problem solved using xargs, cut , and some bash scripting.
Cheers!
if you wanna pass by I have some useful tips http://hongouru.blogspot.com
Actually, it's relatively easy:
... | sed 's/^/--prefix=/g' | xargs echo | xargs -I PARAMS your_cmd PARAMS
The sed 's/^/--prefix=/g' is optional, in case you need to prefix each param with some --prefix=.
The xargs echo turns the list of param lines (one param in each line) into a list of params in a single line and the xargs -I PARAMS your_cmd PARAMS allows you to run a command, placing the params where ever you want.
So cat input.txt | sed 's/^/--file=/g' | xargs echo | xargs -I PARAMS my-program PARAMS does what you need (assuming all lines within input.txt are simple and qualify as a single param value each).
There is another nice way of doing this, if you do not know the number of files upront:
my-program $(find . -name '*.txt' -printf "--file=%p ")
Nobody has mentioned echoing out from a loop yet, so I'll put that in for completeness sake (it would be my second approach, the sed one being the first):
for line in $(< input.txt) ; do echo --file=$line ; done | xargs echo my-program
Old but this is a better answer:
cat input.txt | gsed "s/\(.*\)/\-\-file=\1/g" | tr '\n' ' ' | xargs my_program
# i like clean one liners
gsed is just gnu sed to ensure syntax matches version brew install gsed or just sed if your on gnu linux already...
test it:
cat input.txt | gsed "s/\(.*\)/\-\-file=\1/g" | tr '\n' ' ' | xargs echo my_program

Resources