linux shell stream redirection to run a list of commands directly - linux

I have this svn project... to get a list of unadded files (in my case, hundreds):
svn status |grep "^?"
outputs
? somefile.txt
? somefile1.txt
? somefile2.txt
I was recently introduced to sed... so now I have a list of commands I want to run
svn status | grep "^?"|sed "s/^?/svn add/"
outputs
svn add somefile.txt
svn add somefile1.txt
svn add somefile2.txt
I realize I could just pipe it to a file
svn status | grep "^?"|sed "s/^?/svn add/" >out.sh && sh out.sh && rm out.sh
But I'd like to avoid writing to a temporary file. Is there a way I pipe it to some command like this:
svn status | grep "^?"|sed "s/^?/svn add/" |some_command_that_runs_each_line

What about bash/sh?
svn status | grep "^?"|sed "s/^?/svn add/" | bash

What you are looking for is the xargs command:
svn status | grep "^?" | sed "s/^..//" | xargs svn add
You can also use substitution:
svn add `svn status | grep "^?"` | cut -c 3-`

How about:
for i in `svn status | grep "^?"|sed "s/^?/svn add/"`
do
$i
done

Related

How to get latest file from sftp server to local using mget in linux?

Hi I am following below logic to get the latest file from sftp server. But it is copying all the files. Please help what I need to correct in my logic?
datadir="********"
cd ${datadir}
rm -f ${datadir}/my_data*.csv
rm -f ${logfile}
lftp<<END_SCRIPT
open sftp://${sftphost}
user ${sftpuser} ${sftppassword}
cd ${sftpfolder}
lcd $datadir
mget my_data*csv | sed 's/-\([1-9]\)\./-0\1\./g' | sort -r | sed 's/-0\([1-9]\)\./-\1\./g' | head -1
In this code, mget my_data*csv will execute first, and its output will be provided to sed as a parameter:
mget my_data*csv | sed 's/-\([1-9]\)\./-0\1\./g' | sort -r | sed 's/-0\([1-9]\)\./-\1\./g' | head -1
You just need to get the filename you want first and then do the mget filename.

Git: speed up this command for searching Git blame for todos

I'm using this command:
git ls-tree -r --name-only HEAD -- . | grep -E '\.(ts|tsx|js|jsx|css|scss|html)$' | xargs -n1 git blame -c -e | sed -E 's/^.+\.com>\s+//' | LC_ALL=C grep -F 'todo: ' | sort
This gets all the todos in my codebase, sorted by date. This is mostly from Use git to list TODOs in code sorted by date introduced, I'm not very good with the command line.
However, the grep 'todo: ' part takes a long time. It takes 1min for ~400 files, without any particularly large files. Is it possible to speed this up somehow?
Edit: I realized it's the git blame that's slow, not grep, so I did a search before running git blame:
git ls-tree -r --name-only HEAD -- . | grep -E '\\.(ts|tsx|js|jsx|css|scss|html)$' | LC_ALL=C xargs grep -F -l 'todo: ' | xargs -n1 git blame -c -e | sed -E 's/^.+\\.com>\\s+//' | LC_ALL=C grep -F 'todo: ' | sort
Now it's 6s.

xargs (or something else) without space before parameter

I would like to execute something like this (git squash):
git rebase -i HEAD~3
extracting the 3 from git log:
git log | blabla | xargs git rebase -i HEAD~
This does not work because xargs inserts a space after HEAD~.
The problem is that I want to alias this command, so I cannot just use
git rebase -i HEAD~`git log | blabla`
because the number would be evaluated just when I define the alias.
I don't have to use xargs, I just need an alias (preferably not a function).
You can use the -I option of xargs:
git log | blabla | xargs -I% git rebase -i HEAD~%
Try this:
git log | blabla | xargs -i bash -c 'git rebase -i HEAD~{}'

How to reference the output of the previous command twice in Linux command?

For instance, if I'd like to reference the output of the previous command once, I can use the command below:
ls *.txt | xargs -I % ls -l %
But how to reference the output twice? Like how can I implement something like:
ls *.txt | xargs -I % 'some command' % > %
PS: I know how to do it in shell script, but I just want a simpler way to do it.
You can pass this argument to bash -c:
ls *.txt | xargs -I % bash -c 'ls -l "$1" > "out.$1"' - %
You can lookup up 'tpipe' on SO; it will also lead you to 'pee' (which is not a good search term elsewhere on the internet). Basically, they're variants of the tee command which write to multiple processes instead of writing to files like the tee command does.
However, with Bash, you can use Process Substitution:
ls *.txt | tee >(cmd1) >(cmd2)
This will write the input to tee to each of the commands cmd1 and cmd2.
You can arrange to lose standard output in at least two different ways:
ls *.txt | tee >(cmd1) >(cmd2) >/dev/null
ls *.txt | tee >(cmd1) | cmd2

How many open files for each process running for a specific user in Linux

Running Apache and Jboss on Linux, sometimes my server halts unexpectedly saying that the problem was Too Many Open Files.
I know that we might set a higher limit for nproc and nofile at /etc/security/limits.conf to fix the open files problem, but I am trying to get better output, such as using watch to monitor them in real-time.
With this command line I can see how many open files per PID:
lsof -u apache | awk '{print $2}' | sort | uniq -c | sort -n
Output (Column 1 is # of open files for the user apache):
1 PID
1335 13880
1389 13897
1392 13882
If I could just add the watch command it would be enough, but the code below isn't working:
watch lsof -u apache | awk '{print $2}' | sort | uniq -c | sort -n
You should put the command insides quotes like this:
watch 'lsof -u apache | awk '\''{print $2}'\'' | sort | uniq -c | sort -n'
or you can put the command into a shell script like test.sh and then use watch.
chmod +x test.sh
watch ./test.sh
This command will tell you how many files Apache has opened:
ps -A x |grep apache | awk '{print $1}' | xargs -I '{}' ls /proc/{}/fd | wc -l
You may have to run it as root in order to access the process fd directory. This sounds like you've got a web application which isn't closing its file descriptors. I would focus my efforts on that area.

Resources