pass output as an argument for cp in bash [duplicate] - linux

This question already has answers here:
How to pass command output as multiple arguments to another command
(5 answers)
Closed 6 years ago.
I'm taking a unix/linux class and we have yet to learn variables or functions. We just learned some basic utilities like the flag and pipeline, output and append to file. On the lab assignment he wants us to find the largest files and copy them to a directory.
I can get the 5 largest files but I don't know how to pass them into cp in one command
ls -SF | grep -v / | head -5 | cp ? Directory

It would be:
cp `ls -SF | grep -v / | head -5` Directory
assuming that the pipeline is correct. The backticks substitute in the line the output of the commands inside it.
You can also make your tests:
cp `echo a b c` Directory
will copy all a, b, and c into Directory.

I would do:
cp $(ls -SF | grep -v / | head -5) Directory
xargs would probably be the best answer though.
ls -SF | grep -v / | head -5 | xargs -I{} cp "{}" Directory

Use backticks `like this` or the dollar sign $(like this) to perform command substitution. Basically this pastes each line of standard ouput of the backticked command into the surrounding command and runs it. Find out more in the bash manpage under "Command Substitution."
Also, if you want to read one line at a time you can read individual lines out of a pipe stream using "while read" syntax:
ls | while read varname; do echo $varname; done

If your cp has a "-t" flag (check the man page), that simplifies matters a bit:
ls -SF | grep -v / | head -5 | xargs cp -t DIRECTORY
The find command gives you more fine-grained ability to get what you want, instead of ls | grep that you have. I'd code your question like this:
find . -maxdepth 1 -type f -printf "%p\t%s\n" |
sort -t $'\t' -k2 -nr |
head -n 5 |
cut -f 1 |
xargs echo cp -t DIRECTORY

Related

Counting files in a huge directory [duplicate]

This question already has answers here:
What is the best way to count "find" results?
(6 answers)
Closed 3 years ago.
Related to this question.
How do I count the number of files in a directory so huge that ls returns too many characters for the command line to handle?
$ ls 150_sims/combined/ | wc -l
bash: /bin/ls: Argument list too long
Try this:
$ find 150_sims/combined/ -maxdepth 1 -type f | wc -l
If you're sure there are no directories inside your directory, you can reduce the command to just:
$ find 150_sims/combined/ | wc -l
If there are no newlines in file names, a simple ls -A | wc -l tells you how many files there are in the directory. Please note that if you have an alias for ls, this may trigger a call to stat (Example: ls --color or ls -F need to know the file type, which requires a call to stat), so from the command line, call command ls -A | wc -l or \ls -A | wc -l to avoid an alias.
ls -A 150_sims/combined | wc -l
If you are interested in counting both the files and directories you can try something like this:
\ls -afq 150_sims/combined | wc -l
This includes . and .., so you need subtract 2 from the count:
echo $(\ls -afq 150_sims/combined | wc -l) - 2 | bc

Using pipes with find command in linux

I would like to find files in my home directory that start with '~', sort them numerically, print the first five and delete them using find command and pipes in Linux. I have a bash script:
#!/bin/bash
find ~/ -name "~*" | sort -n | head -5 | tee | xargs rm
This works fine for deleting files, but I was expecting tee command to print deleted files to standard output. All this command does is delete files, but there in so output in terminal. What should I add/ change?
Thank you.
You could just use the verbose flag on rm and it will tell you what it's deleting
find ~/ -name "~*" | sort -n | head -5 | xargs rm -v
Use man rm to see the docs
-v, --verbose
explain what is being done
You can use rm -v to print each deleting filename:
find ~ -name '~*' -print0 | sort -zn | head -z -n 5 | xargs -0 rm -v
Also note use -print0 and all corresponding options in sort. head, xargs to address filenames with whitespace and glob characters.

Grep inside files returned from ls and head

I have a directory with a large number of files. I am attempting to search for text located in at least one of the files. The text is likely located in one of the more recent files. What is the command to do this? I thought it would look something like ls -t | head -5 | grep abaaba.
For example, if I have 5 files returned from ls -t | head -5:
- file1, file2, file3, file4, file5, I need to know which of those files contains abaaba.
It's not really clear what you are trying to do. But I assume the efficiency is your main goal. I would use something like:
ls -t | while read -r f; do grep -lF abaaba "$f" && break;done
This will print only first file containing the string and stops the search. If you want to see actual lines use -H instead of -l. And if you have regex instead of mere string drop -F which will make grep run slower however.
ls -t | while read -r f; do grep -H abaaba "$f" && break;done
Of course if you want to continue the search I'd suggest dropping "&& break".
ls -t | while read -r f; do grep -HF abaaba "$f";done
If you have some ideas about the time frame, it's good idea to try find.
find . -maxdepth 1 -type f -mtime -2 -exec grep -HF abaaba {} \;
You can raise the number after -mtime to cover more than last 2 days.
If you're just doing this interactively, and you know you don't have spaces in your filenames, then you can do:
grep abaaba $(ls -t | head -5) # DO NOT USE THIS IN A SCRIPT
If writing this in an alias or for repeat future use, do it the "proper" way that takes more typing, but that doesn't break on spaces and other things in filenames.
If you have spaces but not newlines, you can also do
(IFS=$'\n' grep abaaba $(ls -t | head -5) )

Bash script to delete files in a directory if there are more than 5

This is a backup script that copies files from one directory to another. I use a for loop to check if there are more than five files. If there are, the loop should delete the oldest entries first.
I tried ls -tr | head -n -5 | xargs rm from the command line and it works successfully to delete older files if there are more than 5 in the directory.
However, when I put it into my for loop, I get an error rm: missing operand
Here is the full script. I don't think I am using the for loop correctly in the script, but I'm really not sure how to use the commands ls -tr | head -n -5 | xargs rm in a loop that iterates over the files in the directory.
timestamp=$(date +"%m-%d-%Y")
dest=${HOME}/mybackups
src=${HOME}/safe
fname='bu_'
ffname=${HOME}/mybackups/${fname}${timestamp}.tar.gz
# for loop for deletion of file
for f in ${HOME}/mybackups/*
do
ls -tr | head -n -5 | xargs rm
done
if [ -e $ffname ];
then
echo "The backup for ${timestamp} has failed." | tee ${HOME}/mybackups/Error_${timestamp}
else
tar -vczf ${dest}/${fname}${timestamp}.tar.gz ${src}
fi
Edit: I took out the for loop, so it's now just:
[...]
ffname=${HOME}/mybackups/${fname}${timestamp}.tar.gz
ls -tr | head -n -5 | xargs rm
if [ -e $ffname ];
[...]
The script WILL work if it is in the mybackups directory, however, I continue to get the same error if it is not in that directory. The script gets the file names but tries to remove them from the current directory, I think... I tried several modifications but nothing has worked so far.
I get an error rm: missing operand
The cause of that error is that there are no files left to be deleted. To avoid that error, use the --no-run-if-empty option:
ls -tr | head -n -5 | xargs --no-run-if-empty rm
In the comments, mklement0 notes that this issue is peculiar to GNU xargs. BSD xargs will not run with an empty argument. Consequently, it does not need and does not support the --no-run-if-empty option.
More
Quoting from a section of code in the question:
for f in ${HOME}/mybackups/*
do
ls -tr | head -n -5 | xargs rm
done
Note that (1) f is never used for anything and (2) this runs the ls -tr | head -n -5 | xargs rm several times in a row when it needs to be run only once.
Obligatory Warning
Your approach parses the output of ls. This makes for a simple and easily understood command. It can work if all your files are sensibly named. It will not work in general. For more on this, see: Why you shouldn't parse the output of ls(1).
Safer Alternative
The following will work with all manner of file names, whether they contains spaces, tabs, newlines, or whatever:
find . -maxdepth 1 -type f -printf '%T# %i\n' | sort -n | head -n -5 | while read tstamp inode
do
find . -inum "$inode" -delete
done
SMH. I ended up coming up to the simplest solution in the world by just cd-ing into the directory before I ran ls -tr | head -n -5 | xargs rm . Thanks for everyone's help!
timestamp=$(date +"%m-%d-%Y")
dest=${HOME}/mybackups
src=${HOME}/safe
fname='bu_'
ffname=${HOME}/mybackups/${fname}${timestamp}.tar.gz
cd ${HOME}/mybackups
ls -tr | head -n -5 | xargs rm
if [ -e $ffname ];
then
echo "The backup for ${timestamp} has failed." | tee ${HOME}/mybackups/Error_${timestamp}
else
tar -vczf ${dest}/${fname}${timestamp}.tar.gz ${src}
fi
This line ls -tr | head -n -5 | xargs rm came from here
ls -tr displays all the files, oldest first (-t newest first, -r
reverse).
head -n -5 displays all but the 5 last lines (ie the 5 newest files).
xargs rm calls rm for each selected file
.

How to copy the top 10 most recent files from one directory to another?

Al my html files reside here :
/home/thinkcode/myfiles/html/
I want to move the newest 10 files to /home/thinkcode/Test
I have this so far. Please correct me. I am looking for a one-liner!
ls -lt *.htm | head -10 | awk '{print "cp "$1" "..\Test\$1}' | sh
ls -lt *.htm | head -10 | awk '{print "cp " $9 " ../Test/"$9}' | sh
cp seems to understand back-ticked commands. So you could use a command like this one to copy the 10 latest files to another folder like e.g. /test:
cp `ls -t *.htm | head -10` /test
Here is a version which doesn't use ls. It should be less vulnerable to strange characters in file names:
find . -maxdepth 1 -type f -name '*.html' -print0
\| xargs -0 stat --printf "%Y\t%n\n"
\| sort -n
\| tail -n 10
\| cut -f 2
\| xargs cp -t ../Test/
I used find for a couple of reasons:
1) if there are too many files in a directory, bash will balk at the wildcard expansion*.
2) Using the -print0 argument to find gets around the problem of bash expanding whitespace in a filename in to multiple tokens.
* Actually, bash shares a memory buffer for its wildcard expansion and its environment variables, so it's not strictly a function of the number of file names, but rather the total length of the file names and environment variables. Too many environment variables => no wildcard expansion.
EDIT: Incorporated some of #glennjackman's improvements. Kept the initial use of find to avoid the use of the wildcard expansion which might fail in a large directory.
ls -lt *.html | head -10 | awk '{print $NF}' | xargs -i cp {} DestDir
In the above example DestDir is the destination directory for the copy.
Add -t after xargs to see the commands as they execute. I.e., xargs -i -t cp {} DestDir.
For more information check out the xargs command.
EDIT: As pointed out by #DennisWilliamson (and also checking the current man page) re the -i option This option is deprecated; use -I instead..
Also, both solutions presented depend on the filenames in questions don't contain any blanks or tabs.

Resources