How to delete the contents of all the files in a directory - linux

Under /tmp/REPORTS I have a hundred files. What I want to do is erase the contents of each file in /tmp/REPORTS (not delete them). So I tried the following, but I get this error:
cp /dev/null /tmp/REPORTS/*
cp: Target /tmp/REPORTS/….. must be a directory
Usage: cp [-f] [-i] [-p] [-#] f1 f2
cp [-f] [-i] [-p] [-#] f1 ... fn d1
cp -r|-R [-H|-L|-P] [-f] [-i] [-p] [-#] d1 ... dn-1 dn
How can I clear the contents of all the files in the directory?

Assuming that you mean you want to truncate the contents of each file, you can do this:
for file in /tmp/REPORTS/*; do > "$file"; done
This will clear the contents of each file in the directory.
As gniourf_gniourf has suggested in the comments above, there is a GNU tool truncate that can do the job for you as well:
truncate --size 0 /tmp/REPORTS/*
This will possibly be quicker than looping through the files manually.

The question appears to be asking for what is termed a "secure erase".
While you cannot cp /dev/null, you could get the effect you are asking about using dd. Here is a script to illustrate:
#!/bin/sh
for name in $*
do
test -h "$name" && continue
test -f "$name" || continue
blocks=`ls -s "$name" | awk '{print $1; }'`
dd if=/dev/zero of="$name" count=$blocks
done
The script
checks to ensure that the parameter is a regular file.
then it asks for the number of blocks for the file.
finally, it uses dd to copy 0's from the special device /dev/zero.
This relies on dd and ls having the same notion of blocksize (which appears to be the case). It also assumes that the filesystem does not reallocate blocks for a given file, i.e., that they can be reliably overwritten in-place. If you want a better guarantee, there are other solutions, e.g., this discussion of Secure Erase in UNIX

you can use this
for remove
find /tmp/REPORTS/ -type f -name "*.*" -exec rm -rf {} \;
or move them to some where you want
for cp
find /tmp/REPORTS/ -type f -name "*.*" -exec cp {} \tmp\{} \;
for mv
find /tmp/REPORTS/ -type f -name "*.*" -exec mv {} \dev\null \;

Try this instead:
$ find /tmp/REPORTS/ -type f -exec cat /dev/null > {} \;
Or if you have a large number of files, try using \+ instead of \; for a more efficient commandline.

Related

Need guidance with a bash script to check log files in a certain directory for a certain string

I would like to preface this with I am a complete noob with scripting. So I have a situation where I need to manually look for a phone number that could live in one of hundreds of files.
so the logs live in the following directory.
/actlogs/sbclogger_archive
The logs file names are in directories numbered 01-31 inside of that directory and all the files are zipped.
Inside of those numbered directories are tons of files but the only ones I want to search are "sipd.logthenthedate.gz" and "sipmsg.logthenthedate.gz".
So I need to look in all the files in the following directory.
"/actlogs/sbclogger_archive"
Which has 31 directories labeled "01-31"
Then in each 01-31 there is hundreds of files the only ones I want to look are are "sipd.logthenthedate.gz" and "sipmsg.logthenthedate.gz".
The script I am using is below, please let me know what I could do to make this work.
#!/bin/bash
read -p "Enter a phone number: " text
read -p "Enter directory of log file's, Hint it should be /actlogs/sbclogger_archive: " directory
#arr=( $(find $directory -type f -exec grep -l "$text" {} \; | sort -r) )
#find $directory -type f -exec grep -qe "$text" {} \; -exec bash -c '
file=$(find $directory -type f -name 'sipd.log*' -exec grep -qe "$text" {} \; -exec bash -c 'select f; do echo $f; break; done' find-sh {} +;)
if [ -z "$file" ]; then
echo "No matches found."
else
echo "select tool:"
tools=("nano" "less" "vim" "quit")
select tool in "${tools[#]}"
do
case $tool in
"quit")
break
;;
*)
$tool $file
break
;;
esac
done
fi
This would give you the list of files matching:
find \( -name 'sipd.log[0-9]*.gz' -o -name 'sipmsg.log[0-9]*.gz' \) \
-exec sh -c 'gunzip -c {}| grep -m1 -q 888333' \; -print
./18/sipd.log20200118.gz
./7/sipd.log20200107.gz
Note: -m1 tells grep to stop after first match, since you need only the file name in this case, it's enough.
If you have zgrep, you can shorten it to:
find \( -name 'sipd.log[0-9]*.gz' -o -name 'sipmsg.log[0-9]*.gz' \) \
-exec zgrep -l '888333' {} \;
./18/sipd.log20200118.gz
./7/sipd.log20200107.gz
Also, some of the tools you are suggesting do not support gzip files (nano and some variants of less for example). In which case you might need to decompress the file and compress it again when done.
And, you might want to consider a loop if you want to "quit". Feeding the file list to the tool doesn't make sense.
Note: AFAIK zgrep doesn't do recursive:
DESCRIPTION
Zgrep invokes grep on compressed or gzipped files. These grep options will cause zgrep to terminate with an
error code:
(-[drRzZ]|--di*|--exc*|--inc*|--rec*|--nu*). All other options specified are passed directly to grep. If no file is specified, then
the
standard input is decompressed if necessary and fed to grep. Otherwise the given files are uncompressed if necessary and fed to
grep.
so zgrep -rl "$text" "$directory" or zgrep -rl --include 'simpd.log*.gz' "$test" {01..31} won't work except if you have a special zgrep
As you must unzip before using your tool, i would divide the problem in two blocks.
Firstly, i would expand the paths you need (looking under <directory> for the phone <text>), and then iterate to apply the tool (because some tools like vim or nano cannot be piped).
Try something like this:
#!/bin/bash
#...
# text/directory input stuff
#...
tmpdir=$(mktemp -d)
trap 'rm -rf ${tmpdir}' EXIT
while IFS= read -r file; do
unzipped=${tmpdir}/$(basename "${file}" .gz)
gunzip -c "${file}" > "${unzipped}"
${tool} "${unzipped}"
done < <(zgrep -lw "${text}" "${directory}"/{01..31}/{sipd.logthenthedate.gz,sipmsg.logthenthedate.gz} 2>/dev/null)
Above is the proposed invert-form by Charles Duffy following this Bash FAQ.
If you prefer to iterate an array, you could build in this way:
# shellcheck disable=SC2207
files=( $(zgrep -lw "${text}" "${directory}"/{01..31}/{sipd.logthenthedate.gz,sipmsg.logthenthedate.gz} 2>/dev/null) )
for file in "${files[#]}"; do
# etc.
as in our particular case, the files to match have no spaces in their names and shellcheck warning is not so important (hidden above).
BRs

Optimizing bash script (for loops, permissions, ...)

I made this script (very quickly ... :)) ages ago and use it very often, but now I'm interested how bash experts would optimize it.
What it does is it goes through files and directories in the current directory and sets the correct permissions:
#!/bin/bash
echo "Repairing chowns."
for item in "$#"; do
sudo chown -R ziga:ziga "$item"
done
echo "Setting chmods of directories to 755."
for item in $#; do
sudo find "$item" -type d -exec chmod 755 {} \;
done
echo "Setting chmods of files to 644."
for item in $#; do
sudo find "$item" -type f -exec chmod 644 {} \;
done
echo "Setting chmods of scripts to 744."
for item in $#; do
sudo find "$item" -type f -name "*.sh" -exec chmod 744 {} \;
sudo find "$item" -type f -name "*.pl" -exec chmod 744 {} \;
sudo find "$item" -type f -name "*.py" -exec chmod 744 {} \;
done
What I'd like to do is
Reduce the number of for-loops
Reduce the number of find statements (I know the last three can be combined into one, but I wonder if it can be reduced even further)
Make script accept parameters other than the current directory and possibly accept multiple parameters (right now it only works if I cd into a desired directory and then call bash /home/ziga/scripts/repairPermissions.sh .). NOTE: the parameters might have spaces in the path.
a) you are looping through $# in all cases, you only need a single loop.
a1) But find can do this for you, you don't need a bash loop.
a2) And chown can take multiple directories as arguments.
b) Per chw21, remove the sudo for files you own.
c) One exec per found file/directory is expensive, use xargs.
d) Per chw21, combine the last three finds into one.
#!/bin/bash
echo "Repairing permissions."
sudo chown -R ziga:ziga "$#"
find "$#" -type d -print0 | xargs -0 --no-run-if-empty chmod 755
find "$#" -type f -print0 | xargs -0 --no-run-if-empty chmod 644
find "$#" -type f \
\( -name '*.sh' -o -name '*.pl' -o -name '*.py' \) \
-print0 | xargs -0 --no-run-if-empty chmod 744
This is 11 execs (sudo, chown, 3 * find/xargs/chmod) of other processes (if the argument list is very long, xargs will exec chmod multiple times).
This does however read the directory tree four times. The OS's filesystem caching should help though.
Edit: Explanation for why xargs is used in answer to chepner's comment:
Imagine that there is a folder with 100 files in it.
a find . -type f -exec chmod 644 {} \; will execute chmod 100 times.
Using find . -type f -print0 | xargs -0 chmod 644 execute xargs once and chmod once (or more if the argument list is very long).
This is three processes started compared to 101 processes started. The resources and time (and energy) needed to execute three processes is far less.
Edit 2:
Added --no-run-if-empty as an option to xargs. Note that this may not be portable to all systems.
I assume that you are ziga. This means that after the first chown command in there, you own every file and directory, and I don't see why you'd need any sudo after that.
You can combine the three last finds quite easily:
find "$item" -type f \( -name "*.sh" -o -name "*.py" -o -name "*.pl" \) -exec chmod 744 {} \;
Apart from that, I wonder what kind of broken permissions you tend to find. For example, chmod does know the +X modifier which only sets the x if at least one of user, group, or other already has a x.
You can simplify this:
for item in "$#"; do
To this:
for item; do
That's right, the default values for a for loop are taken from "$#".
This won't work well if some of the directories contain spaces:
for item in $#; do
Again, replace with for item; do. Same for all the other loops.
As the other answer pointed out, if you are running this script as ziga, then you can drop all the sudo except in the first loop.
You may use the idiom for item instead of for item in "$#".
We may reduce the use of find to just once (with a double loop).
I am assuming that ziga is your "$USER" name.
Bash 4.4 is required for the readarray with the -d option.
#!/bin/bash
user=$USER
for i
do
# Repairing chowns.
sudo chown -R "$user:$user" "$i"
readarray -d '' -t line< <(sudo find "$i" -print0)
for f in "${line[#]}"; do
# Setting chmods of directories to 755.
[[ -d $f ]] && sudo chmod 755 "$f"
# Setting chmods of files to 644.
[[ -f $f ]] && sudo chmod 644 "$f"
# Setting chmods of scripts to 744.
[[ $f == *.#(sh|pl|py) ]] && sudo chmod 744 "$f"
done
done
If you have an older bash (2.04+), change the readarray line to this:
while IFS='' read -d '' line; do arr+=("$line"); done < <(sudo find "$i" -print0)
I am keeping the sudo assuming the "items" in "$#" migth be repeated inside searched directories. If there is no repetition, sudo may be omited.
There are two inevitable external commands (sudo chown) per loop.
Plus only one find per loop.
The other two external commands (sudo chmod) are reduced to only those needed to efect the change you need.
But the shell is allways very very slow to do its job.
So, the gain in speed depends on the type of files where the script is used.
Do some testing with time script to find out.
Subject to a few constraints, listed below, I would expect something similar to the following to be faster than anything mentioned thus far. I also use only Bourne shell constructs:
#!/bin/sh
set -e
per_file() {
chown ziga:ziga "$1"
test -d "$1" && chmod 755 "$1" || { test -f "$1" && chmod 644 "$1"; }
file "$1" | awk -F: '$2 !~ /script/ {exit 1}' && chmod 744 "$1"
}
if [ "$process_my_file" ]
then
per_file("$1")
exit
fi
export process_my_file=$0
find "$#" -exec $0 {} +
The script calls find(1) on the command-line arguments, and invokes itself for each file found. It detects re-invocation by testing for the existence of an environment variable named process_my_file. Although there is a cost to invoking the shell each time, it should be dwarfed by not traversing the filesystem.
Notes:
set -e to exit on first error
+ in find to get parallel execution
file(1) to detect a script
chown(1) not invoked recursively
symbolic links not followed
Calling file(1) for every file to determine if it's a script is definitely more expensive than just testing the name, but it's also more general. Note that awk is testing only the description text, not the filename. It would misfire if the filename contained a colon.
You could lift chown back up out of the per_file() function, and use its -R option. That might be faster, as it's one execution. It would be an interesting experiment.
In terms of your requirements, there are no loops, and only one call to find(1). You also mentioned,
right now it only works if I cd into a desired directory
but that's not true. It also works if you mention the target directory on the command line,
$ fix_tree_perms my/favorite/directory
You could add, say, a -d option to change there first, but I don't see how that would be easier to use.

Linux : combining the "ls" and "cp" command

The command
ls -l | egrep '^d'
Lists all the Directories in the CWD..
And this command
cp a.txt /folder
copies a file a.txt to the folder named "folder"
Now what should i do to combine the 2 command so that the file a.txt gets copied to all the folders in the CWD.
The cp command does not take several destinations, but you could always try:
for DEST in `command here` ; do cp a.txt "$DEST" ; done
The command inside the backticks could be a command that produces a list of directories on standard output, but I doubt that ls -l | egrep '^d' is such a command. Anyway, the title of your question being about combining ls and cp commands, this my answer. To actually achieve what you want to do, you would be better off using find.
Something like find . -maxdepth 1 -type d ! -name "." -exec cp a.txt {} \; may do what you actually want. The find command is a special case in that is has a -exec option to combine itself with other commands easily. You could also have used (but this other version fails when there are lots of directories):
for DEST in `find . -maxdepth 1 -type d ! -name "."` ; do cp a.txt "$DEST" ; done
Don't use ls in scripts. Use a wildcard instead.
You'll have to loop over the target directories, since cp copies to one destination at a time.
for d in */; do
if ! [ -h "${d%/}" ]; then
cp a.txt "$d"
fi
done
The pattern */ matches all directories in the current directory (unless their name starts with a .), as well as symbolic links to directories. The test over ${d%/} ($d without the final /) excludes symbolic links.

Argument list too long error for rm, cp, mv commands

I have several hundred PDFs under a directory in UNIX. The names of the PDFs are really long (approx. 60 chars).
When I try to delete all PDFs together using the following command:
rm -f *.pdf
I get the following error:
/bin/rm: cannot execute [Argument list too long]
What is the solution to this error?
Does this error occur for mv and cp commands as well? If yes, how to solve for these commands?
The reason this occurs is because bash actually expands the asterisk to every matching file, producing a very long command line.
Try this:
find . -name "*.pdf" -print0 | xargs -0 rm
Warning: this is a recursive search and will find (and delete) files in subdirectories as well. Tack on -f to the rm command only if you are sure you don't want confirmation.
You can do the following to make the command non-recursive:
find . -maxdepth 1 -name "*.pdf" -print0 | xargs -0 rm
Another option is to use find's -delete flag:
find . -name "*.pdf" -delete
tl;dr
It's a kernel limitation on the size of the command line argument. Use a for loop instead.
Origin of problem
This is a system issue, related to execve and ARG_MAX constant. There is plenty of documentation about that (see man execve, debian's wiki, ARG_MAX details).
Basically, the expansion produce a command (with its parameters) that exceeds the ARG_MAX limit.
On kernel 2.6.23, the limit was set at 128 kB. This constant has been increased and you can get its value by executing:
getconf ARG_MAX
# 2097152 # on 3.5.0-40-generic
Solution: Using for Loop
Use a for loop as it's recommended on BashFAQ/095 and there is no limit except for RAM/memory space:
Dry run to ascertain it will delete what you expect:
for f in *.pdf; do echo rm "$f"; done
And execute it:
for f in *.pdf; do rm "$f"; done
Also this is a portable approach as glob have strong and consistant behavior among shells (part of POSIX spec).
Note: As noted by several comments, this is indeed slower but more maintainable as it can adapt more complex scenarios, e.g. where one want to do more than just one action.
Solution: Using find
If you insist, you can use find but really don't use xargs as it "is dangerous (broken, exploitable, etc.) when reading non-NUL-delimited input":
find . -maxdepth 1 -name '*.pdf' -delete
Using -maxdepth 1 ... -delete instead of -exec rm {} + allows find to simply execute the required system calls itself without using an external process, hence faster (thanks to #chepner comment).
References
I'm getting "Argument list too long". How can I process a large list in chunks? # wooledge
execve(2) - Linux man page (search for ARG_MAX) ;
Error: Argument list too long # Debian's wiki ;
Why do I get “/bin/sh: Argument list too long” when passing quoted arguments? # SuperUser
find has a -delete action:
find . -maxdepth 1 -name '*.pdf' -delete
Another answer is to force xargs to process the commands in batches. For instance to delete the files 100 at a time, cd into the directory and run this:
echo *.pdf | xargs -n 100 rm
If you’re trying to delete a very large number of files at one time (I deleted a directory with 485,000+ today), you will probably run into this error:
/bin/rm: Argument list too long.
The problem is that when you type something like rm -rf *, the * is replaced with a list of every matching file, like “rm -rf file1 file2 file3 file4” and so on. There is a relatively small buffer of memory allocated to storing this list of arguments and if it is filled up, the shell will not execute the program.
To get around this problem, a lot of people will use the find command to find every file and pass them one-by-one to the “rm” command like this:
find . -type f -exec rm -v {} \;
My problem is that I needed to delete 500,000 files and it was taking way too long.
I stumbled upon a much faster way of deleting files – the “find” command has a “-delete” flag built right in! Here’s what I ended up using:
find . -type f -delete
Using this method, I was deleting files at a rate of about 2000 files/second – much faster!
You can also show the filenames as you’re deleting them:
find . -type f -print -delete
…or even show how many files will be deleted, then time how long it takes to delete them:
root#devel# ls -1 | wc -l && time find . -type f -delete
100000
real 0m3.660s
user 0m0.036s
sys 0m0.552s
Or you can try:
find . -name '*.pdf' -exec rm -f {} \;
you can try this:
for f in *.pdf
do
rm "$f"
done
EDIT:
ThiefMaster comment suggest me not to disclose such dangerous practice to young shell's jedis, so I'll add a more "safer" version (for the sake of preserving things when someone has a "-rf . ..pdf" file)
echo "# Whooooo" > /tmp/dummy.sh
for f in '*.pdf'
do
echo "rm -i \"$f\""
done >> /tmp/dummy.sh
After running the above, just open the /tmp/dummy.sh file in your favorite editor and check every single line for dangerous filenames, commenting them out if found.
Then copy the dummy.sh script in your working dir and run it.
All this for security reasons.
For somone who doesn't have time.
Run the following command on terminal.
ulimit -S -s unlimited
Then perform cp/mv/rm operation.
I'm surprised there are no ulimit answers here. Every time I have this problem I end up here or here. I understand this solution has limitations but ulimit -s 65536 seems to often do the trick for me.
You could use a bash array:
files=(*.pdf)
for((I=0;I<${#files[#]};I+=1000)); do
rm -f "${files[#]:I:1000}"
done
This way it will erase in batches of 1000 files per step.
you can use this commend
find -name "*.pdf" -delete
The rm command has a limitation of files which you can remove simultaneous.
One possibility you can remove them using multiple times the rm command bases on your file patterns, like:
rm -f A*.pdf
rm -f B*.pdf
rm -f C*.pdf
...
rm -f *.pdf
You can also remove them through the find command:
find . -name "*.pdf" -exec rm {} \;
If they are filenames with spaces or special characters, use:
find -name "*.pdf" -delete
For files in current directory only:
find -maxdepth 1 -name '*.pdf' -delete
This sentence search all files in the current directory (-maxdepth 1) with extension pdf (-name '*.pdf'), and then, delete.
i was facing same problem while copying form source directory to destination
source directory had files ~3 lakcs
i used cp with option -r and it's worked for me
cp -r abc/ def/
it will copy all files from abc to def without giving warning of Argument list too long
Try this also If you wanna delete above 30/90 days (+) or else below 30/90(-) days files/folders then you can use the below ex commands
Ex: For 90days excludes above after 90days files/folders deletes, it means 91,92....100 days
find <path> -type f -mtime +90 -exec rm -rf {} \;
Ex: For only latest 30days files that you wanna delete then use the below command (-)
find <path> -type f -mtime -30 -exec rm -rf {} \;
If you wanna giz the files for more than 2 days files
find <path> -type f -mtime +2 -exec gzip {} \;
If you wanna see the files/folders only from past one month .
Ex:
find <path> -type f -mtime -30 -exec ls -lrt {} \;
Above 30days more only then list the files/folders
Ex:
find <path> -type f -mtime +30 -exec ls -lrt {} \;
find /opt/app/logs -type f -mtime +30 -exec ls -lrt {} \;
And another one:
cd /path/to/pdf
printf "%s\0" *.[Pp][Dd][Ff] | xargs -0 rm
printf is a shell builtin, and as far as I know it's always been as such. Now given that printf is not a shell command (but a builtin), it's not subject to "argument list too long ..." fatal error.
So we can safely use it with shell globbing patterns such as *.[Pp][Dd][Ff], then we pipe its output to remove (rm) command, through xargs, which makes sure it fits enough file names in the command line so as not to fail the rm command, which is a shell command.
The \0 in printf serves as a null separator for the file names wich are then processed by xargs command, using it (-0) as a separator, so rm does not fail when there are white spaces or other special characters in the file names.
Argument list too long
As this question title for cp, mv and rm, but answer stand mostly for rm.
Un*x commands
Read carefully command's man page!
For cp and mv, there is a -t switch, for target:
find . -type f -name '*.pdf' -exec cp -ait "/path to target" {} +
and
find . -type f -name '*.pdf' -exec mv -t "/path to target" {} +
Script way
There is an overall workaroung used in bash script:
#!/bin/bash
folder=( "/path to folder" "/path to anther folder" )
if [ "$1" != "--run" ] ;then
exec find "${folder[#]}" -type f -name '*.pdf' -exec $0 --run {} +
exit 0;
fi
shift
for file ;do
printf "Doing something with '%s'.\n" "$file"
done
What about a shorter and more reliable one?
for i in **/*.pdf; do rm "$i"; done
I had the same problem with a folder full of temporary images that was growing day by day and this command helped me to clear the folder
find . -name "*.png" -mtime +50 -exec rm {} \;
The difference with the other commands is the mtime parameter that will take only the files older than X days (in the example 50 days)
Using that multiple times, decreasing on every execution the day range, I was able to remove all the unnecessary files
You can create a temp folder, move all the files and sub-folders you want to keep into the temp folder then delete the old folder and rename the temp folder to the old folder try this example until you are confident to do it live:
mkdir testit
cd testit
mkdir big_folder tmp_folder
touch big_folder/file1.pdf
touch big_folder/file2.pdf
mv big_folder/file1,pdf tmp_folder/
rm -r big_folder
mv tmp_folder big_folder
the rm -r big_folder will remove all files in the big_folder no matter how many. You just have to be super careful you first have all the files/folders you want to keep, in this case it was file1.pdf
To delete all *.pdf in a directory /path/to/dir_with_pdf_files/
mkdir empty_dir # Create temp empty dir
rsync -avh --delete --include '*.pdf' empty_dir/ /path/to/dir_with_pdf_files/
To delete specific files via rsync using wildcard is probably the fastest solution in case you've millions of files. And it will take care of error you're getting.
(Optional Step): DRY RUN. To check what will be deleted without deleting. `
rsync -avhn --delete --include '*.pdf' empty_dir/ /path/to/dir_with_pdf_files/
.
.
.
Click rsync tips and tricks for more rsync hacks
I found that for extremely large lists of files (>1e6), these answers were too slow. Here is a solution using parallel processing in python. I know, I know, this isn't linux... but nothing else here worked.
(This saved me hours)
# delete files
import os as os
import glob
import multiprocessing as mp
directory = r'your/directory'
os.chdir(directory)
files_names = [i for i in glob.glob('*.{}'.format('pdf'))]
# report errors from pool
def callback_error(result):
print('error', result)
# delete file using system command
def delete_files(file_name):
os.system('rm -rf ' + file_name)
pool = mp.Pool(12)
# or use pool = mp.Pool(mp.cpu_count())
if __name__ == '__main__':
for file_name in files_names:
print(file_name)
pool.apply_async(delete_files,[file_name], error_callback=callback_error)
If you want to remove both files and directories, you can use something like:
echo /path/* | xargs rm -rf
I only know a way around this.
The idea is to export that list of pdf files you have into a file. Then split that file into several parts. Then remove pdf files listed in each part.
ls | grep .pdf > list.txt
wc -l list.txt
wc -l is to count how many line the list.txt contains. When you have the idea of how long it is, you can decide to split it in half, forth or something. Using split -l command
For example, split it in 600 lines each.
split -l 600 list.txt
this will create a few file named xaa,xab,xac and so on depends on how you split it.
Now to "import" each list in those file into command rm, use this:
rm $(<xaa)
rm $(<xab)
rm $(<xac)
Sorry for my bad english.
I ran into this problem a few times. Many of the solutions will run the rm command for each individual file that needs to be deleted. This is very inefficient:
find . -name "*.pdf" -print0 | xargs -0 rm -rf
I ended up writing a python script to delete the files based on the first 4 characters in the file-name:
import os
filedir = '/tmp/' #The directory you wish to run rm on
filelist = (os.listdir(filedir)) #gets listing of all files in the specified dir
newlist = [] #Makes a blank list named newlist
for i in filelist:
if str((i)[:4]) not in newlist: #This makes sure that the elements are unique for newlist
newlist.append((i)[:4]) #This takes only the first 4 charcters of the folder/filename and appends it to newlist
for i in newlist:
if 'tmp' in i: #If statment to look for tmp in the filename/dirname
print ('Running command rm -rf '+str(filedir)+str(i)+'* : File Count: '+str(len(os.listdir(filedir)))) #Prints the command to be run and a total file count
os.system('rm -rf '+str(filedir)+str(i)+'*') #Actual shell command
print ('DONE')
This worked very well for me. I was able to clear out over 2 million temp files in a folder in about 15 minutes. I commented the tar out of the little bit of code so anyone with minimal to no python knowledge can manipulate this code.
I have faced a similar problem when there were millions of useless log files created by an application which filled up all inodes. I resorted to "locate", got all the files "located"d into a text file and then removed them one by one. Took a while but did the job!
I solved with for
I am on macOS with zsh
I moved thousands only jpg files. Within mv in one line command.
Be sure there are no spaces or special characters in the name of the files you are trying to move
for i in $(find ~/old -type f -name "*.jpg"); do mv $i ~/new; done
A bit safer version than using xargs, also not recursive:
ls -p | grep -v '/$' | grep '\.pdf$' | while read file; do rm "$file"; done
Filtering our directories here is a bit unnecessary as 'rm' won't delete it anyway, and it can be removed for simplicity, but why run something that will definitely return error?
Using GNU parallel (sudo apt install parallel) is super easy
It runs the commands multithreaded where '{}' is the argument passed
E.g.
ls /tmp/myfiles* | parallel 'rm {}'
For remove first 100 files:
rm -rf 'ls | head -100'

Find file then cd to that directory in Linux

In a shell script how would I find a file by a particular name and then navigate to that directory to do further operations on it?
From here I am going to copy the file across to another directory (but I can do that already just adding it in for context.)
You can use something like:
cd -- "$(dirname "$(find / -type f -name ls | head -1)")"
This will locate the first ls regular file then change to that directory.
In terms of what each bit does:
The find will start at / and search down, listing out all regular files (-type f) called ls (-name ls). There are other things you can add to find to further restrict the files you get.
The | head -1 will filter out all but the first line.
$() is a way to take the output of a command and put it on the command line for another command.
dirname can take a full file specification and give you the path bit.
cd just changes to that directory, the -- is used to prevent treating a directory name beginning with a hyphen from being treated as an option to cd.
If you execute each bit in sequence, you can see what happens:
pax[/home/pax]> find / -type f -name ls
/usr/bin/ls
pax[/home/pax]> find / -type f -name ls | head -1
/usr/bin/ls
pax[/home/pax]> dirname "$(find / -type f -name ls | head -1)"
/usr/bin
pax[/home/pax]> cd -- "$(dirname "$(find / -type f -name ls | head -1)")"
pax[/usr/bin]> _
The following should be more safe:
cd -- "$(find / -name ls -type f -printf '%h' -quit)"
Advantages:
The double dash prevents the interpretation of a directory name starting with a hyphen as an option (find doesn't produce such file names, but it's not harmful and might be required for similar constructs)
-name check before -type check because the latter sometimes requires a stat
No dirname required because the %h specifier already prints the directory name
-quit to stop the search after the first file found, thus no head required which would cause the script to fail on directory names containing newlines
no one suggesting locate (which is much quicker for huge trees) ?
zsh:
cd $(locate zoo.txt|head -1)(:h)
cd ${$(locate zoo.txt)[1]:h}
cd ${$(locate -r "/zoo.txt$")[1]:h}
or could be slow
cd **/zoo.txt(:h)
bash:
cd $(dirname $(locate -l1 -r "/zoo.txt$"))
Based on this answer to a similar question, other useful choice could be having 2 commands, 1st to find the file and 2nd to navigate to its directory:
find ./ -name "champions.txt"
cd "$(dirname "$(!!)")"
Where !! is history expansion meaning 'the previous command'.
Expanding on answers already given, if you'd like to navigate iteratively to every file that find locates and perform operations in each directory:
for i in $(find /path/to/search/root -name filename -type f)
do (
cd $(dirname $(realpath $i));
your_commands;
)
done
if you are just finding the file and then moving it elsewhere, just use find and -exec
find /path -type f -iname "mytext.txt" -exec mv "{}" /destination +;
function fReturnFilepathOfContainingDirectory {
#fReturnFilepathOfContainingDirectory_2012.0709.18:19
#$1=File
local vlFl
local vlGwkdvlFl
local vlItrtn
local vlPrdct
vlFl=$1
vlGwkdvlFl=`echo $vlFl | gawk -F/ '{ $NF="" ; print $0 }'`
for vlItrtn in `echo $vlGwkdvlFl` ;do
vlPrdct=`echo $vlPrdct'/'$vlItrtn`
done
echo $vlPrdct
}
Simply this way, isn't this elegant?
cdf yourfile.py
Of course you need to set it up first, but you need to do this only once:
Add following line into your .bashrc or .zshrc, whatever you use as your shell initialization script.
source ~/bin/cdf.sh
And add this code into ~/bin/cdf.sh file that you need to create from scratch.
#!/bin/bash
function cdf() {
THEFILE=$1
echo "cd into directory of ${THEFILE}"
# For Mac, replace find with mdfind to get it a lot faster. And it does not need args ". -name" part.
THEDIR=$(find . -name ${THEFILE} |head -1 |grep -Eo "/[ /._A-Za-z0-9\-]+/")
cd ${THEDIR}
}
If it's a program in your PATH, you can do:
cd "$(dirname "$(which ls)")"
or in Bash:
cd "$(dirname "$(type -P ls)")"
which uses one less external executable.
This uses no externals:
dest=$(type -P ls); cd "${dest%/*}"
If your file is only in one location you could try the following:
cd "$(find ~/ -name [filename] -exec dirname {} \;)" && ...
You can use -exec to invoke dirname with the path that find returns (which goes where the {} placeholder is). That will change directories. You can also add double ampersands ( && ) to execute the next command after the shell has changed directory.
For example:
cd "$(find ~/ -name need_to_find_this.rb -exec dirname {} \;)" && ruby need_to_find_this.rb
It will look for that ruby file, change to the directory, then run it from within that folder. This example assumes the filename is unique and that for some reason the ruby script has to run from within its directory. If the filename is not unique you'll get many locations passed to cd, it will return an error then it won't change directories.
try this. i created this for my own use.
cd ~
touch mycd
sudo chmod +x mycd
nano mycd
cd $( ./mycd search_directory target_directory )"
if [ $1 == '--help' ]
then
echo -e "usage: cd \$( ./mycd \$1 \$2 )"
echo -e "usage: cd \$( ./mycd search_directory target_directory )"
else
find "$1"/ -name "$2" -type d -exec echo {} \; -quit
fi
cd -- "$(sudo find / -type d -iname "dir name goes here" 2>/dev/null)"
keep all quotes (all this does is just send you to the directory you want, after that you can just put commands after that)

Resources