Display file size with find command prior to the paths - linux

I'm trying to find a way to combine the results that are displayed with the find command, e.g:
$find /home -name "requirements_*"
/home/cobayo/Obligatorio Test 2/requirements_richard
/home/cobayo/Obligatorio Test 2/requirements_paul
/home/cobayo/Obligatorio Test 2/requirements_george
/home/cobayo/Obligatorio Test 1/Subdirectorio/requirements_joe
So here I get all of the paths of the files I am looking for, what I would like to do is to get the file size listed prior the paths.
For instance:
34K "/home/cobayo/Obligatorio Test 2/requirements_richard"
I have some commands, even a little script does show me file sizes.
For example, this one:
find /home -name "requirements_*" -mtime -1 -print0 | du --files0-from=- -hc |
tail -n1*
And it shows the TOTAL size of all files with the name given as a parameter as results.
But then, I also have this little script:
FILENAME=$1
FILESIZE=$(stat -c%s "$FILENAME")
echo "Size of $FILENAME = $FILESIZE bytes."
Which is perfect as I get the file in K of the specific file. I tried to mix them up using a list:
*for i in `find /home -name "requirements_*"`
do
echo `stat -c%s $i`
done*
But I don't get the results Im looking for, in fact it tells me the files don't exist :(

You could use -exec du -hs {} \;:
$ find -type f -exec du -hs {} \;
4.0K ./proj-2/README.md
4.0K ./proj-2/file.c
4.0K ./proj-2/file.h
or alternatively, if your find supports it:
find -type f -exec du -hs {} +
which calls du just once.
Or -printf with the %k formatting parameter for the size in kB:
$ find -type f -printf '%kK\t%p\n'
4K ./proj-2/README.md
4K ./proj-2/file.c
4K ./proj-2/file.h
Manual references:
-exec, single file
-exec, multiple files
-printf

If you do not put the variable $i between double quotes the spaces will be interpreted as separator and your command will try to asses the following files independently :
/home/cobayo/Obligatorio
Test
1/Subdirectorio/requirements_joe
/home/cobayo/Obligatorio
Test
2/requirements_richard
/home/cobayo/Obligatorio
Test
2/requirements_paul
/home/cobayo/Obligatorio
Test
2/requirements_george
what gives you the error: files don't exist
You can try:
find . -name "requirements_*" -print0 | xargs -0 -I {} bash -c "echo -n 'Size of {} = ' && stat -c%s '{}' | tr -d '\n' && echo ' bytes.'"

Related

Circumvent Argument list too long in script (for loop)

I've seen a few answers regarding this, but as a newbie, I don't really understand how to implement that in my script.
it should be pretty easy (for those who can stuff like this)
I'm using a simple
for f in "/drive1/"images*.{jpg,png}; do
but this is simply overloading and giving me
Argument list too long
How is this easiest solved?
Argument list too long workaroud
Argument list length is something limited by your config.
getconf ARG_MAX
2097152
But after discuss around differences between bash specifics and system (os) limitations (see comments from that other guy), this question seem wrong:
Regarding discuss on comments, OP tried something like:
ls "/simple path"/image*.{jpg,png} | wc -l
bash: /bin/ls: Argument list too long
This happen because of OS limitation, not bash!!
But tested with OP code, this work finely
for file in ./"simple path"/image*.{jpg,png} ;do echo -n a;done | wc -c
70980
Like:
printf "%c" ./"simple path"/image*.{jpg,png} | wc -c
Reduce line length by reducing fixed part:
First step: you could reduce argument length by:
cd "/drive1/"
ls images*.{jpg,png} | wc -l
But when number of file will grow, you'll be buggy again...
More general workaround:
find "/drive1/" -type f \( -name '*.jpg' -o -name '*.png' \) -exec myscript {} +
If you want this to NOT be recursive, you may add -maxdepth as 1st option:
find "/drive1/" -maxdepth 1 -type f \( -name '*.jpg' -o -name '*.png' \) \
-exec myscript {} +
There, myscript will by run with filenames as arguments. The command line for myscript is built up until it reaches a system-defined limit.
myscript /drive1/file1.jpg '/drive1/File Name2.png' /drive1/...
From man find:
-exec command {} +
This variant of the -exec action runs the specified command on
the selected files, but the command line is built by appending
each selected file name at the end; the total number of invoca‐
tions of the command will be much less than the number of
matched files. The command line is built in much the same way
that xargs builds its command lines. Only one instance of `{}'
Inscript sample
You could create your script like
#!/bin/bash
target=( "/drive1" "/Drive 2/Pictures" )
[ "$1" = "--run" ] && exec find "${target[#]}" -type f \( -name '*.jpg' -o \
-name '*.png' \) -exec $0 {} +
for file ;do
echo Process "$file"
done
Then you have to run this with --run as argument.
work with any number of files! (Recursively! See maxdepth option)
permit many target
permit spaces and special characters in file and directrories names
you could run same script directly on files, without --run:
./myscript hello world 'hello world'
Process hello
Process world
Process hello world
Using pure bash
Using arrays, you could do things like:
allfiles=( "/drive 1"/images*.{jpg,png} )
[ -f "$allfiles" ] || { echo No file found.; exit ;}
echo Number of files: ${#allfiles[#]}
for file in "${allfiles[#]}";do
echo Process "$file"
done
There's also a while read loop:
find "/drive1/" -maxdepth 1 -mindepth 1 -type f \( -name '*.jpg' -o -name '*.png' \) |
while IFS= read -r file; do
or with zero terminated files:
find "/drive1/" -maxdepth 1 -mindepth 1 -type f \( -name '*.jpg' -o -name '*.png' \) -print0 |
while IFS= read -r -d '' file; do

BASH: Filter list of files by return value of another command

I have series of directories with (mostly) video files in them, say
test1
1.mpg
2.avi
3.mpeg
junk.sh
test2
123.avi
432.avi
432.srt
test3
asdf.mpg
qwerty.mpeg
I create a variable (video_dir) with the directory names (based on other parameters) and use that with find to generate the basic list. I then filter based on another variable (video_type) for file types (because there is sometimes non-video files in the dirs) piping it through egrep. Then I shuffle the list around and save it out to a file. That file is later used by mplayer to slideshow through the list.
I currently use the following command to accomplish that. I'm sure it's a horrible way to do it, but it works for me and it's quite fast even on big directories.
video_dir="/test1 /test2"
video_types=".mpg$|.avi$|.mpeg$"
find ${video_dir} -type f |
egrep -i "${video_types}" |
shuf > "$TEMP_OUT"
I now would like to add the ability to filter out files based on the resolution height of the video file. I can get that from.
mediainfo --Output='Video;%Height%' filename
Which just returns a number. I have tried using the -exec functionality of find to run that command on each file.
find ${video_dir} -type f -exec mediainfo --Output='Video;%Height%' {} \;
but that just returns the list of heights, not the filenames and I can't figure out how to reject ones based on a comparison, like <480.
I could do a for next loop but that seems like a bad (slow) idea.
Using info from #mark-setchell I modified it to,
video_dir="test1"
find ${video_dir} -type f \
-exec bash -c 'h=$(mediainfo --Output="Video;%Height%" "$1"); [[ $h -gt 480 ]]' _ {} \; -print
Which works.
You can replace your egrep with the following so you are still inside the find command (-iname is case insensitive and -o represents a logical OR):
find test1 test2 -type f \
\( -iname "*.mpg" -o -iname "*.avi" -o -iname "*.mpeg" \) \
NEXT_BIT
The NEXT_BIT can then -exec bash and exit with status 0 or 1 depending on whether you want the current file included or excluded. So it will look like this:
-exec bash -c 'H=$(mediainfo -output ... "$1"); [ $H -lt 480 ] && exit 1; exit 0' _ {} \;
So, taking note of #tripleee advice in comments about superfluous exit statements, I get this:
find test1 test2 -type f \
\( -iname "*.mpg" -o -iname "*.avi" -o -iname "*.mpeg" \) \
-exec bash -c 'h=$(mediainfo ...options... "$1"); [ $h -lt 480 ]' _ {} \; -print
This Q&A was focused on one particular case, so the accepted answer is not as general as it could be.
find
If the list of files comes from find, one can use its filtering facilities, e.g. -exec:
find ${video_dir} -type f \
-exec COMMAND \; \
-print
Here
COMMAND is not enclosed in quotes -- find reads everything after -exec and up to a \;
find will expand {} to the current file name (including path -- you might find -execdir helpful, which will cd to the file's directory and replace {} with the leaf file name)
The exit code of COMMAND is treated as follows:
0 -> true
non-0 -> false
Note that you can build more complex expressions (e.g. -not -exec ...), which will be evaluated "from left to right, according to the rules of precedence ... -and is assumed where the operator is omitted." (per man find)
xargs
If the list of files comes from elsewhere (and is available on stdin), you can use xargs as follows (from
If xargs is map, what is filter? )
ls | xargs -I{} bash -c "COMMAND '{}' && echo '{}'"
Here is my solution.
#!/bin/bash
shopt -s nullglob
video_dir=(/test1 /test2)
while IFS= read -rd '' file; do
if [[ $file = *.#(mpg|avi|mpeg|mp4) ]]; then
h=$(mediainfo --Output="Video;%Height%" "$file")
(( h >= 480 )) && echo "$file"
fi
done < <(find "${video_dir[#]}" -type f -print0)
This solution you can process everything inside the while read loop.

Create Unix shell script to move non empty files from Source directory to Target directory and add timestamp to them

I am trying to Create a shell script to move non empty files from Source directory to Target directory and add timestamp to them.
I am using
find . -type f -size +0 -print0 | xargs -I {} -r0 mv {} $Tgt_dir/{}_`date +%m%d%Y`
but its not working. Could you please help.
Thanks
You can use -printf in find to print the mv command with the full path of the source and just the basename in the destination, and pipe that to the shell:
date=$(date +%m%d%Y)
find . -type f -size +0 -printf "mv '%p' '$Tgt_dir/%f_$date'" | bash
%p is the full pathname, %f is the basename.
To move files with at least one line, write a command that counts the number of lines:
date=$(date +%m%d%Y)
find "$Src_dir" -type f -size +0 -printf "if [ $(wc -l '$p') -gt 1 ]; then mv '%p' '$Tgt_dir/%f_$date'; fi" | bash

Delete files 100 at a time and count total files

I have written a bash script to delete 100 files at a time from a directory because i was getting args list too long error but now i want to count the total files that were deleted in total from the directory
Here is the script
echo /example-dir/* | xargs -n 100 rm -rf
What i want is to write the total deleted files from each directory into a file along with path for example Deleted <count> files from <path>
How can i achieve this with my current setup?
You can simply do this by enabling verbose output from rm and then simply count the output lines using wc -l
If you have whitespaces or special characters in the file names, using echo to pass the list of files to xargs will not work.
Better use find with -print0 to use a NULL character as a delimiter for the individual files:
find /example-dir -type f -print0 | xargs --null -n 100 rm -vrf | wc -l
You can avoid xargs and do this in a simple while loop and use a counter:
destdir='/example-dir/'
count=0
while IFS= read -d '' file; do
rm -rf "$file"
((count++))
done < <(find "$destdir" -type f -print0)
echo "Deleted $count files from $destdir"
Note use of -print0 to take care of file names with whitespaces/newlines/glob etc.
By the way, if you really have lots of files and you do this often, it might be useful to look at some other options:
Use find's built-in -delete
time find . -name \*.txt -print -delete | wc -l
30000
real 0m1.244s
user 0m0.055s
sys 0m1.037s
Use find's ability to build up maximal length argument list
time find . -name \*.txt -exec rm -v {} + | wc -l
30000
real 0m0.979s
user 0m0.043s
sys 0m0.920s
Use GNU Parallel's ability to build long argument lists
time find . -name \*.txt -print0 | parallel -0 -X rm -v | wc -l
30000
real 0m1.076s
user 0m1.090s
sys 0m1.223s
Use a single Perl process to read filenames and delete whilst counting
time find . -name \*.txt -print0 | perl -0ne 'unlink;$i++;END{print $i}'
30000
real 0m1.049s
user 0m0.057s
sys 0m1.006s
For testing, you can create 30,000 files really fast with GNU Parallel, which allows -X to also build up long argument lists. For example, I can create 30,000 files in 8 seconds on my Mac with:
seq -w 0 29999 | parallel -X touch file{}.txt

Print the directory where the 'find' linux command finds a match

I have a bunch of directories; some of them contain a '.todo' file.
/storage/BCC9F9D00663A8043F8D73369E920632/.todo
/storage/BAE9BBF30CCEF5210534E875FC80D37E/.todo
/storage/CBB46FF977EE166815A042F3DEEFB865/.todo
/storage/8ABCBF3194F5D7E97E83C4FD042AB8E7/.todo
/storage/9DB9411F403BD282B097CBF06A9687F5/.todo
/storage/99A9BA69543CD48BA4BD59594169BBAC/.todo
/storage/0B6FB65D4E46CBD8A9B1E704CFACC42E/.todo
I'd like the 'find' command to print me only the directory, like this
/storage/BCC9F9D00663A8043F8D73369E920632
/storage/BAE9BBF30CCEF5210534E875FC80D37E
/storage/CBB46FF977EE166815A042F3DEEFB865
...
here's what I have so far, but it lists the '.todo' file as well
#!/bin/bash
STORAGEFOLDER='/storage'
find $STORAGEFOLDER -name .todo -exec ls -l {} \;
Should be dumb stupid, but i'm giving up :(
To print the directory name only, use -printf '%h\n'. Also recommended to quote your variable with doublequotes.
find "$STORAGEFOLDER" -name .todo -printf '%h\n'
If you want to process the output:
find "$STORAGEFOLDER" -name .todo -printf '%h\n' | xargs ls -l
Or use a loop with process substitution to make use of a variable:
while read -r DIR; do
ls -l "$DIR"
done < <(exec find "$STORAGEFOLDER" -name .todo -printf '%h\n')
The loop would actually process one directory at a time whereas in xargs the directories are passed ls -l in one shot.
To make it sure that you only process one directory at a time, add uniq:
find "$STORAGEFOLDER" -name .todo -printf '%h\n' | uniq | xargs ls -l
Or
while read -r DIR; do
ls -l "$DIR"
done < <(exec find "$STORAGEFOLDER" -name .todo -printf '%h\n' | uniq)
If you don't have bash and that you don't mind about preserving changes to variables outside the loop you can just use a pipe:
find "$STORAGEFOLDER" -name .todo -printf '%h\n' | uniq | while read -r DIR; do
ls -l "$DIR"
done
The quick and easy answer for stripping off a file name and showing only the directory it’s in is dirname:
#!/bin/bash
STORAGEFOLDER='/storage'
find "$STORAGEFOLDER" -name .todo -exec dirname {} \;

Resources