calculate total used disk space by files older than 180 days using find - linux

I am trying to find the total disk space used by files older than 180 days in a particular directory. This is what I'm using:
find . -mtime +180 -exec du -sh {} \;
but the above is quiet evidently giving me disk space used by every file that is found. I want only the total added disk space used by the files. Can this be done using find and exec command ?
Please note I simply don't want to use a script for this, it will be great if there could be a one liner for this. Any help is highly appreciated.

Why not this?
find /path/to/search/in -type f -mtime +180 -print0 | du -hc --files0-from - | tail -n 1

#PeterT is right. Almost all these answers invoke a command (du) for each file, which is very resource intensive and slow and unnecessary. The simplest and fastest way is this:
find . -type f -mtime +356 -printf '%s\n' | awk '{total=total+$1}END{print total/1024}'

du wouldn't summarize if you pass a list of files to it.
Instead, pipe the output to cut and let awk sum it up. So you can say:
find . -mtime +180 -exec du -ks {} \; | cut -f1 | awk '{total=total+$1}END{print total/1024}'
Note that the option -h to display the result in human-readable format has been replaced by -k which is equivalent to block size of 1K. The result is presented in MB (see total/1024 above).

Be careful not to take into account the disk usage by the directories. For example, I have a lot of files in my ~/tmp directory:
$ du -sh ~/tmp
3,7G /home/rpet/tmp
Running the first part of example posted by devnull to find the files modified in the last 24 hours, we can see that awk will sum the whole disk usage of the ~/tmp directory:
$ find ~/tmp -mtime 0 -exec du -ks {} \; | cut -f1
3849848
84
80
But there is only one file modified in that period of time, with very little disk usage:
$ find ~/tmp -mtime 0
/home/rpet/tmp
/home/rpet/tmp/kk
/home/rpet/tmp/kk/test.png
$ du -sh ~/tmp/kk
84K /home/rpet/tmp/kk
So we need to take into account only the files and exclude the directories:
$ find ~/tmp -type f -mtime 0 -exec du -ks {} \; | cut -f1 | awk '{total=total+$1}END{print total/1024}'
0.078125
You can also specify date ranges using the -newermt parameter. For example:
$ find . -type f -newermt "2014-01-01" ! -newermt "2014-06-01"
See http://www.commandlinefu.com/commands/view/8721/find-files-in-a-date-range

You can print file size with find using the -printf option, but you still need awk to sum.
For example, total size of all files older than 365 days:
find . -type f -mtime +356 -printf '%s\n' \
| awk '{a+=$1;} END {printf "%.1f GB\n", a/2**30;}'

Related

How to find the count of and total sizes of multiple files in directory?

I have a directory, inside it multiple directories which contains many type of files.
I want to find *.jpg files then to get the count and total size of all individual one.
I know I have to use find wc -l and du -ch but I don't know how to combine them in a single script or in a single command.
find . -type f name "*.jpg" -exec - not sure how to connect all the three
Supposing your starting folder is ., this will give you all files and the total size:
find . -type f -name '*.jpg' -exec du -ch {} +
The + at the end executes du -ch on all files at once - rather than per file, allowing you the get the frand total.
If you want to know only the total, add | tail -n 1 at the end.
Fair warning: this in fact executes
du -ch file1 file2 file3 ...
Which may break for very many files.
To check how many:
$ getconf ARG_MAX
2097152
That's what is configured on my system.
This doesn't give you the number of files though. You'll need to catch the output of find and use it twice.
The last line is the total, so we'll use all but the last line to get the number of files, and the last one for the total:
OUT=$(find . -type f -name '*.jpg' -exec du -ch {} +)
N=$(echo "$OUT" | head -n -1 | wc -l)
SIZE=$(echo "$OUT" | tail -n 1)
echo "Number of files: $N"
echo $SIZE
Which for me gives:
Number of files: 143
584K total

How to get combined disc space of all files in a directory with help of du in linux [duplicate]

I've got a bunch of files scattered across folders in a layout, e.g.:
dir1/somefile.gif
dir1/another.mp4
dir2/video/filename.mp4
dir2/some.file
dir2/blahblah.mp4
And I need to find the total disk space used for the MP4 files only. This means it's gotta be recursive somehow.
I've looked at du and fiddling with piping things to grep but can't seem to figure out how to calculate just the MP4 files no matter where they are.
A human readable total disk space output is a must too, preferably in GB, if possible?
Any ideas? Thanks
For individual file size:
find . -name "*.mp4" -print0 | du -sh --files0-from=-
For total disk space in GB:
find . -name "*.mp4" -print0 | du -sb --files0-from=- | awk '{ total += $1} END { print total/1024/1024/1024 }'
You can simply do :
find -name "*.mp4" -exec du -b {} \; | awk 'BEGIN{total=0}{total=total+$1}END{print total}'
The -exec option of find command executes a simple command with {} as the file found by find.
du -b displays the size of the file in bytes.
The awk command initializes a variable at 0 and get the size of each file to display the total at the end of the command.
This will sum all mp4 files size in bytes:
find ./ -name "*.mp4" -printf "%s\n" | paste -sd+ | bc

How to find total size of all files under the ownership of a user?

I'm trying to find out the total size of all files owned by a given user.
I've tried this:
find $myfolder -user $myuser -type f -exec du -ch {} +
But this gives me an error:
missing argument to exec
and I don't know how to fix it. Can somebody can help me with this?
You just need to terminate the -exec. If you want the totals for each directory
possibly -type d is required.
find $myfolder -user $myuser -type d -exec du -ch {} \;
Use:
find $myfolder -user gisi -type f -print0 | xargs -0 du -sh
where user gisi is my cat ;)
Note the option -s for summarize
Further note that I'm using find ... -print0 which on the one hand separates filenames by 0 bytes, which are one of the few characters which are not allowed in filenames, and on the other hand xargs -0 which uses the 0 byte as the delimiter. This makes sure that even exotic filenames won't be a problem.
some version of find command does not like "+" for termination of find command
use "\;" instead of "+"

List files over a specific size in current directory and all subdirectories

How can I display all files greater than 10k bytes in my current directory and it's subdirectories.
Tried ls -size +10k but that didn't work.
find . -size +10k -exec ls -lh {} \+
the first part of this is identical to #sputnicks answer, and sucesffully finds all files in the directory over 10k (don't confuse k with K), my addition, the second part then executes ls -lh or ls that lists(-l) the files by human readable size(-h). negate the h if you prefer. of course the {} is the file itself, and the \+ is simply an alternative to \;
which in practice \; would repeat or:
ls -l found.file; ls -l found.file.2; ls -l found.file.3
where \+ display it as one statement or:
ls -l found.file found.file.2 found.file.3
more on \; vs + with find
Additionaly, you may want the listing ordered by size. Which is relatively easy to accomplish. I would at the -s option to ls, so ls -ls and then pipe it to sort -n to sort numerically
which would become:
find . -size +10k -exec ls -ls {} \+ | sort -n
or in reverse order add an -r :
find . -size +10k -exec ls -ls {} \+ | sort -nr
finally, your title says find biggest file in directory. You can do that by then piping the code to tail
find . -size +10k -exec ls -ls {} \+ | sort -n | tail -1
would find you the largest file in the directory and its sub directories.
note you could also sort files by size by using -S, and negate the need for sort. but to find the largest file you would need to use head so
find . -size +10k -exec ls -lS {} \+ | head -1
the benefit of doing it with -S and not sort is one, you don't have to type sort -n and two you can also use -h the human readable size option. which is one of my favorite to use, but is not available with older versisions of ls, for example we have an old centOs 4 server at work that doesn't have -h
Try doing this:
find . -size +10k -ls
And if you want to use the binary ls :
find . -size +10k -exec ls -l {} \;
I realize the assignment is likely long over. For anyone else:
You are overcomplicating.
find . -size +10k
I'll add to #matchew answer (not enough karma points to comment):
find . -size +10k -type f -maxdepth 1 -exec ls -lh {} \; > myLogFile.txt
-type f :specify regular file type
-maxdepth 1 :make sure it only find files in the current directory
You may use ls like that:
ls -lR | egrep -v '^d' | awk '$5>10240{print}'
Explanation:
ls -lR # list recursivly
egrep -v '^d' # only print lines which do not start with a 'd'. (files)
only print lines where the fifth column (size) is greater that 10240 bytes:
awk '$5>10240{print}'

How do I find all the files that were created today in Unix/Linux?

How do I find all the files that were create only today and not in 24 hour period in unix/linux
On my Fedora 10 system, with findutils-4.4.0-1.fc10.i386:
find <path> -daystart -ctime 0 -print
The -daystart flag tells it to calculate from the start of today instead of from 24 hours ago.
Note however that this will actually list files created or modified in the last day. find has no options that look at the true creation date of the file.
find . -mtime -1 -type f -print
To find all files that are modified today only (since start of day only, i.e. 12 am), in current directory and its sub-directories:
touch -t `date +%m%d0000` /tmp/$$
find . -type f -newer /tmp/$$
rm /tmp/$$
Source
I use this with some frequency:
$ ls -altrh --time-style=+%D | grep $(date +%D)
After going through many posts I found the best one that really works
find $file_path -type f -name "*.txt" -mtime -1 -printf "%f\n"
This prints only the file name like
abc.txt not the /path/tofolder/abc.txt
Also also play around or customize with -mtime -1
This worked for me. Lists the files created on May 30 in the current directory.
ls -lt | grep 'May 30'
Use ls or find to have all the files that were created today.
Using ls : ls -ltr | grep "$(date '+%b %e')"
Using find : cd $YOUR_DIRECTORY; find . -ls 2>/dev/null| grep "$(date '+%b %e')"
find ./ -maxdepth 1 -type f -execdir basename '{}' ';' | grep `date +'%Y%m%d'`
You can use find and ls to accomplish with this:
find . -type f -exec ls -l {} \; | egrep "Aug 26";
It will find all files in this directory, display useful informations (-l) and filter the lines with some date you want... It may be a little bit slow, but still useful in some cases.
Just keep in mind there are 2 spaces between Aug and 26. Other wise your find command will not work.
find . -type f -exec ls -l {} \; | egrep "Aug 26";
If you're did something like accidentally rsync'd to the wrong directory, the above suggestions work to find new files, but for me, the easiest was connecting with an SFTP client like Transmit then ordering by date and deleting.
To get file before 24 hours execute below command:
find . -type f -mtime 1 -exec ls -l {} \;
To get files created today execute below command:
find . -type f -mtime -1 -exec ls -l {} \;
To Get files created before n days before, where +2 is before 2 days files in below command:
find . -type f -mtime +2 -exec ls -l {} \;

Resources