im trying to find the largest files created or modified within the last 10 days within Linux - linux

im trying to find the largest files created or modified within the last 10 days within Linux.
I wanted to find all files which exceed 100MB in size but were created in the last 10 days.
Almost along the lines of the below
find / -type f -size +100000k -exec ls -lShr {} ; | awk '{ print $9 ": " $5 }' | sort -k 5 -r -h
-mtime -10 -ls

You can use -printf of GNU find:
find / -size +100M -mtime -10 -printf '%s\t%p\n' | sort -n

Related

How to find the count of and total sizes of multiple files in directory?

I have a directory, inside it multiple directories which contains many type of files.
I want to find *.jpg files then to get the count and total size of all individual one.
I know I have to use find wc -l and du -ch but I don't know how to combine them in a single script or in a single command.
find . -type f name "*.jpg" -exec - not sure how to connect all the three
Supposing your starting folder is ., this will give you all files and the total size:
find . -type f -name '*.jpg' -exec du -ch {} +
The + at the end executes du -ch on all files at once - rather than per file, allowing you the get the frand total.
If you want to know only the total, add | tail -n 1 at the end.
Fair warning: this in fact executes
du -ch file1 file2 file3 ...
Which may break for very many files.
To check how many:
$ getconf ARG_MAX
2097152
That's what is configured on my system.
This doesn't give you the number of files though. You'll need to catch the output of find and use it twice.
The last line is the total, so we'll use all but the last line to get the number of files, and the last one for the total:
OUT=$(find . -type f -name '*.jpg' -exec du -ch {} +)
N=$(echo "$OUT" | head -n -1 | wc -l)
SIZE=$(echo "$OUT" | tail -n 1)
echo "Number of files: $N"
echo $SIZE
Which for me gives:
Number of files: 143
584K total

UNIX: Use a single find command to search files larger than 4 MiB, then pipe the output to a sort command

I currently have a question I am trying to answer below. Below is what I have come up with, but doesn't appear to be working:
find /usr/bin -type f -size +4194304c | sort -n
Am I on the right track with the above?
Question:
Use a single find command to search for all files larger than 4 MiB in
/usr/bin, printing the listing in a long format. Pipe this output to a sort command
which will sort the list from largest to smallest
I'd fiddle with for -printf command line switch, sth like this:
find YOUR_CONDITION_HERE -printf '%s %p\n' | sort -n: %s stands for size in bytes, %p for file name.
You can trim the sizes later, e.g. using cut, e.g.:
find -type f -size +4194304c -printf '%s %p\n' | sort -n | cut -f 2 -d ' '
But given the fact you need the long list format, I guess you'll be adding more fields to printf's argument.
Related topic: https://superuser.com/questions/294161/unix-linux-find-and-sort-by-date-modified
You are on the right track, but the find command will only output the name of the file, not it's size. This is why sort will sort them alphabetically.
To sort by size, you can output the file list and then pass it to ls with xargs like this:
find /usr/bin -type f -size +4194304c | xargs ls -S
If you want ls to output the file list on a single column, you can replace the -S with -S1. The command would become:
find /usr/bin -type f -size +4194304c | xargs ls -S1
To make your command resistant to all filenames, I would suggest using -print0 (it will separate paths with the null character which is the only one that cannot appear in a filename in Linux). The command would become:
find /usr/bin -type f -size +4194304c -print0 | xargs -0 ls -S1
You could also try
find /usr/bin -type f -size +4194304c -ls | sort -n -k7
and if you want the results reversed then try
find /usr/bin -type f -size +4194304c -ls | sort -r -n -k7
Or another option
find /usr/bin -type f -size +4194304c -exec ls -lSd {} +

Search entire server for a certain file type larger than 1GB in size? [duplicate]

This question already has answers here:
Find files with a certain extension that exceeds a certain file size
(3 answers)
Closed 3 years ago.
I have the following linux command I use to determine if a directory is larger than 1GB in size:
du -sh * | sort -hr | awk '$1 ~ /[GT]/
How would I modify this to instead search for any file that has a certain file type, such as .log filetype?
Better use find :
find . -type f -name '*.log' -size +1G
sudo find /www-data -name "*.log" -type f -size +1000000k -exec ls -lh {} \; | awk '{ print $9 ": " $5 }' | sort

Bash Script to find large files recently modified in the past 24 hours

How can I search through a massive amount of data (28TB) to find the largest 10 files in the past 24 hours?
From the current answers below I've tried:
$ find . -type f -mtime -1 -printf "%p %s\n" | sort -k2nr | head -5
This command takes over 24 hours which defeats the purpose of searching for most recently modified in the past 24 hours. Are there any solutions known to be faster than the one above that can drastically cut search time? Solutions to monitor the system also will not work as there is simply too much to monitor and doing such could cause performance issues.
something like this?
$ find . -type f -mtime -1 -printf "%p %s\n" | sort -k2nr | head -5
top 5 modified files by size in the past 24 hours.
you can use the standard yet very powerful find command like this (start_directory is the directory where to scan files)
find start_directory -type f -mtime -1 -size +3000G
-mtime -1 option: files modified 1 day before or less
-size +3000G option: files of size at least 3 Gb

calculate total used disk space by files older than 180 days using find

I am trying to find the total disk space used by files older than 180 days in a particular directory. This is what I'm using:
find . -mtime +180 -exec du -sh {} \;
but the above is quiet evidently giving me disk space used by every file that is found. I want only the total added disk space used by the files. Can this be done using find and exec command ?
Please note I simply don't want to use a script for this, it will be great if there could be a one liner for this. Any help is highly appreciated.
Why not this?
find /path/to/search/in -type f -mtime +180 -print0 | du -hc --files0-from - | tail -n 1
#PeterT is right. Almost all these answers invoke a command (du) for each file, which is very resource intensive and slow and unnecessary. The simplest and fastest way is this:
find . -type f -mtime +356 -printf '%s\n' | awk '{total=total+$1}END{print total/1024}'
du wouldn't summarize if you pass a list of files to it.
Instead, pipe the output to cut and let awk sum it up. So you can say:
find . -mtime +180 -exec du -ks {} \; | cut -f1 | awk '{total=total+$1}END{print total/1024}'
Note that the option -h to display the result in human-readable format has been replaced by -k which is equivalent to block size of 1K. The result is presented in MB (see total/1024 above).
Be careful not to take into account the disk usage by the directories. For example, I have a lot of files in my ~/tmp directory:
$ du -sh ~/tmp
3,7G /home/rpet/tmp
Running the first part of example posted by devnull to find the files modified in the last 24 hours, we can see that awk will sum the whole disk usage of the ~/tmp directory:
$ find ~/tmp -mtime 0 -exec du -ks {} \; | cut -f1
3849848
84
80
But there is only one file modified in that period of time, with very little disk usage:
$ find ~/tmp -mtime 0
/home/rpet/tmp
/home/rpet/tmp/kk
/home/rpet/tmp/kk/test.png
$ du -sh ~/tmp/kk
84K /home/rpet/tmp/kk
So we need to take into account only the files and exclude the directories:
$ find ~/tmp -type f -mtime 0 -exec du -ks {} \; | cut -f1 | awk '{total=total+$1}END{print total/1024}'
0.078125
You can also specify date ranges using the -newermt parameter. For example:
$ find . -type f -newermt "2014-01-01" ! -newermt "2014-06-01"
See http://www.commandlinefu.com/commands/view/8721/find-files-in-a-date-range
You can print file size with find using the -printf option, but you still need awk to sum.
For example, total size of all files older than 365 days:
find . -type f -mtime +356 -printf '%s\n' \
| awk '{a+=$1;} END {printf "%.1f GB\n", a/2**30;}'

Resources