List files over a specific size in current directory and all subdirectories - linux

How can I display all files greater than 10k bytes in my current directory and it's subdirectories.
Tried ls -size +10k but that didn't work.

find . -size +10k -exec ls -lh {} \+
the first part of this is identical to #sputnicks answer, and sucesffully finds all files in the directory over 10k (don't confuse k with K), my addition, the second part then executes ls -lh or ls that lists(-l) the files by human readable size(-h). negate the h if you prefer. of course the {} is the file itself, and the \+ is simply an alternative to \;
which in practice \; would repeat or:
ls -l found.file; ls -l found.file.2; ls -l found.file.3
where \+ display it as one statement or:
ls -l found.file found.file.2 found.file.3
more on \; vs + with find
Additionaly, you may want the listing ordered by size. Which is relatively easy to accomplish. I would at the -s option to ls, so ls -ls and then pipe it to sort -n to sort numerically
which would become:
find . -size +10k -exec ls -ls {} \+ | sort -n
or in reverse order add an -r :
find . -size +10k -exec ls -ls {} \+ | sort -nr
finally, your title says find biggest file in directory. You can do that by then piping the code to tail
find . -size +10k -exec ls -ls {} \+ | sort -n | tail -1
would find you the largest file in the directory and its sub directories.
note you could also sort files by size by using -S, and negate the need for sort. but to find the largest file you would need to use head so
find . -size +10k -exec ls -lS {} \+ | head -1
the benefit of doing it with -S and not sort is one, you don't have to type sort -n and two you can also use -h the human readable size option. which is one of my favorite to use, but is not available with older versisions of ls, for example we have an old centOs 4 server at work that doesn't have -h

Try doing this:
find . -size +10k -ls
And if you want to use the binary ls :
find . -size +10k -exec ls -l {} \;

I realize the assignment is likely long over. For anyone else:
You are overcomplicating.
find . -size +10k

I'll add to #matchew answer (not enough karma points to comment):
find . -size +10k -type f -maxdepth 1 -exec ls -lh {} \; > myLogFile.txt
-type f :specify regular file type
-maxdepth 1 :make sure it only find files in the current directory

You may use ls like that:
ls -lR | egrep -v '^d' | awk '$5>10240{print}'
Explanation:
ls -lR # list recursivly
egrep -v '^d' # only print lines which do not start with a 'd'. (files)
only print lines where the fifth column (size) is greater that 10240 bytes:
awk '$5>10240{print}'

Related

UNIX: Use a single find command to search files larger than 4 MiB, then pipe the output to a sort command

I currently have a question I am trying to answer below. Below is what I have come up with, but doesn't appear to be working:
find /usr/bin -type f -size +4194304c | sort -n
Am I on the right track with the above?
Question:
Use a single find command to search for all files larger than 4 MiB in
/usr/bin, printing the listing in a long format. Pipe this output to a sort command
which will sort the list from largest to smallest
I'd fiddle with for -printf command line switch, sth like this:
find YOUR_CONDITION_HERE -printf '%s %p\n' | sort -n: %s stands for size in bytes, %p for file name.
You can trim the sizes later, e.g. using cut, e.g.:
find -type f -size +4194304c -printf '%s %p\n' | sort -n | cut -f 2 -d ' '
But given the fact you need the long list format, I guess you'll be adding more fields to printf's argument.
Related topic: https://superuser.com/questions/294161/unix-linux-find-and-sort-by-date-modified
You are on the right track, but the find command will only output the name of the file, not it's size. This is why sort will sort them alphabetically.
To sort by size, you can output the file list and then pass it to ls with xargs like this:
find /usr/bin -type f -size +4194304c | xargs ls -S
If you want ls to output the file list on a single column, you can replace the -S with -S1. The command would become:
find /usr/bin -type f -size +4194304c | xargs ls -S1
To make your command resistant to all filenames, I would suggest using -print0 (it will separate paths with the null character which is the only one that cannot appear in a filename in Linux). The command would become:
find /usr/bin -type f -size +4194304c -print0 | xargs -0 ls -S1
You could also try
find /usr/bin -type f -size +4194304c -ls | sort -n -k7
and if you want the results reversed then try
find /usr/bin -type f -size +4194304c -ls | sort -r -n -k7
Or another option
find /usr/bin -type f -size +4194304c -exec ls -lSd {} +

Adjusting xargs to accept ls -lh

I want to find files larger than X MB, so I run
find data/ -size +2M
but I need MB next to each file, so I tried this:
find data/ -size +2M | xargs -I '{}' ls -lh '{}'
Above seems to list all files regardless of size, is the xargs part incorrect and it also does a ls on the data/ rather than on the matching files ?
How should the above be written ?
It worked OK if I specify -type f but I think that is not the solution.
find data/ -size +2M -type f | xargs -I '{}' ls -lh '{}'
This might help you sudo find / -size +2M -exec ls -s1h {} \;

How to list all the find -perm results?

I want to give a long list (with ls -l) of all the files in home directory that is writable by user, how can I combine find and ls -l?
find ~/ -maxdepth 1 -exec ls -l '{}' \;
If you are strictly interested only in files, i.e., no folders then you can tune the last command in the following way
find ~/ -maxdepth 1 -type f -exec ls -l '{}' \;
Check your man page for "find". It has a -ls action that you can tag on to the end:
-ls True; list current file in ls -dils format on standard output.

calculate total used disk space by files older than 180 days using find

I am trying to find the total disk space used by files older than 180 days in a particular directory. This is what I'm using:
find . -mtime +180 -exec du -sh {} \;
but the above is quiet evidently giving me disk space used by every file that is found. I want only the total added disk space used by the files. Can this be done using find and exec command ?
Please note I simply don't want to use a script for this, it will be great if there could be a one liner for this. Any help is highly appreciated.
Why not this?
find /path/to/search/in -type f -mtime +180 -print0 | du -hc --files0-from - | tail -n 1
#PeterT is right. Almost all these answers invoke a command (du) for each file, which is very resource intensive and slow and unnecessary. The simplest and fastest way is this:
find . -type f -mtime +356 -printf '%s\n' | awk '{total=total+$1}END{print total/1024}'
du wouldn't summarize if you pass a list of files to it.
Instead, pipe the output to cut and let awk sum it up. So you can say:
find . -mtime +180 -exec du -ks {} \; | cut -f1 | awk '{total=total+$1}END{print total/1024}'
Note that the option -h to display the result in human-readable format has been replaced by -k which is equivalent to block size of 1K. The result is presented in MB (see total/1024 above).
Be careful not to take into account the disk usage by the directories. For example, I have a lot of files in my ~/tmp directory:
$ du -sh ~/tmp
3,7G /home/rpet/tmp
Running the first part of example posted by devnull to find the files modified in the last 24 hours, we can see that awk will sum the whole disk usage of the ~/tmp directory:
$ find ~/tmp -mtime 0 -exec du -ks {} \; | cut -f1
3849848
84
80
But there is only one file modified in that period of time, with very little disk usage:
$ find ~/tmp -mtime 0
/home/rpet/tmp
/home/rpet/tmp/kk
/home/rpet/tmp/kk/test.png
$ du -sh ~/tmp/kk
84K /home/rpet/tmp/kk
So we need to take into account only the files and exclude the directories:
$ find ~/tmp -type f -mtime 0 -exec du -ks {} \; | cut -f1 | awk '{total=total+$1}END{print total/1024}'
0.078125
You can also specify date ranges using the -newermt parameter. For example:
$ find . -type f -newermt "2014-01-01" ! -newermt "2014-06-01"
See http://www.commandlinefu.com/commands/view/8721/find-files-in-a-date-range
You can print file size with find using the -printf option, but you still need awk to sum.
For example, total size of all files older than 365 days:
find . -type f -mtime +356 -printf '%s\n' \
| awk '{a+=$1;} END {printf "%.1f GB\n", a/2**30;}'

pipe a command after splitting the returned value

I'm using a find command which results in multiple lines for result, I then want to pipe each of those lines into an ls command with the-l option specified.
find . -maxdepth 2 -type f |<some splitting method> | ls -l
I want to do this in one "command" and avoid writing to a file.
I believe this is what you are looking for:
find . -maxdepth 2 -type f -exec ls -l {} \;
Explanation:
find . -maxdepth 2 -type f: find files with maxdepth at 2
-exec ls -l {} \; For each such result found, run ls -l on it; {} specifies where the results from find would be substituted into.
The typical approach is to use -exec:
find . -maxdepth 2 -type f -exec ls -l {} \;
Sounds like you are looking for xargs. For example, on a typical Linux system:
find . -maxdepth 2 -type f -print0 | xargs -0 -n1 ls -l

Resources