UNIX: Use a single find command to search files larger than 4 MiB, then pipe the output to a sort command - linux

I currently have a question I am trying to answer below. Below is what I have come up with, but doesn't appear to be working:
find /usr/bin -type f -size +4194304c | sort -n
Am I on the right track with the above?
Question:
Use a single find command to search for all files larger than 4 MiB in
/usr/bin, printing the listing in a long format. Pipe this output to a sort command
which will sort the list from largest to smallest

I'd fiddle with for -printf command line switch, sth like this:
find YOUR_CONDITION_HERE -printf '%s %p\n' | sort -n: %s stands for size in bytes, %p for file name.
You can trim the sizes later, e.g. using cut, e.g.:
find -type f -size +4194304c -printf '%s %p\n' | sort -n | cut -f 2 -d ' '
But given the fact you need the long list format, I guess you'll be adding more fields to printf's argument.
Related topic: https://superuser.com/questions/294161/unix-linux-find-and-sort-by-date-modified

You are on the right track, but the find command will only output the name of the file, not it's size. This is why sort will sort them alphabetically.
To sort by size, you can output the file list and then pass it to ls with xargs like this:
find /usr/bin -type f -size +4194304c | xargs ls -S
If you want ls to output the file list on a single column, you can replace the -S with -S1. The command would become:
find /usr/bin -type f -size +4194304c | xargs ls -S1
To make your command resistant to all filenames, I would suggest using -print0 (it will separate paths with the null character which is the only one that cannot appear in a filename in Linux). The command would become:
find /usr/bin -type f -size +4194304c -print0 | xargs -0 ls -S1

You could also try
find /usr/bin -type f -size +4194304c -ls | sort -n -k7
and if you want the results reversed then try
find /usr/bin -type f -size +4194304c -ls | sort -r -n -k7
Or another option
find /usr/bin -type f -size +4194304c -exec ls -lSd {} +

Related

Find/list only file types recursive in a directory

how can i search for files and give only a list of mimes or types out.
example:
dir
-file1.pdf
-file2.pdf
-dir2
--file3.png
--file4.pdf
wished output:
pdf
png
Edit
Found also a solution, but does not make a difference between upper and lowercase and also not .peng and .png
find . -type f -printf '%f\n' | sed 's/^.*\.//' | sort -u
Use find's -exec option together with any solution that extracts the extension. Then pipe through sort -u to remove duplicates.
find dir -type f -exec bash -c 'printf %s\\n "${###*.}"' x {} + | sort -u
Files without extensions will be listed too. To filter them out add the option -name '*\.*' before -exec.

How to pipe a list of files returned from find to cat and sort them

I'm trying to find all the files from a folder and then print them but sorted.
I have this so far
find . -type f -exec cat {} \;
and it print's all files but I need to sort them too but when I do
find . -type f -exec sort cat {};
I get the next error
sort:cannot read:cat:No such file or directory
and if I switch sort and cat like this
find . -type f -exec cat sort {} \;
I get the same error the it print's the file(I have only one file to print)
It's not clear to me if you want to display the contents of the files unchanged sorting the files by name, or if you want to sort the contents of each file. If the latter:
find . -type f -exec sort {} \;
If the former, use bsd find's -s option:
find -s . -type f -exec cat {} \;
If you don't have bsd find, use:
find . -type f -print0 | sort -z | xargs -0 cat
Composing commands using pipes is often the simplest solution.
find . -print0 -type f | sort | xargs -0 cat
Explanation: you can sort filenames after the fact using ... | sort, then pass the output (the list of files) to cat using xargs, i.e. ... | xargs cat.
As #arkaduisz points out, when using pipes, should carefully handle filenames containing whitespaces (thus using -print0 and -0).

linux bash count size of files in folder

i saw a few posts in forum but i cant manage to make them work for me
i have a script that runs in a folder and i want it to count the size only of the files in that folder but without the folders inside.
so if i have
file1.txt
folder1
file2.txt
it will return the size in bytes of file1+file2 without folder1
find . -maxdepth 1 -type f
gives me a list of all the files i want to count but how can i get the size of all this files?
The tool for this is xargs:
find "$dir" -maxdepth 1 -type f -print0 | xargs -0 wc -c
Note that find -print0 and xargs -0 are GNU extensions, but if you know they are available, they are well worth using in your script - you don't know what characters might be present in the filenames in the target directory.
You will need to post-process the output of wc; alternatively, use cat to give it a single input stream, like this:
find "$dir" -maxdepth 1 -type f -print0 | xargs -0 cat | wc -c
That gives you a single number you can use in following commands.
(I've assumed you meant "size" in bytes; obviously substitute wc -m if you meant characters or wc -l if you meant lines).

Bash - listing programs in all subdirectories with directory name before file

I don't need to do this in one line, but I've only got 1 line so far.
find . -perm -111 +type f | sort -r
What I'm trying to do is write a bash script that will display the list of all files in the current directory that are executable (z to a). I want the script to do the same for all subdirectories. What I'm having difficulty doing is displaying the name of the subdirectory before the list of executable files in that directory / subdirectory.
So, to clarify, desirable output might look like this:
program1
program2
SubDir1
program3
SubDirSubDir2
program4
SubDir2
program5
What I have right now (the above code) does this. Its not removing /path and it isn't listing the name of the new directory when directories are changed.
./exfile
./test/exfile1
./test1/program2
./test1/program
./first
Hopefully that was clear.
This will work.
I changed the permission to -100 because maybe some programs are only executable by its owner.
for d in $(find . -type d); do
echo "in $d:"
find $d -maxdepth 1 -perm -100 -type f | sed 's#.*/##'
done
This will do the trick for you.
find . -type d | sort | xargs -n1 -I{} bash -c "find {} -type f -maxdepth 1 -executable | sort -r"
The first find command lists all directories and sub directories and sort them in ascending order.
The sorted directories/sub-directories are then passed to xargs which calls bash to find the files within the directory/sub-directory and sort them in descending order.
If you prefer to also print the directory, you may run it without -type f.
You can use find on all directories and combine it with -print (to print the directory name) and -exec (to execute a find for files in that directory):
find . -type d -print -exec bash -c 'find {} -type f -depth 1 -perm +0111 | sort -r' \;
Let's break this down. First, you have the directory search:
find . -type d -print
Then the command to execute for each directory:
find {} -type f -depth 1 -perm +0111 | sort -r
The -exec switch will expand the path wherever it sees {}. Because this uses a pipe operator that is shell syntax, the whole thing is wrapped in bash -c.
You can expand on this further. If you want to strip the directory name off the files and space our your results nicer, something like this might suffice:
find {} -type f -depth 1 -print0 -perm +0111 | xargs -n1 -0 basename | sort -r && echo
Hmm, the sorting requirement makes this tricky - the "for d in $(find...)" command is clever, but hard to control the sorting. How about this? Everything is z->a, including the directories, but the awk statement is a bit of a monster ;_)
find `pwd` -perm 111 -type f |
sort -r |
xargs -n1 -I{} sh -c "dirname {};basename {}" |
awk '/^\// {dir=$0 ; if (dir != lastdir) {print;lastdir=dir}} !/^\// {print}'
Produces
/home/imcgowan/t/t3
jjj
iii
hhh
/home/imcgowan/t/t2
ggg
fff
eee
/home/imcgowan/t/t1
ddd
ccc
bbb
/home/imcgowan/t
aaa

List files over a specific size in current directory and all subdirectories

How can I display all files greater than 10k bytes in my current directory and it's subdirectories.
Tried ls -size +10k but that didn't work.
find . -size +10k -exec ls -lh {} \+
the first part of this is identical to #sputnicks answer, and sucesffully finds all files in the directory over 10k (don't confuse k with K), my addition, the second part then executes ls -lh or ls that lists(-l) the files by human readable size(-h). negate the h if you prefer. of course the {} is the file itself, and the \+ is simply an alternative to \;
which in practice \; would repeat or:
ls -l found.file; ls -l found.file.2; ls -l found.file.3
where \+ display it as one statement or:
ls -l found.file found.file.2 found.file.3
more on \; vs + with find
Additionaly, you may want the listing ordered by size. Which is relatively easy to accomplish. I would at the -s option to ls, so ls -ls and then pipe it to sort -n to sort numerically
which would become:
find . -size +10k -exec ls -ls {} \+ | sort -n
or in reverse order add an -r :
find . -size +10k -exec ls -ls {} \+ | sort -nr
finally, your title says find biggest file in directory. You can do that by then piping the code to tail
find . -size +10k -exec ls -ls {} \+ | sort -n | tail -1
would find you the largest file in the directory and its sub directories.
note you could also sort files by size by using -S, and negate the need for sort. but to find the largest file you would need to use head so
find . -size +10k -exec ls -lS {} \+ | head -1
the benefit of doing it with -S and not sort is one, you don't have to type sort -n and two you can also use -h the human readable size option. which is one of my favorite to use, but is not available with older versisions of ls, for example we have an old centOs 4 server at work that doesn't have -h
Try doing this:
find . -size +10k -ls
And if you want to use the binary ls :
find . -size +10k -exec ls -l {} \;
I realize the assignment is likely long over. For anyone else:
You are overcomplicating.
find . -size +10k
I'll add to #matchew answer (not enough karma points to comment):
find . -size +10k -type f -maxdepth 1 -exec ls -lh {} \; > myLogFile.txt
-type f :specify regular file type
-maxdepth 1 :make sure it only find files in the current directory
You may use ls like that:
ls -lR | egrep -v '^d' | awk '$5>10240{print}'
Explanation:
ls -lR # list recursivly
egrep -v '^d' # only print lines which do not start with a 'd'. (files)
only print lines where the fifth column (size) is greater that 10240 bytes:
awk '$5>10240{print}'

Resources