finding largest file for each directory - linux

I am currently stuck with listing the largest file of each subdirectory in a specific directory.
I succeeded in listing the largest file in a directory by entering the following command (in Debian):
find . -type f -printf "%p\n" | ls -rS |tail -1
I expected entering the command in a shell-file (searchHelper.sh) and running the following command would return the expected filenames for each subdirectory:
find -type d -execdir ./searchHelper.sh {} +
Unfortunately it does not return the largest file for each subdirectory, but something else.
May I get a hint for getting the filename (with absolute path) of the largest file of each subdirectory?
Many thanks in advance

Give a try to this safe and tested version:
find "$(pwd)" -depth -type f -printf "d%h\0%s %p\0" | awk -v RS="\0" '
/^d/ {
directoryname=substr($0,2);
}
/^[0-9]/ {
if (!biggestfilesizeindir[directoryname] || biggestfilesizeindir[directoryname] < $1) {
biggestfilesizeindir[directoryname]=$1;
biggestfilesizefilenameindir[directoryname]=substr($0,index($0," ")+1);
}
}
END {
for (directoryname in biggestfilesizefilenameindir) {
print biggestfilesizefilenameindir[directoryname];
}
}'
This is safe even if the names contain special chars: ' " \n etc.

Related

Linux - is there a way to get the file size of a directory BUT only including the files that have a last modified / creation date of x?

as per title I am trying to find a way to get the file size of a directory (using du) but only counting the files in the directory that have been created (or modified) after a specific date.
Is it something that can be done using the command line?
Thanks :)
From #Bodo's comment. Using GNU find:
find directory/ -type f -newermt 2021-11-25 -printf "%s\t %f\n" | \
awk '{s += $1 } END { print s }' | \
numfmt --to=iec-i
find looks in in directory/ (change this)
Looks for files (-type f)
that have a newer modified time than 2021-11-25 (-newermt) (change this)
and outputs the files's size (%s) on each line
adds up all the sizes from the lines with awk {s += $1 }
Prints the results END { print s }
Formats the byte value to human readable with numfmt's --to=iec-i

How to count files in specific subdirectories of a parent directory?

I use the find . -type f | wc -l command to count all the files in a regular directory, but in more specific cases if a directory contains many files, is it possible to specify this in the command? In case I only want to count all the files in the image subdirectories for example. To know how many images (all in .jpeg) I have in total in mydirectory.
This command works find /Users/mydirectory -type f -exec file --mime-type {} \; | awk '{if ($NF == "image/jpeg") print $0 }' but just display them. How to count them?
Finally the command find /Users/mydirectory -type f -exec file --no-pad --mime-type {} + | awk '$NF == "image/jpeg" {$NF=""; sub(": $", ""); print}' | wc -l seems to do the trick.
mydirectory/
folder1/
image/
label/
folder2/
image/
label/
...
I have the impression that you are not aware of the maxdepth parameter of the find command, which indicates the depth of your search command: by using find ... -maxdepth 1 you say that you only want to search within the directory itself. I believe this will solve your question.
Simple example: I created a subdirectory "tralala" and added two files, a subsubdirectory with one file, and I launched following command:
find tralala -maxdepth 1 -type f | wc -l
The answer was 2, which was correct, as you can see here from the amount "to be counted":
Prompt$ find tralala -ls
... drwxrwxrwx ... tralala/ => is a directory, don't count.
... drwxrwxrwx ... tralala/dir => is a directory, don't count.
... -rwxrwxrwx ... tralala/dir/test.txt => is inside subdirectory, don't count.
... -rwxrwxrwx ... tralala/test.txt => is file inside directory, to be counted.
... -rwxrwxrwx ... tralala/test2.txt => is file inside directory, to be counted.

How can I output the result of find and grep as filename => found

How can I combine the result of commands find and grep in the format: filename: => string?
For example, find . -maxdepth 2 -type f -name .env -exec grep 'CURRENT_ENV' {} \; The command will display me the line when CURRENT_ENVstring found, e.g. CURRENT_ENV=staging. I want to modify the output in the follow way: ./site1.com: CURRENT_ENV=staging.
I can't understand How can I reach that. Is it possible?
-H, --with-filename
Print the file name for each match. This is the default when
there is more than one file to search.
http://man7.org/linux/man-pages/man1/grep.1.html

How to find file with range parameter?

For the following files:
res1, res2, res3, 1res4, res100
Expected result would be res1, res2 and res3. How to use 'grep' to get this result.
Thanks in Advance.
grep is not needed.
ls res[1-5]
If you want to have number range try:
ls res{1..100}
To do exactly what was requested:
find . -maxdepth 1 -type f | grep '^\./res[1-5]$'
will ignore res100, and only look for files in the current directory.
To get sorted output (as "ls" would do), add that step:
find . -maxdepth 1 -type f | grep '^\./res[1-5]$' |sort

Getting files names with bash script

I am trying to get all the Files names from a directory "blabla"
and only from that directory without its sub-directorys
and i need all those names without the X first names and Y last names
and without its path (only the file names themselfs
i tried
#!/bin/bash
find blavla | sort
but it gave me all the files including the subfolders files
and it gave me FULL name (with the path)
and i have no idea how to reed without the X first and Y last names
tried to search online and reading the man find but didnt find nothing
Use the following command:
find . -maxdepth 1 -type f -exec basename {} ';' | \
sort | \
awk 'BEGIN { X = 2; Y = 2 } { lines[NR] = $0 } END { for (i=1 + X; i<=NR - Y; i++) print lines[i] }'
Set X and Y to how many file names you want to skip at the beginning and at the end of the list respectively.
Try this (substitute Y and X with actual values):
cd blavla && find . -maxdepth 1 -type f|head -n -Y|tail -n +(X+1)

Resources