How to test if a Linux directory contain only one subdirectory and no other files? - linux

In a /bin/sh script, I'd like to check whether a directory contains only one subdir and no other files (aside from "." and "..", of course). I could probably parse the output of ls, but I also understand that's generally a bad idea. Suggestions?
Reason for questions: When I zip a folder on, say, a windows machine, and I unzip it under Linux, sometimes I get a directory whose contents are that of the original folder; sometimes I get a directory containing exactly one subdir, whose contents are that of the original folder. (I assume that there's something that varies in the way that I use zip under Windows, or that the various windows machines I use are configured slightly differently, or ...who knows?) Anhow, I'd like, on the Linux side, to handle both kinds of results in more or less the same way, hence this question.
For those thinking "What if your Windows-side folder really did contain just one subdir?", it happens that that's OK in this case, although I grant that it's a corner-case for the problem specification.

find would be a good tool for this. It has some neat arguments:
-maxdepth 1 so it does not search recursively
-type d searching only for directories
-printf 1 to overcome the problem with weird filenames (print 1 instead of the file name)
The full command is then:
find DIRECTORY -maxdepth 1 -type d -printf 1
This will print one character for each directory plus the directory itself so you are looking for directories that prints two characters (find has a nice feature that ignores . and .. when searching).
Then, you want to check if there are no other (non-directory) files:
find DIRECTORY -maxdepth 1 ! -type d -printf 1
The full check will then be:
if [ "$(find DIRECTORY -maxdepth 1 -type d -printf 1 | wc -m)" -eq 2 \
-a "$(find DIRECTORY -maxdepth 1 ! -type d -printf 1 | wc -m)" -eq 0 ]; then
# It has only one subdirectory and no other content
fi
Or, you can make it one command using -printf's %y which prints file type (d for directory):
if [ "$(find DIRECTORY -maxdepth 1 -printf %y)" = "dd" ]; then
# It has only one subdirectory and no other content
fi

Your biggest issue in /bin/sh will be invisible files, since a * doesn't catch then [by default].
This will do what you want, I think:
#!/bin/bash
count_()
{
echo $(( $# - 2 )) # -2 to remove . and ..
}
count()
{
count_ * .*
}
ITEM_COUNT=$(count)
Of course you can adapt it to take a path as an argument if you wish.
Example output:
bash-3.2$ count
3
bash-3.2$ ll
total 0
drwxr-xr-x 5 christopher wheel 170 Mar 18 2014 .
drwxrwxrwx 6 root wheel 204 Jul 5 12:28 ..
drwxr-xr-x 14 christopher wheel 476 Mar 18 2014 .git
drwxr-xr-x 5 christopher wheel 170 Mar 18 2014 bin
drwxr-xr-x 4 christopher wheel 136 Mar 18 2014 pylib
Another example:
sh-3.2$ count_()
> {
> echo $(( $# - 2 )) # -2 to remove . and ..
> }
sh-3.2$ count()
> {
> count_ * .*
> }
sh-3.2$ ITEM_COUNT=$(count)
sh-3.2$ echo ${ITEM_COUNT}
3
Sidenote:
You're right that different zip implementations handle things differently, but on Linux, many tools treat zip /path/to/folder and zip /path/to/folder/ differently (which is absurdly irritating). If you're working in a controlled environment, you might want to instead normalize how things get zipped. However, if this is a user-facing thing, then that sucks.
If you're not using bash as the invoking shell:
countFiles.sh:
#!/bin/bash
count_()
{
echo $(( $# - 2 )) # -2 to remove . and ..
}
count_ * .*
scriptThatWantsTheCountOfFiles:
#!/bin/tcsh
set count = `./countFiles.sh`

To avoid having to parse output yourself, you can use find.
if [[ `find . -type d | wc -l` -eq 2 && \
`find . -type f | wc -l` -eq 0 ]]; then
echo yes
else
echo no
fi
If you want to avoid recursing into the directory, you can specify -maxdepth 1.
See man find for some of the other options.

The number of links to the directory inode can help somewhat, but not solve completely the problem. A directory has 2 + n links pointing to it, being n the number of subdirectories created inside. So, if you try to find all the directories with only 3 links, you'll get the directories with only one subdirectory inside. But the only solution to get a no files directory is to search it.
Perhaps you can think in a mixed, two phase solution, in which first you find all the directories with 3 links pointing to them (very efficient). Then you read only those, finding for no plain files inside.

Here's a portable shell function, tested in Dash:
check() {
d=
for i in "$1"/* "$1"/.*
do
test ddd != "$d" && test -d "$i" || return 1
d=d$d
done
test ddd = "$d"
}
We use the variable d to count how many directories we've seen. We're expecting three (., .. and the target directory). If we've already seen three, then exit early; similarly if we've seen a non-directory.
If we make it to the end of the loop, we check that we've seen three directories; the return value of test becomes the return value of the function.

Related

Using bash, how can I remove the extensions of all files in a specific directory?

I want to keep the files but remove their extensions. The files do not have the same extension to them. My end goal is to remove all their extensions and change them to one single extension of my choice. I have the second part down.
My code so far:
#!/bin/bash
echo -n "Enter the directory: "
read path
#Remove all extensions
find $path -type f -exec mv '{}' '{}'.extension \; #add desired extension
You don't need an external command find for this, but do it in bash alone. The script below removes the extension from all the files in the folder path.
for file in "$path"/*; do
[ -f "$file" ] || continue
mv "$file" "${file%.*}"
done
The reason for using [ -f "$file" ] is only for a safety check. The glob expression "$path"/* might end up in no files listed, in that case the mv command would fail as there are no files. The [ -f "$file" ] || continue condition safeguards this by exiting the loop when the $file variable is empty in which the [ -f "$file" ] returns a failure error code. The || when used in a compound statement will run if the previous command fails, so when continue is hit next, the for loop is terminated.
If you want to add a new extension just do
mv "$file" "${file%.*}.extension"
This could also be a way
for i in `find . -type f `;do filename=`ls $i | cut -f 2 -d "."`; mv $i ./$filename.ext; done
You might want to try the below. It uses find and awk with system() to remove the extension:
find . -name "*" -type f|awk 'BEGIN{FS="/"}{print $2}'|awk 'BEGIN{FS="."}{system("mv "$0" ./"$1"")}'
example:
[root#puppet:0 check]# ls -lrt
total 0
-rw-r--r--. 1 root root 0 Oct 5 13:49 abc.ext
-rw-r--r--. 1 root root 0 Oct 5 13:49 something.gz
[root#puppet:0 check]# find . -name "*" -type f|awk 'BEGIN{FS="/"}{print $2}'|awk 'BEGIN{FS="."}{system("mv "$0" ./"$1"")}'
[root#puppet:0 check]# ls -lrt
total 0
-rw-r--r--. 1 root root 0 Oct 5 13:49 abc
-rw-r--r--. 1 root root 0 Oct 5 13:49 something
also if you have a specific extension that you want to add to all the files, you may modify the command as below:
find . -name "*" -type f|awk 'BEGIN{FS="/"}{print $2}'|awk 'BEGIN{FS=".";ext=".png"}{system("mv "$0" ./"$1ext"")}'

How to find/list the directories where a particular sub-directory is not present

I am writing a shell script where it is checking if the bin directory is present under all the users directory under /home directory. The bin directory can be present directly under user directory or under the child directory of the user directory.
I mean let say I have a user as amit under /home. So the bin directory can be present directly as /amit/bin or can be present as /amit/jash/bin
Now my requirement is that I should have a list of users directories where the bin directory is not present either directly under user directory or under the child directory of the user directory. I tried the command as :
find /home -type d ! -exec test -e '{}/bin' \; -print
but it is not working. However when I am replacing the bin directory with some file, the command is working fine. Looks like this command is particularly for files. Is there any similar command for directories?? Any help on this will be greatly appreciated.
You're on the right track. The catch is that your test of "does the following directory NOT exist in this target" can't be expressed within find's conditions in such a way as to return only the top-level directory. So you need to nest, one way or another.
One strategy would be to use a for loop in bash:
$ mkdir foo bar baz one two
$ mkdir bar/bin baz/bin
$ for d in /home/*/; do find "$d" -type d -name bin | grep -q . || echo "$d"; done
foo/
one/
two/
This uses pathname expansion (globbing) to generate the list of directories to test, and then checks for the existence of "bin". If that check fails (i.e. find outputs nothing), the directory is printed. Note the trailing slash on /home/*/, which ensures that you will only be searching within directories, rather than files that might accidentally exist in /home/.
Another possibility might be to use nested finds, if you don't want to depend on bash:
$ find /home/ -type d -depth 1 -not -exec sh -c "find {}/ -type d -name bin -print | grep -q . " \; -print
/home/foo
/home/one
/home/two
This roughly duplicates the effect of the bash for loop above, but by nesting find within find -exec. It uses grep -q . to convert the output of find into an exit status that can be used as a condition for the outer find.
Note that since you're looking for a bin directory, we want to use test -d rather than test -e (which would also check for a bin file, which probably does not matter to you.)
Another option is to use bash process redirection. On multiple lines for easier reading:
cd /home/
comm -3 \
<(printf '%s\n' */ | sed 's|/.*||' | sort) \
<(find */ -type d -name bin | cut -d/ -f1 | uniq)
This unfortunately requires you to change to the /home directory before running, because of the way it strips off subdirectories. You can of course collapse this into a big long one-liner if you feel so inclined.
This comm solution also has the risk of failing on directories with special characters in their names, like newlines.
One last option is bash-only but more than a one-liner. It involves subtracting the directories containing "bin" from the full list. It uses an associative array and globstar, so it depends on bash version 4.
#!/usr/bin/env bash
shopt -s globstar
# Go to our root
cd /home
# Declare an associative array
declare -A dirs=()
# Populate the array with our "full" list of home directories
for d in */; do dirs[${d%/}]=""; done
# Remove directories that contain a "bin" somewhere inside 'em
for d in **/bin; do unset dirs[${d%%/*}]; done
# Print the result in reproducible form
declare -p dirs
# Or print the result just as a list of words.
printf '%s\n' "${!dirs[#]}"
Note that we're storing directories in the array index, which (1) makes it easy for us to find and delete items, and (2) insures unique entries, even if one user has multiple "bin" directories under their home.
cd /home
find . -maxdepth 1 -type d ! -name . | sort > a
find . -type d -name bin | cut -d/ -f1,2 | sort > b
comm -23 a b
Here, I'm making two sorted lists. The first contains all the home directories, and the second contains the top parent of any bin subdirectory. Finally I output any items from the first list not present in the second.

how to perform action of list of directories listed using for loop?

I am doing one check to see if a directory is present inside list of below directories. Below is the table listed. This is the only table available about such directories.
user#root> cat u
Directory owner value
-------- ---- -----
0-0-1-0 Aleks 10
0-0-2-0 Ram 23
0-0-3-0 mark 43
0-0-4-0 Sam 22
0-0-5-0 wood 21
0-0-6-0 peter 34
0-0-7-0 ron 45
0-0-8-0 Alic 44
0-0-9-0 amber 56
0-0-10-0 janny 34
user#root> cat u |grep -Ev "owner|--"|awk '{print $1 }'
0-0-1-0
0-0-2-0
0-0-3-0
0-0-4-0
0-0-5-0
0-0-6-0
0-0-7-0
0-0-8-0
0-0-9-0
0-0-10-0
Query:
I want to login into all the directories from 0-0-1-0 to 0-0-10-0 and perform some action. How can I do that ?
For example I want to validate if XYZ directory is present inside all the directories or not.
user#root>test -d 0-0-1-0/XYZ; if [ "$?" != "0" ];then echo "directory is missing" fi
I think if I can store value of each row incrementally in some variable then issue will be resolved.
You can process your list of files like this:
#!/bin/sh
for dir in $(awk 'NR>2 {print $1}' $1)
do
if [[ -d "$dir" ]]
then
cd "$dir"
pwd
# Do random stuff
fi
done
Run the script like this:
./script.sh my_list_of_files
If the directory exists it will cd to that directory and run pwd.
One warning though, this script will get a bit confused if any of your directories have a space in them.
If you know the DIR_NAME_TO_BE_SEARCHED, then you can use following command:
find YOUR_STARTING_DIRECTORY -type d -name DIR_NAME_TO_BE_SEARCHED -print
example:
find . -type d -name test -print
explanation:
will find all directories (-type d) starting from your current directory that have their name as test (-name test) and output them (-print).
and if you don't know the exact DIRECTORY_NAME_TO_BE_SEARCHED, then you can use pattern as well :
find YOUR_STARTING_DIRECTORY -type d -name "DIR_NAME_TO_BE_SEARCHED_PATTERN" -print
example:
find . -type d -name "\*test\*" -print

Recursively counting files in a Linux directory

How can I recursively count files in a Linux directory?
I found this:
find DIR_NAME -type f ¦ wc -l
But when I run this it returns the following error.
find: paths must precede expression: ¦
This should work:
find DIR_NAME -type f | wc -l
Explanation:
-type f to include only files.
| (and not ¦) redirects find command's standard output to wc command's standard input.
wc (short for word count) counts newlines, words and bytes on its input (docs).
-l to count just newlines.
Notes:
Replace DIR_NAME with . to execute the command in the current folder.
You can also remove the -type f to include directories (and symlinks) in the count.
It's possible this command will overcount if filenames can contain newline characters.
Explanation of why your example does not work:
In the command you showed, you do not use the "Pipe" (|) to kind-of connect two commands, but the broken bar (¦) which the shell does not recognize as a command or something similar. That's why you get that error message.
For the current directory:
find -type f | wc -l
If you want a breakdown of how many files are in each dir under your current dir:
for i in */ .*/ ; do
echo -n $i": " ;
(find "$i" -type f | wc -l) ;
done
That can go all on one line, of course. The parenthesis clarify whose output wc -l is supposed to be watching (find $i -type f in this case).
On my computer, rsync is a little bit faster than find | wc -l in the accepted answer:
$ rsync --stats --dry-run -ax /path/to/dir /tmp
Number of files: 173076
Number of files transferred: 150481
Total file size: 8414946241 bytes
Total transferred file size: 8414932602 bytes
The second line has the number of files, 150,481 in the above example. As a bonus you get the total size as well (in bytes).
Remarks:
the first line is a count of files, directories, symlinks, etc all together, that's why it is bigger than the second line.
the --dry-run (or -n for short) option is important to not actually transfer the files!
I used the -x option to "don't cross filesystem boundaries", which means if you execute it for / and you have external hard disks attached, it will only count the files on the root partition.
You can use
$ tree
after installing the tree package with
$ sudo apt-get install tree
(on a Debian / Mint / Ubuntu Linux machine).
The command shows not only the count of the files, but also the count of the directories, separately. The option -L can be used to specify the maximum display level (which, by default, is the maximum depth of the directory tree).
Hidden files can be included too by supplying the -a option .
Since filenames in UNIX may contain newlines (yes, newlines), wc -l might count too many files. I would print a dot for every file and then count the dots:
find DIR_NAME -type f -printf "." | wc -c
Note: The -printf option does only work with find from GNU findutils. You may need to install it, on a Mac for example.
Combining several of the answers here together, the most useful solution seems to be:
find . -maxdepth 1 -type d -print0 |
xargs -0 -I {} sh -c 'echo -e $(find "{}" -printf "\n" | wc -l) "{}"' |
sort -n
It can handle odd things like file names that include spaces parenthesis and even new lines. It also sorts the output by the number of files.
You can increase the number after -maxdepth to get sub directories counted too. Keep in mind that this can potentially take a long time, particularly if you have a highly nested directory structure in combination with a high -maxdepth number.
If you want to know how many files and sub-directories exist from the present working directory you can use this one-liner
find . -maxdepth 1 -type d -print0 | xargs -0 -I {} sh -c 'echo -e $(find {} | wc -l) {}' | sort -n
This will work in GNU flavour, and just omit the -e from the echo command for BSD linux (e.g. OSX).
You can use the command ncdu. It will recursively count how many files a Linux directory contains. Here is an example of output:
It has a progress bar, which is convenient if you have many files:
To install it on Ubuntu:
sudo apt-get install -y ncdu
Benchmark: I used https://archive.org/details/cv_corpus_v1.tar (380390 files, 11 GB) as the folder where one has to count the number of files.
find . -type f | wc -l: around 1m20s to complete
ncdu: around 1m20s to complete
If what you need is to count a specific file type recursively, you can do:
find YOUR_PATH -name '*.html' -type f | wc -l
-l is just to display the number of lines in the output.
If you need to exclude certain folders, use -not -path
find . -not -path './node_modules/*' -name '*.js' -type f | wc -l
tree $DIR_PATH | tail -1
Sample Output:
5309 directories, 2122 files
If you want to avoid error cases, don't allow wc -l to see files with newlines (which it will count as 2+ files)
e.g. Consider a case where we have a single file with a single EOL character in it
> mkdir emptydir && cd emptydir
> touch $'file with EOL(\n) character in it'
> find -type f
./file with EOL(?) character in it
> find -type f | wc -l
2
Since at least gnu wc does not appear to have an option to read/count a null terminated list (except from a file), the easiest solution would just be to not pass it filenames, but a static output each time a file is found, e.g. in the same directory as above
> find -type f -exec printf '\n' \; | wc -l
1
Or if your find supports it
> find -type f -printf '\n' | wc -l
1
To determine how many files there are in the current directory, put in ls -1 | wc -l. This uses wc to do a count of the number of lines (-l) in the output of ls -1. It doesn't count dotfiles. Please note that ls -l (that's an "L" rather than a "1" as in the previous examples) which I used in previous versions of this HOWTO will actually give you a file count one greater than the actual count. Thanks to Kam Nejad for this point.
If you want to count only files and NOT include symbolic links (just an example of what else you could do), you could use ls -l | grep -v ^l | wc -l (that's an "L" not a "1" this time, we want a "long" listing here). grep checks for any line beginning with "l" (indicating a link), and discards that line (-v).
Relative speed: "ls -1 /usr/bin/ | wc -l" takes about 1.03 seconds on an unloaded 486SX25 (/usr/bin/ on this machine has 355 files). "ls -l /usr/bin/ | grep -v ^l | wc -l" takes about 1.19 seconds.
Source: http://www.tldp.org/HOWTO/Bash-Prompt-HOWTO/x700.html
With bash:
Create an array of entries with ( ) and get the count with #.
FILES=(./*); echo ${#FILES[#]}
Ok that doesn't recursively count files but I wanted to show the simple option first. A common use case might be for creating rollover backups of a file. This will create logfile.1, logfile.2, logfile.3 etc.
CNT=(./logfile*); mv logfile logfile.${#CNT[#]}
Recursive count with bash 4+ globstar enabled (as mentioned by #tripleee)
FILES=(**/*); echo ${#FILES[#]}
To get the count of files recursively we can still use find in the same way.
FILES=(`find . -type f`); echo ${#FILES[#]}
For directories with spaces in the name ... (based on various answers above) -- recursively print directory name with number of files within:
find . -mindepth 1 -type d -print0 | while IFS= read -r -d '' i ; do echo -n $i": " ; ls -p "$i" | grep -v / | wc -l ; done
Example (formatted for readability):
pwd
/mnt/Vancouver/Programming/scripts/claws/corpus
ls -l
total 8
drwxr-xr-x 2 victoria victoria 4096 Mar 28 15:02 'Catabolism - Autophagy; Phagosomes; Mitophagy'
drwxr-xr-x 3 victoria victoria 4096 Mar 29 16:04 'Catabolism - Lysosomes'
ls 'Catabolism - Autophagy; Phagosomes; Mitophagy'/ | wc -l
138
## 2 dir (one with 28 files; other with 1 file):
ls 'Catabolism - Lysosomes'/ | wc -l
29
The directory structure is better visualized using tree:
tree -L 3 -F .
.
├── Catabolism - Autophagy; Phagosomes; Mitophagy/
│   ├── 1
│   ├── 10
│   ├── [ ... SNIP! (138 files, total) ... ]
│   ├── 98
│   └── 99
└── Catabolism - Lysosomes/
├── 1
├── 10
├── [ ... SNIP! (28 files, total) ... ]
├── 8
├── 9
└── aaa/
└── bbb
3 directories, 167 files
man find | grep mindep
-mindepth levels
Do not apply any tests or actions at levels less than levels
(a non-negative integer). -mindepth 1 means process all files
except the starting-points.
ls -p | grep -v / (used below) is from answer 2 at https://unix.stackexchange.com/questions/48492/list-only-regular-files-but-not-directories-in-current-directory
find . -mindepth 1 -type d -print0 | while IFS= read -r -d '' i ; do echo -n $i": " ; ls -p "$i" | grep -v / | wc -l ; done
./Catabolism - Autophagy; Phagosomes; Mitophagy: 138
./Catabolism - Lysosomes: 28
./Catabolism - Lysosomes/aaa: 1
Applcation: I want to find the max number of files among several hundred directories (all depth = 1) [output below again formatted for readability]:
date; pwd
Fri Mar 29 20:08:08 PDT 2019
/home/victoria/Mail/2_RESEARCH - NEWS
time find . -mindepth 1 -type d -print0 | while IFS= read -r -d '' i ; do echo -n $i": " ; ls -p "$i" | grep -v / | wc -l ; done > ../../aaa
0:00.03
[victoria#victoria 2_RESEARCH - NEWS]$ head -n5 ../../aaa
./RNA - Exosomes: 26
./Cellular Signaling - Receptors: 213
./Catabolism - Autophagy; Phagosomes; Mitophagy: 138
./Stress - Physiological, Cellular - General: 261
./Ancient DNA; Ancient Protein: 34
[victoria#victoria 2_RESEARCH - NEWS]$ sed -r 's/(^.*): ([0-9]{1,8}$)/\2: \1/g' ../../aaa | sort -V | (head; echo ''; tail)
0: ./Genomics - Gene Drive
1: ./Causality; Causal Relationships
1: ./Cloning
1: ./GenMAPP 2
1: ./Pathway Interaction Database
1: ./Wasps
2: ./Cellular Signaling - Ras-MAPK Pathway
2: ./Cell Death - Ferroptosis
2: ./Diet - Apples
2: ./Environment - Waste Management
988: ./Genomics - PPM (Personalized & Precision Medicine)
1113: ./Microbes - Pathogens, Parasites
1418: ./Health - Female
1420: ./Immunity, Inflammation - General
1522: ./Science, Research - Miscellaneous
1797: ./Genomics
1910: ./Neuroscience, Neurobiology
2740: ./Genomics - Functional
3943: ./Cancer
4375: ./Health - Disease
sort -V is a natural sort. ... So, my max number of files in any of those (Claws Mail) directories is 4375 files. If I left-pad (https://stackoverflow.com/a/55409116/1904943) those filenames -- they are all named numerically, starting with 1, in each directory -- and pad to 5 total digits, I should be ok.
Addendum
Find the total number of files, subdirectories in a directory.
$ date; pwd
Tue 14 May 2019 04:08:31 PM PDT
/home/victoria/Mail/2_RESEARCH - NEWS
$ ls | head; echo; ls | tail
Acoustics
Ageing
Ageing - Calorie (Dietary) Restriction
Ageing - Senescence
Agriculture, Aquaculture, Fisheries
Ancient DNA; Ancient Protein
Anthropology, Archaeology
Ants
Archaeology
ARO-Relevant Literature, News
Transcriptome - CAGE
Transcriptome - FISSEQ
Transcriptome - RNA-seq
Translational Science, Medicine
Transposons
USACEHR-Relevant Literature
Vaccines
Vision, Eyes, Sight
Wasps
Women in Science, Medicine
$ find . -type f | wc -l
70214 ## files
$ find . -type d | wc -l
417 ## subdirectories
There are many correct answers here. Here's another!
find . -type f | sort | uniq -w 10 -c
where . is the folder to look in and 10 is the number of characters by which to group the directory.
I have written ffcnt to speed up recursive file counting under specific circumstances: rotational disks and filesystems that support extent mapping.
It can be an order of magnitude faster than ls or find based approaches, but YMMV.
suppose you want a per directory total files, try:
for d in `find YOUR_SUBDIR_HERE -type d`; do
printf "$d - files > "
find $d -type f | wc -l
done
for current dir try this:
for d in `find . -type d`; do printf "$d - files > "; find $d -type f | wc -l; done;
if you have long space names you need change IFS, like this:
OIFS=$IFS; IFS=$'\n'
for d in `find . -type d`; do printf "$d - files > "; find $d -type f | wc -l; done
IFS=$OIFS
We can use tree command it displays all the files and folders recursively. As well as it displays count of folders and files in last line of output.
$ tree path/to/folder/
path/to/folder/
├── a-first.html
├── b-second.html
├── subfolder
│ ├── readme.html
│ ├── code.cpp
│ └── code.h
└── z-last-file.html
1 directories, 6 files
For only last line of output in tree command we can use tail command on it's output
$ tree path/to/folder/ | tail -1
1 directories, 6 files
for installing tree we can use below command
$ sudo apt-get install tree
This alternate approach with filtering for format counts all available grub kernel modules:
ls -l /boot/grub/*.mod | wc -l
Based on the responses given above and comments, I've came up with the following file count listing. Especially it's a combination of the solution provided by #Greg Bell, with comments from #Arch Stanton
& #Schneems
Count all files in the current directory & subdirectories
function countit { find . -maxdepth 1000000 -type d -print0 | while IFS= read -r -d '' i ; do file_count=$(find "$i" -type f | wc -l) ; echo "$file_count: $i" ; done }; countit | sort -n -r >file-count.txt
Count all files of given name in the current directory & subdirectories
function countit { find . -maxdepth 1000000 -type d -print0 | while IFS= read -r -d '' i ; do file_count=$(find "$i" -type f | grep <enter_filename_here> | wc -l) ; echo "$file_count: $i" ; done }; countit | sort -n -r >file-with-name-count.txt
find -type f | wc -l
OR (If directory is current directory)
find . -type f | wc -l
This will work completely fine. Simple short. If you want to count the number of files present in a folder.
ls | wc -l
ls -l | grep -e -x -e -dr | wc -l
long list
filter files and dirs
count the filtered line no

Script to distribute a large number of files in to smaller groups

I have folders containing large numbers of files (e.g. 1000+) of various sizes which I want to move in to smaller groups of, say, 100 files per folder.
I wrote an Apple Script which counted the files, created a numbered subfolder, and then moved 100 files in to the new folder (the number of files could be specified) which looped until there were less than specified number of files which it moved in to the last folder it created.
The problem was that it ran horrendously slowly. I'm looking for either an Apple Script or shell script I can run on my MacBook and/or Linux box which will efficiently move the files in to smaller groups.
How the files are grouped is not particularly significant, I just want fewer files in each folder.
This should get you started:
DIR=$1
BATCH_SIZE=$2
SUBFOLDER_NAME=$3
COUNTER=1
while [ `find $DIR -maxdepth 1 -type f| wc -l` -gt $BATCH_SIZE ] ; do
NEW_DIR=$DIR/${SUBFOLDER_NAME}${COUNTER}
mkdir $NEW_DIR
find $DIR -maxdepth 1 -type f | head -n $BATCH_SIZE | xargs -I {} mv {} $NEW_DIR
let COUNTER++
if [ `find $DIR -maxdepth 1 -type f| wc -l` -le $BATCH_SIZE ] ; then
mkdir $NEW_DIR
find $DIR -maxdepth 1 -type f | head -n $BATCH_SIZE | xargs -I {} mv {} $NEW_DIR
fi
done
The nested if statement gets the last remaining files. You can add some additional checks as you see needed after you modify for your use.
This is a tremendous kludge, but it shouldn't be too terribly slow:
rm /tmp/counter*
touch /tmp/counter1
find /source/dir -type f -print0 |
xargs -0 -n 100 \
sh -c 'n=$(echo /tmp/counter*); \
n=${n#/tmp/counter}; \
counter="/tmp/counter$n"; \
mv "$counter" "/tmp/counter$((n+1))"; \
mkdir "/dest/dir/$n"; \
mv "$#" "/dest/dir/$n"' _
It's completely indiscriminate as to which files go where.
The most common way to solve the problem of directories with too many files in them is to subdivide by the the first couple characters of the name. For example:
Before:
aardvark
apple
architect
...
zebra
zork
After:
a/aardvark
a/apple
a/architect
b/...
...
z/zebra
z/zork
If that isn't subdividing well enough, then go one step further:
a/aa/aardvark
a/ap/apple
a/ar/architect
...
z/ze/zebra
z/zo/zork
This should work quite quickly, because the move command that your script executes can use simple glob expansion to select all the files to move, ala mv aa* a/aa, as opposed to having to individually run a move command on each file (which would be my first guess as to why the original script was slow)

Resources