Counting Amount of Files in Directory Including Hidden Files with BASH - linux

I want to count the amount of files in the directory I am currently in (including hidden files). So far I have this:
ls -1a | wc -l
but I believe this returns 2 more than what I want because it also counts "." (current directory) and ".." (directory above this one) as files. How would I go about returning the correct amount of files?

I believe to count all files / directories / hidden file you can also use BASH array like this:
shopt -s nullglob dotglob
cd /whatever/path
arr=( * )
count="${#arr[#]}"
This also works with filenames that contain space or newlines.

Edit:
ls piped to wc is not the right tool for that job. This is because filenames in UNIX can contain newlines as well. This would lead to counting them multiple times.
Following #gniourf_gniourf's comment (thanks!) the following command will handle newlines in file names correctly and should be used:
find -mindepth 1 -maxdepth 1 -printf x | wc -c
The find command lists files in the current directory - including hidden files, excluding the . and .. because of -mindepth 1. It works non-recursively because of -maxdepth 1.
The -printf x action simply prints an x for each file in the directory which leads to an output like this:
xxxxxxxx
Piped to wc -c (-c means counting characters) you get your final result.
Former Answer:
Use the following command:
ls -1A | wc -l
-a will include all files or directories starting with a dot, but -A will exclude the current folder . and the parent folder ..
I suggest to follow man ls

You almost got it right:
ls -1A | wc -l
If you filenames contain new-lines or other funny characters do:
find -type f -ls | wc -l

Related

How to list files in a directory, sorted by size, but without listing folder sizes?

I'm writing a bash script that should output the first 10 heaviest files in a directory and all subfolders (the directory is passed to the script as an argument).
And for this I use the following command:
sudo du -haS ../../src/ | sort -hr
, but its output contains folder sizes, and I only need files. Help!
Why using du at all? You could do a
ls -S1AF
This will list all entries in the current directoriy, sorted descending by size. It will also include the names of the subdirectories, but they will be at the end (because the size of a directory entry is always zero), and you can recognize them because they have a slash at the end.
To exclude those directories and pick the first 10 lines, you can do a
ls -S1AF | head -n 10 | grep -v '/$'
UPDATE:
If your directory contains not only subdirectories, but also files of length zero, some of those empty files might not be shown in the output, as pointed out in the comment by F.Hauri. If this is an issue for your application, I suggest to exchange the order and do a
ls -S1AF | grep -v '/$' | head -n 10
instead.
Would you please try the following:
dir="../../src/"
sudo find "$dir" -type f -printf "%s\t%p\n" | sort -nr | head -n 10 | cut -f2-
find "$dir" -type f searches $dir for files recursively.
The -printf "%s\t%p\n" option tells find to print the filesize
and the filename delimited by a tab character.
The final cut -f2- in the pipeline prints the 2nd and the following
columns, dropping the filesize column only.
It will work with the filenames which contain special characters such as a whitespace except for
a newline character.

How to display directories that begins with small or capital letter in Linux?

I want to display all directory names in directory /opt/BMC/patrol/ that begin with Patrol* or patrol*. Command
ls /opt/BMC/patrol/ | grep -i '^Patrol*'
produces
Patrol3
Patrol3.16
PatrolAgent_3181.sh
patrol_cfg.sh
It is correct but there are directories and files, instead of just directories.
Command ls -d /opt/BMC/patrol/*/ | grep -i '^Patrol*' produces nothing...
Command ls -d /opt/BMC/patrol/*/ | grep -i 'Patrol*' produces
/opt/BMC/patrol/BMCINSTALL/
/opt/BMC/patrol/bmc_products/
/opt/BMC/patrol/cert_gg/
/opt/BMC/patrol/common/
/opt/BMC/patrol/Install/
/opt/BMC/patrol/itools/
/opt/BMC/patrol/Patrol3/
/opt/BMC/patrol/Patrol3.16/
/opt/BMC/patrol/perform/
/opt/BMC/patrol/rtserver/
/opt/BMC/patrol/temp2/
/opt/BMC/patrol/test/
/opt/BMC/patrol/testftp/
/opt/BMC/patrol/Uninstall/
Does it search recursively? What is a command to find only directory names that begins with capital or small letters?
Try find:
find /opt/BMC/patrol -type d -iname 'patrol*'
-type d matches directories, and -iname is a case-insensitive match. The 'patrol*' has to be quoted '' because otherwise the shell will expand the * before find gets a chance.
find does search recursively by default (see Edit 2, below).
Edit ls is not optimized for this use case. ls -d prevents descending into directories, which is why you don't get any matches. As far as grep goes, ^ matches at the beginning of the line, not at the leading / before a directory's name. So grep -i '\/patrol' would be a way to find names beginning with Patrol or patrol, but you would still have to filter down to directories. find is designed to handle all these things.
Edit 2 For non-recursive, use -maxdepth:
find /opt/BMC/patrol -maxdepth 1 -type d -iname 'patrol*'
I made a test directory with the following contents, based on your question:
opt/BMC/patrol/
opt/BMC/patrol/BMCINSTALL/
opt/BMC/patrol/bmc_products/
opt/BMC/patrol/cert_gg/
opt/BMC/patrol/common/
opt/BMC/patrol/common/patrol.d/
opt/BMC/patrol/Install/
opt/BMC/patrol/itools/
opt/BMC/patrol/Patrol3/
opt/BMC/patrol/Patrol3.16/
opt/BMC/patrol/PatrolAgent_3181.sh
opt/BMC/patrol/patrol_cfg.sh
opt/BMC/patrol/perform/
opt/BMC/patrol/rtserver/
opt/BMC/patrol/temp2/
opt/BMC/patrol/test/
opt/BMC/patrol/testftp/
opt/BMC/patrol/Uninstall/
When I run the first command (without -maxdepth), I get:
opt/BMC/patrol
opt/BMC/patrol/common/patrol.d
opt/BMC/patrol/Patrol3
opt/BMC/patrol/Patrol3.16
When I run the second command (with -maxdepth), I get:
opt/BMC/patrol
opt/BMC/patrol/Patrol3
opt/BMC/patrol/Patrol3.16
and common/patrol.d is not present in the results.

How to find/list the directories where a particular sub-directory is not present

I am writing a shell script where it is checking if the bin directory is present under all the users directory under /home directory. The bin directory can be present directly under user directory or under the child directory of the user directory.
I mean let say I have a user as amit under /home. So the bin directory can be present directly as /amit/bin or can be present as /amit/jash/bin
Now my requirement is that I should have a list of users directories where the bin directory is not present either directly under user directory or under the child directory of the user directory. I tried the command as :
find /home -type d ! -exec test -e '{}/bin' \; -print
but it is not working. However when I am replacing the bin directory with some file, the command is working fine. Looks like this command is particularly for files. Is there any similar command for directories?? Any help on this will be greatly appreciated.
You're on the right track. The catch is that your test of "does the following directory NOT exist in this target" can't be expressed within find's conditions in such a way as to return only the top-level directory. So you need to nest, one way or another.
One strategy would be to use a for loop in bash:
$ mkdir foo bar baz one two
$ mkdir bar/bin baz/bin
$ for d in /home/*/; do find "$d" -type d -name bin | grep -q . || echo "$d"; done
foo/
one/
two/
This uses pathname expansion (globbing) to generate the list of directories to test, and then checks for the existence of "bin". If that check fails (i.e. find outputs nothing), the directory is printed. Note the trailing slash on /home/*/, which ensures that you will only be searching within directories, rather than files that might accidentally exist in /home/.
Another possibility might be to use nested finds, if you don't want to depend on bash:
$ find /home/ -type d -depth 1 -not -exec sh -c "find {}/ -type d -name bin -print | grep -q . " \; -print
/home/foo
/home/one
/home/two
This roughly duplicates the effect of the bash for loop above, but by nesting find within find -exec. It uses grep -q . to convert the output of find into an exit status that can be used as a condition for the outer find.
Note that since you're looking for a bin directory, we want to use test -d rather than test -e (which would also check for a bin file, which probably does not matter to you.)
Another option is to use bash process redirection. On multiple lines for easier reading:
cd /home/
comm -3 \
<(printf '%s\n' */ | sed 's|/.*||' | sort) \
<(find */ -type d -name bin | cut -d/ -f1 | uniq)
This unfortunately requires you to change to the /home directory before running, because of the way it strips off subdirectories. You can of course collapse this into a big long one-liner if you feel so inclined.
This comm solution also has the risk of failing on directories with special characters in their names, like newlines.
One last option is bash-only but more than a one-liner. It involves subtracting the directories containing "bin" from the full list. It uses an associative array and globstar, so it depends on bash version 4.
#!/usr/bin/env bash
shopt -s globstar
# Go to our root
cd /home
# Declare an associative array
declare -A dirs=()
# Populate the array with our "full" list of home directories
for d in */; do dirs[${d%/}]=""; done
# Remove directories that contain a "bin" somewhere inside 'em
for d in **/bin; do unset dirs[${d%%/*}]; done
# Print the result in reproducible form
declare -p dirs
# Or print the result just as a list of words.
printf '%s\n' "${!dirs[#]}"
Note that we're storing directories in the array index, which (1) makes it easy for us to find and delete items, and (2) insures unique entries, even if one user has multiple "bin" directories under their home.
cd /home
find . -maxdepth 1 -type d ! -name . | sort > a
find . -type d -name bin | cut -d/ -f1,2 | sort > b
comm -23 a b
Here, I'm making two sorted lists. The first contains all the home directories, and the second contains the top parent of any bin subdirectory. Finally I output any items from the first list not present in the second.

Unix - Only list directories which contain a subdirectory

How can I print in the Unix shell the number of directories in a tree which contain other directories?
I haven't found a solution yet with commands like find or ls.
You can use find command: find . -type d -not -empty
That will print every subdirectory that is not empty. You can control how deep you want the search with -maxdepth.
To print the number, you can use wc -l.
find . -type d -not -empty | wc -l
If you generate a list of all the directories under a particular directory, and then remove the last component from the name, you have a list of the directories containing subdirectories, but there are likely to be repeats in that list. So, you need to post-process the list, yielding (as a first approximation):
find ${base:-.} -type d |
sed 's%/[^/]*$%%' |
sort -u
Find all the directories under the directory or directories listed in variable $base, defaulting to the current directory, and print their names. The code assumes you don't have directories with a newline in the name. If you do, there are fixes, but the best fix is to rename the directory. The sed command removes the last slash and everything after it. The sort eliminates duplicate entries. What's left is the list of directories containing subdirectories.
Well, more or less. There's the degenerate case to consider: the top-level directories in the list will be listed regardless of whether they have sub-directories or not. Fixing that is a bit harder. You need to eliminate any lines of output that exactly match the directories specified to find before removing trailing material. So, you need something like:
{
printf '\\#^%s$#d\n' ${base:-.}
echo 's%/[^/]*$%%'
} > sed.script
find ${base:-.} -type d |
sed -f sed.script |
sort -u
rm -f sed.script
The \\#^%s$#d assumes you don't use # in directory names. If you do use it, then you need to find a character you don't use in names (maybe Control-A) and use that in place of the #. If you could face absolutely any character, then you'll need to do more work escaping some obscure character, such as Control-A, when it appears in a directory name.
There's a problem still: using a fixed name like sed.script for a temporary file name is bad (for multiple reasons — such as two people trying to run the script at the same time in the same directory, though it can also be a security risk), so use mktemp to create a temporary file name:
tmp=$(mktemp ${TMPDIR:-/tmp}/dircnt.XXXXXX)
trap "rm -f $tmp; exit 1" 0 1 2 3 13 15
{
printf '\\#^%s$#d\n' ${base:-.}
echo 's%/[^/]*$%%'
} > $tmp
find ${base:-.} -type d |
sed -f $tmp |
sort -u
rm -f $tmp
trap 0
This deals with the most common signals (HUP, INT, QUIT, PIPE, TERM) and removes the temporary file even if one of those arrives.
Clearly, if you want to simply count the number of directories, you can pipe the output from the commands above through wc -l to get the count.
ls -1d */*/. | cut -d / -f1 | uniq

Find the number of files in a directory

Is there any method in Linux to calculate the number of files in a directory (that is, immediate children) in O(1) (independently of the number of files) without having to list the directory first? If not O(1), is there a reasonably efficient way?
I'm searching for an alternative to ls | wc -l.
readdir is not as expensive as you may think. The knack is avoid stat'ing each file, and (optionally) sorting the output of ls.
/bin/ls -1U | wc -l
avoids aliases in your shell, doesn't sort the output, and lists 1 file-per-line (not strictly necessary when piping the output into wc).
The original question can be rephrased as "does the data structure of a directory store a count of the number of entries?", to which the answer is no. There isn't a more efficient way of counting files than readdir(2)/getdents(2).
One can get the number of subdirectories of a given directory without traversing the whole list by stat'ing (stat(1) or stat(2)) the given directory and observing the number of links to that directory. A given directory with N child directories will have a link count of N+2, one link for the ".." entry of each subdirectory, plus two for the "." and ".." entries of the given directory.
However one cannot get the number of all files (whether regular files or subdirectories) without traversing the whole list -- that is correct.
The "/bin/ls -1U" command will not get all entries however. It will get only those directory entries that do not start with the dot (.) character. For example, it would not count the ".profile" file found in many login $HOME directories.
One can use either the "/bin/ls -f" command or the "/bin/ls -Ua" command to avoid the sort and get all entries.
Perhaps unfortunately for your purposes, either the "/bin/ls -f" command or the "/bin/ls -Ua" command will also count the "." and ".." entries that are in each directory. You will have to subtract 2 from the count to avoid counting these two entries, such as in the following:
expr `/bin/ls -f | wc -l` - 2 # Those are back ticks, not single quotes.
The --format=single-column (-1) option is not necessary on the "/bin/ls -Ua" command when piping the "ls" output, as in to "wc" in this case. The "ls" command will automatically write its output in a single column if the output is not a terminal.
The -U option for ls is not in POSIX, and in OS X's ls it has a different meaning from GNU ls, which is that it makes -t and -l use creation times instead of modification times. -f is in POSIX as an XSI extension. The manual of GNU ls describes -f as do not sort, enable -aU, disable -ls --color and -U as do not sort; list entries in directory order.
POSIX describes -f like this:
Force each argument to be interpreted as a directory and list the name found in each slot. This option shall turn off -l, -t, -s, and -r, and shall turn on -a; the order is the order in which entries appear in the directory.
Commands like ls|wc -l give the wrong result when filenames contain newlines.
In zsh you can do something like this:
a=(*(DN));echo ${#a}
D (glob_dots) includes files whose name starts with a period and N (null_glob) causes the command to not result in an error in an empty directory.
Or the same in bash:
shopt -s dotglob nullglob;a=(*);echo ${#a[#]}
If IFS contains ASCII digits, add double quotes around ${#a[#]}. Add shopt -u failglob to ensure that failglob is unset.
A portable option is to use find:
find . ! -name . -prune|grep -c /
grep -c / can be replaced with wc -l if filenames do not contain newlines. ! -name . -prune is a portable alternative to -mindepth 1 -maxdepth 1.
Or here's another alternative that does not usually include files whose name starts with a period:
set -- *;[ -e "$1" ]&&echo "$#"
The command above does however include files whose name starts with a period when an option like dotglob in bash or glob_dots in zsh is set. When * matches no file, the command results in an error in zsh with the default settings.
I used this command..works like a charm..only to change the maxdepth..that is sub directories
find * -maxdepth 0 -type d -exec sh -c "echo -n {} ' ' ; ls -lR {} | wc -l" \;
I think you can have more control on this using find:
find <path> -maxdepth 1 -type f -printf "." | wc -c
find -maxdepth 1 will not go deeper into the hierarchy of files.
-type f allows filtering to just files. Similarly, you can use -type d for directories.
-printf "." prints a dot for every match.
wc -c counts the characters, so it counts the dots created by the print... which means counting how many files exist in the given path.
For the number of all file in a current directory try this:
ls -lR * | wc -l
As far as I know, there is no better alternative. This information might be off-topic to this question and you may already know this that under Linux (in general under Unix) directories are just special file which contains the list of other files (I understand that the exact details will be dependent on specific file system but this is the general idea). And there is no call to find the total number of entries without traversing the whole list. Please make me correct if I'm wrong.
use ls -1 | wc -l

Resources