How to recursive list files with size and last modified time? - linux

Given a directory i'm looking for a bash one-liner to get a recursive list of all files with their size and modified time tab separated for easy parsing. Something like:
cows/betsy 145700 2011-03-02 08:27
horses/silver 109895 2011-06-04 17:43

You can use stat(1) to get the information you want, if you don't want the full ls -l output, and you can use find(1) to get a recursive directory listing. Combining them into one line, you could do this:
# Find all regular files under the current directory and print out their
# filenames, sizes, and last modified times
find . -type f -exec stat -f '%N %z %Sm' '{}' +
If you want to make the output more parseable, you can use %m instead of %Sm to get the last modified time as a time_t instead of as a human-readable date.

find is perfect for recursively searching through directories. The -ls action tells it to output its results in ls -l format:
find /dir/ -ls
On Linux machines you can print customized output using the -printf action:
find /dir/ -printf '%p\t%s\t%t\n'
See man find for full details on the format specifiers available with -printf. (This is not POSIX-compatible and may not be available on other UNIX flavors.)

find * -type f -printf '%p\t%s\t%TY-%Tm-%Td %Tk:%TM\n'
If you prefer fixed-width fields rather than tabs, you can do things like changing %s to %10s.
I used find * ... to avoid the leading "./" on each file name. If you don't mind that, use . rather than * (which also shows files whose names start with .). You can also pipe the output through sed 's/^\.\///'.
Note that the output order will be arbitrary. Pipe through sort if you want an ordered listing.

You could try this for recursive listing from current folder called "/from_dir"
find /from_dir/* -print0 | xargs -0 stat -c “%n|%A|%a|%U|%G” > permissions_list.txt

Lists files and directories passes through to stat command and puts all the info into a file called permissions_list.txt
“%n|%A|%a|%U|%G” will give you the following result in the file:
from_
 dir|drwxr-sr-x|2755|root|root
from_dir/filename|-rw-r–r–|644|root|root

Cheers!


Related

How can i count the number of files with a specific octal code without them showing in shell

I tried using tree command but I didn't know how .(I wanted to use tree because I don't want the files to show up , just the number)
Let's say c is the code for permission
For example I want to know how many files are there with the permission 751
Use find with the -perm flag, which only matches files with the specified permission bits.
For example, if you have the octal in $c, then run
find . -perm $c
The usual find options apply—if you only want to find files at the current level without recursing into directories, run
find . -maxdepth 1 -perm $c
To find the number of matching files, make find print a dot for every file and use wc to count the number of dots. (wc -l will not work with more exotic filenames with newlines as #BenjaminW. has pointed out in the comments. Source of idea of using wc -c is this answer.)
find . -maxdepth 1 -perm $c -printf '.' | wc -c
This will show the number of files without showing the files themselves.
If you're using zsh as your shell, you can do it natively without any external programs:
setopt EXTENDED_GLOB # Just in case it's not already set
c=0751
files=( **/*(#qf$c) )
echo "${#files[#]} files found"
will count all files in the current working directory and subdirectories with those permissions (And gives you all the names in an array in case you want to do something with them later). Read more about zsh glob qualifiers in the documentation.

How to grep/find for a list of file names?

So for example, I have a text document of a list of file names I may have in a directory. I want to grep or use find to find out if those file names exist in a specific directory and the subdirectories within it. Current I can do it manually via find . | grep filename but that's one at a time and when I have over 100 file names I need to check to see if I have them or not that can be really pesky and time-consuming.
What's the best way to go about this?
xargs is what you want here. The case is following:
Assume you have a file named filenames.txt that contains a list of files
a.file
b.file
c.file
d.file
e.file
and only e.file doesn't exist.
the command in terminal is:
cat filenames.txt | xargs -I {} find . -type f -name {}
the output of this command is:
a.file
b.file
c.file
d.file
Maybe this is helpful.
If the files didn't move, since the last time, updatedb ran, often < 24h, your fastest search is by locate.
Read the filelist into an array and search by locate. In case the filenames are common (or occur as a part of other files), grep them by the base dir, where to find them:
< file.lst mapfile filearr
locate ${filearr[#]} | grep /path/where/to/find
If the file names may contain whitespace or characters, which might get interpreted by the bash, the usual masking mechanisms have to been taken.
A friend had helped me figure it out via find . | grep -i -Ff filenames.txt

Using grep to recursively search through subdirectories for specific keyword inside specific filename

Im trying to look for the text Elapsed time inside a specific log file names vsim.log. Im not familiar with grep, but after some googling I found that grep -r will allow me to do recursively searches and grep -r "Elapsed time" will do recursive searches for that phrase within all files in my directory. According to this link, I can then do grep -r "Elapsed time" ./vsim* to recursively search through the directories for files starting with vsim and look inside those files for Elapsed time. However, when i tried this i get grep: No match. which i know is not true since I know the files exist there with those keywords. What am i messing up?
Continuing from my comment, you can use find to locate the file vsim.log if you do not know its exact location and then use the -execdir option to find to grep the file for the term Elapsed time, e.g.
find path -type f -name "vsim.log" -execdir grep -H 'Elapsed time' '{}' +
That will return the filename along with the matched text which you can simply parse to isolate the filename if desired. You can process all files that match if you anticipate more than one by feeding the results of the find command into a while read -r loop, e.g.
while read -r match; do
# process "$match" as desired
echo "Term 'Elapsed time' found in file ${match%:*}"
done < <(find path -type f -name "vsim.log" -execdir grep -H 'Elapsed time' '{}' +)
Where:
find is the swiss-army knife for finding files on your system
path can be any relative or absolute path to search (e.g. $HOME or /home/dorojevets) to search all files in your home directory
the option -type f tells find to only locate files (see man find for link handling)
the option -name "foo" tell find to only locate files named foo (wildcards allowed)
the -exec and -execdir options allow you to execute the command that follows on each file (represented by '{}')
the grep -H 'Elapsed time' '{}' being the command to execute on each filename
the + being what tells find it has reached the end of the command (\; used with -exec)
finally, the ${match%:*} parameter expansion on the variable $match is used to parse the filename from filename:Elapsed time returned by grep -H (the %:* simply being used to trim everything to the first : from the right of $match)
Give that a try and compare the execution time to a recursive grep of the file tree. What you may be missing in this discussion, is that you use find if you know some part of the filename (or file mod time, or set of permissions, etc) that contains the information you need. It can search millions of files in a file tree vastly quicker than you can recursively grep every single file. If you have no clue what file may contain the needed info -- then use grep and just wait...
Try:
grep -r "Elapsed time" * --include vsim.log
Or this answer Use grep --exclude/--include syntax to not grep through certain files.
The following works just in case if you are using Mac:
To search UUID in *.c files recursively under given folder "/home" use the following:
grep -r "UUID" --include "*.c" /home/
To recursively search UUID in all main.c files in multiple projects under current folder use the following:
grep -r "UUID" --include "main.c" .

Linux terminal: Recursive search for string only in files w given file extension; display file name and absolute path

I'm new to Linux terminal; using Ubuntu Peppermint 5.
I want to recursively search all directories for a given text string (eg 'mystring'), in all files which have a given file extension (eg. '*.doc') in the file name; and then display a list of the file names and absolute file paths of all matches. I don't need to see any lines of content.
This must be a common problem. I'm hoping to find a solution which does the search quickly and efficiently, and is also simple to remember and type into the terminal.
I've tried using 'cat', 'grep', 'find', and 'locate' with various options, and piped together in different combinations, but I haven't found a way to do the above.
Something similar was discussed on:
How to show grep result with complete path or file name
and:
Recursively search for files of a given name, and find instances of a particular phrase AND display the path to that file
but I can't figure a way to adapt these to do the above, and would be grateful for any suggestions.
According to the grep manual, you can do this using the --include option (combined with the -l option if you want only the name — I usually use -n to show line numbers):
--include=glob
Search only files whose name matches glob, using wildcard matching as described under --exclude.
-l
--files-with-matches
Suppress normal output; instead print the name of each input file from which output would normally have been printed. The scanning of each file stops on the first match. (-l is specified by POSIX.)
A suitable glob would be "*.doc" (ensure that it is quoted, to allow the shell to pass it to grep).
GNU grep also has a recursive option -r (not in POSIX grep). Together with the globbing, you can search a directory-tree of ".doc" files like this:
grep -r -l --include="*.doc" "mystring" .
If you wanted to make this portable, then find is the place to start. But using grep's extension makes searches much faster, and is available on any Linux platform.
find . -name '*.doc' -exec grep -l 'mystring' {} \; -print
How it works:
find searches recursively from the given path .
for all files which name is '*.doc'
-exec grep execute grep on files found
suppress output from grep -l
and search inside the files for 'mystring'
The expression for grep ends with the {} \;
and -print print out all names where grep founds mystring.
EDIT:
To get only results from the current directory without recursion you can add:
-maxdepth 0 to find.

Find the number of files in a directory

Is there any method in Linux to calculate the number of files in a directory (that is, immediate children) in O(1) (independently of the number of files) without having to list the directory first? If not O(1), is there a reasonably efficient way?
I'm searching for an alternative to ls | wc -l.
readdir is not as expensive as you may think. The knack is avoid stat'ing each file, and (optionally) sorting the output of ls.
/bin/ls -1U | wc -l
avoids aliases in your shell, doesn't sort the output, and lists 1 file-per-line (not strictly necessary when piping the output into wc).
The original question can be rephrased as "does the data structure of a directory store a count of the number of entries?", to which the answer is no. There isn't a more efficient way of counting files than readdir(2)/getdents(2).
One can get the number of subdirectories of a given directory without traversing the whole list by stat'ing (stat(1) or stat(2)) the given directory and observing the number of links to that directory. A given directory with N child directories will have a link count of N+2, one link for the ".." entry of each subdirectory, plus two for the "." and ".." entries of the given directory.
However one cannot get the number of all files (whether regular files or subdirectories) without traversing the whole list -- that is correct.
The "/bin/ls -1U" command will not get all entries however. It will get only those directory entries that do not start with the dot (.) character. For example, it would not count the ".profile" file found in many login $HOME directories.
One can use either the "/bin/ls -f" command or the "/bin/ls -Ua" command to avoid the sort and get all entries.
Perhaps unfortunately for your purposes, either the "/bin/ls -f" command or the "/bin/ls -Ua" command will also count the "." and ".." entries that are in each directory. You will have to subtract 2 from the count to avoid counting these two entries, such as in the following:
expr `/bin/ls -f | wc -l` - 2 # Those are back ticks, not single quotes.
The --format=single-column (-1) option is not necessary on the "/bin/ls -Ua" command when piping the "ls" output, as in to "wc" in this case. The "ls" command will automatically write its output in a single column if the output is not a terminal.
The -U option for ls is not in POSIX, and in OS X's ls it has a different meaning from GNU ls, which is that it makes -t and -l use creation times instead of modification times. -f is in POSIX as an XSI extension. The manual of GNU ls describes -f as do not sort, enable -aU, disable -ls --color and -U as do not sort; list entries in directory order.
POSIX describes -f like this:
Force each argument to be interpreted as a directory and list the name found in each slot. This option shall turn off -l, -t, -s, and -r, and shall turn on -a; the order is the order in which entries appear in the directory.
Commands like ls|wc -l give the wrong result when filenames contain newlines.
In zsh you can do something like this:
a=(*(DN));echo ${#a}
D (glob_dots) includes files whose name starts with a period and N (null_glob) causes the command to not result in an error in an empty directory.
Or the same in bash:
shopt -s dotglob nullglob;a=(*);echo ${#a[#]}
If IFS contains ASCII digits, add double quotes around ${#a[#]}. Add shopt -u failglob to ensure that failglob is unset.
A portable option is to use find:
find . ! -name . -prune|grep -c /
grep -c / can be replaced with wc -l if filenames do not contain newlines. ! -name . -prune is a portable alternative to -mindepth 1 -maxdepth 1.
Or here's another alternative that does not usually include files whose name starts with a period:
set -- *;[ -e "$1" ]&&echo "$#"
The command above does however include files whose name starts with a period when an option like dotglob in bash or glob_dots in zsh is set. When * matches no file, the command results in an error in zsh with the default settings.
I used this command..works like a charm..only to change the maxdepth..that is sub directories
find * -maxdepth 0 -type d -exec sh -c "echo -n {} ' ' ; ls -lR {} | wc -l" \;
I think you can have more control on this using find:
find <path> -maxdepth 1 -type f -printf "." | wc -c
find -maxdepth 1 will not go deeper into the hierarchy of files.
-type f allows filtering to just files. Similarly, you can use -type d for directories.
-printf "." prints a dot for every match.
wc -c counts the characters, so it counts the dots created by the print... which means counting how many files exist in the given path.
For the number of all file in a current directory try this:
ls -lR * | wc -l
As far as I know, there is no better alternative. This information might be off-topic to this question and you may already know this that under Linux (in general under Unix) directories are just special file which contains the list of other files (I understand that the exact details will be dependent on specific file system but this is the general idea). And there is no call to find the total number of entries without traversing the whole list. Please make me correct if I'm wrong.
use ls -1 | wc -l

Resources