Searching a directory for files - linux

Hi all I am trying to write a script which recursively searches directories from a parent directory and counts how many text files (.txt) are in the sub directories. I also need to output the files relative path to the parent directory.
Lets say I have a folder named Files
Within this folder there may be:
Files/childFolder1/child1.txt
Files/childFolder1/child2.txt
Files/childFolder1/child3.txt
Files/childFolder2/child4.txt
Files/childFolder2/child5.txt
Files/child6.txt
So the output would be
The files are:
/childFolder1/child1.txt
/childFolder1/child2.txt
/childFolder1/child3.txt
/childFolder2/child4.txt
/childFolder2/child5.txt
/child6.txt
There are 6 files in the 'Files' folder
So far I have a script which is this:
#! /bin/csh
find $argv[1] -iname '*.txt'
set wc=`find $argv[1] -iname '*.txt' | wc -l`
echo "Number of files under" $argv[1] "is" $wc
I have no idea how to make the output so it only shows the file path relative to the directory. Currently my output is something like this:
/home/Alice/Documents/Files/childFolder1/child1.txt
/home/Alice/Documents/Files/childFolder1/child2.txt
/home/Alice/Documents/Files/childFolder1/child3.txt
/home/Alice/Documents/Files/childFolder2/child4.txt
/home/Alice/Documents/Files/childFolder2/child5.txt
/home/Alice/Documents/Files/child6.txt
Number of files under /home/Alice/Documents/Files is 6
Which is not desired. I am also worried about how I am setting the $wc variable. If the directory listing is large then this is going to acquire a massive overhead.

cd $argv[1] first and then use find . -iname '*.txt' to make the results relative to the directory

Try
find $argv[1] -iname '*.txt' -printf "%P \n"
This should give the desired output :)
Hayden

Also, if you don't want to execute find twice, you can either:
store its output in a variable, though I don't know if variables are limited in size in csh ;
store its output in a temporary file, but purists don't like temporary file;
count the number of lines yourself in a while loop which iterates over the results of find.
Note that if you used bash instead, which supports process substitution you could duplicate find's output and pipe it to multiple commands with tee:
find [...] | tee >(cmd1) >(cmd2) >/dev/null

Related

How can i count the number of files with a specific octal code without them showing in shell

I tried using tree command but I didn't know how .(I wanted to use tree because I don't want the files to show up , just the number)
Let's say c is the code for permission
For example I want to know how many files are there with the permission 751
Use find with the -perm flag, which only matches files with the specified permission bits.
For example, if you have the octal in $c, then run
find . -perm $c
The usual find options apply—if you only want to find files at the current level without recursing into directories, run
find . -maxdepth 1 -perm $c
To find the number of matching files, make find print a dot for every file and use wc to count the number of dots. (wc -l will not work with more exotic filenames with newlines as #BenjaminW. has pointed out in the comments. Source of idea of using wc -c is this answer.)
find . -maxdepth 1 -perm $c -printf '.' | wc -c
This will show the number of files without showing the files themselves.
If you're using zsh as your shell, you can do it natively without any external programs:
setopt EXTENDED_GLOB # Just in case it's not already set
c=0751
files=( **/*(#qf$c) )
echo "${#files[#]} files found"
will count all files in the current working directory and subdirectories with those permissions (And gives you all the names in an array in case you want to do something with them later). Read more about zsh glob qualifiers in the documentation.

Recursively find a directory and rename it in Shell Script

Im putting together a simple Shell script to run on a Linux Machine where I would:
1) Look for specific sub-directories within a main directory. These sub-dirs have a very specific naming convention (see below) and they are always 2 -max depth below the main directory.
2) Rename those sub-dirs to PART of its original name.
For example,
The sub directories are named:
andrew-11111
andrew-11112
andrew-11113
andrew-11114
The path to get to these sub dirs would look something like this:
myphotos/sailing/photos/andrew-1111
myphotos/sailing/photos/andrew-1112
myphotos/biking/photos/andrew-1113
myphotos/hiking/photos/andrew-1114
Id like take out the 'andrew-' from each of these sub dirs:
myphotos/sailing/photos/1111
myphotos/sailing/photos/1112
myphotos/biking/photos/1113
myphotos/hiking/photos/1114
Ive gotten as far as "finding" the sub dirs and listing them. I also understand how to copy and rename in command line. But putting it together at my level of shell scripting knowledge has been taking much more time than I can afford. Just a disclaimer, I am more than willing to learn, and have written a handful of shell scripts, but still new to this. Any help or examples are much appreciated!
Use wildcards to match the files in the nested directories
You can use bash parameter expansion operators to manipulate the filenames.
for file in myphotos/*/photos/*; do
name=${file##*/} # remove everything up to last /
dir=${file%/*} # remove everything from last /
newname=${name##*-} # remove everything up to last -
mv "$file" "$dir/$newname"
done
If you have the perl-based rename command, you can do:
rename 's#[^/]*-##' myphotos/*/photos/*
You can do it with this one-liner:
find -type d -name andrew\* -exec sh -c 'mv {} $(dirname {})/$(basename {} | cut -d"-" -f2)' \;
Explanation:
-type d find only directories
-name andrew\* self-explaining, you have to escape the * though
-exec sh -c '...' execute it in a subshell, so you can do the command substitution ($(...)) without problems
mv {} the {} holds whatever find finds
dirname gives you the path to a directory (try it out with a random path, my english is too bad now to explain better)
basename gives you the last directory of a given path
cut -d"-" -f2 use cut to cut off "andrew-". For this set the delimiter to - and select the field number 2

Unix: traverse a directory

I need to traverse a directory so starting in one directory and going deeper into difference sub directories. However I also need to be able to have access to each individual file to modify the file. Is there already a command to do this or will I have to write a script? Could someone provide some code to help me with this task? Thanks.
The find command is just the tool for that. Its -exec flag or -print0 in combination with xargs -0 allows fine-grained control over what to do with each file.
Example: Replace all foo's by bar's in all files in /tmp and subdirectories.
find /tmp -type f -exec sed -i -e 's/foo/bar/' '{}' ';'
for i in `find` ; do
if [ -d $i ] ; then do something with a directory ; fi
if [ -f $i ] ; then do something with a file etc. ; fi
done
This will return the whole tree (recursively) in the current directory in a list that the loop will go through.
This can be easily achieved by mixing find, xargs, sed (or other file modification command).
For example:
$ find /path/to/base/dir -type f -name '*.properties' | xargs sed -ie '/^#/d'
This will filter all files with file extension .properties.
The xargs command will feed the file path generated by find command into the sed command.
The sed command will delete all lines start with # in the files (feed by xargs).
Command combination in this way is very flexible.
For example, find command have different parameters so you can filter by user name, file size, file path (eg: under /test/ subfolder), file modification time.
Another dimension of flexibility is how and what to change in your file. For ex, sed command allows you to make changes on file in applying substitution (specify via regular expressions). Similarly, you can use gzip to compress the file. And so on ...
You would usually use the find command. On Linux, you have the GNU version, of course. It has many extra (and useful) options. Both will allow you to execute a command (eg a shell script) on the files as they are found.
The exact details of how to make changes to the file depend on the change you want to make to the file. That is probably best scripted, with find running the script:
POSIX or GNU:
find . -type f -exec your_script '{}' +
This will run your script once for a group of files with those names provided as arguments. If you want to do it one file at a time, replace the + with ';' (or \;).
I am assuming SearchMe is the example directory name you need to traverse completely.
I am also assuming, since it was not specified, the files you want to modify are all text file. Is this correct?
In such scenario I would suggest using the command:
find SearchMe -type f -exec vi {} \;
If you are not familiar with vi editor, just use another one (nano, emacs, kate, kwrite, gedit, etc.) and it should work as well.
Bash 4+
shopt -s globstar
for file in **
do
if [ -f "$file" ];then
# do some processing to your file here
# where the find command can't do conveniently
fi
done

How to recursive list files with size and last modified time?

Given a directory i'm looking for a bash one-liner to get a recursive list of all files with their size and modified time tab separated for easy parsing. Something like:
cows/betsy 145700 2011-03-02 08:27
horses/silver 109895 2011-06-04 17:43
You can use stat(1) to get the information you want, if you don't want the full ls -l output, and you can use find(1) to get a recursive directory listing. Combining them into one line, you could do this:
# Find all regular files under the current directory and print out their
# filenames, sizes, and last modified times
find . -type f -exec stat -f '%N %z %Sm' '{}' +
If you want to make the output more parseable, you can use %m instead of %Sm to get the last modified time as a time_t instead of as a human-readable date.
find is perfect for recursively searching through directories. The -ls action tells it to output its results in ls -l format:
find /dir/ -ls
On Linux machines you can print customized output using the -printf action:
find /dir/ -printf '%p\t%s\t%t\n'
See man find for full details on the format specifiers available with -printf. (This is not POSIX-compatible and may not be available on other UNIX flavors.)
find * -type f -printf '%p\t%s\t%TY-%Tm-%Td %Tk:%TM\n'
If you prefer fixed-width fields rather than tabs, you can do things like changing %s to %10s.
I used find * ... to avoid the leading "./" on each file name. If you don't mind that, use . rather than * (which also shows files whose names start with .). You can also pipe the output through sed 's/^\.\///'.
Note that the output order will be arbitrary. Pipe through sort if you want an ordered listing.
You could try this for recursive listing from current folder called "/from_dir"
find /from_dir/* -print0 | xargs -0 stat -c “%n|%A|%a|%U|%G” > permissions_list.txt

Lists files and directories passes through to stat command and puts all the info into a file called permissions_list.txt
“%n|%A|%a|%U|%G” will give you the following result in the file:
from_
 dir|drwxr-sr-x|2755|root|root
from_dir/filename|-rw-r–r–|644|root|root

Cheers!


Copying files in multiple subdirectories in the Linux command line

Let's say I have the following subdirectories
./a/, ./b/, ./c/, ...
That is, in my current working directory are these subdirectories a/, b/ and c/, and in each of these subdirectories are files. In directory a/ is the file a.in, in directory b/ is the file b.in and so forth.
I now want to copy each .in file to a .out file, that is, a.in to a.out and b.in to b.out, and I want them to reside in the directories they were copied from. So a.out will be found in directory a/.
I've tried various different approaches, such as
find ./ -name '*.in'|cp * *.out
which doesn't work because it thinks *.out is a directory. Also tried
ls -d */ | cd; cp *.in *.out
but it that would list the subdirectories, go into each one of them, but won't let cp do it's work (which still doesn't work)
The
find ./ -name '*.in'
command works fine. Is there a way to pipe arguments to an assignment operator? E.g.
find ./ -name '*.in'| assign filename=|cp filename filename.out
where assign filename= gives filename the value of each .in file. In fact, it would be even better if the assignment could get rid of the .in file extension, then instead of getting a.in.out we would get the preferred a.out
Thank you for your time.
Let the shell help you out:
find . -name '*.in' | while read old; do
new=${old%.in}.out # strips the .in and adds .out
cp "$old" "$new"
done
I just took the find command you said works and let bash read its output one filename at a time. So the bash while loop gets the filenames one at a time, does a little substitution, and a straight copy. Nice and easy (but not tested!).
Try a for loop:
for f in */*.in; do
cp $f ${f%.in}.out;
done
The glob should catch all the files one directory down that have a .in extension. In the cp command, it strips off the .in suffix and then appends a .out (see Variable Mangling in Bash with String Operators)
Alternatively, if you want to recurse into every subdirectory (not just 1 level deep) replace the glob with a find:
for f in $(find . -name '*.in'); do
cp $f ${f%.in}.out;
done
This should do the trick!
for f in `find . -type f -name "*.in"`; do cp $f `echo $f | sed 's/in$/out/g'`; done

Resources