Unix: traverse a directory - linux

I need to traverse a directory so starting in one directory and going deeper into difference sub directories. However I also need to be able to have access to each individual file to modify the file. Is there already a command to do this or will I have to write a script? Could someone provide some code to help me with this task? Thanks.

The find command is just the tool for that. Its -exec flag or -print0 in combination with xargs -0 allows fine-grained control over what to do with each file.
Example: Replace all foo's by bar's in all files in /tmp and subdirectories.
find /tmp -type f -exec sed -i -e 's/foo/bar/' '{}' ';'

for i in `find` ; do
if [ -d $i ] ; then do something with a directory ; fi
if [ -f $i ] ; then do something with a file etc. ; fi
done
This will return the whole tree (recursively) in the current directory in a list that the loop will go through.

This can be easily achieved by mixing find, xargs, sed (or other file modification command).
For example:
$ find /path/to/base/dir -type f -name '*.properties' | xargs sed -ie '/^#/d'
This will filter all files with file extension .properties.
The xargs command will feed the file path generated by find command into the sed command.
The sed command will delete all lines start with # in the files (feed by xargs).
Command combination in this way is very flexible.
For example, find command have different parameters so you can filter by user name, file size, file path (eg: under /test/ subfolder), file modification time.
Another dimension of flexibility is how and what to change in your file. For ex, sed command allows you to make changes on file in applying substitution (specify via regular expressions). Similarly, you can use gzip to compress the file. And so on ...

You would usually use the find command. On Linux, you have the GNU version, of course. It has many extra (and useful) options. Both will allow you to execute a command (eg a shell script) on the files as they are found.
The exact details of how to make changes to the file depend on the change you want to make to the file. That is probably best scripted, with find running the script:
POSIX or GNU:
find . -type f -exec your_script '{}' +
This will run your script once for a group of files with those names provided as arguments. If you want to do it one file at a time, replace the + with ';' (or \;).

I am assuming SearchMe is the example directory name you need to traverse completely.
I am also assuming, since it was not specified, the files you want to modify are all text file. Is this correct?
In such scenario I would suggest using the command:
find SearchMe -type f -exec vi {} \;
If you are not familiar with vi editor, just use another one (nano, emacs, kate, kwrite, gedit, etc.) and it should work as well.

Bash 4+
shopt -s globstar
for file in **
do
if [ -f "$file" ];then
# do some processing to your file here
# where the find command can't do conveniently
fi
done

Related

Simple Bash Script that recursively searches in subdirs for a certain string

i recently started learning linux because a ctf contest is coming in the next months. The problem that I struggle with is that i am trying to make a bash script that starts from a directory, checks if the content is a directory or other kind of file. If it is a file,image etc apply strings $f | grep -i 'abcdef', if it is a directory cd to that directory and start over. i have c++ experience and i understand the logic but i can't really make it work.I can't succesfully implement the loop that goes thru all the subdirectories. All help would be appreciated!
you don not need a loop for this implementation. The find command can do what you are looking after.
for instance:
find /home -type f -exec sh -c " strings {} | grep abcd " \;
explain:
/home is you base directory can be anything
-type f: means a regular file
-exec from the man page:
"Execute command; true if 0 status is returned. All
following arguments to find are taken to be arguments to
the command until an argument consisting of ;' is encountered. The string {}' is replaced by the current
file name being processed everywhere it occurs in the
arguments to the command, not just in arguments where it
is alone, as in some versions of find. Both of these
constructions might need to be escaped (with a `') or
quoted to protect them from expansion by the shell. See
the EXAMPLES section for examples of the use of the -exec
option. The specified command is run once for each
matched file. The command is executed in the starting
directory. There are unavoidable security problems
surrounding use of the -exec action; you should use the
-execdir option instead."
If you want to just find the string in a file and you do not HAVE TO first find a directory and then a file and then search, you can just simply find the text with grep.
Go to the the parent directory and execute :
grep -iR "abcd"
Or from any place,
grep -iR "abcd" /var/log/mylogs/
Suggesting a grep command on find filter results:
grep "abcd" $(find . -type f)

Find files recursively and rename based on their full path

I'm looking to search for files of a specific name, modify the name to the full path and then copy the results to another folder.
Is it possible to update each find result with the full path as the file name; i.e.
./folder/subfolder/my-file.csv
becomes
folder_subfolder_my-file.csv
I am listing the files using the following and would like to script it.
find . -name my-file.csv -exec ls {} \;
Since you're using bash, you can take advantage of globstar and use a for loop:
shopt -s globstar # set globstar option
for csv in **/my-file.csv; do
echo "$csv" "${csv//\//_}"
done
shopt -u globstar # unset the option if you don't want it any more
With globstar enabled, ** does a recursive search (similar to the basic functionality of find).
"${csv//\//_}" is an example of ${var//match/replace}, which does a global replacement of all instances of match (here an escaped /) with replace.
If you're happy with the output, then change the echo to mv.
Just to demonstrate how to do this with find;
find . -type f -exec bash -c '
for file; do
f=${file#./}
cp "$file" "./${f//\//_}"
done' _ {} +
The Bash pattern expansion ${f//x/y} replaces x with y throughout. Because find prefixes each found file with the path where it was found (here, ./) we trim that off in order to avoid doing mv "./file" "._file". And because the slash is used in the parameter expansion itself, we need to backslash the slash we want the shell to interpret literally. Finally, because this parameter expansion syntax is a Bash-only extension, we use bash rather than sh.
Obviously, if you want to rename rather than copy, replace cp with mv.
If your find does not support -exec ... + this needs to be refactored somewhat (probably to use xargs); but it should be supported on any reasonably modern platform.
With perl's rename command ...
$ prename
Usage: rename [-v] [-n] [-f] perlexpr [filenames]
... you can rename multiple files by applying a regular expression. rename also accepts file names via stdin:
find ... | rename -n 's#/#_#g'
Check the results and if they are fine, remove -n.

No such file or directory when piping. Each command works separately, but not when piping

I have 2 folders: folder_a & folder_b. In each of these folders there are a bunch of files. I am trying to use sed to move all of these files out of these folders and into my current working directory I am currently in.
My folder structure looks like this:
mytest:
a:
1.txt
2.txt
3.txt
b:
4.txt
5.txt
The command I am trying to use is:
find . -type d ! -iname '*.*' # find all folders other than root
| sed -r 's/.*/&\/*/' # add '/*' to each of the arguments
| sed -r 'p;s/.*/./' # output: a/* . b/* .
| xargs -n 2 mv # should be creating two commands: 'mv a/* .' and 'mv b/* .'
Unfortunately I get an error:
mv: cannot stat './aaa/*': No such file or directory
I also get the same error when I try this other strategy (using ls instead of mv):
for dir in */; do
ls $dir;
done;
Even if I use sed to replace the spaces in each directory name with '\ ', or surround the directory names with quotes I get the same error.
I'm not sure if these 2 examples are related in my misunderstanding of bash but they both seem to demonstrate my ignorance of how bash translates the output from one command into the input of another command.
Can anyone shed some light on this?
Update: Completely rewritten.
As #EtanReisner and #melpomene have noted, mv */* . or, more specifically, mv a/* b/* . is the most straightforward solution, but you state that this is in part a learning exercise, so the remainder of the answer shows an efficient find-based solution and explains the problem with the original command.
An efficient find-based solution
Generally, if feasible, it's best and most efficient to let find itself do the work, without involving additional tools; find's -exec action is like a built-in xargs, with {} representing the path at hand (with terminator \;) / all paths (with +):
find . -type f -exec echo mv -t . {} +
To be safe, his will just print the mv commands that would be executed; remove the echo to actually execute them.
This will execute a single[1] mv command to which all matching files are passed, and -t . moves them all to the current dir.
[1] If the resulting command line is too long (which is unlikely), it is split up into multiple commands, just as with xargs.
Operating on files (-type f) bypasses the need for globbing, as find will then enumerate all files for you (it also bypasses the need to exclude . explicitly).
Note that this solution works on entire subtrees, not just (immediate) subdirectories.
It's tempting to consider turning on Bash 4's globstar option and using mv */** ., but that won't work, because it will attempt to move directories as well, not just the files in them.
A caveat re -exec with +: it only works if {} - the placeholder for all paths - is the token immediately before the +.
Since you're on Linux, we can satisfy this condition by specifying the target folder for mv with option -t before the {}; on BSD-based systems such as OSX, you could not do that, because mv doesn't support -t there, so you'd have to use terminator \;, which means that mv is called once for every path, which is obviously much slower.
Why your command didn't work:
As #EtanReisner points out in a comment, xargs invokes the command specified without (implicitly) involving a shell, so globbing won't work; you can verify this with the following command:
echo '*' | xargs echo # -> '*' - NO globbing
If we leave the globbing issue aside, additional work would have been necessary to make your xargs command work correctly with folder names with embedded spaces (or other shell metacharacters):
find . -mindepth 1 -type d |
sed -r "s/.*/'&'\/* ./" | # -> '<input-path>'/* . (including single-quotes)
xargs -n 2 echo mv # NOTE: still won't work due to lack of globbing
Note how the (combined) sed command now produces a single output line '<input-path>'/* ., with the input path enclosed in embedded single-quotes, which is required for xargs to recognize <input-path> as a single argument, even if it contains embedded spaces.
(If your filenames contain single-quotes, you'd have to do more work; also note that since now all arguments for a given dir. are on a single line, you could use xargs -L 1 ....)
Also note how -mindepth 1 (only process paths at the subdirectory level or below) is used to skip processing of . itself.
The only way to make globbing happen is to get the shell involved:
find . -mindepth 1 -type d |
sed -r "s/.*/'&'\/* ./" | # -> '<input-path>'/* . (including single-quotes)
xargs -I {} sh -c 'echo mv {}' # works, but is inefficient
Note the use of xargs' -I option to treat each input line as its own argument ({} is a self-chosen placeholder for the input).
sh -c invokes the (default) shell to execute the resulting command, at which globbing does happen.
However, overall, this is quite inefficient:
A pipeline with 3 segments is used.
A shell instance is invoked for every input path, which in turn calls the mv utility.
Compare this to the efficient find-only solution above, which (typically) creates only 2 processes in total.

Shell Script to Recursively Loop Through Directory and print location of important files

So I am trying to write a command line shell script or a shell script that will be able to recursively loop through a directory, all its files, and sub-directories for certain files and then print the location of these files to a text file.
I know that this is possible using BASH commands such as find, locate, exec, and >.
This is what I have so far. find <top-directory> -name '*.class' -exec locate {} > location.txt \;
This does not work though. Can any BASH, Shell scripting experts help me out please?
Thank-you for reading this.
The default behavior of find (if you don't specify any other action) is to print the filename. So you can simply do:
find <top-directory> -name '*.class' > location.txt
Or if you want to be explicit about it:
find <top-directory> -name '*.class' -print > location.txt
You can save the redirection by using find's -fprint option:
find <top-directory> -name '*.class' -fprint location.txt
From the man page:
-fprint file
[...] print the full file name into file file. If file does not exist when find is run, it is created; if it does exist, it is truncated.
A less preferred way to do it is to use ls:
ls -d $PWD**/* | grep class
let's break it down:
ls -d # lists the directory (returns `.`)
ls -d $PWD # lists the directory - but this time $PWD will provide full path
ls -d $PWD/** # list the directory with full-path and every file under this directory (not recursively) - an effect which is due to `/**` part
ls -d $PWD/**/* # same like previous one, only that now do it recursively to the folders below (achieved by adding the `/*` at the end)
A better way of doing it:
After reading this due to recommendation from Charles Duffy, it appears as a bad idea to use both ls as well as find (article also says: "find is just as bad as ls in this context".) The reason it's a bad idea is because you can't control the output of ls: for example, you can't configure ls to terminate filenames with NUL. The reason it's problematic is that unix allows all kind of weird characters in a file-name (newline, pipe etc) and will "break" ls in a way you can't anticipate.
Better use a shell script for the task, and it's pretty simple task too:
Create a file my_script.sh, edit the file to contain:
for i in **/*; do
echo $PWD/$i
done
Give it execute permissions (by running: chmod +x my_script.sh).
Run it from the same directory with:
./my_script.sh
and you're good to go!

Bash script cd issues

Hi all I have some problems with my script. I've read that changing the current directory from within a script is a bit of an issue. Basically I am looking for a single php file with a project folder and any sub-folders in it. And I want to change the directory to where that folder is and perform a command for it. So far no luck.
function findPHP(){
declare -a FILES
FILES=$(find ./ -name \*.php)
for file in "${FILES[#]}"
do
DIR=`dirname file`
( cd $DIR && doSomethingInThisDir &(...))
done
Any help would be greatly appreciated.
You are trying to iterate over FILES as an array, but it only has one element. In order to make the result of your subshell into an array, you can:
FILES=($(find ./ -name \*.php))
Note that it splits file names on spaces, so even though you properly quote below, it won't help. Alternatively, you could just let it split below (i.e. using your existing FILES) and use instead:
for file in $FILES
If you are using bash 4, you may want to have a look at recursive globbing... this would make it a bit easier:
for file in **/*.php
Note that you have to have the globstar shell option set, which you could enable with shopt -s globstar. This way is simpler and won't break on whitespace.
Also, you probably want $file here:
DIR=`dirname $file`
Or just use parameter expansion:
DIR=${file%/*}
There is no reason to use an array, or store the file list in anyway. If your find supports -execdir (eg gnufind 4.2.27), then use it. Otherwise, cd in a subshell as you have done:
#!/bin/bash
doSomethingInThisDir() ( cd $(dirname $1); ... )
export -f doSomethingInThisDir
find . -type f -exec bash -c 'doSomethingInThisDir {}' \;
I have defined the function using () instead of {}, but that is not necessary in this case. Normally, using () causes the function to run in a subshell, but that happens here anyway because find runs a separate process for each file.

Resources