Loop through a directory with any level of depth - linux

I want to execute a command on all files present on all levels in the directory. It may have any number of files and sub directories. Even these sub directories may contain any number of files and subdirectories. I want to do this using shell script. As I am new to this field can any one suggest me a way out.

You can use the command "find" with "xargs" after "|"(pipe).
Example: Suppose that I want to remove all files that have ".txt" extension on "Documents" directory:
find Documents -iname *.txt |xargs rm -f
Helps?

You can use a recursive command that uses wildcard characters (*) like so:
for dir in ~/dev/myproject/*; do (cd "$dir" && git status); done
If you want to apply commands on the individual files you should use the find command and execute commands on it like so:
find yourdirectory -type f -exec echo "File found: '{}'" \;
What this does:
finds all the items in the directory yourdirectory
that have the type f - so are a file
runs an exec on each file

Use find:
find -type f -exec COMMAND {} \;
-f applies the command only to files, not to directories. The command is recursive by default.

Related

Find the name of subdirectories and process files in each

Let's say /tmp has subdirectories /test1, /test2, /test3 and so on,
and each has multiple files inside.
I have to run a while loop or for loop to find the name of the directories (in this case /test1, /test2, ...)
and run a command that processes all the files inside of each directory.
So, for example,
I have to get the directory names under /tmp which will be test1, test2, ...
For each subdirectory, I have to process the files inside of it.
How can I do this?
Clarification:
This is the command that I want to run:
find /PROD/140725_D0/ -name "*.json" -exec /tmp/test.py {} \;
where 140725_D0 is an example of one subdirectory to process - there are multiples, with different names.
So, by using a for or while loop, I want to find all subdirectories and run a command on the files in each.
The for or while loop should iteratively replace the hard-coded name 140725_D0 in the find command above.
You should be able to do with a single find command with an embedded shell command:
find /PROD -type d -execdir sh -c 'for f in *.json; do /tmp/test.py "$f"; done' \;
Note: -execdir is not POSIX-compliant, but the BSD (OSX) and GNU (Linux) versions of find support it; see below for a POSIX alternative.
The approach is to let find match directories, and then, in each matched directory, execute a shell with a file-processing loop (sh -c '<shellCmd>').
If not all subdirectories are guaranteed to have *.json files, change the shell command to for f in *.json; do [ -f "$f" ] && /tmp/test.py "$f"; done
Update: Two more considerations; tip of the hat to kenorb's answer:
By default, find processes the entire subtree of the input directory. To limit matching to immediate subdirectories, use -maxdepth 1[1]:
find /PROD -maxdepth 1 -type d ...
As stated, -execdir - which runs the command passed to it in the directory currently being processed - is not POSIX compliant; you can work around this by using -exec instead and by including a cd command with the directory path at hand ({}) in the shell command:
find /PROD -type d -exec sh -c 'cd "{}" && for f in *.json; do /tmp/test.py "$f"; done' \;
[1] Strictly speaking, you can place the -maxdepth option anywhere after the input file paths on the find command line - as an option, it is not positional. However, GNU find will issue a warning unless you place it before tests (such as -type) and actions (such as -exec).
Try the following usage of find:
find . -type d -exec sh -c 'cd "{}" && echo Do some stuff for {}, files are: $(ls *.*)' ';'
Use -maxdepth if you'd like to limit your directory levels.
You can do this using bash's subshell feature like so
for i in /tmp/test*; do
# don't do anything if there's no /test directory in /tmp
[ "$i" != "/tmp/test*" ] || continue
for j in $i/*.json; do
# don't do anything if there's nothing to run
[ "$j" != "$i/*.json" ] || continue
(cd $i && ./file_to_run)
done
done
When you wrap a command in ( and ) it starts a subshell to run the command. A subshell is exactly like starting another instance of bash except it's slightly more optimal.
You can also simply ask the shell to expand the directories/files you need, e.g. using command xargs:
echo /PROD/*/*.json | xargs -n 1 /tmp/test.py
or even using your original find command:
find /PROD/* -name "*.json" -exec /tmp/test.py {} \;
Both command will process all JSON files contained into any subdirectory of /PROD.
Another solution is to change slightly the Python code inside your script in order to accept and process multiple files.
For example, if your script contains something like:
def process(fname):
print 'Processing file', fname
if __name__ == '__main__':
import sys
process(sys.argv[1])
you could replace the last line with:
for fname in sys.argv[1:]:
process(fname)
After this simple modification, you can call your script this way:
/tmp/test.py /PROD/*/*.json
and have it process all the desired JSON files.

Linux recursive copy files to its parent folder

I want to copy recursively files to its parent folder for a specific file extension. For example:
./folderA/folder1/*.txt to ./folderA/*.txt
./folderB/folder2/*.txt to ./folderB/*.txt
etc.
I checked cp and find commands but couldn't get it working.
I suspect that while you say copy, you actually mean to move the files up to their respective parent directories. It can be done easily using find:
$ find . -name '*.txt' -type f -execdir mv -n '{}' ../ \;
The above command recurses into the current directory . and then applies the following cascade of conditionals to each item found:
-name '*.txt' will filter out only files that have the .txt extension
-type f will filter out only regular files (eg, not directories that – for whatever reason – happen to have a name ending in .txt)
-execdir mv -n '{}' ../ \; executes the command mv -n '{}' ../ in the containing directory where the {} is a placeholder for the matched file's name and the single quotes are needed to stop the shell from interpreting the curly braces. The ; terminates the command and again has to be escaped from the shell interpreting it.
I have passed the -n flag to the mv program to avoid accidentally overwriting an existing file.
The above command will transform the following file system tree
dir1/
dir11/
file3.txt
file4.txt
dir12/
file2.txt
dir2/
dir21/
file6.dat
dir22/
dir221/
dir221/file8.txt
file7.txt
file5.txt
dir3/
file9.dat
file1.txt
into this one:
dir1/
dir11/
dir12/
file3.txt
file4.txt
dir2/
dir21/
file6.dat
dir22/
dir221/
file8.txt
file7.txt
dir3/
file9.dat
file2.txt
file5.txt
To get rid of the empty directories, run
$ find . -type d -empty -delete
Again, this command will traverse the current directory . and then apply the following:
-type d this time filters out only directories
-empty filters out only those that are empty
-delete deletes them.
Fine print: -execdir is not specified by POSIX, though major implementations (at least the GNU and BSD one) support it. If you need strict POSIX compliance, you'll have to make do with the less safe -exec which would need additional thought to be applied correctly in this case.
Finally, please try your commands in a test directory with dummy files, not your actual data. Especially with the -delete option of find, you can loose all your data quicker than you might imaging. Read the man page and, if that is not enough, the reference manual of find. Never blindly copy shell commands from random strangers posted on the internet if you don't understand them.
$cp ./folderA/folder1/*.txt ./folderA
Try this commnad
Run something like this from the root(ish) directory:
#! /bin/bash
BASE_DIR=./
new_dir() {
LOC_DIR=`pwd`
for i in "${LOC_DIR}"/*; do
[[ -f "${i}" ]] && cp "${i}" ../
[[ -d "${i}" ]] && cd "${i}" && new_dir
cd ..
done
return 0
}
new_dir
This will search each directory. When a file is encountered, it copies the file up a directory. When a directory is found, it will move down into the directory and start the process over again. I think it'll work for you.
Good luck.

Search for text files in a directory and append a (static) line to each of them

I have a directory with many subdirectories and files with suffixes in those subdirectories (e.g FileA-suffixA FileB-SuffixB FileC-SuffixC FileD-SuffixA, etc).
How can I recursively search for files with a certain suffix, and append a user-defined line of text to those files? I feel like this is a job for grep and sed, but I'm not sure how I would go about doing it. I'm fairly new to scripting, so please bear with me.
You can do it like
find /where/to/search -type f -iname '*.SUFFIX' -exec echo "USER DEFINED STRING" >> \{\} \;
find searches in the suplied path
-type f finds only files
-iname '*.SUFFIX' find the .SUFFIXed names, case ignored
find ./ -name "*suffix" -exec bash -c 'echo "line_to_add" >> $1' -- {} \;
Basically you use find to get a list of the files. Then you use bash to echo append your line to that list.

Changing all file's extensions in a folder using CLI in Linux

How to change all file's extensions in a folder using one command on CLI in Linux?
Use rename:
rename 's/.old$/.new/' *.old
If you have the perl rename installed (there are different rename implementations) you can do something like this:
$ ls -1
test1.foo
test2.foo
test3.foo
$ rename 's/\.foo$/.bar/' *.foo
$ ls -1
test1.bar
test2.bar
test3.bar
You could use a for-loop on the command line:
for foo in *.old; do mv $foo `basename $foo .old`.new; done
this will take all files with extension .old and rename them to .new
This should works on current directory AND sub-directories. It will rename all .oldExtension files under directory structure with a new extension.
for f in `find . -iname '*.oldExtension' -type f -print`;do mv "$f" ${f%.oldExtension}.newExtension; done
This will work recursively, and with files containing spaces.
Be sure to replace .old and .new with the proper extensions before running.
find . -iname '*.old' -type f -exec bash -c 'mv "$0" "${0%.old}.new"' {} \;
Source : recursively add file extension to all files
(Not my answer.)
find . -type f -exec mv '{}' '{}'.jpg \;
Explanation: this recursively finds all files (-type f) starting from the current directory (.) and applies the move command (mv) to each of them. Note also the quotes around {}, so that filenames with spaces (and even newlines...) are properly handled.

How to gzip all files in all sub-directories in bash

I want to iterate among sub directories of my current location and gzip each file seperately. For zipping files in a directory, I use
for file in *; do gzip "$file"; done
but this can just work on current directory and not the sub directories of the current directory. How can I rewrite the above statements so that It also zips the files in all subdirectories?
I'd prefer gzip -r ./ which does the same thing but is shorter.
No need for loops or anything more than find and gzip:
find . -type f ! -name '*.gz' -exec gzip "{}" \;
This finds all regular files in and below the current directory whose names don't end with the .gz extension (that is, all files that are not already compressed). It invokes gzip on each file individually.
Edit, based on comment from user unknown:
The curly braces ({}) are replaced with the filename, which is passed directly, as a single word, to the command following -exec as you can see here:
$ touch foo
$ touch "bar baz"
$ touch xyzzy
$ find . -exec echo {} \;
./foo
./bar baz
./xyzzy
find . -type f | while read file; do gzip "$file"; done
I can't comment on the top post (yet...), but I read in the man pages of "find" that -execDir is safer than -exec because the command is done in the subdirectory where the match is found, rather than the parent directory where "find" is ran from.
If anyone would like to use a regex with to locate specific files in a subdirectory to zip, I'd recommend using
find ./ -type f -name 'addRegexHere' -execdir gzip -k "{}" \;
if you don't need regex's, stick with the recursive gzip call above (or below, if I gain any traction haha)
source

Resources