Let's say you have a first.sh file in a directory: "/home/userbob/scripts/foo/". Basically I would like to know how to loop through specific directories, each time going back up to a higher level directory and repeating.
The .sh file has something like this pseudocode:
#!/bin/bash
curdi={$PATH} #where the first.sh file sits on the server
FOLDERS="$curdi/waffles/inner/
$curdi/pancakes/inner/
$curdi/bagels/inner/"
for f in $FOLDERS
do
cd $f
cp innerofinner/* .
cd $curdi
done
The idea is to somehow copy all the contents of /home/userbob/scripts/foo/waffles/inner/innerofinner to /home/userbob/scripts/foo/waffles/inner/
(and basically repeating just with the path having pancakes, bagels.etc.)
Can't do it for all directories (*) under /home/userbob/scripts/foo/ because there are some that I don't want to copy.
This should do it:
for name in waffles pancakes bagels
do
cp "$curdi/$name/inner/innferofinner/"* "$curdi/waffles/inner"
done
Walking file trees? Sounds like a job for find!
#!/usr/local/bin/env bash
# only environment variables should be all-caps
dirs=({bagels,pancakes}/inner)
find "${dirs[#]}" -type d -maxdepth 1 -mindepth 1 -name innerofinner -execdir bash -c 'cp "$1"/* .' -- {} \;
I did a partial path and assumed a working directory of /home/userbob/scripts/foo. An absolute path would work, too, and would look like
dirs=(/home/userbob/scripts/foo/{bagels,pancakes}/inner)
This finds all directories exactly one level below the listed directory that are named "innerofinner" and, in their parent directories, executs bash and a simple cp script.
If you're wondering how this works, read below.
The dirs=() syntax creates an empty array named dirs. dirs+(a b) creates an array with a at index 0 and b at index 1. Any whitespace-delimited string will work, here. In a shell script {a,b,c} expands to a b c but A{a,b,c}B expands to AaB AbB AcB. So specifying {bagels,pancakes}/inner is just a way to say both bagels/inner and pancakes/inner without having to type as much.
A variable in bash can be expanded with $foo or with ${foo}; these are the same. An array in shell can be expanded to all of its elements with ${foo[#]} delimited by spaces (if you know perl or php this will make some sense) and quoting the expansion (always a good idea in shell!) prevents spaces innside the variable from being processed again by the shell. Thus, "${dir[#]}" becomes bagels/inner pancakes/inner.
Knowing this we see that the find command has become find bagels/inner pancakes/inner -maxdepth 1 -mindepth 1 -type d -name innerofinner and if you execute this it will return exactly two lines: both full paths to each innerofinner directory. All we want now is to do something for each one, which -execdir does nicely.
Use a recursive function or invoke the script recursively.
I am not sure if I understand your problem statement correctly. Your psuedo code seems good. But, I see a problem with the following line.
curdi={$PWD}
It does not give you the directory where the script resides but gives the directory you are in. If your script directory is in the path and you are running the script from your home directory then $curdi would point to your home directory and not the directory where your script resides. This will lead to undesired results.
Incidentally, if you really wanted to do it in the way that your pseudo-script attempts it, you'd do it like this
#!/usr/bin/env bash
for f in "$PWD"/{waffles,pancakes,bagels}/inner ; do
cd "$f"
cp innerofinner/* .
# if you know for sure that it's one level up
cd ..
done
Presuming that $PWD is a good enough indicator of "current" directory for you. Me, I'd pass it in to the script.
#!/usr/bin/env bash
base="${1-$PWD}"
for f in "$base"/{waffles,pancakes,bagels}/inner ; do
cd "$f"
cp innerofinner/* .
cd ..
done
at call it like
breakfast.sh /home/userbob/scripts/foo/
find . \( -iname '*waffles*innferofinner*' -o \
-iname '*pancakes*innferofinner*' -o \
-iname '*baggels*innferofinner*' \) \
-type f \
-exec cp {} "`echo {} | sed 's:\(.*\)/[^/]\+/[^/]\+:\1:'` \;
Should do. Finds every file in the desired subdirs, then copies it based on its name.
HTH
Related
I have a script called summarize.sh which produces a summary of the file/dirs inside of a directory. I would like to have it run recursively down the whole tree from the top. Whats a good way to do this?
I have tried to loop it with a for loop with
for dir in */; do
cd $dir
./summarize.sh
cd ..
however it returns ./summarize.sh: no file or directory
Is it because I am not moving the script as I run it? I am not very familiar with unix directories.
You can recursively list files using find . -type f and make your script take the interested file as a first argument, so you can do find . -type f -exec myScript.sh {} \;
If you want directories only, use find . -type d instead, or if you want both use just find . without restriction.
Additional option by name, e.g. find . -name '*.py'
Finally, if you do not want to recurse down the directory structure, i.e. only summarize the top level, you can use -maxdepth 1 option, so something like find . -type d -maxdepth 1 -exec myScript.sh {} \;.
The issue is that you are changing to a different directory with the cd command while your summarize.sh script is not located in these directories. One possible solution is to use an absolute path instead of a relative one. For example, change:
./summarize.sh
to something like:
/path/to/file/summarize.sh
Alternatively, under the given example code, you can also use a relative path pointing to the previous directory like this:
../summarize.sh
Try this code if you are running Bash 4.0 or later:
#! /bin/bash -p
shopt -s nullglob # Globs expand to nothing when they match nothing
shopt -s globstar # Enable ** to expand over the directory hierarchy
summarizer_path=$PWD/summarize.sh
for dir in **/ ; do
cd -- "$dir"
"$summarizer_path"
cd - >/dev/null
done
shopt -s nullglob avoids an error in case there are no directories under the current one.
The summarizer_path variable is set to an absolute path for the summarize.sh program. That is necessary to allow it to be run in directories other than the current one. (./summarize.sh only works in the current directory, ..)
Use cd -- ... to avoid problems if any directory name begins with '-'.
cd - >/dev/null to cd to the previous directory, and throw away its path when it is output by cd -.
Shellcheck issues several warnings about the code above, all to do with the use of cd. I'd fix them for "real" code.
Im putting together a simple Shell script to run on a Linux Machine where I would:
1) Look for specific sub-directories within a main directory. These sub-dirs have a very specific naming convention (see below) and they are always 2 -max depth below the main directory.
2) Rename those sub-dirs to PART of its original name.
For example,
The sub directories are named:
andrew-11111
andrew-11112
andrew-11113
andrew-11114
The path to get to these sub dirs would look something like this:
myphotos/sailing/photos/andrew-1111
myphotos/sailing/photos/andrew-1112
myphotos/biking/photos/andrew-1113
myphotos/hiking/photos/andrew-1114
Id like take out the 'andrew-' from each of these sub dirs:
myphotos/sailing/photos/1111
myphotos/sailing/photos/1112
myphotos/biking/photos/1113
myphotos/hiking/photos/1114
Ive gotten as far as "finding" the sub dirs and listing them. I also understand how to copy and rename in command line. But putting it together at my level of shell scripting knowledge has been taking much more time than I can afford. Just a disclaimer, I am more than willing to learn, and have written a handful of shell scripts, but still new to this. Any help or examples are much appreciated!
Use wildcards to match the files in the nested directories
You can use bash parameter expansion operators to manipulate the filenames.
for file in myphotos/*/photos/*; do
name=${file##*/} # remove everything up to last /
dir=${file%/*} # remove everything from last /
newname=${name##*-} # remove everything up to last -
mv "$file" "$dir/$newname"
done
If you have the perl-based rename command, you can do:
rename 's#[^/]*-##' myphotos/*/photos/*
You can do it with this one-liner:
find -type d -name andrew\* -exec sh -c 'mv {} $(dirname {})/$(basename {} | cut -d"-" -f2)' \;
Explanation:
-type d find only directories
-name andrew\* self-explaining, you have to escape the * though
-exec sh -c '...' execute it in a subshell, so you can do the command substitution ($(...)) without problems
mv {} the {} holds whatever find finds
dirname gives you the path to a directory (try it out with a random path, my english is too bad now to explain better)
basename gives you the last directory of a given path
cut -d"-" -f2 use cut to cut off "andrew-". For this set the delimiter to - and select the field number 2
I am writing a shell script where it is checking if the bin directory is present under all the users directory under /home directory. The bin directory can be present directly under user directory or under the child directory of the user directory.
I mean let say I have a user as amit under /home. So the bin directory can be present directly as /amit/bin or can be present as /amit/jash/bin
Now my requirement is that I should have a list of users directories where the bin directory is not present either directly under user directory or under the child directory of the user directory. I tried the command as :
find /home -type d ! -exec test -e '{}/bin' \; -print
but it is not working. However when I am replacing the bin directory with some file, the command is working fine. Looks like this command is particularly for files. Is there any similar command for directories?? Any help on this will be greatly appreciated.
You're on the right track. The catch is that your test of "does the following directory NOT exist in this target" can't be expressed within find's conditions in such a way as to return only the top-level directory. So you need to nest, one way or another.
One strategy would be to use a for loop in bash:
$ mkdir foo bar baz one two
$ mkdir bar/bin baz/bin
$ for d in /home/*/; do find "$d" -type d -name bin | grep -q . || echo "$d"; done
foo/
one/
two/
This uses pathname expansion (globbing) to generate the list of directories to test, and then checks for the existence of "bin". If that check fails (i.e. find outputs nothing), the directory is printed. Note the trailing slash on /home/*/, which ensures that you will only be searching within directories, rather than files that might accidentally exist in /home/.
Another possibility might be to use nested finds, if you don't want to depend on bash:
$ find /home/ -type d -depth 1 -not -exec sh -c "find {}/ -type d -name bin -print | grep -q . " \; -print
/home/foo
/home/one
/home/two
This roughly duplicates the effect of the bash for loop above, but by nesting find within find -exec. It uses grep -q . to convert the output of find into an exit status that can be used as a condition for the outer find.
Note that since you're looking for a bin directory, we want to use test -d rather than test -e (which would also check for a bin file, which probably does not matter to you.)
Another option is to use bash process redirection. On multiple lines for easier reading:
cd /home/
comm -3 \
<(printf '%s\n' */ | sed 's|/.*||' | sort) \
<(find */ -type d -name bin | cut -d/ -f1 | uniq)
This unfortunately requires you to change to the /home directory before running, because of the way it strips off subdirectories. You can of course collapse this into a big long one-liner if you feel so inclined.
This comm solution also has the risk of failing on directories with special characters in their names, like newlines.
One last option is bash-only but more than a one-liner. It involves subtracting the directories containing "bin" from the full list. It uses an associative array and globstar, so it depends on bash version 4.
#!/usr/bin/env bash
shopt -s globstar
# Go to our root
cd /home
# Declare an associative array
declare -A dirs=()
# Populate the array with our "full" list of home directories
for d in */; do dirs[${d%/}]=""; done
# Remove directories that contain a "bin" somewhere inside 'em
for d in **/bin; do unset dirs[${d%%/*}]; done
# Print the result in reproducible form
declare -p dirs
# Or print the result just as a list of words.
printf '%s\n' "${!dirs[#]}"
Note that we're storing directories in the array index, which (1) makes it easy for us to find and delete items, and (2) insures unique entries, even if one user has multiple "bin" directories under their home.
cd /home
find . -maxdepth 1 -type d ! -name . | sort > a
find . -type d -name bin | cut -d/ -f1,2 | sort > b
comm -23 a b
Here, I'm making two sorted lists. The first contains all the home directories, and the second contains the top parent of any bin subdirectory. Finally I output any items from the first list not present in the second.
Let's say /tmp has subdirectories /test1, /test2, /test3 and so on,
and each has multiple files inside.
I have to run a while loop or for loop to find the name of the directories (in this case /test1, /test2, ...)
and run a command that processes all the files inside of each directory.
So, for example,
I have to get the directory names under /tmp which will be test1, test2, ...
For each subdirectory, I have to process the files inside of it.
How can I do this?
Clarification:
This is the command that I want to run:
find /PROD/140725_D0/ -name "*.json" -exec /tmp/test.py {} \;
where 140725_D0 is an example of one subdirectory to process - there are multiples, with different names.
So, by using a for or while loop, I want to find all subdirectories and run a command on the files in each.
The for or while loop should iteratively replace the hard-coded name 140725_D0 in the find command above.
You should be able to do with a single find command with an embedded shell command:
find /PROD -type d -execdir sh -c 'for f in *.json; do /tmp/test.py "$f"; done' \;
Note: -execdir is not POSIX-compliant, but the BSD (OSX) and GNU (Linux) versions of find support it; see below for a POSIX alternative.
The approach is to let find match directories, and then, in each matched directory, execute a shell with a file-processing loop (sh -c '<shellCmd>').
If not all subdirectories are guaranteed to have *.json files, change the shell command to for f in *.json; do [ -f "$f" ] && /tmp/test.py "$f"; done
Update: Two more considerations; tip of the hat to kenorb's answer:
By default, find processes the entire subtree of the input directory. To limit matching to immediate subdirectories, use -maxdepth 1[1]:
find /PROD -maxdepth 1 -type d ...
As stated, -execdir - which runs the command passed to it in the directory currently being processed - is not POSIX compliant; you can work around this by using -exec instead and by including a cd command with the directory path at hand ({}) in the shell command:
find /PROD -type d -exec sh -c 'cd "{}" && for f in *.json; do /tmp/test.py "$f"; done' \;
[1] Strictly speaking, you can place the -maxdepth option anywhere after the input file paths on the find command line - as an option, it is not positional. However, GNU find will issue a warning unless you place it before tests (such as -type) and actions (such as -exec).
Try the following usage of find:
find . -type d -exec sh -c 'cd "{}" && echo Do some stuff for {}, files are: $(ls *.*)' ';'
Use -maxdepth if you'd like to limit your directory levels.
You can do this using bash's subshell feature like so
for i in /tmp/test*; do
# don't do anything if there's no /test directory in /tmp
[ "$i" != "/tmp/test*" ] || continue
for j in $i/*.json; do
# don't do anything if there's nothing to run
[ "$j" != "$i/*.json" ] || continue
(cd $i && ./file_to_run)
done
done
When you wrap a command in ( and ) it starts a subshell to run the command. A subshell is exactly like starting another instance of bash except it's slightly more optimal.
You can also simply ask the shell to expand the directories/files you need, e.g. using command xargs:
echo /PROD/*/*.json | xargs -n 1 /tmp/test.py
or even using your original find command:
find /PROD/* -name "*.json" -exec /tmp/test.py {} \;
Both command will process all JSON files contained into any subdirectory of /PROD.
Another solution is to change slightly the Python code inside your script in order to accept and process multiple files.
For example, if your script contains something like:
def process(fname):
print 'Processing file', fname
if __name__ == '__main__':
import sys
process(sys.argv[1])
you could replace the last line with:
for fname in sys.argv[1:]:
process(fname)
After this simple modification, you can call your script this way:
/tmp/test.py /PROD/*/*.json
and have it process all the desired JSON files.
I was wondering if there is a simple and concise way of writing a shell script that would go through a series of directories, (i.e., one for each student in a class), determine if within that directory there are any files that were modified within the last day, and only in that case the script would create a subdirectory and copy the files there. So if the directory had no files modified in the last 24h, it would remain untouched. My initial thought was this:
#!/bin/sh
cd /path/people/ #this directory has multiple subdirectories
for i in `ls`
do
if find ./$i -mtime -1 -type f then
mkdir ./$i/updated_files
#code to copy the files to the newly created directory
fi
done
However, that seems to create /updated_files for all subdirectories, not just the ones that have recently modified files.
Heavier use of find will probably make your job much easier. Something like
find /path/people -mtime -1 -type f -printf "mkdir --parents %h/updated_files\n" | sort | uniq | sh
The problem is that you are assuming the find command will fail if it finds nothing. The exit code is zero (success) even if it finds nothing that matches.
Something like
UPDATEDFILES=`find ./$i -mtime -1 -type f`
[ -z "$UPDATEDFILES" ] && continue
mkdir ...
cp ...
...