Run a qsub command in all subdirectories - linux

I am using Centos on a HPC to run my code. I typically have a folder that contains a run_calc File, which is what I want to run as:
qsub run_calc
Now I want to write a script "submit_all.sh" that submits all run_calc files in all the subfolders in their current directory and not from the from a parent folder where I runt the submit_all.sh script.
I found similar questions posted here Solution and here Solution2
that seems to be a partial answer to this question. I am not confident just submitting scripts until I found a solution which is why I ask:
In the second link I found this solution:
for i in {1..1000}; do
cd "$i"
qsub submit.sh
cd ..
done
were "i" was a list of folders with the names 1-100. Is it somehow possible to use find to create a list of all the subdirectories and path it to the for loop? How would i deal with subsubdirectories? Would I be able to change the cd .. statement such that I always go back to the parent folder directly in that case?
I fond this solution here: Solution
#!/bin/sh
for i in `find /var/www -type d -maxdepth 1 -mindepth 1`; do
cd $i
# do something here
done
But I do not understand what is going on? Is it possible to change the above script to the only dive into folders containing a run_calc File and also include subsubdirectries?
Thank you in advance

Assuming that you are using bash as your shell:
$ cat ./test.sh
#!/bin/bash
IFS=$'\n'
while read -r fname ;
do
pushd $(dirname "${fname}") > /dev/null
qsub run_calc
popd > /dev/null
done < <(find . -type f -name 'run_calc')
find . -type f -name 'run_calc' finds all paths to file run_calc inside the current directory and its subdirectories. This is input for the while loop.
pushd, popd are bash specific, and adds in or pops out of directory stack.

for d in `find . -type d`
do ( cd "$d"
if test ! -f run_calc; then continue; fi
qsub run_calc
) done
( commnds ) execute commands in a separate process and effect of cd does not "leak".

Related

bash script to iterate directories and create tar files

I was searching for ways to create a bash file that would iterate all the folders in a directory, and create a tar.gz file for each of those directories.
(This is used specifically for ubuntu/drupal website - but could be useful in other scenarios.)
After lots of searching, combining scripts and testing, I found the following works very well when run from within the main directory.
This might be slightly different depending on version of bash, what version of ubuntu and from where you schedule or run the bash file. (Run by typing sh createDirectoryTarFiles.sh at command line from within the parent folder.)
The echo line is not necessary - just for viewing purposes.
for D in *; do
if [ -d "${D}" ]; then
tx="${D%????}"
echo "Directory is ${D} - and name of file would be $tx"
tar -zcvf "$tx.tar.gz" "${D}"
fi
done
You can use find to search for directories between mindepth and maxdepth and then create the tars
find . -maxdepth 1 -mindepth 1 -type d -exec tar czf $(basename {}).tar.gz {} \;

Creating a file in a directory other than root using bash

I am currently working on an auto grading script for a class project. It has to be able to search any number of given directories lets say
for example
usr/autograder/jdoe/
jdoe contains two files house.c and readme.txt.
I need to create a file in jdoe called jdoe.pdf
Currently i'm using this line of code below to get the path to where i need to create the file. Where $1 is user input of the path containing the projects the auto grader will grade.
find $1 -name "*.txt" -exec sh -c "dirname {}"
When I try adding /somename.pdf to the end of this statement I get readme.txt/somename.pdf
along with another -exec to get the name for the file.
\; -exec sh -c "dirname {} xargs -n 1 basename" \;
I'm having problems combining these two into one working statement.
I'm new to unix programming and would appreciate any advice or help even if it means re-writing the code using different unix tools.
The main question here is how do I create files in a path other than the directory I call my script from. Thanks in advance.
How about this?
find "$1" -name "*.txt" -exec bash -c 'd=$(dirname "$1"); touch $d"/"$(basename "$d").pdf' - {} \;
You can create files in another path using change directory command (cd).
If you start your script in usr/autograder/script and want to change to usr/autograder/jdoe you can change directory with shell command cd ../jdoe (relative) or cd usr/autograder/jdoe (absolute).
Now you are in the directory of usr/autograder/jdoe and you are able to create files in this directory, for example gedit readme.txt will open gedit and creates the file in usr/autograder/jdoe.
The simplest way is to loop over the files returned by find and then do whatever you need to do.
For example:
find "$1" -type f -name "*.txt" -print0 | while IFS= read -r -d $'\0' filename; do
dir=$(dirname "$filename")
# create pdf file
touch "$dir/${dir##*/}.pdf"
done
(Note the use of find -print0 to correctly handle filenames containing whitespace and newline characters.)
Is this what you are looking for?
function process_file {
dir=$(dirname "$1")
name=$(basename "$1")
echo name is $name and dir is $dir;
cd "$dir"
touch "${dir##*/}.pdf" # or anything else
}
# export the function, so that it is known in the child processes
export -f process_file
find . -name '*.txt' -exec bash -c "process_file '{}'" \;

Execute multiple commands on target files from find command

Let's say I have a bunch of *.tar.gz files located in a hierarchy of folders. What would be a good way to find those files, and then execute multiple commands on it.
I know if I just need to execute one command on the target file, I can use something like this:
$ find . -name "*.tar.gz" -exec tar xvzf {} \;
But what if I need to execute multiple commands on the target file? Must I write a bash script here, or is there any simpler way?
Samples of commands that need to be executed a A.tar.gz file:
$ tar xvzf A.tar.gz # assume it untars to folder logs
$ mv logs logs_A
$ rm A.tar.gz
Here's what works for me (thanks to Etan Reisner suggestions)
#!/bin/bash # the target folder (to search for tar.gz files) is parsed from command line
find $1 -name "*.tar.gz" -print0 | while IFS= read -r -d '' file; do # this does the magic of getting each tar.gz file and assign to shell variable `file`
echo $file # then we can do everything with the `file` variable
tar xvzf $file
# mv untar_folder $file.suffix # untar_folder is the name of folder after untar
rm $file
done
As suggested, the array way is unsafe if file name contained space(s), and also doesn't seem to work properly in this case.
Writing a shell script is probably easiest. Take a look at sh for loops. You could use the output of a find command in an array, and then loop over that array to perform a set of commands on each element.
For example,
arr=( $(find . -name "*.tar.gz" -print0) )
for i in "${arr[#]}"; do
# $i now holds each of the filenames output by find
tar xvzf $i
mv $i $i.suffix
rm $i
# etc., etc.
done

Find the name of subdirectories and process files in each

Let's say /tmp has subdirectories /test1, /test2, /test3 and so on,
and each has multiple files inside.
I have to run a while loop or for loop to find the name of the directories (in this case /test1, /test2, ...)
and run a command that processes all the files inside of each directory.
So, for example,
I have to get the directory names under /tmp which will be test1, test2, ...
For each subdirectory, I have to process the files inside of it.
How can I do this?
Clarification:
This is the command that I want to run:
find /PROD/140725_D0/ -name "*.json" -exec /tmp/test.py {} \;
where 140725_D0 is an example of one subdirectory to process - there are multiples, with different names.
So, by using a for or while loop, I want to find all subdirectories and run a command on the files in each.
The for or while loop should iteratively replace the hard-coded name 140725_D0 in the find command above.
You should be able to do with a single find command with an embedded shell command:
find /PROD -type d -execdir sh -c 'for f in *.json; do /tmp/test.py "$f"; done' \;
Note: -execdir is not POSIX-compliant, but the BSD (OSX) and GNU (Linux) versions of find support it; see below for a POSIX alternative.
The approach is to let find match directories, and then, in each matched directory, execute a shell with a file-processing loop (sh -c '<shellCmd>').
If not all subdirectories are guaranteed to have *.json files, change the shell command to for f in *.json; do [ -f "$f" ] && /tmp/test.py "$f"; done
Update: Two more considerations; tip of the hat to kenorb's answer:
By default, find processes the entire subtree of the input directory. To limit matching to immediate subdirectories, use -maxdepth 1[1]:
find /PROD -maxdepth 1 -type d ...
As stated, -execdir - which runs the command passed to it in the directory currently being processed - is not POSIX compliant; you can work around this by using -exec instead and by including a cd command with the directory path at hand ({}) in the shell command:
find /PROD -type d -exec sh -c 'cd "{}" && for f in *.json; do /tmp/test.py "$f"; done' \;
[1] Strictly speaking, you can place the -maxdepth option anywhere after the input file paths on the find command line - as an option, it is not positional. However, GNU find will issue a warning unless you place it before tests (such as -type) and actions (such as -exec).
Try the following usage of find:
find . -type d -exec sh -c 'cd "{}" && echo Do some stuff for {}, files are: $(ls *.*)' ';'
Use -maxdepth if you'd like to limit your directory levels.
You can do this using bash's subshell feature like so
for i in /tmp/test*; do
# don't do anything if there's no /test directory in /tmp
[ "$i" != "/tmp/test*" ] || continue
for j in $i/*.json; do
# don't do anything if there's nothing to run
[ "$j" != "$i/*.json" ] || continue
(cd $i && ./file_to_run)
done
done
When you wrap a command in ( and ) it starts a subshell to run the command. A subshell is exactly like starting another instance of bash except it's slightly more optimal.
You can also simply ask the shell to expand the directories/files you need, e.g. using command xargs:
echo /PROD/*/*.json | xargs -n 1 /tmp/test.py
or even using your original find command:
find /PROD/* -name "*.json" -exec /tmp/test.py {} \;
Both command will process all JSON files contained into any subdirectory of /PROD.
Another solution is to change slightly the Python code inside your script in order to accept and process multiple files.
For example, if your script contains something like:
def process(fname):
print 'Processing file', fname
if __name__ == '__main__':
import sys
process(sys.argv[1])
you could replace the last line with:
for fname in sys.argv[1:]:
process(fname)
After this simple modification, you can call your script this way:
/tmp/test.py /PROD/*/*.json
and have it process all the desired JSON files.

Find file then cd to that directory in Linux

In a shell script how would I find a file by a particular name and then navigate to that directory to do further operations on it?
From here I am going to copy the file across to another directory (but I can do that already just adding it in for context.)
You can use something like:
cd -- "$(dirname "$(find / -type f -name ls | head -1)")"
This will locate the first ls regular file then change to that directory.
In terms of what each bit does:
The find will start at / and search down, listing out all regular files (-type f) called ls (-name ls). There are other things you can add to find to further restrict the files you get.
The | head -1 will filter out all but the first line.
$() is a way to take the output of a command and put it on the command line for another command.
dirname can take a full file specification and give you the path bit.
cd just changes to that directory, the -- is used to prevent treating a directory name beginning with a hyphen from being treated as an option to cd.
If you execute each bit in sequence, you can see what happens:
pax[/home/pax]> find / -type f -name ls
/usr/bin/ls
pax[/home/pax]> find / -type f -name ls | head -1
/usr/bin/ls
pax[/home/pax]> dirname "$(find / -type f -name ls | head -1)"
/usr/bin
pax[/home/pax]> cd -- "$(dirname "$(find / -type f -name ls | head -1)")"
pax[/usr/bin]> _
The following should be more safe:
cd -- "$(find / -name ls -type f -printf '%h' -quit)"
Advantages:
The double dash prevents the interpretation of a directory name starting with a hyphen as an option (find doesn't produce such file names, but it's not harmful and might be required for similar constructs)
-name check before -type check because the latter sometimes requires a stat
No dirname required because the %h specifier already prints the directory name
-quit to stop the search after the first file found, thus no head required which would cause the script to fail on directory names containing newlines
no one suggesting locate (which is much quicker for huge trees) ?
zsh:
cd $(locate zoo.txt|head -1)(:h)
cd ${$(locate zoo.txt)[1]:h}
cd ${$(locate -r "/zoo.txt$")[1]:h}
or could be slow
cd **/zoo.txt(:h)
bash:
cd $(dirname $(locate -l1 -r "/zoo.txt$"))
Based on this answer to a similar question, other useful choice could be having 2 commands, 1st to find the file and 2nd to navigate to its directory:
find ./ -name "champions.txt"
cd "$(dirname "$(!!)")"
Where !! is history expansion meaning 'the previous command'.
Expanding on answers already given, if you'd like to navigate iteratively to every file that find locates and perform operations in each directory:
for i in $(find /path/to/search/root -name filename -type f)
do (
cd $(dirname $(realpath $i));
your_commands;
)
done
if you are just finding the file and then moving it elsewhere, just use find and -exec
find /path -type f -iname "mytext.txt" -exec mv "{}" /destination +;
function fReturnFilepathOfContainingDirectory {
#fReturnFilepathOfContainingDirectory_2012.0709.18:19
#$1=File
local vlFl
local vlGwkdvlFl
local vlItrtn
local vlPrdct
vlFl=$1
vlGwkdvlFl=`echo $vlFl | gawk -F/ '{ $NF="" ; print $0 }'`
for vlItrtn in `echo $vlGwkdvlFl` ;do
vlPrdct=`echo $vlPrdct'/'$vlItrtn`
done
echo $vlPrdct
}
Simply this way, isn't this elegant?
cdf yourfile.py
Of course you need to set it up first, but you need to do this only once:
Add following line into your .bashrc or .zshrc, whatever you use as your shell initialization script.
source ~/bin/cdf.sh
And add this code into ~/bin/cdf.sh file that you need to create from scratch.
#!/bin/bash
function cdf() {
THEFILE=$1
echo "cd into directory of ${THEFILE}"
# For Mac, replace find with mdfind to get it a lot faster. And it does not need args ". -name" part.
THEDIR=$(find . -name ${THEFILE} |head -1 |grep -Eo "/[ /._A-Za-z0-9\-]+/")
cd ${THEDIR}
}
If it's a program in your PATH, you can do:
cd "$(dirname "$(which ls)")"
or in Bash:
cd "$(dirname "$(type -P ls)")"
which uses one less external executable.
This uses no externals:
dest=$(type -P ls); cd "${dest%/*}"
If your file is only in one location you could try the following:
cd "$(find ~/ -name [filename] -exec dirname {} \;)" && ...
You can use -exec to invoke dirname with the path that find returns (which goes where the {} placeholder is). That will change directories. You can also add double ampersands ( && ) to execute the next command after the shell has changed directory.
For example:
cd "$(find ~/ -name need_to_find_this.rb -exec dirname {} \;)" && ruby need_to_find_this.rb
It will look for that ruby file, change to the directory, then run it from within that folder. This example assumes the filename is unique and that for some reason the ruby script has to run from within its directory. If the filename is not unique you'll get many locations passed to cd, it will return an error then it won't change directories.
try this. i created this for my own use.
cd ~
touch mycd
sudo chmod +x mycd
nano mycd
cd $( ./mycd search_directory target_directory )"
if [ $1 == '--help' ]
then
echo -e "usage: cd \$( ./mycd \$1 \$2 )"
echo -e "usage: cd \$( ./mycd search_directory target_directory )"
else
find "$1"/ -name "$2" -type d -exec echo {} \; -quit
fi
cd -- "$(sudo find / -type d -iname "dir name goes here" 2>/dev/null)"
keep all quotes (all this does is just send you to the directory you want, after that you can just put commands after that)

Resources