Finding subdirectories of depth 1 that do _not_ include a file - linux

I am working on an open-source project. In most, but not all, of the sub-directories of depth 1, a file called "test.c" can be found. How can I find out those directories that do not include "test.c"?
For example, I have subdirectories dir1, dir2, dir3. Dir2 and dir3 have "test.c". I have to manually check them with "ls" to determine "dir1" does not have "test.c". Probably there is a simpler way (such as a bash command) to do so? I am under Ubuntu 16. So a bash command would be preferred.

You may use this find command from base directory of all the sub-directories:
find . -type d -exec bash -c 'for d; do [[ -f "$d"/test.c ]] || echo "$d"; done' - {} +
This command finds all sub directories from current directory and checks for presence of file test.c in each directory in the bash command. If file is not present then directory name is printed.

Related

bash script to iterate directories and create tar files

I was searching for ways to create a bash file that would iterate all the folders in a directory, and create a tar.gz file for each of those directories.
(This is used specifically for ubuntu/drupal website - but could be useful in other scenarios.)
After lots of searching, combining scripts and testing, I found the following works very well when run from within the main directory.
This might be slightly different depending on version of bash, what version of ubuntu and from where you schedule or run the bash file. (Run by typing sh createDirectoryTarFiles.sh at command line from within the parent folder.)
The echo line is not necessary - just for viewing purposes.
for D in *; do
if [ -d "${D}" ]; then
tx="${D%????}"
echo "Directory is ${D} - and name of file would be $tx"
tar -zcvf "$tx.tar.gz" "${D}"
fi
done
You can use find to search for directories between mindepth and maxdepth and then create the tars
find . -maxdepth 1 -mindepth 1 -type d -exec tar czf $(basename {}).tar.gz {} \;

Find the name of subdirectories and process files in each

Let's say /tmp has subdirectories /test1, /test2, /test3 and so on,
and each has multiple files inside.
I have to run a while loop or for loop to find the name of the directories (in this case /test1, /test2, ...)
and run a command that processes all the files inside of each directory.
So, for example,
I have to get the directory names under /tmp which will be test1, test2, ...
For each subdirectory, I have to process the files inside of it.
How can I do this?
Clarification:
This is the command that I want to run:
find /PROD/140725_D0/ -name "*.json" -exec /tmp/test.py {} \;
where 140725_D0 is an example of one subdirectory to process - there are multiples, with different names.
So, by using a for or while loop, I want to find all subdirectories and run a command on the files in each.
The for or while loop should iteratively replace the hard-coded name 140725_D0 in the find command above.
You should be able to do with a single find command with an embedded shell command:
find /PROD -type d -execdir sh -c 'for f in *.json; do /tmp/test.py "$f"; done' \;
Note: -execdir is not POSIX-compliant, but the BSD (OSX) and GNU (Linux) versions of find support it; see below for a POSIX alternative.
The approach is to let find match directories, and then, in each matched directory, execute a shell with a file-processing loop (sh -c '<shellCmd>').
If not all subdirectories are guaranteed to have *.json files, change the shell command to for f in *.json; do [ -f "$f" ] && /tmp/test.py "$f"; done
Update: Two more considerations; tip of the hat to kenorb's answer:
By default, find processes the entire subtree of the input directory. To limit matching to immediate subdirectories, use -maxdepth 1[1]:
find /PROD -maxdepth 1 -type d ...
As stated, -execdir - which runs the command passed to it in the directory currently being processed - is not POSIX compliant; you can work around this by using -exec instead and by including a cd command with the directory path at hand ({}) in the shell command:
find /PROD -type d -exec sh -c 'cd "{}" && for f in *.json; do /tmp/test.py "$f"; done' \;
[1] Strictly speaking, you can place the -maxdepth option anywhere after the input file paths on the find command line - as an option, it is not positional. However, GNU find will issue a warning unless you place it before tests (such as -type) and actions (such as -exec).
Try the following usage of find:
find . -type d -exec sh -c 'cd "{}" && echo Do some stuff for {}, files are: $(ls *.*)' ';'
Use -maxdepth if you'd like to limit your directory levels.
You can do this using bash's subshell feature like so
for i in /tmp/test*; do
# don't do anything if there's no /test directory in /tmp
[ "$i" != "/tmp/test*" ] || continue
for j in $i/*.json; do
# don't do anything if there's nothing to run
[ "$j" != "$i/*.json" ] || continue
(cd $i && ./file_to_run)
done
done
When you wrap a command in ( and ) it starts a subshell to run the command. A subshell is exactly like starting another instance of bash except it's slightly more optimal.
You can also simply ask the shell to expand the directories/files you need, e.g. using command xargs:
echo /PROD/*/*.json | xargs -n 1 /tmp/test.py
or even using your original find command:
find /PROD/* -name "*.json" -exec /tmp/test.py {} \;
Both command will process all JSON files contained into any subdirectory of /PROD.
Another solution is to change slightly the Python code inside your script in order to accept and process multiple files.
For example, if your script contains something like:
def process(fname):
print 'Processing file', fname
if __name__ == '__main__':
import sys
process(sys.argv[1])
you could replace the last line with:
for fname in sys.argv[1:]:
process(fname)
After this simple modification, you can call your script this way:
/tmp/test.py /PROD/*/*.json
and have it process all the desired JSON files.

Linux recursive copy files to its parent folder

I want to copy recursively files to its parent folder for a specific file extension. For example:
./folderA/folder1/*.txt to ./folderA/*.txt
./folderB/folder2/*.txt to ./folderB/*.txt
etc.
I checked cp and find commands but couldn't get it working.
I suspect that while you say copy, you actually mean to move the files up to their respective parent directories. It can be done easily using find:
$ find . -name '*.txt' -type f -execdir mv -n '{}' ../ \;
The above command recurses into the current directory . and then applies the following cascade of conditionals to each item found:
-name '*.txt' will filter out only files that have the .txt extension
-type f will filter out only regular files (eg, not directories that – for whatever reason – happen to have a name ending in .txt)
-execdir mv -n '{}' ../ \; executes the command mv -n '{}' ../ in the containing directory where the {} is a placeholder for the matched file's name and the single quotes are needed to stop the shell from interpreting the curly braces. The ; terminates the command and again has to be escaped from the shell interpreting it.
I have passed the -n flag to the mv program to avoid accidentally overwriting an existing file.
The above command will transform the following file system tree
dir1/
dir11/
file3.txt
file4.txt
dir12/
file2.txt
dir2/
dir21/
file6.dat
dir22/
dir221/
dir221/file8.txt
file7.txt
file5.txt
dir3/
file9.dat
file1.txt
into this one:
dir1/
dir11/
dir12/
file3.txt
file4.txt
dir2/
dir21/
file6.dat
dir22/
dir221/
file8.txt
file7.txt
dir3/
file9.dat
file2.txt
file5.txt
To get rid of the empty directories, run
$ find . -type d -empty -delete
Again, this command will traverse the current directory . and then apply the following:
-type d this time filters out only directories
-empty filters out only those that are empty
-delete deletes them.
Fine print: -execdir is not specified by POSIX, though major implementations (at least the GNU and BSD one) support it. If you need strict POSIX compliance, you'll have to make do with the less safe -exec which would need additional thought to be applied correctly in this case.
Finally, please try your commands in a test directory with dummy files, not your actual data. Especially with the -delete option of find, you can loose all your data quicker than you might imaging. Read the man page and, if that is not enough, the reference manual of find. Never blindly copy shell commands from random strangers posted on the internet if you don't understand them.
$cp ./folderA/folder1/*.txt ./folderA
Try this commnad
Run something like this from the root(ish) directory:
#! /bin/bash
BASE_DIR=./
new_dir() {
LOC_DIR=`pwd`
for i in "${LOC_DIR}"/*; do
[[ -f "${i}" ]] && cp "${i}" ../
[[ -d "${i}" ]] && cd "${i}" && new_dir
cd ..
done
return 0
}
new_dir
This will search each directory. When a file is encountered, it copies the file up a directory. When a directory is found, it will move down into the directory and start the process over again. I think it'll work for you.
Good luck.

Removing Colons From Multiple FIles on Linux

I am trying to take some directories that and transfer them from Linux to Windows. The problem is that the files on Linux have colons in them. And I need to copy these directories (I cannot alter them directly since they are needed as they are the server) over to files with a name that Windows can use. For example, the name of a directory on the server might be:
IAPLTR2b-ERVK-LTR_chr9:113137544-113137860_-
while I need it to be:
IAPLTR2b-ERVK-LTR_chr9-113137544-113137860_-
I have about sixty of these directories and I have collected the names of the files with their absolute paths in a file I call directories.txt. I need to walk through this file changing the colons to hyphens. Thus far, my attempt is this:
#!/bin/bash
$DIRECTORIES=`cat directories.txt`
for $i in $DIRECTORIES;
do
cp -r "$DIRECTORIES" "`echo $DIRECTORIES | sed 's/:/-/'`"
done
However I get the error:
./my_shellscript.sh: line 10: =/bigpartition1/JKim_Test/test_bs_1/129c-test-biq/IAPLTR1_Mm-ERVK-LTR_chr10:104272652-104273004_+.fasta: No such file or directory ./my_shellscript.sh: line 14: `$i': not a valid identifier
Can anyone here help me identify what I am doing wrong and maybe what I need to do?
Thanks in advance.
This monstrosity will rename the directories in situ:
find tmp -depth -type d -exec sh -c '[ -d "{}" ] && echo mv "{}" "$(echo "{}" | tr : _)"' \;
I use -depth so it descends down into the deepest subdirectories first.
The [ -d "{}" ] is necessary because as soon as the subdirectory is renamed, its parent directory (as found by find) may no longer exist (having been renamed).
Change "echo mv" to "mv" if you're satisfied it will do what you want.

Get grandparent directory in bash script - rename files for a directory in their paths

I have the following script, which I normally use when I get a bunch of files that need to be renamed to the directory name which contains them.
The problem now is I need to rename the file to the directory two levels up. How can I get the grandparent directory to make this work?
With the following I get errors like this example:
"mv: cannot move ./48711/zoom/zoom.jpg to ./48711/zoom/./48711/zoom.jpg: No such file or directory". This is running on CentOS 5.6.
I want the final file to be named: 48711.jpg
#!/bin/bash
function dirnametofilename() {
for f in $*; do
bn=$(basename "$f")
ext="${bn##*.}"
filepath=$(dirname "$f")
dirname=$(basename "$filepath")
mv "$f" "$filepath/$dirname.$ext"
done
}
export -f dirnametofilename
find . -name "*.jpg" -exec bash -c 'dirnametofilename "{}"' \;
find .
Another method could be to use
(cd ../../; pwd)
If this were executed in any top-level paths such as /, /usr/, or /usr/share/, you would get a valid directory of /, but when you get one level deeper, you would start seeing results: /usr/share/man/ would return /usr, /my/super/deep/path/is/awesome/ would return /my/super/deep/path, and so on.
You could store this in a variable as well:
GRANDDADDY="$(cd ../../; pwd)"
and then use it for the rest of your script.
Assuming filepath doesn't end in /, which it shouldn't if you use dirname, you can do
Parent = "${filepath%/*}"
Grandparent = "${filepath%/*/*}"
So do something like this
[[ "${filepath%/*/*}" == "" ]] && echo "Path isn't long enough" || echo "${filepath%/*/*}"
Also this likely won't work if you're using relative paths (like find .). In which case you will want to use
filepath=$(dirname "$f")
filepath=$(readlink -f "$filepath")
instead of
filepath=$(dirname "$f")
Also you're never stripping the extension, so there is no reason to get it from the file and then append it again.
Note:
* This answer solves the OP's specific problem, in whose context "grandparent directory" means: the parent directory of the directory containing a file (it is the grandparent path from the file's perspective).
* By contrast, given the question's generic title, other answers here focus (only) on getting a directory's grandparent directory; the succinct answer to the generic question is: grandParentDir=$(cd ../..; printf %s "$PWD") to get the full path, and grandParentDirName=$(cd ../..; basename -- "$PWD") to get the dir. name only.
Try the following:
find . -name '*.jpg' \
-execdir bash -c \
'old="$1"; new="$(cd ..; basename -- "$PWD").${old##*.}"; echo mv "$old" "$new"' - {} \;
Note: echo was prepended to mv to be safe - remove it to perform the actual renaming.
-execdir ..\; executes the specified command in the specific directory that contains a given matching file and expands {} to the filename of each.
bash -c is used to execute a small ad-hoc script:
$(cd ..; basename -- "$PWD") determines the parent directory name of the directory containing the file, which is the grandparent path from the file's perspective.
${old##*.} is a Bash parameter expansion that returns the input filename's suffix (extension).
Note how {} - the filename at hand - is passed as the 2nd argument to the command in order to bind to $1, because bash -c uses the 1st one to set $0 (which is set to dummy value _ here).
Note that each file is merely renamed, i.e., it stays in its original directory.
Caveat:
Each directory with a matching file should only contain 1 matching file, otherwise multiple files will be renamed to the same target name in sequence - effectively, only the last file renamed will survive.
Can't you use realpath ../../ or readlink -f ../../ ? See this, readlink(1), realpath(3), canonicalize_file_name(3), and realpath(1). You may want to install the realpath package on Debian or Ubuntu. Probably CentOS has an equivalent package. (readlink should always be available, it is in GNU coreutils)

Resources