How to find/list the directories where a particular sub-directory is not present

How to find/list the directories where a particular sub-directory is not present - linux

I am writing a shell script where it is checking if the bin directory is present under all the users directory under /home directory. The bin directory can be present directly under user directory or under the child directory of the user directory.
I mean let say I have a user as amit under /home. So the bin directory can be present directly as /amit/bin or can be present as /amit/jash/bin
Now my requirement is that I should have a list of users directories where the bin directory is not present either directly under user directory or under the child directory of the user directory. I tried the command as :
find /home -type d ! -exec test -e '{}/bin' \; -print
but it is not working. However when I am replacing the bin directory with some file, the command is working fine. Looks like this command is particularly for files. Is there any similar command for directories?? Any help on this will be greatly appreciated.

You're on the right track. The catch is that your test of "does the following directory NOT exist in this target" can't be expressed within find's conditions in such a way as to return only the top-level directory. So you need to nest, one way or another.
One strategy would be to use a for loop in bash:
$ mkdir foo bar baz one two
$ mkdir bar/bin baz/bin
$ for d in /home/*/; do find "$d" -type d -name bin | grep -q . || echo "$d"; done
foo/
one/
two/
This uses pathname expansion (globbing) to generate the list of directories to test, and then checks for the existence of "bin". If that check fails (i.e. find outputs nothing), the directory is printed. Note the trailing slash on /home/*/, which ensures that you will only be searching within directories, rather than files that might accidentally exist in /home/.
Another possibility might be to use nested finds, if you don't want to depend on bash:
$ find /home/ -type d -depth 1 -not -exec sh -c "find {}/ -type d -name bin -print | grep -q . " \; -print
/home/foo
/home/one
/home/two
This roughly duplicates the effect of the bash for loop above, but by nesting find within find -exec. It uses grep -q . to convert the output of find into an exit status that can be used as a condition for the outer find.
Note that since you're looking for a bin directory, we want to use test -d rather than test -e (which would also check for a bin file, which probably does not matter to you.)
Another option is to use bash process redirection. On multiple lines for easier reading:
cd /home/
comm -3 \
<(printf '%s\n' */ | sed 's|/.*||' | sort) \
<(find */ -type d -name bin | cut -d/ -f1 | uniq)
This unfortunately requires you to change to the /home directory before running, because of the way it strips off subdirectories. You can of course collapse this into a big long one-liner if you feel so inclined.
This comm solution also has the risk of failing on directories with special characters in their names, like newlines.
One last option is bash-only but more than a one-liner. It involves subtracting the directories containing "bin" from the full list. It uses an associative array and globstar, so it depends on bash version 4.
#!/usr/bin/env bash
shopt -s globstar
# Go to our root
cd /home
# Declare an associative array
declare -A dirs=()
# Populate the array with our "full" list of home directories
for d in */; do dirs[${d%/}]=""; done
# Remove directories that contain a "bin" somewhere inside 'em
for d in **/bin; do unset dirs[${d%%/*}]; done
# Print the result in reproducible form
declare -p dirs
# Or print the result just as a list of words.
printf '%s\n' "${!dirs[#]}"
Note that we're storing directories in the array index, which (1) makes it easy for us to find and delete items, and (2) insures unique entries, even if one user has multiple "bin" directories under their home.

cd /home
find . -maxdepth 1 -type d ! -name . | sort > a
find . -type d -name bin | cut -d/ -f1,2 | sort > b
comm -23 a b
Here, I'm making two sorted lists. The first contains all the home directories, and the second contains the top parent of any bin subdirectory. Finally I output any items from the first list not present in the second.

Related

How can I make a bash script where I can move certain files to certain folders which are named based on a string in the files?

This is the script that I'm using to move files with the string "john" in them (124334_john_rtx.mp4 , 3464r64_john_gty.mp4 etc) to a certain folder
find /home/peter/Videos -maxdepth 1 -type f -iname '*john' -print0 | \
xargs -0 --no-run-if-empty echo mv --target-directory=/home/peter/Videos/john/
Since I have a large amount of videos with various names written in the files, I want to make a bash script which moves videos with a string between the underscores to a folder named based on the string between the underscores. So for example if a file is named 4345655_ben_rts.mp4 the script would identify the string "ben" between the underscores, create a folder named as the string between the underscores which in this case is "ben" and move the file to that folder. Any advice is greatly appreciated !

My way to do it :
cd /home/peter/Videos # Change directory to your start directory
for name in $(ls *.mp4 | cut -d'_' -f2 | sort -u) # loops on a list of names after the first underscore
do
mkdir -p /home/peter/Videos/${name} # create the target directory if it doesn't exist
mv *_${name}_*.mp4 /home/peter/Videos/${name} # Moving the files
done

This bash loop should do what you need:
find dir -maxdepth 1 -type f -iname '*mp4' -print0 | while IFS= read -r -d '' file
do
if [[ $file =~ _([^_]+)_ ]]; then
TARGET_DIR="/PARENTPATH/${BASH_REMATCH[1]}"
mkdir -p "$TARGET_DIR"
mv "$file" "$TARGET_DIR"
fi
done
It'll only move the files if it finds a directory token.
I used _([^_]+)_ to make sure there is no _ in the dir name, but you didn't specify what you want if there are more than two _ in the file name. _(.+)_ will work if foo_bar_baz_buz.mp4 is meant to go into directory bar_baz.
And this answer to a different question explains the find | while logic: https://stackoverflow.com/a/64826172/3216427 .
EDIT: As per a question in the comments, I added mkdir -p to create the target directory. The -p means recursively create any part of the path that doesn't already exist, and will not error out if the full directory already exists.

recursively call a program to run on each subdirectory

I have a program which does something like
#!bin/bash
cd $1
tree $1
Where I run:
myprogram.sh subdir1
Where subdir1 is a subdirectory of dir I however have subdir1, subdir2, subdir3... subdirN within dir.
How can I tell my program to run on every sub directory of dir? Obviously my program doe not just run tree but just to denote I pass a subdirectory through the command line, of which my program uses the subdirectory name for a numer of processes.
Thanks

Use find. For example find $1 -type d will return a list of all directories under $1, recursing as needed.
You can use it before your script with xargs or exec:
find DIR -type d -print0 | xargs -0 -n1 thescript.sh
or
find DIR -type d -exec thescript.sh {} \;
Both of the above are safe for strangely named directories.
If you want to use find inside your script and no directory names contain newlines, try:
#!/bin/bash
find "$1" -type d| IFS='' while read d; do
pushd "$d" #like cd, but remembers where you came from
tree "$d"; #<-- your code here
popd #go back to starting point
done
If you only want direct subdirectories of the starting point, try adding -depth 1 in the argument list to find in the above examples.

Unix - Only list directories which contain a subdirectory

How can I print in the Unix shell the number of directories in a tree which contain other directories?
I haven't found a solution yet with commands like find or ls.

You can use find command: find . -type d -not -empty
That will print every subdirectory that is not empty. You can control how deep you want the search with -maxdepth.
To print the number, you can use wc -l.
find . -type d -not -empty | wc -l

If you generate a list of all the directories under a particular directory, and then remove the last component from the name, you have a list of the directories containing subdirectories, but there are likely to be repeats in that list. So, you need to post-process the list, yielding (as a first approximation):
find ${base:-.} -type d |
sed 's%/[^/]*$%%' |
sort -u
Find all the directories under the directory or directories listed in variable $base, defaulting to the current directory, and print their names. The code assumes you don't have directories with a newline in the name. If you do, there are fixes, but the best fix is to rename the directory. The sed command removes the last slash and everything after it. The sort eliminates duplicate entries. What's left is the list of directories containing subdirectories.
Well, more or less. There's the degenerate case to consider: the top-level directories in the list will be listed regardless of whether they have sub-directories or not. Fixing that is a bit harder. You need to eliminate any lines of output that exactly match the directories specified to find before removing trailing material. So, you need something like:
{
printf '\\#^%s$#d\n' ${base:-.}
echo 's%/[^/]*$%%'
} > sed.script
find ${base:-.} -type d |
sed -f sed.script |
sort -u
rm -f sed.script
The \\#^%s$#d assumes you don't use # in directory names. If you do use it, then you need to find a character you don't use in names (maybe Control-A) and use that in place of the #. If you could face absolutely any character, then you'll need to do more work escaping some obscure character, such as Control-A, when it appears in a directory name.
There's a problem still: using a fixed name like sed.script for a temporary file name is bad (for multiple reasons — such as two people trying to run the script at the same time in the same directory, though it can also be a security risk), so use mktemp to create a temporary file name:
tmp=$(mktemp ${TMPDIR:-/tmp}/dircnt.XXXXXX)
trap "rm -f $tmp; exit 1" 0 1 2 3 13 15
{
printf '\\#^%s$#d\n' ${base:-.}
echo 's%/[^/]*$%%'
} > $tmp
find ${base:-.} -type d |
sed -f $tmp |
sort -u
rm -f $tmp
trap 0
This deals with the most common signals (HUP, INT, QUIT, PIPE, TERM) and removes the temporary file even if one of those arrives.
Clearly, if you want to simply count the number of directories, you can pipe the output from the commands above through wc -l to get the count.

ls -1d */*/. | cut -d / -f1 | uniq

Linux recursive copy files to its parent folder

I want to copy recursively files to its parent folder for a specific file extension. For example:
./folderA/folder1/*.txt to ./folderA/*.txt
./folderB/folder2/*.txt to ./folderB/*.txt
etc.
I checked cp and find commands but couldn't get it working.

I suspect that while you say copy, you actually mean to move the files up to their respective parent directories. It can be done easily using find:
$ find . -name '*.txt' -type f -execdir mv -n '{}' ../ \;
The above command recurses into the current directory . and then applies the following cascade of conditionals to each item found:
-name '*.txt' will filter out only files that have the .txt extension
-type f will filter out only regular files (eg, not directories that – for whatever reason – happen to have a name ending in .txt)
-execdir mv -n '{}' ../ \; executes the command mv -n '{}' ../ in the containing directory where the {} is a placeholder for the matched file's name and the single quotes are needed to stop the shell from interpreting the curly braces. The ; terminates the command and again has to be escaped from the shell interpreting it.
I have passed the -n flag to the mv program to avoid accidentally overwriting an existing file.
The above command will transform the following file system tree
dir1/
dir11/
file3.txt
file4.txt
dir12/
file2.txt
dir2/
dir21/
file6.dat
dir22/
dir221/
dir221/file8.txt
file7.txt
file5.txt
dir3/
file9.dat
file1.txt
into this one:
dir1/
dir11/
dir12/
file3.txt
file4.txt
dir2/
dir21/
file6.dat
dir22/
dir221/
file8.txt
file7.txt
dir3/
file9.dat
file2.txt
file5.txt
To get rid of the empty directories, run
$ find . -type d -empty -delete
Again, this command will traverse the current directory . and then apply the following:
-type d this time filters out only directories
-empty filters out only those that are empty
-delete deletes them.
Fine print: -execdir is not specified by POSIX, though major implementations (at least the GNU and BSD one) support it. If you need strict POSIX compliance, you'll have to make do with the less safe -exec which would need additional thought to be applied correctly in this case.
Finally, please try your commands in a test directory with dummy files, not your actual data. Especially with the -delete option of find, you can loose all your data quicker than you might imaging. Read the man page and, if that is not enough, the reference manual of find. Never blindly copy shell commands from random strangers posted on the internet if you don't understand them.

$cp ./folderA/folder1/*.txt ./folderA
Try this commnad

Run something like this from the root(ish) directory:
#! /bin/bash
BASE_DIR=./
new_dir() {
LOC_DIR=`pwd`
for i in "${LOC_DIR}"/*; do
[[ -f "${i}" ]] && cp "${i}" ../
[[ -d "${i}" ]] && cd "${i}" && new_dir
cd ..
done
return 0
}
new_dir
This will search each directory. When a file is encountered, it copies the file up a directory. When a directory is found, it will move down into the directory and start the process over again. I think it'll work for you.
Good luck.

unix bash - save environment variable and loop

Let's say you have a first.sh file in a directory: "/home/userbob/scripts/foo/". Basically I would like to know how to loop through specific directories, each time going back up to a higher level directory and repeating.
The .sh file has something like this pseudocode:
#!/bin/bash
curdi={$PATH} #where the first.sh file sits on the server
FOLDERS="$curdi/waffles/inner/
$curdi/pancakes/inner/
$curdi/bagels/inner/"
for f in $FOLDERS
do
cd $f
cp innerofinner/* .
cd $curdi
done
The idea is to somehow copy all the contents of /home/userbob/scripts/foo/waffles/inner/innerofinner to /home/userbob/scripts/foo/waffles/inner/
(and basically repeating just with the path having pancakes, bagels.etc.)
Can't do it for all directories (*) under /home/userbob/scripts/foo/ because there are some that I don't want to copy.

This should do it:
for name in waffles pancakes bagels
do
cp "$curdi/$name/inner/innferofinner/"* "$curdi/waffles/inner"
done

Walking file trees? Sounds like a job for find!
#!/usr/local/bin/env bash
# only environment variables should be all-caps
dirs=({bagels,pancakes}/inner)
find "${dirs[#]}" -type d -maxdepth 1 -mindepth 1 -name innerofinner -execdir bash -c 'cp "$1"/* .' -- {} \;
I did a partial path and assumed a working directory of /home/userbob/scripts/foo. An absolute path would work, too, and would look like
dirs=(/home/userbob/scripts/foo/{bagels,pancakes}/inner)
This finds all directories exactly one level below the listed directory that are named "innerofinner" and, in their parent directories, executs bash and a simple cp script.
If you're wondering how this works, read below.
The dirs=() syntax creates an empty array named dirs. dirs+(a b) creates an array with a at index 0 and b at index 1. Any whitespace-delimited string will work, here. In a shell script {a,b,c} expands to a b c but A{a,b,c}B expands to AaB AbB AcB. So specifying {bagels,pancakes}/inner is just a way to say both bagels/inner and pancakes/inner without having to type as much.
A variable in bash can be expanded with $foo or with ${foo}; these are the same. An array in shell can be expanded to all of its elements with ${foo[#]} delimited by spaces (if you know perl or php this will make some sense) and quoting the expansion (always a good idea in shell!) prevents spaces innside the variable from being processed again by the shell. Thus, "${dir[#]}" becomes bagels/inner pancakes/inner.
Knowing this we see that the find command has become find bagels/inner pancakes/inner -maxdepth 1 -mindepth 1 -type d -name innerofinner and if you execute this it will return exactly two lines: both full paths to each innerofinner directory. All we want now is to do something for each one, which -execdir does nicely.

Use a recursive function or invoke the script recursively.

I am not sure if I understand your problem statement correctly. Your psuedo code seems good. But, I see a problem with the following line.
curdi={$PWD}
It does not give you the directory where the script resides but gives the directory you are in. If your script directory is in the path and you are running the script from your home directory then $curdi would point to your home directory and not the directory where your script resides. This will lead to undesired results.

Incidentally, if you really wanted to do it in the way that your pseudo-script attempts it, you'd do it like this
#!/usr/bin/env bash
for f in "$PWD"/{waffles,pancakes,bagels}/inner ; do
cd "$f"
cp innerofinner/* .
# if you know for sure that it's one level up
cd ..
done
Presuming that $PWD is a good enough indicator of "current" directory for you. Me, I'd pass it in to the script.
#!/usr/bin/env bash
base="${1-$PWD}"
for f in "$base"/{waffles,pancakes,bagels}/inner ; do
cd "$f"
cp innerofinner/* .
cd ..
done
at call it like
breakfast.sh /home/userbob/scripts/foo/

find . \( -iname '*waffles*innferofinner*' -o \
-iname '*pancakes*innferofinner*' -o \
-iname '*baggels*innferofinner*' \) \
-type f \
-exec cp {} "`echo {} | sed 's:\(.*\)/[^/]\+/[^/]\+:\1:'` \;
Should do. Finds every file in the desired subdirs, then copies it based on its name.
HTH

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string