Bash script to check for directory name with condition - linux

I want to exclude some directory in my script (directory name >1000) for deletion and here is my directory look like:
/home/tester/100
/home/tester/1000
/home/tester/1020 # delete all files inside
/home/tester/2000 # delete all files inside
My bash script:
cd /home/tester
for dir in */ ; do
echo -n $dir": ";
find "$dir" -type f | wc -l;
if [ $dir -gt 1000 ]; then
cd $dir;
rm *;
cd ..;
fi
done
I got error on the if line and have no idea how to fix it ... Is it possible to do with bash script ?
Thank you for your help

for dir in */ ; do will set dir to things like "1000/" -- and the "/" makes it not a valid number. You can trim off the trailing "/" with ${dir%/}. I'd also recommend double-quoting it to prevent possible weird parsing:
if [ "${dir%/}" -gt 1000 ]; then
Note that if the directory name isn't a number (even after the "/" is removed), you'll get an error from the comparison, and the then clause won't run (which is probably what you want). If you want to handle other (non-numeric) directory names more gracefully, you should add some appropriate is-this-a-number test first.
Also, using cd in scripts tends to be problematic, because if a cd fails for any reason, the rest of the script will continue running, but in the wrong place. This can cause all sorts of chaos. Consider what'd happen if one of the cd $dir commands fails: it'd run rm * in the /home/tester directory, deleting all the non-subdirectory files there, then it'd cd .., leaving it in /home. The next iteration would try to cd down to something like 2000, which doesn't exist under /home, so that cd would fail too, and then it'd delete all files in /home. This repeats indefinitely, potentially all the way up to running rm * in /, the root directory. Not good at all.
I recommend either putting error checks on cd commands, or just avoiding them entirely in favor of using explicit paths to files.
#!/bin/bash
cd /home/tester || {
echo "Couldn't cd to /home/tester, quitting here..." >&2
exit 1
}
for dir in */ ; do
echo -n "$dir: "
find "$dir" -type f | wc -l
if [ "${dir%/}" -gt 1000 ]; then
rm "$dir"/* # Explicit path -- the / is redundant, but won't hurt
fi
done
I've also added an explicit shebang line, double-quoted all the variable references (good general scripting hygiene), and removed the semicolons from the ends of lines (not needed in shell syntax).
Another recommendation: run your scripts through shellcheck.net -- it'll point out a lot of common mistakes like unquoted variable references and unchecked cds.

The value of $dir is not numeric. Add set -x at the to of your script to debug.
Use "$(basename "$dir")" to get the numeric value.

When I did not have my first cup of coffee, I would do
for dir in */ ; do
echo -n $dir": ";
find "$dir" -type f | wc -l;
done
mv /home/tester/1000 /home/tester/some_unique_name
rm /home/tester/[1-9][0-9][0-9][0-9]/*
mv /home/tester/some_unique_name /home/tester/1000
This will not work when you have directories > 9999.
Perhaps rm /home/tester/[1-9][0-9][0-9][0-9]*/* will work, when you don't have directories like 1000backup or 2000my_unique_name.
A better solution is
find . -regextype sed -regex '/home/tester/[0-9]\{4,\}' ! -name 1000 |
xargs -L1 -I{} echo rm {}/*

Related

How to remove all but a few selected files in a directory?

I want to remove all files in a directory except some through a shell script. The name of files will be passed as command line argument and number of arguments may vary.
Suppose the directory has these 5 files:
1.txt, 2.txt, 3.txt. 4.txt. 5.txt
I want to remove two files from it through a shell script using file name. Also, the number of files may vary.
There are several ways this could be done, but the one that's most robust and highest performance with large directories is probably to construct a find command.
#!/usr/bin/env bash
# first argument is the directory name to search in
dir=$1; shift
# subsequent arguments are filenames to absolve from deletion
find_args=( )
for name; do
find_args+=( -name "$name" -prune -o )
done
if [[ $dry_run ]]; then
exec find "$dir" -mindepth 1 -maxdepth 1 "${find_args[#]}" -print
else
exec find "$dir" -mindepth 1 -maxdepth 1 "${find_args[#]}" -exec rm -f -- '{}' +
fi
Thereafter, to list files which would be deleted (if the above is in a script named delete-except):
dry_run=1 delete-except /path/to/dir 1.txt 2.txt
or, to actually delete those files:
delete-except /path/to/dir 1.txt 2.txt
A simple, straightforward way could be using the GLOBIGNORE variable.
GLOBIGNORE is a colon-separated list of patterns defining the set of filenames to be ignored by pathname expansion. If a filename matched by a pathname expansion pattern also matches one of the patterns in GLOBIGNORE, it is removed from the list of matches.
Thus, the solution is to iterate through the command line args, appending file names to the list. Then call rm *. Don't forget to unset GLOBIGNORE var at the end.
#!/bin/bash
for arg in "$#"
do
if [ $arg = $1 ]
then
GLOBIGNORE=$arg
else
GLOBIGNORE=${GLOBIGNORE}:$arg
fi
done
rm *
unset GLOBIGNORE
*In case you had set GLOBIGNORE before, you can just store the val in a tmp var then reset it at the end.
We can accomplish this in pure Bash, without the need for any external tools:
#!/usr/bin/env bash
# build an associative array that contains all the filenames to be preserved
declare -A skip_list
for f in "$#"; do
skip_list[$f]=1
done
# walk through all files and build an array of files to be deleted
declare -a rm_list
for f in *; do # loop through all files
[[ -f "$f" ]] || continue # not a regular file
[[ "${skip_list[$f]}" ]] && continue # skip this file
rm_list+=("$f") # now it qualifies for rm
done
# remove the files
printf '%s\0' "${rm_list[#]}" | xargs -0 rm -- # Thanks to Charles' suggestion
This solution will also work for files that have whitespaces or glob characters in them.
Thanks all for your answers, I have figured out my solution. Below is the solution worked for me:
find /home/mydir -type f | grep -vw "goo" | xargs rm

No overwrite -n being ignored by cp command

I'm trying to cp some files on an OSX desktop that fit a pattern of five digits. It works, but I can't understand why the -n option is being ignored. I don't want to overwrite a file if it already at the desination.
find ./prefix* -name '[0-9][0-9][0-9][0-9][0-9]' -maxdepth 5 -exec cp -nr {} ./dest \;
Everything is copied, even though one directory is already in dest. How can I force no overwrite? This solution on super user indicates that I could simply change the permissions on everything in dest to read only. But I feel like there must be a reason why my implementation cp is behaving inconsistent to that which is on the man page, and there should therefore be a better way to solve the problem.
Also, the permissions for the file being overwritten are rwxr-xr-x (or 0755 if octal is your thing).
cpwon't do what you want. You'll need to iterate over the output from find. Assuming you don't have spaces or other special characters in any of the paths you find (see below if you do):
find ./prefix* -name '[0-9][0-9][0-9][0-9][0-9]' -maxdepth 5 | \
while read dir
do
target=./dest/$(basename $dir)
[ -d $target ] || cp -r $dir ./dest/
done
This works because while will keep executing what's between do and done as long as read returns success. The output from find is piped into read so every time read dir executes, it reads one line of output from find and assigns it to the dirvariable.
When there are no more lines to be read from find, read returns failure and the loop terminates.
Inside the loop body, basename prints the last part of the path passed to it. In this case, the 5 digits.
The [ ... ] is shell lingo for running a conditional test. (Yes, [ is a command!) You could say test -d ... instead of [ -d ... ]. See e.g. https://unix.stackexchange.com/questions/99185/what-do-square-brackets-mean-without-the-if-on-the-left for more info on this)
-d ... returns success if the argument exists as a directory. Failure if not.
|| means or - so foo || bar executes bar only if foo fails.
So the loop body basically says:
let target be "dest/" + the basename of $dir
such a directory exists or copy $dir into dest/
I hope that clarifies it a bit. It's a lot of shell lingo in very little code. All of this is information basically in the bash manpage, though arguably in a much less accessible format.
If there's any chance that any of the paths found by find contain spaces or other special characters, then you'll need to add quoting and a few other bells & whistles:
find ./prefix* -name '[0-9][0-9][0-9][0-9][0-9]' -maxdepth 5 -print0 | \
while IFS= read -r -d '' dir
do
target="./dest/$(basename "$dir")"
[ -d "$target" ] || cp -r "$dir" ./dest/
done

using IF to see a directory exists if not do something

I am trying to move the directories from $DIR1 to $DIR2 if $DIR2 does not have the same directory name
if [[ ! $(ls -d /$DIR2/* | grep test) ]] is what I currently have.
then
mv $DIR1/test* /$DIR2
fi
first it gives
ls: cannot access //data/lims/PROCESSING/*: No such file or directory
when $DIR2 is empty
however, it still works.
secondly
when i run the shell script twice.
it doesn't let me move the directories with the similar name.
for example
in $DIR1 i have test-1 test-2 test-3
when it runs for the first time all three directories moves to $DIR2
after that i do mkdir test-4 at $DIR1 and run the script again..
it does not let me move the test-4 because my loop thinks that test-4 is already there since I am grabbing all test
how can I go around and move test-4 ?
Firstly, you can check whether or not a directory exists using bash's built in 'True if directory exists' expression:
test="/some/path/maybe"
if [ -d "$test" ]; then
echo "$test is a directory"
fi
However, you want to test if something is not a directory. You've shown in your code that you already know how to negate the expression:
test="/some/path/maybe"
if [ ! -d "$test" ]; then
echo "$test is NOT a directory"
fi
You also seem to be using ls to get a list of files. Perhaps you want to loop over them and do something if the files are not a directory?
dir="/some/path/maybe"
for test in $(ls $dir);
do
if [ ! -d $test ]; then
echo "$test is NOT a directory."
fi
done
A good place to look for bash stuff like this is Machtelt Garrels' guide. His page on the various expressions you can use in if statements helped me a lot.
Moving directories from a source to a destination if they don't already exist in the destination:
For the sake of readability I'm going to refer to your DIR1 and DIR2 as src and dest. First, let's declare them:
src="/place/dir1/"
dest="/place/dir2/"
Note the trailing slashes. We'll append the names of folders to these paths so the trailing slashes make that simpler. You also seem to be limiting the directories you want to move by whether or not they have the word test in their name:
filter="test"
So, let's first loop through the directories in source that pass the filter; if they don't exist in dest let's move them there:
for dir in $(ls -d $src | grep $filter); do
if [ ! -d "$dest$dir" ]; then
mv "$src$dir" "$dest"
fi
done
I hope that solves your issue. But be warned, #gniourf_gniourf posted a link in the comments that should be heeded!
If you need to mv some directories to another according to some pattern, than you can use find:
find . -type d -name "test*" -exec mv -t /tmp/target {} +
Details:
-type d - will search only for directories
-name "" - set search pattern
-exec - do something with find results
-t, --target-directory=DIRECTORY move all SOURCE arguments into DIRECTORY
There are many examples of exec or xargs usage.
And if you do not want to overwrite files, than add -n option to mv command:
find . -type d -name "test*" -exec mv -n -t /tmp/target {} +
-n, --no-clobber do not overwrite an existing file

A bash script to run a program for directories that do not have a certain file

I need a Bash Script to Execute a program for all directories that do not have a specific file and create the output file on the same directory.This program needs an input file which exist in every directory with the name *.DNA.fasta.Suppose I have the following directories that may contain sub directories also
dir1/a.protein.fasta
dir2/b.protein.fasta
dir3/anyfile
dir4/x.orf.fasta
I have started by finding the directories that don't have that specific file whic name is *.protein.fasta
in this case I want the dir3 and dir4 to be listed (since they do not contain *.protein.fasta)
I have tried this code:
find . -maxdepth 1 -type d \! -exec test -e '{}/*protein.fasta' \; -print
but it seems I missed some thing it does not work.
also I do not know how to proceed for the whole story.
This is a tricky one.
I can't think of a good solution. But here's a solution, nevertheless. Note that this is guaranteed not to work if your directory or file names contain newlines, and it's not guaranteed to work if they contain other special characters. (I've only tested with the samples in your question.)
Also, I haven't included a -maxdepth because you said you need to search subdirectories too.
#!/bin/bash
# Create an associative array
declare -A excludes
# Build an associative array of directories containing the file
while read line; do
excludes[$(dirname "$line")]=1
echo "excluded: $(dirname "$line")" >&2
done <<EOT
$(find . -name "*protein.fasta" -print)
EOT
# Walk through all directories, print only those not in array
find . -type d \
| while read line ; do
if [[ ! ${excludes[$line]} ]]; then
echo "$line"
fi
done
For me, this returns:
.
./dir3
./dir4
All of which are directories that do not contain a file matching *.protein.fasta. Of course, you can replace the last echo "$line" with whatever you need to do with these directories.
Alternately:
If what you're really looking for is just the list of top-level directories that do not contain the matching file in any subdirectory, the following bash one-liner may be sufficient:
for i in *; do test -d "$i" && ( find "$i" -name '*protein.fasta' | grep -q . || echo "$i" ); done
#!/bin/bash
for dir in *; do
test -d "$dir" && ( find "$dir" -name '*protein.fasta' | grep -q . || Programfoo"$dir/$dir.DNA.fasta");
done

moving files to different directories

I'm trying to move media and other files which are in a specified directory to another directory and create another one if it does not exits (where the files will go), and create a directory the remaining files with different extensions will go. My first problem is that my script is not making a new directory and it is not moving the files to other directories and what code can I use to move files with different extensions to one directory?
This is what i have had so far, correct me where I'm wrong and help modify my script:
#!/bin/bash
From=/home/katy/doc
To=/home/katy/mo #directory where the media files will go
WA=/home/katy/do # directory where the other files will go
if [ ! -d "$To" ]; then
mkdir -p "$To"
fi
cd $From
find path -type f -name"*.mp4" -exec mv {} $To \;
I'd solve it somewhat like this:
#!/bin/bash
From=/home/katy/doc
To=/home/katy/mo # directory where the media files will go
WA=/home/katy/do # directory where the other files will go
cd "$From"
find . -type f \
| while read file; do
dir="$(dirname "$file")"
base="$(basename "$file")"
if [[ "$file" =~ \.mp4$ ]]; then
target="$To"
else
target="$WA"
fi
mkdir -p "$target/$dir"
mv -i "$file" "$target/$dir/$base"
done
Notes:
mkdir -p will not complain if the directory already exists, so there's no need to check for that.
Put double quotes around all filenames in case they contain spaces.
By piping the output of find into a while loop, you also avoid getting bitten by spaces, because read will read until a newline.
You can modify the regex according to taste, e.g. \.(mp3|mp4|wma|ogg)$.
In case you didn't know, $(...) will run the given command and stick its output back in the place of the $(...) (called command substitution). It is almost the same as `...` but slightly better (details).
In order to test it, put echo in front of mv. (Note that quotes will disappear in the output.)
cd $From
find . -type f -name "*.mp4" -exec mv {} $To \;
^^^
or
find $From -type f -name "*.mp4" -exec mv {} $To \;
^^^^^
cd $From
mv *.mp4 $To;
mv * $WA;

Resources