Check if files exist in multiple directories using wildcards - linux

I have ~10,000 directories. Most of them have a similarly named text file.
I would like to take these .txt files and copy them to a folder in the main directory, ALL_RESULTS. How can i accomplish this? What I have below
for d in *_directories/; do
#go into directory
cd "$d"
#check if file exists using wildcard, then copy it into ALL_RESULTS and print the name of
#directory out
if ls *SCZ_PGC3_GWAS.sumstats.gz*.txt 1> /dev/null 2>&1; then
cp *SCZ_PGC3_GWAS.sumstats.gz*.txt ../ALL_RESULTS && echo "$d"
#if file does not exist, print the name of the directory we're
#in
else
echo "$d"
echo "files do not exist"
cd ..
fi
done
I keep getting errors saying the directories themselves don't exist. What am I doing wrong?

All relative paths are interpreted relative to the directory you are in (the "current working directory"). So, imagine, you cd into the first directory. So now you are in that directory. Then you loop executes, and you try to cd into the second directory. But that directory is no longer then, you need to go "up" and then cd into the directory. That is the reason the directory does not exists - you have to go "up" a directory for each directory you cd into.
So you need to cd .. on the end of your loop to go back to the directory you started from.
I have ~10,000 directories. ... I would like to take these .txt files and move them to a folder in the main directory, ALL_RESULTS
If you don't need to output anything, just use find for that with a proper regex. Doing ls and cd and a loop will be very slow. Something along:
find . -maxdepth 2 -type f -regex '\./.*_directories/.*SCZ_PGC3_GWAS.sumstats.gz.*\.txt' -exec cp {} ALL_RESULTS \;
You can also add -v to cp to see what it copies.

You miss
shopt -s nullglob
and don't parse ls output :
#!/bin/bash
shopt -s nullglob
for d in *_directories/; do
# check if file exists using wildcard, then copy it into ALL_RESULTS and print
# the name of directory
files=$( $d/*SCZ_PGC3_GWAS.sumstats.gz*.txt )
if [[ ${files[#]} ]]; then
cp "${files[#]}" ALL_RESULTS && echo "$d"
#if file does not exist, print the name of the directory we're
#in
else
echo "$d"
echo "files do not exist"
fi
done

Related

How to create a directory and copy files into it with bash

I have some files in Pictures\ with extension *.png and directories like 12-21-20, 12-20-20. These directories was created with dir=mkdir $(date +'%m'-'%d'-'%Y')
At the end of the day I want to run a script which will create a folder $dir and copy all png files I've made for today into that folder. How can I do that? Any information you could give me would be greatly appreciated.
date +'%m-%d-%Y' is the date command outputting, e,g. 12-22-2020. $(..) is called command substitution that captures the result of the date command allowing it to be assigned to the variable dir.
To create a directory with the contents of $dir (e.g. 12-22-2020) you would use the mkdir command, providing the -p option to suppress the error if that directory already exists (and also create parent directories as necessary -- not relevant here). You want to ensure it succeeds before you attempt to copy files to the new directory, so you would use:
mkdir -p "$dir" || exit 1
Which simply exits if the command fails.
At this point, you can simply use cp (or preferably mv to move the files) from whatever source directory they currently reside in. That you can do with:
mv /path/to/source/dir/*.png "$dir"
Or the copy command,
cp -a /path/to/source/dir/*.png "$dir"
Both cp -a and mv will preserve the original file attributes (time, date, permissions, etc...).
From a script standpoint, you will either want to change to the directory above the new "$dir" or use the full path, e.g.
mv /path/to/source/dir/*.png "/path/to/$dir"
Short Example
If you want to provide the directory containing the .png files to move to "$dir" created with today's date, you could write a short script like the following. You provide the directory containing the .png files you would like to copy or move as the first argument to the script on the command-line, e.g. usage would be bash pngscript.sh /path/to/source/dir.
#!/bin/bash
[ -z "$1" ] && { ## validate one argument given for source directory
printf "usage: ./%s /path/to/images" "${0##*/}" >&2
exit 1
}
[ "$(ls -1 "$1"/*.png | wc -l)" -gt 0 ] || { ## validate png files in source dir
printf "error: no .png files in '%s'\n" "$1" >&2
exit 1
}
dir=$(date +'%m-%d-%Y') ## get current date
mkdir -p "$dir" || exit 1 ## create directory, exit on failure
mv "$1"/*.png "$dir" ## move .png files from source to "$dir"
(note: it will create the "$dir" directory below the current working directory and then move files from the path provided as the first argument (positional parameter) to the newly created directory. Change mv to cp -a if you want to leave a copy of the files in the original directory)
You can make the script executable with chmod +x pngscript.sh and then you can simply run it from the current directory as:
./pngscript.sh /path/to/source/dir
Let me know if you have further questions.

Shell, For loop into directories and sub directory not working as intended

When trying to do a simple script that tars the files and moves them to another directory i'am having problems implementing the recursive loop.
#!/bin/bash
for directorio in $1; do #my thought process: start for loop goes into dir
for file in $directorio; do #enter for loop for files
fwe="${file%.*}" #file_without_extension
tar -czPf "$fwe.tar.bz2" "$(pwd)" && mv "$fwe.tar.bz2" /home/cunha/teste
done
done
the script seems to be doing nothing ...
when the script is called like: ./script.sh /home/blablabla
How could i get this fixed?
You can better follow below option. How will it work? First it will list all the directories in short_list.txt. In while loop it will read each directory name and zip it in location /home/cunha/teste
find $1 -type d -name "*" > short_list.txt
cdr=$(pwd)
while read line
do
cd $line
base=$(basename $PWD)
cd ..
tar -cf /home/cunha/teste/$base.tar.gz $base
done < short_list.txt
cd $cdr
rm -rf short_list.txt

Command for moving subfolders with files, with keeping the original structure

I have a parent/ folder with a couple of subfolders in it. Structure:
/parent/
/subfolder_1/
- file_1.txt
- file_2.txt
/subfolder_2/
- file_3.txt
- file_4.txt
Now, I need to recursively move the contents of parent/ folder to the empty parent_tmp/ directory. Thing is, I need to keep the original folder structure in parent/.
Expected outcome after moving:
/parent/
/subfolder_1/
(empty)
/subfolder_2/
(empty)
/parent_tmp/
/subfolder_1/
- file_1.txt
- file_2.txt
/subfolder_2/
- file_3.txt
- file_4.txt
Normally, I would simply do
mv parent/* parent_tmp
but this will, of course, move the subfolders permanently.
Is there a way to adjust the mv command to keep the original structure of the source directory?
Note:
I realize that I can e.g. copy parent/ to parent_tmp, and then remove the files in parent/ subfolders. This is plan B to me.
You can use find from parent of parent and parent_tmp directoroies:
find parent -type f -exec bash -c 'mkdir -p "parent_tmp/${1%/*}" &&
mv "$1" "parent_tmp/${1%/*}"' - {} \;
You could copy the files
cp -r parent/* parent_tmp/
or create hard links (should be a lot faster for big files)
cp -l -r parent/* parent_tmp/
and then delete the original files
find parent -type f -delete
while keeping the directory structure.
Zip the content of the parent folder and Unzip it in the target folder.
Quick and Dirty:
I don't think you'll find a tool or option in the mv command to do what you want, but you should be able to achieve the desired goal by using find:
cd parent && while read file ; do dirname="$(dirname "$file")" ; mkdir -p ../parent_tmp/"$dirname"/; mv "$file" "../parent_tmp/"${file#}"" ; done < <( find . -type f ) && cd -
Function
If you use this a lot then you can add the above to your ~/.basrc like so (append to the end of the file):
alias mvkp=moveandkeep
moveandkeep() {
cd "$1"
while read file ;
do dirname="$(dirname "$file")" ;
mkdir -p "$2"/"${dirname#}";
mv "$file" ""$2"/"${file#}"";
done < <(find . -type f)
cd -
}
Now you could simply do the following: (Full path to directories required)
mvkp /home/user/parent /home/user/parent_tmp

using IF to see a directory exists if not do something

I am trying to move the directories from $DIR1 to $DIR2 if $DIR2 does not have the same directory name
if [[ ! $(ls -d /$DIR2/* | grep test) ]] is what I currently have.
then
mv $DIR1/test* /$DIR2
fi
first it gives
ls: cannot access //data/lims/PROCESSING/*: No such file or directory
when $DIR2 is empty
however, it still works.
secondly
when i run the shell script twice.
it doesn't let me move the directories with the similar name.
for example
in $DIR1 i have test-1 test-2 test-3
when it runs for the first time all three directories moves to $DIR2
after that i do mkdir test-4 at $DIR1 and run the script again..
it does not let me move the test-4 because my loop thinks that test-4 is already there since I am grabbing all test
how can I go around and move test-4 ?
Firstly, you can check whether or not a directory exists using bash's built in 'True if directory exists' expression:
test="/some/path/maybe"
if [ -d "$test" ]; then
echo "$test is a directory"
fi
However, you want to test if something is not a directory. You've shown in your code that you already know how to negate the expression:
test="/some/path/maybe"
if [ ! -d "$test" ]; then
echo "$test is NOT a directory"
fi
You also seem to be using ls to get a list of files. Perhaps you want to loop over them and do something if the files are not a directory?
dir="/some/path/maybe"
for test in $(ls $dir);
do
if [ ! -d $test ]; then
echo "$test is NOT a directory."
fi
done
A good place to look for bash stuff like this is Machtelt Garrels' guide. His page on the various expressions you can use in if statements helped me a lot.
Moving directories from a source to a destination if they don't already exist in the destination:
For the sake of readability I'm going to refer to your DIR1 and DIR2 as src and dest. First, let's declare them:
src="/place/dir1/"
dest="/place/dir2/"
Note the trailing slashes. We'll append the names of folders to these paths so the trailing slashes make that simpler. You also seem to be limiting the directories you want to move by whether or not they have the word test in their name:
filter="test"
So, let's first loop through the directories in source that pass the filter; if they don't exist in dest let's move them there:
for dir in $(ls -d $src | grep $filter); do
if [ ! -d "$dest$dir" ]; then
mv "$src$dir" "$dest"
fi
done
I hope that solves your issue. But be warned, #gniourf_gniourf posted a link in the comments that should be heeded!
If you need to mv some directories to another according to some pattern, than you can use find:
find . -type d -name "test*" -exec mv -t /tmp/target {} +
Details:
-type d - will search only for directories
-name "" - set search pattern
-exec - do something with find results
-t, --target-directory=DIRECTORY move all SOURCE arguments into DIRECTORY
There are many examples of exec or xargs usage.
And if you do not want to overwrite files, than add -n option to mv command:
find . -type d -name "test*" -exec mv -n -t /tmp/target {} +
-n, --no-clobber do not overwrite an existing file

Shell Script for renaming and relocating the files

I am working on something and need to solve the following. I am giving a analogous version of mine problem.
Say we have a music directory, in which there are 200 directories corresponding to different movies. In each movie directory there are some music files.
Now, say a file music.mp3 is in folder movie.mp3 . I want to make a shell script such that it renames the file to movie_music.mp3 and put it in some folder that I mention to it. Basically, all the files in the subdirectories are to be renamed and to be put in a new directory.
Any workaround for this?
This script receives two arguments: the source folder and the destination folder. It will move every file under any directory under the source directory to the new directory with the new filename:
#!/bin.sh
echo "Moving from $1 to $2"
for dir in "$1"/*; do
if [ -d "$dir" ]; then
for file in "$dir"/*; do
if [ -f "$file" ]; then
echo "${file} -> $2/`basename "$dir"`_`basename "${file}"`"
mv "${file}" "$2"/`basename "$dir"`_`basename "${file}"`
fi
done
fi
done
Here is a sample:
bash move.sh dir dir2
Moving from dir to dir2
dir/d1/f1 -> dir2/d1_f1
dir/d1/f2 -> dir2/d1_f2
dir/d2/f1 -> dir2/d2_f1
dir/d2/f2 -> dir2/d2_f2
Bash:
newdir=path/to/new_directory;
find . -type d |while read d; do
find "$d" -type f -maxdepth 1 |while read f; do
movie="$(basename "$d" |sed 's/\(\..*\)\?//')"
mv "$f" "$newdir/$movie_$(basename $f)";
done;
done
Assuming the following directory tree:
./movie1:
movie1.mp3
./movie2:
movie2.mp3
The following one-liner will create 'mv' commands you can use:
find ./ | grep "movie.*/" | awk '{print "mv "$1" "$1}' | sed 's/\(.*\)\//\1_/'
EDIT:
If your directory structure contains only the relevant directories, you can expand use the following grep instead:
grep "\/.*\/.*"
Notice it looks file anything with at least one directory and one file. If you have multiple inner directories, it won't be good enough.

Resources