Using bash command to copy files from a subfolder to another - linux

I have the following structure:
.
├── dag_1
│   ├── dag
│   │   ├── current
│   │   └── deprecated
│   └── sparkjobs
│   ├── current
│      | └── spark_3.py
│   └── deprecated
│      └── spark_1.py
│      └── spark_2.py
├── dag_2
│   ├── dag
│   │   ├── current
│   │   └── deprecated
│   └── sparkjobs
│   ├── current
│      | └── spark_3.py
│   └── deprecated
│      └── spark_1.py
│      └── spark_2.py
I want to create a new folder getting only current spark jobs, my expected output folder is:
.
├── dag_1
| └── spark_3.py
├── dag_2
└── spark_3.py
I've tried to use
find /mnt/c/Users/User/Test/ -type f -wholename "sparkjob/current" | xargs -i cp {} /mnt/c/Users/User/Test/output/
Although my script is not writing the files and returns me no error. How can I solve this?

Use this, install command take the input file and copy it to another dir structure, creating the whole tree of dirs if necessary as mkdir -p transparently:
(you need to add wildcard * in -wholename to effectively find files)
find . -type f -wholename "*/sparkjob/current/*" -exec bash -c '
dir=${1#./} dir=${dir%%/*} file=${1##*/}
install -D "$1" "./$dir/$file"
' bash {} \;
Exemple of what is done:
install -D ./dag_2/sparkjob/current/spark_3.py ./dag_2/spark_3.py
install -D ./dag_1/sparkjob/current/spark_3.py ./dag_1/spark_3.py
The source path is an example, if longer, no issue.

First you should check what find returns by removing everything after |. You'll see find doesn't find any files. The reasons:
as the name implies, -wholename matches the whole name, so you need */sparkjob/current/*
according to your tree output, the folder is not named sparkjob but sparkjobs.
I'd start with something like this:
find /mnt/c/Users/User/Test/ -type f -wholename "*/sparkjobs/current/*" -print0 | while IFS= read -r -d '' file; do
echo mv "$file" "$(realpath "$(dirname "$file")"/../..)"
done
I added an echo so you can check all paths and commands are correct.
You may want to trade simplicity for performance. See https://mywiki.wooledge.org/BashFAQ/001 if performance is important (many files or frequent runs).

You'll want to do:
mkdir ../new_folder
find . -type f \
-path '*/sparkjobs/current/*' \
-exec sh -c 'f=$1
new=${f/sparkjobs\/current\//}
dest="../new_folder/$(dirname "$new")"
mkdir -p "$dest"
cp -v "$f" "$dest"' sh '{}' \;
‘./dag_1/sparkjobs/current/spark_3.py’ -> ‘../new_folder/./dag_1/spark_3.py’
‘./dag_2/sparkjobs/current/spark_3.py’ -> ‘../new_folder/./dag_2/spark_3.py’

This looks pretty straightforward.
for d in $old_loc/dag_*
do mkdir -p "$new_loc/${d##*/}"
cp "$d"/sparkjobs/current/spark_*.py "${d##*/}"
done

Related

Move files to parent directory of current location

I have a lot of folders that have a folder inside them, with files inside. I want to move the 2nd level files into the 1st level and do so without knowing their names.
Simple example:
Before running a script:
/temp/1stlevel/test.txt
/temp/1stlevel/2ndlevel/test.rtf
After running a script:
/temp/1stlevel/test.txt
/temp/1stlevel/test.rtf
I'm getting very close but I'm missing something and I'm sure it's simple/stupid. Here's what I'm running:
find . -mindepth 3 -type f -exec sh -c 'mv -i "$1" "${1%/*}"' sh {} \;
Here's what that's getting me:
mv: './1stlevel/2ndlevel/test.rtf' and './1stlevel/2ndlevel/test.rtf' are the same file
Any suggestions?
UPDATE: George, this is great stuff, thank you! I'm learning a lot and taking notes. Using the mv command instead of the more complicated one is brilliant. Far from the first time I've been accused of doing something the hardest way possible!
However, while it works great with 1 set of folders, if I have more, it doesn't work as intended. Here's what I mean:
Before:
new
└── temp
├── Folder1
│ ├── SubFolder1
│ │ └── SubTest1.txt
│ └── Test1.txt
├── Folder2
│ ├── SubFolder2
│ │ └── SubTest2.txt
│ └── Test2.txt
└── Folder3
├── SubFolder3
│ └── SubTest3.txt
└── Test3.txt
After:
new
└── temp
└── Folder3
├── Folder1
│ ├── SubFolder1
│ └── Test1.txt
├── Folder2
│ ├── SubFolder2
│ └── Test2.txt
├── SubFolder3
├── SubTest1.txt
├── SubTest2.txt
├── SubTest3.txt
└── Test3.txt
Desired:
new
└── temp
├── Folder1
│ ├── SubFolder1
│ ├── SubTest1.txt
│ └── Test1.txt
├── Folder2
│ ├── SubFolder2
│ ├── SubTest2.txt
│ └── Test2.txt
└── Folder3
├── SubFolder3
├── SubTest3.txt
└── Test3.txt
If one wanted to get fancy*:
new
└── temp
├── Folder1
│ ├── SubTest1.txt
│ └── Test1.txt
├── Folder2
│ ├── SubTest2.txt
│ └── Test2.txt
└── Folder3
├── SubTest3.txt
└── Test3.txt
I don't need to get fancy, though, 'cause later in my script I just remove empty folders.
BTW, that took me forever in Notepad++ to draw. What did you use?
Your find . -mindepth 3 -type f -exec sh -c 'mv -i "$1" "${1%/*}"' sh {} \;
attempt is very close to being right. 
A useful technique when debugging complex commands
is to insert echo statements to see what is happening. 
So, if we say$ find . -mindepth 3 -type f -exec sh -c 'echo mv -i "$1" "${1%/*}"' sh {} \;we get
mv -i ./Folder1/SubFolder1/SubTest1.txt ./Folder1/SubFolder1
mv -i ./Folder2/SubFolder2/SubTest2.txt ./Folder2/SubFolder2
mv -i ./Folder3/SubFolder3/SubTest3.txt ./Folder3/SubFolder3
which makes perfect sense — it’s finding all the files at depth 3 (and beyond),
stripping the last level off the pathname, and moving the file there. 
But,mv (path_to_file) (path_to_directory) means
move the file into the directory.
So the command mv -i ./Folder1/SubFolder1/SubTest1.txt ./Folder1/SubFolder1
means move Folder1/SubFolder1/SubTest1.txt into Folder1/SubFolder1 —
but that’s where it already is. 
Therefore, you got error messages saying
that you were moving a file to where it already was.
As is clear from your illustration,
you want to move SubTest1.txt into Folder1. 
One quick fix is
$ find . -mindepth 3 -type f -exec sh -c 'mv -i "$1" "${1%/*}/.."' sh {} \;
which uses .. to go up from SubFolder1 to Folder1:
mv -i ./Folder1/SubFolder1/SubTest1.txt ./Folder1/SubFolder1/..
mv -i ./Folder2/SubFolder2/SubTest2.txt ./Folder2/SubFolder2/..
mv -i ./Folder3/SubFolder3/SubTest3.txt ./Folder3/SubFolder3/..
I believe that that’s bad style, although I can’t figure out quite why. 
I would prefer
$ find . -mindepth 3 -type f -exec sh -c 'mv -i "$1" "${1%/*/*}"' sh {} \;
which uses %/*/* to remove two components from the pathname of the file
to get what you really want,
mv -i ./Folder1/SubFolder1/SubTest1.txt ./Folder1
mv -i ./Folder2/SubFolder2/SubTest2.txt ./Folder2
mv -i ./Folder3/SubFolder3/SubTest3.txt ./Folder3
You can then use
$ find . -mindepth 2 -type d –delete
to delete the empty SubFolderN directories. 
If, through some malfunction, any of them is not empty,
find will leave it alone and issue a warning message.
Let me use this example to illustrate:
Tree structure:
new
└── temp
└── 1stlevel
├── 2ndlevel
│   └── text.rtf
└── test.txt
Move with:
find . -mindepth 4 -type f -exec mv {} ./*/* \;
Result after move:
new
└── temp
└── 1stlevel
├── 2ndlevel
├── test.txt
└── text.rtf
Where you run it from matters, I am running from one folder up from the temp folder, if you want to run it from the temp folder then the command would be:
find 1stlevel/ -mindepth 2 -type f -exec mv {} ./* \;
Or:
find ./ -mindepth 3 -type f -exec mv {} ./* \;
Please look closely at the section find ./ -mindepth 3, remember that -mindepth 1 means process all files except the starting-points. So if you start from temp and are after a file in temp/1st/2nd/ then you will access it with -mindepth 3 starting at temp. Please see: man find.
Now for the destination I used ./*/*, interpretation "from current (one up from temp, mine was new) directory down to temp, then 1stlevel, so:
./: => new folder
./*: => new/temp folder
./*/*: => new/temp/1stlevel
But all that is for the find command but another trick is to use the mv command only from the new folder:
mv ./*/*/*/* ./*/*
This is run from the new folder in my example (in other words from one folder up the temp folder). Make adjustments to run it at different levels.
To run from the temp folder:
mv ./*/*/* ./*
If your bordered about time since you mentioned you had a lot of files, then the mv option beats the find option. See the time results for just three files:
find:
real 0m0.004s
user 0m0.000s
sys 0m0.000s
mv:
real 0m0.001s
user 0m0.000s
sys 0m0.000s
Update:
Since OP wants a script to access multiple folders I came with this:
#!/usr/bin/env bash
for i in ./*/*/*;
do
if [[ -d "$i" ]];
then
# Move the files to the new location
mv "$i"/* "${i%/*}/"
# Remove the empty directories
rm -rf "$i"
fi
done
How to: Run from the new folder: ./move.sh, remember to make the script executable with chmod +x move.sh.
Target directory structure:
new
├── move.sh
└── temp
├── folder1
│   ├── subfolder1
│   │   └── subtext1.txt
│   └── test1.txt
├── folder2
│   ├── subfolder2
│   │   └── subtext2.txt
│   └── test1.txt
└── folder3
├── subfolder3
│   └── subtext3.txt
└── test1.txt
Get fancy result:
new
├── move.sh
└── temp
├── folder1
│   ├── subtext1.txt
│   └── text1.txt
├── folder2
│   ├── subtext2.txt
│   └── text2.txt
└── folder3
├── subtext3.txt
└── text3.txt
mv YOUR-FILE-NAME ../
Ii thould work this way if u have writing permissions
Have your script navigate to each directory where you need the files moved "up," then you can have find find each file in the directory, then move them up one directory:
$ find . -type f -exec mv {} ../. \;

How to change all hidden folders/files to visible in a multiple sub directories

I have hundreds of sub directories in a directory that all have hidden files in them that I need to remove the period at the beginning of them to make them visible. I found a command to go into each directory and change them to make them visible but I need to know how to make this command work from one directory up.
rename 's/\.//;' .*
I have tried about an hour to modify this to work one level up but don't understand the Perl string enough to do it. If someone could help out I am sure it's simple and I just can't land on the right answer.
This requires a find that supports the + (can use \; instead, which will call rename multiple times), but even POSIX find specifies it:
find -mindepth 1 -depth -exec rename -n 's{/\.([^\/]*$)}{/$1}' {} +
The -depth option prevents directories from being renamed before all the files in them are renamed
-mindepth 1 prevents find from trying to rename the current directory, ..
-n is to just print what would be renamed instead of actually renaming (has to be removed to do the renaming).
The regular expression removes the last period after which there are no forward slashes, if it is preceded by a forward slash.
rename doesn't overwrite existing files, unless the -f ("force") option is used.
For a test directory structure like this:
.
├── .dir1
│   ├── .dir2
│   │   ├── .dir3
│   │   │   └── .file2
│   │   └── .file1
│   ├── file3
│   └── .file6
├── dir5
│   └── .file5
├── .file4
├── test1.bar
└── test1.foo
the output is
rename(./dir5/.file5, ./dir5/file5)
rename(./.file4, ./file4)
rename(./.dir1/.file6, ./.dir1/file6)
rename(./.dir1/.dir2/.file1, ./.dir1/.dir2/file1)
rename(./.dir1/.dir2/.dir3/.file2, ./.dir1/.dir2/.dir3/file2)
rename(./.dir1/.dir2/.dir3, ./.dir1/.dir2/dir3)
rename(./.dir1/.dir2, ./.dir1/dir2)
rename(./.dir1, ./dir1)
and the result after removing -n is
.
├── dir1
│   ├── dir2
│   │   ├── dir3
│   │   │   └── file2
│   │   └── file1
│   ├── file3
│   └── file6
├── dir5
│   └── file5
├── file4
├── test1.bar
└── test1.foo
safely_unhide:
#!/usr/bin/perl
use strict;
use warnings;
use File::Basename qw( fileparse );
for (#ARGV) {
my $o = $_;
my ($fn, $dir_qfn) = fileparse($_);
$fn =~ s/^\.//
or next;
my $n = "$dir_qfn/$fn";
if (stat($n)) {
warn("Skipping \"$o\": \"$n\" already exists\n");
next;
}
elsif (!$!{ENOENT}) {
warn("Skipping \"$o\": Can't stat \"$n\": $!\n");
next;
}
rename($n, $o)
or warn("Skipping \"$o\": Can't rename to \"$n\": $!\n");
}
Usage:
find -type f -exec safely_unhide {} + # Supports all file names. Requires GNU find
find -type f | xargs safely_unhide # Doesn't support newlines in file names.
find -type f -print0 | xargs -0 safely_unhide # Supports all file names.
Drop -type f and add -depth if you want to rename hidden dirs too.

bash script to rename following a pattern in subdirectories and make a copy

I am trying to do an iterative renaming of certain files in all directories.
homefolder/folder1/ouput/XXXXX_ab.png
homefolder/folder1/ouput/XXXXX_abcdefg.png
homefolder/folder2/ouput/XXXXX_ab.png
homefolder/folder2/ouput/XXXXX_abcdefg.png
homefolder/folder3/ouput/XXXXX_ab.png
homefolder/folder3/ouput/XXXXX_abcdefg.png
...
homefolder/folder500/ouput/XXXXX_ab.png
homefolder/folder500/ouput/XXXXX_abcdefg.png
I want to get the folder name (ex. folder1, folder2, ... folder500) and pass it to the two png files as a prefix and remove those five Xs at the beginning of each file.
The pattern of those png files are:
XXXXX_ab.png
XXXXX_abcdrfg.png
so only the first five characters are different in each subdirectory, which will be replaced by the name of its parent directory, those folder names.
the results will be:
homefolder/folder1/ouput/folder1_ab.png
homefolder/folder1/ouput/folder1_abcdefg.png
homefolder/folder2/ouput/folder2_ab.png
homefolder/folder2/ouput/folder2_abcdefg.png
homefolder/folder3/ouput/folder3_ab.png
homefolder/folder3/ouput/folder3_abcdefg.png
...
homefolder/folder500/ouput/folder500_ab.png
homefolder/folder500/ouput/folder500_abcdefg.png
at the end of renaming, create a copy of these two newly renamed files inside another folder in the homefolder, for example all_png_folder.
find . -iname "*_ab.png" -exec rename _ab.png folder1_ab.png '{}' \;
find . -name "*_ab.png" -exec cp {} ./all_png_folder \;
Here is a start, the copying at the end should be a trivial addition.
#!/usr/bin/env bash
files=$(find . -type f -name "*_ab.png" -or -name "*_abcdefg.png")
for file in $files; do
foldername=$(cut -d '/' -f 2 <<< $file)
# The name of the png-file minus the leading xxxxxx
pngfile=$(basename "$file" | cut -d '_' -f 2)
destinationdir=$(dirname "$file")
mv $file "$destinationdir/$foldername"'_'"$pngfile"
done
Demo
$ tree
.
├── folder1
│   └── ouput
│   ├── foo_bar.png
│   ├── xxxxx_abcdefg.png
│   └── xxxxx_ab.png
├── folder2
│   └── ouput
│   ├── xxxxx_abcdefg.png
│   └── xxxxx_ab.png
└── rename.sh
4 directories, 6 files
$ ./rename.sh
$ tree
.
├── folder1
│   └── ouput
│   ├── folder1_abcdefg.png
│   ├── folder1_ab.png
│   └── foo_bar.png
├── folder2
│   └── ouput
│   ├── folder2_abcdefg.png
│   └── folder2_ab.png
└── rename.sh

Linux/shell - Remove all (sub)subfolders from a directory except one

I've inherited a structure like the below, a result of years of spaghetti code...
gallery
├── 1
│   ├── deleteme1
│   ├── deleteme2
│   ├── deleteme3
│   └── full
│   ├── file1
│   ├── file2
│   └── file3
├── 2
│   ├── deleteme1
│   ├── deleteme2
│   ├── deleteme3
│   └── full
│   ├── file1
│   ├── file2
│   └── file3
└── 3
├── deleteme1
├── deleteme2
├── deleteme3
└── full
├── file1
├── file2
└── file3
In reality, this folder is thousands of subfolders large. I only need to keep ./gallery/{number}/full/* (i.e. the full folder and all files within, from each numbered directory within gallery), with everything else no longer required and needs to be deleted.
Is it possible to construct a one-liner to handle this? I've experimented with find/maxdepth/prune could not find an arragement which met my needs.
(Update: To clarify, all folders contain files - none are empty)
Using PaddyD answer you can first clean unwanted directories and then delete them:
find . -type f -not -path "./gallery/*/full/*" -exec rm {} + && find . -type d -empty -delete
This can easily be done with bash extglobs, which allow matching all files that don't match a pattern:
shopt -s extglob
rm -ri ./gallery/*/!(full)
How about:
find . -type d -empty -delete

Recursively remove directories inside folder on same level Linux

My structure is as follows:
├── Proj 1
│   ├── .git
│   ├── LICENSE
│   ├── README.md
│   └── example.cpp
├── Proj 2
│   ├── .git
│   ├── root_folder
│   └── README.md
├── Proj 3
│   ├── .git
│   ├── root_folder
│   └── README.md
...
Why is it when I do a rm -ri \.git it says:
rm: cannot remove `.git': No such file or directory
you could try
rm -ri */.git
(not sure that's what you want)
The semantics of rm's recursive search are not right for finding and deleting directories below the current one. The -ri flag will probably show each file beneath the .git folder right?
Happily if you are using bash, a one-liner with find will do what you need:
find . -name .git -type d -exec bash -c 'read -p "$0: Delete? " -n 1 -r && echo "" && case $REPLY in y) rm -rf "$0" ;; esac' {} \; -prune

Resources