Bash Script to Copy Folders by first character

Bash Script to Copy Folders by first character - linux

I'm a newbie to ubuntu/Linux, and just got my first bash script to execute.
I'm trying to copy and organize my music collection from driveA to driveB.
driveA has all my artists folders (e.g Adele, Brian, Bob Marley, Cassie) the path to this /media/myMusic
in driveB i have created folders A, B, C and the path to those is /media/orderedMusic
All artist folders whose first character is A or B or C in driveA will be copied to respective folders in driveB i.e. Adele would be copied to /media/orderedMusic/A,
Brian and Bob Marley would be copied to /media/orderedMusic/B and so on.
here is what i have so far, help would be highly appreciated. Thanks
#!/bin/bash
folder1=/media/myMusic
folder2=/media/orderedMusic
for dir in $folder1
do
if []
then
cp
fi
done

This should do the trick:
#!/usr/bin/env bash
folder1=/media/myMusic
folder2=/media/orderedMusic
cd "$folder1" && {
for artist in *; do
dest=$folder2/${artist:0:1}
mkdir -p "$dest"
cp -rp "$artist" "$dest"
done
}
Note that if you are on a case-sensitive filesystem and have artist names that aren't capitalized, you will get separate folders in the destination tree for the two cases.. e.g. an "A" folder and an "a" folder.

You could use substring extraction: ${string:start_index:length}:
#!/bin/bash
folder1=/media/myMusic
folder2=/media/orderedMusic
for dir in "$folder1/*"
do
initial=${dir:0:1}
src="$folder1/$dir"
dest="$folder2/$initial"
# test if the destination directory exists
if [ ! -d "$dest" ]
then
mkdir $dest
fi
cp -r $src $dest
done
Also you could use string index as you need only the first one character in a string.
For more details, see http://tldp.org/LDP/abs/html/string-manipulation.html

Related

Moving files to subfolders based on prefix in bash

I currently have a long list of files, which look somewhat like this:
Gmc_W_GCtl_E_Erz_Aue_Dl_281_heart_xerton
Gmc_W_GCtl_E_Erz_Aue_Dl_254_toe_taixwon
Gmc_W_GCtl_E_Erz_Homersdorf_Dl_201_head_xaubadan
Gmc_W_GCtl_E_Erz_Homersdorf_Dl_262_bone_bainan
Gmc_W_GCtl_E_Thur_Peuschen_Dl_261_blood_blodan
Gmc_W_GCtl_E_Thur_Peuschen_Dl_281_heart_xerton
The naming pattern all follow the same order, where I'm mainly seeking to group the files based on the part with "Aue", "Homersdorf", "Peuschen", and so forth (there are many others down the list), with the position of these keywords being always the same (e.g. they are all followed by Dl; they are all after the fifth underscore...etc.).
All the files are in the same folder, and I am trying to move these files into subfolders based on these keywords in bash, but I'm not quite certain how. Any help on this would be appreciated, thanks!

I am guessing you want something like this:
$ find . -type f | awk -F_ '{system("mkdir -p "$5"/"$6";mv "$0" "$5"/"$6)}'
This will move say Gmc_W_GCtl_E_Erz_Aue_Dl_281_heart_xerton into /Erz/Aue/Gmc_W_GCtl_E_Erz_Aue_Dl_281_heart_xerton.

Using the bash shell with a for loop.
#!/usr/bin/env bash
shopt -s nullglob
for file in Gmc*; do
[[ -d $file ]] && continue
IFS=_ read -ra dir <<< "$file"
echo mkdir -pv "${dir[4]}/${dir[5]}" || exit
echo mv -v "$file" "${dir[4]}/${dir[5]}" || exit
done
Place the script inside the directory in question make it executable and execute it.
Remove the echo's so it create the directories and move the files.

bash script to create folders and move files

I have many files created from a simulation.
Like this: res_00001.root through res_09999.root.
I would like to create a series of folders that move in batches of 1000 files in sequence to a newly created folder based on the filename we are moving. e.g. folder1 would contain res_00001.root through res_00999.root, folder2 res_01000.root through res_01999.root, ...
I attempted to create a script but it's not working:
#!/bin/bash
N_files=$1
for (( file=0; file<$N_files; ++file )) do #state what file I am looking at
s=file%1000 INPUT=printf data/output_%04lu.root $file` OUTPUT=printf data/folder%02lu/res_%04lu.root $s # move the files
mv INPUT OUTPUT
done`
I've been banging my head against this for sometime, I appreciate any help you can provide.

Updated Answer
You can run this little script if you can't find the rename program - make backup first!
#!/bin/bash
shopt -s nullglob nocaseglob
for f in *.root; do
n=$(tr -dc '[0-9]' <<< $f)
((d=(10#$n/1000)+1))
[ ! -d folder$d ] && mkdir folder$d
echo mv "$f" folder$d/$f
done
Original Answer
Make a backup and see if this helps you on a copy of a small subset of your files:
rename --dry-run 's/[^0-9]//g; my $d=int($_/1000)+1; $_="folder$d/res_$_.root"' *root
Sample Output
'res_00001.root' would be renamed to 'folder1/res_00001.root'
'res_00002.root' would be renamed to 'folder1/res_00002.root'
'res_00003.root' would be renamed to 'folder1/res_00003.root'
'res_00004.root' would be renamed to 'folder1/res_00004.root'
'res_00005.root' would be renamed to 'folder1/res_00005.root'
...
...
'res_00997.root' would be renamed to 'folder1/res_00997.root'
'res_00998.root' would be renamed to 'folder1/res_00998.root'
'res_00999.root' would be renamed to 'folder1/res_00999.root'
'res_01000.root' would be renamed to 'folder2/res_01000.root'
'res_01001.root' would be renamed to 'folder2/res_01001.root'
'res_01002.root' would be renamed to 'folder2/res_01002.root'
'res_01003.root' would be renamed to 'folder2/res_01003.root'
...
...
If it looks good, remove the --dry-run so it actually does stuff rather than just saying what stuff it would do!
s/[^0-9]//g gets rid of anything non-numeric in the filename
my $d=int($_/1000)+1 calculates the directory name
$_="folder$d/res_$_.root" builds the output filename

Copy numbered files to corresponding numbered directory using Linux bash commands or script

This should be a relatively straightforward problem but I haven't found any answers within stackoverflow. In a given directory, I have ~1000 files that are numbered (e.g. chem-0320.inp). I would like to cp the numbered file to a correspondingly numbered directory; all copied files will be renamed with the same name. I would like to do this for a specified numbered of files (#'s 300-500 for example).
For example, I would like to copy chem-0320.inp to a directory named 320 and rename it mech.dat.
Another example: copy chem-0430.inp to a directory named 430 and rename it mech.dat.
Thanks in advance for your help!

The following script would do the work for you
for file in *.inp
do
dir=$(echo $file | sed -r 's/[^0-9]+0([0-9]+).*/\1/g')
mv $file $dir/mech.dat
done

"cd" first to right dir. Subdirs will be created there.
#!/bin/bash
lo_limit=300
hi_limit=500
for file in ./*.inp
do
dir="${file//[^0-9]/}"
dir_cut="${dir:1:3}" # leading zero cut off
if [ $dir_cut -ge $lo_limit ] && [ $dir_cut -le $hi_limit ]; then
echo "$file $dir_cut"
mkdir -p "$dir_cut"
cp "$file" "$dir_cut"/mech.dat
fi
done

Find and delete files that contain same string in filename in linux terminal

I want to delete all files from a folder that contain a not unique numerical string in the filename using linux terminal. E.g.:
werrt-110009.jpg => delete
asfff-110009.JPG => delete
asffa-123489.jpg => maintain
asffa-111122.JPG => maintain
Any suggestions?

I only now understand your question, I think. You want to remove all files that contain a numeric value that is not unique (in a particular folder). If a filename contains a value that is also found in another filename, you want to remove both files, right?
This is how I would do that (it may not be the fastest way):
# put all files in your folder in a list
# for array=(*) to work make sure you have enabled nullglob: shopt -s nullglob
array=(*)
delete=()
for elem in "${array[#]}"; do
# for each elem in your list extract the number
num_regex='([0-9]+)\.'
[[ "$elem" =~ $num_regex ]]
num="${BASH_REMATCH[1]}"
# use the extracted number to check if it is unique
dup_regex="[^0-9]($num)\..+?(\1)"
# if it is not unique, put the file in the files-to-delete list
if [[ "${array[#]}" =~ $dup_regex ]]; then
delete+=("$elem")
fi
done
# delete all found duplicates
for elem in "${delete[#]}"; do
rm "$elem"
done
In your example, array would be:
array=(werrt-110009.jpg asfff-110009.JPG asffa-123489.jpg asffa-111122.JPG)
And the result in delete would be:
delete=(werrt-110009.jpg asfff-110009.JPG)
Is this what you meant?

you can use the linux find command along with the -regex parameter and the -delete parameter
to do it in one command

Use "rm" command to delete all matching string files in directory
cd <path-to-directory>/ && rm *110009*
This command helps to delete all files with matching string and it doesn't depend on the position of string in file name.
I was mentioned rm command option as another option to delete files with matching string.
Below is the complete script to achieve your requirement,
#!/bin/sh -eu
#provide the destination fodler path
DEST_FOLDER_PATH="$1"
TEMP_BUILD_DIR="/tmp/$( date +%Y%m%d-%H%M%S)_clenup_duplicate_files"
#++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
clean_up()
{
if [ -d $TEMP_BUILD_DIR ]; then
rm -rf $TEMP_BUILD_DIR
fi
}
trap clean_up EXIT
[ ! -d $TEMP_BUILD_DIR ] && mkdir -p $TEMP_BUILD_DIR
TEMP_FILES_LIST_FILE="$TEMP_BUILD_DIR/folder_file_names.txt"
echo "$(ls $DEST_FOLDER_PATH)" > $TEMP_FILES_LIST_FILE
while read filename
do
#check files with number pattern
if [[ "$filename" =~ '([0-9]+)\.' ]]; then
#fetch the number to find files with similar number
matching_string="${BASH_REMATCH[1]}"
# use the extracted number to check if it is unique
#find the files count with matching_string
if [ $(ls -1 $DEST_FOLDER_PATH/*$matching_string* | wc -l) -gt 1 ]; then
rm $DEST_FOLDER_PATH/*$matching_string*
fi
fi
#reload remaining files in folder (this optimizes the loop and speeds up the operation
#(this helps lot when folder contains more files))
echo "$(ls $DEST_FOLDER_PATH)" > $TEMP_FILES_LIST_FILE
done < $TEMP_FILES_LIST_FILE
exit 0
How to execute this script,
Save this script into file as
path-to-script/delete_duplicate_files.sh (you can rename whatever
you want)
Make script executable
chmod +x {path-to-script}/delete_duplicate_files.sh
Execute script by providing directory path where duplicate
files(files with matching number pattern) needs to be deleted
{path-to-script}/delete_duplicate_files.sh "{path-to-directory}"

BASH : merge two directories and delete duplicated data

i want to compare the content of two folders and delete duplicated data, actually i wrote a script (BASH) but i think it's not the right way to do it (i use loops to iterate over directories content and a lot of diff commands , that make it too much time consuming).
I'll explain the context :
I have two directories :
1-
dir1/
Student1/
homework1
homework2
Student2/
homework1
homework2
2-
dir2/
Student1/
homework1
homework2
Student3/
homework1
homework2
suppose that student1/homework1 folder contains the same data in dir1 and dir2, unlike homework2 which contains different data
the output directory should contains :
Student1
homework1 //same name , same content ==> keep one homework
homework2
homework2_dir2 //same name different content ==> _dir2
Student2
homework1
homework2
Student3
homework1
homework2
What do you think the optimal way in term of time and reliability (filenames problem, etc..) to do such kind of operation ?
Thank you ;)
PS: dir* and Student* and homework* are directories
PS2: PLEASE i am not looking to this model of answer :
loop over student
loop over student homeworks
test on homework existance
diff on homework content
if diff copy
end
end
if i have alot of student and alot of homeworks with only one difference (only one homework that differ), the script take alot of time with the above solution

Assuming that dir1 and dir2 are relative paths with no directories (i.e. no slashes in dir1 or dir2):
dir1=dir1
dir2=dir2
cd $dir1
BASEDIR=$(pwd)
for studentdir in *
cd $BASEDIR/$studentdir
do
for homeworkdir in *
cd $BASEDIR/$studentdir/$homeworkdir
do
for workfile in *
do
if cmp $workfile ${CMPDIR}/${studentdir}/${homeworkdir}/${workfile} 2>&1 >/dev/null
then
altdir=../${studentdir}_${dir2}
mkdir ../${altdir}
ln ${CMPDIR}/${studentdir}/${homeworkdir}/${workfile} ${altdir}
fi
done
done
done
I haven't tried this - there may be some typos.
In dir1, recurse into each student folder, and in each student folder into each homework directory.
In each homework directory, use cmp on each file to check whether it is byte identical with the matching file in the dir2 subtree.
If different, create an alternate homework directory in the student directory, and link (ln) the different file in to the alternate directory.
cmp is faster than diff; ln is faster than cp.
That's all, folks.

I'm not sure it's faster than your solution, as you didn't post it.
#!/bin/bash
mkdir output
cp -r dir1/* output
cd dir2
for student in Student* ; do
(
cd $student
out_path=../../output/$student
[[ -d $out_path ]] || mkdir $out_path
for file in * ; do
if [[ -f $out_path/$file ]] ; then
diff -q $file $out_path/$file \
|| cp $file $out_path/$file'_dir2'
else
cp $file $out_path/$student
fi
done
)
done

As far as I understand, you need to merge all files in two different directories into a new directory and you don't want duplicate files or folders.
Let's say you want to merge them into 'merged' directory.
You can do this:
rsync -hrv /dir1 /merged/
rsync -hrv /dir2 /merged/
All files in the /dir1 folder will be copied into /merged folder, then the same process will work for /dir2 folder.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Bash Script to Copy Folders by first character - linux

Related

Moving files to subfolders based on prefix in bash

bash script to create folders and move files

Copy numbered files to corresponding numbered directory using Linux bash commands or script

Find and delete files that contain same string in filename in linux terminal

BASH : merge two directories and delete duplicated data

Categories

Resources