Copy numbered files to corresponding numbered directory using Linux bash commands or script - linux

This should be a relatively straightforward problem but I haven't found any answers within stackoverflow. In a given directory, I have ~1000 files that are numbered (e.g. chem-0320.inp). I would like to cp the numbered file to a correspondingly numbered directory; all copied files will be renamed with the same name. I would like to do this for a specified numbered of files (#'s 300-500 for example).
For example, I would like to copy chem-0320.inp to a directory named 320 and rename it mech.dat.
Another example: copy chem-0430.inp to a directory named 430 and rename it mech.dat.
Thanks in advance for your help!

The following script would do the work for you
for file in *.inp
do
dir=$(echo $file | sed -r 's/[^0-9]+0([0-9]+).*/\1/g')
mv $file $dir/mech.dat
done

"cd" first to right dir. Subdirs will be created there.
#!/bin/bash
lo_limit=300
hi_limit=500
for file in ./*.inp
do
dir="${file//[^0-9]/}"
dir_cut="${dir:1:3}" # leading zero cut off
if [ $dir_cut -ge $lo_limit ] && [ $dir_cut -le $hi_limit ]; then
echo "$file $dir_cut"
mkdir -p "$dir_cut"
cp "$file" "$dir_cut"/mech.dat
fi
done

Related

Moving files to subfolders based on prefix in bash

I currently have a long list of files, which look somewhat like this:
Gmc_W_GCtl_E_Erz_Aue_Dl_281_heart_xerton
Gmc_W_GCtl_E_Erz_Aue_Dl_254_toe_taixwon
Gmc_W_GCtl_E_Erz_Homersdorf_Dl_201_head_xaubadan
Gmc_W_GCtl_E_Erz_Homersdorf_Dl_262_bone_bainan
Gmc_W_GCtl_E_Thur_Peuschen_Dl_261_blood_blodan
Gmc_W_GCtl_E_Thur_Peuschen_Dl_281_heart_xerton
The naming pattern all follow the same order, where I'm mainly seeking to group the files based on the part with "Aue", "Homersdorf", "Peuschen", and so forth (there are many others down the list), with the position of these keywords being always the same (e.g. they are all followed by Dl; they are all after the fifth underscore...etc.).
All the files are in the same folder, and I am trying to move these files into subfolders based on these keywords in bash, but I'm not quite certain how. Any help on this would be appreciated, thanks!
I am guessing you want something like this:
$ find . -type f | awk -F_ '{system("mkdir -p "$5"/"$6";mv "$0" "$5"/"$6)}'
This will move say Gmc_W_GCtl_E_Erz_Aue_Dl_281_heart_xerton into /Erz/Aue/Gmc_W_GCtl_E_Erz_Aue_Dl_281_heart_xerton.
Using the bash shell with a for loop.
#!/usr/bin/env bash
shopt -s nullglob
for file in Gmc*; do
[[ -d $file ]] && continue
IFS=_ read -ra dir <<< "$file"
echo mkdir -pv "${dir[4]}/${dir[5]}" || exit
echo mv -v "$file" "${dir[4]}/${dir[5]}" || exit
done
Place the script inside the directory in question make it executable and execute it.
Remove the echo's so it create the directories and move the files.

Create folders automatically and move files

I have a lot of daily files that are sort by hours which comes from a data-logger (waveform). I downloaded inside a USB stick, now I need to save them inside folders named with the first 8 characters of waveform.
Those files have the following pattern:
Year-Month-Day-hourMinute-##.Code_Station_location_Channel
for example, inside the USB I have:
2020-10-01-0000-03.AM_REDDE_00_EHE; 2020-10-01-0100-03.AM_REDDE_00_EHE; 2020-10-02-0300-03.AM_REDDE_00_EHE; 2020-10-20-0000-03.AM_REDDE_00_EHE; 2020-10-20-0100-03.AM_REDDE_00_EHE; 2020-11-15-2000-03.AM_REDDE_00_EHE; 2020-11-15-2100-03.AM_REDDE_00_EHE; 2020-11-19-0400-03.AM_REDDE_00_EHE; 2020-11-19-0900-03.AM_REDDE_00_EHE;
I modified a little a code from #user3360767 (shell script to create folder daily with time-stamp and push time-stamp generated logs) to speed up the procedure of creating a folder and moving the files to them
for filename in 2020-10-01*EHE; do
foldername=$(echo "$filename" | awk '{print (201001)}');
mkdir -p "$foldername"
mv "$filename" "$foldername"
echo "$filename $foldername" ;
done
2020-10-01*EHE
Here I list all hours from 2020-10-01-0000-03.AM_REDDE_00_EHE
foldername=$(echo "$filename" | awk '{print (201001)}');
Here I create the folder that belongs to 2020-10-01 and with the following lines create the folder and then move all files to created folder.
mkdir -p "$foldername"
mv "$filename" "$foldername"
echo "$filename $foldername" ;
As you may notice, I will always need to modify the line for filename in 2020-10-01*EHE each time the file changes the date.
Is there a way to try to create folders with the first 8 number of the file?
Tonino
Use date
And since the foldername doesn't change, you don't need to keep creating one inside the loop.
files="$(date +%Y-%m-%d)*EHE"
foldername=$(date +%Y%m%d)
mkdir -p "$foldername"
for filename in $files; do
mv "$filename" "$foldername"
echo "$filename $foldername"
done
Edit:
If you want to specify the folder each time, you can pass it as an argument and use sed to get the filename pattern
foldername=$1
files=$(echo $1 | sed 's/\(....\)\(..\)\(..\)/\1-\2-\3/')
filepattern="$files*EHE"
mkdir -p "$foldername"
for filename in $filepattern; do
mv "$filename" "$foldername"
echo "$filename $foldername"
done
You call it with
./<yourscriptname>.sh 20101001
I think you want to move all files whose names end in *EHE into subdirectories. The subdirectories will be created as necessary and will be named according to the date at the start of each filename without the dashes/hyphens.
Please test the following on a copy of your files in a temporary directory somewhere.
#!/bin/bash
for filename in *EHE ; do
# Derive folder by deleting all dashes from filename, then taking first 8 characters
folder=${filename//-/}
folder=${folder:0:8}
echo "Would move $filename to $folder"
# Uncomment next 2 lines to actually move file
# mkdir -p "$folder"
# mv "$filename" "$folder"
done
Sample Output
Would move 2020-10-01-0000-03.AM_REDDE_00_EHE to 20201001
Would move 2020-10-01-0100-03.AM_REDDE_00_EHE to 20201001
Note that the 2 lines:
folder=${filename//-/}
folder=${folder:0:8}
use "bash parameter substitution", which is described here if you want to learn about it, and obviate the need to create whole new processes to run awk, sed or cut to extract the fields.

Delete files in one directory that do not exist in another directory or its child directories

I am still a newbie in shell scripting and trying to come up with a simple code. Could anyone give me some direction here. Here is what I need.
Files in path 1: /tmp
100abcd
200efgh
300ijkl
Files in path2: /home/storage
backupfile_100abcd_str1
backupfile_100abcd_str2
backupfile_200efgh_str1
backupfile_200efgh_str2
backupfile_200efgh_str3
Now I need to delete file 300ijkl in /tmp as the corresponding backup file is not present in /home/storage. The /tmp file contains more than 300 files. I need to delete the files in /tmp for which the corresponding backup files are not present and the file names in /tmp will match file names in /home/storage or directories under /home/storage.
Appreciate your time and response.
You can also approach the deletion using grep as well. You can loop though the files in /tmp checking with ls piped to grep, and deleting if there is not a match:
#!/bin/bash
[ -z "$1" -o -z "$2" ] && { ## validate input
printf "error: insufficient input. Usage: %s tmpfiles storage\n" ${0//*\//}
exit 1
}
for i in "$1"/*; do
fn=${i##*/} ## strip path, leaving filename only
## if file in backup matches filename, skip rest of loop
ls "${2}"* | grep -q "$fn" &>/dev/null && continue
printf "removing %s\n" "$i"
# rm "$i" ## remove file
done
Note: the actual removal is commented out above, test and insure there are no unintended consequences before preforming the actual delete. Call it passing the path to tmp (without trailing /) as the first argument and with /home/storage as the second argument:
$ bash scriptname /path/to/tmp /home/storage
You can solve this by
making a list of the files in /home/storage
testing each filename in /tmp to see if it is in the list from /home/storage
Given the linux+shell tags, one might use bash:
make the list of files from /home/storage an associative array
make the subscript of the array the filename
Here is a sample script to illustrate ($1 and $2 are the parameters to pass to the script, i.e., /home/storage and /tmp):
#!/bin/bash
declare -A InTarget
while read path
do
name=${path##*/}
InTarget[$name]=$path
done < <(find $1 -type f)
while read path
do
name=${path##*/}
[[ -z ${InTarget[$name]} ]] && rm -f $path
done < <(find $2 -type f)
It uses two interesting shell features:
name=${path##*/} is a POSIX shell feature which allows the script to perform the basename function without an extra process (per filename). That makes the script faster.
done < <(find $2 -type f) is a bash feature which lets the script read the list of filenames from find without making the assignments to the array run in a subprocess. Here the reason for using the feature is that if the array is updated in a subprocess, it would have no effect on the array value in the script which is passed to the second loop.
For related discussion:
Extract File Basename Without Path and Extension in Bash
Bash Script: While-Loop Subshell Dilemma
I spent some really nice time on this today because I needed to delete files which have same name but different extensions, so if anyone is looking for a quick implementation, here you go:
#!/bin/bash
# We need some reference to files which we want to keep and not delete,
 # let's assume you want to keep files in first folder with jpeg, so you
# need to map it into the desired file extension first.
FILES_TO_KEEP=`ls -1 ${2} | sed 's/\.pdf$/.jpeg/g'`
#iterate through files in first argument path
for file in ${1}/*; do
# In my case, I did not want to do anything with directories, so let's continue cycle when hitting one.
if [[ -d $file ]]; then
continue
fi
# let's omit path from the iterated file with baseline so we can compare it to the files we want to keep
NAME_WITHOUT_PATH=`basename $file`
 # I use mac which is equal to having poor quality clts
# when it comes to operating with strings,
# this should be safe check to see if FILES_TO_KEEP contain NAME_WITHOUT_PATH
if [[ $FILES_TO_KEEP == *"$NAME_WITHOUT_PATH"* ]];then
echo "Not deleting: $NAME_WITHOUT_PATH"
else
# If it does not contain file from the other directory, remove it.
echo "deleting: $NAME_WITHOUT_PATH"
rm -rf $file
fi
done
Usage: sh deleteDifferentFiles.sh path/from/where path/source/of/truth

Bash Script to Copy Folders by first character

I'm a newbie to ubuntu/Linux, and just got my first bash script to execute.
I'm trying to copy and organize my music collection from driveA to driveB.
driveA has all my artists folders (e.g Adele, Brian, Bob Marley, Cassie) the path to this /media/myMusic
in driveB i have created folders A, B, C and the path to those is /media/orderedMusic
All artist folders whose first character is A or B or C in driveA will be copied to respective folders in driveB i.e. Adele would be copied to /media/orderedMusic/A,
Brian and Bob Marley would be copied to /media/orderedMusic/B and so on.
here is what i have so far, help would be highly appreciated. Thanks
#!/bin/bash
folder1=/media/myMusic
folder2=/media/orderedMusic
for dir in $folder1
do
if []
then
cp
fi
done
This should do the trick:
#!/usr/bin/env bash
folder1=/media/myMusic
folder2=/media/orderedMusic
cd "$folder1" && {
for artist in *; do
dest=$folder2/${artist:0:1}
mkdir -p "$dest"
cp -rp "$artist" "$dest"
done
}
Note that if you are on a case-sensitive filesystem and have artist names that aren't capitalized, you will get separate folders in the destination tree for the two cases.. e.g. an "A" folder and an "a" folder.
You could use substring extraction: ${string:start_index:length}:
#!/bin/bash
folder1=/media/myMusic
folder2=/media/orderedMusic
for dir in "$folder1/*"
do
initial=${dir:0:1}
src="$folder1/$dir"
dest="$folder2/$initial"
# test if the destination directory exists
if [ ! -d "$dest" ]
then
mkdir $dest
fi
cp -r $src $dest
done
Also you could use string index as you need only the first one character in a string.
For more details, see http://tldp.org/LDP/abs/html/string-manipulation.html

How to find,copy and rename files in linux?

I am trying to find all files in a directory and sub-directories and then copy them to a different directory. However some of them have the same name, so I need to copy the files over and then if there are two files have the same name, rename one of those files.
So far I have managed to copy all found files with a unique name over using:
#!/bin/bash
if [ ! -e $2 ] ; then
mkdir $2
echo "Directory created"
fi
if [ ! -e $1 ] ; then
echo "image source does not exists"
fi
find $1 -name IMG_****.JPG -exec cp {} $2 \;
However, I now need some sort of if statement to figure out if a file has the same name as another file that has been copied.
Since you are on linux, you are probably using cp from coreutils. If that is the case, let it do the backup for you by using cp --backup=t
Try this approach: put the list of files in a variable and copy each file looking if the copy operation succeeds. If not, try a different name.
In code:
FILES=`find $1 -name IMG_****.JPG | xargs -r`
for FILE in $FILES; do
cp -n $FILE destination
# Check return error of latest command (i.e. cp)
# through the $? variable and, in case
# choose a different name for the destination
done
Inside the for statement, you can also put some incremental integer to try different names incrementally (e.g., name_1, name_2 and so on, until the cp command succeeds).
You can do:
for file in $1/**/IMG_*.jpg ; do
target=$2/$(basename "$file")
SUFF=0
while [[ -f "$target$SUFF" ]] ; do
(( SUFF++ ))
done
cp "$file" "$target$SUFF"
done
in your script in place of the find command to append integer suffixes to identically-named files
You can use rsync with the following switches for more control
rsync --backup --backup-dir=DIR --suffix=SUFFIX -az <source dire> <destination dir>
Here (from man page)
-b, --backup
With this option, preexisting destination files are renamed as each file is transferred or deleted. You can control where the backup file goes and what (if any) suffix gets appended using the --backup-dir and --suffix options.
--backup-dir=DIR
In combination with the --backup option, this tells rsync to store all backups in the specified directory on the receiving side. This can be used for incremental backups. You can additionally specify a backup suffix using the --suffix option (otherwise the files backed up in the specified directory will keep their original filenames).
--suffix=SUFFIX
This option allows you to override the default backup suffix used with the --backup (-b) option. The default suffix is a ~ if no --backup-dir was specified, otherwise it is an empty string.
You can use rsycn to either sync two folders on local file system or on a remote file system. You can even do syncing over ssh connection.
rsync is amazingly powerful. See the man page for all the options.

Resources