Bash script archiving files according to number

Bash script archiving files according to number - linux

I am currently writing a script that mounts a samba share, rsyncs the data to a local machine and archives into a directory structure (say /home/archive/). Currently when new pdfs are added, archiving done manually which seems like inefficient use of time
The files have the following structure
ABC140003.pdf
ABC140124.pdf
.
.
ABC144201.pdf
.
ABC146012.pdf
/home/archive/ has several directories 2010/, 2011/, 2012, 2013 etc
Basically, I need to break up the number to find the correct subdirectory to copy the file. First I extract the number
study_number=`echo $file | sed 's/[^0-9]//g'`
Then the year
year=20`echo $study_number | cut -c 1-2`
All the above pdf files belong in the subdirectory of 2014. Within 2014 or any other year directories there are the following subdirectories 2014/Blue/, /2014/Red/and/2014/Green/`. This corresponds to the 3rd integer in the number Blue(0), Red(4) and Green(6).
I use cases here to find what I have called study type
type_int=`echo $study_number | cut -c 3`
case "$type_int" in
0)
type_string="Blue"
;;
4) type_string="Red"
;;
6) type_string="Green"
;;
*) echo "$date: $file has unknown study type. Do not know where to place it" >> $logfile
continue
;;
esac
I now know the following files go in the following directories
ABC140003.pdf -> /home/archive/2014/Blue/
ABC140124.pdf -> /home/archive/2014/Blue/
.
.
ABC144201.pdf -> /home/archive/2014/Red/
.
ABC146012.pdf -> /home/archive/2014/Green/
I'd be happy if this was the end of the directory structure. However, there is another layer of subdirectories have been introduced so that no directory has more than 100 pdf files (Not my call).
For example /home/archive/2014/Blue/ has the following directories:
140001-0100/ 140101-0200/ 140201-0300/ 140301-0400/ 140401-0500/ 140501-0600/
etc
I now need to come up some logic such that the following files go to the following directories
ABC140003.pdf -> /home/archive/2014/Blue/140001-0100
ABC140124.pdf -> /home/archive/2014/Blue/140100-0124
.
.
ABC144201.pdf -> /home/archive/2014/Red/144200-4300
.
ABC146012.pdf -> /home/archive/2014/Green/146000-6100
I am stumped on how to logically determine that study ABC146012 should go in 146000-6100 in an elegant manner without resorting to multiple if statements for each of Red/ Blue/ and Green/

Here is a simplified version that needs some work but you get the idea (for a nice final solution, see #glenn jackman's solution):
Declare an associative array for the colors
$ declare -A colors
$ colors[0]=Blue
$ colors[4]=Red
$ colors[6]=Green
Then extract the needed information
$ study_number=$(sed 's/[^0-9]//g' <<< ABC140124.pdf);
$ year=${study_number:0:2};
$ type=${study_number:2:1};
$ color=${colors[$type]};
$ from="${study_number:0:$((${#study_number}-2))}01"
$ to="$((${study_number:0:$((${#study_number}-2))}+1))00"
and that gives:
$ echo /home/archive/$year/$color/$from-$to
/home/archive/14/Blue/140101-140200
(I assumed you wanted your intervals to be consistently numbered 'x01-(x+1)00')
You can create a function to simplify the process
build_dir() {
study_number=$(sed 's/[^0-9]//g' <<< $1);
year=${study_number:0:2};
type=${study_number:2:1};
color=${colors[$type]};
from="${study_number:0:$((${#study_number}-2))}01"
to="$((${study_number:0:$((${#study_number}-2))}+1))00"
echo "/home/archive/$year/$color/$from-$to"
}
It needs a bit of more defensive programming-related lines of code, but it can be used like this:
$ build_dir ABC146012.pdf
/home/archive/14/Green/146001-146100

colors=([0]=Blue [4]=Red [6]=Green)
get_destination() {
if [[ $1 =~ ([0-9][0-9])([0-9])([0-9]) ]]; then
printf "/home/archive/20%s/%s/%s%s%d01-%s%d00" \
${BASH_REMATCH[1]} \
${colors[${BASH_REMATCH[2]}]} \
${BASH_REMATCH[1]} \
${BASH_REMATCH[2]} \
${BASH_REMATCH[3]} \
${BASH_REMATCH[2]} \
$(( 1 + ${BASH_REMATCH[3]} ))
fi
}
for file in ABC140003.pdf ABC140124.pdf ABC144201.pdf ABC146012.pdf; do
echo "$file -> $(get_destination $file)"
done
ABC140003.pdf -> /home/archive/2014/Blue/140001-0100
ABC140124.pdf -> /home/archive/2014/Blue/140101-0200
ABC144201.pdf -> /home/archive/2014/Red/144201-4300
ABC146012.pdf -> /home/archive/2014/Green/146001-6100

Related

Set variable from content of files in bash(loop)

I'm trying to upload certificates(just created) to some storage.
So I can read all certificates in my folder and want to use content of each of this file to a variable in a loop.
#!/bin/bash
dir="${0%/*}"
#for f in $(cat $dir"/"*.crt)
# do
# data='{"certificate_data":'"$f"'}'
#done
url="localhost:50183/api/v0.1/Certificates"
data='{"certificate_data":'$(cat $dir"/"*.crt)'}'
echo "$data"
So I got all certificates in one time but I need to get in $data each of content of files in a loop with correct form something like:
{"certificate_data":"<certificate_data_from_file>"}
{"certificate_data":"<certificate_data_from_file>"}
......
and so on
I know that I should use another one loop but don't know how.
Be grateful for any tips!

This should do the work:
#!/bin/bash
for f in ./dir/*.crt
do
data='{"certificate_data":"'"$(< "${f}")"'"}'
echo "${data}"
done
Test:
$ ls ./dir/*
./dir/cert1.crt ./dir/cert2.crt
$ cat ./dir/*
I am certificate1.
I am certificate2.
$ ./cert.sh
{"certificate_data":"I am certificate1."}
{"certificate_data":"I am certificate2."}

Concatenate a string with an array for recursively copy file in bash

I have a concatenation problem between a string and an array
I want to copy all the files contained in the directories stored in the array, my command is in a loop (to recursively copy my files)
yes | cp -rf "./$WORK_DIR/${array[$i]}/"* $DEST_DIR
My array :
array=("My folder" "...")
I have in my array several folder names (they have spaces in their names) that I would like append to my $WORK_DIR to make it possible to copy the files for cp.
But I always have the following error
cp: impossible to evaluate './WORKDIR/my': No such files or folders
cp: impossible to evaluate 'folder/*': No such files or folders

This worked for me
#!/bin/bash
arr=("My folder" "This is a test")
i=0
while [[ ${i} -lt ${#arr[#]} ]]; do
echo ${arr[${i}]}
cp -rfv ./source/"${arr[${i}]}"/* ./dest/.
(( i++ ))
done
exit 0
I ran the script. It gave me the following output:
My folder
'./source/My folder/blah-folder' -> './dest/./blah-folder'
'./source/My folder/foo-folder' -> './dest/./foo-folder'
This is a test
'./source/This is a test/blah-this' -> './dest/./blah-this'
'./source/This is a test/foo-this' -> './dest/./foo-this'
Not sure of the exact difference, but hopefully this will help.

bash: How to transfer/copy only the file names to separate similar files?

I've some files in a folder A which are named like that:
001_file.xyz
002_file.xyz
003_file.xyz
in a separate folder B I've files like this:
001_FILE_somerandomtext.zyx
002_FILE_somerandomtext.zyx
003_FILE_somerandomtext.zyx
Now I want to rename, if possible, with just a command line in the bash all the files in folder B with the file names in folder A. The file extension must stay different.
There is exactly the same amount of files in each folder A and B and they both have the same order due to numbering.
I'm a total noob, but I hope some easy answer for the problem will show up.
Thanks in advance!
ZVLKX
*Example edited for clarification

An implementation might look a bit like this:
renameFromDir() {
useNamesFromDir=$1
forFilesFromDir=$2
for f in "$forFilesFromDir"/*; do
# Put original extension in $f_ext
f_ext=${f##*.}
# Put number in $f_num
f_num=${f##*/}; f_num=${f_num%%_*}
# look for a file in directory B with same number
set -- "$useNamesFromDir"/"${f_num}"_*.*
[[ $1 && -e $1 ]] || {
echo "Could not find file number $f_num in $dirB" >&2
continue
}
(( $# > 1 )) && {
# there's more than one file with the same number; write an error
echo "Found more than one file with number $f_num in $dirB" >&2
printf ' - %q\n' "$#" >&2
continue
}
# extract the parts of our destination filename we want to keep
destName=${1##*/} # remove everything up to the last /
destName=${destName%.*} # and past the last .
# write the command we would run to stdout
printf '%q ' mv "$f" "$forFilesFromDir/$destName.$f_ext"; printf '\n'
## or uncomment this to actually run the command
# mv "$f" "$forFilesFromDir/$destName.$f_ext"
done
}
Now, how would we test this?
mkdir -p A B
touch A/00{1,2,3}_file.xyz B/00{1,2,3}_FILE_somerandomtext.zyx
renameFromDir A B
Given that, the output is:
mv B/001_FILE_somerandomtext.zyx B/001_file.zyx
mv B/002_FILE_somerandomtext.zyx B/002_file.zyx
mv B/003_FILE_somerandomtext.zyx B/003_file.zyx

Sorry if this isn't helpful, but I had fun writing it.
This renames items in folder B to the names in folder A, preserving the extension of B.
A_DIR="./A"
A_FILE_EXT=".xyz"
B_DIR="./B"
B_FILE_EXT=".zyx"
FILES_IN_A=`find $A_DIR -type f -name *$A_FILE_EXT`
FILES_IN_B=`find $B_DIR -type f -name *$B_FILE_EXT`
for A_FILE in $FILES_IN_A
do
A_BASE_FILE=`basename $A_FILE`
A_FILE_NUMBER=(${A_BASE_FILE//_/ })
A_FILE_WITHOUT_EXTENSION=(${A_BASE_FILE//./ })
for B_FILE in $FILES_IN_B
do
B_BASE_FILE=`basename $B_FILE`
B_FILE_NUMBER=(${B_BASE_FILE//_/ })
if [ ${A_FILE_NUMBER[0]} == ${B_FILE_NUMBER[0]} ]; then
mv $B_FILE $B_DIR/$A_FILE_WITHOUT_EXTENSION$B_FILE_EXT
break
fi
done
done

for loop and files with spaces

Here is my command
for i in `find . -name '*Source*.dat'`; do cp "$i" $INBOUND/$RANDOM.dat; done;
Here are the files (just a sample):
/(12)SA1 (Admitting Diagnosis) --_TA1-1 + TA1-2/Source.dat
./(12)SA1 (Admitting Diagnosis) --_TA1-1 + TA1-2/Source_2000C.dat
./(13)SE1 (External Cause of Injury) --_ TE1-1+TE1-2/Source.dat
./(13)SE1 (External Cause of Injury) --_ TE1-1+TE1-2/Source_2000C.dat
./(13)SE1 (External Cause of Injury) --_ TE1-1+TE1-2/Source_POATest.dat
./(14)SP1(Primary)--_ TP1-1 + TP1-2/Source.dat
./(14)SP1(Primary)--_ TP1-1 + TP1-2/Source_2000C.dat
./(14)SP1(Primary)--_ TP1-1 + TP1-2/Source_ProcDateTest.dat
./(15)SP1(Primary)--_ TP1-1 + TP1-2 - SP2 -- TP2-1 + TP2-2/Source.dat
./(16)SP1(Primary)--_ TP1-1 + TP1-2 +TP1-3- SP2 -- TP2-1 + TP2-2/Source.dat
./(17)SP1(Primary)--_ TP1-1 + TP1-2 +TP1-3/Source.dat
./(18)SP1(Primary)--_ TP1-1 + TP1-2 - SP2 -- TP2-1 + TP2-2 - Copy/Source.dat
./(19)SD1 (Primary)+SD2 (Other Diagnosis)--_ TD12/Source.dat
./(19)SD1 (Primary)+SD2 (Other Diagnosis)--_ TD12/Source_2000C.dat
./(19)SD1 (Primary)+SD2 (Other Diagnosis)--_ TD12/Source_POATest.dat
./(2)SD3--_TD4 SD4--_TD4/Source.dat
./(2)SD3--_TD4 SD4--_TD4/Source2.dat
Those spaces are getting tokenized by bash and this doesn't work.
In addition, I want to append some randomness to the end of these files so they don't collide in the destination directory but that's another story.

find . -name '*Source*.dat' -exec sh -c 'cp "$1" "$2/$RANDOM.dat"' -- {} "$INBOUND" \;
Using -exec to execute commands is whitespace safe. Using sh to execute cp is necessary to get a different $RANDOM for each copy.

If all the files are at the same directory level, as in your example, you don't need find. For example,
for i in */*Source*.dat; do
cp "$i" $INBOUND/$RANDOM.dat
done
will tokenize correctly and will find the correct files provided they are all in directories which are children of the current directory.
As #chepner points out in a comment, if you have bash v4 you can use **:
for i in **/*Source*.dat; do
cp "$i" $INBOUND/$RANDOM.dat
done
which should find exactly the same files as find would, without the tokenizing issue.

How about:
find . -name '*file*' -print0 | xargs -0 -I {} cp {} $INBOUND/{}-$RANDOM.dat
xargs is a handy way of constructing an argument list and passing it to a command.
find -print0 and xargs -0 go together, and are basically an agreement between the two commands about how to terminate arguments. In this case, it means the space won't be interpreted as the end of an argument.
-I {} sets up the {} as an argument placeholder for xargs.
As for randomising the file name to avoid a collision, there are obviously lots of things you could do to generate a random string to attach. The most important part, though, is that you verify that your new file name also does not exist. You might use a loop something like this to attempt that:
$RANDOM=$(date | md5)
filename=$INBOUND/$RANDOM.dat
while [ -e $filename ]; do
$RANDOM=$(date | md5)
filename=$INBOUND/$RANDOM.dat
done
I'm not necessarily advocating for or against generating a random filename with a hash of the current time: the main point is that you want to check for existence of that file first, just in case.

There are several ways of treating files with spaces. You can use findin a pipe, while and read:
find . -name '*Source*.dat' | while read file ; do cp "$file" "$INBOUND/$RANDOM.dat"; done

try something like
while read i;do
echo "file is $i"
cp "$i" $INBOUND/$RANDOM.dat
done < <(find . -name '*Source*.dat')

How to rename multiple files in terminal (LINUX)?

I have bunch of files with no pattern in their name at all in a directory. all I know is that they are all Jpg files. How do I rename them, so that they will have some sort of sequence in their name.
I know in Windows all you do is select all the files and rename them all to a same name and Windows OS automatically adds sequence numbers to compensate for the same file name.
I want to be able to do that in Linux Fedora but I you can only do that in Terminal. Please, help. I am lost.
What is the command for doing this?

The best way to do this is to run a loop in the terminal going from picture to picture and renaming them with a number that gets bigger by one with every loop.
You can do this with:
n=1
for i in *.jpg; do
p=$(printf "%04d.jpg" ${n})
mv ${i} ${p}
let n=n+1
done
Just enter it into the terminal line by line.
If you want to put a custom name in front of the numbers, you can put it before the percent sign in the third line.
If you want to change the number of digits in the names' number, just replace the '4' in the third line (don't change the '0', though).

I will assume that:
There are no spaces or other weird control characters in the file names
All of the files in a given directory are jpeg files
That in mind, to rename all of the files to 1.jpg, 2.jpg, and so on:
N=1
for a in ./* ; do
mv $a ${N}.jpg
N=$(( $N + 1 ))
done
If there are spaces in the file names:
find . -type f | awk 'BEGIN{N=1}
{print "mv \"" $0 "\" " N ".jpg"
N++}' | sh
Should be able to rename them.
The point being, Linux/UNIX does have a lot of tools which can automate a task like this, but they have a bit of a learning curve to them

Create a script containing:
#!/bin/sh
filePrefix="$1"
sequence=1
for file in $(ls -tr *.jpg) ; do
renamedFile="$filePrefix$sequence.jpg"
echo $renamedFile
currentFile="$(echo $file)"
echo "renaming \"$currentFile\" to $renamedFile"
mv "$currentFile" "$renamedFile"
sequence=$(($sequence+1))
done
exit 0
If you named the script, say, RenameSequentially then you could issue the command:
./RenameSequentially Images-
This would rename all *.jpg files in the directory to Image-1.jpg, Image-2.jpg, etc... in order of oldest to newest... tested in OS X command shell.

I wrote a perl script a long time ago to do pretty much what you want:
#
# reseq.pl renames files to a new named sequence of filesnames
#
# Usage: reseq.pl newname [-n seq] [-p pad] fileglob
#
use strict;
my $newname = $ARGV[0];
my $seqstr = "01";
my $seq = 1;
my $pad = 2;
shift #ARGV;
if ($ARGV[0] eq "-n") {
$seqstr = $ARGV[1];
$seq = int $seqstr;
shift #ARGV;
shift #ARGV;
}
if ($ARGV[0] eq "-p") {
$pad = $ARGV[1];
shift #ARGV;
shift #ARGV;
}
my $filename;
my $suffix;
for (#ARGV) {
$filename = sprintf("${newname}_%0${pad}d", $seq);
if (($suffix) = m/.*\.(.*)/) {
$filename = "$filename.$suffix";
}
print "$_ -> $filename\n";
rename ($_, $filename);
$seq++;
}
You specify a common prefix for the files, a beginning sequence number and a padding factor.
For exmaple:
# reseq.pl abc 1 2 *.jpg
Will rename all matching files to abc_01.jpg, abc_02.jpg, abc_03.jpg...

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Bash script archiving files according to number - linux

Related

Set variable from content of files in bash(loop)

Concatenate a string with an array for recursively copy file in bash

bash: How to transfer/copy only the file names to separate similar files?

for loop and files with spaces

How to rename multiple files in terminal (LINUX)?

Categories

Resources