Move files to different directories based on file name tokens - linux

I am looking to write a script to move files from a directory:
/home/mydir/
To another directory based on tokens in the file name. I have a bunch of files named as such:
red_office_mike_2015_montreal_546968.ext
or
$color_$location_$name_$year_$city_$numbers.extension (files will be various movie files: mov, mp4, mkv, etc.)
I would like the script to move the files to the following location:
/dir/work/$color/$name
Then verify the file has successfully copied, and delete the original file once it has.
I would also love it if the script would create the to directory if it does not already exist.
So in summary, I need a script to move files based on underscore separated tokens, create the to directory if it doesn't already exist, verify the successful copy (maybe with a size check), then delete the original file.
I am working on linux, and would prefer a bash script. The variables I have given are generic, and I will incorporate some other things to the script, I'm just looking for help on building the skeleton.
Thanks in advance for any help!

It's not a bash script, but perl is much better at this kind of thing and is installed on all Linux systems
while(<>) {
chomp;
$file = $_;
($colour, $location, $name, $year, $city, $numbers) = split(/_/,$file);
$dest0 = "/dir/work/$colour";
$dest1 = "$dest0/$name";
mkdir ($dest0) unless (-d $dest0);
mkdir ($dest1) unless (-d $dest1);
rename ($file, "$dest1/$file");
}
The script splits your input file on the underscore character, creates all the directories to the destination and then renames the file to the new filename. Rename takes care of all the copying and deleting for you. In fact it just changes the directory entries without any copying at all.
UPDATE
The above version takes its input from a file containing a list of filenames to process. For an alternative version which processes all files in the current directory, replace the while line with
while(glob("*")) {

I was able to fumble around online and come up with a for loop to do this task. I used cut and it made things simple. Here is what worked for me:
#!/bin/sh
cd "${1:-.}"
for f in *.*; do
color=`echo "$f" | cut -d'_' -f1`
name=`echo "$f" | cut -d'_' -f3`
todir="/dir/work/$color/$name"
mkdir -p "$todir"
mv "$f" "$todir"
done
This worked perfectly and I hope it can help others who might need to create directories based on portions of filenames.
The first line under the shebang made it so that it will either look at the current working directory or a directory you pass it as an argument.
Thanks to those who chimed in on the original post. I'm new with scripting so it take me a while to figure this stuff out. I love this site though, it is super helpful!

Related

Recursive Text Substitution and File Extension Rename

I am using an application that creates a text file on a Linux server. I then have the ability to execute a shell script (BASH 3.2.57) in which I need to convert the text file from Unix line endings to DOS and also change the extension of the file from .txt to .log.
I currently have a sed based command to do this. This command is rewritten by the application at run time to point to the specific folder and file name, in this example where you see ABC (all capital 3 letters in all my examples are a variable that can be any 3 letters).
pushd /rootfolder/parentfolder/ABC/
sed 's/$/\r/' prABC.txt > prABC.log
popd
The problem with this is that if a user runs the application for 2 different groups, say ABC and DEF at nearly the same time, the script will get overwritten with the DEF variables before ABC had a chance to fire off and do its thing with the file. Additionally the .txt is left in the folder regardless and I would like that to be removed.
A friend of mine came up with the following code that seems to work if its determined to be our best solution, but I would think and hope we have a cleaner more dynamic way to do this. Also this current method requires that when my user decides to add a GHI directory and file I now have to update the code, which i can program my application to do for me but i don't want this script to have to be rewritten every time the application wants to use it.
pushd /rootfolder/parentfolder/ABC
if [[ -f prABC.txt ]]
then
sed 's/$/\r/' prABC.txt > prABC.log
rm prABC.txt
fi
popd
pushd /rootfolder/parentfolder/DEF
if [[ -f prABC.txt ]]
then
sed 's/$/\r/' prABC.txt > prABC.log
rm prABC.txt
fi
popd
I would like to call this script at anytime from my application and it find any file named pr*.txt below the /rootfolder/parentfolder/ directory (if that has to include the parentfolder in its search that won't be a problem) and convert the line endings from LF to CRLF and change the extension of the file from .txt to .log.
I've done a ton of searching and have found near solutions for this but not exactly what I need and I want to be sure it's as safe as possible (issues with using "find with for". I don't know what utilities are installed on this build so i would like to keep it as basic/supportable as possible Thanks in advance :)
You should almost never need pushd and popd in scripts. In fact, you rarely need cd, either.
#!/bin/bash
for d in /rootfolder/parentfolder/ABC /rootfolder/parentfolder/DEF
do
if [[ -f "$d/prABC.txt" ]]
then
sed 's/$/\r/' "$d/prABC.txt" > "$d/prABC.log" &&
rm "$d/prABC.txt"
fi
done
Recall that a && b is shorthand for
if a; then
b
fi
In other words, if sed fails (because the source file can't be read, or the destination can't be written) we don't rm the source file. There should be an error message already so we don't add another one.
Not only is this more succinct, it is also easier to change if you decide that the old file should be renamed instead of removed, or you want to filter out all lines which contain "beef" in the sed script. Generally you should avoid repeated code; see also the DRY principle on Wikipedia.
Something is seriously wrong somewhere if you require DOS line endings in your files on Unix.

How can I move multiple files to a directory while changing their names and extensions using bash?

There are multiple files in /opt/dir/ABC/ named allfile_123-abc allfile_123-def allfile_123-ghi allfile_123-xxx.
I need the files to be named new_name-abc.pgp new_name-def.pgp new_name-ghi.pgp new_name-xxx.pgp and then moved to /usr/tst/output
for file in /opt/dir/ABC/allfile_123* ;
do mv $file /usr/tst/output/"$file.pgp";
rename allfile_123 new_name /usr/tst/output/*.pgp ; done
I know the above doesn't work because $file = /opt/dir/ABC/allfile_123*. Is it possible to make this work, or is it a different command instead of 'for loop'?
This is for the Autosys application in which the jil contains a command to pass to the command line of a linux server running bash.
I could only find versions of each part of my question but not altogether and I was hoping to keep it on the command line of this jil. Unless a script is absolutely necessary.
No need for the loop, you can do this with just rename and mv:
rename -v 's/$/.pgp/' /opt/dir/ABC/allfile_123*
rename -v s/allfile_123/new_name/ /opt/dir/ABC/allfile_123*
mv /opt/dir/ABC/new_name* /usr/tst/output/
But I'm not sure the rename you are using is the same as mine.
However,
since the replacement you want to perform is fairly simple,
it's easy to do in pure Bash:
for file in /opt/dir/ABC/allfile_123*; do
newname=new_name${file##*allfile_123}.gpg
mv "$file" /usr/tst/output/"$newname"
done
If you want to write it on a single line:
for file in /opt/dir/ABC/allfile_123*; do newname=new_name${file##*allfile_123}.gpg; mv "$file" /usr/tst/output/"$newname"; done

RH Linux Bash Script help. Need to move files with specific words in the file

I have a RedHat linux box and I had written a script in the past to move files from one location to another with a specific text in the body of the file.
I typically only write scripts once a year so every year I forget more and more... That being said,
Last year I wrote this script and used it and it worked.
For some reason, I can not get it to work today and I know it's a simple issue and I shouldn't even be asking for help but for some reason I'm just not looking at it correctly today.
Here is the script.
ls -1 /var/text.old | while read file
do
grep -q "to.move" $file && mv $file /var/text.old/TBD
done
I'm listing all the files inside the /var/text.old directory.
I'm reading each file
then I'm grep'ing for "to.move" and holing the results
then I'm moving the resulting found files to the folder /var/text.old/TBD
I am an admin and I have rights to the above files and folders.
I can see the data in each file
I can mv them manually
I have use pwd to grab the correct spelling of the directory.
If anyone can just help me to see what the heck I'm missing here that would really make my day.
Thanks in advance.
UPDATE:
The files I need to move do not have Whitespaces.
The Error I'm getting is as follows:
grep: 9829563.msg: No such file or directory
NOTE: the file "982953.msg" is one of the files I need to move.
Also note: I'm getting this error for every file in the directory that I'm listing.
You didn't post any error, but I'm gonna take a guess and say that you have a filename with a space or special shell character.
Let's say you have 3 files, and ls -1 gives us:
hello
world
hey there
Now, while splits on the value of the special $IFS variable, which is set to <space><tab><newline> by default.
So instead of looping of 3 values like you expect (hello, world, and hey there), you loop over 4 values (hello, world, hey, and there).
To fix this, we can do 2 things:
Set IFS to only a newline:
IFS="
"
ls -1 /var/text.old | while read file
...
In general, I like setting IFS to a newline at the start of the script, since I consider this to be slightly "safer", but opinions on this probably vary.
But much better is to not parse the output of ls, and use for:
for file in /var/text.old/*`; do
This won't fork any external processes (piping to ls to while starts 2), and behaves "less surprising" in other ways. See here for some examples.
The second problem is that you're not quoting $file. You should always quote pathnames with double quoted: "$file" for the same reasons. If $file has a space (or a special shell character, such as *, the meaning of your command changes:
file=hey\ *
mv $file /var/text.old/TBD
Becomes:
mv hey * /var/text.old/TBD
Which is obviously very different from what you intended! What you intended was:
mv "hey *" /var/text.old/TBD

Change working directory while looping over folders

Currently I am trying to run MRI software (TBSS) on imaging files(scan.nii.gz) on the Linux command line.
The scans are all stored in separate folders for different participants and the file names are identical,so:
/home/scans/participant1/scan.nii.gz
/home/scans/participant2/scan.nii.gz
/home/scans/participant3/scan.nii.gz
What this software does is it creates the result of the analysis in the current working directory.Since the scans have the same image name, they get overwritten al the time.
I would like to loop through all the participant folders, make it my working directory and then execute the tbss command, which is simply tbss_1_preproc scan.nii.gz. In this way, the file will be stored in the current working directory,which is the participant directory.
Is there any sensible way of doing this in Linux ?
Thanks so much !
Try it in BASH. The code below is untested, but it should give you a clue
#! /bin/bash
find . -name scan.nii.gz | while read line
do
cd $(dirname "${line}")
tbss_1_preproc $(basename "${line}")
done
Put it in a file and make it executable. Copy it to your scans folder and execute it.

Linux bash script to copy files

I need script to copy on cron basis a list of files. Files selected on name/datetime pattern and to name of file destination must by appended data like ddmmyyy.
It is not problem copy files or directory, but problem to change name of each file according to its data. May be exists some open source solution?
Thanks.
You haven't provided enough information for me to give you real working code; but you can do something like this:
file=dated_log.log
ddmmyyyy=$(read -r < "$file" ; echo "${REPLY:1:8}")
cp "$file" "$file.$ddmmyyyy"
The above will copy dated_log.log to data_log.log.30102011, assuming that the first line of dated_log.log starts with 30102011.
The Bash Reference Manual will hopefully help you adjust the above to suit your needs.

Resources