Bash command-line to rename wildcard - linux

In my /opt/myapp dir I have a remote, automated process that will be dropping files of the form <anything>-<version>.zip, where <anything> could literally be any alphanumeric filename, and where <version> will be a version number. So, examples of what this automated process will be delivering are:
fizz-0.1.0.zip
buzz-1.12.35.zip
foo-1.0.0.zip
bar-3.0.9.RC.zip
etc. Through controls outside the scope of this question, I am guaranteed that only one of these ZIP files will exist under /opt/myapp at any given time. I need to write a Bash shell command that will rename these files and move them to /opt/staging. For the rename, the ZIP files need to have their version dropped. And so /opt/myapp/<anything>-<version>.zip is renamed and moved to /opt/staging/<anything>.zip. Using the examples above:
/opt/myapp/fizz-0.1.0.zip => /opt/staging/fizz.zip
/opt/myapp/buzz-1.12.35.zip => /opt/staging/buzz.zip
/opt/myapp/foo-1.0.0.zip => /opt/staging/foo.zip
/opt/myapp/bar-3.0.9.RC.zip => /opt/staging/bar.zip
The directory move is obvious and easy, but the rename is making me pull my hair out. I need to somehow save off the <anything> and then re-access it later on in the command. The command must be generic and can take no arguments.
My best attempt (which doesn't even come close to working) so far is:
file=*.zip; file=?; mv file /opt/staging
Any ideas on how to do this?

for file in *.zip; do
[[ -e $file ]] || continue # handle zero-match case without nullglob
mv -- "$file" /opt/staging/"${file%-*}.zip"
done
${file%-*} removes everything after the last - in the filename. Thus, we change fizz-0.1.0.zip to fizz, and then add a leading /opt/staging/ and a trailing .zip.
To make this more generic (working with multiple extensions), see the following function (callable as a command; function body could also be put into a script with a #!/bin/bash shebang, if one removed the local declarations):
stage() {
local file ext
for file; do
[[ -e $file ]] || continue
[[ $file = *-*.* ]] || {
printf 'ERROR: Filename %q does not contain a dash and a dot\n' "$file" >&2
continue
}
ext=${file##*.}
mv -- "$file" /opt/staging/"${file%-*}.$ext"
done
}
...with that function defined, you can run:
stage *.zip *.txt
...or any other pattern you so choose.

f=foo-1.3.4.txt
echo ${f%%-*}.${f##*.}

Related

Moving files to subfolders based on prefix in bash

I currently have a long list of files, which look somewhat like this:
Gmc_W_GCtl_E_Erz_Aue_Dl_281_heart_xerton
Gmc_W_GCtl_E_Erz_Aue_Dl_254_toe_taixwon
Gmc_W_GCtl_E_Erz_Homersdorf_Dl_201_head_xaubadan
Gmc_W_GCtl_E_Erz_Homersdorf_Dl_262_bone_bainan
Gmc_W_GCtl_E_Thur_Peuschen_Dl_261_blood_blodan
Gmc_W_GCtl_E_Thur_Peuschen_Dl_281_heart_xerton
The naming pattern all follow the same order, where I'm mainly seeking to group the files based on the part with "Aue", "Homersdorf", "Peuschen", and so forth (there are many others down the list), with the position of these keywords being always the same (e.g. they are all followed by Dl; they are all after the fifth underscore...etc.).
All the files are in the same folder, and I am trying to move these files into subfolders based on these keywords in bash, but I'm not quite certain how. Any help on this would be appreciated, thanks!
I am guessing you want something like this:
$ find . -type f | awk -F_ '{system("mkdir -p "$5"/"$6";mv "$0" "$5"/"$6)}'
This will move say Gmc_W_GCtl_E_Erz_Aue_Dl_281_heart_xerton into /Erz/Aue/Gmc_W_GCtl_E_Erz_Aue_Dl_281_heart_xerton.
Using the bash shell with a for loop.
#!/usr/bin/env bash
shopt -s nullglob
for file in Gmc*; do
[[ -d $file ]] && continue
IFS=_ read -ra dir <<< "$file"
echo mkdir -pv "${dir[4]}/${dir[5]}" || exit
echo mv -v "$file" "${dir[4]}/${dir[5]}" || exit
done
Place the script inside the directory in question make it executable and execute it.
Remove the echo's so it create the directories and move the files.

Delete files in one directory that do not exist in another directory or its child directories

I am still a newbie in shell scripting and trying to come up with a simple code. Could anyone give me some direction here. Here is what I need.
Files in path 1: /tmp
100abcd
200efgh
300ijkl
Files in path2: /home/storage
backupfile_100abcd_str1
backupfile_100abcd_str2
backupfile_200efgh_str1
backupfile_200efgh_str2
backupfile_200efgh_str3
Now I need to delete file 300ijkl in /tmp as the corresponding backup file is not present in /home/storage. The /tmp file contains more than 300 files. I need to delete the files in /tmp for which the corresponding backup files are not present and the file names in /tmp will match file names in /home/storage or directories under /home/storage.
Appreciate your time and response.
You can also approach the deletion using grep as well. You can loop though the files in /tmp checking with ls piped to grep, and deleting if there is not a match:
#!/bin/bash
[ -z "$1" -o -z "$2" ] && { ## validate input
printf "error: insufficient input. Usage: %s tmpfiles storage\n" ${0//*\//}
exit 1
}
for i in "$1"/*; do
fn=${i##*/} ## strip path, leaving filename only
## if file in backup matches filename, skip rest of loop
ls "${2}"* | grep -q "$fn" &>/dev/null && continue
printf "removing %s\n" "$i"
# rm "$i" ## remove file
done
Note: the actual removal is commented out above, test and insure there are no unintended consequences before preforming the actual delete. Call it passing the path to tmp (without trailing /) as the first argument and with /home/storage as the second argument:
$ bash scriptname /path/to/tmp /home/storage
You can solve this by
making a list of the files in /home/storage
testing each filename in /tmp to see if it is in the list from /home/storage
Given the linux+shell tags, one might use bash:
make the list of files from /home/storage an associative array
make the subscript of the array the filename
Here is a sample script to illustrate ($1 and $2 are the parameters to pass to the script, i.e., /home/storage and /tmp):
#!/bin/bash
declare -A InTarget
while read path
do
name=${path##*/}
InTarget[$name]=$path
done < <(find $1 -type f)
while read path
do
name=${path##*/}
[[ -z ${InTarget[$name]} ]] && rm -f $path
done < <(find $2 -type f)
It uses two interesting shell features:
name=${path##*/} is a POSIX shell feature which allows the script to perform the basename function without an extra process (per filename). That makes the script faster.
done < <(find $2 -type f) is a bash feature which lets the script read the list of filenames from find without making the assignments to the array run in a subprocess. Here the reason for using the feature is that if the array is updated in a subprocess, it would have no effect on the array value in the script which is passed to the second loop.
For related discussion:
Extract File Basename Without Path and Extension in Bash
Bash Script: While-Loop Subshell Dilemma
I spent some really nice time on this today because I needed to delete files which have same name but different extensions, so if anyone is looking for a quick implementation, here you go:
#!/bin/bash
# We need some reference to files which we want to keep and not delete,
 # let's assume you want to keep files in first folder with jpeg, so you
# need to map it into the desired file extension first.
FILES_TO_KEEP=`ls -1 ${2} | sed 's/\.pdf$/.jpeg/g'`
#iterate through files in first argument path
for file in ${1}/*; do
# In my case, I did not want to do anything with directories, so let's continue cycle when hitting one.
if [[ -d $file ]]; then
continue
fi
# let's omit path from the iterated file with baseline so we can compare it to the files we want to keep
NAME_WITHOUT_PATH=`basename $file`
 # I use mac which is equal to having poor quality clts
# when it comes to operating with strings,
# this should be safe check to see if FILES_TO_KEEP contain NAME_WITHOUT_PATH
if [[ $FILES_TO_KEEP == *"$NAME_WITHOUT_PATH"* ]];then
echo "Not deleting: $NAME_WITHOUT_PATH"
else
# If it does not contain file from the other directory, remove it.
echo "deleting: $NAME_WITHOUT_PATH"
rm -rf $file
fi
done
Usage: sh deleteDifferentFiles.sh path/from/where path/source/of/truth

How to remove the extension of a file?

I have a folder that is full of .bak files and some other files also. I need to remove the extension of all .bak files in that folder. How do I make a command which will accept a folder name and then remove the extension of all .bak files in that folder ?
Thanks.
To remove a string from the end of a BASH variable, use the ${var%ending} syntax. It's one of a number of string manipulations available to you in BASH.
Use it like this:
# Run in the same directory as the files
for FILENAME in *.bak; do mv "$FILENAME" "${FILENAME%.bak}"; done
That works nicely as a one-liner, but you could also wrap it as a script to work in an arbitrary directory:
# If we're passed a parameter, cd into that directory. Otherwise, do nothing.
if [ -n "$1" ]; then
cd "$1"
fi
for FILENAME in *.bak; do mv "$FILENAME" "${FILENAME%.bak}"; done
Note that while quoting your variables is almost always a good practice, the for FILENAME in *.bak is still dangerous if any of your filenames might contain spaces. Read David W.'s answer for a more-robust solution, and this document for alternative solutions.
There are several ways to remove file suffixes:
In BASH and Kornshell, you can use the environment variable filtering. Search for ${parameter%word} in the BASH manpage for complete information. Basically, # is a left filter and % is a right filter. You can remember this because # is to the left of %.
If you use a double filter (i.e. ## or %%, you are trying to filter on the biggest match. If you have a single filter (i.e. # or %, you are trying to filter on the smallest match.
What matches is filtered out and you get the rest of the string:
file="this/is/my/file/name.txt"
echo ${file#*/} #Matches is "this/` and will print out "is/my/file/name.txt"
echo ${file##*/} #Matches "this/is/my/file/" and will print out "name.txt"
echo ${file%/*} #Matches "/name.txt" and will print out "/this/is/my/file"
echo ${file%%/*} #Matches "/is/my/file/name.txt" and will print out "this"
Notice this is a glob match and not a regular expression match!. If you want to remove a file suffix:
file_sans_ext=${file%.*}
The .* will match on the period and all characters after it. Since it is a single %, it will match on the smallest glob on the right side of the string. If the filter can't match anything, it the same as your original string.
You can verify a file suffix with something like this:
if [ "${file}" != "${file%.bak}" ]
then
echo "$file is a type '.bak' file"
else
echo "$file is not a type '.bak' file"
fi
Or you could do this:
file_suffix=$(file##*.}
echo "My file is a file '.$file_suffix'"
Note that this will remove the period of the file extension.
Next, we will loop:
find . -name "*.bak" -print0 | while read -d $'\0' file
do
echo "mv '$file' '${file%.bak}'"
done | tee find.out
The find command finds the files you specify. The -print0 separates out the names of the files with a NUL symbol -- which is one of the few characters not allowed in a file name. The -d $\0means that your input separators are NUL symbols. See how nicely thefind -print0andread -d $'\0'` together?
You should almost never use the for file in $(*.bak) method. This will fail if the files have any white space in the name.
Notice that this command doesn't actually move any files. Instead, it produces a find.out file with a list of all the file renames. You should always do something like this when you do commands that operate on massive amounts of files just to be sure everything is fine.
Once you've determined that all the commands in find.out are correct, you can run it like a shell script:
$ bash find.out
rename .bak '' *.bak
(rename is in the util-linux package)
Caveat: there is no error checking:
#!/bin/bash
cd "$1"
for i in *.bak ; do mv -f "$i" "${i%%.bak}" ; done
You can always use the find command to get all the subdirectories
for FILENAME in `find . -name "*.bak"`; do mv --force "$FILENAME" "${FILENAME%.bak}"; done

Bash command to move only some files?

Let's say I have the following files in my current directory:
1.jpg
1original.jpg
2.jpg
2original.jpg
3.jpg
4.jpg
Is there a terminal/bash/linux command that can do something like
if the file [an integer]original.jpg exists,
then move [an integer].jpg and [an integer]original.jpg to another directory.
Executing such a command will cause 1.jpg, 1original.jpg, 2.jpg and 2original.jpg to be in their own directory.
NOTE
This doesn't have to be one command. I can be a combination of simple commands. Maybe something like copy original files to a new directory. Then do some regular expression filter on files in the newdir to get a list of file names from old directory that still need to be copied over etc..
Turning on extended glob support will allow you to write a regular-expression-like pattern. This can handle files with multi-digit integers, such as '87.jpg' and '87original.jpg'. Bash parameter expansion can then be used to strip "original" from the name of a found file to allow you to move the two related files together.
shopt -s extglob
for f in +([[:digit:]])original.jpg; do
mv $f ${f/original/} otherDirectory
done
In an extended pattern, +( x ) matches one or more of the things inside the parentheses, analogous to the regular expression x+. Here, x is any digit. Therefore, we match all files in the current directory whose name consists of 1 or more digits followed by "original.jpg".
${f/original/} is an example of bash's pattern substitution. It removes the first occurrence of the string "original" from the value of f. So if f is the string "1original.jpg", then ${f/original/} is the string "1.jpg".
well, not directly, but it's an oneliner (edit: not anymore):
for i in [0-9].jpg; do
orig=${i%.*}original.jpg
[ -f $orig ] && mv $i $orig another_dir/
done
edit: probably I should point out my solution:
for i in [0-9].jpg: execute the loop body for each jpg file with one number as filename. store whole filename in $i
orig={i%.*}original.jpg: save in $orig the possible filename for the "original file"
[ -f $orig ]: check via test(1) (the [ ... ] stuff) if the original file for $i exists. if yes, move both files to another_dir. this is done via &&: the part after it will be only executed if the test was successful.
This should work for any strictly numeric prefix, i.e. 234.jpg
for f in *original.jpg; do
pre=${f%original.jpg}
if [[ -e "$pre.jpg" && "$pre" -eq "$pre" ]] 2>/dev/null; then
mv "$f" "$pre.jpg" targetDir
fi
done
"$pre" -eq "$pre" gives an error if not integer
EDIT:
this fails if there exist original.jpg and .jpg both.
$pre is then nullstring and "$pre" -eq "$pre" is true.
The following would work and is easy to understand (replace out with the output directory, and {1..9} with the actual range of your numbers.
for x in {1..9}
do
if [ -e ${x}original.jpg ]
then
mv $x.jpg out
mv ${x}original.jpg out
fi
done
You can obviously also enter it as a single line.
You can use Regex statements to find "matches" in the files names that you are looking through. Then perform your actions on the "matches" you find.
integer=0; while [ $integer -le 9 ] ; do if [ -e ${integer}original.jpg ] ; then mv -vi ${integer}.jpg ${integer}original.jpg lol/ ; fi ; integer=$[ $integer + 1 ] ; done
Note that here, "lol" is the destination directory. You can change it to anything you like. Also, you can change the 9 in while [ $integer -le 9 ] to check integers larger than 9. Right now it starts at 0* and stops after checking 9*.
Edit: If you want to, you can replace the semicolons in my code with carriage returns and it may be easier to read. Also, you can paste the whole block into the terminal this way, even if that might not immediately be obvious.

Shell script best way to remove files not in a pair

I have a set of files that come in pairs:
/var/log/messages-20111001
/var/log/messages-20111001.hash
I've had several of these rotate away and now I'm left with a ton of /var/log/messages-201110xx.hash files with no associated log. I'd like to clean up the mess, but I'm uncertain how to remove a file that isn't part of a "pair". I can use bash or zsh (or any LSB tool, really). I need to remove all the .hash files that don't have an associated log.
Example
/var/log/messages-20111001.hash
/var/log/messages-20111002.hash
/var/log/messages-20111003.hash
/var/log/messages-20111004.hash
/var/log/messages-20111005
/var/log/messages-20111005.hash
/var/log/messages-20111006
/var/log/messages-20111006.hash
Should be reduced to:
/var/log/messages-20111005
/var/log/messages-20111005.hash
/var/log/messages-20111006
/var/log/messages-20111006.hash
for file in *.hash; do test -f "${file%.hash}" || rm -- "$file"; done
Something like this?
for f in /var/log/messages-????????.hash ; do
[[ -e "${f%.hash}" ]] || rm "$f"
done

Resources