So I have this script which im trying to determine the type of the file and act accordingly, I am determining the type of the file using file command and then grep for specific string , for example if the file is zipped then unzip it, if its gzipped then gunzip it, I want to add a lot of different types of file.
I am trying to replace the if statements with case and can't figure it out
My script looks like this:
##$arg is the file itself
TYPE="$(file $arg)"
if [[ $(echo $TYPE|grep "bzip2") ]] ; then
bunzip2 $arg
elif [[ $(echo $TYPE|grep "Zip") ]] ; then
unzip $arg
fi
Thanks to everyone that help :)
The general syntax is
case expr in
pattern) action;;
other) otheraction;;
*) default action --optional;;
esac
So for your snippet,
case $(file "$arg") in
*bzip2*) bunzip2 "$arg";;
*Zip*) unzip "$arg";;
esac
If you want to capture the file output into a variable first, do that, of course; but avoid upper case for your private variables.
bzip2 and unzip by default modify their input files, though. Perhaps you want to avoid that?
case $(file "$arg") in
*bzip2*) bzip2 -dc <"$arg";;
*Zip*) unzip -p "$arg";;
esac |
grep "stuff"
Notice also how the shell conveniently lets you pipe out of (and into) conditionals.
Related
I would like to build a little helper function that can deal with fastq.gz and fastq.bz2 files.
I want to merge zcat and bzcat into one transparent function which can be used on both sorts of files:
zbzcat example.fastq.gz
zbzcat example.fastq.bz2
zbzcat() {
file=`echo $1 | `
## Not working
ext=${file##*/};
if [ ext == "fastq.gz" ]; then
exec gzip -cd "$#"
else
exec bzip -cd "$#"
fi
}
The extension extraction is not working correctly. Are you aware of other solutions
These are quite a lot of problems:
file=`echo $1 | ` gives a syntax error because there is no command after |. But you don't need the command substitution anyways. Just use file=$1.
ext=${file##*/} is not extracting the extension, but the filename. To extract the extension use ext=${file##*.}.
In your check you didn't use the variable $ext but the literal string ext.
Usually, only the string after the last dot in a filename is considered to be the extension. If you have file.fastq.gz, then the extension is gz. So use the check $ext = gz. That the uncompressed files are fastq files is irrelevant to the function anyways.
exec replaces the shell process with the given command. So after executing your function, the shell would exit. Just execute the command.
By the way: You don't have to extract the extension at all, when using pattern matchting:
zbzcat() {
file="$1"
case "$file" in
*.gz) gzip -cd "$#";;
*.bz2) bzip -cd "$#";;
*) echo "Unknown file format" >&2;;
esac
}
Alternatively, use 7z x which supports a lot of formats. Most distributions name the package p7zip.
ext=${1##*.}
Why are you throwing in an echo and try to strip a /?
Also, the string ext (3 characters) will never be equal to the string fastq.gz (7 characters). If you want to check that the extension equals gz, just do a
if [[ $ext == gz ]]
Having said this, relying on the extension to get an idea of the content of a file is a bit brave. Perhaps a more reliable way would be to use the file to determine the most likely file type. The probably safest approach would be to just try a bzip extraction first, and if it fails, do the gzip extraction.
I think it would be better if you would use mimetype.
File extensions are not always correct.
decomp() {
case $(file -b --mime-type $1) in
"application/gzip")
gzip -cd "$#"
;;
"application/x-bzip2")
bzcat "$#"
;;
"application/x-xz")
xzcat "$#"
;;
*)
echo "Unknown file format" >&2
;;
esac
}
I have prepared a bash script to get only the directory (not full path) with file name where file is present. It has to be done only when file is located in sub directory.
For example:
if input is src/email/${sub_dir}/Bank_Casefeed.email, output should be ${sub_dir}/Bank_Casefeed.email.
If input is src/layouts/Bank_Casefeed.layout, output should be Bank_Casefeed.layout. I can easily get this using basename command.
src/basefolder is always constant. In some cases (after src/email(basefolder) directory), sub_directories will be there.
This script will work. I can use this script (only if module is email) to get output. but script should work even if sub directory is present in other modules. Maybe should I count the directories? if there are more than two directories (src/basefolder), script should get sub directories. Is there any better way to handle both scenarios?
#!/bin/bash
filename=`basename src/email/${sub_dir}/Bank_Casefeed.email`
echo "filename is $filename"
fulldir=`dirname src/email/${sub_dir}/Bank_Casefeed.email`
dir=`basename $fulldir`
echo "subdirectory name: $dir"
echo "concatenate $filename $dir"
Entity=$dir/$filename
echo $Entity
Using shell parameter expansion:
sub_dir='test'
files=( "src/email/${sub_dir}/Bank_Casefeed.email" "src/email/Bank_Casefeed.email" )
for f in "${files[#]}"; do
if [[ $f == *"/$sub_dir/"* ]]; then
echo "${f/*\/$sub_dir\//$sub_dir\/}"
else
basename "$f"
fi
done
test/Bank_Casefeed.email
Bank_Casefeed.email
I know there might be an easier way to do this. But I believe you can just manipulate the input string. For example:
#!/bin/bash
sub_dir='test'
DIRNAME1="src/email/${sub_dir}/Bank_Casefeed.email"
DIRNAME2="src/email/Bank_Casefeed.email"
echo $DIRNAME1 | cut -f3- -d'/'
echo $DIRNAME2 | cut -f3- -d'/'
This will remove the first two directories.
In my /opt/myapp dir I have a remote, automated process that will be dropping files of the form <anything>-<version>.zip, where <anything> could literally be any alphanumeric filename, and where <version> will be a version number. So, examples of what this automated process will be delivering are:
fizz-0.1.0.zip
buzz-1.12.35.zip
foo-1.0.0.zip
bar-3.0.9.RC.zip
etc. Through controls outside the scope of this question, I am guaranteed that only one of these ZIP files will exist under /opt/myapp at any given time. I need to write a Bash shell command that will rename these files and move them to /opt/staging. For the rename, the ZIP files need to have their version dropped. And so /opt/myapp/<anything>-<version>.zip is renamed and moved to /opt/staging/<anything>.zip. Using the examples above:
/opt/myapp/fizz-0.1.0.zip => /opt/staging/fizz.zip
/opt/myapp/buzz-1.12.35.zip => /opt/staging/buzz.zip
/opt/myapp/foo-1.0.0.zip => /opt/staging/foo.zip
/opt/myapp/bar-3.0.9.RC.zip => /opt/staging/bar.zip
The directory move is obvious and easy, but the rename is making me pull my hair out. I need to somehow save off the <anything> and then re-access it later on in the command. The command must be generic and can take no arguments.
My best attempt (which doesn't even come close to working) so far is:
file=*.zip; file=?; mv file /opt/staging
Any ideas on how to do this?
for file in *.zip; do
[[ -e $file ]] || continue # handle zero-match case without nullglob
mv -- "$file" /opt/staging/"${file%-*}.zip"
done
${file%-*} removes everything after the last - in the filename. Thus, we change fizz-0.1.0.zip to fizz, and then add a leading /opt/staging/ and a trailing .zip.
To make this more generic (working with multiple extensions), see the following function (callable as a command; function body could also be put into a script with a #!/bin/bash shebang, if one removed the local declarations):
stage() {
local file ext
for file; do
[[ -e $file ]] || continue
[[ $file = *-*.* ]] || {
printf 'ERROR: Filename %q does not contain a dash and a dot\n' "$file" >&2
continue
}
ext=${file##*.}
mv -- "$file" /opt/staging/"${file%-*}.$ext"
done
}
...with that function defined, you can run:
stage *.zip *.txt
...or any other pattern you so choose.
f=foo-1.3.4.txt
echo ${f%%-*}.${f##*.}
I have a folder that is full of .bak files and some other files also. I need to remove the extension of all .bak files in that folder. How do I make a command which will accept a folder name and then remove the extension of all .bak files in that folder ?
Thanks.
To remove a string from the end of a BASH variable, use the ${var%ending} syntax. It's one of a number of string manipulations available to you in BASH.
Use it like this:
# Run in the same directory as the files
for FILENAME in *.bak; do mv "$FILENAME" "${FILENAME%.bak}"; done
That works nicely as a one-liner, but you could also wrap it as a script to work in an arbitrary directory:
# If we're passed a parameter, cd into that directory. Otherwise, do nothing.
if [ -n "$1" ]; then
cd "$1"
fi
for FILENAME in *.bak; do mv "$FILENAME" "${FILENAME%.bak}"; done
Note that while quoting your variables is almost always a good practice, the for FILENAME in *.bak is still dangerous if any of your filenames might contain spaces. Read David W.'s answer for a more-robust solution, and this document for alternative solutions.
There are several ways to remove file suffixes:
In BASH and Kornshell, you can use the environment variable filtering. Search for ${parameter%word} in the BASH manpage for complete information. Basically, # is a left filter and % is a right filter. You can remember this because # is to the left of %.
If you use a double filter (i.e. ## or %%, you are trying to filter on the biggest match. If you have a single filter (i.e. # or %, you are trying to filter on the smallest match.
What matches is filtered out and you get the rest of the string:
file="this/is/my/file/name.txt"
echo ${file#*/} #Matches is "this/` and will print out "is/my/file/name.txt"
echo ${file##*/} #Matches "this/is/my/file/" and will print out "name.txt"
echo ${file%/*} #Matches "/name.txt" and will print out "/this/is/my/file"
echo ${file%%/*} #Matches "/is/my/file/name.txt" and will print out "this"
Notice this is a glob match and not a regular expression match!. If you want to remove a file suffix:
file_sans_ext=${file%.*}
The .* will match on the period and all characters after it. Since it is a single %, it will match on the smallest glob on the right side of the string. If the filter can't match anything, it the same as your original string.
You can verify a file suffix with something like this:
if [ "${file}" != "${file%.bak}" ]
then
echo "$file is a type '.bak' file"
else
echo "$file is not a type '.bak' file"
fi
Or you could do this:
file_suffix=$(file##*.}
echo "My file is a file '.$file_suffix'"
Note that this will remove the period of the file extension.
Next, we will loop:
find . -name "*.bak" -print0 | while read -d $'\0' file
do
echo "mv '$file' '${file%.bak}'"
done | tee find.out
The find command finds the files you specify. The -print0 separates out the names of the files with a NUL symbol -- which is one of the few characters not allowed in a file name. The -d $\0means that your input separators are NUL symbols. See how nicely thefind -print0andread -d $'\0'` together?
You should almost never use the for file in $(*.bak) method. This will fail if the files have any white space in the name.
Notice that this command doesn't actually move any files. Instead, it produces a find.out file with a list of all the file renames. You should always do something like this when you do commands that operate on massive amounts of files just to be sure everything is fine.
Once you've determined that all the commands in find.out are correct, you can run it like a shell script:
$ bash find.out
rename .bak '' *.bak
(rename is in the util-linux package)
Caveat: there is no error checking:
#!/bin/bash
cd "$1"
for i in *.bak ; do mv -f "$i" "${i%%.bak}" ; done
You can always use the find command to get all the subdirectories
for FILENAME in `find . -name "*.bak"`; do mv --force "$FILENAME" "${FILENAME%.bak}"; done
I have a directory called "images" filled with about one million images. Yep.
I want to write a shell command to rename all of those images into the following format:
original: filename.jpg
new: /f/i/l/filename.jpg
Any suggestions?
Thanks,
Dan
for i in *.*; do mkdir -p ${i:0:1}/${i:1:1}/${i:2:1}/; mv $i ${i:0:1}/${i:1:1}/${i:2:1}/; done;
The ${i:0:1}/${i:1:1}/${i:2:1} part could probably be a variable, or shorter or different, but the command above gets the job done. You'll probably face performance issues but if you really want to use it, narrow the *.* to fewer options (a*.*, b*.* or what fits you)
edit: added a $ before i for mv, as noted by Dan
You can generate the new file name using, e.g., sed:
$ echo "test.jpg" | sed -e 's/^\(\(.\)\(.\)\(.\).*\)$/\2\/\3\/\4\/\1/'
t/e/s/test.jpg
So, you can do something like this (assuming all the directories are already created):
for f in *; do
mv -i "$f" "$(echo "$f" | sed -e 's/^\(\(.\)\(.\)\(.\).*\)$/\2\/\3\/\4\/\1/')"
done
or, if you can't use the bash $( syntax:
for f in *; do
mv -i "$f" "`echo "$f" | sed -e 's/^\(\(.\)\(.\)\(.\).*\)$/\2\/\3\/\4\/\1/'`"
done
However, considering the number of files, you may just want to use perl as that's a lot of sed and mv processes to spawn:
#!/usr/bin/perl -w
use strict;
# warning: untested
opendir DIR, "." or die "opendir: $!";
my #files = readdir(DIR); # can't change dir while reading: read in advance
closedir DIR;
foreach my $f (#files) {
(my $new_name = $f) =~ s!^((.)(.)(.).*)$!$2/$3/$4/$1/;
-e $new_name and die "$new_name already exists";
rename($f, $new_name);
}
That perl is surely limited to same-filesystem only, though you can use File::Copy::move to get around that.
You can do it as a bash script:
#!/bin/bash
base=base
mkdir -p $base/shorts
for n in *
do
if [ ${#n} -lt 3 ]
then
mv $n $base/shorts
else
dir=$base/${n:0:1}/${n:1:1}/${n:2:1}
mkdir -p $dir
mv $n $dir
fi
done
Needless to say, you might need to worry about spaces and the files with short names.
I suggest a short python script. Most shell tools will balk at that much input (though xargs may do the trick). Will update with example in a sec.
#!/usr/bin/python
import os, shutil
src_dir = '/src/dir'
dest_dir = '/dest/dir'
for fn in os.listdir(src_dir):
os.makedirs(dest_dir+'/'+fn[0]+'/'+fn[1]+'/'+fn[2]+'/')
shutil.copyfile(src_dir+'/'+fn, dest_dir+'/'+fn[0]+'/'+fn[1]+'/'+fn[2]+'/'+fn)
Any of the proposed solutions which use a wildcard syntax in the shell will likely fail due to the sheer number of files you have. Of the current proposed solutions, the perl one is probably the best.
However, you can easily adapt any of the shell script methods to deal with any number of files thus:
ls -1 | \
while read filename
do
# insert the loop body of your preference here, operating on "filename"
done
I would still use perl, but if you're limited to only having simple unix tools around, then combining one of the above shell solutions with a loop like I've shown should get you there. It'll be slow, though.