Rename files with different suffixes with only one new suffix - rename

I am trying to modify a pipeline for viral taxonomy analysis. In this pipeline, the variables R1 and R2 are the two input paired-end fastq files. But theses files can be .fastq.gz or .fq.gz. After trimming of these fastq files with the program Trimgalore, the pipeline is coded to rename them by removing the suffix and adding an other. The pipeline can treat only .fq.gz files for the moment:
mv `basename $R1 .fq.gz`_val_1.fq $trimmedR1
mv `basename $R2 .fq.gz`_val_2.fq $trimmedR2
and I am trying to correct it:
#Command to execute:
if [[ "$R1" == *.fq.gz ]]
then
mv `basename $R1 .fq.gz`_val_1.fq $trimmedR1
mv `basename $R2 .fq.gz`_val_2.fq $trimmedR2
else
mv `basename $R1 .fastq.gz`_val_1.fq $trimmedR1
mv `basename $R2 .fastq.gz`_val_2.fq $trimmedR2
fi
but my command above is not correct and I don't know why. I also tried to use the command basename -a but it didn't work too. Thanks in advance to help me correct my code.

Related

Uncompress tar.gz with appending a suffix to each files name

I wanted to uncompress a tar.gz, with appending a suffix to each of the file name. So, for example abc.tar.gz contains files 'first' and 'second', so, after extracting, if I want to append suffix '.append'the file name of each files should be 'first.append' and 'second.append'. Is there a command or way to this?
Note: Files with name 'first' and 'second' are already there, and I wanted to decompress without affecting already available files.
One thing I can think of is uncompress in a temp dir and then, copy all files one by one. But, I wanted to do it in a one shot, if possible, so that, it will save time.
Thanks in advance.
try this:
#!/bin/bash
COMPRESS_F=abc.tar.gz
if [ $# -ne 0 ]; then
COMPRESS_F=$1
fi
for i in `tar -tf "$COMPRESS_F"`; do
if [ -f $i ]; then
echo "mv ${i} ${i}.orig"
mv ${i} ${i}.orig
fi
done
for i in `tar -xvzf "$COMPRESS_F"`; do
echo "mv ${i} ${i}.append"
mv ${i} ${i}.append
if [ -f ${i}.orig ]; then
echo "mv ${i}.orig ${i};"
mv ${i}.orig ${i};
fi
done
Got the solution, we can do this using '--transform' option of tar command, as shown below:
tar -(x/c)vf archive.tar --transform 's,/abc$,/abc.append,'
this will convert all abc files to abc.append.

Do you know how I can create backup files automatically?

I backup files a few times a day on Ubuntu/Linux with the command tar -cpvzf ~/Backup/backup_file_name.tar.gz directory_to_backup/. I want to create a script that will create the file name automatically - check:
~/Backup/backup_file_name_`date +"%Y-%m-%d"`_a.tar.gz
if it exists, if it exists then replace "_a" with "_b" and then checks all the letters up to z. Create the first backup file that doesn't exist. If all the files up to z exist then add "_1" to the file name (with "_z") and check all the numbers until the file doesn't exist. Never change an existing file but only create new backup files. Do you know how to create such a script?
You can do something like
for l in {a..z} ; do
[[ -f ~/Backup/backup_file_name_`date +"%Y-%m-%d"`_${l}.tar.gz ]] && continue
export backupname=-f ~/Backup/backup_file_name_`date +"%Y-%m-%d"`_${l}.tar.gz && break
done
# test if $backupname is properly set, what if `z` is used? I'm leaving this to you
# then backup as usual
tar -cpvzf $backupname directory_to_backup/
This iterates over the letters and if the required file exists skips setting the backupname variable.
OK, I found a solution. I created the file ~/scripts/backup.sh:
#!/bin/bash
working_dir=`pwd`
backupname=""
if [ -z "$backupname" ]; then
for l in {a..z} ; do
if [ ! -f ~/Backup/backup_file_name_`date +"%Y-%m-%d"`_${l}.tar.gz ]; then
backupname=~/Backup/backup_file_name_`date +"%Y-%m-%d"`_${l}.tar.gz
break
fi
done
fi
if [ -z "$backupname" ]; then
l="z"
for (( i = 1 ; i <= 1000; i++ )) do
if [ ! -f ~/Backup/backup_file_name_`date +"%Y-%m-%d"`_${l}_${i}.tar.gz ]; then
backupname=~/Backup/backup_file_name_`date +"%Y-%m-%d"`_${l}_${i}.tar.gz
break
fi
done
fi
if [ ! -z "$backupname" ]; then
cd ~/projects/
~/scripts/tar.sh $backupname directory_to_backup/
cd $working_dir
else
echo "Oops! can't create backup file name."
fi
exit
The file ~/scripts/tar.sh contains this script:
#!/bin/bash
if [ -f $1 ]; then
echo "Oops! backup file was already here."
exit
fi
tar -cpvzf $1 $2 $3 $4 $5
Now I just have to type ~/scripts/backup.sh and the script backs up my files.
Create a script which saves file with date like,
~/Backup/backup_file_name_${date}.tar.gz
and run that script a cron job if you want to take backup after some specific interval or run it manually if you dont have such requirement.

How can I compare two strings in bash?

I am trying to remove all ".s" files in a folder that can be derived by ".c" source files.
This is my code
for cfile in *.c; do
#replace the last letter with s
cfile=${cfile/%c/s}
for sfile in *.s; do
#compare cfile with sfile; if exists delete sfile
if [ $cfile==$sfile ]; then
rm $sfile;
fi
done
done
But this code deletes all the ".s" files. I think it's not comparing the filenames properly.
Can someone please help.
The canonical way to compare strings in bash is:
if [ "$string1" == "$string2" ]; then
this way if one of the strings is empty it'll still run.
You can use it like this:
[[ "$cfile" = "$sfile" ]] && rm "$sfile"
OR
[[ "$cfile" == "$sfile" ]] && rm "$sfile"
OR by using old /bin/[ (test) program
[ "$cfile" = "$sfile" ] && rm "$sfile"
Saying
if [ $chile==$sfile ]; then
would always be true since it amounts to saying
if [ something ]; then
Always ensure spaces around the operators.
The other problem is that you're saying:
cfile=${cfile/%c/s}
You probably wanted to say:
sfile=${cfile/%c/s}
And you need to get rid of the inner loop:
for sfile in *.s; do
done
Just keep the comparison code.
I think the most simpliest solution would be:
for cfile in *.c ; do rm -f "${cfile%.c}.s" ; done
It just lists all the .c files and try to delete the corresponding .s file (if any).
for cFile in *.c; do
sFile="${cFile%c}s"
if [[ -f "$sFile" ]]; then
echo "delete $sFile"
fi
done
The actual deletion of the files I leave as an exercise. :-)
You can also just brute force and delete everything and redirecting the error messages to /dev/null:
for cFile in *.c; do
sFile="${cFile%c}s"
rm "$sFile" &> /dev/null
done
but this will be slower of course.

Renaming xml file extension using bash script

I have a directory which has many folder and each folder contains a list of XML files. I am writing a bash script that traverses through the files and renames the extension of the file to "manual" if the size of the file is greater than 65Mb. This is my first writing a shell script and I was able to write the code for traversing the files but I am having difficulty in the renaming part.
for file in $dir
do
size=$(stat -c%s "$file")
if test "$size" -gt "68157440"; then
echo "Before Renaming...."
echo $file
echo "After renaming"
mv *.manual `basename $file`.xml
echo $file
else
echo $file >> outlog.log
fi
done
an example of $file is,
/apps/jAS/dev/products-app/BConverter/data/supplier-data/TF/output/Fiber Optics and Fiber Management Solutions/Fiber Optic Cable Assemblies.xml
mv *.manual `basename $file`.xml
If you want to change the extension of $file from xml to manual, do instead
mv "$file" "${file%.xml}".manual
What exactly is the difficulty you're having?
If it's white space in file names, try
mv *.manual `basename "$file"`.xml
Note that your script will not work if *.manual expands to more than one file name.
No need for a script on this, combination of find and xargs should do the trick:
find . -size +65M | xargs -IQ mv Q Q.manual
The little-used -I option to Xargs:
runs each input as a separate command, and
lets you replace the filename, so you can use it multiple time, ideal for a mv

Given the file name, how to get it from a directory in shell

I want to compare 2 files with the same name in different directories.
$1 and $2 are 2 directories. I can check if there are same name files, but then i don't know how to get the 2nd file..
cd $1
for i in `ls`
do
if [ -f $2/$i ]
then
echo "find it in another directory"
GET THE OTHER FILE IN $2, THEN COMPARE THEM
cmp -s $i THE OTHER FILE
if [ $? = 0 ]
echo "they are same"
else
echo "they are different"
fi
fi
done
Simplest problem would be spaces in the args - easy to fix, just quote $1 and $2
if [ -f "$2/$i" ]
But I suspect the problem is that you are CDing into $1, which means $2 is invalid (if it is a relative path)
Solution1) Use absolute paths (e.g. /staff/bathpp/stuff/dir2)
Solution2) If you are expecting relative paths, then grab the current dir before jumping.
origDir=`pwd`
...
path2="$origDir/$path2"
Personally I'd to some checks so it worked for both.
for FIRST in $1/*
do
SECOND=$2/$(basename "$FIRST")
if [ -f "$SECOND" ]; then
diff --brief "$FIRST" "$SECOND"
fi
done
N.B. diff --brief only outputs when they are different. If you want to see the actual difference, remove the --brief.

Resources