How to make the bash script work with one command after another? - linux

I have a bash script like below. First it will take sorted.bam files as input and use "stringtie" tool give each sample gtf as output. Then path for each sample gtf will be given into mergelist.txt. and then use "stringtie merge" on them to get "stringtie_merged.gtf".
I totally have 40 sorted.bam files.
for sample in /path/*.sorted.bam
do
dir="/pathto/hisat2_output"
dir2="/pathto/folder"
base=`basename $sample '.sorted.bam'`
"stringtie -p 8 -G gencode.v27.primary_assembly.annotation_nochr.gtf -o ${dir2}/stringtie_output/${base}/${base}_GRCh38.gtf -l ${dir2}/stringtie_output/${base}/${base} ${dir}/${base}.sorted.bam; ls ${dir2}/stringtie_output/*/*_GRCh38.gtf > mergelist.txt; stringtie --merge -p 8 -G gencode.v27.primary_assembly.annotation_nochr.gtf -o ${dir2}/stringtie_output/stringtie_merged.gtf mergelist.txt"
done
I separated the commands with ; After running the script on all sorted.bam files and after completing the job I see that mergelist.txt has paths only for 33 sample gtf's. Which means the path for other 7 sample gtfs is missing in merge list.txt.
Is Separating the commands with ; right one or is there any other way?
The script should use one command first and with the output the paths need to be given in the text file and then use the other command.

You haven't separated the commands with semi-colons; you've invoked a single command that has semi-colons embedded in it. Consider the simple script:
"ls; pwd"
This script does not call ls followed by pwd. Instead, the shell will search the PATH looking for a file named ls; pwd (that is, a file with a semi-colon and a space in its name), probably not find one and respond with an error message. You need to remove the double quotes.

What's wrong with multiple lines, as you already have more than one line:
dir="/pathto/hisat2_output"
dir2="/pathto/folder"
for sample in /path/*.sorted.bam ;do
base=$(basename ${sample} '.sorted.bam')
stringtie -p 8 -G gencode.v27.primary_assembly.annotation_nochr.gtf -o ${dir2}/stringtie_output/${base}/${base}_GRCh38.gtf -l ${dir2}/stringtie_output/${base}/${base} ${dir}/${base}.sorted.bam
ls ${dir2}/stringtie_output/*/*_GRCh38.gtf > mergelist.txt
stringtie --merge -p 8 -G gencode.v27.primary_assembly.annotation_nochr.gtf -o ${dir2}/stringtie_output/stringtie_merged.gtf mergelist.txt
done
Anyway, I don't see the point in having the second stringtie command inside the loop, it should work fine just after.
If stringtie is able process STDIN you might get away without the mergelist.txt by using:
stringtie --merge -p 8 -G gencode.v27.primary_assembly.annotation_nochr.gtf -o ${dir2}/stringtie_output/stringtie_merged.gtf <<< $(echo ${dir2}/stringtie_output/*/*_GRCh38.gtf)

you should double quote your variables and use $( command ) instead backticks
base=$( basename $sample '.sorted.bam' ) :
you have a space in filenames??
prefer:
base=$( basename "$sample.sorted.bam" ) # with or without space
if you have spaces, you must double quote:
stringtie -p 8 \
-G gencode.v27.primary_assembly.annotation_nochr.gtf \
-o "$dir2/stringtie_output/$base/$base_GRCh38.gtf" \
-l "$dir2/stringtie_output/$base/$base" \
"$dir/$base.sorted.bam"
ls "$dir2"/stringtie_output/*/*_GRCh38.gtf > mergelist.txt
...

Related

Execute commands in specific location and depending on answer of previous command

I am currently working on a Text-to-speech project and I need to write bash script which will, when it is called, execute two commands. If the first command returns the proper answer (if returns an answer at all), the second command will be called and executed.
My question is, how can I write a script, that executes shell commands in a specific certain file system location?
For example, I need to be in the directory /opt/text/example and execute this command:
sudo ./bin/sample_read -I ../languages/ -I ../languages -v dave -T 2 \
-i /opt/text/example.txt -F 22 -O embedded-pro -o out_file.pcm
and then to wait for the answer, then (if it is good) execute the second command.
The second command is
aplay -f S16_LE -r 22050 -c 1 out_file.pcm
This should help:
pushd /path/to/directory
my_var=$(command1)
if [ "$my_var" == "expected_result" ]; then
command2
fi
popd
You basically run command1 and store its output in my_var. Then you compare the content of $my_var with whatever you're expecting.
Also pushd <path>/popd allow you to move to a directory and back.

Executing same command for several files in same repository in linux

I'd like to execute the following command for several files in same repository in linux:
../../../../../openSMILE-2.1.0/SMILExtract -C ../../../../../openSMILE-2.1.0/config/IS13_ComParE.conf -I inputfilename.wav -D outputfilename.csv
there are several files (named 1.wav, 2.wav, 3.wav) in the directory and if I execute
../../../../../openSMILE-2.1.0/SMILExtract -C ../../../../../openSMILE-2.1.0/config/IS13_ComParE.conf -nologfile 1 -noconsoleoutput 1 -I 1.wav -D 1.csv
it outputs 1.csv.
How can I create 1.csv, 2.csv, 3.csv, .. by executing just one single command in linux? (or do I have to make .sh file?)
It's probably cleaner to put the following to a script, but you can type it directly into the bash command line as well:
#! /bin/bash
for file in *.wav ; do
prefix=${file%.wav} # Remove from the right.
../../../../../openSMILE-2.1.0/SMILExtract \
-C ../../../../../openSMILE-2.1.0/config/IS13_ComParE.conf \
-I "$file" -D "$prefix".csv
done

substitute parts of a command in a bash script with arguments

I am very new to programing, especially in bash.
I often run the same long command filled with many arguments each day but with different file names and one other argument, the bulk of the string remains the same. This is the case for Clonezilla server.
I want to be able to run a script that only asks for the for the portions I want to change, then substitutes those values into the original string and pass it along to the shell.
For example:
I have a set of file names: a.img b.img c.img
the command I run: drbl-ocs -b -g auto -e1 auto -e2 -r -x -j2 -sc0 -p poweroff --clients-to-wait <number value here> -l en_US.UTF-8 startdisk multicast_restore <name of file> nvme0n1
My goal is to be able to type: sudo sh myscript.sh arga argb
where "arga" is a number that replaces <number value here> and "argb" replaces <name of file> in the above string.
Thanks for any help!
The command line arguments to a script can be found in the positional parameters, parameters whose names are positive integers.
#!/bin/sh
drbl-ocs -b -g auto -e1 auto -e2 -r -x -j2 -sc0 \
-p poweroff --clients-to-wait "$1" -l en_US.UTF-8 \
startdisk multicast_restore "$2" nvme0n1

wget and bash error: bash: line 0: fg: no job control

I am trying to run a series of commands in parallel through xargs. I created a null-separated list of commands in a file cmd_list.txt and then attempted to run them in parallel with 6 threads as follows:
cat cmd_list.txt | xargs -0 -P 6 -I % bash -c %
However, I get the following error:
bash: line 0: fg: no job control
I've narrowed down the problem to be related to the length of the individual commands in the command list. Here's an example artificially-long command to download an image:
mkdir a-very-long-folder-de090952623b4865c2c34bd6330f8a423ed05ed8de090952623b4865c2c34bd6330f8a423ed05ed8de090952623b4865c2c34bd6330f8a423ed05ed8
wget --no-check-certificate --no-verbose -O a-very-long-folder-de090952623b4865c2c34bd6330f8a423ed05ed8de090952623b4865c2c34bd6330f8a423ed05ed8de090952623b4865c2c34bd6330f8a423ed05ed8/blah.jpg http://d4u3lqifjlxra.cloudfront.net/uploads/example/file/48/accordion.jpg
Just running the wget command on its own, without the file list and without xargs, works fine. However, running this command at the bash command prompt (again, without the file list) fails with the no job control error:
echo "wget --no-check-certificate --no-verbose -O a-very-long-folder-de090952623b4865c2c34bd6330f8a423ed05ed8de090952623b4865c2c34bd6330f8a423ed05ed8de090952623b4865c2c34bd6330f8a423ed05ed8/blah.jpg http://d4u3lqifjlxra.cloudfront.net/uploads/example/file/48/accordion.jpg" | xargs -I % bash -c %
If I leave out the long folder name and therefore shorten the command, it works fine:
echo "wget --no-check-certificate --no-verbose -O /tmp/blah.jpg http://d4u3lqifjlxra.cloudfront.net/uploads/example/file/48/accordion.jpg" | xargs -I % bash -c %
xargs has a -s (size) parameter that can change the max size of the command line length, but I tried increasing it to preposterous sizes (e.g., 16000) without any effect. I thought that the problem may have been related to the length of the string passed in to bash -c, but the following command also works without trouble:
bash -c "wget --no-check-certificate --no-verbose -O a-very-long-folder-de090952623b4865c2c34bd6330f8a423ed05ed8de090952623b4865c2c34bd6330f8a423ed05ed8de090952623b4865c2c34bd6330f8a423ed05ed8/blah.jpg http://d4u3lqifjlxra.cloudfront.net/uploads/example/file/48/accordion.jpg"
I understand that there are other options to run commands in parallel, such as the parallel command (https://stackoverflow.com/a/6497852/1410871), but I'm still very interested in fixing my setup or at least figuring out where it's going wrong.
I'm on Mac OS X 10.10.1 (Yosemite).
It looks like the solution is to avoid the -I parameter for xargs which, per the OS X xargs man page, has a 255-byte limit on the replacement string. Instead, the -J parameter is available, which does not have a 255-byte limit.
So my command would look like:
echo "wget --no-check-certificate --no-verbose -O a-very-long-folder-de090952623b4865c2c34bd6330f8a423ed05ed8de090952623b4865c2c34bd6330f8a423ed05ed8de090952623b4865c2c34bd6330f8a423ed05ed8/blah.jpg http://d4u3lqifjlxra.cloudfront.net/uploads/example/file/48/accordion.jpg" | xargs -J % bash -c %
However, in the above command, only the portion of the replacement string before the first whitespace is passed to bash, so bash tries to execute:
wget
which obviously results in an error. My solution is to ensure that xargs interprets the commands as null-delimited instead of whitespace-delimited using the -0 parameter, like so:
echo "wget --no-check-certificate --no-verbose -O a-very-long-folder-de090952623b4865c2c34bd6330f8a423ed05ed8de090952623b4865c2c34bd6330f8a423ed05ed8de090952623b4865c2c34bd6330f8a423ed05ed8/blah.jpg http://d4u3lqifjlxra.cloudfront.net/uploads/example/file/48/accordion.jpg" | xargs -0 -J % bash -c %
and finally, this works!
Thank you to #CharlesDuffy who provided most of this insight. And no thank you to my OS X version of xargs for its poor handling of replacement strings that exceed the 255-byte limit.
I suspect it's the percent symbol, and your top shell complaining.
cat cmd_list.txt | xargs -0 -P 6 -I % bash -c %
Percent is a metacharacter for job control. "fg %2", e.g. "kill %4".
Try escaping the percents with a backslash to signal to the top shell that it should not try to interpret the percent, and xargs should be handed a literal percent character.
cat cmd_list.txt | xargs -0 -P 6 -I \% bash -c \%

Triple nested quotations in shell script

I'm trying to write a shell script that calls another script that then executes a rsync command.
The second script should run in its own terminal, so I use a gnome-terminal -e "..." command. One of the parameters of this script is a string containing the parameters that should be given to rsync. I put those into single quotes.
Up until here, everything worked fine until one of the rsync parameters was a directory path that contained a space. I tried numerous combinations of ',",\",\' but the script either doesn't run at all or only the first part of the path is taken.
Here's a slightly modified version of the code I'm using
gnome-terminal -t 'Rsync scheduled backup' -e "nice -10 /Scripts/BackupScript/Backup.sh 0 0 '/Scripts/BackupScript/Stamp' '/Scripts/BackupScript/test' '--dry-run -g -o -p -t -R -u --inplace --delete -r -l '\''/media/MyAndroid/Internal storage'\''' "
Within Backup.sh this command is run
rsync $5 "$path"
where the destination $path is calculated from text in Stamp.
How can I achieve these three levels of nested quotations?
These are some question I looked at just now (I've tried other sources earlier as well)
https://unix.stackexchange.com/questions/23347/wrapping-a-command-that-includes-single-and-double-quotes-for-another-command
how to make nested double quotes survive the bash interpreter?
Using multiple layers of quotes in bash
Nested quotes bash
I was unsuccessful in applying the solutions to my problem.
Here is an example. caller.sh uses gnome-terminal to execute foo.sh, which in turn prints all the arguments and then calls rsync with the first argument.
caller.sh:
#!/bin/bash
gnome-terminal -t "TEST" -e "./foo.sh 'long path' arg2 arg3"
foo.sh:
#!/bin/bash
echo $# arguments
for i; do # same as: for i in "$#"; do
echo "$i"
done
rsync "$1" "some other path"
Edit: If $1 contains several parameters to rsync, some of which are long paths, the above won't work, since bash either passes "$1" as one parameter, or $1 as multiple parameters, splitting it without regard to contained quotes.
There is (at least) one workaround, you can trick bash as follows:
caller2.sh:
#!/bin/bash
gnome-terminal -t "TEST" -e "./foo.sh '--option1 --option2 \"long path\"' arg2 arg3"
foo2.sh:
#!/bin/bash
rsync_command="rsync $1"
eval "$rsync_command"
This will do the equivalent of typing rsync --option1 --option2 "long path" on the command line.
WARNING: This hack introduces a security vulnerability, $1 can be crafted to execute multiple commands if the user has any influence whatsoever over the string content (e.g. '--option1 --option2 \"long path\"; echo YOU HAVE BEEN OWNED' will run rsync and then execute the echo command).
Did you try escaping the space in the path with "\ " (no quotes)?
gnome-terminal -t 'Rsync scheduled backup' -e "nice -10 /Scripts/BackupScript/Backup.sh 0 0 '/Scripts/BackupScript/Stamp' '/Scripts/BackupScript/test' '--dry-run -g -o -p -t -R -u --inplace --delete -r -l ''/media/MyAndroid/Internal\ storage''' "

Resources