bash for loops - how to use the ${file} variable? - linux

I would like to run a command on a list of paired files with the following format
SAMPLE_1.1.fq.gz SAMPLE_1.2.fq.gz
SAMPLE_2.1.fq.gz SAMPLE_2.2.fq.gz
etc...
etc. That are in a directory called ../cleaned-trimmed
I have the list of samples in a txt file (samples_final.txt - one sample per line) in a directory called info.
SAMPLE_1
SAMPLE_2
SAMPLE_3
I would like to run the following command on all the samples:
gsnap <args> --output-file=./alignments.gsnap/SAMPLE_1.mapped.sam --failed-input=./alignments.gsnap/SAMPLE_1.unmapped.fa ../cleaned-trimmed/SAMPLE_1.1.fq.gz ../cleaned-trimmed/SAMPLE_1.2.fq.gz
Where the args are the database used, command flags, etc.
I modified the script from a previous answer in stackoverflow to build a loop as follows:
for file in $(<../info/samples_final.txt)
do
gsnap <args> --output-file=./alignments.gsnap/${file}.mapped.sam --failed-input=./alignments.gsnap/${file}.unmapped.fa ../cleaned-trimmed/${file}.1.fq.gz ../cleaned-trimmed/${file}.2.fq.gz
done
but it does not pass the variables on correctly.
How do I pass on the values from samples_final.txt to the command?
At the moment the script garbles the file names when I run the loop. So, for example, if I run a test on the file "for_test2.txt":
SAMPLE_1
SAMPLE_2
Using the echo command:
for file in $(<../info/for_test2.txt)
do
echo ../cleaned-trimmed/${file}.1.fq.gz
done
I get the following output:
.1.fq.gzed-trimmed/SAMPLE_1
.1.fq.gzed-trimmed/SAMPLE_2
.1.fq.gzed-trimmed/
So it seems to have replaced ../clean with the .1.fq.gz
I genuinely do not understand the logic of this.

You have dos line endings in your file. The "carrier return" character makes the cursor jump to the beginning of the current line, that's why the .1.fq.gz part in your last code snipped is printed on the beginning of the line. You can first convert your file to normal line endings:
dos2unix ../info/for_test2.txt
Then read the file line by line and execute your command. Remember to quote your variables:
while IFS= read -r file; do
# protect against empty lines in input file
if [ -z "$file" ]; then continue; fi
gsnap <args> --output-file=./alignments.gsnap/"$file".mapped.sam --failed-input=./alignments.gsnap/"$file".unmapped.fa ../cleaned-trimmed/"$file".1.fq.gz ../cleaned-trimmed/"$file".2.fq.gz
done <../info/for_test2.txt
or like a pro with xargs:
<../info/for_test2.txt xargs -I{} gsnap <args> --output-file=./alignments.gsnap/{}.mapped.sam --failed-input=./alignments.gsnap/{}.unmapped.fa ../cleaned-trimmed/{}.1.fq.gz ../cleaned-trimmed/{}.2.fq.gz

Related

How do you append a string built with interpolation of vars and STDIN to a file?

Can someone fix this for me.
It should copy a version log file to backup after moving to a repo directory
Then it automatically appends line given as input to the log file with some formatting.
That's it.
Assume existence of log file and test directory.
#!/bin/bash
cd ~/Git/test
cp versionlog.MD .versionlog.MD.old
LOGDATE="$(date --utc +%m-%d-%Y)"
read -p "MSG > " VHMSG |
VHENTRY="- **${LOGDATE}** | ${VHMSG}"
cat ${VHENTRY} >> versionlog.MD
shell output
virufac#box:~/Git/test$ ~/.logvh.sh
MSG > testing script
EOF
EOL]
EOL
e
E
CTRL + C to get out of stuck in reading lines of input
virufac#box:~/Git/test$ cat versionlog.MD
directly outputs the markdown
# Version Log
## version 0.0.1 established 01-22-2020
*Working Towards Working Mission 1 Demo in 0.1 *
- **01-22-2020** | discovered faker.Faker and deprecated old namelessgen
EOF
EOL]
EOL
e
E
I finally got it to save the damned input lines to the file instead of just echoing the command I wanted to enter on the screen and not executing it. But... why isn't it adding the lines built from the VHENTRY variable... and why doesn't it stop reading after one line sometimes and this time not. You could see I was trying to do something to tell it to stop reading the input.
After some realizing a thing I had done in the script was by accident... I tried to fix it and saw that the | at the end of the read command was seemingly the only reason the script did any of what it did save to the file in the first place.
I would have done this in python3 if I had know this script wouldn't be the simplest thing I had ever done. Now I just have to know how you do it after all the time spent on it so that I can remember never to think a shell script will save time again.
Use printf to write a string to a file. cat tries to read from a file named in the argument list. And when the argument is - it means to read from standard input until EOF. So your script is hanging because it's waiting for you to type all the input.
Don't put quotes around the path when it starts with ~, as the quotes make it a literal instead of expanding to the home directory.
Get rid of | at the end of the read line. read doesn't write anything to stdout, so there's nothing to pipe to the following command.
There isn't really any need for the VHENTRY variable, you can do that formatting in the printf argument.
#!/bin/bash
cd ~/Git/test
cp versionlog.MD .versionlog.MD.old
LOGDATE="$(date --utc +%m-%d-%Y)"
read -p "MSG > " VHMSG
printf -- '- **%s** | %s\n' "${LOGDATE}" "$VHMSG" >> versionlog.MD

Assign full text file path to a variable and use variable as file path in sh file

I am trying to create a shell script for logs and trying to append data into a text file. I have write this sample "test.sh" code for testing:
#!/bin/sh -e
touch /home/sample.txt
SPTH = '/home/sample'.txt
echo "MY LOG FILE" >> "$SPTH"
echo "DUMP started at $(date +'%d-%m-%Y %H:%M:%S')" >> /home/sample.txt
echo "DUMP finished at $(date +'%d-%m-%Y %H:%M:%S')" >> /home/sample.txt
but in above code all lines are working correct except one line of code i.e.
echo "MY LOG FILE" >> "$SPTH"
It is giving error:
test.sh: line 6: : No such file or directory
I want to replace this full path of file "/home/sample.txt" to variable "$SPATH".
I am executing my shell script using
sh test.sh
What I am doing wrong.
Variable assignments in bash shell does not allow you to have spaces within. It will be actually interpreted as command with = and the subsequent keywords as arguments to the first word, which is wrong.
Change your code to
SPTH="/home/sample.txt"
That is the reason why SPTH was not assigned to the actual path you intended it to have. And you have no reason to have single-quote here and excluding the extension part. Using it fully within double-quotes is absolutely fine.
The syntax for the command line is that the first token is a command, tokens are separated by whitespace. So:
SPTH = '/home/sample'.txt
Has the command as SPTH, the second token is =, and so on. You might think this is daft, but most shells behave like this for historical reasons.
So you need to remove the whitespace:
SPTH='/home/sample'.txt

I can't get my bash script to run

This is the script that I used to that will not run, but I am hoping someone can help me figure out what the issue is. I am new to unix
#!/bin/bash
# cat copyit
# copies files
numofargs=$#
listoffiles=
listofcopy=
# Capture all of the arguments passed to the command, store all of the arguments, except
# for the last (the destination)
while [ "$#" -gt 1 ]
do
listoffiles="$listoffiles $1"
shift
done
destination="$1"
# If there are less than two arguments that are entered, or if there are more than two
# arguments, and the last argument is not a valid directory, then display an
# error message
if [ "$numofargs" -lt 2 -o "$numofargs" -gt 2 -a ! -d "$destination" ]
then
echo "Usage: copyit sourcefile destinationfile"
echo" copyit sourcefile(s) directory"
exit 1
fi
# look at each sourcefile
for fromfile in $listoffiles
do
# see if destination file is a directory
if [ -d "$destination" ]
then
destfile="$destination/`basename $fromfile`"
else
destfile="$destination"
fi
# Add the file to the copy list if the file does not already exist, or it
# the user
# says that the file can be overwritten
if [ -f "$destfile" ]
then
echo "$destfile already exist; overwrite it? (yes/no)? \c"
read ans
if [ "$ans" = yes ]
then
listofcopy="$listofcopy $fromfile"
fi
else
listofcopy="$listofcopy $fromfile"
fi
done
# If there is something to copy - copy it
if [ -n "$listofcopy" ]
then
mv $listofcopy $destination
fi
This is what I got and it seems that the script didn't execute all though I did invoke it. I am hoping that someone can help me
[taniamack#localhost ~]$ chmod 555 tryto.txt
[taniamack#localhost ~]$ tryto.txt
bash: tryto.txt: command not found...
[taniamack#localhost ~]$ ./tryto.txt
./tryto.txt: line 7: $'\r': command not found
./tryto.txt: line 11: $'\r': command not found
./tryto.txt: line 16: $'\r': command not found
./tryto.txt: line 43: syntax error near unexpected token `$'do\r''
'/tryto.txt: line 43: `do
Looks like your file contains Windows new line formatting: "\r\n". On Unix, a new line is just "\n". You can use dos2unix (apt-get install dos2unix), to convert your files.
Also have a look at the chmod manual (man chmod).
Most of the time i just use chmod +x ./my_file to give execution rights
I see a few issues. First of all, a mode of 555 means that no one can write to the file. You probably want chmod 755. Second of all, you need to add the current directory to your $PATH variable. In Windows, you also have a %PATH%, but by default the current directory . is always in %PATH%, but in Unix, adding the current directory is highly discouraged because of security concerns. The standard is to put your scripts under the $HOME/bin directory and make that directory the last entry in your $PATH.
First of all: Indent correctly. When you enter a loop or an if statement, indent the lines by four characters (that's the standard). It makes it much easier to read your program.
Another issue is your line endings. It looks like some of the lines have a Windows line ending on them while most others have a Unix/Linux/Mac line ending. Windows ends each line with two characters - Carriage Return and Linefeed while Unix/Linux/Mac end each line with just a Linefeed. The \r is used to represent the Carriage Return character. Use a program editor like vim or gedit. A good program editor will make sure that your line endings are consistent and correct.

How to show line number when executing bash script

I have a test script which has a lot of commands and will generate lots of output, I use set -x or set -v and set -e, so the script would stop when error occurs. However, it's still rather difficult for me to locate which line did the execution stop in order to locate the problem.
Is there a method which can output the line number of the script before each line is executed?
Or output the line number before the command exhibition generated by set -x?
Or any method which can deal with my script line location problem would be a great help.
Thanks.
You mention that you're already using -x. The variable PS4 denotes the value is the prompt printed before the command line is echoed when the -x option is set and defaults to : followed by space.
You can change PS4 to emit the LINENO (The line number in the script or shell function currently executing).
For example, if your script reads:
$ cat script
foo=10
echo ${foo}
echo $((2 + 2))
Executing it thus would print line numbers:
$ PS4='Line ${LINENO}: ' bash -x script
Line 1: foo=10
Line 2: echo 10
10
Line 3: echo 4
4
http://wiki.bash-hackers.org/scripting/debuggingtips gives the ultimate PS4 that would output everything you will possibly need for tracing:
export PS4='+(${BASH_SOURCE}:${LINENO}): ${FUNCNAME[0]:+${FUNCNAME[0]}(): }'
In Bash, $LINENO contains the line number where the script currently executing.
If you need to know the line number where the function was called, try $BASH_LINENO. Note that this variable is an array.
For example:
#!/bin/bash
function log() {
echo "LINENO: ${LINENO}"
echo "BASH_LINENO: ${BASH_LINENO[*]}"
}
function foo() {
log "$#"
}
foo "$#"
See here for details of Bash variables.
PS4 with value $LINENO is what you need,
E.g. Following script (myScript.sh):
#!/bin/bash -xv
PS4='${LINENO}: '
echo "Hello"
echo "World"
Output would be:
./myScript.sh
+echo Hello
3 : Hello
+echo World
4 : World
Workaround for shells without LINENO
In a fairly sophisticated script I wouldn't like to see all line numbers; rather I would like to be in control of the output.
Define a function
echo_line_no () {
grep -n "$1" $0 | sed "s/echo_line_no//"
# grep the line(s) containing input $1 with line numbers
# replace the function name with nothing
} # echo_line_no
Use it with quotes like
echo_line_no "this is a simple comment with a line number"
Output is
16 "this is a simple comment with a line number"
if the number of this line in the source file is 16.
This basically answers the question How to show line number when executing bash script for users of ash or other shells without LINENO.
Anything more to add?
Sure. Why do you need this? How do you work with this? What can you do with this? Is this simple approach really sufficient or useful? Why do you want to tinker with this at all?
Want to know more? Read reflections on debugging
Simple (but powerful) solution: Place echo around the code you think that causes the problem and move the echo line by line until the messages does not appear anymore on screen - because the script has stop because of an error before.
Even more powerful solution: Install bashdb the bash debugger and debug the script line by line
If you're using $LINENO within a function, it will cache the first occurrence. Instead use ${BASH_LINENO[0]}

Can I write a script in the command line which iterates though all files in a dir?

I would usually write a script for the following command but this time I only want to use it once and therefore would like to write it in the command line.
The script processes all files in a dir.
for FILE in *.tif # grab all the tif files
do
NEWFILE=test/${FILE} # create the new file name
gdal_translate -a_srs EPSG:25832 $FILE $NEWFILE
done
sorry...I forgot to mention that I did try "
for FILE in *.tif do NEWFILE = test_${FILE} gdal_translate -outsize 50% 50% %FILE %NEWFILE done"
..but it freezes with a > on the next line...as though it is waiting for something else.
There is basically no difference between an interactive command and a script. If you want to put your commands on one line, separate them with semicolons instead of line breaks.
for f in *.tif; do gdal_translate -a_srs EPSG:25832 $f test/$f; done
The secondary prompt is displayed by the shell if your command was not yet complete, such as if you are in the middle of a quoted string or a compound command, or if the previous line ended in a backslash.
You need semicolons between your script lines. Try
for FILE in *.tif; do NEWFILE=test/${FILE}; gdal_translate -a_srs EPSG:25832 $FILE $NEWFILE; done

Resources