How to apply Praat script to an audio file? - audio

I'm trying to change formants of the audio file with praat in Colab. I found the script that does that, it's code and the code for calculating formants. I installed praat:
!sudo apt-get update -y -qqq --fix-missing && apt-get install -y -qqq praat > /dev/null
!wget -qqq http://www.praatvocaltoolkit.com/downloads/plugin_VocalToolkit.zip
!unzip -qqq /content/plugin_VocalToolkit.zip > /dev/null
with open('/content/script.praat', 'w') as f:
f.write(r"""writeInfoLine: preferencesDirectory$""")
!praat /content/script.praat
/root/.praat-dir
!mv /content/plugin_VocalToolkit/* /root/.praat-dir
!praat --version
Praat 6.0.37 (February 3 2018)
How can I apply this script to multiple wav files without UI, using linux command line or python?

The general answer
You don't. You run a script, and it's entirely up to the script how it works, what objects it works on, where those objects are fetched, how they are fetched, etc.
So you always have to look at how to apply a specific script, and that always entails figuring out how that script wants its input, and how to get to that point.
The specific answer
The page for the script you want says
This command [does something on] each selected Sound
so the first thing will be to open the files you want and select them.
Let's assume you'll be working with a small enough number of sounds to open them all in one go. If you are working on a lot of sound files, or files that are too large to hold in memory, you'll have to batch the job into smaller chunks.
One way to do this would be with a wrapper script that opened your files, selected them, and executed the other script you want:
# Get a list of all your files
files = Create Strings as file list: "list", "/some/path/*.wav"
total_files = Get number of strings
# Open each of them
for i to total_files
selectObject: files
filename$ = Get string: i
sounds[i] = Read from file: "/some/path/" + filename$
endfor
# Clear the selection
nocheck selectObject(undefined)
# Add each sound to your selection
for i to total_files
plusObject: sounds[i]
endfor
# Run your script
runScript: path_to_script$, ...
# where the ... is the list of arguments your script expects
# In your specific case, it would be something like
runScript: preferencesDirectory$ + "/plugin_VocalToolkit/changeformants.praat",
... 500, 1500, 2500, 0, 0, 5500, "yes", "yes"
# ,-´ ,-´ ,--´ ,--´ ,-´ ^ ^ ^
# New F1, F2, F3, F4, and F5 means | | |
# Max formant | |
# Process only voiced parts |
# Retrieve intensity contour
# Do something with whatever the script gives you
My Praat is pretty rusty, but this should at least give you an idea of what to do (disclaimer: I haven't run any of the above, but the concepts should be fine).
With that "wrapper" script stored somewhere, you can then execute it from the command line:
$ praat /path/to/wrapper.praat

Related

How to remove some number of files using a wild card before adding stuff back to them?

I have a set of .txt files named my_file_1.txt, my_file_2.txt, ..., my_file_n.txt where n is finite integer. As my python code is running (in a directory with path ~/simulation/some_code), it is adding data into these files using the following for loop:
for realization in np.arange(1, n+1):
# Identifying the file_path
some_name = 'output/{}/Info/{}/{}/parameter_{:.3f}/my_file_{}.txt'.format(size, name, status, value, realization)
# do some stuff
with open(some_name, "a") as filename:
print('{}'.format(some_list), file=filename)
filename.close()
However, to begin with, these files are not empty and need to be emptied. To do so, I am running the following line ahead of time (in the home directory ~/ which is two levels up with respect to the directory of the above code) to make sure files are empty before being modified:
os.system('> output/{}/Info/{}/{}/parameter_{:.3f}/my_file_*.txt'.format(size, name, status, value))
While I expected to see * symbol should act as a wild card to empty all similar text files above, it seems that files are accumulating from previous data instead of removing initial data and adding the stuff above. Am I using * wild card incorrectly? Is this problem fixable without changing the path of my codes?
Your understanding of the wildcard is correct. The mistake is with redirection. By default you only redirect (>) to one output. You can use the program tee to redirect std out to multiple other files like this:
(echo -n, echoes nothing)
echo -n | tee *
The pipe | passes the stdout of echo -n to the stdin of tee.
Then the wildcard will expand to all files in the directory.
echo -n | tee my_file_1.txt, my_file_2.txt, ..., my_file_n.txt

grep empty output file

I made a shell script the purpose of which is to find files that don't contain a particular string, then display the first line that isn't empty or otherwise useless. My script works well in the console, but for some reason when I try to direct the output to a .txt file, it comes out empty.
Here's my script:
#!/bin/bash
# takes user input.
echo "Input substance:"
read substance
echo "Listing media without $substance:"
cd media
# finds names of files that don't feature the substance given, then puts them inside an array.
searchresult=($(grep -L "$substance" *))
# iterates the array and prints the first line of each - contains both the number and the medium name.
# however, some files start with "Microorganisms" and the actual number and name feature after several empty lines
# the script checks for that occurence - and prints the first line that doesnt match these criteria.
for i in "${searchresult[#]}"
do
grep -m 1 -v "Microorganisms\|^$" $i
done >> output.txt
I've tried moving the >>output.txt to right after the grep line inside the loop, tried switching >> to > and 2>&1, tried using tee. No go.
I'm honestly feeling utterly stuck as to what the issue could be. I'm sure there's something I'm missing, but I'm nowhere near good enough with this to notice. I would very much appreciate any help.
EDIT: Added files to better illustrate what I'm working with. Sample inputs I tried: Glucose, Yeast extract, Agar. Link to files [140kB] - the folder was unzipped beforehand.
The script was given full permissions to execute. I don't think the output is being rewritten because even if I don't iterate and just run a single line of the loop, the file is empty.

abyss-pe: variables to assemble multiple genomes with one command

How do I rewrite the following to properly replace the variable with my genomeID? (I have it working with this method in Spades and Masurca assemblers, so it's something about Abyss that doesnt like this approach and I need a work-around)
I am trying to run abyss on a cluster server but am running into trouble with how abyss-pe is reading my variable input:
my submit file loads a script for each genome listed in a .txt file
my script writes in the genome name throughout the script
the abyss assembly fumbles the variable replacement
Input.sub:
queue genomeID from genomelisttest.txt
Input.sh:
#!/bin/bash
genomeID=$1
cp /mnt/gluster/harrow2/trim_output/${genomeID}_trim.tar.gz ./
tar -xzf ${genomeID}_trim.tar.gz
rm ${genomeID}_trim.tar.gz
for k in `seq 86 10 126`; do
mkdir k$k
abyss-pe -C k$k name=${genomeID} k=$k lib='pe1 pe2' pe1='../${genomeID}_trim/${genomeID}_L1_1.fq.gz ../${genomeID}_trim/${genomeID}_L1_2.fq.gz' pe2='../${genomeID}_trim/${genomeID}_L2_1.fq.gz ../${genomeID}_trim/${genomeID}_L2_2.fq.gz'
done
Error that I get:
`../enome_trim/enome_L1_1.fq.gz': No such file or directory
This is where "enome" is supposed to replace with a five digit genomeID, which happens properly in the earlier part of the script up to this point, where abyss comes in.
pe1='../'"$genomeID"'_trim/'"$genomeID"'_L1_1.fq.gz ...'
I added a single quote before and after the variable

Iterate through files in a directory, create output files, linux

I am trying to iterate through every file in a specific directory (called sequences), and perform two functions on each file. I know that the functions (the 'blastp' and 'cat' lines) work, since I can run them on individual files. Ordinarily I would have a specific file name as the query, output, etc., but I'm trying to use a variable so the loop can work through many files.
(Disclaimer: I am new to coding.) I believe that I am running into serious problems with trying to use my file names within my functions. As it is, my code will execute, but it creates a bunch of extra unintended files. This is what I intend for my script to do:
Line 1: Iterate through every file in my "sequences" directory. (All of which end with ".fa", if that is helpful.)
Line 3: Recognize the filename as a variable. (I know, I know, I think I've done this horribly wrong.)
Line 4: Run the blastp function using the file name as the argument for the "query" flag, always use "database.faa" as the argument for the "db" flag, and output the result in a new file that is has the same name as the initial file, but with ".txt" at the end.
Line 5: Output parts of the output file from line 4 into a new file that has the same name as the initial file, but with "_top_hits.txt" at the end.
for sequence in ./sequences/{.,}*;
do
echo "$sequence";
blastp -query $sequence -db database.faa -out ${sequence}.txt -evalue 1e-10 -outfmt 7
cat ${sequence}.txt | awk '/hits found/{getline;print}' | grep -v "#">${sequence}_top_hits.txt
done
When I ran this code, it gave me six new files derived from each file in the directory (and they were all in the same directory - I'd prefer to have them all in their own folders. How can I do that?). They were all empty. Their suffixes were, ".txt", ".txt.txt", ".txt_top_hits.txt", "_top_hits.txt", "_top_hits.txt.txt", and "_top_hits.txt_top_hits.txt".
If I can provide any further information to clarify anything, please let me know.
If you're only interested in *.fa files I would limit your input to only those matching files like this:
for sequence in sequences/*.fa;
do
I can propose you the following improvements:
for fasta_file in ./sequences/*.fa # ";" is not necessary if you already have a new line for your "do"
do
# ${variable%something} is the part of $variable
# before the string "something"
# basename path/to/file is the name of the file
# without the full path
# $(some command) allows you to use the result of the command as a string
# Combining the above, we can form a string based on our fasta file
# This string can be useful to name stuff in a clean manner later
sequence_name=$(basename ${fasta_file%.fa})
echo ${sequence_name}
# Create a directory for the results for this sequence
# -p option avoids a failure in case the directory already exists
mkdir -p ${sequence_name}
# Define the name of the file for the results
# (including our previously created directory in its path)
blast_results=${sequence_name}/${sequence_name}_blast.txt
blastp -query ${fasta_file} -db database.faa \
-out ${blast_results} \
-evalue 1e-10 -outfmt 7
# Define a file name for the top hits
top_hits=${sequence_name}/${sequence_name}_top_hits.txt
# alternatively, using "%"
#top_hits=${blast_results%_blast.txt}_top_hits.txt
# No need to cat: awk can take a file as argument
awk '/hits found/{getline;print}' ${blast_results} \
| grep -v "#" > ${sequence_name}_top_hits.txt
done
I made more intermediate variables, with (hopefully) meaningful names.
I used \ to escape line ends and allow putting commands in several lines.
I hope this improves code readability.
I haven't tested. There may be typos.
You should be using *.fa if you only want files with a .fa ending. Additionally, if you want to redirect your output to new folders you need to create those directories somewhere using
mkdir 'folder_name'
then you need to redirect your -o outputs to those files, something like this
'command' -o /path/to/output/folder
To help you test this script out, you can run each line one by one to test them. You need to make sure each line works by itself before combining.
One last thing, be careful with your use of colons, it should look something like this:
for filename in *.fa; do 'command'; done

Linux - Recursively list all the zip files and keep only latest modified 5 files and delete the remaining

In command line, How can we recursively find out all the zip files in a directory and its sub directories and keep only the latest modified 5 files and delete the remaining.
The files paths would be something like below:
basedirectory/2015/12/18/abc.zip
basedirectory/2015/12/18/def.zip
basedirectory/2015/12/18/ghi.zip
basedirectory/2015/12/18/jkl.zip
basedirectory/2015/12/08/mno.zip
basedirectory/2015/12/08/pqr.zip
basedirectory/2015/12/08/stu.zip
basedirectory/2015/12/07/stu.zip
I have a way, but it involves several (easy) steps. There are probably more elegant ways of doing this, but here is how I know how. They come from a couple sources, which I list at the end of my answer. You will use the already installed utilites cd, find, ls, rm and head. it will involve a creating and executing two bash scripts.
Open a terminal and change into your base directory with cd ~/basedirectory
This sets up the following commands. It is important that you stay in this directory for the rest of the commands.
Type findpwd-name *.zip > find_zip
This creates a list of all the zip files with the full path relative to the directory you changed in to. Instead of printing them to the screen, it writes them to a find_zip file in the directory you changed into.
type cp find_zip remove_old_zip
This creates a second, duplicate file that you will later use to delete the old files.
Open the find_zip file in your favorite text editor. If you're not used to using any, you can use gedit. If you don't have it, install it with sudo apt-get udpate && sudo apt-get install gedit
Do a search and replace as follows (in gedit): search for \n , and replace it with " \\n"
This places the list of folders within quotes. the first backslash places a "\" at the end of each line, which means continue reading the next line and execute all the code together. The \n preserves the line endings. The last " puts a quote at the beginning of each line. You need the quotes to escape special characters like ' and ( that may be in your file name.
Create 2 new lines at the top of the file and type:
!/bin/bash
ls -lt \
The first line turns your file into a bash script. The second line will list all the files you found with the find command and order them by date.
Create a new line at the bottom of your file and type: | head -5. Save and exit the file.
| is a "pipe" that will take the output of the ordered file list that ls creates and feed it into the head command. The head command will list just the 5 most recently modified files and display or print them on your screen.
As a result of steps 5-7, your file should go from looking like this:
basedirectory/2015/12/18/abc.zip
basedirectory/2015/12/18/def.zip
basedirectory/2015/12/18/ghi.zip
basedirectory/2015/12/18/jkl.zip
basedirectory/2015/12/08/mno.zip
basedirectory/2015/12/08/pqr.zip
basedirectory/2015/12/08/stu.zip
basedirectory/2015/12/07/stu.zip
to this:
#!/bin/bash
ls -lt \
basedirectory/2015/12/18/abc.zip \
basedirectory/2015/12/18/def.zip \
basedirectory/2015/12/18/ghi.zip \
basedirectory/2015/12/18/jkl.zip \
basedirectory/2015/12/08/mno.zip \
basedirectory/2015/12/08/pqr.zip \
basedirectory/2015/12/08/stu.zip \
basedirectory/2015/12/07/stu.zip \
| head -5
Type bash find_zip into in the terminal. With your newfound list of the 5 most recent files, open up the remove_old_zip file created in step 3.
You will also be turning this file into a bash script, but it will remove all but the five newest files.
Delete the lines in the remove_old_zip file containing the 5 files you want to keep.
Do a search and replace as follows (in gedit): search for \n , and replace it with " \\n"
This is the same as step 5.
Create 2 new lines at the top of the file and type:
!/bin/bash
rm \
This is similar to step 6 except that rm will delete the files still listed.
remove the final \ on the final line of the remove_old_zip file. Save and exit.
Type bash remove_old_zip.
Type rm find_zip remove_old_zip.
This remove the two scripts, which are now useless since the files have been deleted.
sources:
How can I list (ls) the 5 last modified files in a directory?
http://www.geekinterview.com/talk/758-how-to-continue-to-next-line.html
List files recursively in Linux CLI with path relative to the current directory

Resources