Indexing audio to get timestamps for each word using python - python-3.x

I have an audio file and I want to get the timestamps for each word. I want to know during which time period was each the words spoken.
(For example if an audio file says "I am a doctor" I want to know during which instant "I" was said, "am" was said and so on)
I want to do this using python.
I have tried the following code.
from SimpleAudioIndexer import SimpleAudioIndexer as sai
indexer = sai(mode="ibm", src_dir="D:/Codes/Python/audio recognition",
username_ibm="", password_ibm="")
indexer.index_audio(basename = "target.wav")
indexer.save_indexed_audio("{}/indexed_audio".format(indexer.src_dir))
indexer.load_indexed_audio("{}/indexed_audio.txt".format(indexer.src_dir))
print(indexer.get_timestamps())
However I am running into following error.
Traceback (most recent call last):
File "D:\Codes\Python\audio recognition\rec.py", line 5, in <module>
indexer.index_audio(basename = "target.wav")
File "C:\Users\Awais\AppData\Roaming\Python\Python37\site-
packages\SimpleAudioIndexer\__init__.py", line 1108, in index_audio
self._index_audio_ibm(*args, **kwargs)
File "C:\Users\Awais\AppData\Roaming\Python\Python37\site-
packages\SimpleAudioIndexer\__init__.py", line 928, in _index_audio_ibm
replace_already_indexed=replace_already_indexed)
File "C:\Users\Awais\AppData\Roaming\Python\Python37\site-
packages\SimpleAudioIndexer\__init__.py", line 730, in _prepare_audio
self._filtering_step(basename)
File "C:\Users\Awais\AppData\Roaming\Python\Python37\site-
packages\SimpleAudioIndexer\__init__.py", line 638, in _filtering_step
universal_newlines=True).communicate()
File "C:\Program Files (x86)\Microsoft Visual
Studio\Shared\Python37_64\lib\subprocess.py", line 800, in __init__
restore_signals, start_new_session)
File "C:\Program Files (x86)\Microsoft Visual
Studio\Shared\Python37_64\lib\subprocess.py", line 1207, in _execute_child
startupinfo)
FileNotFoundError: [WinError 2] The system cannot find the file specified

From what I can tell from your code/errors is that you're running Windows, but you're using forward slashes "/" in your src_dir string. This won't work as Windows uses the backslash "\" when navigating folders. Such as:
C:\Windows
while many Unix based systems use the forward slash like this:
/home/Awais
In your code you should try to change the "/" to "\" and see if that makes a difference.
If it is your intention on using this program on both Windows and Unix systems, you should use python3's "pathlib" to ensure your directory paths work on either OS. Here is a link that you can use to learn more about handling paths in python3:
https://medium.com/#ageitgey/python-3-quick-tip-the-easy-way-to-deal-with-file-paths-on-windows-mac-and-linux-11a072b58d5f

Related

FileNotFoundError When Attempting to Open a File in the Same Directory

The txt file is saved in the exact same folder as my code but when I run it I get that traceback. I right clicked saved file directly to folder but when run the code vs studio. I am very new to code sorry for the basic question.
file = open('regex_sum_1114202.txt', 'r')
Traceback:
Traceback (most recent call last):
File "c:\Users\EM2750\Desktop\py4e\ex_11\ex_11.py", line 2, in <module>
file = open('regex_sum_1114202.txt', 'r')
FileNotFoundError: [Errno 2] No such file or directory: 'regex_sum_1114202.txt'
screen shot of traceback
Try file = open('./regex_sum_1114202.txt', 'r') instead.
This explicitly specifies that Python should look for the file in the current directory by providing the relative path. Think of the point as a shorthand for the current working directory. So if the current working directory is the directory where the script and the file is, that should work.
Use forward slashes (/) instead of backslashes (\). Backslashes are the default directory separator on Windows, but here they make problems because they are interpreted as escape sequences by Python. Alternatively, you can use two backslashes after another as directory separator: \\.
You can also try to specify the full path before the filename: file = open('c:/Users/EM2750/Desktop/py4e/ex_11/regex_sum_1114202.txt', 'r'). The downside is of course that the path wouldn't be correct anymore if you'd move the file.

python script in cron not reading a CSV unless it creates the CSV itself

I have the following script. It works when I run it in command line, and it works when I run it in cron.
The variable 'apath' is the absolute path of the file.
cat=['a','a','a','a','a','b','b','b','b','b']
val=[1,2,3,4,5,6,7,8,9,10]
columns=['cat','val']
data=[cat,val]
dict={key:value for key,value in zip(columns,data)}
statedata_raw=pd.DataFrame(data=dict)
statedata_raw.to_csv(apath+'state_data.csv',index=False)
statedata_raw2=pd.read_csv(apath+'state_data.csv')
statedata_raw2.to_csv(apath+'state_data2.csv',index=False)
But when I try to run the first part manually, creating the first csv, and then run the second part through cron, the second read_csv statement fails. I checked the permissions on the state_data.csv file and they are fine. It's set to -rwxr-xr-x
To be specific: I first run this script manually through command line. It executes and creates state_data.csv. Then I check the permissions of state_csv, and they are -rwxr-xr-x
cat=['a','a','a','a','a','b','b','b','b','b']
val=[1,2,3,4,5,6,7,8,9,10]
columns=['cat','val']
data=[cat,val]
dict={key:value for key,value in zip(columns,data)}
statedata_raw=pd.DataFrame(data=dict)
statedata_raw.to_csv(apath+'state_data.csv',index=False)
and then this script via cron, which fails, and gives the error message below
statedata_raw2=pd.read_csv(apath+'state_data.csv')
statedata_raw2.to_csv(apath+'state_data2.csv',index=False)
This is the error that I get from the system
Traceback (most recent call last):
File "/users/michaelmader/wdtest.py", line 39, in <module>
statedata_raw2=pd.read_csv(apath+'state_data.csv')
File "/opt/miniconda3/lib/python3.7/site-packages/pandas/io/parsers.py", line 676, in parser_f
return _read(filepath_or_buffer, kwds)
File "/opt/miniconda3/lib/python3.7/site-packages/pandas/io/parsers.py", line 448, in _read
parser = TextFileReader(fp_or_buf, **kwds)
File "/opt/miniconda3/lib/python3.7/site-packages/pandas/io/parsers.py", line 880, in __init__
self._make_engine(self.engine)
File "/opt/miniconda3/lib/python3.7/site-packages/pandas/io/parsers.py", line 1114, in _make_engine
self._engine = CParserWrapper(self.f, **self.options)
File "/opt/miniconda3/lib/python3.7/site-packages/pandas/io/parsers.py", line 1891, in __init__
self._reader = parsers.TextReader(src, **kwds)
File "pandas/_libs/parsers.pyx", line 374, in pandas._libs.parsers.TextReader.__cinit__
File "pandas/_libs/parsers.pyx", line 678, in pandas._libs.parsers.TextReader._setup_parser_source
OSError: Initializing from file failed
To summarize
Run complete script through Terminal: state_data2.csv is created: pass
Run complete script through cron: state_data2.csv is created: pass
Run first part through Terminal, second part through cron: fail
I am on MacOS and I already gave crontab full disk access in system preferences.
I figured out the problem. The issue was the permissions that were granted to cron in MacOS. I thought I had solved it by giving \usr\bin\crontab full disk access, but I actually needed to give full disk access to usr\sbin\cron
The steps for doing this can be found here: https://blog.bejarano.io/fixing-cron-jobs-in-mojave/.
Once I made that change everything worked fine.

Converting .TIF to .PDF gives PIL: Error reading image

I've been trying to batch process some .TIF files and convert them to PDFs. I did have it working, but then after trying to change img2pdf so it would accept larger files I was never able to get the same program running again, even after re-installing.
Currently this is throwing out the following error:
>>>>
ImageOpenError: cannot read input image (not jpeg2000). PIL: error reading image: cannot identify image file <_io.BytesIO object at 0x000001A608255EB8>
Here is the code I've been using. Anyone got any suggestions? Thanks in advance.
import img2pdf, sys, os, time
image_directory = r"PATH"
image_files = []
for root, dirs, files in os.walk(image_directory):
for file in files:
if file.endswith(".tif") or file.endswith(".TIF"):
print("Discovered this TIF: ", os.path.join(root, file))
image_files.append(os.path.join(root, file))
for image in image_files:
output_file = image[:-4] + ".pdf"
print ("Putting all TIFs into ", output_file)
pdf_bytes = img2pdf.convert(image)
file = open(output_file,"wb")
file.write(pdf_bytes)
Here is the full traceback
Traceback (most recent call last):
File "<ipython-input-37-fe96d5eeb049>", line 1, in <module>
runfile('PATH', wdir='PATH')
File "PATH", line 704, in runfile
execfile(filename, namespace)
File "PATH", line 108, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)
File "PATH", line 23, in <module>
pdf_bytes = img2pdf.convert(image_files)
File "PATH", line 1829, in convert
) in read_images(rawdata, kwargs["colorspace"], kwargs["first_frame_only"]):
File "PATH", line 1171, in read_images
"PIL: error reading image: %s" % e
ImageOpenError: cannot read input image (not jpeg2000). PIL: error reading image: cannot identify image file <_io.BytesIO object at 0x000001A6082BE3B8>
If, as I understand it, you want to recursively find all TIFF images and convert each one to a correspondingly named PDF file, you can do that simply and in parallel with GNU Parallel and ImageMagick like this in Terminal:
find . -iname "*tif" -print0 | parallel -0 --dry-run mogrify {} {.}.pdf
Sample Output
mogrify ./OpenCVTIFF64/result.tif ./OpenCVTIFF64/result.pdf
mogrify ./OpenCVTIFF64/a.tif ./OpenCVTIFF64/a.pdf
mogrify ./OpenCVBasics/a.tif ./OpenCVBasics/a.pdf
mogrify ./CImgDump/image.tif ./CImgDump/image.pdf
That command says... "Starting in the current directory, recursively find all TIFF files, whether upper or lowercase or some mixture and pass their names, null-terminated, to GNU Parallel. It should then read each name and run ImageMagick mogrify to convert that TIFF into a file with same name but the extension replaced with PDF."
If it does what you want, remove the --dry-run and do it again for real.
So this ended up working once I executed pip install 'Pillow>=6.0.0' --force-reinstall, even though the command itself didn't execute properly. I get a few warnings when I run, but it's now working. Short version is, it was an issue with Pillow.

FileNotFoundError even though file exist

when trying to open a file using wit open .. getting error that file doesn't exist.
I am trying to parse some txt files , when working localy it works with no issue, but the issue started when I am trying to connect to a network folder. the strange this is that is does see the file , but says its not found .
The Path I referring is '//10.8.4.49/Projects/QASA_BR_TCL_Env_7.2.250/Utils/BR_Env/Call Generator/results/Console_Logs/*' (this folder is full of txt files.
but I am still getting this error:
FileNotFoundError: [Errno 2] No such file or directory: 'Console_log_01-01-2019_08-17-56.txt'
as you see , it does see the needed file .
in order to get to this file I am parsing splitting the path the follwoing way :
readFile = name.split("/")[9].split("\\")[1]
because if I am looking on the list of my files I see them the following way :
['//10.8.4.49/Projects/QASA_BR_TCL_Env_7.2.250/Utils/BR_Env/Call Generator/results/Console_Logs\Console_log_01-01-2019_08-17-56.txt',
after splitting I am getting :
Console_log_01-01-2019_08-17-56.txt
and still it says the file is not found.
def main():
lines =0
path = '//10.8.4.49/Projects/QASA_BR_TCL_Env_7.2.250/Utils/BR_Env/Call Generator/results/Console_Logs/*'
files = glob.glob(path)
print ("files")
print ('\n')
print(files)
for name in glob.glob(path):
print (path)
readFile = name.split("/")[9].split("\\")[1]
print(readFile)
with open(readFile,"r") as file:
lines = file.readlines()
print (lines)
main()
files
['//10.8.4.49/Projects/QASA_BR_TCL_Env_7.2.250/Utils/BR_Env/Call Generator/results/Console_Logs\\Console_log_01-01-2019_08-17-56.txt', '//10.8.4.49/Projects/QASA_BR_TCL_Env_7.2.250/Utils/BR_Env/Call Generator/results/Console_Logs\\Console_log_01-01-2019_08-18-29.txt']
Traceback (most recent call last):
//10.8.4.49/Projects/QASA_BR_TCL_Env_7.2.250/Utils/BR_Env/Call Generator/results/Console_Logs/*
Console_log_01-01-2019_08-17-56.txt
File "C:/Users/markp/.PyCharmEdu2018.3/config/scratches/scratch_3.py", line 19, in <module>
main()
File "C:/Users/markp/.PyCharmEdu2018.3/config/scratches/scratch_3.py", line 16, in main
with open(readFile,"r") as file:
FileNotFoundError: [Errno 2] No such file or directory: 'Console_log_01-01-2019_08-17-56.txt'
Process finished with exit code 1
When you are looking for the file you are looking in the entire path, however when you are opening the file, you are referencing it as if it was in the local path, either change the current working directory with
os.chdir(path)
before opening the file, or in the open statement use
open(os.path.join(path,filename))
I recommend the first approach if you have to open only one file in your program and second if you open multiple files at multiple directories.
In future better format your questions, stack overflow has multiple tools, use them, also you can see how your text looks, make sure to have a look at it before posting. Use the code brackets for your code, that will help whoever is trying to answer.

FileNot FoundError:fluidsynth

I just want to convert my midi files to mp3 by using midi2audio, i use this code:
----------
from midi2audio import FluidSynth
FluidSynth().midi_to_audio('test.mid','test.mp3')
---------- or
from midi2audio import FluidSynth
FluidSynth().play_midi('test.mid')
but i got the same result:
----------
/Library/Frameworks/Python.framework/Versions/3.6/bin/python3.6 /Users/blake/Desktop/Bsmart_music/test/t_p/1.py
Traceback (most recent call last):
File "/Users/blake/Desktop/Bsmart_music/test/t_p/1.py", line 5, in <module>
FluidSynth().play_midi('test.mid')
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/midi2audio.py", line 49, in play_midi
subprocess.call(['fluidsynth', '-i', self.sound_font, midi_file, '-r', str(self.sample_rate)])
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/subprocess.py", line 267, in call
with Popen(*popenargs, **kwargs) as p:
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/subprocess.py", line 709, in __init__
restore_signals, start_new_session)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/subprocess.py", line 1344, in _execute_child
raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: 'fluidsynth': 'fluidsynth'
Process finished with exit code 1
----------
So, i try to copy the fold "fluidsynth" to "midi2audio", but it still doesn't work. I have installed both midi2audio and fluidsynth, who knows what happened?
you must copy "fluidsynth" in some folder pointed by your PATH environ variable
( type "echo $PATH" in terminal app to see what is registered actually ).
On unix-like systems python will usually find fluidsynth located at /usr/bin/fluidsynth ( as /usr/bin is listed in the PATH ). on Os X things are not much different any documentation should fit. A common practice is to use /usr/local/bin folder for adding custom programs.
Maybe you should install fluidsynth first
for ubuntu:
sudo apt-get install fluidsynth

Resources