How do I find the length of a sound generated by pyttsx3 - python-3.x

I am making a program that needs the length of speech generated by pyttsx3
I did not find any way to do it using pyttsx3 so I am storing the speech in a file
and then trying to use mutagen to get the audio info
import pyttsx3
from mutagen.mp3 import MP3
# the engine
engine = pyttsx3.init()
# 'Hello World' is just an example
engine.save_to_file('Hello world', 'test.mp3')
engine.runAndWait()
# load the mp3 as an audio
audio = MP3('test.mp3')
# the line above gives an error
I get the following error mutagen.mp3.HeaderNotFoundError: can't sync to mpeg frame
Why am I getting this error?
and also is there any other way to get the length of a pyttsx generated speech?

If there is no specific need to use mutagen, I recommend using pydub instead. Code below which gives duration in seconds
Code:
import pyttsx3
from pydub import AudioSegment
# the engine
engine = pyttsx3.init()
# 'Hello World' is just an example
engine.save_to_file('Hello world', 'test.mp3')
engine.runAndWait()
# load the mp3 as an audio
audio = AudioSegment.from_file("test.mp3")
print(audio.duration_seconds)
Output:
0.9205442176870748

Related

How to convert an audio file in colab to text?

I am trying to convert an audio file I have in colab workspace into text using the speech recognition module. But it doesn't work as the audio argument here needs to be audio, how do I load an audio file "audio.wav" into some variable to pass there or just simply pass that file.
import speech_recognition as sr
r = sr.Recognizer()
text = r.recognize_google(audio, language = 'en-IN')
print(text)
The speech_recognition library has a procedure to read in audio files. You can do:
inp = sr.AudioFile('path/to/audio/file')
with inp as file:
audio = r.record(file)
After that pass the audio as the first argument to r.recognize_google()
Here is a good article to understand this library.
pip3 install SpeechRecognition pydub
Make sure you have an audio file in the current directory that contains english speech
import speech_recognition as sr
filename = "16-122828-0002.wav"
The below code is responsible for loading the audio file, and converting the speech into text using Google Speech Recognition:
# initialize the recognizer
r = sr.Recognizer()
# open the file
with sr.AudioFile(filename) as source:
# listen for the data (load audio to memory)
audio_data = r.record(source)
# recognize (convert from speech to text)
text = r.recognize_google(audio_data)
print(text)
This will take few seconds to finish, as it uploads the file to Google and grabs the output

How to play youtube videos in python-vlc?

import vlc
p = vlc.MediaPlayer("https://www.youtube.com/watch?v=7ailmFB38Rk")
p.play()
gives me this error
[00007f97a80030c0] http stream error: local stream 1 error: Cancellation (0x8)
I was told this is causes if the link is invalid or broken, both of these are not the cases because using regular vlc to play the video works perfectly
Also, if it is somehow not possible to play the video, I only need to play the audio so that will also be of help.
use pafy
# importing vlc module
import vlc
# importing pafy module
import pafy
# url of the video
url = "https://www.youtube.com/watch?v = vG2PNdI8axo"
# creating pafy object of the video
video = pafy.new(url)
# getting best stream
best = video.getbest()
# creating vlc media player object
media = vlc.MediaPlayer(best.url)
# start playing video
media.play()

How do I transfer the output from pyttsx3 to a variable for DSP

I'm running a Raspberry Pi4 with Python 3.7 and pyttsx3.
I'm planning to use pyttsx3 to verbally respond to "commands" I issue. I also plan to visualise the output speech on a neopixel strip (think "Close Encounters" on a miniature scale.) The visualisation is not the problem though.
My problem is, how do I get the output from pyttsx3 into a variable so I can pass it to my DSP?
I know that I can pass the to a file:
import pyttsx3
engine = pyttsx3.init() # object creation
"""Saving Voice to a file"""
engine.save_to_file('Hello World', 'text.mp3')
engine.runAndWait()
& I know I can read the file but that creates a latency.
I want the speech and twinkly lights to coincide and I know I can play the wav file but I'd like something more "real time".
Does anyone have any suggestions please?

How is the best way to play an MP3 file with a start time and end time in Python 3

I am working on a project that requires sound snippets to be played from MP3 files in a playlist. The files are full songs.
I have tried pygame mixer and I can pass the start time of the file, but I cannot pass the end time that I want the music to stop, or be able to fade-in and fade out the current snippet.
I have looked at the vlc and ffmpeg libraries, but I do not see the functionality I am looking for.
I'm hoping someone may be aware of a library out there that may be able to do what I am trying to accomplish.
I finally figured out how to do exactly what I wanted to do!
In the spirit of helping others I am posting an answer to my own question.
My development environment:
Mac OS Mojave 10.14.6
Python 3.7.4
PyAudio 0.2.11
PyDub 0.23.1
Here it is in it's most rudimentary form:
import pyaudio
from pydub import AudioSegment
# Assign a mp3 source file to the PyDub Audiosegment
mp3 = AudioSegment.from_mp3("path_to_your_mp3_file")
# Specify starting and ending offsets from the beginning of the stream
# then apply a fadein and fadeout. All values are in millisecond (seconds * 1000).
mp3 = mp3[int(43000):int(58000)].fade_in(2000).fade_out(2000)
# In the above example the music will start 43 seconds into the track with a 2 second
# fade-in, and only play for 15 seconds with a 2 second fade-out. If you don't need
# these features, just comment out the line and the full mp3 will play.
# Assign the PyAudio player
player = pyaudio.PyAudio()
# Create the stream from the chosen mp3 file
stream = player.open(format = player.get_format_from_width(mp3.sample_width),
channels = mp3.channels,
rate = mp3.frame_rate,
output = True)
data = mp3.raw_data
while data:
stream.write(data)
data=0
stream.close()
player.terminate()
It isn't in the example above, but there is a way to process the stream and increase/decrease/mute the volume of the music.
One other thing that could be done is to set up a thread to pause the processing (writing) of the stream, which would emulate a pause button in a player.

Play wav file python 3

I want to play a .wav file in Python 3.4. Additonally, I want python to play the file rather than python open the file to play in VLC, media player etc..
As a follow up question, is there any way for me to combine the .wav file and the .py file into a standalone exe.
Ignore the second part of the question if it is stupid, I don't really know anything about compiling python.
Also, I know there have been other questions about .wav files, but I have not found one that works in python 3.4 in the way I described.
Using pyaudio you may get incorrect playback due to speed, consider instead:
sudo apt-get install python-pygame
Windows:
choco install python-pygame?
def playSound(filename):
pygame.mixer.music.load(filename)
pygame.mixer.music.play()
import pygame
pygame.init()
playSound('hellyeah.wav')
I fixed the problem by using the module pyaudio, and the module wave to read the file.
I will type example code to play a simple wave file.
import wave, sys, pyaudio
wf = wave.open('Sound1.wav')
p = pyaudio.PyAudio()
chunk = 1024
stream = p.open(format =
p.get_format_from_width(wf.getsampwidth()),
channels = wf.getnchannels(),
rate = wf.getframerate(),
output = True)
data = wf.readframes(chunk)
while data != '':
stream.write(data)
data = wf.readframes(chunk)
If you happen to be using linux a simple solution is to call aplay.
import os
wav_file = "./Hello.wav"
os.system(f'aplay {wav_file}')

Resources