Python SpeechRecognition Snowboy Integration Seems to be Broken - python-3.x

I am working at building a personal assistant in Python. It appears that the Python's SpeechRecognition library has built-in Snowboy recognition, but it appears to be broken. Here is my code. (Note that the problem is that the listen() function never returns).
import speech_recognition as sr
from SnowboyDependencies import snowboydecoder
def get_text():
with sr.Microphone(sample_rate = 48000) as source:
audio = r.listen(source, snowboy_configuration=("SnowboyDependencies", {hotword_path})) #PROBLEM HERE
try:
text = r.recognize_google(audio).lower()
except:
text = none
print("err")
return text
I did some digging in SpeechRecognition and I have found where the problem exists, but I am not sure how to fix it because I am not very familiar with the intricacies of the library. The issue is that sr.listen never returns. It appears the the Snowboy hotword detection is 100% working, because the program progresses past that point when I say my hotword. Here is the source code. I have added my own comments to try to describe the issue further. I added three comments and all of them are enclosed in a multi-line box of #s.
def listen(self, source, timeout=None, phrase_time_limit=None, snowboy_configuration=None):
"""
Records a single phrase from ``source`` (an ``AudioSource`` instance) into an ``AudioData`` instance, which it returns.
This is done by waiting until the audio has an energy above ``recognizer_instance.energy_threshold`` (the user has started speaking), and then recording until it encounters ``recognizer_instance.pause_threshold`` seconds of non-speaking or there is no more audio input. The ending silence is not included.
The ``timeout`` parameter is the maximum number of seconds that this will wait for a phrase to start before giving up and throwing an ``speech_recognition.WaitTimeoutError`` exception. If ``timeout`` is ``None``, there will be no wait timeout.
The ``phrase_time_limit`` parameter is the maximum number of seconds that this will allow a phrase to continue before stopping and returning the part of the phrase processed before the time limit was reached. The resulting audio will be the phrase cut off at the time limit. If ``phrase_timeout`` is ``None``, there will be no phrase time limit.
The ``snowboy_configuration`` parameter allows integration with `Snowboy <https://snowboy.kitt.ai/>`__, an offline, high-accuracy, power-efficient hotword recognition engine. When used, this function will pause until Snowboy detects a hotword, after which it will unpause. This parameter should either be ``None`` to turn off Snowboy support, or a tuple of the form ``(SNOWBOY_LOCATION, LIST_OF_HOT_WORD_FILES)``, where ``SNOWBOY_LOCATION`` is the path to the Snowboy root directory, and ``LIST_OF_HOT_WORD_FILES`` is a list of paths to Snowboy hotword configuration files (`*.pmdl` or `*.umdl` format).
This operation will always complete within ``timeout + phrase_timeout`` seconds if both are numbers, either by returning the audio data, or by raising a ``speech_recognition.WaitTimeoutError`` exception.
"""
assert isinstance(source, AudioSource), "Source must be an audio source"
assert source.stream is not None, "Audio source must be entered before listening, see documentation for ``AudioSource``; are you using ``source`` outside of a ``with`` statement?"
assert self.pause_threshold >= self.non_speaking_duration >= 0
if snowboy_configuration is not None:
assert os.path.isfile(os.path.join(snowboy_configuration[0], "snowboydetect.py")), "``snowboy_configuration[0]`` must be a Snowboy root directory containing ``snowboydetect.py``"
for hot_word_file in snowboy_configuration[1]:
assert os.path.isfile(hot_word_file), "``snowboy_configuration[1]`` must be a list of Snowboy hot word configuration files"
seconds_per_buffer = float(source.CHUNK) / source.SAMPLE_RATE
pause_buffer_count = int(math.ceil(self.pause_threshold / seconds_per_buffer)) # number of buffers of non-speaking audio during a phrase, before the phrase should be considered complete
phrase_buffer_count = int(math.ceil(self.phrase_threshold / seconds_per_buffer)) # minimum number of buffers of speaking audio before we consider the speaking audio a phrase
non_speaking_buffer_count = int(math.ceil(self.non_speaking_duration / seconds_per_buffer)) # maximum number of buffers of non-speaking audio to retain before and after a phrase
# read audio input for phrases until there is a phrase that is long enough
elapsed_time = 0 # number of seconds of audio read
buffer = b"" # an empty buffer means that the stream has ended and there is no data left to read
##################################################
######THE ISSIE IS THAT THIS LOOP NEVER EXITS#####
##################################################
while True:
frames = collections.deque()
if snowboy_configuration is None:
# store audio input until the phrase starts
while True:
# handle waiting too long for phrase by raising an exception
elapsed_time += seconds_per_buffer
if timeout and elapsed_time > timeout:
raise WaitTimeoutError("listening timed out while waiting for phrase to start")
buffer = source.stream.read(source.CHUNK)
if len(buffer) == 0: break # reached end of the stream
frames.append(buffer)
if len(frames) > non_speaking_buffer_count: # ensure we only keep the needed amount of non-speaking buffers
frames.popleft()
# detect whether speaking has started on audio input
energy = audioop.rms(buffer, source.SAMPLE_WIDTH) # energy of the audio signal
if energy > self.energy_threshold: break
# dynamically adjust the energy threshold using asymmetric weighted average
if self.dynamic_energy_threshold:
damping = self.dynamic_energy_adjustment_damping ** seconds_per_buffer # account for different chunk sizes and rates
target_energy = energy * self.dynamic_energy_ratio
self.energy_threshold = self.energy_threshold * damping + target_energy * (1 - damping)
else:
# read audio input until the hotword is said
#############################################################
########THIS IS WHERE THE HOTWORD DETECTION OCCURRS. HOTWORDS ARE DETECTED. I KNOW THIS BECAUSE THE PROGRAM PROGRESSES PAST THIS PART.
#############################################################
snowboy_location, snowboy_hot_word_files = snowboy_configuration
buffer, delta_time = self.snowboy_wait_for_hot_word(snowboy_location, snowboy_hot_word_files, source, timeout)
elapsed_time += delta_time
if len(buffer) == 0: break # reached end of the stream
frames.append(buffer)
# read audio input until the phrase ends
pause_count, phrase_count = 0, 0
phrase_start_time = elapsed_time
while True:
# handle phrase being too long by cutting off the audio
elapsed_time += seconds_per_buffer
if phrase_time_limit and elapsed_time - phrase_start_time > phrase_time_limit:
break
buffer = source.stream.read(source.CHUNK)
if len(buffer) == 0: break # reached end of the stream
frames.append(buffer)
phrase_count += 1
# check if speaking has stopped for longer than the pause threshold on the audio input
energy = audioop.rms(buffer, source.SAMPLE_WIDTH) # unit energy of the audio signal within the buffer
if energy > self.energy_threshold:
pause_count = 0
else:
pause_count += 1
if pause_count > pause_buffer_count: # end of the phrase
break
# check how long the detected phrase is, and retry listening if the phrase is too short
phrase_count -= pause_count # exclude the buffers for the pause before the phrase
####################################################################3
#######THE FOLLOWING CONDITION IS NEVER MET THEREFORE THE LOOP NEVER EXITS AND THE FUNCTION NEVER RETURNS################
############################################################################
if phrase_count >= phrase_buffer_count or len(buffer) == 0: break # phrase is long enough or we've reached the end of the stream, so stop listening
# obtain frame data
for i in range(pause_count - non_speaking_buffer_count): frames.pop() # remove extra non-speaking frames at the end
frame_data = b"".join(frames)
return AudioData(frame_data, source.SAMPLE_RATE, source.SAMPLE_WIDTH)
The issue is that the main while loop in listen() never exits. I am not sure why. Note that the SpeechRecognition module works flawlessly when I am not integrating snowboy. Also note that snowboy works flawlessly on its own.
I am also providing the speech_recognition.snowboy_wait_for_hot_word() method as the problem could be in here.
def snowboy_wait_for_hot_word(self, snowboy_location, snowboy_hot_word_files, source, timeout=None):
print("made it")
# load snowboy library (NOT THREAD SAFE)
sys.path.append(snowboy_location)
import snowboydetect
sys.path.pop()
detector = snowboydetect.SnowboyDetect(
resource_filename=os.path.join(snowboy_location, "resources", "common.res").encode(),
model_str=",".join(snowboy_hot_word_files).encode()
)
detector.SetAudioGain(1.0)
detector.SetSensitivity(",".join(["0.4"] * len(snowboy_hot_word_files)).encode())
snowboy_sample_rate = detector.SampleRate()
elapsed_time = 0
seconds_per_buffer = float(source.CHUNK) / source.SAMPLE_RATE
resampling_state = None
# buffers capable of holding 5 seconds of original and resampled audio
five_seconds_buffer_count = int(math.ceil(5 / seconds_per_buffer))
frames = collections.deque(maxlen=five_seconds_buffer_count)
resampled_frames = collections.deque(maxlen=five_seconds_buffer_count)
while True:
elapsed_time += seconds_per_buffer
if timeout and elapsed_time > timeout:
raise WaitTimeoutError("listening timed out while waiting for hotword to be said")
buffer = source.stream.read(source.CHUNK)
if len(buffer) == 0: break # reached end of the stream
frames.append(buffer)
# resample audio to the required sample rate
resampled_buffer, resampling_state = audioop.ratecv(buffer, source.SAMPLE_WIDTH, 1, source.SAMPLE_RATE, snowboy_sample_rate, resampling_state)
resampled_frames.append(resampled_buffer)
# run Snowboy on the resampled audio
snowboy_result = detector.RunDetection(b"".join(resampled_frames))
assert snowboy_result != -1, "Error initializing streams or reading audio data"
if snowboy_result > 0: break # wake word found
return b"".join(frames), elapsed_time
I am running python 3.7 on a Raspberry pi 3B+ running Raspbian Buster Lite (kernel 4.19.36). Please ask if I can provide any additional information.

Related

Timing issue in windows rather than linux

I have the following function from a colleague who was previously working for the company and the comments are self explanatory, the problem is I'm right now using windows, and there issues with the synchornization with the device.
Would someone address or know a solution in windows for sync with a device ?
def sync_time(self):
"""Sync time on SmartScan."""
# On a SmartScan time can be set only by the precision of seconds
# So we need to wait for the next full second until we can send
# the packet on it's way to the scanner.
# It's not perfect, but the error should be more or less constant.
message = Maint()
message.state = message.OP_NO_CHANGE
now = datetime.datetime.utcnow()
epoch = datetime.datetime(1970, 1, 1)
# int and datetime objects
seconds = int((now - epoch).total_seconds()) + 1 # + sync second
utctime = datetime.datetime.utcfromtimestamp(seconds)
# wait until next full second
# works only on Linux with good accuracy
# Windows needs another approach
time.sleep((utctime - datetime.datetime.utcnow()).total_seconds())
command = MaintRfc()
command.command = command.SET_CLOCK
command.data = (seconds, )
message.add_message(command)
self._handler.sendto(message)
LOG.debug("Time set to: %d = %s", seconds, utctime)

How to wait until a sound file ends in vlc in Python 3.6

I have a question in vlc in python
import vlc
sound = vlc.MediaPlayer('sound.mp3')
sound.play()
# i wanna wait until the sound ends then do some code without
time.sleep()
import time, vlc
def Sound(sound):
vlc_instance = vlc.Instance()
player = vlc_instance.media_player_new()
media = vlc_instance.media_new(sound)
player.set_media(media)
player.play()
time.sleep(1.5)
duration = player.get_length() / 1000
time.sleep(duration)
After edit
That exactly what I wanted, thanks everyone for helping me ..
You can use the get_state method (see here: https://www.olivieraubert.net/vlc/python-ctypes/doc/) to check the state of the vlc player.
Something like
vlc_instance = vlc.Instance()
media = vlc_instance.media_new('sound.mp3')
player = vlc_instance.media_player_new()
player.set_media(media)
player.play()
print player.get_state()# Print player's state
for wait util end of vlc play sound, except your:
player.play()
time.sleep(1.5)
duration = player.get_length() / 1000
time.sleep(duration)
other possible (maybe more precise, but CPU costing) method is:
# your code ...
Ended = 6
current_state = player.get_state()
while current_state != Ended:
current_state = player.get_state()
# do sth you want
print("vlc play ended")
refer :
vlc.State definition
vlc.Instance
vlc.MediaPlayer - get_state
I.m.o. Flipbarak had it almost right. My version:
import vlc, time
vlc_instance = vlc.Instance()
song = 'D:\\mozart.mp3'
player = vlc_instance.media_player_new()
media = vlc_instance.media_new(song)
media.get_mrl()
player.set_media(media)
player.play()
playing = set([1])
time.sleep(1.5) # startup time.
duration = player.get_length() / 1000
mm, ss = divmod(duration, 60)
print "Current song is : ", song, "Length:", "%02d:%02d" % (mm,ss)
time_left = True
# the while loop checks every x seconds if the song is finished.
while time_left == True:
song_time = player.get_state()
print 'song time to go: %s' % song_time
if song_time not in playing:
time_left = False
time.sleep(1) # if 1, then delay is 1 second.
print 'Finished playing your song'
A slight alternative / method that i just tested and had good results (without needing to worry about the State.xxxx types.
This also allowed me to lower the overall wait/delay as i'm using TTS and found i would average 0.2 seconds before is_playing returns true
p = vlc.MediaPlayer(audio_url)
p.play()
while not p.is_playing():
time.sleep(0.0025)
while p.is_playing():
time.sleep(0.0025)
The above simply waits for the media to start playing and then for it to stop playing.
Note: I'm testing this via a URL / not a local file but was having the same issue and believe this will work the same.
Also fully aware its a slightly older and answered, but hopefully its of use to some.

Python3 Two-Way Serial Communication: Reading In Data

I am trying to establish a two-way communication via Python3. There is a laser range finder plugged into one of my USB ports and I'd like to send/receive commands to that. I have a sheet of commands which can be sent and what they would return, so this part is already there.
What I need is a convenient way to do it in real-time. So far I have the following code:
import serial, time
SERIALPORT = "/dev/ttyUSB0"
BAUDRATE = 115200
ser = serial.Serial(SERIALPORT, BAUDRATE)
ser.bytesize = serial.EIGHTBITS #number of bits per bytes
ser.parity = serial.PARITY_NONE #set parity check: no parity
ser.stopbits = serial.STOPBITS_ONE #number of stop bits
ser.timeout = None #block read
ser.xonxoff = False #disable software flow control
ser.rtscts = False #disable hardware (RTS/CTS) flow control
ser.dsrdtr = False #disable hardware (DSR/DTR) flow control
ser.writeTimeout = 0 #timeout for write
print ("Starting Up Serial Monitor")
try:
ser.open()
except Exception as e:
print ("Exception: Opening serial port: " + str(e))
if ser.isOpen():
try:
ser.flushInput()
ser.flushOutput()
ser.write("1\r\n".encode('ascii'))
print("write data: 1")
time.sleep(0.5)
numberOfLine = 0
while True:
response = ser.readline().decode('ascii')
print("read data: " + response)
numberOfLine = numberOfLine + 1
if (numberOfLine >= 5):
break
ser.close()
except Exception as e:
print ("Error communicating...: " + str(e))
else:
print ("Cannot open serial port.")
So in the above code I am sending "1" which should trigger "getDistance()" function of the laser finder and return the distance in mm. I tried this on Putty and it works, returns distances up to 4 digits. However, when I launch the above Python script, my output is only the following:
Starting Up Serial Monitor
Exception: Opening serial port: Port is already open.
write data: 1
read data:
and it goes forever. There is no read data or whatsoever.
Where am I mistaken?
Apparently much more simpler version of the code solved the issue.
import serial
import time
ser = serial.Serial('/dev/ttyUSB0', 115200, timeout = 1) # ttyACM1 for Arduino board
readOut = 0 #chars waiting from laser range finder
print ("Starting up")
connected = False
commandToSend = 1 # get the distance in mm
while True:
print ("Writing: ", commandToSend)
ser.write(str(commandToSend).encode())
time.sleep(1)
while True:
try:
print ("Attempt to Read")
readOut = ser.readline().decode('ascii')
time.sleep(1)
print ("Reading: ", readOut)
break
except:
pass
print ("Restart")
ser.flush() #flush the buffer
Output, as desired:
Writing: 1
Attempt to Read
Reading: 20
Restart
Writing: 1
Attempt to Read
Reading: 22
Restart
Writing: 1
Attempt to Read
Reading: 24
Restart
Writing: 1
Attempt to Read
Reading: 22
Restart
Writing: 1
Attempt to Read
Reading: 26
Restart
Writing: 1
Attempt to Read
Reading: 35
Restart
Writing: 1
Attempt to Read
Reading: 36
It seems to me that your ser.timeout = None may be causing a problem here. The first cycle of your while loop seems to go through fine, but your program hangs when it tries ser.readline() for the second time.
There are a few ways to solve this. My preferred way would be to specify a non-None timeout, perhaps of one second. This would allow ser.readline() to return a value even when the device does not send an endline character.
Another way would be to change your ser.readline() to be something like ser.read(ser.in_waiting) or ser.read(ser.inWaiting()) in order to return all of the characters available in the buffer.
try this code
try:
ser = serial.Serial("/dev/ttyS0", 9600) #for COM3
ser_bytes = ser.readline()
time.sleep(1)
inp = ser_bytes.decode('utf-8')
print (inp)
except:
pass

Can't receive the data from device RFB2000 in python

I am using the module "ctypes" to load RFBClient.dll,I use windll and the convention is stdcall. I want to remote control the device RFB2000 with these commands below:
first step for connection
All the commands
for my programme, the connection is successful but the problem is that i can't recieve the data, when I want to get temperature value, I call the function but it always returns 0, the restype is c_double and the argtypes is none, I can't see there is any problem. English is not my native language; please excuse typing errors.
import ctypes
import time
libc = ctypes.WinDLL("X:\\RFBClient.dll")
#connect to RFB software
libc.OpenRFBConnection(ctypes.c_char_p('127.0.0.1'.encode('UTF-8')))
#check if connection successful
libc.Connected()
#Set parameters
#num_automeas = 1; %Number of auto-measurement runs.
completion_count = 2; #% Number of On-Off pairs within each auto-measurement run.
OnHalfCycleTimeCount = 40; # set 2s on
OffHalfCycleTimeCount = 40; # set 2s off
Data=[]
libc.SetCompletionCount(completion_count)
libc.SetMeasureUntilCount(completion_count)
libc.SetOnHalfCycleCount(OnHalfCycleTimeCount)
libc.SetOffHalfCycleCount(OffHalfCycleTimeCount)
libc.NewAutoMeasurement()
#zeroing
time.sleep(1)
print("zeroing.....")
libc.Zero()
while libc.Zeroing()== -1:
time.sleep(1)
#libc.CheckingSensor()
print("measurement start")
libc.StartMeas()
time.sleep(0.5)
while libc.Measuring() == -1:
time.sleep(1)
print(libc.Measuring())
getTemperature = libc.GetTemperature
getTemperature.restype = ctypes.c_double
getTemperature.argtypes = []
print(getTemperature())

Raspberry PI with GPIO Input buttons

I have a PI running with 4 GPIO input ports.
The target is, if one the 4 buttons will be pressed, a mp3 file should be played, i.e. button1 = file1.mp3, button2 = file2.mp3 and so on.
It seems to be no so complicate, but 'the devil is in the detail' :-)
This is my code for 2 buttons at moment :
#!/usr/bin/env python
#coding: utf8
import time
from time import sleep
import os
import RPi.GPIO as GPIO
GPIO.setmode(GPIO.BCM)
GPIO.setup(24, GPIO.IN, pull_up_down = GPIO.PUD_DOWN)
GPIO.setup(23, GPIO.IN, pull_up_down = GPIO.PUD_DOWN)
def my_callback_1(channel):
print("Button 23 Pressed")
os.system('omxplayer -o both /root/1.mp3')
sleep(10)
def my_callback_2(channel):
print("Button 24 Pressed")
os.system('omxplayer -o both /root/2.mp3')
sleep(10)
GPIO.add_event_detect(23, GPIO.RISING, callback=my_callback_1, bouncetime=200)
GPIO.add_event_detect(24, GPIO.RISING, callback=my_callback_2, bouncetime=200)
try:
while 1:
time.sleep(0.5)
except KeyboardInterrupt:
# exits when you press CTRL+C
print(" Bye Bye")
except:
print("Other error or exception occurred!")
finally:
GPIO.cleanup() # this ensures a clean exit
The sleep time is set for the longer of the mp3 file.
Its working, but not like I expected.
The problem is, when a buttons will be pushed while a file is already playing, the PI keeps the button push in a buffer and play the file after the current file in a loop.
Imagine, somebody will push 5 times the same button, 5 times the same mp3 file will be played in a loop.
So I searching for a solution like this:
While a file is playing, all Input buttons should be "disabled" for this time. When the mp3 file paying is finished, the buttons should be "re-enabled" and another button can be pushed.
How can I this ? Thanks for your help.
I don't see a simple way to do this without adding threads. Note that you are implicitly already using threads behind the scenes with add_event_detect(), which runs the callbacks in separate threads. If add_event_detect doesn't support suppressing the button presses (which I don't think it does), then you can thread it in one of two ways - using python threads or processes, or a simpler way by using bash.
To use background processes in bash, remove your add_event_detect calls, and then in your while loop, you'd do something like (untested):
started_23 = 0
while True:
if GPIO.input(23) and time.time() - started_23 > 10:
started_23 = time.time()
print("Button 23 Pressed")
os.system('omxplayer -o both /root/1.mp3 &')
time.sleep(0.200)
Note the ampersand added to the system() call - that will run the omxplayer in the background. And the started_23 variable keeps track of when the sound was started in order to prevent replaying it for another 10 seconds. You may way to increase that to include the length of the file time. You can similarly add in code for GPIO 24 in the same loop.
Thanks for help, Brian. You brought me in the right direction !
I got it. Its working now as I described above.
Here my code :
try:
vtc1 = 8 # Time Audiofile 1
vtc2 = 11 # Time Audiofile 2
vtc3 = 0 # Time Audiofile 3
vtc4 = 0 # Time Audiofile 4
vtc = 0 # Current AudioFileTime
started_t = 0 # Started Time
while True:
if GPIO.input(23) and time.time() - started_t > vtc:
vtc = vtc1
started_t = time.time()
print("Button 23 Pressed")
os.system('omxplayer -o both /root/1.mp3 &')
time.sleep(0.200)
if GPIO.input(24) and time.time() - started_t > vtc:
vtc = vtc2
started_t = time.time()
print("Button 24 Pressed")
os.system('omxplayer -o both /root/2.mp3 &')
time.sleep(0.200)
The problem was, that the seconded file was started before the first was finished. Because the code did not know the longer of currently played file. So I put the time of the audiofile in the "vtc" value when this file were executed.
If you push another button, it calculate the time playing with the current file time "vtc". That's it.
Thanks again.

Resources