We are using Female 1 as the voice in our Google Action. Our intent is fulfilled by a cloud function, that responds with some SSML.
After triggering the intent a several times, the speaker emits a low pitched earcon, and swaps the voice to Male.
Has anyone experienced this or know what's happening?
Related
I face a somewhat weird problem. Unfortunately, I have to diagnose and repair it from distance without any real access. So I don't know if I can provide enough information to solve it.
Context:
I built an interface for my grandma (94) to chat with the family via telegram by connecting a Telegram bot to a microphone (for her to record voice messages) and a small printer (for the messages to be printed for her). [This is inspired by the Yayagram invented!]
The problem:
For the last one year, everything worked just fine. However, a few days ago, the bot stopped working. My grandma could not send any voice messages and didn't receive any of our messages. After calling her and checking all the connections, we restarted the device. Now, she can send voice messages, but the bot still doesn't process our text messages (nor commands!).
I have never encountered a bot being able to send messages without processing incoming messages. As I said, I have no way to analyze the problem in detail as my grandma lives 500km from me. If anybody has any ideas where the problem could be, I would be very happy.
Edit:
I knew it would work to post a question on StackOverflow! Out of nowhere, the bot seems to work again. I have no idea what caused it, or why it started again as I did nothing. I am still grateful for any insight for what might have caused it!
I'm using Google Cloud Speech API for streaming input.
I need the ability to handle the event that's fired as soon as audio is being detected. I think the appropriate one is readable.
Except, I couldn't get it to work yet:
If I attach this event to Speech API's streamingRecognize() - it'd behave the same as the data listener, except without any parameters.
If, on the other hand, I attach this event listener to node-record-lpcm16's start() - the event works as expected, but the pipe() method seems to fail executing, as Speech API's data event is never fired.
Well, according to this webpage. It's impossible.
It seems that the first event that's fired is data.
I followed this example and managed to collect the audio buffers from my microphone send them to Dialogflow.
https://cloud.google.com/dialogflow-enterprise/docs/detect-intent-stream
But this processing is sequential. I first have to collect all the audio buffers that I afterwards can send to Dialogflow.
Then I get the correct result and also the intermediate results.
But only after I waited for the person to stop talking first before i could send the collected audio buffers to Dialogflow.
I would like to send (stream) the audiobuffers instantly to dialogflow, while somebody is still talking, and also get the intermediate results right away.
Does anybody know if this is possible and point me in the right direction?
My preferred language is Python.
Thanks a lot!
I got this Answer from the Dialogflow support team:
From the Dialogflow documentation: Recognition ceases when it detects
the audio's voice has stopped or paused. In this case, once a detected
intent is received, the client should close the stream and start a new
request with a new stream as needed. This means that user has to
stop/pause speaking in order for you send it to Dialogflow.
In order for Dialogflow to detect a proper intent, it has to have the
full user utterance.
If you are looking for real-time speech recognition, look into our
Speech-to-text product (https://cloud.google.com/speech-to-text/).
While trying to do something similar recently, I found that someone already had this problem and figured it out. Basically, you can feed an audio stream to DialogFlow via the streamingDetectIntent method and get intermediate results as valid language is recognized in the audio input. The tricky bit is that you need to set a threshold on your input stream so that the stream is ended once the user stops talking for a set duration. The closing of the stream serves the same purpose as reaching the end of an audio file, and triggers the intent matching attempt.
The solution linked above uses SoX to stream audio from an external device. The nice thing about this approach is that SoX already has options for setting audio level thresholds to start/stop the streaming process (look at the silence option), so you can fine-tune the settings to work for your needs. If you're not using NodeJS, you may need to write your own utility to handle initiating the audio stream, but hopefully this can point you in the right direction.
I have setup a Webhook to my development environment. In addition to receiving the event notifications which I would expect, I receive additional notifications (with unique event ID) which do not make sense to me. When I try to look up those events on the Stripe dashboard I cannot find trace of them... Shouldn't all events be logged on the Dashboard?
Has anyone experienced this? I've popped a mail to the support but no answer yet.
EDIT:
I could find one of the "missing" events on the dashboard now. It was buried far down the event list, as it was apparently fired several hours ago for the first time, but couldn't be handled at that time by my application.
All events appear on the dashboard, listed at the time they are fired for the first time. So in case the application receiving the notification can't return a response 2xx for some reason, Stripe will try sending the notification again at a later time. Once the application server handles the event notification properly, the event will in all likelihood be logged by the application, and the timing might look confusing.
I used my presence callback method to display users that join or leave my chat room, but this actions seem very much delayed after the user logs off or logs on. What's the cause, and how can i achieve a swift response.