How identify source from google assistant on dialogflow? - dialogflow-es

i have 2 sonos speaker. I have done an application on dialogflow where i can say "play muse" to listen my mp3 on them. The sentence is send to a php webserver (with a webhook) and the sound is send to the sonos with the api. But, the only solution i have found to identify the source, is to add:
"play muse on the kitchen", or "play muse on the bedroom".
I dont know how to identify the source where the request comes from, and to deliver the sound on the correct speaker.
Have you an idea ?

Related

Add music to dialogflow

I am trying to add music into my dialogflow agent. I don't want to add it from the dialogflow console and I wish to add it from the webhook. Can you please tell me how to add the music from the webhook. I am trying this code but it's not working:
app.intent('Music', (conv) => {
var speech = '<speak><audio src="soundbank://soundlibrary/ui/gameshow/amzn_ui_sfx_gameshow_countdown_loop_32s_full_01"/>Did not get the audio file<speak>';
});
Also I want to use one interrupt keyword which will stop this music, is there any predefined way or if user defined, how to interrupt the music and proceed with my other code?
Firstly, to be able to add music, it needs to be hosted on a publicly accesible https endpoint, see the docs. So make sure you can access your file even when using a private browsing mode such as incognito on chrome.
Secondly, If you choose to use SSML to play your audio, the audio will become part of the speech response. By doing this, you won't be able to create any custom interruptions or control over the music. The user can only stop the music by stopping your action or saying "Okay Google" again to interrupt your response.
If you would want to allow your users to control the music you send to them, try having a look at the media responses within the actions on google library.

Actions on Google- Can we use Custom voice for Google Home?

I want my google assistance app to read response using custom user voice. I'm using webhook to send responses to user queries. Currently, I'm sending text responses. I have built a custom voice model using (https://lyrebird.ai/). When my webhook is triggered I would like to first convert the text response to audio using my custom model and send the output audio to google home. Is this possible? Or Is there any better ways of achieving this.
Yes you can synthesize your own audio and return it as Ssml in an audio tag:
https://developers.google.com/actions/reference/ssml

Get raw voice input

I have enabled the google assistant in my api.ai and get the valid intent. However I would also want to get the voice print of the user and save users context with that voice print. is there a way to intercept this , where I can get the voice print ? Idea is to save the conversation with users voice.
Thanks
There is no way to get the original audio recording of the users' request - just the transcript. There is currently true for all the voice assistants: google assistant, alexa, and cortana.

Alexa custom skill - handler to internally invoke other skills

I am trying to develop a custom skill which would perform the below operation:
Alexa, launch michael jackson app
Then I would provide option for user to select from the below option:
Alexa, play music on spotify(and I need to internally pass the value of artist (mj))
Alexa, play music on pandora(and I need to internally pass the value of artist (mj))
Alexa, play music on podcast(and I need to internally pass the value of artist (mj))
User can specify mj on Spotify, iMusic and Pandora etc..
Is this doable?
You cannot invoke Alexa again like 'Alexa, play music on Spotify' when one session is going on. There is one custom solution you can do that too only if other services (like Spotify) has exposed a REST API to use. If they have a REST API then what you can do is, after opening your skill (Alexa, launch Michael Jackson app) you can give options to user like below,
say 1 to play music on Spotify
say 2 play music on Pandora
say 2 play music on podcast
One user responds with numbers ( 1, 2, 3 etc.) then you can another input from the user for the artist name. Now call the corresponding API according to user input.
Please note all these logic would be possible only if another party has exposed a REST API.
Yes, this can be done in several ways. One would require that your app respond to the launch request, and also 3 intents:
"Alexa, open Michael Jackson app" would launch your app. It should respond to the launch request with something like "where would you like me to play Michael Jackson? You can say spotify, pandora, or podcast"
SpotifyIntent: "play music on Spotify" or even just "Spotify"
PandoraIntent: "play music on Pandora" or even just "Pandora"
PodcastIntent: "play music on podcast" or even just "podcast".
Your intent handlers would then need to make the REST calls to the selected service.
This could also be done using slots, but I think the above is about the simplest way to accomplish what you describe.

play audio file in twilio conference

I'm looking for an example of how to play an audio file during a twilio conference session. I'd like to play the audio file into the conference at the push of a button by one of the participants.
Any ideas?
Here's an article for how to do a roll call during a conference that will give you an idea of how to manage the workflow (you obviously just want the user to hear the played message, and not need to parse the calls):
https://www.twilio.com/blog/2014/09/roll-call-roger-stringer-shows-you-how-to-take-a-headcount-during-a-twilio-conference-call.html
Basically you have the user exit the conference on the keypress, have the message played, then the user rejoin the conference. If you want it to play for all users, it would disrupt the audio of the conference if you need the record feature for it.

Resources