I want my google assistance app to read response using custom user voice. I'm using webhook to send responses to user queries. Currently, I'm sending text responses. I have built a custom voice model using (https://lyrebird.ai/). When my webhook is triggered I would like to first convert the text response to audio using my custom model and send the output audio to google home. Is this possible? Or Is there any better ways of achieving this.
Yes you can synthesize your own audio and return it as Ssml in an audio tag:
https://developers.google.com/actions/reference/ssml
Related
I am trying to add music into my dialogflow agent. I don't want to add it from the dialogflow console and I wish to add it from the webhook. Can you please tell me how to add the music from the webhook. I am trying this code but it's not working:
app.intent('Music', (conv) => {
var speech = '<speak><audio src="soundbank://soundlibrary/ui/gameshow/amzn_ui_sfx_gameshow_countdown_loop_32s_full_01"/>Did not get the audio file<speak>';
});
Also I want to use one interrupt keyword which will stop this music, is there any predefined way or if user defined, how to interrupt the music and proceed with my other code?
Firstly, to be able to add music, it needs to be hosted on a publicly accesible https endpoint, see the docs. So make sure you can access your file even when using a private browsing mode such as incognito on chrome.
Secondly, If you choose to use SSML to play your audio, the audio will become part of the speech response. By doing this, you won't be able to create any custom interruptions or control over the music. The user can only stop the music by stopping your action or saying "Okay Google" again to interrupt your response.
If you would want to allow your users to control the music you send to them, try having a look at the media responses within the actions on google library.
I am using Dialogflow and fullfilment for dynamic response and integration has been done with Hangout. Text response is working fine. But when i use rich media like CARDS (Hangout API), It is not working. Can you please let me know what was i am missing or how to use cards for hangouts using dialogflow-fullfilment agent?
Stack Driver Log Image
Thanks and Regards,
Ramchandra-Sah GANESH
I wrote a blog about this, since I've figured some small quirks.
The best is to test this first within the Dialogflow UI console, by choosing custom payloads for Hangouts.
Note:
The first key can't be called cards, but it has to be named to hangouts
This hangouts key points to an object, not an array (of cards)
Have a look into this blog to get further details. (For example on using this with webhook code) How to build chatbots for Hangouts with Dialogflow by using custom payloads and cards.
i have a Use Case where i need to receive image as input from user in dialogflow chatbot. This images is require to send in webhook request to rest end point as fulfillment.
I couldn't find anything which resembles to my requirement and also not sure whether it is supported in dialogflow or not.
Thanks in advance!
Dialogflow cannot take an image as input and map it to an intent directly. However, what you can do is process that image and send some text to dialogflow which will trigger specific intent and then proceed to perform the desired action.
You need to have a bridge between your user agent and Dialogflow, which will handle all the text/image request. It will do all the pre-processing and then call Dialogflow.
In this way, you need to send message to the messenger user using it's API's and call dialogflow using API/SDK as well.
Hope it helps.
I'm back again with a question about NLP. I made my own back-end, which on one side can connect to websites, the Google Assistant and Facebook Messenger, and on the other end to Dialogflow. On the side, is logs interactions and does some other database stuff.
Now, I'm trying to connect this back-end to Alexa. I made a project which calls my endpoint. This project has one intent, which has a paramater which should get the raw user input, send it to my back-end, process it, parse and send the response to get back. I feel like there is not a real way to collect and send the raw user input, so I can process it myself (on Dialogflow) instead of using the Amazon way of mapping intents and such.
I know Dialogflow can export to Alexa, but this is not an option for me. I really hope one of you can point me in the right direction.
I just need a way to collect the raw user input, and respond in an Alexa accepted response format.
For Actions on Google for example, I'm using a Custom Project Action Package.
Thanks a lot in advace!
To accept or get any user input, you can use sys.any in google assistant and AMAZON.SearchQuery in AMAZON ALEXA.
In Alexa, You have to add the carrier phrase to use AMAZON.SearchQuery. You can't combine any other slot with AMAZON.SearchQuery.
So there are also some limitations. I hope this answer will help you.
I'm editing the Cloud Functions for Firebase on the Fulfillment page on DialogFlow. I'm trying to respond to an Intent with an audio file playback. Specifically I'm targeting the Telephony integration.
I understand that a text message like
<speak><audio src="https://actions.google.com/sounds/v1/alarms/bugle_tune.ogg"></audio></speak>
Should play the audio.
But what is the interface to send it back so it would work?
Just using agent.add() doesn't seem to work (it reads the SSML string outloud)
You cannot use the client API library to do this at this time. You'll need to craft the JSON response yourself. Please see my answer here which should be helpful: DIalogflow Telephony integration is interpreting SSML response from webhook as normal text
Basically do the same but use TelephonyPlayAudio instead of TelephonySynthesizeSpeech.