i'm new to action on Googles and right now doing R&D. I've created an audio skill on Alexa, and now want same for Google assistant as well. But i've few questions:
1- Can we return audio in response? my audios are about 1hour long, so can we play them in our action? In Alexa, we have audio player. Anything like that in assistant?
2- I didn't find any SDK, but devs are talking about it, so there must be some. Kindly share the link.
Thanks in anticipation.
Update:
I believe, SDK is actions-on-google. I've not explored it yet, but it's the SDK that i found for creating actions with node js
Link: actions-on-google
Actions support SSML which provides the playback of audio files: https://developers.google.com/actions/reference/ssml#support_for_ssml_elements
At the moment there is a 120 seconds maximum duration for all the audio formats supported, but you can break up the audio and play them in sequence if they are longer.
If you have your own NLU, you can use the Actions SDK. If you don't have your own NLU, then you can use API.AI to create an action.
A node.js client library is available for either of these options: https://github.com/actions-on-google/actions-on-google-nodejs
For any other developer questions, you should look at the actions documentation: https://developers.google.com/actions/develop/conversation
Related
I’m wondering how I can create a music Player for my Google Assistant compatible devices (e.g. Google Home mini, my tablet, phone...). I’ve been researching about how I can do this, but I’ve just found things like using Dialogflow, node-js and/or Actions on Google using Google Firebase Cloud Functions. I’m new to all this, I was motivated by Spotify and Pandora and all those other services. So I also tried looking up how they do it, but I found nothing. If any of you Know how to do it, please help me.
In addition to all that, I am just a tad bit confused about the whole Dialogflow and Actions on Google integration, but that’s easier to fix than the overall question.
If this isn’t “solvable” is there a way to do it with Dialogflow Fulfillment’s?
In order to create something like Spotify or Pandora, you need to partner with Google to create a media action. These are different than the conversational actions that you can create using Actions on Google and Dialogflow.
If you want to create a conversational action with Actions on Google and Dialogflow that produce long-form audio results as part of the conversation, you will want to look into the Media response, which you can include in your replies.
We are looking to build Google Action where it will record small snippets (like a voice TODO list) and can be played later.
Is there any documentation for this?
In short - no. Google does not provide access to the audio stream from the Assistant. You can get the Speech To Text (STT) processed by Google, however, using the Actions on Google API.
I am making a game where I want to command the AI using word i speak.
Say for example I can say go and AI bot goes to certain distance.
Question is I am finding asset and no provider is giving me grantee that it is possible ?
What are the difficulties for doing it?
I am programmer so if some one suggest the way to handle it I can do it.
Should I make mic listener on all the time and read audio and then pass audio to some external sdk which can convert my voice to text ?
these are the asset provider i have contacted.
https://www.assetstore.unity3d.com/en/#!/content/73036
https://www.assetstore.unity3d.com/en/#!/content/45168
https://www.assetstore.unity3d.com/en/#!/content/47520
and few more !
If someone just explains the steps I need to follow then I can try it for sure.
I am currently using this external api for pretty much the same thing: https://api.ai/
It comes with a unity SDK that works quite well:
https://github.com/api-ai/api-ai-unity-sample#apiai-unity-plugin
You have to connect a audio source to the sdk, and tell it to start listening. It will then convert your voice audio to text, and even detect pre-selected intentions from your voice audio / text.
You can find all steps on how to integrate the unity plugin in the api.ai Unity SDK documentation on github.
EDIT: It's free too btw :)
If you want to recognize offline without sending data to the server, you need to try this plugin:
https://github.com/dimixar/unity3DPocketSphinx-android-lib
It uses open source speech recognition engine CMUSphinx
From the docs it seems like SpeechResponse is the only documented type of response you can return:
https://developers.google.com/actions/reference/conversation#SpeechResponse
Is it be possible to load an image or some other type of media in the assistant conversation via API.AI or the Actions SDK? Seems like this is supported with api.ai for FB, other messengers:
https://docs.api.ai/docs/rich-messages#image
Thanks!
As of today, Google Actions SDK supports Conversation Actions, by building a better Voice UI, which is integrated with Google Home.
Even API.AI integrations with Google Actions can be checked out here, which shows currently no support for images in the response.
When they provide integrations with Google Allo, then in the messaging interface, they might start supporting images, videos etc.
That feature seems to be present now. You can look it up in the docs at https://developers.google.com/actions/assistant/responses
Note: But images would be supported only on devices with a visual output. So Google Home would obviously not be able to do it. But the devices with screen do support a card with an image.
Pro Tip: Yes you can
What you want to do is represent your (image/video) as a URL within API.AI and render the URL as a (image/video) within your app
see working example
I'm using the Spotify API 11.1.60 for iOS and created successfully an app that can download / play songs from Spotify, download and show the covers, execute searches. But I can't find a way to crossfade songs. I only can play one song a time.
Does anybody know how to crossfade songs?
I did not implement crossfade but I think the only solution is to bufferize your current song to be able to crossfade.
You have to update your music_delivery callback and your code used to play the sound. The solution depend on the sound engine your using (OpenAL, CoreAudio, ...)