How do I add Amazon.helpintent in audio player - node.js

When I ask help, Alexa help invoke instead not the custom help skill. If the audio player not playing eg. on the launch page I get the custom help invoke, but not in the audio player. How can I override that?
thank you.

Per the AudioPlayer documentation:
When sending a Play directive, you normally set the shouldEndSession flag in the response object to true to end the session.
So once the user has invoked the Play Directive, they are no longer interacting with your skill. The user can effect the playback of content from your skill using the built-in playback control intents, but any other interaction with your skill requires use of the normal invocation phrase - e.g. "Alexa, ask [SkillName] for help"
What about setting shouldEndSession to false?
This has the effect of expecting more user input. While this would allow the user to ask for help (or otherwise interact with you skill) immediately after starting the audio playback, it would also pause the audio playback to listen for this input.

You can't.
In Audio Player skill when skill starts to play audio, then there is no internal session management and you can only respond using AudioPlayer directives like Play Pause Next and some other directives which you can find In this link here.

Related

Add music to dialogflow

I am trying to add music into my dialogflow agent. I don't want to add it from the dialogflow console and I wish to add it from the webhook. Can you please tell me how to add the music from the webhook. I am trying this code but it's not working:
app.intent('Music', (conv) => {
var speech = '<speak><audio src="soundbank://soundlibrary/ui/gameshow/amzn_ui_sfx_gameshow_countdown_loop_32s_full_01"/>Did not get the audio file<speak>';
});
Also I want to use one interrupt keyword which will stop this music, is there any predefined way or if user defined, how to interrupt the music and proceed with my other code?
Firstly, to be able to add music, it needs to be hosted on a publicly accesible https endpoint, see the docs. So make sure you can access your file even when using a private browsing mode such as incognito on chrome.
Secondly, If you choose to use SSML to play your audio, the audio will become part of the speech response. By doing this, you won't be able to create any custom interruptions or control over the music. The user can only stop the music by stopping your action or saying "Okay Google" again to interrupt your response.
If you would want to allow your users to control the music you send to them, try having a look at the media responses within the actions on google library.

Is it possible to wait until the ask method returns a response

Using actions-on-google, when handling an intent and then using conv.ask() to send a response to the agent, is it possible to wait until the request has been successfully sent and then continue doing something else? Is there a way to await the response of the ask method?
My idea is to tell the agent to say something, manually time playing a sound (mp3) after the ask method has been successfully sent to the agent. Right now it takes a bit of time for the agent to receive the request, say the thing and then play the sound. The request gets sent, but not received instantly so the sound that I am playing plays way before the agent has said something.
Is that something possible?
Update
Right now I'm using SSML to make two different voices speak in one intent. The idea of it is that we have two "personalities" talking, and each personality has a different voice. Currently, in SSML using some attributes to do that. Let's call them P1 and P2. P1 starts by saying something, and as soon as it ends a sound of a blender gets played. Right after the sound is playing the second personality P2 starts talking, and then P1 then "replies" to it, but that all happens in one intent response. That's the idea I'm trying to implement.
If you want to play audio immediately after saying something, it more sounds like you want to use a Media response as part of what you're sending back. Your mp3 file must be available at an HTTPS address, although that address can be anything you want as long as the device can resolve it. Since it will be on the same server the webhook is running on, and the webhook has to have a public HTTPS URL, then presumably the audio will (or can) as well.
If your interest is in knowing that latency, you can probably time the difference between when you send the response and when the device requests the mp3 file.
There is no direct way to know when the Assistant has finished saying the text, but you can use tricks with the Media response to get some idea depending on your needs.
Update based on your use case.
If you're doing it all as one response, and it fits in that response, and your audio is just a few seconds long, then you can do it using SSML as a single response. That part seems fine.
If the audio is longer or you want more of a back and forth between your personalities, then you can use the Media response to play the audio (even a very short empty audio). At the end of the audio playing, it sends an event to the Action and you can then continue to the next step in your personalities responding.

Alexa custom skill - handler to internally invoke other skills

I am trying to develop a custom skill which would perform the below operation:
Alexa, launch michael jackson app
Then I would provide option for user to select from the below option:
Alexa, play music on spotify(and I need to internally pass the value of artist (mj))
Alexa, play music on pandora(and I need to internally pass the value of artist (mj))
Alexa, play music on podcast(and I need to internally pass the value of artist (mj))
User can specify mj on Spotify, iMusic and Pandora etc..
Is this doable?
You cannot invoke Alexa again like 'Alexa, play music on Spotify' when one session is going on. There is one custom solution you can do that too only if other services (like Spotify) has exposed a REST API to use. If they have a REST API then what you can do is, after opening your skill (Alexa, launch Michael Jackson app) you can give options to user like below,
say 1 to play music on Spotify
say 2 play music on Pandora
say 2 play music on podcast
One user responds with numbers ( 1, 2, 3 etc.) then you can another input from the user for the artist name. Now call the corresponding API according to user input.
Please note all these logic would be possible only if another party has exposed a REST API.
Yes, this can be done in several ways. One would require that your app respond to the launch request, and also 3 intents:
"Alexa, open Michael Jackson app" would launch your app. It should respond to the launch request with something like "where would you like me to play Michael Jackson? You can say spotify, pandora, or podcast"
SpotifyIntent: "play music on Spotify" or even just "Spotify"
PandoraIntent: "play music on Pandora" or even just "Pandora"
PodcastIntent: "play music on podcast" or even just "podcast".
Your intent handlers would then need to make the REST calls to the selected service.
This could also be done using slots, but I think the above is about the simplest way to accomplish what you describe.

Twilio - Detect when user starts talking

I've integrated Twilio in a NodeJS application and I want to know if there is any way to detect using TwiML when a user starts talking while in the middle of a playback or when the synthesizer is talking? Like when you call to an IVR and the IVR is talking to you and you already know what to say and instead of waiting for it to finish you choose the option right away.
Currently, Twilio does not support speech or voice recognition.
Source: https://www.twilio.com/help/faq/voice/does-twilio-support-speech-recognition
If
choose the option right away
means a key press, then you don't have to wait for the playback to finish.

play audio file in twilio conference

I'm looking for an example of how to play an audio file during a twilio conference session. I'd like to play the audio file into the conference at the push of a button by one of the participants.
Any ideas?
Here's an article for how to do a roll call during a conference that will give you an idea of how to manage the workflow (you obviously just want the user to hear the played message, and not need to parse the calls):
https://www.twilio.com/blog/2014/09/roll-call-roger-stringer-shows-you-how-to-take-a-headcount-during-a-twilio-conference-call.html
Basically you have the user exit the conference on the keypress, have the message played, then the user rejoin the conference. If you want it to play for all users, it would disrupt the audio of the conference if you need the record feature for it.

Resources