I'm trying to develop a skill for amazon alexa whereby the application leads the user into a new state.
"User input" -> "Speak" -> "Ask Question" -> "User input" .... etc
Is the most obvious way of going about this, however, this means I have to rather bluntly mash together "speak" and "ask question".
Is there another way to chain events for amazon alexa. Say, for example, emit some speach then go to another handler ? (I know that I can emit("handlerName") and switch to another handler, but I can't do that AND make alexa speak before the switch happens)
The best way to "chain events" is to maintain state is by using the session object in the Alexa API's request and response structures (see here). Store a variable in the attributes indicating the current step in your flow.
Related
We are implementing our customized chatbot using dialog flow. When user enters any text, our javascript code sends this text to our python server and the server interacts with google dialog flow and server gets complete response. I just have couple of questions as below.
When server gets the response from dialog flow, it will process the
response and sends some response to UI. Do we still need to have
fulfillment enabled as our server is getting response? Basically if
server is interacting with dialog flow and getting response, what is
the use of webhook?
Is there anyway to enforce the dialog flow intents require at least
one of entities? I went through Can I make Dialogflow intents require atleast one of the trained entities? which says to enable webhook fulfillment for that intent and if no entities were provided, re prompt the user for at least one of a list of entities. So in my case, if webhook is not needed, do I need to do it in the server once server receives response or is there anyway dialog flow will automatically enforce the condition with out server taking the responsibility?
In your case, no, you don't need to use webhook fulfillment.
You may still wish to use it, however, if you want to separate business logic (which would be in the webhook) from UI/UX logic (which would be in your python server and in the javascript client). But there is no requirement that you separate things this way.
Similarly, you can use your python code to enforce "at least one of" the parameters matching - you're moving that logic from the webhook into your existing server.
Either way, this is a bit kludgy. One alternative if you have different entity types is to have multiple Intents, one for each possible type, and to mark the parameter as required. This way the Intent will only match if the parameter is provided. If you then need to report each of these Intents as the "same" Intent, you can add that logic to your python code.
I am trying to wrap my head around using Dialogflow for developing and integrating an SMS chatbot with our custom CRM. The creation of an Intent is pretty powerful and straight forward. However, I am trying to understand best practices for something. If I have an intent used to return the price of a service at a certain location, I can model that very easily within dialog flow. However, when an SMS message comes in, it will be from a new customer or a known existing customer for a certain location. For existing customers, we already know the location and therefore don't want them to have to specify the location value in the intent. Prior to sending the inbound SMS message to the client API to match the intent, how can I pre-set the "location" parameter value in the intent so it does exists even if that inbound SMS message did not include it? For example a known customer in Dallas would just have to say "how much is a xxx" instead of "how much is a xxx in Dallas".
Can you use the API to set a parameter value prior to calling the API to try and match the intent? If so, how do you get do that without a session ID? The reason the "location" is needed is because when we get to the fulfillment, the prices for the same service are different based on the location so I will need to be known but we don't want to make existing customers say the location.
Maybe another option is to have a Location intent with an event that we can trigger through the API. this would have an output context on it called location and fulfillment that sets the parameter value. But even then I struggle with understanding how to pass in values like location, phone number, etc into dialogflow from the calling application so dialogflow has those parameter values to use in fulfillment.
Reading documentation, watching videos and starting to test client API v2
This is certainly possible. What you would want to do is use the Dialogflow API for this. Here you can find the languages for which Google has created client libraries: https://cloud.google.com/dialogflow/docs/reference/libraries/overview
As soon as you have any 'if' in your code you should use the fulfillment: https://dialogflow.com/docs/fulfillment
How I would handle this:
Client sends SMS
You check in your back-end if this user is known. If known -> don't ask location, else you ask the location
Match the user query against the Dialogflow client library
Dialogflow will return the intent if (any) is matched
You should define and implement any logic before calling the Dialogflow library.
I am making a mobile app for a college project that will feature some games. I was thinking on how can I make it a better, and I thought of using my amazon echo that has been collecting dust since I bought it :D
I had an idea of saying something like "alexa show me only FPS games", and in my app I grab that input and filter the app to only show FPS games. But the question is, how do I grab alexas input? What's the simplest way, is it even possible?
Had an idea that maybe I can grab alexas input in a form of a JSON, and then program it accordingly, but is that possible?
I have never programmed alexa skills, so I have no clue where to start with this, any directions would be pretty helpful! Also, keep in mind that I am a student that doesn't have as many programming experience, but I am willing to do the research.
Thanks a lot, cheers!
The traditional Alexa skills use Lambda functions that respond to events from Alexa Skills Kit. This flow of events would look like the following:
Echo device -> Alexa -> AWS Lambda -> Alexa -> Echo Device
Lambda functions are not Alexa-only components though, meaning you can program it to do whatever you want. Want to record metrics to a database before Alexa responds? Don't want Alexa to respond at all? That's entirely up to you.
For the use case you subscribed, you could write a Lambda function that filters a list of games for any spoken keyword and pushes that list to a client, and then ends the Alexa conversation: a "one-way" Alexa skill.
Echo device -> Alexa -> AWS Lambda
With that being said, you don't really need to use Alexa for that. There are plenty of other speech-to-text programs that can accomplish this (Amazon Transcribe, Watson Speech to Text, Google Speech Recognition). Additionally, those you could probably plug in without writing any server side code, so that's a plus.
We have a framework that implements chatbot / voice assistant logic for handling complex conversations in the health domain. Everything is implemented on our server side. This gives us full control of how responses are generated.
The channel (such as Alexa or Facebook Messenger cloud) calls our webhook:
When user messages, the platform sends these to our webhook: hashed user id, message text (chat message or transcribed voice)
Our webhook responds with the appropriately structured response, which includes text to be displayed, spoken, possibly choice buttons and some images etc. It also includes a flag whether the current session has finished or user input is expected.
Integrating a new channel involves conversion of the response returned into the form expected by a channel and setting some flags (has voice, has display etc.).
This simple framework has worked so far for Facebook Messenger, Cortana, Alexa (a little bit of hacking was needed to abandon it's intent and slot recognition), our web chatbot.
We wanted to write a thin layer of support for Google Assistant action.
Is there any way of passing all the input from Assistant user intact into a webhook such as the one described above and taking full control of the way responses are generated and the end of conversation is determined?
I'd rather not delve into those cumbersome ways of API.AI of structuring a conversation which seems good for a trivial scenarios such as ordering an Uber but seems very bad for longer conversation.
Since you already have a Natural Language Understanding layer for your system, you don't need API.AI/Dialogflow, and you can skip this layer completely. (The NLU is useful, even for large and extensive conversations, but doesn't make sense in your case where you've already defined the conversation through other means.)
You'll need to use the Actions SDK (sometimes known as actions.json after the configuration file it uses) to define triggering phrases, but after that you'll get all the text that the user says as part of your conversation through a webhook that delivers JSON to you. You'll reply with JSON that contains the text/audio response, images on cards, possibly suggestion chips, etc.
I am having some problems with my Alexa skill. I would like the dialogue to go like this:
User: 'Alexa, open party'
Alexa: 'Hello, what is your four digit secret pin?'
User: '1234'
Alexa: 'Confirmed, what can I help you with?'
But I am confused on how to structure this. I need to take the user's pin and verify it in my codebase. I know you cant get dialogue delegation to work inside of the LaunchRequest. The LaunchRequest can not be customized, so I cannot add slots to it. I can't find any other suggestions/examples on the internet. Has anyone done this before or are there any suggestions?
Amazon supports account linking as the method to connect users with their other accounts. This allows users to log into their other account using OAuth at the time the skill is installed. While it may be possible to determine a user based on the session object userid, it may be difficult to get such a skill published.
It turns out that you can not delegate slot collection to Alexa within the LaunchRequest, because it is not part of a valid response type for LaunchRequest.
My Initial logic was:
User says 'Alexa, open party'
Alexa Skill calls LaunchRequest. (At this point I need to ask the user for their pin by delegating Alexa to do slot colleciton)
In the LaunchRequest, immidiately respond with this.emit(':getPinIntent'); where getPinIntent is another intent existing in my Alexa Skill. The above code is what I saw on the internet for how to call another intent without the user having to provoke using voice.
getPinIntent gets called and immediately it checks to see if all the required slots are filled (i.e. if the slot PIN has a value). If they are not and dialogState !== 'COMPLETED' then I delegate the slot collection to Alexa.
The above step (#4) is where things go wrong. Because delegation is not a valid response type for LaunchRequest's, there is no field dialogueState which is required for delegation to Alexa. The Alexa Request is still a LaunchRequest instead of an Intent request because the user did not invoke the intent by saying something to Alexa.
In conclusion this is not a valid way of completing a dialogue where upon launch the user is asked for a pin and then can reply by only saying that pin, visualized below:
User: "Alexa, open party"
Alexa: "What is your pin" (alexa never gets here, because of #4 and #5 above)
User: "one two three four"
Alexa: "Confirmed, what can I help you with?"
If I have made any mistakes or wrong assumptions please let me know.
My current logic has now changed. If you do not use the Skill Builder Beta you can have a slot exist as an utterance for one of your intents. So I now have getPinIntent with a slot called {PIN} and an utterance in the form of {PIN}. This lets the above type of conversation happen because when the user says his or her pin back ("one two three four") it starts the getPinIntent where I can then continue OR delegate the dialogue to Alexa because for IntentRequest dialog is a valid response type.
The only problem I have now is that because I am not using the Skill Builder Beta I can not (or have not found a way) to add Dialogue Models to/inside of my Intent Schema. I have tried copying the JSON text from the Skill Builder Beta into my Intent Schema after adding the correct Dialogue Model, but this always results in build errors.
So now I can complete the user's pin authentication and respond with a "How can I help", but the IntentRequest that comes after that may require delegation to Alexa for slots, and this would cause a crash because without the Skill Builder Beta I am unable to add the appropriate dialogue models for Alexa to use during delegated slot collection.