Google Dialogflow is detecting wrong intent - dialogflow-es

We are implementing our customized chatbot using google dialog flow. I created some intents in dialog flow as below.
Intent : DVR
Training Phrases :
Unable to play DVR
Unable to play Recording
Not able to list Recordings
Intent : listings
Training Phrases :
Where are my TV listings
Intent : movies
Training Phrases :
Unable to play Movies
Unable to list movies
Intent : new movies
Training Phrases :
New movies are not getting listed
Very few new movies are getting listed
How can I buy new movies on mobile app?
New Movies not displayed
Seeing blank entries on New Movies
Seeing blank titles on New Movies
Order of intents in the dialog flow are as above -> DVR, listings, movies, new movies
Created entities as below.
DVR -> DVR, Recordings, Record
Listings -> listings
Movies -> movies
New_Movies -> new movies
When I try below texts, they are all working fine.
unable to play recording -> detected intent : DVR, Entity : DVR
unable to play DVR -> detected intent : DVR, Entity : DVR
Unable to play movies -> detected intent : movies, Entity : Movies
First issue : When I try 'unable to play new movies', detected intent is new movies (with "intentDetectionConfidence": 0.81950855) (entity : New_Movies). Not sure how this is working even though I didn't put any training phrase like unable to play under 'new movies'.
Unable to play New Movies -> detected intent : new movies, Entity : New_Movies, Confidence : 0.81950855
Second issue : if I try 'unable to play DVR New Movies', it is not detecting 'DVR' intent and 'DVR' entity. It is detecting intent as 'new movies' and entity as 'New_Movies' as below.
Unable to play DVR New Movies -> detected intent : new movies, Entity : New_Movies, Confidence : 0.7595941
Unable to play DVR 'New Movies' -> detected intent : new movies, Entity : New_Movies, Confidence : 0.7595941
I am not sure why dialog flow is behaving this way. Can any one please let me know how to resolve these issues (why DVR intent is not detected, why DVR Entity is not detected).
One more issue I found just now. There is one more intent I wrote in google dialogflow as below.
Intent : Remove Device
Training Phrases :
Having issues with device removal
Unable to remove the device from registration
Unable to remove the device
Issue with Registration removal
Now I tried with text "unable to play". It should detect intent 'DVR' but it detected intent 'Remove Device' which doesn't have any training phrases related to play.

There are two types of matching algorithms used to assign the phrases to specific intents (1). I believe none of these algorithms can work as desired in your specific case:
-The grammar of the training phrases given for different intents is quite similar. This can lead the model to confusion:
“Unable to play DVR, unable to play Movies, unable to remove the
device”
To solve this issue, you should provide a wide range of different structures such as “being unable to, can’t play, not allowed to, is not available”.
-As explained in (1), accurate ML matching needs a large number of training phrase examples. Otherwise the model will not be fine-tuned for your specific use cases and it will confuse phrases that, even though for a human may look simple, are somewhat similar:
“list recordings, TV listings, list movies, new movies are getting
listed”
Computers understand the language in a different way that humans do. Normally, models have been trained for general objectives and, consequently, the more training examples you provide, the better they will adjust to specific use cases.
Finally, there is another concern regarding the Dialogflow agent you described. Given that the intents are rather similar, some intents might be getting an almost identical confidence score for some phrases. When this happens, the highest score will prevail (for example 0.76 over 0.74). This can be partially solved by configuring the intent priority (2).
Regarding the three issues you described above:
-1st: Even if you did not put “unable to” in this intent, it is the only intent which has the entity “new movies”. Nevertheless, bear in mind that the entity “new movies” contains the entity “movies” and these two intents will get confused in many cases.
-2nd: That phrase contains three entities for different intents “new movies”, “movies” and “DVR”. Probably all these intents will get a score over the threshold but only the highest one prevails. So it is not that the DVR intent is not detected, it is just that the “new movies” intent probably gets a slightly higher score.
-3rd: “Unable to play” does not give the agent enough information to determine which intent to trigger. It might as well even have been detected as “movies” given that you have the training phrase “unable to play movies”. No entity is present in “unable to play” and more than one intent matches the grammar structure “unable to”.
All in all, your intents are too similar to each other and have too few training phrases. Bear in mind that some of these examples would be difficult to classify even for a human (“unable to play”).
Have a nice day!

Related

Dialogflow Detect Intent Matching Confidence And Entity Value Selection

I am looking to understand how detect intent confidence impacts entity value selection in Dialogflow. For example, using two user generated phrases:
Phrase 1: "For snack yesterday I had an apple and peanut butter". This phrase has an intent detection confidence of '1' and 'snack' and 'yesterday' are tagged correctly to their respective entities, and the foods, 'apple' and 'peanut butter' are correctly matched within their entity [food], with values of 'apple' and 'peanut butter' respectively.
Phrase 2: "Are the snack yesterday I had an apple and peanut butter". This phrase was mumbled by the user or garbled by Siri (we use an iOS voice app). Here the intent detection confidence is '0.852' and while 'snack' and 'yesterday' are tagged to their entities correctly, the foods are not treated as above. Specifically, while both are tagged to the correct entity [food] and 'apple' was correctly tagged to 'apple', the 'peanut' of 'peanut butter' was tagged as one food [value = 'peanut'] and the 'butter' of 'peanut butter' was tagged as another food [value = 'butter'].
As context we have ~500 intents, the intent matched above has ~400 training phrases (clearly not including 'Are the...') and ~200 entities, the largest of which has 29,998 values.
So it appears the intent detection confidence impacts the entity parameter value matching. Can anyone shed any light on this? From our viewpoint, it is not a useful 'feature'. Quite the opposite.
When searching for a matching intent, Dialogflow scores potential matches with an intent detection confidence, also known as the confidence score.
These values range from 0.0 (completely uncertain) to 1.0 (completely certain). Without taking the other factors described in this document into account, once intents are scored, there are three possible outcomes:
If the highest scoring intent has a confidence score greater than or equal to the ML Classification Threshold setting, it is returned as a match.
If no intent meets the threshold, a fallback intent is matched.
If no intents meet the threshold and no fallback intent is defined, no intent is matched.
Update: As per the GCP DialogFlow development team:
Scores are referred to intent recognition confidence
Parameter extraction is not taken into account when scores are computed"
In other words, there is no relationship between intent classification confidence and entity extraction.
The described behavior could potentially be a bug within DialogFlow or something specific to your GCP project and further inspection for your GCP project is required with GCP Support to investigate why this is happening. You can create a GCP Support Case.
Over the last few weeks we have interacted at some length with GCP Support who've interacted with the DF engineering team.
The short answer is they indicate the entity value extraction is 'as designed', and specifically, when using a composite entity (a nested entity structure), values that match input terms (here, 'peanut', 'butter' and 'peanut butter') are extracted and matched at random. So a user's utterance of 'peanut butter' may be matched to 'peanut' and 'butter' or 'peanut butter' at random.
This behavior cannot be controlled by adding additional training phrases to the agent.
From our point of view, the behavior is not as desired but we understand the implications for our design. Hopefully this 'answer' helps others navigate this and similar issues.

How can I do speaker identification (diarization) with microsoft speech to text without previous voice enrollment?

In my application, I need to record a conversation between people and there's no room in the physical workflow to take a 20 second sample of each person's voice for the purpose of training the recognizer, nor to ask each person to read a canned passphrase for training. But without doing that, as far as I can tell, there's no way to get speaker identification.
Is there any way to just record, say, 5 people speaking and have the recognizer automatically classify returned text as belonging to one of the 5 distinct people, without previous training?
(For what it's worth, IBM Watson can do this, although it doesn't do it very accurately, in my testing.)
If I understand your question right then Conversation Transcription should be a solution for your scenario, as it will show the speakers as Speaker[x] and iterate for each new speaker, if you don't generate user profiles beforehand.
User voice samples are optional. Without this input, the transcription
will show different speakers, but shown as "Speaker1", "Speaker2",
etc. instead of recognizing as pre-enrolled specific speaker names.
You can get started with the real-time conversation transcription quickstart.
Microsoft Conversation Transcription which is in Preview, now targeting to microphone array device. So the input recording should be recorded by a microphone array. If your recordings are from common microphone, it may not work and you need special configuration. You can also try Batch diarization which support offline transcription with diarizing 2 speakers for now, it will support 2+ speaker very soon, probably in this month.

Wrong intents are getting trigger on the inappropriate User input in Dialog Flow

I have few intents which are getting trigger on the inappropriate User input. Below are a few examples.
Intent 1). Training phrases I have given
When will I get a job abroad?
Is there any possibility that I will be settled in foreign
When will I settle in foreign
This intent is getting called for user input I had a fight with my friend, will it settle down
Intent 2). Training phrases I have given
When my financial problems will over
Tell me about my financial condition
How will be my financial condition in the future
What will be my financial condition
This intent is getting called for user input When my family problems will over
Please help me out to handle these scenarios.
According to this documentation, you should use at least 10-20 trainning phrases.
You don't have to define every possible example, because Dialogflow's
built-in machine learning expands on your list with other, similar
phrases. You should create at least 10-20 (depending on complexity of
intent) training phrases, so your agent can recognize a variety of
end-user expressions. For example, if you want your intent to
recognize an end-user's expression about their favorite color, you
could define the following training phrases:
"I like red"
"My favorite color is yellow"
"black"
"Blue is my favorite"
...
Given that, to increase the assertiveness of your intents I'd recommend you creating more training phrases and focus them in the main terms necessary in your problem.

how to validate user expression in dialogflow

I have created a pizza bot in dialogflow. The scenario is like..
Bot says: Hi What do you want.
User says : I want pizza.
If the user says I want watermelon or I love pizza then dialogflow should respond with error message and ask the same question again. After getting a valid response from the user the bot should prompt the second like
Bot says: What kind of pizza do you want.
User says: I want mushroom(any) pizza.
If the user gives some garbage data like I want icecream or I want good pizza then again bot has to respond with an error and should ask the same question. I have trained the bot with the intents but the problem is validating the user input.
How can I make it possible in dialogflow?
A glimpse of training data & output
If you have already created different training phrases, then invalid phrases will typically trigger the Fallback Intent. If you're just using #sys.any as a parameter type, then it will fill it with anything, so you should define more narrow Entity Types.
In the example Intent you provided, you have a number of training phrases, but Dialogflow uses these training phrases as guidance, not as absolute strings that must be matched. From what you've trained it, it appears that phrases such as "I want .+ pizza" should be matched, so the NLU model might read it that way.
To narrow exactly what you're looking for, you might wish to create an Entity Type to handle pizza flavors. This will help narrow how the NLU model will interpret what the user will say. It also makes it easier for you to understand what type of pizza they're asking for, since you can examine just the parameters, and not have to parse the entire string again.
How you handle this in the Fallback Intent depends on how the rest of your system works. The most straightforward is to use your Fulfillment webhook to determine what state of your questioning you're in and either repeat the question or provide additional guidance.
Remember, also, that the conversation could go something like this:
Bot says: Hi What do you want.
User says : I want a mushroom pizza.
They've skipped over one of your questions (which wasn't necessary in this case). This is normal for a conversational UI, so you need to be prepared for it.
The type of pizzas (eg mushroom, chicken etc) should be a custom entity.
Then at your intent you should define the training phrases as you have but make sure that the entity is marked and that you also add a template for the user's response:
There are 3 main things you need to note here:
The entities are marked
A template is used. To create a template click on the quote symbol in the training phrases as the image below shows. Make sure that again your entity is used here
Make your pizza type a required parameter. That way it won't advance to the next question unless a valid answer is provided.
One final advice is to put some more effort in designing the interaction and the responses. Greeting your users with "what do you want" isn't the best experience. Also, with your approach you're trying to force them into one specific path but this is not how a conversational app should be. You can find more about this here.
A better experience would be to greet the users, explain what they can do with your app and let them know about their options. Example:
- Hi, welcome to the Pizza App! I'm here to help you find the perfect pizza for you [note: here you need to add any other actions your bot can perform, like track an order for instance]! Our most popular pizzas are mushroom, chicken and margarita? Do you know what you want already or do you need help?

Map to the wrong LUIS intent

I am facing an issue whereby words that does not match with any intents, it will assume it belongs to intent with the most labeled utterances.
Example: if
Intent A consists of utterances such as Animals
Intent B consists of utterances such as Fruits
Intent C consists of utterances such as Insects
Intent D consists of utterances such as People Name
Desired: If the random word(s) does not fit into any of the luis intent, it will fit into none luis intent. Example of desired: If word such as "emotions" or "clothes" were entered, it will match as "None" intent.
Actual: When user type random word(s), it match with luis intent with highest number of labeled utterances. If word such as "emotions" was entered, it will match as "A" intent as intent A consist of highest number of labeled utterances.
Please advise on the issue.
Set a score threshold, below which your app won't show any response to the user (or could show a "sorry I didn't get you" message instead). This avoid responding to users with anything LUIS is unsure about, which usually takes care of a lot of "off topic" input too.
I would suggest setting it your threshold between 0.3 and 0.7, depending on the seriousness of your subject matter. This is not a configuration option in LUIS, rather in your code you just do:
if(result.score >=0.5) {
// show response based on intent.
} else {
// ask user to rephrase
}
On a separate note, it looks like your intents are very imbalanced. You want to try and have roughly the same number of utterances for each intent, between 10 and 20 ideally.
So without more details on how you've built your language model, most likely the underlying issue is that you either don't have enough utterances in each intent that have enough variation displaying the different ways in which different utterances could be said for that particular intent.
And by variation I mean different lengths of the utterance (word count), different word order, tenses, grammatical correctness, etc. (docs here)
And remember each intent should have at least 15 utterances.
Also, as stated in best practices, do did you also make sure to include example utterances in your None intent as well? Best practices state that you should have 1 utterances in None for every 10 utterances in the other parts of your app.
Ultimately: build your app so that your intents are distinct enough with varying example utterances built into the intent, so that when you test other utterances LUIS will be more likely able to match to your distinct intents--and if you enter an utterance that doesn't follow any sort of pattern or context of your distinct intents, LUIS will know to detect the utterance to your fallback "None" intent.
If you want more specific help, please post the JSON of your language model.

Resources