Entity over-generalisation on Api.ai - nlp

We’ve been having a great deal of difficulty with chatbot entities over-generalising on Api.ai, i.e. returning values that have not been specified for that entity when using the “Define Synonyms” feature on custom entities, even when the “Allow automated expansion” flag is turned off.
Our key example is an entity we use for confirming a user choice called confirm_accept. We had an entry: “that’s it”, with synonyms: “thats it”, “that is it”, “that’s it thanks”, “thats it thanks”, “that is it thanks”. This entity value was being returned unexpectedly in expressions where just a stray “it” was appearing.
In general, we have seen a lot of inappropriate entity generalisation which seems to indicate there is some form of stop word removal and stemming/lemmatization going on during entity identification... and which can’t be turned off.
This returns poor entity classifications, making it difficult to create entities for which very precise values are important, e.g. where a single word or character can make a big difference in meaning. Our key use case involves a lot of address processing, so it is important we get back only values we have specified.
Types of over-generalisations we’ve seen include:
inappropriate identification of determiners (a, an, the, this, that, etc.) as part of entities: as in “it” returning “that’s it”
stemmed words: as in stray mentions of “driving”, returning “drive” (a valid street type entity)
inappropriate plural stems: a stray mention of “children” returning “child”, or a stray “will” returning “wills” (which in our case “child” and “wills” are street name entities, so we don’t want “children” or “will” to be returned)
This is currently making it difficult to create a production quality chatbot using the Api.ai service.
Anyone had more luck at either getting a response from Api.ai or solving the over-generalisation problem?

Entities are meant to extract information from conversation:
API.AI's entities are meant to be used to extract data from conversational input not parse different phrases and parts of speech. For your examples (that’s it, thats it, that is it, that’s it thanks, thats it thanks, that is it thanks) all seem to indicate that the user's intent is to indicate that the last message from the API.AI agent was correct. For instances like these, it would be best to use these phrases as examples for an intent or an existing intent with other responses indicating that the user wants to indicate that the last response was correct.
API.AI captures entity tenses and plurals automatically: To address your other concern (driving entity, returning drive value, children returning child, or wills returning wills): API.AI intentionally captures different tenses and plurals of entities to provide a better experience for users who many not know the exact entities you've entered in your database. This allows users of your conversational app to have a natural conversation with your users and not require precise wording or

Related

Dialogflow parameter and entities

I wanted to know if it possible for dialog flow to store value of parameter without set the entities for it. For example The bot ask the user What is your name? and the user respond with Jack. So dialogflow will store value “John” in the parameter. Thank you.
Dialogflow Web UI
So when working with the Dialogflow Web UI your options are a bit more limited compared to fulfillment and you won't really get around using entities. In the Web UI the best way to extract values from the user input is by using entities and parameters. The only way of using "raw" input via the Web UI is by using the #sys.any entity, but you should be careful with this.
The #sys.any entity takes the complete input from the user and gives it to you, but it doesn't provide any information on what entity the input might be off. For instance, if you ask the user "What is you name?" the user might respond with "John", but they also could respond by saying "Oh.. uhh. My name is John" and if you use #sys.any you get the whole string and you have to detect what is a name and extract it from the input yourself. Entities and parameters do this for you.
You can use the input from the user in your response by using $parameterName in your response
Dialogflow Fulfillment
When working with code the issues with raw input remain the same, you will get the whole user input, but have to do recognition or regexing to retrieve the values yourself.
One benefit of working with fulfillment is that you always have access to the raw input, you can call agent.query to retrieve the raw input, so you are not required to use a #sys.any entity in your parameter setup.
Conclusion
So as I mentioned in the above, there are a couple ways of retrieving the raw input of the user, but in both cases you lose the automatic detection provided by entities and parameters when you do so. While at first it might seem a hassle to work with entities and parameters, if you are going to use the user input for anything, like saving the name or making a decision, I really recommend sticking to the entity approach because it automatically detect the input from the string, you don't have to worry about how the user answers your question, which is a big part of developing a bot.
There are very few cases where using raw input has made developing easier for me in the long run.

Entity Extraction/Validation in Bot Composer

I have a composer text input task with the following settings. I'm making a bot testing the capabilities of Luis integrations with a bot. The issue I'm facing is not the entity recognition itself; rather the entity validation on the input task for the entity. I only notice this when using a text input task and have the entity validation working for other tasks (specifically datetime).
Current Entity User Input Task
To illustrate my issue, here are two current behaviors of the bot:
WAI: I say something like "I want to make an appointment and my contact number is (800)-234-5678". This not only triggers my intent model and skips the user input question where I ask for the user's phone number (seeing as they provided one already) and the conversation.phoneNumber variable is (800)-234-5678
Not WAI: I say something like "I want to make an appointment and my contact number is abc". This triggers my intent model as desired but skips the user input question where I ask for the user's phone number (because it thinks they provided one already) and the conversation.phoneNumber variable is abc. Ideally, this is where I would think validation would happen and it would ask the question "What is a good phone number for our office..." seeing as abc isn't a phone number.
For reference, this is the documentation I'm following:
https://learn.microsoft.com/en-us/composer/how-to-define-advanced-intents-entities
I considered a regex validator as an option because phone number is such a trivial entity that can be easily defined, but something more complex/diverse (such as geographic location, currency, etc) would be better handled by whatever dictionaries Microsoft has in place for built-in entities. I can see the raw data when I type in an utterance that shows no entity is being picked up (so SOMETHING is working in the background). I'm just curious if I can make use of that functionality in a bot composer task.
As I said before, this is working fine when validating if a user entered a date/time on a separate task; I'm hoping there's a feature for this for other built-in, text-based entities in the bot composer.
I assume that whatever built-in entity recognition that Microsoft has for their built-in entities here https://learn.microsoft.com/en-us/azure/cognitive-services/luis/luis-reference-prebuilt-entities is vastly superior to whatever regex I can come up with, and some type of built-in validation would be handy for more complex bots.

Entity value is entire utterance if utterance doesn't exist as an exact training phrase

I have a custom Entity which is configured as allowing synonyms but without "fuzzy matching" and "allow auto expansion". This entity exists in an intent with a lot of training phrases.
For this problem, let's say I have an entity named Fruit with the values apple and pear as well as a training phrase that is "I would like to buy a $fruit".
If I invoke this intent with a phrase that exists as a training phrase I get my entity value resolved nicely.
But if I'd invoke the intent with an utterance, that includes an entity value, but doesn't exist as a training phrase the entity value is resolved to the entire utterance, which of course isn't what I want.
For instance if the user says "I think I would like an apple", the parameters.fruit value is "I think I would like an apple".
Now if a user would say this exact utterance once more a few seconds or minutes later, then the entity $fruit is resolved nicely to "apple" as a parameter to my intent (conv.parameters.fruit = "apple").
How do I configure my intent/entity to always resolve the entity value correctly? This is really frustrating since it's a simple, stupid entity (enum) I'm trying to use. Thanks.
Although I am unable to reproduce your issue, this would be a bug on the Dialogflow side.
Can you try again and confirm you are still receiving this issue?
Also ensure you only have Define synonyms enabled for your custom entities, and that the highlighted portions of all of your training phrases only includes the name of the entity value.
Regardless, I have filed a bug internally to further investigate this issue.
Best,

LUIS: Identify "normalized" value based on synonyms automatically

I am currently developing a chatbot that recommends theater plays to the user.
It is working pretty well, but now I want to enable the user to get recommendations based on the type of theater plays (like funny, dramatic, sad).
Nevertheless, as I do not know how exactly the user is phrasing the request also synonyms might be used (funny: witty, humourous, ...)
What is a good solution to get these types from the user's request in a normalized way?
Typically I would use the List entity, but then I have to insert all synonyms for each possible value by myself. Is there a way how i can define my "normalized" values and synonyms are automatically matched by LUIS (and improved by further training of the model)

How to create a search form with dialogflow

I am trying to make a search algorithm with dialogflow that could take any combination of: first name, address, phone number, zip code or city as input to a search algorithm. The user does not need all of them, but we will refine our search with each additional answer until we only have one result. Basically we are trying to identify which customer we are talking to.
How should this type of intent (or set of intents) be structured? We have tried one intent with multiple parameters, but we do not need all of them to be required. We have also written a JavaScript function for fulfillment but how can we communicate back to dialogflow as to whether we need more information?
Thank you very much for your help.
Slot filling is designed for this purpose.
Hope that helps.
Please post more code/details to help answers be more specific.
First, keep in mind that Intents reflect what the user is saying, and not typically what you're replying with or what other information you need. Slot filling sometimes bends this rule, but only if you have required slots.
Since you don't - you need a different approach.
This can be done with a single intent, although you may find that multiple intents make it easier in some ways. The approach is broadly the same:
When you ask the question, make sure you set an Outgoing Context with a relatively short lifespan (2-3 is good) to indicate you are collecting user info.
Create an Intent (or Intents) that have sample phrases that capture the information you need.
Some of these will have obvious entity types (phone number and zip code) while others will be more difficult (First name has a system entity type, but it doesn't include all possible first names).
You will need to create sample phrases that collect the parameters by themselves, along with phrases that make sense. You're the best judge of this, and you should probably write some sample conversations before you write the phrases.
In your fulfillment, you'll figure out if you have enough information.
If you do, you can reply and clear the Context that was set. (Clearing it is important so Dialogflow doesn't match the information collecting Intent again.)
If you do not, you can add the information you have as parameters to the Context so you can save it for later processing, make sure you reset the Context lifespan (so it doesn't expire), and prompt the user for additional information. Again, having a conversation mocked out ahead of time will help here.

Resources