I'm using Dialogflow as the NLP engine behind a chatbot, and am trying to get it to recognize company names. In the following examples, it understands the intent well, but doesn't pick up the company name.
Create a company called Google
Make a new account called Johnson & Johnson
New company Nike
Does anyone have any advice on how I can get Dialogflow to start to recognize these entities? I'm wondering if there are features I don't know about, or maybe some sort of plugin/library I can utilize for this?
I'm afraid there's no Dialogflow system entity that can do this for you as of Oct, 2020. Your best bet is to add as many training phrases as possible and create a custom entity with #sys.any as the entity type. Annotate as many training phrases as possible and let Dialogflow do the rest. When it comes to identifying company names specifically, there are two types of company names:-
Common company names like "Google", "Facebook" which Dialogflow can recognize without much assistance, especially if your entity type is #sys.any.
Domain-specific company names like Overflow LLC or Stack and Overflow Associates. Here, the annotated training phrases play an important role and if you have an idea of the types of companies that would need to be understood, it would help annotating those phrases (Eg: LLC, Associates, Firm etc).
Also think about how you structure your question to ensure the user enters values as per your needs. Eg: Please type in/spell out the name of your company increases the chances that anything your user enters would just be the company name.
Related
I have a list of functionality that the system should have and I have created a case diagram for it, but I am not sure if it correspond to said functionality and want a second look on my solution. Hopefully it is readable and I appreciate any feedback on my trail of design.
Description of said functions:
The system shall allow people to register as a student or faculty member. To sign up, users must provide their name, e-mail address, phone number, and a password. In addition, students add the name of their program and their student id; faculty members add the name of their department and their employee id.
A user shall be able to search for books, the library system shall indicate the availability for books. If available for loan, a logged in user shall be able to reserve a book for loan. When reserved, the librarian will move the book to a pick-up shelf. To loan a book, users shall login at the library in person, and checkout the books.
The library has an automated booth where users can leave the books and the system shall process the return, upon return, the system shall send a digital receipt sent to their e-mail address.
When a book is not returned in time, the system shall send a reminder e-mail with a fine for each day it is late.
The system shall allow users to extend the loan period of a loaned book at most two times. The system shall allow users to have at most five books on loan simultaneously. If a book is not available in current library, but is in another one, users can ask the system to transfer and vice versa. Books have a title, author, ISBN, edition, and shelf number denoting their location in the library. The library has varying stock for different books, it has a single copy for most books but up to ten physical copies for some popular books.
Librarians shall be able to add new books to the system and edit the information of existing books.
The design :
Update based on feedback:
I will not make a detailed review of your diagram, since this is very specific to your needs and will not help anybody else. However, I'd like to address some general issues that are frequent in these kind of diagrams:
It appears in the requirements that users may be a student or faculty member (or both?), whereas your diagram suggest that a user is another independent category of actors.
Having several actors for a use case is ambiguous. It cannot always be avoided, but here, it's not clear if all the actors are involved at the same time for a search, or if they are involved one after the other, or if only one may be involved at a time.
Your diagram is a functional decomposition of the requirements. For example, Register and Verify registry are not independent, but the second belongs to the detailed decomposition of the first, without being an independent user goal (in fact, the verification doesn't make sense without the first). The same applies to all the verifications ("maximum...") as well. This is not forbidden but strongly discouraged as it leads to too detailed and complex diagrams.
Sometimes your diagram seems to be a sequence of action: e.g. Return a book is followed by a confirmation email. Use-case diagrams shall not show any sequence. If you want to show the workflow, you need to use an activity diagram and not a use-case diagram.
extend corresponds to an optional use-case. Here, you seem to say that books are returned only for some loans.
In conclusion, simplify your diagram to show only user goals. Avoid extend and include dependencies as much as possible, to keep it simple and understandable. If you want to document details, document them in a narrative, not in the diagram.
To boil it down: this is no use case synthesis but functional decomposition. Use cases show added value for actors. Full stop. This is obviously the hardest thing to learn when finding use cases. They are like pearls you have to find. It's not about the how-to.
I recommend reading Bittner/Spence about use cases.
I am using Rasa 2.0 to build an FAQ chatbot, wherein I have a large dataset, and specifying entities while defining intents does not seem efficient to me.
I have the intents and examples defined in nlu.yml and would like to extract entities.
Here is an example of what I want to achieve,
User message -> I want a hospital in Delhi.
Entity -> Delhi, hospital
Is it possible to do so?
Entity detection is not a solved problem. There exist pre-trained models that integrate with Rasa like Duckling and spaCy and while these tools certainly contribute a lot of knowledge, they will make errors. If you're interested in learning more of the background on why these models can certainly fail, you can enjoy this youtube video that explains human name detection.
That's why a popular alternative is to use name-lists. There are lists of cities around the world as well as lists of baby names that you can download that might be used as a rule based alternative. You can configure this in Rasa via the RegexEntityExtractor but if you have namelists with 1000+ items then a FlashTextExtractor might be preferable.
If you've got labelled examples you can also train Rasa itself to recognise the entities. But in order to do this you will to have labels around.
specifying entities while defining intents does not seem efficient to me
Labelling might not be super fun, but it is super effective. Without labelling your received utterances you won't know what intents your users are interested in.
You could use entity annotations in your nlu training data; for example, assuming you have defined building_type and city as entity names:
I want a [hospital]("building_type") in [Delhi]("city").
Alternatively, you could try out these options:
annotate a smaller sample (for example, those entities that are essential for your FAQ assistant)
use the RegexEntityExtractor to write some rules
if you have a list of entities, you can use lookup tables to generate the regular expressions
I am building new intent where user phrases contains company name, like,
my company is google
Google Technology is my company
company is Apple Systems Inc.
What could BE entity type i have to set to get complete company name as is in user phrases. Like for above phrases, i want dialog flow would give me,
google
Google Technology
Apple Systems Inc.
Thanks,
It depends on how much control you have (or want) about what those company names can be.
If you have a concrete list of companies that it should apply to, you can create a Custom Entity Type. This is the best approach if you can use it.
If not, you may need to use the #sys.any type, which will match anything.
I'm trying to create a model in LUIS that allow me to detect if a brand (any brand) is mentioned in an utterance. I've tried different approaches but I'm struggling to get it working.
First I have an intent searchBrand with some examples utterances:
'Help me find info about Channel'
'I want to know more about Adidas'
...
What I want is that LUIS recognizes that a brand has been mentioned in the utterance (as an entity).
I believe I have these options:
Use a List Entity: impossible since I would have to fill the list
with every possible brand that exists and, moreover, the user would
have to write the brand exactly as it is, not allowing typos (e.g. ralf
lauren)
Use a ML Entity: I believe this could be the right approach. I've tried the following without success:
Create a ML Entity "brands"
Add a Structure with 1 component "brand"
Add to the component a Descriptor with a list of different brands as an example
Once I label the entities in the utterances, the model recognizes correctly the brands that I added to the Descriptor but it fails to recognize others brands or typos
Another option is a pattern entity. It fits somewhere between the two options you listed. You do need to train it with the patterns, and if the pattern is off at all it will not recognize the entity (and won't recognize the intent either unless you've separately trained it with utterances, which you should). However, it seems like the phrasings in your case would be consistent enough that you could define a few patterns for this, and as you train your bot from endpoint utterances you can add additional patterns as needed. Here is an example:
As I put this together I realized I'm ignoring [help me] and [find], essentially the pattern is "info about {brand}", which may or may not be appropriate depending on your other intents. If you say something different like "Tell me more about Adidas", the intent will be recognized (I trained it with your sample utterances), but the pattern, and therefore entity, will not.
Tutorial on using Patterns in LUIS
I got it working following this:
Create a ML Entity "brands"
Add to the entity a Descriptor with a list of different brands as an example. Remember to normalize the elements in the Descriptor
Add brands to the Descriptor
Label entities as "brands" inside utterances in intent "searchBrands"
Train & test the model
It is very important to normalize everything in LUIS. I had the brands inside the Descriptor capitalized and LUIS couldn't recognize new ones, once I normalized the brands LUIS started suggesting new ones and recognizing more when testing the model
I was looking through the documentation and testing Google's Natural Language API and noticed it gets a number of people, events, organizations, and locations incorrect - it appears to be using Wikipedia as a major data source so if it is not in Wikipedia it seems to have trouble identifying the type of various words. Also, if certain words appear in a name (proper noun) it seems to always identify an entity as a certain type which is not always correct.
For instance: "Congress" seems to always identify as an organization [government] even when it is part of an event name. The name "WordCamp" shows as a location, but it is an event.
Is there a way to train the Natural Language engine or provide a custom set of organizations, locations, events, etc. so that it provides more accurate type information for entities that are not extremely popular?
I am the Product manager for this product. Custom entity types are not currently supported. As per your comment about not getting some entity types right, this is true for any NLP system but our goal is to keep improving. We are working on ways for you to provide us feedback on instances that we get wrong to improve our accuracy and will share the details shortly. Note we have trained our models on multiple data sources and not just Wikipedia data. The API returns the most relevant Wikipedia article for an entity detected so if an entity has multiple interpretations, we will only return the most commonly used interpretation.