I'm developing a skill, I have an interaction model and 3 custom slot types. When I had toy values for the slot types everything seems to work but when I paste the values for the custom slot types and hit save, first I get a spinning wheel saying updating interaction model, then another one saying Please wait while model is being built... after a couple minutes I get a red error message that says: Error: Failed building the interaction model.
I pasted about 100 utterances and about 30, 300 and 30,000 values in my custom slot types. According to the documentation: A skill can have a total of 50,000 custom slot values, totaled across all custom slots used in the interaction model. https://developer.amazon.com/public/solutions/alexa/alexa-skills-kit/docs/alexa-skills-kit-interaction-model-reference
The interaction model builder has been known to break. Some times you have to wait a day or two for it to reset. Complaining on the developer forum or submitting a "Contact Us" request can sometimes get action.
You don't say which of your 30, 300, or 30,000 value test work and which don't. A point to bear in mind, though, is that after a couple hundred, the quality of recognition only improves slightly. The list of words you give for a custom slot is "advice". It isn't a hard-and-fast list. Alexa may return to you words not on your list. The more words you have, the wider and more arbitrary the input it will return. So, although you can submit 50,000 words, it is seldom profitable to do so.
Related
I have few intents which are getting trigger on the inappropriate User input. Below are a few examples.
Intent 1). Training phrases I have given
When will I get a job abroad?
Is there any possibility that I will be settled in foreign
When will I settle in foreign
This intent is getting called for user input I had a fight with my friend, will it settle down
Intent 2). Training phrases I have given
When my financial problems will over
Tell me about my financial condition
How will be my financial condition in the future
What will be my financial condition
This intent is getting called for user input When my family problems will over
Please help me out to handle these scenarios.
According to this documentation, you should use at least 10-20 trainning phrases.
You don't have to define every possible example, because Dialogflow's
built-in machine learning expands on your list with other, similar
phrases. You should create at least 10-20 (depending on complexity of
intent) training phrases, so your agent can recognize a variety of
end-user expressions. For example, if you want your intent to
recognize an end-user's expression about their favorite color, you
could define the following training phrases:
"I like red"
"My favorite color is yellow"
"black"
"Blue is my favorite"
...
Given that, to increase the assertiveness of your intents I'd recommend you creating more training phrases and focus them in the main terms necessary in your problem.
I have some question about some "best practice" for certain issues that we are facing using LUIS, QnA Maker, in particular for the Dispatcher:
1) Is there any best practice in case we have more that 15k utterances in the Dispatcher? That's looks like a limitation of the LUIS apps but the scalability of the model in the long run will be questionable.
2) Bing Spell Check for LUIS changes names and surnames for example, how to avoid this? I guess that Bing Spell Check is necessary when we are talking about ChatBots, since the typo are always behind the door, but using it for names is dangerous.
3) Cross validation is not supported out of the box, you would have split your data to folds with custom code (not difficult), use the command line to train and publish your model on your k-1/k folds, then send the k-fold utterances to the API one-by-one. Batch upload is only supported through the UI https://cognitive.uservoice.com/forums/551524-language-understanding-luis/suggestions/20082157-add-api-to-batch-test-model and is limited to a test set of 1,000 utterances. If we use the one-by-one approach, we pay $1,50 per 1k transactions https://azure.microsoft.com/de-de/pricing/details/cognitive-services/language-understanding-intelligent-services/ and this means to get cross-validation metrics for the 5 folds for example, we could be paying about 20$ for a single experiment with our current data, more if we add more data.
4) Model is a black box, which doesn't give us the ability to use custom features if needed.
I will try to address your concerns in the best possible way I can as follows:
1) As per the LUIS documentation,
Hence, you cannot exceed the limit. In case of Dispatch apps,if the total utterance exceeds 15k, then dispatch will down sample the utterances to keep it under 15k. There is an optional parameter(--doAutoActiveLearning) for CLI to do auto active learning which will down sample intelligently (remove non relevant utterances).
--doAutoActiveLearning: (optional) Default to false. LUIS limit on training-set size is 15000. When a LUIS app has much more utterances for training, Dispatch's auto active learning process can intelligently down sample the utterances.
2) Bing Spell Check helps users to correct misspelled words in utterances before LUIS predicts the score and entities of the utterance. However, if you want to avoid using Bing Spell Check API service, then you will need to add the correct and incorrect spelling which can be done in two ways:
Label example utterances that have the all the different spellings so that LUIS can learn proper spelling as well as typos. This option requires more labeling effort than using a spell checker.
Create a phrase list with all variations of the word. With this solution, you do not need to label the word variations in the example utterances.
3) As per the current documentation, a maximum of 1000 utterances are allowed per test. The data set is a JSON-formatted file containing a maximum of 1,000 labeled non-duplicate utterances. You can test up to 10 data sets in an app. If you need to test more, delete a data set and then add a new one. I would suggest you to report it as a feature request in the feedback forum.
Hope this helps.
I know that i can make a none intent to cover some of these, however we cannot just create every nonsense question a person could ask.
Or even if someone types in a 50 word statement. The bigger problem is that if we get a query to LUIS, it is assigning it an intent that is not correct, without even having identified any entities either.
What to do?
To handle these cases, it would be better to add more labeled utterances to your other intents and occasionally add the stray utterances to the None intent. When the model is better for predicting your non-None intents, the better predicting of None intents also accompany this (LUIS attempts to match to an intent rather than cutting intents out).
If intents are triggering without any entities being recognized (and thus you believe the wrong intent has been triggered), this should be handled at an application level, where you would then disambiguate the intents back to your users. If you've set the verbose flag to true, then you could take the top three scoring intents and present those back as options to your user. Then you can move back into the proper dialog.
After you've moved into the intent/dialog they meant to access, you can conduct a programmatic API call to add that utterance to the intent. Individually adding labeled utterances can be problematic (the programmatic API key has a limit of 100,000 transactions per month, and a rate of 10 transactions per second), so you can instead aggregate the utterances and conduct batch labeling. An additional bit of info; there is a limit of 100 labeled utterance per batch upload.
Adding to the Steven's answer - in the intent window, you have the Suggested Utternaces tab - this is also a hint for the algorithm, kind of reinforced learning approach.
I am newbie to this and have started learning Spark. I have general question regarding how recommendation systems work in production environments or rather how is it deployed to production.
Below is a small example of a system for e-commerce website.
I understand that once the system is built, at the start we can feed the data to the engine(we can run the jobs or run the program/process for the engine) and it will give the results, which will be stored back to the database against each user. Next time when the user logins the website can fetch the data, previously computed data by the engine from the database and show as recommended items.
The confusion I have is how ‘these systems’, on the fly generate outputs based on the user activity. e.g. If I view a video on Youtube and I refresh the page, Youtube starts showing me similar videos.
So, do we have these recommendation engine running always in background and they keep updating the results based on the user’s activity? How it is done so fast and quick?
Short answer:
Inference is fast; training is the slow part.
Retraining is a tweak to the existing model, not from scratch.
Retraining is only periodic, rarely "on demand".
Long answer:
They generate this output based on a long baseline of user activity. This is not something the engine derives from your recent view: anyone viewing that video on the same day will get the same recommendations.
Some recommender systems will take your personal history into account, but this is usually just a matter of sorting the recommended list with respect to the predicted ratings for you personally. This is done with a generic model from your own ratings by genre and other characteristics of each video.
In general, updates are done once a day, and the changes are small. They don't retrain a model from scratch every time you make a request; the parameters and weights of your personal model are stored under your account. When you add a rating (not just viewing a video), your account gets flagged, and the model will be updated at the next convenient opportunity.
This update will begin with your current model, and will need to run for only a couple of epochs -- unless you've rated a significant number of videos with ratings that depart significantly from the previous predictions.
To add to Prune's answer, depending on the design of the system it may also be possible to take into account the user's most recent interactions.
There are two main ways of doing so:
Fold-in: your most recent interactions are used to recompute the parameters of your personal model while leaving the remainder of the model unchanged. This is usually quite fast and could be done in response to every request.
A model that takes interactions as input directly: some models can compute recommendations directly by taking a user's interactions as input, without storing user representations that can only be updated via retraining. For example, you can represent the user by simply averaging the representations of the items she most recently interacted with. Recomputing this representation can be done in response to every request.
In fact, this last approach seems to form part of the YouTube system. You can find details in the Covington et al. Deep Neural Networks For YouTube Recommendations paper.
I have an app that extracts information from incoming messages. The messages all contain the same information, but they have different forms depending on the source that sent them.
Example:
Message from source A :
A: You spent $50.00 at Macy's on 2/20/12
Message from source B :
Purchase, $50.00, Macy's, 2Feb2012, Balance $5000.00
Every message from a single source has the same form though. So at the moment, I'm doing it by writing a set of regular expressions to first identify which message I'm trying to decode (i.e. what source it came from so I know what the form of the message is), and then extracting the necessary information from the message (in the above example, I want to know the transaction amount, the store where the transaction happened, and the date). If I discover a new source for a message, or a source changes the format of their message (doesn't happen very often, but could happen), I need to manually write the regular expressions for that message. I'm sure however that I could automate this using some kind of machine learning technique. I just don't know much about machine learning, and I don't know where to even start looking for a technique that would apply to my problem. I would like someone to just point me in the right direction on where to start reading.
In order to detect and label amounts, dates, person names and similar information you can use a technique called Named Entity Recognition. The Stanford Named Entity Recognizer comes with pretrained, ready to use models.
You also use whatever labeled data you have generated so far to learn a custom model for your application. The standard techniques used for this purpose are Conditional Random Fields or Sequence Perceptron. There are many toolkits implementing these models, including:
Wapiti - A simple and fast discriminative sequence labelling toolkit.
Sequor - sequence labeler based on Collins's (2002) perceptron.