Limitations of an agent in Dialogflow - dialogflow-es

We recently have added 100s of pages of training phrases for intents at least get more accuracy in the intent responses. which ironically didn't help too much but rather caused the agent to start behaving incorrectly.
Adding a lot of training phrases was encourage by the Dialogflow team but it seems there is still a limit for that.
I am trying to restore the agent in dialogflow that unzips to around 70mb now but I am getting an error saying:
Unzipped file size is too big! We allow maximum 50 MBs dialogflow.
It seems that we push the system again to the limit but we didn't know it will actually break everything.
Is there a way to remove those limitation for our specific agent? or is there a way for us to know upfront all the limitations that seems to be hidden.

Related

Google Postmaster Tools, No Data to Display Message

I added my domain to Postmaster Tools and have my domain verified.
Unfortunately when I tried to check the data (it is already 3 days until today), it showed No data to display at this time. Please come back later message.
Actually SPF & DKIM are already set up correctly (below is my Check MX result).
Anybody know how to solve this issue?
As far as I’m aware, you need to have a minimum level of activity (of the order of several hundred messages) before it will show up. I’m not sure why, but it may be to limit the ability to identify individuals. My own very active accounts still get some of these “no data” days.
Also bear in mind that google postmaster tools is a buggy mess that hardly works at the best of times, for example a spam rating of “bad” will often coincide with a spam reporting rate of zero. It’s also about the only google service that has no support channel whatsoever.

Data lost after a while of adding it in Dialogflow Is it a storage or server problem?

We are working on Dialogflow for a year now, and lately, we experienced some issues regarding stabilization, new behavior and new features causing problems. Among those problems is that when adding a synonym to an entity or a training phrases and hit save, wait for it to train then refresh again, all the newly added items are gone. It seems that Dialogflow is experiencing storage issues. And we are losing time trying to retrain again and add those items again and this is frustrating.
We have tried to troubleshoot this issue for more than a week and it seems that those issues still there:
Example 1: Synonyms already exist but Dialogflow treats them like they don't exist when visiting the "Validation" option.
Example 2: Adding new synonyms, saving and training; After a while, they disappear.
Example 3: The DF server is most of the time unavailable.
Please, Dialogflow Support Team helps us check those issues.
Thank you.
If you are an Enterprise customer I would suggest to contact them directly, if not probably you should switch to enterprise for better customer support. Since you have been using DF for a year now, you guys might have reaches some limits. For example, a single agent cannot have more than 1 million entity reference values and synonyms. For more information on quotas and limits check the following link: Doc

QnA maker - Different Results between REST API and Preview Page

I'm using the Azure QnA Version 4. I'm posting using the REST API.
If I'm posting against the Live-Database using the parameter isTest=true I'm getting an answer score of around 80% which is very reasonable as my question almost matches the database. I'm getting exactly the same result using the Webinterface on qnamaker.ai.
Using the same POST against the published version (without isTest=true) I'm getting a score of only around 13% (which is very odd for entering almost a question which matches the database).
I've found some hints within the FAQs that slight differences are normal but I don't think 67% difference is normal. Is there anything I can do, so that the published version gets scores closer to the test version?
Pursang has a good point on his answer.
A good way to solve this problem is adding "isTest: true" on QnAMaker post request body. It has worked for me.
Its a qnaMaker bug when we have to add multiples knowledge bases...
{"question":"your question here", "top":3, "isTest": true }
Good Luck!
The test version and the published version are two different knowledge bases. This allows you to make changes and test them without affecting the live knowledge base that your customers are using. If you're getting worse results with your published knowledge base than your test version, that seems to indicate that you've trained your test knowledge base after you've published. Publishing again may fix the issue.
If you publish again and your published version still doesn't seem to behave the same as the test version, consider this entry in the FAQ:
The updates that I made to my knowledge base are not reflected on publish. Why not?
Every edit operation, whether in a table update, test, or settings,
needs to be saved before it can be published. Be sure to click the
Save and train button after every edit operation.
I had the same exact problem. It was related to something going wrong when I created the QnA Service in Azzure. The Language of your QnA Knowldege Base is automatically detected. You can see your Language in your Azure Search Ressource=>testkb=>Fields=>question/awnser MSDN
Mine was set to Standard-Lucene instead of German-Microsoft. I did not find any way to change that, so I had to recreate the QnA Service and move all Knowledge Bases there. Example picture wrong language Example picture correct language
I'm using a QnA service created in February this year. There are discrepancies between the test (QnA portal) & the published version (api). A correct answer would drop 10%, while a bad answer rises 10%, which ultimately converts good matches in test into bad ones in the bot application. Try to explain that to your customer.
It appears that you can run into this trouble if you use multiple KBs (= knowledge bases) on a single search service. The test index is a single index that covers all your KBs for that search service, while production KBs, when published, are indexed separately per KB. The QnA Maker help bot on the QnA portal mentions this:
"The top answer can sometimes vary because of small score variations between the test and production indexes. The test chat in portal hits the test index, and the generateAnswer API hits the production index. This typically happens when you have multiple knowledge bases in the same QnA Maker service. Learn more about confidence score differences.
This happens because all test knowledge bases are combined into a single index, while prod knowledge bases are on separate indexes. We can help you by separating all test and prod into separate indexes for your service."
So we need to contact Microsoft to also split up the test index per KB ? So that will rectify any discrepancies between test & published version ? Did not try this yet, anyone else?
Or do we limit ourselves to a single KB per search service (= multiple search services = expensive).
Or do we put all in a single KB, and use metadata to logically separate the answers and pray that this single massive KB produces good enough results ?

Reduce latency of Bot Connector

I figured out that there's always latency of about two to three seconds when sending messages through the Bot Connector of Microsoft's Bot Framework independent of which channel type I'm using.
This means if I call the POST .../messages API method of my Bot directly (so not going through the Bot Connector) I get an answer within several dozens ms. However, if messages are routed through bot connector (e.g. when I use Direct Line communication or Telegram or any other supported channel) it always takes about two to three seconds until I get an answer.
For a possible user this would not be a good user experience so that I'm wondering whether either I'm doing someting wrong (e.g. Bot Connector settings) or whether this is a general problem and will be improved at a later pont of time.
Thanks a lot in advance.
This is a known issue. The BotFramework is still in Preview, so it has yet to be optimized. Expect to see significant performance improvements in the near future.

Using Datamining/Statistics for Log Monitoring

I have a large set of log files that I want to characterize or possibly add some kind of decision tree or some kind of analytics. But I don't know exactly what. What kind of analysis have you done with log files, a lot of log files.
For example, so far I am collecting how many requests are made to a particular page for a given log file.
Servlet = 60 requets
Servlet2 = 70 requests, etc.
I guess right there, filter by only the most popular requests. Also, might do something like 60 requests given a 2 hour period. 60 / 160 minutes.
Deciding what analysis to do depends on what decisions you're trying to make based on that analysis. For example, I currently monitor logs for exceptions reported by our application (all exceptions in the client application are logged with the server) to decide what should be high priority client bugs to investigate. I also use log searching software to monitor for any Exceptions reported by our server software which may need more immediate investigation. On top of the logs generated by everything anyway, I also use some monitoring software to track usage of our web server and database server which records usage stats etc. in a database. The final aim of this is to predict future usage levels and purchase more hardware as appropriate to keep up with demand.
Two (free) tools I've been using are:
Hyperic for monitoring, it's pretty easy to set up and might be able to start logging a lot of data you may be interested in, ie requests per second on a web server.
Splunk for searching log files, it's very easy to get set up and work with and gives you excellent searching capabilities over your log files. If you're working with log files right now and haven't tried out splunk I definitely recommend it. I have noticed a couple of moments of 100% cpu whilst using it on our main production server so stopped running it on that machine recently, just a word of warning.
Not sure what your aim is with this analysis, mine has been very much about looking for any errors I should know about, and planning for future capacity needs. If you're interested in the latter I'd also recommend The Art of Capacity Planning.

Resources