I have been through the documentation of both AWS Cognito and Azure Comprehend, trying to understand the accuracy or both TPR and FPR of the two services when it comes to identify PII and PHI inside a document without performing custom training. Unfortunately, I wasn't able to find any number and I do not have enough data to build my own confusion matrix, do any of you have an idea - even indicative - of their performances?
Thanks!
The matchbox recommender available in http://studio.azureml.net doesn't seem to have a counterpart in http://ml.azure.com (which it appears is the newer portal for azure ml). Here only the plain SVD recommender is available, which doesn't take user or item features. This is a feature takeaway from the matchbox.
Is there an ETA when matchbox would be made available in the azure machine learning services? Either via SDK or designer.
Thanks.
we don't have a plan to bring back the matchbox yet. If you are looking for recommender algorithms, this repo could be a good reference for best practices: https://github.com/Microsoft/Recommenders. Please let me know if this can unblock you or if you are looking for specific thing in matchbox.
We are building a platform that aims to deliver machine learning solutions for large enterprises with significant data security concerns. All data training is done on premise with restrictions on the nature of data used for model training. Once the model is completed, I am looking to deploy this on cloud with standard security/ audit standards.(IP whitelists, access tokens, logs)
I believe the features can be completed anonymized (normalized, PCA etc) to provide an additional layer of security. Is there any way the data sent to the cloud-based ML model can lead back to the original data?
While I had reviewed other questions around model deployment, this aspect of security isn't handled specifically.
https://dzone.com/articles/security-attacks-analysis-of-machine-learning-mode
(concern is not on availability or model distortion- but more around confidential data)
Again, idea is to retain learning and data on premise and only the deployment on cloud for speed, flexibility and availability.
Is there any way the data sent to the cloud-based ML model can lead back to the original data?
Any function that has an inverse, can lead back to the original data. The risk is not just from a random person viewing the data, but an insider threat within the team. Here is an example:
How to reverse PCA and reconstruct original variables from several principal components?
Depending on the number of principal components, it may also be possible to brute-force guess the Eigenvectors.
I have a Tensorflow object detection model deployed on Google cloud platform's ML Engine. I have come across posts suggesting Tensorflow Serving + Docker for better performance. I am new to Tensorflow and want to know what is the best way to serve predictions. Currently, the ml engine online predictions have a latency of >50 seconds. My use case is a User uploading pictures using a mobile app and the getting a suitable response based on the prediction result. So, I am expecting th prediciton latency to come down to 2-3 seconds. What else can I do to make the predictions faster?
Google Cloud ML Engine has recently released GPUs support for Online Prediction (Alpha). I believe that our offering may provide the performance improvements you're looking for. Feel free to sign up here: https://docs.google.com/forms/d/e/1FAIpQLSexO16ULcQP7tiCM3Fqq9i6RRIOtDl1WUgM4O9tERs-QXu4RQ/viewform?usp=sf_link
I want to build a chatbot for a customer service application. I tried SaaS services like Wit.Ai, Motion.Ai, Api.Ai, LUIS.ai etc. These cognitive services find the "intent" and "entities" when trained with the typical interactions model.
I need to build chatbot for on-premise solution, without using any of these SaaS services.
e.g Typical conversation would be as following -
Can you book me a ticket?
Is my ticket booked?
What is the status of my booking BK02?
I want to cancel the booking BK02.
Book the tickets
StandFord NLP toolkit looks promising but there are licensing constraints. Hence I started experimenting with the OpenNLP. I assume, there are two OpenNLP tasks involved -
Use 'Document Categorizer' to find out the intent
Use 'Named Entity Recognition' to find out entities
Once the context is identified, I will call my application APIS to build the response.
Is it a right approach?
How good OpenNLP is in parsing the text?
Can I use Facebook FASTTEXT library for Intent identification?
Is there any other open source library which can be helpful in building the BOT?
Will "SyntaxNet" be useful for my adventure?
I prefer to do this in Java. BUT open to node or python solution too.
PS - I am new to NLP.
Have a look at this. It says it is an Open-source language understanding for bots and a drop-in replacement for popular NLP tools like wit.ai, api.ai or LUIS
https://rasa.ai/
Have a look at my other answer for a plan of attack when using Luis.ai:
Creating an API for LUIS.AI or using .JSON files in order to train the bot for non-technical users
In short use Luis.ai and setup some intents, start with one or two and train it based on your domain. I am using asp.net to call the Cognitive Service API as outlined above. Then customize the response via some JQuery...you could search a list of your rules in a javascript array when each intent or action is raised by the response from Luis.
If your Bot is english based, then I would use OpenNLP's sentence parser to dump the customer input into a database (I do this today). I then use the OpenNLP tokenizer and push the keywords (less the stop words) and Parts of Speech into a database table for keyword analysis. I have a custom Sentiment model built for OpenNLP that will tag each sentence with a Pos, Neg, Neutral sentiment...You can then use this to identify negative customer service feedback. To build your own Sentiment model have a look at SentiWord.net and download their domain agnostic data file to build and train an OpenNLP model or have a look at this Node version...
https://www.npmjs.com/package/sentiword
Hope that helps.
I'd definitely recommend Rasa, it's great for your use case, working on-premise easily, handling intents and entities for you and on top of that it has a friendly community too.
Check out my repo for an example of how to build a chatbot with Rasa that interacts with a simple database: https://github.com/nmstoker/lockebot
I tried RASA, But one glitch I found there was the inability of Rasa to answer unmatched/untrained user texts.
Now, I'm using ChatterBot and I'm totally in love with it.
Use "ChatterBot", and host it locally using - 'flask-chatterbot-master"
Links:
ChatterBot Installation: https://chatterbot.readthedocs.io/en/stable/setup.html
Host Locally using - flask-chatterbot-master: https://github.com/chamkank/flask-chatterbot
Cheers,
Ratnakar
With the help of the RASA and Botkit framework we can build the onpremise chatbot and the NLP engine for any channel. Please follow this link for End to End steps on building the same. An awsome blog that helped me to create a one for my office
https://creospiders.blogspot.com/2018/03/complete-on-premise-and-fully.html
First of all any chatbot is going to be the program that runs along with the NLP, Its the NLP that brings the knowledge to the chatbot. NLP lies on the hands of the Machine learning techniques.
There are few reasons why the on premise chatbots are less.
We need to build the infrastructure
We need to train the model often
But using the cloud based NLP may not provide the data privacy and security and also the flexibility of including my business logic is very less.
All together going to the on premise or on cloud is based on the needs and the use case of the requirements.
How ever please refer this link for end to end knowledge on building the chatbot on premise with very few steps and easily and fully customisable.
Complete On-Premise and Fully Customisable Chat Bot - Part 1 - Overview
Complete On-Premise and Fully Customisable Chat Bot - Part 2 - Agent Building Using Botkit
Complete On-Premise and Fully Customisable Chat Bot - Part 3 - Communicating to the Agent that has been built
Complete On-Premise and Fully Customisable Chat Bot - Part 4 - Integrating the Natural Language Processor NLP
Disclaimer: I am the author of this package.
Abodit NLP (https://nlp.abodit.com) can do what you want but it's .NET only at present.
In particular you can easily connect it to databases and can provide custom Tokens that are queries against a database. It's all strongly-typed and adding new rules is as easy as adding a method in C#.
It's also particularly adept at turning date time expressions into queries. For example "next month on a Thursday after 4pm" becomes ((((DatePart(year,[DATEFIELD])=2019) AND (DatePart(month,[DATEFIELD])=7)) AND (DatePart(dw,[DATEFIELD])=4)) AND DatePart(hour,[DATEFIELD])>=16)