Accuracy of Cognito and Comprehend for PII detection - azure

I have been through the documentation of both AWS Cognito and Azure Comprehend, trying to understand the accuracy or both TPR and FPR of the two services when it comes to identify PII and PHI inside a document without performing custom training. Unfortunately, I wasn't able to find any number and I do not have enough data to build my own confusion matrix, do any of you have an idea - even indicative - of their performances?
Thanks!

Related

Different face matching rates between Azure face verify API and Azure Demo

I am using Azure Face API to tell two different persons' faces.
It was easy to use thanks to the good documentation on the MicroSoft Azure API website.
But the different confidence rate between API call and the demo on the webstie: https://azure.microsoft.com/en-us/services/cognitive-services/face/#demo
My code is simple.
First I get the face ids of two uploaded images using face detection API.
And I just send two face ids to face verify API. Then I get the result of confidence rate that means the similarity of two faces.
I always get less confidence rate from my API call than the demo of the Azure website. About 20% less.
ex) I get 0.65123 on API call while I get the higher number like 0.85121 on the demo.
This is the Azure face API specifications to verity two faces:
https://learn.microsoft.com/en-us/rest/api/cognitiveservices/face/face/verifyfacetoface
Since I got no clue why it happens. I don't resize or crop the images on uploading.
I use the exactly same images for this test.
Is it possible for MS Azure to manipulate the values for their own interests?
I wonder if anyone has the same issue? If yes, please share your experience with me.
Different 'detectionModel' values can be provided. To use and compare different detection models, please refer to How to specify a detection model.
'detection_02': Detection model released in 2019 May with improved accuracy compared to detection_01. When you use the Face - Detect API, you can assign the model version with the detectionModel parameter. The available values are:
detection_01
detection_02

Google Cloud API sentiment analysis

I was going through Google Cloud API for sentiment analysis. The thing which is not clear to me is that on what basis the sentiment scores and magnitudes are assigned? Is there any kind of lexicon or any kind of training data? Is there any algorithm which can clarify how sentiment score is assigned?
I work in Cloud Natural Language team at Google. What's mentioned in the above comment is correct. We have built models internally (neural nets) and exposing them via the API basically.

Are there any specific enterprise security concerns with deploying a Machine Learning model on the Cloud

We are building a platform that aims to deliver machine learning solutions for large enterprises with significant data security concerns. All data training is done on premise with restrictions on the nature of data used for model training. Once the model is completed, I am looking to deploy this on cloud with standard security/ audit standards.(IP whitelists, access tokens, logs)
I believe the features can be completed anonymized (normalized, PCA etc) to provide an additional layer of security. Is there any way the data sent to the cloud-based ML model can lead back to the original data?
While I had reviewed other questions around model deployment, this aspect of security isn't handled specifically.
https://dzone.com/articles/security-attacks-analysis-of-machine-learning-mode
(concern is not on availability or model distortion- but more around confidential data)
Again, idea is to retain learning and data on premise and only the deployment on cloud for speed, flexibility and availability.
Is there any way the data sent to the cloud-based ML model can lead back to the original data?
Any function that has an inverse, can lead back to the original data. The risk is not just from a random person viewing the data, but an insider threat within the team. Here is an example:
How to reverse PCA and reconstruct original variables from several principal components?
Depending on the number of principal components, it may also be possible to brute-force guess the Eigenvectors.

Deployment of a Tensorflow object detection model and serving predicitions

I have a Tensorflow object detection model deployed on Google cloud platform's ML Engine. I have come across posts suggesting Tensorflow Serving + Docker for better performance. I am new to Tensorflow and want to know what is the best way to serve predictions. Currently, the ml engine online predictions have a latency of >50 seconds. My use case is a User uploading pictures using a mobile app and the getting a suitable response based on the prediction result. So, I am expecting th prediciton latency to come down to 2-3 seconds. What else can I do to make the predictions faster?
Google Cloud ML Engine has recently released GPUs support for Online Prediction (Alpha). I believe that our offering may provide the performance improvements you're looking for. Feel free to sign up here: https://docs.google.com/forms/d/e/1FAIpQLSexO16ULcQP7tiCM3Fqq9i6RRIOtDl1WUgM4O9tERs-QXu4RQ/viewform?usp=sf_link

Azure ML App - Complete Experince - Train automatically and Consume

I played a bit around with Azure ML studio. So as I understand the process goes like this:
a) Create training experiment. Train it with data.
b) Create Scoring experiment. This will include the trained model from the training experiment. Expose this as a service to be consumed over REST.
Maybe a stupid question but what is the recommended way to get the complete experience like the one i get when I use an app like https://datamarket.azure.com/dataset/amla/mba (Frequently Bought Together API built with Azure Machine Learning).
I mean the following:
a) Expose 2 or more services - one to train the model and the other to consume (test) the trained model.
b) User periodically sends training data to train the model
c) The trained model/models now gets saved available for consumption
d) User is now able to send a dataframe to get the predicted results.
Is there an additional wrapper that needs to be built?
If there is a link documenting this please point me to the same.
The Azure ML retraining API is designed to handle the workflow you describe:
http://azure.microsoft.com/en-us/documentation/articles/machine-learning-retrain-models-programmatically/
Hope this helps,
Roope - Microsoft Azure ML Team
You need to take a look at Azure Data Factory.
I have written a Custom Activity to do the same.
And used the logic to retrain the model in the custom activity.

Resources