How to generate an alert if deployment becomes 'Unhealthy' in Azure Machine Learning?

How to generate an alert if deployment becomes 'Unhealthy' in Azure Machine Learning? - azure-machine-learning-service

I deployed an Azure Machine Learning model to AKS, and would like to know how to set an alert if the deployment status changes to any value other than 'Healhty'. I looked at the monitoring metrics in the workspace, but it looks like they are more related to the training process (Model and Run) and Quotas. Please let me know if you have any suggestions
Thanks!

Aazure Machine Learning does not provide a way to continuously monitor the health of your webservice and generate alerts.
You can set up this fairly easily using Application Insights(AML Workspace comes with a provisioned Application Insights).
You can monitor the webservice scoring endpoint using URL ping or web test in App Insights.

Related

Azure Monitor issue with on prem

I have installed azure monitoring agent on my on prem windows server but i am not getting ram and cpu utilization on log Analytics dashboard .I have researched on it but didnt find any solution ?
is it good to install azure monitoring agent on on prem production servers .Thanks

You can collect performance data source with log analytics agent. For this you need to configure Performance Counters as it works with Azure monitor.
Below is the workflow screenshot of it;
Below are few steps to configure it in Azure portal:
Add Performance Counter
Input the necessary details like instance counter
Setup the interval, by default it will be as 10 seconds.
Apply the changes when you are done.
The above mentioned steps are for Windows Performance Counters.
For more insights you can check for Microsoft Documentation for the same

Is it possible to Monitor Azure Integration Runtime?

I am running few Data Pipelines in Azure Data Factory and its using Azure Integration Runtime for the compute.
I am trying to Monitor the CPU/Memory Usage Pipelines Consume and Utilise Azure IR.
I have checked in the Azure Monitor but the CPU / Memory Metrics are for Self Hosted Integration Runtime I think.
Also, with the Diagnostic Setting Enabled, I tried to verify the details in the Logs too but these details are not available.
Can anyone help to know more options?

If you are referring to the Azure AutoResolveIntegrationRuntime, then no there is not, and this is why (from https://www.cathrinewilhelmsen.net/integration-runtimes-azure-data-factory/)
Microsoft has massive elastic pools across the various locations/regions they offer Azure, and at runtime ADF determines what pool/hardware it will use to perform the Pipeline activities. So there is really no way (and no need) to monitor the Azure Autoresolve IR. But if you are interested in monitoring Self-Hosted IR's then there are many ways to do it.
One simple and straight forward way to do it is by creating Azure Dashboards in the Metrics portion of Azure Monitor. As you can see from the screenshot below it provides good visual representation of usage/resources over time.
As you can see I'm visualizing the integration Runtime itself (CPU/Memory) as well as the Azure VM that is hosting the Integration Runtime. On top of this you can go into the Metrics dashboard to set up alerts if certain conditions are met (eg AVG CPU % usage is over 75% for the last 15 minutes). These alerts can send you a text message, or email... and even do things as complicated as triggering a LogicApp or WebHook for automated scaling up/out, advanced notifications, etc.
This in my opinion is the best way to monitor but another option could be to call the Azure Data Factory REST API to get monitor data for the Integration Runtimes
POST https://management.azure.com/subscriptions/{subscriptionId}/resourceGroups/{resourceGroupName}/providers/Microsoft.DataFactory/factories/{factoryName}/integrationRuntimes/{integrationRuntimeName}/monitoringData?api-version=2018-06-01
But this method would require you to incrementally pull in data, store it, parse it, and then visualize it or act upon it when that is already very well built in for you. Sometimes it's fun to recreate wheels though.

Yes It is possible to Monitor Azure Integration Runtime.
"Pipeline Runs" in Monitoring has the option to check the CPU Utilization specific to pipeline, Integration Runtime and more specific filters. You can find here, how its done.

Monitoring & Detecting Exceptions in Applications using Cloud Monitoring

I am new to GCP and come from an Azure background. Is there an equivalent of "Azure Application Insights" on the GCP side for Monitoring Applications?
Let me explain my use case more clearly with an example: If I have a .NET based web application running on a Windows VM on GCP can Google Cloud Monitoring help detect Exceptions raised by the running application and send out alerts.
Any pointers/links to further explore this type of monitoring capability would be helpful.

Cloud Monitoring will provide you with many statisctics - most probably with what you need. And if there aren't any metrics to suit you need you may create ones based on the logs collected from the VM.
By default there is a number of logs being ingested but if you want to have full range and experiment with various ones you may want to install a monitoring agent. Go through the documentation and have a look.
You can then use the metrics to create charts and have a live view on a number of things such as cpu utilisation, disk IO/s, dropped/sent/received packets etc. Here's the Cloud Monitoring documentation.
And finally - you can create alerts based on the metrics (set thresholds, time periods etc). They can be simple e-mail alerts for example but they can be sent via pub-sub and trigger some functions or apps too.
Since you're new to GCP it's a lot of reading ahead of you but you will easily find documentation for most of GCP's services.
If you provide more details I can update my answer and give you more precise answer.

Is there any way to get Azure status update only for some services and regions I am using?

Is there any way to get Azure status update only for some services and regions I am using? For example, I am using Cloud Services in West US. When this service in West US is down, I want to get an alert for it. I don't care about other services and other regions.

If you set up alert notifications for your application, you'll get notified when any of the underlying services you're using are not functioning properly. An alert will ensure that your service is available and working.
https://azure.microsoft.com/en-us/documentation/articles/insights-receive-alert-notifications/
If you get an alert about a service issue, that's when I would first take a look at the Azure status dashboard, and then take a look at your application logs to troubleshoot.
Another trick is to create simple URL's in your application that do a quick service test. For example, let's say you're using blob storage in the west datacenter. You could set up a page that does a test write/read to ensure that service is working. This will give you a 100% accurate indication if there is a problem. Since the cloud is highly distributed, and services statuses don't update immediately, I find this method highly preferable.
You would then point your alert monitoring at URL's like this:
http://yourapp.com/
http://yourapp.com/blobtest
http://yourapp.com/redistest

The Azure Status website has the information your need for all Azure regions.
https://azure.microsoft.com/en-us/status/

Reading Performance counters from Azure Cloud Service - Load Test

I try to create a load test for a Azure hosted Web Service, but I am not able to connect to the Azure Cloud Service in order to collect the counters.
How can we connect to Azure cloud service from local machine or from any machine ?
I have tried to use the cloud service name, the VIP, but no luck.
Error: Cannot read counters from the machine 'xyz'.
Note: I am able to do RDC to the same cloud service.

If you would like to get performance counters from the machines you could configure Application Insights on your roles and collect them as part of your cloud load test.
http://blogs.msdn.com/b/visualstudioalm/archive/2014/04/07/get-application-performance-data-during-load-runs-with-visual-studio-online.aspx
This link would give you an idea on how to do it.
Let me know if you have more queries. Above link should also help you in getting help from the product team.
Thanks!
Ranga

Just came across this question -
this can be done by configuring the IP addresses of the Virtual machine instances of your webservice (webrole / worker role) - in the load test's performance counter collection section. Your load test controller would have to be in the same subnet as of your application as well. This will help collect all perfmon data (including application's custom performance counters).

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string