I am trying to add an alert if Azure ML pipeline fails. It looks that one of the ways is to create a monitor in the Azure Portal. The problem is that I cannot find a correct signal name (required when setting up condition), which would identify pipeline fail. What signal name should I use? Or is there another way to send an email if Azure pipeline fails?
What signal name should I use?
You can use PipelineChangeEvent category of AmlPipelineEvent table to view events when ML pipeline draft or endpoint or module are accessed (read, created, or deleted).
For example, according to documentation, use AmlComputeJobEvent to get failed jobs in the last five days:
AmlComputeJobEvent
| where TimeGenerated > ago(5d) and EventType == "JobFailed"
| project TimeGenerated , ClusterId , EventType , ExecutionState , ToolType
Updated answer:
According to Laurynas G:
AmlRunStatusChangedEvent
| where Status == "Failed" or Status == "Canceled"
You can refer to Monitor Azure Machine Learning, Log & view metrics and log files and Troubleshooting machine learning pipelines
Related
Quick context :
I want to delete my VM(s) in a specific resource group if its CPU usage is below 30 for 1 hour.
Detailed explanation : Please refer alert-action group flow diagram (https://i.stack.imgur.com/I4gBD.png%60 ) or Below Image Flow
Resource Group -> Linux VM(s) , connected with Log analytics workspace
Created Azure Function - Delete-VM , which written in Powershell will Delete VM
Created Action Group (delete-Action) to trigger a mail notification and above Azure Function(Delete-VM)
Created alert rule with signal as Log & condition as custom Log query and configured above action group to take action.
Custom Query :
Perf | where TimeGenerated > ago(60m) | where (ObjectName == "Processor") | summarize AggregatedValue = avg(CounterValue) by Computer | where AggregatedValue < 100 | project Computer, AggregatedValue
Issue :
When condition breached and alert get fired.
Only Mail action is Triggered
Azure Function (Delete DVSM) is not executing.
How to Trigger Azure function when Alert fired.
I have followed the blog which created by cloudsma.
Workaround follows
I have created the Alert Group with Email notification alert and it will trigger the Azure Function.
Adding Alert Rule:
Alert can be able to trigger in both Email and Azure Function
Alert Results
I am trying to bring in Azure Synapse logs into Loganalytics to create dashboards on usage level.
I have already setup in diagnostic settings to pass on the logs to my loganalytics workspace.
But while trying to execute queries from below documentation, I am getting error saying -
Query -
//Chart the most active resource classes
AzureDiagnostics | where Category contains "ExecRequests" | where
Status_s == "Completed" | summarize totalQueries = dcount(RequestId_s)
by ResourceClass_s | render barchart
Error:
'where' operator: Failed to resolve column or scalar expression named 'Status_s'...
Documentation link for queries : https://learn.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/sql-data-warehouse-monitor-workload-portal
Please let me know if there is something I am missing. I am directly logging to loganalytics workspace and running these queries inside a workbook...
Also i didnt find any proper documentation/blogs/links for connecting synapse to loganalytics, please let me know if anyone has that..
The documentation linked in your post appears to be out of date even though the last update date is recent.
See this link:
Azure services that use resource-specific mode store data in a table
specific to that service and do not use the AzureDiagnostics
table.
The link also lists a number of resource-specific tables for Synapse. "SynapseSqlPoolExecRequests" and "SynapseSqlPoolSqlRequests" are a few examples that might provide the info you're seeking.
So, I can see create_or_update logs of my VM on activity logs. There is no filter just to get the create logs as much as I am aware.
So is there any way where I can just see the create logs of a VM using API or commands?
You can follow below steps to achieve your requirement
You need to enable diagnostic settings to activity logs.
refer https://learn.microsoft.com/en-us/azure/azure-monitor/essentials/activity-log#send-to-log-analytics-workspace for enabling the diagnostic settings.
Once the Log analytics workspace is established, you can query the logs as
AzureActivity
| where OperationName == 'Create or Update Virtual Machine' and ActivitySubstatusValue == 'Created'
| order by TimeGenerated desc
above output will show only the Create operations. You can further filter it based on your requirement.
I use an Azure VM for personal purposes and use it mostly like I would use a laptop for checking email etc. However, I have several times forgot to stop the VM when I am done using it and thus have had it run idle for days, if not weeks, resulting in unnecessarily high billing.
I want to set up an email (and if possible also SMS and push notification) alert.
I have looked at the alert function in the advisor, but it does not seem to have enough customization to handle such a specific alert (which would also reduce Microsoft's income!).
Do you know any relatively simple way to set up such an alert?
You can take use of Log Analytics workspaces and Custom log search.
The below are the steps to create an alert, which will send the alert if the azure vm is running exactly 1 hour.
First:
you need to create a Log Analytics workspaces and connect to azure vm as per this link.
Sencod:
1.In azure portal, nav to Azure Monitor -> Alerts -> New alert rule.
2.In the "Create rule" page, for Resource, select the Log Analytics workspaces you created ealier. Screenshot as below:
Then for Condition, please select Custom log search. Screenshot as below:
Then in the Configure signal logic page, in Search query, input the following query:
Heartbeat
| where Computer == "yangtestvm" //this is your azure vm name
| order by TimeGenerated desc
For Alert logic: set Based on as Number of results, set Operator as Equal to, set Threshold value as 60.
For Evaluated based on: set Period as 60, set Frequency as 5.
The screenshot as below:
Note:
for the above settings, I query the Heartbeat table. For azure vm which is running, it always sends data to log analytics to the Heartbeat table per minute. So if I want to check if the azure vm is running exactly 1 hour(means it sends 60 data to Heartbeat table), just use the above query, and set the Threshold value to 60.
Another thing is the Period, it also needs to be set as 1 hour(60 minutes) since I just check if the azure vm is running for 1 hour; for Frequecy, you can set it any value you like.
If you understand what I explains, you can change these values as per your need.
At last, set the other settings for this alert.
Please let me know if you still have more issues about this.
Another option is to use the Azure Activity log to determine if a VM has been running for more than a specified amount of time. The benefit to this approach is that you don't need to enable Diagnostic Logging (Log Analytics), it also supports appliances that can't have an agent installed (i.e. NVAs).
The logic behind this query is to determine if the VM is in a running state, and if so has it been running for more than a specified period of time (MaxUpTime).
This is achieved by getting the most recent event of type 'Start' or 'Deallocate', then checking if this event is of type 'Start' and was generated more than 'MaxUpTime' ago
let DaysOfLogsToCheck = ago(7days);
let MaxUptime = ago(2h); // If the VM has been up for this long we want to know about it
AzureActivity
| where TimeGenerated > DaysOfLogsToCheck
// ActivityStatus == "Succeeded" makes more sense, but in practice it can be out of order, so "Started" is better in the real world
| where OperationName in ("Deallocate Virtual Machine", "Start Virtual Machine") and ActivityStatus == "Started"
// We need to keep only the most recent entry of type 'Deallocate Virtual Machine' or 'Start Virtual Machine'
| top 1 by TimeGenerated desc
// Check if the most recent entry was "Start Virtual Machine" and is older than MaxUpTime
| where OperationName == "Start Virtual Machine" and TimeGenerated <= MaxUptime
| project TimeGenerated, Resource, OperationName, ActivityStatus, ResourceId
I'm able to create an alert based on my custom metric. However, i'd like to have several different alerts, for each cloud_RoleInstance I have. Is it possible somehow?
If the logs are stored in Azure Log Analytics or Azure Application Insights, then you can use Custom Log Search alert(in step 5 of this article). Note you need to create one alert as per one cloud_RoleInstance in the query.
Steps as blow:
Step 1:
In azure portal -> Nav to azure monitor -> Alerts -> New alert rule, then in the resource, select the Azure Log Analytics or Azure Application Insights.
Step 2:
Then in Condition, select Add, then select "Custom log search":
Step 3:
Then in new window, write you query to trigger the alert, remember use where clause to filter the cloud_RoleInstance.
And note that:
change "Based on" from "Number of results" to "Metric measurement",
and use this query:
customMetrics
| where name == 'MyMetricName'
| where cloud_RoleInstance == 'MyInstanceName'
| summarize AggregatedValue = sum(value) by bin(timestamp, 5m)