How to get hollistic view of Azure environment

How to get hollistic view of Azure environment - azure

There's an awful lot of disjointed documentation on monitoring network/resources in Azure. What I'm looking for is which pieces are needed to get information from VMs, NVA firewalls, azure load balancers, and other network resources and network connectivity into a single pain of glass in Azure. Only concerned about Azure, not on-prem for now.
I've come across azure monitor, log analytics work spaces, event hub, vm extensions, network watcher, insights, etc...but I'm not sure which are required and which are not. One doc leads to the next and I end up with 30 tabs open. I'll also need to be able to push logs to other security devices such as a SIEM.
Does anyone know of a deployment guide that wraps this all up in a more logical fashion? Does anyone have any feedback on which pieces from azure (not 3rd parties) are required at a minimum to accomplish a single pane of glass to view my Azure environment holistically?

General overview of observability in Azure
Likely, the thing you're looking for is Azure Monitor. It's an umbrella term for everything observability related inside Azure.
To store Metrics and Logs you need Log Analytics: it can query data with kusto query language, visualize results, define Alerts on queries.
Alerts is quite a complex beast, as it is spread across the entire cloud. Two types that I use the most:
log-analytics alert (which I mentioned above)
Alerts tab, which is available at every Azure component view. for example, open resource group, and scroll down to Monitoring section
Each component also has a subset of built-in metrics. Likely, you noticed that many azure components on the Overview view display some charts. For example, Azure Storage Account displays Total egress, Total ingress, and other line-charts. When you click on these charts you can customize them. These metrics and charts are free to use.
Microsoft also has all-in-one observability solution for Azure Functions and Web Apps: Application Insights
Dashboards allows to join multiple charts into a single view and share it with others.
If you care about security, Azure proposes Azure Security Center
Deployment/management strategy
I suggest to start with:
Create Log Analytics Workspace, which is the storage for metrics and logs. The azure docs article explains how to design it: how many instances to use, how to rate-limit ingestion (it might be expensive if goes out of control), how to access it and so on.
To get Azure components logs, look for Diagnostic Settings tab at a component page at Azure portal, but not all components has it (sic!). I suggest
sending the most critical data to Log Analytics workspace to store them in a queryable format for 30 days (it's in free tier). This is needed for investigating current issues with your infrastructure
if you might need logs later than 30 days - send them to Storage Account
you mentioned SIEM integration - route required events to Event Hub and then process the stream according to your requirements
So, if you need long-term storage - you need to create Azure Storage Account.
If you need real-time analysis - you need to build a pipeline based on Azure Event Hub.
If you have Azure Functions and Web Apps - add Application Insights. According to my experience, I would suggest starting with a separate instance per each Azure Function resource or Service.
Create Alerts for each component separately. If you do it through UI - open component page at the portal and look for Alerts tab there. If you're automating the process (please do so as soon as possible), do not expect easy trip: I used ARM templates and terraform - in both cases, there are dozens of barely documented features.
Join related components core-metrics into Dashboards and share it with the team. This guide is a good starting point. Note, when you share the dashboard, it's also persisted as an azure resource in the subscription.

Related

Is there a script to create azure custom alerts format and any log analytics query to get azure VM status

I have below two questions can someone help on them.
1.Is there a script or a way to create custom alert format for azure alerts?
2.Is there a way to pin all the azure VM status to dashboard?

Regarding #1, the feature to customize or configure alert email format is currently not supported. If interested, I suggest you to raise your feedback / feature request here in UserVoice / feedback forum. Responsible product / feature team would triage / start checking feasibility and would prioritize the feedback.
Regarding #2, If 'status' is meant as 'PowerState' (i.e., status of VM whether it is running, deallocated, etc.) or if it's meant as 'StatusCode' (i.e., ok, etc.) or if it's meant as 'ProvisioningState' (i.e., succeeded, etc.) then I don't think we have straight-forward way for it so that we can ingest that particular data directly to dashboard but said that, you may just leverage 'Heartbeat' Log Analytics Kusto table at first place and create a custom view as dashboard using view designer but as views in Azure Monitor are being phased out and replaced with workbooks so I suggest leveraging these workbooks now.
If not, you may leverage a new feature called as Azure Monitor for VMs which basically helps to analyze the performance and health of your Windows and Linux VMs, and monitor their processes and dependencies on other resources and external processes. Here again, you can create interactive reports Azure Monitor for VMs with VM insights workbooks.
Hope these inputs helps!

Allow customer to only see logging information

We run a software application on azure for one of our customers. The customer want to see the performance of the systems. This consist of two parts. One is the metric information of the servers and they also want to see some information I want to provide by custom logging.
My plan is to give the customer access to the portal and only allow him access to the metric information and the custom tables.
It seems to me that by assigning a role to the customer I should be able to block all the other possibilities.
Does someone can me tell which actions I have to allow/forbid to achieve this? Or were I can find the information for this?

Solution #1
Instead of giving Read access to the virtual machine which may breaks security policy, I'd recommend to go with Azure Log Analytics (ref: https://learn.microsoft.com/en-us/azure/log-analytics/log-analytics-overview
) workspace. That said, you will need to create a workspace which collects and stores server metrics (ref: https://learn.microsoft.com/en-us/azure/log-analytics/log-analytics-quick-collect-windows-computer) and other custom metrics.
Your customer will be given access to the workspace only which he can see all metrics in a dashboard. If there is a need for log filtering, you can use Log Analytics query language (ref: https://learn.microsoft.com/en-us/azure/log-analytics/log-analytics-log-search-transition)
Log Analytics is a paid service. You are given free up to 10 workspaces per subscription. The workspace is considered an Azure resource so the limit follows by subscription limit, which means you can create up to 800 workspaces per a resource group. A subscription can allow 800 * 800 (for reference if you would like to do capacity planning for your workspace-based solution). For Log Analytics pricing, read here (https://azure.microsoft.com/en-us/pricing/details/log-analytics/).
Log Analytics is a good choice as its value proportion is to offer your customer intuitive dashboard to monitor their virtual machine performance, and to offer Near Real Time monitoring. And this solution is a cloud native compatibility.
There is a management solution which offers a bundle of VM capacity and performance monitoring which you can try now https://learn.microsoft.com/en-us/azure/log-analytics/log-analytics-capacity
Solution #2
Log Analytics might not be your choice because it might add more Azure service and operational cost. If you need a cheaper cost, you would need to collect your virtual machine by Performance Counter which is a built-in feature in Windows OS. With Performance Counter you can export to Excel file, or visualize into Power BI or some custom chart.
Other Solutions
You can utilize Azure Monitor and API to get data, For example, this API https://learn.microsoft.com/en-us/rest/api/monitor/metricdefinitions/list. You would certainly need to visualize or format in some intuitive way to satisfy your customer. It can be a custom front-end web, or Power BI or even Excel with chart.
You can just query to Azure Blob Storage and use Stream Analytics combining with Power BI to visualize your data (https://thuansoldier.net/?p=7187).
There is not a single solution. This really depends on your existing resource capacity, financial stuff or so on.

Azure Application Insights for Service Fabric

I have multiple services running on Service Fabric. I would like to add Application Insight for logging. I'm just wondering whether I have to add an Application Insight resource for each microservice or only one is common for all. What is the best practice?

There is no such thing a the best practice for this. It really depends. Some considerations:
Pricing: depending on the level (basic or enterprise) you will get an amount of data for free / included in the base price. See the docs. So in some cases, depending on the amount of traffic you can reduce costs by having a dedicated AI resource per service. AI resources for services that send data below the threshold of the AI pricing plan are then (almost) free.
Querying: if you split up services per AI resource getting an overview of the whole system is difficult since at the moment you cannot create queries spanning multiple AI resources.
Responsibility: If you have multiple teams working on multiple services it might be an option to have an AI resource per team so they have a good insight in only the parts they are responsible for.
If you do decide to use a shared AI resource there are options like custom telemetry initializers to include custom data that further identify which ASF application or service is sending the data if it is not included by default.
See also Add Application Insight to a existing Azure Service Fabric cluster for more info about how to integrate AI.
Now, when it comes to bring data together you do have some additional options that may or may not need additional services or configuration. For example:
PowerBi: You can visualize data of AI resources using dashboards, see https://learn.microsoft.com/en-us/azure/application-insights/app-insights-export-power-bi
OMS: Operation Management Suite, See https://blogs.technet.microsoft.com/msoms/2016/09/26/application-insights-connector-in-oms/. As Jesse mentions you can link multiple AI Resources
Custom dashboards: Using the rest api you can create your own solution that displays data for one or more AI resources.

is azure diagnostics only available through code?

Is Azure diagnostics only implemented through code? Windows has the Event Viewer where various types of information can be accessed. ASP.Net websites have a Trace.axd file at the root that can viewed for trace information.
I was thinking that something similar might exist in Azure. However, based on the following url, Azure Diagnostics appears to require a custom code implementation:
https://azure.microsoft.com/en-us/documentation/articles/cloud-services-dotnet-diagnostics/#overview
Is there an easier, more built-in way to access Azure diagnostics like I described for other systems above? Or does a custom Worker role need to be created to capture and process this information?

Azure Worker Roles have extensive diagnostics that you can configure up.
You get to them via the Role configuration:
Then, through the various tabs, you can configure up specific types of diagnostics and have them periodically transferred to a Table Storage account for later analysis.
You can also enable a transfer of application specific logs, which is handy and something that I use to avoid having to remote into the service to view logs:
(here, I transfer all files under the AppRoot\logs folder to a blob container named wad-processor-logs, and do so every minute.)
If you go through the tabs, you will find that you have the ability to extensively monitor quite a bit of detail, including custom Performance Counters.
Finally, you can also connect to your cloud service via the Server Explorer, and dig into the same information:
Right-click on the instance, and select View Diagnostics Data.
(a recent deployment, so not much to see)
So, yes, you can get access to Event Logs, IIS Logs and custom application logs without writing custom code. Additionally, you can implement custom code to capture additional Performance Counters and other trace logging if you wish.

"Azure diagnostics" is a bit vague since there are a variety of services in Azure, each with potentially different diagnostic experiences. The article you linked to talks about Cloud Services, but are you restricted to using Cloud Services?
Another popular option is Azure App Service, which allows you many more options for capturing logs, including streaming them, etc. Here is an article which goes into more details: https://azure.microsoft.com/en-us/documentation/articles/web-sites-enable-diagnostic-log/

Alternate to run window service in Azure cloud

We currently have a window service which send some notification emails to users after doing some processing on database(SQL database). Runs once in day.
We want to move this on azure cloud. One alternate is to put it on Azure VM as is. but I am finding some other best possible solution for that.
I study about recurring and on demand Web jobs but I am not sure is this is best solution.
Also is there any possibility to update configuration of service code in App.config without re-deploy the code of service on cloud. I means we can manage configuration from Azure portal.
Thanks in advance.

Update 11/4/2016
Since this was written, there are 2 additional features available in Azure that are both excellent choices depending on what functionality you need:
Azure Functions (which was based on the WebJobs described below): Serverless code that can be trigger/invoked in various ways, and has scaling support.
Azure Service Fabric: Microservice platform, with support for actor model, stateful and stateless services.
You've got 3 basic options:
Windows service running on VM
WebJob
Cloud service
There's a lot of information out there on the tradeoffs between these choices, but here's a brief summary.
VM - Advantages: you can move your service basically as it is without having to change much or any of your code. They also have the easiest connectivity with other resources in Azure (blob storage, virtual networks, etc). The disadvantage is you're giving up all the of PaaS advantages and are still stuck managing your own VM infrastructure
WebJob - Advantages: Multiple invocation options (queues, blobs, manually, queue receive loops, continuous while-loop style, etc), scheduled (would cover your case). Easy to deploy (can go with website, as a console app, automatically through Kudu), has some built in logging in Azure portal - and yes, to answer your question, you can alter the configuration in the portal itself for connection strings and app settings.
Disadvantages - you'll need to update code, you don't have access to underlying resources (if you need that), and more of something to keep in mind than a disadvantage - it uses the same resources as the webapp it's deployed with.
Web Jobs are the newest of the options, but at the same time appear to have active development going on to increase the functionality and usefulness.
Cloud Service - like a managed VM, has some deployment options, access to underlying VM if needed. Would require some code changes from your existing service.
There's nothing you've mentioned in your use case that makes me think a Web Job shouldn't be first thing you try.
(Edit: Troy Hunt has a great and relatively recent blog post illustrating most of the points I've mentioned about Web Jobs above: http://www.troyhunt.com/2015/01/azure-webjobs-are-awesome-and-you.html)

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string