Is Azure Monitor a good store for custom application performance monitoring - azure

We have legacy applications that currently write out various run time metrics (SQL calls run time, api / http request run times etc) to local SQL DB.
format:( source, event, data, executionduration)
We are moving away from storing those in local SQL DB, and are now publishing those same metrics to azure event hub.
Looking for a good place to store those metrics for the purpose of monitoring the health of the application. Simple solution would be to store in some DB and build custom application to visualize the data in custom ways.
We are also considering using Azure Monitor for this purpose via data collector API (https://learn.microsoft.com/en-us/azure/azure-monitor/platform/data-collector-api)
QUESTION: Are there any issues with azure monitor that would prevent us from achieving this type of health monitoring?
Details
each event is small (few hundred characters)
expecting ~ 10 million events per day
retention of 1-2 days is enough
ability to aggregate old events per source per event is important (to have historical run time information)
Thank you

You can do some simple graphs and with the Log Analytics query language, you can do just about any form of data analytics you need.
Here's a pretty good article on Monitor Visualizations.
learn.microsoft.com/en-us/azure/azure-monitor/log-query/charts

Related

Azure-based approach for sending 100000 requests to external service

Every night I need to get data from external http service and save it to Azure Data Lake.
Actually, I need to get all the orders for all the customers. The problem is that there is now way to get this data via a single call. Id of a customer should be provided per each separate call.
The format of url is something like /api/ordersByCutomer/{cutomerId}
I need to get data for 100 000 different customers. It will result in 100 000 calls to the external service.
I tried to use Azure Data Factory with Foreach activity in parallel mode, but it takes 4 sec per each call there (3 seconds are spent in queue). The overall speed result was not satisfying.
What is the best (I mean the fastest) azure-based approach for this (except Azure Data Factory)?
Thanks
You could write some asynchronous code to hit the API/http service parallelly and execute this code using custom activity in ADF which works by using batch account to get this job done. Use custom activities in an Azure Data Factory
Also, before doing any of this it would nice to contact the owner/stakeholder of the external http service and finding out if there is rate limiting on that service and even if the service can handle such loads.

Application insights performance counters spam

I have two identical resource groups for dev and qa.
They have the same services and same configurations.
At some point, I found that appinsight in QA contains a lot of performance counters which are produced by the app service plan.
I've tried to compare their configuration, but they look similar.
What can cause such different behaviour?
According to Document it says,
At the set sample interval, Azure Monitor gathers data from all deployed performance counters on all agents. For the time period defined by your log analytics workspace, the raw data is accessible in all log query views and has not been aggregated.
According to Document it says,
Performance counters show how the system resources are performing. Performance counter data ingestion depends on your environment size and usage. In most cases, performance counters should make up 80 to 99% of your data ingestion for Azure Monitor for Azure Virtual Desktop.
References:
https://learn.microsoft.com/en-us/dotnet/framework/debug-trace-profile/performance-counters

Should Azure Log Analytics and Application Insights be used per app or per environment?

We have a Azure based system which is growing in complexity, and we need to monitor chains of events and ensure they arrive where we expect them to arrive.
We have a on-prem Java application, which sends events to an IoT Hub. The IoT hub routes to service bus queues. We have functions that update a cosmos database, trigger other functions or route to additional queues. Some functions are also callable through an API Management instance.
Our functions are already connected to Application Insights, and here the Application Insights instance is named the same as the Function App (IIRC this naming was suggested through the form that created the AI resource)
The application map in Application Insights make me lean toward one AI per environment, to have a complete map of the system. Log Analytics also seems logical to use one per environment to be able to potentially correlate data if needed.
What is the correct path for Log Analytics and Application Insights, respectively?
If it is not as clear-cut as stated in my title, what factors do I need to consider when I start to use these services?
The correct number of instances is the one that works best for you, whether that exactly follows recommended practices or not.
The recommendation is to use one workspace per environment and make sure the cloud_RoleName in App Insights to distinguish parts of the system. Log Analytics has similar considerations.
Functions defaults to spinning up an App Insights instance along with the app because if you don't use App Insights you loose most of the logging ability- it's important to connect it to App Insights, but overriding the default behavior and connecting to a centralized workspace is common in larger systems.
There are certainly reasons you might want to split the workspaces, and you can union data across workspaces as needed to pull data together from both Log Analytics and App Insights instances.
Data access control or geographic locations. If you need to keep a portion of the data within certain geographic boundaries or limit access to certain people, then split that portion off.
Similar to the security concern is a billing one. If for whatever reason, billing for different portions of the application needs to be split, then you would also want to split the logging portion.
Different portions of the system rarely interact, or are maintained by different teams, and organizing the data into separate workspaces will provide more benefits over the hassle of cross-
You are going to surpass the limitations on a single resource. Very few applications actually hit these limits, but they are there.

Azure WebJobs for Aggregation

I'm trying to figure out a solution for recurring data aggregation of several thousand remote XML and JSON data files, by using Azure queues and WebJobs to fetch the data.
Basically, an input endpoint URL of some sort would be called (with a data URL as parameter) on an Azure website/app. It should trigger a WebJobs background job (or can it continuously running and checking the queue periodically for new work), fetch the data URL and then callback an external endpoint URL on completion.
Now the main concern is the volume and its performance/scaling/pricing overhead. There will be around 10,000 URLs to be fetched every 10-60 minutes (most URLs will be fetched once every 60 minutes). With regards to this scenario of recurring high-volume background jobs, I have a couple of questions:
Is Azure WebJobs (or Workers?) the right option for background processing at this volume, and be able to scale accordingly?
For this sort of volume, which Azure website tier will be most suitable (comparison at http://azure.microsoft.com/en-us/pricing/details/app-service/)? Or would only a Cloud or VM(s) work at this scale?
Any suggestions or tips are appreciated.
Yes, Azure WebJobs is an ideal solution to this. Azure WebJobs will scale with your Web App (formerly Websites). So, if you increase your web app instances, you will also increase your web job instances. There are ways to prevent this but that's the default behavior. You could also setup autoscale to automatically scale your web app based on CPU or other performance rules you specify.
It is also possible to scale your web job independently of your web front end (WFE) by deploying the web job to a web app separate from the web app where your WFE is deployed. This has the benefit of not taking up machine resources (CPU, RAM) that your WFE is using while giving you flexibility to scale your web job instances to the appropriate level. Not saying this is what you should do. You will have to do some load testing to determine if this strategy is right (or necessary) for your situation.
You should consider at least the Basic tier for your web app. That would allow you to scale out to 3 instances if you needed to and also removes the CPU and Network I/O limits that the Free and Shared plans have.
As for the queue, I would definitely suggest using the WebJobs SDK and let the JobHost (from the SDK) invoke your web job function for you instead of polling the queue. This is a really slick solution and frees you from having to write the infrastructure code to retrieve messages from the queue, manage message visibility, delete the message, etc. For a working example of this and a quick start on building your web job like this, take a look at the sample code the Azure WebJobs SDK Queues template punches out for you.

Pulling data asynchronously from third-party web service on Windows Azure Platform

I want to pull large amount of data, frequently from different third party API web services and store it in a staging area (this is what I want to decide right now) from where it will be then moved one by one as required into my application's database.
I wanted to know that can I use Azure platform to achieve the above? How good is it to use Azure platform for this task?
What if the data to be pulled is of large amount and the frequency of the pull is high i.e. may be half-hourly or hourly for 2,000 different users?
I assume that if at all this is possible, then the bandwidth, data storage and server capability etc. will not be a thing to worry for me but for ©Microsoft. And obviously, I should be able to access the data back whenever I need it.
If I would have to implement it on Windows Servers, then I know that I would use a windows service to do this. But I don't know how it can be done for Windows Azure Platform if at all it is possible?
As Rinat stated, you can use Lokad's solution. If you choose to do it yourself, you can run a timed task in your worker role - maybe spawn a thread that sleeps, waking every 30 minutes to perform its task. It can then reach out to the Web Services in question (or maybe one thread per Web Service?) and fetch data. You can store it temporarily in Azure Table Storage, which is a fraction of the cost of SQL Azure (0.15 per GB), and then easily read it out of Table Storage on-demand and transfer to SQL Azure.
Assuming you host your services, storage and SQL Azure are in the same data center (by setting the affinity appropriately), you'd only pay for bandwidth when pulling data from the web service. There'd be no bandwidth charges to retrieve from Table Storage or insert into SQL Azure.
In Windows Azure that's usually Worker Role used to host the cloud processing. In order to accomplish your tasks you'll either need to implement this messaging/scheduling infrastructure yourself or use something like Lokad.Cloud or Lokad.CQRS open source projects for Azure.
We use Lokad.Cloud for distributed BI processing of hundreds of thousands of series and Lokad.CQRS allows to reliably retrieve and synchronize millions of products on schedule.
There are samples, docs and community in both projects to get you started.

Resources