Capture Performance Counters in Web Job / App?

Capture Performance Counters in Web Job / App? - azure-web-app-service

Is it possible to capture performance counter values for the current process/app? I am trying to capture metrics about my running web job, such as % TIme in GC, Gen N heap size, thread contention Rate / Sec, etc. I see Application Insights can capture these kind of metrics when enabled in my web site.
var category = ".NET CLR Memory";
var counter = "% Time in GC";
var instance = Process.GetCurrentProcess().ProcessName;
var pc = new PerformanceCounter(category, counter, instance, true);
I understand why this would fail as the Auzre/kudu runtime sandbox needs to ensure the instance is my process not some other customer's process. Is there a way to get access to my process' counters so I can report on/collect them? I would like the collection to occur from within the process.

At this time, it's not possible to collect performance counters due to sandbox limitations, as you mentioned. We're looking into ways to make this possible in a secure way, but don't have any timelines to share.
Your best bet at this point may be to use application insights, which is currently the recommended way to collect diagnostics for your app (they have solutions for pretty much any app type - not just web apps): https://azure.microsoft.com/en-us/documentation/services/application-insights/

Related

Azure slow communication between APIs

In some 1-5% of our requests, we are seeing slow communication between APIs (REST API requests). Both APIs are developed by us and hosted on Azure, each app service on its own app service plan in the same region, P1v2 tier.
What we are seeing on application insights is that POST or GET requests on origin API can take a few seconds to execute, while real execution time on destination API is only a few milliseconds.
Examples (first line POST request on origin, second execution time on destination API): slow req 1, slow req 2
Our best guess is that the time difference is lost in communication between components. We don't have an explanation for it since the payload is really small and in most cases, communication takes less than 5 milliseconds.
We dismiss the possible explanation it could be due to component cold start since it happens during constant load and no horizontal scaling was performed.
Do you have any idea what might cause it or how to do additional analysis in order to discover it?

If you're running multiple sites on the App Service Plan, then enable the "Always On" setting for your web app > All Settings > Application Settings > Click on Always On
See here for details: https://azure.microsoft.com/en-us/documentation/articles/web-sites-configure/
When Always On is off, the site is shut down after 20 minutes of inactivity to free up resources for any additional websites that might be using the same App Service Plan.
The amount of information it needs to collect, process and then present itself requires some time, and involve internal calls as well, that is why considering the server load and usage, it takes around 6 to 7 seconds sometimes even more.
To Troubleshoot that latency, try this steps, provided by Microsoft.

Memory leak/consumption in Hubs due to large messages?

I'm currently trying SignalR and RabbitMQ in order to round-robin / load balance json webservice queries and I'm having troubles with the memory consumption by one of the application when it processes large (~ 300 - 2500 kb) messages.
I have a IIS server hosting a web application (named "Backend") that needs to query an another web application (name "Pricing") also hosted by a IIS server.
In order to keep a connection alive with my RabbitMQ server, I developped console application that are connected to Backend and Princing using SignalR.
So when Backend needs to query Princing, it asks its console to publish the message in the queue and the console attached to Pricing takes the message and give it to Pricing (with Invoke<> method). When Pricing finished its job, it asks its console to publish the reply message and the console attached to Backend takes it and give to Backend.
To sum up :
[Backend] -> [Console] -> [RabbitMQ] <- [Console] <- [Pricing]
And I have 2 Pricing taking messages from their console from the RabbitMQ queue.
This setup is to replace a traditionnal webservice query between the 2 IIS and benefit from the advantages of RabbitMQ (load balancer and asynchronous call in a micro/web services architecture)
I added
GlobalHost.Configuration.MaxIncomingWebSocketMessageSize = null;
in Startup.cs in both IIS in order to accept large messages.
When I take a look at Pricing's memory consumption in Windows Task Manager, it quickly grows from 500Mb to 1500Mb (in 5 minutes, dealing with neverending queries from Backend to test the setup).
I tried something else by writing the queries content in files in a shared folder and just publishing the name of the file in RabbitMQ's messages and the memory consumption of Pricing (with of course a code modification to load the file) doesn't move and stays around 500Mb.
So it seems that it has something to do with the message length that my console passes to the IIS.
I tried to disconnect the console from the IIS Hubs because I thought that it will maybe free some memory but nope.
Does anyone experienced this issue of memory consumption by large messages into Hubs ? How can I check if there's indeed a memory leak in my application ?
What about using SignalR and RabbitMQ in web/micro services environment ? Any feedback ?
Many thanks,
Jean-Francois
.NETFramework : 4.5
Microsoft.AspNet.SignalR : 2.4.1

So it seems that the version of SignalR I use (.NetFramework) allows to tune the number of messages per hub per connection kept in memory.
I fixed it to an arbitrary 50 in Startup.cs
GlobalHost.Configuration.DefaultMessageBufferSize = 50;
Its default value is 1000, meaning (if I understood it clearly) that IIS keep a circular buffer of 1000 messages in memory. Some of the messages were weighting 2.5Mo meaning that the memory used could go up to 2500Mo per connection.
As my IIS only has one connection (its console) and doesn't need to keep track of messages (as it works as webservice), it seems that 1000 messages is way too much.
With the limit of 50 messages, the memory used by the application in Windows Task Manager stays put (around 500Mo).
Is there any flaw in the way I'm using it ?
Thanks !

How to find/cure source of function app throughput issues

I have an Azure function app triggered by an HttpRequest. The function app reads the request, tosses one copy of it into a storage table for safekeeping and sends another copy to a queue for further processing by another element of the system. I have a client running an ApacheBench test that reports approximately 148 requests per second processed. That rate of processing will not be enough for our expected load.
My understanding of function apps is that it should spawn as many instances as is needed to handle the load sent to it. But this function app might not be scaling out quickly enough as it’s only handling that 148 requests per second. I need it to handle at least 200 requests per second.
I’m not 100% sure the problem is on my end, though. In analyzing the performance of my function app I found a LOT of 429 errors. What I found online, particularly https://learn.microsoft.com/en-us/azure/azure-resource-manager/resource-manager-request-limits, suggests that these errors could be due to too many requests being sent from a single IP. Would several ApacheBench 10K and 20K request load tests within a given day cause the 429 error?
However, if that’s not it, if the problem is with my function app, how can I force my function app to spawn more instances more quickly? I assume this is the way to get more throughput per second. But I’m still very new at working with function apps so if there is a different way, I would more than welcome your input.
Maybe the Premium app service plan that’s in public preview would handle more throughput? I’ve thought about switching over to that and running a quick test but am unsure if I’d be able to switch back?
Maybe EventHub is something I need to investigate? Is that something that might increase my apparent throughput by catching more requests and holding on to them until the function app could accept and process them?
Thanks in advance for any assistance you can give.

You dont provide much context of you app but this is few steps how you can improve
If you want more control you need to use App Service plan with always on to avoid cold start, also you will need to configure auto scaling since you are responsible in this plan and auto scale is not enabled by default in app service plan.
Your azure function must be fully async as you have external dependencies so you dont want to block thread while you are calling them.
Look on the limits. Using host.json you can tweek it.
429 error means that function is busy to process your request, so probably when you writing to table you are not using async and blocking thread

Function apps work very well and scale as it says. It could be because request coming from Single IP and Azure could be considering it DDOS. You can do the following
AzureDevOps Load Test
You can load test using one of the azure service . I am very sure they have better criteria of handling IPs. Azure DeveOps Load Test
Provision VM in Azure
The way i normally do is provision the VM (windows 10 pro) in azure and use JMeter to Load test. I have use this method to test and it works fine. You can provision couple of them and subdivide the load.
Use professional Load testing services
If possible you may use services like Loader.io . They use sophisticated algos to run the load test and provision bunch of VMs to run the same test.
Use Application Insights
If not already you must be using application insights to have a better look from server perspective. Go to live stream and see how many instance it would provision to handle the load test . You can easily look into events and error logs that may be arising and investigate. You can deep dive into each associated dependency and investigate the problem.

Programmatically get the amount of instances running for a Function App

I'm running an Azure Function app on Consumption Plan and I want to monitor the amount of instances currently running. Using REST API endpoint of format
https://management.azure.com/subscriptions/{subscr}/resourceGroups/{rg}
/providers/Microsoft.Web/sites/{appname}/instances?api-version=2015-08-01
I'm able to retrieve the instances. However, the result doesn't match the information that I see in Application Insights / Live Metrics Stream.
For example, right now App Insights shows 4 servers online, while API call returns just one (the GUID of this 1 instance is also among App Insights guids).
Who can I trust? Is there a better way to get instance count (e.g. from App Insights)?
UPDATE: It looks like data from REST API are wrong.
I was sending 10000 messages to the queue, logging each function call with respective instance ID which processed the request.
While messages keep coming in and the backlog grows, instance count from REST API seems to be correct (scaled from 1 to 12). After sending stops, the reported instance count rapidly goes down (eventually back to 1, while processors are still busy).
But based on the speed and the execution logs I can tell that the actual instance count kept growing and ended up at 15 instances at the moment of last message processed.
UPDATE2: It looks like SDK refuses to report more than 20 servers. The metric flats out at 20, while App Insights kept steady growth and is already showing 41.

Who can I trust? Is there a better way to get instance count (e.g. from App Insights)?
Based on my understanding we need to use Rest API endpoint to retrieve the instance, App Insights could be configured for multiple WebApps, so the number of servers online in the App Insights may be for multiple WebApps.
Updated:
Based on my test, the number of the application insight may be not real time.
During my test if the WebApp Function scale out then I could get multiple instances with Rest API, and I also can check the number of servers online in the App Insights.
https://management.azure.com/subscriptions/{subscriptionId}/resourceGroups/{resourcegroup}/providers/Microsoft.Web/sites/{functionname}/instances?api-version=2016-08-01
But after I finished the test, I could get the number of the instance with Rest API is 1, based on my understanding, it is right result.
At the same time I check it in the Application Insight the number of the servers online is the max number during my test.
And after a while, the number of server online in the application insight also became 1.
So If we want to get the number of intance for Azure function, my suggestion is that using REST API to do that.
Update2：
According to the DavidEbbo mentioned that the REST API is not always reliable.
Unfortunately, the REST API is not always reliable. Specifically, when a Function App scales across multiple scale units, only the instances from the 'home' scale unit are reflected. You probably will not see this in a smallish test, but likely will if you start scaling out widely (say over 20 instances).

Programmatically Get VM Instance Network & Memory Info

This question is for Continuous Web Jobs.
Main Questions
How can we "VIEW" or programmatically "LOG" the current memory & network status of a VM running a Continuous Web Job?
Background:
Our web job is scraping some API and we keep getting 500 errors. We believe that the VM is firing too many threads for API requests - and then because of network limitations - when the responses come back, too many responses come back at the same time, overloading the VM's network limitations.
Side Questions:
How would you use MS Azure to Web scrape - and make sure you don't overload (in terms of memory + network) the VM it's running on?
(It seems that for background processing, these VMs are built for CPU calculation - not for Web/API scraping)

I'm still using the Monitoring (Classic) APIs currently. I've not found a "non-classic" version of the API, but I've also not spent much time looking. Since a web job runs as part of the Web App you'll need to monitor the web app using the tools provided in the Microsoft.WindowsAzure.Management.Monitoring.Metrics Namespace.
I found the API to be somewhat confusing, but spent sometime working with the PG to get it right. I've provided some sample code on the MSPFE github page at: https://github.com/mspfe/AzureMetricsAPISampleKit. Running the "tests" in this Solution will show you how to use the lib.
You first need to identify the web app by getting a list of them:
var webSpaceList = _webSiteClient.WebSpaces.List();
Then collect the availabile metrics:
foreach (var website in websiteList)
{
MetricDefinitionListResponse wsMetricListResponse = _metricsClient.MetricDefinitions.List(website.WebsiteResourceId, null, null);
website.MetricDefinitionsList = wsMetricListResponse.MetricDefinitionCollection;
website.MetricNamesList = new List<string>();
foreach (var metric in website.MetricDefinitionsList.Value)
{
website.MetricNamesList.Add(metric.Name);
}
MetricValueListResponse wsValueResponse = _metricsClient.MetricValues.List(website.WebsiteResourceId, website.MetricNamesList, "",
_timeGrain, _startDateTime, _endDateTime);
website.MetricValueList = wsValueResponse.MetricValueSetCollection;
}
From there you should have metric definitions and values. Sorry if this code is a little dated... but it should work.

Azure WebJobs run within your Azure App Service's web app (formerly called Websites). So, your capacity is governed by the size (and quantity) of Web App instances, whether free tier or one of the paid tiers. And you'd measure your utilization against the Web App instances.
Your side question, about how to use Azure to web scrape, is not answerable here: It's an opinion-based question with no right answer.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string