Azure Websites - monitoring data - azure

We're trying to understand the intricacies of monitoring data that Windows Azure Management API returns for Azure Websites (not Webroles)
For example, the image below describes a data point retrieved for CPUTime. It appears to indicate that during the 10:00pm thru 10:39pm range, I've used up 3.171 seconds of CPU. Is this translatable to CPU utilization (in percentage form) that we're all accustomed to seeing in Perfmon?
Does this get reset every clock hour and what is TimeGrain?
Interestingly, the "Count" indicates "1" - which to me implies the number of measurements in the timeslot, but even after subsequent calls are issued to the API, the Count stays at 1 (however the Total value changes).
Ultimately the goal is to translate the captured metric to standard CPU utilization % that everyone is accustomed in seeing during Perfmon monitoring.
I'm guessing that two relatively close measurements need to be taken, the delta between measurements computed (in milliseconds) and divided by the total span between the measurements (in milliseconds) - in order to produce a percentage value. Is this correct?

Azure Web Sites in 'Free' and 'Shared' mode is running in multi-tenant environment. You can't translate CpuTime to CPU utilization % in this case. In case of Reserved mode it is technically possible, but this value is not currently exposed. Please also note, if you upgrade your web site to 'Reserved' mode all other web sites will be also upgraded and share same reserved instances.

Related

Azure cloud service auto scaling through classic portal

We have a IaaS cloud service, trying to auto-scale. Weird is, we didn't see the scale happen, we configure the auto-scale based on CPU metric, range is 20-60, looking at logs of one of active server, its CPU is 40%, but seems there is no extra instance got booted up and added to the farm.
Looking at Microsoft documentation, it says 'based on the average percentage of CPU resources that it uses.', what does this average meant, daily average, hourly average or (ideally under our impression) the time duration (the scale up wait time) since the farm's last scale.
All instances are included when calculating the average percentage of CPU usage and the average is based on use over the previous hour.
https://azure.microsoft.com/en-us/documentation/articles/cloud-services-how-to-scale/#automatically-scale-an-application-running-web-roles-worker-roles-or-virtual-machines

How long does it take for Azure Websites to Autoscale?

I am using MVC3, ASP.NET4.5, C#, Razor, EF6.1, SQL Azure
I have been doing some load testing using JMeter, and I have found some surprising results.
I have a test of 30 concurrent users, ramping up over 10 secs. The test plan is fairly simple:
1) Login
2) Navigate to page
3) Do query
4) Navigate back
5) Logout.
I am using "small" "standard" instances.
I have found that when my Azure setup is configured to "autoscale", it behaves like my test with one "small" instance with no autoscale. When I setup two "small" instances with no autoscale, it goes twice as fast, or rather the average process time per request is 2x, over the test. So it appears that it is NOT autoscaling. I have tried setting the CPU trigger to a lower target ie 40-70. Still no joy.
On further investigation, when "Autoscale" was first introduced, it seems it evaluated the metrics over the previous hour, and now I see references to "10 minutes". I thought that once the CPU started hitting the target value, then it immediately triggered the new instance, which must be the whole point of "autoscale". If I have a burst of concurrent usage, I need the extra instances now, hence a reason for using a PAAS . Since my test took less than 10 minutes, "Autoscale" never kicked in. So what should be the time that Autoscale takes to kick in?
Thanks.
Azure will check the CPU metric every 5 minutes, and if it exceeds the threshold that is set, will increase the instance count at that point.
Interestingly, Azure will decrease instance counts after 2 hours of remaining below the threshold.
Source: How to Scale Websites
Quoted relevant section:
Note: When Scale by Metric is enabled, Microsoft Azure checks the CPU
of your website once every five minutes and adds instances as needed
at that point in time. If CPU usage is low, Microsoft Azure will
remove instances once every two hours to ensure that your website
remains performant. Generally, putting the minimum instance count at 1
is appropriate. However, if you have sudden usage spikes on your
website, be sure that you have a sufficient minimum number of
instances to handle the load. For example, if you have a sudden spike
of traffic during the 5 minute interval before Microsoft Azure checks
your CPU usage, your site might not be responsive during that time. If
you expect sudden, large amounts of traffic, set the minimum instance
count higher to anticipate these bursts.
It is now possible in the new Azure portal (https://portal.azure.com) to configure scaling based upon different metrics:
CPU
Memory usage
Data in/out
Http queue length
Disk Queue length
And also to configure scale up time and scale down time. In the graph it will show you the current amount of instances (solid line) vs your max configured (dashed line) and your configured metrics. When the metric exceeds the line (=configured scale up for that given metric) it will scale up & vica versa.

Azure web site questions

I currently have a web application deployed to "Web Sites" - This is configured in standard mode and it performs really well from what I have seen so far.
I have a few questions:
1)My instance size is currently small - however I can scale out to 10 instances. Does this also mean that if I change my instance size to medium or large, I can still have 10 instances?
2)What is the maximum number of instances I can have for an azure web site?
3)Is there any SLA for a single azure instance?
4)Is it possible to change the instance size programatically or is better to just change the instance count
1) Yes
2) 10 for standard.
3) Yes, for Websites Basic and Standard, MS guarantee a 99.9% monthly availability.
4) It depends on a lot of factors. The real question is "Is it better for your app to scale up or scale out?"
Yes, the default limit is 10 instances regardless of the size.
The default limit is 10 instances, but you can contact Azure Support to have the limit increased. Default and "real" limits for Azure services are documented here.
According to the Websites pricing page Free and Shared sites have no SLA and Basic and Standard sites have 99.9% uptime SLA. Having a single instance means that during the 0.1% outage time (43.8 minutes per month) your site will be down. If you have multiple instances then most likely at least one will be up at any given time.
Typically instance auto-scaling is used to handle variation in demand while instance size would be used for application performance. If you only get 100 requests per day but each request is slow because it's maxing out CPU then adding more instances won't help you. Likewise if you're getting millions of requests that are being processed quickly but the volume is maxing out your resources then adding more instances is probably the better solution.

Azure websites scaling issue

I am using a azure websites solution with 20 websites. Hosted on 4 cores, 8 GB RAM standard instance. I would like to know how I could do scaling in Azure websites and when to do it ?
Also I am reading some values from the new azure portal.
Can someone guide me on the values that I see here ?
Thank you
Averages
The Avg % is telling you, on average, how much of that resource is being used. So, if you have 8GB of ram, and you are typically using 66% of it, then you are averaging 5.28 Gb of ram used. Same goes for the CPU average listed below.
For the totals, I have no idea.
You're not using much of the CPU available to you here, but you are definitely taking advantage of the RAM. I'm not sure of what kind of web application you are running though, so it's dificult to determine what could be causing this.
Scaling
In terms of scaling, I always suggest starting with a small machine, then gradually scaling up.
Based on your usage, I'd drop to a machine that has fewer CPU cores, but more available RAM. From within your dashboard, you can see how to scale by clicking no your web app, then scrolling down. Click on the scale tab and it should appear as it does below:
You can now adjust what you want to scale by. The default setting is CPU Percentage, but that isn't particularly useful in this case. Instead, select Schedule and performance rules and a new panel wioll appear. On the right hand side, select Metric name and look for Memory Percentage.
In your particular case, this is helpful as we saw that your RAM is consistently being used.
Look at Action and you will want to Increase count by and change the number of VMs to 1. What this does is when your RAM reaches a certain usage %, Azure will auto-scale and generate a new VM for you. After a cool down period of 5 minutes (the default, listed at the bottom), your machine will revert to 1 machine.
Conclusion
With these settings, each time your website uses <= (Select your percentage) of RAM, Azure will increase the size of your machines.
In your case, I suggest using fewer cores, but more RAM.
Make sure you save your settings, with the Save button above.
Scott Hanselman as a great blog post on how to make sense of all of this.

Number of instances needed for windows azure application

I'm fairly new to Windows Azure and want to host a survey application that will be filled out by appr. 30.000 users simultaniously.
The application consists of 1 .aspx page that will be sent to the client once, asks 25 questions and will give a wrap-up of the given answers at the end. When the user has given the answer and hits the 'next question' buttons the given answer will be send via an .ashx handler to the server. The response is the next question and answers. The wrap-up is sent to the client after a full postback.
The answer is saved in an Azure Table that is partitioned so that each partition can hold a max of 450 users.
I would like to ask if someone can give an estimated guess about how many web-role instances we need to start in order to have this application keep running. (If that is too hard to say, is it more likely to start 5, 50 or 500 instances?)
What is a better way to go: 20 small instances or 5 large instances?
Thanks for your help!
The most obvious answer: you would be best served by testing this yourself and see how your application holds up. You can easily get performance counters and other diagnostics out of Windows Azure; for instance, you can connect Microsoft SCOM (System Center Operations Manager) to monitor your environment during test. Site Hammer is a simple load testing tool for Windows Azure (on MSDN code gallery).
Apart from this very obvious answer, I will share some guesstimates: given the type of load, you are probably better of with more small instances as opposed to a lower number of large ones, especially since you already have your storage partitioned. If you are really going to have 30K visitors simultaneously and give them a ~15 second interval between reading the questions & posting their answers you are looking at 2,000 requests per second. 10 nodes should be more than enough to handle that load. Remember that this is just a simple estimate, lacking any form of insight in your architecture, etc. For these types of loads, caching is a very good idea; it will dramatically increase the load each node can handle.
However, the best advice I can give you is to make sure that you are actively monitoring. It takes less than 30 minutes to spin up additional instances, so if you monitor your environment and/or make sure that you are notified whenever it starts to choke, you can easily upgrade your setup. Keep in mind that you do need to contact customer support to be able to go over 20 instances (this is a default limit, in place to protect you from over-spending).
Aside from the sage advice tijmenvdk gave you, let me add my opinion on instance size. In general, go with the smallest size that will support your app, and then scale out to handle increased traffic. This way, when you scale back down, your minimum compute cost is kept low. If you ran, say, a pair of extra-large instances as your baseline (since you always want minimum two instances to get the uptime SLA), your cost footprint starts at 0.12 x 8 x 2 = $1.92 per hour, even during low-traffic times. If you go with small instances, you'd be at 0.12 x 1 x 2 = $0.24 per hour.
Each VM size as associated CPU, memory, and local 9non-durable) disk storage, so pick the smallest size unit that your app works efficiently in.
For load/performance-testing, you might also want to consider a hosted solution such as Loadstorm.
How simultaneous are the requests in reality?
Will they all type the address in at exactly the same time?
That said, profile your app locally, this will enable you to estimate CPU, Network and Memory usage on Azure. Then, rather than looking at how many instances you need, look at how you can reduce the requirement! Apply these tips, and profile locally again.
Most performance tips have a tradeoff between cpu, memory or bandwith usage, the idea is to ensure that they scale equally. If you're application runs out of memory, but you have loads of CPU and network, dont
For a single page survey, ensure your html, css & js is minified, ensure its cacheable.
Combine them if possible, and to get really scaleable, push static files (css,js & images) to a CDN. This all reduces the number of requests the webserver has to deal with, and therefore reduces the number of webroles you will need = less network.
How does the ashx return the response? i.e. is it sending html, xml or json?
personally, I'd get it to return JSON, as this will require less network bandwidth, and most likely less server side processing = less mem and network.
Use Asyncronous API's to access azure storage (this uses IO completion ports to free up the iis thread to handle more requests until azure storage comes back = enabling cpu to scale)
tijmenvdk has already mentioned using queues to write. Do the list of questions change? if not, cache them, so that the app only has to read from table storage once on start-up and once for each client for the final wrap-up = saves network and cpu at the expense of memory.
All of these tips are equally applicable to a normal web application, on a single server or web-farm environment.
The point I'm trying to make is that what you can't measure, you cant improve, and measurement, improvement and cost all go hand in hand. Dynamic scaling will reduce costs, but fundamentally if your application hasn't been measured and resource usage optimised, asking how many instances you need is pointless.

Resources