I am using MVC3, ASP.NET4.5, C#, Razor, EF6.1, SQL Azure
I have been doing some load testing using JMeter, and I have found some surprising results.
I have a test of 30 concurrent users, ramping up over 10 secs. The test plan is fairly simple:
1) Login
2) Navigate to page
3) Do query
4) Navigate back
5) Logout.
I am using "small" "standard" instances.
I have found that when my Azure setup is configured to "autoscale", it behaves like my test with one "small" instance with no autoscale. When I setup two "small" instances with no autoscale, it goes twice as fast, or rather the average process time per request is 2x, over the test. So it appears that it is NOT autoscaling. I have tried setting the CPU trigger to a lower target ie 40-70. Still no joy.
On further investigation, when "Autoscale" was first introduced, it seems it evaluated the metrics over the previous hour, and now I see references to "10 minutes". I thought that once the CPU started hitting the target value, then it immediately triggered the new instance, which must be the whole point of "autoscale". If I have a burst of concurrent usage, I need the extra instances now, hence a reason for using a PAAS . Since my test took less than 10 minutes, "Autoscale" never kicked in. So what should be the time that Autoscale takes to kick in?
Thanks.
Azure will check the CPU metric every 5 minutes, and if it exceeds the threshold that is set, will increase the instance count at that point.
Interestingly, Azure will decrease instance counts after 2 hours of remaining below the threshold.
Source: How to Scale Websites
Quoted relevant section:
Note: When Scale by Metric is enabled, Microsoft Azure checks the CPU
of your website once every five minutes and adds instances as needed
at that point in time. If CPU usage is low, Microsoft Azure will
remove instances once every two hours to ensure that your website
remains performant. Generally, putting the minimum instance count at 1
is appropriate. However, if you have sudden usage spikes on your
website, be sure that you have a sufficient minimum number of
instances to handle the load. For example, if you have a sudden spike
of traffic during the 5 minute interval before Microsoft Azure checks
your CPU usage, your site might not be responsive during that time. If
you expect sudden, large amounts of traffic, set the minimum instance
count higher to anticipate these bursts.
It is now possible in the new Azure portal (https://portal.azure.com) to configure scaling based upon different metrics:
CPU
Memory usage
Data in/out
Http queue length
Disk Queue length
And also to configure scale up time and scale down time. In the graph it will show you the current amount of instances (solid line) vs your max configured (dashed line) and your configured metrics. When the metric exceeds the line (=configured scale up for that given metric) it will scale up & vica versa.
Related
I’m about to start work on an API that will literally go from 0 RPS to a couple hundred thousand HTTP RPS at the same time and run at that rate for ~2 mins. All processing of those 30 million requests must finish by the end of that 2 min period. This would happen 7 times a WEEK.
Going serverless with Azure Functions in Consumption Plan Hosting Mode sounds appealing. This document describes that a scale controller exists to coordinate app instances, but doesn't really discuss what I can expect from it for HTTP triggers. I can’t find any info that says the scale controller will be able to respond in the time frame I'd need.
The best info I could find was this info saying it took nearly 8 mins to scale up for his tests.
Is this a bad use case for Azure Functions in consumption mode?
Obviously, spinning up a testing harness that is capable of issuing 30 million requests within 2 minutes is an undertaking of its own, and an expensive one. I'd like to learn from others who have already done so.
Based on my experience, this scenario is not properly covered by Consumption Plan. They can scale up to many instances, but not very rapidly. 2 minutes is way too fast to rely on.
I was mostly working with queues, not HTTP, but I got delays up to 40 minutes caused by low pace of scaling up.
If you can predict which 2 minutes are going to be heavy-loaded, your best bet could be to provision the capacity with a script (or another Function).
We have a IaaS cloud service, trying to auto-scale. Weird is, we didn't see the scale happen, we configure the auto-scale based on CPU metric, range is 20-60, looking at logs of one of active server, its CPU is 40%, but seems there is no extra instance got booted up and added to the farm.
Looking at Microsoft documentation, it says 'based on the average percentage of CPU resources that it uses.', what does this average meant, daily average, hourly average or (ideally under our impression) the time duration (the scale up wait time) since the farm's last scale.
All instances are included when calculating the average percentage of CPU usage and the average is based on use over the previous hour.
https://azure.microsoft.com/en-us/documentation/articles/cloud-services-how-to-scale/#automatically-scale-an-application-running-web-roles-worker-roles-or-virtual-machines
I currently have a web application deployed to "Web Sites" - This is configured in standard mode and it performs really well from what I have seen so far.
I have a few questions:
1)My instance size is currently small - however I can scale out to 10 instances. Does this also mean that if I change my instance size to medium or large, I can still have 10 instances?
2)What is the maximum number of instances I can have for an azure web site?
3)Is there any SLA for a single azure instance?
4)Is it possible to change the instance size programatically or is better to just change the instance count
1) Yes
2) 10 for standard.
3) Yes, for Websites Basic and Standard, MS guarantee a 99.9% monthly availability.
4) It depends on a lot of factors. The real question is "Is it better for your app to scale up or scale out?"
Yes, the default limit is 10 instances regardless of the size.
The default limit is 10 instances, but you can contact Azure Support to have the limit increased. Default and "real" limits for Azure services are documented here.
According to the Websites pricing page Free and Shared sites have no SLA and Basic and Standard sites have 99.9% uptime SLA. Having a single instance means that during the 0.1% outage time (43.8 minutes per month) your site will be down. If you have multiple instances then most likely at least one will be up at any given time.
Typically instance auto-scaling is used to handle variation in demand while instance size would be used for application performance. If you only get 100 requests per day but each request is slow because it's maxing out CPU then adding more instances won't help you. Likewise if you're getting millions of requests that are being processed quickly but the volume is maxing out your resources then adding more instances is probably the better solution.
We're trying to understand the intricacies of monitoring data that Windows Azure Management API returns for Azure Websites (not Webroles)
For example, the image below describes a data point retrieved for CPUTime. It appears to indicate that during the 10:00pm thru 10:39pm range, I've used up 3.171 seconds of CPU. Is this translatable to CPU utilization (in percentage form) that we're all accustomed to seeing in Perfmon?
Does this get reset every clock hour and what is TimeGrain?
Interestingly, the "Count" indicates "1" - which to me implies the number of measurements in the timeslot, but even after subsequent calls are issued to the API, the Count stays at 1 (however the Total value changes).
Ultimately the goal is to translate the captured metric to standard CPU utilization % that everyone is accustomed in seeing during Perfmon monitoring.
I'm guessing that two relatively close measurements need to be taken, the delta between measurements computed (in milliseconds) and divided by the total span between the measurements (in milliseconds) - in order to produce a percentage value. Is this correct?
Azure Web Sites in 'Free' and 'Shared' mode is running in multi-tenant environment. You can't translate CpuTime to CPU utilization % in this case. In case of Reserved mode it is technically possible, but this value is not currently exposed. Please also note, if you upgrade your web site to 'Reserved' mode all other web sites will be also upgraded and share same reserved instances.
I'm an msdn subscriber and I'm looking at Azure as a possible platform for a new website that will test the water of a new service. This website is expecting low to very low traffic at the time of launch. I've heard that this kind of traffic levels is very expensive for Azure but since they have this msdn offer, I thought I'd finally take a look at Azure.
In the offer, I'm looking at getting "750 small compute hours per month". From the reading I've done, this seems that, if I purchase nothing more than what's given (although the subscription itself is thousands of dollars of course), that an entire month would be covered. Since 24 (hours) x 31 (max days in a month) = 744 I'm still below my allotted 750 for the month.
Am I missing something else from this simple equation? Is there further aspects that could cause the site to be "turned off" temporarily that should be considered?
Yes, you can indeed run a small instance during a whole month. Or you can have 2 extra-small instances instead (having 2 instances means you're covered by the SLA).
There are 2 other things you need to consider:
Depending on your subscription you can have maximum 45GB of storage (blob/table/queue). If you use Virtual Machines you need to know that the system disk (and additional data disks) are persisted as blobs, so make sure not to reach the limit here.
There are also other limits active, but the most important one besides storage is the data transfer limit which is also very limited (max 35GB out).
If you're expecting very low traffic, did you ever consider Windows Azure Web Sites? You get 10 of those for free during 12 months. The free ones run on shared instances, but they are perfect to host the first low-traffic version of your app.