Is there a page explaining how this actually works? We're developing an API-only webservice and it's getting hit reasonably hard every day, but fairly consistently around mid-afternoon it starts spitting back "403 - Site Disabled". I'm assuming it's because we've exceeded the "4 CPU Hours per Day" quota you get for the shared pricing tier, but there's nothing unambiguously showing that this is the problem, and I'm struggling to understand what is the expected behaviour once you've exceeded this quota - obviously it doesn't shut itself down for 20 hours until your next 24 hour period starts, but at what point do you get full service availability? And surely there's some sort of chart or report I can get to confirm this is what's happening? Not even what's in the billing usage data makes much sense (it says 21-24 "shared app service hours" for most days).
Related
I have a .Net web app hosted on Azure and am on the S1 Production pricing tier (1x core, 1.75 GB memory, A-Series compute). What's weird is I am going through extended periods of poor performance. Usually my average response time is around the 1.4 s mark. Not good by any stretch but it's something I can work with. However I'm experiencing extended periods where the response time shoots up to around the 5 s or greater mark. These periods last for days, up to a week, before coming back to normal levels. My knowledge of Azure is pretty limited but I can't seem to find anything that would explain this.
average response time over the last 30 days
You might first want to identify if this is an issue with your web app itself or is this the trend of app usage(i.e. it receives max hits during specific weeks of a month).
There are several areas you might want to look for further diagnosis. A few are -
Look for the number of requests during the time the website is slow. This is a web part on the website overview page.
Check the diagnose and solve problems. It is self-service diagnostic and troubleshooting experiencing to help you resolve issues with your web app.
If you have a considerable user base(number of users) and its a production environment sometimes a 1 Core + 1.75 GB RAM might not be sufficient to bear the load. If you determine that this is due to a usage trend from your users then you can plan for scaling out/up your application to meet the demands of high usage.
We run a web service that gets 6k+ requests per minute during peak hours and about 3k requests per minute during off hours. Lots of data feeds compiled from 3rd party web services and custom generated images. Our service and code is mature, we've been running this for years. A lot of work by good developers has gone into our service's code base.
We're migrating to Azure, and we're seeing some serious problems. For one, we are seeing our Premium P1 SQL Azure database routinely become unavailable for 1-2 full entire minutes. I'm sorry, but this seems absurd. How are we supposed to run a web service with requests waiting 2 minutes for access to our database? This is occurring several times a day. It occurs less after switching from Standard level to Premium level, but we're nowhere near our DB's DTU capacity and we're getting throttled hard far too often.
Our SQL Azure DB is Premium P1 and our load according to the new Azure portal is usually under 20% with a couple spikes each hour reaching 50-75%. Of course, we can't even trust Azure's portal metrics. The old portal gives us no data for our SQL, and the new portal is very obviously wrong at times (our DB was not down for 1/2 an hour, like the graph suggests, but it was down for more than 2 full minutes):
Azure reports the size of our DB at a little over 12GB (in our own SQL Server installation, the DB is under 1GB - that's another of many questions, why is it reported as 12GB on Azure?). We've done plenty of tuning over the years and have good indices.
Our service runs on two D4 cloud service instances. Our DB libraries are all implementing retry logic, waiting 2, 4, 8, 16, 32, and then 48 seconds before failing completely. Controllers are all async, most of our various external service calls are async. DB access is still largely synchronous but our heaviest queries are async. We heavily utilize in-memory and Redis caching. The most frequent use of our DB is 1-3 records inserted for each request (those tables are queried only once every 10 minutes to check error levels).
Aside from batching up those request logging inserts, there's really not much more give in our application's db access code. We're nowhere near our DTU allocation on this database, and the server our DB is on has like 2000 DTU's available to be allocated still. If we have to live with 1+ minute periods of unavailability every day, we're going to abandon Azure.
Is this the best we get?
Querying stats in the database seems to show we are nowhere near our resource limits. Also, on premium tier we should be guaranteed our DTU level second-by-second. But, again, we go more than an entire solid minute without being able to get a database connection. What is going on?
I can also say that after we experience one of these longer delays, our stats seem to reset. The above image was a couple minutes before a 1 min+ delay and this is a couple minutes after:
We have been in contact with Azure's technical staff and they confirm this is a bug in their platform that is causing our database to go through failover multiple times a day. They stated they will be deploying fixes starting this week and continuing over the next month.
Frankly, we're having trouble understanding how anyone can reliably run a web service on Azure. Our pool of Websites randomly goes down for a few minutes a few times a month, taking our public sites down. If our cloud service returns too many 500 responses something in front of it is cutting off all traffic and returning 502's (totally undocumented behavior as far as we can tell). SQL Azure has very limited performance and obviously isn't ready for prime time.
I maintain an azure cloud service. It is set to auto-scale based on load. To monitor the health of this service I have another service which pings this service every 2 minutes. The usual response time from this service is around 100ms.
Once or twice a week I see that the service does not respond. It is not really a worry for me - because it happens quite infrequently. I still am trying to figure out what could be causing the service to not respond. I do not think the problem is with the pinging service - I don't see any of the other services (not on azure, but on other servers) that it pings having any issues.
What could be causing these occasional delays. Any other azure service owners seeing such delays ?
Having quite similar problems. But I use Applications Inside, so I have some statistics. For example that reponse time increases together with SQL azure access time and CPU usage. My average response time according to Applications Inside is about 600ms and average RPS is about 0,6. During these problems RPS usually higher than avarage - up to 1.5, but average response time grows up to 1min! (During the day my RPS can grow up to 3 or even higher without any reponse time growth). As I have 1min sql connection timeout and I have drammatical growth of total SQL azure access time during this periods I can assume that problem happens bacause of SQL Azure. This also happens once a day or two, for about 10-15 minutes max and my ping service also always reports that service doesn't respond.
So my advice here - install Application Insights to analyze what happens dusring these response delays. It would be great if you share your results here.
P.S. I also use autoscale based on load. Though it doesn't really help in these concrete situations.
I subscribed to free 90 days azure trail offered by MS. I was excited and talked about it everywhere(including my blog http://techibee.com/windows-2012/free-try-windows-server-2012-in-azure-for-90-days/1876) about the free service offered by MS and how to make use of it. Well, my excitement lasted only for 7-8 days. Today I got a message from Azure team that my subscription disabled as my computer hours exceed the monthly limit.
I am just wondering how these compute hours are calculated in my case. I configured 2 VMs(2 medium) and using them to explore stuff. I never shutdown them since creation. Anyone has idea how these two VMs constituted to limits.
Another question I have is, since the subscription is disabled for this month, I am considering purchasing few more compute hours(pay-as-you-go). If I do that now, should I shutdown the VMs when I am not using them actively? will it stop the compute hours from increasing or they will continue to charge me for even shutdown hours. All I want is, I should get billed only when I am actively using it, when I am not connected to that host, I shouldn't. Looks like this is not happened in the trail program and their calculations seems different. Can anyone here given me some clarity?
From http://www.windowsazure.com/en-us/pricing/details/#header-3
Compute hours are charged whenever the Virtual Machine is deployed,
irrespective of whether it is running or not.
That's where all your hours went. You need to delete your VMs to prevent them using compute time.
With the free trial account you can configure only 1 VMs medium. Probably your offered expired early becouse you configured two.
Be aware that if you create a VM and you turn it off you will be charged the same as indicated when your turn off a VM.
I'm an msdn subscriber and I'm looking at Azure as a possible platform for a new website that will test the water of a new service. This website is expecting low to very low traffic at the time of launch. I've heard that this kind of traffic levels is very expensive for Azure but since they have this msdn offer, I thought I'd finally take a look at Azure.
In the offer, I'm looking at getting "750 small compute hours per month". From the reading I've done, this seems that, if I purchase nothing more than what's given (although the subscription itself is thousands of dollars of course), that an entire month would be covered. Since 24 (hours) x 31 (max days in a month) = 744 I'm still below my allotted 750 for the month.
Am I missing something else from this simple equation? Is there further aspects that could cause the site to be "turned off" temporarily that should be considered?
Yes, you can indeed run a small instance during a whole month. Or you can have 2 extra-small instances instead (having 2 instances means you're covered by the SLA).
There are 2 other things you need to consider:
Depending on your subscription you can have maximum 45GB of storage (blob/table/queue). If you use Virtual Machines you need to know that the system disk (and additional data disks) are persisted as blobs, so make sure not to reach the limit here.
There are also other limits active, but the most important one besides storage is the data transfer limit which is also very limited (max 35GB out).
If you're expecting very low traffic, did you ever consider Windows Azure Web Sites? You get 10 of those for free during 12 months. The free ones run on shared instances, but they are perfect to host the first low-traffic version of your app.