Azure storage failure - azure

I'm experiencing problems with Azure PUTs. They are either extremely slow or are failing intermittently when the requests originate from servers in the US Central region. Requests from US East appear to be working ok, although a little slower than usual. Unclear if other regions originating such requests are also facing issues.
I also see the Azure Dashboard mentioning outages of Web Apps in NA, but Storage is shown as all green. I'm assuming that the Dashboard hasn't quite caught up to the actual situation.

I was having a similar issue with puts which surfaced through Application Insights. It took some digging to find that we were getting a 409 error.
I asked microsoft and found that CreateIfNotExists correctly returns a 409 (my error) if the container exists which i was calling each time.
Below is a link to my discussion with MS.
https://social.msdn.microsoft.com/Forums/en-US/67652f9c-b941-4729-a3b3-21530ed9f2fb/blob-storage-is-failing-with-409-and-then-retrying-without-any-retry-code-and-succeeding?forum=windowsazuredata

Related

Azure functions portal log / monitor isn't very accurate

I've been using functions for a while and it seems the longer the Function is around, the less accurate the Portal logs are. When I first was using my functions for maybe 3 months everything monitor/logging wise was fine. Over time things starting getting less accurate.
Now I see the real logs by going to the ms azure storage explorer and checking the AzureWebJobsStorage.
First when I bring up the code/logs the last log it brings up isn't accurate. It will be from a few days ago usually, or the last error. When it triggers though, it does get the live feed. This isn't that big a deal, it's the monitor being inactive that and not being able to see the logs from that which is bad. I suppose I just use the Azure Storage explorer.
Monitor Invocation Logs, always seems a few days behind. This used to be accurate, but the last month or so, it's always a few days behind
Dan,
The local, file based logs, exist primarily to support the portal experience, so the behavior you're observing on the log window is expected as the logs are not written by the runtime as part of the normal invocation process, but only when you're actively developing/testing on the portal.
The issue you're experiencing with the monitor is due to a regression that has been patched and should be fully rolled out today (you can see more details here)
We've been listening to feedback on our logging capabilities, and there has been a lot of investment in that area, resulting in the recently announced built in integration with Application Insights. That integration addresses some of the pain points you've brought up as well as other issues, so I'd strongly recommend trying it out. You can find more information about it here.

Random 503 errors from Azure

Not sure if i should post here or on Serverfault but this morning we have been getting random 503 errors from Azure (web apps).
They occur from random places across the world and i do get them myself from time to time.
In our "Support Observe" view i do see a lot of errors:
I do not see that amount of erros in our event logs thou. I do however see some errors that could be something like.
6136
w3wp
Role environment . FAILED TO INITIALIZE. hr: -2147024891
and from W3SVC-WP that are really cryptic like.
*1
5
50000780*
I've found some other posts about these kind of errors here and they seems to point towards issue with Azure sometimes and sometimes not.
I'm on the East US datacenter. Anyone else having issues or can help me figuring out what this is. The fact that is occuring randomly across the world really do point towards an Azure issue?
I could also add that i do not do any load balancing so it could not be that one of the instances is down and or something like that. I have restarted and redeployed the code and so on as well.

503 error on azure cloud service

We use Azure and have problems with our Cloud Service last two days.
We get 503 error on site. It looks like one of web-roles reboots sometimes. But in dashboard all of them works fine.
Application Insight and Logs doesn't show any problems. CPU, Memory, Exception rate - all OK.
But I found one interesting moment. SQL queries average time grew to 5 seconds. But I checked it on database, it worked normal. This means that 5 seconds is not execution time but connection.
It looks too much for trace inside data center.
Does anyone have any ideas how I can find a solution of this problem?
When your app generates a lot of exceptions in short time IIS stops application pool and you get 503 error.
For more details google for "IIS Rapid Protection".

Random 503 errors in Azure Mobile Services

At certain times during the week while I'm testing my Mobile Services app I get a 503 error (Service Unavailable). It happens whether I try to call the app from localhost or live on my Azure Website. It hangs around for 10-15 minutes and then goes away on its own. It doesn't seem to be caused by anything in particular that I am doing (i.e. I have not updated any code). The 503 error occurs when I'm trying to call one of my custom APIs in my Mobile Services account. A few of the requests make it through (strangely enough) but the majority return a 503 error.
I've seen that someone had a very similar problem here (Why does Azure give me an intermittent Error 503. The service is unavailable?) without an acceptable resolution.
I am using the free version of Mobile Services but I should be no where near pushing the limits of what the free version can handle; I am the sole user of the app right now.
It will soon be time to make the service live and I'm shuddering at the thought of support calls that will come in during one of these funky states the service gets into. Any help in debugging the problem would be greatly appreciated.
EDIT:
I've narrowed this down to a database problem. I have one main query (sproc) that I use to feed data to the UI. I noticed that when I get the 503 errors the query takes about 13 seconds (when run in SSMS). When things are running "normally", the query takes less than a second.
This doesn't solve my problem though, in fact it makes it more perplexing because I am using the Business Edition of Windows Azure SQL Database and there shouldn't be a 13 second fluctuation in execution time!
This problem seems to happen randomly. Is there some kind of caching in SQL Server that could explain this? Maybe my query really does take 13 seconds to execute and the caching superficially speeds it up.
Could you try transitioning your database/server to one of the "editions"? They have resource governance to promote predictable performance. Web/Business suffer from a noisy neighbor problem. It sounds like that may be your issue, considering it is intermittent.
Here's a link to a page describing the editions. https://msdn.microsoft.com/en-us/library/azure/dn741340.aspx

Getting ocassional 503 errors on azure website

I'm getting occasional 503 errors on our site. It usually happens after not visiting the site for a while. The whole page might return 503 or just some resources like css or js files.
It seems to go away after you've surfed the site for a bit and hit all of our servers.
Elmah doesn't show any errors.
I've gone into the logs on each of our servers (three medium web roles on azure) and I can't find any problems.
Our deployment has been up since December without a code change, we've been having this problem for about a week.
One thing to note is that when this happens the site doesn't shut down. I would think that would happen if IIS was crashing and restarting (even with three servers).
Does anyone know how to diagnose or fix this problem?
While this could be code related, I'll assume you've already explored this route as much as possible via logs (and since you haven't deployed new code). Having said that:
Do your issues align with the Compute service degradation events shown in the Azure Dashboard over the past several days? Look at Historical View and you'll see a few issues around Compute. Depending on your data center, maybe this is related?

Resources