I am intermittently facing the ErrorCode:SubStatus with Windows Azure and the App Fabric cache in my ASP.NET application. This brings my whole web application to a standstill until I reset it making Azure no longer viable.
I am only storing very small strings in Session state and only have a very small number of users. I can’t imagine that I could be over any of the usage quotas (at http://msdn.microsoft.com/en-us/library/gg602420.aspx#C_BKMK_FAQ8)
I would like to find out which quota if any I am exceeding and why. How can I find out if and why I am being throttled, or if there is any other issue that might be causing this.
Is there any way to find the Cache Size (I know this is in the Management Portal but it always reports ir over 95% lower than my 128MB limit), Transactions Per Hour, Bandwidth MB Per Hour and Concurrent Connections?
Stack trace:
Application_Error: ErrorCode:SubStatus:There is a temporary failure. Please retry later. (One or more specified cache servers are unavailable, which could be caused by busy network or servers. For on-premises cache clusters, also verify the following conditions. Ensure that security permission has been granted for this client account, and check that the AppFabric Caching Service is allowed through the firewall on all cache hosts. Also the MaxBufferSize on the server must be greater than or equal to the serialized object size sent from the client.)Stack Trace: at Microsoft.ApplicationServer.Caching.DataCache.ThrowException(ResponseBody respBody)
at Microsoft.ApplicationServer.Caching.DataCache.ExecuteAPI(RequestBody reqMsg, IMonitoringListener listener)
at Microsoft.ApplicationServer.Caching.DataCache.InternalPut(String key, Object value, DataCacheItemVersion oldVersion, TimeSpan timeout, DataCacheTag[] tags, String region, IMonitoringListener listener)
at Microsoft.ApplicationServer.Caching.DataCache.<>c_DisplayClass25.b_24()
at Microsoft.ApplicationServer.Caching.MonitoringListenerFactory.EmptyListener.Microsoft.ApplicationServer.Caching.IMonitoringListener.Listen[TResult](Func1 innerDelegate)
at Microsoft.ApplicationServer.Caching.DataCache.Put(String key, Object value, TimeSpan timeout)
at Microsoft.Web.DistributedCache.DataCacheWrapper.Put(String key, Object value, TimeSpan timeout)
at Microsoft.Web.DistributedCache.DataCacheForwarderBase.<>c__DisplayClass10.<Put>b__f()
at Microsoft.Web.DistributedCache.DataCacheForwarderBase.<>c__DisplayClass2e1.b__2d()
at Microsoft.Web.DistributedCache.DataCacheRetryWrapper.PerformCacheOperation(Action action)
at Microsoft.Web.DistributedCache.DataCacheForwarderBase.PerformCacheOperation[TResult](Func`1 func)
at Microsoft.Web.DistributedCache.DataCacheForwarderBase.Put(String key, Object value, TimeSpan timeout)
at Microsoft.Web.DistributedCache.BlobBasedSessionStoreProvider.SetAndReleaseItemExclusive(HttpContextBase context, String id, SessionStateStoreData item, Object lockId, Boolean newItem)
at Microsoft.Web.DistributedCache.DistributedCacheSessionStateStoreProvider.SetAndReleaseItemExclusive(HttpContext context, String id, SessionStateStoreData item, Object lockId, Boolean newItem)
at System.Web.SessionState.SessionStateModule.OnReleaseState(Object source, EventArgs eventArgs)
at System.Web.SessionState.SessionStateModule.OnEndRequest(Object source, EventArgs eventArgs)
at System.Web.HttpApplication.SyncEventExecutionStep.System.Web.HttpApplication.IExecutionStep.Execute()
at System.Web.HttpApplication.ExecuteStep(IExecutionStep step, Boolean& completedSynchronously) on page
Take a look at the Windows Azure Service Dashboard. Go to the bottom, and under Status History, select AppFabric Caching. Look for periods of service degradation or interruption on the days you saw this error (including today), for your given data center.
Hope that helps...
There is an object limit as 8MB which causes that error message when you are trying to post something more than that
Related
our web application is experiencing permission errors, from time to time we are receiving this exception :
Access to the path 'D:\home\site\wwwroot\App_Data\TEMP\PluginCache\..' is denied.
this is the call stack
System.UnauthorizedAccessException: Access to the path 'D:\home\site\wwwroot\App_Data\TEMP\PluginCache\umbraco-plugins.RD0003FF29ADB0.list' is denied.
at System.IO.__Error.WinIOError(Int32 errorCode, String maybeFullPath)
at System.IO.File.InternalDelete(String path, Boolean checkHost)
at Umbraco.Core.PluginManager..ctor(IServiceProvider serviceProvider, IRuntimeCacheProvider runtimeCache, ProfilingLogger logger, Boolean detectChanges)
at Umbraco.Core.CoreBootManager.Initialize()
at Umbraco.Web.WebBootManager.Initialize()
at Umbraco.Core.UmbracoApplicationBase.StartApplication(Object sender, EventArgs e)
however we are not capable of reproducing the error, we tried restarting, up-scaling, down-scaling, changing instance size - it newer.
We do not have any evidence to prove that it is OS-Update, the main reason why we believe it is related to OS Update is as it is happening on all running instances at the same time (+5/-5 seconds). It comes out of nowhere periodically, and usually on Tuesdays or Wednesdays, however we can not ever reproduce it.
our only assumption is that it is caused by OS-Update (or similar system event - any idea which?) in azure web application. To get a better handle on the issue we would like to know.
can I get last windows update time?
if not can I at least get
system uptime ?
and is it by any means possible to initiate windows update or postpone it until the moment I need it ?
thanks
almir
We have an Umbraco website (version 7.5.11) hosted on Azure Web Apps.
We are experiencing the following exception intermittently (3 times within the past 3 weeks). Once the exception occurs it brings the website down until we republish the home node in Umbraco. At all other times the website is working as expected, including retrieving image files from the server.
Exception type: IOException
Exception message: An unexpected network error occurred. at Umbraco.Core.Cache.HttpRuntimeCacheProvider.GetCacheItem(String cacheKey, Func1 getCacheItem, Nullable1 timeout, Boolean isSliding, CacheItemPriority priority, CacheItemRemovedCallback removedCallback, CacheDependency dependency) at Umbraco.Core.Cache.HttpRuntimeCacheProvider.GetCacheItem(String cacheKey, Func1 getCacheItem, Nullable1 timeout, Boolean isSliding, CacheItemPriority priority, CacheItemRemovedCallback removedCallback, String[] dependentFiles) at Umbraco.Core.Cache.DeepCloneRuntimeCacheProvider.GetCacheItem(String cacheKey, Func1 getCacheItem, Nullable1 timeout, Boolean isSliding, CacheItemPriority priority, CacheItemRemovedCallback removedCallback, String[] dependentFiles) at Umbraco.Web.PublishedCache.XmlPublishedCache.PublishedMediaCache.GetCacheValues(Int32 id, Func2 func) at Umbraco.Web.PublishedCache.XmlPublishedCache.PublishedMediaCache.GetUmbracoMedia(Int32 id) at Umbraco.Web.PublishedCache.XmlPublishedCache.PublishedMediaCache.GetById(UmbracoContext umbracoContext, Boolean preview, Int32 nodeId) at Umbraco.Web.PublishedCache.ContextualPublishedCache1.GetById(Boolean preview, Int32 contentId) at Umbraco.Web.PublishedContentQuery.DocumentById(Int32 id, ContextualPublishedCache cache, Object ifNotFound) at Umbraco.Web.PublishedContentQuery.Media(Int32 id) at Umbraco.Web.UmbracoHelper.Media(String id)
The media file exists, and republishing the home node brought the site back online.
At the time of the exception, no code changes were deployed and no pages were updated / published within Umbraco.
Has anyone experienced something similar, or any ideas what the root cause is?
According to the source code of PublishedMediaCache.cs, the exception is often caused by following issue.
Examine index is corrupted.
Here is a thread on umbraco forum which related to your issue.
Examine corruption issues
And here is the solution for this issue from #Shannon Deminick.
If you are using Azure web apps and are NOT auto-scaling, you should use these settings:
useTempStorage="Sync"
use this feature to store local index files: http://issues.umbraco.org/issue/U4-7614
Remove the {machinename} token from your index path
RebuildOnAppStart="true" - since this should only happen one time
If you are using Azure web apps and are load balancing w/ auto-scaling your front-end workers then:
useTempStorage="Sync"
use this feature to store local index files: http://issues.umbraco.org/issue/U4-7614
You must have the {machinename} token from your index path
RebuildOnAppStart="true" - so that when new sites come online, their indexes are built
... yes in some cases this might not be ideal, please see: https://our.umbraco.org/forum/extending-umbraco-and-using-the-api/74731-examine-corruption-issues#comment-244293
When trying to retrieve a list of registrations for my notification hub I receive the following error:
[QuotaExceededException: The remote server returned an error: (403) Forbidden. The request was terminated because the namespace XXX is being throttled. Please wait 60 seconds and try again. TrackingId:c7e05299-24ba-4f9d-9017-885db746a032_G20,TimeStamp:11/19/2014 9:00:51 PM]
Microsoft.ServiceBus.Common.AsyncResult.End(IAsyncResult result) +624
Microsoft.ServiceBus.Messaging.ServiceBusResourceOperations.EndGetAll(IAsyncResult asyncResult, String& continuationToken) +12
Microsoft.ServiceBus.NamespaceManager.EndGetAllRegistrations(IAsyncResult result) +33
System.Threading.Tasks.TaskFactory`1.FromAsyncCoreLogic(IAsyncResult iar, Func`2 endFunction, Action`1 endAction, Task`1 promise, Boolean requiresSynchronization) +52
This is only a test application so the number of devices is relatively low (~20) and so is the number of users of the code which is throwing the exception. The error message obviously indicates that we are going over our quota somehow, but I cannot tell in what aspect - none of the metrics in the Notification Hubs appear to be over 100 operations in the previous 24 hours. The number of available operations per day should be far more than we are using.
There is nowhere in the Azure Portal which seems to show the number of total operations, so I am at a loss for how to find the cause of this issue.
Strangely, this similar question - Azure QuotaExceededException - indicates that they received an indication of Max and Allowed numbers of operations, but my error shows no such thing.
Is there any way (besides Azure paid support) to find why I am being throttled?
It turns out there is a bug in the NuGet package for Microsoft.ServiceBus 2.1.2.0 where calling NotificationHubClient.GetAllRegistrations(10) does not correctly retrieve only the top 10 registrations, instead it recursively retrieves ALL registrations in blocks of 10. In my case it turns out there were 250 registrations (most were old), so the API call was being made 25 times in quick succession (~5 seconds) which explains the QuotaExceededException.
The fix was to upgrade to the latest NuGet package for Microsoft.ServiceBus - currently 2.5.2.0.
Get all registrations is treated (and throttled!) as analytic operation. It means that it is not supposed to be used in your main runtime flow. Or in the other words - if you have to call it often, then something is wrong...
Describe your application in more detailed manner and I'll be glad to help to figure out a good NH usage pattern.
Hy guys ,
I developed a WebForms application using VS 2012 . I've published it on Azure .
After that I integrated ACS ( i've set the URL to my allready published application)
I published the application again , but it doesn't work .
After I've registered myself (for ex using Yahoo or LiveID) I've got this error :
Server Error in '/' Application.
The data protection operation was unsuccessful. This may have been caused by not having the user profile loaded for the current thread's user context, which may be the case when the thread is impersonating.
Description: An unhandled exception occurred during the execution of the current web request. Please review the stack trace for more information about the error and where it originated in the code.
Exception Details: System.Security.Cryptography.CryptographicException: The data protection operation was unsuccessful. This may have been caused by not having the user profile loaded for the current thread's user context, which may be the case when the thread is impersonating.
Source Error:
An unhandled exception was generated during the execution of the current web request. Information regarding the origin and location of the exception can be identified using the exception stack trace below.
Stack Trace:
[CryptographicException: The data protection operation was unsuccessful. This may have been caused by not having the user profile loaded for the current thread's user context, which may be the case when the thread is impersonating.]
System.Security.Cryptography.ProtectedData.Protect(Byte[] userData, Byte[] optionalEntropy, DataProtectionScope scope) +379
System.IdentityModel.ProtectedDataCookieTransform.Encode(Byte[] value) +52
[InvalidOperationException: ID1074: A CryptographicException occurred when attempting to encrypt the cookie using the ProtectedData API (see inner exception for details). If you are using IIS 7.5, this could be due to the loadUserProfile setting on the Application Pool being set to false. ]
System.IdentityModel.ProtectedDataCookieTransform.Encode(Byte[] value) +167
System.IdentityModel.Tokens.SessionSecurityTokenHandler.ApplyTransforms(Byte[] cookie, Boolean outbound) +57
System.IdentityModel.Tokens.SessionSecurityTokenHandler.WriteToken(XmlWriter writer, SecurityToken token) +658
System.IdentityModel.Tokens.SessionSecurityTokenHandler.WriteToken(SessionSecurityToken sessionToken) +86
System.IdentityModel.Services.SessionAuthenticationModule.WriteSessionTokenToCookie(SessionSecurityToken sessionToken) +144
System.IdentityModel.Services.SessionAuthenticationModule.AuthenticateSessionSecurityToken(SessionSecurityToken sessionToken, Boolean writeCookie) +82
System.IdentityModel.Services.WSFederationAuthenticationModule.SetPrincipalAndWriteSessionToken(SessionSecurityToken sessionToken, Boolean isSession) +216
System.IdentityModel.Services.WSFederationAuthenticationModule.SignInWithResponseMessage(HttpRequestBase request) +860
System.IdentityModel.Services.WSFederationAuthenticationModule.OnAuthenticateRequest(Object sender, EventArgs args) +369
System.Web.SyncEventExecutionStep.System.Web.HttpApplication.IExecutionStep.Execute() +136
System.Web.HttpApplication.ExecuteStep(IExecutionStep step, Boolean& completedSynchronously) +69
What should i do ?
I've set the URL's correctly. I don't have in web.config any reffernces of "localhost"..
I don't know what I must set additionly for this to work..
By default WIF uses DPAPI to encrypt cookies. Switch to cert based encryption. See this answer:
Is it possible to run WIF without LoadUserProfile = True
Vittorio Bertocci answers the question here
http://www.cloudidentity.com/blog/2013/01/28/running-wif-based-apps-in-windows-azure-web-sites-4/
DPAPI is not available in the cloud web apps and 4.5 has a simple solutions
I have an application that connects to the CRM 2011 service. When I attach to the service without a CallerID I can grab the data without error. The hitch comes when I add the caller ID to the connection. I receive this error message (might seem familiar):
The server was unable to process the request due to an internal error.
For more information about the error, either turn on
IncludeExceptionDetailInFaults (either from ServiceBehaviorAttribute
or from the configuration behavior) on the server in order to send the
exception information back to the client, or turn on tracing as per
the Microsoft .NET Framework 3.0 SDK documentation and inspect the
server trace logs. Server stack trace: at
System.ServiceModel.Channels.ServiceChannel.ThrowIfFaultUnderstood(Message
reply, MessageFault fault, String action, MessageVersion version,
FaultConverter faultConverter) at
System.ServiceModel.Channels.ServiceChannel.HandleReply(ProxyOperationRuntime
operation, ProxyRpc& rpc) at
System.ServiceModel.Channels.ServiceChannel.Call(String action,
Boolean oneway, ProxyOperationRuntime operation, Object[] ins,
Object[] outs, TimeSpan timeout) at
System.ServiceModel.Channels.ServiceChannelProxy.InvokeService(IMethodCallMessage
methodCall, ProxyOperationRuntime operation) at
System.ServiceModel.Channels.ServiceChannelProxy.Invoke(IMessage
message) Exception rethrown at [0]: at
System.Runtime.Remoting.Proxies.RealProxy.HandleReturnMessage(IMessage
reqMsg, IMessage retMsg) at
System.Runtime.Remoting.Proxies.RealProxy.PrivateInvoke(MessageData&
msgData, Int32 type) at
Microsoft.Xrm.Sdk.IOrganizationService.Retrieve(String entityName,
Guid id, ColumnSet columnSet) at
Microsoft.Xrm.Sdk.Client.OrganizationServiceProxy.RetrieveCore(String
entityName, Guid id, ColumnSet columnSet) at
Microsoft.Xrm.Sdk.Client.OrganizationServiceProxy.Retrieve(String
entityName, Guid id, ColumnSet columnSet) at
Hanlon.Data.CRM.DataObjectBase.Retrieve(Guid identity) at
Hanlon.Data.CRM.Advisor.Fill(Guid advisor_identity) at
HypotheticalReportCrmSite.HypotheticalReport.AcceptanceWorkFlow()
Does anyone have any ideas on why this is happening or how I can find out what the error is more specifically? The application works on both Test and User Acceptance servers but blows up on Production.
Clearly there are a number of reasons why this may work differently on different servers, some places to start looking would be the setup/installation/permissions on the different servers including CRM, SQL, Active Directory (although I might assume AD would be the same).
Is there any difference in the proxy used for the Prod server compared to that on the Test/UAT systems?
Are all the relevant versions of the .NET framework available on the Prod server?
Are the same assemblies installed in the GAC?
Is SQL more locked down for the Prod server?
Were the CRM installations on the Prod server completed according to the same specifications as the Test and UAT servers? Were they installed by the same party?
Again, these are all just ideas, but it would seem that the differences between the servers would be the place to look and the installations of the key products would be a good place to start, followed perhaps by looking at the .NET framework etc.
Apologies if the answer is quite generic, but as you have found the specific error doesn't appear to have revealed a smoking gun.
I'd be interested to know what you find.