Is there a timeout on RoleEnvironment.Stopping. On Azure Worker Roles? - azure

We have some long running tasks on our roles and need to be sure to stop them ima controlled way. initially we tried to use On stop method but MSDN says that
Important
Code running in the OnStop method has a limited time to finish when it is called for reasons other than a user-initiated shutdown. After this time elapses, the process is terminated, so you must make sure that code in the OnStop method can run quickly or tolerates not running to completion. The OnStop method is called after the Stopping event is raised.
And timeout seems to be around 30 seconds and overall shutdown procedure should take no more than 5 minutes.
Does this limitation occurs also on Stopping event? I can't find a clear and direct answer anywhere.
Thanks

Related

AWS Step Functions activity task state timeouts limited functionality, conflicting documentation

I am running into stuck executions on my Activity Task States even though they have timeouts set. It turns out, it seems this timeout setting is pretty pointless if your external Activity Worker is not even running, and you want to catch that scenario by timing out. This is because their documentation clearly states that the timeout setting only begins counting when an external Activity Worker calls the GetActivityTask API, which triggers and "ActivityStarted" execution event.
https://docs.aws.amazon.com/step-functions/latest/dg/troubleshooting-activities.html#troubleshooting-activities-stuck-state-machine
Moreover, they have documentation here above that states to workaround your executions gettin stuck in this way, use timeouts (which is exactly what I'm doing) 🤦
Am I missing something or is this an obvious gap in their functionality for Activity Tasks?

Azure diagnostics and Class onStop (on scale down) - How to prevent log loss?

I am initializing my Azure Diagnostics inside onStart of my Web Role, and have it scheduled to transfer logs every 5 minutes. But when the auto-scale shuts down one my roles we are loosing the logs since the last transfer. What can I do in onStop to prevent this from happening? Is there a way to force the log transfer and prevent onStop from finishing until it's done? Thanks!
Just have your log transfer in the onStop method. Or alternatively, if that's done in a thread, use a flag and a loop in the onStop method to sleep until the flag is set.
SaveLogAsync();
while(!saved)
{
Thread.Sleep(100);
}
There is still a max amount of time the onStop method will run for before it's forced to shut down. I think it's 5 minutes.
To transfer diagnostics data when role is shutting down, you would need to perform an operation called On-Demand Transfer. This will start transferring the data stored in buffer to diagnostics storage account. You may find this link helpful in performing On-Demand Transfer: http://msdn.microsoft.com/en-us/library/windowsazure/gg433075.aspx.
And David is correct. You get about 5 minutes to perform this operation. See this link for more details: http://msdn.microsoft.com/en-us/library/microsoft.windowsazure.serviceruntime.roleentrypoint.onstop.aspx. To handle OnStop event gracefully, you may find this blog post useful: http://blogs.msdn.com/b/windowsazure/archive/2013/01/14/the-right-way-to-handle-azure-onstop-events.aspx.

Ramifications of using timeout on cluster shutdown?

I'm using the java datastax driver. I have a ServletContextListener that closes the datastax Cluster object on context destroyed by calling Cluster.shutdown(). The problem is that it takes shutdown() several minutes to return.
Cluster.shutdown() have an override where you can specify a timeout value. I can't seem to find any documentation for NOT using the shutdown value, and when I specify a timeout of one millisecond, the cluster shuts down more or less instantly (as expected).
So, my question is, if I'm only shutting down the cluster when the servlet is shutting down anyway, is there a reason I should wait for the return? It seems that by specifying the timeout, it's essentially calling an asynchronous shutdown, which should be ok, but I don't want to introduce a memory leak or any instability.
I'm pretty new to Cassandra/datastax so if information about using the timeout is spelled out somewhere, pointing me in that direction would be great!
TIA,
wbj
If you do specify a short timeout, the method will initiate the shutdown but only wait on the completion of the shutdown for as long as asked. So yes, a short timeout won't interfere with the shutdown per-se which will continue asynchronously. If you don't care about knowing when the shutdown is complete (i.e. when exactly all resources have been properly closed), then there is no particular downside to using a timeout (and you can even use 0 for the timeout to make that intention clear).
I'll not that version 2.x of the driver changes the shutdown API slightly, making it asynchronous by default but returning a future on the shutdown completion. Which hopefully makes it more clear what happens.

How Does Azure Interrupt a Worker Role For Deployment?

I'm moving some background processing from an Azure web role to a worker role. My worker needs to do a task every minute or so, possibly spawning off tasks:
while(true){
//start some tasks
Thread.Sleep(60000);
}
Once I deploy, it will start running forever. So later, when I redeploy, how does Azure stop my process for redeployment?
Does it just kill it instantly? Is there a way to get a warning that it's shutting down? Do I just have to make sure everything is transactional?
When a role (either worker or web) is asked to gracefully shut down (because it is being scaled down or because you've asked for a redeployment) the OnStop method of the RoleEntryPoint class is called. This is the same class which has the Run method which likely either contains your loop or calls the code that contains that loop.
A couple of things to note here: The OnStop has 5 minutes to actually stop, after that the process is simply killed. If you have to call something else to shut down asynchronously, you'll need the thread in OnStop to be kept busy waiting until that other process is shut down. Once execution has left OnStop the platform assumes the machine can be shut down.
If you need to gracefully stop processing but it not require a shutdown of the machine then you can put a setting in the service config file that you can update to indicate work should be done or note. So for example a bool that says "ProcessQueues". Then in your onStart in RoleEntryPoint you hook the RoleEnvironmentChanging event. Your event handler then looks for a RoleEnvironmentConfigurationSettingChange to occur and then checks the ProcessQueues bool. If it is true it either starts up or continues processing, if it is false it stop the processing gracefully. You can then do a config change to control when things are running or not. This is one option of handling this and there are many more depending on how quickly you need to stop processing, etc.

Timers in Windows Service

I want to use timers in my service.
But I heard that timers cause deadlock issues.
Suppose If I set my timer to start every 10 mins.
My service takes 5 mins to finish its current execution.
But in some cases it will take more time.(Its unpredictable).
So what happens if my service couldn't finish current
execution within 10 mins.A new timer event will fire??
And what happens to my current execution of the service?
Appreciate your help.
you can use timers in Windows service as it also stated on MSDN
A service application is designed to be long running. As such, it usually polls or monitors something in the system. The monitoring is set up in the OnStart method. However, OnStart does not actually do the monitoring. The OnStart method must return to the operating system once the service's operation has begun. It must not loop forever or block. To set up a simple polling mechanism, you can use the System.Timers.Timer component. In the OnStart method, you would set parameters on the component, and then you would set the Enabled property to true. The timer would then raise events in your code periodically, at which time your service could do its monitoring.
despite the above you still need to create your logic to avoid both deadlock or the code specified within the Elapsed event takes longer then the interval itself.
The Elapsed event is raised on a ThreadPool thread. If processing of the Elapsed event lasts longer than Interval, the event might be raised again on another ThreadPool thread. Thus, the event handler should be reentrant.
http://msdn.microsoft.com/en-us/library/system.timers.timer%28v=vs.80%29.aspx

Resources