In this scenario, WF 4 WCF workflow service (xamlx's) hosted in IIS, how does one accomodate the fact that the app pool may be recycled at any time (config edit, memory pressure, etc.) and one or more WCF initiated workflow(s) may still be executing when the app pool is being torn down and restarted. The concern is that a workflow may be executing it's activities and the IIS host tear down may prevent that thread (or threads if async activities used) from completing and leave the workflow in an unstable state. We could use transaction scope or some other construct for this but not sure of the overall behavior in order to best plan on how to accomodate it?
IIS has a feature called Overlapped Recycle that is enabled by default resulting in the previous AppPool being granted some time before it is completely destroyed. I believe this feature was first introduces in IIS 7.5. You should be able to find it under Application Pools/Advanced Settings/Recycling I don't recall the exact amount of time but providing you are not doing extensive computation you should be fine.
Related
Several components of our system have "long running" operations. They can take anywhere from seconds to minutes, and vary in their CPU usage. E.g., report generation will peg a CPU for a few seconds, but data collection is largely spent waiting on a database query.
I am faced with two choices:
(1) Web role + worker role + queue + table. Worker role spins on a queue, gets a message with parameters, does work, updates table with progress and completion flags. Client spins, displaying progress until its marked done. One web role, scale up number of worker roles as needed.
(2) Web role + async method. Make my long running operations use .NET 4.5's async/await stuff, and have the controller actions marked async. Scale up number of web roles as needed.
Option 1 obviously is much more complicated, but has the advantage of keeping the web roles free to do web stuff, and allowing proper queuing if things start getting really busy. Option 2 is simpler and will require less roles and storage resources, but doesn't it have the potential to choke up the entire website if things start getting busy? I am strongly leaning to option 2 just for simplicity. Is there any particular reason not to do this? If the website start slowing down, simply increasing the number of web role instances will solve performance problems right?
As with most architectural decisions, the answer is only "it depends".
In this case, option (2) is easier to code. If your're not expecting to scale massively, then I would say that's fine.
The key advantage to option (1) is that you have two knobs for scaling: your web roles handle web requests and your worker roles handle the work, and you can scale up your workers independently of your webs.
But unless you're going to scale massively, I wouldn't worry about it. Option (2) can scale quite well, just not with perfect efficiency. And if you do start scaling massively, you can (inefficiently) crank up scaling with option (2) and (presumably) use your massively scaling income to develop option (1).
P.S. You should use async for both options.
As Stephen posted, this is going to be an "It Depends" type of answer.
In this case I would probably go with either the first option, or an option in between where yuo have the web role, queue, and table, but create an executable that also runs on your web role as a startup executable.
With option 2, your going to be open to connection timeouts, instance restarts, etc and and additional demand for resources will have to be met with scaling your web roles or a code rewrite.
With option 1 (as Stephen pointed out), your workload is separated and you can scale the roles independently. In addition, letting the queues and workers handle the work allows the workers to manage their own lifespan and you can build in some resiliency for planned or unplanned restarts by not removing the queue items until you have finished the job (that way they will resurface if you crash in the middle). You also have the ability to take fuller advantage of the role resources (scale on threads first, then additional instances), as your polling mechanism will be controlling the workload rather than random people hitting the website. On the web side, you can then chose to either have the async method that waits for completion and returns, or you could return a token and let the client poll for completion of the work, either case should be relatively simple code.
Option 1.5 may be the best starting point, though. If your trying to start small, then using a background executable on your web roles will be the cheapest solution. You're going to want to have at least 2 web roles to ensure coverage by the SLA, so this solution would let you start with just those two instances and no more. Build the executable that does the queue polling and report (or whatever) execution seperately, then configure it as a start up task for the web role. This will let you keep the cost down while your starting and the code for those exe's is speerate and becomes the code for your worker role later if you need to expand. The biggest thing to watch for in this situation is that your executable handles all exceptions because if this process exits due to an unhandled exception, it won't get restarted (unlike worker roles, which Azure will continue to just keep on starting every time they die).
Application pools in IIS are recycled very frequently and I can't figure out why. I remember reading about a possible issue in IIS6 that meant you were forced to recycle but a quick search now turns up empty. On IIS6 or 7 you can turn off the idle time, duration and specific time recycle options so no problems there.
So why does every .net site recycle the application pool? If a site didn't have any memory leaks could you set up a site that never needed to recycle?
Also failing this, what would be the best way to ensure background tasks are called, is there any auto restart modules for IIS or should an external service be used to make those calls?
It sounds like it is possible to do if you really wanted/needed to?
Websites are intended to keep running (albeit in a stateless nature). There are a myriad of reasons why app pool recycling can be beneficial to the hosting platform to ensure both the website and the server run at optimum. These include (but not limited to) dynamically compiled assemblies remaining in the appdomain, use of session caching (with no guarantee of cleanup), other websites running amok and resources getting consumed over time etc. An app pool can typically serve more than one website, so app pool recycling can be beneficial to ensure everything runs smoothly.
Besides the initial boot when the app fires up again, the effect should be minimal. Http.sys holds onto requests while a new worker process is started so no requests should be dropped.
From https://weblogs.asp.net/owscott/why-is-the-iis-default-app-pool-recycle-set-to-1740-minutes
You may ask whether a fixed recycle is even needed. A daily recycle is
just a band-aid to freshen IIS in case there is a slight memory leak
or anything else that slowly creeps into the worker process. In theory
you don’t need a daily recycle unless you have a known problem. I used
to recommend that you turn it off completely if you don’t need it.
However, I’m leaning more today towards setting it to recycle once per
day at an off-peak time as a proactive measure.
My reason is that, first, your site should be able to survive a
recycle without too much impact, so recycling daily shouldn’t be a
concern. Secondly, I’ve found that even well behaving app pools can
eventually have something sneak in over time that impacts the app
pool. I’ve seen issues from traffic patterns that cause excessive
caching or something odd in the application, and I’ve seen the very
rare IIS bug (rare indeed!) that isn’t a problem if recycled daily. Is
it a band-aid? Possibly, but if a daily recycle keeps a non-critical
issue from bubbling to the top then I believe that it’s a good
proactive measure to save a lot of troubleshooting effort on something
that probably isn’t important to troubleshoot. However, if you think
you have a real issue that is being suppressed by recycling then, by
all means, turn off the auto-recycling so that you can track down and
resolve your issue. There’s no black and white answer. Only you can
make the best decision for your environment.
There's a lot more useful/interesting info for someone relatively unlearned in the IIS world (like me), I recommend you read it.
I have a web application that's consuming a WCF service. Both are slow on warmup after IIS reset or app pool recycle. So, as a possiible solution I installed Application Warm-Up for IIS 7.5 and set it up for both web site and wcf service.
My concern is, it doesn't seem to make any difference - first time I hit the site it still takes long time to bring it up. I checked event logs, there are no errors. So I'm wondering if anything special needs to be done for that module to work.
In IIS manager, when you go into the site, then into Application Warm-Up, the right-hand side has an "Actions" pane. I think you need the following two things:
Click Add Request and add at least one URL, e.g. /YourService.svc
Click Settings, and check "Start Application Pool 'your pool' when service started"
Do you have both of these? If you don't have the second setting checked, then I think the warmup won't happen until a user hits the site (which probably defeats the purpose of the warmup module in your case).
There is a new module from Microsoft that is part of IIS 8.0 that supercedes the previous warm-up module. This Application Initialization Module for IIS 7.5 is available a separate download.
The module will create a warm-up phase where you can specify a number of requests that must complete before the server starts accepting requests. Most importantly it will provide overlapping processes so that the user will not be served by the newly started process before it is ready.
I have answered a similar question with more details at How to warm up an ASP.NET MVC application on IIS 7.5?.
After you have fixed possible software/code optimizations allow me to suggest that each and evey code needs processing via hardware cpu. And our server skyrocketed in performance when we went to a multicore cpu and installed more GIGS of ram and connected UTP-6 cable insetad of standard UTP 5e cable onto the server... That doesnt fix your problem but if you are obsessed with speed as much as us, then you will be interested in the various dimensions that bottleneck speed.
We have a Windows Server 2003 web server, and on that server runs about 5-6 top level Sharepoint sites, with a different application pool for each one.
There is one W3WP process that keeps pegging 100% for most of the day (happened yesterday and today) and it's connected (found by doing "Cscript iisapp.vbs" at the command line and matching ProcessID) to a particular Sharepoint site...which is nearly unusable.
What kind of corrective action can I take? These are the following ideas I had
1) Stopping and restarting the Web Site in IIS - For some reason this didn't stop the offending W3WP process??? Any ideas why not?
2) Stopping and restarting the associated Application Pool.
3) Recycling the associated Application Pool.
Any of those sound like the right idea? If not what are some good things to try? I can't do an iisreset since I don't want to alter service to the other, much more heavily used, Sharepoint sites.
If I truly NEED to do some diagnostic work please point me in the right direction. I'm not the Sharepoint admin guy (he's out of town so I'm filling in even though I'm just a developer) but I'll do my best.
If you need any information just let me know and I'll look it up (slowly though, as that one process is pegging the entire machine).
It's not an IISReset that you need. You have a piece of code that is running amok with your memory. Most likely it's not actually a CPU problem but a paging problem. I've encountered this a few times with data structures in memory that grow too large to page in/out effectively and eventually the attempt to page data just begins consuming everything. The steps I would recommend are:
1) Go get the IIS Debug Diagnostics tools. And learn how to use them.
2) If possible, remove the session state from InProc to a state server or a sql server (since this requires serialization of all classes that go into session this may not be possible). This will help alleviate some process related memory issues.
3) Go to your application pool and adjust the number of worker processes upward. Remove Rapid fail protection (this will allow the site to continue serving pages even if rapid catastrophic errors occur).
The IIS debug diagnostics will record a LOT of data, but you can specify specific "catch" alerts that will detect hangs, excessive cpu usage etc. It will capture gigs of data, so be ready for a long wait when attempting to view the logs.
Turns out someone tried to install some features that went haywire.
So he wrote a stsadm script to uninstall those features
Processor was still pegging.
I restarted the IIS Application Pool for that IIS process, didn't fix it.
So then I restarted IIS for that site and that resolved the processor issue.
Maybe someone can shed some light on this simple question:
I have a .NET web application that has been thoroughly vetted. It loads a cache per appdomain (process) whenever one starts and can not fully reply to requests until it completes this cache loading.
I have been examining the settings on my application pools and have started wondering why I was even recycling so often (once every 1,000,000 calls or 2 hours).
What would prevent me from setting auto-recycles to being once every 24 hours or even longer? Why not completely remove the option and just recycle if memory spins out of control for the appdomain?
If your application runs reliably for longer then the threshold set for app pool recycling, then by all means increase the threshold. There is no downside if your app is stable.
For us, we have recycling turned off altogether, and instead have a task that loads a test page every minute and runs an iisreset if it fails to load five times in a row.
You should probably be looking at recycling from the point of view of reliability. Based on historical data, you should have an idea how much memory, CPU and so on your app uses, and the historical patterns and when trouble starts to occur. Knowing that, you can configure recycling to counter those issues. For example, if you know your app has an increasing memory usage pattern* that leads to the app running out of memory after a period of several days, you could configure it to recycle before that would have happened.
* Obviously, you would also want to resolve this bug if possible, but recycling can be used to increase reliability for the customer
The reason they do it is that an application can be "not working" even though it's CPU and memory are fine (think deadlock). The app recycling is a final failsafe measure which can protect flawed code from dying.
Also any code which has failed to implement IDisposable would run finalizers on the recycle which will possibly release held resources.