Watchdog win service to watch another win service - multithreading

I want to make a windows service that monitors another windows service, and make sure that it is working.
sometimes the Win Service that I want to watch stay in the memory (appear in task manager, so it is considered a running service, but the fact is that it is doing nothing, it is dead, its timer is not firing for one reason, which is not the subject for this question).
what I need is to make a watch dog Win Service that somehow reads a value in the memory that the other watched service is periodically writing.
I thought about using Named Pipes but I don't want to add communication issues to my services, I want to know if there is a way to create such a shared memory between 2 applications (possibly using a named system wide Mutex?)

Since you have to deal with detecting a zombie service I don't think using a kernel object like a mutex will help, you need to detect activity. A semaphore isn't a good fit either.
My personal preference would be a named pipe sending small heartbeat messages (since that could be detected across a network as well), but if you want to avoid the complexity of pipe comms - which I guess is understandable - then you could update a DWORD in a predetermined registry key. If both services run under LocalSystem you could write a key/value into HKEY_LOCAL_MACHINE. Run a pump-up timer and watch for changes to the key every so often (watch out for counter wrap-around). You won't have a normal window/message pump so SetTimer is off-limits, but you can still use timeSetEvent or waitable timers.
HKLM won't be available if one of the services runs under a non-admin account, but that's a pretty rare situation for services. Of course all this assumes you have access to the code of both services. Watching a third-party service would severely limit your options.

Related

Long running (or forever) task on Windows Azure

I need to write some data to database every 50 seconds or so. It's similar to a Windows service that's running on background and silently doing its job. Starting and stopping is not an option in my case as I need a small amount of previously inserted data to be stored in memory. What's the best solution for this when using Windows Azure or AWS?
Thank you.
With Windows Azure, you can choose either a Web or Worker role (both basically Windows 2008 Server R2 or SP2) and have some type of timed event, as #Lucifure suggested. You could also run a scheduler, like Quartz.net, or take advantage of windows Azure queues or service bus queues to have messages show up at a certain time. However: You cannot have a "forever" task in a given role instance, in that periodically your VM instances will be rebooted (e.g. for host OS maintenance every month). With role shutdowns, you'll get notice, which you can handle these shutdown notices in Stopping() or OnStop(). If you have multiple instances, you can use a scheduler or queue to ensure your events still trigger every 50 seconds or so, and get handled across multiple instances (but only by one instance at any given time).
To preserve your in-memory information, one idea is to store that information in a cache. You have 2 choices:
Distributed (shared) cache service, which has been around for some time now. It runs independently of your role instances.
In-memory cache, just introduced in June 2012. Assuming you have more than one instance, the cache is spread across those instances. You can even run the cache inside of memory of your existing roles.
More information on caching is here.
There are a few StackOverflow answers regarding Quartz.net and Windows Azure, such as this one.
On Windows Azure, you can use a Worker Role, which can do this. It can be simple as a while loop.
Try this article for an introduction.
http://www.c-sharpcorner.com/uploadfile/40e97e/windows-azu-creating-and-deploying-worker-role/
You could setup a System.Threading.Timer to fire every 50 seconds or so, and do your work whenever the event occurs.

Azure Development - How to stop a Web Role instance

I need to test how my code will handle the failure of a web role instance in a development environment.
How do I terminate one of the instances? I can't see any option in the UI for this. Seems like a strange ommission
Update
The issue is relating to a distributed cache layer (I know that azure offers their own)
I want to be able to test how the system reacts to a missing or additional node etc
Prehaps my real question is
how up to date is RoleEnvironment.CurrentRoleInstance.Role.Instances
The need to simulate ungraceful exits in the dev emulator usually is done because you are doing something in your web role that is stateful or long running. That is generally discouraged, but sometimes is unavoidable.
I suspect the best way to simulate the a failure is to kill processes. If you open task manager (or better Process Explorer), you will see "WatDebugger" hosting either "WaIISHost" or "WaWorkerHost". If you kill this process, I think it will simulate a failure.
Honestly, it is easier to test this one in the cloud however. You can RDP into one of the instances and kill the 'WaAppAgent' process. That will kill your RoleEntryPoint and fabric controller agent. That will be a true ungraceful failure.
By failure, do you mean becoming unavailable? It should be seamless because the next request would simply be handled by one of the other instances. As long as there is one instance available Azure will route calls to that instance.
This is the nature of a high-available system, requests are handled by the available instances. This is why you have multiple instances in the first place, to handle requests in the case of failure in one or more instances.
This is why you need to always be watchful of how your application handles state. State needs to be maintained outside of the instance, either in queues or in a database. This ensures that any process can pickup a piece of work and execute against it.
There is another question dealing with Session State that should help: How does Microsoft Azure handle Session State?
By terminate an instance do you mean reducing instance count and see which one gets killed? I like Ryan's view about ungraceful exits, but if it's forced kill by the fabric it'll be a different ball game.

WF4 Affinity on Windows Azure and other NLB environments

I'm using Windows Azure and WF4 and my workflow service is hosted in a web-role (with N instances). My job now is find out how
to do an affinity, in a way that I can send messages to the right workflow instance. To explain this scenario, my workflow (attached) starts with a "StartWorkflow" receive activity, creates 3 "Person" and, in a parallel-for-each, waits for the confirmation of these 3 people ("ConfirmCreation" Receive Activity).
I then started to search how the affinity is made in others NLB environments (mainly looked for informations about how this works on Windows Server AppFabric), but I didn't find a precise answer. So how is it done in others NLB environments?
My next task is find out how I could implement a system to handle this affinity on Windows Azure and how much would this solution cost (in price, time and amount of work) to see if its viable or if it's better to work with only one web-role instance while we wait for the WF4 host for the Azure AppFabric. The only way I found was to persist the workflow instance. Is there other ways of doing this?
My third, but not last, task is to find out how WF4 handles multiple messages received at the same time. In my scenario, this means how it would handle if the 3 people confirmed at the same time and the confirmation messages are also received at the same time. Since the most logical answer for this problem seems to be to use a queue, I started looking for information about queues on WF4 and found people speaking about MSQM. But what is the native WF4 messages handler system? Is this handler really a queue or is it another system? How is this concurrency handled?
You shouldn't need any affinity. In fact that's kinda the whole point of durable Workflows. Whilst your workflow is waiting for this confirmation it should be persisted and unloaded from any one server.
As far as persistence goes for Windows Azure you would either need to hack the standard SQL persistence scripts so that they work on SQL Azure or write your own InstanceStore implementation that sits on top of Azure Storage. We have done the latter for a workflow we're running in Azure, but I'm unable to share the code. On a scale of 1 to 10 for effort, I'd rank it around an 8.
As far as multiple messages, what will happen is the messages will be received and delivered to the workflow instance one message at a time. Now, it's possible that every one of those messages goes to the same server or maybe each one goes to a diff. server. No matter how it happens, the workflow runtime will attempt to load the workflow from the instance store, see that it is currently locked and block/retry until the workflow becomes available to process the next message. So you don't have to worry about concurrent access to the same workflow instance as long as you configure everything correctly and the InstanceStore implementation is doing its job.
Here's a few other suggestions:
Make sure you use the PersistBeforeSend option on your SendReply actvities
Configure the following workflow service options
<workflowIdle timeToUnload="00:00:00" />
<sqlWorkflowInstanceStore ... instanceLockedExceptionAction="AggressiveRetry" />
Using the out of the box SQL instance store with SQL Azure is a bit of a problem at the moment with the Azure 1.3 SDK as each deployment, even if you made 0 code changes, results in a new service deployment meaning that already persisted workflows can't continue. That is a bug that will be solved but a PITA for now.
As Drew said your workflow instance should just move from server to server as needed, no need to pin it to a specific machine. And even if you could that would hurt scalability and reliability so something to be avoided.
Sending messages through MSMQ using the WCF NetMsmqBinding works just fine. Internally WF uses a completely different mechanism called bookmarks that allow a workflow to stop and resume. Each Receive activity, as well as others like Delay, will create a bookmark and wait for that to be resumed. You can only resume existing bookmarks. Even resuming a bookmark is not a direct action but put into an internal queue, not MSMQ, by the workflow scheduler and executed through a SynchronizationContext. You get no control over the scheduler but you can replace the SynchronizationContext when using the WorkflowApplication and so get some control over how and where activities are executed.

which one to use windows services or threading

We are having a web application build using asp.net 3.5 & SQL server as database which is quite big and used by around 300 super users for managing around 5000 staffs.
Now we are implementing SMS functionality into the application which means the users will be able to send and receive SMS. Every two minute the SMS server of the third party is pinged to check whether there are any new messages. Also SMS are hold in queue and send every time interval of 15 to 30 minutes.
I want this checking and sending process to run in the background of the application all the time, even if the user closes the browser window.
I need some advice on how do I do this?
Will using thread will achieve this or do I need to create a windows service for it or are there any other options?
More information:
I want to execute a task in a timer, what will happen if I close the browser window, the task wont be completed isn't it so.
For example I am saving 10 records to the database in a time interval of 5 minutes, which means every 5 minutes when the timer tick event fires, a record is inserted into the database.
How do I run this task if I close the browser window?
I tried looking at windows service but how do I pass a generic collection of data to it for processing.
There really is no thread or service choice, a service can (and usually is!) multi threaded, a thread can start a service.
There are three basic choices you can:-
Somehow start another thread running when a user logs in -- this is probably a very poor choice for what you want, as you cannot really keep it running once the user session is lost.
Write a fully fledged windows service which is starts on OS startup and continues running unitl the server is shutdown. You can make this dependant on the SQLserver service, so it starts after the DB is available. This is the "best" solution but may be overkill for your purposes. Aslo you need to know the services API to write it properly as you need to respond correctly to shutdown and status requests.
You can schedule your task periodically using either the Windows schedular, or, preferably the schedular which is built in to SQLServer, I think this would be the most suitable option for your needs.
Distinguish between what the browser is doing and what's happening server-side.
Your Web App is sitting server-side waiting for requests from whatever browsers may be running, and servicing those requests, in servicing those requests I guess it may well put messages on a queue and have a look in a database for any new messages.
You want the daemon processor, which talks to the third-party SMS, to be triggered by time rather than by browser function. Either of your suggestions would work:
A competely independent service could run and work against the queues and database.
Your web app, which I assume is already a service, could spawn a thread
In either case we have a few technical questions of avoiding any race conditions between the browser-request processing and the daemon - but databases and queueing systems can deal with that.
So I would decide between stand-alone daemon and background thread like this:
Which is easier to implement? I'm a Java EE developer, I know in my app server I have an API for specifying code to be run according to a timer, the API deals with the threading issues. So for me that's very easy. I don't know what you have available. Timers are not quite as trivial as they may appear - so having a reliable API is beneficial. If this was a more complex requirement, where the daemon code were gnarly and might possibly interfere with the WebApp code then I might prefer to keep it conspicuously separate.
Which is easier to deploy and administer? Deploy separate Web App and daemon, or deploy one thing. In the Java EE world we could have a single Enterprise Application with all the code, so that's a single thing to deploy, start and control.
One other thing to consider: Scaling and Resilience. You might choose to have more than one copy of your web app running, either to provide fail-over capabilities or just because you need the extra power. In which case how many daemons would you have? Would it be a problem to have two daemons running? You might need some extra code to mediate between two daemons, for example log in the database the time of last work, each daemon can say "Oh, my buddy balready did the 10:30 job, I'll go back to sleep"

OS; resources automatically clean up

From this answer: When is a C++ terminate handler the Right Thing(TM)?
It would be nice to have a list of resources that 'are' and 'are not' automatically cleaned up by the OS when an application quits. In your answer it would be nice if you can specify the OS/resource and preferably a link to some documentaiton (if appropriate).
The obvious one:
Memory: Yes automatically cleaned up.
Question. Are there any exceptions?
There are some obscure resources that Windows does not clean up when an app crashes or exits without explicitly releasing them, mostly because the OS doesn't know if they're important to leave around or not.
Temporary files -- as others have mentioned.
Globally registered WNDCLASSes ("No window classes registered by a DLL are unregistered when the DLL is unloaded. A DLL must explicitly unregister its classes when it is unloaded." MSDN) If your global window class also has a class DC, then that DC will leak as well.
Global ATOMs (a relatively limited resource).
Window message IDs created with RegisterWindowMessage. These are designed to leak, since there's no UnregisterWindowMessage.
Semaphores and Events aren't technically leaked, but when the owning application goes away without signalling them, then other processes can hang. This is not true for a Mutex. If the owning application goes away, other processes waiting on that Mutex are released.
There may be some residual weirdness on Windows XP and earlier if you don't unregister a hot key before exiting. Other applications may be unable to register the same hot key.
On Windows XP and earlier, it's not uncommon to have a zombie console window live on after a process crashes. (Specifically, a GUI application that also creates a console window.) It shows up on the task bar. All you can do is minimize, restore, or move the window.
Buggy drivers can be aggravated by apps that don't explicitly release resources when they exit. Non-paged pool leaks are fairly common.
Data copied to the clipboard. I guess that doesn't really count because it's owned by the OS at that point, not the application that put it there.
Globally installed hooks aren't unloaded when the installing process crashes before removing the hook.
Temporary files is a good example of something that will not be cleaned up - the handle is released but the file isn't deleted
In Windows, just about anything you can get handle to should be in fact be managed by the OS - that's why you only get a handle. This includes, but is not limited tom
the following (list copied from MSDN docs for CloseHandle() API):
Communications device
Console input
Console screen buffer
Event
File
File mapping
Job
Mailslot
Mutex
Named pipe
Process
Semaphore
Socket
Thread
Token
All of these should be recovered by the OS when an application closes, though possibly not immediately, depending on their use by other processes.
Other operating systems work in the same way. It's hard to an imagine an OS worth its name (I exclude embedded systems etc.) where this is not the case - resource management is the #1 raison d'etre for an operating system.
Any exception is a bug - applications can and do crash and do contain leaks. An OS needs to be reliable and not exhaust resources even in the face of poorly written applications. This also applies to non-OS resources. Services that hand out resources to processes need to free those resources when the process exits. If they don't it is a bug that needs to be fixed.
If you're looking for program artifacts which can persist beyond process exit, on Windows you have at least:
Registry keys that are created
without REG_OPTION_VOLATILE
Files created without FILE_FLAG_DELETE_ON_CLOSE
Event log entries
Paper that was used for print jobs

Resources