Is there a timeout function for receiveChan? - haskell

In Cloud Haskell, the expect function has a timeout cousin called expectTimeout. Is there also a timeout function for receiveChan (for the type safe channel)? Source
Currently, my application will wait indefinitely for a reply, therefore if a process dies my application will be in a deadlock state, thus, it would be nice if I could set a timeout so that it ignores processes that have died.

There is a receiveChanTimeout in version 0.4.2
http://hackage.haskell.org/packages/archive/distributed-process/0.4.2/doc/html/Control-Distributed-Process.html

Related

redis-py not closing threads on exit

I am using redis-py 2.10.6 and redis 4.0.11.
My application uses redis for both the db and the pubsub. When I shut down I often get either hanging or a crash. The latter usually complains about a bad file descriptor or an I/O error on a file (I don't use any) which happens while handling a pubsub callback, so I'm guessing the underlying issue is the same: somehow I don't get disconnected properly and the pool used by my redis.Redis object is alive and kicking.
An example of the output of the former kind of error (during _read_from_socket):
redis.exceptions.ConnectionError: Error while reading from socket: (9, 'Bad file descriptor')
Other times the stacktrace clearly shows redis/connection.py -> redis/client.py -> threading.py, which proves that redis isn't killing the threads it uses.
When I star the application I run:
self.redis = redis.Redis(host=XXXX, port=XXXX)
self.pubsub = self.redis.pubsub()
subscriptions = {'chan1': self.cb1, 'chan2': self.cb2} # cb1 and cb2 are functions
self.pubsub.subscribe(**subscriptions)
self.pubsub_thread = self.pubsub.run_in_thread(sleep_time=1)
When I want to exit the application the last instruction I execute in main is a call to a function in my redis using class, whose implementation is:
self.pubsub.close()
self.pubsub_thread.stop()
self.redis.connection_pool.disconnect()
My understanding is that in theory I do not even need to do any of these 'closing' calls, and yet, with or without them, I still can't guarantee a clean shutdown.
My question is, how am I supposed to guarantee a clean shutdown?
I ran into this same issue and it's largely caused by improper handling of the shutdown by the redis library. During the cleanup, the thread continues to process new messages and doesn't account for situations where the socket is no longer available. After scouring the code a bit, I couldn't find a way to prevent additional processing without just waiting.
Since this is run during a shutdown phase and it's a remedy for a 3rd party library, I'm not overly concerned about the sleep, but ideally the library should be updated to prevent further action while shutting down.
self.pubsub_thread.stop()
time.sleep(0.5)
self.pubsub.reset()
This might be worth an issue log or PR on the redis-py library.
PubSubWorkerThread class check for self._running.is_set() inside the loop.
To do a "clean shutdown" you should call self.pubsub_thread._running.clean() to set the thread event to false and it will stop.
Check how it work here:
https://redis.readthedocs.io/en/latest/_modules/redis/client.html?highlight=PubSubWorkerThread#

thread with a forever loop with one inherently asynch operation

I'm trying to understand the semantics of async/await in an infinitely looping worker thread started inside a windows service. I'm a newbie at this so give me some leeway here, I'm trying to understand the concept.
The worker thread will loop forever (until the service is stopped) and it processes an external queue resource (in this case a SQL Server Service Broker queue).
The worker thread uses config data which could be changed while the service is running by receiving commands on the main service thread via some kind of IPC. Ideally the worker thread should process those config changes while waiting for the external queue messages to be received. Reading from service broker is inherently asynchronous, you literally issue a "waitfor receive" TSQL statement with a receive timeout.
But I don't quite understand the flow of control I'd need to use to do that.
Let's say I used a concurrentQueue to pass config change messages from the main thread to the worker thread. Then, if I did something like...
void ProcessBrokerMessages() {
foreach (BrokerMessage m in ReadBrokerQueue()) {
ProcessMessage(m);
}
}
// ... inside the worker thread:
while (!serviceStopped) {
foreach (configChange in configChangeConcurrentQueue) {
processConfigChange(configChange);
}
ProcessBrokerMessages();
}
...then the foreach loop to process config changes and the broker processing function need to "take turns" to run. Specifically, the config-change-processing loop won't run while the potentially-long-running broker receive command is running.
My understanding is that simply turning the ProcessBrokerMessages() into an async method doesn't help me in this case (or I don't understand what will happen). To me, with my lack of understanding, the most intuitive interpretation seems to be that when I hit the async call it would go off and do its thing, and execution would continue with a restart of the outer while loop... but that would mean the loop would also execute the ProcessBrokerMessages() function over and over even though it's already running from the invocation in the previous loop, which I don't want.
As far as I know this is not what would happen, though I only "know" that because I've read something along those lines. I don't really understand it.
Arguably the existing flow of control (ie, without the async call) is OK... if config changes affect ProcessBrokerMessages() function (which they can) then the config can't be changed while the function is running anyway. But that seems like it's a point specific to this particular example. I can imagine a case where config changes are changing something else that the thread does, unrelated to the ProcessBrokerMessages() call.
Can someone improve my understanding here? What's the right way to have
a block of code which loops over multiple statements
where one (or some) but not all of those statements are asynchronous
and the async operation should only ever be executing once at a time
but execution should keep looping through the rest of the statements while the single instance of the async operation runs
and the async method should be called again in the loop if the previous invocation has completed
It seems like I could use a BackgroundWorker to run the receive statement, which flips a flag when its job is done, but it also seems weird to me to create a thread specifically for processing the external resource and then, within that thread, create a BackgroundWorker to actually do that job.
You could use a CancelationToken. Most async functions accept one as a parameter, and they cancel the call (the returned Task actually) if the token is signaled. SqlCommand.ExecuteReaderAsync (which you're likely using to issue the WAITFOR RECEIVE is no different. So:
Have a cancellation token passed to the 'execution' thread.
The settings monitor (the one responding to IPC) also has a reference to the token
When a config change occurs, the monitoring makes the config change and then signals the token
the execution thread aborts any pending WAITFOR (or any pending processing in the message processing loop actually, you should use the cancellation token everywhere). any transaction is aborted and rolled back
restart the execution thread, with new cancellation token. It will use the new config
So in this particular case I decided to go with a simpler shared state solution. This is of course a less sound solution in principle, but since there's not a lot of shared state involved, and since the overall application isn't very complicated, it seemed forgivable.
My implementation here is to use locking, but have writes to the config from the service main thread wrapped up in a Task.Run(). The reader doesn't bother with a Task since the reader is already in its own thread.

Long running infinity in node.js

Can't understand one thing. On server I have some process which runs forever in async mode. For example like this:
function loginf() {
console.log(1+1);
process.nextTick(loginf);
}
loginf();
Its recusrion, and as I understand it must cause stack overflow and(or) eat the memory.
How to to do long running forever without memory leak in node.js? Is it possible?
If you want to do something repeatedly and you want to do it in a friendly way that leaves cycles for other events in your server to be processed, then the usual way to do that is with setInterval().
setInterval(function() {
console.log("called");
}, 1000);
Repeatedly calling the same function like you are doing with process.nextTick() is not really recursion and does not lead to a stack overflow because the stack completely unwinds before the event queue calls the function the next time. It finishes the current path of execution and then calls the function you passed to nextTick().
Your choices for this type of operation are:
setInterval()
setTimeout()
setImmediate()
process.nextTick()
All three choices let the current thread of execution finish before calling the callback function so there is no stack build-up.
setInterval() uses a system timer set for some time in the future and allows all other events currently in the queue or in the queue before the timer time occurs to be serviced before calling the setInterval() callback. Use setInterval() when a time pause between calls to the callback is advisable and you want the callback called repeatedly over and over again.
setTimeout() uses a system timer set for some time in the future and allows all other events currently in the queue or in the queue before the timer time occurs to be serviced before calling the setTimeout() callback. You can use setTimeout() repeatedly (setting another timeout from each callback), though this is generally what setInterval() is designed for. setTimeout() in node.js does not follow the minimum time interval that browsers do, so setTimeout(fn, 1) will be called pretty quickly, though not as quickly as setImmediate() or process.nextTick() due to implementation differences.
setImmediate() runs as soon as other events that are currently in the event queue have all been serviced. This is thus "fair" to other events in the system. Note, this is more efficient that setTimeout(fn, 0); because it doesn't need to use a system timer, but is coded right into the event sub-system. Use this when you want the stack to unwind and you want other events already in the queue to get processed first, but you want the callback to run as soon as possible otherwise.
process.nextTick() runs as soon as the current thread of execution finishes (and the stack unwinds), but BEFORE any other events currently in the event queue. This is not "fair" and if you run something over and over with process.nextTick(), you will starve the system of processing other types of events. It can be used once to run something as soon as possible after the stack unwinds, but should not be used repeatedly over and over again.
Some useful references:
setImmediate vs. nextTick
Does Node.js enforce a minimum delay for setTimeout?
NodeJS - setTimeout(fn,0) vs setImmediate(fn)
setImmediate vs process.nextTick vs setTimeout

Is the first thread that gets to run inside a Win32 process the "primary thread"? Need to understand the semantics

I create a process using CreateProcess() with the CREATE_SUSPENDED and then go ahead to create a little patch of code inside the remote process to load a DLL and call a function (exported by that DLL), using VirtualAllocEx() (with ..., MEM_RESERVE | MEM_COMMIT, PAGE_EXECUTE_READWRITE), WriteProcessMemory(), then call FlushInstructionCache() on that patch of memory with the code.
After that I call CreateRemoteThread() to invoke that code, creating me a hRemoteThread. I have verified that the remote code works as intended. Note: this code simply returns, it does not call any APIs other than LoadLibrary() and GetProcAddress(), followed by calling the exported stub function that currently simply returns a value that will then get passed on as the exit status of the thread.
Now comes the peculiar observation: remember that the PROCESS_INFORMATION::hThread is still suspended. When I simply ignore hRemoteThread's exit code and also don't wait for it to exit, all goes "fine". The routine that calls CreateRemoteThread() returns and PROCESS_INFORMATION::hThread gets resumed and the (remote) program actually gets to run.
However, if I call WaitForSingleObject(hRemoteThread, INFINITE) or do the following (which has the same effect):
DWORD exitCode = STILL_ACTIVE;
while(STILL_ACTIVE == exitCode)
{
Sleep(500);
if(!GetExitCodeThread(hRemoteThread, &exitCode))
break;
}
followed by CloseHandle() this leads to hRemoteThread finishing before PROCESS_INFORMATION::hThread gets resumed and the process simply "disappears". It is enough to allow hRemoteThread to finish somehow without PROCESS_INFORMATION::hThread to cause the process to die.
This looks suspiciously like a race condition, since under certain circumstances hRemoteThread may still be faster and the process would likely still "disappear", even if I leave the code as is.
Does that imply that the first thread that gets to run within a process becomes automatically the primary thread and that there are special rules for that primary thread?
I was always under the impression that a process finishes when its last thread dies, not when a particular thread dies.
Also note: there is no call to ExitProcess() involved here in any way, because hRemoteThread simply returns and PROCESS_INFORMATION::hThread is still suspended when I wait for hRemoteThread to return.
This happens on Windows XP SP3, 32bit.
Edit: I have just tried Sysinternals Process Monitor to see what's happening and I could verify my observations from before. The injected code does not crash or anything, instead I get to see that if I don't wait for the thread it doesn't exit before I close the program where the code got injected. I'm thinking whether the call to CloseHandle(hRemoteThread) should be postponed or something ...
Edit+1: it's not CloseHandle(). If I leave that out just for a test, the behavior doesn't change when waiting for the thread to finish.
The first thread to run isn't special.
For example, create a console app which creates a suspended thread and terminates the original thread (by calling ExitThread). This process never terminates (on Windows 7 anyway).
Or make the new thread wait for five seconds then exit. As expected, the process will live for five seconds and exit when the secondary thread terminates.
I don't know what's happening with your example. The easiest way to avoid the race is to make the new thread resume the original thread.
Speculating now, I do wonder if what you're doing isn't likely to cause problems anyway. For example, what happens to all the DllMain calls for the implicitly loaded DLLs? Are they unexpectedly happening on the wrong thread, are they being skipped, or are they postponed until after your code has run and the main thread starts?
Odds are good that the thread with the main (or equivalent) function calls ExitProcess (either explicitly or in its runtime library). ExitProcess, well, exits the entire process, including killing all threads. Since the main thread doesn't know about your injected code, it doesn't wait for it to finish.
I don't know that there's a good way to make the main thread wait for yours to complete...

Should I use for loop async way when I use node.js?

I'm testing with node.js with express.
Theoretically, If I run something very heavy calculation on a "for loop" without any callbacks,
it is blocked and other request should be ignored.
But In my case, regular "for loop"
for(var i=0;i<300000;i++) {
console.log( i );
}
does not make any request blocks but just high cpu load.
It accepts other requests as well.
but why should I use some other methods to make these non-blocking such as
process.nextTick()
Or does node.js take care of basic loop functions ( for, while ) with wrapping them with process.nextTick() as default?
Node runs in a single thread with an event loop, so as you said, when your for loop is executing, no other processing will happen. The underlying operating system TCP socket may very well accept incoming connections, but if node is busy doing your looping logic then the request itself won't be processed until afterward.
If you absolutely must run some long-running processin Node, then you should use separate worker processes to do the calculation, and leave the main event loop to do request handling.
Node doesn't wrap loops with process.nextTick().
It may be that your program is continuing to accept new connections because console.log is yielding control back to the main event loop; since it's an I/O operation.

Resources