Forcing a message loop to yield - multithreading

I hope this question isn't too broad.
I'm working with a legacy Ada application. This application is built around a very old piece of middleware that handles, among other things, our IPC. For the sake of this question, I can boil the middleware's provisions down to
1: a message loop that processes messages (from other programs or this program)
2: a function to send messages to this program or others
3: a function to read from a database
The program operates mainly on a message loop - simply something like
loop
This_Msg := Message_Loop.Wait_For_Message; -- Blocking wait call
-- Do things based on This_Msg's ID
end loop
however there are also callbacks that can be triggered by external stimuli. These callbacks run in their own threads. Some of these callbacks call the database-reading function, which has always been fine, EXCEPT, as we recently discovered, in a relatively rare condition. When this condition occurs, it turns out it isn't safe to read from the database when the message loop is executing its blocking Wait_For_Message.
It seemed like a simple solution would be to use a protected object to synchronize the Wait_For_Message and database read: if we try to read the database while Wait_For_Message is blocking, the read will block until Wait_For_Message returns, at which point the Wait_For_Message call will be blocked until the database read is complete. The next problem is that I can't guarantee the message loop will receive a message in a timely fashion, meaning that the database read could be blocked for an arbitrary amount of time. It seems like the solution to this is also simple: send a do-nothing message to the loop before blocking, ensuring that the Wait_For_Message call will yield.
What I'm trying to wrap my head around is:
If I send the do-nothing message and THEN block before the database read, I don't think I can guarantee that Wait_For_Message won't have returned, yielded, processed the do-nothing message, and started blocking again before the pre-database read block. I think I conceptually need to start blocking and THEN push a message, but I'm not sure how to do this. I think I could handle it with a second layer of locks, but I can't think of the most efficient way to do so, and don't know if that's even the right solution. This is really my first foray into concurrency in Ada, so I'm hoping for a pointer in the right direction.

Perhaps you should use a task for this; the following would have the task waiting at the SELECT to either process a message or access the DB while another call on an entry during the processing would queue on that entry for the loop to reiterate the select, thus eliminating the problem altogether... unless, somehow, your DB-access calls the message entry; but that shouldn't happen.
Package Example is
Task Message_Processor is
Entry Message( Text : String );
Entry Read_DB( Data : DB_Rec );
End Message_Processor;
End Example;
Package Body Example is
Task Body Message_Processor is
Package Message_Holder is new Ada.Containers.Indefinite_Holders
(Element_Type => String);
Package DB_Rec_Holder is new Ada.Containers.Indefinite_Holders
(Element_Type => DB_Rec);
Current_Message : Message_Holder.Holder;
Current_DB_Rec : DB_Rec_Holder.Holder;
Begin
MESSAGE_LOOP:
loop
select
accept Message (Text : in String) do
Current_Message:= Message_Holder.To_Holder( Text );
end Message;
-- Process the message **outside** the rendevouz.
delay 1.0; -- simulate processing.
Ada.Text_IO.Put_Line( Current_Message.Element );
or
accept Read_DB (Data : in DB_Rec) do
Current_DB_Rec:= DB_Rec_Holder.To_Holder( Data );
end Message;
-- Process the DB-record here, **outside** the rendevouz.
or
Terminate;
end select;
end loop MESSAGE_LOOP;
End Message_Processor;
End Example;

Related

fs.readFile operation inside the callback of parent fs.readFile does not execute asynchronously

I have read that fs.readFile is an asynchronous operation and the reading takes place in a separate thread other than the main thread and hence the main thread execution is not blocked. So I wanted to try out something like below
// reads take almost 12-15ms
fs.readFile('./file.txt', (err, data) => {
console.log('FIRST READ', Date.now() - start)
const now = Date.now()
fs.readFile('./file.txt', (err, data) => {
// logs how much time it took from the beginning
console.log('NESTED READ CALLBACK', Date.now() - start)
})
// blocks untill 20ms more, untill the above read operation is done
// so that the above read operation is done and another callback is queued in the poll phase
while (Date.now() - now < 20) {}
console.log('AFTER BLOCKING', Date.now() - start)
})
I am making another fs.readFile call inside the callback of parent fs.readFile call. What I am expecting is that the log NESTED READ CALLBACK must arrive immediately after AFTER BLOCKING since, the reading must be completed asynchronously in a separate thread
Turns out the log NESTED READ CALLBACK comes 15ms after the AFTER BLOCKING call, indicating that while I was blocking in the while loop, the async read operation somehow never took place. By the way the while loop is there to model some task that takes 20ms to complete
What exactly is happening here? Or am I missing some information here? Btw I am using windows
During your while() loop, no events in Javascript will be processed and none of your Javascript code will run (because you are blocking the event loop from processing by just looping).
The disk operations can make some progress (as they do some work in a system thread), but their results will not be processed until your while loop is done. But, because the fs.readFile() actually consists of three or more operations, fs.open(), fs.read() and fs.close(), it probably won't get very far while the event loop is blocked because it needs events to be processed to advance through the different stages of its work.
Turns out the log NESTED READ CALLBACK comes 15ms after the AFTER BLOCKING call, indicating that while I was blocking in the while loop, the async read operation somehow never took place. By the way the while loop is there to model some task that takes 20ms to complete
What exactly is happening here?
fs.readFile() is not a single monolithic operation. Instead, it consists of fs.open(), fs.read() and fs.close() and the sequencing of those is run in user-land Javascript in the main thread. So, while you are blocking the main thread with your while() loop, the fs.readFile() can't make a lot of progress. Probably what happens is you initiate the second fs.readFile() operation and that kicks off the fs.open() operation. That gets sent off to an OS thread in the libuv thread pool. Then, you block the event loop with your while() loop. While that loop is blocking the event loop, the fs.open() completes and (internal to the libuv event loop) an event is placed into the event queue to call the completion callback for the fs.open() call. But, the event loop is blocked by your loop so that callback can't get called immediately. Thus, any further progress on completing the fs.readFile() operation is blocked until the event loop frees up and can process waiting events again.
When your while() loop finishes, then control returns to the event loop and the completion callback for the fs.open() call gets called while will then initiate the reading of the actual data from the file.
FYI, you can actually inspect the code yourself for fs.readFile() right here in the Github repository. If you follow its flow, you will see that, from its own Javascript, it first calls binding.open() which is a native code operation and then, when that completes and Javascript is able to process the completion event through the event queue, it will then run the fucntion readFileAfterOpen(...) which will call bind.fstat() (another native code operation) and then, when that completes and Javascript is able to process the completion event
through the event queue, it will then call `readFileAfterStat(...) which will allocate a buffer and initiate the read operation.
Here, the code gets momentarily harder to follow as flow skips over to a read_file_context object here where eventually it calls read() and again when that completes and Javascript is able to process the completion event via the event loop, it can advance the process some more, eventually reading all the bytes from the file into a buffer, closing the file and then calling the final callback.
The point of all this detail is to illustrate how fs.readFile() is written itself in Javascript and consists of multiple steps (some of which call code which will use some native code in a different thread), but can only advance from one step to the next when the event loop is able to process new events. So, if you are blocking the event loop with a while loop, then fs.readFile() will get stuck between steps and not be able to advance. It will only be able to advance and eventually finish when the event loop is able to process events.
An analogy
Here's a simple analogy. You ask your brother to do you a favor and go to three stores for you to pick up things for dinner. You give him the list and destination for the first store and then you ask him to call you on your cell phone after he's done at the first store and you'll give him the second destination and the list for that store. He heads off to the first store.
Meanwhile you call your girlfriend on your cell phone and start having a long conversation with her. Your brother finishes at the first store and calls you, but you're ignoring his call because you're still talking to your girlfriend. Your brother gets stuck on his mission of running errands because he needs to talk to you to find out what the next step is.
In this analogy, the cell phone is kind of like the event loop's ability to process the next event. If you are blocking his ability to call you on the phone, then he can't advance to his next step (the event loop is blocked). His visits to the three stores are like the individual steps involved in carrying out the fs.readfile() operation.

Is there a way to make StreamExt::next non blocking (fail fast) if the stream is empty (need to wait for the next element)?

Currently I am doing something like this
use tokio::time::timeout;
while let Ok(option_element) = timeout(Duration::from_nanos(1), stream.next()).await {
...
}
to drain the items already in the rx buffer of the stream. I don't want to wait for the next element that has not been received.
I think the timeout would slow down the while loop.
I am wondering that is there a better way to do this without the use of the timeout?
Possibly like this https://github.com/async-rs/async-std/issues/579 but for the streams in futures/tokio.
The direct answer to your question is to use the FutureExt::now_or_never method from the futures crate as in stream.next().now_or_never().
However it is important to avoid writing a busy loop that waits on several things by calling now_or_never on each thing in a loop. This is bad because it is blocking the thread, and you should prefer a different solution such as tokio::select! to wait for multiple things. For the special case of this where you are constantly checking whether the task should shut down, see this other question instead.
On the other hand, an example where using now_or_never is perfectly fine is when you want to empty a queue for the items available now so you can batch process them in some manner. This is fine because the now_or_never loop will stop spinning as soon as it has emptied the queue.
Beware that if the stream is empty, then now_or_never will succeed because next() immediately returns None in this case.

thread with a forever loop with one inherently asynch operation

I'm trying to understand the semantics of async/await in an infinitely looping worker thread started inside a windows service. I'm a newbie at this so give me some leeway here, I'm trying to understand the concept.
The worker thread will loop forever (until the service is stopped) and it processes an external queue resource (in this case a SQL Server Service Broker queue).
The worker thread uses config data which could be changed while the service is running by receiving commands on the main service thread via some kind of IPC. Ideally the worker thread should process those config changes while waiting for the external queue messages to be received. Reading from service broker is inherently asynchronous, you literally issue a "waitfor receive" TSQL statement with a receive timeout.
But I don't quite understand the flow of control I'd need to use to do that.
Let's say I used a concurrentQueue to pass config change messages from the main thread to the worker thread. Then, if I did something like...
void ProcessBrokerMessages() {
foreach (BrokerMessage m in ReadBrokerQueue()) {
ProcessMessage(m);
}
}
// ... inside the worker thread:
while (!serviceStopped) {
foreach (configChange in configChangeConcurrentQueue) {
processConfigChange(configChange);
}
ProcessBrokerMessages();
}
...then the foreach loop to process config changes and the broker processing function need to "take turns" to run. Specifically, the config-change-processing loop won't run while the potentially-long-running broker receive command is running.
My understanding is that simply turning the ProcessBrokerMessages() into an async method doesn't help me in this case (or I don't understand what will happen). To me, with my lack of understanding, the most intuitive interpretation seems to be that when I hit the async call it would go off and do its thing, and execution would continue with a restart of the outer while loop... but that would mean the loop would also execute the ProcessBrokerMessages() function over and over even though it's already running from the invocation in the previous loop, which I don't want.
As far as I know this is not what would happen, though I only "know" that because I've read something along those lines. I don't really understand it.
Arguably the existing flow of control (ie, without the async call) is OK... if config changes affect ProcessBrokerMessages() function (which they can) then the config can't be changed while the function is running anyway. But that seems like it's a point specific to this particular example. I can imagine a case where config changes are changing something else that the thread does, unrelated to the ProcessBrokerMessages() call.
Can someone improve my understanding here? What's the right way to have
a block of code which loops over multiple statements
where one (or some) but not all of those statements are asynchronous
and the async operation should only ever be executing once at a time
but execution should keep looping through the rest of the statements while the single instance of the async operation runs
and the async method should be called again in the loop if the previous invocation has completed
It seems like I could use a BackgroundWorker to run the receive statement, which flips a flag when its job is done, but it also seems weird to me to create a thread specifically for processing the external resource and then, within that thread, create a BackgroundWorker to actually do that job.
You could use a CancelationToken. Most async functions accept one as a parameter, and they cancel the call (the returned Task actually) if the token is signaled. SqlCommand.ExecuteReaderAsync (which you're likely using to issue the WAITFOR RECEIVE is no different. So:
Have a cancellation token passed to the 'execution' thread.
The settings monitor (the one responding to IPC) also has a reference to the token
When a config change occurs, the monitoring makes the config change and then signals the token
the execution thread aborts any pending WAITFOR (or any pending processing in the message processing loop actually, you should use the cancellation token everywhere). any transaction is aborted and rolled back
restart the execution thread, with new cancellation token. It will use the new config
So in this particular case I decided to go with a simpler shared state solution. This is of course a less sound solution in principle, but since there's not a lot of shared state involved, and since the overall application isn't very complicated, it seemed forgivable.
My implementation here is to use locking, but have writes to the config from the service main thread wrapped up in a Task.Run(). The reader doesn't bother with a Task since the reader is already in its own thread.

Is safe and good design AllocateHWND to respond more than one thread?

It's known that, in cases when one needs comunicate between UI thread and working thread, an hidden window must be created because of thread safety(handle reconstruction).
For exemplify:
Form1 has N dynamicaly created TProgressBar instances with the same name of a background running .
Is always garanteed that WM_REFRESH will only be called inside Task Thread.
Form1 has H : THandle property that allocates the following procedure:
procedure RefreshStat(var Message: TMessage); message WM_REFRESH;
Inside RefreshStat, in cases when there is only 1 background thread I could easily use L and W parameter to map Task Id and position.
I don't know if the title says what I want to know, but let's imagine if we have an application that has multiple background tasks running.
In my case I use TProgressBar to report progress the done.
Does AllocateHwnd garantee that all messages arrives with no race condition the hidden window?
What happens if two or more tasks post the message at the same time?
If this needs to be controled manually, I wonder if there is something else to do besides creating another message loop system in the custom message.
I hope the question is clear enough.
The message queue associated with a thread is a threadsafe queue. Both synchronous and asynchronous messages from multiple other thread are delivered safely no harmful date races. There is no need for any external synchronization when calling the Windows message API functions like SendMessage and PostMessage.
If two threads post or send messages to the same window at the same time, then there is no guarantee as to which message will be processed first. This is what is known as a benign race condition. If you want one message to be processed before the other then you must impose an ordering.

How does the stack work on NodeJs?

I've recently started reading up on NodeJS (and JS) and am a little confused about how call backs are working in NodeJs. Assume we do something like this:
setTimeout(function(){
console.log("World");
},1000);
console.log("Hello");
output: "Hello World"
So far on what I've read JS is single threaded so the event loop is going though one big stack, and I've also been told not to put big calls in the callback functions.
1)
Ok, so my first question is assuming its one stack does the callback function get run by the main event loop thread? if so then if we have a site with that serves up content via callbacks (fetches from db and pushes request), and we have 1000 concurrent, are those 1000 users basically being served synchronously, where the main thread goes in each callback function does the computation and then continues the main event loop? If this is the case, how exactly is this concurrent?
2) How are the callback functions added to the stack, so lets say my code was like the following:
setTimeout(function(){
console.log("Hello");
},1000);
setTimeout(function(){
console.log("World");
},2000);
then does the callback function get added to the stack before the timeout even has occured? if so is there a flag that's set that notifies the main thread the callback function is ready (or is there another mechanism). If this is infact what is happening doesn't it just bloat the stack, especially for large web applications with callback functions, and the larger the stack the longer everything will take to run since the thread has to step through the entire thing.
The event loop is not a stack. It could be better thought of as a queue (first in/first out). The most common usage is for performing I/O operations.
Imagine you want to read a file from disk. You start by saying hey, I want to read this file, and when you're done reading, call this callback. While the actual I/O operation is being performed in a separate process, the Node application is free to keep doing something else. When the file has finished being read, it adds an item to the event loop's queue indicating the completion. The node app might still be busy doing something else, or there may be other items waiting to be dequeued first, but eventually our completion notification will be dequeued by node's event loop. It will match this up with our callback, and then the callback will be called as expected. When the callback has returned, the event loop dequeues the next item and continues. When there is nothing left in node's event loop, the application exits.
That's a rough approximation of how the event loop works, not an exact technical description.
For the specific setTimeout case, think of it like a priority queue. Node won't consider dequeuing the item/job/callback until at least that amount of time has passed.
There are lots of great write-ups on the Node/JavaScript event loop which you probably want to read up on if you're confused.
callback functions are not added to caller stack. There is no recursion here. They are called from event loop. Try to replace console.log in your example and watch result - stack is not growing.

Resources