select(), posix message queue and multithreading in Linux - linux

I am facing a problem about the message queue:
I have used mq_timedreceive() to get message queue in abs_timeout time.
But this function is affected by system time (CLOCK_REALTIME). I mean that when system time change, the abs_timeout (absolute time) is not right any more.
To fix this problem, I realize that it should change to CLOCK_MONOTOIC clock.
But in linux, there is no way (I seached and found QNX support this mechanism).
Finally, I combine select() and mq_timereceive with NO_WAIT.
+ select(): using relative time so it's not affected by system time changing.
After timeout, I will get message queue with mq_timereceive(), of course absolute time = 0;
But my problem is:
If system have many thread that are waiting the same message queue (by using select()),
If a message is sent to message queue, all waiting thread are woken up and running. So it's wrong.
Maybe a thread (not first waiting thread) wake up first and get this message.
My expected is only first waiting thread should woken up and it will get message, and others still block.
Please help.

Looks like you have several questions in one:
Waiting on a message queue with a timeout that is not affected by clock adjustments. In Linux the following APIs support clock (CLOCK_REALTIME, CLOCK_MONOTONIC, etc.) selection: timerfd_create and timer_create. One way to integrate these with mq_timedreceive is to let timer_create fire a signal that interrupts mq_timedreceive.
Integrating waiting on a POSIX message queue with select. The most straight-forward way would be to use mq_notify to make it deliver a signal when a new message is available, thus making select call return -1 and errno set to EINTR.
Fair queuing, so that the first waiter gets the next message. With POSIX message queues it may be possible if the waiting threads are blocked in mq_receive. Otherwise the next available message is delivered to a thread that calls mq_receive first.
For message passing between threads of the same process another approach can be to have a pipe act as a queue of message pointers. That is, a producer thread creates a message and writes a pointer to it into the pipe (i.e. no need to serialize the entire message because the message recipient is in the same process and has access to the process address space). Any consumer thread can wait on the pipe using select and then read the pointers to messages. However, if multiple threads are waiting on the same pipe, they all get woken up but only one of the threads will read the message pointer off the pipe.

Related

pause() system call and receiving a SIGINT signal

I'm a beginner in Linux and Process signal handling.
Let's say we have a process A and it execute pause() function, we know that puts the current process to sleep until a signal is received by the process.
But when we type ctrl-c, kernel also sends a SIGINT to process A and when A receives the signal, it execute the SIGINT's default handler which is terminating the current process. So my question is:
Does the process A resume first or handler get executed first?
For simplicity, let's assume process A has only a single thread, which is blocking in a pause() call, and exactly one signal gets sent to the process.
Does the process A resume first or handler get executed first?
The signal handler gets executed first, then the pause() call returns.
What if there are multiple signals?
Standard signals are not queued, so if you send say two INT signals to the process very quickly in succession, only one of them is delivered.
If there are multiple signals, the order is unspecified.
What about POSIX realtime signals? (SIGRTMIN+0 to SIGRTMAX-0)
They are just like standard named signals, except they are queued (to a limit), and if more than one of them is pending, they get delivered in increasing numerical order.
If there are both standard and realtime signals pending, it is unspecified which ones get delivered first; although in practice, in Linux and many other systems, the standard signals get delivered first, then the realtime ones.
What if there are multiple threads in the process?
The kernel will pick one thread among those that do not have the signal masked (via sigprocmask() or pthread_sigmask()), and use that thread to deliver the signal to the signal handler.
If there are more than one thread blocking in a pause() call, one of them gets woken up. If there are more than one pending signal, it is unspecified whether the one woken thread handles them all, or if more than one thread is woken up.
In general, I warmly recommend reading the man 7 signal, man 7 signal-safety, man 2 sigaction, man 2 sigqueue, and man 2 sigwaitinfo man pages. (While the links go to the Linux man pages project, each of the pages includes a Conforming To section naming the related standards, and Linux-specific behaviour is clearly marked.)

SetEvent ResetEvent WaitForMultipleObjectsEx - Race condition?

I am not able to understand the PulseEvent or race condition. But to avoid it I am trying to SetEvent instead, and ResetEvent every time before WaitForMultipleObjectsEx.
This is my flow:
Thread One - Uses CreateEvent to create an auto reseting event, I then spawn and tell Thread TWO about it.
Thread One - Tell thread TWO to run.
Thread TWO will do ResetEvent on event and then immediately start WaitForMultipleObjectsEx on the event and some other stuff for file watching. If WaitForMultipleObjectsEx returns, and it is not due to the event, then restart the loop immediately. If WaitForMultipleObjectsEx returns, due to event going to signaled, then do not restart loop.
So now imagine this case please:
Thread TWO - loop is running
Thread One - needs to add a path, so it does (1) SetEvent, and then (2) sends another message to thread 2 to add a path, and then (3) sends message to thread 2 to restart loop.
The messages of add path and restart loop will not come in to Thread TWO unless I stop the loop in TWO, which is done by the SetEvent. Thread TWO will see it was stoped due to the event, and so it wont restart the loop. So it will now get the message to add path, so it will add path, then restart loop.
Thread One - needs to stop the thread, so it does (1) SetEvent and then (2) waits for message thread 2, when it gets that message it will terminate the thread.
Will this avoid race condition?
Thank you
Suppose the loop needs to be interrupted twice in succession. You're imagining a sequence of events something like this, on thread ONE and thread TWO:
Thread ONE realizes that the first interruption is complete.
Thread ONE sends a message telling TWO to restart the wait loop.
Thread TWO reads the message "restart the wait loop".
Thread TWO resets the event.
Thread TWO starts waiting.
Thread ONE now realizes that another interruption is needed.
Thread ONE sets the event to ask for another interruption.
Thread ONE sends message related to the second interruption.
Thread TWO stops the loop, receives the message about the second interruption.
But since you don't have any control over the timing between the two threads, it might instead happen like this:
Thread ONE realizes that the first interruption is complete.
Thread ONE sends a message telling TWO to restart the wait loop.
Thread ONE now realizes that another interruption is needed.
Thread ONE sets the event to ask for another interruption.
Thread TWO reads the message "restart the wait loop".
Thread TWO resets the event.
Thread TWO starts waiting.
Thread ONE sends a message about the second interruption, but TWO isn't listening!
Even if the message passing mechanism is synchronous, so that ONE won't continue until TWO has read the message, it could happen this way:
Thread ONE realizes that the first interruption is complete.
Thread ONE sends a message telling TWO to restart the wait loop.
Thread TWO reads the message "restart the wait loop", but is then swapped out.
Thread ONE now realizes that another interruption is needed.
Thread ONE sets the event to ask for another interruption.
Thread TWO resets the event.
Thread TWO starts waiting.
Thread ONE sends a message about the second interruption, but TWO isn't listening!
(Obviously, a similar thing can happen if you use PulseEvent.)
One quick solution would be to use a second event for TWO to signal ONE at the appropriate point, i.e., after resetting the main event but before waiting on it, but that seems somewhat inelegant and also doesn't generalize very well. If you can guarantee that there will never be two interruptions in close-enough succession, you might simply choose to ignore the race condition, but note that it is difficult to reason about this because there is no theoretical limit to how long it might take for thread TWO to resume running after being swapped out.
The various alternatives depend on how the messages are being passed between the threads and any other constraints. [If you can provide more information about your current implementation I'll update my answer accordingly.]
This is an overview of some of the more obvious options.
If the message-passing mechanism is synchronous (if thread ONE waits for thread TWO to receive the message before proceeding) then using a single auto-reset event should just work. Thread ONE won't set the event until after thread TWO has received the restart-loop message. If the event is already set when thread TWO starts waiting, that just means that there were two interruptions in immediate succession; TWO will never stall waiting for a message that isn't coming. [This potential stall is the only reason I can think of why you might not want to use an auto-reset event. If you have another concern, please edit your question to provide more details.]
If is OK for sending a message to be non-blocking, and you aren't already locked in to a particular solution, any of these options would probably be sensible:
User mode APCs (the QueueUserAPC function) provide a message-passing mechanism that automatically interrupts alertable waits.
You could implement a simple queue (protected by a critical section) which uses an event to indicate whether there is a message pending or not. In this case you can safely use a manual-reset event provided that you only manipulate it when you hold the same critical section that protects the queue.
You could use an auto-reset event in combination with any sort of thread-safe queue, provided only that the queue allows you to test for emptiness without blocking. The idea here is that thread ONE would always insert the message into the queue before setting the event, and if thread TWO sees that the event is set but it turns out that the queue is empty, the event is ignored. If efficiency is a concern, you might even be able to find a suitable lock-free queue implementation. (I don't recommend attempting that yourself.)
(All of those mechanisms could also be made synchronous by using a second event object.)
I wouldn't recommend the following approaches, but if you happen to already be using one of these for messaging this is how you can make it work:
If you're using named pipes for messaging, you could use asynchronous I/O in thread TWO. Thread TWO would use an auto-reset event internally, you specify the event handle when you issue the I/O call and Windows sets it when I/O arrives. From the point of view of thread ONE, there's only a single operation. From the point of view of thread TWO, if the event is set, a message is definitely available. (I believe this is somewhat similar to your original approach, you just have to issue the I/O call in advance rather than afterwards.)
If you're using a window queue for messaging, the MsgWaitForMultipleObjectsEx() function allows you to wait for a window message and other events simultaneously.
PS:
The other problem with PulseEvent, the one mentioned in the documentation, is that this can happen:
Thread TWO starts waiting.
Thread TWO is preempted by Windows and all user code on the thread stops running.
Thread ONE pulses the event.
Thread TWO is restarted by Windows, and the wait is resumed.
Thread ONE sends a message, but TWO isn't listening.
(Personally I'm a bit disappointed that the kernel doesn't deal with this situation; I would have thought that it would be possible for it to set a flag saying that the wait shouldn't be resumed. But I can only assume that there is a good reason why this is impractical.)
The Auto-Reset Events
Would you please try to change the flow so there is just SetEvent and WaitForMultipleObjectsEx with auto-reset events? You may create more events if you need. For example, each thread will have its own pair of events: one to get notifications and another to report about its state changes - you define the scheme that best suits your needs.
Since there will be auto-reset events, there would be neither ResetEvent nor PulseEvent.
If you will be able to change the logic of the algorithm flow this way - the program will become clear, reliable, and straightforward.
I advise this because this is how our applications work since the times of Windows NT 3.51 – we manage to do everything we need with just SetEvent and WaitForMultipleObjects (without the Ex suffix).
As for the PulseEvent, as you know, it is very unreliable, even though it exists from the very first version of Windows NT - 3.1 - maybe it was reliable then, but not now.
To create the auto-reset events, use the bManualReset argument of the CreateEvent API function (if this parameter is TRUE, the function creates a manual-reset event object, which requires the use of the ResetEvent function to set the event state to non-signaled -- this is not what you need). If this parameter is FALSE, the function creates an auto-reset event object. The system will automatically reset the event state to non-signaled after a single waiting thread has been released, i.e., after WaitForMultipleObjects or WaitForSingleObject or other wait functions that explicitly wait for this event to become signaled.
These auto-reset events are very reliable and easy to use.
Let me make a few additional notes on the PulseEvent. Even Microsoft has admitted that PulseEvent is unreliable and should not be used -- see https://msdn.microsoft.com/en-us/library/windows/desktop/ms684914(v=vs.85).aspx -- because only those threads will be notified that are in the "wait" state when PulseEvent is called. If they are in any other state, they will not be notified, and you may never know for sure what the thread state is, and, even if you are responsible for the program flow, the state can be changed by the operating system contrary to your program logic. A thread waiting on a synchronization object can be momentarily removed from the wait state by a kernel-mode Asynchronous Procedure Call (APC) and returned to the wait state after the APC is complete. If the call to PulseEvent occurs during the time when the thread has been removed from the wait state, the thread will not be released because PulseEvent releases only those threads that are waiting at the moment it is called.
You can find out more about the kernel-mode APC at the following links:
https://msdn.microsoft.com/en-us/library/windows/desktop/ms681951(v=vs.85).aspx
http://www.drdobbs.com/inside-nts-asynchronous-procedure-call/184416590
http://www.osronline.com/article.cfm?id=75
The Manual-Reset Events
The Manual-Reset events are not that bad. :-) You can reliably use them when you need to notify multiple instances of a global state change that occurs only once, for example, application exit. The auto-reset events can only be used to notify one thread (because if more threads are waiting simultaneously for an auto-reset event and you set the event, one random thread will exist and will reset the event, but the behavior of the remaining threads that also wait for the event, will be undefined). From the Microsoft documentation, we may assume that one and only one thread will exit while others would definitely not exit, but this is not very explicitly articulated in the documentation. Anyway, we must take the following quote into consideration: "Do not assume a first-in, first-out (FIFO) order. External events such as kernel-mode APCs can change the wait order" Source - https://msdn.microsoft.com/en-us/library/windows/desktop/ms682655(v=vs.85).aspx
So, when you need to notify all the threads quickly – just set the manual-reset event to the signaled state, rather than signaling each auto-reset event for each thread. Once you have signaled the manual-reset event, do not call ResetEvent since then. The drawback of this solution is that the threads need to have an additional event handle passed in the array of their WaitForMultipleObjects. The array size is limited, although, to MAXIMUM_WAIT_OBJECTS, which is 64, we never reached close to this limit in practice.
You can get more ideas about auto-reset events and manual reset events from https://www.codeproject.com/Articles/39040/Auto-and-Manual-Reset-Events-Revisited

When a goroutine blocks on I/O how does the scheduler identify that it has stopped blocking?

From what I've read here, the golang scheduler will automatically determine if a goroutine is blocking on I/O, and will automatically switch to processing others goroutines on a thread that isn't blocked.
What I'm wondering is how the scheduler then figures out that that goroutine has stopped blocking on I/O.
Does it just do some kind of polling every so often to check if it's still blocking? Is there some kind of background thread running that checks the status of all goroutines?
For example, if you were to do an HTTP GET request inside a goroutine that took 5s to get a response, it would block while waiting for the response, and the scheduler would switch to processing another goroutine. Now given that, when the server returns a response, how does the scheduler understand that the response has arrived, and it's time to go back to the goroutine that made the GET so that it can process the result of the GET?
All I/O must be done through syscalls, and the way syscalls are implemented in Go, they are always called through code that is controlled by the runtime. This means that when you call a syscall, instead of just calling it directly (thus giving up control of the thread to the kernel), the runtime is notified of the syscall you want to make, and it does it on the goroutine's behalf. This allows it to, for example, do a non-blocking syscall instead of a blocking one (essentially telling the kernel, "please do this thing, but instead of blocking until it's done, return immediately, and let me know later once the result is ready"). This allows it to continue doing other work in the meantime.

How to know if a thread is alive and then kill it?

I've been searching and reading about killing threads (C posix threads), and everybody says that is not a good idea because a thread should make its work and then return... but my problem is the next:
I'm reciving messages in my local network (using the recvfrom function), but this function "blocks" my program, I mean, if I don't revice any messege the function keeps locked (forever) until it recives something.
To avoid this, I thought to use threads, so, while my main thread is "counting", my second thread is try to recive messages. If in a determinated time (i.e. 1 second), my second thread is still waiting for a message (is locked in the recvfrom function) I need to "kill it" and then create another thread to start again (and try to recive messages from another IP). This means that not always my thread going to finish its work and I can't wait forever...
So far I can do that (create a lot of threads and recive the messages from the IP I'm interested in), but I don't know how to kill the threads that never recived anything...
Someone knows how to kill the threads? Or they are killed automatically when my main program returns?
Thank you and really sorry for my poor english...
Looks like its related to one of my questions How to avoid thread waiting in the following or similar scenarios (want to make a thread wait iff its really really necessary)?
But its .net, though (code sample is in C#)
Essentially i spawned new thread and performing some i/o oeprations and its a blocking call.
And for some reason it just waits foreve, i do have timeout so that i can abort the thread 'abort' method.
Rearchitect so the thread can receive messages from any IP. That way, you can try to receive messages from another IP without having to disturb the thread.

What's the most CPU-efficient way to "waste time" in a thread?

I have a number of threads (100's) that each execute for a few seconds at a time. When they are executing, they spend a significant amount of that time waiting for a response from another system (a serial device). I am mindful that having 100 threads executing at once could be a resource hog so I actually limit the number of threads that can start at any one time.
It occurs to me though that there must be good and bad ways of waiting for an external event inside a thread. Is this approach CPU-intensive?:
send command ;
repeat
until response arrived ;
process response ;
and does this approach make it more efficient?:
send command ;
repeat
Sleep (20) ;
until response arrived ;
process response ;
* ADDITIONAL INFO *
The environment is x86 Windows XP. The thread code is a long and involved series of interactions with a serial device but in general, it consists of writing characters to a COM port (using the AsyncFree serial library) and waiting for characters to be returned by camping on the incoming characters buffer and processing them when they arrive. I imagine the serial library makes device reads and writes. The time in the thread can be as long as a minute , or as short as a couple of seconds, but most of that time is spent waiting for characters to leave the port, or waiting for the response characters (baud rate is slow), hence my question about the best way for the thread to behave while it is waiting. Currently I am calling Sleep in a loop waiting for CharactersInBuffer to become non-zero, processing each character when it arrives, and exiting the thread when I have the complete response. So the code looks more like (ignoring handling of timeouts, etc):
send command ;
Packet = '' ;
repeat
repeat
Sleep (20) ;
until response character arrived ;
build Packet
until complete packet arrived
process response ;
If the thread is truly waiting with something like a WaitForSingleObject, which uses no processor time, then times out, there is no reason to put a delay in the thread with sleep.
Your user isn't waiting on the thread to be responsive, it's not using processor time, and other threads won't be blocked, so there would be no reason to put the thread to sleep.
As David Heffernan indicated in his comment, if it's not using 100% of your CPU now, then there's no problem.
You might use sleep() if you were single threaded and you had to occasionally respond to the user in between waiting on the serial port to respond.
Also, having a thread sleep would not make it more efficient. It would simply yield processor cycles to other threads.
Take a look at sleep(0) as a CPU efficient way of "wasting time" in a thread.
The most efficient way to prevent a thread from using CPU time is to put it in a "wait mode."
I don't use delphi at all, but it seems that the fundamentals for that are there. See "Chapter 11. Synchronizers and Events" and more specifically "Event simulation using semaphores".
If you want to wait without using CPU, then use WaitForEvent:
The signal state of the event is examined. If it indicates that the event is signalled, then the internal semaphore is signalled, and the count of threads blocked on the semaphore is decremented. The count of blocked threads is then incremented, and a wait is performed on the internal semaphore.
If this is I/O related, then things work a little different. If it's a socket, then it might already be blocking, if it's asynchronous I/O, then you can use a semaphore and WaitForEvent and so on.
In .NET there is the Monitor.Wait, Monitor.Signal, ManualResetEvent, CountDownLatch, etc., but I don't know what are the equivalent things in delphi.
I cannot speak for AsyncFree's capabilities, but in general COM port programming in Windows supports Overlapped I/O, so you can efficiently wait for a notification when data arrives by using the WaitCommEvent() function with one of the WaitFor...() family of functions, such as WaitForSingleObject(). The thread can be put into a sleep state until the notify is issues, at which time it "wakes up" to read from the port until there is nothing further to read, then it can go back to sleep until the next notify.

Resources