thread safety in a signal-slot system (C++11)

thread safety in a signal-slot system (C++11) - multithreading

I have some problems designing a Signal/Slot system in C++11.
My main design goals are: simple but still offering some features and thread safe.
My personal opinion on a Signal/Slot system is that emitting should be as fast as possible. Because of that I try to keep the slot list inside the signal tidy. Many other Signal/Slot systems leave disconnected slots empty. That means more slots to iterate and checking slot validity during signal emission.
Here is the concrete problem:
Signal class have one function for emitting and one function for disconnecting a slot:
template<typename... Args>
void Signal<void(Args...)>::operator()(Args&&... args)
{
std::lock_guard<std::mutex> mutex_lock(_mutex);
for (auto const& slot : _slots) {
if (slot.connection_data->enabled) {
slot.callback(std::forward<Args>(args)...);
}
}
}
template<typename... Args>
void Signal<void(Args...)>::destroy_connection(std::shared_ptr<Connection::Data> connection_data)
{
std::lock_guard<std::mutex> mutex_lock(_mutex);
connection_data->reset();
for (auto it = _slots.begin(); it != _slots.end(); ++it) {
if (it->connection_data == connection_data) {
*it = _slots.back(); _slots.pop_back();
break;
}
}
}
This works fine until one tries to make a connection that disconnects itself when signal i emitted:
Connection con;
Signal<void()> sig;
con = sig.connect([&]() { con.disconnect(); });
sig();
I have two problems here:
The emit function must probably be redesigned because slots can potentially be removed when iterated.
There are two mutex locks inside the same thread.
Is it possible to make this work (maybe with recursive mutex?), or should I redesign the system to not interfere with slots list and just leave empty slots (as many other similar projects do) when disconnecting the signal?

Related

Access the main OMNET++ simulation thread from a working/child thread

I wrote a simple multi-threaded application in OMNET++ that does not call any OMNET++ API in the working thread and is working as expected. I know that OMNET++ does not support multi-thread applications by design, but I was wondering if there is any mechanism that I can use to make a bridge between my worker thread and my code in the main simulation thread.
More specifically, I am saving some data in a vector in the working thread and I want to signal the code in the simulation thread to consume it (producer/consumer scenario). Is there any way to achieve this?
Do I need to design my own event scheduler?

METHOD 1
The simplest way to achieve your goal is to use a selfmessage in simulation thread and a small modification of worker thread. The worker thread should modify a common variable (visible by both threads). And the selfmessage should periodically check the state of this variable.
The sample code of this idea:
// common variable
bool vectorReady;
// worker thread
if (someCondition) {
vectorReady = true;
}
// simulation thread
void someclass::handleMessage(cMessage * msg) {
if (msg->isSelfMessage()) {
if (vectorReady) {
vectorReady = false;
// reads vector data
}
scheduleAt(simTime() + somePeriod, msg);
}
The place of declaration of common variable depends how you create and start the worker thread.
METHOD 2
The other way is to create own scheduler and adding a condition just before every event. By default OMNeT++ uses cSequentialScheduler scheduler. It has the method takeNextEvent() which is called to obtain next event. You can create a derived class and overwrite this method, for example:
// cThreadScheduler.h
#include <omnetpp.h>
using namespace omnetpp;
class cThreadScheduler : public cSequentialScheduler {
public:
virtual cEvent *takeNextEvent() override;
};
// cThreadScheduler.cc
#include "cThreadScheduler.h"
Register_Class(cThreadScheduler);
cEvent* cThreadScheduler::takeNextEvent() {
if (vectorReady) {
vectorReady = false;
// reads vector data
}
return cSequentialScheduler::takeNextEvent();
}
In omnetpp.ini add a line:
scheduler-class = "cThreadScheduler"

CoGetInterfaceAndReleaseStream let my thread hangs

UINT __stdcall CExternal::WorkThread( void * pParam)
{
HRESULT hr;
CTaskBase* pTask;
CComPtr<IHTMLDocument3> spDoc3;
CExternal* pThis = reinterpret_cast<CExternal*>(pParam);
if (pThis == NULL)
return 0;
// Init the com
::CoInitializeEx(0,COINIT_APARTMENTTHREADED);
hr = ::CoGetInterfaceAndReleaseStream(
pThis->m_pStream_,
IID_IHTMLDocument3,
(void**)&spDoc3);
if(FAILED(hr))
return 0;
while (pThis->m_bShutdown_ == 0)
{
if(pThis->m_TaskList_.size())
{
pTask = pThis->m_TaskList_.front();
pThis->m_TaskList_.pop_front();
if(pTask)
{
pTask->doTask(spDoc3); //do my custom task
delete pTask;
}
}
else
{
Sleep(10);
}
}
OutputDebugString(L"start CoUninitialize\n");
::CoUninitialize(); //release com
OutputDebugString(L"end CoUninitialize\n");
return 0;
}
The above the code that let my thread hang, the only output is "start CoUninitialize".
m_hWorker_ = (HANDLE)_beginthreadex(NULL, 0, WorkThread, this, 0, 0);
This code starts my thread, but the thread can't exit safely, so it waits. What the problem with this code?

The problem is not in this code, although it violates core COM requirements. Which says that you should release interface pointers when you no longer use them, calling IUnknown::Release(), and that an apartment-threaded thread must pump a message loop. Especially the message loop is important, you'll get deadlock when the owner thread of a single-threaded object (like a browser) is not pumping.
CoUninitialize() is forced to clean up the interface pointer wrapped by spDoc3 since you didn't do this yourself. It is clear from the code that the owner of the interface pointer actually runs on another thread, something to generally keep in mind since that pretty much defeats the point of starting your own worker thread. Creating your own STA thread doesn't fix this, it is still the wrong thread.
So the proxy needs to context switch to the apartment that owns the browser object. With the hard requirement that this apartment pumps a message loop so that the call can be dispatched on the right thread in order to safely call the Release() function. With very high odds that this thread isn't pumping messages anymore when your program is shutting down. Something you should be able to see in the debugger, locate the owner thread in the Debug + Windows + Threads window and see what it is doing.
Deadlock is the common outcome. The only good way to fix it is to shut down threads in the right order, this one has to shut down before the thread that owns the browser object. Shutting down a multi-threaded program cleanly can be quite difficult when threads have an interdependency like this. The inspiration behind the C++11 std::quick_exit() addition.

Does Arduino support threading?

I have a couple of tasks to do with arduino but one of them takes very long time, so I was thinking to use threads to run them simultaneously.
I have an Arduino Mega
[Update]
Finally after four years I can install FreeRTOS in my arduino mega. Here is a link

In short: NO.
But you may give it a shot at:
http://www.kwartzlab.ca/2010/09/arduino-multi-threading-librar/
(Archived version: https://web.archive.org/web/20160505034337/http://www.kwartzlab.ca/2010/09/arduino-multi-threading-librar
Github: https://github.com/jlamothe/mthread

Not yet, but I always use this Library with big projects:
https://github.com/ivanseidel/ArduinoThread
I place the callback within a Timer interrupt, and voilá! You have pseudo-threads running on the Arduino...

Just to make this thread more complete: there are also protothreads which have very small memory footprint (couple bytes if I remember right) and preserve variables local to thread; very handy and time saving (far less finite state machines -> more readable code).
Examples and code:
arduino-class / ProtoThreads wiki
Just to let you know what results you may expect: serial communication # 153K6 baudrate with threads for: status diodes blinking, time keeping, requested functions evaluation, IO handling and logic and all on atmega328.

Not real threading but TimedActions are a good alternative for many uses
http://playground.arduino.cc/Code/TimedAction#Example
Of course, if one task blocks, the others will too, while threading can let one task freeze and the others will continue...

No you can't but you can use Timer interrupt.
Ref : https://www.teachmemicro.com/arduino-timer-interrupt-tutorial/

The previous answer is correct, however, the arduino generally runs pretty quick, so if you properly time your code, it can accomplish tasks more or less simultaneously.
The best practice is to make your own functions and avoid putting too much real code in the default void loop

You can use arduinos
It is designed for Arduino environment. Features:
Only static allocation (no malloc/new)
Support context switching when delaying execution
Implements semaphores
Lightweight, both cpu and memory
I use it when I need to receive new commands from bluetooth/network/serial while executing the old ones and the old ones have delay in them.
One thread is the sever thread that does the following loop:
while (1) {
while ((n = Serial.read()) != -1) {
// do something with n, like filling a buffer
if (command_was_received) {
arduinos_create(command_func, arg);
}
}
arduinos_yield(); // context switch to other threads
}
The other is the command thread that executes the command:
int command_func(void* arg) {
// move some servos
arduinos_delay(1000); // wait for them to move
// move some more servos
}

Arduino does not support multithread programming.
However there have been some workarounds, for example the one in this project (you can install it also from the Arduino IDE).
It seems you have to define the schedule time yourself while in a real multithread environment it is the OS that decides when to execute tasks.
Alternatively you can use protothreads

The straight answer is No No No!. There are some alternatives but you can't expect a perfect multi threading functionality from an arduino mega. You can use arduino due or lenado for multithreading like below-
void loop1(){
}
void loop2(){
}
void loop3(){
}
Normally, I handle those types of cases in backend. You can run the main code in a server while using Arduino to just collect inputs and show outputs. In such cases I would prefer nodemcu which has built in wifi.

Thread NO!
Concurrent YES!
You can run different tasks concurrently with FreeRTOS library.
https://www.arduino.cc/reference/en/libraries/freertos/
void TaskBlink( void *pvParameters );
void TaskAnalogRead( void *pvParameters );
// Now set up two tasks to run independently.
xTaskCreate(
TaskBlink
, (const portCHAR *)"Blink" // A name just for humans
, 128 // Stack size
, NULL
, 2 // priority
, NULL );
xTaskCreate(
TaskAnalogRead
, (const portCHAR *) "AnalogRead"
, 128 // This stack size can be checked & adjusted by reading Highwater
, NULL
, 1 // priority
, NULL );
void TaskBlink(void *pvParameters) // This is a task.
{
(void) pvParameters;
// initialize digital pin 13 as an output.
pinMode(13, OUTPUT);
for (;;) // A Task shall never return or exit.
{
digitalWrite(13, HIGH); // turn the LED on (HIGH is the voltage level)
vTaskDelay( 1000 / portTICK_PERIOD_MS ); // wait for one second
digitalWrite(13, LOW); // turn the LED off by making the voltage LOW
vTaskDelay( 1000 / portTICK_PERIOD_MS ); // wait for one second
}
}
void TaskAnalogRead(void *pvParameters) // This is a task.
{
(void) pvParameters;
// initialize serial communication at 9600 bits per second:
Serial.begin(9600);
for (;;)
{
// read the input on analog pin 0:
int sensorValue = analogRead(A0);
// print out the value you read:
Serial.println(sensorValue);
vTaskDelay(1); // one tick delay (15ms) in between reads for stability
}
}
Just take care!
When different tasks tried to reach variables at the same time, like i2c communication line or sd card module. Use Semaphores and mutexes
https://www.geeksforgeeks.org/mutex-vs-semaphore/.

Arduino does not supports threading. However, you can do the next best thing and structure your code around state machines running in interleaving.
While there are lots of ways to implement your tasks as state machines, I recommend this library (https://github.com/Elidio/StateMachine). This library abstracts most of the process.
You can create a state machine as a class like this:
#include "StateMachine.h"
class STATEMACHINE(Blink) {
private:
int port;
int waitTime;
CREATE_STATE(low);
CREATE_STATE(high);
void low() {
digitalWrite(port, LOW);
*this << &STATE(high)<< waitTime;
}
void high() {
digitalWrite(port, HIGH);
*this << &STATE(low)<< waitTime;
}
public:
Blink(int port = 0, int waitTime = 0) :
port(port),
waitTime(waitTime),
INIT_STATE(low),
INIT_STATE(high)
{
pinMode(port, OUTPUT);
*this << &STATE(low);
}
};
The macro STATEMACHINE() abstracts the class inheritances, the macro CREATE_STATE() abstracts the state wrapper creation, the macro INIT_STATE() abstracts method wrapping and the macro STATE() abstracts state wrapper reference within the state machine class.
State transition is abstracted by << operator between the state machine class and the state, and if you want a delayed state transition, all you have to do is to use that operator with an integer, where the integer is the delay in millisseconds.
To use the state machine, first you have to instantiate it. Declaring an reference to the class in global space while instantiating it with new on setup function might do the trick
Blink *led1, *led2, *led3;
void setup() {
led1 = new Blink(12, 300);
led2 = new Blink(11, 500);
led3 = new Blink(10, 700);
}
Then you run the states on loop.
void loop() {
(*led2)();
(*led1)();
(*led3)();
}

Async Logger. Can I lose/delay log entries?

I'm implementing my own logging framework. Following is my BaseLogger which receives the log entries and push it to the actual Logger which implements the abstract Log method.
I use the C# TPL for logging in an Async manner. I use Threads instead of TPL. (TPL task doesn't hold a real thread. So if all threads of the application end, tasks will stop as well, which will cause all 'waiting' log entries to be lost.)
public abstract class BaseLogger
{
// ... Omitted properties constructor .etc. ... //
public virtual void AddLogEntry(LogEntry entry)
{
if (!AsyncSupported)
{
// the underlying logger doesn't support Async.
// Simply call the log method and return.
Log(entry);
return;
}
// Logger supports Async.
LogAsync(entry);
}
private void LogAsync(LogEntry entry)
{
lock (LogQueueSyncRoot) // Make sure we ave a lock before accessing the queue.
{
LogQueue.Enqueue(entry);
}
if (LogThread == null || LogThread.ThreadState == ThreadState.Stopped)
{ // either the thread is completed, or this is the first time we're logging to this logger.
LogTask = new new Thread(new ThreadStart(() =>
{
while (true)
{
LogEntry logEntry;
lock (LogQueueSyncRoot)
{
if (LogQueue.Count > 0)
{
logEntry = LogQueue.Dequeue();
}
else
{
break;
// is it possible for a message to be added,
// right after the break and I leanve the lock {} but
// before I exit the loop and task gets 'completed' ??
}
}
Log(logEntry);
}
}));
LogThread.Start();
}
}
// Actual logger implimentations will impliment this method.
protected abstract void Log(LogEntry entry);
}
Note that AddLogEntry can be called from multiple threads at the same time.
My question is, is it possible for this implementation to lose log entries ?
I'm worried that, is it possible to add a log entry to the queue, right after my thread exists the loop with the break statement and exits the lock block, and which is in the else clause, and the thread is still in the 'Running' state.
I do realize that, because I'm using a queue, even if I miss an entry, the next request to log, will push the missed entry as well. But this is not acceptable, specially if this happens for the last log entry of the application.
Also, please let me know whether and how I can implement the same, but using the new C# 5.0 async and await keywords with a cleaner code. I don't mind requiring .NET 4.5.
Thanks in Advance.

While you could likely get this to work, in my experience, I'd recommend, if possible, use an existing logging framework :) For instance, there are various options for async logging/appenders with log4net, such as this async appender wrapper thingy.
Otherwise, IMHO since you're going to be blocking a threadpool thread during your logging operation anyway, I would instead just start a dedicated thread for your logging. You seem to be kind-of going for that approach already, just via Task so that you'd not hold a threadpool thread when nothing is logging. However, the simplification in implementation I think benefits just having the dedicated thread.
Once you have a dedicated logging thread, you then only need have an intermediate ConcurrentQueue. At that point, your log method just adds to the queue and your dedicated logging thread just does that while loop you already have. You can wrap with BlockingCollection if you need blocking/bounded behavior.
By having the dedicated thread as the only thing that writes, it eliminates any possibility of having multiple threads/tasks pulling off queue entries and trying to write log entries at the same time (painful race condition). Since the log method is now just adding to a collection, it doesn't need to be async and you don't need to deal with the TPL at all, making it simpler and easier to reason about (and hopefully in the category of 'obviously correct' or thereabouts :)
This 'dedicated logging thread' approach is what I believe the log4net appender I linked to does as well, FWIW, in case that helps serve as an example.

I see two race conditions off the top of my head:
You can spin up more than one Thread if multiple threads call AddLogEntry. This won't cause lost events but is inefficient.
Yes, an event can be queued while the Thread is exiting, and in that case it would be "lost".
Also, there's a serious performance issue here: unless you're logging constantly (thousands of times a second), you're going to be spinning up a new Thread for each log entry. That will get expensive quickly.
Like James, I agree that you should use an established logging library. Logging is not as trivial as it seems, and there are already many solutions.
That said, if you want a nice .NET 4.5-based approach, it's pretty easy:
public abstract class BaseLogger
{
private readonly ActionBlock<LogEntry> block;
protected BaseLogger(int maxDegreeOfParallelism = 1)
{
block = new ActionBlock<LogEntry>(
entry =>
{
Log(entry);
},
new ExecutionDataflowBlockOptions
{
MaxDegreeOfParallelism = maxDegreeOfParallelism,
});
}
public virtual void AddLogEntry(LogEntry entry)
{
block.Post(entry);
}
protected abstract void Log(LogEntry entry);
}

Regarding the loosing waiting messages on app crush because of unhandled exception, I've bound a handler to the event AppDomain.CurrentDomain.DomainUnload. Goes like this:
protected ManualResetEvent flushing = new ManualResetEvent(true);
protected AsyncLogger() // ctor of logger
{
AppDomain.CurrentDomain.DomainUnload += CurrentDomain_DomainUnload;
}
protected void CurrentDomain_DomainUnload(object sender, EventArgs e)
{
if (!IsEmpty)
{
flushing.WaitOne();
}
}
Maybe not too clean, but works.

Efficient consumer thread with multiple producers

I am trying to make a producer/consumer thread situation more efficient by skipping expensive event operations if necessary with something like:
//cas(variable, compare, set) is atomic compare and swap
//queue is already lock free
running = false
// dd item to queue – producer thread(s)
if(cas(running, false, true))
{
// We effectively obtained a lock on signalling the event
add_to_queue()
signal_event()
}
else
{
// Most of the time if things are busy we should not be signalling the event
add_to_queue()
if(cas(running, false, true))
signal_event()
}
...
// Process queue, single consumer thread
reset_event()
while(1)
{
wait_for_auto_reset_event() // Preferably IOCP
for(int i = 0; i &lt SpinCount; ++i)
process_queue()
cas(running, true, false)
if(queue_not_empty())
if(cas(running, false, true))
signal_event()
}
Obviously trying to get these things correct is a little tricky(!) so is the above pseudo code correct? A solution that signals the event more than is exactly needed is ok but not one that does so for every item.

This falls into the sub-category of "stop messing about and go back to work" known as "premature optimisation". :-)
If the "expensive" event operations are taking up a significant portion of time, your design is wrong, and rather than use a producer/consumer you should use a critical section/mutex and just do the work from the calling thread.
I suggest you profile your application if you are really concerned.
Updated:
Correct answer:
Producer
ProducerAddToQueue(pQueue,pItem){
EnterCriticalSection(pQueue->pCritSec)
if(IsQueueEmpty(pQueue)){
SignalEvent(pQueue->hEvent)
}
AddToQueue(pQueue, pItem)
LeaveCriticalSection(pQueue->pCritSec)
}
Consumer
nCheckQuitInterval = 100; // Every 100 ms consumer checks if it should quit.
ConsumerRun(pQueue)
{
while(!ShouldQuit())
{
Item* pCurrentItem = NULL;
EnterCriticalSection(pQueue-pCritSec);
if(IsQueueEmpty(pQueue))
{
ResetEvent(pQueue->hEvent)
}
else
{
pCurrentItem = RemoveFromQueue(pQueue);
}
LeaveCriticalSection(pQueue->pCritSec);
if(pCurrentItem){
ProcessItem(pCurrentItem);
pCurrentItem = NULL;
}
else
{
// Wait for items to be added.
WaitForSingleObject(pQueue->hEvent, nCheckQuitInterval);
}
}
}
Notes:
The event is a manual-reset event.
The operations protected by the critical section are quick. The event is only set or reset when the queue transitions to/from empty state. It has to be set/reset within the critical section to avoid a race condition.
This means the critical section is only held for a short time. so contention will be rare.
Critical sections don't block unless they are contended. So context switches will be rare.
Assumptions:
This is a real problem not homework.
Producers and consumers spend most of their time doing other stuff, i.e. getting the items ready for the queue, processing them after removing them from the queue.
If they are spending most of the time doing the actual queue operations, you shouldn't be using a queue. I hope that is obvious.

Went thru a bunch of cases, can't see an issue. But it's kinda complicated. I thought maybe you would have an issue with queue_not_empty / add_to_queue racing. But looks like the post-dominating CAS in both paths covers this case.
CAS is expensive (not as expensive as signal). If you expect skipping the signal to be common, I would code the CAS as follows:
bool cas(variable, old_val, new_val) {
if (variable != old_val) return false
asm cmpxchg
}
Lock-free structures like this is the stuff that Jinx (the product I work on) is very good at testing. So you might want to use an eval license to test the lock-free queue and signal optimization logic.
Edit: maybe you can simplify this logic.
running = false
// add item to queue – producer thread(s)
add_to_queue()
if (cas(running, false, true)) {
signal_event()
}
// Process queue, single consumer thread
reset_event()
while(1)
{
wait_for_auto_reset_event() // Preferably IOCP
for(int i = 0; i &lt SpinCount; ++i)
process_queue()
cas(running, true, false) // this could just be a memory barriered store of false
if(queue_not_empty())
if(cas(running, false, true))
signal_event()
}
Now that the cas/signal are always next to each other they can be moved into a subroutine.

Why not just associate a bool with the event? Use cas to set it to true, and if the cas succeeds then signal the event because the event must have been clear. The waiter can then just clear the flag before it waits
bool flag=false;
// producer
add_to_queue();
if(cas(flag,false,true))
{
signal_event();
}
// consumer
while(true)
{
while(queue_not_empty())
{
process_queue();
}
cas(flag,true,false); // clear the flag
if(queue_is_empty())
wait_for_auto_reset_event();
}
This way, you only wait if there are no elements on the queue, and you only signal the event once for each batch of items.

I believe, you want to achieve something like in this question:
WinForms Multithreading: Execute a GUI update only if the previous one has finished. It is specific on C# and Winforms, but the structure may well apply for you.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string