Lambda expressions, concurrency and static variables - multithreading

As far as I know, such use of static storage within lambda is legal. Essentially it counts number of entries into the closure:
#include <vector>
#include <iostream>
#include <algorithm>
#include <iterator>
typedef std::pair<int,int> mypair;
std::ostream &operator<< (std::ostream &os, mypair const &data) {
return os << "(" << data.first << ": " << data.second << ") ";
}
int main()
{
int n;
std::vector<mypair> v;
std::cin >> n;
v.reserve(n);
std::for_each(std::begin(v), std::end(v), [](mypair& x) {
static int i = 0;
std::cin >> x.second;
x.first = i++;
});
std::for_each(std::begin(v), std::end(v), [](mypair& x) {
std::cout << x;
});
return 0;
}
Let assume I have a container 'workers' of threads.
std::vector<std::thread> workers;
for (int i = 0; i < 5; i++) {
workers.push_back(std::thread([]()
{
std::cout << "thread #" << "start\n";
doLengthyOperation();
std::cout << "thread #" << "finish\n";
}));
}
Code in doLengthyOperation() is contained and self-sufficient operation, akin a new process creation.
Provided I join them using for_each and the stored variable in question must count number of active tasks, not just number of entries, what possible implementations for such counter are there, if I want to avoid to rely onto global variables to avoid someone else messing up with it and allowing automatic support for separate "flavors" of threads.
std::for_each(workers.begin(), workers.end(), [](std::thread &t)
{
t.join();
});
Surrounding scope would die quickly after finishing thread starts, may repeat , adding new threads to the container is possible, and that must be global variable, which I want to avoid. More of, the whole operation is a template

The best way to handle this is to capture an instance of std::atomic<int> which provides a thread safe counter. Depending on the lifetime of lambdas and the surrounding scope, you may wish to capture by reference or shared pointer.
To take your example:
std::vector<std::thread> workers;
auto counter = std::make_shared<std::atomic<int>>(0);
for (int i = 0; i < 5; i++) {
workers.push_back(std::thread([counter]()
{
std::cout << "thread #" << "start\n";
(*counter)++;
doLengthyOperation();
(*counter)--;
std::cout << "thread #" << "finish\n";
}));
}

Related

Multithreaded Producer/Consumer in C++

I am looking at multithreading and written a basic producer/consumer. I have two issues with the producer/consumer written below. 1) Even by setting the consumer sleep time lower than the producer sleep time, the producer still seems to execute quicker. 2) In the consumer I have duplicated the code in the case where the producer finishes adding to the queue, but there is still elements in the queue. Any advise for a better way of structuring the code?
#include <iostream>
#include <queue>
#include <mutex>
class App {
private:
std::queue<int> m_data;
bool m_bFinished;
std::mutex m_Mutex;
int m_ConsumerSleep;
int m_ProducerSleep;
int m_QueueSize;
public:
App(int &MaxQueue) :m_bFinished(false), m_ConsumerSleep(1), m_ProducerSleep(5), m_QueueSize(MaxQueue){}
void Producer() {
for (int i = 0; i < m_QueueSize; ++i) {
std::lock_guard<std::mutex> guard(m_Mutex);
m_data.push(i);
std::cout << "Producer Thread, queue size: " << m_data.size() << std::endl;
std::this_thread::sleep_for(std::chrono::seconds(m_ProducerSleep));
}
m_bFinished = true;
}
void Consumer() {
while (!m_bFinished) {
if (m_data.size() > 0) {
std::lock_guard<std::mutex> guard(m_Mutex);
std::cout << "Consumer Thread, queue element: " << m_data.front() << " size: " << m_data.size() << std::endl;
m_data.pop();
}
else {
std::cout << "No elements, skipping" << std::endl;
}
std::this_thread::sleep_for(std::chrono::seconds(m_ConsumerSleep));
}
while (m_data.size() > 0) {
std::lock_guard<std::mutex> guard(m_Mutex);
std::cout << "Emptying remaining elements " << m_data.front() << std::endl;
m_data.pop();
std::this_thread::sleep_for(std::chrono::seconds(m_ConsumerSleep));
}
}
};
int main()
{
int QueueElements = 10;
App app(QueueElements);
std::thread consumer_thread(&App::Consumer, &app);
std::thread producer_thread(&App::Producer, &app);
producer_thread.join();
consumer_thread.join();
std::cout << "loop exited" << std::endl;
return 0;
}
You should use condition_variable. Don't use sleep for threads.
Main scheme:
Producer pushes value under lock and signals condition_variable.
Consumer waits under lock on condition variable and checks predicate to prevent spurious wakeups.
My version:
#include <iostream>
#include <queue>
#include <mutex>
#include <thread>
#include <condition_variable>
#include <atomic>
class App {
private:
std::queue<int> m_data;
std::atomic_bool m_bFinished;
std::mutex m_Mutex;
std::condition_variable m_cv;
int m_QueueSize;
public:
App(int MaxQueue)
: m_bFinished(false)
, m_QueueSize(MaxQueue)
{}
void Producer()
{
for (int i = 0; i < m_QueueSize; ++i)
{
{
std::unique_lock<std::mutex> lock(m_Mutex);
m_data.push(i);
}
m_cv.notify_one();
std::cout << "Producer Thread, queue size: " << m_data.size() << std::endl;
}
m_bFinished = true;
}
void Consumer()
{
do
{
std::unique_lock<std::mutex> lock(m_Mutex);
while (m_data.empty())
{
m_cv.wait(lock, [&](){ return !m_data.empty(); }); // predicate an while loop - protection from spurious wakeups
}
while(!m_data.empty()) // consume all elements from queue
{
std::cout << "Consumer Thread, queue element: " << m_data.front() << " size: " << m_data.size() << std::endl;
m_data.pop();
}
} while(!m_bFinished);
}
};
int main()
{
int QueueElements = 10;
App app(QueueElements);
std::thread consumer_thread(&App::Consumer, &app);
std::thread producer_thread(&App::Producer, &app);
producer_thread.join();
consumer_thread.join();
std::cout << "loop exited" << std::endl;
return 0;
}
Also note, that it's better to use atomic for end flag, when you have deal with concurrent threads, because theoretically value of the m_bFinished will be stored in the cache-line and if there is no cache invalidation in the producer thread, the changed value can be unseen from the consumer thread. Atomics have memory fences, that guarantees, that value will be updated for other threads.
Also you can take a look on memory_order page.
First, you should use a condition variable instead of a delay on the consumer. This way, the consumer thread only wakes up when the queue is not empty and the producer notifies it.
That said, the reason why your producer calls are more frequent is the delay on the producer thread. It's executed while holding the mutex, so the consumer will never execute until the delay is over. You should release the mutex before calling sleep_for:
for (int i = 0; i < m_QueueSize; ++i) {
/* Introduce a scope to release the mutex before sleeping*/
{
std::lock_guard<std::mutex> guard(m_Mutex);
m_data.push(i);
std::cout << "Producer Thread, queue size: " << m_data.size() << std::endl;
} // Mutex is released here
std::this_thread::sleep_for(std::chrono::seconds(m_ProducerSleep));
}

Object address suddenly changed

class test
{
void thread1()
{
int i = 0;
while(true){
for(unsigned int k = 0;k < mLD.size(); k++ )
{
mLD[k] = i++;
}
}
}
void thread2()
{
std::cout << "thread2 address : " << &mLD << "\n";
C();
}
void B()
{
std::cout << "B address : " << &mLD << "\n";
for(unsigned int k = 0;k < mLD.size(); k++ )
{
if(mLD[k]<=25)
{
}
}
}
void C()
{
B();
std::cout << "C address : " << &mLD << "\n";
double distance = mLD[0]; // <---- segmetation fault
}
std::array<double, 360> mLD;
};
cout result --->
thread2 address : 0x7e807660
B address : 0x7e807660
C address : 0x1010160 (sometimes 0x7e807660 )
Why mLD's address changed ....?
even i change std::array to std::array<std::atomic<double>360>, the result is the same.
Most probably, the object you referred is destroyed at the point of call to C, which points to a synchronization issue. You need to extend the lifetime of the object referred by thread(s), until the threads done executing their routine. To accomplish this, you can have something like this;
#include <thread>
#include <array>
#include <iostream>
struct foo{
void callback1(){
for(auto & elem: storage){
elem += 5;
}
}
void callback2(){
for(const auto & elem: storage){
std::cout << elem << std::endl;
}
}
std::array<double, 300> storage;
};
int main(void){
foo f;
std::thread t1 {[&f](){f.callback1();}};
std::thread t2 {[&f](){f.callback2();}};
// wait until both threads are done executing their routines
t1.join();
t2.join();
return 0;
}
The instance of foo, f lives in scope of main() function, so its' lifetime is defined by from the line it defined to end of the main's scope. By joining both threads, we block main from proceeding further until both threads are done executing their callback functions, hence the lifetime of f extended until callbacks are done.
The second issue is, the code needs synchronization primitives, because storage variable is shared between two independent execution paths. The final code with proper synchronization can look like this;
#include <thread>
#include <array>
#include <iostream>
#include <mutex>
struct foo{
void callback1(){
// RAII style lock, which invokes .lock() upon construction, and .unlock() upon destruction
// automatically.
std::unique_lock<std::mutex> lock(mtx);
for(auto & elem: storage){
elem += 5;
}
}
void callback2(){
std::unique_lock<std::mutex> lock(mtx);
for(const auto & elem: storage){
std::cout << elem << std::endl;
}
}
std::array<double, 300> storage;
// non-reentrant mutex
mutable std::mutex mtx;
};
int main(void){
foo f;
std::thread t1 {[&f](){f.callback1();}};
std::thread t2 {[&f](){f.callback2();}};
// wait until both threads are done executing their routines
t1.join();
t2.join();
return 0;
}

Threads executing an operation a second

I'm trying to create a concurrent code where I execute a function per second, this function prints a character and waits a second on that thread. The behaviour I expect is to print each character after another but this doesn't happen, instead, it prints all of the characters of the inner loop execution. I'm not sure if this is somewhat related to an I/O operation or whatnot.
I've also tried to create an array of threads where each thread are created on the execution of the inner loop but the behaviour repeats, even if not calling join(). What might be wrong with the code?
The following code is what I've tried to do, and I used a clock to see if it was waiting the correct amount of time
#include <iostream>
#include <thread>
#include <chrono>
#include <string>
void print_char();
int main() {
using Timer = std::chrono::high_resolution_clock;
using te = std::chrono::duration<double>;
using s = std::chrono::seconds;
te interval;
for (int i = 0; i < 100; i++) {
auto a = Timer::now();
for (int j = 0; j < i; j++) {
std::thread t(print_char);
t.join();
}
auto b = Timer::now();
interval = b-a;
std::cout << std::chrono::duration_cast<s>(interval).count();
std::cout << std::endl;
}
return 0;
}
void print_char() {
std::cout << "*";
std::this_thread::sleep_for(std::chrono::seconds(1));
}
The behaviour I expect is to print each character after another but this doesn't happen, instead, it prints all of the characters of the inner loop execution.
You need to flush the output stream in order to see the text:
std::cout << "*" << std::flush;
std::endl contains call to std::flush in it and this is why you see whole lines to be displayed once the inner loop is complete. Your threads do add the '*' characters once per second, you just don't see them being added until the stream is flushed.
Consider the code
std::thread t(print_char);
t.join();
The first line creates and start a thread. The second line immediately wait for the thread to end. That makes your program serial and not parallel. In fact, it's no different than calling the function directly instead of creating the thread.
If you want to have the thread operate in parallel and independently from your main thread, you should have the loop in the thread function itself instead. Perhaps something like
std::atomic<bool> keep_running = true;
void print_char() {
while (keep_running) {
std::cout << "*";
std::this_thread::sleep_for(std::chrono::seconds(1));
}
}
Then in the main function you just create the thread, and do something else until you want the thread to end.
std::thread t(print_char);
// Do something else...
keep_running = false;
t.join();
In regard to your current code, it's really no different than
for (int i = 0; i < 100; i++) {
auto a = Timer::now();
for (int j = 0; j < i; j++) {
print_char();
}
auto b = Timer::now();
interval = b-a;
std::cout << std::chrono::duration_cast<s>(interval).count();
std::cout << std::endl;
}

Posix semaphore for synchronisation between two different processes [duplicate]

According to my understanding, a semaphore should be usable across related processes without it being placed in shared memory. If so, why does the following code deadlock?
#include <iostream>
#include <semaphore.h>
#include <sys/wait.h>
using namespace std;
static int MAX = 100;
int main(int argc, char* argv[]) {
int retval;
sem_t mutex;
cout << sem_init(&mutex, 1, 0) << endl;
pid_t pid = fork();
if (0 == pid) {
// sem_wait(&mutex);
cout << endl;
for (int i = 0; i < MAX; i++) {
cout << i << ",";
}
cout << endl;
sem_post(&mutex);
} else if(pid > 0) {
sem_wait(&mutex);
cout << endl;
for (int i = 0; i < MAX; i++) {
cout << i << ",";
}
cout << endl;
// sem_post(&mutex);
wait(&retval);
} else {
cerr << "fork error" << endl;
return 1;
}
// sem_destroy(&mutex);
return 0;
}
When I run this on Gentoo/Ubuntu Linux, the parent hangs. Apparently, it did not receive the post by child. Uncommenting sem_destroy won't do any good. Am I missing something?
Update 1:
This code works
mutex = (sem_t *) mmap(NULL, sizeof(sem_t), PROT_READ | PROT_WRITE, MAP_ANONYMOUS | MAP_SHARED, 0, 0);
if (!mutex) {
perror("out of memory\n");
exit(1);
}
Thanks,
Nilesh.
The wording in the manual page is kind of ambiguous.
If pshared is nonzero, then the semaphore is shared between processes,
and should be located in a region of shared memory.
Since a child created by fork(2) inherits its parent's memory
mappings, it can also access the semaphore.
Yes, but it still has to be in a shared region. Otherwise the memory simply gets copied with the usual CoW and that's that.
You can solve this in at least two ways:
Use sem_open("my_sem", ...)
Use shm_open and mmap to create a shared region
An excellent article on this topic, for future passers-by:
http://blog.superpat.com/2010/07/14/semaphores-on-linux-sem_init-vs-sem_open/

C++ Locking stream operators with mutex

I need to lock stdout in my logging application to prevent string interleaving in multi-thread applications logging to stdout. Can't figure out how to use move constructor or std::move or sth else to move unique_lock to another object.
I created objects for setting configs and encapsulation and figured out how to lock stdout with static std::mutex to lock from these objects (called shards).
Something like this works for me:
l->log(1, "Test message 1");
While that is fine and could be implemented with templates and variable number of parameters I would like to approach more stream-like possibilities. I am looking for something like this:
*l << "Module id: " << 42 << "value: " << 42 << std::endl;
I dont want to force users to precompute string with concatenation and to_string(42) I just want to find a way to lock stdout.
My approach so far was to create operator << and another object locked stream, as was suggested in other answers. Things is I can't figure how to move mutex to another object. My code:
locked_stream& shard::operator<<(int num)
{
static std::mutex _out_mutex;
std::unique_lock<std::mutex> lock(_out_mutex);
//std::lock_guard<std::mutex> lock (_out_mutex);
std::cout << std::to_string(num) << "(s)";
locked_stream s;
return s;
}
After outputting input to std::cout I woould like to move lock into object stream.
In this case, I would be careful not to use static locks in functions, as you will get a different lock for each stream operator you create.
What you need is to lock some "output lock" when a stream is created, and unlock it when the stream is destroyed. You can piggie back on existing stream operations if you're just wrapping std::ostream. Here's a working implementation:
#include <mutex>
#include <iostream>
class locked_stream
{
static std::mutex s_out_mutex;
std::unique_lock<std::mutex> lock_;
std::ostream* stream_; // can't make this reference so we can move
public:
locked_stream(std::ostream& stream)
: lock_(s_out_mutex)
, stream_(&stream)
{ }
locked_stream(locked_stream&& other)
: lock_(std::move(other.lock_))
, stream_(other.stream_)
{
other.stream_ = nullptr;
}
friend locked_stream&& operator << (locked_stream&& s, std::ostream& (*arg)(std::ostream&))
{
(*s.stream_) << arg;
return std::move(s);
}
template <typename Arg>
friend locked_stream&& operator << (locked_stream&& s, Arg&& arg)
{
(*s.stream_) << std::forward<Arg>(arg);
return std::move(s);
}
};
std::mutex locked_stream::s_out_mutex{};
locked_stream locked_cout()
{
return locked_stream(std::cout);
}
int main (int argc, char * argv[])
{
locked_cout() << "hello world: " << 1 << 3.14 << std::endl;
return 0;
}
Here it is on ideone: https://ideone.com/HezJBD
Also, forgive me, but there will be a mix of spaces and tabs up there because of online editors being awkward.

Resources