I have very simple code in which multiple threads are trying to insert data in std::map and as per my understanding this should led to program crash because this is data race
std::map<long long,long long> k1map;
void Ktask()
{
for(int i=0;i<1000;i++)
{
long long random_variable = (std::rand())%1000;
std::cout << "Thread ID -> " << std::this_thread::get_id() << " with looping index " << i << std::endl;
k1map.insert(std::make_pair(random_variable, random_variable));
}
}
int main()
{
std::srand((int)std::time(0)); // use current time as seed for random generator
for (int i = 0; i < 1000; ++i)
{
std::thread t(Ktask);
std::cout << "Thread created " << t.get_id() << std::endl;
t.detach();
}
return 0;
}
However i ran it multiple time and there is no application crash and if run same code with pthread and c++03 application is crashing so I am wondering is there some change in c++11 that make map insert thread safe ?
No, std::map::insert is not thread-safe.
There are many reasons why your example may not crash. Your threads may be running in a serial fashion due to the system scheduler, or because they finish very quickly (1000 iterations isn't that much). Your map will fill up quickly (only having 1000 nodes) and therefore later insertions won't actually modify the structure and reduce possibility of crashes. Or perhaps the implementation you're using IS thread-safe.
For most standard library types, the only thread safety guarantee you get is that it is safe to use separate object instances in separate threads. That's it.
And std::map is not one of the exceptions to that rule. An implementation might offer you more of a guarantee, or you could just be getting lucky.
And when it comes to fixing threading bugs, there's only one kind of luck.
Related
I am using a MultiThreading class which creates the required number of threads in its own threadpool and deletes itself after use.
std::thread *m_pool; //number of threads according to available cores
std::mutex m_locker;
std::condition_variable m_condition;
std::atomic<bool> m_exit;
int m_processors
m_pool = new std::thread[m_processors + 1]
void func()
{
//code
}
for (int i = 0; i < m_processors; i++)
{
m_pool[i] = std::thread(func);
}
void reset(void)
{
{
std::lock_guard<std::mutex> lock(m_locker);
m_exit = true;
}
m_condition.notify_all();
for(int i = 0; i <= m_processors; i++)
m_pool[i].join();
delete[] m_pool;
}
After running through all tasks, the for-loop is supposed to join all running threads before delete[] is being executed.
But there seems to be one last thread still running, while the m_pool does not exist anymore.
This leads to the problem, that I can't close my program anymore.
Is there any way to check if all threads are joined or wait for all threads to be joined before deleting the threadpool?
Simple typo bug I think.
Your loop that has the condition i <= m_processors is a bug and will actually process one extra entry past the end of the array. This is an off-by-one bug. Suppose m_processors is 2. You'll have an array that contains 2 elements with indices [0] and [1]. Yet, you'll be reading past the end of the array, attempting to join with the item at index [2]. m_pool[2] is undefined memory and you're likely going to either crash or block forever there.
You likely intended i < m_processors.
The real source of the problem is addressed by Wick's answer. I will extend it with some tips that also solve your problem while improving other aspects of your code.
If you use C++11 for std::thread, then you shouldn't create your thread handles using operator new[]. There are better ways of doing that with other C++ constructs, which will make everything simpler and exception safe (you don't leak memory if an unexpected exception is thrown).
Store your thread objects in a std::vector. It will manage the memory allocation and deallocation for you (no more new and delete). You can use other more flexible containers such as std::list if you insert/delete threads dynamically.
Fill the vector in place with std::generate or similar
std::vector<std::thread> m_pool;
m_pool.reserve(n_processors);
// Fill the vector
std::generate_n( std::back_inserter(m_pool), m_processors,
[](){ return std::thread(func); } );
Join all the elements using range-for loop and delete handles using container's functions.
for( std::thread& t: m_pool ) {
t.join();
}
m_pool.clear();
I am working on a final project for a class. This project is to mimic multiple atm's. That is my program already runs. Inside of my main.cpp, I created the threads, for now just two, later on maybe more, They call a class Begin that rand() if customers are going to make a deposit or withdraw and then rand() the amount they are going to use and does this 5 times.
#include "ATM.h"
void main()
{
Begin test1;
test1.manager();
thread first(&Begin::atm, test1);
thread second(&Begin::atm, test1);
first.join();
second.join();
delete resbox::cashbox;
system("pause");
}
I cannot figure out how to suspend my threads created in Main.cpp inside of my observe() function like so:
void watcher::observe()
{
float cash;
if (resbox::cashbox->gettotal() >= resbox::cashbox->getmax())
{
//suspend all other threads
cout << "Please empty cash box it is full! with $"<< resbox::cashbox->gettotal() << endl;
cout << "How much would like to withdraw?" << endl;
cin >> cash;
resbox::cashbox->cashwd(cash);
cout << "This is the amount in the reserve box now is $" << resbox::cashbox->gettotal() << endl;
//resume all other threads
}
if (resbox::cashbox->gettotal() <= 500)
{
//suspend all other threads
cout << "Please fill cashbox it is low, has $" << resbox::cashbox->gettotal() << endl;
cout << "How much would like to add?" << endl;
cin >> cash;
resbox::cashbox->cashdp(cash);
cout << "This is the amount in the reserve box now $" << resbox::cashbox->gettotal() << endl;
//resume all other threads
}
}
Whenever the condition is met for one of the if statements I need to be able to suspend all other threads except the current thread that met the condition. Then after the data is completed before leaving the if statement and observer functions resume all other threads.
I read about the possibility of using SuspendThread, and ResumeThread from here, how to suspend thread. Yet I am having a hard time passing the threads created in main.cpp to the observer function so that I could call those functions. I figured out how to create threads from cplusplus.com, I also notice I could potentially use a mutex locking as refered to from What is the best solution to pause and resume pthreads?
I am using c++ under Microsoft Visual Studio 2015 Community.
This is my first time dealing with threads. For my use which is better, pass the created threads to the observer function, or is there another to pause/suspend and then resume them and how would i do so? Thank you for any advice/help provided.
Currently If I run my program and one of the conditions is met by a thread, the other thread will also meet the same condition and I have to enter the amount to deposit/withdraw twice before the threads continue until each thread has dealt with 5 customers each for a total of 10 customers.
I finally figured out what I needed and what to use thanks to:
Class RWLock
By utilizing this class, inside my project. Then creating a global instance of that class.
Then I added the reader and writer lock and unlocks where it function inside my code the best. Like so:
void Begin::atm() //The main function that makes it easier for threads to
call and run the Program.
{
ATM atm;
int choice, amount;
LARGE_INTEGER cicles;
QueryPerformanceCounter(&cicles);
srand(cicles.QuadPart);
for (int i = 0; i < imax; i++) //mimics a total of 5 customers
{
rw.ReadLock(); //Have to place to read lock here.
choice = rand() % 2; //Randomizes the choice of depositing or withdrawing.
amount = rand() % 5000 + 1; //Randomizes 'the amount of cash that the customers use.
rw.ReadUnlock(); //Read unlock must happen here otherwise it blocks the writers.
rw.WriteLock(); //Must happen here!
if (choice == 0)
{
atm.cashdp(amount);
cout << "\tCustomer depositing $" << amount << endl;
}
else if (choice == 1)
{
atm.cashwd(amount);
cout << "\tCustomer withdrawing $" << amount << endl;
}
else
//error checker against the randomizer for the choice of depsoiting or withdrawing.
cout << "error rand creating wrong number" << endl;
rw.WriteUnlock(); //Must Happen here!
Sleep(5000); // Sleeps the program between customer usage to mimic actual use.
}
}
My ray tracer is currently multi threaded, I'm basically dividing the image into as many chunks as the system has and rendering them parallel. However, not all chunks have the same rendering time, so most of the time half of the run time is only 50% cpu usage.
Code
std::shared_ptr<bitmap_image> image = std::make_shared<bitmap_image>(WIDTH, HEIGHT);
auto nThreads = std::thread::hardware_concurrency();
std::cout << "Resolution: " << WIDTH << "x" << HEIGHT << std::endl;
std::cout << "Supersampling: " << SUPERSAMPLING << std::endl;
std::cout << "Ray depth: " << DEPTH << std::endl;
std::cout << "Threads: " << nThreads << std::endl;
std::vector<RenderThread> renderThreads(nThreads);
std::vector<std::thread> tt;
auto size = WIDTH*HEIGHT;
auto chunk = size / nThreads;
auto rem = size % nThreads;
//launch threads
for (unsigned i = 0; i < nThreads - 1; i++)
{
tt.emplace_back(std::thread(&RenderThread::LaunchThread, &renderThreads[i], i * chunk, (i + 1) * chunk, image));
}
tt.emplace_back(std::thread(&RenderThread::LaunchThread, &renderThreads[nThreads-1], (nThreads - 1)*chunk, nThreads*chunk + rem, image));
for (auto& t : tt)
t.join();
I would like to divide the image into 16x16 chunks or something similar and render them paralelly, so after each chunk gets rendered, the thread switches to the next and so on... This would greatly increase cpu usage and run time.
How do I set up my ray tracer render these 16x16 chunks in a multithreaded manner?
I assume the question is "How to distribute the blocks to the various threads?"
In your current solution, you're figuring out the regions ahead of time and assigning them to the threads. The trick is to turn this idea on its head. Make the threads ask for what to do next whenever they finish a chunk of work.
Here's an outline of what the threads will do:
void WorkerThread(Manager *manager) {
while (auto task = manager->GetTask()) {
task->Execute();
}
}
So you create a Manager object that returns a chunk of work (in the form of a Task) each time a thread calls its GetTask method. Since that method will be called from multiple threads, you have to be sure it uses appropriate synchronization.
std::unique_ptr<Task> Manager::GetTask() {
std::lock_guard guard(mutex);
std::unique_ptr<Task> t;
if (next_row < HEIGHT) {
t = std::make_unique<Task>(next_row);
++next_row;
}
return t;
}
In this example, the manager creates a new task to ray trace the next row. (You could use 16x16 blocks instead of rows if you like.) When all the tasks have been issued, it just returns an empty pointer, which essentially tells the calling thread that there's nothing left to do, and the calling thread will then exit.
If you made all the Tasks in advance and had the manager dole them as they are requested, this would be a typical "work queue" solution. (General work queues also allow new Tasks to be added on the fly, but you don't need that feature for this particular problem.)
I do this a bit differently:
obtain number of CPU and or cores
You did not specify OS so you need to use your OS api for this. search for System affinity mask.
divide screen into threads
I am dividing screen by lines instead of 16x16 blocks so I do not need to have a que or something. Simply create thread for each CPU/core that will process only its horizontal lines rays. That is simple so each thread should have its ID number counting from zero and number of CPU/cores n so lines belonging to each process are:
y = ID + i*n
where i={0,1,2,3,... } once y is bigger or equal then screen resolution stop. This type of access has its advantages for example accessing screen buffer via ScanLines will not be conflicting between threads as each thread access only its lines...
I am also setting affinity mask for each thread so it uses its own CPU/core only it give me a small boost so there is not so much process switching (but that was on older OS versions hard to say what it does now).
synchronize threads
basically you should wait until all threads are finished. if they are then render the result on screen. Your threads can either stop and you will create new ones on next frame or jump to Sleep loops until rendering forced again...
I am using the latter approach so I do not need to create and configure the threads over and over again but beware Sleep(1) can sleep a lot more then just 1 ms.
I need to parallelize "while" loop by the means of PPL. I have the following code in Visual C++ in MS VS 2013.
int WordCount::CountWordsInTextFiles(basic_string<char> p_FolderPath, vector<basic_string<char>>& p_TextFilesNames)
{
// Word counter in all files.
atomic<unsigned> wordsInFilesTotally = 0;
// Critical section.
critical_section cs;
// Set specified folder as current folder.
::SetCurrentDirectory(p_FolderPath.c_str());
// Concurrent iteration through p_TextFilesNames vector.
parallel_for(size_t(0), p_TextFilesNames.size(), [&](size_t i)
{
// Create a stream to read from file.
ifstream fileStream(p_TextFilesNames[i]);
// Check if the file is opened
if (fileStream.is_open())
{
// Word counter in a particular file.
unsigned wordsInFile = 0;
// Read from file.
while (fileStream.good())
{
string word;
fileStream >> word;
// Count total number of words in all files.
wordsInFilesTotally++;
// Count total number of words in a particular file.
wordsInFile++;
}
// Verify the values.
cs.lock();
cout << endl << "In file " << p_TextFilesNames[i] << " there are " << wordsInFile << " words" << endl;
cs.unlock();
}
});
// Destroy critical section.
cs.~critical_section();
// Return total number of words in all files in the folder.
return wordsInFilesTotally;
}
This code does parallel iteration through std::vector in outer loop. Parallelism is provided by concurrency::parallel_for() algorithm. But this code also has nested "while" loop that executes reading from file. I need to parallelize this nested "while" loop. How can this nested "while" loop can be parallelized by the means of PPL. Please help.
As user High Performance Mark hints in his comment, parallel reads from the same ifstream instance will cause undefined and incorrect behavior. (For some more discussion, see question "Is std::ifstream thread-safe & lock-free?".) You're basically at the parallelization limit here with this particular algorithm.
As a side note, even reading multiple different file streams in parallel will not really speed things up if they are all being read from the same physical volume. The disk hardware can only actually support so many parallel requests (typically not more than one at a time, queuing up any requests that come in while it is busy). For some more background, you might want to check out Mark Friedman's Top Six FAQs on Windows 2000 Disk Performance; the performance counters are Windows-specific, but most of the information is of general use.
This is my first post I hope I am not making any mistake.
I have the following code. I am trying to allocate and access a two dimensional array in one shot and more importantly in one byte array. I also need to be able to access each sub array individually as shown in the code. It works fine in the debug mode. Though in the release build in VS 2012, it causes some problems during runtime, when the compiler optimizations are applied. If I disable the release compiler optimizations then it works. Do I need to do some kind of special cast to inform the compiler?
My priorities in code is fast allocation and network communication of complete array and at the same time working with its sub arrays.
I prefer not to use boost.
Thanks a lot :)
void PrintBytes(char* x,byte* data,int length)
{
using namespace std;
cout<<x<<endl;
for( int i = 0; i < length; i++ )
{
std::cout << "0x" << std::setbase(16) << std::setw(2) << std::setfill('0');
std::cout << static_cast<unsigned int>( data[ i ] ) << " ";
}
std::cout << std::dec;
cout<<endl;
}
byte* set = new byte[SET_SIZE*input_size];
for (int i=0;i<SET_SIZE;i++)
{
sprintf((char*)&set[i*input_size], "M%06d", i+1);
}
PrintByte((byte*)&set[i*input_size]);