Wait for thread queue to be empty - multithreading

I am new to C++ and multithreading applications. I want to process a long list of data (potentially several thousands of entries) by dividing its entries among a few threads. I have retrieved a ThreadPool class and a Queue class from the web (it is my first time tackling the subject). I construct the threads and populate the queue in the following way (definitions at the end of the post):
ThreadPool *pool = new ThreadPool(8);
std::vector<std::function<void(int)>> *caller =
new std::vector<std::function<void(int)>>;
for (size_t i = 0; i < Nentries; ++i)
{
caller->push_back(
[=](int j){func(entries[i], j);});
pool->PushTask((*caller)[i]);
}
delete pool;
The problem is that only a number of entries equaling the number of created threads are processed, as if the program does not wait for the queue to be empty. Indeed, if I put
while (pool->GetWorkQueueLength()) {}
just before the pool destructor, the whole list is correctly processed. However, I am afraid I am consuming too many resources by using a while loop. Moreover, I have not found anyone doing anything like it, so I think this is the wrong approach and the classes I use have some error. Can anyone find the error (if present) or suggest another solution?
Here are the classes I use. I suppose the problem is in the implementation of the destructor, but I am not sure.
SynchronizeQueue.hh
#ifndef SYNCQUEUE_H
#define SYNCQUEUE_H
#include <list>
#include <mutex>
#include <condition_variable>
template<typename T>
class SynchronizedQueue
{
public:
SynchronizedQueue();
void Put(T const & data);
T Get();
size_t Size();
private:
SynchronizedQueue(SynchronizedQueue const &)=delete;
SynchronizedQueue & operator=(SynchronizedQueue const &)=delete;
std::list<T> queue;
std::mutex mut;
std::condition_variable condvar;
};
template<typename T>
SynchronizedQueue<T>::SynchronizedQueue()
{}
template<typename T>
void SynchronizedQueue<T>::Put(T const & data)
{
std::unique_lock<std::mutex> lck(mut);
queue.push_back(data);
condvar.notify_one();
}
template<typename T>
T SynchronizedQueue<T>::Get()
{
std::unique_lock<std::mutex> lck(mut);
while (queue.empty())
{
condvar.wait(lck);
}
T result = queue.front();
queue.pop_front();
return result;
}
template<typename T>
size_t SynchronizedQueue<T>::Size()
{
std::unique_lock<std::mutex> lck(mut);
return queue.size();
}
#endif
ThreadPool.hh
#ifndef THREADPOOL_H
#define THREADPOOL_H
#include "SynchronizedQueue.hh"
#include <atomic>
#include <functional>
#include <mutex>
#include <thread>
#include <vector>
class ThreadPool
{
public:
ThreadPool(int nThreads = 0);
virtual ~ThreadPool();
void PushTask(std::function<void(int)> func);
size_t GetWorkQueueLength();
private:
void WorkerThread(int i);
std::atomic<bool> done;
unsigned int threadCount;
SynchronizedQueue<std::function<void(int)>> workQueue;
std::vector<std::thread> threads;
};
#endif
ThreadPool.cc
#include "ThreadPool.hh"
#include "SynchronizedQueue.hh"
void doNothing(int i)
{}
ThreadPool::ThreadPool(int nThreads)
: done(false)
{
if (nThreads <= 0)
{
threadCount = std::thread::hardware_concurrency();
}
else
{
threadCount = nThreads;
}
for (unsigned int i = 0; i < threadCount; ++i)
{
threads.push_back(std::thread(&ThreadPool::WorkerThread, this, i));
}
}
ThreadPool::~ThreadPool()
{
done = true;
for (unsigned int i = 0; i < threadCount; ++i)
{
PushTask(&doNothing);
}
for (auto& th : threads)
{
if (th.joinable())
{
th.join();
}
}
}
void ThreadPool::PushTask(std::function<void(int)> func)
{
workQueue.Put(func);
}
void ThreadPool::WorkerThread(int i)
{
while (!done)
{
workQueue.Get()(i);
}
}
size_t ThreadPool::GetWorkQueueLength()
{
return workQueue.Size();
}

You can push tasks saying "done" instead of setting "done" via atomic variable.
So that each thread will exit by itself when seeing "done" task, and no earlier. In destructor you only need to push these tasks and join threads. This is called "poison pill".
Alternatively, if you insist on your current design with done variable, you can wait on the same condition you already have:
std::unique_lock<std::mutex> lck(mut);
while (!queue.empty())
{
condvar.wait(lck);
}
But then you'll need to change your notify_one to notify_all, and this may be sub-optimal.

I want to process a long list of data (potentially several thousands of entries) by dividing its entries among a few threads.
You can do that with parallel algorithms, like tbb::parallel_for:
#include <tbb/parallel_for.h>
#include <vector>
void func(int entry);
int main () {
std::vector<int> entries(1000000);
tbb::parallel_for(size_t{0}, entries.size(), [&](size_t i) { func(entries[i]); });
}
If you need sequential thread ids, you can do:
void func(int element, int thread_id);
template<class C>
inline auto make_range(C& c) -> decltype(tbb::blocked_range<decltype(c.begin())>(c.begin(), c.end())) {
return tbb::blocked_range<decltype(c.begin())>(c.begin(), c.end());
}
int main () {
std::vector<int> entries(1000000);
std::atomic<int> thread_counter{0};
tbb::parallel_for(make_range(entries), [&](auto sub_range) {
static thread_local int const thread_id = thread_counter.fetch_add(1, std::memory_order_relaxed);
for(auto& element : sub_range)
func(element, thread_id);
});
}
Alternatively, there is std::this_thread::get_id.

Related

How to prematurely kill std::async threads before they are finished *without* using a std::atomic_bool?

I have a function that takes a callback, and used it to do work on 10 separate threads. However, it is often the case that not all of the work is needed. For example, if the desired result is obtained on the third thread, it should stop all work being done on of the remaining alive threads.
This answer here suggests that it is not possible unless you have the callback functions take an additional std::atomic_bool argument, that signals whether the function should terminate prematurely.
This solution does not work for me. The workers are spun up inside a base class, and the whole point of this base class is to abstract away details of multithreading. How can I do this? I am anticipating that I will have to ditch std::async for something more involved.
#include <iostream>
#include <future>
#include <vector>
class ABC{
public:
std::vector<std::future<int> > m_results;
ABC() {};
~ABC(){};
virtual int callback(int a) = 0;
void doStuffWithCallBack();
};
void ABC::doStuffWithCallBack(){
// start working
for(int i = 0; i < 10; ++i)
m_results.push_back(std::async(&ABC::callback, this, i));
// analyze results and cancel all threads when you get the 1
for(int j = 0; j < 10; ++j){
double foo = m_results[j].get();
if ( foo == 1){
break; // but threads continue running
}
}
std::cout << m_results[9].get() << " <- this shouldn't have ever been computed\n";
}
class Derived : public ABC {
public:
Derived() : ABC() {};
~Derived() {};
int callback(int a){
std::cout << a << "!\n";
if (a == 3)
return 1;
else
return 0;
};
};
int main(int argc, char **argv)
{
Derived myObj;
myObj.doStuffWithCallBack();
return 0;
}
I'll just say that this should probably not be a part of a 'normal' program, since it could leak resources and/or leave your program in an unstable state, but in the interest of science...
If you have control of the thread loop, and you don't mind using platform features, you could inject an exception into the thread. With posix you can use signals for this, on Windows you would have to use SetThreadContext(). Though the exception will generally unwind the stack and call destructors, your thread may be in a system call or other 'non-exception safe place' when the exception occurs.
Disclaimer: I only have Linux at the moment, so I did not test the Windows code.
#if defined(_WIN32)
# define ITS_WINDOWS
#else
# define ITS_POSIX
#endif
#if defined(ITS_POSIX)
#include <signal.h>
#endif
void throw_exception() throw(std::string())
{
throw std::string();
}
void init_exceptions()
{
volatile int i = 0;
if (i)
throw_exception();
}
bool abort_thread(std::thread &t)
{
#if defined(ITS_WINDOWS)
bool bSuccess = false;
HANDLE h = t.native_handle();
if (INVALID_HANDLE_VALUE == h)
return false;
if (INFINITE == SuspendThread(h))
return false;
CONTEXT ctx;
ctx.ContextFlags = CONTEXT_CONTROL;
if (GetThreadContext(h, &ctx))
{
#if defined( _WIN64 )
ctx.Rip = (DWORD)(DWORD_PTR)throw_exception;
#else
ctx.Eip = (DWORD)(DWORD_PTR)throw_exception;
#endif
bSuccess = SetThreadContext(h, &ctx) ? true : false;
}
ResumeThread(h);
return bSuccess;
#elif defined(ITS_POSIX)
pthread_kill(t.native_handle(), SIGUSR2);
#endif
return false;
}
#if defined(ITS_POSIX)
void worker_thread_sig(int sig)
{
if(SIGUSR2 == sig)
throw std::string();
}
#endif
void init_threads()
{
#if defined(ITS_POSIX)
struct sigaction sa;
sigemptyset(&sa.sa_mask);
sa.sa_flags = 0;
sa.sa_handler = worker_thread_sig;
sigaction(SIGUSR2, &sa, 0);
#endif
}
class tracker
{
public:
tracker() { printf("tracker()\n"); }
~tracker() { printf("~tracker()\n"); }
};
int main(int argc, char *argv[])
{
init_threads();
printf("main: starting thread...\n");
std::thread t([]()
{
try
{
tracker a;
init_exceptions();
printf("thread: started...\n");
std::this_thread::sleep_for(std::chrono::minutes(1000));
printf("thread: stopping...\n");
}
catch(std::string s)
{
printf("thread: exception caught...\n");
}
});
printf("main: sleeping...\n");
std::this_thread::sleep_for(std::chrono::seconds(2));
printf("main: aborting...\n");
abort_thread(t);
printf("main: joining...\n");
t.join();
printf("main: exiting...\n");
return 0;
}
Output:
main: starting thread...
main: sleeping...
tracker()
thread: started...
main: aborting...
main: joining...
~tracker()
thread: exception caught...
main: exiting...

thread safe boost intrusive list is slow

I wrapped a boost intrusive list with mutex to make it thread safe, to be used as a producer/consumer queue.
But on windows (MSVC 14) it's really slow, after profiling, 95% of time is spent idle, mainly on push() and waint_and_pop() methods.
I only have 1 producer and 2 producer/consumer threads.
Any suggestions to make this faster?
#ifndef INTRUSIVE_CONCURRENT_QUEUE_HPP
#define INTRUSIVE_CONCURRENT_QUEUE_HPP
#include <thread>
#include <mutex>
#include <condition_variable>
#include <boost/intrusive/list.hpp>
using namespace boost::intrusive;
template<typename Data>
class intrusive_concurrent_queue
{
protected:
list<Data, constant_time_size<false> > the_queue;
mutable std::mutex the_mutex;
std::condition_variable the_condition_variable;
public:
void push(Data * data)
{
std::unique_lock<std::mutex> lock(the_mutex);
the_queue.push_back(*data);
lock.unlock();
the_condition_variable.notify_one();
}
bool empty() const
{
std::unique_lock<std::mutex> lock(the_mutex);
return the_queue.empty();
}
size_t unsafe_size() const
{
return the_queue.size();
}
size_t size() const
{
std::unique_lock<std::mutex> lock(the_mutex);
return the_queue.size();
}
Data* try_pop()
{
Data* popped_ptr;
std::unique_lock<std::mutex> lock(the_mutex);
if(the_queue.empty())
{
return nullptr;
}
popped_ptr= & the_queue.front();
the_queue.pop_front();
return popped_ptr;
}
Data* wait_and_pop(const bool & exernal_stop = false)
{
Data* popped_ptr;
std::unique_lock<std::mutex> lock(the_mutex);
the_condition_variable.wait(lock,[&]{ return ! ( the_queue.empty() | exernal_stop ) ; });
if ( exernal_stop){
return nullptr;
}
popped_ptr=&the_queue.front();
the_queue.pop_front();
return popped_ptr;
}
intrusive_concurrent_queue<Data> & operator=(intrusive_concurrent_queue<Data>&& origin)
{
this->the_queue = std::move(the_queue);
return *this;
}
};
#endif // !INTRUSIVE_CONCURRENT_QUEUE_HPP
Try removing the lock from the methods and lock the whole data structure when you do things with it.

C++ Qt: Redirect cout from a thread to emit a signal

In a single thread, I have this beautiful class that redirects all cout output to a QTextEdit
#include <iostream>
#include <streambuf>
#include <string>
#include <QScrollBar>
#include "QTextEdit"
#include "QDateTime"
class ThreadLogStream : public std::basic_streambuf<char>, QObject
{
Q_OBJECT
public:
ThreadLogStream(std::ostream &stream) : m_stream(stream)
{
m_old_buf = stream.rdbuf();
stream.rdbuf(this);
}
~ThreadLogStream()
{
// output anything that is left
if (!m_string.empty())
{
log_window->append(m_string.c_str());
}
m_stream.rdbuf(m_old_buf);
}
protected:
virtual int_type overflow(int_type v)
{
if (v == '\n')
{
log_window->append(m_string.c_str());
m_string.erase(m_string.begin(), m_string.end());
}
else
m_string += v;
return v;
}
virtual std::streamsize xsputn(const char *p, std::streamsize n)
{
m_string.append(p, p + n);
long pos = 0;
while (pos != static_cast<long>(std::string::npos))
{
pos = m_string.find('\n');
if (pos != static_cast<long>(std::string::npos))
{
std::string tmp(m_string.begin(), m_string.begin() + pos);
log_window->append(tmp.c_str());
m_string.erase(m_string.begin(), m_string.begin() + pos + 1);
}
}
return n;
}
private:
std::ostream &m_stream;
std::streambuf *m_old_buf;
std::string m_string;
QTextEdit* log_window;
};
However, this doesn't work if ANY thread (QThread) is initiated with a cout. This is because all pointers are messed up, and one has to use signals and slots for allowing transfer of data between the sub-thread and the main thread.
I would like to modify this class to emit a signal rather than write to a text file. This requires that this class becomes a Q_OBJECT and be inherited from one. I tried to inherit from QObject in addition to std::basic_streambuf<char> and added Q_OBJECT macro in the body but it didn't compile.
Could you please help me to achieve this? What should I do to get this class to emit signals that I can connect to and that are thread safe?
For those who need the full "working" answer, here it's. I just copied it because #GraemeRock asked for it.
#ifndef ThreadLogStream_H
#define ThreadLogStream_H
#include <iostream>
#include <streambuf>
#include <string>
#include <QScrollBar>
#include "QTextEdit"
#include "QDateTime"
class ThreadLogStream : public QObject, public std::basic_streambuf<char>
{
Q_OBJECT
public:
ThreadLogStream(std::ostream &stream) : m_stream(stream)
{
m_old_buf = stream.rdbuf();
stream.rdbuf(this);
}
~ThreadLogStream()
{
// output anything that is left
if (!m_string.empty())
{
emit sendLogString(QString::fromStdString(m_string));
}
m_stream.rdbuf(m_old_buf);
}
protected:
virtual int_type overflow(int_type v)
{
if (v == '\n')
{
emit sendLogString(QString::fromStdString(m_string));
m_string.erase(m_string.begin(), m_string.end());
}
else
m_string += v;
return v;
}
virtual std::streamsize xsputn(const char *p, std::streamsize n)
{
m_string.append(p, p + n);
long pos = 0;
while (pos != static_cast<long>(std::string::npos))
{
pos = static_cast<long>(m_string.find('\n'));
if (pos != static_cast<long>(std::string::npos))
{
std::string tmp(m_string.begin(), m_string.begin() + pos);
emit sendLogString(QString::fromStdString(tmp));
m_string.erase(m_string.begin(), m_string.begin() + pos + 1);
}
}
return n;
}
private:
std::ostream &m_stream;
std::streambuf *m_old_buf;
std::string m_string;
signals:
void sendLogString(const QString& str);
};
#endif // ThreadLogStream_H
The derivation needs to happen QObject-first:
class LogStream : public QObject, std::basic_streambuf<char> {
Q_OBJECT
...
};
...
If the goal was to minimally modify your code, there's a simpler way. You don't need to inherit QObject to emit signals iff you know exactly what slots the signals are going to. All you need to do is to invoke the slot in a thread safe way:
QMetaObject::invokeMethod(log_window, "append", Qt::QueuedConnection,
Q_ARG(QString, tmp.c_str()));
To speed things up, you can cache the method so that it doesn't have to be looked up every time:
class LogStream ... {
QPointer<QTextEdit> m_logWindow;
QMetaMethod m_append;
LogStream::LogStream(...) :
m_logWindow(...),
m_append(m_logWindow->metaObject()->method(
m_logWindow->metaObject()->indexOfSlot("append(QString)") )) {
...
}
};
You can then invoke it more efficiently:
m_append.invoke(m_logWindow, Qt::QueuedConnection, Q_ARG(QString, tmp.c_str()));
Finally, whenever you're holding pointers to objects whose lifetimes are not under your control, it's helpful to use QPointer since it never dangles. A QPointer resets itself to 0 when the pointed-to object gets destructed. It will at least prevent you from dereferencing a dangling pointer, since it never dangles.

Differences between POSIX threads on OSX and LINUX?

Can anyone shed light on the reason that when the below code is compiled and run on OSX the 'bartender' thread skips through the sem_wait() in what seems like a random manner and yet when compiled and run on a Linux machine the sem_wait() holds the thread until the relative call to sem_post() is made, as would be expected?
I am currently learning not only POSIX threads but concurrency as a whole so absoutely any comments, tips and insights are warmly welcomed...
Thanks in advance.
#include <stdio.h>
#include <stdlib.h>
#include <semaphore.h>
#include <fcntl.h>
#include <unistd.h>
#include <pthread.h>
#include <errno.h>
//using namespace std;
#define NSTUDENTS 30
#define MAX_SERVINGS 100
void* student(void* ptr);
void get_serving(int id);
void drink_and_think();
void* bartender(void* ptr);
void refill_barrel();
// This shared variable gives the number of servings currently in the barrel
int servings = 10;
// Define here your semaphores and any other shared data
sem_t *mutex_stu;
sem_t *mutex_bar;
int main() {
static const char *semname1 = "Semaphore1";
static const char *semname2 = "Semaphore2";
pthread_t tid;
mutex_stu = sem_open(semname1, O_CREAT, 0777, 0);
if (mutex_stu == SEM_FAILED)
{
fprintf(stderr, "%s\n", "ERROR creating semaphore semname1");
exit(EXIT_FAILURE);
}
mutex_bar = sem_open(semname2, O_CREAT, 0777, 1);
if (mutex_bar == SEM_FAILED)
{
fprintf(stderr, "%s\n", "ERROR creating semaphore semname2");
exit(EXIT_FAILURE);
}
pthread_create(&tid, NULL, bartender, &tid);
for(int i=0; i < NSTUDENTS; ++i) {
pthread_create(&tid, NULL, student, &tid);
}
pthread_join(tid, NULL);
sem_unlink(semname1);
sem_unlink(semname2);
printf("Exiting the program...\n");
}
//Called by a student process. Do not modify this.
void drink_and_think() {
// Sleep time in milliseconds
int st = rand() % 10;
sleep(st);
}
// Called by a student process. Do not modify this.
void get_serving(int id) {
if (servings > 0) {
servings -= 1;
} else {
servings = 0;
}
printf("ID %d got a serving. %d left\n", id, servings);
}
// Called by the bartender process.
void refill_barrel()
{
servings = 1 + rand() % 10;
printf("Barrel refilled up to -> %d\n", servings);
}
//-- Implement a synchronized version of the student
void* student(void* ptr) {
int id = *(int*)ptr;
printf("Started student %d\n", id);
while(1) {
sem_wait(mutex_stu);
if(servings > 0) {
get_serving(id);
} else {
sem_post(mutex_bar);
continue;
}
sem_post(mutex_stu);
drink_and_think();
}
return NULL;
}
//-- Implement a synchronized version of the bartender
void* bartender(void* ptr) {
int id = *(int*)ptr;
printf("Started bartender %d\n", id);
//sleep(5);
while(1) {
sem_wait(mutex_bar);
if(servings <= 0) {
refill_barrel();
} else {
printf("Bar skipped sem_wait()!\n");
}
sem_post(mutex_stu);
}
return NULL;
}
The first time you run the program, you're creating named semaphores with initial values, but since your threads never exit (they're infinite loops), you never get to the sem_unlink calls to delete those semaphores. If you kill the program (with ctrl-C or any other way), the semaphores will still exist in whatever state they are in. So if you run the program again, the sem_open calls will succeed (because you don't use O_EXCL), but they won't reset the semaphore value or state, so they might be in some odd state.
So you should make sure to call sem_unlink when the program STARTS, before calling sem_open. Better yet, don't use named semaphores at all -- use sem_init to initialize a couple of unnamed semaphores instead.

Multithread Directory and File Search

I am new to semaphores and the concepts of mutual exclusion. I am supposed to recursively text search in files through directories using multithreading. The number of threads is to be given by the user.
The issue with this code is it goes through one directory and then waits. I cannot figure out what is wrong.I am getting a segmentation fault error. Cannot figure out why is this happening.
#include <iostream>
#include <sys/wait.h>
#include <sys/types.h>
#include <pthread.h>
#include <string.h>
#include <unistd.h>
#include <dirent.h>
#include <sys/stat.h>
#include <fstream>
#include <limits.h>
#include <stdlib.h>
#include <semaphore.h>
using namespace std;
#include <stdio.h>
int iDirectories=0;
pthread_mutex_t mutex=PTHREAD_MUTEX_INITIALIZER;
sem_t semaphore1;
char searchStringThread[PATH_MAX];
int directories=0;
class directoryQueue
{
private:
struct Node
{
char directoryPath[PATH_MAX];
Node *next;
};
Node *front;
Node *rear;
Node *nodeCount;
public:
directoryQueue(void)
{
front=NULL;
rear=NULL;
nodeCount=0;
}
void Enqueue(char array[PATH_MAX])
{
Node *newNode;
newNode=new Node;
strcpy(newNode->directoryPath,array);
newNode->next=NULL;
if(isEmpty())
{
front=newNode;
rear=newNode;
}
else
{
rear->next=newNode;
rear=newNode;
}
nodeCount++;
}
char * Dequeue(void)
{
Node *temp;
if (isEmpty())
cout << "Error ! Empty Queue "<<endl;
else
{
char *deque;
deque=new char[PATH_MAX];
strcpy(deque,front->directoryPath);
temp = front->next;
front = temp;
nodeCount--;
return deque;
}
}
bool isEmpty(void)
{
if(nodeCount)
return false;
else
return true;
}
void makeNull(void)
{
while(!isEmpty())
{
Dequeue();
}
}
~directoryQueue(void)
{
makeNull();
}
};
directoryQueue saveDirectory;
void *threadHandler(void *)
{
int thpath_length;
char thPath[PATH_MAX];
char saveITDirectory[PATH_MAX];
char itDirectory[PATH_MAX];
int threadCount;
struct dirent *iWalker;
DIR *iDirectory;
pthread_mutex_lock(&mutex);
threadCount=iDirectories++;
pthread_mutex_unlock(&mutex);
sem_wait(&semaphore1);
pthread_mutex_lock(&mutex);
strcpy(itDirectory,saveDirectory.Dequeue());
pthread_mutex_unlock(&mutex);
iDirectory=opendir(itDirectory);
if(iDirectory==NULL)
{
cout<<"Error"<<endl;
cout<<itDirectory<<" Cannot be Opened"<<endl;
exit(10000);
}
while((iWalker=readdir(iDirectory)) !=NULL)
{
if(iWalker->d_type==DT_REG)
{
strcpy(saveITDirectory,iWalker->d_name);
cout<<itDirectory<<"/"<<endl;
if (strcmp (saveITDirectory, "..") == 0 ||
strcmp (saveITDirectory, ".") == 0)
{
continue;
}
else
{
thpath_length = snprintf(thPath,PATH_MAX,"%s/%s",itDirectory,saveITDirectory);
cout<<thPath<<endl;
if (thpath_length >= PATH_MAX)
{
cout<<"Path is too long"<<endl;
exit (1000);
}
ifstream openFile;
openFile.open(thPath);
char line[1500];
int currentLine = 0;
if (openFile.is_open()) {
while (openFile.good()) {
currentLine++;
openFile.getline(line, 1500);
if (strstr(line, searchStringThread) != NULL){
cout<<thPath<<": "<<currentLine<<": "<<line<<endl;
cout<<"This was performed by Thread no. "<<threadCount<<endl;
cout<<"ID :"<<pthread_self();
}
}
}
openFile.close();
}
}
if (closedir (iDirectory))
{
cout<<"Unable to close "<<itDirectory<<endl;
exit (1000);
}
}
}
void walkThroughDirectory(char directory_name[PATH_MAX],char searchString[PATH_MAX])
{
DIR * directory;
struct dirent * walker;
char d_name[PATH_MAX];
int path_length;
char path[PATH_MAX];
directory=opendir(directory_name);
if(directory==NULL)
{
cout<<"Error"<<endl;
cout<<directory_name<<" Cannot be Opened"<<endl;
exit(10000);
}
while((walker=readdir(directory)) !=NULL)
{
strcpy(d_name,walker->d_name);
cout<<directory_name<<"/"<<endl;
if (strcmp (d_name, "..") == 0 ||
strcmp (d_name, ".") == 0)
{
continue;
}
else
{
path_length = snprintf(path,PATH_MAX,"%s/%s",directory_name,d_name);
cout<<path<<endl;
if (path_length >= PATH_MAX)
{
cout<<"Path is too long"<<endl;
exit (1000);
}
if(walker->d_type==DT_DIR)
{
pthread_mutex_lock(&mutex);
saveDirectory.Enqueue(path);
pthread_mutex_lock(&mutex);
sem_post(&semaphore1);
directories++;
walkThroughDirectory (path,searchString);
}
else if(walker->d_type==DT_REG)
{
ifstream openFile;
openFile.open(path);
char line[1500];
int currentLine = 0;
if (openFile.is_open()) {
while (openFile.good()) {
currentLine++;
openFile.getline(line, 1500);
if (strstr(line, searchString) != NULL)
cout<<path<<": "<<currentLine<<": "<<line<<endl;
}
}
openFile.close();
}
}
}
if (closedir (directory))
{
cout<<"Unable to close "<<directory_name<<endl;
exit (1000);
}
}
int main(int argc,char *argv[])
{
char * name;
cout<<"Total Directories "<< directories<<endl;
name=get_current_dir_name();
cout<<"Current Directory is: "<<name<<endl;
sem_init(&semaphore1,0,0);
strcpy(searchStringThread,argv[1]);
int number_of_threads=atoi(argv[3]);
pthread_t threads[number_of_threads];
walkThroughDirectory(argv[2],argv[1]);
pthread_mutex_lock(&mutex);
saveDirectory.Enqueue(argv[2]);
pthread_mutex_unlock(&mutex);
sem_post(&semaphore1);
for(int i=0;i<number_of_threads;i++)
{
pthread_create(&threads[i],NULL,threadHandler,NULL);
}
for(int j=0;j<number_of_threads;j++)
{
pthread_join(threads[j],NULL);
}
while(saveDirectory.isEmpty())
{
cout<<"Queue is Empty"<<endl;
cout<<"Exiting"<<endl;
exit(10000);
}
free(name);
cout<<"Total Directories "<< directories<<endl;
return 0;
}
There's a simple bug where you lock a mutex twice instead of unlocking it when you're done:
pthread_mutex_lock(&mutex);
saveDirectory.Enqueue(path);
pthread_mutex_lock(&mutex);
should be:
pthread_mutex_lock(&mutex);
saveDirectory.Enqueue(path);
pthread_mutex_unlock(&mutex);
Note: this isn't to say that there aren't other problems - just that this is probably your immediate problem.
The biggest problem is that it looks like you put directories on the saveDirectory queue (so another thread can pull it off to work on it), then go ahead an process that directory recursively in the thread that just put it on the queue. I think you'll need to give some more thought on how the work will be divided among the threads.
A couple of more minor comments:
you might want to consider using std::string if that's permitted. It should make some of your string handling simpler (you leak memory from the data returned from directoryQueue::Dequeue(), for example)
if the primary reason for the existence of the directoryQueue class is to hold work items for multiple threads, then maybe it should manage it's own mutex so callers don't need to deal with that complexity

Resources