I'm going to implement boost::asio server with a thread pool using single io_service ( HTTP Server 3
example ). io_service will be bound to unix domain socket and pass requests going from connections on this socket to different threads. In order to reduce resource consumption I want to make the thread pool dynamic.
Here is a concept. Firstly a single thread is created. When a request arrives and server sees that there is no idle thread in a pool it creates a new thread and passes the request to it. The server can create up to some maximum number of threads. Ideally it should have functinality of suspending threads which are idle for some time.
Did somebody make something similar? Or maybe somebody has a relevant example?
As for me, I guess I should somehow override io_service.dispatch to achieve that.
There may be a few challenges with the initial approach:
boost::asio::io_service is not intended to be derived from or reimplemented. Note the lack of virtual functions.
If your thread library does not provide the ability to query a thread's state, then state information needs to be managed separately.
An alternative solution is to post a job into the io_service, then check how long it sat in the io_service. If the time delta between when it was ready-to-run and when it was actually ran is above a certain threshold, then this indicates there are more jobs in the queue than threads servicing the queue. A major benefit to this is that the dynamic thread pool growth logic becomes decoupled from other logic.
Here is an example that accomplishes this by using the deadline_timer.
Set deadline_timer to expire 3 seconds from now.
Asynchronously wait on the deadline_timer. The handler will be ready-to-run 3 seconds from when the deadline_timer was set.
In the asynchronous handler, check the current time relative to when the timer was set to expire. If it is greater than 2 seconds, then the io_service queue is backing up, so add a thread to the thread pool.
Example:
#include <boost/asio.hpp>
#include <boost/bind.hpp>
#include <boost/thread.hpp>
#include <iostream>
class thread_pool_checker
: private boost::noncopyable
{
public:
thread_pool_checker( boost::asio::io_service& io_service,
boost::thread_group& threads,
unsigned int max_threads,
long threshold_seconds,
long periodic_seconds )
: io_service_( io_service ),
timer_( io_service ),
threads_( threads ),
max_threads_( max_threads ),
threshold_seconds_( threshold_seconds ),
periodic_seconds_( periodic_seconds )
{
schedule_check();
}
private:
void schedule_check();
void on_check( const boost::system::error_code& error );
private:
boost::asio::io_service& io_service_;
boost::asio::deadline_timer timer_;
boost::thread_group& threads_;
unsigned int max_threads_;
long threshold_seconds_;
long periodic_seconds_;
};
void thread_pool_checker::schedule_check()
{
// Thread pool is already at max size.
if ( max_threads_ <= threads_.size() )
{
std::cout << "Thread pool has reached its max. Example will shutdown."
<< std::endl;
io_service_.stop();
return;
}
// Schedule check to see if pool needs to increase.
std::cout << "Will check if pool needs to increase in "
<< periodic_seconds_ << " seconds." << std::endl;
timer_.expires_from_now( boost::posix_time::seconds( periodic_seconds_ ) );
timer_.async_wait(
boost::bind( &thread_pool_checker::on_check, this,
boost::asio::placeholders::error ) );
}
void thread_pool_checker::on_check( const boost::system::error_code& error )
{
// On error, return early.
if ( error ) return;
// Check how long this job was waiting in the service queue. This
// returns the expiration time relative to now. Thus, if it expired
// 7 seconds ago, then the delta time is -7 seconds.
boost::posix_time::time_duration delta = timer_.expires_from_now();
long wait_in_seconds = -delta.seconds();
// If the time delta is greater than the threshold, then the job
// remained in the service queue for too long, so increase the
// thread pool.
std::cout << "Job job sat in queue for "
<< wait_in_seconds << " seconds." << std::endl;
if ( threshold_seconds_ < wait_in_seconds )
{
std::cout << "Increasing thread pool." << std::endl;
threads_.create_thread(
boost::bind( &boost::asio::io_service::run,
&io_service_ ) );
}
// Otherwise, schedule another pool check.
schedule_check();
}
// Busy work functions.
void busy_work( boost::asio::io_service&,
unsigned int );
void add_busy_work( boost::asio::io_service& io_service,
unsigned int count )
{
io_service.post(
boost::bind( busy_work,
boost::ref( io_service ),
count ) );
}
void busy_work( boost::asio::io_service& io_service,
unsigned int count )
{
boost::this_thread::sleep( boost::posix_time::seconds( 5 ) );
count += 1;
// When the count is 3, spawn additional busy work.
if ( 3 == count )
{
add_busy_work( io_service, 0 );
}
add_busy_work( io_service, count );
}
int main()
{
using boost::asio::ip::tcp;
// Create io service.
boost::asio::io_service io_service;
// Add some busy work to the service.
add_busy_work( io_service, 0 );
// Create thread group and thread_pool_checker.
boost::thread_group threads;
thread_pool_checker checker( io_service, threads,
3, // Max pool size.
2, // Create thread if job waits for 2 sec.
3 ); // Check if pool needs to grow every 3 sec.
// Start running the io service.
io_service.run();
threads.join_all();
return 0;
}
Output:
Will check if pool needs to increase in 3 seconds.
Job job sat in queue for 7 seconds.
Increasing thread pool.
Will check if pool needs to increase in 3 seconds.
Job job sat in queue for 0 seconds.
Will check if pool needs to increase in 3 seconds.
Job job sat in queue for 4 seconds.
Increasing thread pool.
Will check if pool needs to increase in 3 seconds.
Job job sat in queue for 0 seconds.
Will check if pool needs to increase in 3 seconds.
Job job sat in queue for 0 seconds.
Will check if pool needs to increase in 3 seconds.
Job job sat in queue for 0 seconds.
Will check if pool needs to increase in 3 seconds.
Job job sat in queue for 3 seconds.
Increasing thread pool.
Thread pool has reached its max. Example will shutdown.
Related
I'm doing a project for my OS exam. The picture is I have a process divided in:
1 thread producer that push messages in a queue.
n thread consumer that pop messages from the head of the queue.
1 thread collector that says to the producer to generate another set of tasks.
My initial design in pseudocode is this (if you need real code i can make a .tar with Makefile and all headers...).
producer
while(1) {
pushTasks(); /* push messages in the queue */
waitCollector(); /* wait on a CV the signal of the collector */
broadcastToWorkers(); /* tells to all workers to start a new elaboration */
}
all consumers do
while(1)
{
while(1)
{
popTask();
doTask();
if(message of end of stream) break;
}
signalToCollector(); /* increment a count and signal on a CV */
waitProducer(); /* wait signal from producer to restart the elaboration */
}
collector, it have to synchronize time (imagine each unit of time occurs when all tasks are done)
while(1) {
doStuff();
waitWorkers(); /* each worker increments a count when it's done... */
signalToProducer(); /* tells the producer to generate new tasks*/
}
But I got a deadlock somewhere. Do you know how many mutex and condition variables should I use? What are the conditions that make each thread do pthread_cond_signal or pthread_cond_wait?
I try to find out the number of CPU cores using threads in C. Someone told me to try to execute 40 threads at once, make every thread sleep for one second, and see how many are executed simultaneously. I really like his approach, the problem is after executing my code, the program is sleeping for 1 second, and after that all threads are launced at once(no sleeping included).
Can someone please help me out?
void func(void* arg)
{
int n=(int*)arg;
sleep(1);
printf("Exec nr:%d\n",n);
}
int main(void) {
int i;
time_t rawtime;
struct tm * timeinfo;
time ( &rawtime );
timeinfo = localtime ( &rawtime );
printf ( "Current local time and date: %s", asctime (timeinfo) );
for(i=0;i<N;i++)
{
pthread_create(&th[i],NULL,func,(int*)i);
}
for(i=0;i<N;i++)
{
pthread_join(th[i],NULL);
}
time_t rawtime2;
struct tm* timeinfo2;
time ( &rawtime2 );
timeinfo2 = localtime ( &rawtime2 );
printf ( "Current local time and date: %s", asctime (timeinfo2) );
return 0;
}
You won't be able of discovering the number of CPUs in this way, because the scheduler may choose to run all your threads on the same core and leave the other cores for (more important) stuff.
Therefore, you should rely on some functionality provided by your OS.
For example, on Linux the file /proc/cpuinfo provides information about the CPU. This file can be opened and parsed by any user-level program. Other operating systems provide different mechanisms.
I have N threads performing various task and these threads must be regularly synchronized with a thread barrier as illustrated below with 3 thread and 8 tasks. The || indicates the temporal barrier, all threads have to wait until the completion of 8 tasks before starting again.
Thread#1 |----task1--|---task6---|---wait-----||-taskB--| ...
Thread#2 |--task2--|---task5--|-------taskE---||----taskA--| ...
Thread#3 |-task3-|---task4--|-taskG--|--wait--||-taskC-|---taskD ...
I couldn’t find a workable solution, thought the little book of Semaphores http://greenteapress.com/semaphores/index.html was inspiring. I came up with a solution using std::atomic shown below which “seems” to be working using three std::atomic.
I am worried about my code breaking down on corner cases hence the quoted verb. So can you share advise on verification of such code? Do you have a simpler fool proof code available?
std::atomic<int> barrier1(0);
std::atomic<int> barrier2(0);
std::atomic<int> barrier3(0);
void my_thread()
{
while(1) {
// pop task from queue
...
// and execute task
switch(task.id()) {
case TaskID::Barrier:
barrier2.store(0);
barrier1++;
while (barrier1.load() != NUM_THREAD) {
std::this_thread::yield();
}
barrier3.store(0);
barrier2++;
while (barrier2.load() != NUM_THREAD) {
std::this_thread::yield();
}
barrier1.store(0);
barrier3++;
while (barrier3.load() != NUM_THREAD) {
std::this_thread::yield();
}
break;
case TaskID::Task1:
...
}
}
}
Boost offers a barrier implementation as an extension to the C++11 standard thread library. If using Boost is an option, you should look no further than that.
If you have to rely on standard library facilities, you can roll your own implementation based on std::mutex and std::condition_variable without too much of a hassle.
class Barrier {
int wait_count;
int const target_wait_count;
std::mutex mtx;
std::condition_variable cond_var;
Barrier(int threads_to_wait_for)
: wait_count(0), target_wait_count(threads_to_wait_for) {}
void wait() {
std::unique_lock<std::mutex> lk(mtx);
++wait_count;
if(wait_count != target_wait_count) {
// not all threads have arrived yet; go to sleep until they do
cond_var.wait(lk,
[this]() { return wait_count == target_wait_count; });
} else {
// we are the last thread to arrive; wake the others and go on
cond_var.notify_all();
}
// note that if you want to reuse the barrier, you will have to
// reset wait_count to 0 now before calling wait again
// if you do this, be aware that the reset must be synchronized with
// threads that are still stuck in the wait
}
};
This implementation has the advantage over your atomics-based solution that threads waiting in condition_variable::wait should get send to sleep by your operating system's scheduler, so you don't block CPU cores by having waiting threads spin on the barrier.
A few words on resetting the barrier: The simplest solution is to just have a separate reset() method and have the user ensure that reset and wait are never invoked concurrently. But in many use cases, this is not easy to achieve for the user.
For a self-resetting barrier, you have to consider races on the wait count: If the wait count is reset before the last thread returned from wait, some threads might get stuck in the barrier. A clever solution here is to not have the terminating condition depend on the wait count variable itself. Instead you introduce a second counter, that is only increased by the thread calling the notify. The other threads then observe that counter for changes to determine whether to exit the wait:
void wait() {
std::unique_lock<std::mutex> lk(mtx);
unsigned int const current_wait_cycle = m_inter_wait_count;
++wait_count;
if(wait_count != target_wait_count) {
// wait condition must not depend on wait_count
cond_var.wait(lk,
[this, current_wait_cycle]() {
return m_inter_wait_count != current_wait_cycle;
});
} else {
// increasing the second counter allows waiting threads to exit
++m_inter_wait_count;
cond_var.notify_all();
}
}
This solution is correct under the (very reasonable) assumption that all threads leave the wait before the inter_wait_count overflows.
With atomic variables, using three of them for a barrier is simply overkill that only serves to complicate the issue. You know the number of threads, so you can simply atomically increment a single counter every time a thread enters the barrier, and then spin until the counter becomes greater or equal to N. Something like this:
void barrier(int N) {
static std::atomic<unsigned int> gCounter = 0;
gCounter++;
while((int)(gCounter - N) < 0) std::this_thread::yield();
}
If you don't have more threads than CPU cores and a short expected waiting time, you might want to remove the call to std::this_thread::yield(). This call is likely to be really expensive (more than a microsecond, I'd wager, but I haven't measured it). Depending on the size of your tasks, this may be significant.
If you want to do repeated barriers, just increment the N as you go:
unsigned int lastBarrier = 0;
while(1) {
switch(task.id()) {
case TaskID::Barrier:
barrier(lastBarrier += processCount);
break;
}
}
I would like to point out that in the solution given by #ComicSansMS ,
wait_count should be reset to 0 before executing cond_var.notify_all();
This is because when the barrier is called a second time the if condition will always fail, if wait_count is not reset to 0.
I have a main thread which creates another thread to perform some job.
main thread has a reference to that thread. How do I kill that thread forcefully some time later, even if thread is still operating. I cant find a proper function call that does that.
any help would be appreciable.
The original problem that I want to solve is I created a thread a thread to perform a CPU bound operation that may take 1 second to complete or may be 10 hours. I cant predict how much time it is going to take. If it is taking too much time, I want it to gracefully abandon the job when/ if I want. can I somehow communicate this message to that thread??
Assuming you're talking about a GLib.Thread, you can't. Even if you could, you probably wouldn't want to, since you would likely end up leaking a significant amount of memory.
What you're supposed to do is request that the thread kill itself. Generally this is done by using a variable to indicate whether or not it has been requested that the operation stop at the earliest opportunity. GLib.Cancellable is designed for this purpose, and it integrates with the I/O operations in GIO.
Example:
private static int main (string[] args) {
GLib.Cancellable cancellable = new GLib.Cancellable ();
new GLib.Thread<int> (null, () => {
try {
for ( int i = 0 ; i < 16 ; i++ ) {
cancellable.set_error_if_cancelled ();
GLib.debug ("%d", i);
GLib.Thread.usleep ((ulong) GLib.TimeSpan.MILLISECOND * 100);
}
return 0;
} catch ( GLib.Error e ) {
GLib.warning (e.message);
return -1;
}
});
GLib.Thread.usleep ((ulong) GLib.TimeSpan.SECOND);
cancellable.cancel ();
/* Make sure the thread has some time to cancel. In an application
* with a UI you probably wouldn't need to do this artificially,
* since the entire application probably wouldn't exit immediately
* after cancelling the thread (otherwise why bother cancelling the
* thread? Just exit the program) */
GLib.Thread.usleep ((ulong) GLib.TimeSpan.MILLISECOND * 150);
return 0;
}
I have a problem in understanding how the winapi condition variables work.
On the more specific side, what I want is a couple of threads waiting on some condition. Then I want to use the WakeAllConditionVariable() call to wake up all the threads so that they can do work. Besides the fact that i just want the threads started, there isn't any other prerequisite for them to start working ( like you would have in an n producer / n consumer scenario ).
Here's the code so far:
#define MAX_THREADS 4
CONDITION_VARIABLE start_condition;
SRWLOCK cond_rwlock;
bool wake_all;
__int64 start_times[MAX_THREADS];
Main thread:
int main()
{
HANDLE h_threads[ MAX_THREADS ];
int tc;
for (tc = 0; tc < MAX_THREADS; tc++)
{
DWORD tid;
h_threads[tc] = CreateThread(NULL,0,(LPTHREAD_START_ROUTINE)thread_routine,(void*)tc,0,&tid);
if( h_threads[tc] == NULL )
{
cout << "Error while creating thread with index " << tc << endl;
continue;
}
}
InitializeSRWLock( &cond_rwlock );
InitializeConditionVariable( &start_condition );
AcquireSRWLockExclusive( &cond_rwlock );
// set the flag to true, then wake all threads
wake_all = true;
WakeAllConditionVariable( &start_condition );
ReleaseSRWLockExclusive( &cond_rwlock );
WaitForMultipleObjects( tc, h_threads, TRUE, INFINITE );
return 0;
}
And here is the code for the thread routine:
DWORD thread_routine( PVOID p_param )
{
int t_index = (int)(p_param);
AcquireSRWLockShared( &cond_rwlock );
// main thread sets wake_all to true and calls WakeAllConditionVariable()
// so this thread should start doing the work (?)
while ( !wake_all )
SleepConditionVariableSRW( &start_condition,&cond_rwlock, INFINITE,CONDITION_VARIABLE_LOCKMODE_SHARED );
QueryPerformanceCounter((LARGE_INTEGER*)&start_times[t_index]);
// do the actual thread related work here
return 0;
}
This code does not do what i would expect it to do. Sometimes just one thread finishes the job, sometimes two or three, but never all of them. The main function never gets past the WaitForMultipleObjects() call.
I'm not exactly sure what I've done wrong, but I would assume some synchronization issue somewhere ?
Any help would be appreciated. (sorry if I re-posted older topic with different dressing :)
You initialize the cond_rwlock and start_condition variables too late. Move the code up, before you start the threads. A thread is likely to start running right away, especially on a multi-core machine.
And test the return values of api functions. You don't know why it doesn't work because you never check for failure.