I'm trying to create a semaphore for multiple threads, i need to run only one thread at a time. I'm declaring the semaphore in my dialog file
ghSemaphore = CreateSemaphore(NULL, 1, 1, NULL); // only one thread at once
Before i start the tread i call
WaitForSingleObject(ghSemaphore, 0L);
Before the thread ends i call:
ReleaseSemaphore(ghSemaphore, 1, NULL);
It starts all treads instead of one tread once. Any idea please ?
Thanks a lot!
You say "before i start the thread..(you acquire the semaphore)" - so always in the same (main) thread?
I think, the semaphore restrict its acquisition to only one thread (which here would be the main thread), so the acquisition needed to be placed inside the (child) threads, to only allow one of them to run concurrently.
You have to create the single one semaphore in the parent thread and pass a reference to it to the child threads. Once one child thread is released from Wait..() the semaphore blocks concurrent threads until the first one releases the semaphore and some next child thread is allowed to run. However, all child threads do run concurrently until their call of the Wait..().
Btw: Why do you create multiple threads if you actually want only one thread to run at any time (until it terminates)?
Regarding the scope where to create the semaphore: From the info you provided it looks ok to have one single semaphore at application level. However, i would recommend to pass it to the child threads as a parameter at thread start (instead of referring a global variable), so the child threads are independent of the choice of the scope. If you ever need to handle multiple, independent bunches of such child threads, you can easily switch to create one semaphore for each bunch just before they are created (the other option you mentioned). If you create semaphores on the fly, be sure to release it, once all threads have terminated.
So, for now, best create one application-wide semaphore ("global").
Related
I had a doubt on using fork on a multi-threaded process.
If a process has multiple threads (already created using pthread_create and did a pthread_join) and I call fork, will it copy the same functions assigned to the threads in the child process or create a space where we can reassign the functions?
Read carefully what POSIX says about fork() and threads. In particular:
A process shall be created with a single thread. If a multi-threaded process calls fork(), the new process shall contain a replica of the calling thread and its entire address space, possibly including the states of mutexes and other resources. Consequently, to avoid errors, the child process may only execute async-signal-safe operations until such time as one of the exec functions is called.
The child process will have a single thread running in the context of the calling thread. Other parts of the original process may be tied up by threads that no longer exist (so mutexes may be locked, for example).
The rationale section (further down the linked page) says:
There are two reasons why POSIX programmers call fork(). One reason is to create a new thread of control within the same program (which was originally only possible in POSIX by creating a new process); the other is to create a new process running a different program. In the latter case, the call to fork() is soon followed by a call to one of the exec functions.
The general problem with making fork() work in a multi-threaded world is what to do with all of the threads. There are two alternatives. One is to copy all of the threads into the new process. This causes the programmer or implementation to deal with threads that are suspended on system calls or that might be about to execute system calls that should not be executed in the new process. The other alternative is to copy only the thread that calls fork(). This creates the difficulty that the state of process-local resources is usually held in process memory. If a thread that is not calling fork() holds a resource, that resource is never released in the child process because the thread whose job it is to release the resource does not exist in the child process.
When a programmer is writing a multi-threaded program, the first described use of fork(), creating new threads in the same program, is provided by the pthread_create() function. The fork() function is thus used only to run new programs, and the effects of calling functions that require certain resources between the call to fork() and the call to an exec function are undefined.
It is well-known that the default way to create a new process under POSIX is to use fork() (under Linux this internally maps to clone(...))
What I want to know is the following: It is well-known that when one calls fork() "The child process is created with a single thread--the one that called fork()"
(cf. https://linux.die.net/man/2/fork). This can of course cause problems if for example some other thread currently holds a lock. To me not also forking all the threads that exist in the process intuitively feels like a "leaky abstraction".
So I would like to know: What is the reason why only the thread calling fork() will exist in the child process instead of all threads of the process? Is there a good technical reason for this?
I know that on Multithreaded fork there is a related question, but the answers given there don't answer mine.
Of these two possibilities:
only the thread calling fork() continues running in the child process
Downside: if another thread was holding on to an internal resource such as a lock, it will not be released.
after fork(), all threads are duplicated into the child process
Downside: threads that were interacting with external resources continue running in parallel. If a thread was appending data to a file: now it happens twice.
Both are bad, but the first one choice only deadlocks the new child process, while the second choice results in corruption outside of the process. This could be described as "bad".
POSIX did standardize pthread_atfork to try to allow automatic cleanup in the first case, but it cannot possibly work.
tl;dr Don't use both threads and forks. Use posix_spawn if you have to.
In Pthread programming we can assign task to the work threads by calling pthread_create function and also pass the function argument but I want to assign the job to the previously created thread by using pthread_create what can i do for that?
You can "communicate" a new task to existing threads. Let existing threads wait for a signal (using pthread_cond_wait()). When you have a new task, you can store the task in a common storage, and then simply signal the worker threads (using pthtread_cond_signal()). This approach works well, when you have a pool of worker threads that are waiting for incoming tasks. When you signal, only one thread will wake up (the pthread_cond_wait is tied to a mutex and so only one of them re-acquires the mutex) and the remaining threads will continue to wait.
I'm thinking of using a semaphore as a pause mechanism for a pool of worker threads like so:
// main thread
for N jobs:
semaphore.release()
create and start worker
// worker thread
while (not done)
semaphore.acquire()
do_work
semaphore.release()
Now, if I want to pause all workers, I can acquire the entire count available in the semaphore. I'm wondering it that is better than:
if (paused)
paused_mutex.lock
wait for condition (paused_mutex)
do_work
Or is there a better solution?
I guess one downside of doing it with the semaphore is that the main thread will block until all workers release. In my case, the unit of work per iteration is very small so that probably won't be a problem.
Update: to clarify, my workers are database backups that act like file copies. The while(not quit) loop quits when the file has been successfully copied. So to relate it to the traditional worker-waits-for-condition to get work: my workers wait for a needed file copy and the while loop you see is doing the work requested. You could think of my do_work above as do_piece_of_work.
The problem with the semaphore approach is that the worker threads have to constantly check for work. They are eating up all the available CPU resources. It is better to use a mutex and a condition (signalling) variable (as in your second example) so that the threads are woken up only when they have something to do.
It is also better to hold the mutex for as short a time as possible. The traditional way to do this is to create a WORK QUEUE and to use the mutex to synchronize queue inserts and removals. The main thread inserts into the work queue and wakes up the worker. The worker acquires the mutex, removes an item from the queue, then release the mutex. NOW the worker performs the action. This maximizes the concurrency between the worker threads and the main thread.
Here is an example:
// main thread
create signal variable
create mutex
for N jobs:
create and start worker
while (wait for work)
// we have something to do
create work item
mutex.acquire();
insert_work_into_queue(item);
mutex.release();
//tell the workers
signal_condition_variable()
//worker thread
while (wait for condition)
mutex.acquire();
work=remove_item_from_queue();
mutex.release();
if (work) do(work);
This is a simple example where all the worker threads are awakened, even though only one worker will actually succeed in getting work off of the queue. If you want even more efficiency, use an array of condition variables, one per worker thread and then just signal the "next" one, using an algorithm for "next" that is as simple or as complex as you want.
I'm new to Multithread in Win32. And I have an assignment with Semaphore. But I cannot understand this.
Assume that we have 20 tasks (each task is the same with other tasks). We use semaphore then there's 2 circumstances:
First, there should be have 20 childthreads in order that each thread will handle 1 task.
Or:
Second, there would be have n childthreads. When a thread finishs a task, it will handle another task?
The second problem I counter that I cannot find any samples for Semaphore in Win32(API) but Consonle that I found in MSDN.
Can you help me with the "20 task" and tell me the instruction of writing a Semaphore in WinAPI application (Where should I place CreateSemaphore() function ...)?
Your suggestion will be appreciated.
You can start a thread for every task, which is a common approach, or you can use a "threadpool" where threads are reused. This is up to you. In both scenarios, you may or may not use a semaphore, the difference is only how you start the multiple threads.
Now, concerning your question where to place the CreateSemaphore() function, you should call that before starting any further threads. The reason is that these threads need to access the semaphore, but they can't do that if it doesn't exist yet. You could of course pass it to the other threads, but that again would give you the problem how to pass it safely without any race conditions, which is something that semaphores and other synchronization primitives are there to avoid. In other words, you would only complicate things by creating a chicken-and-egg problem.
Note that if this doesn't help you any further, you should perhaps provide more info. What are the goals? What have you done yourself so far? Any related questions here that you read but that didn't fully present answers to your problem?
Well, if you are contrained to using semaphores only, you could use two semaphores to create an unbounded producer-consumer queue class that you could use to implement a thread pool.
You need a 'SimpleQueue' class for task objects. I assume you either have one already, can easily build one or whatever.
In the ctor of your 'ProducerConsumerQueue' class, (or in main(), or in some factory function that returns a *ProducerConsumerQueue struct, whatever your language has), create a SimpleClass and two semaphores. A 'QueueCount' semaphore, initialized with a count of 0, and a 'QueueAccess' semaphore, initialized with a count of 1.
Add 'push(*task)' and ' *task pop()' methods/memberFunctions/methods to the ProducerConsumerQueue:
In 'push', first call 'WaitForSingleObject()' API on QueueAccess, then push the *task onto the SimpleQueue, then ReleaseSemaphore() API on QueueAccess. This pushes the *task in a thread-safe manner. Then ReleaseSemaphore() on QueueCount - this will signal any waiting threads.
In pop(), first call 'WaitForSingleObject()' API on QueueCount - this ensures that any calling consumer thread has to wait until there is a *task in the queue. Then call 'WaitForSingleObject()' API on QueueAccess, then pop task from the SimpleQueue, then ReleaseSemaphore() API on QueueAccess and return the task - this this thread-safely dequeues the *task.
Once you have created your ProducerConsumerQueue, create some threads to run the tasks. In CreateThread(), pass the same *ProducerConsumerQueue as the 'auxiliary' *void parameter.
In the thread function, cast the *void back to *ProducerConsumerQueue and then just loop around for ever, calling pop() and then running the returned task.
OK, your pool of threads is now ready to do stuff. If you want to run 20 tasks, create them in a loop and push them onto the ProducerConsumerQueue. The threads will then run them all.
You can create as many threads as you want to in the pool, (within reason). As many threads as cores is reasonable for tasks that are CPU-intensive. If the tasks make blocking calls, you may want to create many more threads for quickest overall throughput.
A useful enhancement is to check for 'null' in the thread function loop after each task is received and, if it is null, clean up an exit the thread, so terminating it. This allows the threads to be easily terminated by queueing up nulls, making it easier to shutdown your thread pool, (should you need to), and also to control the number of threads in the pool at runtime.