What are the main purposes for joining pthreads in Linux/UNIX? - linux

I'm a student and I'm going over threads right now, and despite reading TLPI very carefully, I still don't have a good understanding as to why one might join two pthreads.
From what I've gleaned, it can be used either as a way for one thread to pass a return value to another OR it can be used as a waiting mechanism between threads. That said, it's entirely possible that I've misunderstood the entire point. Would someone mind explaining it a bit for me?

Threads are mainly used for parallel processing. Joining/Exiting threads means the work/purpose of the thread is fulfilled. When the purpose is fulfilled then the resources should be freed and made available to other threads/processes. Resources could be any of following:
Stack (as Basile Starynkevitch said)
Processor time
Opened files/Shared Memory/Any other resource locked/booked by the thread.
Joining threads can be done for just shifting the control also Or it might be done for transferring values as return values (as Michael Burr said).

Related

Is it safe to update an object in a thread without locks if other threads will not access it?

I have a vector of entities. At update cycle I iterate through vector and update each entity: read it's position, calculate current speed, write updated position. Also, during updating process I can change some other objects in other part of program, but each that object related only to current entity and other entities will not touch that object.
So, I want to run this code in threads. I separate vector into few chunks and update each chunk in different threads. As I see, threads are fully independent. Each thread on each iteration works with independent memory regions and doesn't affect other threads work.
Do I need any locks here? I assume, that everything should work without any mutexes, etc. Am I right?
Short answer
No, you do not need any lock or synchronization mechanism as your problem appear to be a embarrassingly parallel task.
Longer answer
A race conditions that can only appear if two threads might access the same memory at the same time and at least one of the access is a write operation. If your program exposes this characteristic, then you need to make sure that threads access the memory in an ordered fashion. One way to do it is by using locks (it is not the only one though). Otherwise the result is UB.
It seems that you found a way to split the work among your threads s.t. each thread can work independently from the others. This is the best case scenario for concurrent programming as it does not require any synchronization. The complexity of the code is dramatically decreased and usually speedup will jump up.
Please note that as #acelent pointed out in the comment section, if you need changes made by one thread to be visible in another thread, then you might need some sort of synchronization due to the fact that depending on the memory model and on the HW changes made in one thread might not be immediately visible in the other.
This means that you might write from Thread 1 to a variable and after some time read the same memory from Thread 2 and still not being able to see the write made by Thread 1.
"I separate vector into few chunks and update each chunk in different threads" - in this case you do not need any lock or synchronization mechanism, however, the system performance might degrade considerably due to false sharing depending on how the chunks are allocated to threads. Note that the compiler may eliminate false sharing using thread-private temporal variables.
You can find plenty of information in books and wiki. Here is some info https://software.intel.com/en-us/articles/avoiding-and-identifying-false-sharing-among-threads
Also there is a stackoverflow post here does false sharing occur when data is read in openmp?

Semaphores & threads - what is the point?

I've been reading about semaphores and came across this article:
www.csc.villanova.edu/~mdamian/threads/posixsem.html
So, this page states that if there are two threads accessing the same data, things can get ugly. The solution is to allow only one thread to access the data at the same time.
This is clear and I understand the solution, only why would anyone need threads to do this? What is the point? If the threads are blocked so that only one can execute, why use them at all? There is no advantage. (or maybe this is a just a dumb example; in such a case please point me to a sensible one)
Thanks in advance.
Consider this:
void update_shared_variable() {
sem_wait( &g_shared_variable_mutex );
g_shared_variable++;
sem_post( &g_shared_variable_mutex );
}
void thread1() {
do_thing_1a();
do_thing_1b();
do_thing_1c();
update_shared_variable(); // may block
}
void thread2() {
do_thing_2a();
do_thing_2b();
do_thing_2c();
update_shared_variable(); // may block
}
Note that all of the do_thing_xx functions still happen simultaneously. The semaphore only comes into play when the threads need to modify some shared (global) state or use some shared resource. So a thread will only block if another thread is trying to access the shared thing at the same time.
Now, if the only thing your threads are doing is working with one single shared variable/resource, then you are correct - there is no point in having threads at all (it would actually be less efficient than just one thread, due to context switching.)
When you are using multithreading not everycode that runs will be blocking. For example, if you had a queue, and two threads are reading from that queue, you would make sure that no thread reads at the same time from the queue, so that part would be blocking, but that's the part that will probably take the less time. Once you have retrieved the item to process from the queue, all the rest of the code can be run asynchronously.
The idea behind the threads is to allow simultaneous processing. A shared resource must be governed to avoid things like deadlocks or starvation. If something can take a while to process, then why not create multiple instances of those processes to allow them to finish faster? The bottleneck is just what you mentioned, when a process has to wait for I/O.
Being blocked while waiting for the shared resource is small when compared to the processing time, this is when you want to use multiple threads.
This is of course a SSCCE (Short, Self Contained, Correct Example)
Let's say you have 2 worker threads that do a lot of work and write the result to a file.
you only need to lock the file (shared resource) access.
The problem with trivial examples....
If the problem you're trying to solve can be broken down into pieces that can be executed in parallel then threads are a good thing.
A slightly less trivial example - imagine a for loop where the data being processed in each iteration is different every time. In that circumstance you could execute each iteration of the for loop simultaneously in separate threads. And indeed some compilers like Intel's will convert suitable for loops to threads automatically for you. In that particular circumstances no semaphores are needed because of the iterations' data independence.
But say you were wanting to process a stream of data, and that processing had two distinct steps, A and B. The threadless approach would involve reading in some data then doing A then B and then output the data before reading more input. Or you could have a thread reading and doing A, another thread doing B and output. So how do you get the interim result from the first thread to the second?
One way would be to have a memory buffer to contain the interim result. The first thread could write the interim result to a memory buffer and the second could read from it. But with two threads operating independently there's no way for the first thread to know if it's safe to overwrite that buffer, and there's no way for the second to know when to read from it.
That's where you can use semaphores to synchronise the action of the two threads. The first thread takes a semaphore that I'll call empty, fills the buffer, and then posts a semaphore called filled. Meanwhile the second thread will take the filled semaphore, read the buffer, and then post empty. So long as filled is initialised to 0 and empty is initialised to 1 it will work. The second thread will process the data only after the first has written it, and the first won't write it until the second has finished with it.
It's only worth it of course if the amount of time each thread spends processing data outweighs the amount of time spent waiting for semaphores. This limits the extent to which splitting code up into threads yields a benefit. Going beyond that tends to mean that the overall execution is effectively serial.
You can do multithreaded programming without semaphores at all. There's the Actor model or Communicating Sequential Processes (the one I favour). It's well worth looking up JCSP on Wikipedia.
In these programming styles data is shared between threads by sending it down communication channels. So instead of using semaphores to grant another thread access to data it would be sent a copy of that data down something a bit like a network socket, or a pipe. The advantage of CSP (which limits that communication channel to send-finishes-only-if-receiver-has-read) is that it stops you falling into the many many pitfalls that plague multithreaded do programs. It sounds inefficient (copying data is inefficient), but actually it's not so bad with Intel's QPI architecture, AMD's Hypertransport. And it means hat the 'channel' really could be a network connection; scalability built in by design.

Definition of Multi-threading

Not really programming related this question, but I still hope it fits somehow here :).
I wrote the following sentence in my work:
Mulitthreading refers to the ability of an OS to subdivide an application into
threads, where each of the them are capable to execute independently.
I was told, that this definition of thread is too narrow. I am not really sure why this is the case, could somebody be so kind to explain me what I missed?
Thank you
Usually, it is the application that decides when to create threads, not the OS. Also, you may want to mention that threads share address space, while each process has its own.
A thread fundamentally, is a saved execution context - a set of saved registers and a stack, that you can resume and continue execution of. This thread can be executed on a processor (these days, many machines of course can execute multiple threads at the same time).
The critical aspect of "multi-threading" is, that an operating system can emulate execution of many threads at the same time, by preempting (stopping) a thread once it has run for a certain amount of time (a "quantum"), then scheduling another thread to run, based on a certain algorithm that is OS-specific.

fork in multi-threaded program

I've heard that mixing forking and threading in a program could be very problematic, often resulting with mysterious behavior, especially when dealing with shared resources, such as locks, pipes, file descriptors. But I never fully understand what exactly the dangers are and when those could happen. It would be great if someone with expertise in this area could explain a bit more in detail what pitfalls are and what needs to be care when programming in a such environment.
For example, if I want to write a server that collects data from various different resources, one solution I've thought is to have the server spawns a set of threads, each popen to call out another program to do the actual work, open pipes to get the data back from the child. Each of these threads responses for its own work, no data interexchange in b/w them, and when the data is collected, the main thread has a queue and these worker threads will just put the result in the queue. What could go wrong with this solution?
Please do not narrow your answer by just "answering" my example scenario. Any suggestions, alternative solutions, or experiences that are not related to the example but helpful to provide a clean design would be great! Thanks!
The problem with forking when you do have some threads running is that the fork only copies the CPU state of the one thread that called it. It's as if all of the other threads just died, instantly, wherever they may be.
The result of this is locks aren't released, and shared data (such as the malloc heap) may be corrupted.
pthread does offer a pthread_atfork function - in theory, you could take every lock in the program before forking, release them after, and maybe make it out alive - but it's risky, because you could always miss one. And, of course, the stacks of the other threads won't be freed.
It is really quite simple. The problems with multiple threads and processes always arise from shared data. If there is not shared data then there can be no possible issues arising.
In your example the shared data is the queue owned by the main thread - any potential contention or race conditions will arise here. Typical methods for "solving" these issues involve locking schemes - a worker thread will lock the queue before inserting any data, and the main thread will lock the queue before removing it.

How to define threadsafe?

Threadsafe is a term that is thrown around documentation, however there is seldom an explanation of what it means, especially in a language that is understandable to someone learning threading for the first time.
So how do you explain Threadsafe code to someone new to threading?
My ideas for options are the moment are:
Do you use a list of what makes code
thread safe vs. thread unsafe
The book definition
A useful metaphor
Multithreading leads to non-deterministic execution - You don't know exactly when a certain piece of parallel code is run.
Given that, this wonderful multithreading tutorial defines thread safety like this:
Thread-safe code is code which has no indeterminacy in the face of any multithreading scenario. Thread-safety is achieved primarily with locking, and by reducing the possibilities for interaction between threads.
This means no matter how the threads are run in particular, the behaviour is always well-defined (and therefore free from race conditions).
Eric Lippert says:
When I'm asked "is this code thread safe?" I always have to push back and ask "what are the exact threading scenarios you are concerned about?" and "exactly what is correct behaviour of the object in every one of those scenarios?".
It is unhelpful to say that code is "thread safe" without somehow communicating what undesirable behaviors the utilized thread safety mechanisms do and do not prevent.
G'day,
A good place to start is to have a read of the POSIX paper on thread safety.
Edit: Just the first few paragraphs give you a quick overview of thread safety and re-entrant code.
HTH
cheers,
i maybe wrong but one of the criteria for being thread safe is to use local variables only. Using global variables can have undefined result if the same function is called from different threads.
A thread safe function / object (hereafter referred to as an object) is an object which is designed to support multiple concurrent calls. This can be achieved by serialization of the parallel requests or some sort of support for intertwined calls.
Essentially, if the object safely supports concurrent requests (from multiple threads), it is thread safe. If it is not thread safe, multiple concurrent calls could corrupt its state.
Consider a log book in a hotel. If a person is writing in the book and another person comes along and starts to concurrently write his message, the end result will be a mix of both messages. This can also be demonstrated by several threads writing to an output stream.
I would say to understand thread safe, start with understanding difference between thread safe function and reentrant function.
Please check The difference between thread-safety and re-entrancy for details.
Tread-safe code is code that won't fail because the same data was changed in two places at once. Thread safe is a smaller concept than concurrency-safe, because it presumes that it was in fact two threads of the same program, rather than (say) hardware modifying data, or the OS.
A particularly valuable aspect of the term is that it lies on a spectrum of concurrent behavior, where thread safe is the strongest, interrupt safe is a weaker constraint than thread safe, and reentrant even weaker.
In the case of thread safe, this means that the code in question conforms to a consistent api and makes use of resources such that other code in a different thread (such as another, concurrent instance of itself) will not cause an inconsistency, so long as it also conforms to the same use pattern. the use pattern MUST be specified for any reasonable expectation of thread safety to be had.
The interrupt safe constraint doesn't normally appear in modern userland code, because the operating system does a pretty good job of hiding this, however, in kernel mode this is pretty important. This means that the code will complete successfully, even if an interrupt is triggered during its execution.
The last one, reentrant, is almost guaranteed with all modern languages, in and out of userland, and it just means that a section of code may be entered more than once, even if execution has not yet preceeded out of the code section in older cases. This can happen in the case of recursive function calls, for instance. It's very easy to violate the language provided reentrancy by accessing a shared global state variable in the non-reentrant code.

Resources