are simultaneous reads of a variable thread-safe? - multithreading

Assuming that the variable isn't in any risk of being modified during the reads, are there any inherent problems in a variable being read by 2 or more threads at the same time?

No this operation is not inherently thread safe.
Even though the variable is not currently being written to, previous writes to the variable may not yet be visible to all threads. This means two threads can read the same value and get different results creating a race condition.
This can be prevented though memory barriers, correct use of volatile or a few other mechanisms. We'd need to know more about your environment, in particular the language, to give a complete explanation.
A slight restating of your question though yields a better answer. Assuming there are no more writes and all previous writes are visible to the current thread, then yes reading the value from multiple threads is safe.

If your assumption holds, then there are no problems.

As long as it's a plain variable, it's no risk.
If it is a property, reading it can possibly have side effects, so is not guaranteed to be thread safe.

Yes in three characters.
Edit:
Whoops. Yes, it's thread safe. No, there are no problems. People usually ask if something is thread-safe, not if it's thread unsafe.

Given that databases can commonly use shared read locks, where any number of clients may read the same block, I would suggest that there are no direct inherent problems.

Related

Does all shared variables needs to be atomic?

I have started working to learn multicore programming. I started leaning c++11 atomics. I would like to know if all he shared variables needs to be atomic?
The only time a variable needs to be "atomic", i.e., can be updated in "one fell swoop, without any other thread being able to read it meanwhile", is if it even can be read while someone else is updating it. For instance, if it's set on initialization and then never changes, no one can ever read it while it's changing, and so it doesn't have to be atomic. On the other hand, if it ever changes after initialization and there's a risk that anyone else than the changing thread reads it while it's changing, then it needs to be atomic (atomic intrinsics or protected by mutex or otherwise)
If multiple thread accessing (read/write) the same variable then it should be atomic.
Also you can go though this.
Not necessarily for all scenarios. Also, note that atomicity of variable access alone does not guarantee full thread safety. It just ensures that the particular variable being read is got as a whole.In some architectures, the read operation does not happen in single assembly instruction. For example, if you are reading 64 bit value, the compiler might implement the read using two load instructions of assembly such that the 1st instruction reads the lower 32 bits and the 2nd instruction reads the higher 32 bits. This in turn can lead to race condition. So, atomic reads are preferred.

Perl ithreads :shared variables - multiprocessor kernel threads - visibility

perlthrtut excerpt:
Note that a shared variable guarantees that if two or more threads try
to modify it at the same time, the internal state of the variable will
not become corrupted. However, there are no guarantees beyond this, as
explained in the next section.
Working on Linux supporting multiprocessor kernel threads.
Is there a guarantee that all threads will see the updated shared variable value ?
Consulting the perlthrtut doc as stated above there is no such guarantee.
Now the question: What can be done programmatically to guarantee that?
You ask
Is there a guarantee that all threads will see the updated shared variable value ?
Yes. :shared is that guarantee. The value will be safely and consistently and freshly updated.
The problem is simply that, without other synchronization, you don't know the order of these updates.
Consulting the perlthrtut doc as stated above there is no such guarantee.
You didn't read far enough. :)
The very next section in perlthrtut explains the kind of pitfalls you do face with perl threads: data races, which is to say, application logic races concerning shared data. Again, the shared data will be consistent and fresh and immune to corruption from (more-or-less) atomic perl opcodes. However, the high-level perl operations you perform on that shared data are not guaranteed to be atomic. $shared_var++, for instance, might be more than one atomic operation.
(If I may hazard a guess, you are perhaps thinking too much about other languages' lower level threading interfaces with their cache inconsistencies, torn words, reordered instructions, and lions and tigers and bears. Perl's model takes care of those low-level concerns for you.)
Using :shared on a variable causes all threads to reference it in the same physical memory address, so it doesn't matter which processor/core/hyper-thread they happen to be executing in. The perlthrtut talk of guarantees is in reference to race conditions, and in short, that you need to take into account that shared variables can be modified by any thread at any time. If this is a problem you'll need to make use of synchronization functions (e.g. lock() and cond_wait()) to control access.
You seem to be confused as to what :shared does. It makes it so a variable is shared by all threads.
A variable is indeed guaranteed to have the value it has, no matter which thread accesses it. It's a tautology, so nothing can be done to programmatically guarantee that.

Handling concurrent reads?

I'm new to concurrent programming and I have a specific situation in mind that I'd like some input on. If I have a variable that I will be accessing from multiple threads but only to read the value (the only reason it's wouldn't be a constant is because I'd need to set it at runtime), do I need a mutex for it? Or do you only need to worry about race conditions when there are also writes going out to a shared resource?
If you set the value before you start up the threads, you do not need a mutex.
If you set the value after you start up the threads, you will need a mutex to ensure they all the threads read the same value.
Logically,if you are only reading a shared data then you may not need to use mutex.But,in case of large programmes you must use it to avoid confusions.
That depends on what language and machine architecture you're talking about and what "reading the variable" means in that language. When "reading the variable" translates into only reading from memory at the machine-level, concurrent reads are in themselves generally safe. You need to be sure, of course, that nothing else in your program translates into writing to those same memory areas.
Many mainstream languages (Java, C#, C, C++) gives only very weak guarantees about how your program translates into memory accesses. At the same time, the guarantees you get tend to take the form of very complex rules, say, about which sequences of statements may be re-ordered when. To avoid introducing really difficult to find bugs, it's a very often better to require the synchronisation properties you need in as un-subtle and concrete a form as possible, that is, use mutexes.

Should access to a shared resource be locked by a parent thread before spawning a child thread that accesses it?

If I have the following psuedocode:
sharedVariable = somevalue;
CreateThread(threadWhichUsesSharedVariable);
Is it theoretically possible for a multicore CPU to execute code in threadWhichUsesSharedVariable() which reads the value of sharedVariable before the parent thread writes to it? For full theoretical avoidance of even the remote possibility of a race condition, should the code look like this instead:
sharedVariableMutex.lock();
sharedVariable = somevalue;
sharedVariableMutex.unlock();
CreateThread(threadWhichUsesSharedVariable);
Basically I want to know if the spawning of a thread explicitly linearizes the CPU at that point, and is guaranteed to do so.
I know that the overhead of thread creation probably takes enough time that this would never matter in practice, but the perfectionist in me is afraid of the theoretical race condition. In extreme conditions, where some threads or cores might be severely lagged and others are running fast and efficiently, I can imagine that it might be remotely possible for the order of execution (or memory access) to be reversed unless there was a lock.
I would say that your pseudocode is safe on any correctly functioning
multiprocessor system. The C++ compiler cannot generate a call to
CreateThread() before sharedVariable has received a correct value
unless it can prove to itself that doing so is safe. You are guaranteed
that your single-threaded code executes equivalently to a completely
non-reordered linear execution path. Any system that "time warps" the
thread creation ahead of the variable assignment is seriously broken.
I don't think declaring sharedVariable as volatile does anything
useful in this case.
Given your example and if you were using Java then the answer would be "No". In Java it is not possible for the thread to spawn and read your value before the assignment operation is complete. In some other languages this might be a different story.
"Variables shared between multiple threads (e.g., instance variables of objects) have atomic assignment guaranteed by the Java language specification for all data types except longs and doubles... If a method consists solely of a single variable access or assignment, there is no need to make it synchronized for thread-safety, and every reason not to do so for performance."
reference
If your double or long is declared volatile, then you are also guaranteed that the assignment is an atomic operation.
Update:
Your example is going to work in C++ just like it works in Java. Theoretically there is no way that the thread spawning will begin or complete before the assignment, even with Out of Order Execution.
Note that your example is VERY specific and in any other case it is recommended that you ensure the shared resource is protected properly. The new C++ standard is coming out with a lot of atomic stuff, so you could declare your variable as atomic and the assignment operation will be visible to all threads without the need of locking. CAS (compare and set) is a your next best option.

How to define threadsafe?

Threadsafe is a term that is thrown around documentation, however there is seldom an explanation of what it means, especially in a language that is understandable to someone learning threading for the first time.
So how do you explain Threadsafe code to someone new to threading?
My ideas for options are the moment are:
Do you use a list of what makes code
thread safe vs. thread unsafe
The book definition
A useful metaphor
Multithreading leads to non-deterministic execution - You don't know exactly when a certain piece of parallel code is run.
Given that, this wonderful multithreading tutorial defines thread safety like this:
Thread-safe code is code which has no indeterminacy in the face of any multithreading scenario. Thread-safety is achieved primarily with locking, and by reducing the possibilities for interaction between threads.
This means no matter how the threads are run in particular, the behaviour is always well-defined (and therefore free from race conditions).
Eric Lippert says:
When I'm asked "is this code thread safe?" I always have to push back and ask "what are the exact threading scenarios you are concerned about?" and "exactly what is correct behaviour of the object in every one of those scenarios?".
It is unhelpful to say that code is "thread safe" without somehow communicating what undesirable behaviors the utilized thread safety mechanisms do and do not prevent.
G'day,
A good place to start is to have a read of the POSIX paper on thread safety.
Edit: Just the first few paragraphs give you a quick overview of thread safety and re-entrant code.
HTH
cheers,
i maybe wrong but one of the criteria for being thread safe is to use local variables only. Using global variables can have undefined result if the same function is called from different threads.
A thread safe function / object (hereafter referred to as an object) is an object which is designed to support multiple concurrent calls. This can be achieved by serialization of the parallel requests or some sort of support for intertwined calls.
Essentially, if the object safely supports concurrent requests (from multiple threads), it is thread safe. If it is not thread safe, multiple concurrent calls could corrupt its state.
Consider a log book in a hotel. If a person is writing in the book and another person comes along and starts to concurrently write his message, the end result will be a mix of both messages. This can also be demonstrated by several threads writing to an output stream.
I would say to understand thread safe, start with understanding difference between thread safe function and reentrant function.
Please check The difference between thread-safety and re-entrancy for details.
Tread-safe code is code that won't fail because the same data was changed in two places at once. Thread safe is a smaller concept than concurrency-safe, because it presumes that it was in fact two threads of the same program, rather than (say) hardware modifying data, or the OS.
A particularly valuable aspect of the term is that it lies on a spectrum of concurrent behavior, where thread safe is the strongest, interrupt safe is a weaker constraint than thread safe, and reentrant even weaker.
In the case of thread safe, this means that the code in question conforms to a consistent api and makes use of resources such that other code in a different thread (such as another, concurrent instance of itself) will not cause an inconsistency, so long as it also conforms to the same use pattern. the use pattern MUST be specified for any reasonable expectation of thread safety to be had.
The interrupt safe constraint doesn't normally appear in modern userland code, because the operating system does a pretty good job of hiding this, however, in kernel mode this is pretty important. This means that the code will complete successfully, even if an interrupt is triggered during its execution.
The last one, reentrant, is almost guaranteed with all modern languages, in and out of userland, and it just means that a section of code may be entered more than once, even if execution has not yet preceeded out of the code section in older cases. This can happen in the case of recursive function calls, for instance. It's very easy to violate the language provided reentrancy by accessing a shared global state variable in the non-reentrant code.

Resources