I'd like to clarify something about ConcurrentHashMap vs ConcurrentSkipListMap based on the API documentation.
From my understanding ConcurrentHashMap gaurantees thread safety for insertions by multiple threads. So if you have a map that will only be populated concurrently by multiple threads then there are no issues. The API however goes on to suggest that it does not gaurantee locking for retrieval so you may get misleading results here?
In contrast, for the ConcurrentSkipListMap it is stated that: "Insertion, removal, update, and access operations safely execute concurrently by multiple threads". So I assume this does not have the aforementioned retrieval issue that the hash map has, but obviously this would generally come with a performance cost?
In practice, has anyone found the need to use the ConcurrentSkipListMap because of this particular behaviour, or does it generally not matter that retrievals may give an out of date view?
ConcurrentHashMap
Retrievals reflect the results of the most recently completed update
operations holding upon their onset. For aggregate operations such as
putAll and clear, concurrent retrievals may reflect insertion or
removal of only some entries.
it uses volatile semantics for get(key). In case when Thread1 calls put(key1, value1) and right after that Thread2 calls get(key1), Thread2 wouldn't wait Thread1 to finish its put, they are not synchronized with each other and Thread2 can get old associated value. But if put(key1, value1) was finished in Thread1 before Thread2 tries to get(key1) then Thread2 is guaranteed to get this update (value1).
ConcurrentSkipListMap is sorted and provides
expected average log(n) time cost for the containsKey, get,
put and remove operations and their variants
ConcurrentSkipListMap isn't so fast, but is useful when you need sorted thread-safe map.
The API however goes on to suggest that it does not gaurantee locking for retrieval so you may get misleading results here?
Interestingly enough, neither does the ConcurrentSkipListMap, infact the CSLM is completely non-blocking.
In Java 7 The CHM, for all intents and purposes, is non-blocking when executing reads. In fact, Java 8's updated CHM implementation has completely non-blocking reads.
The point here is that the CHM and CSLM have similar read semantics, the difference is time complexity.
From your question, you seem to have come to the conclusion that only insertions into ConcurrentHashMap are thread safe.
From my understanding ConcurrentHashMap gaurantees thread safety for insertions by multiple threads. So if you have a map that will only be populated concurrently by multiple threads then there are no issues.
How did you come to this conclusion? The first line of the documentation for ConcurrentHashMap implies that all operations are thread safe:
A hash table supporting full concurrency of retrievals and adjustable expected concurrency for updates.
Additionally, it implies that get() operations can sustain a higher level of concurrency than put() operations.
Simply put ConcurrentHashMap does not have the retrieval issue that you think it has. In most cases you should be using ConcurrentHashMap instead of ConcurrentSkipListMap since performance of ConcurrentHashMap is generally better than ConcurrentSkipListMap. You should only be using CurrentSkipListMap when you need a ConcurrentMap that has predictable iteration order or if you need the facilities of a NavigableMap.
Related
I have a vector of entities. At update cycle I iterate through vector and update each entity: read it's position, calculate current speed, write updated position. Also, during updating process I can change some other objects in other part of program, but each that object related only to current entity and other entities will not touch that object.
So, I want to run this code in threads. I separate vector into few chunks and update each chunk in different threads. As I see, threads are fully independent. Each thread on each iteration works with independent memory regions and doesn't affect other threads work.
Do I need any locks here? I assume, that everything should work without any mutexes, etc. Am I right?
Short answer
No, you do not need any lock or synchronization mechanism as your problem appear to be a embarrassingly parallel task.
Longer answer
A race conditions that can only appear if two threads might access the same memory at the same time and at least one of the access is a write operation. If your program exposes this characteristic, then you need to make sure that threads access the memory in an ordered fashion. One way to do it is by using locks (it is not the only one though). Otherwise the result is UB.
It seems that you found a way to split the work among your threads s.t. each thread can work independently from the others. This is the best case scenario for concurrent programming as it does not require any synchronization. The complexity of the code is dramatically decreased and usually speedup will jump up.
Please note that as #acelent pointed out in the comment section, if you need changes made by one thread to be visible in another thread, then you might need some sort of synchronization due to the fact that depending on the memory model and on the HW changes made in one thread might not be immediately visible in the other.
This means that you might write from Thread 1 to a variable and after some time read the same memory from Thread 2 and still not being able to see the write made by Thread 1.
"I separate vector into few chunks and update each chunk in different threads" - in this case you do not need any lock or synchronization mechanism, however, the system performance might degrade considerably due to false sharing depending on how the chunks are allocated to threads. Note that the compiler may eliminate false sharing using thread-private temporal variables.
You can find plenty of information in books and wiki. Here is some info https://software.intel.com/en-us/articles/avoiding-and-identifying-false-sharing-among-threads
Also there is a stackoverflow post here does false sharing occur when data is read in openmp?
I have read the docs and understand that in many cases you do not need to manually call refresh on the Realm instance. However, in this pretty common scenario it turned out to be necessary because the completion block may query the Realm before the start of the next run loop.
dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0), ^{
[[RLMRealm defaultRealm] transactionWithBlock:^{
// Add some RLMObjects
}];
if (completion) {
dispatch_async(dispatch_get_main_queue(), ^{
[[RLMRealm defaultRealm] refresh]; // necessary if it queries realm
completion();
});
}
});
I thought I was being a good citizen by performing write operations on a background thread, but now that I have to call refresh, I'm wondering if the overhead involved in this call defeats the point of doing the background processing.
So my questions are:
1) What is the performance cost of calling refresh on the Realm?
2) Adding just one object to the realm in this pattern probably is
pointless. After adding how many objects in this pattern would I see an advantage over the alternative of just performing the write transaction on the main thread synchronously?
Really great questions!
1) What is the performance cost of calling refresh on the Realm?
tl;dr; refreshing isn't that expensive
The cost will be proportionally linear to the number of live "accessors" backed by the Realm instance being advanced on that thread. Accessors in Realm Objective-C are RLMObjects, RLMArrays and RLMResults.
Since Realm uses an MVCC versioning system under the hood, similar to git's inner workings, calling -[RLMRealm refresh] is a matter of advancing that Realm's "current transaction pointer" to the latest stable state, much like a git pull operation.
It's worth noting that for Realms on a thread with a runloop and with autorefresh set to YES (which is generally the case for Realms on the main thread), -[RLMRealm refresh] will be called automatically at every iteration of the runloop.
So in the vast majority of cases, refreshing a Realm will have a negligible performance impact, unless you have a very large number of live "accessors" on that thread.
2) Adding just one object to the realm in this pattern probably is pointless. After adding how many objects in this pattern would I see an advantage over the alternative of just performing the write transaction on the main thread synchronously?
tl;dr; performing writes in the background is safer
In the vast majority of cases without contention, the overhead to performing a write transaction on the main thread will be below the threshold of 1/60th of a second needed for smooth UIs. However writes in Realm are blocking, which means that if there's a large write transaction happening concurrently in the background, this will block the main thread when writing from the main thread concurrently, which is less than ideal because it causes UIs to stutter or block.
For this reason, we recommend that all write transactions, no matter how small and fast, be performed on a background thread unless you're sure there won't be any contention.
I realize that performing writes on a background thread is complicated by Realm's strict thread confinement enforcement of accessors, which is why we're tracking adding APIs for asynchronous writes allowing the safe handover of accessors across threads in #3136.
Since read operations in Realm aren't blocked by other reads or writes (thanks to MVCC mentioned above!), it's perfectly acceptable to perform those on any thread.
I have a threading question and what I'd qualify as a modest threading background.
Suppose I have the following (oversimplified) design and behavior:
Object ObjectA - has a reference to object ObjectB and a method MethodA().
Object ObjectB - has a reference to ObjectA, an array of elements ArrayB and a method MethodB().
ObjectA is responsible for instantiating ObjectB. ObjectB.ObjectA will point to ObjectB's instantiator.
Now, whenever some conditions are met, a new element is added in ObjectB.ArrayB and a new thread is started for this element, say ThreadB_x, where x goes from 1 to ObjectB.ArrayB.Length. Each such thread calls ObjectB.MethodB() to pass some data in, which in turn calls ObjectB.ObjectA.MethodA() for data processing.
So multiple threads call the same method ObjectB.MethodB(), and it's very likely that they do so at the very same time. There's a lot of code in MethodB that creates and initializes new objects, so I don't think there are problems there. But then this method calls ObjectB.ObjectA.MethodA(), and I don't have the slightest idea of what's going on in there. Based on the results I get, nothing wrong, apparently, but I'd like to be sure of that.
For now, I enclosed the call to ObjectB.ObjectA.MethodA() in a lock statement inside ObjectB.MethodB(), so I'm thinking this will ensure there are no clashes to the call of MethodA(), though I'm not 100% sure of that. But what happens if each ThreadB_x calls ObjectB.MethodB() a lot of times and very very fast? Will I have a queue of calls waiting for ObjectB.ObjectA.MethodA() to finish?
Thanks.
Your question is very difficult to answer because of the lack of information. It depends on the average time spent in methodA, how many times this method is called per thread, how many cores are allocated to the process, the OS scheduling policy, to name a few parameters.
All things being equals, when the number of threads grows toward infinity, you can easily imagine that the probability for two threads requesting access to a shared resource simultaneously will tend to one. This probability will grow faster in proportion to the amount of time spent on the shared resource. That intuition is probably the reason of your question.
The main idea of multithreading is to parallelize code which can be effectively computed concurrently, and avoid contention as much as possible. In your setup, if methodA is not pure, ie. if it may change the state of the process - or in C++ parlance, if it cannot be made const, then it is a source of contention (recall that a function can only be pure if it uses pure functions or constants in its body).
One way of dealing with a shared resource is to protect it with a mutex, as you've done in your code. Another way is to try to turn its use into an async service, with one thread handling it, and others requesting that thread for computation. In effect, you will end up with an explicit queue of requests, but threads doing these requests will be free to work on something else in the mean time. The goal is always to maximize computation time, as opposed to thread management time, which happens each time a thread gets rescheduled.
Of course, it is not always possible to do so, eg. when the result of methodA belongs to a strongly ordered chain of computation.
What is a realistic performance loss due to the fact that in C++0x all other threads shall
wait in a case like this:
string& program_name() {
static string instance = "Parallel Pi";
return instance;
}
Lets assume the optimal scenario: The programmer was very careful that even with 100 threads only the main thread calls the function program_name, all the other 99 worker threads are busy doing useful stuff, which does not involve calling this "critical" function.
I quote from the new C++0x-Std ยง 6.7.(4) stmt.decl
...such an object is initialized the first time control passes through its declaration... If control enters the declaration concurrently while the object is being initialized, the concurrent execution shall wait for completion of the initialization...
What is a realistic overhead that a real-world compiler is needed to impose on me to ensure that that static initialization is done as required by the standard.
Is a lock/mutex required? I assume they are expensive, even when not really needed?
If they are expensive, will this be done by less expensive mechanisms?
edit: added string...
If control enters the declaration concurrently while the object is being initialized, the
concurrent execution shall wait for completion of the initialization...
I think this is reasonable and very normal thing to do in concurrent programming. Anyway, this statement doesn't say that all other threads must wait for this initialization. They have to wait in case they need to access the initializing object.
Is a lock/mutex required? I assume they are expensive, even when not really needed?
Could be. Mutex / lock aren't that expensive actually, they're expensive only when the locked code fragment needs to be accessed frequently by many or even all threads.
If they are expensive, will this be done by less expensive mechanisms?
There are also another non-lock based solutions AFAIK.
If you were really concerned about the price of the lock, you could simply call the function before you started your worker threads, which would initialise the static. If you call it after the threads start, either you or the compiler has got to arrange for locking of some sort, so there is no real extra overhead.
Threadsafe is a term that is thrown around documentation, however there is seldom an explanation of what it means, especially in a language that is understandable to someone learning threading for the first time.
So how do you explain Threadsafe code to someone new to threading?
My ideas for options are the moment are:
Do you use a list of what makes code
thread safe vs. thread unsafe
The book definition
A useful metaphor
Multithreading leads to non-deterministic execution - You don't know exactly when a certain piece of parallel code is run.
Given that, this wonderful multithreading tutorial defines thread safety like this:
Thread-safe code is code which has no indeterminacy in the face of any multithreading scenario. Thread-safety is achieved primarily with locking, and by reducing the possibilities for interaction between threads.
This means no matter how the threads are run in particular, the behaviour is always well-defined (and therefore free from race conditions).
Eric Lippert says:
When I'm asked "is this code thread safe?" I always have to push back and ask "what are the exact threading scenarios you are concerned about?" and "exactly what is correct behaviour of the object in every one of those scenarios?".
It is unhelpful to say that code is "thread safe" without somehow communicating what undesirable behaviors the utilized thread safety mechanisms do and do not prevent.
G'day,
A good place to start is to have a read of the POSIX paper on thread safety.
Edit: Just the first few paragraphs give you a quick overview of thread safety and re-entrant code.
HTH
cheers,
i maybe wrong but one of the criteria for being thread safe is to use local variables only. Using global variables can have undefined result if the same function is called from different threads.
A thread safe function / object (hereafter referred to as an object) is an object which is designed to support multiple concurrent calls. This can be achieved by serialization of the parallel requests or some sort of support for intertwined calls.
Essentially, if the object safely supports concurrent requests (from multiple threads), it is thread safe. If it is not thread safe, multiple concurrent calls could corrupt its state.
Consider a log book in a hotel. If a person is writing in the book and another person comes along and starts to concurrently write his message, the end result will be a mix of both messages. This can also be demonstrated by several threads writing to an output stream.
I would say to understand thread safe, start with understanding difference between thread safe function and reentrant function.
Please check The difference between thread-safety and re-entrancy for details.
Tread-safe code is code that won't fail because the same data was changed in two places at once. Thread safe is a smaller concept than concurrency-safe, because it presumes that it was in fact two threads of the same program, rather than (say) hardware modifying data, or the OS.
A particularly valuable aspect of the term is that it lies on a spectrum of concurrent behavior, where thread safe is the strongest, interrupt safe is a weaker constraint than thread safe, and reentrant even weaker.
In the case of thread safe, this means that the code in question conforms to a consistent api and makes use of resources such that other code in a different thread (such as another, concurrent instance of itself) will not cause an inconsistency, so long as it also conforms to the same use pattern. the use pattern MUST be specified for any reasonable expectation of thread safety to be had.
The interrupt safe constraint doesn't normally appear in modern userland code, because the operating system does a pretty good job of hiding this, however, in kernel mode this is pretty important. This means that the code will complete successfully, even if an interrupt is triggered during its execution.
The last one, reentrant, is almost guaranteed with all modern languages, in and out of userland, and it just means that a section of code may be entered more than once, even if execution has not yet preceeded out of the code section in older cases. This can happen in the case of recursive function calls, for instance. It's very easy to violate the language provided reentrancy by accessing a shared global state variable in the non-reentrant code.