I'm trying to demonstrate the problem of using an ordinary Map with multiple concurrent tasks. The following example (which compiles and runs) is intended to show the Map failing:
import java.util.*;
import java.util.stream.*;
import java.util.concurrent.*;
class BreakMap2 implements Runnable {
private Map<Integer, Integer> map;
public BreakMap2(Map<Integer, Integer> map) {
this.map = map;
}
#Override
public void run() {
while(true) {
int key = ThreadLocalRandom.current().nextInt(10_000);
if(map.containsKey(key)) {
assert map.get(key) == key;
}
map.put(key, key);
}
}
}
public class MapBreaker2 {
public static void main(String[] args) {
Map<Integer, Integer> map = new HashMap<>();
IntStream.range(0, 1000)
.mapToObj(i -> new BreakMap2(map))
.map(CompletableFuture::runAsync)
.collect(Collectors.toList())
.forEach(CompletableFuture::join);
}
}
This doesn't demonstrate the problem (it doesn't fail). How can I do this more effectively? Is there an approach that will fail quickly and reliably?
To clarify, I'm trying to show how it's unsafe to have multiple tasks writing to a Map that is not designed for concurrent use. I'm trying to create something that will show an incorrect write to a Map because of concurrent access.
Edit: I've simplified the example so now it just runs forever until you hit Control-C. What I'd like instead is for a failure to stop the program.
It’s not clear which kind of “incorrect write” you are trying to provoke.
There is a broad variety of problems that can occur with concurrent updates of a data structure. One infamous problem with unsynchronized concurrent updates of HashMap, which indeed shows up in real applications, is HashMap.get being stuck in an infinite loop but, of course, your program which runs infinitely anyway wouldn’t be able to spot that.
The only thing you are testing is the value of a stored Integer with assert map.get(key) == key; This doesn’t test the object identity (otherwise it was doomed to fail due to the unspecified object identity of auto-boxed values), but rather the contained int value, which benefits from guarantees made for final fields, even if their owner object is published via a data race. So you are unable to see an uninitialized value here and it’s hard to imagine any scenario in which you could encounter a wrong value.
Since you are storing immutable objects whose values are immune to data races, only the structural updates to the Map itself may have an effect. But these are rare. You are filling the Map with random keys between 0 and 10000 in an infinite loop, so once all ten thousand distinct keys in that range are encountered, there will be no structural change anymore. Chances are good, that the first asynchronous task will reach that state even before the next task starts its work. Even if there is a short overlapping phase, the likelihood of encountering a data race right in that time window is low.
After that short time window, you are only replacing the values of existing mappings, as said, with objects immune to data races and, since they represent the same boxed values, the JVM may even optimize away the entire update.
If you want a program with a higher likelihood of failing, you may try the following. It performs a simple naive transfer from one Map to another, probing an entry and putting it into the target map if it exists. It looks simple and will indeed run smoothly with a thread count of one, but will fail badly in most environments when using any other thread count.
public class MapBreaker2 {
public static void main(String[] args) throws InterruptedException {
int threadCount = 2; // try varying that number
Map<Integer, Integer> source = IntStream.range(0, 10_000)
.boxed().collect(Collectors.toMap(i->i, i->i));
System.out.println("trying to copy "+source.size()+" mappings without synchonizing");
Map<Integer, Integer> target = new HashMap<>();
Callable<?> job=() -> {
while(!source.isEmpty()) {
int key = ThreadLocalRandom.current().nextInt(10_000);
Integer value=source.remove(key);
if(value!=null)
target.put(key, value);
}
return null;
};
ExecutorService pool = Executors.newCachedThreadPool();
pool.invokeAll(Collections.nCopies(threadCount, job));
pool.shutdown();
System.out.println(target.size());
assert source.isEmpty();
assert target.size()==10_000;
}
}
But it should be emphasized that multi-threading without synchronization still is unpredictable, so it may run without a noticeable error in one or another test run…
Related
Just trying to understand threads and race condition and how they affect the expected output. In the below code, i once had an output that began with
"2 Thread-1" then "1 Thread-0" .... How could such an output happen? What I understand is as follows:
Step1:Assuming Thread 0 started, it incremented counter to 1,
Step2: Before printing it, Thread 1 incremented it to 2 and printed it,
Step3: Thread 0 prints counter which should be 2 but is printing 1.
How could Thread 0 print counter as 1 when Thread 1 already incremented it to 2?
P.S: I know that synchronized key could deal with such race conditions, but I just want to have some concepts done before.
public class Counter {
static int count=0;
public void add(int value) {
count=count+value;
System.out.println(count+" "+ Thread.currentThread().getName());
}
}
public class CounterThread extends Thread {
Counter counter;
public CounterThread(Counter c) {
counter=c;
}
public void run() {
for(int i=0;i<5;i++) {
counter.add(1);
}
}
}
public class Main {
public static void main(String args[]) {
Counter counter= new Counter();
Thread t1= new CounterThread(counter);
Thread t2= new CounterThread(counter);
t1.start();
t2.start();
}
}
How could Thread 0 print counter as 1 when Thread 1 already incremented it to 2?
There's a lot more going on in these two lines than meets the eye:
count=count+value;
System.out.println(count+" "+ Thread.currentThread().getName());
First of all, the compiler doesn't know anything about threads. It's job is to emit code that will achieve the same end result when executed in a single thread. That is, when all is said and done, the count must be incremented, and the message must be printed.
The compiler has a lot of freedom to re-order operations, and to store values in temporary registers in order to ensure that the correct end result is achieved in the most efficient way possible. So, for example, the count in the expression count+" "+... will not necessarily cause the compiler to fetch the latest value of the global count variable. In fact it probably will not fetch from the global variable because it knows that the result of the + operation still is sitting in a CPU register. And, since it doesn't acknowledge that other threads could exist, then it knows that there's no way that the value in the register could be any different from what it stored into the global variable after doing the +.
Second of all, the hardware itself is allowed to stash values in temporary places and re-order operations for efficiency, and it too is allowed to assume that there are no other threads. So, even when the compiler emits code that says to actually fetch from or store to the global variable instead of to or from a register, the hardware does not necessarily store to or fetch from the actual address in memory.
Assuming your code example is Java code, then all of that changes when you make appropriate use of synchronized blocks. If you would add synchronized to the declaration of your add method for example:
public synchronized void add(int value) {
count=count+value;
System.out.println(count+" "+ Thread.currentThread().getName());
}
That forces the compiler to acknowledge the existence of other threads, and the compiler will emit instructions that force the hardware to acknowledge other threads as well.
By adding synchronized to the add method, you force the hardware to deliver the actual value of the global variable on entry to the method, your force it to actually write the global by the time the method returns, and you prevent more than one thread from being in the method at the same time.
I tried making a class that extends thread which simply takes an array of strings and prints the first 2 strings out alternately for 10000 iterations. I keep track of the index to print from using an AtomicInteger (counter), however the output sometimes prints this:
hello
hello
hello
w
hello
hello
etc.
instead of alternating at each iteration. Why is this and how could I fix it without just putting 'synchronized' in the run method?
public class MyThreadDelegate implements Runnable {
List<String> words;
AtomicInteger counter = new AtomicInteger(0);
public MyThread(List<String> words) {
this.words = words;
}
#Override
public void run() {
for (int i = 0; i < 10000; i++) {
System.out.println(words.get(counter.getAndIncrement()%2) + counter.get());
}
}
public static void main(String[] args) {
MyThreadDelegate myThreadDelegate = new MyThreadDelegate(Arrays.asList("hello", "w"));
Thread t1 = new Thread(MyThreadDelegate);
Thread t2 = new Thread(MyThreadDelegate);
t1.start();
t2.start();
}
}
While the numbers are retrieved one by one, the rest of the method isn't synced up. So sometimes this might happen:
t1: gets value 0 from the counter
t2: gets value 1 from the counter
t2: prints w
t1: prints hello
A quick fix would be to put the whole System.out line in a synchronized block, but that would not guarantee that the threads take turns. It just guarantees that echt value is retrieved, incremented and printed before the next one.
If you want to have the threads actually take turns, you'll have to implement some sort of locking. But if you wan't to have the threads wait for each other, why are you using multiple threads?
EDIT: also, you should probably have MyThread implement Runnable instead of extending Thread if you're going to use it this way. See this link for more: https://www.baeldung.com/java-runnable-vs-extending-thread (Solomon Slow beat me to it :)
For a JMH class, Thread number is limited to 1 via #Threads(1).
However, when I get the number of threads using Thread.activeCount(), it shows that there are 2 threads.
The simplified version of the code is below:
#Fork(1)
#Warmup(iterations = 10)
#Measurement(iterations = 10)
#BenchmarkMode(Mode.AverageTime)
#OutputTimeUnit(TimeUnit.MICROSECONDS)
#Threads(1)
public class MyBenchmark {
#State(Scope.Benchmark)
public static class BState {
#Setup(Level.Trial)
public void initTrial() {
}
#TearDown(Level.Trial)
public void tearDownTrial(){
}
}
#Benchmark
public List<Integer> get(BState state) {
System.out.println("Thread number: " + Thread.activeCount());
...
List<byte[]> l = new ArrayList<byte[]>(state.dict.get(k));
...
}
}
Actually, the value is tried to get from the dictionary using its key. However, when 2 threads exist, the key is not able to get from the dictionary, and here list l becomes [].
Why the key is not taken? I limit the thread number because of this to 1.
Thread.activeCount() answers the number of threads in the system, not necessarily the number of benchmark threads. Using that to divide the work between the benchmark threads is dangerous because of this fundamental disconnect. ThreadParams may help to get the benchmark thread indexes, if needed, see the relevant example.
If you want a more conclusive answer, you need to provide MCVE that clearly highlights your problem.
I have code listed here: Threading and Sockets.
The answer to that question was to modify isListening with volatile. As I remarked, that modifier allowed me to access the variable from another thread. After reading MSDN, I realized that I was reading isListening from the following newly created thread process.
So, my questions now:
Is volatile the preferred method,since I am basically making a non-thread safe request on a variable? I have read about the Interlocked class and wondered if this was something that would be better to use in my code. Interlocked looks similar to what lock(myObj) is doing - but with a little more 'flair' and control. I do know that simply applying a lock(myObj) code block around isListening did not work.
Should I implement the Interlocked class?
Thank you for your time and responses.
If all you are doing is reading and writing a variable across multiple threads in C#, then you do not have to worry about synchronizing access to (locking) that variable providing its type is bool, char, byte, sbyte, short, ushort, int, uint, float, and reference types. See here for details.
In the example from your other post, the reason you have to mark the field as volatile is to ensure that it is not subject to compiler optimizations and that the most current value is present in the field at all times. See here for details on the volatile keyword. Doing this allows that field to be read and written across threads without having to lock (synchronize access to) it. But keep in mind, the volatile keyword can only be used for your field because it is of type bool. Had it been a double, for example, the volatile keyword wouldn't work, and you'd have to use a lock.
The Interlocked class is used for a specialized purpose, namely incrementing, decrementing, and exchanging values of (typically) numeric types. These operations are not atomic. For example, if you are incrementing a value in one thread and trying to read the resulting value in another thread, you would normally have to lock the variable to prevent reading intermediate results. The Interlocked class simply provides some convenience functions so you don't have to lock the variable yourself while the increment operation is performed.
What you are doing with the isListening flag does not require use of the Interlocked class. Marking the field as volatile is sufficient.
Edit due to lunchtime rushed answer..
The lock statement used in your previous code is locking an object instance that is created in the scope of a method so it will have no effect on another thread calling into the same method. Each thread must be able to lock the same instance of an object in order to synchronise access to the given block of code. One way to do this (depending on the semantics you require) is to make the locking object a private static variable of the class that it is used in. This will allow multiple instances of a given object to synchronise access to a block of code or a single shared resource. If synchronisation is required for individual instances of an object or a resource that is instance specific then static should be emitted.
Volatile doesn't guarantee that reads or writes to the given variable will be atomic amongst different threads. It is a compiler hint to preserve ordering of instructions and prevents the variable from being cached inside a register. In general unless you are working on something extremely performance sensitive (low locking / lock free algorithms, data structures etc.) or really know you are doing then I would opt for using Interlocked. The performance difference between using volatile / interlocked / lock in most applications will be neglible, so if you are unsure its best to use what ever gives you the safest guarantee (read Joe Duffy's blog & book).
For example using volatile in the example below is not thread safe and the incremented counter does not reach 10,000,000 (when I ran the test it reached 8848450) . This is because volatile only guarentees reading the latest value (e.g. not cached from a register for example). When using interlocked the operation is thread safe and the counter does reach 10,000,000.
public class Incrementor
{
private volatile int count;
public int Count
{
get { return count; }
}
public void UnsafeIncrement()
{
count++;
}
public void SafeIncrement()
{
Interlocked.Increment(ref count);
}
}
[TestFixture]
public class ThreadingTest
{
private const int fiveMillion = 5000000;
private const int tenMillion = 10000000;
[Test]
public void UnsafeCountShouldNotCountToTenMillion()
{
const int iterations = fiveMillion;
Incrementor incrementor = new Incrementor();
Thread thread1 = new Thread(() => UnsafeIncrement(incrementor, iterations));
Thread thread2 = new Thread(() => UnsafeIncrement(incrementor, iterations));
thread1.Start();
thread2.Start();
thread1.Join();
thread2.Join();
Assert.AreEqual(tenMillion, incrementor.Count);
}
[Test]
public void SafeIncrementShouldCountToTenMillion()
{
const int iterations = fiveMillion;
Incrementor incrementor = new Incrementor();
Thread thread1 = new Thread(() => SafeIncrement(incrementor, iterations));
Thread thread2 = new Thread(() => SafeIncrement(incrementor, iterations));
thread1.Start();
thread2.Start();
thread1.Join();
thread2.Join();
Assert.AreEqual(tenMillion, incrementor.Count);
}
private void UnsafeIncrement(Incrementor incrementor, int times)
{
for (int i =0; i < times; ++i)
incrementor.UnsafeIncrement();
}
private void SafeIncrement(Incrementor incrementor, int times)
{
for (int i = 0; i < times; ++i)
incrementor.SafeIncrement();
}
}
If you search for 'interlocked volatile' you will find a number of answers to your question. The one below for example addresses your question:
A simple example below shows
Volatile vs. Interlocked vs. lock
"One way to do this is to make the locking object a private static variable of the class that it is used in."
Why should it be static? You can access the same function from multiple threads as long as they work on different object. I am not saying that it would not work, but would seriously slow the speed of the application without any advantages. Or am I missing something?
And here is what MSDN says about volatiles:
"Also, when optimizing, the compiler must maintain ordering among references to volatile objects as well as references to other global objects. In particular,
A write to a volatile object (volatile write) has Release semantics; a reference to a global or static object that occurs before a write to a volatile object in the instruction sequence will occur before that volatile write in the compiled binary.
A read of a volatile object (volatile read) has Acquire semantics; a reference to a global or static object that occurs after a read of volatile memory in the instruction sequence will occur after that volatile read in the compiled binary.
This allows volatile objects to be used for memory locks and releases in multithreaded applications."
I think that nunit isn't function properly when threading code is involved:
Here's the sample code:
public class multiply
{
public Thread myThread;
public int Counter
{
get;
private set;
}
public string name
{
get;
private set;
}
private Object thisLock = new Object();
public void RunConsolePrint()
{
//lock(thisLock)
//{
Console.WriteLine("Now thread " + name + " has started");
for (int i = 1; i<= Counter; i++)
{
Console.WriteLine(name + ": count has reached " + i+ ": total count is "+Counter);
}
Console.WriteLine("Thread " + name + " has finished");
//}
}
public multiply(string pname, int pCounter)
{
name = pname;
Counter = pCounter;
myThread = new Thread(new ThreadStart(RunConsolePrint));
}
}
And here's the test code:
[Test]
public void Main()
{
counter=100;
multiply m2=new multiply("Second", counter);
multiply m1 = new multiply("First", counter);
m1.myThread.Start();
m2.myThread.Start();
}
And the output is a sequential running of m1 and m2, which means that the loop in m1 is always execute first before m2, at least that's what my testing shows. I ran the tests a few times and I always get this.
Is this a bug? Or an expected behavior?
If I copy the above code to a console program and run, I can see the threading effect clearly.
I am using the test using TestDriven.net runner.
The exact interleaving of two or more threads cannot really be predicted.
There are a couple of factors to consider for your example. First of all each thread will execute for its quantum before (potentially) getting switched to another thread. You can't expect thread switches to happen on any special place in your code. I.e. one thread might finish before the other one is started (especially since the task is relatively short in your case).
Secondly, since you're writing to the console, the threads are synchronized on that access. This also affects the interleaving.
The result also depends on the number of available core on your machine (as well as the general load on the machine when you run the code).
In short, you cannot predict how the two threads will run.
It's non-deterministic whether m1 or m2 starts executing first. If the counter executes fast enough, I wouldn't be surprised to see one of them finish before the other starts. Change the count to something very large (e.g. a million) and I'm pretty sure you'll see separate threads executing concurrently.
Any result is possible - what result are you seeing that suggests there's a bug?