How is it possible for some threads to never execute? - multithreading

int x = 0 // global shared variable
T1: for (i=0; i++; i<100) x++;
T2: x++ // no loop, just one increment
T1 and T2 are separate threads. I am told the final value of x can be ANYTHING from the values of 1 and 101. How is this possible? I am wondering how this could possibly just be 1.
Obviously, something fails in the execution sequence, but I'm wondering what.

x++ is not atomic operation (at least in most languages), this operation actually works like this:
tmp = x;
tmp = tmp + 1;
x = tmp;
now assume next execution order:
T2: tmp = x; // tmp is 0
T1: run all loop iterations, finally x is 100
T2: tmp = tmp+1; x = tmp; // x is 1
to get any other number, imagine next order:
T1: started loop, at some point x is 45
T2: tmp = x; // tmp is 45
T1: finished loop, x is 100
T2: tmp = tmp+1; x = tmp; // x is 46

The reason for this behavior is memory caching. Since threads can be executed on independent cpu's following situation is possible:
T1: loads x value
T2: loads x value
T1: runs loop 10 times (T1_x=10)
T2: increments (T2_x=1)
T1: saves value 10 to memory
T2: saves value 1 to memory
This is why you need thread synchronization. You can read more here: Mutex example / tutorial?
Thank you #Lashane for correction.

Related

Multiply numbers from two iterators in order and without duplicates

I have this code and I want every combination to be multiplied:
fn main() {
let min = 1;
let max = 9;
for i in (min..=max).rev() {
for j in (min..=max).rev() {
println!("{}", i * j);
}
}
}
Result is something like:
81
72
[...]
9
72
65
[...]
8
6
4
2
9
8
7
6
5
4
3
2
1
Is there a clever way to produce the results in descending order (without collecting and sorting) and without duplicates?
Note that this answer provides a solution for this specific problem (multiplication table) but the title asks a more general question (any two iterators).
The naive solution of storing all elements in a vector and then sorting it uses O(n^2 log n) time and O(n^2) space (where n is the size of the multiplication table).
You can use a priority queue to reduce the memory to O(n):
use std::collections::BinaryHeap;
fn main() {
let n = 9;
let mut heap = BinaryHeap::new();
for j in 1..=n {
heap.push((9 * j, j));
}
let mut last = n * n + 1;
while let Some((val, j)) = heap.pop() {
if val < last {
println!("{val}");
last = val;
}
if val > j {
heap.push((val - j, j));
}
}
}
playground.
The conceptual idea behind the algorithm is to consider 9 separate sequences
9*9, 9*8, 9*7, .., 9*1
8*9, 8*8, 8*7, .., 8*1
...
1*9, 1*8, 1*7, .., 1*1
Since they are all decreasing, at a given moment, we only need to consider one element of each sequence (the largest one we haven't reached yet).
These are inserted into the priority queue which allows us to efficiently find the maximum one.
Once we have printed a given element we move onto the next one in the sequence and insert that into the priority queue.
By keeping track of the last element printed we can avoid duplicates.

Understanding shadowing in rust

I'm on a journey learning rust at my own pace. I'm primarily a C++, Python programmer.
Instead of just following the manual, I try to do weird things in the language and understand the behaviour.
I'm experimenting with the concept of shadowing variables in rust. Can I get an understanding of why this program prints 1 continuously?
fn main() {
let counter = 0;
let counter = loop{
let counter = counter + 1;
println!("{}", counter);
if counter == 10{
break counter * 2;
}
};
println!("The result is {}", counter);
}
Here is what I thought would occur. I haven't yet tried it in debugger though.
counter is initialized and assigned 0.
the loop is run,
2.1 counter will be incremented by 1.
2.2 it will print 1.
2.3 check if counter is 10, if it is so, break with counter * 2, this will be returned out of the loop.
2.4 else, just continue...
counter is now 20.
print "The result is 20"
I'd expect to see:
1
2
3
4
5
6
7
8
9
10
The result is 20
What I get:
1
1
1
1
1
1
1
1
1
.. and so on
Is it that shadowing reinitializes the named variable to 0? That is what occurs here..
In
let counter = 0;
let counter = loop{
let counter = counter + 1;
...
}
At any iteration, you're starting from a new scope, the previous version doesn't exist. The right part of the let counter = counter + 1; assignment always refer to the outside counter, which is always 0.

Can I determine the result of a data race without reading the value?

I'm trying to better understand lock-free programming:
Suppose we have two threads in a data race:
// Thread 1
x = 1
// Thread 2
x = 2
Is there a lock-free way a third thread can know the result of the race without being able to read x?
Suppose thread 3 consumes a lock-free queue, and the code is:
// Thread 1
x = 1
queue.push(1)
// Thread 2
x = 2
queue.push(2)
Then the operations could be ordered as:
x = 1
x = 2
queue.push(1)
queue.push(2)
or
x = 1
x = 2
queue.push(2)
queue.push(1)
So having a lock-free queue alone would not suffice for thread 3 to know the value of x after the race.
If you know the value of x before the race began, the following code using atomic Read-Modify-Write operations should do the job.
// Notes:
// x == 0
// x and winner are both atomic
// atomic_swap swaps the content of two variables atomically,
// meaning, that no other thread can interfere with this operation
//thread-1:
t = 1;
atomic_swap(x, t);
if (t != 0) {
//x was non zero, when thread-1 called the swap operation
//--> thread-2 was faster
winner = 1;
}
//thread-2
t = 2;
atomic_swap(x, t);
if (t != 0) {
//x was non zero, when thread-2 called the swap operation
//--> thread-1 was faster
winner = 2;
}
//thread-3
while (winner == 0) {}
print("Winner is " + winner);

Go thread deadlock error - what is the correct way to use go routines?

I am writing a program that calculates a Riemann sum based on user input. The program will split the function into 1000 rectangles (yes I know I haven't gotten that math in there yet) and sum them up and return the answer. I am using go routines to compute the 1000 rectangles but am getting an
fatal error: all go routines are asleep - deadlock!
What is the correct way to handle multiple go routines? I have been looking around and haven't seen an example that resembles my case? I'm new and want to adhere to standards. Here is my code (it is runnable if you'd like to see what a typical use case of this is - however it does break)
package main
import "fmt"
import "time"
//Data type to hold 'part' of function; ie. "4x^2"
type Pair struct {
coef, exp int
}
//Calculates the y-value of a 'part' of the function and writes this to the channel
func calc(c *chan float32, p Pair, x float32) {
val := x
//Raise our x value to the power, contained in 'p'
for i := 1; i < p.exp; i++ {
val = val * val
}
//Read existing answer from channel
ans := <-*c
//Write new value to the channel
*c <- float32(ans + (val * float32(p.coef)))
}
var c chan float32 //Channel
var m map[string]Pair //Map to hold function 'parts'
func main() {
c = make(chan float32, 1001) //Buffered at 1001
m = make(map[string]Pair)
var counter int
var temp_coef, temp_exp int
var check string
var up_bound, low_bound float32
var delta float32
counter = 1
check = "default"
//Loop through as long as we have no more function 'parts'
for check != "n" {
fmt.Print("Enter the coefficient for term ", counter, ": ")
fmt.Scanln(&temp_coef)
fmt.Print("Enter the exponent for term ", counter, ": ")
fmt.Scanln(&temp_exp)
fmt.Print("Do you have more terms to enter (y or n): ")
fmt.Scanln(&check)
fmt.Println("")
//Put data into our map
m[string(counter)] = Pair{temp_coef, temp_exp}
counter++
}
fmt.Print("Enter the lower bound: ")
fmt.Scanln(&low_bound)
fmt.Print("Enter the upper bound: ")
fmt.Scanln(&up_bound)
//Calculate the delta; ie. our x delta for the riemann sum
delta = (float32(up_bound) - float32(low_bound)) / float32(1000)
//Make our go routines here to add
for i := low_bound; i < up_bound; i = i + delta {
//'counter' is indicative of the number of function 'parts' we have
for j := 1; j < counter; j++ {
//Go routines made here
go calc(&c, m[string(j)], i)
}
}
//Wait for the go routines to finish
time.Sleep(5000 * time.Millisecond)
//Read the result?
ans := <-c
fmt.Print("Answer: ", ans)
}
It dead locks because both the calc() and the main() function reads from the channel before anyone gets to write to it.
So you will end up having every (non-main) go routine blocking at:
ans := <-*c
waiting for someone other go routine to enter a value into the channel. There fore none of them gets to the next line where they actually write to the channel. And the main() routine will block at:
ans := <-c
Everyone is waiting = deadlock
Using buffered channels
Your solution should have the calc() function only writing to the channel, while the main() could read from it in a for-range loop, suming up the values coming from the go-routines.
You will also need to add a way for main() to know when there will be no more values arriving, perhaps by using a sync.WaitGroup (maybe not the best, since main isn't suppose to wait but rather sum things up) or an ordinary counter.
Using shared memory
Sometimes it is not necessarily a channel you need. Having a shared value that you update with the sync/atomic package (atomic add doesn't work on floats) lock with a sync.Mutex works fine too.

does multithread conflict with Map in F#

let len = 25000000
let map = Map.ofArray[|for i =1 to len do yield (i,i+1)|]
let maparr = [|map;map;map;map|]
let f1 i =
for i1 =1 to len do
let l1 = maparr.[i-1].Item(i1)
()
let index = [|1..4|]
let _ = index |> Array.Parallel.map f1
printf "done"
I found that only one core is working at full speed be the code above . But what i except is all the four thread is working together with a high level of cpu usage. So it seems multithread conflict with Map, am i right? If not, how can achieve my initial goal? Thank you in advance
So I think you were tripping a heuristic where the library assumed when there were only a small number of tasks, it would be fastest to just use a single thread.
This code maxes out all threads on my computer:
let len = 1000000
let map = Map.ofArray[|for i =1 to len do yield (i,i+1)|]
let maparr = [|map;map;map;map|]
let f1 (m:Map<_,_>) =
let mutable sum = 0
for i1 =1 to len do
let l1 = m.Item(i1)
for i = 1 to 10000 do
sum <- sum + 1
printfn "%i" sum
let index = [|1..40|]
printfn "starting"
index |> Array.map (fun t -> maparr.[(t-1)/10]) |> Array.Parallel.iter f1
printf "done"
Important changes:
Reduced len significantly. In your code, almost all the time was spent creating the matrix.
Actually do work in the loop. In your code, it is possible that the loop was optimised to a no-op.
Run many more tasks. This tricked the scheduler into using more threads and all is good

Resources