Understanding mutex behviour - multithreading

I was thinking mutex in Go would lock the data and won't allow read/write by any other goroutine unless the fist goroutine releases the lock. It seems like my understanding was wrong. The only way to block read/write from other goroutine is to call lock in other goroutines as well. This would ensure critical section is accessed by one and only one goroutine.
So, I would expect this code to have a deadlock:
package main
import(
"fmt"
"sync"
)
type myMap struct {
m map[string]string
mutex sync.Mutex
}
func main() {
done := make(chan bool)
ch := make(chan bool)
myM := &myMap{
m: make(map[string]string),
}
go func() {
myM.mutex.Lock()
myM.m["x"] = "i"
fmt.Println("Locked. Won't release the Lock")
ch <- true
}()
go func() {
<- ch
fmt.Println("Trying to write to the myMap")
myM.m["a"] = "b"
fmt.Println(myM)
done <- true
}()
<- done
}
Since the fist goroutine locks the struct, I would expect the second goroutine to fail to read/write to the struct but that not happening here.
If I will add mux.Lock() in second goroutine then there will be a deadlock.
I find it a little weird the way mutex works in Go. If I lock then Go shouldn't allow any other goroutine to read/write to it.
Can someone explain to me the mutex concept in Go?

There's no magical force field that surrounds a mutex, protecting any datastructure it happens to be embedded in. If you lock a mutex, it prevents other code from locking it until it's unlocked. Nothing more, nothing less. It's well documented in the sync package.
So in your code, where there's exactly one myM.mutex.Lock(), the effect is the same as if there was no mutex.
A correct use of a mutex that protects data involves locking the mutex before updating or reading the data, and then unlocking it afterwards. Often this code will be wrapped in a function so that defer can be used:
func doSomething(myM *myMap) {
myM.mutex.Lock()
defer myM.mutex.Unlock()
... read or update myM
}

Related

Peterson's algorithm and deadlock

I am trying to experiment with some mutual execution algorithms. I have implemented the Peterson's algorithm. It prints the correct counter value but sometimes it seems just like some kind of a deadlock had occurred which stalls the execution indefinitely. This should not be possible since this algorithm is deadlock free.
PS: Is this related to problems with compiler optimizations often mentioned when addressing the danger of "benign" data races? If this is the case then how to disable such optimizations?
PPS: When atomically storing/loading the victim field, the problem seems to disappear which makes the compiler's optimizations more suspicious
package main
import (
"fmt"
"sync"
)
type mutex struct {
flag [2]bool
victim int
}
func (m *mutex) lock(id int) {
m.flag[id] = true // I'm interested
m.victim = id // you can go before me if you want
for m.flag[1-id] && m.victim == id {
// while the other thread is inside the CS
// and the victime was me (I expressed my interest after the other one already did)
}
}
func (m *mutex) unlock(id int) {
m.flag[id] = false // I'm not intersted anymore
}
func main() {
var wg sync.WaitGroup
var mu mutex
var cpt, n = 0, 100000
for i := 0; i < 2; i++ {
wg.Add(1)
go func(id int) {
defer wg.Done()
for j := 0; j < n; j++ {
mu.lock(id)
cpt = cpt + 1
mu.unlock(id)
}
}(i)
}
wg.Wait()
fmt.Println(cpt)
}
There is no "benign" data race. Your program has data race, and the behavior is undefined.
At the core of the problem is the mutex implementation. Modifications made to a shared object from one goroutine are not necessarily observable from others until those goroutines communicate using one of the synchronization primitives. You are writing to mutex.victim from multiple goroutines, and won't be observed. You are also reading the mutex.flag elements written by other goroutines, and won't necessarily be seen. That is, there may be cases where the for-loop won't terminate even if the other goroutine changes the variables.
And since the mutex implementation is broken, the updates to cpt will not necessarily be correct either.
To implement this correctly, you need the sync/atomic package.
See the Go Memory Model: https://go.dev/ref/mem
For Peterson's algorithm (same goes for Dekker), you need to ensure that your code is sequential consistent. In Go you can do that using atomics. This will prevent the compiler and the hardware to mess things up.

Go Memory Model Happens Before (Channels with Shared State)

I'm trying to more fully understand the nature of the Happens-Before relationship between channels and other shared state. Specifically, I want to see if some kind of memory fence is created on a channel send and receive operation.
For example, if I send a message on a channel, do all other operations surrounding modification of shared state "happen before" the send/receive operation. In my particular example, I'm only ever writing from a single go routine and then reading from a single go routine.
(Aside: the obvious answer in the example below is to put an instance of the Person struct on the channel directly, but that's not what I'm asking.)
package main
func main() {
channel := make(chan int, 128)
go func() {
person := &sharedState[0]
person.Name = "Hello, World!"
channel <- 0
}()
index := <-channel
person := sharedState[index]
if person.Name != "Hello, World!" {
// unintended race condition
}
}
type Person struct{ Name string }
var sharedState = make([]Person, 1024)
The memory model guarantees that when the channel write operation executes, all operations in that goroutine that comes before the channel operation are visible. So in your example, the "unintended race condition" cannot happen, because when the channel is read, the assignment happened in the goroutine is visible. This, of course, assumes that there isn't another goroutine that's writing to that same variable. If there was another goroutine writing to that same variable, then you would need to synchronize that goroutine as well to avoid the race.

Self-Synchronizing Goroutines end up with Deadlock

I have a stress test issue that I want to solve with simple synchronization in Go. So far I have tried to find documenation on my specific usecase regarding synchronization in Go, but didn't find anything that fits.
To be a bit more specific:
I must fulfill a task where I have to start a large amount of threads (in this example only illustrated with two threads) in the main routine. All of the initiated workers are supposed to prepare some initialization actions by themselves in unordered manner. Until they reach a small sequence of commands, which I want them to be executed by all goroutines at once, which is why I want to self-synchronize the goroutines with each other. It is very vital for my task that the delay through the main routine, which instantiates all other goroutines, does not affect the true parallelism of the workers execution (at the label #maximum parallel in the comment). For this purpose I do initialize a wait group with the amount of running goroutines in the main routine and pass it over to all routines so they can synchronize each others workflow.
The code looks similar to this example:
import sync
func worker_action(wait_group *sync.WaitGroup) {
// ...
// initialization
// ...
defer wait_group.Done()
wait_group.Wait() // #label: wait
// sequence of maximum parallel instructions // #label: maximum parallel
// ...
}
func main() {
var numThreads int = 2 // the number of threads shall be much higher for the actual stress test
var wait_group sync.WaitGroup
wait_group.Add(numThreads)
for i := 0; i < numThreads; i++ {
go worker_action(&wait_group)
}
// ...
}
Unfortunately my setup runs into a deadlock, as soon as all goroutines have reached the Wait instruction (labeled with #wait in the comment). This is true for any amount of threads that I start with the main routine (even two threads are caught in a deadlock within no time).
From my point of view a deadlock should not occur, due to the fact that immediately before the wait instruction each goroutine executes the done function on the same wait group.
Do I have a wrong understanding of how wait groups work? Is it for instance not allowed to execute the wait function inside of a goroutine other than the main routine? Or can someone give me a hint on what else I am missing?
Thank you very much in advance.
EDIT:
Thanks a lot #tkausl. It was indeed the unnecessary "defer" that caused the problem. I do not know how I could not see it myself.
There are several issues in your code. First the form. Idiomatic Go should use camelCase. wg is a better name for the WaitGroup.
But more important is the use where your code is waiting. Not inside your Goroutines. It should wait inside the main func:
func workerAction(wg *sync.WaitGroup) {
// ...
// initialization
// ...
defer wg.Done()
// wg.Wait() // #label: wait
// sequence of maximum parallel instructions // #label: maximum parallel
// ...
}
func main() {
var numThreads int = 2 // the number of threads shall be much higher for the actual stress test
var wg sync.WaitGroup
wg.Add(numThreads)
for i := 0; i < numThreads; i++ {
go workerAction(&wg)
}
wg.Wait() // you need to wait here
// ...
}
Again thanks #tkausl. The issue was resolved by removing the unnecessary "defer" instruction from the line that was meant to let the worker goroutines increment the number of finished threads.
I.e. "defer wait_group.Done()" -> "wait_group.Done()"

How to safely interact with channels in goroutines in Golang

I am new to go and I am trying to understand the way channels in goroutines work. To my understanding, the keyword range could be used to iterate over a the values of the channel up until the channel is closed or the buffer runs out; hence, a for range c will repeatedly loops until the buffer runs out.
I have the following simple function that adds value to a channel:
func main() {
c := make(chan int)
go printchannel(c)
for i:=0; i<10 ; i++ {
c <- i
}
}
I have two implementations of printchannel and I am not sure why the behaviour is different.
Implementation 1:
func printchannel(c chan int) {
for range c {
fmt.Println(<-c)
}
}
output: 1 3 5 7
Implementation 2:
func printchannel(c chan int) {
for i:=range c {
fmt.Println(i)
}
}
output: 0 1 2 3 4 5 6 7 8
And I was expecting neither of those outputs!
Wanted output: 0 1 2 3 4 5 6 7 8 9
Shouldnt the main function and the printchannel function run on two threads in parallel, one adding values to the channel and the other reading the values up until the channel is closed? I might be missing some fundamental go/thread concept here and pointers to that would be helpful.
Feedback on this (and my understanding to channels manipulation in goroutines) is greatly appreciated!
Implementation 1. You're reading from the channel twice - range c and <-c are both reading from the channel.
Implementation 2. That's the correct approach. The reason you might not see 9 printed is that two goroutines might run in parallel threads. In that case it might go like this:
main goroutine sends 9 to the channel and blocks until it's read
second goroutine receives 9 from the channel
main goroutine unblocks and exits. That terminates whole program which doesn't give second goroutine a chance to print 9
In case like that you have to synchronize your goroutines. For example, like so
func printchannel(c chan int, wg *sync.WaitGroup) {
for i:=range c {
fmt.Println(i)
}
wg.Done() //notify that we're done here
}
func main() {
c := make(chan int)
wg := sync.WaitGroup{}
wg.Add(1) //increase by one to wait for one goroutine to finish
//very important to do it here and not in the goroutine
//otherwise you get race condition
go printchannel(c, &wg) //very important to pass wg by reference
//sync.WaitGroup is a structure, passing it
//by value would produce incorrect results
for i:=0; i<10 ; i++ {
c <- i
}
close(c) //close the channel to terminate the range loop
wg.Wait() //wait for the goroutine to finish
}
As to goroutines vs threads. You shouldn't confuse them and probably should understand the difference between them. Goroutines are green threads. There're countless blog posts, lectures and stackoverflow answers on that topic.
In implementation 1, range reads into channel once, then again in Println. Hence you're skipping over 2, 4, 6, 8.
In both implementations, once the final i (9) has been sent to goroutine, the program exits. Thus goroutine does not have the time to print out 9. To solve it, use a WaitGroup as has been mentioned in the other answer, or a done channel to avoid semaphore/mutex.
func main() {
c := make(chan int)
done := make(chan bool)
go printchannel(c, done)
for i:=0; i<10 ; i++ {
c <- i
}
close(c)
<- done
}
func printchannel(c chan int, done chan bool) {
for i := range c {
fmt.Println(i)
}
done <- true
}
The reason your first implementation only returns every other number is because you are, in effect "taking" from c twice each time the loop runs: first with range, then again with <-. It just happens that you're not actually binding or using the first value taken off the channel, so all you end up printing is every other one.
An alternative approach to your first implementation would be to not use range at all, e.g.:
func printchannel(c chan int) {
for {
fmt.Println(<-c)
}
}
I could not replicate the behavior of your second implementation, on my machine, but the reason for that is that both of your implementations are racy - they will terminate whenever main ends, regardless of what data may be pending in a channel or however many goroutines may be active.
As a closing note, I'd warn you not to think about goroutines as explicitly being "threads", though they have a similar mental model and interface. In a simple program like this it's not at all unlikely that Go might just do it all using a single OS thread.
Your first loop does not work as you have 2 blocking channel receivers and they do not execute at the same time.
When you call the goroutine the loop starts, and it waits for the first value to be sent to the channel. Effectively think of it as <-c .
When the for loop in the main function runs it sends 0 on the Chan. At this point the range c recieves the value and stops blocking the execution of the loop.
Then it is blocked by the reciever at fmt.println(<-c) . When 1 is sent on the second iteration of the loop in main the recieved at fmt.println(<-c) reads from the channel, allowing fmt.println to execute thus finishing the loop and waiting for a value at the for range c .
Your second implementation of the looping mechanism is the correct one.
The reason it exits before printing to 9 is that after the for loop in main finishes the program goes ahead and completes execution of main.
In Go func main is launched as a goroutine itself while executing. Thus when the for loop in main completes it goes ahead and exits, and as the print is within a parallel goroutine that is closed, it is never executed. There is no time for it to print as there is nothing to block main from completing and exiting the program.
One way to solve this is to use wait groups http://www.golangprograms.com/go-language/concurrency.html
In order to get the expected result you need to have a blocking process running in main that provides enough time or waits for confirmation of the execution of the goroutine before allowing the program to continue.

Reading values from a different thread

I'm writing software in Go that does a lot of parallel computing. I want to collect data from worker threads and I'm not really sure how to do it in a safe way. I know that I could use channels but in my scenario they make it more complicated since I have to somehow synchronize messages (wait until every thread sent something) in the main thread.
Scenario
The main thread creates n Worker instances and launches their work() method in a goroutine so that the workers each run in their own thread. Every 10 seconds the main thread should collect some simple values (e.g. iteration count) from the workers and print a consolidated statistic.
Question
Is it safe to read values from the workers? The main thread will only read values and each individual thread will write it's own values. It would be ok if the values are a few nanoseconds off while reading.
Any other ideas on how to implement this in an easy way?
In Go no value is safe for concurrent access from multiple goroutines without synchronization if at least one of the accesses is a write. Your case meets the conditions listed, so you must use some kind of synchronization, else the behavior would be undefined.
Channels are used if goroutine(s) want to send values to another. Your case is not exactly this: you don't want your workers to send updates in every 10 seconds, you want your main goroutine to fetch status in every 10 seconds.
So in this example I would just protect the data with a sync.RWMutex: when the workers want to modify this data, they have to acquire a write lock. When the main goroutine wants to read this data, it has to acquire a read lock.
A simple implementation could look like this:
type Worker struct {
iterMu sync.RWMutex
iter int
}
func (w *Worker) Iter() int {
w.iterMu.RLock()
defer w.iterMu.RUnlock()
return w.iter
}
func (w *Worker) setIter(n int) {
w.iterMu.Lock()
w.iter = n
w.iterMu.Unlock()
}
func (w *Worker) incIter() {
w.iterMu.Lock()
w.iter++
w.iterMu.Unlock()
}
Using this example Worker, the main goroutine can fetch the iteration using Worker.Iter(), and the worker itself can change / update the iteration using Worker.setIter() or Worker.incIter() at any time, without any additional synchronization. The synchronization is ensured by the proper use of Worker.iterMu.
Alternatively for the iteration counter you could also use the sync/atomic package. If you choose this, you may only read / modify the iteration counter using functions of the atomic package like this:
type Worker struct {
iter int64
}
func (w *Worker) Iter() int64 {
return atomic.LoadInt64(&w.iter)
}
func (w *Worker) setIter(n int64) {
atomic.StoreInt64(&w.iter, n)
}
func (w *Worker) incIter() {
atomic.AddInt64(&w.iter, 1)
}

Resources