How to safely interact with channels in goroutines in Golang

How to safely interact with channels in goroutines in Golang - multithreading

I am new to go and I am trying to understand the way channels in goroutines work. To my understanding, the keyword range could be used to iterate over a the values of the channel up until the channel is closed or the buffer runs out; hence, a for range c will repeatedly loops until the buffer runs out.
I have the following simple function that adds value to a channel:
func main() {
c := make(chan int)
go printchannel(c)
for i:=0; i<10 ; i++ {
c <- i
}
}
I have two implementations of printchannel and I am not sure why the behaviour is different.
Implementation 1:
func printchannel(c chan int) {
for range c {
fmt.Println(<-c)
}
}
output: 1 3 5 7
Implementation 2:
func printchannel(c chan int) {
for i:=range c {
fmt.Println(i)
}
}
output: 0 1 2 3 4 5 6 7 8
And I was expecting neither of those outputs!
Wanted output: 0 1 2 3 4 5 6 7 8 9
Shouldnt the main function and the printchannel function run on two threads in parallel, one adding values to the channel and the other reading the values up until the channel is closed? I might be missing some fundamental go/thread concept here and pointers to that would be helpful.
Feedback on this (and my understanding to channels manipulation in goroutines) is greatly appreciated!

Implementation 1. You're reading from the channel twice - range c and <-c are both reading from the channel.
Implementation 2. That's the correct approach. The reason you might not see 9 printed is that two goroutines might run in parallel threads. In that case it might go like this:
main goroutine sends 9 to the channel and blocks until it's read
second goroutine receives 9 from the channel
main goroutine unblocks and exits. That terminates whole program which doesn't give second goroutine a chance to print 9
In case like that you have to synchronize your goroutines. For example, like so
func printchannel(c chan int, wg *sync.WaitGroup) {
for i:=range c {
fmt.Println(i)
}
wg.Done() //notify that we're done here
}
func main() {
c := make(chan int)
wg := sync.WaitGroup{}
wg.Add(1) //increase by one to wait for one goroutine to finish
//very important to do it here and not in the goroutine
//otherwise you get race condition
go printchannel(c, &wg) //very important to pass wg by reference
//sync.WaitGroup is a structure, passing it
//by value would produce incorrect results
for i:=0; i<10 ; i++ {
c <- i
}
close(c) //close the channel to terminate the range loop
wg.Wait() //wait for the goroutine to finish
}
As to goroutines vs threads. You shouldn't confuse them and probably should understand the difference between them. Goroutines are green threads. There're countless blog posts, lectures and stackoverflow answers on that topic.

In implementation 1, range reads into channel once, then again in Println. Hence you're skipping over 2, 4, 6, 8.
In both implementations, once the final i (9) has been sent to goroutine, the program exits. Thus goroutine does not have the time to print out 9. To solve it, use a WaitGroup as has been mentioned in the other answer, or a done channel to avoid semaphore/mutex.
func main() {
c := make(chan int)
done := make(chan bool)
go printchannel(c, done)
for i:=0; i<10 ; i++ {
c <- i
}
close(c)
<- done
}
func printchannel(c chan int, done chan bool) {
for i := range c {
fmt.Println(i)
}
done <- true
}

The reason your first implementation only returns every other number is because you are, in effect "taking" from c twice each time the loop runs: first with range, then again with <-. It just happens that you're not actually binding or using the first value taken off the channel, so all you end up printing is every other one.
An alternative approach to your first implementation would be to not use range at all, e.g.:
func printchannel(c chan int) {
for {
fmt.Println(<-c)
}
}
I could not replicate the behavior of your second implementation, on my machine, but the reason for that is that both of your implementations are racy - they will terminate whenever main ends, regardless of what data may be pending in a channel or however many goroutines may be active.
As a closing note, I'd warn you not to think about goroutines as explicitly being "threads", though they have a similar mental model and interface. In a simple program like this it's not at all unlikely that Go might just do it all using a single OS thread.

Your first loop does not work as you have 2 blocking channel receivers and they do not execute at the same time.
When you call the goroutine the loop starts, and it waits for the first value to be sent to the channel. Effectively think of it as <-c .
When the for loop in the main function runs it sends 0 on the Chan. At this point the range c recieves the value and stops blocking the execution of the loop.
Then it is blocked by the reciever at fmt.println(<-c) . When 1 is sent on the second iteration of the loop in main the recieved at fmt.println(<-c) reads from the channel, allowing fmt.println to execute thus finishing the loop and waiting for a value at the for range c .
Your second implementation of the looping mechanism is the correct one.
The reason it exits before printing to 9 is that after the for loop in main finishes the program goes ahead and completes execution of main.
In Go func main is launched as a goroutine itself while executing. Thus when the for loop in main completes it goes ahead and exits, and as the print is within a parallel goroutine that is closed, it is never executed. There is no time for it to print as there is nothing to block main from completing and exiting the program.
One way to solve this is to use wait groups http://www.golangprograms.com/go-language/concurrency.html
In order to get the expected result you need to have a blocking process running in main that provides enough time or waits for confirmation of the execution of the goroutine before allowing the program to continue.

Related

Self-Synchronizing Goroutines end up with Deadlock

I have a stress test issue that I want to solve with simple synchronization in Go. So far I have tried to find documenation on my specific usecase regarding synchronization in Go, but didn't find anything that fits.
To be a bit more specific:
I must fulfill a task where I have to start a large amount of threads (in this example only illustrated with two threads) in the main routine. All of the initiated workers are supposed to prepare some initialization actions by themselves in unordered manner. Until they reach a small sequence of commands, which I want them to be executed by all goroutines at once, which is why I want to self-synchronize the goroutines with each other. It is very vital for my task that the delay through the main routine, which instantiates all other goroutines, does not affect the true parallelism of the workers execution (at the label #maximum parallel in the comment). For this purpose I do initialize a wait group with the amount of running goroutines in the main routine and pass it over to all routines so they can synchronize each others workflow.
The code looks similar to this example:
import sync
func worker_action(wait_group *sync.WaitGroup) {
// ...
// initialization
// ...
defer wait_group.Done()
wait_group.Wait() // #label: wait
// sequence of maximum parallel instructions // #label: maximum parallel
// ...
}
func main() {
var numThreads int = 2 // the number of threads shall be much higher for the actual stress test
var wait_group sync.WaitGroup
wait_group.Add(numThreads)
for i := 0; i < numThreads; i++ {
go worker_action(&wait_group)
}
// ...
}
Unfortunately my setup runs into a deadlock, as soon as all goroutines have reached the Wait instruction (labeled with #wait in the comment). This is true for any amount of threads that I start with the main routine (even two threads are caught in a deadlock within no time).
From my point of view a deadlock should not occur, due to the fact that immediately before the wait instruction each goroutine executes the done function on the same wait group.
Do I have a wrong understanding of how wait groups work? Is it for instance not allowed to execute the wait function inside of a goroutine other than the main routine? Or can someone give me a hint on what else I am missing?
Thank you very much in advance.
EDIT:
Thanks a lot #tkausl. It was indeed the unnecessary "defer" that caused the problem. I do not know how I could not see it myself.

There are several issues in your code. First the form. Idiomatic Go should use camelCase. wg is a better name for the WaitGroup.
But more important is the use where your code is waiting. Not inside your Goroutines. It should wait inside the main func:
func workerAction(wg *sync.WaitGroup) {
// ...
// initialization
// ...
defer wg.Done()
// wg.Wait() // #label: wait
// sequence of maximum parallel instructions // #label: maximum parallel
// ...
}
func main() {
var numThreads int = 2 // the number of threads shall be much higher for the actual stress test
var wg sync.WaitGroup
wg.Add(numThreads)
for i := 0; i < numThreads; i++ {
go workerAction(&wg)
}
wg.Wait() // you need to wait here
// ...
}

Again thanks #tkausl. The issue was resolved by removing the unnecessary "defer" instruction from the line that was meant to let the worker goroutines increment the number of finished threads.
I.e. "defer wait_group.Done()" -> "wait_group.Done()"

waitgroup on subset of go routines

I have situation where in, the main go routines will create "x" go routines. but it is interested only in "y" ( y < x ) go routines to finish.
I was hoping to use Waitgroup. But Waitgroup only allows me to wait on all go routines. I cannot, for example do this,
1. wg.Add (y)
2 create "x" go routines. These routines will call wg.Done() when finished.
3. wg. Wait()
This panics when the y+1 go routine calls wg.Done() because the wg counter goes negative.
I sure can use channels to solve this but I am interested if Waitgroup solves this.

As noted in Adrian's answer, sync.WaitGroup is a simple counter whose Wait method will block until the counter value reaches zero. It is intended to allow you to block (or join) on a number of goroutines before allowing a main flow of execution to proceed.
The interface of WaitGroup is not sufficiently expressive for your usecase, nor is it designed to be. In particular, you cannot use it naïvely by simply calling wg.Add(y) (where y < x). The call to wg.Done by the (y+1)th goroutine will cause a panic, as it is an error for a wait group to have a negative internal value. Furthermore, we cannot be "smart" by observing the internal counter value of the WaitGroup; this would break an abstraction and, in any event, its internal state is not exported.
Implement your own!
You can implement the relevant logic yourself using some channels per the code below (playground link). Observe from the console that 10 goroutines are started, but after two have completed, we fallthrough to continue execution in the main method.
package main
import (
"fmt"
"time"
)
// Set goroutine counts here
const (
// The number of goroutines to spawn
x = 10
// The number of goroutines to wait for completion
// (y <= x) must hold.
y = 2
)
func doSomeWork() {
// do something meaningful
time.Sleep(time.Second)
}
func main() {
// Accumulator channel, used by each goroutine to signal completion.
// It is buffered to ensure the [y+1, ..., x) goroutines do not block
// when sending to the channel, which would cause a leak. It will be
// garbage collected when all goroutines end and the channel falls
// out of scope. We receive y values, so only need capacity to receive
// (x-y) remaining values.
accChan := make(chan struct{}, x-y)
// Spawn "x" goroutines
for i := 0; i < x; i += 1 {
// Wrap our work function with the local signalling logic
go func(id int, doneChan chan<- struct{}) {
fmt.Printf("starting goroutine #%d\n", id)
doSomeWork()
fmt.Printf("goroutine #%d completed\n", id)
// Communicate completion of goroutine
doneChan <- struct{}{}
}(i, accChan)
}
for doneCount := 0; doneCount < y; doneCount += 1 {
<-accChan
}
// Continue working
fmt.Println("Carrying on without waiting for more goroutines")
}
Avoid leaking resources
As this does not wait for the [y+1, ..., x) goroutines to complete, you should take special care in the doSomeWork function to remove or minimize the risk that the work can block indefinitely, which would also cause a leak. Remove, where possible, the feasibility of indefinite blocking on I/O (including channel operations) or falling into infinite loops.
You could use a context to signal to the additional goroutines when their results are no longer required to have them break out of execution.

WaitGroup doesn't actually wait on goroutines, it waits until its internal counter reaches zero. If you only Add() the number of goroutines you care about, and you only call Done() in those goroutines you care about, then Wait() will only block until those goroutines you care about have finished. You are in complete control of the logic and flow, there are no restrictions on what WaitGroup "allows".

Are these y specific go-routines that you are trying to track, or any y out of the x? What are the criteria?
Update:
1. If you hve control over any criteria to pick matching y go-routines:
You can do wp.wg.Add(1) and wp.wg.Done() from inside the goroutine based on your condition by passing it as a pointer argument into the goroutine, if your condition can't be checked outside the goroutine.
Something like below sample code. Will be able to be more specific if you provide more details of what you are trying to do.
func sampleGoroutine(z int, b string, wg *sync.WaitGroup){
defer func(){
if contition1{
wg.Done()
}
}
if contition1 {
wg.Add(1)
//do stuff
}
}
func main() {
wg := sync.WaitGroup{}
for i := 0; i < x; i++ {
go sampleGoroutine(1, "one", &wg)
}
wg.Wait()
}
2. If you have no control over which ones, and just want the first y:
Based on your comment, that you have no control/desire to pick any specific goroutines, but the ones that finish first. If you would want to do it in a generic way, you can use the below custom waitGroup implementation that fits your use case. (It's not copy-safe, though. Also doesn't have/need wg.Add(int) method)
type CountedWait struct {
wait chan struct{}
limit int
}
func NewCountedWait(limit int) *CountedWait {
return &CountedWait{
wait: make(chan struct{}, limit),
limit: limit,
}
}
func (cwg *CountedWait) Done() {
cwg.wait <- struct{}{}
}
func (cwg *CountedWait) Wait() {
count := 0
for count < cwg.limit {
<-cwg.wait
count += 1
}
}
Which can be used as follows:
func sampleGoroutine(z int, b string, wg *CountedWait) {
success := false
defer func() {
if success == true {
fmt.Printf("goroutine %d finished successfully\n", z)
wg.Done()
}
}()
fmt.Printf("goroutine %d started\n", z)
time.Sleep(time.Second)
if rand.Intn(10)%2 == 0 {
success = true
}
}
func main() {
x := 10
y := 3
wg := NewCountedWait(y)
for i := 0; i < x; i += 1 {
// Wrap our work function with the local signalling logic
go sampleGoroutine(i, "something", wg)
}
wg.Wait()
fmt.Printf("%d out of %d goroutines finished successfully.\n", y, x)
}
3. You can also club in context with 2 to ensure that the remaining goroutines don't leak
You may not be able to run this on play.golang, as it has some long sleeps.
Below is a sample output:
(note that, there may be more than y=3 goroutines marking Done, but you are only waiting till 3 finish)
goroutine 9 started
goroutine 0 started
goroutine 1 started
goroutine 2 started
goroutine 3 started
goroutine 4 started
goroutine 5 started
goroutine 5 marking done
goroutine 6 started
goroutine 7 started
goroutine 7 marking done
goroutine 8 started
goroutine 3 marking done
continuing after 3 out of 10 goroutines finished successfully.
goroutine 9 will be killed, bcz cancel
goroutine 8 will be killed, bcz cancel
goroutine 6 will be killed, bcz cancel
goroutine 1 will be killed, bcz cancel
goroutine 0 will be killed, bcz cancel
goroutine 4 will be killed, bcz cancel
goroutine 2 will be killed, bcz cancel
Play links
https://play.golang.org/p/l5i6X3GClBq
https://play.golang.org/p/Bcns0l9OdFg
https://play.golang.org/p/rkGSLyclgje

Reading values from a different thread

I'm writing software in Go that does a lot of parallel computing. I want to collect data from worker threads and I'm not really sure how to do it in a safe way. I know that I could use channels but in my scenario they make it more complicated since I have to somehow synchronize messages (wait until every thread sent something) in the main thread.
Scenario
The main thread creates n Worker instances and launches their work() method in a goroutine so that the workers each run in their own thread. Every 10 seconds the main thread should collect some simple values (e.g. iteration count) from the workers and print a consolidated statistic.
Question
Is it safe to read values from the workers? The main thread will only read values and each individual thread will write it's own values. It would be ok if the values are a few nanoseconds off while reading.
Any other ideas on how to implement this in an easy way?

In Go no value is safe for concurrent access from multiple goroutines without synchronization if at least one of the accesses is a write. Your case meets the conditions listed, so you must use some kind of synchronization, else the behavior would be undefined.
Channels are used if goroutine(s) want to send values to another. Your case is not exactly this: you don't want your workers to send updates in every 10 seconds, you want your main goroutine to fetch status in every 10 seconds.
So in this example I would just protect the data with a sync.RWMutex: when the workers want to modify this data, they have to acquire a write lock. When the main goroutine wants to read this data, it has to acquire a read lock.
A simple implementation could look like this:
type Worker struct {
iterMu sync.RWMutex
iter int
}
func (w *Worker) Iter() int {
w.iterMu.RLock()
defer w.iterMu.RUnlock()
return w.iter
}
func (w *Worker) setIter(n int) {
w.iterMu.Lock()
w.iter = n
w.iterMu.Unlock()
}
func (w *Worker) incIter() {
w.iterMu.Lock()
w.iter++
w.iterMu.Unlock()
}
Using this example Worker, the main goroutine can fetch the iteration using Worker.Iter(), and the worker itself can change / update the iteration using Worker.setIter() or Worker.incIter() at any time, without any additional synchronization. The synchronization is ensured by the proper use of Worker.iterMu.
Alternatively for the iteration counter you could also use the sync/atomic package. If you choose this, you may only read / modify the iteration counter using functions of the atomic package like this:
type Worker struct {
iter int64
}
func (w *Worker) Iter() int64 {
return atomic.LoadInt64(&w.iter)
}
func (w *Worker) setIter(n int64) {
atomic.StoreInt64(&w.iter, n)
}
func (w *Worker) incIter() {
atomic.AddInt64(&w.iter, 1)
}

Can go channel keep a value for multiple reads [duplicate]

This question already has answers here:
Multiple goroutines listening on one channel
(7 answers)
Closed 5 years ago.
I understand the regular behavior of a channel is that it empties after a read. Is there a way to keep an unbuffered channel value for multiple reads without the value been removed from the channel?
For example, I have a goroutine that generates a single data for multiple down stream go routines to use. I don't want to have to create multiple channels or use a buffered channel which would require me to duplicate the source data (I don't even know how many copies I will need). Effectively, I want to be able to do something like the following:
main{
ch := make(ch chan dType)
ch <- sourceDataGenerator()
for _,_ := range DynamicRange{
go TargetGoRoutine(ch)
}
close(ch) // would want this to remove the value and the channel
}
func(ch chan dType) TargetGoRoutine{
targetCollection <- ch // want to keep the channel value after read
}
EDIT
Some feel this is a duplicate question. Perhaps, but not sure. The solution here seems simple in the end as n-canter pointed out. All it needs is for every go routine to "recycle" the data by putting it back to the channel after use. None of the supposedly "duplicates" provided this solution. Here is a sample:
package main
import (
"fmt"
"sync"
)
func main() {
c := make(chan string)
var wg sync.WaitGroup
wg.Add(5)
for i := 0; i < 5; i++ {
go func(i int) {
wg.Done()
msg := <-c
fmt.Printf("Data:%s, From go:%d\n", msg, i)
c <-msg
}(i)
}
c <- "Original"
wg.Wait()
fmt.Println(<-c)
}
https://play.golang.org/p/EXBbf1_icG

You may readd value back to the channel after reading, but then all your gouroutines will read shared value sequentially and also you'll need some synchronization primitives for last goroutine not to block.
As far as I know the only case when you can use the single channel for broadcasting is closing it. In this case all readers will be notified.
If you don't want to duplicate large data, maybe you'd better use some global variable. But use it carefully, because it violates golang rule: "Don't communicate by sharing memory; share memory by communicating."
Also look at this question How to broadcast message using channel

Why following code generates deadlock

Golang newbie here. Can somebody explain why the following code generates a deadlock?
I am aware of sending true to boolean <- done channel but I don't want to use it.
package main
import (
"fmt"
"sync"
"time"
)
var wg2 sync.WaitGroup
func producer2(c chan<- int) {
for i := 0; i < 5; i++ {
time.Sleep(time.Second * 10)
fmt.Println("Producer Writing to chan %d", i)
c <- i
}
}
func consumer2(c <-chan int) {
defer wg2.Done()
fmt.Println("Consumer Got value %d", <-c)
}
func main() {
c := make(chan int)
wg2.Add(5)
fmt.Println("Starting .... 1")
go producer2(c)
go consumer2(c)
fmt.Println("Starting .... 2")
wg2.Wait()
}
Following is my understanding and I know that it is wrong:
The channel will be blocked the moment 0 is written to it within the
loop of producer function
So I expect channel to be emptied by the
consumer afterwards.
As the channel is emptied in the step 2,
producer function can again put in another value and then get
blocked and steps 2 repeats again.

Your original deadlock is caused by wg2.Add(5), you were waiting for 5 goroutines to finish, but only one did; you called wg2.Done() once. Change this to wg2.Add(1), and your program will run without error.
However, I suspect that you intended to consume all the values in the channel not just one as you do. If you change consumer function to:
func consumer2(c <-chan int) {
defer wg2.Done()
for i := range c {
fmt.Printf("Consumer Got value %d\n", i)
}
}
You will get another deadlock because the channel is not closed in producer function, and consumer is waiting for more values that never arrive. Adding close(c) to the producer function will fix it.

Why it error?
Running your code gets the following error:
➜ gochannel go run dl.go
Starting .... 1
Starting .... 2
Producer Writing to chan 0
Consumer Got value 0
Producer Writing to chan 1
fatal error: all goroutines are asleep - deadlock!
Here is why:
There are three goroutines in your code: main,producer2 and consumer2. When it runs,
producer2 sends a number 0 to the channel
consumer2 recives 0 from the channel, and exits
producer2 sends 1 to the channel, but no one is consuming, since consumer2 already exits
producer2 is waiting
main executes wg2.Wait(), but not all waitgroup are closed. So main is waiting
Two goroutines are waiting here, does nothing, and nothing will be done no matter how long you wait. It is a deadlock! Golang detects it and panic.
There are two concepts you are confused here:
how waitgourp works
how to receive all values from a channel
I'll explain them here briefly, there are alreay many articles out there on the internet.
how waitgroup works
WaitGroup if a way to wait for all groutine to finish. When running goroutines in the background, it's important to know when all of them quits, then certain action can be conducted.
In your case, we run two goroutines, so at the beginning we should set wg2.Add(2), and each goroutine should add wg2.Done() to notify it is done.
Receive data from a channel
When receiving data from a channel. If you know exactly how many data it will send, use for loop this way:
for i:=0; i<N; i++ {
data = <-c
process(data)
}
Otherwise use it this way:
for data := range c {
process(data)
}
Also, Don't forget to close channel when there is no more data to send.
How to fix it?
With the above explanation, the code can be fixed as:
package main
import (
"fmt"
"sync"
"time"
)
var wg2 sync.WaitGroup
func producer2(c chan<- int) {
defer wg2.Done()
for i := 0; i < 5; i++ {
time.Sleep(time.Second * 1)
fmt.Printf("Producer Writing to chan %d\n", i)
c <- i
}
close(c)
}
func consumer2(c <-chan int) {
defer wg2.Done()
for i := range c {
fmt.Printf("Consumer Got value %d\n", i)
}
}
func main() {
c := make(chan int)
wg2.Add(2)
fmt.Println("Starting .... 1")
go producer2(c)
go consumer2(c)
fmt.Println("Starting .... 2")
wg2.Wait()
}
Here is another possible way to fix it.
The expected output
Fixed code gives the following output:
➜ gochannel go run dl.go
Starting .... 1
Starting .... 2
Producer Writing to chan 0
Consumer Got value 0
Producer Writing to chan 1
Consumer Got value 1
Producer Writing to chan 2
Consumer Got value 2
Producer Writing to chan 3
Consumer Got value 3
Producer Writing to chan 4
Consumer Got value 4

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How to safely interact with channels in goroutines in Golang - multithreading

Related

Self-Synchronizing Goroutines end up with Deadlock

waitgroup on subset of go routines

Reading values from a different thread

Can go channel keep a value for multiple reads [duplicate]

Why following code generates deadlock

Categories

Resources