How to "keep main thread running" even though routine has "runtime error" occurred in Golang? - multithreading

I am new to Goland and I did Java in the past. I've written a Golang function to calculate the integer number part of the result. What I am thinking is using a timer to do the calculation and generate the random number. But one problem I met is if the routine has some error, the main thread will stop. Is there any way to keep the main thread running? Even though there are errors in routine?
Below is the code for test:
func main() {
ticker := time.NewTicker(1*1000 * time.Millisecond)
for _ = range ticker.C {
rand.Seed(time.Now().Unix())
divisor := rand.Intn(20)
go calculate(divisor)
}
}
func calculate(divisor int){
result:= 100/divisor
fmt.Print("1/"+strconv.Itoa(divisor)+"=")
fmt.Println(result)
}
As the error handling for Golang really confused me, as what I am thinking is the error occurs in the "thread", the main function is just responsible to create the thread and assign task, it should never mind whether there are exceptions occurs in the "threads" and main should always keep going. If I do this in Java, I could use try catch to surround with
try{
result = 1/divisor;
}
catch(Exception e){
e.printTrace();
}
even every time I give divisor a 0 value in a separate thread, the main progress will not exit, but for Golang, I think
go calculate(divisor)
is opening a new "thread" and run calculate
inside the "thread", but why the main progress will quit.
Is there any possible method to prevent the main progress to quit?
Thanks.

use the defer/ recover feature
package main
import (
"fmt"
"time"
"math/rand"
)
func main() {
ticker := time.NewTicker(1*1000 * time.Millisecond)
for _ = range ticker.C {
rand.Seed(time.Now().Unix())
divisor := rand.Intn(20)
go calculate(divisor)
}
fmt.Println("that's all")
}
func calculate(divisor int){
defer func() {
if r := recover(); r != nil {
fmt.Println("Recovered in f", r)
}
}()
result:= 100/divisor
fmt.Printf("1/%d=", divisor)
fmt.Println(result)
}

Related

Synchronize write to file from heavy operations in different threads

I need to elaborate a file (potentially a big file) one block at a time and write the result to a new file.
To put it simply, I have the basic function to elaborate a block:
func elaborateBlock(block []byte) []byte { ... }
Every block needs to be elaborated and then written to the output file sequentially (preserving original order).
The one-thread implementation is trivial:
for {
buffer := make([]byte, BlockSize)
_, err := inputFile.Read(buffer)
if err == io.EOF {
break
}
processedData := elaborateBlock(buffer)
outputFile.Write(processedData)
}
But the elaboration can be heavy and every block can be processed separately, so a multi-threaded implementation is the natural evolution.
The solution I came up with is to create an array of channels, compute every block in a different thread and sync the final write by looping the channel array:
Utility function:
func blockThread(channel chan []byte, block []byte) {
channel <- elaborateBlock(block)
}
In the main program:
chans = []chan []byte {}
for {
buffer := make([]byte, BlockSize)
_, err := inputFile.Read(buffer)
if err == io.EOF {
break
}
channel := make(chan []byte)
chans = append(chans, channel)
go blockThread(channel, buffer)
}
for i := range chans {
data := <- chans[i]
outputFile.Write(data)
}
This approach works but can be problematic with large files because it requires to load the whole file in memory before starting writing the output.
Do you think there can be a better solution, with also better performance overall?
If blocks do need to be written out in order
If you want to work on multiple blocks concurrently, obviously you need to hold multiple blocks in memory at the same time.
You may decide how many blocks you want to process concurrently, and it's enough to read as many into memory at the same time. E.g. you may say you want to process 5 blocks concurrently. This will limit memory usage, and still utilize your CPU resources potentially to the max. Recommended to pick a number based on your available CPU cores (if processing a block does not already use multi cores). This can be queried using runtime.GOMAXPROCS(0).
You should have a single goroutine that reads the input file sequentially, and prodocue the blocks wrapped in Jobs (which also contain the block index).
You should have multiple worker goroutines, preferable as many as cores you have (but experiment with smaller and higher values too). Each worker goroutine just receives jobs, and calls elaborateBlock() on the data, and delivers it on the results channel.
There should be a single, designated consumer which receives completed jobs, and writes them in order to the output file. Since goroutines run concurrently and we have no control in which order the blocks are completed, the consumer should keep track of the index of the next block to be written to the output. Blocks arriving out of order should only be stored, and only proceed with writing if the subsequent block arrives.
This is an (incomplete) example how to do all these:
const BlockSize = 1 << 20 // 1 MB
func elaborateBlock(in []byte) []byte { return in }
type Job struct {
Index int
Block []byte
}
func producer(jobsCh chan<- *Job) {
// Init input file:
var inputFile *os.File
for index := 0; ; index++ {
job := &Job{
Index: index,
Block: make([]byte, BlockSize),
}
_, err := inputFile.Read(job.Block)
if err != nil {
break
}
jobsCh <- job
}
}
func worker(jobsCh <-chan *Job, resultCh chan<- *Job) {
for job := range jobsCh {
job.Block = elaborateBlock(job.Block)
resultCh <- job
}
}
func consumer(resultCh <-chan *Job) {
// Init output file:
var outputFile *os.File
nextIdx := 0
jobMap := map[int]*Job{}
for job := range resultCh {
jobMap[job.Index] = job
// Write out all blocks we have in contiguous index range:
for {
j := jobMap[nextIdx]
if j == nil {
break
}
if _, err := outputFile.Write(j.Block); err != nil {
// handle error, maybe terminate?
}
delete(nextIdx) // This job is written out
nextIdx++
}
}
}
func main() {
jobsCh := make(chan *Job)
resultCh := make(chan *Job)
for i := 0; i < 5; i++ {
go worker(jobsCh, resultCh)
}
wg := sync.WaitGroup{}
wg.Add(1)
go func() {
defer wg.Done()
consumer(resultCh)
}()
// Start producing jobs:
producer(jobsCh)
// No more jobs:
close(jobsCh)
// Wait for consumer to complete:
wg.Wait()
}
One thing to note here: this alone won't guarantee limiting the used memory. Imagine a case where the first block would require an enormous time to calculate, while subsequent blocks do not. What would happen? The first block would occupy a worker, and the other workers would "quickly" complete the subsequent blocks. The consumer would store all in memory, waiting for the first block to complete (as that has to be written out first). This could increase memory usage.
How could we avoid this?
By introducing a job pool. New jobs could not be created arbitrarily, but taken from a pool. If the pool is empty, the producer has to wait. So when the producer needs a new Job, takes one from a pool. When the consumer has written out a Job, puts it back into the pool. Simple as that. This would also reduce pressure on the garbage collector, as jobs (and large []byte buffers) are not created and thrown away, they could be re-used.
For a simple Job pool implementation you could use a buffered channel. For details, see How to implement Memory Pooling in Golang.
If blocks can be written in any order
Another option could be to allocate the output file in advance. If the size of the output blocks are also deterministic, you can do so (e.g. outsize := (insize / blocksize) * outblockSize).
To what end?
If you have the output file pre-allocated, the consumer does not need to wait input blocks in order. Once an input block is calculated, you can calculate the position where it will go in the output, seek to that position and just write it. For this you may use File.Seek().
This solution still requires to send the block index from the producer to the consumer, but the consumer won't need to store blocks arriving out-of-order, so the consumer can be simpler, and does not need to store completed blocks until the subsequent one arrives in order to proceed with writing the output file.
Note that this solution naturally does not pose a memory threat, as completed jobs are never accumulated / cached, they are written out in the order of completion.
See related questions for more details and techniques:
Is this an idiomatic worker thread pool in Go?
How to collect values from N goroutines executed in a specific order?
here is a working example that should work and is as close as possible to your original code.
the idea is to turn your array into a channel of channels of bytes. then
first fire up a consumer that will read on this channel of channels , get the channel of bytes, read from it and write the result.
Back on the main thread you create a channel of bytes, write it to the channel of channels (now the consumer reading sequentially from them will read the results in order) and then fire up the process that will do the work and write on the allocated channel (producers).
what will happen now is that the there will be a "race" between the procuders and the consumer, as soon as a produced block is read from the consumer and written the resources associated with it will be deallocated. this could be an improvement to your original design.
here is the code and the playground link:
package main
import (
"bytes"
"fmt"
"io"
"sync"
)
func elaborateBlock(b []byte) []byte {
return []byte("werkwerkwerk")
}
func blockThread(channel chan []byte, block []byte, wg *sync.WaitGroup) {
channel <- elaborateBlock(block)
wg.Done()
}
func main() {
chans := make(chan chan []byte)
BlockSize := 3
inputBytes := bytes.NewBuffer([]byte("transmutemetowerkwerkwerk"))
producewg := sync.WaitGroup{}
consumewg := sync.WaitGroup{}
consumewg.Add(1)
go func() {
chancount := 0
for ch := range chans {
data := <-ch
fmt.Printf("got %d block, result:%s\n", chancount, data)
chancount++
}
fmt.Printf("done receiving\n")
consumewg.Done()
}()
for {
buffer := make([]byte, BlockSize)
_, err := inputBytes.Read(buffer)
if err == io.EOF {
go func() {
//wait for all the procuders to finish
producewg.Wait()
//then close the main channel to notify the consumer
close(chans)
}()
break
}
channel := make(chan []byte)
chans <- channel //give the channel that we return the result to the receiver
producewg.Add(1)
go blockThread(channel, buffer, &producewg)
}
consumewg.Wait()
fmt.Printf("main exiting")
}
playground link
as a minor point i don't feel right about the "read the whole file into memory" statement cause you are just reading a block every time from the Reader, maybe "holding the result of the whole computation in memory" is more appropriate?

Why following code generates deadlock

Golang newbie here. Can somebody explain why the following code generates a deadlock?
I am aware of sending true to boolean <- done channel but I don't want to use it.
package main
import (
"fmt"
"sync"
"time"
)
var wg2 sync.WaitGroup
func producer2(c chan<- int) {
for i := 0; i < 5; i++ {
time.Sleep(time.Second * 10)
fmt.Println("Producer Writing to chan %d", i)
c <- i
}
}
func consumer2(c <-chan int) {
defer wg2.Done()
fmt.Println("Consumer Got value %d", <-c)
}
func main() {
c := make(chan int)
wg2.Add(5)
fmt.Println("Starting .... 1")
go producer2(c)
go consumer2(c)
fmt.Println("Starting .... 2")
wg2.Wait()
}
Following is my understanding and I know that it is wrong:
The channel will be blocked the moment 0 is written to it within the
loop of producer function
So I expect channel to be emptied by the
consumer afterwards.
As the channel is emptied in the step 2,
producer function can again put in another value and then get
blocked and steps 2 repeats again.
Your original deadlock is caused by wg2.Add(5), you were waiting for 5 goroutines to finish, but only one did; you called wg2.Done() once. Change this to wg2.Add(1), and your program will run without error.
However, I suspect that you intended to consume all the values in the channel not just one as you do. If you change consumer function to:
func consumer2(c <-chan int) {
defer wg2.Done()
for i := range c {
fmt.Printf("Consumer Got value %d\n", i)
}
}
You will get another deadlock because the channel is not closed in producer function, and consumer is waiting for more values that never arrive. Adding close(c) to the producer function will fix it.
Why it error?
Running your code gets the following error:
➜ gochannel go run dl.go
Starting .... 1
Starting .... 2
Producer Writing to chan 0
Consumer Got value 0
Producer Writing to chan 1
fatal error: all goroutines are asleep - deadlock!
Here is why:
There are three goroutines in your code: main,producer2 and consumer2. When it runs,
producer2 sends a number 0 to the channel
consumer2 recives 0 from the channel, and exits
producer2 sends 1 to the channel, but no one is consuming, since consumer2 already exits
producer2 is waiting
main executes wg2.Wait(), but not all waitgroup are closed. So main is waiting
Two goroutines are waiting here, does nothing, and nothing will be done no matter how long you wait. It is a deadlock! Golang detects it and panic.
There are two concepts you are confused here:
how waitgourp works
how to receive all values from a channel
I'll explain them here briefly, there are alreay many articles out there on the internet.
how waitgroup works
WaitGroup if a way to wait for all groutine to finish. When running goroutines in the background, it's important to know when all of them quits, then certain action can be conducted.
In your case, we run two goroutines, so at the beginning we should set wg2.Add(2), and each goroutine should add wg2.Done() to notify it is done.
Receive data from a channel
When receiving data from a channel. If you know exactly how many data it will send, use for loop this way:
for i:=0; i<N; i++ {
data = <-c
process(data)
}
Otherwise use it this way:
for data := range c {
process(data)
}
Also, Don't forget to close channel when there is no more data to send.
How to fix it?
With the above explanation, the code can be fixed as:
package main
import (
"fmt"
"sync"
"time"
)
var wg2 sync.WaitGroup
func producer2(c chan<- int) {
defer wg2.Done()
for i := 0; i < 5; i++ {
time.Sleep(time.Second * 1)
fmt.Printf("Producer Writing to chan %d\n", i)
c <- i
}
close(c)
}
func consumer2(c <-chan int) {
defer wg2.Done()
for i := range c {
fmt.Printf("Consumer Got value %d\n", i)
}
}
func main() {
c := make(chan int)
wg2.Add(2)
fmt.Println("Starting .... 1")
go producer2(c)
go consumer2(c)
fmt.Println("Starting .... 2")
wg2.Wait()
}
Here is another possible way to fix it.
The expected output
Fixed code gives the following output:
➜ gochannel go run dl.go
Starting .... 1
Starting .... 2
Producer Writing to chan 0
Consumer Got value 0
Producer Writing to chan 1
Consumer Got value 1
Producer Writing to chan 2
Consumer Got value 2
Producer Writing to chan 3
Consumer Got value 3
Producer Writing to chan 4
Consumer Got value 4

How does Golang share variables between goroutines? [duplicate]

This question already has answers here:
Why does Go handle closures differently in goroutines?
(2 answers)
Closed 10 months ago.
I'm learning Go and trying to understand its concurrency features.
I have the following program.
package main
import (
"fmt"
"sync"
)
func main() {
var wg sync.WaitGroup
for i := 0; i < 5; i++ {
wg.Add(1)
x := i
go func() {
defer wg.Done()
fmt.Println(x)
}()
}
wg.Wait()
fmt.Println("Done")
}
When executed I got:
4
0
1
3
2
It's just what I want. However, if I make slight modification to it:
package main
import (
"fmt"
"sync"
)
func main() {
var wg sync.WaitGroup
for i := 0; i < 5; i++ {
wg.Add(1)
go func() {
defer wg.Done()
fmt.Println(i)
}()
}
wg.Wait()
fmt.Println("Done")
}
What I got will be:
5
5
5
5
5
I don't quite understand the difference. Can anyone help to explain what happened here and how Go runtime execute this code?
You have new variable on each run of x := i,
This code shows difference well, by printing the address of x inside goroutine:
The Go Playground:
package main
import (
"fmt"
"sync"
)
func main() {
var wg sync.WaitGroup
for i := 0; i < 5; i++ {
wg.Add(1)
x := i
go func() {
defer wg.Done()
fmt.Println(&x)
}()
}
wg.Wait()
fmt.Println("Done")
}
output:
0xc0420301e0
0xc042030200
0xc0420301e8
0xc0420301f0
0xc0420301f8
Done
And build your second example with go build -race and run it:
You will see: WARNING: DATA RACE
And this will be fine The Go Playground:
//go build -race
package main
import (
"fmt"
"sync"
)
func main() {
var wg sync.WaitGroup
for i := 0; i < 5; i++ {
wg.Add(1)
go func(i int) {
defer wg.Done()
fmt.Println(i)
}(i)
}
wg.Wait()
fmt.Println("Done")
}
output:
0
4
1
2
3
Done
The general rule is, don't share data between goroutines. In the first example, you essentially give each goroutine their own copy of x, and they print it out in whatever order they get to the print statement. In the second example, they all reference the same loop variable, and it is incremented to 5 by the time any of them print it. I don't believe the output there is guaranteed, it just happens that the loop creating goroutines finished faster than the goroutines themselves got to the printing part.
It's a bit hard to explain in plain english, but I'll try my best.
You see, every time you spawn a new goroutine, there is an initialization time, no matter how minuscule it might be, it's always there. So, in your second case, the entire loop has finished incrementing the variable 5 times before any of the goroutines even started. And when the goroutines finish initialization, all they see is the final variable value which is 5.
In your first case though, the x variable keeps a copy of the i variable so that when the goroutines start, x get's passed to them. Remember, it is i that is being incremented here, not x. x is fixed. So, when the goroutines start, they get a fixed value.

How would you define a pool of goroutines to be executed at once?

TL;DR: Please just go to the last part and tell me how you would solve this problem.
I've begun using Go this morning coming from Python. I want to call a closed-source executable from Go several times, with a bit of concurrency, with different command line arguments. My resulting code is working just well but I'd like to get your input in order to improve it. Since I'm at an early learning stage, I'll also explain my workflow.
For the sake of simplicity, assume here that this "external closed-source program" is zenity, a Linux command line tool that can display graphical message boxes from the command line.
Calling an executable file from Go
So, in Go, I would go like this:
package main
import "os/exec"
func main() {
cmd := exec.Command("zenity", "--info", "--text='Hello World'")
cmd.Run()
}
This should be working just right. Note that .Run() is a functional equivalent to .Start() followed by .Wait(). This is great, but if I wanted to execute this program just once, the whole programming stuff would not be worth it. So let's just do that multiple times.
Calling an executable multiple times
Now that I had this working, I'd like to call my program multiple times, with custom command line arguments (here just i for the sake of simplicity).
package main
import (
"os/exec"
"strconv"
)
func main() {
NumEl := 8 // Number of times the external program is called
for i:=0; i<NumEl; i++ {
cmd := exec.Command("zenity", "--info", "--text='Hello from iteration n." + strconv.Itoa(i) + "'")
cmd.Run()
}
}
Ok, we did it! But I still can't see the advantage of Go over Python … This piece of code is actually executed in a serial fashion. I have a multiple-core CPU and I'd like to take advantage of it. So let's add some concurrency with goroutines.
Goroutines, or a way to make my program parallel
a) First attempt: just add "go"s everywhere
Let's rewrite our code to make things easier to call and reuse and add the famous go keyword:
package main
import (
"os/exec"
"strconv"
)
func main() {
NumEl := 8
for i:=0; i<NumEl; i++ {
go callProg(i) // <--- There!
}
}
func callProg(i int) {
cmd := exec.Command("zenity", "--info", "--text='Hello from iteration n." + strconv.Itoa(i) + "'")
cmd.Run()
}
Nothing! What is the problem? All the goroutines are executed at once. I don't really know why zenity is not executed but AFAIK, the Go program exited before the zenity external program could even be initialized. This was confirmed by the use of time.Sleep: waiting for a couple of seconds was enough to let the 8 instance of zenity launch themselves. I don't know if this can be considered a bug though.
To make it worse, the real program I'd actually like to call takes a while to execute itself. If I execute 8 instances of this program in parallel on my 4-core CPU, it's gonna waste some time doing a lot of context switching … I don't know how plain Go goroutines behave, but exec.Command will launch zenity 8 times in 8 different threads. To make it even worse, I want to execute this program more than 100,000 times. Doing all of that at once in goroutines won't be efficient at all. Still, I'd like to leverage my 4-core CPU!
b) Second attempt: use pools of goroutines
The online resources tend to recommend the use of sync.WaitGroup for this kind of work. The problem with that approach is that you are basically working with batches of goroutines: if I create of WaitGroup of 4 members, the Go program will wait for all the 4 external programs to finish before calling a new batch of 4 programs. This is not efficient: CPU is wasted, once again.
Some other resources recommended the use of a buffered channel to do the work:
package main
import (
"os/exec"
"strconv"
)
func main() {
NumEl := 8 // Number of times the external program is called
NumCore := 4 // Number of available cores
c := make(chan bool, NumCore - 1)
for i:=0; i<NumEl; i++ {
go callProg(i, c)
c <- true // At the NumCoreth iteration, c is blocking
}
}
func callProg(i int, c chan bool) {
defer func () {<- c}()
cmd := exec.Command("zenity", "--info", "--text='Hello from iteration n." + strconv.Itoa(i) + "'")
cmd.Run()
}
This seems ugly. Channels were not intended for this purpose: I'm exploiting a side-effect. I love the concept of defer but I hate having to declare a function (even a lambda) to pop a value out of the dummy channel that I created. Oh, and of course, using a dummy channel is, by itself, ugly.
c) Third attempt: die when all the children are dead
Now we are nearly finished. I have just to take into account yet another side effect: the Go program closes before all the zenity pop-ups are closed. This is because when the loop is finised (at the 8th iteration), nothing prevents the program from finishing. This time, sync.WaitGroup will be useful.
package main
import (
"os/exec"
"strconv"
"sync"
)
func main() {
NumEl := 8 // Number of times the external program is called
NumCore := 4 // Number of available cores
c := make(chan bool, NumCore - 1)
wg := new(sync.WaitGroup)
wg.Add(NumEl) // Set the number of goroutines to (0 + NumEl)
for i:=0; i<NumEl; i++ {
go callProg(i, c, wg)
c <- true // At the NumCoreth iteration, c is blocking
}
wg.Wait() // Wait for all the children to die
close(c)
}
func callProg(i int, c chan bool, wg *sync.WaitGroup) {
defer func () {
<- c
wg.Done() // Decrease the number of alive goroutines
}()
cmd := exec.Command("zenity", "--info", "--text='Hello from iteration n." + strconv.Itoa(i) + "'")
cmd.Run()
}
Done.
My questions
Do you know any other proper way to limit the number of goroutines executed at once?
I don't mean threads; how Go manages goroutines internally is not relevant. I really mean limiting the number of goroutines launched at once: exec.Command creates a new thread each time it is called, so I should control the number of time it is called.
Does that code look fine to you?
Do you know how to avoid the use of a dummy channel in that case?
I can't convince myself that such dummy channels are the way to go.
I would spawn 4 worker goroutines that read the tasks from a common channel. Goroutines that are faster than others (because they are scheduled differently or happen to get simple tasks) will receive more task from this channel than others. In addition to that, I would use a sync.WaitGroup to wait for all workers to finish. The remaining part is just the creation of the tasks. You can see an example implementation of that approach here:
package main
import (
"os/exec"
"strconv"
"sync"
)
func main() {
tasks := make(chan *exec.Cmd, 64)
// spawn four worker goroutines
var wg sync.WaitGroup
for i := 0; i < 4; i++ {
wg.Add(1)
go func() {
for cmd := range tasks {
cmd.Run()
}
wg.Done()
}()
}
// generate some tasks
for i := 0; i < 10; i++ {
tasks <- exec.Command("zenity", "--info", "--text='Hello from iteration n."+strconv.Itoa(i)+"'")
}
close(tasks)
// wait for the workers to finish
wg.Wait()
}
There are probably other possible approaches, but I think this is a very clean solution that is easy to understand.
A simple approach to throttling (execute f() N times but maximum maxConcurrency concurrently), just a scheme:
package main
import (
"sync"
)
const maxConcurrency = 4 // for example
var throttle = make(chan int, maxConcurrency)
func main() {
const N = 100 // for example
var wg sync.WaitGroup
for i := 0; i < N; i++ {
throttle <- 1 // whatever number
wg.Add(1)
go f(i, &wg, throttle)
}
wg.Wait()
}
func f(i int, wg *sync.WaitGroup, throttle chan int) {
defer wg.Done()
// whatever processing
println(i)
<-throttle
}
Playground
I wouldn't probably call the throttle channel "dummy". IMHO it's an elegant way (it's not my invention of course), how to limit concurrency.
BTW: Please note that you're ignoring the returned error from cmd.Run().
🧩 Modules
Golang Concurrency Manager
📃 Template
package main
import (
"fmt"
"github.com/zenthangplus/goccm"
"math/rand"
"runtime"
)
func main() {
semaphore := goccm.New(runtime.NumCPU())
for {
semaphore.Wait()
go func() {
fmt.Println(rand.Int())
semaphore.Done()
}()
}
semaphore.WaitAllDone()
}
🎰 Optimal routine quantity
If the operation is CPU bounded: runtime.NumCPU()
Otherwise test with: time go run *.go
🔨 Configure
export GOPATH="$(pwd)/gopath"
go mod init *.go
go mod tidy
🧹 CleanUp
find "${GOPATH}" -exec chmod +w {} \;
rm --recursive --force "${GOPATH}"
try this:
https://github.com/korovkin/limiter
limiter := NewConcurrencyLimiter(10)
limiter.Execute(func() {
zenity(...)
})
limiter.Wait()
You could use Worker Pool pattern described here in this post.
This is how an implementation would look like ...
package main
import (
"os/exec"
"strconv"
)
func main() {
NumEl := 8
pool := 4
intChan := make(chan int)
for i:=0; i<pool; i++ {
go callProg(intChan) // <--- launch the worker routines
}
for i:=0;i<NumEl;i++{
intChan <- i // <--- push data which will be received by workers
}
close(intChan) // <--- will safely close the channel & terminate worker routines
}
func callProg(intChan chan int) {
for i := range intChan{
cmd := exec.Command("zenity", "--info", "--text='Hello from iteration n." + strconv.Itoa(i) + "'")
cmd.Run()
}
}

Why do my goroutines wait for each other instead of finishing when done?

I'm pretty new to Go and there is one thing in my code which I don't understand.
I wrote a simple bubblesort algorithm (I know it's not really efficient ;)).
Now I want to start 3 GoRoutines. Each thread should sort his array independent from the other ones. When finished, the func. should print a "done"-Message.
Here is my Code:
package main
import (
"fmt"
"time" //for time functions e.g. Now()
"math/rand" //for pseudo random numbers
)
/* Simple bubblesort algorithm*/
func bubblesort(str string, a []int) []int {
for n:=len(a); n>1; n-- {
for i:=0; i<n-1; i++ {
if a[i] > a[i+1] {
a[i], a[i+1] = a[i+1], a[i] //swap
}
}
}
fmt.Println(str+" done") //done message
return a
}
/*fill slice with pseudo numbers*/
func random_fill(a []int) []int {
for i:=0; i<len(a); i++ {
a[i] = rand.Int()
}
return a
}
func main() {
rand.Seed( time.Now().UTC().UnixNano()) //set seed for rand.
a1 := make([]int, 34589) //create slice
a2 := make([]int, 42) //create slice
a3 := make([]int, 9999) //create slice
a1 = random_fill(a1) //fill slice
a2 = random_fill(a2) //fill slice
a3 = random_fill(a3) //fill slice
fmt.Println("Slices filled ...")
go bubblesort("Thread 1", a1) //1. Routine Start
go bubblesort("Thread 2", a2) //2. Routine Start
go bubblesort("Thread 3", a3) //3. Routine Start
fmt.Println("Main working ...")
time.Sleep(1*60*1e9) //Wait 1 minute for the "done" messages
}
This is what I get:
Slices filled ...
Main working ...
Thread 1 done
Thread 2 done
Thread 3 done
Should'nt Thread 2 finish first, since his slice is the smallest?
It seems that all the threads are waiting for the others to finish, because the "done"-messages appear at the same time, no matter how big the slices are..
Where is my brainbug? =)
Thanks in advance.
*Edit:
When putting "time.Sleep(1)" in the for-loop in the bubblesort func. it seems to work.. but I want to clock the duration on different machines with this code (I know, i have to change the random thing), so sleep would falsify the results.
Indeed, there is no garantee regarding the order in which your goroutines will be executed.
However if you force the true parallel processing by explicitly letting 2 processor cores run :
import (
"fmt"
"time" //for time functions e.g. Now()
"math/rand" //for pseudo random numbers
"runtime"
)
...
func main() {
runtime.GOMAXPROCS(2)
rand.Seed( time.Now().UTC().UnixNano()) //set seed for rand.
...
Then you will get the expected result :
Slices filled ...
Main working ...
Thread 2 done
Thread 3 done
Thread 1 done
Best regards
The important thing is the ability to "yield" the processor to other processes, before the whole potentialy long-running workload is finished. This holds true as well in single-core context or multi-core context (because Concurrency is not the same as Parallelism).
This is exactly what the runtime.Gosched() function does :
Gosched yields the processor, allowing other goroutines to run. It
does not suspend the current goroutine, so execution resumes
automatically.
Be aware that a "context switch" is not free : it costs a little time each time.
On my machine without yielding, your program runs in 5.1s.
If you yield in the outer loop (for n:=len(a); n>1; n--), it runs in 5.2s : small overhead.
If you yield in the inner loop (for i:=0; i<n-1; i++), it runs in 61.7s : huge overhead !!
Here is the modified program correctly yielding, with the small overhead :
package main
import (
"fmt"
"math/rand"
"runtime"
"time"
)
/* Simple bubblesort algorithm*/
func bubblesort(str string, a []int, ch chan []int) {
for n := len(a); n > 1; n-- {
for i := 0; i < n-1; i++ {
if a[i] > a[i+1] {
a[i], a[i+1] = a[i+1], a[i] //swap
}
}
runtime.Gosched() // yield after part of the workload
}
fmt.Println(str + " done") //done message
ch <- a
}
/*fill slice with pseudo numbers*/
func random_fill(a []int) []int {
for i := 0; i < len(a); i++ {
a[i] = rand.Int()
}
return a
}
func main() {
rand.Seed(time.Now().UTC().UnixNano()) //set seed for rand.
a1 := make([]int, 34589) //create slice
a2 := make([]int, 42) //create slice
a3 := make([]int, 9999) //create slice
a1 = random_fill(a1) //fill slice
a2 = random_fill(a2) //fill slice
a3 = random_fill(a3) //fill slice
fmt.Println("Slices filled ...")
ch1 := make(chan []int) //create channel of result
ch2 := make(chan []int) //create channel of result
ch3 := make(chan []int) //create channel of result
go bubblesort("Thread 1", a1, ch1) //1. Routine Start
go bubblesort("Thread 2", a2, ch2) //2. Routine Start
go bubblesort("Thread 3", a3, ch3) //3. Routine Start
fmt.Println("Main working ...")
<-ch1 // Wait for result 1
<-ch2 // Wait for result 2
<-ch3 // Wait for result 3
}
Output :
Slices filled ...
Main working ...
Thread 2 done
Thread 3 done
Thread 1 done
I also used channels to implement the rendez-vous, as suggested in my previous comment.
Best regards :)
Since the release of Go 1.2, the original program now works may work fine without modification. You may try it in Playground.
This is explained in the Go 1.2 release notes :
In prior releases, a goroutine that was looping forever could starve
out other goroutines on the same thread, a serious problem when
GOMAXPROCS provided only one user thread. In Go 1.2, this is partially
addressed: The scheduler is invoked occasionally upon entry to a
function.

Resources