Why does this program run faster when it's allocated fewer threads?

Why does this program run faster when it's allocated fewer threads? - multithreading

I have a fairly simple Go program designed to compute random Fibonacci numbers to test some strange behavior I observed in a worker pool I wrote. When I allocate one thread, the program finishes in 1.78s. When I allocate 4, it finishes in 9.88s.
The code is as follows:
var workerWG sync.WaitGroup
func worker(fibNum chan int) {
for {
var tgt = <-fibNum
workerWG.Add(1)
var a, b float64 = 0, 1
for i := 0; i < tgt; i++ {
a, b = a+b, a
}
workerWG.Done()
}
}
func main() {
rand.Seed(time.Now().UnixNano())
runtime.GOMAXPROCS(1) // LINE IN QUESTION
var fibNum = make(chan int)
for i := 0; i < 4; i++ {
go worker(fibNum)
}
for i := 0; i < 500000; i++ {
fibNum <- rand.Intn(1000)
}
workerWG.Wait()
}
If I replace runtime.GOMAXPROCS(1) with 4, the program takes four times as long to run.
What's going on here? Why does adding more available threads to a worker pool slow the entire pool down?
My personal theory is that it has to do with the processing time of the worker being less than the overhead of thread management, but I'm not sure. My reservation is caused by the following test:
When I replace the worker function with the following code:
for {
<-fibNum
time.Sleep(500 * time.Millisecond)
}
both one available thread and four available threads take the same amount of time.

I revised your program to look like the following:
package main
import (
"math/rand"
"runtime"
"sync"
"time"
)
var workerWG sync.WaitGroup
func worker(fibNum chan int) {
for tgt := range fibNum {
var a, b float64 = 0, 1
for i := 0; i < tgt; i++ {
a, b = a+b, a
}
}
workerWG.Done()
}
func main() {
rand.Seed(time.Now().UnixNano())
runtime.GOMAXPROCS(1) // LINE IN QUESTION
var fibNum = make(chan int)
for i := 0; i < 4; i++ {
go worker(fibNum)
workerWG.Add(1)
}
for i := 0; i < 500000; i++ {
fibNum <- rand.Intn(100000)
}
close(fibNum)
workerWG.Wait()
}
I cleaned up the wait group usage.
I changed rand.Intn(1000) to rand.Intn(100000)
On my machine that produces:
$ time go run threading.go (GOMAXPROCS=1)
real 0m20.934s
user 0m20.932s
sys 0m0.012s
$ time go run threading.go (GOMAXPROCS=8)
real 0m10.634s
user 0m44.184s
sys 0m1.928s
This means that in your original code, the work performed vs synchronization (channel read/write) was negligible. The slowdown came from having to synchronize across threads instead of one and only perform a very small amount of work inbetween.
In essence, synchronization is expensive compared to calculating fibonacci numbers up to 1000. This is why people tend to discourage micro-benchmarks. Upping that number gives a better perspective. But an even better idea is to benchmark actual work being done i.e. including IO, syscalls, processing, crunching, writing output, formatting, etc.
Edit: As an experiment, I upped the number of workers to 8 with GOMAXPROCS set to 8 and the result was:
$ time go run threading.go
real 0m4.971s
user 0m35.692s
sys 0m0.044s

The code written by #thwd is correct and idiomatic Go.
Your code was being serialized due to the atomic nature of sync.WaitGroup. Both workerWG.Add(1) and workerWG.Done() will block until they're able to atomically update the internal counter.
Since the workload is between 0 and 1000 recursive calls, the bottleneck of a single core was enough to keep data races on the waitgroup counter to a minimum.
On multiple cores, the processor spends a lot of time spinning to fix the collisions of waitgroup calls. Add that to the fact that the waitgroup counter is kept on one core and you now have added communication between cores (taking up even more cycles).
A couple hints for simplifying code:
For a small, set number of goroutines, a complete channel (chan struct{} to avoid allocations) is cheaper to use.
Use the send channel close as a kill signal for goroutines and have them signal that they've exited (waitgroup or channel). Then, close to complete channel to free them up for the GC.
If you need a waitgroup, aggressively minimize the number of calls to it. Those calls must be internally serialized, so extra calls forces added synchronization.

Your main computation routine in worker does not allow the scheduler to run.
Calling the scheduler manually like
for i := 0; i < tgt; i++ {
a, b = a+b, a
if i%300 == 0 {
runtime.Gosched()
}
}
Reduces wall clock by 30% when switching from one to two threads.
Such artificial microbenchmarks are really hard to get right.

Related

Peterson's algorithm and deadlock

I am trying to experiment with some mutual execution algorithms. I have implemented the Peterson's algorithm. It prints the correct counter value but sometimes it seems just like some kind of a deadlock had occurred which stalls the execution indefinitely. This should not be possible since this algorithm is deadlock free.
PS: Is this related to problems with compiler optimizations often mentioned when addressing the danger of "benign" data races? If this is the case then how to disable such optimizations?
PPS: When atomically storing/loading the victim field, the problem seems to disappear which makes the compiler's optimizations more suspicious
package main
import (
"fmt"
"sync"
)
type mutex struct {
flag [2]bool
victim int
}
func (m *mutex) lock(id int) {
m.flag[id] = true // I'm interested
m.victim = id // you can go before me if you want
for m.flag[1-id] && m.victim == id {
// while the other thread is inside the CS
// and the victime was me (I expressed my interest after the other one already did)
}
}
func (m *mutex) unlock(id int) {
m.flag[id] = false // I'm not intersted anymore
}
func main() {
var wg sync.WaitGroup
var mu mutex
var cpt, n = 0, 100000
for i := 0; i < 2; i++ {
wg.Add(1)
go func(id int) {
defer wg.Done()
for j := 0; j < n; j++ {
mu.lock(id)
cpt = cpt + 1
mu.unlock(id)
}
}(i)
}
wg.Wait()
fmt.Println(cpt)
}

There is no "benign" data race. Your program has data race, and the behavior is undefined.
At the core of the problem is the mutex implementation. Modifications made to a shared object from one goroutine are not necessarily observable from others until those goroutines communicate using one of the synchronization primitives. You are writing to mutex.victim from multiple goroutines, and won't be observed. You are also reading the mutex.flag elements written by other goroutines, and won't necessarily be seen. That is, there may be cases where the for-loop won't terminate even if the other goroutine changes the variables.
And since the mutex implementation is broken, the updates to cpt will not necessarily be correct either.
To implement this correctly, you need the sync/atomic package.
See the Go Memory Model: https://go.dev/ref/mem

For Peterson's algorithm (same goes for Dekker), you need to ensure that your code is sequential consistent. In Go you can do that using atomics. This will prevent the compiler and the hardware to mess things up.

Self-Synchronizing Goroutines end up with Deadlock

I have a stress test issue that I want to solve with simple synchronization in Go. So far I have tried to find documenation on my specific usecase regarding synchronization in Go, but didn't find anything that fits.
To be a bit more specific:
I must fulfill a task where I have to start a large amount of threads (in this example only illustrated with two threads) in the main routine. All of the initiated workers are supposed to prepare some initialization actions by themselves in unordered manner. Until they reach a small sequence of commands, which I want them to be executed by all goroutines at once, which is why I want to self-synchronize the goroutines with each other. It is very vital for my task that the delay through the main routine, which instantiates all other goroutines, does not affect the true parallelism of the workers execution (at the label #maximum parallel in the comment). For this purpose I do initialize a wait group with the amount of running goroutines in the main routine and pass it over to all routines so they can synchronize each others workflow.
The code looks similar to this example:
import sync
func worker_action(wait_group *sync.WaitGroup) {
// ...
// initialization
// ...
defer wait_group.Done()
wait_group.Wait() // #label: wait
// sequence of maximum parallel instructions // #label: maximum parallel
// ...
}
func main() {
var numThreads int = 2 // the number of threads shall be much higher for the actual stress test
var wait_group sync.WaitGroup
wait_group.Add(numThreads)
for i := 0; i < numThreads; i++ {
go worker_action(&wait_group)
}
// ...
}
Unfortunately my setup runs into a deadlock, as soon as all goroutines have reached the Wait instruction (labeled with #wait in the comment). This is true for any amount of threads that I start with the main routine (even two threads are caught in a deadlock within no time).
From my point of view a deadlock should not occur, due to the fact that immediately before the wait instruction each goroutine executes the done function on the same wait group.
Do I have a wrong understanding of how wait groups work? Is it for instance not allowed to execute the wait function inside of a goroutine other than the main routine? Or can someone give me a hint on what else I am missing?
Thank you very much in advance.
EDIT:
Thanks a lot #tkausl. It was indeed the unnecessary "defer" that caused the problem. I do not know how I could not see it myself.

There are several issues in your code. First the form. Idiomatic Go should use camelCase. wg is a better name for the WaitGroup.
But more important is the use where your code is waiting. Not inside your Goroutines. It should wait inside the main func:
func workerAction(wg *sync.WaitGroup) {
// ...
// initialization
// ...
defer wg.Done()
// wg.Wait() // #label: wait
// sequence of maximum parallel instructions // #label: maximum parallel
// ...
}
func main() {
var numThreads int = 2 // the number of threads shall be much higher for the actual stress test
var wg sync.WaitGroup
wg.Add(numThreads)
for i := 0; i < numThreads; i++ {
go workerAction(&wg)
}
wg.Wait() // you need to wait here
// ...
}

Again thanks #tkausl. The issue was resolved by removing the unnecessary "defer" instruction from the line that was meant to let the worker goroutines increment the number of finished threads.
I.e. "defer wait_group.Done()" -> "wait_group.Done()"

waitgroup on subset of go routines

I have situation where in, the main go routines will create "x" go routines. but it is interested only in "y" ( y < x ) go routines to finish.
I was hoping to use Waitgroup. But Waitgroup only allows me to wait on all go routines. I cannot, for example do this,
1. wg.Add (y)
2 create "x" go routines. These routines will call wg.Done() when finished.
3. wg. Wait()
This panics when the y+1 go routine calls wg.Done() because the wg counter goes negative.
I sure can use channels to solve this but I am interested if Waitgroup solves this.

As noted in Adrian's answer, sync.WaitGroup is a simple counter whose Wait method will block until the counter value reaches zero. It is intended to allow you to block (or join) on a number of goroutines before allowing a main flow of execution to proceed.
The interface of WaitGroup is not sufficiently expressive for your usecase, nor is it designed to be. In particular, you cannot use it naïvely by simply calling wg.Add(y) (where y < x). The call to wg.Done by the (y+1)th goroutine will cause a panic, as it is an error for a wait group to have a negative internal value. Furthermore, we cannot be "smart" by observing the internal counter value of the WaitGroup; this would break an abstraction and, in any event, its internal state is not exported.
Implement your own!
You can implement the relevant logic yourself using some channels per the code below (playground link). Observe from the console that 10 goroutines are started, but after two have completed, we fallthrough to continue execution in the main method.
package main
import (
"fmt"
"time"
)
// Set goroutine counts here
const (
// The number of goroutines to spawn
x = 10
// The number of goroutines to wait for completion
// (y <= x) must hold.
y = 2
)
func doSomeWork() {
// do something meaningful
time.Sleep(time.Second)
}
func main() {
// Accumulator channel, used by each goroutine to signal completion.
// It is buffered to ensure the [y+1, ..., x) goroutines do not block
// when sending to the channel, which would cause a leak. It will be
// garbage collected when all goroutines end and the channel falls
// out of scope. We receive y values, so only need capacity to receive
// (x-y) remaining values.
accChan := make(chan struct{}, x-y)
// Spawn "x" goroutines
for i := 0; i < x; i += 1 {
// Wrap our work function with the local signalling logic
go func(id int, doneChan chan<- struct{}) {
fmt.Printf("starting goroutine #%d\n", id)
doSomeWork()
fmt.Printf("goroutine #%d completed\n", id)
// Communicate completion of goroutine
doneChan <- struct{}{}
}(i, accChan)
}
for doneCount := 0; doneCount < y; doneCount += 1 {
<-accChan
}
// Continue working
fmt.Println("Carrying on without waiting for more goroutines")
}
Avoid leaking resources
As this does not wait for the [y+1, ..., x) goroutines to complete, you should take special care in the doSomeWork function to remove or minimize the risk that the work can block indefinitely, which would also cause a leak. Remove, where possible, the feasibility of indefinite blocking on I/O (including channel operations) or falling into infinite loops.
You could use a context to signal to the additional goroutines when their results are no longer required to have them break out of execution.

WaitGroup doesn't actually wait on goroutines, it waits until its internal counter reaches zero. If you only Add() the number of goroutines you care about, and you only call Done() in those goroutines you care about, then Wait() will only block until those goroutines you care about have finished. You are in complete control of the logic and flow, there are no restrictions on what WaitGroup "allows".

Are these y specific go-routines that you are trying to track, or any y out of the x? What are the criteria?
Update:
1. If you hve control over any criteria to pick matching y go-routines:
You can do wp.wg.Add(1) and wp.wg.Done() from inside the goroutine based on your condition by passing it as a pointer argument into the goroutine, if your condition can't be checked outside the goroutine.
Something like below sample code. Will be able to be more specific if you provide more details of what you are trying to do.
func sampleGoroutine(z int, b string, wg *sync.WaitGroup){
defer func(){
if contition1{
wg.Done()
}
}
if contition1 {
wg.Add(1)
//do stuff
}
}
func main() {
wg := sync.WaitGroup{}
for i := 0; i < x; i++ {
go sampleGoroutine(1, "one", &wg)
}
wg.Wait()
}
2. If you have no control over which ones, and just want the first y:
Based on your comment, that you have no control/desire to pick any specific goroutines, but the ones that finish first. If you would want to do it in a generic way, you can use the below custom waitGroup implementation that fits your use case. (It's not copy-safe, though. Also doesn't have/need wg.Add(int) method)
type CountedWait struct {
wait chan struct{}
limit int
}
func NewCountedWait(limit int) *CountedWait {
return &CountedWait{
wait: make(chan struct{}, limit),
limit: limit,
}
}
func (cwg *CountedWait) Done() {
cwg.wait <- struct{}{}
}
func (cwg *CountedWait) Wait() {
count := 0
for count < cwg.limit {
<-cwg.wait
count += 1
}
}
Which can be used as follows:
func sampleGoroutine(z int, b string, wg *CountedWait) {
success := false
defer func() {
if success == true {
fmt.Printf("goroutine %d finished successfully\n", z)
wg.Done()
}
}()
fmt.Printf("goroutine %d started\n", z)
time.Sleep(time.Second)
if rand.Intn(10)%2 == 0 {
success = true
}
}
func main() {
x := 10
y := 3
wg := NewCountedWait(y)
for i := 0; i < x; i += 1 {
// Wrap our work function with the local signalling logic
go sampleGoroutine(i, "something", wg)
}
wg.Wait()
fmt.Printf("%d out of %d goroutines finished successfully.\n", y, x)
}
3. You can also club in context with 2 to ensure that the remaining goroutines don't leak
You may not be able to run this on play.golang, as it has some long sleeps.
Below is a sample output:
(note that, there may be more than y=3 goroutines marking Done, but you are only waiting till 3 finish)
goroutine 9 started
goroutine 0 started
goroutine 1 started
goroutine 2 started
goroutine 3 started
goroutine 4 started
goroutine 5 started
goroutine 5 marking done
goroutine 6 started
goroutine 7 started
goroutine 7 marking done
goroutine 8 started
goroutine 3 marking done
continuing after 3 out of 10 goroutines finished successfully.
goroutine 9 will be killed, bcz cancel
goroutine 8 will be killed, bcz cancel
goroutine 6 will be killed, bcz cancel
goroutine 1 will be killed, bcz cancel
goroutine 0 will be killed, bcz cancel
goroutine 4 will be killed, bcz cancel
goroutine 2 will be killed, bcz cancel
Play links
https://play.golang.org/p/l5i6X3GClBq
https://play.golang.org/p/Bcns0l9OdFg
https://play.golang.org/p/rkGSLyclgje

recursively scan tree in parallel in golang

I have a binary tree that accessing a node is relatively fast, with the exception of the leaves - they might be 100-1000 times slower. I have a recursive algorithm that I would like to implement in go (I am new to it).
Because I have to get to the leaves to get benefits from the parallelism I need to parallelize the execution higher in the tree. This though might result in millions of goroutines. Limiting this with semaphore does not seem the 'go' way- there is no such sync primitive. Another concern I have is how expensive is, in fact, a channel, should I use wait group instead.
My tree is abstract and the algorithm runs over it identifying the items by level and index.
// l3 0
// / \
// l2 0 1
// / \ / \
// l1 0 1 2 3
// / \ / \ / \ / \
// l0 0 1 2 3 4 5 6 7
For example, I can use such to compute sum of all items in a vector with the function:
func Sum(level, index int, items []int) int {
if level == 0 {return items[index]}
return Sum(level-1, index*2, items) + Sum(level-1, index*2+1, items)
}
What should be my approach? Can someone point me to a recursive tree multithreaded algorithm implemented in go?

It sounds like you need a worker pool. Here's an example I just wrote: https://play.golang.org/p/NRM0yyQi8X
package main
import (
"fmt"
"sync"
"time"
)
type Leaf struct {
// Whatever
}
func worker(i int, wg *sync.WaitGroup, in <-chan Leaf) {
for leaf := range in {
time.Sleep(time.Millisecond * 500)
fmt.Printf("worker %d finished work: %#v\n", i, leaf)
}
fmt.Printf("worker %d exiting\n", i)
wg.Done()
}
func main() {
var jobQueue = make(chan Leaf)
var numWorkers = 10
// the waitgroup will allow us to wait for all the goroutines to finish at the end
var wg = new(sync.WaitGroup)
for i := 0; i < numWorkers; i++ {
wg.Add(1)
go worker(i, wg, jobQueue)
}
// enqueue work (this goes inside your tree traversal.)
for i := 0; i < 100; i++ {
jobQueue <- Leaf{}
}
// closing jobQueue will cause all goroutines to exit the loop on the channel.
close(jobQueue)
// Wait for all the goroutines to finish
wg.Wait()
}

I strongly suggest reading this excellent blog post from top to bottom:
https://blog.golang.org/pipelines
It covers not only an example of exactly what you need (i.e. parallelized file-tree walk calculating MD5 file checksums), but much much more:
fan-in/fan-out channel techniques
parallelism
pipeline cancellations via done channels
pipeline error chaining via error channels
bounded parallelism
The last topic, bounded parallelism, is used to ensure 'walking' large-node directory-trees do not create excessive go-routines: bounded.go

How would you define a pool of goroutines to be executed at once?

TL;DR: Please just go to the last part and tell me how you would solve this problem.
I've begun using Go this morning coming from Python. I want to call a closed-source executable from Go several times, with a bit of concurrency, with different command line arguments. My resulting code is working just well but I'd like to get your input in order to improve it. Since I'm at an early learning stage, I'll also explain my workflow.
For the sake of simplicity, assume here that this "external closed-source program" is zenity, a Linux command line tool that can display graphical message boxes from the command line.
Calling an executable file from Go
So, in Go, I would go like this:
package main
import "os/exec"
func main() {
cmd := exec.Command("zenity", "--info", "--text='Hello World'")
cmd.Run()
}
This should be working just right. Note that .Run() is a functional equivalent to .Start() followed by .Wait(). This is great, but if I wanted to execute this program just once, the whole programming stuff would not be worth it. So let's just do that multiple times.
Calling an executable multiple times
Now that I had this working, I'd like to call my program multiple times, with custom command line arguments (here just i for the sake of simplicity).
package main
import (
"os/exec"
"strconv"
)
func main() {
NumEl := 8 // Number of times the external program is called
for i:=0; i<NumEl; i++ {
cmd := exec.Command("zenity", "--info", "--text='Hello from iteration n." + strconv.Itoa(i) + "'")
cmd.Run()
}
}
Ok, we did it! But I still can't see the advantage of Go over Python … This piece of code is actually executed in a serial fashion. I have a multiple-core CPU and I'd like to take advantage of it. So let's add some concurrency with goroutines.
Goroutines, or a way to make my program parallel
a) First attempt: just add "go"s everywhere
Let's rewrite our code to make things easier to call and reuse and add the famous go keyword:
package main
import (
"os/exec"
"strconv"
)
func main() {
NumEl := 8
for i:=0; i<NumEl; i++ {
go callProg(i) // <--- There!
}
}
func callProg(i int) {
cmd := exec.Command("zenity", "--info", "--text='Hello from iteration n." + strconv.Itoa(i) + "'")
cmd.Run()
}
Nothing! What is the problem? All the goroutines are executed at once. I don't really know why zenity is not executed but AFAIK, the Go program exited before the zenity external program could even be initialized. This was confirmed by the use of time.Sleep: waiting for a couple of seconds was enough to let the 8 instance of zenity launch themselves. I don't know if this can be considered a bug though.
To make it worse, the real program I'd actually like to call takes a while to execute itself. If I execute 8 instances of this program in parallel on my 4-core CPU, it's gonna waste some time doing a lot of context switching … I don't know how plain Go goroutines behave, but exec.Command will launch zenity 8 times in 8 different threads. To make it even worse, I want to execute this program more than 100,000 times. Doing all of that at once in goroutines won't be efficient at all. Still, I'd like to leverage my 4-core CPU!
b) Second attempt: use pools of goroutines
The online resources tend to recommend the use of sync.WaitGroup for this kind of work. The problem with that approach is that you are basically working with batches of goroutines: if I create of WaitGroup of 4 members, the Go program will wait for all the 4 external programs to finish before calling a new batch of 4 programs. This is not efficient: CPU is wasted, once again.
Some other resources recommended the use of a buffered channel to do the work:
package main
import (
"os/exec"
"strconv"
)
func main() {
NumEl := 8 // Number of times the external program is called
NumCore := 4 // Number of available cores
c := make(chan bool, NumCore - 1)
for i:=0; i<NumEl; i++ {
go callProg(i, c)
c <- true // At the NumCoreth iteration, c is blocking
}
}
func callProg(i int, c chan bool) {
defer func () {<- c}()
cmd := exec.Command("zenity", "--info", "--text='Hello from iteration n." + strconv.Itoa(i) + "'")
cmd.Run()
}
This seems ugly. Channels were not intended for this purpose: I'm exploiting a side-effect. I love the concept of defer but I hate having to declare a function (even a lambda) to pop a value out of the dummy channel that I created. Oh, and of course, using a dummy channel is, by itself, ugly.
c) Third attempt: die when all the children are dead
Now we are nearly finished. I have just to take into account yet another side effect: the Go program closes before all the zenity pop-ups are closed. This is because when the loop is finised (at the 8th iteration), nothing prevents the program from finishing. This time, sync.WaitGroup will be useful.
package main
import (
"os/exec"
"strconv"
"sync"
)
func main() {
NumEl := 8 // Number of times the external program is called
NumCore := 4 // Number of available cores
c := make(chan bool, NumCore - 1)
wg := new(sync.WaitGroup)
wg.Add(NumEl) // Set the number of goroutines to (0 + NumEl)
for i:=0; i<NumEl; i++ {
go callProg(i, c, wg)
c <- true // At the NumCoreth iteration, c is blocking
}
wg.Wait() // Wait for all the children to die
close(c)
}
func callProg(i int, c chan bool, wg *sync.WaitGroup) {
defer func () {
<- c
wg.Done() // Decrease the number of alive goroutines
}()
cmd := exec.Command("zenity", "--info", "--text='Hello from iteration n." + strconv.Itoa(i) + "'")
cmd.Run()
}
Done.
My questions
Do you know any other proper way to limit the number of goroutines executed at once?
I don't mean threads; how Go manages goroutines internally is not relevant. I really mean limiting the number of goroutines launched at once: exec.Command creates a new thread each time it is called, so I should control the number of time it is called.
Does that code look fine to you?
Do you know how to avoid the use of a dummy channel in that case?
I can't convince myself that such dummy channels are the way to go.

I would spawn 4 worker goroutines that read the tasks from a common channel. Goroutines that are faster than others (because they are scheduled differently or happen to get simple tasks) will receive more task from this channel than others. In addition to that, I would use a sync.WaitGroup to wait for all workers to finish. The remaining part is just the creation of the tasks. You can see an example implementation of that approach here:
package main
import (
"os/exec"
"strconv"
"sync"
)
func main() {
tasks := make(chan *exec.Cmd, 64)
// spawn four worker goroutines
var wg sync.WaitGroup
for i := 0; i < 4; i++ {
wg.Add(1)
go func() {
for cmd := range tasks {
cmd.Run()
}
wg.Done()
}()
}
// generate some tasks
for i := 0; i < 10; i++ {
tasks <- exec.Command("zenity", "--info", "--text='Hello from iteration n."+strconv.Itoa(i)+"'")
}
close(tasks)
// wait for the workers to finish
wg.Wait()
}
There are probably other possible approaches, but I think this is a very clean solution that is easy to understand.

A simple approach to throttling (execute f() N times but maximum maxConcurrency concurrently), just a scheme:
package main
import (
"sync"
)
const maxConcurrency = 4 // for example
var throttle = make(chan int, maxConcurrency)
func main() {
const N = 100 // for example
var wg sync.WaitGroup
for i := 0; i < N; i++ {
throttle <- 1 // whatever number
wg.Add(1)
go f(i, &wg, throttle)
}
wg.Wait()
}
func f(i int, wg *sync.WaitGroup, throttle chan int) {
defer wg.Done()
// whatever processing
println(i)
<-throttle
}
Playground
I wouldn't probably call the throttle channel "dummy". IMHO it's an elegant way (it's not my invention of course), how to limit concurrency.
BTW: Please note that you're ignoring the returned error from cmd.Run().

🧩 Modules
Golang Concurrency Manager
📃 Template
package main
import (
"fmt"
"github.com/zenthangplus/goccm"
"math/rand"
"runtime"
)
func main() {
semaphore := goccm.New(runtime.NumCPU())
for {
semaphore.Wait()
go func() {
fmt.Println(rand.Int())
semaphore.Done()
}()
}
semaphore.WaitAllDone()
}
🎰 Optimal routine quantity
If the operation is CPU bounded: runtime.NumCPU()
Otherwise test with: time go run *.go
🔨 Configure
export GOPATH="$(pwd)/gopath"
go mod init *.go
go mod tidy
🧹 CleanUp
find "${GOPATH}" -exec chmod +w {} \;
rm --recursive --force "${GOPATH}"

try this:
https://github.com/korovkin/limiter
limiter := NewConcurrencyLimiter(10)
limiter.Execute(func() {
zenity(...)
})
limiter.Wait()

You could use Worker Pool pattern described here in this post.
This is how an implementation would look like ...
package main
import (
"os/exec"
"strconv"
)
func main() {
NumEl := 8
pool := 4
intChan := make(chan int)
for i:=0; i<pool; i++ {
go callProg(intChan) // <--- launch the worker routines
}
for i:=0;i<NumEl;i++{
intChan <- i // <--- push data which will be received by workers
}
close(intChan) // <--- will safely close the channel & terminate worker routines
}
func callProg(intChan chan int) {
for i := range intChan{
cmd := exec.Command("zenity", "--info", "--text='Hello from iteration n." + strconv.Itoa(i) + "'")
cmd.Run()
}
}

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Why does this program run faster when it's allocated fewer threads? - multithreading

Related

Peterson's algorithm and deadlock

Self-Synchronizing Goroutines end up with Deadlock

waitgroup on subset of go routines

recursively scan tree in parallel in golang

How would you define a pool of goroutines to be executed at once?

Categories

Resources