I recently realized that I don't know how to properly Read and Close in Go concurrently. In my particular case, I need to do that with a serial port, but the problem is more generic.
If we do that without any extra effort to synchronize things, it leads to a race condition. Simple example:
package main
import (
"fmt"
"os"
"time"
)
func main() {
f, err := os.Open("/dev/ttyUSB0")
if err != nil {
panic(err)
}
// Start a goroutine which keeps reading from a serial port
go reader(f)
time.Sleep(1000 * time.Millisecond)
fmt.Println("closing")
f.Close()
time.Sleep(1000 * time.Millisecond)
}
func reader(f *os.File) {
b := make([]byte, 100)
for {
f.Read(b)
}
}
If we save the above as main.go, and run go run --race main.go, the output will look as follows:
closing
==================
WARNING: DATA RACE
Write at 0x00c4200143c0 by main goroutine:
os.(*file).close()
/usr/local/go/src/os/file_unix.go:143 +0x124
os.(*File).Close()
/usr/local/go/src/os/file_unix.go:132 +0x55
main.main()
/home/dimon/mydata/projects/go/src/dmitryfrank.com/testfiles/main.go:20 +0x13f
Previous read at 0x00c4200143c0 by goroutine 6:
os.(*File).read()
/usr/local/go/src/os/file_unix.go:228 +0x50
os.(*File).Read()
/usr/local/go/src/os/file.go:101 +0x6f
main.reader()
/home/dimon/mydata/projects/go/src/dmitryfrank.com/testfiles/main.go:27 +0x8b
Goroutine 6 (running) created at:
main.main()
/home/dimon/mydata/projects/go/src/dmitryfrank.com/testfiles/main.go:16 +0x81
==================
Found 1 data race(s)
exit status 66
Ok, but how to handle that properly? Of course, we can't just lock some mutex before calling f.Read(), because the mutex will end up locked basically all the time. To make it work properly, we'd need some sort of cooperation between reading and locking, like conditional variables do: the mutex gets unlocked before putting the goroutine to wait, and it's locked back when the goroutine wakes up.
I would implement something like this manually, but then I need some way to select things while reading. Like this: (pseudocode)
select {
case b := <-f.NextByte():
// process the byte somehow
default:
}
I examined docs of the packages os and sync, and so far I don't see any way to do that.
I belive you need 2 signals:
main -> reader, to tell it to stop reading
reader -> main, to tell that reader has been terminated
of course you can select go signaling primitive (channel, waitgroup, context etc) that you prefer.
Example below, I use waitgroup and context. The reason is
that you can spin multiple reader and only need to close the context to tell all the reader go-routine to stop.
I created multiple go routine just as
an example that you can even coordinate multiple go routine with it.
package main
import (
"context"
"fmt"
"os"
"sync"
"time"
)
func main() {
ctx, cancelFn := context.WithCancel(context.Background())
f, err := os.Open("/dev/ttyUSB0")
if err != nil {
panic(err)
}
var wg sync.WaitGroup
for i := 0; i < 3; i++ {
wg.Add(1)
// Start a goroutine which keeps reading from a serial port
go func(i int) {
defer wg.Done()
reader(ctx, f)
fmt.Printf("reader %d closed\n", i)
}(i)
}
time.Sleep(1000 * time.Millisecond)
fmt.Println("closing")
cancelFn() // signal all reader to stop
wg.Wait() // wait until all reader finished
f.Close()
fmt.Println("file closed")
time.Sleep(1000 * time.Millisecond)
}
func reader(ctx context.Context, f *os.File) {
b := make([]byte, 100)
for {
select {
case <-ctx.Done():
return
default:
f.Read(b)
}
}
}
Related
I have a golang grpc server which has streaming endpoint. Earlier I was doing all the work sequentially and sending on the stream but then I realize I can make the work concurrent and then send on stream. From grpc-go docs: I understood that I can make the work concurrent, but you can't make sending on the stream concurrent so I got below code which does the job.
Below is the code I have in my streaming endpoint which sends data back to client in a streaming way. This does all the work concurrently.
// get "allCids" from lot of files and load in memory.
allCids := .....
var data = allCids.([]int64)
out := make(chan *custPbV1.CustomerResponse, len(data))
wg := &sync.WaitGroup{}
wg.Add(len(data))
go func() {
wg.Wait()
close(out)
}()
for _, cid := range data {
go func (id int64) {
defer wg.Done()
pd := repo.GetCustomerData(strconv.FormatInt(cid, 10))
if !pd.IsCorrect {
return
}
resources := us.helperCom.GenerateResourceString(pd)
val, err := us.GenerateInfo(clientId, resources, cfg)
if err != nil {
return
}
out <- val
}(cid)
}
for val := range out {
if err := stream.Send(val); err != nil {
log.Printf("send error %v", err)
}
}
Now problem I have is size of data slice can be approx a million so I don't want to spawn million go routine doing the job. How do I handle that scenario here? If instead of len(data) I use 100 then will that work for me or I need to slice data as well in 100 sub arrays? I am just confuse on what is the best way to deal with this problem?
I recently started with golang so pardon me if there are any mistakes in my above code while making it concurrent.
Please check this pseudo code
func main() {
works := make(chan int, 100)
errChan := make(chan error, 100)
out := make(chan *custPbV1.CustomerResponse, 100)
// spawn fixed workers
var workerWg sync.WaitGroup
for i := 0; i < 100; i++ {
workerWg.Add(1)
go worker(&workerWg, works, errChan, out)
}
// give input
go func() {
for _, cid := range data {
// this will be blocked if all the workers are busy and no space is left in the channel.
works <- cid
}
close(works)
}()
var analyzeResults sync.WaitGroup
analyzeResults.Add(2)
// process errors
go func() {
for err := range errChan {
log.Printf("error %v", err)
}
analyzeResults.Done()
}()
// process outout
go func() {
for val := range out {
if err := stream.Send(val); err != nil {
log.Printf("send error %v", err)
}
}
analyzeResults.Done()
}()
workerWg.Wait()
close(out)
close(errChan)
analyzeResults.Wait()
}
func worker(job *sync.WaitGroup, works chan int, errChan chan error, out chan *custPbV1.CustomerResponse) {
defer job.Done()
// Idle worker takes the work from this channel.
for cid := range works {
pd := repo.GetCustomerData(strconv.FormatInt(cid, 10))
if !pd.IsCorrect {
errChan <- errors.New(fmt.Sprintf("pd %d is incorrect", pd))
// we can not return here as the total number of workers will be reduced. If all the workers does this then there is a chance that no workers are there to do the job
continue
}
resources := us.helperCom.GenerateResourceString(pd)
val, err := us.GenerateInfo(clientId, resources, cfg)
if err != nil {
errChan <- errors.New(fmt.Sprintf("got error", err))
continue
}
out <- val
}
}
Explanation:
This is a worker pool implementation where we spawn a fixed number of goroutines(100 workers here) to do the same job(GetCustomerData() & GenerateInfo() here) but with different input data(cid here). 100 workers here does not mean that it is parallel but concurrent(depends on the GOMAXPROCS). If one worker is waiting for io result(basically some blocking operation)then that particular goroutine will be context switched and other worker goroutine gets a chance to execute. But increasing goroutuines (workers) may not give much performance but can leads to contention on the channel as more workers are waiting for the input job on that channel.
The benefit over splitting the 1 million data to subslice is that. Lets say we have 1000 jobs and 100 workers. each worker will get assigned to the jobs 1-10, 11-20 etc... What if the first 10 jobs is taking more time than others. In that case the first worker is overloaded and the other workers will finish the tasks and will be idle even though there are pending tasks. So to avoid this situation, this is the best solution as the idle worker will take the next job. So that no worker is more overloaded compared to the other workers
Sry this title might be misleading. Actually the full code is here below:
package main
import (
"fmt"
"sync"
)
type Button struct {
Clicked *sync.Cond
}
func main() {
button := Button{
Clicked: sync.NewCond(&sync.Mutex{}),
}
subscribe := func(c *sync.Cond, fn func()) {
var wg sync.WaitGroup
wg.Add(1)
go func() {
wg.Done()
c.L.Lock()
defer c.L.Unlock()
c.Wait()
fn()
}()
wg.Wait()
}
var clickRegistered sync.WaitGroup
clickRegistered.Add(2)
subscribe(button.Clicked, func() {
fmt.Println("maximizing window")
clickRegistered.Done()
})
subscribe(button.Clicked, func() {
fmt.Println("displaying dialog")
clickRegistered.Done()
})
button.Clicked.Broadcast()
clickRegistered.Wait()
}
When I comment some lines and run it again, it throws a fatal error "all goroutines are asleep - deadlock!"
The subscribe function altered looks like as below:
subscribe := func(c *sync.Cond, fn func()) {
//var wg sync.WaitGroup
//wg.Add(1)
go func() {
//wg.Done()
c.L.Lock()
defer c.L.Unlock()
c.Wait()
fn()
}()
//wg.Wait()
}
What makes me confused is that whether go func is executed before the outer subscribe function returns. In my thought, the go func will run as a daemon though the outer function has returned, so the wg variable is unnecessary. But it shows I'm totally wrong. So if the go func has the possibility of not being scheduled, does it mean that we must use the sync.WaitGroup in every function or code block to make sure the goroutine is scheduled to be executed before the function or code block returns?
Thank you all.
The problem is that c.Wait() in either call is not guaranteed to run before button.Clicked.Broadcast(); and even your original code's use of WaitGroup does not guarantees it either (since it is the c.Wait() part, not the spawn of the goroutine that is important)
modified subscribe:
subscribe := func(c *sync.Cond, subWG *sync.WaitGroup, fn func()) {
go func() {
c.L.Lock()
defer c.L.Unlock()
subWG.Done() // [2]
c.Wait()
fn()
}()
}
code of waiting:
subWG.Done()
button.Clicked.L.Lock()
button.Clicked.L.Unlock()
This is based on the observation that [2] can only happen either at the beginning or after the all previous goroutines that execute [2] is holding on c.Wait, due to the locker they shared. So subWG.Wait(), meaning that 2 (or number of the subscriptions) [2] is executed, it is only possible that one goroutine is not holding on c.Wait, which can be solved by asking for the locker to Lock another time.
Playground: https://play.golang.org/p/6mjUEcn3ec5
With the wg waitgroup (as coded in your current group) : when the subscribe function returns, you know that the waiting goroutine has at least started its execution.
So when your main function reaches button.Clicked.Broadcast(), there's a good chance the two goroutines are actually waiting on their button.Clicked.Wait() call.
Without the wg, you have no guarantee that the goroutines have even started, and your code may call button.Clicked.Broadcast() too soon.
Note that your use of wg merely makes it less probable for the deadlock to happen, but it won't prevent it in all cases.
Try compiling your binary with -race, and run it in a loop (e.g from bash : for i in {1..100}; do ./myprogram; done), I think you will see that the same problem happens sometimes.
in Java I can make threads run for long periods of time and I don't need to stay within the function that started the thread.
Goroutines, Go's answer to Threads seem to stop running after I return from the function that started the routine.
How can I make these routines stay running and return from the calling function?
Thanks
Goroutines do continue running after the function that invokes them exits: Playground
package main
import (
"fmt"
"time"
)
func countToTen() chan bool {
done := make(chan bool)
go func() {
for i := 0; i < 10; i++ {
time.Sleep(1 * time.Second)
fmt.Println(i)
}
done <- true
}()
return done
}
func main() {
done := countToTen()
fmt.Println("countToTen() exited")
// reading from the 'done' channel will block the main thread
// until there is something to read, which won't happen until
// countToTen()'s goroutine is finished
<-done
}
Note that we need to block the main thread until countToTen()'s goroutine completes. If we don't do this, the main thread will exit and all other goroutines will be stopped even if they haven't completed their task yet.
You can.
If you want to have a go-routine running in background forever, you need to have some kind of infinite loop, with some kind of graceful stopping mechanism in place, usually via channel. And invoke the go-routine via some other function, so even after this other function terminates, your go-routine will still be running.
For example:
// Go routine which will run indefinitely.
// Unless you send a signal on quit channel.
func goroutine(quit chan bool) {
for {
select {
case <-quit:
fmt.Println("quit")
return
default:
fmt.Println("Do your thing")
}
}
}
// Go routine will still be running,
// after you return from this function.
func invoker() {
q := make(chan bool)
go goroutine(q)
}
Here, you can call invoker, when you want to start the go-routine. And even after invoker returns, your go-routine will still be running in background.
Only exception to this is, when main function returns all go-routines in the application will be terminated.
I have a Go program that runs continuously and relies entirely on goroutines + 1 manager thread. The main thread simply calls goroutines and otherwise sleeps.
There is a memory leak. The program uses more and more memory until it drains all 16GB RAM + 32GB SWAP and then each goroutine panics. It is actually OS memory that causes the panic, usually the panic is fork/exec ./anotherapp: cannot allocate memory when I try to execute anotherapp.
When this happens all of the worker threads will panic and be recovered and restarted. So each goroutine will panic, be recovered and restarted... at which point the memory usage will not decrease, it remains at 48GB even though there is now virtually nothing allocated. This means all goroutines will always panic as there is never enough memory, until the entire executable is killed and restarted completely.
The entire thing is about 50,000 lines, but the actual problematic area is as follows:
type queue struct {
identifier string
type bool
}
func main() {
// Set number of gorountines that can be run
var xthreads int32 = 10
var usedthreads int32
runtime.GOMAXPROCS(14)
ready := make(chan *queue, 5)
// Start the manager goroutine, which prepared identifiers in the background ready for processing, always with 5 waiting to go
go manager(ready)
// Start creating goroutines to process as they are ready
for obj := range ready { // loops through "ready" channel and waits when there is nothing
// This section uses atomic instead of a blocking channel in an earlier attempt to stop the memory leak, but it didn't work
for atomic.LoadInt32(&usedthreads) >= xthreads {
time.Sleep(time.Second)
}
debug.FreeOSMemory() // Try to clean up the memory, also did not stop the leak
atomic.AddInt32(&usedthreads, 1) // Mark goroutine as started
// Unleak obj, probably unnecessary, but just to be safe
copy := new(queue)
copy.identifier = unleak.String(obj.identifier) // unleak is a 3rd party package that makes a copy of the string
copy.type = obj.type
go runit(copy, &usedthreads) // Start the processing thread
}
fmt.Println(`END`) // This should never happen as the channels are never closed
}
func manager(ready chan *queue) {
// This thread communicates with another server and fills the "ready" channel
}
// This is the goroutine
func runit(obj *queue, threadcount *int32) {
defer func() {
if r := recover(); r != nil {
// Panicked
erstring := fmt.Sprint(r)
reportFatal(obj.identifier, erstring)
} else {
// Completed successfully
reportDone(obj.identifier)
}
atomic.AddInt32(threadcount, -1) // Mark goroutine as finished
}()
do(obj) // This function does the actual processing
}
As far as I can see, when the do function (last line) ends, either by having finished or having panicked, the runit function then ends, which ends the goroutine entirely, which means all of the memory from that goroutine should now be free. This is now what happens. What happens is that this app just uses more and more and more memory until it becomes unable to function, all the runit goroutines panic, and yet the memory does not decrease.
Profiling does not reveal anything suspicious. The leak appears to be outside of the profiler's scope.
Please consider inverting the pattern, see here or below....
package main
import (
"log"
"math/rand"
"sync"
"time"
)
// I do work
func worker(id int, work chan int) {
for i := range work {
// Work simulation
log.Printf("Worker %d, sleeping for %d seconds\n", id, i)
time.Sleep(time.Duration(rand.Intn(i)) * time.Second)
}
}
// Return some fake work
func getWork() int {
return rand.Intn(2) + 1
}
func main() {
wg := new(sync.WaitGroup)
work := make(chan int)
// run 10 workers
for i := 0; i < 10; i++ {
wg.Add(1)
go func(i int) {
worker(i, work)
wg.Done()
}(i)
}
// main "thread"
for i := 0; i < 100; i++ {
work <- getWork()
}
// signal there is no more work to be done
close(work)
// Wait for the workers to exit
wg.Wait()
}
I am currently studying, and I miss setTimeout from Nodejs in golang. I haven't read much yet, and I'm wondering if I could implement the same in go like an interval or a loopback.
Is there a way that I can write this from node to golang? I heard golang handles concurrency very well, and this might be some goroutines or else?
//Nodejs
function main() {
//Do something
setTimeout(main, 3000)
console.log('Server is listening to 1337')
}
Thank you in advance!
//Go version
func main() {
for t := range time.Tick(3*time.Second) {
fmt.Printf("working %s \n", t)
}
//basically this will not execute..
fmt.Printf("will be called 1st")
}
The closest equivalent is the time.AfterFunc function:
import "time"
...
time.AfterFunc(3*time.Second, somefunction)
This will spawn a new goroutine and run the given function after the specified amount of time. There are other related functions in the package that may be of use:
time.After: this version will return a channel that will send a value after the given amount of time. This can be useful in combination with the select statement if you want a timeout while waiting on one or more channels.
time.Sleep: this version will simply block until the timer expires. In Go it is more common to write synchronous code and rely on the scheduler to switch to other goroutines, so sometimes simply blocking is the best solution.
There is also the time.Timer and time.Ticker types that can be used for less trivial cases where you may need to cancel the timer.
This website provides an interesting example and explanation of timeouts involving channels and the select function.
// _Timeouts_ are important for programs that connect to
// external resources or that otherwise need to bound
// execution time. Implementing timeouts in Go is easy and
// elegant thanks to channels and `select`.
package main
import "time"
import "fmt"
func main() {
// For our example, suppose we're executing an external
// call that returns its result on a channel `c1`
// after 2s.
c1 := make(chan string, 1)
go func() {
time.Sleep(2 * time.Second)
c1 <- "result 1"
}()
// Here's the `select` implementing a timeout.
// `res := <-c1` awaits the result and `<-Time.After`
// awaits a value to be sent after the timeout of
// 1s. Since `select` proceeds with the first
// receive that's ready, we'll take the timeout case
// if the operation takes more than the allowed 1s.
select {
case res := <-c1:
fmt.Println(res)
case <-time.After(1 * time.Second):
fmt.Println("timeout 1")
}
// If we allow a longer timeout of 3s, then the receive
// from `c2` will succeed and we'll print the result.
c2 := make(chan string, 1)
go func() {
time.Sleep(2 * time.Second)
c2 <- "result 2"
}()
select {
case res := <-c2:
fmt.Println(res)
case <-time.After(3 * time.Second):
fmt.Println("timeout 2")
}
}
You can also run it on the Go Playground
another solution could be to implement an
Immediately-Invoked Function Expression (IIFE) function like:
go func() {
time.Sleep(time.Second * 3)
// your code here
}()
you can do this by using sleep function
and give your duration you need
package main
import (
"fmt"
"time"
)
func main() {
fmt.Println("First")
time.Sleep(5 * time.Second)
fmt.Println("second")
}