Should idle threads be left around in long running process? - multithreading

I am creating a go program that is intended to run long term and listen for work. When it receives a request, it runs the work on a process queue.
I am new to golang and systems programming, so my question is this: should I spin up the process queue (with it's multiple idle worker threads) at the program launch (they will just sit there until work comes in) or should I spin them up when work arrives and shut them down when finished?
I am unclear as to the overall system impact multiple idle threads will have, but I am assuming since they are idle there will be no impact until work arrives. That being said, I want to make sure my program is a "good neighbor" and as efficient as possible.
--EDIT--
To clarify, the "process pool" is a group of worker go routines waiting for work on a channel. Should they be started/stopped when work arrives, or started when the program launches and left waiting until work comes in?

First of all you can't create a thread using standard Go library. In Go universe you should use goroutines which are so called green threads.
Usually you shouldn't spawn "reusable" goroutines. They are cheap to create so create them on demand as work job arrives and finish (return from goroutine) as soon as work is completed.
Also don't hesitate to create nested goroutines. In general spawn them like crazy if you feel you should do something in concurrent manner and don't try to reuse them as it makes no sense.

There is very little cost either way. goroutines don't require a separate OS thread and consume practically no resources while blocking on a channel receive, but also cost very little to spin up, so there's no great reason to leave them open either.
My code rarely uses worker pools. Generally my producer will spawn a goroutine for every unit of work it produces and hands it off directly along with a response channel, then spawns a "listener" that does some formatting for the work output and pipes all the responses back to the main thread. A common pattern for me looks like:
func Foo(input []interface{}) resp chan interface{} {
var wg sync.WaitGroup
resp := make(chan interface{})
listen := make(chan interface{})
theWork := makeWork(input)
// do work
for _, unitOfWork := range theWork {
wg.Add(1)
go func() {
// doWork has signature:
// func doWork(w interface{}, ch chan interface{})
doWork(unitOfWork, listen)
wg.Done()
}()
}
// format the output of listen chan and send to resp chan
// then close resp chan so main can continue
go func() {
for r := range listen {
resp <- doFormatting(r)
}
close(resp)
}()
// close listen chan after work is done
go func() {
wg.Wait()
close(listen)
}()
return resp
}
Then my main function passes it some input and listens on the response channel
func main() {
loremipsum := []string{"foo", "bar", "spam", "eggs"}
response := Foo(loremipsum)
for output := range response {
fmt.Println(output)
}
}

Pattern with tasks queue and waiting workers is common in Go. Goroutines are cheap, but order of execution is nondetermined. So if you want your system behavior to be predictable, you better would control workers rendezvous with main routine thru unbuffered channels requested in a loop or somehow else. Otherwise some of them can be spawned but remain idle which is legal.

Related

Best way to wake 0-N sleeping goroutines at once

I'm writing a program where I start N (N is a command-line argument) worker threads, and at any time 0 to N-1 of them can be waiting on another to update a variable. What's the best way for the threads to wait for this event, and the best way for one of the threads to notify all the others at once of the event occurring? This event will be sent multiple times by each thread.
sync.Cond isn't appropriate because the threads don't need to lock a resource upon waking from sleep. sync.WaitGroup won't work because I don't know how many times to call wg.Done().
Solution #1: I could use a sync.Mutex and have the thread that will eventually notify the others acquire the lock and then unlock it to notify the others, but it seems really inefficient for the others to all fight over a lock when they all just need to pop out of sleep, read a variable to see if that particular worker is now the master, and then either go back to sleep or start working.
Solution #2: Create a wrapper for sync.WaitGroup that allows keeping track of the number of waiting threads so that I can call wg.Add(-numWaitingThreads) to wake them. This sounds like a headache to figure out how to code it without all sorts of race conditions.
Solution #3: Until someone comes up with a better idea, I'll be using a list of N channels and have the notifier non-blocking-send to all of the channels except its own. Is this really the best way?
More details: I give each worker some unique credits and have a central variable for "which credit is the next to be written to the output file". When a worker finishes its work for whichever credit ID it was working on, it needs to do the following:
for centralNextCreditID != creditID {
wait_for_centralNextCreditID_to_change()
}
saveWorkToFile()
centralNextCreditID++
wake_other_threads_waiting_for_centralNextCreditID_to_change()
To me it does seem like this is an appropriate use case for sync.Cond. You can use a *RWMutex.RLocker() for Cond.L so all goroutines can acquire the read lock simultaneously once the Cond.Broadcast() is sent.
Additionally, it may be worth making sure you hold a write lock when changing this "who's master" variable to avoid race conditions, which would make sync.Cond an even better fit.
sync.WaitGroup won't work because I don't know how many times to call wg.Done().
wg can be used in this case. Make a wg with count 1 and pass this to the N goroutines. Make them wg.Wait(), except the one that updates the variable.
The goroutine updating the variable calls wg.Done() after successful update thus resulting in N goroutines to come out of wait and start executing further.
The title says that you want to wake 0-N sleeping goroutines, but the body of the question indicates that you only need to wake the goroutine for the next id (if there is a goroutine waiting).
Here's how to implement the problem described in the body of the question:
// waiter sequences work according to an incrementing id.
type waiter struct {
mu sync.Mutex
id int
waiting map[int]chan struct{}
}
func NewWaiter(firstID int) *waiter {
return &waiter{id: firstID, waiting: make(map[int]chan struct{})}
}
// wait waits for id's turn in the sequence.
func (w *waiter) wait(id int) {
w.mu.Lock()
if w.id == id {
// This id is next. Nothing to do.
w.mu.Unlock()
return
}
// Wait for our turn.
c := make(chan struct{})
w.waiting[id] = c
w.mu.Unlock()
<-c
}
// done signals that the work for the previous id is done.
func (w *waiter) done() {
w.mu.Lock()
w.id++
c, ok := w.waiting[w.id]
if ok {
delete(w.waiting, w.id)
}
w.mu.Unlock()
if ok {
// close cause c to receive a zero value
close(c)
}
}
Here's how to use it:
for _, creditID := range creditIDs {
doWorkFor(creditID)
waiter.wait(creditID)
saveWorkToFile()
waiter.done()
}
WaitGroup is the best option. The reason is that is keeps its signalled state and you are safe from deadlock if the main thread signals too early.
If you use Cond there is a risk that the main thread calls cond.Broadcast BEFORE the worker thread calls cond.Wait(). Since Cond doesn't remember that it was signalled, the worker thread will wait for the event to happen.
Here is an example: https://go.dev/play/p/YLfvEGO2A18
The main thread broadcasts too early, the worker threads run into a deadlock.
Same case with con.WaitGroup: https://go.dev/play/p/R6_-ULo2eJ2
The main thread releases the wait group too early, but there is no deadlock.

Golang: can WaitGroup leak with go-routines

I am planning to implement a go-routine and have a sync.WaitGroup to synchronize end of a created go-routine. I create a thread essentially using go <function>. So it is something like:
main() {
var wg sync.WaitGroup
for <some condition> {
go myThread(wg)
wg.Add(1)
}
wg.wait()
}
myThread(wg sync.WaitGroup) {
defer wg.Done()
}
I have earlier worked with pthread_create which does fail to create a thread under some circumstances. With that context, is it possibly for the above go myThread(wg) to fail to start, and/or run wg.Done() if the rest of the routine behaves correctly? If so, what would be reported and how would the error be caught? My concern is a possible leak in wg due to wg.Add(1) following the thread creation. (Of course it may be possible to use wg.Add(1) within the function, but that leads to other races between the increment and the main program waiting).
I have read through numerous documentation of go-routines and there is no mention of a failure in scheduling or thread creation anywhere. What would be the case if I create billions of threads and exhaust bookkeeping space? Would the go-routine still work and threads still be created?
I don't know of any possible way for this to fail, and if it is possible, it would result in a panic (and therefor application crash). I have never seen it happen, and I'm aware of examples of applications running millions of goroutines. The only limiting factor is available memory to allocate the goroutine stack.
go foo() is not like pthread_create. Goroutines are lightweight green threads handled by the Go runtime, and scheduled to run on OS threads. Starting a goroutine does not start a new OS thread.
The problem with your code is not in starting a goroutine (which cannot "fail" per se) or that like but in the use of sync.WaitGroup. Your code has two major bugs:
You must do wg.Add(1) before launching the goroutine as otherwise the Done() could be executed before the Add(1).
You must not copy a sync.WaitGroup. Your code makes a copy while calling myThread().
Both issues are explained in the official documentation to sync.WaitGroup and the given example in https://golang.org/pkg/sync/#WaitGroup

two way communication through channels in golang

I have several functions that I want them to be executed atomically since they deal with sensitive data structures. Suppose the following scenario:
There are two functions: lock(sth) and unlock(sth) that can be called anytime by a goroutine to lock or unlock sth in a global array. I was thinking about having a command channel so that goroutines send lock and unlock commands into the channel, and on the receive side of the channel, some kind of handler handles lock, unlock requests, sequentially, by grabbing commands from the channel. That's fine, but what if the handler wants to send the result back to the requester? Is it possible to do so use golang channels? I know that it is possible to use some kind of lock mechanism like mutex, but I was wondering if it's possible to use channels for such use-case? I saw somewhere that it is recommended to use channel instead of goland low-level lock structs.
In a single sentence:
In a channel with the capacity of 1, I want the receiver side to be able to reply back to the goroutine which sent the message.
or equivalently:
A goroutine sends something to a channel; the message is received by another goroutine and handled leading to some result; how does the sender become aware of the result?
The sync package includes a Mutex lock, sync.Mutex, which can be locked and unlocked from any goroutine in a threadsafe way. Instead of using a channel to send a command to lock something, how about just using a mutex lock from the sender?
mutex := new(sync.Mutex)
sensitiveData := make([]string, 0)
// when someone wants to operate on a sensitiveData,
// ...
mutex.Lock()
operate(sensitiveData)
mutex.Unlock()
When you say how does the sender become aware of the result, I think you're talking about how does the handler receive the result -- that would be with a chan. You can send data through channels.
Alternatively, if you just want to be aware, a semaphore, sync.WaitGroup might do the job. This struct can be Add()ed to, and then the sender can wg.Wait() until the handler calls wg.Done(), which will indicate to the sender (which is waiting) that the handler is done doing such and such.
If your question is about whether to use locks or channels, the wiki has a terse answer:
A common Go newbie mistake is to over-use channels and goroutines just because it's possible, and/or because it's fun. Don't be afraid to use a sync.Mutex if that fits your problem best. Go is pragmatic in letting you use the tools that solve your problem best and not forcing you into one style of code.
As a general guide, though:
Channel: passing ownership of data, distributing units of work, communicating async results
Mutex: caches, state
If you absolutely want to avoid anything but chans :), try not altering the sensitive array to begin with. Rather, use channels to send data to different goroutines, at each step processing the data, and then funneling the processed data into a final type goroutine. That is, avoid using an array at all and store the data in chans.
As the motto goes,
Do not communicate by sharing memory; instead, share memory by communicating.
If you want to prevent race conditions then sync primitives should work just fine, as described in #Nevermore's answer. It leaves the code much more readable and easier to reason about.
However, if you want channels to perform syncing for you, you can always try something like below:
// A global, shared channel used as a lock. Capacity of 1 allows for only
// one thread to access the protected resource at a time.
var lock = make(chan struct{}, 1)
// Operate performs the access/modification on the protected resource.
func Operate(f func() error) error {
lock <- struct{}{}
defer func() { <- lock }()
return f()
}
To use this Operate, pass in a closure that accesses the protected resource.
// Some value that requires concurrent access.
var arr = []int{1, 2, 3, 4, 5}
// Used to sync up goroutines.
var wg sync.WaitGroup
wg.Add(len(arr))
for i := 0; i < len(arr); i++ {
go func(j int) {
defer wg.Done()
// Access to arr remains protected.
Operate(func () error {
arr[j] *= 2
return nil
})
}(i)
}
wg.Wait()
Working example: https://play.golang.org/p/Drh-yJDVNh
Or you can entirely bypass Operate and use lock directly for more readability:
go func(j int) {
defer wg.Done()
lock <- struct{}{}
defer func() { <- lock }()
arr[j] *= 2
}(i)
Working example: https://play.golang.org/p/me3K6aIoR7
As you can see, arr access is protected using a channel here.
The other questions have covered locking well, but I wanted to address the other part of the question around using channels to send a response back to a caller. There is a not-uncommon pattern in Go of sending a response channel with the request. For example, you might send commands to a handler over a channel; these commands would be a struct with implementation-specific details, and the struct would include a channel for sending the result back, typed to the result type. Each command sent would include a new channel, which the handler would use to send back the response, and then close. To illustrate:
type Command struct {
// command parameters etc
Results chan Result
}
type Result struct {
// Whatever a result is in this case
}
var workQueue = make(chan Command)
// Example for executing synchronously
func Example(param1 string, param2 int) Result {
workQueue <- Command{
Param1: param1,
Param2: param2,
Results: make(chan Result),
}
return <- Results

Do multiple goroutine will invoke a method on a Conn simultaneously?

My program like this:
func handle(conn net.Conn) {
msg := "hello, world!"
for i:= 0; i< 100000; i++ {
go func() {
err := write(conn, msg)
}
}
}
func write(conn net.Conn, msg string) error {
mlen := fmt.Sprintf("%04d", len(msg))
_, err := conn.Write([]byte(mlen + msg))
return err
}
The program will run 100000 goroutines at same time, and all goroutines will send message to the same connection。
I am doubt that server will receive error message like "hellohelloworldworld", but there is no problem when the program run in my Ubuntu 14.04LTS.
So, Do multiple goroutine will invoke a method on a Conn simultaneously?
=========================================================================
How can I keep the Write method atomic?
The documentation states:
Multiple goroutines may invoke methods on a Conn simultaneously.
There is no mention of whether each individual write is atomic. While the current implementation may ensure that each call to Write happens completely before the next call can begin, there is no guarantee in the language specification.
This answer implies writes are atomic.
Specifically implementors of the io.Write interface are required to return an error if a partial write occurs. net.Conn handles this on unix by acquiring a lock and calling write in a loop until the whole buffer is written. On Windows it calls WSASend which guarantees to send the whole buffer unless an error occurs. But the docs do have this warning:
The order of calls made to WSASend is also the order in which the
buffers are transmitted to the transport layer. WSASend should not be
called on the same stream-oriented socket concurrently from different
threads, because some Winsock providers may split a large send request
into multiple transmissions, and this may lead to unintended data
interleaving from multiple concurrent send requests on the same
stream-oriented socket.
Which means it wouldn't necessarily be atomic, unless Go acquires a mutex - which it does.
So basically it is atomic in practice. It is conceivable that an implementation could define thread-safety as just not crashing and allow interleaved writes by unlocking the mutex around calls to write (or not acquiring it at all on windows.) That doesn't make sense to me though, and the developers have clearly shown the opposite intent.

Will Go block the current thread when doing I/O inside a goroutine?

I am confused over how Go handles non-blocking I/O. Go's APIs look mostly synchronous to me, and when watching presentations on Go, it's not uncommon to hear comments like "and the call blocks".
Is Go using blocking I/O when reading from files or the network? Or is there some kind of magic that re-writes the code when used from inside a goroutine?
Coming from a C# background, this feels very unintuitive, as in C# we have the await keyword when consuming async APIs, which clearly communicates that the API can yield the current thread and continue later inside a continuation.
TLDR; will Go block the current thread when doing I/O inside a goroutine?, or will it be transformed into a C# like async/await state machine using continuations?
Go has a scheduler that lets you write synchronous code, and does context switching on its own and uses async I/O under the hood. So if you're running several goroutines, they might run on a single system thread, and when your code is blocking from the goroutine's view, it's not really blocking. It's not magic, but yes, it masks all this stuff from you.
The scheduler will allocate system threads when they're needed, and during operations that are really blocking (file I/O is blocking, for example, or calling C code). But if you're doing some simple http server, you can have thousands and thousands of goroutines using actually a handful of "real threads".
You can read more about the inner workings of Go here.
You should read #Not_a_Golfer answer first and the link he provided to understand how goroutines are scheduled. My answer is more like a deeper dive into network IO specifically. I assume you understand how Go achieves cooperative multitasking.
Go can and does use only blocking calls because everything runs in goroutines and they're not real OS threads. They're green threads. So you can have many of them all blocking on IO calls and they will not eat all of your memory and CPU like OS threads would.
File IO is just syscalls. Not_a_Golfer already covered that. Go will use real OS thread to wait on a syscall and will unblock the goroutine when it returns. Here you can see file read implementation for Unix.
Network IO is different. The runtime uses "network poller" to determine which goroutine should unblock from IO call. Depending on the target OS it will use available asynchronous APIs to wait for network IO events. Calls look like blocking but inside everything is done asynchronously.
For example, when you call read on TCP socket goroutine first will try to read using syscall. If nothing is arrived yet it will block and wait for it to be resumed. By blocking here I mean parking which puts the goroutine in a queue where it awaits resuming. That's how "blocked" goroutine yields execution to other goroutines when you use network IO.
func (fd *netFD) Read(p []byte) (n int, err error) {
if err := fd.readLock(); err != nil {
return 0, err
}
defer fd.readUnlock()
if err := fd.pd.PrepareRead(); err != nil {
return 0, err
}
for {
n, err = syscall.Read(fd.sysfd, p)
if err != nil {
n = 0
if err == syscall.EAGAIN {
if err = fd.pd.WaitRead(); err == nil {
continue
}
}
}
err = fd.eofError(n, err)
break
}
if _, ok := err.(syscall.Errno); ok {
err = os.NewSyscallError("read", err)
}
return
}
https://golang.org/src/net/fd_unix.go?s=#L237
When data arrives network poller will return goroutines that should be resumed. You can see here findrunnable function that searches for goroutines that can be run. It calls netpoll function which will return goroutines that can be resumed. You can find kqueue implementation of netpoll here.
As for async/wait in C#. async network IO will also use asynchronous APIs (IO completion ports on Windows). When something arrives OS will execute callback on one of the threadpool's completion port threads which will put continuation on the current SynchronizationContext. In a sense, there are some similarities (parking/unparking does looks like calling continuations but on a much lower level) but these models are very different, not to mention the implementations. Goroutines by default are not bound to a specific OS thread, they can be resumed on any one of them, it doesn't matter. There're no UI threads to deal with. Async/await are specifically made for the purpose of resuming the work on the same OS thread using SynchronizationContext. And because there're no green threads or a separate scheduler async/await have to split your function into multiple callbacks that get executed on SynchronizationContext which is basically an infinite loop that checks a queue of callbacks that should be executed. You can even implement it yourself, it's really easy.

Resources