how to safely close a chan chan T in Go? - multithreading

I'm implementing a simple worker-pool algorithm where 1 Sender (the dispatcher) sends jobs to M (Worker) go routines. For that it uses a channel of channels to allocate to the first idle worker an available job:
// builds the pool
func NewWorkerPool(maxWorkers int) WorkerPool {
pool := make(chan chan Job, maxWorkers)
workers := make([]Worker, 0)
return WorkerPool{
WorkerPool: pool,
Workers: workers,
maxWorkers: maxWorkers,
waitGroup: sync.WaitGroup{}}
}
// Starts the WorkerPool
func (p *WorkerPool) Run(queue chan Job) {
w := p.waitGroup
// starting n number of workers
for i := 0; i < p.maxWorkers; i++ {
worker := NewWorker(p.WorkerPool)
p.Workers = append(p.Workers, worker)
w.Add(1)
worker.Start(&w)
}
go p.dispatch(queue)
}
// dispatches a job to be handled by an idle Worker of the pool
func (p *WorkerPool) dispatch(jobQueue chan Job) {
for {
select {
case job := <-jobQueue:
// a model request has been received
go func(job Job) {
// try to obtain a worker model channel that is available.
// this will block until a worker is idle
jobChannel := <-p.WorkerPool
// dispatch the model to the worker model channel
jobChannel <- job
}(job)
}
}
}
// checks if a Worker Pool is open or closed - If we can recieve on the channel then it is NOT closed
func (p *WorkerPool) IsOpen() bool {
_, ok := <-p.WorkerPool
return ok
}
The worker Start and Stop methods
// Start method starts the run loop for the worker, listening for a quit channel in
// case we need to stop it
func (w Worker) Start(wg *sync.WaitGroup) {
go func() {
defer wg.Done()
for {
// register the current worker into the worker queue.
w.WorkerPool <- w.JobChannel
select {
case job := <-w.JobChannel:
// we have received a work request.
result := job.Run()
job.ReturnChannel <- result
// once result is returned close the job output channel
close(job.ReturnChannel)
case <-w.quit:
// we have received a signal to stop
return
}
}
}()
}
// Stop signals the worker to stop listening for work requests.
func (w Worker) Stop() {
go func() {
w.quit <- true
}()
}
Now I'm trying to close the Pool by using the following method, I use a sync.WaitGroup in order to wait for all the workers to shutdown:
// stops the Pool
func (p *WorkerPool) Stop() bool {
// stops all workers
for _, worker := range p.Workers {
worker.Stop()
}
p.waitGroup.Wait() //Wait for the goroutines to shutdown
close(p.WorkerPool)
more := p.IsOpen()
fmt.Printf(" more? %t", more)
return ok
}
// prints more? TRUE
Even though I wait for the workers to quit and later on invoke close(p.WorkerPool) I still have the channel open, what is missing in this case, how to close the channels accordingly ?

Closing a channel indicates that no more values will be sent to it. This can be useful to communicate completion to the channel’s receivers.
The data in the channel will still be there where you may have to close the channel and then remove all channels inside it like following
// Stop stops the Pool and free all the channels
func (p *WorkerPool) Stop() bool {
// stops all workers
for _, worker := range p.Workers {
worker.Stop()
}
p.waitGroup.Wait() //Wait for the goroutines to shutdown
close(p.WorkerPool)
for channel := range p.WorkerPool {
fmt.Println("Freeing channel") //remove all the channels
}
more := p.IsOpen()
fmt.Printf(" more? %t", more)
return ok
}
BTW, one can not use _, ok <- to check if a channel is closed. I would suggest a different name for the function

Related

Goroutines - send critical data to the single goroutine and wait for result

I have many goroutines running in my application, and I have another goroutine that must handle only one request at the same period of time and then send the result back to caller.
It means other goroutines should wait until the necessary (single-operated) goroutine is busy.
[goroutine 1] <-
-
-
-
[goroutine 2]<- - - - -> [Process some data in a single goroutine and send the result back to caller
-
-
-
[goroutine 3] <-
This is the diagram how it should look like
I'm very very new to Go and I have a poor knowledge how it should be correctly implemented.
Could someone provide me with some working example so I can run it on go playground?
Here a code snippet which has a few worker-goroutines and one processor-goroutine. Only one single worker-goroutine can send something to the processor because the the processorChannel only allows one entry. When the processor is done, he sends back the response to the worker he got the work from.
package main
import (
"fmt"
"time"
)
type WorkPackage struct {
value int
responseChannel chan int
}
func main() {
processorChannel := make(chan *WorkPackage)
for i := 0; i < 3; i++ {
go runWorker(processorChannel)
}
go runProcessor(processorChannel)
// Do some clever waiting here like with wait groups
time.Sleep(5 * time.Second)
}
func runWorker(processorChannel chan *WorkPackage) {
responseChannel := make(chan int)
for i := 0; i < 10; i++ {
processorChannel <- &WorkPackage{
value: i,
responseChannel: responseChannel,
}
fmt.Printf("** Sent %d\n", i)
response := <-responseChannel
fmt.Printf("** Received the response %d\n", response)
// Do some work
time.Sleep(300 * time.Millisecond)
}
}
func runProcessor(processorChannel chan *WorkPackage) {
for workPackage := range processorChannel {
fmt.Printf("## Received %d\n", workPackage.value)
// Do some processing work
time.Sleep(100 * time.Millisecond)
workPackage.responseChannel <- workPackage.value * 100
}
}
I'll describe the approach with a goroutine that adds two numbers.
Declare request and response types for the goroutine. Include a channel of response values in the request:
type request struct {
a, b int // add these two numbers
ch chan response
}
type response struct {
n int // the result of adding the numbers
}
Kick off a goroutine that receives requests, executes the action and sends the response to the channel in the request:
func startAdder() chan request {
ch := make(chan request)
go func() {
for req := range ch {
req.ch <- response{req.a + req.b}
}
}()
return ch
}
To add the numbers, send a request to the goroutine with a response channel. Receive on the response channel. Return the response value.
func add(ch chan request, a, b int) int {
req := request{ch: make(chan response), a: a, b: b}
ch <- req
return (<-req.ch).n
}
Use it like this:
ch := startAdder()
fmt.Println(add(ch, 1, 2))
Run it on the GoLang PlayGround.

How do I make this program thread-safe, would channels be the best implementation, if so, how?

I'm using Golang, I'm trying to make this program thread-safe. It takes a number as a parameter (which is the number of consumer tasks to start), reads lines from an input, and accumulates word count. I want the threads to be safe (but I don't want it to just lock everything, it needs to be efficient) should I use channels? How do I do this?
package main
import (
"bufio"
"fmt"
"log"
"os"
"sync"
)
// Consumer task to operate on queue
func consumer_task(task_num int) {
fmt.Printf("I'm consumer task #%v ", task_num)
fmt.Println("Line being popped off queue: " + queue[0])
queue = queue[1:]
}
// Initialize queue
var queue = make([]string, 0)
func main() {
// Initialize wait group
var wg sync.WaitGroup
// Get number of tasks to run from user
var numof_tasks int
fmt.Print("Enter number of tasks to run: ")
fmt.Scan(&numof_tasks)
// Open file
file, err := os.Open("test.txt")
if err != nil {
log.Fatal(err)
}
defer file.Close()
// Scanner to scan the file
scanner := bufio.NewScanner(file)
if err := scanner.Err(); err != nil {
log.Fatal(err)
}
// Loop through each line in the file and append it to the queue
for scanner.Scan() {
line := scanner.Text()
queue = append(queue, line)
}
// Start specified # of consumer tasks
for i := 1; i <= numof_tasks; i++ {
wg.Add(1)
go func(i int) {
consumer_task(i)
wg.Done()
}(i)
}
wg.Wait()
fmt.Println("All done")
fmt.Println(queue)
}
You have a data race on the slice queue. Concurrent goroutines, when popping elements off the head of the queue to do so in a controlled manner either via a sync.Mutex lock. Or use a channel to manage the "queue" of work items.
To convert what you have to using channels, update the worker to take an input channel as your queue - and range on the channel, so each worker can handle more than one task:
func consumer_task(task_num int, ch <-chan string) {
fmt.Printf("I'm consumer task #%v\n", task_num)
for item := range ch {
fmt.Printf("task %d consuming: Line item: %v\n", task_num, item)
}
// each worker will drop out of their loop when channel is closed
}
change queue from a slice to a channel & feed items in like so:
queue := make(chan string)
go func() {
// Loop through each line in the file and append it to the queue
for scanner.Scan() {
queue <- scanner.Text()
}
close(queue) // signal to workers that there is no more items
}()
then just update your work dispatcher code to add the channel input:
go func(i int) {
consumer_task(i, queue) // add the queue parameter
wg.Done()
}(i)
https://go.dev/play/p/AzHyztipUZI

Multiple concurrent dynamic locks and timeouts if failure to acquire locks

I have a use case where I need to lock on arguments of a function.
The function itself can be accessed concurrently
Function signature is something like
func (m objectType) operate(key string) (bool) {
// get lock on "key" (return false if unable to get lock in X ms - eg: 100 ms)
// operate
// release lock on "key"
return true;
}
The data space which can be locked is in the range of millions (~10 million)
Concurrent access to operate() is in the range of thousands (1 - 5k)
Expected contention is low though possible in case of hotspots in key (hence the lock)
What is the right way to implement this ? Few options I explored using a concurrent hash map
sync.Map - this is suited for cases with append only entries and high read ratio compared to writes. Hence not applicable here
sharded hashmap where each shard is locked by RWMutex - https://github.com/orcaman/concurrent-map - While this would work, concurrency is limited by no of shards rather than actual contention between keys. Also doesn't enable the timeout scenarios when lot of contention happens for a subset of keys
Though timeout is a P1 requirement, the P0 requirement would be to increase throughput here by granular locking if possible.
Is there a good way to achieve this ?
I would do it by using a map of buffered channels:
to acquire a "mutex", try to fill a buffered channel with a value
work
when done, empty the buffered channel so that another goroutine can use it
Example:
package main
import (
"fmt"
"sync"
"time"
)
type MutexMap struct {
mut sync.RWMutex // handle concurrent access of chanMap
chanMap map[int](chan bool) // dynamic mutexes map
}
func NewMutextMap() *MutexMap {
var mut sync.RWMutex
return &MutexMap{
mut: mut,
chanMap: make(map[int](chan bool)),
}
}
// Acquire a lock, with timeout
func (mm *MutexMap) Lock(id int, timeout time.Duration) error {
// get global lock to read from map and get a channel
mm.mut.Lock()
if _, ok := mm.chanMap[id]; !ok {
mm.chanMap[id] = make(chan bool, 1)
}
ch := mm.chanMap[id]
mm.mut.Unlock()
// try to write to buffered channel, with timeout
select {
case ch <- true:
return nil
case <-time.After(timeout):
return fmt.Errorf("working on %v just timed out", id)
}
}
// release lock
func (mm *MutexMap) Release(id int) {
mm.mut.Lock()
ch := mm.chanMap[id]
mm.mut.Unlock()
<-ch
}
func work(id int, mm *MutexMap) {
// acquire lock with timeout
if err := mm.Lock(id, 100*time.Millisecond); err != nil {
fmt.Printf("ERROR: %s\n", err)
return
}
fmt.Printf("working on task %v\n", id)
// do some work...
time.Sleep(time.Second)
fmt.Printf("done working on %v\n", id)
// release lock
mm.Release(id)
}
func main() {
mm := NewMutextMap()
var wg sync.WaitGroup
for i := 0; i < 50; i++ {
wg.Add(1)
id := i % 10
go func(id int, mm *MutexMap, wg *sync.WaitGroup) {
work(id, mm)
defer wg.Done()
}(id, mm, &wg)
}
wg.Wait()
}
EDIT: different version, where we also handle the concurrent access to the chanMap itself

Simple queue model example

Is there a simple program which demonstrates how queues work in Go.
I just need something like add number 1 to 10 in queue and pull those from the queue in parallel using another thread.
A queue that is safe for concurrent use is basically a language construct: channel.
A channel–by design–is safe for concurrent send and receive. This is detaild here: If I am using channels properly should I need to use mutexes? Values sent on it are received in the order they were sent.
You can read more about channels here: What are golang channels used for?
A very simple example:
c := make(chan int, 10) // buffer for 10 elements
// Producer: send elements in a new goroutine
go func() {
for i := 0; i < 10; i++ {
c <- i
}
close(c)
}()
// Consumer: receive all elements sent on it before it was closed:
for v := range c {
fmt.Println("Received:", v)
}
Output (try it on the Go Playground):
Received: 0
Received: 1
Received: 2
Received: 3
Received: 4
Received: 5
Received: 6
Received: 7
Received: 8
Received: 9
Note that the channel buffer (10 in this example) has nothing to do with the number of elements you want to send "through" it. The buffer tells how many elements the channel may "store", or in other words, how many elements you may send on it without blocking when there are nobody is receiving from it. When the channel's buffer is full, further sends will block until someone starts receiving values from it.
You could use channel(safe for concurrent use) and wait group to read from queue concurrently
package main
import (
"fmt"
"sync"
)
func main() {
queue := make(chan int)
wg := new(sync.WaitGroup)
wg.Add(1)
defer wg.Wait()
go func(wg *sync.WaitGroup) {
for {
r, ok := <-queue
if !ok {
wg.Done()
return
}
fmt.Println(r)
}
}(wg)
for i := 1; i <= 10; i++ {
queue <- i
}
close(queue)
}
Playground link: https://play.golang.org/p/A_Amqcf2gwU
Another option is to create and implement a queue interface, with a backing type of a channel for concurrency. For convenience, I've made a gist.
Here's how you can use it.
queue := GetIntConcurrentQueue()
defer queue.Close()
// queue.Enqueue(1)
// myInt, errQueueClosed := queue.DequeueBlocking()
// myInt, errIfNoInt := queue.DequeueNonBlocking()
Longer example here - https://play.golang.org/p/npb2Uj9hGn1
Full implementation below, and again here's the gist of it.
// Can be any backing type, even 'interface{}' if desired.
// See stackoverflow.com/q/11403050/3960399 for type conversion instructions.
type IntConcurrentQueue interface {
// Inserts the int into the queue
Enqueue(int)
// Will return error if there is nothing in the queue or if Close() was already called
DequeueNonBlocking() (int, error)
// Will block until there is a value in the queue to return.
// Will error if Close() was already called.
DequeueBlocking() (int, error)
// Close should be called with defer after initializing
Close()
}
func GetIntConcurrentQueue() IntConcurrentQueue {
return &intChannelQueue{c: make(chan int)}
}
type intChannelQueue struct {
c chan int
}
func (q *intChannelQueue) Enqueue(i int) {
q.c <- i
}
func (q *intChannelQueue) DequeueNonBlocking() (int, error) {
select {
case i, ok := <-q.c:
if ok {
return i, nil
} else {
return 0, fmt.Errorf("queue was closed")
}
default:
return 0, fmt.Errorf("queue has no value")
}
}
func (q *intChannelQueue) DequeueBlocking() (int, error) {
i, ok := <-q.c
if ok {
return i, nil
}
return 0, fmt.Errorf("queue was closed")
}
func (q *intChannelQueue) Close() {
close(q.c)
}

how to use multiple processes with http

How to make use of all CPUs and spawn a http process for each CPU?
Get num of CPUs
numCPU := runtime.NumCPU()
Start http
package main
import (
"fmt"
"net/http"
)
func handler(w http.ResponseWriter, r *http.Request) {
fmt.Fprintf(w, "Hi there, I love %s!", r.URL.Path[1:])
}
func main() {
http.HandleFunc("/", handler)
http.ListenAndServe(":8080", nil)
}
If your goal is just to have your request-processing code run on all CPU cores, net/http already starts a goroutine (a vaguely thread-like thing with a Go-specific implementation) per connection, and Go arranges for NumCPU OS threads to run by default so that goroutines can be spread across all available CPU cores.
The Accept loop runs in a single goroutine, but the actual work of parsing requests and generating responses runs in one per connection.
You can't nativly, you have to write your own wrapper:
// copied from http://golang.org/src/pkg/net/http/server.go#L1942
type tcpKeepAliveListener struct {
*net.TCPListener
}
func (ln tcpKeepAliveListener) Accept() (c net.Conn, err error) {
tc, err := ln.AcceptTCP()
if err != nil {
return
}
tc.SetKeepAlive(true)
tc.SetKeepAlivePeriod(3 * time.Minute)
return tc, nil
}
func ListenAndServe(addr string, num int) error {
if addr == "" {
addr = ":http"
}
ln, err := net.Listen("tcp", addr)
if err != nil {
return err
}
var wg sync.WaitGroup
for i := 0; i < num; i++ {
wg.Add(1)
go func(i int) {
log.Println("listener number", i)
log.Println(http.Serve(tcpKeepAliveListener{ln.(*net.TCPListener)}, nil))
wg.Done()
}(i)
}
wg.Wait()
return nil
}
func main() {
num := runtime.NumCPU()
runtime.GOMAXPROCS(num) //so the goroutine listeners would try to run on multiple threads
log.Println(ListenAndServe(":9020", num))
}
Or if you use a recent enough Linux Kernel you can use the patch from http://comments.gmane.org/gmane.comp.lang.go.general/121122 and actually spawn multiple processes.

Resources