golang: how to bind code with thread? - multithreading

I have nearly implemented a face recognition Go server. My face recognition algorithm uses caffe, caffe is a thread-binding graphical library, that means I have to init and call algorithm in same thread, so I checked LockOSThread().
LockOSThread uses 1 thread, but my server owns 4 GPU.
In C/C++, I could create 4 threads, initialize algorithm in each thread, use sem_wait and sem_post to assign task, 1 thread use 1 GPU.
How to do the same thing in Go, how to bind code with thread?

You spawn some number of goroutines, run runtime.LockOSThread() in each
and then initialize your graphical library in each.
You then use regular Go communication primitives to send tasks to those
goroutines. Usually, the most simple way is to have each goroutine read "tasks" from a channel and send the results back, like in
type Task struct {
Data DataTypeToContainRecognitionTask
Result chan<- DataTypeToContainRecognitionResult
}
func GoroutineLoop(tasks <-chan Task) {
for task := range tasks {
task.Result <- recognize(Data)
}
}
tasks := make(chan Task)
for n := 4; n > 0; n-- {
go GoroutineLoop(tasks)
}
for {
res := make(chan DataTypeToContainRecognitionResult)
tasks <- Task{
Data: makeRecognitionData(),
Result: res,
}
result <- res
// Do something with the result
}
As to figuring out how much goroutines to start, there exist different
strategies.
The simplest approach is probably to query runtime.NumCPU()
and using this number.

Related

How to wait for *any* of a group of goroutines to signal something without requiring that we wait for them to do so

There are plenty of examples of how to use WaitGroup to wait for all of a group of goroutines to finish, but what if you want to wait for any one of them to finish without using a semaphore system where some process must be waiting? For example, a producer/consumer scenario where multiple producer threads add multiple entries to a data structure while a consumer is removing them one at a time. In this scenario:
We can't just use the standard producer/consumer semaphore system, because production:consumption is not 1:1, and also because the data structure acts as a cache, so the producers can be "free-running" instead of blocking until a consumer is "ready" to consume their product.
The data structure may be emptied by the consumer, in which case, the consumer wants to wait until any one of the producers finishes (meaning that there might be new things in the data structure)
Question: Is there a standard way to do that?
I've only been able to devise two methods of doing this. Both by using channels as semaphores:
var unitary_channel chan int = make(chan int, 1)
func my_goroutine() {
// Produce, produce, produce!!!
unitary_channel<-0 // Try to push a value to the channel
<-unitary_channel // Remove it, in case nobody was waiting
}
func main() {
go my_goroutine()
go my_goroutine()
go my_goroutine()
for len(stuff_to_consume) { /* Consume, consume, consume */ }
// Ran out of stuff to consume
<-unitary_channel
unitary_channel<-0 // To unblock the goroutine which was exiting
// Consume more
}
Now, this simplistic example has some glaring (but solvable issues), like the fact that main() can't exit if there wasn't at least one go_routine() still running.
The second method, instead of requiring producers to remove the value they just pushed to the channel, uses select to allow the producers to exit when the channel would block them.
var empty_channel chan int = make(chan int)
func my_goroutine() {
// Produce, produce, produce!!!
select {
case empty_channel <- 0: // Push if you can
default: // Or don't if you can't
}
}
func main() {
go my_goroutine()
go my_goroutine()
go my_goroutine()
for len(stuff_to_consume) { /* Consume, consume, consume */ }
// Ran out of stuff to consume
<-unitary_channel
// Consume more
}
Of course, this one will also block main() forever if all of the goroutines have already terminated. So, if the answer to the first question is "No, there's no standard solution to this other than the ones you've come up with", is there a compelling reason why one of these should be used instead of the other?
you could use a channel with a buffer like this
// create a channel with a buffer of 1
var Items = make(chan int, 1)
var MyArray []int
func main() {
go addItems()
go addItems()
go addItems()
go sendToChannel()
for true {
fmt.Println(<- Items)
}
}
// push a number to the array
func addItems() {
for x := 0; x < 10; x++ {
MyArray = append(MyArray, x)
}
}
// push to Items and pop the array
func sendToChannel() {
for true {
for len(MyArray) > 0 {
Items <- MyArray[0]
MyArray = MyArray[1:]
}
time.Sleep(10 * time.Second)
}
}
the for loop in main will loop for ever and print anything that gets added to the channel and the sendToChannel function will block when the array is empty,
this way a producer will never be blocked and a consumer can consume when there are one or more items available

Troubles with gitlab scraping via golang

I'm newbie in programming and I need help. Trying to write gitlab scraper on golang.
Something goes wrong when i'm trying to get information about projects in multithreading mode.
Here is the code:
func (g *Gitlab) getAPIResponce(url string, structure interface{}) error {
responce, responce_error := http.Get(url)
if responce_error != nil {
return responce_error
}
ret, _ := ioutil.ReadAll(responce.Body)
if string(ret) != "[]" {
err := json.Unmarshal(ret, structure)
return err
}
return errors.New(error_emptypage)
}
...
func (g *Gitlab) GetProjects() {
projects_chan := make(chan Project, g.LatestProjectID)
var waitGroup sync.WaitGroup
queue := make(chan struct{}, 50)
for i := g.LatestProjectID; i > 0; i-- {
url := g.BaseURL + projects_url + "/" + strconv.Itoa(i) + g.Token
waitGroup.Add(1)
go func(url string, channel chan Project) {
queue <- struct{}{}
defer waitGroup.Done()
var oneProject Project
err := g.getAPIResponce(url, &oneProject)
if err != nil {
fmt.Println(err.Error())
}
fmt.Printf(".")
channel <- oneProject
<-queue
}(url, projects_chan)
}
go func() {
waitGroup.Wait()
close(projects_chan)
}()
for project := range projects_chan {
if project.ID != 0 {
g.Projects = append(g.Projects, project)
}
}
}
And here is the output:
$ ./gitlab-auditor
latest project = 1532
Gathering projects...
.......................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................Get https://gitlab.example.com/api/v4/projects/563&private_token=SeCrEt_ToKeN: unexpected EOF
Get https://gitlab.example.com/api/v4/projects/558&private_token=SeCrEt_ToKeN: unexpected EOF
..Get https://gitlab.example.com/api/v4/projects/531&private_token=SeCrEt_ToKeN: unexpected EOF
Get https://gitlab.example.com/api/v4/projects/571&private_token=SeCrEt_ToKeN: unexpected EOF
.Get https://gitlab.example.com/api/v4/projects/570&private_token=SeCrEt_ToKeN: unexpected EOF
..Get https://gitlab.example.com/api/v4/projects/467&private_token=SeCrEt_ToKeN: unexpected EOF
Get https://gitlab.example.com/api/v4/projects/573&private_token=SeCrEt_ToKeN: unexpected EOF
................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
Every time it's different projects, but it's id is around 550.
When I'm trying to curl links from output, i'm getting normal JSON. When I'm trying to run this code with queue := make(chan struct{}, 1) (in single thread) - everything is fine.
What can it be?
i would say this not a very clear way to achieve concurrency.
what seems to be happening here is
you create a buffered channel that has a size of 50.
then you fire up 1532 goroutines
the first 50 of them enqueue themselves and start processing. by the time they <-queue and free up somespace a random one from the next manages to get on the queue.
as people say in the comments most certainly you hit some limits around the time it the blast has made it around id 550. Then gitlab's API is angry at you and rate limits.
then another goroutine is fired that will close the channel to notify the main goroutine
the main goroutine reads messages.
the talk go concurrency patterns
as well as this blog post concurrency in go might help.
personally i rarely use buffered channels. for your problem i would go like:
define a number of workers
have the main goroutine fire up the workers with a func listening on a channel of ints , doing the api call, writing to a channel of projects
have the main goroutine send to a channel of ints the project number to be fetched and read from the channel of projects.
maybe ratelimit by firing a ticker and have main read from it before it sends the next request?
main closes the number channel to notify the others to die.

Make goroutines stay running after returning from function

in Java I can make threads run for long periods of time and I don't need to stay within the function that started the thread.
Goroutines, Go's answer to Threads seem to stop running after I return from the function that started the routine.
How can I make these routines stay running and return from the calling function?
Thanks
Goroutines do continue running after the function that invokes them exits: Playground
package main
import (
"fmt"
"time"
)
func countToTen() chan bool {
done := make(chan bool)
go func() {
for i := 0; i < 10; i++ {
time.Sleep(1 * time.Second)
fmt.Println(i)
}
done <- true
}()
return done
}
func main() {
done := countToTen()
fmt.Println("countToTen() exited")
// reading from the 'done' channel will block the main thread
// until there is something to read, which won't happen until
// countToTen()'s goroutine is finished
<-done
}
Note that we need to block the main thread until countToTen()'s goroutine completes. If we don't do this, the main thread will exit and all other goroutines will be stopped even if they haven't completed their task yet.
You can.
If you want to have a go-routine running in background forever, you need to have some kind of infinite loop, with some kind of graceful stopping mechanism in place, usually via channel. And invoke the go-routine via some other function, so even after this other function terminates, your go-routine will still be running.
For example:
// Go routine which will run indefinitely.
// Unless you send a signal on quit channel.
func goroutine(quit chan bool) {
for {
select {
case <-quit:
fmt.Println("quit")
return
default:
fmt.Println("Do your thing")
}
}
}
// Go routine will still be running,
// after you return from this function.
func invoker() {
q := make(chan bool)
go goroutine(q)
}
Here, you can call invoker, when you want to start the go-routine. And even after invoker returns, your go-routine will still be running in background.
Only exception to this is, when main function returns all go-routines in the application will be terminated.

Golang Memory Leak Concerning Goroutines

I have a Go program that runs continuously and relies entirely on goroutines + 1 manager thread. The main thread simply calls goroutines and otherwise sleeps.
There is a memory leak. The program uses more and more memory until it drains all 16GB RAM + 32GB SWAP and then each goroutine panics. It is actually OS memory that causes the panic, usually the panic is fork/exec ./anotherapp: cannot allocate memory when I try to execute anotherapp.
When this happens all of the worker threads will panic and be recovered and restarted. So each goroutine will panic, be recovered and restarted... at which point the memory usage will not decrease, it remains at 48GB even though there is now virtually nothing allocated. This means all goroutines will always panic as there is never enough memory, until the entire executable is killed and restarted completely.
The entire thing is about 50,000 lines, but the actual problematic area is as follows:
type queue struct {
identifier string
type bool
}
func main() {
// Set number of gorountines that can be run
var xthreads int32 = 10
var usedthreads int32
runtime.GOMAXPROCS(14)
ready := make(chan *queue, 5)
// Start the manager goroutine, which prepared identifiers in the background ready for processing, always with 5 waiting to go
go manager(ready)
// Start creating goroutines to process as they are ready
for obj := range ready { // loops through "ready" channel and waits when there is nothing
// This section uses atomic instead of a blocking channel in an earlier attempt to stop the memory leak, but it didn't work
for atomic.LoadInt32(&usedthreads) >= xthreads {
time.Sleep(time.Second)
}
debug.FreeOSMemory() // Try to clean up the memory, also did not stop the leak
atomic.AddInt32(&usedthreads, 1) // Mark goroutine as started
// Unleak obj, probably unnecessary, but just to be safe
copy := new(queue)
copy.identifier = unleak.String(obj.identifier) // unleak is a 3rd party package that makes a copy of the string
copy.type = obj.type
go runit(copy, &usedthreads) // Start the processing thread
}
fmt.Println(`END`) // This should never happen as the channels are never closed
}
func manager(ready chan *queue) {
// This thread communicates with another server and fills the "ready" channel
}
// This is the goroutine
func runit(obj *queue, threadcount *int32) {
defer func() {
if r := recover(); r != nil {
// Panicked
erstring := fmt.Sprint(r)
reportFatal(obj.identifier, erstring)
} else {
// Completed successfully
reportDone(obj.identifier)
}
atomic.AddInt32(threadcount, -1) // Mark goroutine as finished
}()
do(obj) // This function does the actual processing
}
As far as I can see, when the do function (last line) ends, either by having finished or having panicked, the runit function then ends, which ends the goroutine entirely, which means all of the memory from that goroutine should now be free. This is now what happens. What happens is that this app just uses more and more and more memory until it becomes unable to function, all the runit goroutines panic, and yet the memory does not decrease.
Profiling does not reveal anything suspicious. The leak appears to be outside of the profiler's scope.
Please consider inverting the pattern, see here or below....
package main
import (
"log"
"math/rand"
"sync"
"time"
)
// I do work
func worker(id int, work chan int) {
for i := range work {
// Work simulation
log.Printf("Worker %d, sleeping for %d seconds\n", id, i)
time.Sleep(time.Duration(rand.Intn(i)) * time.Second)
}
}
// Return some fake work
func getWork() int {
return rand.Intn(2) + 1
}
func main() {
wg := new(sync.WaitGroup)
work := make(chan int)
// run 10 workers
for i := 0; i < 10; i++ {
wg.Add(1)
go func(i int) {
worker(i, work)
wg.Done()
}(i)
}
// main "thread"
for i := 0; i < 100; i++ {
work <- getWork()
}
// signal there is no more work to be done
close(work)
// Wait for the workers to exit
wg.Wait()
}

What is the nodejs setTimeout equivalent in Golang?

I am currently studying, and I miss setTimeout from Nodejs in golang. I haven't read much yet, and I'm wondering if I could implement the same in go like an interval or a loopback.
Is there a way that I can write this from node to golang? I heard golang handles concurrency very well, and this might be some goroutines or else?
//Nodejs
function main() {
//Do something
setTimeout(main, 3000)
console.log('Server is listening to 1337')
}
Thank you in advance!
//Go version
func main() {
for t := range time.Tick(3*time.Second) {
fmt.Printf("working %s \n", t)
}
//basically this will not execute..
fmt.Printf("will be called 1st")
}
The closest equivalent is the time.AfterFunc function:
import "time"
...
time.AfterFunc(3*time.Second, somefunction)
This will spawn a new goroutine and run the given function after the specified amount of time. There are other related functions in the package that may be of use:
time.After: this version will return a channel that will send a value after the given amount of time. This can be useful in combination with the select statement if you want a timeout while waiting on one or more channels.
time.Sleep: this version will simply block until the timer expires. In Go it is more common to write synchronous code and rely on the scheduler to switch to other goroutines, so sometimes simply blocking is the best solution.
There is also the time.Timer and time.Ticker types that can be used for less trivial cases where you may need to cancel the timer.
This website provides an interesting example and explanation of timeouts involving channels and the select function.
// _Timeouts_ are important for programs that connect to
// external resources or that otherwise need to bound
// execution time. Implementing timeouts in Go is easy and
// elegant thanks to channels and `select`.
package main
import "time"
import "fmt"
func main() {
// For our example, suppose we're executing an external
// call that returns its result on a channel `c1`
// after 2s.
c1 := make(chan string, 1)
go func() {
time.Sleep(2 * time.Second)
c1 <- "result 1"
}()
// Here's the `select` implementing a timeout.
// `res := <-c1` awaits the result and `<-Time.After`
// awaits a value to be sent after the timeout of
// 1s. Since `select` proceeds with the first
// receive that's ready, we'll take the timeout case
// if the operation takes more than the allowed 1s.
select {
case res := <-c1:
fmt.Println(res)
case <-time.After(1 * time.Second):
fmt.Println("timeout 1")
}
// If we allow a longer timeout of 3s, then the receive
// from `c2` will succeed and we'll print the result.
c2 := make(chan string, 1)
go func() {
time.Sleep(2 * time.Second)
c2 <- "result 2"
}()
select {
case res := <-c2:
fmt.Println(res)
case <-time.After(3 * time.Second):
fmt.Println("timeout 2")
}
}
You can also run it on the Go Playground
another solution could be to implement an
Immediately-Invoked Function Expression (IIFE) function like:
go func() {
time.Sleep(time.Second * 3)
// your code here
}()
you can do this by using sleep function
and give your duration you need
package main
import (
"fmt"
"time"
)
func main() {
fmt.Println("First")
time.Sleep(5 * time.Second)
fmt.Println("second")
}

Resources