Postgres lock behaves inconsistently over different goroutines - multithreading

I've encountered some inconsistent behaviour with pg_locks on go routines and PostgreSQL 9.5.
When I create additional goroutines and call SELECT pg_try_advisory_lock(1); the results are inconsistent.
I can try to get the lock at some moment, fail, try again and manage to get it, without anyone explicitly releasing the lock in the meanwhile.
I've created the a small program to reproduce the issue.
Program flow
Create 10 goroutines. Each one of them tries to get the same lock on initialization.
Every second, each instance would try to get the lock again, if it didn't get it already.
Every second I probe all instances and count how many already got the lock.
Expected behaviour:
Only 1 goroutine would have the lock at any given moment.
Actual results:
The number of goroutines that manage to aquire the lock increases over time.
package main
var dbMap *gorp.DbMap // init code omitted for brevity
func main() {
setup(10)
fmt.Println("after initialization,", countOwners(), "instances of lockOwners have the lock!")
for {
if _, err := dbMap.Exec("SELECT pg_sleep(1)"); err != nil {
panic(err)
}
fmt.Println(countOwners(), "instances of lockOwners have the lock!")
}
}
func countOwners() int {
possessLock := 0
for _, lo := range los {
if lo.hasLock {
possessLock++
}
}
return possessLock
}
var los []*lockOwner
func setup(instanceCount int) {
var wg sync.WaitGroup
for i := 0; i < instanceCount; i++ {
wg.Add(1)
newInstance := lockOwner{}
los = append(los, &newInstance)
go newInstance.begin(time.Second, &wg, i+1)
}
wg.Wait()
}
type lockOwner struct {
id int
ticker *time.Ticker
hasLock bool
}
func (lo *lockOwner) begin(interval time.Duration, wg *sync.WaitGroup, id int) {
lo.ticker = time.NewTicker(interval)
lo.id = id
go func() {
lo.tryToGetLock()
wg.Done()
for range lo.ticker.C {
lo.tryToGetLock()
}
}()
}
func (lo *lockOwner) tryToGetLock() {
if lo.hasLock {
return
}
locked, err := dbMap.SelectStr("SELECT pg_try_advisory_lock(4);")
if err != nil {
panic(err)
}
if locked == "true" {
fmt.Println(lo.id, "Did get lock!")
lo.hasLock = true
}
}
The output of this program varies, but usually something along the lines:
1 Did get lock!
after initialization, 1 instances of lockOwners have the lock!
1 instances of lockOwners have the lock!
2 Did get lock!
2 instances of lockOwners have the lock!
2 instances of lockOwners have the lock!
7 Did get lock!
3 instances of lockOwners have the lock!
3 instances of lockOwners have the lock!
6 Did get lock!
4 instances of lockOwners have the lock!
My question:
What should I expect to be protected when using pg_locks this way?
What is the reason some goroutine fails to acquire the lock?
What is the reason the same goroutine succeeds doing so on the next attempt?
Could it be that the thread is the resource being locked and every time a goroutine triggers it does so from a different thread? That would explain the inconsistent behaviour.

After some months of experience with PostgreSQL via gorp, I think I understand this behaviour:
gorp maintains a connection pool.
Whenever we create a transaction, one of these connections will be picked in random(?).
dbMap.SomeCommand() creates a transaction, executes some command and commits the transaction.
pg_try_advisory_lock works on the session level.
A session is synonymous with a TCP connection, so is by no means stable or permanent.
A connection from the pool keeps using the same session when possible, but will reset it when needed.
Whenever we execute dbMap.SelectStr("SELECT pg_try_advisory_lock(4);"), a connection is selected from the pool. Then a transaction is created, it acquires the lock on the session level and then it's committed.
When another go-routine attempts to do the same, there's a chance that it will use the same session, probably depending on the connection that is taken from the pool. Since the locking is done on the session level - the new transaction is able to acquire the lock again.

Related

How to make work concurrent while sending data on a stream in golang?

I have a golang grpc server which has streaming endpoint. Earlier I was doing all the work sequentially and sending on the stream but then I realize I can make the work concurrent and then send on stream. From grpc-go docs: I understood that I can make the work concurrent, but you can't make sending on the stream concurrent so I got below code which does the job.
Below is the code I have in my streaming endpoint which sends data back to client in a streaming way. This does all the work concurrently.
// get "allCids" from lot of files and load in memory.
allCids := .....
var data = allCids.([]int64)
out := make(chan *custPbV1.CustomerResponse, len(data))
wg := &sync.WaitGroup{}
wg.Add(len(data))
go func() {
wg.Wait()
close(out)
}()
for _, cid := range data {
go func (id int64) {
defer wg.Done()
pd := repo.GetCustomerData(strconv.FormatInt(cid, 10))
if !pd.IsCorrect {
return
}
resources := us.helperCom.GenerateResourceString(pd)
val, err := us.GenerateInfo(clientId, resources, cfg)
if err != nil {
return
}
out <- val
}(cid)
}
for val := range out {
if err := stream.Send(val); err != nil {
log.Printf("send error %v", err)
}
}
Now problem I have is size of data slice can be approx a million so I don't want to spawn million go routine doing the job. How do I handle that scenario here? If instead of len(data) I use 100 then will that work for me or I need to slice data as well in 100 sub arrays? I am just confuse on what is the best way to deal with this problem?
I recently started with golang so pardon me if there are any mistakes in my above code while making it concurrent.
Please check this pseudo code
func main() {
works := make(chan int, 100)
errChan := make(chan error, 100)
out := make(chan *custPbV1.CustomerResponse, 100)
// spawn fixed workers
var workerWg sync.WaitGroup
for i := 0; i < 100; i++ {
workerWg.Add(1)
go worker(&workerWg, works, errChan, out)
}
// give input
go func() {
for _, cid := range data {
// this will be blocked if all the workers are busy and no space is left in the channel.
works <- cid
}
close(works)
}()
var analyzeResults sync.WaitGroup
analyzeResults.Add(2)
// process errors
go func() {
for err := range errChan {
log.Printf("error %v", err)
}
analyzeResults.Done()
}()
// process outout
go func() {
for val := range out {
if err := stream.Send(val); err != nil {
log.Printf("send error %v", err)
}
}
analyzeResults.Done()
}()
workerWg.Wait()
close(out)
close(errChan)
analyzeResults.Wait()
}
func worker(job *sync.WaitGroup, works chan int, errChan chan error, out chan *custPbV1.CustomerResponse) {
defer job.Done()
// Idle worker takes the work from this channel.
for cid := range works {
pd := repo.GetCustomerData(strconv.FormatInt(cid, 10))
if !pd.IsCorrect {
errChan <- errors.New(fmt.Sprintf("pd %d is incorrect", pd))
// we can not return here as the total number of workers will be reduced. If all the workers does this then there is a chance that no workers are there to do the job
continue
}
resources := us.helperCom.GenerateResourceString(pd)
val, err := us.GenerateInfo(clientId, resources, cfg)
if err != nil {
errChan <- errors.New(fmt.Sprintf("got error", err))
continue
}
out <- val
}
}
Explanation:
This is a worker pool implementation where we spawn a fixed number of goroutines(100 workers here) to do the same job(GetCustomerData() & GenerateInfo() here) but with different input data(cid here). 100 workers here does not mean that it is parallel but concurrent(depends on the GOMAXPROCS). If one worker is waiting for io result(basically some blocking operation)then that particular goroutine will be context switched and other worker goroutine gets a chance to execute. But increasing goroutuines (workers) may not give much performance but can leads to contention on the channel as more workers are waiting for the input job on that channel.
The benefit over splitting the 1 million data to subslice is that. Lets say we have 1000 jobs and 100 workers. each worker will get assigned to the jobs 1-10, 11-20 etc... What if the first 10 jobs is taking more time than others. In that case the first worker is overloaded and the other workers will finish the tasks and will be idle even though there are pending tasks. So to avoid this situation, this is the best solution as the idle worker will take the next job. So that no worker is more overloaded compared to the other workers

If each thread needs to resume work, whenever any thread finds some new information, how to wait until all threads have finished?

A Seemingly Simple Synchronization Problem
TL;DR
Several threads depend on each other. Whenever one of them finds some new information, all of them need to process that information. How to determine, that all threads are ready?
Background
I have (almost) parallelized a function Foo(input) that solves a problem, which is known to be P-complete and may be thought of as some type of search. Unsurprisingly, so far nobody has managed to successfully exploit parallelism beyond two threads for solving that problem. However, I had a promising idea and managed to fully implement it, except for this seemingly simply problem.
Details
Information between each of the threads is exchanged implicitly using some kind of shared graph-like database g of type G, such that the threads have all informations immediately and do not really need to notify each other explicitly. More precisely, each time an information i is found by some thread, that thread calls a thread-safe function g.addInformation(i) which among other things basically places the information i at the end of some array. One aspect of my new implementation is, that threads can use an information i during their search even before i has been enqueued at the end of the array. Nevertheless, each thread needs to additionally process the information i separately after it has been enqueued in that array. Enqueueing i may happen after the thread who added i has returned from g.addInformation(i). This is because some other thread may take over responsibility to enqueue i.
Each thread s calls a function s.ProcessAllInformation() in order to processes all information in that array in g in order. A call to s.ProcessAllInformation by some thread is a noop, i.e. does nothing, if that thread has already processed all informations or there was no (new) informations.
As soon as a thread finished processing all informations, it should wait for all other threads to finish. And it should resume work if any of the other threads finds some new information i. I.e. each time some thread calls g.addInformation(i) all threads that had finished processing all previously known informations, need to resume their work and process that (and any other) newly added information.
My Problem
Any solution I could think does not work and suffers from a variation of the same problem: One thread finished processing all informations and then sees all other threads are ready, too. Hence, this thread leaves. But then another thread notices some new information had been added, resumes work and finds a new information. The new information is then not processed by the thread that has already left.
A solution to this problem may be straight forward, but I can not think of one. Ideally a solution to this problem should not depend on time-consuming operations during a function call to g.addInformation(i) whenever a new information is found, because of how many times a second this situation is predicted to appear (1 or 2 Million times per second, see below).
Even more background
In my initially sequential application the function Foo(input) is called roughly 100k times a second on modern hardware and my application spends 80% to 90% of time executing Foo(input). Actually, all function calls to Foo(input) depend on each other, we kind of search for something in a very large space in an iterative manner. Solving a reasonable-sized problem typically takes about one or two hours when using the sequential version of the application.
Each time Foo(input) is called between zero and many hundred new informations are found. On average during the execution of my application 1 or 2 million informations are found per second, i.e. we find 10 to 20 new informations on each function call to Foo(input). All of these statistics probably have a very high standard deviation (which i didn't yet measure, though).
Currently I am writing a prototype for the parallel version of Foo(input) in go. I prefer answers in go. The sequential application is written in C (actually it's C++, but its written like a program in C). So answers in C or C++ (or pseudo-code) are no problem. I haven't benchmarked my prototype, yet, since wrong code is infinitely slower than slow code.
Code
This code examples are in order to clarify. Since I haven't solved the problem feel free to consider any changes to the code. (I appreciate unrelated helpful remarks, too.)
Global situation
We have some type G and Foo() is a method of G. If g is an object of type G and when g.Foo(input) is called, g creates some workers s[1], ..., s[g.numThreads] that obtain a pointer to g, such that these have access to the member variables of g and are able to call g.addInformation(i) whenever they find a new information. Then for each worker s[j] a method FooInParallel() is called in parallel.
type G struct {
s []worker
numThreads int
// some data, that the workers need access to
}
func (g *G) initializeWith(input InputType) {
// Some code...
}
func (g *G) Foo(input InputType) int {
// Initialize data-structures:
g.initializeWith(input)
// Initialize workers:
g.s := make([]worker, g.numThreads)
for j := range g.s {
g.s[j] := newWorker(g) // workers get a pointer to g
}
// Note: This wait group doesn't solve the problem. See remark below.
wg := new(sync.WaitGroup)
wg.Add(g.numThreads)
// Actual computation in parallel:
for j := 0 ; j < g.numThreads - 1 ; j++ {
// Start g.numThread - 1 go-routines in parrallel
go g.s[j].FooInParallel(wg)
}
// Last thread is this go-routine, such that we have
// g.numThread go-routines in total.
g.s[g.numThread-1].FooInParallel(wg)
wg.Wait()
}
// This function is thread-safe in so far as several
// workers can concurrently add information.
//
// The function is optimized for heavy contention; most
// threads can leave almost immediately. One threads
// cleans up any mess they leave behind (and even in
// bad cases that is not too much).
func (g *G) addInformation(i infoType) {
// Step 1: Make information available to all threads.
// Step 2: Enqueue information at the end of some array.
// Step 3: Possibly, call g.notifyAll()
}
// If a new information has been added, we must ensure,
// that every thread, that had finished, resumes work
// and processes any newly added informations.
func (g *G) notifyAll() {
// TODO:
// This is what I fail to accomplish. I include
// my most successful attempt in the corresponding.
// section. It doesn't work, though.
}
// If a thread has finished processing all information
// it must ensure that all threads have finished and
// that no new information have been added since.
func (g *G) allThreadsReady() bool {
// TODO:
// This is what I fail to accomplish. I include
// my most successful attempt in the corresponding.
// section. It doesn't work, though.
}
Remark: The only purpose of the wait group is to ensure Foo(input) is not called again before the last worker has returned. However, you can completely ignore this.
Local Situation
Each worker contains a pointer to the global data-structure and searches for either a treasure or new informations until it has processed all information that have been enqueued by this or other threads. If it finds a new information i it calls the function g.addInformation(i) and continues its search. If it finds a treasure it sends the treasure via a channel it has obtained as an argument and returns. If all threads are ready with processing all information, each of them can send a dummy-treasure to the channel and return. However, determining whether all threads are ready is exactly my problem.
type worker struct {
// Each worker contains a pointer to g
// such that it has access to its member
// variables and is able to call the
// function g.addInformation(i) as soon
// as it finds some information i.
g *G
// Also contains some other stuff.
}
func (s *worker) FooInParallel(wg *sync.WaitGroup) {
defer wg.Done()
for {
a := s.processAllInformation()
// The following is the problem. Feel free to make any
// changes to the following block.
s.notifyAll()
for !s.needsToResumeWork() {
if s.allThreadsReady() {
return
}
}
}
}
func (s *worker) notifyAll() {
// TODO:
// This is what I fail to accomplish. I include
// my most successful attempt in the corresponding.
// section. It doesn't work, though.
// An example:
// Step 1: Possibly, do something else first.
// Step 2: Call g.notifyAll()
}
func (s *worker) needsToResumeWork() bool {
// TODO:
// This is what I fail to accomplish. I include
// my most successful attempt in the corresponding.
// section. It doesn't work, though.
}
func (s *worker) allThreadsReady() bool {
// TODO:
// This is what I fail to accomplish. I include
// my most successful attempt in the corresponding.
// section. It doesn't work, though.
// If all threads are ready, return true.
// Otherwise, return false.
// Alternatively, spin as long as no new information
// has been added, and return false as soon as some
// new information has been added, or true if no new
// information has been added and all other threads
// are ready.
//
// However, this doesn't really matter, because a
// function call to processAllInformation is cheap
// if no new informations are available.
}
// A call to this function is cheap if no new work has
// been added since the last function call.
func (s *worker) processAllInformation() treasureType {
// Access member variables of g and search
// for information or treasures.
// If a new information i is found, calls the
// function g.addInformation(i).
// If all information that have been enqueued to
// g have been processed by this thread, returns.
}
My best attempt to solve the problem
Well, by now, I am rather tired, so I might need to double-check my solution later. However, even my correct attempt doesn't work. So in order to give you an idea of what I have been trying so far (among many other things), I share it immediately.
I tried the following. Each of the workers contains a member variable needsToResumeWork, that is atomically set to one whenever a new information has been added. Several times setting this member variable to one does not do harm, it is only important that the thread resumes work after the last information has been added.
In order to reduce work load for a thread calling g.addInformation(i) whenever an information i is found, instead of notifying all threads individually, the thread that enqueues the information (that is not necessarily the thread that called g.addInformation(i)) afterwards sets a member variable notifyAllFlag of g to one, which indicates that all threads need to be notified about the latest information.
Whenever a thread that has finished processing all information that had been enqueued calls the function g.notifyAll(), it checks whether the member variable notifyAllFlag is set to one. If so it tries to atomically compare g.allInformedFlag with 1 and swap with 0. If it could not write g.allInformedFlag it assumes some other thread has taken the responsibility to inform all threads. If this operation is successful, this thread has taken over responsibility to notify all threads and proceeds to do so by setting the member variable needsToResumeWorkFlag to one for every thread. Afterwards it atomically sets g.numThreadsReady and g.notifyAllFlag to zero, and g.allInformedFlag to 1.
type G struct {
numThreads int
numThreadsReady *uint32 // initialize to 0 somewhere appropriate
notifyAllFlag *uint32 // initialize to 0 somewhere appropriate
allInformedFlag *uint32 // initialize to 1 somewhere appropriate (1 is not a typo)
// some data, that the workers need access to
}
// This function is thread-safe in so far as several
// workers can concurrently add information.
//
// The function is optimized for heavy contention; most
// threads can leave almost immediately. One threads
// cleans up any mess they leave behind (and even in
// bad cases that is not too much).
func (g *G) addInformation(i infoType) {
// Step 1: Make information available to all threads.
// Step 2: Enqueue information at the end of some array.
// Since the responsibility to enqueue an information may
// be passed to another thread, it is important that the
// last step is executed by the thread which enqueues the
// information(s) in order to ensure, that the information
// successfully has been enqueued.
// Step 3:
atomic.StoreUint32(g.notifyAllFlag,1) // all threads need to be notified
}
// If a new information has been added, we must ensure,
// that every thread, that had finished, resumes work
// and processes any newly added informations.
func (g *G) notifyAll() {
if atomic.LoadUint32(g.notifyAll) == 1 {
// Somebody needs to notify all threads.
if atomic.CompareAndSwapUint32(g.allInformedFlag, 1, 0) {
// This thread has taken over the responsibility to inform
// all other threads. All threads are hindered to access
// their member variable s.needsToResumeWorkFlag
for j := range g.s {
atomic.StoreUint32(g.s[j].needsToResumeWorkFlag, 1)
}
atomic.StoreUint32(g.notifyAllFlag, 0)
atomic.StoreUint32(g.numThreadsReady, 0)
atomic.StoreUint32(g.allInformedFlag, 1)
} else {
// Some other thread has taken responsibility to inform
// all threads.
}
}
Whenever a thread finishes processing all information that had been enqueued, it checks whether it needs to resume work by atomically comparing its member variable needsToResumeWorkFlag with 1 and swapping with 0. However, since one of the threads is responsible to notify all others, it can not do so immediately.
First, it must call the function g.notifyAll(), and then it must check, whether the latest thread to call g.notifyAll() finished notifying all threads. Hence, after calling g.notifyAll() it must spin until g.allInformed is one, before it checks whether its member variable s.needsToResumeWorkFlag is one and in this case atomically sets it to be zero and resumes work. (I guess here is a mistake, but I also tried several other things here without success.) If s.needsToResumeWorkFlag is already zero, it atomically increments g.numThreadsReady by one, if it hasn't done so before. (Recall that g.numThreadsReady is reset during a function call to g.notifyAll().) then it atomically checks whether g.numThreadsReady is equal to g.numThreads, in which case it can leave (after sending a dummy-treasure to the channel). otherwise we start all over again until either this thread has been notified (possibly by itself) or all threads are ready.
type worker struct {
// Each worker contains a pointer to g
// such that it has access to its member
// variables and is able to call the
// function g.addInformation(i) as soon
// as it finds some information i.
g *G
// If new work has been added, the thread
// is notified by setting the uint32
// at which needsToResumeWorkFlag points to 1.
needsToResumeWorkFlag *uint32 // initialize to 0 somewhere appropriate
// Also contains some other stuff.
}
func (s *worker) FooInParallel(wg *sync.WaitGroup) {
defer wg.Done()
for {
a := s.processAllInformation()
numReadyIncremented := false
for !s.needsToResumeWork() {
if !numReadyIncremented {
atomic.AddUint32(g.numThreadsReady,1)
numReadyIncremented = true
}
if s.allThreadsReady() {
return
}
}
}
}
func (s *worker) needsToResumeWork() bool {
s.notifyAll()
for {
if atomic.LoadUint32(g.allInformedFlag) == 1 {
if atomic.CompareAndSwapUint32(s.needsToResumeWorkFlag, 1, 0) {
return true
} else {
return false
}
}
}
}
func (s *worker) notifyAll() {
g.notifyAll()
}
func (g *G) allThreadsReady() bool {
if atomic.LoadUint32(g.numThreadsReady) == g.numThreads {
return true
} else {
return false
}
}
As mentioned my solution doesn't work.
I found a solution myself. We exploit, that a call to s.processAllInformation() does nothing, if no new information had been added (and is cheap). The trick is to use an atomic variable as a lock to both, for each thread to notify all if necessary and to check whether it has been notified. And then to simply call s.processAllInformation() again, if the lock can not be acquired. A thread then uses the notifications to check whether it has to increment the counter of ready threads, instead of to see whether it needs to return work.
Global situation
type G struct {
numThreads int
numThreadsReady *uint32 // initialize to 0 somewhere appropriate
notifyAllFlag *uint32 // initialize to 0 somewhere appropriate
allCanGoFlag *uint32 // initialize to 0 somewhere appropriate
lock *uint32 // initialize to 0 somewhere appropriate
// some data, that the workers need access to
}
// This function is thread-safe in so far as several
// workers can concurrently add information.
//
// The function is optimized for heavy contention; most
// threads can leave almost immediately. One threads
// cleans up any mess they leave behind (and even in
// bad cases that is not too much).
func (g *G) addInformation(i infoType) {
// Step 1: Make information available to all threads.
// Step 2: Enqueue information at the end of some array.
// Since the responsibility to enqueue an information may
// be passed to another thread, it is important that the
// last step is executed by the thread which enqueues the
// information(s) in order to ensure, that the information
// successfully has been enqueued.
// Step 3:
atomic.StoreUint32(g.notifyAllFlag,1) // all threads need to be notified
}
// If a new information has been added, we must ensure,
// that every thread, that had finished, resumes work
// and processes any newly added informations.
//
// This function is not thread-safe. Make sure not to
// have several threads call this function concurrently
// if these calls are not guarded by some lock.
func (g *G) notifyAll() {
if atomic.LoadUint32(g.notifyAllFlag,1) {
for j := range g.s {
atomic.StoreUint32(g.s[j].needsToResumeWorkFlag, 1)
}
atomic.StoreUint32(g.notifyAllFlag,0)
atomic.StoreUint32(g.numThreadsReady,0)
}
Local situation
type worker struct {
// Each worker contains a pointer to g
// such that it has access to its member
// variables and is able to call the
// function g.addInformation(i) as soon
// as it finds some information i.
g *G
// If new work has been added, the thread
// is notified by setting the uint32
// at which needsToResumeWorkFlag points to 1.
needsToResumeWorkFlag *uint32 // initialize to 0 somewhere appropriate
incrementedNumReadyFlag *uint32 // initialize to 0 somewhere appropriate
// Also contains some other stuff.
}
func (s *worker) FooInParallel(wg *sync.WaitGroup) {
defer wg.Done()
for {
a := s.processAllInformation()
if atomic.LoadUint32(s.g.allCanGoFlag, 1) {
return
}
if atomic.CompareAndSwapUint32(g.lock,0,1) { // If possible, lock.
s.g.notifyAll() // It is essential, that this is also guarded by the lock.
if atomic.LoadUint32(s.needsToResumeWorkFlag) == 1 {
atomic.StoreUint32(s.needsToResumeWorkFlag,0)
// Some new information was found, and this thread can't be sure,
// whether it already has processed it. Since the counter for
// how many threads are ready had been reset, we must increment
// that counter after the next call processAllInformation() in the
// following iteration.
atomic.StoreUint32(s.incrementedNumReadyFlag,0)
} else {
// Increment number of ready threads by one, if this thread had not
// done this before (since the last newly found information).
if atomic.CompareAndSwapUint32(s.incrementedNumReadyFlag,0,1) {
atomic.AddUint32(s.g.numThreadsReady,1)
}
// If all threads are ready, give them all a signal.
if atomic.LoadUint32(s.g.numThreadsReady) == s.g.numThreads {
atomic.StoreUint32(s.g.allCanGo, 1)
}
}
atomic.StoreUint32(g.lock,0) // Unlock.
}
}
}
Later I may add some order for the threads to access to the lock under heavy contention, but for now that'll do.

Troubles with gitlab scraping via golang

I'm newbie in programming and I need help. Trying to write gitlab scraper on golang.
Something goes wrong when i'm trying to get information about projects in multithreading mode.
Here is the code:
func (g *Gitlab) getAPIResponce(url string, structure interface{}) error {
responce, responce_error := http.Get(url)
if responce_error != nil {
return responce_error
}
ret, _ := ioutil.ReadAll(responce.Body)
if string(ret) != "[]" {
err := json.Unmarshal(ret, structure)
return err
}
return errors.New(error_emptypage)
}
...
func (g *Gitlab) GetProjects() {
projects_chan := make(chan Project, g.LatestProjectID)
var waitGroup sync.WaitGroup
queue := make(chan struct{}, 50)
for i := g.LatestProjectID; i > 0; i-- {
url := g.BaseURL + projects_url + "/" + strconv.Itoa(i) + g.Token
waitGroup.Add(1)
go func(url string, channel chan Project) {
queue <- struct{}{}
defer waitGroup.Done()
var oneProject Project
err := g.getAPIResponce(url, &oneProject)
if err != nil {
fmt.Println(err.Error())
}
fmt.Printf(".")
channel <- oneProject
<-queue
}(url, projects_chan)
}
go func() {
waitGroup.Wait()
close(projects_chan)
}()
for project := range projects_chan {
if project.ID != 0 {
g.Projects = append(g.Projects, project)
}
}
}
And here is the output:
$ ./gitlab-auditor
latest project = 1532
Gathering projects...
.......................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................Get https://gitlab.example.com/api/v4/projects/563&private_token=SeCrEt_ToKeN: unexpected EOF
Get https://gitlab.example.com/api/v4/projects/558&private_token=SeCrEt_ToKeN: unexpected EOF
..Get https://gitlab.example.com/api/v4/projects/531&private_token=SeCrEt_ToKeN: unexpected EOF
Get https://gitlab.example.com/api/v4/projects/571&private_token=SeCrEt_ToKeN: unexpected EOF
.Get https://gitlab.example.com/api/v4/projects/570&private_token=SeCrEt_ToKeN: unexpected EOF
..Get https://gitlab.example.com/api/v4/projects/467&private_token=SeCrEt_ToKeN: unexpected EOF
Get https://gitlab.example.com/api/v4/projects/573&private_token=SeCrEt_ToKeN: unexpected EOF
................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
Every time it's different projects, but it's id is around 550.
When I'm trying to curl links from output, i'm getting normal JSON. When I'm trying to run this code with queue := make(chan struct{}, 1) (in single thread) - everything is fine.
What can it be?
i would say this not a very clear way to achieve concurrency.
what seems to be happening here is
you create a buffered channel that has a size of 50.
then you fire up 1532 goroutines
the first 50 of them enqueue themselves and start processing. by the time they <-queue and free up somespace a random one from the next manages to get on the queue.
as people say in the comments most certainly you hit some limits around the time it the blast has made it around id 550. Then gitlab's API is angry at you and rate limits.
then another goroutine is fired that will close the channel to notify the main goroutine
the main goroutine reads messages.
the talk go concurrency patterns
as well as this blog post concurrency in go might help.
personally i rarely use buffered channels. for your problem i would go like:
define a number of workers
have the main goroutine fire up the workers with a func listening on a channel of ints , doing the api call, writing to a channel of projects
have the main goroutine send to a channel of ints the project number to be fetched and read from the channel of projects.
maybe ratelimit by firing a ticker and have main read from it before it sends the next request?
main closes the number channel to notify the others to die.

Golang Memory Leak Concerning Goroutines

I have a Go program that runs continuously and relies entirely on goroutines + 1 manager thread. The main thread simply calls goroutines and otherwise sleeps.
There is a memory leak. The program uses more and more memory until it drains all 16GB RAM + 32GB SWAP and then each goroutine panics. It is actually OS memory that causes the panic, usually the panic is fork/exec ./anotherapp: cannot allocate memory when I try to execute anotherapp.
When this happens all of the worker threads will panic and be recovered and restarted. So each goroutine will panic, be recovered and restarted... at which point the memory usage will not decrease, it remains at 48GB even though there is now virtually nothing allocated. This means all goroutines will always panic as there is never enough memory, until the entire executable is killed and restarted completely.
The entire thing is about 50,000 lines, but the actual problematic area is as follows:
type queue struct {
identifier string
type bool
}
func main() {
// Set number of gorountines that can be run
var xthreads int32 = 10
var usedthreads int32
runtime.GOMAXPROCS(14)
ready := make(chan *queue, 5)
// Start the manager goroutine, which prepared identifiers in the background ready for processing, always with 5 waiting to go
go manager(ready)
// Start creating goroutines to process as they are ready
for obj := range ready { // loops through "ready" channel and waits when there is nothing
// This section uses atomic instead of a blocking channel in an earlier attempt to stop the memory leak, but it didn't work
for atomic.LoadInt32(&usedthreads) >= xthreads {
time.Sleep(time.Second)
}
debug.FreeOSMemory() // Try to clean up the memory, also did not stop the leak
atomic.AddInt32(&usedthreads, 1) // Mark goroutine as started
// Unleak obj, probably unnecessary, but just to be safe
copy := new(queue)
copy.identifier = unleak.String(obj.identifier) // unleak is a 3rd party package that makes a copy of the string
copy.type = obj.type
go runit(copy, &usedthreads) // Start the processing thread
}
fmt.Println(`END`) // This should never happen as the channels are never closed
}
func manager(ready chan *queue) {
// This thread communicates with another server and fills the "ready" channel
}
// This is the goroutine
func runit(obj *queue, threadcount *int32) {
defer func() {
if r := recover(); r != nil {
// Panicked
erstring := fmt.Sprint(r)
reportFatal(obj.identifier, erstring)
} else {
// Completed successfully
reportDone(obj.identifier)
}
atomic.AddInt32(threadcount, -1) // Mark goroutine as finished
}()
do(obj) // This function does the actual processing
}
As far as I can see, when the do function (last line) ends, either by having finished or having panicked, the runit function then ends, which ends the goroutine entirely, which means all of the memory from that goroutine should now be free. This is now what happens. What happens is that this app just uses more and more and more memory until it becomes unable to function, all the runit goroutines panic, and yet the memory does not decrease.
Profiling does not reveal anything suspicious. The leak appears to be outside of the profiler's scope.
Please consider inverting the pattern, see here or below....
package main
import (
"log"
"math/rand"
"sync"
"time"
)
// I do work
func worker(id int, work chan int) {
for i := range work {
// Work simulation
log.Printf("Worker %d, sleeping for %d seconds\n", id, i)
time.Sleep(time.Duration(rand.Intn(i)) * time.Second)
}
}
// Return some fake work
func getWork() int {
return rand.Intn(2) + 1
}
func main() {
wg := new(sync.WaitGroup)
work := make(chan int)
// run 10 workers
for i := 0; i < 10; i++ {
wg.Add(1)
go func(i int) {
worker(i, work)
wg.Done()
}(i)
}
// main "thread"
for i := 0; i < 100; i++ {
work <- getWork()
}
// signal there is no more work to be done
close(work)
// Wait for the workers to exit
wg.Wait()
}

"window procedure" of a newly created thread without window

I want to create a thread for some db writes that should not block the ui in case the db is not there. For synchronizing with the main thread, I'd like to use windows messages. The main thread sends the data to be written to the writer thread.
Sending is no problem, since CreateThread returns the handle of the newly created thread. I thought about creating a standard windows event loop for processing the messages. But how do I get a window procedure as a target for DispatchMessage without a window?
Standard windows event loop (from MSDN):
while( (bRet = GetMessage( &msg, NULL, 0, 0 )) != 0)
{
if (bRet == -1)
{
// handle the error and possibly exit
}
else
{
TranslateMessage(&msg);
DispatchMessage(&msg);
}
}
Why windows messages? Because they are fast (windows relies on them) and thread-safe. This case is also special as there is no need for the second thread to read any data. It just has to recieve data, write it to the DB and then wait for the next data to arrive. But that's just what the standard event loop does. GetMessage waits for the data, then the data is processed and everything starts again. There's even a defined signal for terminating the thread that is well understood - WM_QUIT.
Other synchronizing constructs block one of the threads every now and then (critical section, semaphore, mutex). As for the events mentioned in the comment - I don't know them.
It might seem contrary to common sense, but for messages that don't have windows, it's actually better to create a hidden window with your window proc than to manually filter the results of GetMessage() in a message pump.
The fact that you have an HWND means that as long as the right thread has a message pump going, the message is going to get routed somewhere. Consider that many functions, even internal Win32 ones, have their own message pumps (for example MessageBox()). And the code for MessageBox() isn't going to know to invoke your custom code after its GetMessage(), unless there's a window handle and window proc that DispatchMessage() will know about.
By creating a hidden window, you're covered by any message pump running in your thread, even if it isn't written by you.
EDIT: but don't just take my word for it, check these articles from Microsoft's Raymond Chen.
Thread messages are eaten by modal loops
Why do messages posted by PostThreadMessage disappear?
Why isn't there a SendThreadMessage function?
NOTE: Refer this code only when you don't need any sort of UI-related or some COM-related code. Other than such corner cases, this code works correctly: especially good for pure computation-bounded worker thread.
DispathMessage and TranslateMessage are not necessary if the thread is not having a window. So, simply just ignore it. HWND is nothing to do with your scenario. You don't actually need to create any Window at all. Note that that two *Message functions are needed to handle Windows-UI-related message such as WM_KEYDOWN and WM_PAINT.
I also prefer Windows Messages to synchronize and communicate between threads by using PostThreadMessage and GetMessage, or PeekMessage. I wanted to cut and paste from my code, but I'll just briefly sketch the idea.
#define WM_MY_THREAD_MESSAGE_X (WM_USER + 100)
#define WM_MY_THREAD_MESSAGE_Y (WM_USER + 100)
// Worker Thread: No Window in this thread
unsigned int CALLBACK WorkerThread(void* data)
{
// Get the master thread's ID
DWORD master_tid = ...;
while( (bRet = GetMessage( &msg, NULL, 0, 0 )) != 0)
{
if (bRet == -1)
{
// handle the error and possibly exit
}
else
{
if (msg.message == WM_MY_THREAD_MESSAGE_X)
{
// Do your task
// If you want to response,
PostThreadMessage(master_tid, WM_MY_THREAD_MESSAGE_X, ... ...);
}
//...
if (msg.message == WM_QUIT)
break;
}
}
return 0;
}
// In the Master Thread
//
// Spawn the worker thread
CreateThread( ... WorkerThread ... &worker_tid);
// Send message to worker thread
PostThreadMessage(worker_tid, WM_MY_THREAD_MESSAGE_X, ... ...);
// If you want the worker thread to quit
PostQuitMessage(worker_tid);
// If you want to receive message from the worker thread, it's simple
// You just need to write a message handler for WM_MY_THREAD_MESSAGE_X
LRESULT OnMyThreadMessage(WPARAM, LPARAM)
{
...
}
I'm a bit afraid that this is what you wanted. But, the code, I think, is very easy to understand. In general, a thread is created without having message queue. But, once Window-message related function is called, then the message queue for the thread is initialized. Please note that again no Window is necessary to post/receive Window messages.
You don't need a window procedure in your thread unless the thread has actual windows to manage. Once the thread has called Peek/GetMessage(), it already has the same message that a window procedure would receive, and thus can act on it immediately. Dispatching the message is only necessary when actual windows are involved. It is a good idea to dispatch any messages that you do not care about, in case other objects used by your thread have their own windows internally (ActiveX/COM does, for instance). For example:
while( (bRet = GetMessage(&msg, NULL, 0, 0)) != 0 )
{
if (bRet == -1)
{
// handle the error and possibly exit
}
else
{
switch( msg.message )
{
case ...: // process a message
...
break;
case ...: // process a message
...
break;
default: // everything else
TranslateMessage(&msg);
DispatchMessage(&msg);
break;
}
}
}

Resources