A Go map thread safety problem while reading a cache DIY book - multithreading

I am reading a book which teaches me how to write a simple cache like redis.
With a goal to implement a distribute hash, the project must have key migrate, which needs an iterator. And I think there may be some problems.
His book about iterating a map, but while the iteration, the hold of read lock not continuously. The reason is trying not to effect main cache process. I believe there must be a thread safety problem because the main cache thread is still writing to map. I wrote a demo, but not sure.
//book code
type inMemoryScanner struct {
pair
pair Chan *pair
closeCh chan struct{}
}
func (c *inMemoryCache) NewScanner() Scanner {
pairCh := make(chan *pair)
closeCh := make(chan struct{})
go func() {
defer close(pairCh)
c.mutex.RLock()
//the c.c is book's map
for k, v := range c.c {
c.mutex.RUnlock()
select {
case <-closeCh:
return
case pairCh <- &pair{k, v}:
}
c.mutex.RLock()
}
c.mutex.RUnLock()
}
return &inMemoryScanner{pair{}, pairCh, closeCh}
}
//my demo
func main() {
testMap := make(map[string]string)
mutex := sync.RWMutex{}
for i := 0; i < 64; i ++ {
mutex.Lock()
testMap[uuid.New().String()] = uuid.New().String()
mutex.Unlock()
fmt.Println("Write")
}
go func() {
for {
mutex.Lock()
testMap[uuid.New().String()] = uuid.New().String()
time.Sleep(100 * time.Millisecond)
mutex.Unlock()
fmt.Println("Write")
}
} ()
for k, v := range testMap {
mutex.RLock()
fmt.Println("k" + k + "v" + v)
mutex.RUnlock()
time.Sleep(100 * time.Millisecond)
}
}
In my demo, the 'Write' and the map's result amount not equal! And I believe, In an reality project, the rebalance can't be once, there must be continuous background work, doesn't it?

You have a data race. Your results are undefined.
Simplifying your code so that it compiles and runs,
package main
import (
"sync"
"time"
)
func main() {
testMap := make(map[string]string)
mutex := sync.RWMutex{}
for i := 0; i < 64; i++ {
mutex.Lock()
now := time.Now().String()
testMap[now] = now
mutex.Unlock()
}
go func() {
for {
mutex.Lock()
now := time.Now().String()
testMap[now] = now
time.Sleep(100 * time.Millisecond)
mutex.Unlock()
}
}()
for k, v := range testMap {
mutex.RLock()
_, _ = k, v
mutex.RUnlock()
time.Sleep(100 * time.Millisecond)
}
}
Output:
$ go run -race racer.go
==================
WARNING: DATA RACE
Read at 0x00c00008c060 by main goroutine:
runtime.mapiternext()
/home/peter/go/src/runtime/map.go:844 +0x0
main.main()
/home/peter/gopath/src/racer.go:26 +0x217
Previous write at 0x00c00008c060 by goroutine 5:
runtime.mapassign_faststr()
/home/peter/go/src/runtime/map_faststr.go:202 +0x0
main.main.func1()
/home/peter/gopath/src/racer.go:21 +0x9b
Goroutine 5 (running) created at:
main.main()
/home/peter/gopath/src/racer.go:17 +0x17b
==================
==================
WARNING: DATA RACE
Read at 0x00c0000a6638 by main goroutine:
main.main()
/home/peter/gopath/src/racer.go:26 +0x1d0
Previous write at 0x00c0000a6638 by goroutine 5:
main.main.func1()
/home/peter/gopath/src/racer.go:21 +0xb0
Goroutine 5 (running) created at:
main.main()
/home/peter/gopath/src/racer.go:17 +0x17b
==================
fatal error: concurrent map iteration and map write
goroutine 1 [running]:
runtime.throw(0x4b1eb7, 0x26)
/home/peter/go/src/runtime/panic.go:617 +0x72 fp=0xc000059e48 sp=0xc000059e18 pc=0x44d722
runtime.mapiternext(0xc000059f28)
/home/peter/go/src/runtime/map.go:851 +0x55e fp=0xc000059ed0 sp=0xc000059e48 pc=0x434c2e
main.main()
/home/peter/gopath/src/racer.go:26 +0x218 fp=0xc000059f98 sp=0xc000059ed0 pc=0x48a1f8
runtime.main()
/home/peter/go/src/runtime/proc.go:200 +0x20c fp=0xc000059fe0 sp=0xc000059f98 pc=0x44f06c
runtime.goexit()
/home/peter/go/src/runtime/asm_amd64.s:1337 +0x1 fp=0xc000059fe8 sp=0xc000059fe0 pc=0x475751
goroutine 4 [sleep]:
runtime.goparkunlock(...)
/home/peter/go/src/runtime/proc.go:307
time.Sleep(0x5f5e100)
/home/peter/go/src/runtime/time.go:105 +0x159
main.main.func1(0xc00001c280, 0xc00008c060)
/home/peter/gopath/src/racer.go:22 +0x3e
created by main.main
/home/peter/gopath/src/racer.go:17 +0x17c
exit status 2
$
You are not locking the map reads,
for k, v := range testMap {
mutex.RLock()
_, _ = k, v
mutex.RUnlock()
time.Sleep(100 * time.Millisecond)
}
for k, v := range testMap { ... } reads the map. k, v are local variables.
You need to lock the map reads,
mutex.RLock()
for k, v := range testMap {
_, _ = k, v
time.Sleep(100 * time.Millisecond)
}
mutex.RUnlock()
Go: Data Race Detector
The Go Blog: Introducing the Go Race Detector
GopherCon 2016: Keith Randall - Inside the Map Implementation

Related

Issue modifying map from goroutine func

scores := make(map[string]int)
percentage := make(map[string]float64)
total := 0
for i, ans := range answers {
answers[i] = strings.ToLower(ans)
}
wg := sync.WaitGroup{}
go func() {
wg.Add(1)
body, _ := google(question)
for _, ans := range answers {
count := strings.Count(body, ans)
total += count
scores[ans] += 5 // <------------------- This doesn't work
}
wg.Done()
}()
Here's a snippet of code, my issue is, that I am unable to modify the scores, I've tried using pointers, I've tried doing it normally, I've tried passing it as a parameter.
Package sync
import "sync"
type WaitGroup
A WaitGroup waits for a collection of goroutines to finish. The main
goroutine calls Add to set the number of goroutines to wait for. Then
each of the goroutines runs and calls Done when finished. At the same
time, Wait can be used to block until all goroutines have finished.
You have provided us with a non-working fragment of code. See How to create a Minimal, Complete, and Verifiable example.
As a guess, your use of a sync.WaitGroup looks strange. For example, by simply following the instructions in the sync.Waitgroup documentation, I would expect something more like the following:
package main
import (
"fmt"
"strings"
"sync"
)
func google(string) (string, error) { return "yes", nil }
func main() {
question := "question?"
answers := []string{"yes", "no"}
scores := make(map[string]int)
total := 0
wg := sync.WaitGroup{}
wg.Add(1)
go func() {
defer wg.Done()
body, _ := google(question)
for _, ans := range answers {
count := strings.Count(body, ans)
total += count
scores[ans] += 5 // <-- This does work
}
}()
wg.Wait()
fmt.Println(scores, total)
}
Playground: https://play.golang.org/p/sZmB2Dc5RjL
Output:
map[yes:5 no:5] 1

Go channel takes each letter as string instead of the whole string

I'm creating a simple channel that takes string values. But apparently I'm pushing each letter in the string instead of the whole string in each loop.
I'm probably missing something very fundamental. What am I doing wrong ?
https://play.golang.org/p/-6E-f7ALbD
Code:
func doStuff(s string, ch chan string) {
ch <- s
}
func main() {
c := make(chan string)
loops := [5]int{1, 2, 3, 4, 5}
for i := 0; i < len(loops); i++ {
go doStuff("helloooo", c)
}
results := <-c
fmt.Println("channel size = ", len(results))
// print the items in channel
for _, r := range results {
fmt.Println(string(r))
}
}
Your code sends strings on the channel properly:
func doStuff(s string, ch chan string){
ch <- s
}
The problem is at the receiver side:
results := <- c
fmt.Println("channel size = ", len(results))
// print the items in channel
for _,r := range results {
fmt.Println(string(r))
}
results will be a single value received from the channel (the first value sent on it). And you print the length of this string.
Then you loop over this string (results) using a for range which loops over its runes, and you print those.
What you want is loop over the values of the channel:
// print the items in channel
for s := range c {
fmt.Println(s)
}
This when run will result in a runtime panic:
fatal error: all goroutines are asleep - deadlock!
Because you never close the channel, and a for range on a channel runs until the channel is closed. So you have to close the channel sometime.
For example let's wait 1 second, then close it:
go func() {
time.Sleep(time.Second)
close(c)
}()
This way your app will run and quit after 1 second. Try it on the Go Playground.
Another, nicer solution is to use sync.WaitGroup: this waits until all goroutines are done doing their work (sending a value on the channel), then it closes the channel (so there is no unnecessary wait / delay).
var wg = sync.WaitGroup{}
func doStuff(s string, ch chan string) {
ch <- s
wg.Done()
}
// And in main():
for i := 0; i < len(loops); i++ {
wg.Add(1)
go doStuff("helloooo", c)
}
go func() {
wg.Wait()
close(c)
}()
Try this one on the Go Playground.
Notes:
To repeat something 5 times, you don't need that ugly loops array. Simply do:
for i := 0; i < 5; i++ {
// Do something
}
The reason you are getting back the letters instead of string is that you are assigning the channel result to a variable and iterating over the result of the channel assigned to this variable which in your case is a string, and in Go you can iterate over a string with a for range loop to get the runes.
You can simply print the channel without to iterate over the channel result.
package main
import (
"fmt"
)
func doStuff(s string, ch chan string){
ch <- s
}
func main() {
c := make(chan string)
loops := [5]int{1,2,3,4,5}
for i := 0; i < len(loops) ; i++ {
go doStuff("helloooo", c)
}
results := <- c
fmt.Println("channel size = ", len(results))
fmt.Println(results) // will print helloooo
}

Synchronisation of threads in Go lang

I want to understand a bit more about how synchronisation of threads works in go. Below here I've have a functioning version of my program which uses a done channel for syncronization.
package main
import (
. "fmt"
"runtime"
)
func Goroutine1(i_chan chan int, done chan bool) {
for x := 0; x < 1000000; x++ {
i := <-i_chan
i++
i_chan <- i
}
done <- true
}
func Goroutine2(i_chan chan int, done chan bool) {
for x := 0; x < 1000000; x++ {
i := <-i_chan
i--
i_chan <- i
}
done <- true
}
func main() {
i_chan := make(chan int, 1)
done := make(chan bool, 2)
i_chan <- 0
runtime.GOMAXPROCS(runtime.NumCPU())
go Goroutine1(i_chan, done)
go Goroutine2(i_chan)
<-done
<-done
Printf("This is the value of i:%d\n", <-i_chan)
}
However when I try to run it with out any synchronisation. Using a wait statement and no channel to specify when it's done so no synchronisation.
const MAX = 1000000
func Goroutine1(i_chan chan int) {
for x := 0; x < MAX-23; x++ {
i := <-i_chan
i++
i_chan <- i
}
}
func main() {
i_chan := make(chan int, 1)
i_chan <- 0
runtime.GOMAXPROCS(runtime.NumCPU())
go Goroutine1(i_chan)
go Goroutine2(i_chan)
time.Sleep(100 * time.Millisecond)
Printf("This is the value of i:%d\n", <-i_chan)
}
It'll print out the wrong value of i. If you extend the wait for let say 1 sec it'll finish and print out the correct statement. I kind of understand that it has something with both thread not being finished before you print what's on the i_chan I'm just a bit curious about how this works.
Note that your first example would deadlock, since it never calls GoRoutine2 (the OP since edited the question).
If it calls GoRoutine2, then the expected i value is indeed 0.
Without synchronization, (as in this example), there is no guarantee that the main() doesn't exit before the completion of Goroutine1() and Goroutine2().
For a 1000000 loop, a 1 millisecond wait seems enough, but again, no guarantee.
func main() {
i_chan := make(chan int, 1)
i_chan <- 0
runtime.GOMAXPROCS(runtime.NumCPU())
go Goroutine2(i_chan)
go Goroutine1(i_chan)
time.Sleep(1 * time.Millisecond)
Printf("This is the value of i:%d\n", <-i_chan)
}
see more at "How to Wait for All Goroutines to Finish Executing Before Continuing", where the canonical way is to use the sync package’s WaitGroup structure, as in this runnable example.

Go channels and I/O

First function
ReadF2C
takes a filename and channel, reads from file and inputs in channel.
Second function
WriteC2F
takes 2 channels and filename, takes value of each channel and saves the lower value in the output file. I'm sure there is a few syntax errors but i'm new to GO
package main
import (
"fmt"
"bufio"
"os"
"strconv"
)
func main() {
fmt.Println("Hello World!\n\n")
cs1 := make (chan int)
var nameinput string = "input.txt"
readF2C(nameinput,cs1)
cs2 := make (chan int)
cs3 := make (chan int)
cs2 <- 10
cs2 <- 16
cs2 <- 7
cs2 <- 2
cs2 <- 5
cs3 <- 8
cs3 <- 15
cs3 <- 14
cs3 <- 1
cs3 <- 6
var nameoutput string = "output.txt"
writeC2F (nameoutput,cs2,cs3)
}
func readF2C (fn string, ch chan int){
f,err := os.Open(fn)
r := bufio.NewReader(f)
for err != nil { // not end of file
fmt.Println(r.ReadString('\n'))
ch <- r.ReadString('\n')
}
if err != nil {
fmt.Println(r.ReadString('\n'))
ch <- -1
}
}
func writeC2F(fn string, // output text file
ch1 chan int, // first input channel
ch2 chan int){
var j int = 0
var channel1temp int
var channel2temp int
f,_ := os.Create(fn)
w := bufio.NewWriter(f)
channel1temp = <-ch1
channel2temp = <-ch2
for j := 1; j <= 5; j++ {
if (channel2temp < channel1temp){
n4, err := w.WriteString(strconv.Itoa(channel1temp))
} else{
n4, err := w.WriteString(strconv.Itoa(channel2temp))
}
w.flush()
}
}
This is the error messages I get:
prog.go:38: multiple-value r.ReadString() in single-value context
prog.go:65: w.flush undefined (cannot refer to unexported field or method bufio.(*Writer)."".flush)
There are multiple errors:
1)
Unlike C, Go enforces you to have your curly braces directly after your statements. So for an if case (and the same for func), instead of doing it like this:
if (channel2temp < channel1temp)
{
use this
if channel2temp < channel1temp {
2)
There is no while in Go. Use for
for {
...
}
or
for channel1temp != null || channel2temp != null {
...
}
3)
Usage of non-declared variables. Often easy to fix by making a short variable declaration the first time you initialize the variable. So instead of:
r = bufio.NewReader(file)
use
r := bufio.NewReader(file)
4)
Trying to a assign multi-value return into a single variable. If a function returns two values and you only need one, the variable you don't want can be discarded by assigning it to _. So instead of:
file := os.Open(fn)
use
file, _ := os.Open(fn)
but best practice would be to catch that error:
file, err := os.Open(fn)
if err != nil {
panic(err)
}
There are more errors on top of this, but maybe it will get you started.
I also suggest reading Effective Go since it will explain many of the things I've just mentioned.
Edit:
And there are help online for sure. It might be a new language, but the online material is really useful. Below is a few that I used when learning Go:
Effective Go: Good document on how to write idiomatic Go code
The Go programming language Tour: Online tour of Go with interactive examples.
Go By Example: Interactive examples of Go programs, starting with Hello World.
Go Specification: Surprisingly readable for being a specification. Maybe not a start point, but very useful.

throw: all goroutines are asleep - deadlock

Given the following simple Go program
package main
import (
"fmt"
)
func total(ch chan int) {
res := 0
for iter := range ch {
res += iter
}
ch <- res
}
func main() {
ch := make(chan int)
go total(ch)
ch <- 1
ch <- 2
ch <- 3
fmt.Println("Total is ", <-ch)
}
I am wondering if someone can enlighten me as to why I get
throw: all goroutines are asleep - deadlock!
thank you
As you never close the ch channel, the range loop will never finish.
You can't send back the result on the same channel. A solution is to use a different one.
Your program could be adapted like this :
package main
import (
"fmt"
)
func total(in chan int, out chan int) {
res := 0
for iter := range in {
res += iter
}
out <- res // sends back the result
}
func main() {
ch := make(chan int)
rch := make(chan int)
go total(ch, rch)
ch <- 1
ch <- 2
ch <- 3
close (ch) // this will end the loop in the total function
result := <- rch // waits for total to give the result
fmt.Println("Total is ", result)
}
This is also right.
package main
import "fmt"
func main() {
c := make(chan int)
go do(c)
c <- 1
c <- 2
// close(c)
fmt.Println("Total is ", <-c)
}
func do(c chan int) {
res := 0
// for v := range c {
// res = res + v
// }
for i := 0; i < 2; i++ {
res += <-c
}
c <- res
fmt.Println("something")
}

Resources