goroutine or multithreading is not working in golang - multithreading

I was trying to implement multithreading in golang. I am able to implement go routines but it is not working as expected. below is the sample program which i have prepared,
func test(s string, fo *os.File) {
var s1 [105]int
count :=0
for x :=1000; x<1101;x++ {
s1[count] = x;
count++
}
//fmt.Println(s1[0])
for i := range s1 {
runtime.Gosched()
sd := s + strconv.Itoa(i)
var fileMutex sync.Mutex
fileMutex.Lock()
fmt.Fprintf(fo,sd)
defer fileMutex.Unlock()
}
}
func main() {
fo,err :=os.Create("D:/Output.txt")
if err != nil {
panic(err)
}
for i := 0; i < 4; i++ {
go test("bye",fo)
}
}
OUTPUT - good0bye0bye0bye0bye0good1bye1bye1bye1bye1good2bye2bye2bye2bye2.... etc.
the above program will create a file and write "Hello" and "bye" in the file.
My problem is i am trying to create 5 thread and wanted to process different values values with different thread. if you will see the above example it is printing "bye" 4 times.
i wanted output like below using 5 thread,
good0bye0good1bye1good2bye2....etc....
any idea how can i achieve this?

First, you need to block in your main function until all other goroutines return. The mutexes in your program aren't blocking anything, and since they're re-initialized in each loop, they don't even block within their own goroutine. You can't defer an unlock if you're not returning from the function, you need to explicitly unlock in each iteration of the loop. You aren't using any of the values in your array (though you should use a slice instead), so we can drop that entirely. You also don't need runtime.GoSched in a well-behaved program, and it does nothing here.
An equivalent program that will run to completion would look like:
var wg sync.WaitGroup
var fileMutex sync.Mutex
func test(s string, fo *os.File) {
defer wg.Done()
for i := 0; i < 105; i++ {
fileMutex.Lock()
fmt.Fprintf(fo, "%s%d", s, i)
fileMutex.Unlock()
}
}
func main() {
fo, err := os.Create("D:/output.txt")
if err != nil {
log.Fatal(err)
}
for i := 0; i < 4; i++ {
wg.Add(1)
go test("bye", fo)
}
wg.Wait()
}
Finally though, there's no reason to try and write serial values to a single file from multiple goroutines, and it's less efficient to do so. If you want the values ordered over the entire file, you will need to use a single goroutine anyway.

Related

How do I make this program thread-safe, would channels be the best implementation, if so, how?

I'm using Golang, I'm trying to make this program thread-safe. It takes a number as a parameter (which is the number of consumer tasks to start), reads lines from an input, and accumulates word count. I want the threads to be safe (but I don't want it to just lock everything, it needs to be efficient) should I use channels? How do I do this?
package main
import (
"bufio"
"fmt"
"log"
"os"
"sync"
)
// Consumer task to operate on queue
func consumer_task(task_num int) {
fmt.Printf("I'm consumer task #%v ", task_num)
fmt.Println("Line being popped off queue: " + queue[0])
queue = queue[1:]
}
// Initialize queue
var queue = make([]string, 0)
func main() {
// Initialize wait group
var wg sync.WaitGroup
// Get number of tasks to run from user
var numof_tasks int
fmt.Print("Enter number of tasks to run: ")
fmt.Scan(&numof_tasks)
// Open file
file, err := os.Open("test.txt")
if err != nil {
log.Fatal(err)
}
defer file.Close()
// Scanner to scan the file
scanner := bufio.NewScanner(file)
if err := scanner.Err(); err != nil {
log.Fatal(err)
}
// Loop through each line in the file and append it to the queue
for scanner.Scan() {
line := scanner.Text()
queue = append(queue, line)
}
// Start specified # of consumer tasks
for i := 1; i <= numof_tasks; i++ {
wg.Add(1)
go func(i int) {
consumer_task(i)
wg.Done()
}(i)
}
wg.Wait()
fmt.Println("All done")
fmt.Println(queue)
}
You have a data race on the slice queue. Concurrent goroutines, when popping elements off the head of the queue to do so in a controlled manner either via a sync.Mutex lock. Or use a channel to manage the "queue" of work items.
To convert what you have to using channels, update the worker to take an input channel as your queue - and range on the channel, so each worker can handle more than one task:
func consumer_task(task_num int, ch <-chan string) {
fmt.Printf("I'm consumer task #%v\n", task_num)
for item := range ch {
fmt.Printf("task %d consuming: Line item: %v\n", task_num, item)
}
// each worker will drop out of their loop when channel is closed
}
change queue from a slice to a channel & feed items in like so:
queue := make(chan string)
go func() {
// Loop through each line in the file and append it to the queue
for scanner.Scan() {
queue <- scanner.Text()
}
close(queue) // signal to workers that there is no more items
}()
then just update your work dispatcher code to add the channel input:
go func(i int) {
consumer_task(i, queue) // add the queue parameter
wg.Done()
}(i)
https://go.dev/play/p/AzHyztipUZI

Proper way to optimize multi-threading with WaitGroups and goroutines?

I am trying to make my application run as fast as possible. I purchased a semi-powerful container off of Google Cloud and I am just itching to see how many iterations per second I can get out of this program. However, I am new to Go and so far my implementation is showing to be very messy and not working well.
The way I have it set up now, it will start out at a high rate (around 11,000 iterations per second) but then quickly dwindle down to 2,000. My goal is for a far bigger number than even 11,000. Also, the infofunc(i) function can't seem to keep up with fast speeds and using a goroutine for that function causes overlap of the printing to the console. Also, it will on occasion reuse the WaitGroup before the Wait has returned.
I don't like to be the person to ask to be spoon-fed code, but I am at a loss as to how to implement this. There seems to be so many different methods when it comes to parallelism, multithreading, etc. and it is confusing to me.
import (
"fmt"
"math/big"
"os"
"os/exec"
"sync"
"time"
)
var found = 0
var pages_queried = 0
var start_time = time.Now()
var bignum = new(big.Int)
var foundAddresses = 0
var wg sync.WaitGroup
var set = make(map[string]bool)
var addresses = []string{"6ab42gyr", "lo08n4g6"}
func main() {
bignum.SetString("1000000000000000000000000000", 10)
pick := os.Args[1]
kpp := 128
switch pick {
case "btc":
i := new(big.Int)
i, ok := i.SetString(os.Args[2], 10)
if ok {
cmd := exec.Command("clear")
cmd.Stdout = os.Stdout
cmd.Run()
for i.Cmp(bignum) < 0 {
wg.Add(1)
go func(i *big.Int) {
defer wg.Done()
go printKeys(i.String(), kpp)
i.Add(i, big.NewInt(1))
pages_queried += 1
infofunc(i)
}(i)
wg.Wait()
}
}
}
}
func infofunc(i *big.Int) {
elapsed := time.Now().Sub(start_time)
duration, _ := time.ParseDuration(elapsed.String())
duration2 := int(duration.Seconds())
if duration2 != 0 {
fmt.Printf("\033[5;0H")
fmt.Printf("Started at %s. Found: %d. Elapsed: %s. Queried: %d pages. Current page: %s. Rate: %d/s", start_time.String(), found, elapsed.String(), pages_queried, i.String(), (pages_queried / duration2))
}
}
func printKeys(pageNumber string, keysPerPage int) {
keys := generateKeys(pageNumber, keysPerPage)
length := len(keys)
var addressesLen = len(addresses)
for i := 0; i < length; i++ {
wg.Add(1)
go func(i int) {
defer wg.Done()
for ii := 0; ii < addressesLen; ii++ {
wg.Add(1)
go func(i int, ii int, keys []key) {
defer wg.Done()
for _, v := range addresses {
if set[keys[i].compressed] || set[keys[i].uncompressed] {
fmt.Print("Found an address: " + v + "!\n")
fmt.Printf("%v", keys[i])
fmt.Print("\n")
foundAddresses += 1
found += 1
}
}
}(i, ii, keys)
}
}(i)
foundAddresses = 0
}
}
I would not use a global sync.WaitGroup, it is hard to understand what is happening. Instead, just define it wherever you need.
You are calling wg.Wait() inside the loop block. That is basically blocking the loop every iteration waiting for goroutine to complete. What you really want is to spawn all the goroutines and only then wait for their completition.
if ok {
cmd := exec.Command("clear")
cmd.Stdout = os.Stdout
cmd.Run()
var wg sync.WaitGroup //I am about to spawn goroutines, I need to wait for them
for i.Cmp(bignum) < 0 {
wg.Add(1)
go func(i *big.Int) {
defer wg.Done()
go printKeys(i.String(), kpp)
i.Add(i, big.NewInt(1))
pages_queried += 1
infofunc(i)
}(i)
}
wg.Wait() //Now that all goroutines are working, let's wait
}
You cannot avoid the print overlap when you have multiple goroutines. If that's a problem you might think of using the Go's log stdlib, which will add timestamps for you. Then, you should be able to sort them in chronological order.
Anyway, split the code in more goroutines does not ensure a speed up. If the problem you are trying to solve is intrinsically sequential, then more goroutines will just add more contention and pressure on Go scheduler, leading to the opposite result. More details here. Thus, a goroutine for infofunc will not help. But it can be improved by using a logger library instead of plain fmt package.
func infofunc(i *big.Int) {
duration := time.Since(start_time).Seconds()
if duration != 0 {
log.Printf("\033[5;0H")
log.Printf("Started at %s. Found: %d. Elapsed: %s. Queried: %d pages. Current page: %s. Rate: %d/s", start_time.String(), found, elapsed.String(), pages_queried, i.String(), (pages_queried / duration2))
}
}
For printKeys, I would not create so many goroutines, they are not going to help if work they need to perform is CPU bound, which seems to be the case here.
func printKeys(pageNumber string, keysPerPage int) {
keys := generateKeys(pageNumber, keysPerPage)
length := len(keys)
var addressesLen = len(addresses)
var wg sync.WaitGroup //Local WaitGroup
for i := 0; i < length; i++ {
wg.Add(1)
go func(i int) { //This goroutine could be removed, in my opinion.
defer wg.Done()
for ii := 0; ii < addressesLen; ii++ {
for _, v := range addresses {
if set[keys[i].compressed] || set[keys[i].uncompressed] {
log.Printf("Found an address: %v\n", v)
log.Printf("%v", keys[i])
log.Printf("\n")
foundAddresses += 1
found += 1
}
}
}
}(i)
foundAddresses = 0
}
wg.Wait()
}
I would suggest to write a benchmark on these functions and then enable tracing. In this way you should get an idea where your code is spending most of the time.

How to convert the following Thread statement in go

I am trying to convert the following java statement of threads in go;
int num = 5;
Thread[] threads = new Thread[5];
for (int i = 0; i < num; i++)
{
threads[i] = new Thread(new NewClass(i));
threads[i].start();
}
for (int i = 0; i < numT; i++)
threads[i].join();
I wanted to know, how to convert this to go?
Thanks
Golang uses a concept called "goroutines" (inspired by "coroutines") instead of "threads" used by many other languages.
Your specific example looks like a common use for the sync.WaitGroup type:
wg := sync.WaitGroup{}
for i := 0; i < 5; i++ {
wg.Add(1) // Increment the number of routines to wait for.
go func(x int) { // Start an anonymous function as a goroutine.
defer wg.Done() // Mark this routine as complete when the function returns.
SomeFunction(x)
}(i) // Avoid capturing "i".
}
wg.Wait() // Wait for all routines to complete.
Note that SomeFunction(...) in the example can be work of any sort and will be executed concurrently with all other invocations; however, if it uses goroutines itself then it should be sure to use a similar mechanism as shown here to prevent from returning from the "SomeFunction" until it is actually done with its work.

Simple queue model example

Is there a simple program which demonstrates how queues work in Go.
I just need something like add number 1 to 10 in queue and pull those from the queue in parallel using another thread.
A queue that is safe for concurrent use is basically a language construct: channel.
A channel–by design–is safe for concurrent send and receive. This is detaild here: If I am using channels properly should I need to use mutexes? Values sent on it are received in the order they were sent.
You can read more about channels here: What are golang channels used for?
A very simple example:
c := make(chan int, 10) // buffer for 10 elements
// Producer: send elements in a new goroutine
go func() {
for i := 0; i < 10; i++ {
c <- i
}
close(c)
}()
// Consumer: receive all elements sent on it before it was closed:
for v := range c {
fmt.Println("Received:", v)
}
Output (try it on the Go Playground):
Received: 0
Received: 1
Received: 2
Received: 3
Received: 4
Received: 5
Received: 6
Received: 7
Received: 8
Received: 9
Note that the channel buffer (10 in this example) has nothing to do with the number of elements you want to send "through" it. The buffer tells how many elements the channel may "store", or in other words, how many elements you may send on it without blocking when there are nobody is receiving from it. When the channel's buffer is full, further sends will block until someone starts receiving values from it.
You could use channel(safe for concurrent use) and wait group to read from queue concurrently
package main
import (
"fmt"
"sync"
)
func main() {
queue := make(chan int)
wg := new(sync.WaitGroup)
wg.Add(1)
defer wg.Wait()
go func(wg *sync.WaitGroup) {
for {
r, ok := <-queue
if !ok {
wg.Done()
return
}
fmt.Println(r)
}
}(wg)
for i := 1; i <= 10; i++ {
queue <- i
}
close(queue)
}
Playground link: https://play.golang.org/p/A_Amqcf2gwU
Another option is to create and implement a queue interface, with a backing type of a channel for concurrency. For convenience, I've made a gist.
Here's how you can use it.
queue := GetIntConcurrentQueue()
defer queue.Close()
// queue.Enqueue(1)
// myInt, errQueueClosed := queue.DequeueBlocking()
// myInt, errIfNoInt := queue.DequeueNonBlocking()
Longer example here - https://play.golang.org/p/npb2Uj9hGn1
Full implementation below, and again here's the gist of it.
// Can be any backing type, even 'interface{}' if desired.
// See stackoverflow.com/q/11403050/3960399 for type conversion instructions.
type IntConcurrentQueue interface {
// Inserts the int into the queue
Enqueue(int)
// Will return error if there is nothing in the queue or if Close() was already called
DequeueNonBlocking() (int, error)
// Will block until there is a value in the queue to return.
// Will error if Close() was already called.
DequeueBlocking() (int, error)
// Close should be called with defer after initializing
Close()
}
func GetIntConcurrentQueue() IntConcurrentQueue {
return &intChannelQueue{c: make(chan int)}
}
type intChannelQueue struct {
c chan int
}
func (q *intChannelQueue) Enqueue(i int) {
q.c <- i
}
func (q *intChannelQueue) DequeueNonBlocking() (int, error) {
select {
case i, ok := <-q.c:
if ok {
return i, nil
} else {
return 0, fmt.Errorf("queue was closed")
}
default:
return 0, fmt.Errorf("queue has no value")
}
}
func (q *intChannelQueue) DequeueBlocking() (int, error) {
i, ok := <-q.c
if ok {
return i, nil
}
return 0, fmt.Errorf("queue was closed")
}
func (q *intChannelQueue) Close() {
close(q.c)
}

Does it make sense to make expensive syscalls from different goroutines?

If application does some heavy lifting with multiple file descriptors (e.g., opening - writing data - syncing - closing), what actually happens to Go runtime? Does it block all the goroutines at the time when expensive syscall occures (like syscall.Fsync)? Or only the calling goroutine is blocked while the others are still operating?
So does it make sense to write programs with multiple workers that do a lot of user space - kernel space context switching? Does it make sense to use multithreading patterns for disk input?
package main
import (
"log"
"os"
"sync"
)
var data = []byte("some big data")
func worker(filenamechan chan string, wg *sync.waitgroup) {
defer wg.done()
for {
filename, ok := <-filenamechan
if !ok {
return
}
// open file is a quite expensive operation due to
// the opening new descriptor
f, err := os.openfile(filename, os.o_create|os.o_wronly, os.filemode(0644))
if err != nil {
log.fatal(err)
continue
}
// write is a cheap operation,
// because it just moves data from user space to the kernel space
if _, err := f.write(data); err != nil {
log.fatal(err)
continue
}
// syscall.fsync is a disk-bound expensive operation
if err := f.sync(); err != nil {
log.fatal(err)
continue
}
if err := f.close(); err != nil {
log.fatal(err)
}
}
}
func main() {
// launch workers
filenamechan := make(chan string)
wg := &sync.waitgroup{}
for i := 0; i < 2; i++ {
wg.add(1)
go worker(filenamechan, wg)
}
// send tasks to workers
filenames := []string{
"1.txt",
"2.txt",
"3.txt",
"4.txt",
"5.txt",
}
for i := range filenames {
filenamechan <- filenames[i]
}
close(filenamechan)
wg.wait()
}
https://play.golang.org/p/O0omcPBMAJ
If a syscall blocks, the Go runtime will launch a new thread so that the number of threads available​ to run goroutines remains the same.
A fuller explanation can be found here: https://morsmachine.dk/go-scheduler

Resources