Go program slowing down when increasing number of goroutines - multithreading

I'm doing a small project for my parallelism course and I have tried it with buffered channels, unbuffered channels, without channels using pointers to slices etc. Also, tried to optimize it as much as possible (not the current state) but I still get the same result: increasing number of goroutines (even by 1) slows down the whole program. Can someone please tell me what I'm doing wrong and is even parallelism enhancement possible in this situation?
Here is part of the code:
func main() {
rand.Seed(time.Now().UnixMicro())
numAgents := 2
fmt.Println("Please pick a number of goroutines: ")
fmt.Scanf("%d", &numAgents)
numFiles := 4
fmt.Println("How many files do you want?")
fmt.Scanf("%d", &numFiles)
start := time.Now()
numAssist := numFiles
channel := make(chan []File, numAgents)
files := make([]File, 0)
for i := 0; i < numAgents; i++ {
if i == numAgents-1 {
go generateFiles(numAssist, channel)
} else {
go generateFiles(numFiles/numAgents, channel)
numAssist -= numFiles / numAgents
}
}
for i := 0; i < numAgents; i++ {
files = append(files, <-channel...)
}
elapsed := time.Since(start)
fmt.Printf("Function took %s\n", elapsed)
}
func generateFiles(numFiles int, channel chan []File) {
magicNumbersMap := getMap()
files := make([]File, 0)
for i := 0; i < numFiles; i++ {
content := randElementFromMap(&magicNumbersMap)
length := rand.Intn(400) + 100
hexSlice := getHex()
for j := 0; j < length; j++ {
content = content + hexSlice[rand.Intn(len(hexSlice))]
}
hash := getSHA1Hash([]byte(content))
file := File{
content: content,
hash: hash,
}
files = append(files, file)
}
channel <- files
}
Expectation was that by increasing goroutines the program would run faster but to a certain number of goroutines and at that point by increasing goroutines I would get the same execution time or a little bit slower.
EDIT: All the functions that are used:
import (
"crypto/sha1"
"encoding/base64"
"fmt"
"math/rand"
"time"
)
type File struct {
content string
hash string
}
func getMap() map[string]string {
return map[string]string{
"D4C3B2A1": "Libcap file format",
"EDABEEDB": "RedHat Package Manager (RPM) package",
"4C5A4950": "lzip compressed file",
}
}
func getHex() []string {
return []string{
"0", "1", "2", "3", "4", "5",
"6", "7", "8", "9", "A", "B",
"C", "D", "E", "F",
}
}
func randElementFromMap(m *map[string]string) string {
x := rand.Intn(len(*m))
for k := range *m {
if x == 0 {
return k
}
x--
}
return "Error"
}
func getSHA1Hash(content []byte) string {
h := sha1.New()
h.Write(content)
return base64.URLEncoding.EncodeToString(h.Sum(nil))
}

Simply speaking - the files generation code is not complex enough to justify parallel execution. All the context switching and moving data through the channel eats all benefit of parallel processing.
If you add something like time.Sleep(time.Millisecond * 10) inside the loop in your generateFiles function as if it was doing something more complex, you'll see what you expected to see - more goroutines work faster. But again, only until certain level, when extra work to do parallel processing overweights the benefit.
Note also, the execution time of the last bit of your program:
for i := 0; i < numAgents; i++ {
files = append(files, <-channel...)
}
directly depends on number of goroutines. Since all goroutines finish approximately at the same time, this loop almost never executed in parallel with your workers and the time it takes to run is simply added to the total time.
Next, when you append to files slice multiple times, it has to grow several times and copy the data over to the new location. You can avoid this by initially creating a slice that will fil all your resulting elements (luckily, you know exactly how many you'll need).

Related

Issue modifying map from goroutine func

scores := make(map[string]int)
percentage := make(map[string]float64)
total := 0
for i, ans := range answers {
answers[i] = strings.ToLower(ans)
}
wg := sync.WaitGroup{}
go func() {
wg.Add(1)
body, _ := google(question)
for _, ans := range answers {
count := strings.Count(body, ans)
total += count
scores[ans] += 5 // <------------------- This doesn't work
}
wg.Done()
}()
Here's a snippet of code, my issue is, that I am unable to modify the scores, I've tried using pointers, I've tried doing it normally, I've tried passing it as a parameter.
Package sync
import "sync"
type WaitGroup
A WaitGroup waits for a collection of goroutines to finish. The main
goroutine calls Add to set the number of goroutines to wait for. Then
each of the goroutines runs and calls Done when finished. At the same
time, Wait can be used to block until all goroutines have finished.
You have provided us with a non-working fragment of code. See How to create a Minimal, Complete, and Verifiable example.
As a guess, your use of a sync.WaitGroup looks strange. For example, by simply following the instructions in the sync.Waitgroup documentation, I would expect something more like the following:
package main
import (
"fmt"
"strings"
"sync"
)
func google(string) (string, error) { return "yes", nil }
func main() {
question := "question?"
answers := []string{"yes", "no"}
scores := make(map[string]int)
total := 0
wg := sync.WaitGroup{}
wg.Add(1)
go func() {
defer wg.Done()
body, _ := google(question)
for _, ans := range answers {
count := strings.Count(body, ans)
total += count
scores[ans] += 5 // <-- This does work
}
}()
wg.Wait()
fmt.Println(scores, total)
}
Playground: https://play.golang.org/p/sZmB2Dc5RjL
Output:
map[yes:5 no:5] 1

Can I perform an action on all slice items at once in go?

I have the following code:
func myfunction() {
results := make([]SomeCustomStruct, 0)
// ... results gets populated ...
for index, value := range results {
results[index].Body = cleanString(value.Body)
}
// ... when done, more things happen ...
}
func cleanString (in string) (out string) {
s := sanitize.HTML(in)
s = strings.Replace(s, "\n", " ", -1)
out = strings.TrimSpace(s)
return
}
The slice will never contain more than 100 or so entries. Is there any way I can exploit goroutines here to perform the cleanString function on each slice item at the same time rather than one by one?
Thanks!
If the slice only has 100 items or less and that's is the entirety of cleanString, you're not going to get a lot of speedup unless the body strings are fairly large.
Parallelizing it with goroutines would look something like:
var wg sync.WaitGroup
for index, value := range results {
wg.Add(1)
go func(index int, body string) {
defer wg.Done()
results[index].Body = cleanString(body)
}(index, value.Body)
}
wg.Wait()

Go channel takes each letter as string instead of the whole string

I'm creating a simple channel that takes string values. But apparently I'm pushing each letter in the string instead of the whole string in each loop.
I'm probably missing something very fundamental. What am I doing wrong ?
https://play.golang.org/p/-6E-f7ALbD
Code:
func doStuff(s string, ch chan string) {
ch <- s
}
func main() {
c := make(chan string)
loops := [5]int{1, 2, 3, 4, 5}
for i := 0; i < len(loops); i++ {
go doStuff("helloooo", c)
}
results := <-c
fmt.Println("channel size = ", len(results))
// print the items in channel
for _, r := range results {
fmt.Println(string(r))
}
}
Your code sends strings on the channel properly:
func doStuff(s string, ch chan string){
ch <- s
}
The problem is at the receiver side:
results := <- c
fmt.Println("channel size = ", len(results))
// print the items in channel
for _,r := range results {
fmt.Println(string(r))
}
results will be a single value received from the channel (the first value sent on it). And you print the length of this string.
Then you loop over this string (results) using a for range which loops over its runes, and you print those.
What you want is loop over the values of the channel:
// print the items in channel
for s := range c {
fmt.Println(s)
}
This when run will result in a runtime panic:
fatal error: all goroutines are asleep - deadlock!
Because you never close the channel, and a for range on a channel runs until the channel is closed. So you have to close the channel sometime.
For example let's wait 1 second, then close it:
go func() {
time.Sleep(time.Second)
close(c)
}()
This way your app will run and quit after 1 second. Try it on the Go Playground.
Another, nicer solution is to use sync.WaitGroup: this waits until all goroutines are done doing their work (sending a value on the channel), then it closes the channel (so there is no unnecessary wait / delay).
var wg = sync.WaitGroup{}
func doStuff(s string, ch chan string) {
ch <- s
wg.Done()
}
// And in main():
for i := 0; i < len(loops); i++ {
wg.Add(1)
go doStuff("helloooo", c)
}
go func() {
wg.Wait()
close(c)
}()
Try this one on the Go Playground.
Notes:
To repeat something 5 times, you don't need that ugly loops array. Simply do:
for i := 0; i < 5; i++ {
// Do something
}
The reason you are getting back the letters instead of string is that you are assigning the channel result to a variable and iterating over the result of the channel assigned to this variable which in your case is a string, and in Go you can iterate over a string with a for range loop to get the runes.
You can simply print the channel without to iterate over the channel result.
package main
import (
"fmt"
)
func doStuff(s string, ch chan string){
ch <- s
}
func main() {
c := make(chan string)
loops := [5]int{1,2,3,4,5}
for i := 0; i < len(loops) ; i++ {
go doStuff("helloooo", c)
}
results := <- c
fmt.Println("channel size = ", len(results))
fmt.Println(results) // will print helloooo
}

Finding data in large binary file and output with context

Prologue / Context
Last week my root filesystem was remounted readonly serveral times and I took a complete snapshot via ddrescue. Sadly the filesystem was damaged already and some files are missing. At the moment I try to find my ejabberd user-database which should be somewhere within the image. Testdisk found the required file (marked as deleted) but could not restore it. Since the file is pretty small and I have a backup from some month ago I thought about doing a binary search over the whole image.
So now I have a 64GB file with a damaged filesystem and would like to extract some 4kb blocks which contain a certain pattern.
Question
How can I find the data within the 64GB large file and extract the result with some context (4kb)?
Since the filesystem image resides on my server I would prefer a linux cli tool.
The Tool
Since I couldn't find a tool which meet my requirements I wrote it myself in golang. I call it bima (for binary match). It isn't pretty but it did the job:
package main
import (
"bytes"
"encoding/hex"
"fmt"
"gopkg.in/alecthomas/kingpin.v1"
"io"
"log"
"math"
"os"
)
var (
debug = kingpin.Flag("debug", "Enable debug mode.").Short('d').Bool()
bsize = kingpin.Flag("blocksize", "Blocksize").Short('b').Default("126976").Int()
debugDetail = kingpin.Flag("debugdetail", "Debug Detail").Short('v').Default("10").Int()
matchCommand = kingpin.Command("match", "Match a value")
matchCommandValue = matchCommand.Arg("value", "The value (Hex Encoded e.g.: 616263 == abc)").Required().String()
matchCommandFile = matchCommand.Arg("file", "The file").Required().String()
)
func main() {
kingpin.Version("0.1")
mode := kingpin.Parse()
if *bsize <= 0 {
log.Fatal("The blocksize has to be larger than 0")
}
if *debugDetail <= 0 {
log.Fatal("The Debug Detail has to be larger than 0")
}
if mode == "match" {
searchBytes, err := hex.DecodeString(*matchCommandValue)
if err != nil {
log.Fatal(err)
}
scanFile(searchBytes, *matchCommandFile)
}
}
func scanFile(search []byte, path string) {
searchLength := len(search)
blocksize := *bsize
f, err := os.Open(path)
if err != nil {
log.Fatal(err)
}
defer f.Close()
fi, err := f.Stat()
if err != nil {
log.Fatal(err)
}
filesize := fi.Size()
expectedRounds := int(math.Ceil(float64(filesize-int64(searchLength))/float64(blocksize)) + 1)
if expectedRounds <= 0 {
expectedRounds = 1
}
data := make([]byte, 0, blocksize+searchLength-1)
data2 := make([]byte, 0, blocksize+searchLength-1)
offset := make([]byte, searchLength-1)
//reading the len of the slice or less (but not the cap)
readCount, err := f.Read(offset)
if err == io.EOF {
fmt.Println("The files seems to be empty")
return
} else if err != nil {
log.Fatal(err)
}
data = append(data, offset...)
buffer := make([]byte, blocksize)
var blockpos int
var idx int
blockpos = 0
lastLevel := -1
roundLevel := 0
idxOffset := 0
for round := 0; ; round++ {
if *debug {
roundLevel = ((round * 100) / expectedRounds)
if (roundLevel%*debugDetail == 0) && (roundLevel > lastLevel) {
lastLevel = roundLevel
fmt.Fprintln(os.Stderr, "Starting round", round+1, "of", expectedRounds, "--", ((round * 100) / expectedRounds))
}
}
//At EOF, the count will be zero and err will be io.EOF
readCount, err = f.Read(buffer)
if err != nil {
if err == io.EOF {
if *debug {
fmt.Fprintln(os.Stderr, "Done - Found EOF")
}
break
}
fmt.Println(err)
return
}
data = append(data, buffer[:readCount]...)
data2 = data
idxOffset = 0
for {
idx = bytes.Index(data2, search)
if idx >= 0 {
fmt.Println(blockpos + idxOffset + idx)
if idx+searchLength < len(data2) {
data2 = data2[idx+searchLength:]
idxOffset += idx
} else {
break
}
} else {
break
}
}
data = data[readCount:]
blockpos += readCount
}
}
The Story
For completeness here comes what I did to solve my problem:
At first I used hexedit to find out, that all db files have the same header. Encoded in hex it looks like this: 0102030463584d0b0000004b62574c41
So I used my tool to find all occurrences within my sda.image file:
./bima match 0102030463584d0b0000004b62574c41 ./sda.image >DBfiles.txt
For the 64GB this took about 8 Minutes and I think the HDD was the limiting factor.
The result where about 1200 occurrences which I extracted from the image with dd. As I didn't know the exact size of the files I simply extracted chunks of 20.000 bytes:
for f in $(cat DBfiles.txt); do
dd if=sda.image of=$f.dunno bs=1 ibs=1 skip=$f count=20000
done
Now I had about 1200 files and had to find the right ones. In a first step I search for the passwd files (passwd.DCD and passwd.DCL). later I did the same for the roster files. As the header of the files contains the name, I simply greped for passwd:
for f in *.dunno; do
if [ "$(cat $f | head -c 200 | grep "passwd" | wc -l)" == "1" ]; then
echo "$f" | sed 's/\.$//g' >> passwd_files.list
fi
done
Because the chunks were larger than the files I had to find the end of each files manually. I did the corrections with Curses Hexedit.
During that process I could see that the head of each file contained either dcl_logk or dcd_logk. So I knew which of the files were DCL files and which were DCD files.
In the end I had each file up to ten times and had to decide which version I wanted to use. In general I took the largest file. After putting the files in the DB directory of the new ejabberd server and restarting it, all accounts are back again. :-)

How to get CPU usage

My Go program needs to know the current cpu usage percentage of all system and user processes.
How can I obtain that?
Check out this package http://github.com/c9s/goprocinfo, goprocinfo package does the parsing stuff for you.
stat, err := linuxproc.ReadStat("/proc/stat")
if err != nil {
t.Fatal("stat read fail")
}
for _, s := range stat.CPUStats {
// s.User
// s.Nice
// s.System
// s.Idle
// s.IOWait
}
I had a similar issue and never found a lightweight implementation. Here is a slimmed down version of my solution that answers your specific question. I sample the /proc/stat file just like tylerl recommends. You'll notice that I wait 3 seconds between samples to match top's output, but I have also had good results with 1 or 2 seconds. I run similar code in a loop within a go routine, then I access the cpu usage when I need it from other go routines.
You can also parse the output of top -n1 | grep -i cpu to get the cpu usage, but it only samples for half a second on my linux box and it was way off during heavy load. Regular top seemed to match very closely when I synchronized it and the following program:
package main
import (
"fmt"
"io/ioutil"
"strconv"
"strings"
"time"
)
func getCPUSample() (idle, total uint64) {
contents, err := ioutil.ReadFile("/proc/stat")
if err != nil {
return
}
lines := strings.Split(string(contents), "\n")
for _, line := range(lines) {
fields := strings.Fields(line)
if fields[0] == "cpu" {
numFields := len(fields)
for i := 1; i < numFields; i++ {
val, err := strconv.ParseUint(fields[i], 10, 64)
if err != nil {
fmt.Println("Error: ", i, fields[i], err)
}
total += val // tally up all the numbers to get total ticks
if i == 4 { // idle is the 5th field in the cpu line
idle = val
}
}
return
}
}
return
}
func main() {
idle0, total0 := getCPUSample()
time.Sleep(3 * time.Second)
idle1, total1 := getCPUSample()
idleTicks := float64(idle1 - idle0)
totalTicks := float64(total1 - total0)
cpuUsage := 100 * (totalTicks - idleTicks) / totalTicks
fmt.Printf("CPU usage is %f%% [busy: %f, total: %f]\n", cpuUsage, totalTicks-idleTicks, totalTicks)
}
It seems like I'm allowed to link to the full implementation that I wrote on bitbucket; if it's not, feel free to delete this. It only works on linux so far, though: systemstat.go
The mechanism for getting CPU usage is OS-dependent, since the numbers mean slightly different things to different OS kernels.
On Linux, you can query the kernel to get the latest stats by reading the pseudo-files in the /proc/ filesystem. These are generated on-the-fly when you read them to reflect the current state of the machine.
Specifically, the /proc/<pid>/stat file for each process contains the associated process accounting information. It's documented in proc(5). You're interested specifically in fields utime, stime, cutime and cstime (starting at the 14th field).
You can calculate the percentage easily enough: just read the numbers, wait some time interval, and read them again. Take the difference, divide by the amount of time you waited, and there's your average. This is precisely what the top program does (as well as all other programs that perform the same service). Bear in mind that you can have over 100% cpu usage if you have more than 1 CPU.
If you just want a system-wide summary, that's reported in /proc/stat -- calculate your average using the same technique, but you only have to read one file.
You can use the os.exec package to execute the ps command and get the result.
Here is a program issuing the ps aux command, parsing the result and printing the CPU usage of all processes on linux :
package main
import (
"bytes"
"log"
"os/exec"
"strconv"
"strings"
)
type Process struct {
pid int
cpu float64
}
func main() {
cmd := exec.Command("ps", "aux")
var out bytes.Buffer
cmd.Stdout = &out
err := cmd.Run()
if err != nil {
log.Fatal(err)
}
processes := make([]*Process, 0)
for {
line, err := out.ReadString('\n')
if err!=nil {
break;
}
tokens := strings.Split(line, " ")
ft := make([]string, 0)
for _, t := range(tokens) {
if t!="" && t!="\t" {
ft = append(ft, t)
}
}
log.Println(len(ft), ft)
pid, err := strconv.Atoi(ft[1])
if err!=nil {
continue
}
cpu, err := strconv.ParseFloat(ft[2], 64)
if err!=nil {
log.Fatal(err)
}
processes = append(processes, &Process{pid, cpu})
}
for _, p := range(processes) {
log.Println("Process ", p.pid, " takes ", p.cpu, " % of the CPU")
}
}
Here is an OS independent solution using Cgo to harness the clock() function provided by C standard library:
//#include <time.h>
import "C"
import "time"
var startTime = time.Now()
var startTicks = C.clock()
func CpuUsagePercent() float64 {
clockSeconds := float64(C.clock()-startTicks) / float64(C.CLOCKS_PER_SEC)
realSeconds := time.Since(startTime).Seconds()
return clockSeconds / realSeconds * 100
}
I recently had to take CPU usage measurements from a Raspberry Pi (Raspbian OS) and used github.com/c9s/goprocinfo combined with what is proposed here:
Accurate calculation of CPU usage given in percentage in Linux?
The idea comes from the htop source code and is to have two measurements (previous / current) in order to calculate the CPU usage:
func calcSingleCoreUsage(curr, prev linuxproc.CPUStat) float32 {
PrevIdle := prev.Idle + prev.IOWait
Idle := curr.Idle + curr.IOWait
PrevNonIdle := prev.User + prev.Nice + prev.System + prev.IRQ + prev.SoftIRQ + prev.Steal
NonIdle := curr.User + curr.Nice + curr.System + curr.IRQ + curr.SoftIRQ + curr.Steal
PrevTotal := PrevIdle + PrevNonIdle
Total := Idle + NonIdle
// fmt.Println(PrevIdle, Idle, PrevNonIdle, NonIdle, PrevTotal, Total)
// differentiate: actual value minus the previous one
totald := Total - PrevTotal
idled := Idle - PrevIdle
CPU_Percentage := (float32(totald) - float32(idled)) / float32(totald)
return CPU_Percentage
}
For more you can also check https://github.com/tgogos/rpi_cpu_memory

Resources