My Go program needs to know the current cpu usage percentage of all system and user processes.
How can I obtain that?
Check out this package http://github.com/c9s/goprocinfo, goprocinfo package does the parsing stuff for you.
stat, err := linuxproc.ReadStat("/proc/stat")
if err != nil {
t.Fatal("stat read fail")
}
for _, s := range stat.CPUStats {
// s.User
// s.Nice
// s.System
// s.Idle
// s.IOWait
}
I had a similar issue and never found a lightweight implementation. Here is a slimmed down version of my solution that answers your specific question. I sample the /proc/stat file just like tylerl recommends. You'll notice that I wait 3 seconds between samples to match top's output, but I have also had good results with 1 or 2 seconds. I run similar code in a loop within a go routine, then I access the cpu usage when I need it from other go routines.
You can also parse the output of top -n1 | grep -i cpu to get the cpu usage, but it only samples for half a second on my linux box and it was way off during heavy load. Regular top seemed to match very closely when I synchronized it and the following program:
package main
import (
"fmt"
"io/ioutil"
"strconv"
"strings"
"time"
)
func getCPUSample() (idle, total uint64) {
contents, err := ioutil.ReadFile("/proc/stat")
if err != nil {
return
}
lines := strings.Split(string(contents), "\n")
for _, line := range(lines) {
fields := strings.Fields(line)
if fields[0] == "cpu" {
numFields := len(fields)
for i := 1; i < numFields; i++ {
val, err := strconv.ParseUint(fields[i], 10, 64)
if err != nil {
fmt.Println("Error: ", i, fields[i], err)
}
total += val // tally up all the numbers to get total ticks
if i == 4 { // idle is the 5th field in the cpu line
idle = val
}
}
return
}
}
return
}
func main() {
idle0, total0 := getCPUSample()
time.Sleep(3 * time.Second)
idle1, total1 := getCPUSample()
idleTicks := float64(idle1 - idle0)
totalTicks := float64(total1 - total0)
cpuUsage := 100 * (totalTicks - idleTicks) / totalTicks
fmt.Printf("CPU usage is %f%% [busy: %f, total: %f]\n", cpuUsage, totalTicks-idleTicks, totalTicks)
}
It seems like I'm allowed to link to the full implementation that I wrote on bitbucket; if it's not, feel free to delete this. It only works on linux so far, though: systemstat.go
The mechanism for getting CPU usage is OS-dependent, since the numbers mean slightly different things to different OS kernels.
On Linux, you can query the kernel to get the latest stats by reading the pseudo-files in the /proc/ filesystem. These are generated on-the-fly when you read them to reflect the current state of the machine.
Specifically, the /proc/<pid>/stat file for each process contains the associated process accounting information. It's documented in proc(5). You're interested specifically in fields utime, stime, cutime and cstime (starting at the 14th field).
You can calculate the percentage easily enough: just read the numbers, wait some time interval, and read them again. Take the difference, divide by the amount of time you waited, and there's your average. This is precisely what the top program does (as well as all other programs that perform the same service). Bear in mind that you can have over 100% cpu usage if you have more than 1 CPU.
If you just want a system-wide summary, that's reported in /proc/stat -- calculate your average using the same technique, but you only have to read one file.
You can use the os.exec package to execute the ps command and get the result.
Here is a program issuing the ps aux command, parsing the result and printing the CPU usage of all processes on linux :
package main
import (
"bytes"
"log"
"os/exec"
"strconv"
"strings"
)
type Process struct {
pid int
cpu float64
}
func main() {
cmd := exec.Command("ps", "aux")
var out bytes.Buffer
cmd.Stdout = &out
err := cmd.Run()
if err != nil {
log.Fatal(err)
}
processes := make([]*Process, 0)
for {
line, err := out.ReadString('\n')
if err!=nil {
break;
}
tokens := strings.Split(line, " ")
ft := make([]string, 0)
for _, t := range(tokens) {
if t!="" && t!="\t" {
ft = append(ft, t)
}
}
log.Println(len(ft), ft)
pid, err := strconv.Atoi(ft[1])
if err!=nil {
continue
}
cpu, err := strconv.ParseFloat(ft[2], 64)
if err!=nil {
log.Fatal(err)
}
processes = append(processes, &Process{pid, cpu})
}
for _, p := range(processes) {
log.Println("Process ", p.pid, " takes ", p.cpu, " % of the CPU")
}
}
Here is an OS independent solution using Cgo to harness the clock() function provided by C standard library:
//#include <time.h>
import "C"
import "time"
var startTime = time.Now()
var startTicks = C.clock()
func CpuUsagePercent() float64 {
clockSeconds := float64(C.clock()-startTicks) / float64(C.CLOCKS_PER_SEC)
realSeconds := time.Since(startTime).Seconds()
return clockSeconds / realSeconds * 100
}
I recently had to take CPU usage measurements from a Raspberry Pi (Raspbian OS) and used github.com/c9s/goprocinfo combined with what is proposed here:
Accurate calculation of CPU usage given in percentage in Linux?
The idea comes from the htop source code and is to have two measurements (previous / current) in order to calculate the CPU usage:
func calcSingleCoreUsage(curr, prev linuxproc.CPUStat) float32 {
PrevIdle := prev.Idle + prev.IOWait
Idle := curr.Idle + curr.IOWait
PrevNonIdle := prev.User + prev.Nice + prev.System + prev.IRQ + prev.SoftIRQ + prev.Steal
NonIdle := curr.User + curr.Nice + curr.System + curr.IRQ + curr.SoftIRQ + curr.Steal
PrevTotal := PrevIdle + PrevNonIdle
Total := Idle + NonIdle
// fmt.Println(PrevIdle, Idle, PrevNonIdle, NonIdle, PrevTotal, Total)
// differentiate: actual value minus the previous one
totald := Total - PrevTotal
idled := Idle - PrevIdle
CPU_Percentage := (float32(totald) - float32(idled)) / float32(totald)
return CPU_Percentage
}
For more you can also check https://github.com/tgogos/rpi_cpu_memory
Related
I'm doing a small project for my parallelism course and I have tried it with buffered channels, unbuffered channels, without channels using pointers to slices etc. Also, tried to optimize it as much as possible (not the current state) but I still get the same result: increasing number of goroutines (even by 1) slows down the whole program. Can someone please tell me what I'm doing wrong and is even parallelism enhancement possible in this situation?
Here is part of the code:
func main() {
rand.Seed(time.Now().UnixMicro())
numAgents := 2
fmt.Println("Please pick a number of goroutines: ")
fmt.Scanf("%d", &numAgents)
numFiles := 4
fmt.Println("How many files do you want?")
fmt.Scanf("%d", &numFiles)
start := time.Now()
numAssist := numFiles
channel := make(chan []File, numAgents)
files := make([]File, 0)
for i := 0; i < numAgents; i++ {
if i == numAgents-1 {
go generateFiles(numAssist, channel)
} else {
go generateFiles(numFiles/numAgents, channel)
numAssist -= numFiles / numAgents
}
}
for i := 0; i < numAgents; i++ {
files = append(files, <-channel...)
}
elapsed := time.Since(start)
fmt.Printf("Function took %s\n", elapsed)
}
func generateFiles(numFiles int, channel chan []File) {
magicNumbersMap := getMap()
files := make([]File, 0)
for i := 0; i < numFiles; i++ {
content := randElementFromMap(&magicNumbersMap)
length := rand.Intn(400) + 100
hexSlice := getHex()
for j := 0; j < length; j++ {
content = content + hexSlice[rand.Intn(len(hexSlice))]
}
hash := getSHA1Hash([]byte(content))
file := File{
content: content,
hash: hash,
}
files = append(files, file)
}
channel <- files
}
Expectation was that by increasing goroutines the program would run faster but to a certain number of goroutines and at that point by increasing goroutines I would get the same execution time or a little bit slower.
EDIT: All the functions that are used:
import (
"crypto/sha1"
"encoding/base64"
"fmt"
"math/rand"
"time"
)
type File struct {
content string
hash string
}
func getMap() map[string]string {
return map[string]string{
"D4C3B2A1": "Libcap file format",
"EDABEEDB": "RedHat Package Manager (RPM) package",
"4C5A4950": "lzip compressed file",
}
}
func getHex() []string {
return []string{
"0", "1", "2", "3", "4", "5",
"6", "7", "8", "9", "A", "B",
"C", "D", "E", "F",
}
}
func randElementFromMap(m *map[string]string) string {
x := rand.Intn(len(*m))
for k := range *m {
if x == 0 {
return k
}
x--
}
return "Error"
}
func getSHA1Hash(content []byte) string {
h := sha1.New()
h.Write(content)
return base64.URLEncoding.EncodeToString(h.Sum(nil))
}
Simply speaking - the files generation code is not complex enough to justify parallel execution. All the context switching and moving data through the channel eats all benefit of parallel processing.
If you add something like time.Sleep(time.Millisecond * 10) inside the loop in your generateFiles function as if it was doing something more complex, you'll see what you expected to see - more goroutines work faster. But again, only until certain level, when extra work to do parallel processing overweights the benefit.
Note also, the execution time of the last bit of your program:
for i := 0; i < numAgents; i++ {
files = append(files, <-channel...)
}
directly depends on number of goroutines. Since all goroutines finish approximately at the same time, this loop almost never executed in parallel with your workers and the time it takes to run is simply added to the total time.
Next, when you append to files slice multiple times, it has to grow several times and copy the data over to the new location. You can avoid this by initially creating a slice that will fil all your resulting elements (luckily, you know exactly how many you'll need).
scores := make(map[string]int)
percentage := make(map[string]float64)
total := 0
for i, ans := range answers {
answers[i] = strings.ToLower(ans)
}
wg := sync.WaitGroup{}
go func() {
wg.Add(1)
body, _ := google(question)
for _, ans := range answers {
count := strings.Count(body, ans)
total += count
scores[ans] += 5 // <------------------- This doesn't work
}
wg.Done()
}()
Here's a snippet of code, my issue is, that I am unable to modify the scores, I've tried using pointers, I've tried doing it normally, I've tried passing it as a parameter.
Package sync
import "sync"
type WaitGroup
A WaitGroup waits for a collection of goroutines to finish. The main
goroutine calls Add to set the number of goroutines to wait for. Then
each of the goroutines runs and calls Done when finished. At the same
time, Wait can be used to block until all goroutines have finished.
You have provided us with a non-working fragment of code. See How to create a Minimal, Complete, and Verifiable example.
As a guess, your use of a sync.WaitGroup looks strange. For example, by simply following the instructions in the sync.Waitgroup documentation, I would expect something more like the following:
package main
import (
"fmt"
"strings"
"sync"
)
func google(string) (string, error) { return "yes", nil }
func main() {
question := "question?"
answers := []string{"yes", "no"}
scores := make(map[string]int)
total := 0
wg := sync.WaitGroup{}
wg.Add(1)
go func() {
defer wg.Done()
body, _ := google(question)
for _, ans := range answers {
count := strings.Count(body, ans)
total += count
scores[ans] += 5 // <-- This does work
}
}()
wg.Wait()
fmt.Println(scores, total)
}
Playground: https://play.golang.org/p/sZmB2Dc5RjL
Output:
map[yes:5 no:5] 1
I currently have a script that performs an os command, that returns a great deal of data, at the end of the data it gives a total such that:
N Total.
N can be any number from 0 upward.
I want to perform this command, and take N then put it into a value. I have the command running and I'm storing it in a bytes.Buffer, however I'm unsure how to scrape this so that I only get the number. The "N Total." string is always at the end of the output. Any help would be appreciated as I've seen various different methods but they all seem quite convoluted.
You can use a bufio.Scanner to read the command's output line-wise. Then just remember the last line and parse it once the command has finished.
package main
import (
"bufio"
"fmt"
"io"
"os/exec"
"strings"
)
func main() {
r, w := io.Pipe()
cmd := exec.Command("fortune")
cmd.Stdout = w
go func() {
cmd.Run()
r.Close()
w.Close()
}()
sc := bufio.NewScanner(r)
var lastLine string
for sc.Scan() {
line := sc.Text()
fmt.Println("debug:", line)
if strings.TrimSpace(line) != "" {
lastLine = line
}
}
fmt.Println(lastLine)
}
Sample output:
debug: "Get back to your stations!"
debug: "We're beaming down to the planet, sir."
debug: -- Kirk and Mr. Leslie, "This Side of Paradise",
debug: stardate 3417.3
stardate 3417.3
Parsing lastLine is left as an excercise for the reader.
You can split the string by \n and get the last line.
package main
import (
"fmt"
"strconv"
"strings"
)
func main() {
output := `
Some os output
Some more os output
Again some os output
1001 Total`
// If you're getting the string from the bytes.Buffer do this:
// output := myBytesBuffer.String()
outputSplit := strings.Split(output, "\n") // Break into lines
// Get last line from the end.
// -1 assumes the numbers in the last line. Change it if its not.
lastLine := outputSplit[len(outputSplit)-1]
lastLine = strings.Replace(lastLine, " Total", "", -1) // Remove text
number, _ := strconv.Atoi(lastLine) // Convert from text to number
fmt.Println(number)
}
peterSO points out that for big output the above may be slow.
Here's another way that uses a compiled regexp expression to match against a small subset of bytes.
package main
import (
"bytes"
"fmt"
"os/exec"
"regexp"
"strconv"
)
func main() {
// Create regular expression. You only create this once.
// Would be regexpNumber := regexp.MustCompile(`(\d+) Total`) for you
regexpNumber := regexp.MustCompile(`(\d+) bits physical`)
// Whatever your os command is
command := exec.Command("cat", "/proc/cpuinfo")
output, _ := command.Output()
// Your bytes.Buffer
var b bytes.Buffer
b.Write(output)
// Get end of bytes slice
var end []byte
if b.Len()-200 > 0 {
end = b.Bytes()[b.Len()-200:]
} else {
end = b.Bytes()
}
// Get matches. matches[1] contains your number
matches := regexpNumber.FindSubmatch(end)
// Convert bytes to int
number, _ := strconv.Atoi(string(matches[1])) // Convert from text to number
fmt.Println(number)
}
Prologue / Context
Last week my root filesystem was remounted readonly serveral times and I took a complete snapshot via ddrescue. Sadly the filesystem was damaged already and some files are missing. At the moment I try to find my ejabberd user-database which should be somewhere within the image. Testdisk found the required file (marked as deleted) but could not restore it. Since the file is pretty small and I have a backup from some month ago I thought about doing a binary search over the whole image.
So now I have a 64GB file with a damaged filesystem and would like to extract some 4kb blocks which contain a certain pattern.
Question
How can I find the data within the 64GB large file and extract the result with some context (4kb)?
Since the filesystem image resides on my server I would prefer a linux cli tool.
The Tool
Since I couldn't find a tool which meet my requirements I wrote it myself in golang. I call it bima (for binary match). It isn't pretty but it did the job:
package main
import (
"bytes"
"encoding/hex"
"fmt"
"gopkg.in/alecthomas/kingpin.v1"
"io"
"log"
"math"
"os"
)
var (
debug = kingpin.Flag("debug", "Enable debug mode.").Short('d').Bool()
bsize = kingpin.Flag("blocksize", "Blocksize").Short('b').Default("126976").Int()
debugDetail = kingpin.Flag("debugdetail", "Debug Detail").Short('v').Default("10").Int()
matchCommand = kingpin.Command("match", "Match a value")
matchCommandValue = matchCommand.Arg("value", "The value (Hex Encoded e.g.: 616263 == abc)").Required().String()
matchCommandFile = matchCommand.Arg("file", "The file").Required().String()
)
func main() {
kingpin.Version("0.1")
mode := kingpin.Parse()
if *bsize <= 0 {
log.Fatal("The blocksize has to be larger than 0")
}
if *debugDetail <= 0 {
log.Fatal("The Debug Detail has to be larger than 0")
}
if mode == "match" {
searchBytes, err := hex.DecodeString(*matchCommandValue)
if err != nil {
log.Fatal(err)
}
scanFile(searchBytes, *matchCommandFile)
}
}
func scanFile(search []byte, path string) {
searchLength := len(search)
blocksize := *bsize
f, err := os.Open(path)
if err != nil {
log.Fatal(err)
}
defer f.Close()
fi, err := f.Stat()
if err != nil {
log.Fatal(err)
}
filesize := fi.Size()
expectedRounds := int(math.Ceil(float64(filesize-int64(searchLength))/float64(blocksize)) + 1)
if expectedRounds <= 0 {
expectedRounds = 1
}
data := make([]byte, 0, blocksize+searchLength-1)
data2 := make([]byte, 0, blocksize+searchLength-1)
offset := make([]byte, searchLength-1)
//reading the len of the slice or less (but not the cap)
readCount, err := f.Read(offset)
if err == io.EOF {
fmt.Println("The files seems to be empty")
return
} else if err != nil {
log.Fatal(err)
}
data = append(data, offset...)
buffer := make([]byte, blocksize)
var blockpos int
var idx int
blockpos = 0
lastLevel := -1
roundLevel := 0
idxOffset := 0
for round := 0; ; round++ {
if *debug {
roundLevel = ((round * 100) / expectedRounds)
if (roundLevel%*debugDetail == 0) && (roundLevel > lastLevel) {
lastLevel = roundLevel
fmt.Fprintln(os.Stderr, "Starting round", round+1, "of", expectedRounds, "--", ((round * 100) / expectedRounds))
}
}
//At EOF, the count will be zero and err will be io.EOF
readCount, err = f.Read(buffer)
if err != nil {
if err == io.EOF {
if *debug {
fmt.Fprintln(os.Stderr, "Done - Found EOF")
}
break
}
fmt.Println(err)
return
}
data = append(data, buffer[:readCount]...)
data2 = data
idxOffset = 0
for {
idx = bytes.Index(data2, search)
if idx >= 0 {
fmt.Println(blockpos + idxOffset + idx)
if idx+searchLength < len(data2) {
data2 = data2[idx+searchLength:]
idxOffset += idx
} else {
break
}
} else {
break
}
}
data = data[readCount:]
blockpos += readCount
}
}
The Story
For completeness here comes what I did to solve my problem:
At first I used hexedit to find out, that all db files have the same header. Encoded in hex it looks like this: 0102030463584d0b0000004b62574c41
So I used my tool to find all occurrences within my sda.image file:
./bima match 0102030463584d0b0000004b62574c41 ./sda.image >DBfiles.txt
For the 64GB this took about 8 Minutes and I think the HDD was the limiting factor.
The result where about 1200 occurrences which I extracted from the image with dd. As I didn't know the exact size of the files I simply extracted chunks of 20.000 bytes:
for f in $(cat DBfiles.txt); do
dd if=sda.image of=$f.dunno bs=1 ibs=1 skip=$f count=20000
done
Now I had about 1200 files and had to find the right ones. In a first step I search for the passwd files (passwd.DCD and passwd.DCL). later I did the same for the roster files. As the header of the files contains the name, I simply greped for passwd:
for f in *.dunno; do
if [ "$(cat $f | head -c 200 | grep "passwd" | wc -l)" == "1" ]; then
echo "$f" | sed 's/\.$//g' >> passwd_files.list
fi
done
Because the chunks were larger than the files I had to find the end of each files manually. I did the corrections with Curses Hexedit.
During that process I could see that the head of each file contained either dcl_logk or dcd_logk. So I knew which of the files were DCL files and which were DCD files.
In the end I had each file up to ten times and had to decide which version I wanted to use. In general I took the largest file. After putting the files in the DB directory of the new ejabberd server and restarting it, all accounts are back again. :-)
Like [a-zA-Z0-9] string:
na1dopW129T0anN28udaZ
or hexadecimal string:
8c6f78ac23b4a7b8c0182d
By long I mean 2K and more characters.
This does about 200MBps on my box. There's obvious room for improvement.
type randomDataMaker struct {
src rand.Source
}
func (r *randomDataMaker) Read(p []byte) (n int, err error) {
for i := range p {
p[i] = byte(r.src.Int63() & 0xff)
}
return len(p), nil
}
You'd just use io.CopyN to produce the string you want. Obviously you could adjust the character set on the way in or whatever.
The nice thing about this model is that it's just an io.Reader so you can use it making anything.
Test is below:
func BenchmarkRandomDataMaker(b *testing.B) {
randomSrc := randomDataMaker{rand.NewSource(1028890720402726901)}
for i := 0; i < b.N; i++ {
b.SetBytes(int64(i))
_, err := io.CopyN(ioutil.Discard, &randomSrc, int64(i))
if err != nil {
b.Fatalf("Error copying at %v: %v", i, err)
}
}
}
On one core of my 2.2GHz i7:
BenchmarkRandomDataMaker 50000 246512 ns/op 202.83 MB/s
EDIT
Since I wrote the benchmark, I figured I'd do the obvious improvement thing (call out to the random less frequently). With 1/8 the calls to rand, it runs about 4x faster, though it's a big uglier:
New version:
func (r *randomDataMaker) Read(p []byte) (n int, err error) {
todo := len(p)
offset := 0
for {
val := int64(r.src.Int63())
for i := 0; i < 8; i++ {
p[offset] = byte(val & 0xff)
todo--
if todo == 0 {
return len(p), nil
}
offset++
val >>= 8
}
}
panic("unreachable")
}
New benchmark:
BenchmarkRandomDataMaker 200000 251148 ns/op 796.34 MB/s
EDIT 2
Took out the masking in the cast to byte since it was redundant. Got a good deal faster:
BenchmarkRandomDataMaker 200000 231843 ns/op 862.64 MB/s
(this is so much easier than real work sigh)
EDIT 3
This came up in irc today, so I released a library. Also, my actual benchmark tool, while useful for relative speed, isn't sufficiently accurate in its reporting.
I created randbo that you can reuse to produce random streams wherever you may need them.
You can use the Go package uniuri to generate random strings (or view the source code to see how they're doing it). You'll want to use:
func NewLen(length int) string
NewLen returns a new random string of the provided length, consisting of standard characters.
Or, to specify the set of characters used:
func NewLenChars(length int, chars []byte) string
This is actually a little biased towards the first 8 characters in the set (since 255 is not a multiple of len(alphanum)), but this will get you most of the way there.
import (
"crypto/rand"
)
func randString(n int) string {
const alphanum = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"
var bytes = make([]byte, n)
rand.Read(bytes)
for i, b := range bytes {
bytes[i] = alphanum[b % byte(len(alphanum))]
}
return string(bytes)
}
If you want to generate cryptographically secure random string, I recommend you to take a look at this page. Here is a helper function that reads n random bytes from the source of randomness of your OS and then use these bytes to base64encode it. Note that the string length would be bigger than n because of base64.
package main
import(
"crypto/rand"
"encoding/base64"
"fmt"
)
func GenerateRandomBytes(n int) ([]byte, error) {
b := make([]byte, n)
_, err := rand.Read(b)
if err != nil {
return nil, err
}
return b, nil
}
func GenerateRandomString(s int) (string, error) {
b, err := GenerateRandomBytes(s)
return base64.URLEncoding.EncodeToString(b), err
}
func main() {
token, _ := GenerateRandomString(32)
fmt.Println(token)
}
Here Evan Shaw's answer re-worked without the bias towards the first 8 characters of the string. Note that it uses lots of expensive big.Int operations so probably isn't that quick! The answer is crypto strong though.
It uses rand.Int to make an integer of exactly the right size len(alphanum) ** n, then does what is effectively a base conversion into base len(alphanum).
There is almost certainly a better algorithm for this which would involve keeping a much smaller remainder and adding random bytes to it as necessary. This would get rid of the expensive long integer arithmetic.
import (
"crypto/rand"
"fmt"
"math/big"
)
func randString(n int) string {
const alphanum = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"
symbols := big.NewInt(int64(len(alphanum)))
states := big.NewInt(0)
states.Exp(symbols, big.NewInt(int64(n)), nil)
r, err := rand.Int(rand.Reader, states)
if err != nil {
panic(err)
}
var bytes = make([]byte, n)
r2 := big.NewInt(0)
symbol := big.NewInt(0)
for i := range bytes {
r2.DivMod(r, symbols, symbol)
r, r2 = r2, r
bytes[i] = alphanum[symbol.Int64()]
}
return string(bytes)
}