What is the idiomatic way to include data in error messages? - string

I’ve encountered different ways to incorporate variables into error messages in Go. In the following example, which way is the idiomatic one? Is there a better alternative?
Which is safer when things start to break? For example, when there is very little memory left available, the option that allocates fewer bytes would be preferable.
Which is faster, in case we need to generate a lot of errors?
The full runnable code can be seen in the Go Play Space or in the official Go Playground.
func f() error {
return SepError("Sepuled " + strconv.Itoa(n) + " sepulcas " + strconv.Itoa(t) +
" times each")
}
func g() error {
return SepError(strings.Join([]string{
"Sepuled", strconv.Itoa(n), "sepulcas", strconv.Itoa(t), "times each"}, " "))
}
func h() error {
return SepError(fmt.Sprintf("Sepuled %d sepulcas %d times each", n, t))
}

Unless you have very little memory, or are going to be generating a HUGE amount of these errors I wouldn't worry about it. As far as idiomatic Go, I would opt for the h() option because it is easier to read.
The nice thing here is that allocations, memory used, and speed can be tested with some simple benchmarks
func BenchmarkF(b *testing.B) {
for i := 0; i <= b.N; i++ {
f()
}
}
func BenchmarkG(b *testing.B) {
for i := 0; i <= b.N; i++ {
g()
}
}
func BenchmarkH(b *testing.B) {
for i := 0; i <= b.N; i++ {
h()
}
}
Output of `go test -bench . -benchmem
BenchmarkF-8 10000000 169 ns/op 72 B/op 4 allocs/op
BenchmarkG-8 10000000 204 ns/op 120 B/op 5 allocs/op
BenchmarkH-8 5000000 237 ns/op 80 B/op 4 allocs/op
As you can see, f() is the fastest, uses the least memory, and is tied for the fewest allocations. It is also, not worth (in my opinion) the additional cost of readability.

Related

string vs integer as map key for memory utilization in golang?

I have a below read function which is called by multiple go routines to read s3 files and it populates two concurrent map as shown below.
During server startup, it calls read function below to populate two concurrent map.
And also periodically every 30 seconds, it calls read function again to read new s3 files and populate two concurrent map again with some new data.
So basically at a given state of time during the whole lifecycle of this app, both my concurrent map have some data and also periodically being updated too.
func (r *clientRepository) read(file string, bucket string) error {
var err error
//... read s3 file
for {
rows, err := pr.ReadByNumber(r.cfg.RowsToRead)
if err != nil {
return errs.Wrap(err)
}
if len(rows) <= 0 {
break
}
byteSlice, err := json.Marshal(rows)
if err != nil {
return errs.Wrap(err)
}
var productRows []ParquetData
err = json.Unmarshal(byteSlice, &productRows)
if err != nil {
return errs.Wrap(err)
}
for i := range productRows {
var flatProduct definitions.CustomerInfo
err = r.ConvertData(spn, &productRows[i], &flatProduct)
if err != nil {
return errs.Wrap(err)
}
// populate first concurrent map here
r.products.Set(strconv.FormatInt(flatProduct.ProductId, 10), &flatProduct)
for _, catalogId := range flatProduct.Catalogs {
strCatalogId := strconv.FormatInt(int64(catalogId), 10)
// upsert second concurrent map here
r.productCatalog.Upsert(strCatalogId, flatProduct.ProductId, func(exists bool, valueInMap interface{}, newValue interface{}) interface{} {
productID := newValue.(int64)
if valueInMap == nil {
return map[int64]struct{}{productID: {}}
}
oldIDs := valueInMap.(map[int64]struct{})
// value is irrelevant, no need to check if key exists
oldIDs[productID] = struct{}{}
return oldIDs
})
}
}
}
return nil
}
In above code flatProduct.ProductId and strCatalogId are integer but I am converting them into string bcoz concurrent map works with string only. And then I have below three functions which is used by my main application threads to get data from the concurrent map populated above.
func (r *clientRepository) GetProductMap() *cmap.ConcurrentMap {
return r.products
}
func (r *clientRepository) GetProductCatalogMap() *cmap.ConcurrentMap {
return r.productCatalog
}
func (r *clientRepository) GetProductData(pid string) *definitions.CustomerInfo {
pd, ok := r.products.Get(pid)
if ok {
return pd.(*definitions.CustomerInfo)
}
return nil
}
I have a use case where I need to populate map from multiple go routines and then read data from those maps from bunch of main application threads so it needs to be thread safe and it should be fast enough as well without much locking.
Problem Statement
I am dealing with lots of data like 30-40 GB worth of data from all these files which I am reading into memory. I am using concurrent map here which solves most of my concurrency issues but the key for the concurrent map is string and it doesn't have any implementation where key can be integer. In my case key is just a product id which can be int32 so is it worth it storing all those product id's as string in this concurrent map? I think string allocation takes more memory compare to storing all those keys as integer? At least it does in c/c++ so I am assuming it should be same case here in golang too.
Is there anything I can to improve here w.r.t map usage so that I can reduce memory utilization plus I don't lose performance as well while reading data from these maps from main threads?
I am using concurrent map from this repo which doesn't have implementation for key as integer.
Update
I am trying to use cmap_int in my code to try it out.
type clientRepo struct {
customers *cmap.ConcurrentMap
customersCatalog *cmap.ConcurrentMap
}
func NewClientRepository(logger log.Logger) (ClientRepository, error) {
// ....
customers := cmap.New[string]()
customersCatalog := cmap.New[string]()
r := &clientRepo{
customers: &customers,
customersCatalog: &customersCatalog,
}
// ....
return r, nil
}
But I am getting error as:
Cannot use '&products' (type *ConcurrentMap[V]) as the type *cmap.ConcurrentMap
What I need to change in my clientRepo struct so that it can work with new version of concurrent map which uses generics?
I don't know the implementation details of concurrent map in Go, but if it's using a string as a key I'm guessing that behind the scenes it's storing both the string and a hash of the string (which will be used for actual indexing operations).
That is going to be something of a memory hog, and there'll be nothing that can be done about that as concurrent map uses only strings for key.
If there were some sort of map that did use integers, it'd likely be using hashes of those integers anyway. A smooth hash distribution is a necessary feature for good and uniform lookup performance, in the event that key data itself is not uniformly distributed. It's almost like you need a very simple map implementation!
I'm wondering if a simple array would do, if your product ID's fit within 32bits (or can be munged to do so, or down to some other acceptable integer length). Yes, that way you'd have a large amount of memory allocated, possibly with large tracts unused. However, indexing is super-rapid, and the OS's virtual memory subsystem would ensure that areas of the array that you don't index aren't swapped in. Caveat - I'm thinking very much in terms of C and fixed-size objects here - less so Go - so this may be a bogus suggestion.
To persevere, so long as there's nothing about the array that implies initialisation-on-allocation (e.g. in C the array wouldn't get initialised by the compiler), allocation doesn't automatically mean it's all in memory, all at once, and only the most commonly used areas of the array will be in RAM courtesy of the OS's virtual memory subsystem.
EDIT
You could have a map of arrays, where each array covered a range of product Ids. This would be close to the same effect, trading off storage of hashes and strings against storage of null references. If product ids are clumped in some sort of structured way, this could work well.
Also, just a thought, and I'm showing a total lack of knowledge of Go here. Does Go store objects by reference? In which case wouldn't an array of objects actually be an array of references (so, fixed in size) and the actual objects allocated only as needed (ie a lot of the array is null references)? That doesn't sound good for my one big array suggestion...
The library you use is relatively simple and you may just replace all string into int32 (and modify the hashing function) and it will still work fine.
I ran a tiny (and not that rigorous) benchmark against the replaced version:
$ go test -bench=. -benchtime=10x -benchmem
goos: linux
goarch: amd64
pkg: maps
BenchmarkCMapAlloc-4 10 174272711 ns/op 49009948 B/op 33873 allocs/op
BenchmarkCMapAllocSS-4 10 369259624 ns/op 102535456 B/op 1082125 allocs/op
BenchmarkCMapUpdateAlloc-4 10 114794162 ns/op 0 B/op 0 allocs/op
BenchmarkCMapUpdateAllocSS-4 10 192165246 ns/op 16777216 B/op 1048576 allocs/op
BenchmarkCMap-4 10 1193068438 ns/op 5065 B/op 41 allocs/op
BenchmarkCMapSS-4 10 2195078437 ns/op 536874022 B/op 33554471 allocs/op
Benchmarks with a SS suffix is the original string version. So using integers as keys takes less memory and runs faster, as anyone would expect. The string version allocates about 50 bytes more each insertion. (This is not the actual memory usage though.)
Basically, a string in go is just a struct:
type stringStruct struct {
str unsafe.Pointer
len int
}
So on a 64-bit machine, it takes at least 8 bytes (pointer) + 8 bytes (length) + len(underlying bytes) bytes to store a string. Turning it into a int32 or int64 will definitely save memory. However, I assume that CustomerInfo and the catalog sets takes the most memory and I don't think there will be a great improvement.
(By the way, tuning the SHARD_COUNT in the library might also help a bit.)

Is there a standard way to write this tightly coupled for loop so that it doesn't burn up 100 percent of my CPU

I'm pretty new to Go, but I've been enjoying it so far. I have this one piece of code that burns up quite a bit of my CPU and I was wondering if there was a more Go oriented idiomatic way to write such a loop. I think such a loop would be common so there should be a better way to write this, maybe using Go's sync package.
The loop functionally looks something like this. Given that es.SomeCondition is being manipulated by other threads...
type ExampleStruct struct {
SomeCondition bool
mu sync.Mutex
}
func (es *ExampleStruct) runTimeOut() int {
timeout := 800
segmentOfSleep := 20
timeSleep := currentTimeout/segmentsOfSleep
for i := 0; i < segmentsOfSleep; i++ {
es.mu.Lock()
if es.SomeCondition == true {
return 1
}
es.mu.Unlock()
time.Sleep(time.Duration(timeSleep) * time.Millisecond)
}
return 0
}
The above code eats up my CPU and I can see why but I can't think of a solution.
Anybody have any thoughts?

extra allocation when returning interface{} instead of int64

I have a function that generates a random int64 and returns it as an interface{} like this:
func Val1(rnd rand.Source) interface{} {
return rnd.Int63()
}
now consider this function, which does the same thing but returns a int64
func Val2(rnd rand.Source) int64 {
return rnd.Int63()
}
I benchmarked the two functions with this (go test -bench=. -benchmem) :
func BenchmarkVal1(b *testing.B) {
var rnd = rand.NewSource(time.Now().UnixNano())
for n := 0; n < b.N; n++ {
Val1(rnd)
}
}
func BenchmarkVal2(b *testing.B) {
var rnd = rand.NewSource(time.Now().UnixNano())
for n := 0; n < b.N; n++ {
Val2(rnd)
}
}
and got folowing results:
BenchmarkVal1-4 50000000 32.4 ns/op 8 B/op 1 allocs/op
BenchmarkVal2-4 200000000 7.47 ns/op 0 B/op 0 allocs/op
Where does the extra allocation in Val1() come from ? Can it be avoided when returning an interface{} ?
An interface value is a wrapper under the hood, a pair of the concrete value stored in the interface value and its type descriptor.
Read this for more information: The Laws of Reflection #The representation of an interface
So if you want to return a value of interface{} type, an interface{} value will be implicitly created (if the value being returned is not already of that type), which will hold the integer number and its type descriptor denoting the int64 type. You can't avoid this.
interface{} is a special interface type (having 0 methods). Its value is only 8 bytes as you see on the benchmark output. Other interface types have larger size (double) as they also have to identify the static method set of the interface type (besides the dynamic type and value).
Also be sure to check out this informative answer: Go: What's the meaning of interface{}?
If you want more information about the implementation / internals, I recommend this post: How Interfaces Work in Golang

Why does this program run faster when it's allocated fewer threads?

I have a fairly simple Go program designed to compute random Fibonacci numbers to test some strange behavior I observed in a worker pool I wrote. When I allocate one thread, the program finishes in 1.78s. When I allocate 4, it finishes in 9.88s.
The code is as follows:
var workerWG sync.WaitGroup
func worker(fibNum chan int) {
for {
var tgt = <-fibNum
workerWG.Add(1)
var a, b float64 = 0, 1
for i := 0; i < tgt; i++ {
a, b = a+b, a
}
workerWG.Done()
}
}
func main() {
rand.Seed(time.Now().UnixNano())
runtime.GOMAXPROCS(1) // LINE IN QUESTION
var fibNum = make(chan int)
for i := 0; i < 4; i++ {
go worker(fibNum)
}
for i := 0; i < 500000; i++ {
fibNum <- rand.Intn(1000)
}
workerWG.Wait()
}
If I replace runtime.GOMAXPROCS(1) with 4, the program takes four times as long to run.
What's going on here? Why does adding more available threads to a worker pool slow the entire pool down?
My personal theory is that it has to do with the processing time of the worker being less than the overhead of thread management, but I'm not sure. My reservation is caused by the following test:
When I replace the worker function with the following code:
for {
<-fibNum
time.Sleep(500 * time.Millisecond)
}
both one available thread and four available threads take the same amount of time.
I revised your program to look like the following:
package main
import (
"math/rand"
"runtime"
"sync"
"time"
)
var workerWG sync.WaitGroup
func worker(fibNum chan int) {
for tgt := range fibNum {
var a, b float64 = 0, 1
for i := 0; i < tgt; i++ {
a, b = a+b, a
}
}
workerWG.Done()
}
func main() {
rand.Seed(time.Now().UnixNano())
runtime.GOMAXPROCS(1) // LINE IN QUESTION
var fibNum = make(chan int)
for i := 0; i < 4; i++ {
go worker(fibNum)
workerWG.Add(1)
}
for i := 0; i < 500000; i++ {
fibNum <- rand.Intn(100000)
}
close(fibNum)
workerWG.Wait()
}
I cleaned up the wait group usage.
I changed rand.Intn(1000) to rand.Intn(100000)
On my machine that produces:
$ time go run threading.go (GOMAXPROCS=1)
real 0m20.934s
user 0m20.932s
sys 0m0.012s
$ time go run threading.go (GOMAXPROCS=8)
real 0m10.634s
user 0m44.184s
sys 0m1.928s
This means that in your original code, the work performed vs synchronization (channel read/write) was negligible. The slowdown came from having to synchronize across threads instead of one and only perform a very small amount of work inbetween.
In essence, synchronization is expensive compared to calculating fibonacci numbers up to 1000. This is why people tend to discourage micro-benchmarks. Upping that number gives a better perspective. But an even better idea is to benchmark actual work being done i.e. including IO, syscalls, processing, crunching, writing output, formatting, etc.
Edit: As an experiment, I upped the number of workers to 8 with GOMAXPROCS set to 8 and the result was:
$ time go run threading.go
real 0m4.971s
user 0m35.692s
sys 0m0.044s
The code written by #thwd is correct and idiomatic Go.
Your code was being serialized due to the atomic nature of sync.WaitGroup. Both workerWG.Add(1) and workerWG.Done() will block until they're able to atomically update the internal counter.
Since the workload is between 0 and 1000 recursive calls, the bottleneck of a single core was enough to keep data races on the waitgroup counter to a minimum.
On multiple cores, the processor spends a lot of time spinning to fix the collisions of waitgroup calls. Add that to the fact that the waitgroup counter is kept on one core and you now have added communication between cores (taking up even more cycles).
A couple hints for simplifying code:
For a small, set number of goroutines, a complete channel (chan struct{} to avoid allocations) is cheaper to use.
Use the send channel close as a kill signal for goroutines and have them signal that they've exited (waitgroup or channel). Then, close to complete channel to free them up for the GC.
If you need a waitgroup, aggressively minimize the number of calls to it. Those calls must be internally serialized, so extra calls forces added synchronization.
Your main computation routine in worker does not allow the scheduler to run.
Calling the scheduler manually like
for i := 0; i < tgt; i++ {
a, b = a+b, a
if i%300 == 0 {
runtime.Gosched()
}
}
Reduces wall clock by 30% when switching from one to two threads.
Such artificial microbenchmarks are really hard to get right.

What do the return values of node.js process.memoryUsage() stand for?

From the official documentation (source):
process.memoryUsage()
Returns an object describing the memory usage of the Node process
measured in bytes.
var util = require('util');
console.log(util.inspect(process.memoryUsage()));
This will generate:
{ rss: 4935680, heapTotal: 1826816, heapUsed: 650472 }
heapTotal and heapUsed refer to V8's memory usage.
Exactly what do rss, heapTotal, and heapUsed stand for?
It might seem like a trivial question, but I've been looking and I could not find a clear answer so far.
In order to answer this question, one has to understand V8’s Memory Scheme first.
A running program is always represented through some space allocated in memory. This space is called Resident Set. V8 uses a scheme similar to the Java Virtual Machine and divides the memory into segments:
Code: the actual code being executed
Stack: contains all value types (primitives like integer or Boolean) with pointers referencing objects on the heap and pointers defining the control flow of the program
Heap: a memory segment dedicated to storing reference types like objects, strings and closures.
Now it is easy to answer the question:
rss: Resident Set Size
heapTotal: Total Size of the Heap
heapUsed: Heap actually Used
Ref: http://apmblog.dynatrace.com/2015/11/04/understanding-garbage-collection-and-hunting-memory-leaks-in-node-js/
RSS is the resident set size, the portion of the process's memory held in RAM (as opposed to the swap space or the part held in the filesystem).
The heap is the portion of memory from which newly allocated objects will come from (think of malloc in C, or new in JavaScript).
You can read more about the heap at Wikipedia.
The Node.js documentation describes it as follows:
heapTotal and heapUsed refer to V8's memory usage. external refers to
the memory usage of C++ objects bound to JavaScript objects managed by
V8. rss, Resident Set Size, is the amount of space occupied in the
main memory device (that is a subset of the total allocated memory)
for the process, which includes the heap, code segment and stack.
All mentioned values are expressed in bytes. So, if you just want to print them, you probably want to rescale them to MB:
const used = process.memoryUsage();
for (let key in used) {
console.log(`Memory: ${key} ${Math.round(used[key] / 1024 / 1024 * 100) / 100} MB`);
}
That will give you an output like:
Memory: rss 522.06 MB
Memory: heapTotal 447.3 MB
Memory: heapUsed 291.71 MB
Memory: external 0.13 MB
Let's do this with an Example
The following example will show you how the increase in memory usage will actually increase the rss and heapTotal
const numeral = require('numeral');
let m = new Map();
for (let i = 0; i < 100000; i++) {
m.set(i, i);
if (i % 10000 === 0) {
const { rss, heapTotal } = process.memoryUsage();
console.log( 'rss', numeral(rss).format('0.0 ib'), heapTotal, numeral(heapTotal).format('0.0 ib') )
}
}
Running The above will give you something like this:
rss 22.3 MiB 4734976 4.5 MiB
rss 24.2 MiB 6483968 6.2 MiB
rss 27.6 MiB 9580544 9.1 MiB
rss 27.6 MiB 9580544 9.1 MiB
rss 29.3 MiB 11419648 10.9 MiB
rss 29.3 MiB 11419648 10.9 MiB
rss 29.3 MiB 11419648 10.9 MiB
rss 32.8 MiB 15093760 14.4 MiB
rss 32.9 MiB 15093760 14.4 MiB
rss 32.9 MiB 15093760 14.4 MiB
This clearly shows you how using variable and continuously incrementing the space required by it increases the heapTotal and correspondingly the Resident Set Size(rss)
RSS
RSS is a reasonable measure for the "total memory usage of the Node.js interpreter process". You simply be able to run your program if that goes above the available RAM. Note however that it excludes some types of memory, so the actual memory consumption on a server that just runs a single process could be higher (VSZ is the worst case).
The concept of RSS is defined in the Linux kernel itself as mentioned at: What is RSS and VSZ in Linux memory management and measures the total memory usage of the process. This value can therefore be measured by external programs such as ps without knowledge of Node.js internals, e.g. as shown at: Retrieve CPU usage and memory usage of a single process on Linux?
heapTotal and heapUsed
These are concepts internal to the Node.js implementation. It would be good to look at the v8 source code to understand them more precisely, notably I wonder if they just obtain those values from glibc with functions such as those mentioned at: API call to get current heap size of process? of if it has its own heap management done on top of it.
For the concept of heap in general see also: What and where are the stack and heap? and What is the function of the push / pop instructions used on registers in x86 assembly? The heap is overwhelmingly likely to take the majority of memory in a JavaScript program, I don't think you will ever bother to try and look for that memory elsewhere (besides perhaps typed arrays perhaps, which show separately under process.memoryUsage()).
Runnable test
The following code example can be used to do simple tests which I have tried to analyze at: https://cirosantilli.com/javascript-memory-usage-benchmark But unlike languages without garbage collection like C++, it is very difficult to predict why memory usage is so overblown sometimes, especially when we have smaller numbers of objects. I'm not sure other garbage collected languages do any better though.
You have to run the program with:
node --expose-gc main.js
main.js
#!/usr/bin/env node
// CLI arguments.
let arr = false
let array_buffer = false
let dealloc = false
let klass = false
let obj = false
let n = 1000000
let objn = 0
for (let i = 2; i < process.argv.length; i++) {
switch (process.argv[i]) {
case 'arr':
arr = true
break
case 'array-buffer':
array_buffer = true
break
case 'class':
klass = true
break
case 'dealloc':
dealloc = true
break
case 'obj':
obj = true
break
case 'n':
i++
n = parseInt(process.argv[i], 10)
break
case 'objn':
i++
objn = parseInt(process.argv[i], 10)
break
default:
console.error(`unknown option: ${process.argv[i]}`);
break
}
}
class MyClass {
constructor(a, b) {
this.a = a
this.b = b
}
}
let a
if (array_buffer) {
a = new Int32Array(new ArrayBuffer(n * 4))
for (let i = 0; i < n; i++) {
a[i] = i
}
} else if (obj) {
a = []
for (let i = 0; i < n; i++) {
a.push({ a: i, b: -i })
}
} else if (objn) {
a = []
for (let i = 0; i < n; i++) {
const obj = {}
for (let j = 0; j < objn; j++) {
obj[String.fromCharCode(65 + j)] = i
}
a.push(obj)
}
} else if (klass) {
a = []
for (let i = 0; i < n; i++) {
a.push({ a: i, b: -i })
}
} else if (klass) {
a = []
for (let i = 0; i < n; i++) {
a.push(new MyClass(i, -i))
}
} else if (arr) {
a = []
for (let i = 0; i < n; i++) {
a.push([i, -i])
}
} else {
a = []
for (let i = 0; i < n; i++) {
a.push(i)
}
}
if (dealloc) {
a = undefined
}
let j
while (true) {
if (!dealloc) {
j = 0
// The collector somehow removes a if we don't reference it here.
for (let i = 0; i < n; i++) {
if (obj || klass) {
j += a[i].a + a[i].b
} else if (objn) {
const obj = a[i]
for (let k = 0; k < objn; k++) {
j += obj[String.fromCharCode(65 + k)]
}
} else if (arr) {
j += a[i][0] + a[i][1]
} else {
j += a[i]
}
}
console.error(j)
}
global.gc()
console.error(process.memoryUsage())
}
Some things we learn on Node 16 Ubuntu 21.10:
with node --expose-gc bench_mem.js n 1 we see that the minimum RSS is 30 MiB and the minimum heapUsed 3.7 MB. RSS for a C hello world on the same system is 770 kB for comparison

Resources