GoLang - memory allocation - []byte vs string - string

In the below code:
c := "fool"
d := []byte("fool")
fmt.Printf("c: %T, %d\n", c, unsafe.Sizeof(c)) // 16 bytes
fmt.Printf("d: %T, %d\n", d, unsafe.Sizeof(d)) // 24 bytes
To decide the datatype needed to receive JSON data from CloudFoundry, am testing above sample code to understand the memory allocation for []byte vs string type.
Expected size of string type variable c is 1 byte x 4 ascii encoded letter = 4 bytes, but the size shows 16 bytes.
For byte type variable d, GO embeds the string in the executable program as a string literal. It converts the string literal to a byte slice at runtime using the runtime.stringtoslicebyte function. Something like... []byte{102, 111, 111, 108}
Expected size of byte type variable d is again 1 byte x 4 ascii values = 4 bytes but the size of variable d shows 24 bytes as it's underlying array capacity.
Why the size of both variables is not 4 bytes?

Both slices and strings in Go are struct-like headers:
reflect.SliceHeader:
type SliceHeader struct {
Data uintptr
Len int
Cap int
}
reflect.StringHeader:
type StringHeader struct {
Data uintptr
Len int
}
The sizes reported by unsafe.Sizeof() are the sizes of these headers, exluding the size of the pointed arrays:
Sizeof takes an expression x of any type and returns the size in bytes of a hypothetical variable v as if v was declared via var v = x. The size does not include any memory possibly referenced by x. For instance, if x is a slice, Sizeof returns the size of the slice descriptor, not the size of the memory referenced by the slice.
To get the actual ("recursive") size of some arbitrary value, use Go's builtin testing and benchmarking framework. For details, see How to get memory size of variable in Go?
For strings specifically, see String memory usage in Golang. The complete memory required by a string value can be computed like this:
var str string = "some string"
stringSize := len(str) + int(unsafe.Sizeof(str))

Related

String memory usage in Golang

I was optimizing a code using a map[string]string where the value of the map was only either "A" or "B". So I thought Obviously a map[string]bool was way better as the map hold around 50 millions elements.
var a = "a"
var a2 = "Why This ultra long string take the same amount of space in memory as 'a'"
var b = true
var c map[string]string
var d map[string]bool
c["t"] = "A"
d["t"] = true
fmt.Printf("a: %T, %d\n", a, unsafe.Sizeof(a))
fmt.Printf("a2: %T, %d\n", a2, unsafe.Sizeof(a2))
fmt.Printf("b: %T, %d\n", b, unsafe.Sizeof(b))
fmt.Printf("c: %T, %d\n", c, unsafe.Sizeof(c))
fmt.Printf("d: %T, %d\n", d, unsafe.Sizeof(d))
fmt.Printf("c: %T, %d\n", c, unsafe.Sizeof(c["t"]))
fmt.Printf("d: %T, %d\n", d, unsafe.Sizeof(d["t"]))
And the result was:
a: string, 8
a2: string, 8
b: bool, 1
c: map[string]string, 4
d: map[string]bool, 4
c2: map[string]string, 8
d2: map[string]bool, 1
While testing I found something weird, why a2 with a really long string use 8 bytes, same as a which has only one letter?
unsafe.Sizeof() does not recursively go into data structures, it just reports the "shallow" size of the value passed. Quoting from its doc:
The size does not include any memory possibly referenced by x. For instance, if x is a slice, Sizeof returns the size of the slice descriptor, not the size of the memory referenced by the slice.
Maps in Go are implemented as pointers, so unsafe.Sizeof(somemap) will report the size of that pointer.
Strings in Go are just headers containing a pointer and a length. See reflect.StringHeader:
type StringHeader struct {
Data uintptr
Len int
}
So unsafe.Sizeof(somestring) will report the size of the above struct, which is independent of the length of the string value (which is the value of the Len field).
To get the actual memory requirement of a map ("deeply"), see How much memory do golang maps reserve? and also How to get memory size of variable in Go?
Go stores the UTF-8 encoded byte sequences of string values in memory. The builtin function len() reports the byte-length of a string, so
basically the memory required to store a string value in memory is:
var str string = "some string"
stringSize := len(str) + int(unsafe.Sizeof(str))
Also don't forget that a string value may be constructed by slicing another, bigger string, and thus even if the original string is no longer referenced (and thus no longer needed), the bigger backing array will still be required to be kept in memory for the smaller string slice.
For example:
s := "some loooooooong string"
s2 := s[:2]
Here, even though memory requirement for s2 would be len(s2) + unsafe.Sizeof(str) = 2 + unsafe.Sizeof(str), still, the whole backing array of s will be retained.

Returning string from a remote server using rpcgen

I am going through RPC tutorial and learn few techniques in rpcgen. I have the idea of adding, multiplying different data types using rpcgen.
But I have not found any clue that how could I declare a function in .x file which will return a string. Actually I am trying to build a procedure which will return a random string(rand string array is in server).
Can any one advise me how to proceed in this issue? It will be helpful if you advise me any tutorial regarding this returning string/pointer issue.
Thank you in advance.
Ok, answering to the original question (more than 2 years old), the first answer is correct but a little tricky.
In your .x file, you define your structure with the string inside, having defined previously the size of the string:
typedef string str_t<255>;
struct my_result {
str_t data;
};
...
Then you invoke rpcgen on your .x file to generate client and server stubs and .xdr file:
$rpcgen -N *file.x*
Now you can compile client and server in addition to any program where you pretend to use the remote functions. To do so, I followed the "repcgen Tutorial" in ORACLE's web page:
https://docs.oracle.com/cd/E19683-01/816-1435/rpcgenpguide-21470/index.html
The tricky part is, although you defined a string of size m (array of m characters) what rpcgen and .xdr file create is a pointer to allocated memmory. Something like this:
.h file
typedef char *str_t;
struct my_result {
int res;
str_t data;
};
typedef struct my_result my_result;
.xdr file
bool_t xdr_str_t (XDR *xdrs, str_t *objp)
{
register int32_t *buf;
if (!xdr_string (xdrs, objp, 255))
return FALSE;
return TRUE;
}
So just take into account when using this structure in your server side that it is not a string of size m, but a char pointer for which you'll have to reserve memory before using it or you'll be prompted the same error than me on execution:
Segmentation fault!
To use it on the server you can write:
static my_result response;
static char text[255];
memset(&response, '\0', sizeof(my_result));
memset(text, '\0', sizeof(text));
response.data = text;
And from there you are ready to use it wisely! :)
According to the XDR protocol specification you can define a string type where m is the length of the string in bytes:
The standard defines a string of n (numbered 0 to n -1) bytes to be the number n encoded as an unsigned integer (as described above), and followed by the n bytes of the string. Each byte must be regarded by the implementation as being 8-bit transparent data. This allows use of arbitrary character set encodings. Byte m of the string always precedes byte m +1 of the string, and byte 0 of the string always follows the string's length. If n is not a multiple of four, then the n bytes are followed by enough (0 to 3) residual zero bytes, r, to make the total byte count a multiple of four.
string object<m>;
You can then define a struct with the string type str_t as one of the variables:
typedef string str_t<255>;
struct my_result {
str_t data;
};
Then in your .x file you can define an RPC in your program which returns a struct of type my_result. Since rpcgen will give you a pointer to this struct (which I have called res) you can print the message with prinf("%s\n", res->data);.
program HELLO_PROG {
version HELLO_VERSION {
my_result abc() = 1;
} = 1;
} = 1000;

How to calculate actual memory used by string variable?

Strings in Delphi locating in dynamic memory.
How to calculate actual memory (in bytes) used by string variable?
I know the string must store some additional information, at least reference count and length, but how many bytes it uses except characters?
var
S: string;
Delphi 2010, XE, XE2 used
The layout on 32 bit UNICODE DELPHI taken from official Embarcadero documentation is like this:
Note that there's an additional longint field in the 64 bit version for 16 byte alignment. The StrRec record in 'system.pas' looks like this:
StrRec = packed record
{$IF defined(CPUX64)}
_Padding: LongInt; // Make 16 byte align for payload..
{$IFEND}
codePage: Word;
elemSize: Word;
refCnt: Longint;
length: Longint;
end;
The payload is always 2*(Length+1) in size. The overhead is 12 or 16 bytes, for 32 or 64 bit targets. Note that the actual memory block may be larger than needed as determined by the memory manager.
Finally, there has been much mis-information in this question. On 64 bit targets, strings are still indexed by 32 bit signed integers.
For String specifically, you can use SysUtils.ByteLength() to get the byte length of the character data, and if not zero then increment the result by SizeOf(System.StrRec) (which is the header in front of the character data) and SizeOf(Char) (for the null-terminator that is not included in the length), eg:
var
S: string;
len: Integer;
begin
S := ...;
len := ByteLength(s);
if len > 0 then Inc(len, SizeOf(StrRec) + SizeOf(Char));
end;
On the other hand, if you want to calculate the byte size of other string types, like AnsiString, AnsiString(N) (such as UTF8String), RawByteString, etc, you need to use System.StringElementSize() instead, eg:
var
S: SomeStringType;
len: Integer;
begin
S := ...;
len := Length(S) * StringElementSize(S);
if len > 0 then Inc(len, SizeOf(StrRec) + StringElementSize(s));
end;
In either case, the reason you only increment the length if the string has characters in it is because empty strings do not take up any memory at all, they are nil pointers.
To answer the question:
How to calculate actual memory (in bytes) used by string variable?
MemSize = Overhead + CharSize * (Length + 1)
CharSize = 1 // for Ansi strings
CharSize = 2 // for Unicode strings
Overhead = 8 // for 32 bit strings
Overhead = 16 // for 64 bit strings

Converting an int or String to a char array on Arduino

I am getting an int value from one of the analog pins on my Arduino. How do I concatenate this to a String and then convert the String to a char[]?
It was suggested that I try char msg[] = myString.getChars();, but I am receiving a message that getChars does not exist.
To convert and append an integer, use operator += (or member function concat):
String stringOne = "A long integer: ";
stringOne += 123456789;
To get the string as type char[], use toCharArray():
char charBuf[50];
stringOne.toCharArray(charBuf, 50)
In the example, there is only space for 49 characters (presuming it is terminated by null). You may want to make the size dynamic.
Overhead
The cost of bringing in String (it is not included if not used anywhere in the sketch), is approximately 1212 bytes of program memory (flash) and 48 bytes RAM.
This was measured using Arduino IDE version 1.8.10 (2019-09-13) for an Arduino Leonardo sketch.
Risk
There must be sufficient free RAM available. Otherwise, the result may be lockup/freeze of the application or other strange behaviour (UB).
Just as a reference, below is an example of how to convert between String and char[] with a dynamic length -
// Define
String str = "This is my string";
// Length (with one extra character for the null terminator)
int str_len = str.length() + 1;
// Prepare the character array (the buffer)
char char_array[str_len];
// Copy it over
str.toCharArray(char_array, str_len);
Yes, this is painfully obtuse for something as simple as a type conversion, but somehow it's the easiest way.
You can convert it to char* if you don't need a modifiable string by using:
(char*) yourString.c_str();
This would be very useful when you want to publish a String variable via MQTT in arduino.
None of that stuff worked. Here's a much simpler way .. the label str is the pointer to what IS an array...
String str = String(yourNumber, DEC); // Obviously .. get your int or byte into the string
str = str + '\r' + '\n'; // Add the required carriage return, optional line feed
byte str_len = str.length();
// Get the length of the whole lot .. C will kindly
// place a null at the end of the string which makes
// it by default an array[].
// The [0] element is the highest digit... so we
// have a separate place counter for the array...
byte arrayPointer = 0;
while (str_len)
{
// I was outputting the digits to the TX buffer
if ((UCSR0A & (1<<UDRE0))) // Is the TX buffer empty?
{
UDR0 = str[arrayPointer];
--str_len;
++arrayPointer;
}
}
With all the answers here, I'm surprised no one has brought up using itoa already built in.
It inserts the string representation of the integer into the given pointer.
int a = 4625;
char cStr[5]; // number of digits + 1 for null terminator
itoa(a, cStr, 10); // int value, pointer to string, base number
Or if you're unsure of the length of the string:
int b = 80085;
int len = String(b).length();
char cStr[len + 1]; // String.length() does not include the null terminator
itoa(b, cStr, 10); // or you could use String(b).toCharArray(cStr, len);

Sizeof struct in Go

I'm having a look at Go, which looks quite promising.
I am trying to figure out how to get the size of a go struct, for
example something like
type Coord3d struct {
X, Y, Z int64
}
Of course I know that it's 24 bytes, but I'd like to know it programmatically..
Do you have any ideas how to do this ?
Roger already showed how to use SizeOf method from the unsafe package. Make sure you read this before relying on the value returned by the function:
The size does not include any memory possibly referenced by x. For
instance, if x is a slice, Sizeof returns the size of the slice
descriptor, not the size of the memory referenced by the slice.
In addition to this I wanted to explain how you can easily calculate the size of any struct using a couple of simple rules. And then how to verify your intuition using a helpful service.
The size depends on the types it consists of and the order of the fields in the struct (because different padding will be used). This means that two structs with the same fields can have different size.
For example this struct will have a size of 32
struct {
a bool
b string
c bool
}
and a slight modification will have a size of 24 (a 25% difference just due to a more compact ordering of fields)
struct {
a bool
c bool
b string
}
As you see from the pictures, in the second example we removed one of the paddings and moved a field to take advantage of the previous padding. An alignment can be 1, 2, 4, or 8. A padding is the space that was used to fill in the variable to fill the alignment (basically wasted space).
Knowing this rule and remembering that:
bool, int8/uint8 take 1 byte
int16, uint16 - 2 bytes
int32, uint32, float32 - 4 bytes
int64, uint64, float64, pointer - 8 bytes
string - 16 bytes (2 alignments of 8 bytes)
any slice takes 24 bytes (3 alignments of 8 bytes). So []bool, [][][]string are the same (do not forget to reread the citation I added in the beginning)
array of length n takes n * type it takes of bytes.
Armed with the knowledge of padding, alignment and sizes in bytes, you can quickly figure out how to improve your struct (but still it makes sense to verify your intuition using the service).
import unsafe "unsafe"
/* Structure describing an inotify event. */
type INotifyInfo struct {
Wd int32 // Watch descriptor
Mask uint32 // Watch mask
Cookie uint32 // Cookie to synchronize two events
Len uint32 // Length (including NULs) of name
}
func doSomething() {
var info INotifyInfo
const infoSize = unsafe.Sizeof(info)
...
}
NOTE: The OP is mistaken. The unsafe.Sizeof does return 24 on the example Coord3d struct. See comment below.
binary.TotalSize is also an option, but note there's a slight difference in behavior between that and unsafe.Sizeof: binary.TotalSize includes the size of the contents of slices, while unsafe.Sizeof only returns the size of the top level descriptor. Here's an example of how to use TotalSize.
package main
import (
"encoding/binary"
"fmt"
"reflect"
)
type T struct {
a uint32
b int8
}
func main() {
var t T
r := reflect.ValueOf(t)
s := binary.TotalSize(r)
fmt.Println(s)
}
This is subject to change but last I looked there is an outstanding compiler bug (bug260.go) related to structure alignment. The end result is that packing a structure might not give the expected results. That was for compiler 6g version 5383 release.2010-04-27 release. It may not be affecting your results, but it's something to be aware of.
UPDATE: The only bug left in go test suite is bug260.go, mentioned above, as of release 2010-05-04.
Hotei
In order to not to incur the overhead of initializing a structure, it would be faster to use a pointer to Coord3d:
package main
import (
"fmt"
"unsafe"
)
type Coord3d struct {
X, Y, Z int64
}
func main() {
var dummy *Coord3d
fmt.Printf("sizeof(Coord3d) = %d\n", unsafe.Sizeof(*dummy))
}
/*
returns the size of any type of object in bytes
*/
func getRealSizeOf(v interface{}) (int, error) {
b := new(bytes.Buffer)
if err := gob.NewEncoder(b).Encode(v); err != nil {
return 0, err
}
return b.Len(), nil
}

Resources