I'm having a look at Go, which looks quite promising.
I am trying to figure out how to get the size of a go struct, for
example something like
type Coord3d struct {
X, Y, Z int64
}
Of course I know that it's 24 bytes, but I'd like to know it programmatically..
Do you have any ideas how to do this ?
Roger already showed how to use SizeOf method from the unsafe package. Make sure you read this before relying on the value returned by the function:
The size does not include any memory possibly referenced by x. For
instance, if x is a slice, Sizeof returns the size of the slice
descriptor, not the size of the memory referenced by the slice.
In addition to this I wanted to explain how you can easily calculate the size of any struct using a couple of simple rules. And then how to verify your intuition using a helpful service.
The size depends on the types it consists of and the order of the fields in the struct (because different padding will be used). This means that two structs with the same fields can have different size.
For example this struct will have a size of 32
struct {
a bool
b string
c bool
}
and a slight modification will have a size of 24 (a 25% difference just due to a more compact ordering of fields)
struct {
a bool
c bool
b string
}
As you see from the pictures, in the second example we removed one of the paddings and moved a field to take advantage of the previous padding. An alignment can be 1, 2, 4, or 8. A padding is the space that was used to fill in the variable to fill the alignment (basically wasted space).
Knowing this rule and remembering that:
bool, int8/uint8 take 1 byte
int16, uint16 - 2 bytes
int32, uint32, float32 - 4 bytes
int64, uint64, float64, pointer - 8 bytes
string - 16 bytes (2 alignments of 8 bytes)
any slice takes 24 bytes (3 alignments of 8 bytes). So []bool, [][][]string are the same (do not forget to reread the citation I added in the beginning)
array of length n takes n * type it takes of bytes.
Armed with the knowledge of padding, alignment and sizes in bytes, you can quickly figure out how to improve your struct (but still it makes sense to verify your intuition using the service).
import unsafe "unsafe"
/* Structure describing an inotify event. */
type INotifyInfo struct {
Wd int32 // Watch descriptor
Mask uint32 // Watch mask
Cookie uint32 // Cookie to synchronize two events
Len uint32 // Length (including NULs) of name
}
func doSomething() {
var info INotifyInfo
const infoSize = unsafe.Sizeof(info)
...
}
NOTE: The OP is mistaken. The unsafe.Sizeof does return 24 on the example Coord3d struct. See comment below.
binary.TotalSize is also an option, but note there's a slight difference in behavior between that and unsafe.Sizeof: binary.TotalSize includes the size of the contents of slices, while unsafe.Sizeof only returns the size of the top level descriptor. Here's an example of how to use TotalSize.
package main
import (
"encoding/binary"
"fmt"
"reflect"
)
type T struct {
a uint32
b int8
}
func main() {
var t T
r := reflect.ValueOf(t)
s := binary.TotalSize(r)
fmt.Println(s)
}
This is subject to change but last I looked there is an outstanding compiler bug (bug260.go) related to structure alignment. The end result is that packing a structure might not give the expected results. That was for compiler 6g version 5383 release.2010-04-27 release. It may not be affecting your results, but it's something to be aware of.
UPDATE: The only bug left in go test suite is bug260.go, mentioned above, as of release 2010-05-04.
Hotei
In order to not to incur the overhead of initializing a structure, it would be faster to use a pointer to Coord3d:
package main
import (
"fmt"
"unsafe"
)
type Coord3d struct {
X, Y, Z int64
}
func main() {
var dummy *Coord3d
fmt.Printf("sizeof(Coord3d) = %d\n", unsafe.Sizeof(*dummy))
}
/*
returns the size of any type of object in bytes
*/
func getRealSizeOf(v interface{}) (int, error) {
b := new(bytes.Buffer)
if err := gob.NewEncoder(b).Encode(v); err != nil {
return 0, err
}
return b.Len(), nil
}
Related
I have encountered an interesting issue where a PERCPU_ARRAY created on one system with 2 processors creates an array with 2 per-CPU elements and on another system with 2 processors, an array with 128 per-CPU elements. The latter was rather unexpected to me!
The way I discovered this behavior is that a program that allocated an array for the number of CPUs (using get_nprocs_conf(3)) and then read in the PERCPU_ARRAY into it (using bpf_map_lookup_elem()) ended up writing past the end of the array and crashing.
I would like to find out what is the proper way to determine in a program that reads BPF maps the number of elements in a PERCPU_ARRAY used on a system.
Failing that, I think the second best approach is to pick a buffer for reading in that is "large enough." Here, the problem is similar: what is that number and is there way to learn it at runtime?
The question comes from reading the source of bpftool, which figures this out:
unsigned int get_possible_cpus(void)
{
int cpus = libbpf_num_possible_cpus();
if (cpus < 0) {
p_err("Can't get # of possible cpus: %s", strerror(-cpus));
exit(-1);
}
return cpus;
}
int libbpf_num_possible_cpus(void)
{
static const char *fcpu = "/sys/devices/system/cpu/possible";
static int cpus;
int err, n, i, tmp_cpus;
bool *mask;
/* ---8<--- snip */
}
So that's how they do it!
Is assigning a pointer atomic in Go?
Do I need to assign a pointer in a lock? Suppose I just want to assign the pointer to nil, and would like other threads to be able to see it. I know in Java we can use volatile for this, but there is no volatile in Go.
The only things which are guaranteed to be atomic in go are the operations in sync.atomic.
So if you want to be certain you'll either need to take a lock, eg sync.Mutex or use one of the atomic primitives. I don't recommend using the atomic primitives though as you'll have to use them everywhere you use the pointer and they are difficult to get right.
Using the mutex is OK go style - you could define a function to return the current pointer with locking very easily, eg something like
import "sync"
var secretPointer *int
var pointerLock sync.Mutex
func CurrentPointer() *int {
pointerLock.Lock()
defer pointerLock.Unlock()
return secretPointer
}
func SetPointer(p *int) {
pointerLock.Lock()
secretPointer = p
pointerLock.Unlock()
}
These functions return a copy of the pointer to their clients which will stay constant even if the master pointer is changed. This may or may not be acceptable depending on how time critical your requirement is. It should be enough to avoid any undefined behaviour - the garbage collector will ensure that the pointers remain valid at all times even if the memory pointed to is no longer used by your program.
An alternative approach would be to only do the pointer access from one go routine and use channels to command that go routine into doing things. That would be considered more idiomatic go, but may not suit your application exactly.
Update
Here is an example showing how to use atomic.SetPointer. It is rather ugly due to the use of unsafe.Pointer. However unsafe.Pointer casts compile to nothing so the runtime cost is small.
import (
"fmt"
"sync/atomic"
"unsafe"
)
type Struct struct {
p unsafe.Pointer // some pointer
}
func main() {
data := 1
info := Struct{p: unsafe.Pointer(&data)}
fmt.Printf("info is %d\n", *(*int)(info.p))
otherData := 2
atomic.StorePointer(&info.p, unsafe.Pointer(&otherData))
fmt.Printf("info is %d\n", *(*int)(info.p))
}
Since the spec doesn't specify you should assume it is not. Even if it is currently atomic it's possible that it could change without ever violating the spec.
In addition to Nick's answer, since Go 1.4 there is atomic.Value type. Its Store(interface) and Load() interface methods take care of the unsafe.Pointer conversion.
Simple example:
package main
import (
"sync/atomic"
)
type stats struct{}
type myType struct {
stats atomic.Value
}
func main() {
var t myType
s := new(stats)
t.stats.Store(s)
s = t.stats.Load().(*stats)
}
Or a more extended example from the documentation on the Go playground.
Since Go 1.19 atomic.Pointer is added into atomic
The sync/atomic package defines new atomic types Bool, Int32, Int64, Uint32, Uint64, Uintptr, and Pointer. These types hide the underlying values so that all accesses are forced to use the atomic APIs. Pointer also avoids the need to convert to unsafe.Pointer at call sites. Int64 and Uint64 are automatically aligned to 64-bit boundaries in structs and allocated data, even on 32-bit systems.
Sample
type ServerConn struct {
Connection net.Conn
ID string
}
func ShowConnection(p *atomic.Pointer[ServerConn]) {
for {
time.Sleep(10 * time.Second)
fmt.Println(p, p.Load())
}
}
func main() {
c := make(chan bool)
p := atomic.Pointer[ServerConn]{}
s := ServerConn{ID: "first_conn"}
p.Store(&s)
go ShowConnection(&p)
go func() {
for {
time.Sleep(13 * time.Second)
newConn := ServerConn{ID: "new_conn"}
p.Swap(&newConn)
}
}()
<-c
}
Please note that atomicity has nothing to do with "I just want to assign the pointer to nil, and would like other threads to be able to see it". The latter property is called visibility.
The answer to the former, as of right now is yes, assigning (loading/storing) a pointer is atomic in Golang, this lies in the updated Go memory model
Otherwise, a read r of a memory location x that is not larger than a machine word must observe some write w such that r does not happen before w and there is no write w' such that w happens before w' and w' happens before r. That is, each read must observe a value written by a preceding or concurrent write.
Regarding visibility, the question does not have enough information to be answered concretely. If you merely want to know if you can dereference the pointer safely, then a plain load/store would be enough. However, the most likely cases are that you want to communicate some information based on the nullness of the pointer. This requires you using sync/atomic, which provides synchronisation capabilities.
I am going through RPC tutorial and learn few techniques in rpcgen. I have the idea of adding, multiplying different data types using rpcgen.
But I have not found any clue that how could I declare a function in .x file which will return a string. Actually I am trying to build a procedure which will return a random string(rand string array is in server).
Can any one advise me how to proceed in this issue? It will be helpful if you advise me any tutorial regarding this returning string/pointer issue.
Thank you in advance.
Ok, answering to the original question (more than 2 years old), the first answer is correct but a little tricky.
In your .x file, you define your structure with the string inside, having defined previously the size of the string:
typedef string str_t<255>;
struct my_result {
str_t data;
};
...
Then you invoke rpcgen on your .x file to generate client and server stubs and .xdr file:
$rpcgen -N *file.x*
Now you can compile client and server in addition to any program where you pretend to use the remote functions. To do so, I followed the "repcgen Tutorial" in ORACLE's web page:
https://docs.oracle.com/cd/E19683-01/816-1435/rpcgenpguide-21470/index.html
The tricky part is, although you defined a string of size m (array of m characters) what rpcgen and .xdr file create is a pointer to allocated memmory. Something like this:
.h file
typedef char *str_t;
struct my_result {
int res;
str_t data;
};
typedef struct my_result my_result;
.xdr file
bool_t xdr_str_t (XDR *xdrs, str_t *objp)
{
register int32_t *buf;
if (!xdr_string (xdrs, objp, 255))
return FALSE;
return TRUE;
}
So just take into account when using this structure in your server side that it is not a string of size m, but a char pointer for which you'll have to reserve memory before using it or you'll be prompted the same error than me on execution:
Segmentation fault!
To use it on the server you can write:
static my_result response;
static char text[255];
memset(&response, '\0', sizeof(my_result));
memset(text, '\0', sizeof(text));
response.data = text;
And from there you are ready to use it wisely! :)
According to the XDR protocol specification you can define a string type where m is the length of the string in bytes:
The standard defines a string of n (numbered 0 to n -1) bytes to be the number n encoded as an unsigned integer (as described above), and followed by the n bytes of the string. Each byte must be regarded by the implementation as being 8-bit transparent data. This allows use of arbitrary character set encodings. Byte m of the string always precedes byte m +1 of the string, and byte 0 of the string always follows the string's length. If n is not a multiple of four, then the n bytes are followed by enough (0 to 3) residual zero bytes, r, to make the total byte count a multiple of four.
string object<m>;
You can then define a struct with the string type str_t as one of the variables:
typedef string str_t<255>;
struct my_result {
str_t data;
};
Then in your .x file you can define an RPC in your program which returns a struct of type my_result. Since rpcgen will give you a pointer to this struct (which I have called res) you can print the message with prinf("%s\n", res->data);.
program HELLO_PROG {
version HELLO_VERSION {
my_result abc() = 1;
} = 1;
} = 1000;
Requirement:
For my tiny graphics engine, I need an array of all objects to draw. For performance reasons this array needs to be sorted on the attributes. In short:
Store a lot of attributes per struct, add the struct to an array of structs
Efficiently sort the array
walk over the array and perform operations (modesetting and drawing) depending on the attributes
Approach: bitfields in a union (i.e.: let the compiler do the masking and shifting for me)
I thought I had an elegant plan to accomplish this, based on this article: http://realtimecollisiondetection.net/blog/?p=86. The idea is as follows: each attribute is a bitfield, which can be read and written to (step 1). After writing, the sorting procedure look at the bitfield struct as an integer, and sorts on it (step 2). Afterwards (step 3), the bitfields are read again.
Sometimes code says more than a 1000 words, a high-level view:
union key {
/* useful for accessing */
struct {
unsigned int some_attr : 2;
unsigned int another_attr : 3;
/* ... */
} bitrep;
/* useful for sorting */
uint64_t intrep;
};
I would just make sure that the bit-representation was as large as the integer representation (64 bits in this case). My first approach went like this:
union key {
/* useful for accessing */
struct {
/* generic part: 11 bits */
unsigned int layer : 2;
unsigned int viewport : 3;
unsigned int viewportLayer : 3;
unsigned int translucency : 2;
unsigned int type : 1;
/* depends on type-bit: 64 - 11 bits = 53 bits */
union {
struct {
unsigned int sequence : 8;
unsigned int id : 32;
unsigned int padding : 13;
} cmd;
struct {
unsigned int depth : 24;
unsigned int material : 29;
} normal;
};
};
/* useful for sorting */
uint64_t intrep;
};
Note that in this case, there is a decision bitfield called type. Based on that, either the cmd struct or the normal struct gets filled in, just like in the mentioned article. However this failed horribly. With clang 3.3 on OSX 10.9 (x86 macbook pro), the key union is 16 bytes, while it should be 8.
Unable to coerce clang to pack the struct better, I took another approach based on some other stack overflow answers and the preprocessor to avoid me having to repeat myself:
/* 2 + 3 + 3 + 2 + 1 + 5 = 16 bits */
#define GENERIC_FIELDS \
unsigned int layer : 2; \
unsigned int viewport : 3; \
unsigned int viewportLayer : 3; \
unsigned int translucency : 2; \
unsigned int type : 1; \
unsigned int : 5;
/* 8 + 32 + 8 = 48 bits */
#define COMMAND_FIELDS \
unsigned int sequence : 8; \
unsigned int id : 32; \
unsigned int : 8;
/* 24 + 24 = 48 bits */
#define MODEL_FIELDS \
unsigned int depth : 24; \
unsigned int material : 24;
struct generic {
/* 16 bits */
GENERIC_FIELDS
};
struct command {
/* 16 bits */
GENERIC_FIELDS
/* 48 bits */
COMMAND_FIELDS
} __attribute__((packed));
struct model {
/* 16 bits */
GENERIC_FIELDS
/* 48 bits */
MODEL_FIELDS
} __attribute__((packed));
union alkey {
struct generic gen;
struct command cmd;
struct model mod;
uint64_t intrep;
};
Without including the __attribute__((packed)), the command and model structs are 12 bytes. But with the __attribute__((packed)), they are 8 bytes, exactly what I wanted! So it would seems that I have found my solution. However, my small experience with bitfields has taught me to be leery. Which is why I have a few questions:
My questions are:
Can I get this to be cleaner (i.e.: more like my first big union-within-struct-within-union) and still keep it 8 bytes for the key, for fast sorting?
Is there a better way to accomplish this?
Is this safe? Will this fail on x86/ARM? (really exotic architectures are not much of a concern, I'm targeting the 2 most prevalent ones). What about setting a bitfield and then finding out that an adjacent one has already been written to. Will different compilers vary wildly on this?
What issues can I expect from different compilers? Currently I'm just aiming for clang 3.3+ and gcc 4.9+ with -std=c11. However it would be quite nice if I could use MSVC as well in the future.
Related question and webpages I've looked up:
Variable-sized bitfields with aliasing
Unions within unions
for those (like me) scratching their heads about what happens with bitfields that are not byte-aligned and endianness, look no further: http://mjfrazer.org/mjfrazer/bitfields/
Sadly no answer got me the entirety of the way there.
EDIT: While experimenting, setting some values and reading the integer representation. I noticed something that I had forgotten about: endianness. This opens up another can of worms. Is it even possible to do what I want using bitfields or will I have to go for bitshifting operations?
The layout for bitfields is highly implementation (=compiler) dependent. In essence, compilers are free to place consecutive bitfields in the same byte/word if it sees fit, or not. Thus without extensions like the packed attribute that you mention, you can never be sure that your bitfields are squeezed into one word.
Then, if the bitfields are not squeezed into one word, or if you have just some spare bits that you don't use, you may be even more in trouble. These so-called padding bits can have arbitrary values, thus your sorting idea could never work in a portable setting.
For all these reasons, bitfields are relatively rarely used in real code. What you can see more often is the use of macros for the bits of your uint64_t that you need. For each of your bitfields that you have now, you'd need two macros, one to extract the bits and one to set them. Such a code then would be portable on all platforms that have a C99/C11 compiler without problems.
Minor point:
In the declaration of a union it is better to but the basic integer field first. The default initializer for a union uses the first field, so this would then ensure that your union would be initialized to all bits zero by such an initializer. The initializer of the struct would only guarantee that the individual fields are set to 0, the padding bits, if any, would be unspecific.
I've made a variant type to use instead of boost::variant. Mine works storing an index of the current type on a list of the possible types, and storing data in a byte array with enough space to store the biggest type.
unsigned char data[my_types::max_size];
int type;
Now, when I write a value to this variant type comes the trouble. I use the following:
template<typename T>
void set(T a) {
int t = type_index(T);
if (t != -1) {
type = t;
puts("writing atom data");
*((T *) data) = a; //THIS PART CRASHES!!!!
puts("did it!");
} else {
throw atom_bad_assignment;
}
}
The line that crashes is the one that stores data to the internal buffer. As you can see, I just cast the byte array directly to a pointer of the desired type. This gives me bad address signals and bus errors when trying to write some values.
I'm using GCC on a 64-bit system. How do I set the alignment for the byte array to make sure the address of the array is 64-bit aligned? (or properly aligned for any architecture I might port this project to).
EDIT: Thank you all, but the mistake was somewhere else. Apparently, Intel doesn't really care about alignment. Aligned stuff is faster but not mandatory, and the program works fine this way. My problem was I didn't clear the data buffer before writing stuff and this caused trouble with the constructors of some types. I will not, however, mark the question as answered, so more people can give me tips on alignment ;)
See http://gcc.gnu.org/onlinedocs/gcc-4.0.4/gcc/Variable-Attributes.html
unsigned char data[my_types::max_size] __attribute__ ((aligned));
int type;
I believe
#pragma pack(64)
will work on all modern compilers; it definitely works on GCC.
A more correct solution (that doesn't mess with packing globally) would be:
#pragma pack(push, 64)
// define union here
#pragma pack(pop)