Modifying Individual Bytes in Rust [duplicate] - rust

This question already has answers here:
Converting number primitives (i32, f64, etc) to byte representations
(5 answers)
Closed 2 years ago.
In C, if I want to modify the most significant byte of an integer, I can do it in two different ways:
#include <stdint.h>
#include <stdio.h>
int main() {
uint32_t i1 = 0xDDDDDDDD;
printf("i1 original: 0x%X\n", i1);
uint32_t i1msb = 0xAA000000;
uint32_t i1_mask = 0x00FFFFFF;
uint32_t i1msb_mask = 0xFF000000;
i1 = (i1 & i1_mask) | (i1msb & i1msb_mask);
printf("i1 modified: 0x%X\n", i1);
uint32_t i2 = 0xDDDDDDDD;
printf("i2 original: 0x%X\n", i2);
uint32_t *i2ptr = &i2;
char *c2ptr = (char *) i2ptr;
c2ptr += 3;
*c2ptr = 0xAA;
printf("i2 modified: 0x%X\n", i2);
}
i1 original: 0xDDDDDDDD
i1 modified: 0xAADDDDDD
i2 original: 0xDDDDDDDD
i2 modified: 0xAADDDDDD
The masking approach works in both C and Rust, but I don't have not found any way to do the direct byte manipulation approach in (safe) Rust. Although it has some endianness issues that masking does not, I think these can all be resolved at compile time to provide a safe, cross-platform interface, so why is direct byte manipulation not allowed in safe Rust? Or if it is, please correct me as I have been unable to find a function for it - I am thinking something like
fn change_byte(i: &u32, byte_num: usize, new_val: u8) -> Result<(), ErrorType> { ... }
which checks that byte_num is within range before performing the operation, and in which byte_num always goes from least significant byte to most significant.

Rust doesn't have a function to modify bytes directly. It does, however, have methods to convert integers to byte arrays, and byte arrays to integers. For example, one might write
fn set_low_byte(number:u32, byte:u8)->u32{
let mut arr = number.to_be_bytes();
arr[3] = byte;
u32::from_be_bytes(arr)
}
There are also methods for little and native endian byte order. The bitmask approach is also perfectly doable, and the pointer stuff will also work if you don't mind unsafe.
playground link

Related

What does applying XOR between the input instructions & account data accomplish in this Solana smart contract?

https://github.com/solana-labs/break/blob/master/program/src/lib.rs
use solana_program::{
account_info::AccountInfo, entrypoint, entrypoint::ProgramResult, pubkey::Pubkey,
};
entrypoint!(process_instruction);
fn process_instruction<'a>(
_program_id: &Pubkey,
accounts: &'a [AccountInfo<'a>],
instruction_data: &[u8],
) -> ProgramResult {
// Assume a writable account is at index 0
let mut account_data = accounts[0].try_borrow_mut_data()?;
// xor with the account data using byte and bit from ix data
let index = u16::from_be_bytes([instruction_data[0], instruction_data[1]]);
let byte = index >> 3;
let bit = (index & 0x7) as u8;
account_data[byte as usize] ^= 1 << (7 - bit);
Ok(())
}
This is from one of their example applications, really not sure what to make of this, or where one might even begin to look in to understanding what the intent is here and how it functions..
Thanks in advance.
EDIT:
Is this done to create a program-derived address? I found this on their API, and the above seems to make sense as an implementation of this I would imagine.
1 << n sets nth bit of what's called a mask, such as 1 << 1 = 0010.
XOR is an useful operation that allows comparing bits, and in this case, it makes use of that property. If current bit is 0, it will be set to 1, and if it is 1, it will be set to 0.
Using the mask from above, we can select one single specific bit to compare, or in this case, switch based on what value it currently is.
1111 ^ 0010 = 1101, the result is a difference, with the bit that matched being set to 0.
1101 ^ 0010 = 1111, every single bit here is different, and so the bit that didn't match is also set to 1.
In short, it toggles a bit, it is a common idiom in bit manipulation code.
bits ^= 1 << n
Related: https://stackoverflow.com/a/47990/15971564

Dynamic memory in a struct

So, my struct is like this:
struct player{
char name[20];
int time;
}s[50];
I don't know how many players i am going to add to the struct, and i also have to use dynamic memory for this. So how can i allocate and reallocate more space when i add a player to my struct?
I am inexperienced programmer, but i have been googling for this for a long time and i also don't perfectly understand structs.
This site doesn't accept my question so let's put some more text to this post
I assume you are programming in C/C++.
Your struct player has static-allocated fields, so, when you use malloc on it, you are asking space for a 20-byte char array a for one integer.
My suggestion is to store in a variable (or #define a symbol) the initial number of structures you can accept. Then, use malloc to allocate a static array that contains these structures.
Also you have to think about a strategy to store new players coming. The simplest one could be have an index variable to store last free position and use it to add over that position.
A short example follows:
#define init_cap 50
struct player {
char name[20];
int time;
};
int main() {
int index;
struct player* players;
players = (struct player*) malloc(init_cap * sizeof(struct player));
for(i = 0; i < init_cap; i++) {
strcpy(players[i].name, "peppe");
players[i].time = i;
}
free(players);
return 0;
}
At this point you should also think about reallocating memory if the number of players you get a runtime exceeds your initial capacity. You can use:
players = (struct player*) realloc(2 * init_cap * sizeof(struct player));
in order to double the initial capacity.
At the end, always remember to free the requested memory.

Returning string from a remote server using rpcgen

I am going through RPC tutorial and learn few techniques in rpcgen. I have the idea of adding, multiplying different data types using rpcgen.
But I have not found any clue that how could I declare a function in .x file which will return a string. Actually I am trying to build a procedure which will return a random string(rand string array is in server).
Can any one advise me how to proceed in this issue? It will be helpful if you advise me any tutorial regarding this returning string/pointer issue.
Thank you in advance.
Ok, answering to the original question (more than 2 years old), the first answer is correct but a little tricky.
In your .x file, you define your structure with the string inside, having defined previously the size of the string:
typedef string str_t<255>;
struct my_result {
str_t data;
};
...
Then you invoke rpcgen on your .x file to generate client and server stubs and .xdr file:
$rpcgen -N *file.x*
Now you can compile client and server in addition to any program where you pretend to use the remote functions. To do so, I followed the "repcgen Tutorial" in ORACLE's web page:
https://docs.oracle.com/cd/E19683-01/816-1435/rpcgenpguide-21470/index.html
The tricky part is, although you defined a string of size m (array of m characters) what rpcgen and .xdr file create is a pointer to allocated memmory. Something like this:
.h file
typedef char *str_t;
struct my_result {
int res;
str_t data;
};
typedef struct my_result my_result;
.xdr file
bool_t xdr_str_t (XDR *xdrs, str_t *objp)
{
register int32_t *buf;
if (!xdr_string (xdrs, objp, 255))
return FALSE;
return TRUE;
}
So just take into account when using this structure in your server side that it is not a string of size m, but a char pointer for which you'll have to reserve memory before using it or you'll be prompted the same error than me on execution:
Segmentation fault!
To use it on the server you can write:
static my_result response;
static char text[255];
memset(&response, '\0', sizeof(my_result));
memset(text, '\0', sizeof(text));
response.data = text;
And from there you are ready to use it wisely! :)
According to the XDR protocol specification you can define a string type where m is the length of the string in bytes:
The standard defines a string of n (numbered 0 to n -1) bytes to be the number n encoded as an unsigned integer (as described above), and followed by the n bytes of the string. Each byte must be regarded by the implementation as being 8-bit transparent data. This allows use of arbitrary character set encodings. Byte m of the string always precedes byte m +1 of the string, and byte 0 of the string always follows the string's length. If n is not a multiple of four, then the n bytes are followed by enough (0 to 3) residual zero bytes, r, to make the total byte count a multiple of four.
string object<m>;
You can then define a struct with the string type str_t as one of the variables:
typedef string str_t<255>;
struct my_result {
str_t data;
};
Then in your .x file you can define an RPC in your program which returns a struct of type my_result. Since rpcgen will give you a pointer to this struct (which I have called res) you can print the message with prinf("%s\n", res->data);.
program HELLO_PROG {
version HELLO_VERSION {
my_result abc() = 1;
} = 1;
} = 1000;

using bitfields as a sorting key in modern C (C99/C11 union)

Requirement:
For my tiny graphics engine, I need an array of all objects to draw. For performance reasons this array needs to be sorted on the attributes. In short:
Store a lot of attributes per struct, add the struct to an array of structs
Efficiently sort the array
walk over the array and perform operations (modesetting and drawing) depending on the attributes
Approach: bitfields in a union (i.e.: let the compiler do the masking and shifting for me)
I thought I had an elegant plan to accomplish this, based on this article: http://realtimecollisiondetection.net/blog/?p=86. The idea is as follows: each attribute is a bitfield, which can be read and written to (step 1). After writing, the sorting procedure look at the bitfield struct as an integer, and sorts on it (step 2). Afterwards (step 3), the bitfields are read again.
Sometimes code says more than a 1000 words, a high-level view:
union key {
/* useful for accessing */
struct {
unsigned int some_attr : 2;
unsigned int another_attr : 3;
/* ... */
} bitrep;
/* useful for sorting */
uint64_t intrep;
};
I would just make sure that the bit-representation was as large as the integer representation (64 bits in this case). My first approach went like this:
union key {
/* useful for accessing */
struct {
/* generic part: 11 bits */
unsigned int layer : 2;
unsigned int viewport : 3;
unsigned int viewportLayer : 3;
unsigned int translucency : 2;
unsigned int type : 1;
/* depends on type-bit: 64 - 11 bits = 53 bits */
union {
struct {
unsigned int sequence : 8;
unsigned int id : 32;
unsigned int padding : 13;
} cmd;
struct {
unsigned int depth : 24;
unsigned int material : 29;
} normal;
};
};
/* useful for sorting */
uint64_t intrep;
};
Note that in this case, there is a decision bitfield called type. Based on that, either the cmd struct or the normal struct gets filled in, just like in the mentioned article. However this failed horribly. With clang 3.3 on OSX 10.9 (x86 macbook pro), the key union is 16 bytes, while it should be 8.
Unable to coerce clang to pack the struct better, I took another approach based on some other stack overflow answers and the preprocessor to avoid me having to repeat myself:
/* 2 + 3 + 3 + 2 + 1 + 5 = 16 bits */
#define GENERIC_FIELDS \
unsigned int layer : 2; \
unsigned int viewport : 3; \
unsigned int viewportLayer : 3; \
unsigned int translucency : 2; \
unsigned int type : 1; \
unsigned int : 5;
/* 8 + 32 + 8 = 48 bits */
#define COMMAND_FIELDS \
unsigned int sequence : 8; \
unsigned int id : 32; \
unsigned int : 8;
/* 24 + 24 = 48 bits */
#define MODEL_FIELDS \
unsigned int depth : 24; \
unsigned int material : 24;
struct generic {
/* 16 bits */
GENERIC_FIELDS
};
struct command {
/* 16 bits */
GENERIC_FIELDS
/* 48 bits */
COMMAND_FIELDS
} __attribute__((packed));
struct model {
/* 16 bits */
GENERIC_FIELDS
/* 48 bits */
MODEL_FIELDS
} __attribute__((packed));
union alkey {
struct generic gen;
struct command cmd;
struct model mod;
uint64_t intrep;
};
Without including the __attribute__((packed)), the command and model structs are 12 bytes. But with the __attribute__((packed)), they are 8 bytes, exactly what I wanted! So it would seems that I have found my solution. However, my small experience with bitfields has taught me to be leery. Which is why I have a few questions:
My questions are:
Can I get this to be cleaner (i.e.: more like my first big union-within-struct-within-union) and still keep it 8 bytes for the key, for fast sorting?
Is there a better way to accomplish this?
Is this safe? Will this fail on x86/ARM? (really exotic architectures are not much of a concern, I'm targeting the 2 most prevalent ones). What about setting a bitfield and then finding out that an adjacent one has already been written to. Will different compilers vary wildly on this?
What issues can I expect from different compilers? Currently I'm just aiming for clang 3.3+ and gcc 4.9+ with -std=c11. However it would be quite nice if I could use MSVC as well in the future.
Related question and webpages I've looked up:
Variable-sized bitfields with aliasing
Unions within unions
for those (like me) scratching their heads about what happens with bitfields that are not byte-aligned and endianness, look no further: http://mjfrazer.org/mjfrazer/bitfields/
Sadly no answer got me the entirety of the way there.
EDIT: While experimenting, setting some values and reading the integer representation. I noticed something that I had forgotten about: endianness. This opens up another can of worms. Is it even possible to do what I want using bitfields or will I have to go for bitshifting operations?
The layout for bitfields is highly implementation (=compiler) dependent. In essence, compilers are free to place consecutive bitfields in the same byte/word if it sees fit, or not. Thus without extensions like the packed attribute that you mention, you can never be sure that your bitfields are squeezed into one word.
Then, if the bitfields are not squeezed into one word, or if you have just some spare bits that you don't use, you may be even more in trouble. These so-called padding bits can have arbitrary values, thus your sorting idea could never work in a portable setting.
For all these reasons, bitfields are relatively rarely used in real code. What you can see more often is the use of macros for the bits of your uint64_t that you need. For each of your bitfields that you have now, you'd need two macros, one to extract the bits and one to set them. Such a code then would be portable on all platforms that have a C99/C11 compiler without problems.
Minor point:
In the declaration of a union it is better to but the basic integer field first. The default initializer for a union uses the first field, so this would then ensure that your union would be initialized to all bits zero by such an initializer. The initializer of the struct would only guarantee that the individual fields are set to 0, the padding bits, if any, would be unspecific.

Sizeof struct in Go

I'm having a look at Go, which looks quite promising.
I am trying to figure out how to get the size of a go struct, for
example something like
type Coord3d struct {
X, Y, Z int64
}
Of course I know that it's 24 bytes, but I'd like to know it programmatically..
Do you have any ideas how to do this ?
Roger already showed how to use SizeOf method from the unsafe package. Make sure you read this before relying on the value returned by the function:
The size does not include any memory possibly referenced by x. For
instance, if x is a slice, Sizeof returns the size of the slice
descriptor, not the size of the memory referenced by the slice.
In addition to this I wanted to explain how you can easily calculate the size of any struct using a couple of simple rules. And then how to verify your intuition using a helpful service.
The size depends on the types it consists of and the order of the fields in the struct (because different padding will be used). This means that two structs with the same fields can have different size.
For example this struct will have a size of 32
struct {
a bool
b string
c bool
}
and a slight modification will have a size of 24 (a 25% difference just due to a more compact ordering of fields)
struct {
a bool
c bool
b string
}
As you see from the pictures, in the second example we removed one of the paddings and moved a field to take advantage of the previous padding. An alignment can be 1, 2, 4, or 8. A padding is the space that was used to fill in the variable to fill the alignment (basically wasted space).
Knowing this rule and remembering that:
bool, int8/uint8 take 1 byte
int16, uint16 - 2 bytes
int32, uint32, float32 - 4 bytes
int64, uint64, float64, pointer - 8 bytes
string - 16 bytes (2 alignments of 8 bytes)
any slice takes 24 bytes (3 alignments of 8 bytes). So []bool, [][][]string are the same (do not forget to reread the citation I added in the beginning)
array of length n takes n * type it takes of bytes.
Armed with the knowledge of padding, alignment and sizes in bytes, you can quickly figure out how to improve your struct (but still it makes sense to verify your intuition using the service).
import unsafe "unsafe"
/* Structure describing an inotify event. */
type INotifyInfo struct {
Wd int32 // Watch descriptor
Mask uint32 // Watch mask
Cookie uint32 // Cookie to synchronize two events
Len uint32 // Length (including NULs) of name
}
func doSomething() {
var info INotifyInfo
const infoSize = unsafe.Sizeof(info)
...
}
NOTE: The OP is mistaken. The unsafe.Sizeof does return 24 on the example Coord3d struct. See comment below.
binary.TotalSize is also an option, but note there's a slight difference in behavior between that and unsafe.Sizeof: binary.TotalSize includes the size of the contents of slices, while unsafe.Sizeof only returns the size of the top level descriptor. Here's an example of how to use TotalSize.
package main
import (
"encoding/binary"
"fmt"
"reflect"
)
type T struct {
a uint32
b int8
}
func main() {
var t T
r := reflect.ValueOf(t)
s := binary.TotalSize(r)
fmt.Println(s)
}
This is subject to change but last I looked there is an outstanding compiler bug (bug260.go) related to structure alignment. The end result is that packing a structure might not give the expected results. That was for compiler 6g version 5383 release.2010-04-27 release. It may not be affecting your results, but it's something to be aware of.
UPDATE: The only bug left in go test suite is bug260.go, mentioned above, as of release 2010-05-04.
Hotei
In order to not to incur the overhead of initializing a structure, it would be faster to use a pointer to Coord3d:
package main
import (
"fmt"
"unsafe"
)
type Coord3d struct {
X, Y, Z int64
}
func main() {
var dummy *Coord3d
fmt.Printf("sizeof(Coord3d) = %d\n", unsafe.Sizeof(*dummy))
}
/*
returns the size of any type of object in bytes
*/
func getRealSizeOf(v interface{}) (int, error) {
b := new(bytes.Buffer)
if err := gob.NewEncoder(b).Encode(v); err != nil {
return 0, err
}
return b.Len(), nil
}

Resources