In certain unusual cases it may be a requirement that a struct isn't padded (so the memory of the struct is ensured not to contain uninitialized bytes for example).
While it's possible to use #[repr(packed)], this means you may have members which aren't optimally aligned for access.
In C, some software uses manual padding, where GCC's -Wpadded can be used to warn if the struct is padded.
Is there a way to warn/error when a struct is padded?
Or some way to ensure manually padded structs don't have any padding?
The closest thing to this I could come up with is to define the struct twice, pack one, then check if the sizes differ, but this is a kludge.
With some careful use of include it may be possible to avoid actually writing out the struct twice, but it's still a last-resort.
Related
Copy means that the struct could be copied just by copying bytes as is. As a result, it should be easily possible to re-interpret such a struct as [u8]. What's the most idiomatic way to do so, preferably without involving unsafe.
I want to have an optimized struct which could be easily sent via processes/wire/disk. I understand, that there're a lot of details which needs to be taken care of, like alignment, and looking for a solution for such a high performance use case. I.e. I am looking for close to zero copy high performance serialization.
Copy means that the struct could be copied just by copying bytes as is.
This is true.
As a result, it should be easily possible to re-interpret such a struct as [u8].
This is not true, because Copy structs can still contain padding, which is not permitted to be read except incidentally while copying.
What's the most idiomatic way to do so, preferably without involving unsafe.
You should start with bytemuck. It is a library which provides trivial conversion to and from [u8] when it is safe to do so. In particular, it checks that there is no padding in the struct, and that the representation is well-defined (not subject to the whims of the compiler).
You will still need to consider alignment, and for that purpose may need to introduce explicit “padding” fields (whose value is explicitly set rather than being left undefined) so that the alignment of other fields is satisfied.
Your program's data will also not be compatible with machines of different endianness unless you take care. (However, it is possible to do so, in ways which have zero run-time overhead if not necessary, and most machines are little-endian today so that cost will almost never actually apply.)
How is an Option laid out in memory? Since a i32 already takes up an even number of bytes, it Rust forced to use a full byte to store the single bit None/Some?
EDIT: According to this answer, Rust in fact uses an extra 4 (!) bytes. Why?
For structs and enums declared without special layout modifiers, the Rust docs state
Nominal types without a repr attribute have the default representation. Informally, this representation is also called the rust representation.
There are no guarantees of data layout made by this representation.
Option cannot possibly be repr(transparent) or repr(i*) since it is neither a newtype struct nor a fieldless enum, and we can check the source code and see that it's not declared repr(C). So no guarantees are made about the layout.
If it were declared repr(C), then we'd get the C representation, which is what you're envisioning. We need one integer to indicate whether it's None or Some (which size of integer is implementation-defined) and then we need enough space to store the actual i32.
In reality, since Rust is given a lot of leeway here, it can do clever things. If you have a variable which is only ever Some, it needn't store the tag bit (and, again, no guarantees are made about layout, so it's free to make this change internally). If you have an i32 that starts at 0 and goes up to 10, it's provably never negative, so Rust might choose to use, say, -1 to indicate None.
Context
I have a pair of related structs in my program, Rom and ProfiledRom. They both store a list of u8 values and implement a common trait, GetRom, to provide access to those values.
trait GetRom {
fn get(&self, index: usize) -> u8;
}
The difference is that Rom just wraps a simple Vec<u8>, but ProfiledRom wraps each byte in a ProfiledByte type that counts the number of times it is returned by get.
struct Rom(Vec<u8>);
struct ProfiledRom(Vec<ProfiledByte>);
struct ProfiledByte {
value: u8;
get_count: u32;
};
Much of my program operates on trait GetRom values, so I can substitute in Rom or ProfiledRom type/value depending on whether I want profiling to occur.
Question
I have implemented From<Rom> for ProfiledRom, because converting a Rom to a ProfiledRom just involves wrapping each byte in a new ProfiledByte: a simple and lossless operation.
However, I'm not sure whether it's appropriate to implement From<ProfiledRom> for Rom, because ProfiledRom contains information (the get counts) that can't be represented in a Rom. If you did a round-trip conversion, these values would be lost/reset.
Is it appropriate to implement the From trait when only parts of the source object will be used?
Related
I have seen that the standard library doesn't implement integer conversions like From<i64> for i32 because these could result in bytes being truncated/lost. However, that seems like a somewhat distinct case from what we have here.
With the potentially-truncating integer conversion, you would need to inspect the original i64 to know whether it would be converted appropriately. If you didn't, the behaviour or your code could change unexpectedly when you get an out-of-bounds value. However, in our case above, it's always statically clear what data is being preserved and what data is being lost. The conversion's behaviour won't suddenly change. It should be safer, but is it an appropriate use of the From trait?
From implementations are usually lossless, but there is currently no strict requirement that they be.
The ongoing discussion at rust-lang/rfcs#2484 is related. Some possibilities include adding a FromLossy trait and more exactly prescribing the behaviour of From. We'll have to see where that goes.
For consideration, here are some Target::from(Source) implementations in the standard library:
Lossless conversions
Each Source value is converted into a distinct Target value.
u16::from(u8), i16::from(u8) and other conversions to strictly-larger integer types.
Vec<u8>::from(String)
Vec<T>::from(BinaryHeap<T>)
OsString::from(String)
char::from(u8)
Lossy conversions
Multiple Source values may be convert into the same Target value.
BinaryHeap<T>::from(Vec<T>) loses the order of elements.
Box<[T]>::from(Vec<T>) and Box<str>::from(String) lose any excess capacity.
Vec<T>::from(VecDeque<T>) loses the internal split of elements exposed by .as_slices().
Running this code in rust:
fn main() {
println!("{:?}", std::mem::size_of::<[u8; 1024]>());
println!("{:?}", std::mem::size_of::<[bool; 1024]>());
}
1024
1024
This is not what I expected. So I compiled and ran in release mode. But I got the same answer.
Why does the rust compiler seemingly allocate a whole byte for each single boolean? To me it seems to be a simple optimization to only allocate 128 bytes instead. This project implies I'm not the first to think this.
Is this a case of compilers being way harder than the seem? Or is this not optimized because it isn't a realistic scenario? Or am I not understanding something here?
Pointers and references.
There is an assumption that you can always take a reference to an item of a slice, a field of a struct, etc...
There is an assumption in the language that any reference to an instance of a statically sized type can transmuted to a type-erased pointer *mut ().
Those two assumptions together mean that:
due to (2), it is not possible to create a "bit-reference" which would allow sub-byte addressing,
due to (1), it is not possible not to have references.
This essentially means that any type must have a minimum alignment of one byte.
Note that this is not necessarily an issue. Opting in to a 128 bytes representation should be done cautiously, as it implies trading off speed (and convenience) for memory. It's not a pure win.
Prior art (in the name of std::vector<bool> in C++) is widely considered a mistake in hindsight.
I am looking for a way to implement something like a memory pool in Rust.
I want to allocate a set of related small objects in chunks, and delete the set of objects at once. The objects won't be freed separately. There are several benefits to this approach:
It reduces fragmentation.
It saves memory.
Is there any way to create a allocator like this in Rust?
It sounds like you want the typed arena crate, which is stable and can be used in Rust 1.0.
extern crate typed_arena;
#[derive(Debug)]
struct Foo {
a: u8,
b: u8,
}
fn main() {
let allocator = typed_arena::Arena::new();
let f = allocator.alloc(Foo { a: 42, b: 101 });
println!("{:?}", f)
}
This does have limitations - all the objects must be the same. In my usage, I have a very small set of types that I wish to have, so I have just created a set of Arenas, one for each type.
If that isn't suitable, you can look to arena::Arena, which is unstable and slower than a typed arena.
The basic premise of both allocators is simple - you allow the arena to consume an item and it moves the bits around to its own memory allocation.
Another meaning for the word "allocator" is what is used when you box a value. It is planned that Rust will gain support for "placement new" at some point, and the box syntax is reserved for that.
In unstable versions of Rust, you can do something like box Foo(42), and a (hypothetical) enhancement to that would allow you to say something like box my_arena Foo(42), which would use the specified allocator. This capability is a few versions away from existing it seems.
Funny thing is, the allocator you want is already available in arena crate. It is unstable, so you have to use nightlies to use this crate. You can look at its sources if you want to know how it is implemented.
You may want to look at arena::TypedArena in the standard library (Note: this is not stable and, as a result, is only available in nightly builds).
If this doesn't fit your needs, you can always examine the source code (you can click the [src] link in the top right of the documentation) to see how it's done.