Difference between struct and enum - rust

I'm confused about this statement from the Rust Book:
There’s another advantage to using an enum rather than a struct: each variant can have different types and amounts of associated data. Version four type IP addresses will always have four numeric components that will have values between 0 and 255. If we wanted to store V4 addresses as four u8 values but still express V6 addresses as one String value, we wouldn’t be able to with a struct. Enums handle this case with ease:
#![allow(unused_variables)]
fn main() {
enum IpAddr {
V4(u8, u8, u8, u8),
V6(String),
}
let home = IpAddr::V4(127, 0, 0, 1);
let loopback = IpAddr::V6(String::from("::1"));
}
But when I tried it with structs to store V4 addresses as four u8 values but still express V6 addresses as one String value its also doing the same without any errors.
#[derive(Debug)]
struct IpAddr {
V4:(u8, u8, u8, u8),
V6:String,
}
fn main () {
let home = IpAddr {
V4: (127, 1, 1, 1),
V6: String::from("Hello"),
};
println!("{:#?}", home);
}

It's not the same. All enum elements have the very same size! The size of an enum element is the size of the largest variant plus the variant identifier.
With a struct it's a bit different. If we ignore padding, the size of the struct is the sum of the sizes of its members. With padding it will be a bit more:
fn main() {
let size = std::mem::size_of::<TheEnum>();
println!("Enum: {}", size * 8);
let size = std::mem::size_of::<TheStruct>();
println!("Struct: {}", size * 8);
}
struct TheStruct {
a: u64,
b: u8,
c: u64
}
enum TheEnum {
A(u64),
B(u8),
C(u64)
}
Here we can see the difference:
Enum: 128; 64 for the largest variant and 64 for the variant identifier.
Struct: 192; aligned to 64 bits, so we have 54 bits of padding
Another difference is in the way you use enums and structures. In an enum, you have to initialize only one of the variants. In your case - either IPv4 or IPv6. With a structure as in your example you have to provide both V4 and v6 address. You cannot provide only V4 or only V6.

Related

Zero-copy convert slice of integers to slice of bytes

How to convert e.g. &[u64] to &[u8]? I contend that it's safe to do with this method (edited to make harder to misuse):
use num_traits::PrimInt;
/// Reinterpret a slice of T as a slice of bytes without copying.
/// Only use with simple copy types like integers, floats, bools, etc. Don't use with structs or enums.
pub fn get_bytes<T: PrimInt>(array: &[T]) -> &[u8] {
// Add some checks to try and catch unsound use
debug_assert!(size_of::<T>() <= 16);
debug_assert!(size_of::<T>().is_power_of_two());
debug_assert_eq!(size_of::<T>(), align_of::<T>());
// Safety: &[u64] can be safely converted to &[u8]
// (so why doesn't rust have a safe method for this?)
unsafe { std::slice::from_raw_parts(array.as_ptr() as *const u8, array.len() * std::mem::size_of::<T>()) }
}
https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=8f30b03d44aadd6c720057337ac41236
That's how it would be written in C or C++. It's not safe to do the inverse conversion, because the alignment of the types differs. But casting down into a slice of bytes works, and it's why you can cast everything to char* in C.
Does Rust expose a safe method to do this? I'm currently just using the code above, but it'd be nice to get rid of one more unsafe block if I can. If not, why not? Is it unsafe for some reason I haven't considered?
The bytemuck crate is the crate for this kind of things. It has the cast_slice() function for that:
pub fn get_bytes<T: bytemuck::NoUninit>(array: &[T]) -> &[u8] {
bytemuck::cast_slice(array)
}
However, your function is unsound: it allows calling with types with padding bytes, but reinterpreting padding bytes (essentially uninit) as u8 is UB. bytemuck::cast_slice() prohibits this by requiring the type to implement NoUninit. You can #[derive(NoUninit)] for your types, as long as they satisfy all requirements.
Your function is unsound:
#[derive(Debug, Clone, Copy)]
struct Thingy {
a: u32,
b: u16,
}
/// Reinterpret a slice of T as a slice of bytes without copying.
/// Only use with simple copy types like integers, floats, bools, etc. Don't use with structs or enums.
pub fn get_bytes<T: Copy>(array: &[T]) -> &[u8] {
// Safety: &[u64] can be safely converted to &[u8]
// (so why doesn't rust have a safe method for this?)
unsafe { std::slice::from_raw_parts(array.as_ptr() as *const u8, array.len() * std::mem::size_of::<T>()) }
}
fn main() {
let a = [Thingy {
a: 0xca_fc_e2_50,
b: 0x12_34,
}, Thingy {
a: 0x98_76_54_32,
b: 0xca_fc,
}];
let b: &[u8] = get_bytes(&a);
println!("{:?}", b);
// [80, 226, 252, 202, 52, 18, 0, 0, 50, 84, 118, 152, 252, 202, 0, 0]
}
As you can see we can read the unwritten to 0 bytes between the Thingys
If the caller has to hold up some constraints the function should be unsafe itself.

How to write a Vec of structs to a file? [duplicate]

I want to send my struct via a TcpStream. I could send String or u8, but I can not send an arbitrary struct. For example:
struct MyStruct {
id: u8,
data: [u8; 1024],
}
let my_struct = MyStruct { id: 0, data: [1; 1024] };
let bytes: &[u8] = convert_struct(my_struct); // how??
tcp_stream.write(bytes);
After receiving the data, I want to convert &[u8] back to MyStruct. How can I convert between these two representations?
I know Rust has a JSON module for serializing data, but I don't want to use JSON because I want to send data as fast and small as possible, so I want to no or very small overhead.
A correctly sized struct as zero-copied bytes can be done using stdlib and a generic function.
In the example below there there is a reusable function called any_as_u8_slice instead of convert_struct, since this is a utility to wrap cast and slice creation.
Note that the question asks about converting, this example creates a read-only slice, so has the advantage of not needing to copy the memory.
Heres a working example based on the question:
unsafe fn any_as_u8_slice<T: Sized>(p: &T) -> &[u8] {
::core::slice::from_raw_parts(
(p as *const T) as *const u8,
::core::mem::size_of::<T>(),
)
}
fn main() {
struct MyStruct {
id: u8,
data: [u8; 1024],
}
let my_struct = MyStruct { id: 0, data: [1; 1024] };
let bytes: &[u8] = unsafe { any_as_u8_slice(&my_struct) };
// tcp_stream.write(bytes);
println!("{:?}", bytes);
}
Note 1) even though 3rd party crates might be better in some cases, this is such a primitive operation that its useful to know how to do in Rust.
Note 2) at time of writing (Rust 1.15), there is no support for const functions. Once there is, it will be possible to cast into a fixed sized array instead of a slice.
Note 3) the any_as_u8_slice function is marked unsafe because any padding bytes in the struct may be uninitialized memory (giving undefined behavior).
If there were a way to ensure input arguments used only structs which were #[repr(packed)], then it could be safe.
Otherwise the function is fairly safe since it prevents buffer over-run since the output is read-only, fixed number of bytes, and its lifetime is bound to the input.If you wanted a version that returned a &mut [u8], that would be quite dangerous since modifying could easily create inconsistent/corrupt data.
(Shamelessly stolen and adapted from Renato Zannon's comment on a similar question)
Perhaps a solution like bincode would suit your case? Here's a working excerpt:
Cargo.toml
[package]
name = "foo"
version = "0.1.0"
authors = ["An Devloper <an.devloper#example.com>"]
edition = "2018"
[dependencies]
bincode = "1.0"
serde = { version = "1.0", features = ["derive"] }
main.rs
use serde::{Deserialize, Serialize};
use std::fs::File;
#[derive(Serialize, Deserialize)]
struct A {
id: i8,
key: i16,
name: String,
values: Vec<String>,
}
fn main() {
let a = A {
id: 42,
key: 1337,
name: "Hello world".to_string(),
values: vec!["alpha".to_string(), "beta".to_string()],
};
// Encode to something implementing `Write`
let mut f = File::create("/tmp/output.bin").unwrap();
bincode::serialize_into(&mut f, &a).unwrap();
// Or just to a buffer
let bytes = bincode::serialize(&a).unwrap();
println!("{:?}", bytes);
}
You would then be able to send the bytes wherever you want. I assume you are already aware of the issues with naively sending bytes around (like potential endianness issues or versioning), but I'll mention them just in case ^_^.

How do I use cbindgen to return and free a Box<Vec<_>>?

I have a struct returned to C code from Rust. I have no idea if it's a good way to do things, but it does work for rebuilding the struct and freeing memory without leaks.
#[repr(C)]
pub struct s {
// ...
}
#[repr(C)]
#[allow(clippy::box_vec)]
pub struct s_arr {
arr: *const s,
n: i8,
vec: Box<Vec<s>>,
}
/// Frees memory that was returned to C code
pub unsafe extern "C" fn free_s_arr(a: *mut s_arr) {
Box::from_raw(s_arr);
}
/// Generates an array for the C code
pub unsafe extern "C" fn gen_s_arr() -> *mut s_arr {
let many_s: Vec<s> = Vec::new();
// ... logic here
Box::into_raw(Box::new(s_arr {
arr: many_s.as_mut_ptr(),
n: many_s.len() as i8,
vec: many_s,
}))
}
The C header is currently written by hand, but I wanted to try out cbindgen. The manual C definition for s_arr is:
struct s_arr {
struct s *arr;
int8_t n;
void *_;
};
cbindgen generates the following for s_arr:
typedef struct Box_Vec_s Box_Vec_s;
typedef struct s_arr {
const s *arr;
int8_t n;
Box_Vec_s vec;
} s_arr;
This doesn't work since struct Box_Vec_s is not defined. Ideally I would just want to override the cbindgen type generated for vec to make it void * since it requires no code changes and thus no additional testing, but I am open to other suggestions.
I have looked through the cbindgen documentation, though not the examples, and couldn't find anything.
Your question is a bit unclear, but I think that if I understood you right, you're confusing two things and being led down a dark alley as a result.
In C, a dynamically-sized array, as you probably know, is identified by two things:
Its starting position, as a pointer
Its length
Rust follows the same convention - a Vec<_>, below the hood, shares the same structure (well, almost. It has a capacity as well, but that's beside the point).
Passing the boxed vector on top of a pointer is not only overkill, but extremely unwise. FFI bindings may be smart, but they're not smart enough to deal with a boxed complex type most of the time.
To solve this, we're going to simplify your bindings. I've added a single element in struct S to show you how it works. I've also cleaned up your FFI boundary:
#[repr(C)]
#[no_mangle]
pub struct S {
foo: u8
}
#[repr(C)]
pub struct s_arr {
arr: *mut S,
n: usize,
cap: usize
}
// Retrieve the vector back
pub unsafe extern "C" fn recombine_s_arr(ptr: *mut S, n: usize, cap: usize) -> Vec<S> {
Vec::from_raw_parts(ptr, n, cap)
}
#[no_mangle]
pub unsafe extern "C" fn gen_s_arr() -> s_arr {
let mut many_s: Vec<S> = Vec::new();
let output = s_arr {
arr: many_s.as_mut_ptr(),
n: many_s.len(),
cap: many_s.capacity()
};
std::mem::forget(many_s);
output
}
With this, cbindgen returns the expected header definitions:
typedef struct {
uint8_t foo;
} so58311426S;
typedef struct {
so58311426S *arr;
uintptr_t n;
uintptr_t cap;
} so58311426s_arr;
so58311426s_arr gen_s_arr(void);
This allows us to call gen_s_arr() from either C or Rust and retrieve a struct that is usable across both parts of the FFI boundary (so58311426s_arr). This struct contains all we need to be able to modify our array of S (well, so58311426S according to cbindgen).
When passing through FFI, you need to make sure of a few simple things:
You cannot pass raw boxes or non-primitive types; you will almost universally need to convert down to a set of pointers or change your definitions to accomodate (as I have done here)
You most definitely do not pass raw vectors. At most, you pass a slice, as that is a primitive type (see the point above).
You make sure to std::mem::forget() whatever you do not want to deallocate, and make sure to remember to deallocate it or reform it somewhere else.
I will edit this question in an hour; I have a plane to get on to. Let me know if any of this needs clarifications and I'll get to it once I'm in the right country :-)

How do I encode or pack a struct into bytes without using an external library? [duplicate]

I want to send my struct via a TcpStream. I could send String or u8, but I can not send an arbitrary struct. For example:
struct MyStruct {
id: u8,
data: [u8; 1024],
}
let my_struct = MyStruct { id: 0, data: [1; 1024] };
let bytes: &[u8] = convert_struct(my_struct); // how??
tcp_stream.write(bytes);
After receiving the data, I want to convert &[u8] back to MyStruct. How can I convert between these two representations?
I know Rust has a JSON module for serializing data, but I don't want to use JSON because I want to send data as fast and small as possible, so I want to no or very small overhead.
A correctly sized struct as zero-copied bytes can be done using stdlib and a generic function.
In the example below there there is a reusable function called any_as_u8_slice instead of convert_struct, since this is a utility to wrap cast and slice creation.
Note that the question asks about converting, this example creates a read-only slice, so has the advantage of not needing to copy the memory.
Heres a working example based on the question:
unsafe fn any_as_u8_slice<T: Sized>(p: &T) -> &[u8] {
::core::slice::from_raw_parts(
(p as *const T) as *const u8,
::core::mem::size_of::<T>(),
)
}
fn main() {
struct MyStruct {
id: u8,
data: [u8; 1024],
}
let my_struct = MyStruct { id: 0, data: [1; 1024] };
let bytes: &[u8] = unsafe { any_as_u8_slice(&my_struct) };
// tcp_stream.write(bytes);
println!("{:?}", bytes);
}
Note 1) even though 3rd party crates might be better in some cases, this is such a primitive operation that its useful to know how to do in Rust.
Note 2) at time of writing (Rust 1.15), there is no support for const functions. Once there is, it will be possible to cast into a fixed sized array instead of a slice.
Note 3) the any_as_u8_slice function is marked unsafe because any padding bytes in the struct may be uninitialized memory (giving undefined behavior).
If there were a way to ensure input arguments used only structs which were #[repr(packed)], then it could be safe.
Otherwise the function is fairly safe since it prevents buffer over-run since the output is read-only, fixed number of bytes, and its lifetime is bound to the input.If you wanted a version that returned a &mut [u8], that would be quite dangerous since modifying could easily create inconsistent/corrupt data.
(Shamelessly stolen and adapted from Renato Zannon's comment on a similar question)
Perhaps a solution like bincode would suit your case? Here's a working excerpt:
Cargo.toml
[package]
name = "foo"
version = "0.1.0"
authors = ["An Devloper <an.devloper#example.com>"]
edition = "2018"
[dependencies]
bincode = "1.0"
serde = { version = "1.0", features = ["derive"] }
main.rs
use serde::{Deserialize, Serialize};
use std::fs::File;
#[derive(Serialize, Deserialize)]
struct A {
id: i8,
key: i16,
name: String,
values: Vec<String>,
}
fn main() {
let a = A {
id: 42,
key: 1337,
name: "Hello world".to_string(),
values: vec!["alpha".to_string(), "beta".to_string()],
};
// Encode to something implementing `Write`
let mut f = File::create("/tmp/output.bin").unwrap();
bincode::serialize_into(&mut f, &a).unwrap();
// Or just to a buffer
let bytes = bincode::serialize(&a).unwrap();
println!("{:?}", bytes);
}
You would then be able to send the bytes wherever you want. I assume you are already aware of the issues with naively sending bytes around (like potential endianness issues or versioning), but I'll mention them just in case ^_^.

How to convert 'struct' to '&[u8]'?

I want to send my struct via a TcpStream. I could send String or u8, but I can not send an arbitrary struct. For example:
struct MyStruct {
id: u8,
data: [u8; 1024],
}
let my_struct = MyStruct { id: 0, data: [1; 1024] };
let bytes: &[u8] = convert_struct(my_struct); // how??
tcp_stream.write(bytes);
After receiving the data, I want to convert &[u8] back to MyStruct. How can I convert between these two representations?
I know Rust has a JSON module for serializing data, but I don't want to use JSON because I want to send data as fast and small as possible, so I want to no or very small overhead.
A correctly sized struct as zero-copied bytes can be done using stdlib and a generic function.
In the example below there there is a reusable function called any_as_u8_slice instead of convert_struct, since this is a utility to wrap cast and slice creation.
Note that the question asks about converting, this example creates a read-only slice, so has the advantage of not needing to copy the memory.
Heres a working example based on the question:
unsafe fn any_as_u8_slice<T: Sized>(p: &T) -> &[u8] {
::core::slice::from_raw_parts(
(p as *const T) as *const u8,
::core::mem::size_of::<T>(),
)
}
fn main() {
struct MyStruct {
id: u8,
data: [u8; 1024],
}
let my_struct = MyStruct { id: 0, data: [1; 1024] };
let bytes: &[u8] = unsafe { any_as_u8_slice(&my_struct) };
// tcp_stream.write(bytes);
println!("{:?}", bytes);
}
Note 1) even though 3rd party crates might be better in some cases, this is such a primitive operation that its useful to know how to do in Rust.
Note 2) at time of writing (Rust 1.15), there is no support for const functions. Once there is, it will be possible to cast into a fixed sized array instead of a slice.
Note 3) the any_as_u8_slice function is marked unsafe because any padding bytes in the struct may be uninitialized memory (giving undefined behavior).
If there were a way to ensure input arguments used only structs which were #[repr(packed)], then it could be safe.
Otherwise the function is fairly safe since it prevents buffer over-run since the output is read-only, fixed number of bytes, and its lifetime is bound to the input.If you wanted a version that returned a &mut [u8], that would be quite dangerous since modifying could easily create inconsistent/corrupt data.
(Shamelessly stolen and adapted from Renato Zannon's comment on a similar question)
Perhaps a solution like bincode would suit your case? Here's a working excerpt:
Cargo.toml
[package]
name = "foo"
version = "0.1.0"
authors = ["An Devloper <an.devloper#example.com>"]
edition = "2018"
[dependencies]
bincode = "1.0"
serde = { version = "1.0", features = ["derive"] }
main.rs
use serde::{Deserialize, Serialize};
use std::fs::File;
#[derive(Serialize, Deserialize)]
struct A {
id: i8,
key: i16,
name: String,
values: Vec<String>,
}
fn main() {
let a = A {
id: 42,
key: 1337,
name: "Hello world".to_string(),
values: vec!["alpha".to_string(), "beta".to_string()],
};
// Encode to something implementing `Write`
let mut f = File::create("/tmp/output.bin").unwrap();
bincode::serialize_into(&mut f, &a).unwrap();
// Or just to a buffer
let bytes = bincode::serialize(&a).unwrap();
println!("{:?}", bytes);
}
You would then be able to send the bytes wherever you want. I assume you are already aware of the issues with naively sending bytes around (like potential endianness issues or versioning), but I'll mention them just in case ^_^.

Resources