I'm reading a series of bytes from a socket and I need to put each segment of n bytes as a item in a struct.
use std::mem;
#[derive(Debug)]
struct Things {
x: u8,
y: u16,
}
fn main() {
let array = [22 as u8, 76 as u8, 34 as u8];
let foobar: Things;
unsafe {
foobar = mem::transmute::<[u8; 3], Things>(array);
}
println!("{:?}", foobar);
}
I'm getting errors that say that foobar is 32 bits when array is 24 bits. Shouldn't foobar be 24 bits (8 + 16 = 24)?
The issue here is that the y field is 16-bit-aligned. So your memory layout is actually
x
padding
y
y
Note that swapping the order of x and y doesn't help, because Rust's memory layout for structs is actually undefined (and thus still 32 bits for no reason but simplicity in the compiler). If you depend on it you will get undefined behavior.
The reasons for alignment are explained in Purpose of memory alignment.
You can prevent alignment from happening by adding the attribute repr(packed) to your struct, but you'll lose performance and the ability to take references of fields:
#[repr(packed)]
struct Things {
x: u8,
y: u16,
}
The best way would be to not use transmute at all, but to extract the values manually and hope the optimizer makes it fast:
let foobar = Things {
x: array[0],
y: ((array[1] as u16) << 8) | (array[2] as u16),
};
A crate like byteorder may simplify the process of reading different sizes and endianness from the bytes.
bincode and serde can do this quit simply.
use bincode::{deserialize};
use serde::{Deserialize};
#[derive(Deserialize, Debug)]
struct Things {
x: u8,
y: u16,
}
fn main() {
let array = [22 as u8, 76 as u8, 34 as u8];
let foobar: Things = deserialize(&array).unwrap();
println!("{:?}", foobar);
}
This also works well for serializing a struct into bytes as well.
use bincode::{serialize};
use serde::{Serialize};
#[derive(Serialize, Debug)]
struct Things {
x: u8,
y: u16,
}
fn main() {
let things = Things{
x: 22,
y: 8780,
};
let baz = serialize(&things).unwrap();
println!("{:?}", baz);
}
I was having issues using the byteorder crate when dealing with structs that also had char arrays. I couldn't get past the compiler errors. I ended up casting like this:
#[repr(packed)]
struct Things {
x: u8,
y: u16,
}
fn main() {
let data: [u8; 3] = [0x22, 0x76, 0x34];
unsafe {
let things_p: *const Things = data.as_ptr() as *const Things;
let things: &Things = &*things_p;
println!("{:x} {:x}", things.x, things.y);
}
}
Note that with using packed, you get this warning:
= warning: this was previously accepted by the compiler but is being phased out; it will become a hard error in a future release!
If you can, change Things to behave like a C struct:
#[repr(C)]
struct Things2 {
x: u8,
y: u16,
}
Then initialize data like this. Note the extra byte for alignment purposes.
let data: [u8; 4] = [0x22, 0, 0x76, 0x34];
use std::mem;
fn main() {
let bytes = vec!(0u8, 1u8,2u8, 3, 4, 5, 6, 7, 8, 9, 0xffu8, );
let data_ptr: *const u64 = unsafe { mem::transmute(bytes[0..4].as_ptr()) };
let data: u64 = unsafe { *data_ptr };
println!("{:#x}", data);
}
Related
TL;DR: I thought that the packed attribute in Rust always strips any padding between the fields but apparently this is only true for packed(1).
I want my struct to represent the exact bytes in memory without any additional padding between fields but the struct also needs to be page-aligned. The compiler output isn't what I expect it to be in my code example. From the language reference [0] I found, that packed(N) aligns the struct to a N-byte boundary. I expected that only the beginning of the struct is aligned while there is never padding between the fields. However, I found out that:
#[repr(C, packed(4096)]
struct Foo {
first: u8,
second: u32,
}
let foo = Foo { first: 0, second: 0 };
println!("foo is page-aligned: {}", &foo as *const _ as usize & 0xfff == 0);
println!("{:?}", &foo.first as *const _);
println!("{:?}", &foo.second as *const _);
println!("padding between fields: {}", &foo.second as *const _ as usize - &foo.first as *const _ as usize);
results in
foo is page-aligned: false
0x7ffc85be5eb8
0x7ffc85be5ebc
padding between fields: 4
Why is the struct not page-aligned and why is there padding between the fields? I found out that I can achieve what I want with
#[repr(align(4096))]
struct PageAligned<T>(T);
#[repr(C, packed)]
struct Foo {
first: u8,
second: u32,
}
let foo = Foo { first: 0, second: 0 };
let aligned_foo = PageAligned(Foo { first: 0, second: 0 });
it results in
foo is page-aligned: true
0x7ffd18c12000
0x7ffd18c12001
padding between fields: 1
but I think this is counter-intuitive. Is this how it is supposed to work? I'm on Rust stable 1.57.
To meet the requirements with the available tools, your best option may be to construct a substitute for your u32 field which naturally has an alignment of 1:
#[repr(C, align(4096))]
struct Foo {
first: u8,
second: [u8; 4],
}
impl Foo {
fn second(&self) -> u32 {
u32::from_ne_bytes(self.second)
}
fn set_second(&mut self, value: u32) {
self.second = u32::to_ne_bytes(value);
}
}
This struct's layout passes your tests.
I need to write a function that returns array of u16 integers in Rust. This function then should be used by FFI.
extern crate libc;
use libc::{uint16_t};
#[no_mangle]
pub extern fn ffi_test() -> *const uint16_t {
let test: [u16;4] = [1,2,3,4];
test.as_ptr()
}
Rust code compiles without errors. I used Ruby to test the ffi call:
# coding: utf-8
require 'ffi'
module MyMod
extend FFI::Library
ffi_lib 'my_ffi_test_lib'
attach_function :ffi_test, [], :pointer
end
a_ptr = MyMod.ffi_test
size = 4
result_array = a_ptr.read_array_of_uint16(size)
p result_array
But the results are totally wrong (expected: [1, 2, 3, 4]):
$ ruby ffi_test.rb
[57871, 25191, 32767, 0]
As if I am reading totally diffirent memory addr. I assume maybe that I should not use #as_ptr() on Rust array?
EDIT
As per recommendation of #FrenchBoiethios I tried to box the array:
extern crate libc;
use libc::{uint16_t};
#[no_mangle]
pub extern fn ffi_test() -> *mut uint16_t {
let test: [u16;4] = [1,2,3,4];
let b = Box::new(test);
Box::into_raw(b)
}
This gives compile error:
note: expected type `std::boxed::Box<u16>`
found type `std::boxed::Box<[u16; 4]>`
Your array is on the stack, so there is a lifetime issue when you returns it as a pointer (returned pointer to a local variable). You must allocate it in the heap:
#[no_mangle]
pub extern "C" fn ffi_test() -> *mut u16 {
let mut test = vec![1, 2, 3, 4];
let ptr = test.as_mut_ptr();
std::mem::forget(test); // so that it is not destructed at the end of the scope
ptr
}
or
#[no_mangle]
pub extern "C" fn ffi_test() -> *mut u16 {
let test = Box::new([1u16, 2, 3, 4]); // type must be explicit here...
Box::into_raw(test) as *mut _ // ... because this cast can convert
// *mut [i32; 4] to *mut u16
}
I am trying to learn Rust ffi, those implementations are a frankenstein creation from different sources in internet. So take it with a grain of salt.
Currently I am with two approaches:
a) Remove the array from rust GC and return the point. User need to promise to call free later.
#[repr(C)]
pub struct V2 {
pub x: i32,
pub y: i32,
}
#[repr(C)]
struct Buffer {
len: i32,
data: *mut V2,
}
#[no_mangle]
extern "C" fn generate_data() -> Buffer {
let mut buf = vec![V2 { x: 1, y: 0 }, V2 { x: 2, y: 0}].into_boxed_slice();
let data = buf.as_mut_ptr();
let len = buf.len() as i32;
std::mem::forget(buf);
Buffer { len, data }
}
#[no_mangle]
extern "C" fn free_buf(buf: Buffer) {
let s = unsafe { std::slice::from_raw_parts_mut(buf.data, buf.len as usize) };
let s = s.as_mut_ptr();
unsafe {
Box::from_raw(s);
}
}
b) Send the array through FFI callback function. User need to promise to not keep references, but dont need to call free.
#[no_mangle]
pub extern "C" fn context_get_byte_responses(callback: extern "stdcall" fn (*mut u8, i32)) -> bool {
let bytes: Vec<u8> = vec![];
callback(bytes.as_mut_ptr(), bytes.len() as i32);
true
}
This question already has answers here:
How to get a slice as an array in Rust?
(7 answers)
Closed 6 years ago.
I have a structure with some fixed-sized arrays:
struct PublicHeaderBlock_LAS14 {
file_signature: [u8; 4],
file_source_id: u16,
global_encoding: u16,
project_id_data_1: u32,
project_id_data_2: u16,
project_id_data_3: u16,
project_id_data_4: [u8; 8],
version_major: u8,
version_minor: u8,
systemIdentifier: [u8; 32], // ...
}
I'm reading in bytes from a file into a fixed size array and am copying those bytes into the struct bit by bit.
fn create_header_struct_las14(&self, buff: &[u8; 373]) -> PublicHeaderBlock_LAS14 {
PublicHeaderBlock_LAS14 {
file_signature: [buff[0], buff[1], buff[2], buff[3]],
file_source_id: (buff[4] | buff[5] << 7) as u16,
global_encoding: (buff[6] | buff[7] << 7) as u16,
project_id_data_1: (buff[8] | buff[9] << 7 | buff[10] << 7 | buff[11] << 7) as u32,
project_id_data_2: (buff[12] | buff[13] << 7) as u16,
project_id_data_3: (buff[14] | buff[15] << 7) as u16,
project_id_data_4: [buff[16], buff[17], buff[18], buff[19], buff[20], buff[21], buff[22], buff[23]],
version_major: buff[24],
version_minor: buff[25],
systemIdentifier: buff[26..58]
}
}
The last line (systemIdentifier) doesn't work, because in the struct it is a [u8; 32] and buff[26..58] is a slice. Can I return convert a slice to a fixed sized array like that over a range, instead of doing what I've done to say file_signature?
Edit: Since Rust 1.34, you can use TryInto, which is derived from TryFrom<&[T]> for [T; N]
struct Foo {
arr: [u8; 32],
}
fn fill(s: &[u8; 373]) -> Foo {
// We unwrap here because it will always return `Ok` variant
let arr: [u8; 32] = s[26..68].try_into().unwrap();
Foo { arr }
}
Original answer from 2016:
There is no safe way to initialize an array in a struct with a slice. You need either resort to unsafe block that operates directly on uninitialized memory, or use one of the following two initialize-then-mutate strategies:
Construct an desired array, then use it to initialize the struct.
struct Foo {
arr: [u8; 32],
}
fn fill(s: &[u8; 373]) -> Foo {
let mut a: [u8; 32] = Default::default();
a.copy_from_slice(&s[26..58]);
Foo { arr: a }
}
Or initialize the struct, then mutate the array inside the struct.
#[derive(Default)]
struct Foo {
arr: [u8; 32],
}
fn fill(s: &[u8; 373]) -> Foo {
let mut f: Foo = Default::default();
f.arr.copy_from_slice(&s[26..58]);
f
}
The first one is cleaner if your struct has many members. The second one may be a little faster if the compiler cannot optimize out the intermediate copy. But you probably will use the unsafe method if this is the performance bottleneck of your program.
Thanks to #malbarbo we can use this helper function:
use std::convert::AsMut;
fn clone_into_array<A, T>(slice: &[T]) -> A
where A: Sized + Default + AsMut<[T]>,
T: Clone
{
let mut a = Default::default();
<A as AsMut<[T]>>::as_mut(&mut a).clone_from_slice(slice);
a
}
to get a much neater syntax:
fn main() {
let original = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10];
let e = Example {
a: clone_into_array(&original[0..4]),
b: clone_into_array(&original[4..10]),
};
println!("{:?}", e);
}
as long as T: Default + Clone.
It will panic! if the target array and the passed-in slice do not have the same length, because clone_from_slice does.
I'm reading a series of bytes from a socket and I need to put each segment of n bytes as a item in a struct.
use std::mem;
#[derive(Debug)]
struct Things {
x: u8,
y: u16,
}
fn main() {
let array = [22 as u8, 76 as u8, 34 as u8];
let foobar: Things;
unsafe {
foobar = mem::transmute::<[u8; 3], Things>(array);
}
println!("{:?}", foobar);
}
I'm getting errors that say that foobar is 32 bits when array is 24 bits. Shouldn't foobar be 24 bits (8 + 16 = 24)?
The issue here is that the y field is 16-bit-aligned. So your memory layout is actually
x
padding
y
y
Note that swapping the order of x and y doesn't help, because Rust's memory layout for structs is actually undefined (and thus still 32 bits for no reason but simplicity in the compiler). If you depend on it you will get undefined behavior.
The reasons for alignment are explained in Purpose of memory alignment.
You can prevent alignment from happening by adding the attribute repr(packed) to your struct, but you'll lose performance and the ability to take references of fields:
#[repr(packed)]
struct Things {
x: u8,
y: u16,
}
The best way would be to not use transmute at all, but to extract the values manually and hope the optimizer makes it fast:
let foobar = Things {
x: array[0],
y: ((array[1] as u16) << 8) | (array[2] as u16),
};
A crate like byteorder may simplify the process of reading different sizes and endianness from the bytes.
bincode and serde can do this quit simply.
use bincode::{deserialize};
use serde::{Deserialize};
#[derive(Deserialize, Debug)]
struct Things {
x: u8,
y: u16,
}
fn main() {
let array = [22 as u8, 76 as u8, 34 as u8];
let foobar: Things = deserialize(&array).unwrap();
println!("{:?}", foobar);
}
This also works well for serializing a struct into bytes as well.
use bincode::{serialize};
use serde::{Serialize};
#[derive(Serialize, Debug)]
struct Things {
x: u8,
y: u16,
}
fn main() {
let things = Things{
x: 22,
y: 8780,
};
let baz = serialize(&things).unwrap();
println!("{:?}", baz);
}
I was having issues using the byteorder crate when dealing with structs that also had char arrays. I couldn't get past the compiler errors. I ended up casting like this:
#[repr(packed)]
struct Things {
x: u8,
y: u16,
}
fn main() {
let data: [u8; 3] = [0x22, 0x76, 0x34];
unsafe {
let things_p: *const Things = data.as_ptr() as *const Things;
let things: &Things = &*things_p;
println!("{:x} {:x}", things.x, things.y);
}
}
Note that with using packed, you get this warning:
= warning: this was previously accepted by the compiler but is being phased out; it will become a hard error in a future release!
If you can, change Things to behave like a C struct:
#[repr(C)]
struct Things2 {
x: u8,
y: u16,
}
Then initialize data like this. Note the extra byte for alignment purposes.
let data: [u8; 4] = [0x22, 0, 0x76, 0x34];
use std::mem;
fn main() {
let bytes = vec!(0u8, 1u8,2u8, 3, 4, 5, 6, 7, 8, 9, 0xffu8, );
let data_ptr: *const u64 = unsafe { mem::transmute(bytes[0..4].as_ptr()) };
let data: u64 = unsafe { *data_ptr };
println!("{:#x}", data);
}
As far as I know, the Rust compiler is allowed to pack, reorder, and add padding to each field of a struct. How can I specify the precise memory layout if I need it?
In C#, I have the StructLayout attribute, and in C/C++, I could use various compiler extensions. I could verify the memory layout by checking the byte offset of expected value locations.
I'd like to write OpenGL code employing custom shaders, which needs precise memory layout. Is there a way to do this without sacrificing performance?
As described in the FFI guide, you can add attributes to structs to use the same layout as C:
#[repr(C)]
struct Object {
a: i32,
// other members
}
and you also have the ability to pack the struct:
#[repr(C, packed)]
struct Object {
a: i32,
// other members
}
And for detecting that the memory layout is ok, you can initialize a struct and check that the offsets are ok by casting the pointers to integers:
#[repr(C, packed)]
struct Object {
a: u8,
b: u16,
c: u32, // other members
}
fn main() {
let obj = Object {
a: 0xaa,
b: 0xbbbb,
c: 0xcccccccc,
};
let a_ptr: *const u8 = &obj.a;
let b_ptr: *const u16 = &obj.b;
let c_ptr: *const u32 = &obj.c;
let base = a_ptr as usize;
println!("a: {}", a_ptr as usize - base);
println!("b: {}", b_ptr as usize - base);
println!("c: {}", c_ptr as usize - base);
}
outputs:
a: 0
b: 1
c: 3
There's no longer to_uint. In Rust 1.0, the code can be:
#[repr(C, packed)]
struct Object {
a: i8,
b: i16,
c: i32, // other members
}
fn main() {
let obj = Object {
a: 0x1a,
b: 0x1bbb,
c: 0x1ccccccc,
};
let base = &obj as *const _ as usize;
let a_off = &obj.a as *const _ as usize - base;
let b_off = &obj.b as *const _ as usize - base;
let c_off = &obj.c as *const _ as usize - base;
println!("a: {}", a_off);
println!("b: {}", b_off);
println!("c: {}", c_off);
}
You also can set memory layout for "data-carrying enums" like this.
#[repr(Int)]
enum MyEnum {
A(u32),
B(f32, u64),
C { x: u32, y: u8 },
D,
}
Details are described in manual and RFC2195.
https://rust-lang.github.io/unsafe-code-guidelines/layout/enums.html
https://rust-lang.github.io/rfcs/2195-really-tagged-unions.html#motivation