Rust: Using two u8 struct fields as a u16 - struct

For a Gameboy emulator, you have two u8 fields for registers A and F, but they can sometimes be accessed as AF, a combined u16 register.
In C, it looks like you can do something like this:
struct {
union {
struct {
unsigned char f;
unsigned char a;
};
unsigned short af;
};
};
(Taken from here)
Is there a way in Rust, ideally without unsafe, of being able to access two u8s as registers.a/registers.f, but also be able to use them as the u16 registers.af?

I can give you a couple of ways to do it. First is a straightforward unsafe analogue but without boilerplate, the second one is safe but explicit.
Unions in rust are very similar, so you can translate it to this:
#[repr(C)]
struct Inner {
f: u8,
a: u8,
}
#[repr(C)]
union S {
inner: Inner,
af: u16,
}
// Usage:
// Putting data is safe:
let s = S { af: 12345 };
// but retrieving is not:
let a = unsafe { s.inner.a };
Or as an alternative you may manually do all of the explicit casts wrapped in a structure:
#[repr(transparent)]
// This is optional actually but allows a chaining,
// you may remove these derives and change method
// signatures to `&self` and `&mut self`.
#[derive(Clone, Copy)]
struct T(u16);
impl T {
pub fn from_af(af: u16) -> Self {
Self(af)
}
pub fn from_a_f(a: u8, f: u8) -> Self {
Self::from_af(u16::from_le_bytes([a, f]))
}
pub fn af(self) -> u16 {
self.0
}
pub fn f(self) -> u8 {
self.0.to_le_bytes()[0]
}
pub fn set_f(self, f: u8) -> Self {
Self::from_a_f(self.a(), f)
}
pub fn a(self) -> u8 {
self.0.to_le_bytes()[1]
}
pub fn set_a(self, a: u8) -> Self {
Self::from_a_f(a, self.f())
}
}
// Usage:
let t = T::from_af(12345);
let a = t.a();
let new_af = t.set_a(12).set_f(t.f() + 1).af();

Related

Is there a way to emit a compile error if a struct contains padding?

I'm really after an opt-in derivable trait that safely returns an objects unique representation as bytes. In my application, I noticed a ~20x speedup by hashing as a byte array over the derived implementation. AFAIK, this is safe for Copy types with a well-defined representation and no padding. The current implementation expands to something like this.
use core::mem::{size_of, transmute, MaybeUninit};
pub trait ByteValued: Copy {
fn byte_repr(&self) -> &[u8];
}
pub struct AssertByteValued<T: ByteValued> {
_phantom: ::core::marker::PhantomData<T>
}
macro_rules! impl_byte_repr {
() => {
fn byte_repr(&self) -> &[u8] {
let len = size_of::<Self>();
unsafe {
let self_ptr: *const u8 = transmute(self as *const Self);
core::slice::from_raw_parts(self_ptr, len)
}
}
}
}
// Manual implementations for builtin/std types
impl ByteValued for u32 { impl_byte_repr!{} }
impl ByteValued for usize { impl_byte_repr!{} }
impl<T: ByteValued> ByteValued for MaybeUninit<T> { impl_byte_repr!{} }
impl<T: ByteValued, const N: usize> ByteValued for [T; N] { impl_byte_repr!{} }
// Expanded version of a proc_macro generated derived implementation
pub struct ArrayVec<T, const CAP: usize> {
data: [MaybeUninit<T>; CAP],
len: usize,
}
impl<T: Clone, const CAP: usize> Clone for ArrayVec<T, CAP> {
fn clone(&self) -> Self { todo!() }
}
impl<T: Copy, const CAP: usize> Copy for ArrayVec<T, CAP> {}
// This is only valid if all unused capacity is always consistently represented
impl<T: ByteValued, const CAP: usize> ByteValued for ArrayVec<T, CAP> {
fn byte_repr(&self) -> &[u8] {
// Compiletime check all fields are also ByteValued
let _: AssertByteValued<[MaybeUninit<T>; CAP]>;
let _: AssertByteValued<usize>;
// Runtime check for no padding
let _self_size = size_of::<Self>();
let _field_size = size_of::<[MaybeUninit<T>; CAP]>() + size_of::<usize>();
assert!(_self_size == _field_size, "Must not contain padding");
let len = size_of::<Self>();
unsafe {
let self_ptr: *const u8 = transmute(self as *const Self);
::core::slice::from_raw_parts(self_ptr, len)
}
}
}
fn main() {
let x = ArrayVec::<u32, 4> {
data: unsafe { MaybeUninit::zeroed().assume_init() },
len: 0
};
let bytes = x.byte_repr();
assert_eq!(bytes, &[0; 24]);
// This unconditionally panics, but I want a compile error
let y = ArrayVec::<u32, 3> {
data: unsafe { MaybeUninit::zeroed().assume_init() },
len: 0
};
let _ = y.byte_repr();
}
The tricky bit here is asserting no padding in byte_repr. As written, this checks the object size against the sum of the sizes of its fields at runtime. I would like to make that assert const to get a compile error, but that wouldn't work because it depends on the generic types. So, is there a way to emit a compile error (potentially from a proc_macro) if a struct contains padding between its fields?
I suggest starting with bytemuck::NoUninit. This is a derivable trait which guarantees that the type has no uninitialized bytes of any sort (including padding). After implementing it, you can use bytemuck::bytes_of() to get the &[u8] you want to work with.
This cannot just be derived for your ArrayVec since you are explicitly using MaybeUninit, but you can add a T: NoUninit bound to ArrayVec, and blanket implement your ByteValued for all NoUninit, which will both check the condition of T you care about, and simplify the number of impls you need to write.

How can I get the size of a C object in Rust

bindgen has nicely given me
extern "C" {
pub fn Hacl_Bignum4096_new_bn_from_bytes_be(len: u32, b: *mut u8) -> *mut u64;
}
returning something of type *mut u64. Unfortunately there is no reliable way (that I have found) to determine how many u64s are allocated. This makes is very hard (for me) to extract the data pointed to into something I can safely persist in a Rust struct instance.
As a consequence, any time I want to use any function from the Hacl library I have to perform that conversion and free up the created pointers in an unsafe block.
impl Bignum {
/// Returns true if self < other
pub fn lt(&self, other: &Bignum) -> Result<bool, Error> {
let hacl_result: HaclBnWord;
unsafe {
let a = self.get_hacl_bn()?;
let b = other.get_hacl_bn()?;
hacl_result = Hacl_Bignum4096_lt_mask(a, b);
free_hacl_bn(a);
free_hacl_bn(b);
}
Ok(hacl_result != 0 as HaclBnWord)
}
}
unsafe fn get_hacl_bn(&self) is suitably defined and calls Hacl_Bignum4096_new_bn_from_bytes_be() appropriately. And unsafe fn free_hacl_bn(bn: HaclBnType) also lives in this module.
I haven't benchmarked anything yet, but having to perform the conversion to a Hacl_Bignum from bytes each and every time feels wasteful.
So is there a way to determine the size of what is pointed to or is there a way to copy the data out of it into something safe?
You write: "having to perform the conversion to a Hacl_Bignum from bytes each and every time feels wasteful". It seems like you are not letting the library do its job. You should not keep a copy of the bignum data in your Rust struct Bignum, but only the pointer you get from the library. Something like:
extern "C" {
pub fn Hacl_Bignum4096_new_bn_from_bytes_be(len: u32, b: *mut u8) -> *mut u64;
pub fn Hacl_Bignum4096_lt_mask(a: *mut u64, b: *mut u64) -> u64;
}
struct Bignum {
handle: *mut u64,
}
struct BignumError {}
impl Bignum {
pub fn new(bytes: &mut [u8]) -> Result<Self, BignumError> {
unsafe {
let handle =
Hacl_Bignum4096_new_bn_from_bytes_be(bytes.len() as u32, bytes.as_mut_ptr());
if handle.is_null() {
return Err(BignumError {});
} else {
Ok(Self { handle })
}
}
}
/// Returns true if self < other
pub fn lt(&self, other: &Bignum) -> bool {
unsafe { Hacl_Bignum4096_lt_mask(self.handle, other.handle) == u64::MAX }
}
}
PS. I used the comments in this file, which seems to be the library in question.

How to declare a generic type that is a raw pointer?

Given a struct that wraps a pointer,
pub struct Ptr<T> {
ptr: T
}
Is it possible to declare that T must be a raw pointer type? eg *mut SomeStruct or *const SomeStruct.
Without this, I'm unable to perform operations like &*self.ptr within a method, since Rust doesn't know ptr can be treated like a pointer.
Note that this can be made to work:
pub struct Ptr<T> {
ptr: *mut T
}
But in that case, it hard-codes *mut, where we might want *const in other cases.
See: this answer to give some context.
I'm not convinced this is worth doing, but if you're sure then you can just write a trait:
pub trait RawPtr: Sized {
type Value;
fn as_const(self) -> *const Self::Value {
self.as_mut() as *const _
}
fn as_mut(self) -> *mut Self::Value {
self.as_const() as *mut _
}
}
impl<T> RawPtr for *const T {
type Value = T;
fn as_const(self) -> Self { self }
}
impl<T> RawPtr for *mut T {
type Value = T;
fn as_mut(self) -> Self { self }
}
Your can then require P: RawPtr when implementing functions:
pub struct Ptr<P> {
ptr: P
}
impl<P: RawPtr> Ptr<P> {
unsafe fn get(self) -> P::Value
where P::Value: Copy
{
*self.ptr.as_const()
}
}
Additionally, it's possible to define methods that are only available when P is a mutable pointer:
impl<T> Ptr<*mut T> {
unsafe fn get_mut(&mut self) -> *mut T {
self.ptr
}
}

error: `line` does not live long enough (but I know it does)

I am trying to make some kind of ffi to a library written in C, but got stuck. Here is a test case:
extern crate libc;
use libc::{c_void, size_t};
// this is C library api call
unsafe fn some_external_proc(_handler: *mut c_void, value: *const c_void,
value_len: size_t) {
println!("received: {:?}" , std::slice::from_raw_buf(
&(value as *const u8), value_len as usize));
}
// this is Rust wrapper for C library api
pub trait MemoryArea {
fn get_memory_area(&self) -> (*const u8, usize);
}
impl MemoryArea for u64 {
fn get_memory_area(&self) -> (*const u8, usize) {
(unsafe { std::mem::transmute(self) }, std::mem::size_of_val(self))
}
}
impl <'a> MemoryArea for &'a str {
fn get_memory_area(&self) -> (*const u8, usize) {
let bytes = self.as_bytes();
(bytes.as_ptr(), bytes.len())
}
}
#[allow(missing_copy_implementations)]
pub struct Handler<T> {
obj: *mut c_void,
}
impl <T> Handler<T> {
pub fn new() -> Handler<T> { Handler{obj: std::ptr::null_mut(),} }
pub fn invoke_external_proc(&mut self, value: T) where T: MemoryArea {
let (area, area_len) = value.get_memory_area();
unsafe {
some_external_proc(self.obj, area as *const c_void,
area_len as size_t)
};
}
}
// this is Rust wrapper user code
fn main() {
let mut handler_u64 = Handler::new();
let mut handler_str = Handler::new();
handler_u64.invoke_external_proc(1u64); // OK
handler_str.invoke_external_proc("Hello"); // also OK
loop {
match std::io::stdin().read_line() {
Ok(line) => {
let key =
line.trim_right_matches(|&: c: char| c.is_whitespace());
//// error: `line` does not live long enough
// handler_str.invoke_external_proc(key)
}
Err(std::io::IoError { kind: std::io::EndOfFile, .. }) => break ,
Err(error) => panic!("io error: {}" , error),
}
}
}
Rust playpen
I get "line does not live long enough" error if I uncomment line inside the loop. In fact, I realize that Rust is afraid that I could store short-living reference to a slice somewhere inside Handler object, but I quite sure that I wouldn't, and I also know, that it is safe to pass pointers to the external proc (actually, memory is immidiately copied at the C library side).
Is there any way for me to bypass this check?
The problem is that you are incorrectly parameterizing your struct, when you really want to do it for the function. When you create your current Handler, the struct will be specialized with a type that includes a lifetime. However, the lifetime of line is only for the block, so there can be no lifetime for Handler that lasts multiple loop iterations.
What you want is for the lifetime to be tied to the function call, not the life of the struct. As you noted, if you put the lifetime on the struct, then the struct is able to store references of that length. You don't need that, so put the generic type on the function instead:
impl Handler {
pub fn new() -> Handler { Handler{obj: std::ptr::null_mut(),} }
pub fn invoke_external_proc<T>(&mut self, value: T) where T: MemoryArea {
let (area, area_len) = value.get_memory_area();
unsafe {
some_external_proc(self.obj, area as *const c_void,
area_len as size_t)
};
}
}
Amended answer
Since you want to specialize the struct on a type, but don't care too much about the lifetime of the type, let's try this:
#[allow(missing_copy_implementations)]
pub struct Handler<T: ?Sized> {
obj: *mut c_void,
}
impl<T: ?Sized> Handler<T> {
pub fn new() -> Handler<T> { Handler{ obj: std::ptr::null_mut() } }
pub fn invoke_external_proc(&mut self, value: &T) where T: MemoryArea {
let (area, area_len) = value.get_memory_area();
unsafe {
some_external_proc(self.obj, area as *const c_void,
area_len as size_t)
};
}
}
Here, we allow the type to be unsized. Since you can't pass an unsized value as a parameter, we now have to take a reference instead. We also have to change the impl:
impl MemoryArea for str {
fn get_memory_area(&self) -> (*const u8, usize) {
let bytes = self.as_bytes();
(bytes.as_ptr(), bytes.len())
}
}

How would I create a handle manager in Rust?

pub struct Storage<T>{
vec: Vec<T>
}
impl<T: Clone> Storage<T>{
pub fn new() -> Storage<T>{
Storage{vec: Vec::new()}
}
pub fn get<'r>(&'r self, h: &Handle<T>)-> &'r T{
let index = h.id;
&self.vec[index]
}
pub fn set(&mut self, h: &Handle<T>, t: T){
let index = h.id;
self.vec[index] = t;
}
pub fn create(&mut self, t: T) -> Handle<T>{
self.vec.push(t);
Handle{id: self.vec.len()-1}
}
}
struct Handle<T>{
id: uint
}
I am currently trying to create a handle system in Rust and I have some problems. The code above is a simple example of what I want to achieve.
The code works but has one weakness.
let mut s1 = Storage<uint>::new();
let mut s2 = Storage<uint>::new();
let handle1 = s1.create(5);
s1.get(handle1); // works
s2.get(handle1); // unsafe
I would like to associate a handle with a specific storage like this
//Pseudo code
struct Handle<T>{
id: uint,
storage: &Storage<T>
}
impl<T> Handle<T>{
pub fn get(&self) -> &T;
}
The problem is that Rust doesn't allow this. If I would do that and create a handle with the reference of a Storage I wouldn't be allowed to mutate the Storage anymore.
I could implement something similar with a channel but then I would have to clone T every time.
How would I express this in Rust?
The simplest way to model this is to use a phantom type parameter on Storage which acts as a unique ID, like so:
use std::kinds::marker;
pub struct Storage<Id, T> {
marker: marker::InvariantType<Id>,
vec: Vec<T>
}
impl<Id, T> Storage<Id, T> {
pub fn new() -> Storage<Id, T>{
Storage {
marker: marker::InvariantType,
vec: Vec::new()
}
}
pub fn get<'r>(&'r self, h: &Handle<Id, T>) -> &'r T {
let index = h.id;
&self.vec[index]
}
pub fn set(&mut self, h: &Handle<Id, T>, t: T) {
let index = h.id;
self.vec[index] = t;
}
pub fn create(&mut self, t: T) -> Handle<Id, T> {
self.vec.push(t);
Handle {
marker: marker::InvariantLifetime,
id: self.vec.len() - 1
}
}
}
pub struct Handle<Id, T> {
id: uint,
marker: marker::InvariantType<Id>
}
fn main() {
struct A; struct B;
let mut s1 = Storage::<A, uint>::new();
let s2 = Storage::<B, uint>::new();
let handle1 = s1.create(5);
s1.get(&handle1);
s2.get(&handle1); // won't compile, since A != B
}
This solves your problem in the simplest case, but has some downsides. Mainly, it depends on the use to define and use all of these different phantom types and to prove that they are unique. It doesn't prevent bad behavior on the user's part where they can use the same phantom type for multiple Storage instances. In today's Rust, however, this is the best we can do.
An alternative solution that doesn't work today for reasons I'll get in to later, but might work later, uses lifetimes as anonymous id types. This code uses the InvariantLifetime marker, which removes all sub typing relationships with other lifetimes for the lifetime it uses.
Here is the same system, rewritten to use InvariantLifetime instead of InvariantType:
use std::kinds::marker;
pub struct Storage<'id, T> {
marker: marker::InvariantLifetime<'id>,
vec: Vec<T>
}
impl<'id, T> Storage<'id, T> {
pub fn new() -> Storage<'id, T>{
Storage {
marker: marker::InvariantLifetime,
vec: Vec::new()
}
}
pub fn get<'r>(&'r self, h: &Handle<'id, T>) -> &'r T {
let index = h.id;
&self.vec[index]
}
pub fn set(&mut self, h: &Handle<'id, T>, t: T) {
let index = h.id;
self.vec[index] = t;
}
pub fn create(&mut self, t: T) -> Handle<'id, T> {
self.vec.push(t);
Handle {
marker: marker::InvariantLifetime,
id: self.vec.len() - 1
}
}
}
pub struct Handle<'id, T> {
id: uint,
marker: marker::InvariantLifetime<'id>
}
fn main() {
let mut s1 = Storage::<uint>::new();
let s2 = Storage::<uint>::new();
let handle1 = s1.create(5);
s1.get(&handle1);
// In theory this won't compile, since the lifetime of s2
// is *slightly* shorter than the lifetime of s1.
//
// However, this is not how the compiler works, and as of today
// s2 gets the same lifetime as s1 (since they can be borrowed for the same period)
// and this (unfortunately) compiles without error.
s2.get(&handle1);
}
In a hypothetical future, the assignment of lifetimes may change and we may grow a better mechanism for this sort of tagging. However, for now, the best way to accomplish this is with phantom types.

Resources