Correct way of removing trait objects from container

Correct way of removing trait objects from container - rust

Is it correct to remove trait objects by using trait method that casts object address to usize as in this example? Or is there some more safe/idiomatic way?
trait IObserver {
fn update(&mut self, message: &str);
fn addr(&self) -> usize;
}
impl IObserver for Observer {
fn addr(&self) -> usize {
self as *const Self as usize
}
// ...
}
struct Subject {
observers: Vec<Rc<RefCell<dyn IObserver>>>
}
impl Subject {
fn unsubscribe(&mut self, observer: &dyn IObserver) {
let pos = self.observers.iter().position(|x| {
x.borrow().addr() == observer.addr()
});
if let Some(p) = pos {
self.observers.remove(p);
}
}
// ...
}

According to this comment it is important to cast to *const () before comparing raw pointers to data (discarding vtable pointer)
https://github.com/rust-lang/rust/issues/46139#issuecomment-346971153
So then the right code should be:
fn unsubscribe(&mut self, observer: &dyn IObserver) {
self.observers.retain(|x| {
let p1 = x.as_ptr() as *const dyn IObserver as *const ();
let p2 = observer as *const dyn IObserver as *const ();
!ptr::eq(p1, p2)
});
}
ptr::eq doesn't do any cast, so it is possible to compare fat pointers, but the result can be unexpected in this case:
Comparing trait objects pointers compares an vtable addresses which are not guaranteed to be unique and could vary between different code generation units (https://rust-lang.github.io/rust-clippy/master/index.html#vtable_address_comparisons)

Related

Shared ownership inside a struct (mutable_borrow_reservation_conflict warning)

The following code compiles and runs but emits a mutable_borrow_reservation_conflict warning.
My goal is to have a field all_ops owning a set of Op's implementations (readonly) where each op can be referenced in another container in the same struct (and when the main all_ops container is cleared, used_ops access becomes illegal as expected)
Of course, one could use Rc but it causes performance issues.
Do you have an idea to do that properly ? (i.e. a way which will not become an hard error in the (near?) future).
trait Op {
fn f(&self);
}
struct OpA;
impl Op for OpA {
fn f(&self) {
println!("OpA");
}
}
struct OpB;
impl Op for OpB {
fn f(&self) {
println!("OpB");
}
}
struct Container<'a> {
all_ops: Vec<Box<dyn Op>>,
used_ops: Vec<&'a Box<dyn Op>>, // data pointing to data in all_ops field
}
fn main() {
let v: Vec<Box<dyn Op>> = vec![Box::new(OpA), Box::new(OpB)];
let mut c = Container { all_ops: v, used_ops: Vec::new() };
c.used_ops.push(&c.all_ops.get(0).unwrap());
c.used_ops.push(&c.all_ops.get(1).unwrap());
c.used_ops.push(&c.all_ops.get(0).unwrap());
for op in c.used_ops {
op.f();
}
c.all_ops.clear();
// c.used.first().unwrap().f(); // cannot borrow `c.all` as mutable because it is also borrowed as immutable
}
Rust playground

If I replace used_ops: Vec<&'a Box<dyn Op>>
by used_ops: Vec<&'a dyn Op>, it seems sufficient to fix the warning.
Unfortunately, Container isn't movable even if all references in used_ops are on objects allocated in the heap (and I understand why since there are references to (inner parts of) the object).
trait Op {
fn f(&self);
}
struct OpA;
impl Op for OpA {
fn f(&self) {
println!("OpA");
}
}
struct OpB;
impl Op for OpB {
fn f(&self) {
println!("OpB");
}
}
struct Container<'a> {
all_ops: Vec<Box<dyn Op>>,
used_ops: Vec<&'a dyn Op>, // data pointing to data in all_ops field
}
fn main() {
let v: Vec<Box<dyn Op>> = vec![Box::new(OpA), Box::new(OpB)];
let mut c = Container { all_ops: v, used_ops: Vec::new() };
c.used_ops.push(c.all_ops.get(0).unwrap().as_ref());
c.used_ops.push(c.all_ops.get(1).unwrap().as_ref());
c.used_ops.push(c.all_ops.get(0).unwrap().as_ref());
for op in c.used_ops.iter() {
op.f();
}
// let c2 = c; // cannot move out of `c` because it is borrowed
}
Playground

How do I handle an FFI unsized type that could be owned or borrowed?

c_strange_t is an opaque C type that is only seen behind a pointer. When wrapping this type, there are times when it is our responsibility to free memory using c_free_strange_t(*c_strange_t), and other times when we are not responsible for freeing the data, we are only responsible for accurately controlling the lifetime.
It would be ergonomic if this type could be mapped into 2 types in Rust that work in a similar way to str and String, where there is impl Deref<Target=str> for String. The borrowed type would need to be marked as only valid behind a reference.
Is this possible, and how would it be done?

This appears to work, but it does require using a small unsafe block, so you should test under the normal tools like Miri and Valgrind. The primary assumption made here1 is that c_void cannot be constructed normally. #[repr(transparent)] is used to ensure that the FooBorrowed newtype has the same memory layout as a c_void. Everything should end up as "just a pointer":
use std::{ffi::c_void, mem, ops::Deref};
#[repr(transparent)]
struct FooBorrowed(c_void);
struct FooOwned(*mut c_void);
fn fake_foo_new(v: u8) -> *mut c_void {
println!("C new called");
Box::into_raw(Box::new(v)) as *mut c_void
}
fn fake_foo_free(p: *mut c_void) {
println!("C free called");
let p = p as *mut u8;
if !p.is_null() {
unsafe { Box::from_raw(p) };
}
}
fn fake_foo_value(p: *const c_void) -> u8 {
println!("C value called");
let p = p as *const u8;
unsafe {
p.as_ref().map_or(255, |p| *p)
}
}
impl FooBorrowed {
fn value(&self) -> u8 {
fake_foo_value(&self.0)
}
}
impl FooOwned {
fn new(v: u8) -> FooOwned {
FooOwned(fake_foo_new(v))
}
}
impl Deref for FooOwned {
type Target = FooBorrowed;
fn deref(&self) -> &Self::Target {
unsafe { mem::transmute(self.0) }
}
}
impl Drop for FooOwned {
fn drop(&mut self) {
fake_foo_free(self.0)
}
}
fn use_it(foo: &FooBorrowed) {
println!("{}", foo.value())
}
fn main() {
let f = FooOwned::new(42);
use_it(&f);
}
If the C library actually hands you a pointer, you would need to do some more unsafe:
fn fake_foo_borrowed() -> *const c_void {
println!("C borrow called");
static VALUE_OWNED_ELSEWHERE: u8 = 99;
&VALUE_OWNED_ELSEWHERE as *const u8 as *const c_void
}
impl FooBorrowed {
unsafe fn new<'a>(p: *const c_void) -> &'a FooBorrowed {
mem::transmute(p)
}
}
fn main() {
let f2 = unsafe { FooBorrowed::new(fake_foo_borrowed()) };
use_it(f2);
}
As you identified, FooBorrowed::new returns a reference with an unrestricted lifetime; this is pretty dangerous. In many cases, you can construct a smaller scope and use something that provides a lifetime:
impl FooBorrowed {
unsafe fn new<'a>(p: &'a *const c_void) -> &'a FooBorrowed {
mem::transmute(*p)
}
}
fn main() {
let p = fake_foo_borrowed();
let f2 = unsafe { FooBorrowed::new(&p) };
use_it(f2);
}
This prevents you from using the reference beyond when the pointer variable is valid, which is not guaranteed to be the true lifetime, but is "close enough" in many cases. It's more important to be too short and not too long!
1 — In future versions of Rust, you should use extern types to create a guaranteed opaque type:
extern "C" {
type my_opaque_t;
}

Is there a way to determine the offsets of each of the trait methods in the VTable?

I thought I could try more or less build a trait object from scratch without using the impl blocks. To elaborate:
trait SomeTrait {
fn fn_1(&self);
fn fn_2(&self, a: i64);
fn fn_3(&self, a: i64, b: i64);
}
struct TraitObject {
data: *mut (),
vtable: *mut (),
}
fn dtor(this: *mut ()) {
// ...
}
fn imp_1(this: *mut ()) {
// ...
}
fn imp_2(this: *mut (), a: i64) {
// ...
}
fn imp_3(this: *mut (), a: i64, b: i64) {
// ...
}
fn main() {
let data = &... as *mut (); // something to be the object
let vtable = [dtor as *mut (),
8 as *mut (),
8 as *mut (),
imp_1 as *mut (),
imp_2 as *mut (),
imp_3 as *mut ()]; // ignore any errors in typecasting,
//this is not what I am worried about getting right
let to = TraitObject {
data: data,
vtable: vtable.as_ptr() as *mut (),
};
// again, ignore any typecast errors,
let obj: &SomeTrait = unsafe { mem::transmute(to) };
// ...
obj.fn_1();
obj.fn_2(123);
obj.fn_3(123, 456);
}
From what I understand, the order in which the member functions appear in the trait definition is not always the same as the function pointers appear in the VTable. Is there a way to determine the offsets of each of the trait methods in the VTable?

If you don't mind detecting the layout at runtime, then you can compare the function addresses at specific offsets and compare them to the addresses of a known, dummy implementation to match them up. This assumes that you know how many methods there are in the trait, since you may need to read all of them.
use std::mem;
trait SomeTrait {
fn fn_1(&self);
fn fn_2(&self, a: i64);
fn fn_3(&self, a: i64, b: i64);
}
struct Dummy;
impl SomeTrait for Dummy {
fn fn_1(&self) { unimplemented!() }
fn fn_2(&self, _a: i64) { unimplemented!() }
fn fn_3(&self, _a: i64, _b: i64) { unimplemented!() }
}
struct TraitObject {
data: *mut (),
vtable: *mut (),
}
fn main() {
unsafe {
let fn_1 = Dummy::fn_1 as *const ();
let fn_2 = Dummy::fn_2 as *const ();
let fn_3 = Dummy::fn_3 as *const ();
let dummy = &mut Dummy as &mut SomeTrait;
let dummy: TraitObject = mem::transmute(dummy);
let vtable = dummy.vtable as *const *const ();
let vtable_0 = *vtable.offset(3);
let vtable_1 = *vtable.offset(4);
let vtable_2 = *vtable.offset(5);
// Mapping vtable offsets to methods is left as an exercise to the reader. ;)
println!("{:p} {:p} {:p}", fn_1, fn_2, fn_3);
println!("{:p} {:p} {:p}", vtable_0, vtable_1, vtable_2);
}
}

error: `line` does not live long enough (but I know it does)

I am trying to make some kind of ffi to a library written in C, but got stuck. Here is a test case:
extern crate libc;
use libc::{c_void, size_t};
// this is C library api call
unsafe fn some_external_proc(_handler: *mut c_void, value: *const c_void,
value_len: size_t) {
println!("received: {:?}" , std::slice::from_raw_buf(
&(value as *const u8), value_len as usize));
}
// this is Rust wrapper for C library api
pub trait MemoryArea {
fn get_memory_area(&self) -> (*const u8, usize);
}
impl MemoryArea for u64 {
fn get_memory_area(&self) -> (*const u8, usize) {
(unsafe { std::mem::transmute(self) }, std::mem::size_of_val(self))
}
}
impl <'a> MemoryArea for &'a str {
fn get_memory_area(&self) -> (*const u8, usize) {
let bytes = self.as_bytes();
(bytes.as_ptr(), bytes.len())
}
}
#[allow(missing_copy_implementations)]
pub struct Handler<T> {
obj: *mut c_void,
}
impl <T> Handler<T> {
pub fn new() -> Handler<T> { Handler{obj: std::ptr::null_mut(),} }
pub fn invoke_external_proc(&mut self, value: T) where T: MemoryArea {
let (area, area_len) = value.get_memory_area();
unsafe {
some_external_proc(self.obj, area as *const c_void,
area_len as size_t)
};
}
}
// this is Rust wrapper user code
fn main() {
let mut handler_u64 = Handler::new();
let mut handler_str = Handler::new();
handler_u64.invoke_external_proc(1u64); // OK
handler_str.invoke_external_proc("Hello"); // also OK
loop {
match std::io::stdin().read_line() {
Ok(line) => {
let key =
line.trim_right_matches(|&: c: char| c.is_whitespace());
//// error: `line` does not live long enough
// handler_str.invoke_external_proc(key)
}
Err(std::io::IoError { kind: std::io::EndOfFile, .. }) => break ,
Err(error) => panic!("io error: {}" , error),
}
}
}
Rust playpen
I get "line does not live long enough" error if I uncomment line inside the loop. In fact, I realize that Rust is afraid that I could store short-living reference to a slice somewhere inside Handler object, but I quite sure that I wouldn't, and I also know, that it is safe to pass pointers to the external proc (actually, memory is immidiately copied at the C library side).
Is there any way for me to bypass this check?

The problem is that you are incorrectly parameterizing your struct, when you really want to do it for the function. When you create your current Handler, the struct will be specialized with a type that includes a lifetime. However, the lifetime of line is only for the block, so there can be no lifetime for Handler that lasts multiple loop iterations.
What you want is for the lifetime to be tied to the function call, not the life of the struct. As you noted, if you put the lifetime on the struct, then the struct is able to store references of that length. You don't need that, so put the generic type on the function instead:
impl Handler {
pub fn new() -> Handler { Handler{obj: std::ptr::null_mut(),} }
pub fn invoke_external_proc<T>(&mut self, value: T) where T: MemoryArea {
let (area, area_len) = value.get_memory_area();
unsafe {
some_external_proc(self.obj, area as *const c_void,
area_len as size_t)
};
}
}
Amended answer
Since you want to specialize the struct on a type, but don't care too much about the lifetime of the type, let's try this:
#[allow(missing_copy_implementations)]
pub struct Handler<T: ?Sized> {
obj: *mut c_void,
}
impl<T: ?Sized> Handler<T> {
pub fn new() -> Handler<T> { Handler{ obj: std::ptr::null_mut() } }
pub fn invoke_external_proc(&mut self, value: &T) where T: MemoryArea {
let (area, area_len) = value.get_memory_area();
unsafe {
some_external_proc(self.obj, area as *const c_void,
area_len as size_t)
};
}
}
Here, we allow the type to be unsized. Since you can't pass an unsized value as a parameter, we now have to take a reference instead. We also have to change the impl:
impl MemoryArea for str {
fn get_memory_area(&self) -> (*const u8, usize) {
let bytes = self.as_bytes();
(bytes.as_ptr(), bytes.len())
}
}

generic deserialisation (type-punning) of structs (fighting with the borrow checker)

I am using packed structs and I need to be able to go from raw
bytes to structs and vice versa without any decoding/encoding overhead.
I wrote some code that seemed to work:
#[packed]
struct Test {
data: u64
}
impl Test {
fn from_byte_slice(bs: &[u8]) -> Option<Test> {
if bs.len() != std::mem::size_of::<Test>() {
None
} else {
let p: *const u8 = &bs[0];
let p2: *const Test = p as *const Test;
unsafe {
Some(*p2)
}
}
}
}
However I have a couple of different structs that need to be serialized/deserialed
so I wanted to use a generic function to reduce code duplication.
The following code fails to compile with the error message: "error: cannot move out of dereference of *-pointer"
fn from_byte_slice<T>(bs: &[u8]) -> Option<T> {
if bs.len() != std::mem::size_of::<T>() {
None
} else {
let p: *const u8 = &bs[0];
let p2: *const T = p as *const T;
unsafe {
Some(*p2)
}
}
}
What is weird is that if I instead of returning an Option return an Option<&T> then the code compiles:
fn from_byte_slice<'a, T>(bs: &'a [u8]) -> Option<&'a T> {
if bs.len() != std::mem::size_of::<T>() {
None
} else {
let p: *const u8 = &bs[0];
let p2: *const T = p as *const T;
unsafe {
Some(&*p2)
}
}
}
Am I doing something wrong or have I run into a bug in the borrow checker?

The argument bs: &[u8] is a slice, and is borrowed. This is a form of temporary ownership, you can't move the data out. *p2 does just that, it moves ownership of that data out.
You need to clone it:
fn from_byte_slice<T: Clone>(bs: &[u8]) -> Option<T> {
if bs.len() != std::mem::size_of::<T>() {
None
} else {
let p: *const u8 = &bs[0];
let p2: *const T = p as *const T;
unsafe {
Some((*p2).clone())
}
}
}
Using transmute you probably can make this work with a Vec<u8> instead, if you don't mind moving an owned vector into the function.
The direct impl in the first case works because Test contains all Copy fields and is thus implicitly copied (instead of needing the explicit clone()).
This probably will change soon, Copy will have to be explicitly derived in the future.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Correct way of removing trait objects from container - rust

Related

Shared ownership inside a struct (mutable_borrow_reservation_conflict warning)

How do I handle an FFI unsized type that could be owned or borrowed?

Is there a way to determine the offsets of each of the trait methods in the VTable?

error: `line` does not live long enough (but I know it does)

generic deserialisation (type-punning) of structs (fighting with the borrow checker)

Categories

Resources