I'm creating a function which returns a Weak reference to a trait object. In situations where the object cannot be found (it's a lookup function), I want to return an empty Weak reference using Weak::new():
use std::rc::{self, Rc, Weak};
use std::cell::RefCell;
pub trait Part {}
pub struct Blah {}
impl Part for Blah {}
fn main() {
let blah = Blah {};
lookup(Rc::new(RefCell::new(blah)));
}
fn lookup(part: Rc<RefCell<Part>>) -> Weak<RefCell<Part>> {
if true {
Rc::downgrade(&part)
} else {
Weak::new()
}
}
This has the following error during compilation:
error[E0277]: the trait bound `Part + 'static: std::marker::Sized` is not satisfied in `std::cell::RefCell<Part + 'static>`
--> <anon>:19:9
|
19 | Weak::new()
| ^^^^^^^^^ within `std::cell::RefCell<Part + 'static>`, the trait `std::marker::Sized` is not implemented for `Part + 'static`
|
= note: `Part + 'static` does not have a constant size known at compile-time
= note: required because it appears within the type `std::cell::RefCell<Part + 'static>`
= note: required by `<std::rc::Weak<T>>::new`
Why is it that I can successfully create a Weak<RefCell<Part>> from Rc::downgrade() but cannot use the same type to create a new Weak reference with Weak::new()?
Is there a way for me to annotate Weak::new() to help the compiler or will I have to wrap this in an Option to let the user know the part wasn't found?
Working minimal example
The type inferred for Weak::new() is Weak<RefCell<Part>>, and the Part part can not be created because it's a trait!
That's what the Sized error is all about. The trait is not a concrete structure, it has no size known at compile time, so the compiler wouldn't know how much space to allocate.
Why is it that I can successfully create a Weak<RefCell<Part>> from Rc::downgrade()
It is because Rc<RefCell<Part>> points to a structure that is already allocated. Compiler can reference it with a trait pointer even though it doesn't know whether it's a Blah or some other implementation of the Part trait.
Is there a way for me to annotate Weak::new() to help the compiler
You can indeed annotate Weak::new(), pointing the compiler to the implementation of Part that you want instantiated, like this:
use std::rc::{Rc, Weak};
use std::cell::RefCell;
pub trait Part {}
pub struct Blah {}
impl Part for Blah {}
fn main() {
let blah = Blah {};
lookup(Rc::new(RefCell::new(blah)));
}
fn lookup(part: Rc<RefCell<Part>>) -> Weak<RefCell<Part>> {
if true {
Rc::downgrade(&part)
} else {
Weak::<RefCell<Blah>>::new()
}
}
TL;DR: Fat pointers are hard.
And therefore you need to specify the concrete type explicitly before coercion takes place:
Weak::<RefCell<Blah>>::new()
Note: if Blah takes a lot of memory, create a Zero-Sized Type Fool, implement Part for it (all functions unimplemented!()), then use Weak::<RefCell<Fool>>::new() to avoid allocating memory uselessly.
I believe that the underlying issue is simply one of implementation issue.
It does not seem unfixable, but may require quite some work to cover all corner cases.
First, let's expose the issue.
The implementation of Weak::new:
impl<T> Weak<T> {
pub fn new() -> Weak<T> {
unsafe {
Weak {
ptr: Shared::new(Box::into_raw(box RcBox {
strong: Cell::new(0),
weak: Cell::new(1),
value: uninitialized(),
})),
}
}
}
}
For homogeneity, all Shared elements are wrapping a RcBox, which contains two Cell (the counters) and the actual value.
The mere fact of building a RcBox<T> requires that the size of T be known, which is why unlike most Weak methods, T is NOT marked as : ?Sized in this impl.
Now, since the memory is left uninitialized, it is clear that it will never be used, so actually any size would have been fine.
This is supported by the fact that RcBox can actually carry unsized data, which is necessary to go from RcBox<Struct> to RcBox<Trait>, and therefore the strong and weak fields are always laid out first (only the last field can be unsized).
Thus, we would like
Allocate a RcBox<()>, which would save memory AND not require that T be Sized,
Then transmuted to RcBox<T>, whatever T is.
Alright, let's do it!
Our desired implementation will look something like this:
impl<T: ?Sized> Weak<T> {
pub fn new() -> Weak<T> {
unsafe {
Weak {
ptr: Shared::new(transmute(Box::into_raw(box RcBox {
strong: Cell::new(0),
weak: Cell::new(1),
value: (),
}))),
}
}
}
}
which utterly fails to compile.
Why? Because *mut RcBox<()> is a thin pointer, whereas *mut RcBox<T> is either a thin pointer OR a fat pointer (see raw memory representation) depending on whether T is Sized or !Sized.
Now, trait pointers can be handled (warning: contains a simplified and totally unsafe implementation of Rc) with the following implementation of Weak::new:
impl<T: ?Sized> Weak<T> {
pub fn new() -> Weak<T> {
unsafe {
let boxed = Box::into_raw(box RcBox {
strong: Cell::new(0),
weak: Cell::new(1),
value: (),
});
let ptr = if size_of::<*mut ()>() == size_of::<*mut T>() {
let ptr: *mut RcBox<T> = transmute_copy(&boxed);
ptr
} else {
let ptr: *mut RcBox<T> = transmute_copy(&TraitObject {
data: boxed as *mut (),
vtable: null_mut(),
});
ptr
};
Weak { ptr: Shared::new(ptr) }
}
}
}
However this implementation only accounts for trait pointers, and there are other kinds of fat pointers for which it would... probably completely break down.
Related
I want to create some references to a str with Rc, without cloning str:
fn main() {
let s = Rc::<str>::from("foo");
let t = Rc::clone(&s); // Creating a new pointer to the same address is easy
let u = Rc::clone(&s[1..2]); // But how can I create a new pointer to a part of `s`?
let w = Rc::<str>::from(&s[0..2]); // This seems to clone str
assert_ne!(&w as *const _, &s as *const _);
}
playground
How can I do this?
While it's possible in principle, the standard library's Rc does not support the case you're trying to create: a counted reference to a part of reference-counted memory.
However, we can get the effect for strings using a fairly straightforward wrapper around Rc which remembers the substring range:
use std::ops::{Deref, Range};
use std::rc::Rc;
#[derive(Clone, Debug, Eq, Hash, PartialEq)]
pub struct RcSubstr {
string: Rc<str>,
span: Range<usize>,
}
impl RcSubstr {
fn new(string: Rc<str>) -> Self {
let span = 0..string.len();
Self { string, span }
}
fn substr(&self, span: Range<usize>) -> Self {
// A full implementation would also have bounds checks to ensure
// the requested range is not larger than the current substring
Self {
string: Rc::clone(&self.string),
span: (self.span.start + span.start)..(self.span.start + span.end)
}
}
}
impl Deref for RcSubstr {
type Target = str;
fn deref(&self) -> &str {
&self.string[self.span.clone()]
}
}
fn main() {
let s = RcSubstr::new(Rc::<str>::from("foo"));
let u = s.substr(1..2);
// We need to deref to print the string rather than the wrapper struct.
// A full implementation would `impl Debug` and `impl Display` to produce
// the expected substring.
println!("{}", &*u);
}
There are a lot of conveniences missing here, such as suitable implementations of Display, Debug, AsRef, Borrow, From, and Into — I've provided only enough code to illustrate how it can work. Once supplemented with the appropriate trait implementations, this should be just as usable as Rc<str> (with the one edge case that it can't be passed to a library type that wants to store Rc<str> in particular).
The crate arcstr claims to offer a finished version of this basic idea, but I haven't used or studied it and so can't guarantee its quality.
The crate owning_ref provides a way to hold references to parts of an Rc or other smart pointer, but there are concerns about its soundness and I don't fully understand which circumstances that applies to (issue search which currently has 3 open issues).
Given the following code:
trait Function {
fn filter (&self);
}
#[derive(Debug, Copy, Clone)]
struct Kidney {}
impl Function for Kidney {
fn filter (&self) {
println!("filtered");
}
}
fn main() {
let k = Kidney {};
let f: &Function = &k;
//let k1 = (*f); //--> This gives a "size not satisfied" error
(*f).filter(); //--> Works; what exactly happens here?
}
I am not sure why it compiles. I was expecting the last statement to fail. I guess I have overlooked some fundamentals while learning Rust, as I am failing to understand why dereferencing a trait (that lives behind a pointer) should compile.
Is this issue similar to the following case?
let v = vec![1, 2, 3, 4];
//let s: &[i32] = *v;
println!("{}", (*v)[0]);
*v gives a slice, but a slice is unsized, so again it is not clear to me how this compiles. If I uncomment the second statement I get
| let s:&[i32]= *v;
| ^^
| |
| expected &[i32], found slice
| help: consider borrowing here: `&*v`
|
= note: expected type `&[i32]`
found type `[{integer}]`
Does expected type &[i32] mean "expected a reference of slice"?
Dereferencing a trait object is no problem. In fact, it must be dereferenced at some point, otherwise it would be quite useless.
let k1 = (*f); fails not because of dereferencing but because you try to put the raw trait object on the stack (this is where local variables live). Values on the stack must have a size known at compile time, which is not the case for trait objects because any type could implement the trait.
Here is an example where a structs with different sizes implement the trait:
trait Function {
fn filter (&self);
}
#[derive(Debug, Copy, Clone)]
struct Kidney {}
impl Function for Kidney {
fn filter (&self) {
println!("filtered");
}
}
#[derive(Debug, Copy, Clone)]
struct Liver {
size: f32
}
impl Function for Liver {
fn filter (&self) {
println!("filtered too!");
}
}
fn main() {
let k = Kidney {};
let l = Liver {size: 1.0};
let f: &Function;
if true {
f = &k;
} else {
f = &l;
}
// Now what is the size of *f - Kidney (0 bytes) or Liver (4 bytes)?
}
(*f).filter(); works because the temporarily dereferenced object is not put on the stack. In fact, this is the same as f.filter(). Rust automatically applies as many dereferences as required to get to an actual object. This is documented in the book.
What happens in the second case is that Vec implements Deref to slices, so it gets all methods implemented for slices for free. *v gives you a dereferenced slice, which you assign to a slice. This is an obvious type error.
Judging by the MIR produced by the first piece of code, (*f).filter() is equivalent to f.filter(); it appears that the compiler is aware that since filter is a method on &self, dereferencing it doesn't serve any purpose and is omitted altogether.
The second case, however, is different, because dereferencing the slice introduces bounds-checking code. In my opinion the compiler should also be able to tell that this operation (dereferencing) doesn't introduce any meaningful changes (and/or that there won't be an out-of-bounds error) and treat it as regular slice indexing, but there might be some reason behind this.
I'm new to Rust and have seen some examples of people using Box to allow pushing many types that implement a certain Trait onto a Vec. When using a Trait with Generics, I have run into an issue.
error[E0038]: the trait `collision::collision_detection::Collidable` cannot be made into an object
--> src/collision/collision_detection.rs:19:5
|
19 | collidables: Vec<Box<Collidable<P, M>>>,
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ the trait `collision::collision_detection::Collidable` cannot be made into an object
|
= note: method `get_ncollide_shape` has generic type parameters
error: aborting due to previous error
error: Could not compile `game_proto`.
To learn more, run the command again with --verbose.
Here is my code
extern crate ncollide;
extern crate nalgebra as na;
use self::ncollide::shape::Shape;
use self::ncollide::math::Point;
use self::ncollide::math::Isometry;
use self::na::Isometry2;
pub trait Collidable<P: Point, M> {
fn get_ncollide_shape<T: Shape<P, M>>(&self) -> Box<T>;
fn get_isometry(&self) -> Isometry2<f64>;
}
pub struct CollisionRegistry<P, M>
where
P: Point,
M: Isometry<P>,
{
collidables: Vec<Box<Collidable<P, M>>>,
}
impl<P: Point, M: Isometry<P>> CollisionRegistry<P, M> {
pub fn new() -> Self {
let objs: Vec<Box<Collidable<P, M>>> = Vec::new();
CollisionRegistry { collidables: objs }
}
pub fn register<D>(&mut self, obj: Box<D>)
where
D: Collidable<P, M>,
{
self.collidables.push(obj);
}
}
I'm trying to use collidables as a list of heterogenous game objects that will give me ncollide compatible Shapes back to feed into the collision detection engine.
EDIT:
To clear up some confusion. I'm not trying to construct and return an instance of a Trait. I'm just trying to create a Vec that will allow any instance of the Collidable trait to be pushed onto it.
Rust is a compiled language, so when it compiles your code, it needs to know all of the information it might need to generate machine code.
When you say
trait MyTrait {
fn do_thing() -> Box<u32>;
}
struct Foo {
field: Box<MyTrait>
}
you are telling Rust that Foo will contain a box containing anything implementing MyTrait. By boxing the type, the compiler will erase any additional data about the data type that isn't covered by the trait. These trait objects are implemented as a set of data fields and a table of functions (called a vtable) that contains the functions exposed by the trait, so they can be called.
When you change
fn do_thing() -> Box<u32>;
to
fn do_thing<T>() -> Box<T>;
it may look similar, but the behavior is much different. Let's take a normal function example
fn do_thing<T>(val: T) { }
fn main() {
do_thing(true);
do_thing(45 as u32);
}
the compiler performs what is a called monomorphization, which means your code in the compiler becomes essentially
fn do_thing_bool(val: bool) { }
fn do_thing_num(val: u32) { }
fn main() {
do_thing_bool(true);
do_thing_num(45 as u32);
}
The key thing to realize is that you are asking it to do the same thing for your trait. The problem is that the compiler can't do it. The example above relies on knowing ahead of time that do_thing is called with a number in one case and a boolean in another, and it can know with 100% certainty that those are the only two ways the function is used.
With your code
trait MyTrait {
fn do_thing<T>() -> Box<T>;
}
the compiler does not know what types do_thing will be called with, so it has no way to generate functions you'd need to call. To do that, wherever you convert the struct implementing Collidable into a boxed object it would have to know every possible return type get_ncollide_shape could have, and that is not supported.
Other links for this:
Understanding Traits and Object Safety
https://www.reddit.com/r/rust/comments/3an132/how_to_wrap_a_trait_object_that_has_generic/
In the Rustonomicon's guide to PhantomData, there is a part about what happens if a Vec-like struct has *const T field, but no PhantomData<T>:
The drop checker will generously determine that Vec<T> does not own any values of type T. This will in turn make it conclude that it doesn't need to worry about Vec dropping any T's in its destructor for determining drop check soundness. This will in turn allow people to create unsoundness using Vec's destructor.
What does it mean? If I implement Drop for a struct and manually destroy all Ts in it, why should I care if compiler knows that my struct owns some Ts?
The PhantomData<T> within Vec<T> (held indirectly via a Unique<T> within RawVec<T>) communicates to the compiler that the vector may own instances of T, and therefore the vector may run destructors for T when the vector is dropped.
Deep dive: We have a combination of factors here:
We have a Vec<T> which has an impl Drop (i.e. a destructor implementation).
Under the rules of RFC 1238, this would usually imply a relationship between instances of Vec<T> and any lifetimes that occur within T, by requiring that all lifetimes within T strictly outlive the vector.
However, the destructor for Vec<T> specifically opts out of this semantics for just that destructor (of Vec<T> itself) via the use of special unstable attributes (see RFC 1238 and RFC 1327). This allows for a vector to hold references that have the same lifetime of the vector itself. This is considered sound; after all, the vector itself will not dereference data pointed to by such references (all its doing is dropping values and deallocating the backing array), as long as an important caveat holds.
The important caveat: While the vector itself will not dereference pointers within its contained values while destructing itself, it will drop the values held by the vector. If those values of type T themselves have destructors, those destructors for T get run. And if those destructors access the data held within their references, then we would have a problem if we allowed dangling pointers within those references.
So, diving in even more deeply: the way that we confirm dropck validity for a given structure S, we first double check if S itself has an impl Drop for S (and if so, we enforce rules on S with respect to its type parameters). But even after that step, we then recursively descend into the structure of S itself, and double check for each of its fields that everything is kosher according to dropck. (Note that we do this even if a type parameter of S is tagged with #[may_dangle].)
In this specific case, we have a Vec<T> which (indirectly via RawVec<T>/Unique<T>) owns a collection of values of type T, represented in a raw pointer *const T. However, the compiler attaches no ownership semantics to *const T; that field alone in a structure S implies no relationship between S and T, and thus enforces no constraint in terms of the relationship of lifetimes within the types S and T (at least from the viewpoint of dropck).
Therefore, if the Vec<T> had solely a *const T, the recursive descent into the structure of the vector would fail to capture the ownership relation between the vector and the instances of T contained within the vector. That, combined with the #[may_dangle] attribute on T, would cause the compiler to accept unsound code (namely cases where destructors for T end up trying to access data that has already been deallocated).
BUT: Vec<T> does not solely contain a *const T. There is also a PhantomData<T>, and that conveys to the compiler "hey, even though you can assume (due to the #[may_dangle] T) that the destructor for Vec won't access data of T when the vector is dropped, it is still possible that some destructor of T itself will access data of T as the vector is dropped."
The end effect: Given Vec<T>, if T doesn't have a destructor, then the compiler provides you with more flexibility (namely, it allows a vector to hold data with references to data that lives for the same amount of time as the vector itself, even though such data may be torn down before the vector is). But if T does have a destructor (and that destructor is not otherwise communicating to the compiler that it won't access any referenced data), then the compiler is more strict, requiring any referenced data to strictly outlive the vector (thus ensuring that when the destructor for T runs, all the referenced data will still be valid).
If one wants to try to understand this via concrete exploration, you can try comparing how the compiler differs in its treatment of little container types that vary in their use of #[may_dangle] and PhantomData.
Here is some sample code I have whipped up to illustrate this:
// Illustration of a case where PhantomData is providing necessary ownership
// info to rustc.
//
// MyBox2<T> uses just a `*const T` to hold the `T` it owns.
// MyBox3<T> has both a `*const T` AND a PhantomData<T>; the latter communicates
// its ownership relationship with `T`.
//
// Skim down to `fn f2()` to see the relevant case,
// and compare it to `fn f3()`. When you run the program,
// the output will include:
//
// drop PrintOnDrop(mb2b, PrintOnDrop("v2b", 13, INVALID), Valid)
//
// (However, in the absence of #[may_dangle], the compiler will constrain
// things in a manner that may indeed imply that PhantomData is unnecessary;
// pnkfelix is not 100% sure of this claim yet, though.)
#![feature(alloc, dropck_eyepatch, generic_param_attrs, heap_api)]
extern crate alloc;
use alloc::heap;
use std::fmt;
use std::marker::PhantomData;
use std::mem;
use std::ptr;
#[derive(Copy, Clone, Debug)]
enum State { INVALID, Valid }
#[derive(Debug)]
struct PrintOnDrop<T: fmt::Debug>(&'static str, T, State);
impl<T: fmt::Debug> PrintOnDrop<T> {
fn new(name: &'static str, t: T) -> Self {
PrintOnDrop(name, t, State::Valid)
}
}
impl<T: fmt::Debug> Drop for PrintOnDrop<T> {
fn drop(&mut self) {
println!("drop PrintOnDrop({}, {:?}, {:?})",
self.0,
self.1,
self.2);
self.2 = State::INVALID;
}
}
struct MyBox1<T> {
v: Box<T>,
}
impl<T> MyBox1<T> {
fn new(t: T) -> Self {
MyBox1 { v: Box::new(t) }
}
}
struct MyBox2<T> {
v: *const T,
}
impl<T> MyBox2<T> {
fn new(t: T) -> Self {
unsafe {
let p = heap::allocate(mem::size_of::<T>(), mem::align_of::<T>());
let p = p as *mut T;
ptr::write(p, t);
MyBox2 { v: p }
}
}
}
unsafe impl<#[may_dangle] T> Drop for MyBox2<T> {
fn drop(&mut self) {
unsafe {
// We want this to be *legal*. This destructor is not
// allowed to call methods on `T` (since it may be in
// an invalid state), but it should be allowed to drop
// instances of `T` as it deconstructs itself.
//
// (Note however that the compiler has no knowledge
// that `MyBox2<T>` owns an instance of `T`.)
ptr::read(self.v);
heap::deallocate(self.v as *mut u8,
mem::size_of::<T>(),
mem::align_of::<T>());
}
}
}
struct MyBox3<T> {
v: *const T,
_pd: PhantomData<T>,
}
impl<T> MyBox3<T> {
fn new(t: T) -> Self {
unsafe {
let p = heap::allocate(mem::size_of::<T>(), mem::align_of::<T>());
let p = p as *mut T;
ptr::write(p, t);
MyBox3 { v: p, _pd: Default::default() }
}
}
}
unsafe impl<#[may_dangle] T> Drop for MyBox3<T> {
fn drop(&mut self) {
unsafe {
ptr::read(self.v);
heap::deallocate(self.v as *mut u8,
mem::size_of::<T>(),
mem::align_of::<T>());
}
}
}
fn f1() {
// `let (v, _mb1);` and `let (_mb1, v)` won't compile due to dropck
let v1; let _mb1;
v1 = PrintOnDrop::new("v1", 13);
_mb1 = MyBox1::new(PrintOnDrop::new("mb1", &v1));
}
fn f2() {
{
let (v2a, _mb2a); // Sound, but not distinguished from below by rustc!
v2a = PrintOnDrop::new("v2a", 13);
_mb2a = MyBox2::new(PrintOnDrop::new("mb2a", &v2a));
}
{
let (_mb2b, v2b); // Unsound!
v2b = PrintOnDrop::new("v2b", 13);
_mb2b = MyBox2::new(PrintOnDrop::new("mb2b", &v2b));
// namely, v2b dropped before _mb2b, but latter contains
// value that attempts to access v2b when being dropped.
}
}
fn f3() {
let v3; let _mb3; // `let (v, mb3);` won't compile due to dropck
v3 = PrintOnDrop::new("v3", 13);
_mb3 = MyBox3::new(PrintOnDrop::new("mb3", &v3));
}
fn main() {
f1(); f2(); f3();
}
Caveat emptor — I'm not that strong in the extremely deep theory that truly answers your question. I'm just a layperson who has used Rust a bit and has read the related RFCs. Always refer back to those original sources for a less-diluted version of the truth.
RFC 769 introduced the actual The Drop-Check Rule:
Let v be some value (either temporary or named) and 'a be some
lifetime (scope); if the type of v owns data of type D, where (1.)
D has a lifetime- or type-parametric Drop implementation, and (2.)
the structure of D can reach a reference of type &'a _, and (3.)
either:
(A.) the Drop impl for D instantiates D at 'a
directly, i.e. D<'a>, or,
(B.) the Drop impl for D has some type parameter with a
trait bound T where T is a trait that has at least
one method,
then 'a must strictly outlive the scope of v.
It then goes further to define some of those terms, including what it means for one type to own another. This goes further to mention PhantomData specifically:
Therefore, as an additional special case to the criteria above for when the type E owns data of type D, we include:
If E is PhantomData<T>, then recurse on T.
A key problem occurs when two variables are defined at the same time:
struct Noisy<'a>(&'a str);
impl<'a> Drop for Noisy<'a> {
fn drop(&mut self) { println!("Dropping {}", self.0 )}
}
fn main() -> () {
let (mut v, s) = (Vec::new(), "hi".to_string());
let noisy = Noisy(&s);
v.push(noisy);
}
As I understand it, without The Drop-Check Rule and indicating that Vec owns Noisy, code like this might compile. When the Vec is dropped, the drop implementation could access an invalid reference; introducing unsafety.
Returning to your points:
If I implement Drop for a struct and manually destroy all Ts in it, why should I care if compiler knows that my struct owns some Ts?
The compiler must know that you own the value because you can/will call drop. Since the implementation of drop is arbitrary, if you are going to call it, the compiler must forbid you from accepting values that would cause unsafe behavior during drop.
Always remember that any arbitrary T can be a value, a reference, a value containing a reference, etc. When trying to puzzle out these types of things, it's important to try to use the most complicated variant for any thought experiments.
All of that should provide enough pieces to connect-the-dots; for full understanding, reading the RFC a few times is probably better than relying on my flawed interpretation.
Then it gets more complicated. RFC 1238 further modifies The Drop-Check Rule, removing this specific reasoning. It does say:
parametricity is a necessary but not sufficient condition to justify the inferences that dropck makes
Continuing to use PhantomData seems the safest thing to do, but it may not be required. An anonymous Twitter benefactor pointed out this code:
use std::marker::PhantomData;
#[derive(Debug)] struct MyGeneric<T> { x: Option<T> }
#[derive(Debug)] struct MyDropper<T> { x: Option<T> }
#[derive(Debug)] struct MyHiddenDropper<T> { x: *const T }
#[derive(Debug)] struct MyHonestHiddenDropper<T> { x: *const T, boo: PhantomData<T> }
impl<T> Drop for MyDropper<T> { fn drop(&mut self) { } }
impl<T> Drop for MyHiddenDropper<T> { fn drop(&mut self) { } }
impl<T> Drop for MyHonestHiddenDropper<T> { fn drop(&mut self) { } }
fn main() {
// Does Compile! (magic annotation on destructor)
{
let (a, mut b) = (0, vec![]);
b.push(&a);
}
// Does Compile! (no destructor)
{
let (a, mut b) = (0, MyGeneric { x: None });
b.x = Some(&a);
}
// Doesn't Compile! (has destructor, no attribute)
{
let (a, mut b) = (0, MyDropper { x: None });
b.x = Some(&a);
}
{
let (a, mut b) = (0, MyHiddenDropper { x: 0 as *const _ });
b.x = &&a;
}
{
let (a, mut b) = (0, MyHonestHiddenDropper { x: 0 as *const _, boo: PhantomData });
b.x = &&a;
}
}
This suggests that the changes in RFC 1238 made the compiler more conservative, such that simply having a lifetime or type parameter is enough to prevent it from compiling.
You can also note that Vec doesn't have this problem because it uses the unsafe_destructor_blind_to_params attribute described in the the RFC.
I have a program that involves examining a complex data structure to see if it has any defects. (It's quite complicated, so I'm posting example code.) All of the checks are unrelated to each other, and will all have their own modules and tests.
More importantly, each check has its own error type that contains different information about how the check failed for each number. I'm doing it this way instead of just returning an error string so I can test the errors (it's why Error relies on PartialEq).
My Code So Far
I have traits for Check and Error:
trait Check {
type Error;
fn check_number(&self, number: i32) -> Option<Self::Error>;
}
trait Error: std::fmt::Debug + PartialEq {
fn description(&self) -> String;
}
And two example checks, with their error structs. In this example, I want to show errors if a number is negative or even:
#[derive(PartialEq, Debug)]
struct EvenError {
number: i32,
}
struct EvenCheck;
impl Check for EvenCheck {
type Error = EvenError;
fn check_number(&self, number: i32) -> Option<EvenError> {
if number < 0 {
Some(EvenError { number: number })
} else {
None
}
}
}
impl Error for EvenError {
fn description(&self) -> String {
format!("{} is even", self.number)
}
}
#[derive(PartialEq, Debug)]
struct NegativeError {
number: i32,
}
struct NegativeCheck;
impl Check for NegativeCheck {
type Error = NegativeError;
fn check_number(&self, number: i32) -> Option<NegativeError> {
if number < 0 {
Some(NegativeError { number: number })
} else {
None
}
}
}
impl Error for NegativeError {
fn description(&self) -> String {
format!("{} is negative", self.number)
}
}
I know that in this example, the two structs look identical, but in my code, there are many different structs, so I can't merge them. Lastly, an example main function, to illustrate the kind of thing I want to do:
fn main() {
let numbers = vec![1, -4, 64, -25];
let checks = vec![
Box::new(EvenCheck) as Box<Check<Error = Error>>,
Box::new(NegativeCheck) as Box<Check<Error = Error>>,
]; // What should I put for this Vec's type?
for number in numbers {
for check in checks {
if let Some(error) = check.check_number(number) {
println!("{:?} - {}", error, error.description())
}
}
}
}
You can see the code in the Rust playground.
Solutions I've Tried
The closest thing I've come to a solution is to remove the associated types and have the checks return Option<Box<Error>>. However, I get this error instead:
error[E0038]: the trait `Error` cannot be made into an object
--> src/main.rs:4:55
|
4 | fn check_number(&self, number: i32) -> Option<Box<Error>>;
| ^^^^^ the trait `Error` cannot be made into an object
|
= note: the trait cannot use `Self` as a type parameter in the supertraits or where-clauses
because of the PartialEq in the Error trait. Rust has been great to me thus far, and I really hope I'm able to bend the type system into supporting something like this!
When you write an impl Check and specialize your type Error with a concrete type, you are ending up with different types.
In other words, Check<Error = NegativeError> and Check<Error = EvenError> are statically different types. Although you might expect Check<Error> to describe both, note that in Rust NegativeError and EvenError are not sub-types of Error. They are guaranteed to implement all methods defined by the Error trait, but then calls to those methods will be statically dispatched to physically different functions that the compiler creates (each will have a version for NegativeError, one for EvenError).
Therefore, you can't put them in the same Vec, even boxed (as you discovered). It's not so much a matter of knowing how much space to allocate, it's that Vec requires its types to be homogeneous (you can't have a vec![1u8, 'a'] either, although a char is representable as a u8 in memory).
Rust's way to "erase" some of the type information and gain the dynamic dispatch part of subtyping is, as you discovered, trait objects.
If you want to give another try to the trait object approach, you might find it more appealing with a few tweaks...
You might find it much easier if you used the Error trait in std::error instead of your own version of it.
You may need to impl Display to create a description with a dynamically built String, like so:
impl fmt::Display for EvenError {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
write!(f, "{} is even", self.number)
}
}
impl Error for EvenError {
fn description(&self) -> &str { "even error" }
}
Now you can drop the associated type and have Check return a trait object:
trait Check {
fn check_number(&self, number: i32) -> Option<Box<Error>>;
}
your Vec now has an expressible type:
let mut checks: Vec<Box<Check>> = vec![
Box::new(EvenCheck) ,
Box::new(NegativeCheck) ,
];
The best part of using std::error::Error...
is that now you don't need to use PartialEq to understand what error was thrown. Error has various types of downcasts and type checks if you do need to retrieve the concrete Error type out of your trait object.
for number in numbers {
for check in &mut checks {
if let Some(error) = check.check_number(number) {
println!("{}", error);
if let Some(s_err)= error.downcast_ref::<EvenError>() {
println!("custom logic for EvenErr: {} - {}", s_err.number, s_err)
}
}
}
}
full example on the playground
I eventually found a way to do it that I'm happy with. Instead of having a vector of Box<Check<???>> objects, have a vector of closures that all have the same type, abstracting away the very functions that get called:
fn main() {
type Probe = Box<Fn(i32) -> Option<Box<Error>>>;
let numbers: Vec<i32> = vec![ 1, -4, 64, -25 ];
let checks = vec![
Box::new(|num| EvenCheck.check_number(num).map(|u| Box::new(u) as Box<Error>)) as Probe,
Box::new(|num| NegativeCheck.check_number(num).map(|u| Box::new(u) as Box<Error>)) as Probe,
];
for number in numbers {
for check in checks.iter() {
if let Some(error) = check(number) {
println!("{}", error.description());
}
}
}
}
Not only does this allow for a vector of Box<Error> objects to be returned, it allows the Check objects to provide their own Error associated type which doesn't need to implement PartialEq. The multiple ases look a little messy, but on the whole it's not that bad.
I'd suggest you some refactoring.
First, I'm pretty sure, that vectors should be homogeneous in Rust, so there is no way to supply elements of different types for them. Also you cannot downcast traits to reduce them to a common base trait (as I remember, there was a question about it on SO).
So I'd use algebraic type with explicit match for this task, like this:
enum Checker {
Even(EvenCheck),
Negative(NegativeCheck),
}
let checks = vec![
Checker::Even(EvenCheck),
Checker::Negative(NegativeCheck),
];
As for error handling, consider use FromError framework, so you will able to involve try! macro in your code and to convert error types from one to another.