How does Box<dyn Trait> deconstruct itself? - rust

Since it doesn't know the concrete type of the data, it only contains a vtpr of dyn Trait, How does it drop itself when it goes out of scope? Does every virtual table in Rust contains a drop method implementation?

When the concrete type the original Box contained is unsized into a trait object, the Drop implementation for the type goes into the vtable. A pointer (Any pointer-like thing in Rust. IE, a reference, Box, raw pointer, etc.) whose pointee is a trait object is laid out as follows in memory*:
struct FooTraitDynPointer {
ptr: *[const/mut] (),
vtable: &'static VTableImplForFooTrait
}
The ptr field in my example points to the actual data. We could say that's the original Box.
The vtable field in my example points to a static vtable. Say we have the following Foo trait:
trait Foo {
fn bar(&self) -> usize;
}
Our vtable will look as follows*:
struct VTableImplForFooTrait {
dropper: unsafe fn(*mut ()),
size: usize,
align: usize,
bar: unsafe fn(*const ()) -> usize,
}
We see there, that the drop is there. Along with it, there're size and align fields which allow owning types to deallocate enough memory. Or re-allocate enough memory.
Here's an example program which crudely extracts the size of a struct from within a pointer to a trait object:
#![feature(raw)]
trait Foo {
fn bar(&self) -> usize;
}
struct Baz {
field: f64
}
impl Foo for Baz {
fn bar(&self) -> usize {
self.field as usize
}
}
#[derive(Clone)]
struct FooVTable {
dropper: unsafe fn(*mut ()),
size: usize,
align: usize,
bar: unsafe fn(*const ()) -> usize,
}
fn main() {
use std::{mem, raw};
let value = Baz { field: 20.0 };
let boxed = Box::new(value) as Box<dyn Foo>;
let deconstructed: raw::TraitObject = unsafe { mem::transmute(boxed) };
let vtable = deconstructed.vtable as *mut FooVTable;
let vtable = unsafe { (*vtable).clone() };
println!("size: {}, align: {}", vtable.size, vtable.align);
let result = unsafe { (vtable.bar)(deconstructed.data) };
println!("Value: {}", result);
}
Playground
(Currently) prints:
size: 8, align: 8
Value: 20
However this may very well change in the future so I'm leaving this timestamp here for someone who reads this in a future where the behaviour has been changed. June 5, 2020.
*: The layout of trait objects, and especially their vtables is NOT guaranteed, so do not rely in actual code.

Related

How can I reinterpret a RefMut<T> as a RefMut<U>

I have a collection containing elements wrapped in RefCell that I borrow. I have a wrapper struct to make the api usable like so:
pub struct RefWrapper<'a, T> {
inner: RefMut<'a, T>,
}
impl<T> Deref for RefWrapper<T> {
type Target = T;
fn deref(&self) -> &T { &self.inner }
}
impl<T> DerefMut for RefWrapper<T> {
fn deref_mut(&mut self) -> &mut T { &mut self.inner }
}
Foo and Bar have the same memory layout, so I can safely transmute between references of them, however RefMut is not repr(C) and transmuting would be unsound for other reasons, so I can't safely transmute RefWrapper<Foo> into RefWrapper<Bar>. is there any way to convert from RefWrapper<Foo> to RefWrapper<Bar>?
If you can get a &mut U from a &mut T, then you can use RefMut::map.
Here as a code example:
use std::cell::{RefCell, RefMut};
#[repr(C)]
#[derive(Debug)]
struct Foo {
a: i32,
b: i32,
}
#[repr(C)]
#[derive(Debug)]
struct Bar {
a: i32,
b: i32,
}
fn convert_foo_to_bar(foo: &mut Foo) -> &mut Bar {
unsafe { std::mem::transmute(foo) }
}
fn main() {
let foo = RefCell::new(Foo { a: 42, b: 69 });
{
let foo_ref = foo.borrow_mut();
let mut bar_ref = RefMut::map(foo_ref, convert_foo_to_bar);
println!("{:?}", bar_ref);
bar_ref.b = 420;
}
println!("{:?}", foo);
}
Bar { a: 42, b: 69 }
RefCell { value: Foo { a: 42, b: 420 } }
Be aware that his requires manually keeping the memory layout of Foo and Bar in sync. This will create very subtle and hard-to-find bugs if Foo and Bar are not 100% identical.
Of course one could generalize the concept by implementing From/Into for &mut Foo and &mut Bar instead of using convert_foo_to_bar. You can then use this trait to for your RefWrapper to convert between the two.

How to filter RCed trait objects vector of specific subtrait in Rust?

The task is to filter out the supertrait objects for base trait object vector:
use std::rc::Rc;
use std::any::Any;
pub trait TraitA {
fn get_title(&self) -> &str;
fn as_any(&self) -> Any;
}
pub trait TraitB: TraitA {
fn get_something(&self) -> &str;
}
pub fn filter_b(input: Vec<Rc<dyn TraitA>>) -> Vec<Rc<dyn TraitB>> {
// bs.filter(|it| /* How to do it? */).collect();
}
Is it even possible? Any clue or advice?
I know as_any() can be used to downcast but as i'm not sure how it's meant to work with Rc as it takes ownership (and thus requires the instance).
I was first expecting the answer to be "absolutely not!", Any doesn't help if you don't know the concrete type. But it turns out you can... with caveats, and I'm not 100% sure its totally safe.
To go from Rc<T> to Rc<U>, you can use the escape hatches into_raw and from_raw. The docs of the former read:
Constructs an Rc from a raw pointer.
The raw pointer must have been previously returned by a call to Rc<U>::into_raw where U must have the same size and alignment as T. This is trivially true if U is T. Note that if U is not T but has the same size and alignment, this is basically like transmuting references of different types. See mem::transmute for more information on what restrictions apply in this case.
The user of from_raw has to make sure a specific value of T is only dropped once.
This function is unsafe because improper use may lead to memory unsafety, even if the returned Rc<T> is never accessed.
With that in mind, since we only have access to TraitA, it'll need an as_b() function to get itself as a TraitB. The fact that the target is a super trait doesn't really help. Then we can write a crosscast function like so:
use std::rc::Rc;
trait TraitA {
fn print_a(&self);
// SAFETY: the resulting `dyn TraitB` must have the *exact* same address,
// size, alignment, and drop implementation for `crosscast` to work safely.
// Basically it must be `self` or maybe a transparently wrapped object.
unsafe fn as_b(&self) -> Option<&(dyn TraitB + 'static)>;
}
trait TraitB {
fn print_b(&self);
}
fn crosscast(a: Rc<dyn TraitA>) -> Option<Rc<dyn TraitB>> {
unsafe {
let b_ptr = a.as_b()? as *const dyn TraitB;
let a_ptr = Rc::into_raw(a);
// sanity check
assert!(a_ptr as *const () == b_ptr as *const ());
Some(Rc::from_raw(b_ptr))
}
}
With this function at our disposal, your problem becomes trivial by using .filter_map():
struct Both {
data: String,
}
impl TraitA for Both {
fn print_a(&self) { println!("A: {}", self.data); }
unsafe fn as_b(&self) -> Option<&(dyn TraitB + 'static)> { Some(self) }
}
impl TraitB for Both {
fn print_b(&self) { println!("B: {}", self.data); }
}
struct OnlyA {
data: String,
}
impl TraitA for OnlyA {
fn print_a(&self) { println!("A: {}", self.data); }
unsafe fn as_b(&self) -> Option<&(dyn TraitB + 'static)> { None }
}
fn main() {
let vec_a = vec![
Rc::new(Both{ data: "both".to_owned() }) as Rc<dyn TraitA>,
Rc::new(OnlyA{ data: "only a".to_owned() })
];
for a in &vec_a {
a.print_a();
}
println!();
let vec_b = vec_a
.into_iter()
.filter_map(crosscast)
.collect::<Vec<_>>();
for b in &vec_b {
b.print_b();
}
}
See it all together on the playground.
I would still recommend not doing this if at all possible. It would be perfectly safe for example to go from &Rc<dyn TraitA> to Option<&dyn TraitB> using the above method without all the restrictions. Something like this wouldn't have the restrictions and unsafety:
for b in vec_a.iter().filter_map(|a| a.as_b()) {
// ...
}

How to implement an iterator over chunks of an array in a struct?

I want to implement an iterator for the struct with an array as one of its fields. The iterator should return a slice of that array, but this requires a lifetime parameter. Where should that parameter go?
The Rust version is 1.37.0
struct A {
a: [u8; 100],
num: usize,
}
impl Iterator for A {
type Item = &[u8]; // this requires a lifetime parameter, but there is none declared
fn next(&mut self) -> Option<Self::Item> {
if self.num >= 10 {
return None;
}
let res = &self.a[10*self.num..10*(self.num+1)];
self.num += 1;
Some(res)
}
}
I wouldn't implement my own. Instead, I'd reuse the existing chunks iterator and implement IntoIterator for a reference to the type:
struct A {
a: [u8; 100],
num: usize,
}
impl<'a> IntoIterator for &'a A {
type Item = &'a [u8];
type IntoIter = std::slice::Chunks<'a, u8>;
fn into_iter(self) -> Self::IntoIter {
self.a.chunks(self.num)
}
}
fn example(a: A) {
for chunk in &a {
println!("{}", chunk.iter().sum::<u8>())
}
}
When you return a reference from a function, its lifetime needs to be tied to something else. Otherwise, the compiler wouldn't know how long the reference is valid (the exception to this is a 'static lifetime, which lasts for the duration of the whole program).
So we need an existing reference to the slices. One standard way to do this is to tie the reference to the iterator itself. For example,
struct Iter<'a> {
slice: &'a [u8; 100],
num: usize,
}
Then what you have works almost verbatim. (I've changed the names of the types and fields to be a little more informative).
impl<'a> Iterator for Iter<'a> {
type Item = &'a [u8];
fn next(&mut self) -> Option<Self::Item> {
if self.num >= 100 {
return None;
}
let res = &self.slice[10 * self.num..10 * (self.num + 1)];
self.num += 1;
Some(res)
}
}
Now, you probably still have an actual [u8; 100] somewhere, not just a reference. If you still want to work with that, what you'll want is a separate struct that has a method to convert into A. For example
struct Data {
array: [u8; 100],
}
impl Data {
fn iter<'a>(&'a self) -> Iter<'a> {
Iter {
slice: &self.array,
num: 0,
}
}
}
Thanks to lifetime elision, the lifetimes on iter can be left out:
impl Data {
fn iter(&self) -> Iter {
Iter {
slice: &self.array,
num: 0,
}
}
}
(playground)
Just a few notes. There was one compiler error with [0u8; 100]. This may have been a typo for [u8; 100], but just in case, here's why we can't do that. In the fields for a struct definition, only the types are specified. There aren't default values for the fields or anything like that. If you're trying to have a default for the struct, consider using the Default trait.
Second, you're probably aware of this, but there's already an implementation of a chunk iterator for slices. If slice is a slice (or can be deref coerced into a slice - vectors and arrays are prime examples), then slice.chunks(n) is an iterator over chunks of that slice with length n. I gave an example of this in the code linked above. Interestingly, that implementation uses a very similar idea: slice.chunks(n) returns a new struct with a lifetime parameter and implements Iterator. This is almost exactly the same as our Data::iter.
Finally, your implementation of next has a bug in it that causes an out-of-bounds panic when run. See if you can spot it!

What's the Rust idiom to define a field pointing to a C opaque pointer?

Given a struct:
#[repr(C)]
pub struct User {
pub name: *const c_char,
pub age: u8,
pub ctx: ??,
}
the field ctx would only be manipulated by C code; it's a pointer to a C struct UserAttr.
According to the Rust FFI documentation, the choice would be defined as an opaque type pub enum UserAttr {}. However, I found that Rust is unable to copy its value, e.g. why does the address of an object change across methods.
What's the right way in Rust to define such an opaque pointer, so that its value (as a pointer) gets copied across methods?
The future
RFC 1861 introduced the concept of an extern type. While implemented, it is not yet stabilized. Once it is, it will become the preferred implementation:
#![feature(extern_types)]
extern "C" {
type Foo;
}
type FooPtr = *mut Foo;
Today
The documentation states:
To do this in Rust, let’s create our own opaque types:
#[repr(C)] pub struct Foo { private: [u8; 0] }
#[repr(C)] pub struct Bar { private: [u8; 0] }
extern "C" {
pub fn foo(arg: *mut Foo);
pub fn bar(arg: *mut Bar);
}
By including a private field and no constructor, we create an opaque
type that we can’t instantiate outside of this module. An empty array
is both zero-size and compatible with #[repr(C)]. But because our
Foo and Bar types are different, we’ll get type safety between the
two of them, so we cannot accidentally pass a pointer to Foo to
bar().
An opaque pointer is created such that there's no normal way of creating such a type; you can only create pointers to it.
mod ffi {
use std::ptr;
pub struct MyTypeFromC { _private: [u8; 0] }
pub fn constructor() -> *mut MyTypeFromC {
ptr::null_mut()
}
pub fn something(_thing: *mut MyTypeFromC) {
println!("Doing a thing");
}
}
use ffi::*;
struct MyRustType {
score: u8,
the_c_thing: *mut MyTypeFromC,
}
impl MyRustType {
fn new() -> MyRustType {
MyRustType {
score: 42,
the_c_thing: constructor(),
}
}
fn something(&mut self) {
println!("My score is {}", self.score);
ffi::something(self.the_c_thing);
self.score += 1;
}
}
fn main() {
let mut my_thing = MyRustType::new();
my_thing.something();
}
Breaking it down a bit:
// opaque -----V~~~~~~~~~V
*mut MyTypeFromC
// ^~~^ ------------ pointer
Thus it's an opaque pointer. Moving the struct MyRustType will not change the value of the pointer.
The past
Previous iterations of this answer and the documentation suggested using an empty enum (enum MyTypeFromC {}). An enum with no variants is semantically equivalent to the never type (!), which is a type that cannot exist. There were concerns that using such a construct could lead to undefined behavior, so moving to an empty array was deemed safer.

How do you actually use dynamically sized types in Rust?

In theory, Dynamically-Sized Types (DST) have landed and we should now be able to use dynamically sized type instances. Practically speaking, I can neither make it work, nor understand the tests around it.
Everything seems to revolve around the Sized? keyword... but how exactly do you use it?
I can put some types together:
// Note that this code example predates Rust 1.0
// and is no longer syntactically valid
trait Foo for Sized? {
fn foo(&self) -> u32;
}
struct Bar;
struct Bar2;
impl Foo for Bar { fn foo(&self) -> u32 { return 9u32; }}
impl Foo for Bar2 { fn foo(&self) -> u32 { return 10u32; }}
struct HasFoo<Sized? X> {
pub f:X
}
...but how do I create an instance of HasFoo, which is DST, to have either a Bar or Bar2?
Attempting to do so always seems to result in:
<anon>:28:17: 30:4 error: trying to initialise a dynamically sized struct
<anon>:28 let has_foo = &HasFoo {
I understand broadly speaking that you can't have a bare dynamically sized type; you can only interface with one through a pointer, but I can't figure out how to do that.
Disclaimer: these are just the results of a few experiments I did, combined with reading Niko Matsakis's blog.
DSTs are types where the size is not necessarily known at compile time.
Before DSTs
A slice like [i32] or a bare trait like IntoIterator were not valid object types because they do not have a known size.
A struct could look like this:
// [i32; 2] is a fixed-sized vector with 2 i32 elements
struct Foo {
f: [i32; 2],
}
or like this:
// & is basically a pointer.
// The compiler always knows the size of a
// pointer on a specific architecture, so whatever
// size the [i32] has, its address (the pointer) is
// a statically-sized type too
struct Foo2<'a> {
f: &'a [i32],
}
but not like this:
// f is (statically) unsized, so Foo is unsized too
struct Foo {
f: [i32],
}
This was true for enums and tuples too.
With DSTs
You can declare a struct (or enum or tuple) like Foo above, containing an unsized type. A type containing an unsized type will be unsized too.
While defining Foo was easy, creating an instance of Foo is still hard and subject to change. Since you can't technically create an unsized type by definition, you have to create a sized counterpart of Foo. For example, Foo { f: [1, 2, 3] }, a Foo<[i32; 3]>, which has a statically known size and code some plumbing to let the compiler know how it can coerce this into its statically unsized counterpart Foo<[i32]>. The way to do this in safe and stable Rust is still being worked on as of Rust 1.5 (here is the RFC for DST coercions for more info).
Luckily, defining a new DST is not something you will be likely to do, unless you are creating a new type of smart pointer (like Rc), which should be a rare enough occurrence.
Imagine Rc is defined like our Foo above. Since it has all the plumbing to do the coercion from sized to unsized, it can be used to do this:
use std::rc::Rc;
trait Foo {
fn foo(&self) {
println!("foo")
}
}
struct Bar;
impl Foo for Bar {}
fn main() {
let data: Rc<Foo> = Rc::new(Bar);
// we're creating a statically typed version of Bar
// and coercing it (the :Rc<Foo> on the left-end side)
// to as unsized bare trait counterpart.
// Rc<Foo> is a trait object, so it has no statically
// known size
data.foo();
}
playground example
?Sized bound
Since you're unlikely to create a new DST, what are DSTs useful for in your everyday Rust coding? Most frequently, they let you write generic code that works both on sized types and on their existing unsized counterparts. Most often these will be Vec/[] slices or String/str.
The way you express this is through the ?Sized "bound". ?Sized is in some ways the opposite of a bound; it actually says that T can be either sized or unsized, so it widens the possible types we can use, instead of restricting them the way bounds typically do.
Contrived example time! Let's say that we have a FooSized struct that just wraps a reference and a simple Print trait that we want to implement for it.
struct FooSized<'a, T>(&'a T)
where
T: 'a;
trait Print {
fn print(&self);
}
We want to define a blanket impl for all the wrapped T's that implement Display.
impl<'a, T> Print for FooSized<'a, T>
where
T: 'a + fmt::Display,
{
fn print(&self) {
println!("{}", self.0)
}
}
Let's try to make it work:
// Does not compile. "hello" is a &'static str, so self print is str
// (which is not sized)
let h_s = FooSized("hello");
h_s.print();
// to make it work we need a &&str or a &String
let s = "hello"; // &'static str
let h_s = &s; // & &str
h_s.print(); // now self is a &str
Eh... this is awkward... Luckily we have a way to generalize the struct to work directly with str (and unsized types in general): ?Sized
//same as before, only added the ?Sized bound
struct Foo<'a, T: ?Sized>(&'a T)
where
T: 'a;
impl<'a, T: ?Sized> Print for Foo<'a, T>
where
T: 'a + fmt::Display,
{
fn print(&self) {
println!("{}", self.0)
}
}
now this works:
let h = Foo("hello");
h.print();
playground
For a less contrived (but simple) actual example, you can look at the Borrow trait in the standard library.
Back to your question
trait Foo for ?Sized {
fn foo(&self) -> i32;
}
the for ?Sized syntax is now obsolete. It used to refer to the type of Self, declaring that `Foo can be implemented by an unsized type, but this is now the default. Any trait can now be implemented for an unsized type, i.e. you can now have:
trait Foo {
fn foo(&self) -> i32;
}
//[i32] is unsized, but the compiler does not complain for this impl
impl Foo for [i32] {
fn foo(&self) -> i32 {
5
}
}
If you don't want your trait to be implementable for unsized types, you can use the Sized bound:
// now the impl Foo for [i32] is illegal
trait Foo: Sized {
fn foo(&self) -> i32;
}
To amend the example that Paolo Falabella has given, here is a different way of looking at it with the use of a property.
struct Foo<'a, T>
where
T: 'a + ?Sized,
{
printable_object: &'a T,
}
impl<'a, T> Print for Foo<'a, T>
where
T: 'a + ?Sized + fmt::Display,
{
fn print(&self) {
println!("{}", self.printable_object);
}
}
fn main() {
let h = Foo {
printable_object: "hello",
};
h.print();
}
At the moment, to create a HasFoo storing a type-erased Foo you need to first create one with a fixed concrete type and then coerce a pointer to it to the DST form, that is
let has_too: &HasFoo<Foo> = &HasFoo { f: Bar };
Calling has_foo.f.foo() then does what you expect.
In future these DST casts will almost certainly be possible with as, but for the moment coercion via an explicit type hint is required.
Here is a complete example based on huon's answer. The important trick is to make the type that you want to contain the DST a generic type where the generic need not be sized (via ?Sized). You can then construct a concrete value using Bar1 or Bar2 and then immediately convert it.
struct HasFoo<F: ?Sized = dyn Foo>(F);
impl HasFoo<dyn Foo> {
fn use_it(&self) {
println!("{}", self.0.foo())
}
}
fn main() {
// Could likewise use `&HasFoo` or `Rc<HasFoo>`, etc.
let ex1: Box<HasFoo> = Box::new(HasFoo(Bar1));
let ex2: Box<HasFoo> = Box::new(HasFoo(Bar2));
ex1.use_it();
ex2.use_it();
}
trait Foo {
fn foo(&self) -> u32;
}
struct Bar1;
impl Foo for Bar1 {
fn foo(&self) -> u32 {
9
}
}
struct Bar2;
impl Foo for Bar2 {
fn foo(&self) -> u32 {
10
}
}

Resources