How to offer an API that stores values of different types and can return them with the original type restored? - rust

I want to offer a safe API like below FooManager. It should be able to store arbitrary user-defined values that implement a trait Foo. It should also be able to hand them back later - not as trait object (Box<dyn Foo>) but as the original type (Box<T> where T: Foo). At least conceptually it should be possible to offer this as a safe API, by using generic handles (Handle<T>), see below.
Additional criteria:
The solution should work in stable Rust (internal usage of unsafe blocks is perfectly okay though).
I don't want to modify the trait Foo, as e.g. suggested in How to get a reference to a concrete type from a trait object?. It should work without adding a method as_any(). Reasoning: Foo shouldn't have any knowledge about the fact that it might be stored in containers and be restored to the actual type.
trait Foo {}
struct Handle<T> {
// ...
}
struct FooManager {
// ...
}
impl FooManager {
// A real-world API would complain if the value is already stored.
pub fn keep_foo<T: Foo>(&mut self, foo: Box<T>) -> Handle<T> {
// ...
}
// In a real-world API this would return an `Option`.
pub fn return_foo<T: Foo>(&mut self, handle: Handle<T>) -> Box<T> {
// ...
}
}
I came up with this (Rust Playground) but not sure if there's a better way or if it's safe even. What do you think of that approach?

Related

How do I inspect function arguments at runtime in Rust?

Say I have a trait that looks like this:
use std::{error::Error, fmt::Debug};
use super::CheckResult;
/// A Checker is a component that is responsible for checking a
/// particular aspect of the node under investigation, be that metrics,
/// system information, API checks, load tests, etc.
#[async_trait::async_trait]
pub trait Checker: Debug + Sync + Send {
type Input: Debug;
/// This function is expected to take input, whatever that may be,
/// and return a vec of check results.
async fn check(&self, input: &Self::Input) -> anyhow::Result<Vec<CheckResult>>;
}
And say I have two implementations of this trait:
pub struct ApiData {
some_response: String,
}
pub MetricsData {
number_of_events: u64,
}
pub struct ApiChecker;
impl Checker for ApiChecker {
type Input = ApiData;
// implement check function
}
pub struct MetricsChecker;
impl Checker for MetricsChecker {
type Input = MetricsData;
// implement check function
}
In my code I have a Vec of these Checkers that looks like this:
pub struct MyServer {
checkers: Vec<Box<dyn Checker>>,
}
What I want to do is figure out, based on what Checkers are in this Vec, what data I need to fetch. For example, if it just contained an ApiChecker, I would only need to fetch the ApiData. If both ApiChecker and MetricsChecker were there, I'd need both ApiData and MetricsData. You can also imagine a third checker where Input = (ApiData, MetricsData). In that case I'd still just need to fetch ApiData and MetricsData once.
I imagine an approach where the Checker trait has an additional function on it that looks like this:
fn required_data(&self) -> HashSet<DataId>;
This could then return something like [DataId::Api, DataId::Metrics]. I would then run this for all Checkers in my vec and then I'd end up a complete list of data I need to get. I could then do some complicated set of checks like this:
let mut required_data = HashSet::new();
for checker in checkers {
required_data.union(&mut checker.required_data());
}
let api_data: Option<ApiData> = None;
if required_data.contains(DataId::Api) {
api_data = Some(get_api_data());
}
And so on for each of the data types.
I'd then pass them into the check calls like this:
api_checker.check(
api_data.expect("There was some logic error and we didn't get the API data even though a Checker declared that it needed it")
);
The reasons I want to fetch the data outside of the Checkers is:
To avoid fetching the same data multiple times.
To support memoization between unrelated calls where the arguments are the same (this could be done inside some kind of Fetcher trait implementation for example).
To support generic retry logic.
By now you can probably see that I've got two big problems:
The declaration of what data a specific Checker needs is duplicated, once in the function signature and again from the required_data function. This naturally introduces bug potential. Ideally this information would only be declared once.
Similarly, in the calling code, I have to trust that the data that the Checkers said they needed was actually accurate (the expect in the previous snippet). If it's not, and we didn't get data we needed, there will be problems.
I think both of these problems would be solved if the function signature, and specifically the Input associated type, was able to express this "required data" declaration on its own. Unfortunately I'm not sure how to do that. I see there is a nightly feature in any that implements Provider and Demand: https://doc.rust-lang.org/std/any/index.html#provider-and-demand. This sort of sounds like what I want, but I have to use stable Rust, plus I figure I must be missing something and there is an easier way to do this without going rogue with semi dynamic typing.
tl;dr: How can I inspect what types the arguments are for a function (keeping in mind that the input might be more complex than just one thing, such as a struct or tuple) at runtime from outside the trait implementer? Alternatively, is there a better way to design this code that would eliminate the need for this kind of reflection?
Your problems start way earlier than you mention:
checkers: Vec<Box<dyn Checker>>
This is an incomplete type. The associated type Input means that Checker<Input = ApiData> and Checker<Input = MetricsData> are incompatible. How would you call checkers[0].check(input)? What type would input be? If you want a collection of "checkers" then you'll need a unified API, where the arguments to .check() are all the same.
I would suggest a different route altogether: Instead of providing the input, provide a type that can retrieve the input that they ask for. That way there's no need to coordinate what type the checkers will ask for in a type-safe way, it'll be inherent to the methods the checkers themselves call. And if your primary concern is repeatedly retrieving the same data for different checkers, then all you need to do is implement caching in the provider. Same with retry logic.
Here's my suggestion:
struct DataProvider { /* cached api and metrics */ }
impl DataProvider {
fn fetch_api_data(&mut self) -> anyhow::Result<ApiData> { todo!() }
fn fetch_metrics_data(&mut self) -> anyhow::Result<MetricsData> { todo!() }
}
#[async_trait::async_trait]
trait Checker {
async fn check(&self, data: &mut DataProvider) -> anyhow::Result<Vec<CheckResult>>;
}
struct ApiAndMetricsChecker;
#[async_trait::async_trait]
impl Checker for ApiAndMetricsChecker {
async fn check(&self, data: &mut DataProvider) -> anyhow::Result<Vec<CheckResult>> {
let _api_data = data.fetch_api_data()?;
let _metrics_data = data.fetch_metrics_data()?;
// do something with api and metrics data
todo!()
}
}

How do I avoid Enum + Trait pattern when a struct is not object safe?

I get the implications of object safety, but I'm trying to find an idiomatic way to solve for this situation.
Say I have two structs that share common behavior and also need to derive PartialEq for comparison in another part of the program:
trait Growl:PartialEq {
fn growl(&self);
}
#[derive(PartialEq)]
struct Pikachu;
#[derive(PartialEq)]
struct Porygon;
impl Growl for Pikachu {
fn growl(&self) {
println!("pika");
}
}
impl Growl for Porygon {
fn growl(&self) {
println!("umm.. rawr?");
}
}
In another struct, I want to hold a Vec of these objects. Since I can't use a trait object with Vec<Box<Growl>>...
struct Region{
pokemon: Vec<Box<dyn Growl>>,
}
// ERROR: `Growl` cannot be made into an object
... I need to get more creative. I read this article, which suggests using an enum or changing the trait. I haven't yet explored type erasure, but it seems heavy-handed for my use case. Using an enum like this is what I've ended up doing but it feels unnecessarily complex
enum Pokemon {
Pika(Pikachu),
Pory(Porygon),
}
Someone coming through this code in the future now needs to understand the individual structs, the trait (which provides all functionality for the structs), and the wrapper enum type to make changes.
Is there a better solution for this pattern?
I read this article, which suggests using an enum or changing the trait. I haven't yet explored type erasure, but it seems heavy-handed for my use case.
Type erasure is just a synonym term for dynamic dispatch - even your original Box<dyn Growl> "erases the type" of the Pokemon. What you want here is to continue in the same vein, by creating a new trait better taylored to your use case and providing a blanket implementation of that trait for any type that implements the original trait.
It sounds complex, but it's actually very simple, much simpler than erased-serde, which has to deal with serde's behemoth traits. Let's go through it step by step. First, you create a trait that won't cause issues with dynamic dispatch:
/// Like Growl, but without PartialEq
trait Gnarl {
// here you'd have only the methods which are actually needed by Region::pokemon.
// Let's assume it needs growl().
fn growl(&self);
}
Then, provide a blanket implementation of your new Gnarl trait for all types that implement the original Growl:
impl<T> Gnarl for T
where
T: Growl,
{
fn growl(&self) {
// Here in the implementation `self` is known to implement `Growl`,
// so you can make use of the full `Growl` functionality, *including*
// things not exposed to `Gnarl` like PartialEq
<T as Growl>::growl(self);
}
}
Finally, use the new trait to create type-erased pokemon:
struct Region {
pokemon: Vec<Box<dyn Gnarl>>,
}
fn main() {
let _region = Region {
pokemon: vec![Box::new(Pikachu), Box::new(Porygon)],
};
}
Playground

Overriding or dynamic dispatch equivalent in rust

I've been learning rust for a while and loving it. I've hit a wall though trying to do something which ought to be simple and elegant so I'm sure I'm missing the obvious.
So I'm parsing JavaScript using the excellent RESSA crate and end up with an AST which is a graph of structs defined by the crate. Now I need to traverse this many times and 'visit' certain nodes with my logic. So I've written a traverser that does that but when it hits a certain nodes it needs to call a callback. In my niavity, I thought I'd define a struct with an attribute for every type with an Option<Fn()> value. In my traverser, I check for the Some value and call it. This works fine but it's ugly because I have to populate this enormous struct with dozens of attributes most of which are None because I'm not interested in those types. Then I thought traits, I'd define a trait 'Visit' which defines the function with a default implementation that does nothing. Then I can just redefine the trait implementation with my desired implementation but this is no good because all the types must have an implementation and then the implementation cannot be redefined. Is there as nice way I can just provide a specific implementation for a few types and leave the rest as default or check for the existence of a function before calling it ? I must be missing an idiomatic way to do this.
You can look at something like syn::Visit, which is a visitor in a popular Rust AST library, for inspiration.
The Visit trait is implemented by the visitor only, and has one method for each node type, with the default implementation only visiting the children:
// this snippet has been slightly altered from the source
pub trait Visit<'ast> {
fn visit_expr(&mut self, i: &'ast Expr) {
visit_expr(self, i);
}
fn visit_expr_array(&mut self, i: &'ast ExprArray) {
visit_expr_array(self, i);
}
fn visit_expr_assign(&mut self, i: &'ast ExprAssign) {
visit_expr_assign(self, i);
}
// ...
}
pub fn visit_expr<'ast, V>(v: &mut V, node: &'ast Expr)
where
V: Visit<'ast> + ?Sized,
{
match node {
Expr::Array(_binding_0) => v.visit_expr_array(_binding_0),
Expr::Assign(_binding_0) => v.visit_expr_assign(_binding_0),
// ...
}
}
pub fn visit_expr_array<'ast, V>(v: &mut V, node: &'ast ExprArray)
where
V: Visit<'ast> + ?Sized,
{
for el in &node.elems {
v.visit_expr(el);
}
}
// ...
With this pattern, you can create a visitor where you only implement the methods you need, and whatever you don't implement will just get the default behavior.
Additionally, because the default methods call separate functions that do the default behavior, you can call those within your custom visitor methods if you need to invoke the default behavior of visiting the children. (Rust doesn't let you invoke default implementations of an overriden trait method directly.)
So for example, a visitor to print all array expressions in a Rust program using syn::Visit could look like:
struct MyVisitor;
impl Visit<'ast> for MyVisitor {
fn visit_expr_array(&mut self, i: &'ast ExprArray) {
println("{:?}", i);
// call default visitor method to visit this node's children as well
visit_expr_array(i);
}
}
fn main() {
let root = get_ast_root_node();
MyVisitor.visit_expr(&root);
}

How to assign an impl trait in a struct?

Consider some struct (HiddenInaccessibleStruct) that is not accessible, but implements an API trait. The only way to obtain an object of this hidden type is by calling a function, that returns an opaque implementation of this type. Another struct owns some type, that makes use of this API trait. Right now, it seems not possible to assign this field in fn new(). The code below can also be found in rust playgrounds.
// -- public api
trait Bound {
fn call(&self) -> Self;
}
// this is not visible
#[derive(Default)]
struct HiddenInaccessibleStruct;
impl Bound for HiddenInaccessibleStruct {
fn call(&self) -> Self { }
}
// -- public api
pub fn load() -> impl Bound {
HiddenInaccessibleStruct::default()
}
struct Abc<T> where T : Bound {
field : T
}
impl<T> Abc<T> where T : Bound {
pub fn new() -> Self {
let field = load();
Abc {
field // this won't work, since `field` has an opaque type.
}
}
}
Update
The API trait Bound declares a function, that returns Self, hence it is not Sized.
There are two concepts in mid-air collision here: Universal types and existential types. An Abc<T> is a universal type and we, including Abc, can refer to whatever T actually is as T (simple as that). impl Trait-types are Rust's closest approach to existential types, where we only promise that such a type exists, but we can't refer to it (there is no T which holds the solution). This also means your constructor can't actually create a Abc<T>, because it can't decide what T is. Also see this article.
One solution is to kick the problem upstairs: Change the constructor to take a T from the outside, and pass the value into it:
impl<T> Abc<T>
where
T: Bound,
{
pub fn new(field: T) -> Self {
Abc { field }
}
}
fn main() {
let field = load();
let abc = Abc::new(field);
}
See this playground.
This works, but it only shifts the problem: The type of abc in main() is Abc<impl Bound>, which is (currently) impossible to write down. If you change the line to let abc: () = ..., the compiler will complain that you are trying to assign Abc<impl Bound> to (). If you try to comply with the advice and change the line to let abc: Abc<impl Bound> = ..., the compiler will complain that this type is invalid. So you have to leave the type of abc being implied. This brings some useability issues with Abc<impl Bound>, because you can't easily put values of that type into other structs etc.; basically, the existential type "infects" the outer type containing it.
impl Trait-types are mostly useful for immediate consumption, e.g. impl Iterator<Item=...>. In your case, with the aim apparently being to hide the type, you may get away with sealing Bound. In a more general case, it may be better to use dynamic dispatch (Box<dyn Bound>).

How to Box a trait that has associated types?

I'm very new to Rust so I may have terminology confused.
I want to use the hashes crates to do some hashing and I want to dynamically pick which algorithm (sha256, sha512, etc.) to use at runtime.
I'd like to write something like this:
let hasher = match "one of the algorithms" {
"sha256" => Box::new(Sha256::new()) as Box<Digest>,
"sha512" => Box::new(Sha512::new()) as Box<Digest>
// etc...
};
I sort of get that that doesn't work because the associated types required by Digest aren't specified. If I attempt to fill them in:
"sha256" => Box::new(Sha256::new()) as Box<Digest<<OutputSize = U32, BlockSize = U64>>>,
I'm left with an error: the trait 'digest::Digest' cannot be made into an object. I think this approach will fail anyway because match will be returning slightly different types in cases where different algorithms have different associated types.
Am I missing something obvious? How can I dynamically create an instance of something that implements a trait and then hold on to that thing and use it through the trait interface?
The message refers to object safety (longer article). The Digest trait has two incompatibilities:
It uses associated types (this can be worked around by explicitly setting all type parameters to values compatible for all Digest objects).
It has a method (fn result(self) -> …) taking self by value. You won't be able to call it, which ruins usability of this trait.
Once a trait object is created, information about its subtype-specific features such as memory layout or associated types is erased. All calls to the trait object's methods are done via a vtable pointer. This means they all must be compatible, and Rust can't allow you to call any methods that could vary in these aspects.
A workaround is to create your custom wrapper trait/adapter that is object-compatible. I'm not sure if that's the best implementation, but it does work:
trait Digest {
type Assoc;
fn result(self);
}
struct Sha;
impl Digest for Sha {
type Assoc = u8;
fn result(self) {}
}
///////////////////////////////////////////
trait MyWrapper {
fn result(&mut self); // can't be self/Sized
}
impl<T: Digest> MyWrapper for Option<T> {
fn result(&mut self) {
// Option::take() gives owned from non-owned
self.take().unwrap().result()
}
}
fn main() {
let mut digest: Box<MyWrapper> = Box::new(Some(Sha));
digest.result();
}

Resources