It would be really useful to help organise my Rust code to be able to implement a trait to define a specific set of associated functions that can be called. So the top level code would have a variable pointing to a specific implementation of the trait with the ability to then call the assoicated functions.
For example:
trait Bar {
fn hello();
}
struct FooBar;
impl Bar for FooBar {
fn hello() {
println!("Hello World!");
}
}
// this works fine
let foobar_fn = FooBar::hello;
foobar_fn();
// but how to I do this?
let foobar = FooBar;
foobar::hello();
I'd like to be able to call several associated functions from the same struct, hence why function pointers alone don't give me a strong solution. I'm not trying to store any data, so dynamic traits are not relevant. If I missed this in the documentation, apologies! Please point me in the right direction.
Related
Say I have a trait that looks like this:
use std::{error::Error, fmt::Debug};
use super::CheckResult;
/// A Checker is a component that is responsible for checking a
/// particular aspect of the node under investigation, be that metrics,
/// system information, API checks, load tests, etc.
#[async_trait::async_trait]
pub trait Checker: Debug + Sync + Send {
type Input: Debug;
/// This function is expected to take input, whatever that may be,
/// and return a vec of check results.
async fn check(&self, input: &Self::Input) -> anyhow::Result<Vec<CheckResult>>;
}
And say I have two implementations of this trait:
pub struct ApiData {
some_response: String,
}
pub MetricsData {
number_of_events: u64,
}
pub struct ApiChecker;
impl Checker for ApiChecker {
type Input = ApiData;
// implement check function
}
pub struct MetricsChecker;
impl Checker for MetricsChecker {
type Input = MetricsData;
// implement check function
}
In my code I have a Vec of these Checkers that looks like this:
pub struct MyServer {
checkers: Vec<Box<dyn Checker>>,
}
What I want to do is figure out, based on what Checkers are in this Vec, what data I need to fetch. For example, if it just contained an ApiChecker, I would only need to fetch the ApiData. If both ApiChecker and MetricsChecker were there, I'd need both ApiData and MetricsData. You can also imagine a third checker where Input = (ApiData, MetricsData). In that case I'd still just need to fetch ApiData and MetricsData once.
I imagine an approach where the Checker trait has an additional function on it that looks like this:
fn required_data(&self) -> HashSet<DataId>;
This could then return something like [DataId::Api, DataId::Metrics]. I would then run this for all Checkers in my vec and then I'd end up a complete list of data I need to get. I could then do some complicated set of checks like this:
let mut required_data = HashSet::new();
for checker in checkers {
required_data.union(&mut checker.required_data());
}
let api_data: Option<ApiData> = None;
if required_data.contains(DataId::Api) {
api_data = Some(get_api_data());
}
And so on for each of the data types.
I'd then pass them into the check calls like this:
api_checker.check(
api_data.expect("There was some logic error and we didn't get the API data even though a Checker declared that it needed it")
);
The reasons I want to fetch the data outside of the Checkers is:
To avoid fetching the same data multiple times.
To support memoization between unrelated calls where the arguments are the same (this could be done inside some kind of Fetcher trait implementation for example).
To support generic retry logic.
By now you can probably see that I've got two big problems:
The declaration of what data a specific Checker needs is duplicated, once in the function signature and again from the required_data function. This naturally introduces bug potential. Ideally this information would only be declared once.
Similarly, in the calling code, I have to trust that the data that the Checkers said they needed was actually accurate (the expect in the previous snippet). If it's not, and we didn't get data we needed, there will be problems.
I think both of these problems would be solved if the function signature, and specifically the Input associated type, was able to express this "required data" declaration on its own. Unfortunately I'm not sure how to do that. I see there is a nightly feature in any that implements Provider and Demand: https://doc.rust-lang.org/std/any/index.html#provider-and-demand. This sort of sounds like what I want, but I have to use stable Rust, plus I figure I must be missing something and there is an easier way to do this without going rogue with semi dynamic typing.
tl;dr: How can I inspect what types the arguments are for a function (keeping in mind that the input might be more complex than just one thing, such as a struct or tuple) at runtime from outside the trait implementer? Alternatively, is there a better way to design this code that would eliminate the need for this kind of reflection?
Your problems start way earlier than you mention:
checkers: Vec<Box<dyn Checker>>
This is an incomplete type. The associated type Input means that Checker<Input = ApiData> and Checker<Input = MetricsData> are incompatible. How would you call checkers[0].check(input)? What type would input be? If you want a collection of "checkers" then you'll need a unified API, where the arguments to .check() are all the same.
I would suggest a different route altogether: Instead of providing the input, provide a type that can retrieve the input that they ask for. That way there's no need to coordinate what type the checkers will ask for in a type-safe way, it'll be inherent to the methods the checkers themselves call. And if your primary concern is repeatedly retrieving the same data for different checkers, then all you need to do is implement caching in the provider. Same with retry logic.
Here's my suggestion:
struct DataProvider { /* cached api and metrics */ }
impl DataProvider {
fn fetch_api_data(&mut self) -> anyhow::Result<ApiData> { todo!() }
fn fetch_metrics_data(&mut self) -> anyhow::Result<MetricsData> { todo!() }
}
#[async_trait::async_trait]
trait Checker {
async fn check(&self, data: &mut DataProvider) -> anyhow::Result<Vec<CheckResult>>;
}
struct ApiAndMetricsChecker;
#[async_trait::async_trait]
impl Checker for ApiAndMetricsChecker {
async fn check(&self, data: &mut DataProvider) -> anyhow::Result<Vec<CheckResult>> {
let _api_data = data.fetch_api_data()?;
let _metrics_data = data.fetch_metrics_data()?;
// do something with api and metrics data
todo!()
}
}
I've been learning rust for a while and loving it. I've hit a wall though trying to do something which ought to be simple and elegant so I'm sure I'm missing the obvious.
So I'm parsing JavaScript using the excellent RESSA crate and end up with an AST which is a graph of structs defined by the crate. Now I need to traverse this many times and 'visit' certain nodes with my logic. So I've written a traverser that does that but when it hits a certain nodes it needs to call a callback. In my niavity, I thought I'd define a struct with an attribute for every type with an Option<Fn()> value. In my traverser, I check for the Some value and call it. This works fine but it's ugly because I have to populate this enormous struct with dozens of attributes most of which are None because I'm not interested in those types. Then I thought traits, I'd define a trait 'Visit' which defines the function with a default implementation that does nothing. Then I can just redefine the trait implementation with my desired implementation but this is no good because all the types must have an implementation and then the implementation cannot be redefined. Is there as nice way I can just provide a specific implementation for a few types and leave the rest as default or check for the existence of a function before calling it ? I must be missing an idiomatic way to do this.
You can look at something like syn::Visit, which is a visitor in a popular Rust AST library, for inspiration.
The Visit trait is implemented by the visitor only, and has one method for each node type, with the default implementation only visiting the children:
// this snippet has been slightly altered from the source
pub trait Visit<'ast> {
fn visit_expr(&mut self, i: &'ast Expr) {
visit_expr(self, i);
}
fn visit_expr_array(&mut self, i: &'ast ExprArray) {
visit_expr_array(self, i);
}
fn visit_expr_assign(&mut self, i: &'ast ExprAssign) {
visit_expr_assign(self, i);
}
// ...
}
pub fn visit_expr<'ast, V>(v: &mut V, node: &'ast Expr)
where
V: Visit<'ast> + ?Sized,
{
match node {
Expr::Array(_binding_0) => v.visit_expr_array(_binding_0),
Expr::Assign(_binding_0) => v.visit_expr_assign(_binding_0),
// ...
}
}
pub fn visit_expr_array<'ast, V>(v: &mut V, node: &'ast ExprArray)
where
V: Visit<'ast> + ?Sized,
{
for el in &node.elems {
v.visit_expr(el);
}
}
// ...
With this pattern, you can create a visitor where you only implement the methods you need, and whatever you don't implement will just get the default behavior.
Additionally, because the default methods call separate functions that do the default behavior, you can call those within your custom visitor methods if you need to invoke the default behavior of visiting the children. (Rust doesn't let you invoke default implementations of an overriden trait method directly.)
So for example, a visitor to print all array expressions in a Rust program using syn::Visit could look like:
struct MyVisitor;
impl Visit<'ast> for MyVisitor {
fn visit_expr_array(&mut self, i: &'ast ExprArray) {
println("{:?}", i);
// call default visitor method to visit this node's children as well
visit_expr_array(i);
}
}
fn main() {
let root = get_ast_root_node();
MyVisitor.visit_expr(&root);
}
I need to implement the fmt::Display method for an object coming from an external crate, so I created a wrapper for this object. I'd like to be able to use all the methods from the original object, without having to redefine all of them. I tried to implement Deref as advised on the awesome IRC channel #rust-beginners:
struct CustomMap(ObjectComingFromAnExternalCrate<char, char>);
impl std::ops::Deref for CustomMap {
type Target = ObjectComingFromAnExternalCrate<char, char>;
fn deref(&self) -> &Self::Target {
&self.0
}
}
fn main() {
let cm = CustomMap::with_capacity(10);
println!("Hello, world!");
}
However, I'm getting this error :
error: no associated item named `with_capacity` found for type `CustomMap` in the current scope
--> <anon>:16:13
|
16 | let a = CustomMap::with_capacity(10);
| ^^^^^^^^^^^^^^^^^^^^^^^^
I assume it's because deref() doesn't work with associated functions.
How can I work around this? Reimplementing every associated function I use, just to be able to implement one method I need seems like overkill.
Newtypes are specifically designed to provide encapsulation, so they do not necessarily lend them well to just "adding new stuff".
That being said, a combination of:
Deref and DerefMut to get access to the methods
From and Into to easily convert from one to the other
OR making the inner type pub
should be able to tackle this.
The From/Into recommendation comes from the fact that most associated functions are generally constructors1.
impl From<ObjectComingFromAnExternalCrate<char, char>> for CustomMap { ... }
and then you can do:
let cm: CustomMap = ObjectComingFromAnExternalCrate<char, char>::with_capacity(10).into();
The other solution is to define CustomMap as:
struct CustomMap(pub ObjectComingFromAnExternalCrate<char, char>);
and then:
let cm = CustomMap(ObjectComingFromAnExternalCrate<char, char>::with_capacity(10));
If you do not wish to enforce any other invariant, and do not care about encapsulation, either should get you going.
1 Pointer types, such as Rc, use them heavily to avoid hiding methods of the Deref'ed to type.
When writing callbacks for generic interfaces, it can be useful for them to define their own local data which they are responsible for creating and accessing.
In C I would just use a void pointer, C-like example:
struct SomeTool {
int type;
void *custom_data;
};
void invoke(SomeTool *tool) {
StructOnlyForThisTool *data = malloc(sizeof(*data));
/* ... fill in the data ... */
tool.custom_data = custom_data;
}
void execute(SomeTool *tool) {
StructOnlyForThisTool *data = tool.custom_data;
if (data.foo_bar) { /* do something */ }
}
When writing something similar in Rust, replacing void * with Option<Box<Any>>, however I'm finding that accessing the data is unreasonably verbose, eg:
struct SomeTool {
type: i32,
custom_data: Option<Box<Any>>,
};
fn invoke(tool: &mut SomeTool) {
let data = StructOnlyForThisTool { /* my custom data */ }
/* ... fill in the data ... */
tool.custom_data = Some(Box::new(custom_data));
}
fn execute(tool: &mut SomeTool) {
let data = tool.custom_data.as_ref().unwrap().downcast_ref::<StructOnlyForThisTool>().unwrap();
if data.foo_bar { /* do something */ }
}
There is one line here which I'd like to be able to write in a more compact way:
tool.custom_data.as_ref().unwrap().downcast_ref::<StructOnlyForThisTool>().unwrap()
tool.custom_data.as_ref().unwrap().downcast_mut::<StructOnlyForThisTool>().unwrap()
While each method makes sense on its own, in practice it's not something I'd want to write throughout a code-base, and not something I'm going to want to type out often or remember easily.
By convention, the uses of unwrap here aren't dangerous because:
While only some tools define custom data, the ones that do always define it.
When the data is set, by convention the tool only ever sets its own data. So there is no chance of having the wrong data.
Any time these conventions aren't followed, its a bug and should panic.
Given these conventions, and assuming accessing custom-data from a tool is something that's done often - what would be a good way to simplify this expression?
Some possible options:
Remove the Option, just use Box<Any> with Box::new(()) representing None so access can be simplified a little.
Use a macro or function to hide verbosity - passing in the Option<Box<Any>>: will work of course, but prefer not - would use as a last resort.
Add a trait to Option<Box<Any>> which exposes a method such as tool.custom_data.unwrap_box::<StructOnlyForThisTool>() with matching unwrap_box_mut.
Update 1): since asking this question a point I didn't include seems relevant.
There may be multiple callback functions like execute which must all be able to access the custom_data. At the time I didn't think this was important to point out.
Update 2): Wrapping this in a function which takes tool isn't practical, since the borrow checker then prevents further access to members of tool until the cast variable goes out of scope, I found the only reliable way to do this was to write a macro.
If the implementation really only has a single method with a name like execute, that is a strong indication to consider using a closure to capture the implementation data. SomeTool can incorporate an arbitrary callable in a type-erased manner using a boxed FnMut, as shown in this answer. execute() then boils down to invoking the closure stored in the struct field implementation closure using (self.impl_)(). For a more general approach, that will also work when you have more methods on the implementation, read on.
An idiomatic and type-safe equivalent of the type+dataptr C pattern is to store the implementation type and pointer to data together as a trait object. The SomeTool struct can contain a single field, a boxed SomeToolImpl trait object, where the trait specifies tool-specific methods such as execute. This has the following characteristics:
You no longer need an explicit type field because the run-time type information is incorporated in the trait object.
Each tool's implementation of the trait methods can access its own data in a type-safe manner without casts or unwraps. This is because the trait object's vtable automatically invokes the correct function for the correct trait implementation, and it is a compile-time error to try to invoke a different one.
The "fat pointer" representation of the trait object has the same performance characteristics as the type+dataptr pair - for example, the size of SomeTool will be two pointers, and accessing the implementation data will still involve a single pointer dereference.
Here is an example implementation:
struct SomeTool {
impl_: Box<SomeToolImpl>,
}
impl SomeTool {
fn execute(&mut self) {
self.impl_.execute();
}
}
trait SomeToolImpl {
fn execute(&mut self);
}
struct SpecificTool1 {
foo_bar: bool
}
impl SpecificTool1 {
pub fn new(foo_bar: bool) -> SomeTool {
let my_data = SpecificTool1 { foo_bar: foo_bar };
SomeTool { impl_: Box::new(my_data) }
}
}
impl SomeToolImpl for SpecificTool1 {
fn execute(&mut self) {
println!("I am {}", self.foo_bar);
}
}
struct SpecificTool2 {
num: u64
}
impl SpecificTool2 {
pub fn new(num: u64) -> SomeTool {
let my_data = SpecificTool2 { num: num };
SomeTool { impl_: Box::new(my_data) }
}
}
impl SomeToolImpl for SpecificTool2 {
fn execute(&mut self) {
println!("I am {}", self.num);
}
}
pub fn main() {
let mut tool1: SomeTool = SpecificTool1::new(true);
let mut tool2: SomeTool = SpecificTool2::new(42);
tool1.execute();
tool2.execute();
}
Note that, in this design, it doesn't make sense to make implementation an Option because we always associate the tool type with the implementation. While it is perfectly valid to have an implementation without data, it must always have a type associated with it.
Consider the following implementation:
pub struct BST {
root: Link,
}
type Link = Option<Box<Node>>;
struct Node {
left: Link,
elem: i32,
right: Link,
}
impl Link { /* misc */ }
impl BST { /* misc */ }
I keep getting the error:
cannot define inherent impl for a type outside of the crate where the type is defined; define and implement a trait or new type instead
I was able to find others had this same issue back in February, but there was seemingly no solution at the time.
Is there any fix or another way for me to implement my Link typedef in Rust?
Is there any fix
Not really. A type alias (type Foo = Bar) does not create a new type. All it does is create a different name that refers to the existing type.
In Rust, you are not allowed to implement inherent methods for a type that comes from another crate.
another way for me to implement
The normal solution is to create a brand new type. In fact, it goes by the name newtype!
struct Link(Option<Box<Node>>);
impl Link {
// methods all up in here
}
There's no runtime disadvantage to this - both versions will take the exact same amount of space. Additionally, you won't accidentally expose any methods you didn't mean to. For example, do you really want clients of your code to be able to call Option::take?
Another solution is to create your own trait, and then implement it for your type. From the callers point of view, it looks basically the same:
type Link = Option<Box<Node>>;
trait LinkMethods {
fn cool_method(&self);
}
impl LinkMethods for Link {
fn cool_method(&self) {
// ...
}
}
The annoyance here is that the trait LinkMethods has to be in scope to call these methods. You also cannot implement a trait you don't own for a type you don't own.
See also:
How do I implement a trait I don't own for a type I don't own?