How do I inspect function arguments at runtime in Rust? - rust

Say I have a trait that looks like this:
use std::{error::Error, fmt::Debug};
use super::CheckResult;
/// A Checker is a component that is responsible for checking a
/// particular aspect of the node under investigation, be that metrics,
/// system information, API checks, load tests, etc.
#[async_trait::async_trait]
pub trait Checker: Debug + Sync + Send {
type Input: Debug;
/// This function is expected to take input, whatever that may be,
/// and return a vec of check results.
async fn check(&self, input: &Self::Input) -> anyhow::Result<Vec<CheckResult>>;
}
And say I have two implementations of this trait:
pub struct ApiData {
some_response: String,
}
pub MetricsData {
number_of_events: u64,
}
pub struct ApiChecker;
impl Checker for ApiChecker {
type Input = ApiData;
// implement check function
}
pub struct MetricsChecker;
impl Checker for MetricsChecker {
type Input = MetricsData;
// implement check function
}
In my code I have a Vec of these Checkers that looks like this:
pub struct MyServer {
checkers: Vec<Box<dyn Checker>>,
}
What I want to do is figure out, based on what Checkers are in this Vec, what data I need to fetch. For example, if it just contained an ApiChecker, I would only need to fetch the ApiData. If both ApiChecker and MetricsChecker were there, I'd need both ApiData and MetricsData. You can also imagine a third checker where Input = (ApiData, MetricsData). In that case I'd still just need to fetch ApiData and MetricsData once.
I imagine an approach where the Checker trait has an additional function on it that looks like this:
fn required_data(&self) -> HashSet<DataId>;
This could then return something like [DataId::Api, DataId::Metrics]. I would then run this for all Checkers in my vec and then I'd end up a complete list of data I need to get. I could then do some complicated set of checks like this:
let mut required_data = HashSet::new();
for checker in checkers {
required_data.union(&mut checker.required_data());
}
let api_data: Option<ApiData> = None;
if required_data.contains(DataId::Api) {
api_data = Some(get_api_data());
}
And so on for each of the data types.
I'd then pass them into the check calls like this:
api_checker.check(
api_data.expect("There was some logic error and we didn't get the API data even though a Checker declared that it needed it")
);
The reasons I want to fetch the data outside of the Checkers is:
To avoid fetching the same data multiple times.
To support memoization between unrelated calls where the arguments are the same (this could be done inside some kind of Fetcher trait implementation for example).
To support generic retry logic.
By now you can probably see that I've got two big problems:
The declaration of what data a specific Checker needs is duplicated, once in the function signature and again from the required_data function. This naturally introduces bug potential. Ideally this information would only be declared once.
Similarly, in the calling code, I have to trust that the data that the Checkers said they needed was actually accurate (the expect in the previous snippet). If it's not, and we didn't get data we needed, there will be problems.
I think both of these problems would be solved if the function signature, and specifically the Input associated type, was able to express this "required data" declaration on its own. Unfortunately I'm not sure how to do that. I see there is a nightly feature in any that implements Provider and Demand: https://doc.rust-lang.org/std/any/index.html#provider-and-demand. This sort of sounds like what I want, but I have to use stable Rust, plus I figure I must be missing something and there is an easier way to do this without going rogue with semi dynamic typing.
tl;dr: How can I inspect what types the arguments are for a function (keeping in mind that the input might be more complex than just one thing, such as a struct or tuple) at runtime from outside the trait implementer? Alternatively, is there a better way to design this code that would eliminate the need for this kind of reflection?

Your problems start way earlier than you mention:
checkers: Vec<Box<dyn Checker>>
This is an incomplete type. The associated type Input means that Checker<Input = ApiData> and Checker<Input = MetricsData> are incompatible. How would you call checkers[0].check(input)? What type would input be? If you want a collection of "checkers" then you'll need a unified API, where the arguments to .check() are all the same.
I would suggest a different route altogether: Instead of providing the input, provide a type that can retrieve the input that they ask for. That way there's no need to coordinate what type the checkers will ask for in a type-safe way, it'll be inherent to the methods the checkers themselves call. And if your primary concern is repeatedly retrieving the same data for different checkers, then all you need to do is implement caching in the provider. Same with retry logic.
Here's my suggestion:
struct DataProvider { /* cached api and metrics */ }
impl DataProvider {
fn fetch_api_data(&mut self) -> anyhow::Result<ApiData> { todo!() }
fn fetch_metrics_data(&mut self) -> anyhow::Result<MetricsData> { todo!() }
}
#[async_trait::async_trait]
trait Checker {
async fn check(&self, data: &mut DataProvider) -> anyhow::Result<Vec<CheckResult>>;
}
struct ApiAndMetricsChecker;
#[async_trait::async_trait]
impl Checker for ApiAndMetricsChecker {
async fn check(&self, data: &mut DataProvider) -> anyhow::Result<Vec<CheckResult>> {
let _api_data = data.fetch_api_data()?;
let _metrics_data = data.fetch_metrics_data()?;
// do something with api and metrics data
todo!()
}
}

Related

Type handling in the trait object

I'm new to rust and I recently ran into a problem with trait
I have a trait that is used as the source of a message and is stored in a structure as a Box trait object. I simplified my logic and the code looks something like this.
#[derive(Debug)]
enum Message {
MessageTypeA(i32),
MessageTypeB(f32),
}
enum Config {
ConfigTypeA,
ConfigTypeB,
}
trait Source {
fn next(&mut self) -> Message;
}
struct SourceA;
impl Source for SourceA {
fn next(&mut self) -> Message {
Message::MessageTypeA(1)
}
}
struct SourceB;
impl Source for SourceB {
fn next(&mut self) -> Message {
Message::MessageTypeB(1.1)
}
}
struct Test {
source: Box<dyn Source>,
}
impl Test {
fn new(config: Config) -> Self {
Test {
source: match config {
Config::ConfigTypeA => Box::new(SourceA{}),
Config::ConfigTypeB => Box::new(SourceB{}),
}
}
}
fn do_sth(&mut self) -> String {
match self.source.next() {
Message::MessageTypeA(a) => format!("a is {:?}", a),
Message::MessageTypeB(b) => format!("b is {:?}", b),
}
}
fn do_sth_else(&mut self, message: Message) -> String {
match message {
Message::MessageTypeA(a) => format!("a is {:?}", a),
Message::MessageTypeB(b) => format!("b is {:?}", b),
}
}
}
Different types of Source return different types of Message, the Test structure needs to create the corresponding trait object according to config and call next() in the do_sth function.
So you can see two enum types Config and Message, which I feel is a strange usage, but I don't know what's strange about it.
I tried to use trait association type, but then I need to specify the association type when I declare the Test structure like source: Box<dyn Source<Item=xxxx>> but I don't actually know the exact type when creating the struct object.
Then I tried to use Generic type, but because of the need of the upper code, Test cannot use Generic.
So please help me, is there a more elegant or rustic solution to this situation?
So what might be going on here is that you're mis-using the idea of trait objects. The whole idea of a trait object is that you want to write code that relies purely on its interface. As soon as you find yourself checking for the underlying type of a trait object, that should raise a red flag.
Of course in your example you're not using any sort of weird run-time type checking; instead, you're checking the type implicitly via the enums.
Still, you recognize this as problematic. First, it becomes very clumsy when you try to add another variant to your sources, because now you have to go and add that to your message and config enums as well. Second, you then have to add that handling logic to everywhere the trait object is used. And finally, the type system "lies" a bit. It seems to me that source A will only ever send messages of the first variant and source B will only ever send messages of the second variant, and that's how we're telling them apart.
So what's a way out here?
First, trait objects should be designed such that they can be used without having to know which implementation we're dealing with. Traits represent roles that structs can play, and code that uses trait objects says "I'm happy to work with anyone who can play that role".
If your code isn't happy to work with anyone who can play that trait's role, there's a flaw in the design.
It would be good to know how you are, in general, processing the messages returned by the sources.
For example, does it matter for the rest of your program that source A only ever returns integers and source B only ever returns floats? And if so, what about it is it that matters, and could that be abstracted behind a trait?

How to use PalletB to save record from PalletA without PalletA knowing anything about internals of the saving in substrate and rust

I'd like to save a record from PalletA in PalletB by simply passing the raw data and waiting for the return.
I have tried following:
// ./PalletB/lib.rs
pub trait PutInStorage {
fn put_rule_in_storage(value: u32);
}
impl<T: Trait> PutInStorage for Module<T> {
fn put_rule_in_storage(value: u32) {
SimpleCounter::put(value);
}
}
then in
// ./PalletA/lib.rs
use palletB::{PutInStorage, Trait as PalletBTrait};
///The pallet's configuration trait.
pub trait Trait: system::Trait + PalletBTrait {
/// The overarching event type.
type Event: From<Event<Self>> + Into<<Self as system::Trait>::Event>;
type ExternalStorage: PutInStorage;
}
then I added the definition to the runtime like this:
// ./runtime/lib.rs
// near the construct_runtime macro
impl palletA::Trait for Runtime {
type Event = Event;
type ExternalStorage = palletB::Module<Runtime>;
}
So far this passes the check but not the test. The test configuration for the trait is like this:
use palletB::{PutInStorage, Trait as PalletBTrait};
impl Trait for Test {
type Event = ();
type ExternalStorage = PutInStorage;
}
and this fails with the:
type ExternalRulesStorage = PutInStorage;
^^^^^^^^^^^^ help: use `dyn`: `dyn PutInStorage`
impl Trait for Test
------------------- in this `impl` item
type Event = ();
type ExternalRulesStorage = PutInStorage;
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ doesn't have a size known at compile-time
type ExternalRulesStorage = PutInStorage;
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ the trait `pallet_rules::PutInStorage` cannot be made into an object
I tried all the suggestions the Rust compiler gives me but without any luck. Before someone asks why do I need this in my test, it's because a dispatchable fn in decl_module! checks does a certain record exists before starts processing and saving its own records. It is dependent on record to exist.
To make the compiler happy, your test config must also have an instance of PalletB, or anything else that implements PutInStorage.
Similar to what you've done already in ./runtime/lib.rs:
impl Trait for Test {
type Event = ();
type ExternalStorage = palletB::Module<Test>;
}
Note that now struct Test is the one playing the role of Runtime. I think this is the only thing that you are missing.
That being said, you seem to be on the wrong track in the overall design.
PalletA already depends on PalletB. Given that you also have a trait to link the two PutInStorage, this is not a good design. Generally, you should try and always chose one of the following:
Two pallets will depend on one another. In this case you don't need traits. If one needs to put something to the storage of the other one, you just do it directly. In your example, I assume that PalletB has a storage item called pub Foo: u32 in decl_storage and PutInStorage writes to this. If this is the case, there's no need for a trait. From PalletA you can just say: palletB::Foo::put(value).
Note that this approach should be chosen with care, otherwise you might end up with a lot of pallets depending on one another which is not good.
You decide that your pallets do NOT depend on one another, in which case you use some trait like PutInStorage. Your code seem to be aligned with this approach, except you define PalletA's Trait as pub trait Trait: system::Trait. There's no need to depend on PalletB here, and of course you can wipe it from Cargo.toml as well.

How to offer an API that stores values of different types and can return them with the original type restored?

I want to offer a safe API like below FooManager. It should be able to store arbitrary user-defined values that implement a trait Foo. It should also be able to hand them back later - not as trait object (Box<dyn Foo>) but as the original type (Box<T> where T: Foo). At least conceptually it should be possible to offer this as a safe API, by using generic handles (Handle<T>), see below.
Additional criteria:
The solution should work in stable Rust (internal usage of unsafe blocks is perfectly okay though).
I don't want to modify the trait Foo, as e.g. suggested in How to get a reference to a concrete type from a trait object?. It should work without adding a method as_any(). Reasoning: Foo shouldn't have any knowledge about the fact that it might be stored in containers and be restored to the actual type.
trait Foo {}
struct Handle<T> {
// ...
}
struct FooManager {
// ...
}
impl FooManager {
// A real-world API would complain if the value is already stored.
pub fn keep_foo<T: Foo>(&mut self, foo: Box<T>) -> Handle<T> {
// ...
}
// In a real-world API this would return an `Option`.
pub fn return_foo<T: Foo>(&mut self, handle: Handle<T>) -> Box<T> {
// ...
}
}
I came up with this (Rust Playground) but not sure if there's a better way or if it's safe even. What do you think of that approach?

How to Box a trait that has associated types?

I'm very new to Rust so I may have terminology confused.
I want to use the hashes crates to do some hashing and I want to dynamically pick which algorithm (sha256, sha512, etc.) to use at runtime.
I'd like to write something like this:
let hasher = match "one of the algorithms" {
"sha256" => Box::new(Sha256::new()) as Box<Digest>,
"sha512" => Box::new(Sha512::new()) as Box<Digest>
// etc...
};
I sort of get that that doesn't work because the associated types required by Digest aren't specified. If I attempt to fill them in:
"sha256" => Box::new(Sha256::new()) as Box<Digest<<OutputSize = U32, BlockSize = U64>>>,
I'm left with an error: the trait 'digest::Digest' cannot be made into an object. I think this approach will fail anyway because match will be returning slightly different types in cases where different algorithms have different associated types.
Am I missing something obvious? How can I dynamically create an instance of something that implements a trait and then hold on to that thing and use it through the trait interface?
The message refers to object safety (longer article). The Digest trait has two incompatibilities:
It uses associated types (this can be worked around by explicitly setting all type parameters to values compatible for all Digest objects).
It has a method (fn result(self) -> …) taking self by value. You won't be able to call it, which ruins usability of this trait.
Once a trait object is created, information about its subtype-specific features such as memory layout or associated types is erased. All calls to the trait object's methods are done via a vtable pointer. This means they all must be compatible, and Rust can't allow you to call any methods that could vary in these aspects.
A workaround is to create your custom wrapper trait/adapter that is object-compatible. I'm not sure if that's the best implementation, but it does work:
trait Digest {
type Assoc;
fn result(self);
}
struct Sha;
impl Digest for Sha {
type Assoc = u8;
fn result(self) {}
}
///////////////////////////////////////////
trait MyWrapper {
fn result(&mut self); // can't be self/Sized
}
impl<T: Digest> MyWrapper for Option<T> {
fn result(&mut self) {
// Option::take() gives owned from non-owned
self.take().unwrap().result()
}
}
fn main() {
let mut digest: Box<MyWrapper> = Box::new(Some(Sha));
digest.result();
}

How to provide type-only argument to a function?

In my short Rust experience I ran into this pattern several times, and I'm not sure if the way I solve it is actually adequate...
Let's assume I have some trait that looks like this:
trait Container {
type Item;
fn describe_container() -> String;
}
And some struct that implements this trait:
struct ImAContainerType;
struct ImAnItemType;
impl Container for ImAContainerType {
type Item = ImAnItemType;
fn describe_container() -> String { "some container that contains items".to_string() }
}
This may be a container that has a knowledge about type of items it contains, like in this example, or, as another example, request which knows what type of response should be returned, etc.
And now I find myself in a situation, when I need to implement a function that takes an item (associated type) and invokes a static function of the container (parent trait). This is the first naive attempt:
fn describe_item_container<C: Container>(item: C::Item) -> String {
C::describe_container()
}
This does not compile, because associated types are not injective, and Item can have several possible Containers, so this whole situation is ambiguous. I need to somehow provide the actual Container type, but without providing any container data. I may not have the container data itself at all when I invoke this function!
In search for a solution, I find the documentation for std::marker::PhantomData. It says:
PhantomData allows you to describe that a type acts as if it stores a value of type T, even though it does not.
This has to be the Rust's replacement for Haskell's Proxy type, right? Let's try to use it:
fn describe_item_container<C: Container>(container: PhantomData<C>, item: C::Item) -> String {
C::describe_container()
}
let s = describe_item_container(PhantomData::<PhantomData<ImAContainerType>>, ImAnItemType);
println!("{}", s);
Compiling... Error:
error[E0277]: the trait bound `std::marker::PhantomData<ImAContainerType>: Container` is not satisfied
I ask #rust-beginners and get a response: PhantomData is not meant to be used that way at all! Also I got an advice to simply make a backward associated type link from Item to Container. Something like this:
trait Item {
type C: Container;
}
fn describe_item_container<I: Item>(item: I) -> String {
I::C::describe_container()
}
It should work, but makes things much more complicated (especially for cases when item can be placed in different container kinds)...
After a lot more experimentation, I do the following change and everything compiles and works correctly:
let s = describe_item_container(PhantomData::<ImAContainerType>, ImAnItemType);
println!("{}", s);
The change is ::<PhantomData<ImAContainerType>> to ::<ImAContainerType>.
Playground example.
It works, but now I'm completely confused. Is this the correct way to use PhantomData? Why does it work at all? Is there some other, better way to provide type-only argument to a function in Rust?
EDIT: There is some oversimplification in my example, because in that particular case it would be easier to just invoke ImAContainerType::describe_container(). Here is some more complicated case, when the function actually does something with an Item, and still requires container type information.
If you want to pass a type argument to a function, you can just do it. You don't have to leave it out to be inferred.
This is how it looks for your second example (playground):
fn pack_item<C: Container>(item: C::Item) -> ItemPacket {
ItemPacket {
container_description: C::describe_container(),
_payload: item.get_payload(),
}
}
fn main() {
let s = pack_item::<ImAContainerType>(ImAnItemType);
println!("{}", s.container_description);
let s = pack_item::<ImAnotherContainerType>(ImAnItemType);
println!("{}", s.container_description);
}

Resources