Type handling in the trait object

Type handling in the trait object - rust

I'm new to rust and I recently ran into a problem with trait
I have a trait that is used as the source of a message and is stored in a structure as a Box trait object. I simplified my logic and the code looks something like this.
#[derive(Debug)]
enum Message {
MessageTypeA(i32),
MessageTypeB(f32),
}
enum Config {
ConfigTypeA,
ConfigTypeB,
}
trait Source {
fn next(&mut self) -> Message;
}
struct SourceA;
impl Source for SourceA {
fn next(&mut self) -> Message {
Message::MessageTypeA(1)
}
}
struct SourceB;
impl Source for SourceB {
fn next(&mut self) -> Message {
Message::MessageTypeB(1.1)
}
}
struct Test {
source: Box<dyn Source>,
}
impl Test {
fn new(config: Config) -> Self {
Test {
source: match config {
Config::ConfigTypeA => Box::new(SourceA{}),
Config::ConfigTypeB => Box::new(SourceB{}),
}
}
}
fn do_sth(&mut self) -> String {
match self.source.next() {
Message::MessageTypeA(a) => format!("a is {:?}", a),
Message::MessageTypeB(b) => format!("b is {:?}", b),
}
}
fn do_sth_else(&mut self, message: Message) -> String {
match message {
Message::MessageTypeA(a) => format!("a is {:?}", a),
Message::MessageTypeB(b) => format!("b is {:?}", b),
}
}
}
Different types of Source return different types of Message, the Test structure needs to create the corresponding trait object according to config and call next() in the do_sth function.
So you can see two enum types Config and Message, which I feel is a strange usage, but I don't know what's strange about it.
I tried to use trait association type, but then I need to specify the association type when I declare the Test structure like source: Box<dyn Source<Item=xxxx>> but I don't actually know the exact type when creating the struct object.
Then I tried to use Generic type, but because of the need of the upper code, Test cannot use Generic.
So please help me, is there a more elegant or rustic solution to this situation?

So what might be going on here is that you're mis-using the idea of trait objects. The whole idea of a trait object is that you want to write code that relies purely on its interface. As soon as you find yourself checking for the underlying type of a trait object, that should raise a red flag.
Of course in your example you're not using any sort of weird run-time type checking; instead, you're checking the type implicitly via the enums.
Still, you recognize this as problematic. First, it becomes very clumsy when you try to add another variant to your sources, because now you have to go and add that to your message and config enums as well. Second, you then have to add that handling logic to everywhere the trait object is used. And finally, the type system "lies" a bit. It seems to me that source A will only ever send messages of the first variant and source B will only ever send messages of the second variant, and that's how we're telling them apart.
So what's a way out here?
First, trait objects should be designed such that they can be used without having to know which implementation we're dealing with. Traits represent roles that structs can play, and code that uses trait objects says "I'm happy to work with anyone who can play that role".
If your code isn't happy to work with anyone who can play that trait's role, there's a flaw in the design.
It would be good to know how you are, in general, processing the messages returned by the sources.
For example, does it matter for the rest of your program that source A only ever returns integers and source B only ever returns floats? And if so, what about it is it that matters, and could that be abstracted behind a trait?

Related

How do I inspect function arguments at runtime in Rust?

Say I have a trait that looks like this:
use std::{error::Error, fmt::Debug};
use super::CheckResult;
/// A Checker is a component that is responsible for checking a
/// particular aspect of the node under investigation, be that metrics,
/// system information, API checks, load tests, etc.
#[async_trait::async_trait]
pub trait Checker: Debug + Sync + Send {
type Input: Debug;
/// This function is expected to take input, whatever that may be,
/// and return a vec of check results.
async fn check(&self, input: &Self::Input) -> anyhow::Result<Vec<CheckResult>>;
}
And say I have two implementations of this trait:
pub struct ApiData {
some_response: String,
}
pub MetricsData {
number_of_events: u64,
}
pub struct ApiChecker;
impl Checker for ApiChecker {
type Input = ApiData;
// implement check function
}
pub struct MetricsChecker;
impl Checker for MetricsChecker {
type Input = MetricsData;
// implement check function
}
In my code I have a Vec of these Checkers that looks like this:
pub struct MyServer {
checkers: Vec<Box<dyn Checker>>,
}
What I want to do is figure out, based on what Checkers are in this Vec, what data I need to fetch. For example, if it just contained an ApiChecker, I would only need to fetch the ApiData. If both ApiChecker and MetricsChecker were there, I'd need both ApiData and MetricsData. You can also imagine a third checker where Input = (ApiData, MetricsData). In that case I'd still just need to fetch ApiData and MetricsData once.
I imagine an approach where the Checker trait has an additional function on it that looks like this:
fn required_data(&self) -> HashSet<DataId>;
This could then return something like [DataId::Api, DataId::Metrics]. I would then run this for all Checkers in my vec and then I'd end up a complete list of data I need to get. I could then do some complicated set of checks like this:
let mut required_data = HashSet::new();
for checker in checkers {
required_data.union(&mut checker.required_data());
}
let api_data: Option<ApiData> = None;
if required_data.contains(DataId::Api) {
api_data = Some(get_api_data());
}
And so on for each of the data types.
I'd then pass them into the check calls like this:
api_checker.check(
api_data.expect("There was some logic error and we didn't get the API data even though a Checker declared that it needed it")
);
The reasons I want to fetch the data outside of the Checkers is:
To avoid fetching the same data multiple times.
To support memoization between unrelated calls where the arguments are the same (this could be done inside some kind of Fetcher trait implementation for example).
To support generic retry logic.
By now you can probably see that I've got two big problems:
The declaration of what data a specific Checker needs is duplicated, once in the function signature and again from the required_data function. This naturally introduces bug potential. Ideally this information would only be declared once.
Similarly, in the calling code, I have to trust that the data that the Checkers said they needed was actually accurate (the expect in the previous snippet). If it's not, and we didn't get data we needed, there will be problems.
I think both of these problems would be solved if the function signature, and specifically the Input associated type, was able to express this "required data" declaration on its own. Unfortunately I'm not sure how to do that. I see there is a nightly feature in any that implements Provider and Demand: https://doc.rust-lang.org/std/any/index.html#provider-and-demand. This sort of sounds like what I want, but I have to use stable Rust, plus I figure I must be missing something and there is an easier way to do this without going rogue with semi dynamic typing.
tl;dr: How can I inspect what types the arguments are for a function (keeping in mind that the input might be more complex than just one thing, such as a struct or tuple) at runtime from outside the trait implementer? Alternatively, is there a better way to design this code that would eliminate the need for this kind of reflection?

Your problems start way earlier than you mention:
checkers: Vec<Box<dyn Checker>>
This is an incomplete type. The associated type Input means that Checker<Input = ApiData> and Checker<Input = MetricsData> are incompatible. How would you call checkers[0].check(input)? What type would input be? If you want a collection of "checkers" then you'll need a unified API, where the arguments to .check() are all the same.
I would suggest a different route altogether: Instead of providing the input, provide a type that can retrieve the input that they ask for. That way there's no need to coordinate what type the checkers will ask for in a type-safe way, it'll be inherent to the methods the checkers themselves call. And if your primary concern is repeatedly retrieving the same data for different checkers, then all you need to do is implement caching in the provider. Same with retry logic.
Here's my suggestion:
struct DataProvider { /* cached api and metrics */ }
impl DataProvider {
fn fetch_api_data(&mut self) -> anyhow::Result<ApiData> { todo!() }
fn fetch_metrics_data(&mut self) -> anyhow::Result<MetricsData> { todo!() }
}
#[async_trait::async_trait]
trait Checker {
async fn check(&self, data: &mut DataProvider) -> anyhow::Result<Vec<CheckResult>>;
}
struct ApiAndMetricsChecker;
#[async_trait::async_trait]
impl Checker for ApiAndMetricsChecker {
async fn check(&self, data: &mut DataProvider) -> anyhow::Result<Vec<CheckResult>> {
let _api_data = data.fetch_api_data()?;
let _metrics_data = data.fetch_metrics_data()?;
// do something with api and metrics data
todo!()
}
}

How can I implement `From<some trait's associated type>?`

I'm writing a new crate, and I want it to be useable with any implementation of a trait (defined in another crate). The trait looks something like this:
pub trait Trait {
type Error;
...
}
I have my own Error type, but sometimes I just want to forward the underlying error unmodified. My instinct is to define a type like this:
pub enum Error<T: Trait> {
TraitError(T::Error),
...
}
This is similar to the pattern encouraged by thiserror, and appears to be idiomatic. It works fine, but I also want to use ? in my implementation, so I need to implement From:
impl<T: Trait> From<T::Error> for Error<T> {
fn from(e: T::Error) -> Self { Self::TraitError(e) }
}
That fails, because it conflicts with impl<T> core::convert::From<T> for T. I think I understand why — some other implementor of Trait could set type Error = my_crate::Error such that both impls would apply — but how else can I achieve similar semantics?
I've looked at a few other crates, and they seem to handle this by making their Error (or equivalent) generic over the error type itself, rather than the trait implementation. That works, of course, but:
until we have inherent associated types, it's much more verbose. My T actually implements multiple traits, each with their own Error types, so I'd now have to return types like Result<..., Error<<T as TraitA>::Error, <T as TraitB>::Error>> etc;
it's arguably less expressive (because the relationship to Trait is lost).
Is making my Error generic over the individual types the best (most idiomatic) option today?

Instead of implementing From for your Error enum, consider instead using Result::map_err in combination with ? to specify which variant to return.
This works even for enums generic over a type using associated types, as such:
trait TraitA {
type Error;
fn do_stuff(&self) -> Result<(), Self::Error>;
}
trait TraitB {
type Error;
fn do_other_stuff(&self) -> Result<(), Self::Error>;
}
enum Error<T: TraitA + TraitB> {
DoStuff(<T as TraitA>::Error),
DoOtherStuff(<T as TraitB>::Error),
}
fn my_function<T: TraitA + TraitB>(t: T) -> Result<(), Error<T>> {
t.do_stuff().map_err(Error::DoStuff)?;
t.do_other_stuff().map_err(Error::DoOtherStuff)?;
Ok(())
}
On the playground
Here the important bits are that Error has no From implementations (aside from blanket ones), and that the variant is specified using map_err. This works as Error::DoStuff can be interpreted as a fn(<T as TraitA>::Error) -> Error when passed to map_err. Same happens with Error::DoOtherStuff.
This approach is scaleable with however many variants Error has and whether or not they are the same type. It might also be clearer to someone reading the function, as they can figure out that a certain error comes from a certain place without needing to check From implementations and where the type being converted from appears in the function.

How to offer an API that stores values of different types and can return them with the original type restored?

I want to offer a safe API like below FooManager. It should be able to store arbitrary user-defined values that implement a trait Foo. It should also be able to hand them back later - not as trait object (Box<dyn Foo>) but as the original type (Box<T> where T: Foo). At least conceptually it should be possible to offer this as a safe API, by using generic handles (Handle<T>), see below.
Additional criteria:
The solution should work in stable Rust (internal usage of unsafe blocks is perfectly okay though).
I don't want to modify the trait Foo, as e.g. suggested in How to get a reference to a concrete type from a trait object?. It should work without adding a method as_any(). Reasoning: Foo shouldn't have any knowledge about the fact that it might be stored in containers and be restored to the actual type.
trait Foo {}
struct Handle<T> {
// ...
}
struct FooManager {
// ...
}
impl FooManager {
// A real-world API would complain if the value is already stored.
pub fn keep_foo<T: Foo>(&mut self, foo: Box<T>) -> Handle<T> {
// ...
}
// In a real-world API this would return an `Option`.
pub fn return_foo<T: Foo>(&mut self, handle: Handle<T>) -> Box<T> {
// ...
}
}
I came up with this (Rust Playground) but not sure if there's a better way or if it's safe even. What do you think of that approach?

Best practice to return a Result<_, impl Error> and not a Result<_, &str> in Rust?

Is it ok practice to have this style Result?
fn a() -> Result<u32, &'static str>
And then what is the purpose of the Error trait? https://doc.rust-lang.org/std/error/trait.Error.html
Is an impl Error Result better practice?
impl Error for MyError {..... }
fn a() -> Result<u32, MyError>

In short: No, it's not okay. String as error throws away information about details and cause, making errors useless for callers as they won't be able to inspect and possibly recover from it.
In case you just need to fill Error parameter with something, create a unit struct. It's not much useful, but it's also not as volative as string. And you can easily distinguish foo::SomeError from bar::SomeError.
#[derive(Debug)]
pub struct SomeError; // No fields.
In case you can enumerate error variants, use enum.
It is also sometimes useful to "include" other errors into it.
#[derive(Debug)]
pub enum PasswordError {
Empty,
ToShort,
NoDigits,
NoLetters,
NoSpecials
}
#[derive(Debug)]
pub enum ConfigLoadError {
InvalidValues,
DeserializationError(serde::de::Error),
IoError(std::io::Error),
}
Nobody stops you from using structs.
They're particularly useful when you intentionaly want to hide some information from caller (In contrast to enums whose variants always have public visibility). E.g. caller has nothing to do with error message, but can use kind to handle it:
pub enum RegistrationErrorKind {
InvalidName { wrong_char_idx: usize },
NonUniqueName,
WeakPassword,
DatabaseError(db::Error),
}
#[derive(Debug)]
pub struct RegistrationError {
message: String, // Private field
pub kind: RegistrationErrorKind, // Public field
}
impl Error - existential type - makes no sense here. You can't return different error types with it in error place, if this was your intent. And opaque errors are not much useful, just like strings.
std::error::Error trait ensures that your SomeError type has implementation for std::fmt::{Display, Debug} (For displaying error to user and developer, correspondingly) and provides some useful methods like source (This returns the cause of this error); is, downcast, downcast_ref, downcast_mut. Last 4 are for error type erasure.
Error type erasure
Error type erasure has it's tradeoffs, but it's also worth mentioning.
It's also especially useful when writing somewht high-level application code. But in case of libraries you should think twice before deciding to use this approach, because it will make your library unusable with 'no_std'.
Say you have some function with non-trivial logic that can return values of some error types, not exactly one. In this case you can use (But don't abuse) error type erasure:
use std::error::Error;
use std::fmt;
use std::fs;
use std::io::Error as IoError;
use std::net::AddrParseError;
use std::net::Ipv4Addr
use std::path::Path;
// Error for case where file contains '127.0.0.1'
#[derive(Debug)]
pub struct AddressIsLocalhostError;
// Display implementation is required for std::error::Error.
impl fmt::Display for AddressIsLocalhostError {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
write!(f, "Address is localhost")
}
}
impl Error for AddresIsLocalhostError {} // Defaults are okay here.
// Now we have a function that takes a path and returns
// non-localhost Ipv4Addr on success.
// On fail it can return either of IoError, AddrParseError or AddressIsLocalhostError.
fn non_localhost_ipv4_from_file(path: &Path) -> Result<Ipv4Addr, Box<dyn Error + 'static>> {
// Opening and reading file may cause IoError.
// ? operator will automatically convert it to Box<dyn Error + 'static>.
// (via From trait implementation)
// This way concrete type of error is "erased": we don't know what's
// in a box, in fact it's kind of black box now, but we still can call
// methods that Error trait provides.
let content = fs::read_to_string(path)?;
// Parsing Ipv4Addr from string [slice]
// may cause another error: AddressParseError.
// And ? will convert it to to the same type: Box<dyn Error + 'static>
let addr: Ipv4Addr = content.parse()?;
if addr == Ipv4Add::new(127, 0, 0, 1) {
// Here we perform manual conversion
// from AddressIsLocalhostError
// to Box<dyn Error + 'static> and return error.
return Err(AddressIsLocalhostError.into());
}
// Everyhing is okay, returning addr.
Ok(Ipv4Addr)
}
fn main() {
// Let's try to use our function.
let maybe_address = non_localhost_ipv4_from_file(
"sure_it_contains_localhost.conf"
);
// Let's see what kind of magic Error trait provides!
match maybe_address {
// Print address on success.
Ok(addr) => println!("File was containing address: {}", addr),
Err(err) => {
// We sure can just print this error with.
// println!("{}", err.as_ref());
// Because Error implementation implies Display implementation.
// But let's imagine we want to inspect error.
// Here deref coercion implicitly converts
// `&Box<dyn Error>` to `&dyn Error`.
// And downcast_ref tries to convert this &dyn Error
// back to &IoError, returning either
// Some(&IoError) or none
if Some(err) = err.downcast_ref::<IoError>() {
println!("Unfortunately, IO error occured: {}", err)
}
// There's also downcast_mut, which does the same, but gives us
// mutable reference.
if Some(mut err) = err.downcast_mut::<AddressParseError>() {
// Here we can mutate err. But we'll only print it.
println!(
"Unfortunately, what file was cantaining, \
was not in fact an ipv4 address: {}",
err
);
}
// Finally there's 'is' and 'downcast'.
// 'is' comapres "erased" type with some concrete type.
if err.is::<AddressIsLocalhostError>() {
// 'downcast' tries to convert Box<dyn Error + 'static>
// to box with value of some concrete type.
// Here - to Box<AddressIsLocalhostError>.
let err: Box<AddressIsLocalhostError> =
Error::downcast(err).unwrap();
}
}
};
}
To summarize: errors should (I'd say - must) provide useful information to caller, besides ability to just display them, thus they should not be strings. And errors must implement Error at least to preserve more-less consistent error handling experience across all crates. All the rest depends on situation.
Caio alredy mentioned The Rust Book.
But these links might be also useful:
std::any module level API documentation
std::error::Error API documentation

For simple use-cases, a opaque error type like Result<u32, &'static str> or Result<u32, String> is enough but for more complex libraries, it is useful and even encouraged to create your own error type like a struct MyError or enum AnotherLibError, which helps you to define better your intentions. You may also want to read the Error Handling chapter of the Rust by Example book.
The Error trait, as being part of std, helps developers in a generic and centralized manner to define their own error types to describe what happened and the possible root causes (backtrace). It is currently somewhat limited but there are plans to help to improve its usability.
When you use impl Error, you are telling the compiler that you don't care about the type being returned, as long it implements the Error trait. This approach is useful when the error type is too complex or when you want to generalize over the return type. E.g.:
fn example() -> Result<Duration, impl Error> {
let sys_time = SystemTime::now();
sleep(Duration::from_secs(1));
let new_sys_time = SystemTime::now();
sys_time.duration_since(new_sys_time)
}
The method duration_since returns a Result<Duration, SystemTimeError> type but in the above method signature, you can see that for the Err part of the Result, it is returning anything that implements the Error trait.
Summarizing everything, if you read the Rust book and know what you are doing, you can choose the approach that best fits your needs. Otherwise, it is best to define your own types for your errors or use some third-party utilities like the error-chain or failure crates.

Static method in trait dynamic dispatch

Trying to get dynamic dispatch working in a trait static method but get a type must be known error.
I'm trying to achieve something like
F#
https://github.com/Thorium/SimpleCQRS-FSharp/blob/master/CommandSide/Domain.fs
C#
https://github.com/gregoryyoung/m-r/blob/master/SimpleCQRS/Domain.cs..
Is the only way to make the trait generic?
pub struct Aggregate<T: AggregateRoot>
{
pub id: Uuid,
agg: T,
changes: Vec<Box<Any>>
}
impl <T :AggregateRoot > Aggregate<T>
{
fn GetUncomittedChanges(&self) -> Vec<Box<Any>> { self.changes}
fn MarkChangesAsCommitted(&self) { self.changes.drain(..);}
}
trait AggregateRoot
{
fn new2() -> Self; //should be private
fn new(id: Uuid) -> Self;
fn LoadsFromHistory(changes : Vec<Box<Any>> ) -> Self
where Self: Sized
{
let newAgg = AggregateRoot::new2 ();
changes.iter().map( |e| newAgg.Apply(e) );
newAgg.MarkChangesAsCommitted();
newAgg
}
fn Apply<U: Any>(&self, arg: U) ;
fn GetId(&self) -> Uuid;
}
currently trying but gives 2 params expected 1 supplied.

Let's start with issues in how you asked the question, in the hopes that you will be able to ask better questions in the future. The complete error you are getting is:
<anon>:27:37: 27:52 error: the type of this value must be known in this context
<anon>:27 changes.iter().map( |e| newAgg.Apply(e) );
^~~~~~~~~~~~~~~
Note that the compiler error message shows you exactly which bit of code is at fault. It's useful to include that error when asking a question.
You've also included extraneous detail. For example, GetUncomittedChanges, id and GetId are all unused in your example. When solving a problem, you should produce an MCVE. This helps you understand the problem better and also allows people helping you to look at less code which usually results in faster turnaround.
Your code has a number of problems, but let's start at the first error:
let newAgg = AggregateRoot::new2 ();
This says "for any possible AggregateRoot, create a new one". Many concrete types can implement a trait (which is the point of traits), but the compiler needs to know how much space to allocate for a given instance. There might be a struct that takes 1 byte or 200 bytes; how much space needs to be allocated on the stack in this case?
To progress, you can use Self::new2 instead. That means to create a new instance of the current implementor.
The next error is
<anon>:20:16: 20:40 error: no method named `MarkChangesAsCommitted` found for type `Self` in the current scope
<anon>:20 newAgg.MarkChangesAsCommitted();
^~~~~~~~~~~~~~~~~~~~~~~~
You are calling a method on a concrete type from a trait implementation; this simply doesn't make any sense. What would happen if a bool implements this trait? It doesn't have a MarkChangesAsCommitted method. I don't know what you intended in this case, so I'll just delete it.
Now you get this error:
<anon>:19:9: 19:16 error: `changes` does not live long enough
<anon>:19 changes.iter().map( |e| newAgg.Apply(e) );
^~~~~~~
note: reference must be valid for the static lifetime...
<anon>:17:5: 21:6 note: ...but borrowed value is only valid for the scope of parameters for function at 17:4
That's because your method Apply expects to be given a type that implements Any. However, you are passing a &Box<Any>. Any has a lifetime bound of 'static, and that reference is not static. A straightforward change is to accept a reference to a type that implements Any:
fn Apply<U: Any>(&self, arg: &U);
Now that the code compiles, there's a number of stylistic issues to fix:
no space before :
no space after >
no space before (
no space inside ()
map should not be used for side effects
function and variable names are camel_case
most of the time, accept a &[T] instead of a Vec<T> as a function argument.
use "Egyptian" braces, except when you are using a where clause.
All together, your code looks like:
use std::any::Any;
struct Aggregate<T: AggregateRoot> {
agg: T,
changes: Vec<Box<Any>>
}
impl<T: AggregateRoot> Aggregate<T> {
fn mark_changes_as_committed(&self) { }
}
trait AggregateRoot {
fn new() -> Self;
fn load_from_history(changes: &[Box<Any>]) -> Self
where Self: Sized
{
let new_agg = Self::new();
for change in changes { new_agg.apply(change) }
new_agg
}
fn apply<U: Any>(&self, arg: &U);
}
fn main() {}
Is there a way to constrain the concrete types of the AggregateRoot to Aggregates so mark_changes can be called?
Not that I'm aware of. It sounds like you want to move mark_changes to the trait and force all implementors of the trait to implement it:
trait AggregateRoot {
fn load_from_history(changes: &[Box<Any>]) -> Self
where Self: Sized
{
let new_agg = Self::new();
for change in changes { new_agg.apply(change) }
new_agg.mark_changes_as_committed();
new_agg
}
fn mark_changes_as_committed(&self);
// ...
}

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string