1.) I have made a big knot in my code it seems. I defined my own structs e.g.
struct State { // some float values }
and require them be multiplied by f64 + Complex64 and added together member-wise. Now I tried to abstract away the f64 and Complex64 into a trait called "Weight" and I have two structs that need to implement them:
struct WeightReal {
strength: f64,
}
struct WeightComplex {
strength: num_complex::Complex<f64>,
}
Now it's more complicated as I need custom multiplication for these Weights with my struct "State" AND also with f64 itself (because I do other things as well). So I need "Weight" x State and "Weight" x f64 for both possible weight-types. Do I have to define all of these multiplications myself now? I used the derive_more-crate in the past, but I think now it's at its limits. Or I fundamentally misunderstood something here. Another question is: Do I need to define a struct here? I tried type-aliases before but I think there was an error, because I couldn't define custom multiplication on type aliases (at least it seemed so to me). It could've just been me doing it incorrectly.
The Rust way of defining multiplication / overloading of the "*"-operator somehow flies right over my head. With "cargo expand" I looked at a multiplication derived through the derive_more-crate:
impl<__RhsT: ::core::marker::Copy> ::core::ops::Mul<__RhsT> for State
where
f64: ::core::ops::Mul<__RhsT, Output = f64>,
{
type Output = State;
#[inline]
fn mul(self, rhs: __RhsT) -> State {
State {
value1: <f64 as ::core::ops::Mul<__RhsT>>::mul(self.value1, rhs),
value2: <f64 as ::core::ops::Mul<__RhsT>>::mul(self.value2, rhs),
}
}
}
If someone could explain a few of the parts here: what does the "<f64 as ::core.... "-part mean?
I understand that "__RhsT" means "Right-Hand-Side-Type", but I don't understand why it is still generic, because in this example shouldn't it be specifically f64? The third line is also puzzling me, why is it necessary?
I am really confused. The rust docs regarding multiplication are also unclear to me as they seem to be abstracted away in some macro.
There is a lot of noise in the generated code, which is fairly typical of code from macros. That's in order to reduce the chance of naming conflicts or remove ambiguity if there are several traits in scope with the same method name.
This is a bit more readable:
use std::ops::Mul;
impl<Rhs: Copy> Mul<Rhs> for State
where
f64: Mul<Rhs, Output = f64>,
{
type Output = State;
fn mul(self, rhs: Rhs) -> State {
State {
value1: <f64 as Mul<Rhs>>::mul(self.value1, rhs),
value2: <f64 as Mul<Rhs>>::mul(self.value2, rhs),
}
}
}
f64::mul(a, b) is another way to call a method, a.mul(b), while being precise about exactly which mul function you mean. That's needed because it's possible for there to be multiple possible methods with the same name. These could be inherent, from different traits, or from different parametrisations of the same trait.
Rhs is a geneneric parameter rather than just f64 because it's possible to implement Mul serveral times for the same type, using different type parameters. For example, it is reasonably to multiply an f64 by another f64, but it also makes sense to multiply by an f32, u8, i32 etc. Implementing Mul<u8> for f64 allows you to do 1.0f64 * 1u8.
<f64 as Mul<Rhs>>::mul(a, b) is specifying to call the mul method of Mul where the left hand side is an f64, but where the right hand side, Rhs, can be any type.
As for your first question, it's hard to understand what you are actually attempting, but the difficulty may hint that implementing Mul is not the right thing to do in the first place. If you have several different ways to multiply, then perhaps you should just have a different method for each one. It will probably end up being clearer and simpler. There isn't a big advantage in being able to use the * operator.
Related
In this question, an issue arose that could be solved by changing an attempt at using a generic type parameter into an associated type. That prompted the question "Why is an associated type more appropriate here?", which made me want to know more.
The RFC that introduced associated types says:
This RFC clarifies trait matching by:
Treating all trait type parameters as input types, and
Providing associated types, which are output types.
The RFC uses a graph structure as a motivating example, and this is also used in the documentation, but I'll admit to not fully appreciating the benefits of the associated type version over the type-parameterized version. The primary thing is that the distance method doesn't need to care about the Edge type. This is nice but seems a bit shallow of a reason for having associated types at all.
I've found associated types to be pretty intuitive to use in practice, but I find myself struggling when deciding where and when I should use them in my own API.
When writing code, when should I choose an associated type over a generic type parameter, and when should I do the opposite?
This is now touched on in the second edition of The Rust Programming Language. However, let's dive in a bit in addition.
Let us start with a simpler example.
When is it appropriate to use a trait method?
There are multiple ways to provide late binding:
trait MyTrait {
fn hello_word(&self) -> String;
}
Or:
struct MyTrait<T> {
t: T,
hello_world: fn(&T) -> String,
}
impl<T> MyTrait<T> {
fn new(t: T, hello_world: fn(&T) -> String) -> MyTrait<T>;
fn hello_world(&self) -> String {
(self.hello_world)(self.t)
}
}
Disregarding any implementation/performance strategy, both excerpts above allow the user to specify in a dynamic manner how hello_world should behave.
The one difference (semantically) is that the trait implementation guarantees that for a given type T implementing the trait, hello_world will always have the same behavior whereas the struct implementation allows having a different behavior on a per instance basis.
Whether using a method is appropriate or not depends on the usecase!
When is it appropriate to use an associated type?
Similarly to the trait methods above, an associated type is a form of late binding (though it occurs at compilation), allowing the user of the trait to specify for a given instance which type to substitute. It is not the only way (thus the question):
trait MyTrait {
type Return;
fn hello_world(&self) -> Self::Return;
}
Or:
trait MyTrait<Return> {
fn hello_world(&Self) -> Return;
}
Are equivalent to the late binding of methods above:
the first one enforces that for a given Self there is a single Return associated
the second one, instead, allows implementing MyTrait for Self for multiple Return
Which form is more appropriate depends on whether it makes sense to enforce unicity or not. For example:
Deref uses an associated type because without unicity the compiler would go mad during inference
Add uses an associated type because its author thought that given the two arguments there would be a logical return type
As you can see, while Deref is an obvious usecase (technical constraint), the case of Add is less clear cut: maybe it would make sense for i32 + i32 to yield either i32 or Complex<i32> depending on the context? Nonetheless, the author exercised their judgment and decided that overloading the return type for additions was unnecessary.
My personal stance is that there is no right answer. Still, beyond the unicity argument, I would mention that associated types make using the trait easier as they decrease the number of parameters that have to be specified, so in case the benefits of the flexibility of using a regular trait parameter are not obvious, I suggest starting with an associated type.
Associated types are a grouping mechanism, so they should be used when it makes sense to group types together.
The Graph trait introduced in the documentation is an example of this. You want a Graph to be generic, but once you have a specific kind of Graph, you don't want the Node or Edge types to vary anymore. A particular Graph isn't going to want to vary those types within a single implementation, and in fact, wants them to always be the same. They're grouped together, or one might even say associated.
Associated types can be used to tell the compiler "these two types between these two implementations are the same". Here's a double dispatch example that compiles, and is almost similar to how the standard library relates iterator to sum types:
trait MySum {
type Item;
fn sum<I>(iter: I)
where
I: MyIter<Item = Self::Item>;
}
trait MyIter {
type Item;
fn next(&self) {}
fn sum<S>(self)
where
S: MySum<Item = Self::Item>;
}
struct MyU32;
impl MySum for MyU32 {
type Item = MyU32;
fn sum<I>(iter: I)
where
I: MyIter<Item = Self::Item>,
{
iter.next()
}
}
struct MyVec;
impl MyIter for MyVec {
type Item = MyU32;
fn sum<S>(self)
where
S: MySum<Item = Self::Item>,
{
S::sum::<Self>(self)
}
}
fn main() {}
Also, https://blog.thomasheartman.com/posts/on-generics-and-associated-types has some good information on this as well:
In short, use generics when you want to type A to be able to implement a trait any number of times for different type parameters, such as in the case of the From trait.
Use associated types if it makes sense for a type to only implement the trait once, such as with Iterator and Deref.
The main goal is to implement a computation graph, that handles nodes with values and nodes with operators (think of simple arithmetic operators like add, subtract, multiply etc..). An operator node can take up to two value nodes, and "produces" a resulting value node.
Up to now, I'm using an enum to differentiate between a value and operator node:
pub enum Node<'a, T> where T : Copy + Clone {
Value(ValueNode<'a, T>),
Operator(OperatorNode)
}
pub struct ValueNode<'a, T> {
id: usize,
value_object : &'a dyn ValueType<T>
}
Update: Node::Value contains a struct, which itself contains a reference to a trait object ValueType, which is being implemented by a variety of concrete types.
But here comes the problem. During compililation, the generic types will be elided, and replaced by the actual types. The generic type T is also being propagated throughout the computation graph (obviously):
pub struct ComputationGraph<T> where T : Copy + Clone {
nodes: Vec<Node<T>>
}
This actually restricts the usage of ComputeGraph to one specific ValueType.
Furthermore the generic type T cannot be Sized, since a value node can be an opqaue type handled by a different backend not available through rust (think of C opqaue types made available through FFI).
One possible solution to this problem would be to introduce an additional enum type, that "mirrors" the concrete implementation of the valuetype trait mentioned above. this approach would be similiar, that enum dispatch does.
Is there anything I haven't thought of to use multiple implementations of ValueType?
update:
What i want to achive is following code:
pub struct Scalar<T> where T : Copy + Clone{
data : T
}
fn main() {
let cg = ComputeGraph::new();
// a new scalar type. doesn't have to be a tuple struct
let a = Scalar::new::<f32>(1.0);
let b_size = 32;
let b = Container::new::<opaque_type>(32);
let op = OperatorAdd::new();
// cg.insert_operator_node constructs four nodes: 3 value nodes
// and one operator nodes internally.
let result = cg.insert_operator_node::<Container>(&op, &a, &b);
}
update
ValueType<T> looks like this
pub trait ValueType<T> {
fn get_size(&self) -> usize;
fn get_value(&self) -> T;
}
update
To further increase the clarity of my question think of a small BLAS library backed by OpenCL. The memory management and device interaction shall be transparent to the user. A Matrix type allocates space on an OpenCL device with types as a primitive type buffer, and the appropriate call will return a pointer to that specific region of memory. Think of an operation that will scale the matrix by a scalar type, that is being represented by a primitive value. Both the (pointer to the) buffer and the scalar can be passed to a kernel function. Going back to the ComputeGraph, it seems obvious, that all BLAS operations form some type of computational graph, which can be reduced to a linear list of instructions ( think here of setting kernel arguments, allocating buffers, enqueue the kernel, storing the result, etc... ). Having said all that, a computation graph needs to be able to store value nodes with a variety of types.
As always the answer to the problem posed in the question is obvious. The graph expects one generic type (with trait bounds). Using an enum to "cluster" various subtypes was the solution, as already sketched out in the question.
An example to illustrate the solution. Consider following "subtypes":
struct Buffer<T> {
// fields
}
struct Scalar<T> {
// fields
}
struct Kernel {
// fields
}
The value containing types can be packed into an enum:
enum MemType {
Buffer(Buffer<f32>);
Scalar(Scalar<f32>);
// more enum variants ..
}
Now MemType and Kernel can now be packed in an enum as well
enum Node {
Value(MemType);
Operator(Kernel);
}
Node can now be used as the main type for nodes/vertices inside the graph. The solution might not be very elegant, but it does the trick for now. Maybe some code restructuring might be done in the future.
I want to pass Iterators to a function, which then computes some value from these iterators.
I am not sure how a robust signature to such a function would look like.
Lets say I want to iterate f64.
You can find the code in the playground: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=c614429c541f337adb102c14518cf39e
My first attempt was
fn dot(a : impl std::iter::Iterator<Item = f64>,b : impl std::iter::Iterator<Item = f64>) -> f64 {
a.zip(b).map(|(x,y)| x*y).sum()
}
This fails to compile if we try to iterate over slices
So you can do
fn dot<'a>(a : impl std::iter::Iterator<Item = &'a f64>,b : impl std::iter::Iterator<Item = &'a f64>) -> f64 {
a.zip(b).map(|(x,y)| x*y).sum()
}
This fails to compile if I try to iterate over mapped Ranges.
(Why does the compiler requires the livetime parameters here?)
So I tried to accept references and not references generically:
pub fn dot<T : Borrow<f64>, U : Borrow<f64>>(a : impl std::iter::Iterator::<Item = T>, b: impl std::iter::Iterator::<Item = U>) -> f64 {
a.zip(b).map(|(x,y)| x.borrow()*y.borrow()).sum()
}
This works with all combinations I tried, but it is quite verbose and I don't really understand every aspect of it.
Are there more cases?
What would be the best practice of solving this problem?
There is no right way to write a function that can accept Iterators, but there are some general principles that we can apply to make your function general and easy to use.
Write functions that accept impl IntoIterator<...>. Because all Iterators implement IntoIterator, this is strictly more general than a function that accepts only impl Iterator<...>.
Borrow<T> is the right way to abstract over T and &T.
When trait bounds get verbose, it's often easier to read if you write them in where clauses instead of in-line.
With those in mind, here's how I would probably write dot:
fn dot<I, J>(a: I, b: J) -> f64
where
I: IntoIterator,
J: IntoIterator,
I::Item: Borrow<f64>,
J::Item: Borrow<f64>,
{
a.into_iter()
.zip(b)
.map(|(x, y)| x.borrow() * y.borrow())
.sum()
}
However, I also agree with TobiP64's answer in that this level of generality may not be necessary in every case. This dot is nice because it can accept a wide range of arguments, so you can call dot(&some_vec, some_iterator) and it just works. It's optimized for readability at the call site. On the other hand, if you find the Borrow trait complicates the definition too much, there's nothing wrong with optimizing for readability at the definition, and forcing the caller to add a .iter().copied() sometimes. The only thing I would definitely change about the first dot function is to replace Iterator with IntoIterator.
You can iterate over slices with the first dot implementation like that:
dot([0, 1, 2].iter().cloned(), [0, 1, 2].iter().cloned());
(https://doc.rust-lang.org/std/iter/trait.Iterator.html#method.cloned)
or
dot([0, 1, 2].iter().copied(), [0, 1, 2].iter().copied());
(https://doc.rust-lang.org/std/iter/trait.Iterator.html#method.copied)
Why does the compiler requires the livetime parameters here?
As far as I know every reference in rust has a lifetime, but the compiler can infer simple it in cases. In this case, however the compiler is not yet smart enough, so you need to tell it how long the references yielded by the iterator lives.
Are there more cases?
You can always use iterator methods, like the solution above, to get an iterator over f64, so you don't have to deal with lifetimes or generics.
What would be the best practice of solving this problem?
I would recommend the first version (and thus leaving it to the caller to transform the iterator to Iterator<f64>), simply because it's the most readable.
I have written a problem solver in Rust which as a subroutine needs to make calls to a function which is given as a black box (essentially I would like to give an argument of type Fn(f64) -> f64).
Essentially I have a function defined as fn solve<F>(f: F) where F : Fn(f64) -> f64 { ... } which means that I can call solve like this:
solve(|x| x);
What I would like to do is to pass a more complex function to the solver, i.e. a function which depends on multiple parameters etc.
I would like to be able to pass a struct with a suitable trait implementation to the solver. I tried the following:
struct Test;
impl Fn<(f64,)> for Test {}
This yield the following error:
error: the precise format of `Fn`-family traits' type parameters is subject to change. Use parenthetical notation (Fn(Foo, Bar) -> Baz) instead (see issue #29625)
I would also like to add a trait which includes the Fn trait (which I don't know how to define, unfortunately). Is that possible as well?
Edit:
Just to clarify: I have been developing in C++ for quite a while, the C++ solution would be to overload the operator()(args). In that case I could use a struct or class like a function. I would like to be able to
Pass both functions and structs to the solver as arguments.
Have an easy way to call the functions. Calling obj.method(args) is more complicated than obj(args) (in C++). But it seems that this behavior is not achievable currently.
The direct answer is to do exactly as the error message says:
Use parenthetical notation instead
That is, instead of Fn<(A, B)>, use Fn(A, B)
The real problem is that you are not allowed to implement the Fn* family of traits yourself in stable Rust.
The real question you are asking is harder to be sure of because you haven't provided a MCVE, so we are reduced to guessing. I'd say you should flip it around the other way; create a new trait, implement it for closures and your type:
trait Solve {
type Output;
fn solve(&mut self) -> Self::Output;
}
impl<F, T> Solve for F
where
F: FnMut() -> T,
{
type Output = T;
fn solve(&mut self) -> Self::Output {
(self)()
}
}
struct Test;
impl Solve for Test {
// interesting things
}
fn main() {}
I am implementing a quick geometry crate for practice, and I want to implement two structs, Vector and Normal (this is because standard vectors and normal vectors map through certain transformations differently). I've implemented the following trait:
trait Components {
fn new(x: f32, y: f32, z: f32) -> Self;
fn x(&self) -> f32;
fn y(&self) -> f32;
fn z(&self) -> f32;
}
I'd also like to be add two vectors together, as well as two normals, so I have blocks that look like this:
impl Add<Vector> for Vector {
type Output = Vector;
fn add(self, rhs: Vector) -> Vector {
Vector { vals: [
self.x() + rhs.x(),
self.y() + rhs.y(),
self.z() + rhs.z()] }
}
}
And almost the exact same impl for Normals. What I really want is to provide a default Add impl for every struct that implements Components, since typically, they all will add the same way (e.g. a third struct called Point will do the same thing). Is there a way of doing this besides writing out three identical implementations for Point, Vector, and Normal? Something that might look like this:
impl Add<Components> for Components {
type Output = Components;
fn add(self, rhs: Components) -> Components {
Components::new(
self.x() + rhs.x(),
self.y() + rhs.y(),
self.z() + rhs.z())
}
}
Where "Components" would automatically get replaced by the appropriate type. I suppose I could do it in a macro, but that seems a little hacky to me.
In Rust, it is possible to define generic impls, but there are some important restrictions that result from the coherence rules. You'd like an impl that goes like this:
impl<T: Components> Add<T> for T {
type Output = T;
fn add(self, rhs: T) -> T {
T::new(
self.x() + rhs.x(),
self.y() + rhs.y(),
self.z() + rhs.z())
}
}
Unfortunately, this does not compile:
error: type parameter T must be used as the type parameter for some local type (e.g. MyStruct<T>); only traits defined in the current crate can be implemented for a type parameter [E0210]
Why? Suppose your Components trait were public. Now, a type in another crate could implement the Components trait. That type might also try to implement the Add trait. Whose implementation of Add should win, your crate's or that other crate's? By Rust's current coherence rules, the other crate gets this privilege.
For now, the only option, besides repeating the impls, is to use a macro. Rust's standard library uses macros in many places to avoid repeating impls (especially for the primitive types), so you don't have to feel dirty! :P
At present, macros are the only way to do this. Coherence rules prevent multiple implementations that could overlap, so you can’t use a generic solution.