Writing generic implementation of sum on an iterator in Rust - rust

I have been learning Rust, coming from a Swift, C and C++ background. I feel like I have a basic understanding of ownership, borrowing and traits. To exercise a bit, I decided to implement a sum function on a generic slice [T] where T has a default value and can be added to itself.
This is how far I got:
trait Summable {
type Result;
fn sum(&self) -> Self::Result;
}
impl<T> Summable for [T]
where
T: Add<Output = T> + Default,
{
type Result = T;
fn sum(&self) -> T {
let x = T::default();
self.iter().fold(x, |a, b| a + b)
}
}
Compiler complains with expected type parameter T, found &T for a + b.
I understand why the error happens, but not exactly how to fix it. Yes, the type of x is T. It cannot be &T because, if nothing else, if the slice is empty, that's the value that is returned and I cannot return a reference to something created inside the function. Plus, the default function returns a new value that the code inside the function owns. Makes sense. And yes, b should be a shared reference to the values in the slice since I don't want to consume them (not T) and I don't want to mutate them (not &mut T).
But that means I need to add T to &T, and return a T because I am returning a new value (the sum) which will be owned by the caller. How?
PS: Yes, I know this function exists already. This is a learning exercise.

The std::ops::Add trait has an optional Rhs type parameter that defaults to Self:
pub trait Add<Rhs = Self> {
type Output;
fn add(self, rhs: Rhs) -> Self::Output;
}
Because you've omitted the Rhs type parameter from the T: Add<Output = T> bound, it defaults to T: hence to your a you can add a T, but not an &T.
Either specify that T: for<'a> Add<&'a T, Output = T>; or else somehow obtain an owned T from b, e.g. via T: Copy or T: Clone.

Related

What is the difference between `filter(func)` and `filter(|x| func(x))`?

What is the difference between filter(|x| func(x)) and filter(func)? Perhaps a good place to start would be to understand how filter(func) could be written using syntax akin to filter(|x| func(x)). My code looks like this:
fn filter_out_duplicates(vec_of_vecs: Vec<Vec<u8>>) -> Vec<Vec<u8>> {
vec_of_vecs
.into_iter()
.filter(all_unique)
.collect()
}
pub fn all_unique<T>(iterable: T) -> bool
where
T: IntoIterator,
T::Item: Eq + Hash,
{
let mut unique = HashSet::new();
iterable.into_iter().all(|x| unique.insert(x))
}
error[E0599]: the method `collect` exists for struct `Filter<std::vec::IntoIter<Vec<u8>>, fn(&Vec<u8>) -> bool {tmp::all_unique::<&Vec<u8>>}>`, but its trait bounds were not satisfied
--> src/main.rs:44:56
|
44 | vec_of_vecs.into_iter().filter(all_unique).collect()
| ^^^^^^^ method cannot be called on `Filter<std::vec::IntoIter<Vec<u8>>, fn(&Vec<u8>) -> bool {tmp::all_unique::<&Vec<u8>>}>` due to unsatisfied trait bounds
|
::: /.rustup/toolchains/stable-x86_64-apple-darwin/lib/rustlib/src/rust/library/core/src/iter/adapters/filter.rs:15:1
|
15 | pub struct Filter<I, P> {
| ----------------------- doesn't satisfy `_: Iterator`
|
= note: the following trait bounds were not satisfied:
`<fn(&Vec<u8>) -> bool {tmp::all_unique::<&Vec<u8>>} as FnOnce<(&Vec<u8>,)>>::Output = bool`
which is required by `Filter<std::vec::IntoIter<Vec<u8>>, fn(&Vec<u8>) -> bool {tmp::all_unique::<&Vec<u8>>}>: Iterator`
`fn(&Vec<u8>) -> bool {tmp::all_unique::<&Vec<u8>>}: FnMut<(&Vec<u8>,)>`
which is required by `Filter<std::vec::IntoIter<Vec<u8>>, fn(&Vec<u8>) -> bool {tmp::all_unique::<&Vec<u8>>}>: Iterator`
`Filter<std::vec::IntoIter<Vec<u8>>, fn(&Vec<u8>) -> bool {tmp::all_unique::<&Vec<u8>>}>: Iterator`
which is required by `&mut Filter<std::vec::IntoIter<Vec<u8>>, fn(&Vec<u8>) -> bool {tmp::all_unique::<&Vec<u8>>}>: Iterator`
which is required by `&mut Filter<std::vec::IntoIter<Vec<Placement>>, fn(&Vec<Placement>) -> bool {all_unique::<&Vec<Placement>>}>: Iterator`
But the code compiles if I use |x| all_unique(x). I know deciphering compiler errors is the recommended way of solving problems in Rust but I find this error pretty impenetrable.
I found a discussion that seemed to commiserate about the error more than explain coercions but I found the chapter on coercions in the Rustonomicon too short to provide understanding.
This case is not related to coercions. This is another case of late-bound vs. early-bound lifetimes.
Rust has two kinds of lifetimes: early-bound and late-bound. The difference boils down to it is decided what lifetime to use.
For late bound lifetimes, you get a Higher-Ranked Trait Bound - something like for<'a> fn(&'a i32). Then, a lifetime is picked only when the function is called.
For early-bound lifetimes, on the other hand, you get fn(&'some_concrete_lifetime i32). The lifetime may be inferred, sometimes omitted, but it's there. And it has to be decided at the time we decide the type for the function pointer/item.
filter() expects a HRTB function, that is, with a late bound lifetime. This is because the desugaring for FnMut(&Self::Item) -> bool, which is the bound in filter(), is for<'a> FnMut(&'a Self::Item) -> bool, or, if you wish, for<'a> FnMut<(&'a Self::Item,), Output = bool>.
Your all_unique(), however, is generic over T: IntoIterator. And if we set T = &'a Vec<u8>, then 'a is early bound. This is because lifetimes from generic parameters are always early bound - essentially, because we can't late-bind generic arguments, as there is no way in Rust to express for<T>, as generic type parameters are monomorphized and so this is generally impossible.
So, if we reveal the elided lifetimes, you want to satisfy the trait bound fn(&'some_lifetime Vec<u8>) -> bool: for<'all_lifetimes> FnMut(&'all_lifetimes Vec<u8>) -> bool, and this bound is false. This is the reason for the error you saw.
If we use a closure, however, we generate a closure that is specific for the type &'lifetime Vec<u8>. Since it is not generic over the type, the lifetime can be late bound.
I'm not 100% sure about what it is happening here. It could even be considered a compiler limitation/bug, depending on your point of view, but I think that this is what happens:
When you write filter(all_unique), being all_unique a generic function, it is resolved as taking a reference to the item it is iterating upon, as per the definition of filter:
fn filter<P>(self, predicate: P) -> Filter<Self, P>ⓘ where
P: FnMut(&Self::Item) -> bool,
So you are actually calling all_unique<&Vec<u8>>. You may think that all is ok, because that &Vec<u8> actually implements IntoIterator and the other constraints.
But the issue is with lifetimes. See, when in filter there is the constraint P: FnMut(&Self::Item) -> bool that is actually syntactic sugar for P: for <'a> FnMut(&'a Self::Item) -> bool, that is the function must be able to accept any lifetime, and your function cannot.
But, wait! you may say that your function all_unique<T> certainly can take T: 'a for any lifetime 'a. And that is true, but that is not what is happening here: you are calling filter<P> with P=all_unique::<&'f Vec<u8>> being 'f that particular lifetime. And that lifetime is not any lifetime! Now your all_unique function is tainted with a particular lifetime and it does not satisfy the for <'a> ... thing above.
Surely, you do not actually need that for <a>` here because you are calling the function with the proper lifetime, but the syntax is what it is and it forces the error.
The obvious solution is to write all_unique to take a reference:
pub fn all_unique<T>(iterable: &T) -> bool
That is actually sintactic sugar for:
pub fn all_unique<'a, T>(iterable: &'a T) -> bool
where the universality of 'a (that for <'a> thing) is implicit.
And now calling filter(all_unique) selects the generic all_unique::<Vec<u8>>, that is untainted with lifetimes and can be called with any 'a.
And this is indeed is a syntactic limitation, you can just write:
pub fn all_unique<T> (iterable: T) -> bool { /* ... */ }
pub fn all_unique_ref<'a, T> (iterable: &'a T) -> bool {
all_unique::<&T>(iterable)
}
And writing filter(all_unique_ref) will work while filter(all_unique) will not.
Your solution of using a lambda expression:
filter(|x| all_unique(x))
is just like that all_unique_ref but anonymous.
TL;DR; The original error is caused because the lifetime of the argument is captured in the type of the generic function instead of in the usage of that function. And that makes filter() unhappy because its argument does not look generic enough.

Why does `Fn() -> T` constrain `T` but `Fn(T) -> T` does not

The following code compiles fine:
struct StructA<F>(F);
impl<F, T> StructA<F> where F: Fn() -> T {}
Although T doesn't show up in StructA's type parameters, it is still constrained due to the where clause. This trick is used, for example, in std::iter::Map so Map<I, F> only needs two type parameters while the impl<B, I, F> Iterator for Map<I, F> takes three.
However the following code does not compile:
struct StructB<F>(F);
impl<F, T> StructB<F> where F: Fn(T) -> T {}
error[E0207]: the type parameter `B` is not constrained by the impl trait, self type, or predicates
--> src/lib.rs:5:9
|
5 | impl<F, T> StructB<F> where F: Fn(T) -> T {}
| ^ unconstrained type parameter
For more information about this error, try `rustc --explain E0207`.
error: could not compile `playground` due to previous error
Playground Link
This is unintuitive, why would using T in more places make it less constrained? Is this intended or is it a limitation in Rust?
Note this also happens with regular traits, i.e. the desugared version of Fn:
trait FnTrait<Args> {
type Output;
}
// Works
struct StructA<F>(F);
impl<F, T> StructA<F> where F: FnTrait<(), Output = T> {}
// Fails
struct StructB<F>(F);
impl<F, T> StructB<F> where F: FnTrait<(T,), Output = T> {}
Playground Link
Consider if we implement Fn manually (of course this requires nightly)...
#![feature(fn_traits, unboxed_closures)]
struct MyFunction;
impl<T> FnOnce<(T,)> for MyFunction {
type Output = T;
extern "rust-call" fn call_once(self, (v,): (T,)) -> T { v }
}
Now imagine your struct:
struct StructA<F>(F);
impl<F: FnOnce(T) -> T, T> StructA<F>{
fn foo(self) -> T { (self.0)() }
}
let s: StructA<MyFunction> = ...;
s.foo(); // What is `T`?
While the reference says:
Generic parameters constrain an implementation if the parameter appears at least once in one of:
...
As an associated type in the bounds of a type that contains another parameter that constrains the implementation
This is inaccurate. Citing the RFC:
Type parameters are legal if they are "constrained" according to the following inference rules:
...
If <T0 as Trait<T1...Tn>>::U == V appears in the impl predicates, and T0...Tn are constrained and T0 as Trait<T1...Tn> is not the impl trait reference then V is constrained.
That is, all type parameters should that appear in the trait should be constrained, not just one of them.
I've opened an issue in the reference repo.
Related: https://github.com/rust-lang/rust/issues/25041.
You always need to be able to deduce the generic type parameters of an impl from the self type of the impl (and possibly the trait being implemented, if it's a trait impl). In the first case, F: Fn() -> T, it is possible to derive T from the self type StructA<F>. However, with the trait bound F: Fn(T) -> T, this is not possible.
The difference between the two cases results from the fact that the return type of the closure trait Fn is an associated type, while the argument types are generic parameters. In other words, you can only implement Fn() -> T for any type F once, and that implementation will have a fixed return type T. The trait Fn(T) -> T, on the other hand, could be implemented for multiple types T for the same F, so given F you can't deduce what T is in general.
In practice it is of course very uncommon that multiple Fn traits are implemented for the same type, and when only using closures it's even impossible. However, since it's possible, the compiler needs to account for it.

What is the difference between "eq()" and "=="?

This is what the std says:
pub trait PartialEq<Rhs: ?Sized = Self> {
/// This method tests for `self` and `other` values to be equal, and is used
/// by `==`.
#[must_use]
#[stable(feature = "rust1", since = "1.0.0")]
fn eq(&self, other: &Rhs) -> bool;
/// This method tests for `!=`.
#[inline]
#[must_use]
#[stable(feature = "rust1", since = "1.0.0")]
fn ne(&self, other: &Rhs) -> bool {
!self.eq(other)
}
}
And the link: https://doc.rust-lang.org/src/core/cmp.rs.html#207
This is my code:
fn main() {
let a = 1;
let b = &a;
println!("{}", a==b);
}
and the compiler told me:
error[E0277]: can't compare `{integer}` with `&{integer}`
--> src\main.rs:4:21
|
4 | println!("{}", a==b);
| ^^ no implementation for `{integer} == &{integer}`
|
= help: the trait `PartialEq<&{integer}>` is not implemented for `{integer}`
But when I used eq(), it compiled:
fn main() {
let a = 1;
let b = &a;
println!("{}", a.eq(b));
}
It's actually quite simple, but it requires a bit of knowledge. The expression a == b is syntactic sugar for PartialEq::eq(&a, &b) (otherwise, we'd be moving a and b by trying to test if they're equal if we're dealing with non-Copy types).
In our case, the function PartialEq::eq needs to take two arguments, both of which are of type &i32. We see that a : i32 and b : &i32. Thus, &b will have type &&i32, not &i32.
It makes sense that we'd get a type error by trying to compare two things with different types. a has type i32 and b has type &i32, so it makes sense that no matter how the compiler secretly implements a == b, we might get a type error for trying to do it.
On the other hand, in the case where a : i32, the expression a.eq(b) is syntactic sugar for PartialEq::eq(&a, b). There's a subtle difference here - there's no &b. In this case, both &a and b have type &i32, so this is totally fine.
The difference between a.eq(b) and a == b in that dot operator does autoref/autoderef on receiver type for call-by-reference methods.
So when you write a.eq(b) compiler looks at PartialEq::eq(&self, other: &Rhs) signature, sees &self reference and adds it to a.
When you write a == b it desugars to PartialEq::eq(a, b) where a: i32 b: &i32 in your case, hence the error no implementation for `{integer} == &{integer}` .
But why it does not do the same in operators? See
Tracking issue: Allow autoderef and autoref in operators (experiment) #44762
Related information:
What are Rust's exact auto-dereferencing rules?

Why does this Rust code compile with a lifetime bound on the struct, but give a lifetime error if the bound is only on the impl?

Recently, I tried to write a piece of code similar to the following:
pub struct Foo<'a, F> /* where F: Fn(&u32) -> bool */ {
u: &'a u32,
f: F
}
impl<'a, F> Foo<'a, F>
where F: Fn(&u32) -> bool
{
pub fn new_foo<G: 'static>(&self, g: G) -> Foo<impl Fn(&u32) -> bool + '_>
where G: Fn(&u32) -> bool
{
Foo { u: self.u, f: move |x| (self.f)(x) && g(x) }
}
}
Here, an instance of Foo represents a condition on a piece of data (the u32), where a more restrictive Foo can be built from a less restrictive one via new_foo, without consuming the old. However, the above code does not compile as written, but gives the rather cryptic error message:
error[E0308]: mismatched types
--> src/lib.rs:9:52
|
9 | pub fn new_foo<G: 'static>(&self, g: G) -> Foo<impl Fn(&u32) -> bool + '_>
| ^^^^^^^^^^^^^^^^^^^^^^^^^^ one type is more general than the other
|
= note: expected type `std::ops::FnOnce<(&u32,)>`
found type `std::ops::FnOnce<(&u32,)>`
error: higher-ranked subtype error
--> src/lib.rs:9:5
|
9 | / pub fn new_foo<G: 'static>(&self, g: G) -> Foo<impl Fn(&u32) -> bool + '_>
10 | | where G: Fn(&u32) -> bool
11 | | {
12 | | Foo { u: self.u, f: move |x| (self.f)(x) && g(x) }
13 | | }
| |_____^
error: aborting due to 2 previous errors
After much experimentation, I did find a way to make the code compile, and I believe it then functions as intended. I am used to the convention of placing bounds on impls rather than declarations when the declaration can be written without relying on those bounds, but for some reason uncommenting the where clause above, that is, copying the bound F: Fn(&u32) -> bool from the impl to the declaration of Foo itself resolved the problem. However, I don't have a clue why this makes a difference (nor do I really understand the error message in the first place). Does anyone have an explanation of what's going on here?
The only subtypes that exist in Rust are lifetimes, so your errors (cryptically) hint that there's some sort of lifetime problem at play. Furthermore, the error clearly points at the signature of your closure, which involves two lifetimes:
the lifetime of the closure itself, which you have explicitly stated outlives the anonymous lifetime '_; and
the lifetime of its argument &u32, which you have not explicitly stated, so a higher-ranked lifetime is inferred as if you had stated the following:
pub fn new_foo<G: 'static>(&self, g: G) -> Foo<impl for<'b> Fn(&'b u32) -> bool + '_>
where G: Fn(&u32) -> bool
Using the more explicit signature above gives a (very) slightly more helpful error:
error[E0308]: mismatched types
--> src/lib.rs:9:52
|
9 | pub fn new_foo<G: 'static>(&self, g: G) -> Foo<impl for<'b> Fn(&'b u32) -> bool + '_>
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ one type is more general than the other
|
= note: expected type `std::ops::FnOnce<(&'b u32,)>`
found type `std::ops::FnOnce<(&u32,)>`
At least we can now see that "one type is more general than the other": we expected a closure that can take an argument with any lifetime but for some reason Rust thinks that what we have instead is a closure that takes an argument that may have some more restricted range of lifetimes.
What's happened? Well, the function's return value is the following expression:
Foo { u: self.u, f: move |x| (self.f)(x) && g(x) }
This is of course an instance of struct Foo<'a, F>, where this F bears no relation to that declared on the impl block (with its trait bound). Indeed, since there's no explicit bound on F in the struct definition, the compiler must fully infer this type F from the expression itself. By giving the struct definition a trait bound, you are telling the compiler that instances of Foo, including the above expression, have an F that implements for<'b> Fn(&'b u32) -> bool: i.e. the range of lifetimes for the &u32 argument are unbounded.
Okay, so the compiler needs to infer F instead, and indeed it does infer that it implements Fn(&u32) -> bool. However, it's just not quite smart enough to determine to what range of lifetimes that &u32 argument might be restricted. Adding an explicit type annotation, as suggested in #rodrigo's comment above, states that the argument can indeed have any lifetime.
If there are in fact some restrictions on the possible lifetimes of the closure's argument, you would need to indicate that more explicitly by changing the definition of 'b from a higher-ranked trait bound (i.e. for<'b> in the return type above) to whatever is appropriate for your situation.
Hopefully once chalk is fully integrated into the compiler it will be able to perform this inference in both the unrestricted and restricted cases. In the meantime, the compiler is erring on the side of caution and not making potentially erroneous assumptions. The errors could definitely have been a bit more helpful, though!

How do I make a closure that avoids a redundant clone of captured variables?

I'm trying to implement the classic make_adder function which takes an addable thing and returns a closure which takes another addable thing and returns the sum. Here is my attempt:
use std::ops::Add;
fn make_adder<T: Add + Clone>(x: T) -> impl Fn(T) -> T::Output {
move |y| x.clone() + y
}
Because I don't want to restrict T to be Copy, I'm calling clone() inside the closure. I think this also means there will always be one redundant x captured by the closure as the "prototype". Can I somehow do this better?
Realistically, you cannot avoid this. You never know if the closure will be called another time; you will need to keep the value in case it is. I wouldn't worry about performing the clone until profiling has identified that this is a bottleneck.
In certain cases, you might be able to change your closure type to FnOnce, which enforces that it can only be called exactly once:
fn make_adder<T>(x: T) -> impl FnOnce(T) -> T::Output
where
T: Add,
{
move |y| x + y
}
In other cases, you might be able to add some indirection to the problem. For example, instead of passing a T, pass in a closure that generates a T (presumably not by cloning its own captured variable...). This T can always be consumed directly:
fn make_adder<T>(x: impl Fn() -> T) -> impl Fn(T) -> T::Output
where
T: Add,
{
move |y| x() + y
}
Perhaps you can use a reference, if you’re using types that support addition on references (probably all the useful ones do, including the built-in numeric types).
fn make_adder<T, U>(x: T) -> impl Fn(T) -> U
where
for<'a> &'a T: Add<T, Output = U>,
{
move |y| &x + y
}
or
fn make_adder<'a, T>(x: &'a T) -> impl Fn(T) -> <&'a T as Add<T>>::Output
where
&'a T: Add<T>,
{
move |y| x + y
}

Resources