How to write trait & impl with lifetimes for iterators? - rust

I'm trying to understand how to write a trait and an impl for it for my own types that will process some input data. I'm starting with a simple example where I want to process the input 1, 2, 3, 4 with a trait Processor. One implementation will skip the first element and double all remaining inputs. It should therefore look like this:
trait Processor {} // TBD
struct SkipOneTimesTwo;
impl Processor for SkipOneTimesTwo {} // TBD
let numbers = vec![1, 2, 3, 4];
let it = numbers.iter();
let it = Box::new(it);
let proc = SkipOneTimesTwo;
let four_to_eight = proc.process(it);
assert_eq!(Some(4), four_to_eight.next());
assert_eq!(Some(6), four_to_eight.next());
assert_eq!(Some(8), four_to_eight.next());
assert_eq!(None, four_to_eight.next());
So my assumption is that my trait and the corresponding implementation would look like this:
trait Processor {
// Arbitrarily convert from `i32` to `u32`
fn process(&self, it: Box<dyn Iterator<Item = i32>>) -> Box<dyn Iterator<Item = u32>>;
}
struct SkipOneTimesTwo;
impl Processor for SkipOneTimesTwo {
fn process(&self, it: Box<dyn Iterator<Item = i32>>) -> Box<dyn Iterator<Item = u32>> {
let p = it.skip(1).map(|i| 2 * (i as u32));
Box::new(p)
}
}
This code doesn't work as-is. I get the following error:
7 | let four_to_eight = proc.process(it);
| ^^ expected `i32`, found reference
|
= note: expected type `i32`
found reference `&{integer}`
= note: required for the cast to the object type `dyn Iterator<Item = i32>`
If my input data were very large, I wouldn't want the entire dataset to be kept in-memory (the whole point of using Iterator), so I assume that using Iterator<T> should stream data through from the original source of input until it is eventually aggregated or otherwise handled. I don't know what this means, however, in terms of what lifetimes I need to annotate here.
Eventually, my Processor may hold some intermediate data from the input (eg, for a running average calculation), so I will probably have to specify a lifetime on my struct.
Working with some of the compiler errors, I've tried adding 'a, 'static, and '_ lifetimes to my dyn Iterator<...>, but I can't quite figure out how to pass along an input iterator and modify the values lazily.
Is this even a reasonable approach? I could probably store the input Iterator<Item = i32> in my struct and impl Iterator<Item = u32> for SkipOneTimesTwo, but then I would presumably lose some of the abstraction of being able to pass around the Processor trait.

All iterators in Rust are lazy. Also, you don't need to use lifetimes, just use into_iter() instead of iter() and your code compiles:
trait Processor {
fn process(&self, it: Box<dyn Iterator<Item = i32>>) -> Box<dyn Iterator<Item = u32>>;
}
struct SkipOneTimesTwo;
impl Processor for SkipOneTimesTwo {
fn process(&self, it: Box<dyn Iterator<Item = i32>>) -> Box<dyn Iterator<Item = u32>> {
let p = it.skip(1).map(|i| 2 * (i as u32));
Box::new(p)
}
}
fn main() {
let numbers = vec![1, 2, 3, 4];
let it = numbers.into_iter(); // only change here
let it = Box::new(it);
let pro = SkipOneTimesTwo;
let mut four_to_eight = pro.process(it);
assert_eq!(Some(4), four_to_eight.next());
assert_eq!(Some(6), four_to_eight.next());
assert_eq!(Some(8), four_to_eight.next());
assert_eq!(None, four_to_eight.next());
}
playground

Related

Why explicitly non-dispatchable methods in Iterator are dispatchable?

Rust reference object-safety confused me for a while, and says:
Explicitly non-dispatchable functions require:
Have a where Self: Sized bound (receiver type of Self (i.e. self) implies this).
But I found code::iter::Iterator has dozen of methods are declared as explicitly non-dispatchable functions, one of them below:
pub trait Iterator {
...
fn count(self) -> usize
where
Self: Sized,
{
self.fold(
0,
#[rustc_inherit_overflow_checks]
|count, _| count + 1,
)
}
...
}
However, all of them are dispatchable by trait-object at rust-playground:
fn main() {
let it: &mut dyn Iterator<Item = u32> = &mut [1, 2, 3].into_iter();
assert_eq!(3, it.count()); // ok
}
That is good, I start try to implements a worked example, but it can not be dispatched at rust-playground, and report compiler error: "the dispatch method cannot be invoked on a trait object" that is expected:
fn main() {
pub trait Sup {
fn dispatch(self) -> String
where
Self: Sized,
{
"sup".to_string()
}
}
struct Sub;
impl Sup for Sub {
fn dispatch(self) -> String {
"sub".to_string()
}
}
let it: &mut dyn Sup = &mut Sub;
assert_eq!("trait", it.dispatch());
}
Why explicitly non-dispatchable methods in code::iter::Iterator are dispatchable? Is there any magic which I didn't found?
The reason is simple, if we think of this: what type we're invoking the method count on?
Is it dyn Iterator<Item = u32>? Let's check:
assert_eq!(3, <dyn Iterator<Item = u32>>::count(it));
Nope, there are two errors:
error[E0308]: mismatched types
--> src/main.rs:3:53
|
3 | assert_eq!(3, <dyn Iterator<Item = u32>>::count(it));
| ^^ expected trait object `dyn Iterator`, found mutable reference
|
= note: expected trait object `dyn Iterator<Item = u32>`
found mutable reference `&mut dyn Iterator<Item = u32>`
error[E0277]: the size for values of type `dyn Iterator<Item = u32>` cannot be known at compilation time
--> src/main.rs:3:53
|
3 | assert_eq!(3, <dyn Iterator<Item = u32>>::count(it));
| --------------------------------- ^^ doesn't have a size known at compile-time
| |
| required by a bound introduced by this call
|
= help: the trait `Sized` is not implemented for `dyn Iterator<Item = u32>`
note: required by a bound in `count`
OK, well... is it &mut dyn Iterator, then?
assert_eq!(3, <&mut dyn Iterator<Item = u32>>::count(it));
Now it compiles. It's understandable that the second error goes away - &mut T is always Sized. But why do the &mut dyn Iterator has access to the method of Iterator?
The answer is in the documentation. First, dyn Iterator obviously implements Iterator - that's true for any trait. Second, there's implementation of Iterator for any &mut I, where I: Iterator + ?Sized - which our dyn Iterator satisfies.
Now, one may ask - what code is executed when we use this implementation? After all, count requires consuming self, so calling it on reference can't delegate to the dyn Iterator - otherwise we'd be back to square one, dispatching non-dispatchable.
Here, the answer lies in the structure of the Iterator trait. As one can see, it has only a single required method, namely next, which takes &mut self; all other methods are provided, that is, they have some default implementations using next - for example, here's it for count:
fn count(self) -> usize
where
Self: Sized,
{
self.fold(
0,
#[rustc_inherit_overflow_checks]
|count, _| count + 1,
)
}
where fold, in turn, is the following:
fn fold<B, F>(mut self, init: B, mut f: F) -> B
where
Self: Sized,
F: FnMut(B, Self::Item) -> B,
{
let mut accum = init;
while let Some(x) = self.next() {
accum = f(accum, x);
}
accum
}
As you can see, knowing just the next, compiler can derive fold and then count.
Now, back to our &mut dyn Iterators. Let's check how, exactly, this implementation looks like; it appears to be quite simple:
#[stable(feature = "rust1", since = "1.0.0")]
impl<I: Iterator + ?Sized> Iterator for &mut I {
type Item = I::Item;
#[inline]
fn next(&mut self) -> Option<I::Item> {
(**self).next()
}
fn size_hint(&self) -> (usize, Option<usize>) {
(**self).size_hint()
}
fn advance_by(&mut self, n: usize) -> Result<(), usize> {
(**self).advance_by(n)
}
fn nth(&mut self, n: usize) -> Option<Self::Item> {
(**self).nth(n)
}
}
You can see that the &self and &mut self methods, i.e. the ones which can be called on the trait object, are forwarded by the reference to the inner value and dispatched dynamically.
The self methods, i.e. the ones which cannot use the trait object, are dispached statically using their default implementation, which consume the reference and pass it, eventually, into one of these - non-consuming, dynamically-dispatched - methods.

Is it possible to create a wrapper around an &mut that acts like an &mut

The following code fails to compile because MutRef is not Copy. It can not be made copy because &'a mut i32 is not Copy. Is there any way give MutRef similar semantics to &'a mut i32?
The motivation for this is being able to package up a large set of function parameters into a struct so that they can be passed as a group instead of needing to be passed individually.
struct MutRef<'a> {
v: &'a mut i32
}
fn wrapper_use(s: MutRef) {
}
fn raw_use(s: &mut i32) {
}
fn raw_ref() {
let mut s: i32 = 9;
let q = &mut s;
raw_use(q);
raw_use(q);
}
fn wrapper() {
let mut s: i32 = 9;
let q = MutRef{ v: &mut s };
wrapper_use(q);
wrapper_use(q);
}
No.
The name for this feature is "implicit reborrowing" and it happens when you pass a &mut reference where the compiler expects a &mut reference of a possibly different lifetime. The compiler only implicitly reborrows when the actual type and the expected type are both &mut references. It does not work with generic arguments or structs that contain &mut references. There is no way in current Rust to make a custom type that can be implicitly reborrowed. There is an open issue about this limitation dating from 2015, but so far nobody has proposed any way to lift it.
You can always implement your own method to explicitly reborrow:
impl<'a> MutRef<'a> {
// equivalent to fn reborrow(&mut self) -> MutRef<'_>
fn reborrow<'b>(&'b mut self) -> MutRef<'b> {
MutRef {v: self.v}
}
}
fn wrapper() {
let mut s: i32 = 9;
let mut q = MutRef{ v: &mut s };
wrapper_use(q.reborrow()); // does not move q
wrapper_use(q); // moves q
}
See also
Why is the mutable reference not moved here?
Type inference and borrowing vs ownership transfer

IntoIterator as a function argument doesn't accept adapter struct

I want to have a function, that accepts &IntoIterator<Item=u32>, so I could pass to it both &Vec<u32> and iterators' adapter structs (like Map, Filter and any other, which I believe all implement IntoIterator)
So I have a function like
pub fn f<'a, T>(it_src: &'a T) -> u32
where &'a T: IntoIterator<Item = u32> {
let it = it_src.into_iter();
let result: u32;
// more more usage
result
}
And this is how I tried to use it (same signature, but different name)
pub fn f_with_feature()<'a, T>(it_src: &'a T) -> u32
where &'a T: IntoIterator<Item = u32> {
let adjusted_values = it_src.into_iter()
.map(|e| adjust(e));
f(&adjusted_values)
}
What I've got is an error
error[E0308]: mismatched types
--> src\main.rs:14:7
|
14 | f(&adjusted_values)
| ^^^^^^^^^^^^^^^^ expected type parameter, found struct `std::iter::Map`
|
= note: expected type `&T`
found type `&std::iter::Map<<&T as std::iter::IntoIterator>::IntoIter, [closure#src\main.rs:13:14: 13:27]>`
How is it that Map doesn't match as T?
Also, I've come up with an idea, that passing iterators' adaptors with static dispatch isn't a good idea since each other closure used to generate a Map will create a new function specialization. Though I've seen that static dispatch approach for most of the times is idiomatic in Rust. How to manage this situation?
I think you want to have trait bounds on T (and not on &'a T). So I guess you actually want the following:
pub fn f<'a, T>(it_src: &'a T) -> u32
where T: IntoIterator<Item = u32> {
let it = it_src.into_iter();
let result: u32 = 1;
// more more usage
result
}
pub fn f_with_feature<'a, T>(it_src: &'a T) -> u32
where T: IntoIterator<Item = u32> {
let adjusted_values = it_src.into_iter()
.map(|e| adjust(e));
f(&adjusted_values)
}
Which brings us to the next problem: IntoIterator's into_iter consumes self, which means that you cannot call it_src.into_iter if you only borrow it_src.
So if you really want to use into_iter, you can try this:
pub fn f<T>(it_src: T) -> u32
where T: IntoIterator<Item = u32> {
let it = it_src.into_iter();
let result: u32 = 1;
// more more usage
result
}
pub fn f_with_feature<T>(it_src: T) -> u32
where T: IntoIterator<Item = u32> {
let adjusted_values = it_src.into_iter()
.map(|e| adjust(e));
f(adjusted_values)
}
The above, however, requires you to move the values into f resp. f_with_feature.
In my experience, just taking an iterator (and doing the conversion at call site if necessary), leads to simple, straightforward solutions:
pub fn f<T>(it_src: T) -> u32
where T: Iterator<Item = u32> {
let it = it_src.into_iter();
let result: u32 = 1;
// more more usage
result
}
pub fn f_with_feature<T>(it_src: T) -> u32
where T: Iterator<Item = u32> {
let adjusted_values = it_src.into_iter()
.map(|e| adjust(e));
f(adjusted_values)
}

Wrong inferred lifetime due to associated type

The following code sample is a minified version of a problem I have.
trait Offset: Default {}
trait Reader {
type Offset: Offset;
}
impl Offset for usize {}
impl<'a> Reader for &'a [u8] {
type Offset = usize;
}
// OK
// struct Header<R: Reader>(R, usize);
// Bad
struct Header<R: Reader>(R, R::Offset);
impl <R: Reader<Offset=usize>> Header<R> {
fn new(r: R) -> Self {
Header(r, 0)
}
}
fn test<R: Reader>(_: Header<R>, _: Header<R>) {}
fn main() {
let buf1 = [0u8];
let slice1 = &buf1[..];
let header1 = Header::new(slice1);
let buf2 = [0u8];
let slice2 = &buf2[..];
let header2 = Header::new(slice2);
test(header1, header2);
}
I currently have the code working using usize instead of the Offset associated type. I'm trying to generalize my code so it can work with other types for offset. However, adding this associated type has caused lots of existing code to stop compiling with errors like this:
error[E0597]: `buf2` does not live long enough
--> src/main.rs:37:1
|
33 | let slice2 = &buf2[..];
| ---- borrow occurs here
...
37 | }
| ^ `buf2` dropped here while still borrowed
|
= note: values in a scope are dropped in the opposite order they are created
Reversing the order of header1 and buf2 fixes the problem for this example, but I don't want to have to make this change everywhere (and may not be able to), and I don't understand why it is a problem.
Cause
Variance is the cause of the problem.
In struct Header<R: Reader>(R, usize);, Header<R> is covariant w.r.t. R.
However, in struct Header<R: Reader>(R, R::Offset);, Header<R> is invariant w.r.t. R.
Subtyping is a safe conversion of lifetimes. For example, &'static [u8] can be converted to &'a [u8].
Variance describes how subtyping is lifted to complex types. For example, if Header<_> is covariant and R is a subtype of S, Header<R> is a subtype of Header<S>. This is not the case with invariant structs.
In current Rust, traits are always invariant, because trait variance can't be inferred nor specified in the current syntax. Same restrictions apply to projected types like R::Offset.
In your code, since Header is invariant, Header<&'a [u8]> can't be upcasted to Header<&'b [u8]> even if 'a: 'b. Since fn test requires the same type for both arguments, the compiler required the same lifetime for slice1 and slice2.
Solution
One possible ad-hoc solution is to generalize the signature for fn test, if it is feasible.
fn test<R: Reader, S: Reader>(_: Header<R>, _: Header<S>) {}
Another solution is to make Header covariant somehow.
Maybe it is safe to assume Header to be covariant if type Offset has 'static bound, but the current compiler doesn't do such a clever inference.
Perhaps you can split out lifetimes as a parameter for Header. This recovers covariance.
trait Offset: Default {}
trait Reader {
type Offset: Offset;
}
impl Offset for usize {}
impl Reader for [u8] {
type Offset = usize;
}
struct Header<'a, R: Reader + ?Sized + 'a>(&'a R, R::Offset);
impl <'a, R: Reader<Offset=usize> + ?Sized> Header<'a, R> {
fn new(r: &'a R) -> Self {
Header(r, 0)
}
}
fn test<R: Reader + ?Sized>(_: Header<R>, _: Header<R>) {}
fn main() {
let buf1 = [0u8];
let slice1 = &buf1[..];
let header1 = Header::new(slice1);
let buf2 = [0u8];
let slice2 = &buf2[..];
let header2 = Header::new(slice2);
test(header1, header2);
}

Index and IndexMut implementations to return borrowed vectors

I've been working on a multi-dimensional array library, toying around with different interfaces, and ran into an issue I can't seem to solve. This may be a simple misunderstanding of lifetimes, but I've tried just about every solution I can think of, to no success.
The goal: implement the Index and IndexMut traits to return a borrowed vector from a 2d matrix, so this syntax can be used mat[rowind][colind].
A (very simplified) version of the data structure definition is below.
pub struct Matrix<T> {
shape: [uint, ..2],
dat: Vec<T>
}
impl<T: FromPrimitive+Clone> Matrix<T> {
pub fn new(shape: [uint, ..2]) -> Matrix<T> {
let size = shape.iter().fold(1, |a, &b| { a * b});
// println!("Creating MD array of size: {} and shape: {}", size, shape)
Matrix{
shape: shape,
dat: Vec::<T>::from_elem(size, FromPrimitive::from_uint(0u).expect("0 must be convertible to parameter type"))
}
}
pub fn mut_index(&mut self, index: uint) -> &mut [T] {
let base = index*self.shape[1];
self.dat.mut_slice(base, base + self.shape[1])
}
}
fn main(){
let mut m = Matrix::<f32>::new([4u,4]);
println!("{}", m.dat)
println!("{}", m.mut_index(3)[0])
}
The mut_index method works exactly as I would like the IndexMut trait to work, except of course that it doesn't have the syntax sugar. The first attempt at implementing IndexMut made me wonder, since it returns a borrowed reference to the specified type, I really want to specify [T] as a type, but it isn't a valid type. So the only option is to specify &mut [T] like this.
impl<T: FromPrimitive+Clone> IndexMut<uint, &mut [T]> for Matrix<T> {
fn index_mut(&mut self, index: &uint) -> &mut(&mut[T]) {
let base = index*self.shape[1];
&mut self.dat.mut_slice(base, base + self.shape[1])
}
}
This complains about a missing lifetime specifier on the trait impl line. So I try adding one.
impl<'a, T: FromPrimitive+Clone> IndexMut<uint, &'a mut [T]> for Matrix<T> {
fn index_mut(&'a mut self, index: &uint) -> &mut(&'a mut[T]) {
let base = index*self.shape[1];
&mut self.dat.mut_slice(base, base + self.shape[1])
}
}
Now I get method `index_mut` has an incompatible type for trait: expected concrete lifetime, but found bound lifetime parameter 'a [E0053]. Aside from this I've tried just about every combination of one and two lifetimes I can think of, as well as creating a secondary structure to hold a reference that is stored in the outer structure during the indexing operation so a reference to that can be returned instead, but that's not possible for Index. The final answer may just be that this isn't possible, given the response on this old github issue, but that would seem to be a problematic limitation of the Index and IndexMut traits. Is there something I'm missing?
At present, this is not possible, but when Dynamically Sized Types lands I believe it will become possible.
Let’s look at the signature:
pub trait IndexMut<Index, Result> {
fn index_mut<'a>(&'a mut self, index: &Index) -> &'a mut Result;
}
(Note the addition of the <'a> compared with what the docs say; I’ve filed #16228 about that.)
'a is an arbitrary lifetime, but it is important that it is specified on the method, not on the impl as a whole: it is in absolute truth a generic parameter to the method. I’ll show how it all comes out here with the names 'ρ₀ and 'ρ₁. So then, in this attempt:
impl<'ρ₀, T: FromPrimitive + Clone> IndexMut<uint, &'ρ₀ mut [T]> for Matrix<T> {
fn index_mut<'ρ₁>(&'ρ₁ mut self, index: &uint) -> &'ρ₁ mut &'ρ₀ mut [T] {
let base = index * self.shape[1];
&mut self.dat.mut_slice(base, base + self.shape[1])
}
}
This satisfies the requirements that (a) all lifetimes must be explicit in the impl header, and (b) that the method signature matches the trait definition: Index is uint and Result is &'ρ₀ mut [T]. Because 'ρ₀ is defined on the impl block (so that it can be used as a parameter there) and 'ρ₁ on the method (because that’s what the trait defines), 'ρ₀ and 'ρ₁ cannot be combined into a single named lifetime. (You could call them both 'a, but this is shadowing and does not change anything except for the introduction of a bit more confusion!)
However, this is not enough to have it all work, and it will indeed not compile, because 'ρ₀ is not tied to anything, nor is there to tie it to in the signature. And so you cannot cast self.data.mut_slice(…), which is of type &'ρ₁ mut [T], to &'ρ₀ mut [T] as the lifetimes do not match, nor is there any known subtyping relationship between them (that is, it cannot structurally be demonstrated that the lifetime 'ρ₀ is less than—a subtype of—'ρ₁; although the return type of the method would make that clear, it is not so at the basic type level, and so it is not permitted).
Now as it happens, IndexMut isn’t as useful as it should be anyway owing to #12825, as matrix[1] would always use IndexMut and never Index if you have implemented both. I’m not sure if that’s any consolation, though!
The solution comes in Dynamically Sized Types. When that is here, [T] will be a legitimate unsized type which can be used as the type for Result and so this will be the way to write it:
impl<T: FromPrimitive + Clone> IndexMut<uint, [T]> for Matrix<T> {
fn index_mut<'a>(&'a mut self, index: &uint) -> &'a mut [T] {
let base = index * self.shape[1];
&mut self.dat.mut_slice(base, base + self.shape[1])
}
}
… but that’s not here yet.
This code works in Rust 1.25.0 (and probably has for quite a while)
extern crate num;
use num::Zero;
pub struct Matrix<T> {
shape: [usize; 2],
dat: Vec<T>,
}
impl<T: Zero + Clone> Matrix<T> {
pub fn new(shape: [usize; 2]) -> Matrix<T> {
let size = shape.iter().product();
Matrix {
shape: shape,
dat: vec![T::zero(); size],
}
}
pub fn mut_index(&mut self, index: usize) -> &mut [T] {
let base = index * self.shape[1];
&mut self.dat[base..][..self.shape[1]]
}
}
fn main() {
let mut m = Matrix::<f32>::new([4; 2]);
println!("{:?}", m.dat);
println!("{}", m.mut_index(3)[0]);
}
You can enhance it to support Index and IndexMut:
use std::ops::{Index, IndexMut};
impl<T> Index<usize> for Matrix<T> {
type Output = [T];
fn index(&self, index: usize) -> &[T] {
let base = index * self.shape[1];
&self.dat[base..][..self.shape[1]]
}
}
impl<T> IndexMut<usize> for Matrix<T> {
fn index_mut(&mut self, index: usize) -> &mut [T] {
let base = index * self.shape[1];
&mut self.dat[base..][..self.shape[1]]
}
}
fn main() {
let mut m = Matrix::<f32>::new([4; 2]);
println!("{:?}", m.dat);
println!("{}", m[3][0]);
m[3][0] = 42.42;
println!("{:?}", m.dat);
println!("{}", m[3][0]);
}

Resources