How can I construct and pass an iterator of iterators? - rust

I am trying to grok Rust by implementing simple algorithms in it. I managed to make a generic merge_sorted, which ended up having the following signature:
fn merge_sorted<IL, ILL, I: Ord>(mut arrays: ILL) -> Vec<I>
where
IL: Iterator<Item = I>,
ILL: Iterator<Item = IL>,
{
// ...
}
This seems to be compiling on its own. The signature makes sense to me, as the function consumes the top-level iterator, and all the iterators it returns too. However, I am unable to construct a valid value to pass to this function:
fn main() {
let v1 = vec![1, 2];
let vectors = vec![v1.iter()];
merge_sorted(vectors.iter());
}
As expected, vectors in this sample has the type:
std::vec::Vec<std::slice::Iter<'_, i32>>
This is the error message I get:
error[E0277]: the trait bound `&std::slice::Iter<'_, {integer}>: std::iter::Iterator` is not satisfied
--> src\main.rs:58:5
|
58 | merge_sorted(vectors.iter());
| ^^^^^^^^^^^^ `&std::slice::Iter<'_, {integer}>` is not an iterator; maybe try calling `.iter()` or a similar method
|
= help: the trait `std::iter::Iterator` is not implemented for `&std::slice::Iter<'_, {integer}>`
note: required by `merge_sorted`
Where does the & come from?

Vec::iter borrows the items it contains, so you are iterating over borrowed iterators (&std::slice::Iter) that do not implement Iterator. To consume a vector in order to have the ownership of the items, you must call Vec::into_iter:
fn main() {
let v1 = vec![1, 2];
let vectors = vec![v1.iter()]; // You can use `into_iter` there to iterate over ints.
merge_sorted(vectors.into_iter());
}
You can also require IntoIterators that can make easier the usage of your API:
fn merge_sorted<IterT, IterIterT, T: Ord>(mut arrays: IterIterT) -> Vec<T>
where
IterT: IntoIterator<Item = T>,
IterIterT: IntoIterator<Item = IterT>,
{
panic!();
}
fn main() {
let v1 = vec![1, 2];
let vectors = vec![v1];
merge_sorted(vectors);
}

Related

How are return values of type `impl Trait` borrow-checked?

The following code fails to compile:
fn foo<'a, F: Fn() -> &'a str>(vec: Vec<i32>, fun: F) -> impl Iterator<Item = i32> {
println!("{}", fun());
vec.into_iter()
}
fn main() {
let s = "hello, world!".to_string();
let iter = foo(vec![1, 2, 3], || &s);
drop(s);
for x in iter {
println!("{}", x);
}
}
error[E0505]: cannot move out of `s` because it is borrowed
--> src/main.rs:9:10
|
8 | let iter = foo(vec![1, 2, 3], || &s);
| -- - borrow occurs due to use in closure
| |
| borrow of `s` occurs here
9 | drop(s);
| ^ move out of `s` occurs here
10 |
11 | for x in iter {
| ---- borrow later used here
It compiles if I replace foo's signature with
fn foo<'a, F: Fn() -> &'a str>(vec: Vec<i32>, fun: F) -> <Vec<i32> as IntoIterator>::IntoIter {
// ...
}
It makes me believe that impl Trait types are borrow-checked more conservatively: the compiler assumes that the returned object captures fun even though it doesn't.
However, this interesting example compiles fine:
fn foo(s: &str) -> impl Iterator<Item = i32> {
println!("{}", s);
vec![1, 2, 3].into_iter()
}
fn main() {
let s = "hello, world!".to_string();
let iter = foo(&s);
drop(s);
for x in iter {
println!("{}", x);
}
}
Here the compiler seems not to assume that the returned impl Iterator<Item = i32> borrows s.
How exactly are returned impl Trait types borrow-checked? When are they assumed to borrow other function arguments, like in the first case? When are they assumed not to, like in the latter case?
I believe this issue comment tells the story here. It sounds like it's an intentional limitation of the type system to be conservative, but I agree with the issue author that this would be good to be able to opt out of:
The core reason for this behaviour is the variance behaviour of impl Trait in return position: the returned impl Trait is always variant over all generic input parameters, even if not technically used. This is done so that if you change the internal implementation of a public API returning impl Trait you don't have to worry about introducing an additional variance param to the API. This could break downstream code and is thus not desireable for Rust's semver system.

Why do I get the error "the trait `Iterator` is not implemented" for a reference to a generic type even though it implements IntoIterator?

In the following example, MyTrait extends IntoIterator but the compiler doesn't recognize it when used in a loop.
pub trait MyTrait: IntoIterator<Item = i32> {
fn foo(&self);
}
pub fn run<M: MyTrait>(my: &M) {
for a in my {
println!("{}", a);
}
}
I get the error:
error[E0277]: `&M` is not an iterator
--> src/lib.rs:6:14
|
6 | for a in my {
| ^^ `&M` is not an iterator
|
= help: the trait `Iterator` is not implemented for `&M`
= note: required because of the requirements on the impl of `IntoIterator` for `&M`
= note: required by `into_iter`
Only M implements IntoIterator, but you're trying to iterate over a &M, which doesn't have to.
It's not clear what you hope to achieve with run, but removing the reference might be a start:
pub fn run<M: MyTrait>(my: M) {
for a in my {
println!("{}", a);
}
}
Note that M itself may be (or contain) a reference, so writing it in this way doesn't mean you can't use it with borrowed data. Here's one way to use run to iterate over a &Vec (playground):
impl<I> MyTrait for I
where
I: IntoIterator<Item = i32>,
{
fn foo(&self) {}
}
fn main() {
let v = vec![10, 12, 20];
run(v.iter().copied());
}
This uses .copied() to turn an Iterator<Item = &i32> to an Iterator<Item = i32>.
Related questions
Why is a borrowed range not an iterator, but the range is?
Am I incorrectly implementing IntoIterator for a reference or is this a Rust bug that should be reported?
How to implement Iterator and IntoIterator for a simple struct?
What is an idiomatic way to collect an iterator of &T into a collection of Ts?
How to properly pass Iterators to a function in Rust

How to write trait & impl with lifetimes for iterators?

I'm trying to understand how to write a trait and an impl for it for my own types that will process some input data. I'm starting with a simple example where I want to process the input 1, 2, 3, 4 with a trait Processor. One implementation will skip the first element and double all remaining inputs. It should therefore look like this:
trait Processor {} // TBD
struct SkipOneTimesTwo;
impl Processor for SkipOneTimesTwo {} // TBD
let numbers = vec![1, 2, 3, 4];
let it = numbers.iter();
let it = Box::new(it);
let proc = SkipOneTimesTwo;
let four_to_eight = proc.process(it);
assert_eq!(Some(4), four_to_eight.next());
assert_eq!(Some(6), four_to_eight.next());
assert_eq!(Some(8), four_to_eight.next());
assert_eq!(None, four_to_eight.next());
So my assumption is that my trait and the corresponding implementation would look like this:
trait Processor {
// Arbitrarily convert from `i32` to `u32`
fn process(&self, it: Box<dyn Iterator<Item = i32>>) -> Box<dyn Iterator<Item = u32>>;
}
struct SkipOneTimesTwo;
impl Processor for SkipOneTimesTwo {
fn process(&self, it: Box<dyn Iterator<Item = i32>>) -> Box<dyn Iterator<Item = u32>> {
let p = it.skip(1).map(|i| 2 * (i as u32));
Box::new(p)
}
}
This code doesn't work as-is. I get the following error:
7 | let four_to_eight = proc.process(it);
| ^^ expected `i32`, found reference
|
= note: expected type `i32`
found reference `&{integer}`
= note: required for the cast to the object type `dyn Iterator<Item = i32>`
If my input data were very large, I wouldn't want the entire dataset to be kept in-memory (the whole point of using Iterator), so I assume that using Iterator<T> should stream data through from the original source of input until it is eventually aggregated or otherwise handled. I don't know what this means, however, in terms of what lifetimes I need to annotate here.
Eventually, my Processor may hold some intermediate data from the input (eg, for a running average calculation), so I will probably have to specify a lifetime on my struct.
Working with some of the compiler errors, I've tried adding 'a, 'static, and '_ lifetimes to my dyn Iterator<...>, but I can't quite figure out how to pass along an input iterator and modify the values lazily.
Is this even a reasonable approach? I could probably store the input Iterator<Item = i32> in my struct and impl Iterator<Item = u32> for SkipOneTimesTwo, but then I would presumably lose some of the abstraction of being able to pass around the Processor trait.
All iterators in Rust are lazy. Also, you don't need to use lifetimes, just use into_iter() instead of iter() and your code compiles:
trait Processor {
fn process(&self, it: Box<dyn Iterator<Item = i32>>) -> Box<dyn Iterator<Item = u32>>;
}
struct SkipOneTimesTwo;
impl Processor for SkipOneTimesTwo {
fn process(&self, it: Box<dyn Iterator<Item = i32>>) -> Box<dyn Iterator<Item = u32>> {
let p = it.skip(1).map(|i| 2 * (i as u32));
Box::new(p)
}
}
fn main() {
let numbers = vec![1, 2, 3, 4];
let it = numbers.into_iter(); // only change here
let it = Box::new(it);
let pro = SkipOneTimesTwo;
let mut four_to_eight = pro.process(it);
assert_eq!(Some(4), four_to_eight.next());
assert_eq!(Some(6), four_to_eight.next());
assert_eq!(Some(8), four_to_eight.next());
assert_eq!(None, four_to_eight.next());
}
playground

Why does adding mut to passed Iterator reference solve this?

In the following Rust snippet:
fn main() {
let list1: Vec<i32> = vec![0, 1, 2, 3, 4];
let it1 = list1.iter();
let tens = it1.map(|x| x * 10).collect::<Vec<i32>>();
println!("{:?}", tens);
let it2 = list1.iter();
let doubled_from_iter = scale_it_iter(&it2);
println!("{:?}", doubled_from_iter);
}
fn scale_it_iter(l: &dyn Iterator<Item = &i32>) -> Vec<i32> {
l.map(|x| x * 2).collect()
}
Rust Playground Link
I get this error:
error: the `map` method cannot be invoked on a trait object
--> src/main.rs:15:7
|
15 | l.map(|x| x * 2).collect()
| ^^^
|
::: /home/xolve/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/iter/traits/iterator.rs:625:15
|
625 | Self: Sized,
| ----- this has a `Sized` requirement
|
= note: you need `&mut dyn Iterator<Item = &i32>` instead of `&dyn Iterator<Item = &i32>`
error: aborting due to previous error
Adding mut as suggested by the compiler solves this. Rust Playground link for working code.
I do not understand why it is needed. It's not needed in main when I call it1.map.
I don't understand the error messages.
the `map` method cannot be invoked on a trait object is solved by adding mut to the trait reference. This seems contradictory.
How is the message about the Sized trait bound related to the error?
The "map method cannot be invoked on a trait object" and "this has a Sized requirement" error messages are because map() consumes the original iterator. dyn Traits cannot be consumed (they are unsized types and cannot be passed to functions by value).
It works for it1 because 1) its not a trait object, its a concrete type Iter and 2) its not a reference so it is consumed.
The reason that &mut dyn Iterator works is because &mut dyn Iterator implements Iterator. The effective difference is just the reference is consumed and the underlying iterator is mutated.
If you want to follow convention, I'd make scale_it_iter consume the iterator like so:
fn scale_it_iter<'a>(l: impl Iterator<Item = &'a i32>) -> Vec<i32> {
l.map(|x| x * 2).collect()
}

Why does an Iterator trait object referring to a sibling field fail to compile when the concrete type works?

I'd like to have an iterator that points into a Vec of the same struct.
The following works fine (playground):
struct Holder1<'a> {
vec: Vec<i32>,
iterator: Option<Box<std::slice::Iter<'a, i32>>>,
}
fn holder1_test() {
let vec = vec![1, 2, 3, 4];
let mut holder = Holder1 {
vec,
iterator: None,
};
let iterator: Box<std::slice::Iter<'_, i32>> = Box::new(holder.vec.iter());
holder.iterator = Some(iterator);
for iter_elem in holder.iterator.as_mut().unwrap() {
println!("iter: {}", iter_elem);
}
}
(I know the Box isn't needed here, I just wanted to keep this as close as possible to the next code snippet.)
I'd like to use a trait object, dyn Iterator, instead of the concrete type. I've slightly modified the example from above for that (playground):
struct Holder2<'a> {
vec: Vec<i32>,
iterator: Option<Box<dyn Iterator<Item = &'a i32>>>,
}
fn holder2_test() {
let vec = vec![1, 2, 3, 4];
let mut holder = Holder2 {
vec,
iterator: None,
};
let iterator: Box<dyn Iterator<Item = &'_ i32>> = Box::new(holder.vec.iter());
holder.iterator = Some(iterator);
for iter_elem in holder.iterator.as_mut().unwrap() {
println!("iter: {}", iter_elem);
}
}
This fails to compile:
error[E0597]: `holder.vec` does not live long enough
--> src/lib.rs:12:64
|
12 | let iterator: Box<dyn Iterator<Item = &'_ i32>> = Box::new(holder.vec.iter());
| ^^^^^^^^^^ borrowed value does not live long enough
...
18 | }
| -
| |
| `holder.vec` dropped here while still borrowed
| borrow might be used here, when `holder` is dropped and runs the destructor for type `Holder2<'_>`
What makes the second example so different to the first example that causes the compilation failure? Both iterators point to an element in the Vec of the same struct - so what's the conceptual difference here? Is there a way to get this to work with trait objects?
I'm aware that using an index instead of an iterator would solve this, but I'm rather interested in the underlying reasons of why this doesn't work.

Resources