Create peek_while for iterator - rust

I'm trying to create a peek_while method for my iterator which should basically do the same as take_while but only consume the character after the predicate matched. I've taken some inspiration from https://stackoverflow.com/a/30540952/10315665 and from the actual take_while source code https://github.com/rust-lang/rust/blob/2c7bc5e33c25e29058cbafefe680da8d5e9220e9/library/core/src/iter/adapters/take_while.rs#L42-L54 and arrived at this result:
pub struct PeekWhile<I: Iterator, P> {
iter: Peekable<I>,
predicate: P,
}
impl<I, P> fmt::Debug for PeekWhile<I, P>
where
I: Iterator + Debug,
<I as Iterator>::Item: Debug,
{
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
f.debug_struct("PeekWhile")
.field("iter", &self.iter)
.finish()
}
}
impl<I, P> Iterator for PeekWhile<I, P>
where
I: Iterator,
P: FnMut(&I::Item) -> bool,
{
type Item = I::Item;
fn next(&mut self) -> Option<I::Item> {
let n = self.iter.peek()?;
while let Some(n) = self.iter.peek() {
if (self.predicate)(&n) {
return self.iter.next();
} else {
break;
}
}
None
}
}
pub trait PeekWhileExt: Iterator {
fn peek_while<P>(self, predicate: P) -> PeekWhile<Self, P>
where
Self: Iterator,
Self: Sized,
P: FnMut(&Self::Item) -> bool,
{
PeekWhile {
iter: self.peekable(),
predicate,
}
}
}
impl<I: Iterator> PeekWhileExt for I {}
This is causing an infinite loop, and while I'm not sure why, I see that the take_while::next method does not have a loop, so I changed that to:
let n = self.iter.peek()?;
if (self.predicate)(n) {
Some(n)
} else {
None
}
which now gives me:
mismatched types
expected associated type `<I as Iterator>::Item`
found reference `&<I as Iterator>::Item`
So how would I go about creating such an iterator? Is the code so far correct, and how to I complete it? I know that itertools has https://docs.rs/itertools/0.10.1/itertools/trait.Itertools.html#method.peeking_take_while which sounds promising, but this is a learning project, both in programming concepts and rust itself (I'm creating a JSON parser btw), so I would very much be interested in completing that portion of the code without any libraries.
Example use case (not tested):
let chars = "keyword:".chars();
assert_eq!(chars.peek_while(|c| c.is_alphabetic()).collect::<String>(), "keyword");
assert_eq!(chars.next().unwrap(), ':');
// ^
// Very important that the next char doesn't get lost
Appreciate any help! 😊

Related

What is the idiomatic way to implement `IntoIterator` when some items need to be substituted?

I have a custom collection like this:
struct VecChoice<T> {
v1: Vec<T>,
v2: Vec<T>,
use_v1: Vec<bool>,
}
in the impl I can iterate this collection like this:
fn foo(&self, ...) {
let item_refs: Vec<_> = (0..self.v1.len()).map(|i| {
if self.use_v1[i] {
&self.v1[i]
} else {
&self.v2[i]
}
});
// ... do whatever I want with chosen references
}
However, I am failing to make it iterable:
impl<'a, T> IntoIterator for &'a VecChoice<T> {
type Item = &'a T;
// this fails because the trait `Sized` is not implemented for `(dyn FnMut(usize) -> Self::Item + 'static)`
type IntoIter = Map<usize, dyn FnMut(usize) -> Self::Item>;
fn into_iter(self) -> Self::IntoIter {
(0..self.v1.len()).map(|i| {
if self.use_v1[i] {
&self.v1[i]
} else {
&self.v2[i]
}
})
}
}
I could probably collect results into a Vec<&T> as above, then use its into_iter, but I suspect there should be a way to do it without constructing intermediate Vec.
The closure that you have passed to map actually does have a size. The problem though is that this type isn't nameable. You've tried to solve that with dyn, which isn't quite the right solution because the closure is sized but dyn makes it so that it isn't. dyn would be appropriate if there were different possible sizes, but then you'd have to put it behind a pointer of some kind so that the IntoIter type is Sized.
This is one of those cases where it is probably better to implement the Iterator manually, rather than using combinators.
struct VecChoiceIter<'a, T> {
index: usize,
vec_choice: &'a VecChoice<T>,
}
impl<'a, T> Iterator for VecChoiceIter<'a, T> {
type Item = &'a T;
fn next(&mut self) -> Option<Self::Item> {
if self.index == self.vec_choice.v1.len() {
None
} else {
let i = self.index;
self.index += 1;
let use_v1 = self.vec_choice.use_v1[i];
if use_v1 {
Some(&self.vec_choice.v1[i])
} else {
Some(&self.vec_choice.v2[i])
}
}
}
}
This gives you a Sized and nameable type that you can use for the IntoIterator implementation:
impl<'a, T> IntoIterator for &'a VecChoice<T> {
type Item = &'a T;
type IntoIter = VecChoiceIter<'a, T>;
fn into_iter(self) -> Self::IntoIter {
VecChoiceIter { index: 0, vec_choice: self }
}
}
There are some interesting RFCs in progress that could make this work more like how you originally wanted. In particular RFC-2515. This would let you write your IntoIterator implementation as you originally tried, but without having to name the type (playground - nightly):
impl<'a, T> IntoIterator for &'a VecChoice<T> {
type Item = &'a T;
// This is an "existential" type. That is, tell the compiler that there is
// exactly one possibility for what this type can be, which it can infer
// from the usage.
type IntoIter = impl Iterator<Item = Self::Item>;
fn into_iter(self) -> Self::IntoIter {
(0..self.v1.len()).map(move |i| {
if self.use_v1[i] {
&self.v1[i]
} else {
&self.v2[i]
}
})
}
}
It's often very tempting to try to make an iterator out of a pre-made collection, but unfortunately this tends to run into a practical problem a lot of the time: you need some way to store an offset into that collection, so you serve the right chunk of data out of it when next is called. Consequently, you almost always need to provide some custom iterator type.
In this case, you can do so like this:
struct VecChoice<T> {
v1: Vec<T>,
v2: Vec<T>,
use_v1: Vec<bool>,
}
struct VecChoiceIter<'a, T> {
off: usize,
collection: &'a VecChoice<T>,
}
impl<'a, T> Iterator for VecChoiceIter<'a, T> {
type Item = &'a T;
fn next(&mut self) -> Option<Self::Item> {
let off = self.off;
self.off += 1;
if *self.collection.use_v1.get(off)? {
self.collection.v1.get(off)
} else {
self.collection.v2.get(off)
}
}
}
impl<'a, T> IntoIterator for &'a VecChoice<T> {
type Item = &'a T;
type IntoIter = VecChoiceIter<'a, T>;
fn into_iter(self) -> Self::IntoIter {
VecChoiceIter {
off: 0,
collection: self,
}
}
}
Note that in this case, I've switched use_v1 to a Vec<bool>, because this is not C and only booleans can be used in conditionals.
You could also do the conversion up front and store it in its own Vec, but in my experience people don't expect creating an iterator, whether by calling iter or into_iter, to be expensive. Iterators are pretty fundamental in Rust, and as a consequence it's very common for folks to create lots of them, often implicitly, and making those functions be expensive would be undesirable in many cases.
Probably the most simple way is to use .zip() and return an opaque impl Iterator from a method on the type (so you don't have to write out the actual type):
struct VecChoice<T> {
v1: Vec<T>,
v2: Vec<T>,
use_v1: Vec<bool>,
}
impl<T> VecChoice<T> {
fn iter(&self) -> impl Iterator<Item = &T> {
self.v1
.iter()
.zip(self.v2.iter())
.zip(self.use_v1.iter())
.map(|((v1, v2), use_v1)| if use_v1 { v1 } else { v2 })
}
}
This will iterate over all three Vec (actually the shortest of them) and return either from v1 or v2.
Notice that I switched use_v1 from a Vec<T> to a Vec<bool>, which seems to be what you have, given the way you use it.

Optionally call `skip` in a custom iterator `next()` function

I have a custom iterator and I would like to optionally call .skip(...) in the custom .next() method. However, I get a type error because Skip != Iterator.
Sample code is as follows:
struct CrossingIter<'a, T> {
index: usize,
iter: std::slice::Iter<'a, T>,
}
impl<'a, T: Float> Iterator for CrossingIter<'a, T> {
type Item = (usize, T);
fn next(&mut self) -> Option<(usize, T)> {
let iter = (&mut self.iter).enumerate();
let iter = if self.index == 0 {
self.index += 3;
iter.skip(3)
} else {
iter
}
// lots of code here working with the new iterator
iter.next()
}
}
The issue is that after calling .skip(3), the type of iter has changed. One solution would be to duplicate the // lots of code ... in each branch of the if statement, but I'd rather not.
My question is: Is there a way to conditionally apply skip(...) to an iterator and continue working with it without duplicating a bunch of code?
skip is designed to construct a new iterator, which is very useful in situations where you want your code to remain, at least on the surface, immutable. However, in your case, you want to advance the existing iterator while still leaving it valid.
There is advance_by which does what you want, but it's Nightly so it won't run on Stable Rust.
if self.index == 0 {
self.index += 3;
self.iter.advance_by(3);
}
We can abuse nth to get what we want, but it's not very idiomatic.
if self.index == 0 {
self.index += 3;
self.iter.nth(2);
}
If I saw that code in production, I'd be quite puzzled.
The simplest and not terribly satisfying answer is to just reimplement advance_by as a helper function. The source is available and pretty easy to adapt
fn my_advance_by(iter: &mut impl Iterator, n: usize) -> Result<(), usize> {
for i in 0..n {
iter.next().ok_or(i)?;
}
Ok(())
}
All this being said, if your use case is actually just to skip the first three elements, all you need is to start with the skip call and assume your iterator is always Skip
struct CrossingIter<'a, T> {
index: usize,
iter: std::iter::Skip<std::slice::Iter<'a, T>>,
}
I think #Silvio's answer is a better perspective.
You may call skip(0) instead of the iter itself in else branch...
And the return value of the iterator generated by enumerate doesn't match your definition: fn next(&mut self) -> Option<(usize, T)>. You need to map it.
Here is a working example:
use num::Float;
struct CrossingIter<'a, T> {
index: usize,
iter: std::slice::Iter<'a, T>,
}
impl<'a, T: Float> Iterator for CrossingIter<'a, T> {
type Item = (usize, T);
fn next(&mut self) -> Option<(usize, T)> {
let iter = (&mut self.iter).enumerate();
let mut iter = if self.index == 0 {
self.index += 3;
iter.skip(3)
} else {
iter.skip(0)
};
// lots of code here working with the new iterator
iter.next().map(|(i, &v)| (i, v))
}
}

Lifetime problem when implementing Iterator with item type &str [duplicate]

I am having trouble expressing the lifetime of the return value of an Iterator implementation. How can I compile this code without changing the return value of the iterator? I'd like it to return a vector of references.
It is obvious that I am not using the lifetime parameter correctly but after trying various ways I just gave up, I have no idea what to do with it.
use std::iter::Iterator;
struct PermutationIterator<T> {
vs: Vec<Vec<T>>,
is: Vec<usize>,
}
impl<T> PermutationIterator<T> {
fn new() -> PermutationIterator<T> {
PermutationIterator {
vs: vec![],
is: vec![],
}
}
fn add(&mut self, v: Vec<T>) {
self.vs.push(v);
self.is.push(0);
}
}
impl<T> Iterator for PermutationIterator<T> {
type Item = Vec<&'a T>;
fn next(&mut self) -> Option<Vec<&T>> {
'outer: loop {
for i in 0..self.vs.len() {
if self.is[i] >= self.vs[i].len() {
if i == 0 {
return None; // we are done
}
self.is[i] = 0;
self.is[i - 1] += 1;
continue 'outer;
}
}
let mut result = vec![];
for i in 0..self.vs.len() {
let index = self.is[i];
result.push(self.vs[i].get(index).unwrap());
}
*self.is.last_mut().unwrap() += 1;
return Some(result);
}
}
}
fn main() {
let v1: Vec<_> = (1..3).collect();
let v2: Vec<_> = (3..5).collect();
let v3: Vec<_> = (1..6).collect();
let mut i = PermutationIterator::new();
i.add(v1);
i.add(v2);
i.add(v3);
loop {
match i.next() {
Some(v) => {
println!("{:?}", v);
}
None => {
break;
}
}
}
}
(Playground link)
error[E0261]: use of undeclared lifetime name `'a`
--> src/main.rs:23:22
|
23 | type Item = Vec<&'a T>;
| ^^ undeclared lifetime
As far as I understand, you want want the iterator to return a vector of references into itself, right? Unfortunately, it is not possible in Rust.
This is the trimmed down Iterator trait:
trait Iterator {
type Item;
fn next(&mut self) -> Option<Item>;
}
Note that there is no lifetime connection between &mut self and Option<Item>. This means that next() method can't return references into the iterator itself. You just can't express a lifetime of the returned references. This is basically the reason that you couldn't find a way to specify the correct lifetime - it would've looked like this:
fn next<'a>(&'a mut self) -> Option<Vec<&'a T>>
except that this is not a valid next() method for Iterator trait.
Such iterators (the ones which can return references into themselves) are called streaming iterators. You can find more here, here and here, if you want.
Update. However, you can return a reference to some other structure from your iterator - that's how most of collection iterators work. It could look like this:
pub struct PermutationIterator<'a, T> {
vs: &'a [Vec<T>],
is: Vec<usize>
}
impl<'a, T> Iterator for PermutationIterator<'a, T> {
type Item = Vec<&'a T>;
fn next(&mut self) -> Option<Vec<&'a T>> {
...
}
}
Note how lifetime 'a is now declared on impl block. It is OK to do so (required, in fact) because you need to specify the lifetime parameter on the structure. Then you can use the same 'a both in Item and in next() return type. Again, that's how most of collection iterators work.
#VladimirMatveev's answer is correct in how it explains why your code cannot compile. In a nutshell, it says that an Iterator cannot yield borrowed values from within itself.
However, it can yield borrowed values from something else. This is what is achieved with Vec and Iter: the Vec owns the values, and the the Iter is just a wrapper able to yield references within the Vec.
Here is a design which achieves what you want. The iterator is, like with Vec and Iter, just a wrapper over other containers who actually own the values.
use std::iter::Iterator;
struct PermutationIterator<'a, T: 'a> {
vs : Vec<&'a [T]>,
is : Vec<usize>
}
impl<'a, T> PermutationIterator<'a, T> {
fn new() -> PermutationIterator<'a, T> { ... }
fn add(&mut self, v : &'a [T]) { ... }
}
impl<'a, T> Iterator for PermutationIterator<'a, T> {
type Item = Vec<&'a T>;
fn next(&mut self) -> Option<Vec<&'a T>> { ... }
}
fn main() {
let v1 : Vec<i32> = (1..3).collect();
let v2 : Vec<i32> = (3..5).collect();
let v3 : Vec<i32> = (1..6).collect();
let mut i = PermutationIterator::new();
i.add(&v1);
i.add(&v2);
i.add(&v3);
loop {
match i.next() {
Some(v) => { println!("{:?}", v); }
None => {break;}
}
}
}
(Playground)
Unrelated to your initial problem. If this were just me, I would ensure that all borrowed vectors are taken at once. The idea is to remove the repeated calls to add and to pass directly all borrowed vectors at construction:
use std::iter::{Iterator, repeat};
struct PermutationIterator<'a, T: 'a> {
...
}
impl<'a, T> PermutationIterator<'a, T> {
fn new(vs: Vec<&'a [T]>) -> PermutationIterator<'a, T> {
let n = vs.len();
PermutationIterator {
vs: vs,
is: repeat(0).take(n).collect(),
}
}
}
impl<'a, T> Iterator for PermutationIterator<'a, T> {
...
}
fn main() {
let v1 : Vec<i32> = (1..3).collect();
let v2 : Vec<i32> = (3..5).collect();
let v3 : Vec<i32> = (1..6).collect();
let vall: Vec<&[i32]> = vec![&v1, &v2, &v3];
let mut i = PermutationIterator::new(vall);
}
(Playground)
(EDIT: Changed the iterator design to take a Vec<&'a [T]> rather than a Vec<Vec<&'a T>>. It's easier to take a ref to container than to build a container of refs.)
As mentioned in other answers, this is called a streaming iterator and it requires different guarantees from Rust's Iterator. One crate that provides such functionality is aptly called streaming-iterator and it provides the StreamingIterator trait.
Here is one example of implementing the trait:
extern crate streaming_iterator;
use streaming_iterator::StreamingIterator;
struct Demonstration {
scores: Vec<i32>,
position: usize,
}
// Since `StreamingIterator` requires that we be able to call
// `advance` before `get`, we have to start "before" the first
// element. We assume that there will never be the maximum number of
// entries in the `Vec`, so we use `usize::MAX` as our sentinel value.
impl Demonstration {
fn new() -> Self {
Demonstration {
scores: vec![1, 2, 3],
position: std::usize::MAX,
}
}
fn reset(&mut self) {
self.position = std::usize::MAX;
}
}
impl StreamingIterator for Demonstration {
type Item = i32;
fn advance(&mut self) {
self.position = self.position.wrapping_add(1);
}
fn get(&self) -> Option<&Self::Item> {
self.scores.get(self.position)
}
}
fn main() {
let mut example = Demonstration::new();
loop {
example.advance();
match example.get() {
Some(v) => {
println!("v: {}", v);
}
None => break,
}
}
example.reset();
loop {
example.advance();
match example.get() {
Some(v) => {
println!("v: {}", v);
}
None => break,
}
}
}
Unfortunately, streaming iterators will be limited until generic associated types (GATs) from RFC 1598 are implemented.
I wrote this code not long ago and somehow stumbled on this question here. It does exactly what the question asks: it shows how to implement an iterator that passes its callbacks a reference to itself.
It adds an .iter_map() method to IntoIterator instances. Initially I thought it should be implemented for Iterator itself, but that was a less flexible design decision.
I created a small crate for it and posted my code to GitHub in case you want to experiment with it, you can find it here.
WRT the OP's trouble with defining lifetimes for the items, I didn't run into any such trouble implementing this while relying on the default elided lifetimes.
Here's an example of usage. Note the parameter the callback receives is the iterator itself, the callback is expected to pull the data from it and either pass it along as is or do whatever other operations.
use iter_map::IntoIterMap;
let mut b = true;
let s = "hello world!".chars().peekable().iter_map(|iter| {
if let Some(&ch) = iter.peek() {
if ch == 'o' && b {
b = false;
Some('0')
} else {
b = true;
iter.next()
}
} else { None }
}).collect::<String>();
assert_eq!(&s, "hell0o w0orld!");
Because the IntoIterMap generic trait is implemented for IntoIterator, you can get an "iter map" off anything that supports that interface. For instance, one can be created directly off an array, like so:
use iter_map::*;
fn main()
{
let mut i = 0;
let v = [1, 2, 3, 4, 5, 6].iter_map(move |iter| {
i += 1;
if i % 3 == 0 {
Some(0)
} else {
iter.next().copied()
}
}).collect::<Vec<_>>();
assert_eq!(v, vec![1, 2, 0, 3, 4, 0, 5, 6, 0]);
}
Here's the full code - it was amazing it took such little code to implement, and everything just seemed to work smoothly while putting it together. It gave me a new appreciation for the flexibility of Rust itself and its design decisions.
/// Adds `.iter_map()` method to all IntoIterator classes.
///
impl<F, I, J, R, T> IntoIterMap<F, I, R, T> for J
//
where F: FnMut(&mut I) -> Option<R>,
I: Iterator<Item = T>,
J: IntoIterator<Item = T, IntoIter = I>,
{
/// Returns an iterator that invokes the callback in `.next()`, passing it
/// the original iterator as an argument. The callback can return any
/// arbitrary type within an `Option`.
///
fn iter_map(self, callback: F) -> ParamFromFnIter<F, I>
{
ParamFromFnIter::new(self.into_iter(), callback)
}
}
/// A trait to add the `.iter_map()` method to any existing class.
///
pub trait IntoIterMap<F, I, R, T>
//
where F: FnMut(&mut I) -> Option<R>,
I: Iterator<Item = T>,
{
/// Returns a `ParamFromFnIter` iterator which wraps the iterator it's
/// invoked on.
///
/// # Arguments
/// * `callback` - The callback that gets invoked by `.next()`.
/// This callback is passed the original iterator as its
/// parameter.
///
fn iter_map(self, callback: F) -> ParamFromFnIter<F, I>;
}
/// Implements an iterator that can be created from a callback.
/// does pretty much the same thing as `std::iter::from_fn()` except the
/// callback signature of this class takes a data argument.
pub struct ParamFromFnIter<F, D>
{
callback: F,
data: D,
}
impl<F, D, R> ParamFromFnIter<F, D>
//
where F: FnMut(&mut D) -> Option<R>,
{
/// Creates a new `ParamFromFnIter` iterator instance.
///
/// This provides a flexible and simple way to create new iterators by
/// defining a callback.
/// # Arguments
/// * `data` - Data that will be passed to the callback on each
/// invocation.
/// * `callback` - The callback that gets invoked when `.next()` is invoked
/// on the returned iterator.
///
pub fn new(data: D, callback: F) -> Self
{
ParamFromFnIter { callback, data }
}
}
/// Implements Iterator for ParamFromFnIter.
///
impl<F, D, R> Iterator for ParamFromFnIter<F, D>
//
where F: FnMut(&mut D) -> Option<R>,
{
type Item = R;
/// Iterator method that returns the next item.
/// Invokes the client code provided iterator, passing it `&mut self.data`.
///
fn next(&mut self) -> Option<Self::Item>
{
(self.callback)(&mut self.data)
}
}

How to implement an Iterator where the associated type Item is a slice? [duplicate]

I am having trouble expressing the lifetime of the return value of an Iterator implementation. How can I compile this code without changing the return value of the iterator? I'd like it to return a vector of references.
It is obvious that I am not using the lifetime parameter correctly but after trying various ways I just gave up, I have no idea what to do with it.
use std::iter::Iterator;
struct PermutationIterator<T> {
vs: Vec<Vec<T>>,
is: Vec<usize>,
}
impl<T> PermutationIterator<T> {
fn new() -> PermutationIterator<T> {
PermutationIterator {
vs: vec![],
is: vec![],
}
}
fn add(&mut self, v: Vec<T>) {
self.vs.push(v);
self.is.push(0);
}
}
impl<T> Iterator for PermutationIterator<T> {
type Item = Vec<&'a T>;
fn next(&mut self) -> Option<Vec<&T>> {
'outer: loop {
for i in 0..self.vs.len() {
if self.is[i] >= self.vs[i].len() {
if i == 0 {
return None; // we are done
}
self.is[i] = 0;
self.is[i - 1] += 1;
continue 'outer;
}
}
let mut result = vec![];
for i in 0..self.vs.len() {
let index = self.is[i];
result.push(self.vs[i].get(index).unwrap());
}
*self.is.last_mut().unwrap() += 1;
return Some(result);
}
}
}
fn main() {
let v1: Vec<_> = (1..3).collect();
let v2: Vec<_> = (3..5).collect();
let v3: Vec<_> = (1..6).collect();
let mut i = PermutationIterator::new();
i.add(v1);
i.add(v2);
i.add(v3);
loop {
match i.next() {
Some(v) => {
println!("{:?}", v);
}
None => {
break;
}
}
}
}
(Playground link)
error[E0261]: use of undeclared lifetime name `'a`
--> src/main.rs:23:22
|
23 | type Item = Vec<&'a T>;
| ^^ undeclared lifetime
As far as I understand, you want want the iterator to return a vector of references into itself, right? Unfortunately, it is not possible in Rust.
This is the trimmed down Iterator trait:
trait Iterator {
type Item;
fn next(&mut self) -> Option<Item>;
}
Note that there is no lifetime connection between &mut self and Option<Item>. This means that next() method can't return references into the iterator itself. You just can't express a lifetime of the returned references. This is basically the reason that you couldn't find a way to specify the correct lifetime - it would've looked like this:
fn next<'a>(&'a mut self) -> Option<Vec<&'a T>>
except that this is not a valid next() method for Iterator trait.
Such iterators (the ones which can return references into themselves) are called streaming iterators. You can find more here, here and here, if you want.
Update. However, you can return a reference to some other structure from your iterator - that's how most of collection iterators work. It could look like this:
pub struct PermutationIterator<'a, T> {
vs: &'a [Vec<T>],
is: Vec<usize>
}
impl<'a, T> Iterator for PermutationIterator<'a, T> {
type Item = Vec<&'a T>;
fn next(&mut self) -> Option<Vec<&'a T>> {
...
}
}
Note how lifetime 'a is now declared on impl block. It is OK to do so (required, in fact) because you need to specify the lifetime parameter on the structure. Then you can use the same 'a both in Item and in next() return type. Again, that's how most of collection iterators work.
#VladimirMatveev's answer is correct in how it explains why your code cannot compile. In a nutshell, it says that an Iterator cannot yield borrowed values from within itself.
However, it can yield borrowed values from something else. This is what is achieved with Vec and Iter: the Vec owns the values, and the the Iter is just a wrapper able to yield references within the Vec.
Here is a design which achieves what you want. The iterator is, like with Vec and Iter, just a wrapper over other containers who actually own the values.
use std::iter::Iterator;
struct PermutationIterator<'a, T: 'a> {
vs : Vec<&'a [T]>,
is : Vec<usize>
}
impl<'a, T> PermutationIterator<'a, T> {
fn new() -> PermutationIterator<'a, T> { ... }
fn add(&mut self, v : &'a [T]) { ... }
}
impl<'a, T> Iterator for PermutationIterator<'a, T> {
type Item = Vec<&'a T>;
fn next(&mut self) -> Option<Vec<&'a T>> { ... }
}
fn main() {
let v1 : Vec<i32> = (1..3).collect();
let v2 : Vec<i32> = (3..5).collect();
let v3 : Vec<i32> = (1..6).collect();
let mut i = PermutationIterator::new();
i.add(&v1);
i.add(&v2);
i.add(&v3);
loop {
match i.next() {
Some(v) => { println!("{:?}", v); }
None => {break;}
}
}
}
(Playground)
Unrelated to your initial problem. If this were just me, I would ensure that all borrowed vectors are taken at once. The idea is to remove the repeated calls to add and to pass directly all borrowed vectors at construction:
use std::iter::{Iterator, repeat};
struct PermutationIterator<'a, T: 'a> {
...
}
impl<'a, T> PermutationIterator<'a, T> {
fn new(vs: Vec<&'a [T]>) -> PermutationIterator<'a, T> {
let n = vs.len();
PermutationIterator {
vs: vs,
is: repeat(0).take(n).collect(),
}
}
}
impl<'a, T> Iterator for PermutationIterator<'a, T> {
...
}
fn main() {
let v1 : Vec<i32> = (1..3).collect();
let v2 : Vec<i32> = (3..5).collect();
let v3 : Vec<i32> = (1..6).collect();
let vall: Vec<&[i32]> = vec![&v1, &v2, &v3];
let mut i = PermutationIterator::new(vall);
}
(Playground)
(EDIT: Changed the iterator design to take a Vec<&'a [T]> rather than a Vec<Vec<&'a T>>. It's easier to take a ref to container than to build a container of refs.)
As mentioned in other answers, this is called a streaming iterator and it requires different guarantees from Rust's Iterator. One crate that provides such functionality is aptly called streaming-iterator and it provides the StreamingIterator trait.
Here is one example of implementing the trait:
extern crate streaming_iterator;
use streaming_iterator::StreamingIterator;
struct Demonstration {
scores: Vec<i32>,
position: usize,
}
// Since `StreamingIterator` requires that we be able to call
// `advance` before `get`, we have to start "before" the first
// element. We assume that there will never be the maximum number of
// entries in the `Vec`, so we use `usize::MAX` as our sentinel value.
impl Demonstration {
fn new() -> Self {
Demonstration {
scores: vec![1, 2, 3],
position: std::usize::MAX,
}
}
fn reset(&mut self) {
self.position = std::usize::MAX;
}
}
impl StreamingIterator for Demonstration {
type Item = i32;
fn advance(&mut self) {
self.position = self.position.wrapping_add(1);
}
fn get(&self) -> Option<&Self::Item> {
self.scores.get(self.position)
}
}
fn main() {
let mut example = Demonstration::new();
loop {
example.advance();
match example.get() {
Some(v) => {
println!("v: {}", v);
}
None => break,
}
}
example.reset();
loop {
example.advance();
match example.get() {
Some(v) => {
println!("v: {}", v);
}
None => break,
}
}
}
Unfortunately, streaming iterators will be limited until generic associated types (GATs) from RFC 1598 are implemented.
I wrote this code not long ago and somehow stumbled on this question here. It does exactly what the question asks: it shows how to implement an iterator that passes its callbacks a reference to itself.
It adds an .iter_map() method to IntoIterator instances. Initially I thought it should be implemented for Iterator itself, but that was a less flexible design decision.
I created a small crate for it and posted my code to GitHub in case you want to experiment with it, you can find it here.
WRT the OP's trouble with defining lifetimes for the items, I didn't run into any such trouble implementing this while relying on the default elided lifetimes.
Here's an example of usage. Note the parameter the callback receives is the iterator itself, the callback is expected to pull the data from it and either pass it along as is or do whatever other operations.
use iter_map::IntoIterMap;
let mut b = true;
let s = "hello world!".chars().peekable().iter_map(|iter| {
if let Some(&ch) = iter.peek() {
if ch == 'o' && b {
b = false;
Some('0')
} else {
b = true;
iter.next()
}
} else { None }
}).collect::<String>();
assert_eq!(&s, "hell0o w0orld!");
Because the IntoIterMap generic trait is implemented for IntoIterator, you can get an "iter map" off anything that supports that interface. For instance, one can be created directly off an array, like so:
use iter_map::*;
fn main()
{
let mut i = 0;
let v = [1, 2, 3, 4, 5, 6].iter_map(move |iter| {
i += 1;
if i % 3 == 0 {
Some(0)
} else {
iter.next().copied()
}
}).collect::<Vec<_>>();
assert_eq!(v, vec![1, 2, 0, 3, 4, 0, 5, 6, 0]);
}
Here's the full code - it was amazing it took such little code to implement, and everything just seemed to work smoothly while putting it together. It gave me a new appreciation for the flexibility of Rust itself and its design decisions.
/// Adds `.iter_map()` method to all IntoIterator classes.
///
impl<F, I, J, R, T> IntoIterMap<F, I, R, T> for J
//
where F: FnMut(&mut I) -> Option<R>,
I: Iterator<Item = T>,
J: IntoIterator<Item = T, IntoIter = I>,
{
/// Returns an iterator that invokes the callback in `.next()`, passing it
/// the original iterator as an argument. The callback can return any
/// arbitrary type within an `Option`.
///
fn iter_map(self, callback: F) -> ParamFromFnIter<F, I>
{
ParamFromFnIter::new(self.into_iter(), callback)
}
}
/// A trait to add the `.iter_map()` method to any existing class.
///
pub trait IntoIterMap<F, I, R, T>
//
where F: FnMut(&mut I) -> Option<R>,
I: Iterator<Item = T>,
{
/// Returns a `ParamFromFnIter` iterator which wraps the iterator it's
/// invoked on.
///
/// # Arguments
/// * `callback` - The callback that gets invoked by `.next()`.
/// This callback is passed the original iterator as its
/// parameter.
///
fn iter_map(self, callback: F) -> ParamFromFnIter<F, I>;
}
/// Implements an iterator that can be created from a callback.
/// does pretty much the same thing as `std::iter::from_fn()` except the
/// callback signature of this class takes a data argument.
pub struct ParamFromFnIter<F, D>
{
callback: F,
data: D,
}
impl<F, D, R> ParamFromFnIter<F, D>
//
where F: FnMut(&mut D) -> Option<R>,
{
/// Creates a new `ParamFromFnIter` iterator instance.
///
/// This provides a flexible and simple way to create new iterators by
/// defining a callback.
/// # Arguments
/// * `data` - Data that will be passed to the callback on each
/// invocation.
/// * `callback` - The callback that gets invoked when `.next()` is invoked
/// on the returned iterator.
///
pub fn new(data: D, callback: F) -> Self
{
ParamFromFnIter { callback, data }
}
}
/// Implements Iterator for ParamFromFnIter.
///
impl<F, D, R> Iterator for ParamFromFnIter<F, D>
//
where F: FnMut(&mut D) -> Option<R>,
{
type Item = R;
/// Iterator method that returns the next item.
/// Invokes the client code provided iterator, passing it `&mut self.data`.
///
fn next(&mut self) -> Option<Self::Item>
{
(self.callback)(&mut self.data)
}
}

How do I specify the lifetime for the associated type of an iterator that refers to itself but does not mutate itself?

I have this struct:
struct RepIter<T> {
item: T
}
I want to implement Iterator for it so that it returns a reference to its item every time:
impl<T> Iterator for RepIter<T> {
type Item = &T;
fn next(&mut self) -> Option<Self::Item> {
return Some(&self.item);
}
}
This doesn't compile since a lifetime must be specified for type Item = &T;. Searching for a way to do this I found this question. The first solution doesn't seem applicable since I'm implementing a preexisting trait. Trying to copy the second solution directly I get something like this:
impl<'a, T> Iterator for &'a RepIter<T> {
type Item = &'a T;
fn next(self) -> Option<&'a T> {
return Some(&self.item);
}
}
This doesn't work either since I need a mutable self as argument to next. The only way I was able to get it to compile was to write it like this:
impl<'a, T> Iterator for &'a RepIter<T> {
type Item = &'a T;
fn next(&mut self) -> Option<&'a T> {
return Some(&self.item);
}
}
But now self is a reference to a reference, right? I don't know how to call next on an instance of RepIter. For example, this doesn't work:
fn main() {
let mut iter: RepIter<u64> = RepIter { item: 5 };
let res = iter.next();
}
This makes me think my implementation of the trait could be written in a better way.
As discussed in the question that Shepmaster linked to, this is a bit tricky because you really want to change the type of next(), but you can't because it's part of the trait. There are a couple of approaches to solve this though.
Making minimal changes to your code, you can just use the Iterator implementation on the &'a RepIter<T>:
pub fn main() {
let mut iter = RepIter { item: 5 };
let res = (&iter).next();
}
It's a bit unpleasant though.
Another way of looking at this is to change the ownership of your item. If it was already borrowed, then you can make all the types match up nicely:
struct RepIter<'a, T: 'a> {
item: &'a T,
}
impl<'a, T> Iterator for RepIter<'a, T> {
type Item = &'a T;
fn next(&mut self) -> Option<&'a T> {
Some(&self.item)
}
}
pub fn main() {
let val: u64 = 5;
let mut iter = RepIter { item: &val };
let res = iter.next();
}
When designing an iterator, it's often useful to have distinct types for the collection and for the iterator over that collection. Usually, the collection will own the data, and the iterator will borrow from the collection. Collection types typically implement IntoIterator and don't implement Iterator. This means that creating an iterator happens in two steps: we need to create the collection first, then create the iterator from the collection.
Here's a solution that turns your RepIter type into a collection. I'll use Shepmaster's proposition to use iter::repeat to produce the iterator.
use std::iter::{self, Repeat};
struct RepIter<T> {
item: T,
}
impl<T> RepIter<T> {
// When IntoIterator is implemented on `&Self`,
// then by convention, an inherent iter() method is provided as well.
fn iter(&self) -> Repeat<&T> {
iter::repeat(&self.item)
}
}
impl<'a, T> IntoIterator for &'a RepIter<T> {
type Item = &'a T;
type IntoIter = Repeat<&'a T>;
fn into_iter(self) -> Self::IntoIter {
self.iter()
}
}
fn main() {
let iter: RepIter<u64> = RepIter { item: 5 };
let res = iter.iter().next();
println!("{:?}", res);
let res = iter.iter().fuse().next();
println!("{:?}", res);
let res = iter.iter().by_ref().next();
println!("{:?}", res);
}
I would recommend writing your code as:
use std::iter;
fn main() {
let val = 5u64;
let mut iter = iter::repeat(&val);
let res = iter.next();
}
One thing that I don't quite understand yet is that your existing code almost works, but only for certain Iterator methods; those that take self by value instead of reference!
struct RepIter<T> {
item: T,
}
impl<'a, T> Iterator for &'a RepIter<T> {
type Item = &'a T;
fn next(&mut self) -> Option<&'a T> {
return Some(&self.item);
}
}
fn main() {
let iter: RepIter<u64> = RepIter { item: 5 };
// Works
let res = iter.fuse().next();
println!("{:?}", res);
// Doesn't work
let res = iter.by_ref().next();
println!("{:?}", res);
}
There's probably some interesting interaction happening.

Resources