How to implement multi-valued iterator pattern in Rust? - rust

I have an iterator-type-object that can return zero, one or more items each time it's called. I want to implement a standard Iter API, i.e. next returns Option<Self::Item>, so it can be consumed item by item.
In Clojure I would probably do this with mapcat ("map and concatenate").
My current solution (thanks to #Ryan) uses flat_map but still requires a lot of allocation:
// Desired input:
// A stateful object that implements an iterator which returns a number of results each time.
// The real code is a bit more complicated, this is the minimal example.
struct MyThing {
counter: i32,
}
impl Iterator for MyThing {
type Item = Vec<String>;
fn next(&mut self) -> Option<Vec<String>> {
self.counter += 1;
if self.counter == 4 {
self.counter = 1;
}
match self.counter {
1 => Some(vec!["One".to_string()]),
2 => Some(vec!["One".to_string(), "Two".to_string()]),
3 => Some(vec![
"One".to_string(),
"Two".to_string(),
"Three".to_string(),
]),
_ => Some(vec![]),
}
}
}
fn main() {
let things = MyThing { counter: 0 };
// Missing piece, though the following line does the job:
let flattened = things.flat_map(|x| x);
// However this requires a heap allocation at each loop.
// Desired output: I can iterate, item by item.
for item in flattened {
println!("{:?}", item);
}
}
Given the innovative things I have seen, I wonder if there's a more idiomatic, less expensive way of accomplishing this pattern.

If you know how to generate the "inner" values programmatically, replace Vec<String> with a struct you define that implements Iterator<Item = String>. (Technically only IntoIterator is necessary, but Iterator is sufficient.)
struct Inner {
index: usize,
stop: usize,
}
impl Inner {
fn new(n: usize) -> Self {
Inner { index: 0, stop: n }
}
}
impl Iterator for Inner {
type Item = String;
fn next(&mut self) -> Option<String> {
static WORDS: [&str; 3] = ["One", "Two", "Three"];
let result = if self.index < self.stop {
WORDS.get(self.index).map(|r| r.to_string())
} else {
None
};
self.index += 1;
result
}
}
Because Inner implements Iterator<Item = String>, it can be iterated over much like Vec<String>. But Inner does not have to pre-allocate a Vec and consume items one by one; it can lazily create each String on demand.
The "outer" iterator is just a struct that implements Iterator<Item = Inner>, likewise constructing each Inner lazily:
struct Outer {
counter: i32,
}
impl Iterator for Outer {
type Item = Inner;
fn next(&mut self) -> Option<Inner> {
self.counter = 1 + self.counter % 3;
Some(Inner::new(self.counter as usize))
}
}
As you know, Iterator::flat_map flattens nested structure, so something like the following works:
let things = Outer { counter: 0 };
for item in things.flat_map(|x| x).take(100) {
println!("{:?}", item);
}
In real-life code, Inner and Outer are probably pretty different from this example most of the time. For example, it's not necessarily possible to write Inner without doing the equivalent of allocating a Vec. So the precise shapes and semantics of these iterators depend on concrete information about the use case.
The above assumes that Inner is somehow useful, or easier to implement on its own. You could easily write a single struct that iterates over the sequence without needing to be flattened, but you have to also put the inner iterator state (the index field) into Outer:
struct Outer {
index: usize,
counter: i32,
}
impl Iterator for Outer {
type Item = String;
fn next(&mut self) -> Option<String> {
static WORDS: [&str; 3] = ["One", "Two", "Three"];
let result = WORDS.get(self.index).map(|r| r.to_string());
self.index += 1;
if self.index >= self.counter as usize {
self.counter = 1 + self.counter % 3;
self.index = 0;
};
result
}
}
fn main() {
let things = Outer { counter: 1, index: 0 };
for item in things.take(100) {
println!("{:?}", item);
}
}

Related

Iterator that skips every nth element

Rather than taking every Nth element from an iterator which I can do with Iterator::step_by, I would like to skip every Nth element. How can I achieve this idiomatically? Is there maybe even a standard library or itertools function?
This is what I came up with to skip every 7th say. It requires enumerate, filter, and map, though one could use a filter_map instead of the latter two.
(0..100).enumerate()
.filter(|&(i, x)| (i + 1) % 7 != 0)
.map(|(i, x)| x);
How could I cast this into a function so that I could simply write:
(0..100).skip_every(7)
If you want to get the exact interface you asked for, your best option at this time is to implement a custom iterator adapter type. Here's a basic version of such a type:
pub struct SkipEvery<I> {
inner: I,
every: usize,
index: usize,
}
impl<I> SkipEvery<I> {
fn new(inner: I, every: usize) -> Self {
assert!(every > 1);
let index = 0;
Self {
inner,
every,
index,
}
}
}
impl<I: Iterator> Iterator for SkipEvery<I> {
type Item = I::Item;
fn next(&mut self) -> Option<Self::Item> {
if self.index == self.every - 1 {
self.index = 1;
self.inner.nth(1)
} else {
self.index += 1;
self.inner.next()
}
}
}
pub trait IteratorSkipEveryExt: Iterator + Sized {
fn skip_every(self, every: usize) -> SkipEvery<Self> {
SkipEvery::new(self, every)
}
}
impl<I: Iterator + Sized> IteratorSkipEveryExt for I {}
(Playground)
A more complete implementation could also add optimized versions of further Iterator methods, as well as implementations of DoubleEndedIterator and ExactSizeIterator -- see the implementation of StepBy as an example.
Your code is pretty easy to turn into a function:
fn skip_every<I: Iterator> (iter: I, n: usize) -> impl Iterator<Item = <I as Iterator>::Item> {
iter.enumerate()
.filter_map(move |(i, v)| if (i + 1) % n != 0 { Some (v) } else { None })
}
fn main() {
println!("{:?}", skip_every (0..20, 7).collect::<Vec<_>>());
}
Playground
Or avoiding the expensive modulo:
fn skip_every2<I: Iterator> (iter: I, n: usize) -> impl Iterator<Item = <I as Iterator>::Item> {
iter.zip ((0..n).rev().cycle()).filter_map (|(v, i)| if i != 0 { Some (v) } else { None })
}
Playground

How can I intersperse a rust iterator with a value every n items?

I have an iterator of characters, and I want to add a newline every N characters:
let iter = "abcdefghijklmnopqrstuvwxyz".chars();
let iter_with_newlines = todo!();
let string: String = iter_with_newlines.collect();
assert_eq("abcdefghij\nklmnopqrst\nuvwxyz", string);
So basically, I want to intersperse the iterator with a newline every n characters. How can I do this?
Some Ideas I had
It would be great if I could do something like this, where chunks would be a method to make Iterator<T> into Iterator<Iterator<T>: iter.chunks(10).intersperse('\n').flatten()
It would also be cool if I could do something like this: iter.chunks.intersperseEvery(10, '\n'), where intersperseEvery is a method that would only intersperse the value every n items.
You can do it without temporary allocation using enumerate and flat_map:
use either::Either;
fn main() {
let iter = "abcdefghijklmnopqrstuvwxyz".chars();
let iter_with_newlines = iter
.enumerate()
.flat_map(|(i, c)| {
if i % 10 == 0 {
Either::Left(['\n', c].into_iter())
} else {
Either::Right(std::iter::once(c))
}
})
.skip(1); // The above code add a newline in first position -> skip it
let string: String = iter_with_newlines.collect();
assert_eq!("abcdefghij\nklmnopqrst\nuvwxyz", string);
}
Playground
Here's what I ended up doing:
// src/intersperse_sparse.rs
use core::iter::Peekable;
/// An iterator adaptor to insert a particular value
/// every n elements of the adapted iterator.
///
/// Iterator element type is `I::Item`
pub struct IntersperseSparse<I>
where
I: Iterator,
I::Item: Clone,
{
iter: Peekable<I>,
step_length: usize,
index: usize,
separator: I::Item,
}
impl<I> IntersperseSparse<I>
where
I: Iterator,
I::Item: Clone,
{
#[allow(unused)] // Although this function isn't explicitly exported, it is called in the default implementation of the IntersperseSparseAdapter, which is exported.
fn new(iter: I, step_length: usize, separator: I::Item) -> Self {
if step_length == 0 {
panic!("Chunk size cannot be 0!")
}
Self {
iter: iter.peekable(),
step_length,
separator,
index: 0,
}
}
}
impl<I> Iterator for IntersperseSparse<I>
where
I: Iterator,
I::Item: Clone,
{
type Item = I::Item;
fn next(&mut self) -> Option<Self::Item> {
if self.index == self.step_length && self.iter.peek().is_some() {
self.index = 0;
Some(self.separator.clone())
} else {
self.index += 1;
self.iter.next()
}
}
}
/// An iterator adaptor to insert a particular value created by a function
/// every n elements of the adapted iterator.
///
/// Iterator element type is `I::Item`
pub struct IntersperseSparseWith<I, G>
where
I: Iterator,
G: FnMut() -> I::Item,
{
iter: Peekable<I>,
step_length: usize,
index: usize,
separator_closure: G,
}
impl<I, G> IntersperseSparseWith<I, G>
where
I: Iterator,
G: FnMut() -> I::Item,
{
#[allow(unused)] // Although this function isn't explicitly exported, it is called in the default implementation of the IntersperseSparseAdapter, which is exported.
fn new(iter: I, step_length: usize, separator_closure: G) -> Self {
if step_length == 0 {
panic!("Chunk size cannot be 0!")
}
Self {
iter: iter.peekable(),
step_length,
separator_closure,
index: 0,
}
}
}
impl<I, G> Iterator for IntersperseSparseWith<I, G>
where
I: Iterator,
G: FnMut() -> I::Item,
{
type Item = I::Item;
fn next(&mut self) -> Option<Self::Item> {
if self.index == self.step_length && self.iter.peek().is_some() {
self.index = 0;
Some((self.separator_closure)())
} else {
self.index += 1;
self.iter.next()
}
}
}
/// Import this trait to use the `iter.intersperse_sparse(n, item)` and `iter.intersperse_sparse(n, ||item)` on all iterators.
pub trait IntersperseSparseAdapter: Iterator {
fn intersperse_sparse(self, chunk_size: usize, separator: Self::Item) -> IntersperseSparse<Self>
where
Self: Sized,
Self::Item: Clone,
{
IntersperseSparse::new(self, chunk_size, separator)
}
fn intersperse_sparse_with<G>(
self,
chunk_size: usize,
separator_closure: G,
) -> IntersperseSparseWith<Self, G>
where
Self: Sized,
G: FnMut() -> Self::Item,
{
IntersperseSparseWith::new(self, chunk_size, separator_closure)
}
}
impl<I> IntersperseSparseAdapter for I where I: Iterator {}
To use it:
// src/main.rs
mod intersperse_sparse;
use intersperse_sparse::IntersperseSparseAdapter;
fn main() {
let string = "abcdefg";
let new_string: String = string.chars().intersperse_sparse(3, '\n').collect();
assert_eq!(new_string, "abc\ndef\ng");
}
If you don't particularly care about performance, you can use chunks from itertools, collect the chunks into Vecs, and then intersperse your element as a single-element Vec, just to flatten the whole thing finally.
use itertools::Itertools;
iter
.chunks(3)
.into_iter()
.map(|chunk| chunk.collect::<Vec<_>>())
.intersperse(vec![','])
.flat_map(|chunk| chunk.into_iter())
.collect::<String>();
Playground
Other than that, consider writing your own iterator extension trait, just like itertools is one?
Build an Iterator with from_fn:
let mut iter = "abcdefghijklmnopqrstuvwxyz".chars().peekable();
let mut count = 0;
let iter_with_newlines = std::iter::from_fn(move || match iter.peek() {
Some(_) => {
if count < 10 {
count += 1;
iter.next()
} else {
count = 0;
Some('\n')
}
}
None => None,
});
assert_eq!(
"abcdefghij\nklmnopqrst\nuvwxyz",
iter_with_newlines.collect::<String>()
);
Playground

Optionally call `skip` in a custom iterator `next()` function

I have a custom iterator and I would like to optionally call .skip(...) in the custom .next() method. However, I get a type error because Skip != Iterator.
Sample code is as follows:
struct CrossingIter<'a, T> {
index: usize,
iter: std::slice::Iter<'a, T>,
}
impl<'a, T: Float> Iterator for CrossingIter<'a, T> {
type Item = (usize, T);
fn next(&mut self) -> Option<(usize, T)> {
let iter = (&mut self.iter).enumerate();
let iter = if self.index == 0 {
self.index += 3;
iter.skip(3)
} else {
iter
}
// lots of code here working with the new iterator
iter.next()
}
}
The issue is that after calling .skip(3), the type of iter has changed. One solution would be to duplicate the // lots of code ... in each branch of the if statement, but I'd rather not.
My question is: Is there a way to conditionally apply skip(...) to an iterator and continue working with it without duplicating a bunch of code?
skip is designed to construct a new iterator, which is very useful in situations where you want your code to remain, at least on the surface, immutable. However, in your case, you want to advance the existing iterator while still leaving it valid.
There is advance_by which does what you want, but it's Nightly so it won't run on Stable Rust.
if self.index == 0 {
self.index += 3;
self.iter.advance_by(3);
}
We can abuse nth to get what we want, but it's not very idiomatic.
if self.index == 0 {
self.index += 3;
self.iter.nth(2);
}
If I saw that code in production, I'd be quite puzzled.
The simplest and not terribly satisfying answer is to just reimplement advance_by as a helper function. The source is available and pretty easy to adapt
fn my_advance_by(iter: &mut impl Iterator, n: usize) -> Result<(), usize> {
for i in 0..n {
iter.next().ok_or(i)?;
}
Ok(())
}
All this being said, if your use case is actually just to skip the first three elements, all you need is to start with the skip call and assume your iterator is always Skip
struct CrossingIter<'a, T> {
index: usize,
iter: std::iter::Skip<std::slice::Iter<'a, T>>,
}
I think #Silvio's answer is a better perspective.
You may call skip(0) instead of the iter itself in else branch...
And the return value of the iterator generated by enumerate doesn't match your definition: fn next(&mut self) -> Option<(usize, T)>. You need to map it.
Here is a working example:
use num::Float;
struct CrossingIter<'a, T> {
index: usize,
iter: std::slice::Iter<'a, T>,
}
impl<'a, T: Float> Iterator for CrossingIter<'a, T> {
type Item = (usize, T);
fn next(&mut self) -> Option<(usize, T)> {
let iter = (&mut self.iter).enumerate();
let mut iter = if self.index == 0 {
self.index += 3;
iter.skip(3)
} else {
iter.skip(0)
};
// lots of code here working with the new iterator
iter.next().map(|(i, &v)| (i, v))
}
}

Efficiently mutate a vector while also iterating over the same vector

I have a vector of structs, and I'm comparing every element in the vector against every other element, and in certain cases mutating the current element.
My issue is that you can't have both a mutable and immutable borrow happening at the same time, but I'm not sure how to reframe my problem to get around this without cloning either the current element or the entire vector, which seems like a waste since I'm only ever mutating the current element, and it doesn't need to be compared to itself (I skip that case).
I'm sure there's an idiomatic way to do this in Rust.
struct MyStruct {
a: i32,
}
fn main() {
let mut v = vec![MyStruct { a: 1 }, MyStruct { a: 2 }, MyStruct { a: 3 }];
for elem in v.iter_mut() {
for other_elem in v.iter() {
if other_elem.a > elem.a {
elem.a += 1;
}
}
}
}
The simplest way is to just use indices, which don't involve any long-lived borrows:
for i in 0..v.len() {
for j in 0..v.len() {
if i == j { continue; }
if v[j].a > v[i].a {
v[i].a += 1;
}
}
}
If you really, really want to use iterators, you can do it by dividing up the Vec into disjoint slices:
fn process(elem: &mut MyStruct, other: &MyStruct) {
if other.a > elem.a {
elem.a += 1;
}
}
for i in 0..v.len() {
let (left, mid_right) = v.split_at_mut(i);
let (mid, right) = mid_right.split_at_mut(1);
let elem = &mut mid[0];
for other in left {
process(elem, other);
}
for other in right {
process(elem, other);
}
}
If you can modify type type of v, and the elements of v are Copy, you can wrap MyStruct in Cell.
#[derive(Copy, Clone)]
struct MyStruct {
a: i32,
}
fn main() {
use std::cell::Cell;
let v = vec![
Cell::new(MyStruct { a: 1 }),
Cell::new(MyStruct { a: 2 }),
Cell::new(MyStruct { a: 3 }),
];
for elem in v.iter() {
for other_elem in v.iter() {
let mut e = elem.get();
if other_elem.get().a > e.a {
e.a += 1;
elem.set(e);
}
}
}
}
If instead you're passed a &mut to a slice (or &mut that can be converted into a slice), use Cell::from_mut and Cell::as_slice_of_cells and use the same trick as above (assuming the elements of the slice are Copy).

How to implement Iterator for boxed values?

I have a struct that gives numbers by method next from trait Iterator:
struct Numbers{
number: usize,
count: usize
}
impl Iterator for Numbers {
type Item = usize;
fn next(&mut self) -> Option<Self::Item> {
if self.count > 0 {
self.count -= 1;
return Some(self.number);
}
return None;
}
}
fn main(){
let numbers = Numbers{
number: 777,
count: 10
};
for n in numbers {
println!{"{:?}", n};
}
}
It's work properly with usize type.
But same code with Box type gives a compilation error:
struct Numbers{
number: Box<usize>,
count: usize
}
impl Iterator for Numbers {
type Item = Box<usize>;
fn next(&mut self) -> Option<Self::Item> {
if self.count > 0 {
self.count -= 1;
return Some(self.number);
}
return None;
}
}
fn main(){
let numbers = Numbers{
number: Box::new(777),
count: 10
};
for n in numbers {
println!{"{:?}", n};
}
}
./numbers.rs:12:25: 12:29 error: cannot move out of borrowed content
./numbers.rs:12 return Some(self.number);
How to implement Iterator for boxed values properly?
This comes down to Rust’s ownership model and the distinction between copy and move semantics; Box<T> has move semantics, not implementing Copy, and so return Some(self.number); would move self.number, taking ownership of it; but this is not permitted because it would require consuming self, which is only taken by mutable reference.
You have a few choices (where I write “the object with move semantics,” I mean in this specific case self.number):
Don’t return the object with move semantics, return something else with copy semantics reference, such as a reference instead of the boxed value (returning a reference will require the iterator object to be different from the object being iterated over so that you can write the lifetime in Item; thus it doesn’t apply to your specific use case) or the unboxed number.
Construct a new value to return based on the object with move semantics:
impl Iterator for Numbers {
type Item = Box<usize>;
fn next(&mut self) -> Option<Self::Item> {
if self.count > 0 {
self.count -= 1;
Some(Box::new(self.number))
} else {
None
}
}
}
Clone the object with move semantics (this is a simplified form of the second option, really):
impl Iterator for Numbers {
type Item = Box<usize>;
fn next(&mut self) -> Option<Self::Item> {
if self.count > 0 {
self.count -= 1;
Some(self.number.clone())
} else {
None
}
}
}
Construct a new value to substitute in place of the object with move semantics:
use std::mem;
impl Iterator for Numbers {
type Item = Box<usize>;
fn next(&mut self) -> Option<Self::Item> {
if self.count > 0 {
self.count -= 1;
let number = mem::replace(&mut self.number, Box::new(0));
// self.number now contains 0
Some(number)
} else {
None
}
}
}

Resources