Iterator that skips every nth element - rust

Rather than taking every Nth element from an iterator which I can do with Iterator::step_by, I would like to skip every Nth element. How can I achieve this idiomatically? Is there maybe even a standard library or itertools function?
This is what I came up with to skip every 7th say. It requires enumerate, filter, and map, though one could use a filter_map instead of the latter two.
(0..100).enumerate()
.filter(|&(i, x)| (i + 1) % 7 != 0)
.map(|(i, x)| x);
How could I cast this into a function so that I could simply write:
(0..100).skip_every(7)

If you want to get the exact interface you asked for, your best option at this time is to implement a custom iterator adapter type. Here's a basic version of such a type:
pub struct SkipEvery<I> {
inner: I,
every: usize,
index: usize,
}
impl<I> SkipEvery<I> {
fn new(inner: I, every: usize) -> Self {
assert!(every > 1);
let index = 0;
Self {
inner,
every,
index,
}
}
}
impl<I: Iterator> Iterator for SkipEvery<I> {
type Item = I::Item;
fn next(&mut self) -> Option<Self::Item> {
if self.index == self.every - 1 {
self.index = 1;
self.inner.nth(1)
} else {
self.index += 1;
self.inner.next()
}
}
}
pub trait IteratorSkipEveryExt: Iterator + Sized {
fn skip_every(self, every: usize) -> SkipEvery<Self> {
SkipEvery::new(self, every)
}
}
impl<I: Iterator + Sized> IteratorSkipEveryExt for I {}
(Playground)
A more complete implementation could also add optimized versions of further Iterator methods, as well as implementations of DoubleEndedIterator and ExactSizeIterator -- see the implementation of StepBy as an example.

Your code is pretty easy to turn into a function:
fn skip_every<I: Iterator> (iter: I, n: usize) -> impl Iterator<Item = <I as Iterator>::Item> {
iter.enumerate()
.filter_map(move |(i, v)| if (i + 1) % n != 0 { Some (v) } else { None })
}
fn main() {
println!("{:?}", skip_every (0..20, 7).collect::<Vec<_>>());
}
Playground
Or avoiding the expensive modulo:
fn skip_every2<I: Iterator> (iter: I, n: usize) -> impl Iterator<Item = <I as Iterator>::Item> {
iter.zip ((0..n).rev().cycle()).filter_map (|(v, i)| if i != 0 { Some (v) } else { None })
}
Playground

Related

In Rust, is it OK for an (non-moving) iter to return owned values rather than references?

For example, this works:
pub struct SquareVecIter<'a> {
current: f64,
iter: core::slice::Iter<'a, f64>,
}
pub fn square_iter<'a>(vec: &'a Vec<f64>) -> SquareVecIter<'a> {
SquareVecIter {
current: 0.0,
iter: vec.iter(),
}
}
impl<'a> Iterator for SquareVecIter<'a> {
type Item = f64;
fn next(&mut self) -> Option<Self::Item> {
if let Some(next) = self.iter.next() {
self.current = next * next;
Some(self.current)
} else {
None
}
}
}
// switch to test module
#[cfg(test)]
mod tests_2 {
use super::*;
#[test]
fn test_square_vec() {
let vec = vec![1.0, 2.0];
let mut iter = square_iter(&vec);
assert_eq!(iter.next(), Some(1.0));
assert_eq!(iter.next(), Some(4.0));
assert_eq!(iter.next(), None);
}
}
https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=531edc40dcca4a79d11af3cbd29943b7
But if I have to return references to self.current then I can't get the lifetimes to work.
Yes.
An Iterator cannot yield elements that reference itself simply due to how the trait is designed (search "lending iterator" for more info). So even if you wanted to return Some(&self.current) and deal with the implications therein, you could not.
Returning an f64 (non-reference) is perfectly acceptable and would be expected because this is a kind of generative iterator. And you wouldn't need to store current at all:
pub struct SquareVecIter<'a> {
iter: core::slice::Iter<'a, f64>,
}
pub fn square_iter<'a>(vec: &'a Vec<f64>) -> SquareVecIter<'a> {
SquareVecIter {
iter: vec.iter(),
}
}
impl<'a> Iterator for SquareVecIter<'a> {
type Item = f64;
fn next(&mut self) -> Option<Self::Item> {
if let Some(next) = self.iter.next() {
Some(next * next)
} else {
None
}
}
}
For an example of this in the standard library, look at the Chars iterator for getting characters of a string. It keeps a reference to the original str, but it yields owned chars and not references.
What #kmdreko says, of course.
I just wanted to add a couple of nitpicks:
The pattern of the if let Some(...) = ... {Some} else {None} is so common that it made its way into the standard library as the .map function.
Taking a &Vec<f64> is an antipattern. Use &[f64] instead. It is more general without any drawbacks.
All of your lifetime annotations (except of the one in the struct definition) can be derived automatically, so you can simply omit them.
pub struct SquareVecIter<'a> {
iter: core::slice::Iter<'a, f64>,
}
pub fn square_iter(vec: &[f64]) -> SquareVecIter {
SquareVecIter { iter: vec.iter() }
}
impl Iterator for SquareVecIter<'_> {
type Item = f64;
fn next(&mut self) -> Option<Self::Item> {
self.iter.next().map(|next| next * next)
}
}

How can I intersperse a rust iterator with a value every n items?

I have an iterator of characters, and I want to add a newline every N characters:
let iter = "abcdefghijklmnopqrstuvwxyz".chars();
let iter_with_newlines = todo!();
let string: String = iter_with_newlines.collect();
assert_eq("abcdefghij\nklmnopqrst\nuvwxyz", string);
So basically, I want to intersperse the iterator with a newline every n characters. How can I do this?
Some Ideas I had
It would be great if I could do something like this, where chunks would be a method to make Iterator<T> into Iterator<Iterator<T>: iter.chunks(10).intersperse('\n').flatten()
It would also be cool if I could do something like this: iter.chunks.intersperseEvery(10, '\n'), where intersperseEvery is a method that would only intersperse the value every n items.
You can do it without temporary allocation using enumerate and flat_map:
use either::Either;
fn main() {
let iter = "abcdefghijklmnopqrstuvwxyz".chars();
let iter_with_newlines = iter
.enumerate()
.flat_map(|(i, c)| {
if i % 10 == 0 {
Either::Left(['\n', c].into_iter())
} else {
Either::Right(std::iter::once(c))
}
})
.skip(1); // The above code add a newline in first position -> skip it
let string: String = iter_with_newlines.collect();
assert_eq!("abcdefghij\nklmnopqrst\nuvwxyz", string);
}
Playground
Here's what I ended up doing:
// src/intersperse_sparse.rs
use core::iter::Peekable;
/// An iterator adaptor to insert a particular value
/// every n elements of the adapted iterator.
///
/// Iterator element type is `I::Item`
pub struct IntersperseSparse<I>
where
I: Iterator,
I::Item: Clone,
{
iter: Peekable<I>,
step_length: usize,
index: usize,
separator: I::Item,
}
impl<I> IntersperseSparse<I>
where
I: Iterator,
I::Item: Clone,
{
#[allow(unused)] // Although this function isn't explicitly exported, it is called in the default implementation of the IntersperseSparseAdapter, which is exported.
fn new(iter: I, step_length: usize, separator: I::Item) -> Self {
if step_length == 0 {
panic!("Chunk size cannot be 0!")
}
Self {
iter: iter.peekable(),
step_length,
separator,
index: 0,
}
}
}
impl<I> Iterator for IntersperseSparse<I>
where
I: Iterator,
I::Item: Clone,
{
type Item = I::Item;
fn next(&mut self) -> Option<Self::Item> {
if self.index == self.step_length && self.iter.peek().is_some() {
self.index = 0;
Some(self.separator.clone())
} else {
self.index += 1;
self.iter.next()
}
}
}
/// An iterator adaptor to insert a particular value created by a function
/// every n elements of the adapted iterator.
///
/// Iterator element type is `I::Item`
pub struct IntersperseSparseWith<I, G>
where
I: Iterator,
G: FnMut() -> I::Item,
{
iter: Peekable<I>,
step_length: usize,
index: usize,
separator_closure: G,
}
impl<I, G> IntersperseSparseWith<I, G>
where
I: Iterator,
G: FnMut() -> I::Item,
{
#[allow(unused)] // Although this function isn't explicitly exported, it is called in the default implementation of the IntersperseSparseAdapter, which is exported.
fn new(iter: I, step_length: usize, separator_closure: G) -> Self {
if step_length == 0 {
panic!("Chunk size cannot be 0!")
}
Self {
iter: iter.peekable(),
step_length,
separator_closure,
index: 0,
}
}
}
impl<I, G> Iterator for IntersperseSparseWith<I, G>
where
I: Iterator,
G: FnMut() -> I::Item,
{
type Item = I::Item;
fn next(&mut self) -> Option<Self::Item> {
if self.index == self.step_length && self.iter.peek().is_some() {
self.index = 0;
Some((self.separator_closure)())
} else {
self.index += 1;
self.iter.next()
}
}
}
/// Import this trait to use the `iter.intersperse_sparse(n, item)` and `iter.intersperse_sparse(n, ||item)` on all iterators.
pub trait IntersperseSparseAdapter: Iterator {
fn intersperse_sparse(self, chunk_size: usize, separator: Self::Item) -> IntersperseSparse<Self>
where
Self: Sized,
Self::Item: Clone,
{
IntersperseSparse::new(self, chunk_size, separator)
}
fn intersperse_sparse_with<G>(
self,
chunk_size: usize,
separator_closure: G,
) -> IntersperseSparseWith<Self, G>
where
Self: Sized,
G: FnMut() -> Self::Item,
{
IntersperseSparseWith::new(self, chunk_size, separator_closure)
}
}
impl<I> IntersperseSparseAdapter for I where I: Iterator {}
To use it:
// src/main.rs
mod intersperse_sparse;
use intersperse_sparse::IntersperseSparseAdapter;
fn main() {
let string = "abcdefg";
let new_string: String = string.chars().intersperse_sparse(3, '\n').collect();
assert_eq!(new_string, "abc\ndef\ng");
}
If you don't particularly care about performance, you can use chunks from itertools, collect the chunks into Vecs, and then intersperse your element as a single-element Vec, just to flatten the whole thing finally.
use itertools::Itertools;
iter
.chunks(3)
.into_iter()
.map(|chunk| chunk.collect::<Vec<_>>())
.intersperse(vec![','])
.flat_map(|chunk| chunk.into_iter())
.collect::<String>();
Playground
Other than that, consider writing your own iterator extension trait, just like itertools is one?
Build an Iterator with from_fn:
let mut iter = "abcdefghijklmnopqrstuvwxyz".chars().peekable();
let mut count = 0;
let iter_with_newlines = std::iter::from_fn(move || match iter.peek() {
Some(_) => {
if count < 10 {
count += 1;
iter.next()
} else {
count = 0;
Some('\n')
}
}
None => None,
});
assert_eq!(
"abcdefghij\nklmnopqrst\nuvwxyz",
iter_with_newlines.collect::<String>()
);
Playground

Enumerate over indices without values in rust

Is there a cleaner way to do this?
for i in collection.iter().enumerate().map(|(i, _)| i) {...}
In other words I'm looking for a method like enumerate but which only gives the indices, not the values. Something like
for i in collection.iter().indices() {...}
Does such a method exist?
I want to chain a bunch of other methods but I don't want to type |(i, _)| in each closure, I want to just type |i|
[...]
Does such a method exist?
No, but you can write an extension trait that provides one (playground):
trait IndicesExt<I> {
fn indices(self) -> Indices<I>;
}
struct Indices<I> {
next: usize,
iter: I,
}
impl<I: Iterator> IndicesExt<I> for I {
fn indices(self) -> Indices<I> {
Indices {
next: 0,
iter: self,
}
}
}
impl<I: Iterator> Iterator for Indices<I> {
type Item = usize;
fn next(&mut self) -> Option<usize> {
if let Some(_) = self.iter.next() {
let current = self.next;
self.next += 1;
Some(current)
} else {
None
}
}
}
The implementation can be simplified by reusing the Enumerate iterator returned by Iterator::enumerate() (playground):
struct Indices<I>(std::iter::Enumerate<I>);
impl<I: Iterator> IndicesExt<I> for I {
fn indices(self) -> Indices<I> {
Indices(self.enumerate())
}
}
impl<I: Iterator> Iterator for Indices<I> {
type Item = usize;
fn next(&mut self) -> Option<usize> {
self.0.next().map(|(idx, _)| idx)
}
}

Optionally call `skip` in a custom iterator `next()` function

I have a custom iterator and I would like to optionally call .skip(...) in the custom .next() method. However, I get a type error because Skip != Iterator.
Sample code is as follows:
struct CrossingIter<'a, T> {
index: usize,
iter: std::slice::Iter<'a, T>,
}
impl<'a, T: Float> Iterator for CrossingIter<'a, T> {
type Item = (usize, T);
fn next(&mut self) -> Option<(usize, T)> {
let iter = (&mut self.iter).enumerate();
let iter = if self.index == 0 {
self.index += 3;
iter.skip(3)
} else {
iter
}
// lots of code here working with the new iterator
iter.next()
}
}
The issue is that after calling .skip(3), the type of iter has changed. One solution would be to duplicate the // lots of code ... in each branch of the if statement, but I'd rather not.
My question is: Is there a way to conditionally apply skip(...) to an iterator and continue working with it without duplicating a bunch of code?
skip is designed to construct a new iterator, which is very useful in situations where you want your code to remain, at least on the surface, immutable. However, in your case, you want to advance the existing iterator while still leaving it valid.
There is advance_by which does what you want, but it's Nightly so it won't run on Stable Rust.
if self.index == 0 {
self.index += 3;
self.iter.advance_by(3);
}
We can abuse nth to get what we want, but it's not very idiomatic.
if self.index == 0 {
self.index += 3;
self.iter.nth(2);
}
If I saw that code in production, I'd be quite puzzled.
The simplest and not terribly satisfying answer is to just reimplement advance_by as a helper function. The source is available and pretty easy to adapt
fn my_advance_by(iter: &mut impl Iterator, n: usize) -> Result<(), usize> {
for i in 0..n {
iter.next().ok_or(i)?;
}
Ok(())
}
All this being said, if your use case is actually just to skip the first three elements, all you need is to start with the skip call and assume your iterator is always Skip
struct CrossingIter<'a, T> {
index: usize,
iter: std::iter::Skip<std::slice::Iter<'a, T>>,
}
I think #Silvio's answer is a better perspective.
You may call skip(0) instead of the iter itself in else branch...
And the return value of the iterator generated by enumerate doesn't match your definition: fn next(&mut self) -> Option<(usize, T)>. You need to map it.
Here is a working example:
use num::Float;
struct CrossingIter<'a, T> {
index: usize,
iter: std::slice::Iter<'a, T>,
}
impl<'a, T: Float> Iterator for CrossingIter<'a, T> {
type Item = (usize, T);
fn next(&mut self) -> Option<(usize, T)> {
let iter = (&mut self.iter).enumerate();
let mut iter = if self.index == 0 {
self.index += 3;
iter.skip(3)
} else {
iter.skip(0)
};
// lots of code here working with the new iterator
iter.next().map(|(i, &v)| (i, v))
}
}

How do I iterate over a range with a custom step?

How can I iterate over a range in Rust with a step other than 1? I'm coming from a C++ background so I'd like to do something like
for(auto i = 0; i <= n; i+=2) {
//...
}
In Rust I need to use the range function, and it doesn't seem like there is a third argument available for having a custom step. How can I accomplish this?
range_step_inclusive and range_step are long gone.
As of Rust 1.28, Iterator::step_by is stable:
fn main() {
for x in (1..10).step_by(2) {
println!("{}", x);
}
}
It seems to me that until the .step_by method is made stable, one can easily accomplish what you want with an Iterator (which is what Ranges really are anyway):
struct SimpleStepRange(isize, isize, isize); // start, end, and step
impl Iterator for SimpleStepRange {
type Item = isize;
#[inline]
fn next(&mut self) -> Option<isize> {
if self.0 < self.1 {
let v = self.0;
self.0 = v + self.2;
Some(v)
} else {
None
}
}
}
fn main() {
for i in SimpleStepRange(0, 10, 2) {
println!("{}", i);
}
}
If one needs to iterate multiple ranges of different types, the code can be made generic as follows:
use std::ops::Add;
struct StepRange<T>(T, T, T)
where for<'a> &'a T: Add<&'a T, Output = T>,
T: PartialOrd,
T: Clone;
impl<T> Iterator for StepRange<T>
where for<'a> &'a T: Add<&'a T, Output = T>,
T: PartialOrd,
T: Clone
{
type Item = T;
#[inline]
fn next(&mut self) -> Option<T> {
if self.0 < self.1 {
let v = self.0.clone();
self.0 = &v + &self.2;
Some(v)
} else {
None
}
}
}
fn main() {
for i in StepRange(0u64, 10u64, 2u64) {
println!("{}", i);
}
}
I'll leave it to you to eliminate the upper bounds check to create an open ended structure if an infinite loop is required...
Advantages of this approach is that is works with for sugaring and will continue to work even when unstable features become usable; also, unlike the de-sugared approach using the standard Ranges, it doesn't lose efficiency by multiple .next() calls. Disadvantages are that it takes a few lines of code to set up the iterator so may only be worth it for code that has a lot of loops.
If you are stepping by something predefined, and small like 2, you may wish to use the iterator to step manually. e.g.:
let mut iter = 1..10;
loop {
match iter.next() {
Some(x) => {
println!("{}", x);
},
None => break,
}
iter.next();
}
You could even use this to step by an arbitrary amount (although this is definitely getting longer and harder to digest):
let mut iter = 1..10;
let step = 4;
loop {
match iter.next() {
Some(x) => {
println!("{}", x);
},
None => break,
}
for _ in 0..step-1 {
iter.next();
}
}
Use the num crate with range_step
You'd write your C++ code:
for (auto i = 0; i <= n; i += 2) {
//...
}
...in Rust like so:
let mut i = 0;
while i <= n {
// ...
i += 2;
}
I think the Rust version is more readable too.

Resources