Iterator that returns each Nth value - rust

I have an iterator iter; is it possible to convert it into an iterator that iterates over each Nth element? Something like iter.skip_each(n - 1)?

As of Rust 1.26, there is the Iterator::step_by method in the standard library:
Basic usage:
let a = [0, 1, 2, 3, 4, 5];
let mut iter = a.iter().step_by(2);
assert_eq!(iter.next(), Some(&0));
assert_eq!(iter.next(), Some(&2));
assert_eq!(iter.next(), Some(&4));
assert_eq!(iter.next(), None);

As Dogbert said, use itertools' step.
You are going to be in a world of hurt if you can't use external crates. The Rust ecosystem strongly encourages everything to be pushed to crates. Maybe you should just clone the repository locally and use it that way if you can.
Otherwise, write it yourself:
use std::iter::Fuse;
struct Nth<I> {
n: usize,
iter: Fuse<I>,
}
impl<I> Iterator for Nth<I>
where I: Iterator
{
type Item = I::Item;
fn next(&mut self) -> Option<Self::Item> {
let mut nth = None;
for _ in 0..self.n {
nth = self.iter.next();
}
nth
}
}
trait EveryNth: Iterator + Sized {
fn every_nth(self, n: usize) -> Nth<Self> {
Nth { n: n, iter: self.fuse() }
}
}
impl<I> EveryNth for I where I: Iterator {}
fn main() {
let x = [1,2,3,4,5,6,7,8,9];
for v in x.iter().every_nth(1) { println!("{}", v) }
println!("---");
for v in x.iter().every_nth(2) { println!("{}", v) }
println!("---");
for v in x.iter().every_nth(3) { println!("{}", v) }
println!("---");
for v in x.iter().every_nth(4) { println!("{}", v) }
println!("---");
for v in x.iter().every_nth(5) { println!("{}", v) }
println!("---");
for v in x.iter().every_nth(6) { println!("{}", v) }
}
Note this is different behavior from itertools. You didn't specify where in the cycle of N the iterator picks from, so I chose the one that was easiest to code.

Related

How can I intersperse a rust iterator with a value every n items?

I have an iterator of characters, and I want to add a newline every N characters:
let iter = "abcdefghijklmnopqrstuvwxyz".chars();
let iter_with_newlines = todo!();
let string: String = iter_with_newlines.collect();
assert_eq("abcdefghij\nklmnopqrst\nuvwxyz", string);
So basically, I want to intersperse the iterator with a newline every n characters. How can I do this?
Some Ideas I had
It would be great if I could do something like this, where chunks would be a method to make Iterator<T> into Iterator<Iterator<T>: iter.chunks(10).intersperse('\n').flatten()
It would also be cool if I could do something like this: iter.chunks.intersperseEvery(10, '\n'), where intersperseEvery is a method that would only intersperse the value every n items.
You can do it without temporary allocation using enumerate and flat_map:
use either::Either;
fn main() {
let iter = "abcdefghijklmnopqrstuvwxyz".chars();
let iter_with_newlines = iter
.enumerate()
.flat_map(|(i, c)| {
if i % 10 == 0 {
Either::Left(['\n', c].into_iter())
} else {
Either::Right(std::iter::once(c))
}
})
.skip(1); // The above code add a newline in first position -> skip it
let string: String = iter_with_newlines.collect();
assert_eq!("abcdefghij\nklmnopqrst\nuvwxyz", string);
}
Playground
Here's what I ended up doing:
// src/intersperse_sparse.rs
use core::iter::Peekable;
/// An iterator adaptor to insert a particular value
/// every n elements of the adapted iterator.
///
/// Iterator element type is `I::Item`
pub struct IntersperseSparse<I>
where
I: Iterator,
I::Item: Clone,
{
iter: Peekable<I>,
step_length: usize,
index: usize,
separator: I::Item,
}
impl<I> IntersperseSparse<I>
where
I: Iterator,
I::Item: Clone,
{
#[allow(unused)] // Although this function isn't explicitly exported, it is called in the default implementation of the IntersperseSparseAdapter, which is exported.
fn new(iter: I, step_length: usize, separator: I::Item) -> Self {
if step_length == 0 {
panic!("Chunk size cannot be 0!")
}
Self {
iter: iter.peekable(),
step_length,
separator,
index: 0,
}
}
}
impl<I> Iterator for IntersperseSparse<I>
where
I: Iterator,
I::Item: Clone,
{
type Item = I::Item;
fn next(&mut self) -> Option<Self::Item> {
if self.index == self.step_length && self.iter.peek().is_some() {
self.index = 0;
Some(self.separator.clone())
} else {
self.index += 1;
self.iter.next()
}
}
}
/// An iterator adaptor to insert a particular value created by a function
/// every n elements of the adapted iterator.
///
/// Iterator element type is `I::Item`
pub struct IntersperseSparseWith<I, G>
where
I: Iterator,
G: FnMut() -> I::Item,
{
iter: Peekable<I>,
step_length: usize,
index: usize,
separator_closure: G,
}
impl<I, G> IntersperseSparseWith<I, G>
where
I: Iterator,
G: FnMut() -> I::Item,
{
#[allow(unused)] // Although this function isn't explicitly exported, it is called in the default implementation of the IntersperseSparseAdapter, which is exported.
fn new(iter: I, step_length: usize, separator_closure: G) -> Self {
if step_length == 0 {
panic!("Chunk size cannot be 0!")
}
Self {
iter: iter.peekable(),
step_length,
separator_closure,
index: 0,
}
}
}
impl<I, G> Iterator for IntersperseSparseWith<I, G>
where
I: Iterator,
G: FnMut() -> I::Item,
{
type Item = I::Item;
fn next(&mut self) -> Option<Self::Item> {
if self.index == self.step_length && self.iter.peek().is_some() {
self.index = 0;
Some((self.separator_closure)())
} else {
self.index += 1;
self.iter.next()
}
}
}
/// Import this trait to use the `iter.intersperse_sparse(n, item)` and `iter.intersperse_sparse(n, ||item)` on all iterators.
pub trait IntersperseSparseAdapter: Iterator {
fn intersperse_sparse(self, chunk_size: usize, separator: Self::Item) -> IntersperseSparse<Self>
where
Self: Sized,
Self::Item: Clone,
{
IntersperseSparse::new(self, chunk_size, separator)
}
fn intersperse_sparse_with<G>(
self,
chunk_size: usize,
separator_closure: G,
) -> IntersperseSparseWith<Self, G>
where
Self: Sized,
G: FnMut() -> Self::Item,
{
IntersperseSparseWith::new(self, chunk_size, separator_closure)
}
}
impl<I> IntersperseSparseAdapter for I where I: Iterator {}
To use it:
// src/main.rs
mod intersperse_sparse;
use intersperse_sparse::IntersperseSparseAdapter;
fn main() {
let string = "abcdefg";
let new_string: String = string.chars().intersperse_sparse(3, '\n').collect();
assert_eq!(new_string, "abc\ndef\ng");
}
If you don't particularly care about performance, you can use chunks from itertools, collect the chunks into Vecs, and then intersperse your element as a single-element Vec, just to flatten the whole thing finally.
use itertools::Itertools;
iter
.chunks(3)
.into_iter()
.map(|chunk| chunk.collect::<Vec<_>>())
.intersperse(vec![','])
.flat_map(|chunk| chunk.into_iter())
.collect::<String>();
Playground
Other than that, consider writing your own iterator extension trait, just like itertools is one?
Build an Iterator with from_fn:
let mut iter = "abcdefghijklmnopqrstuvwxyz".chars().peekable();
let mut count = 0;
let iter_with_newlines = std::iter::from_fn(move || match iter.peek() {
Some(_) => {
if count < 10 {
count += 1;
iter.next()
} else {
count = 0;
Some('\n')
}
}
None => None,
});
assert_eq!(
"abcdefghij\nklmnopqrst\nuvwxyz",
iter_with_newlines.collect::<String>()
);
Playground

How can concatenated &[u8] slices implement the Read trait without additional copying?

The Read trait is implemented for &[u8]. How can I get a Read trait over several concatenated u8 slices without actually doing any concatenation first?
If I concatenate first, there will be two copies -- multiple arrays into a single array followed by copying from single array to destination via the Read trait. I would like to avoid the first copying.
I want a Read trait over &[&[u8]] that treats multiple slices as a single continuous slice.
fn foo<R: std::io::Read + Send>(data: R) {
// ...
}
let a: &[u8] = &[1, 2, 3, 4, 5];
let b: &[u8] = &[1, 2];
let c: &[&[u8]] = &[a, b];
foo(c); // <- this won't compile because `c` is not a slice of bytes.
You could use the multi_reader crate, which can concatenate any number of values that implement Read:
let a: &[u8] = &[1, 2, 3, 4, 5];
let b: &[u8] = &[1, 2];
let c: &[&[u8]] = &[a, b];
foo(multi_reader::MultiReader::new(c.iter().copied()));
If you don't want to depend on an external crate, you can wrap the slices in a struct of your own and implement Read for it:
struct MultiRead<'a> {
sources: &'a [&'a [u8]],
pos_in_current: usize,
}
impl<'a> MultiRead<'a> {
fn new(sources: &'a [&'a [u8]]) -> MultiRead<'a> {
MultiRead {
sources,
pos_in_current: 0,
}
}
}
impl Read for MultiRead<'_> {
fn read(&mut self, buf: &mut [u8]) -> std::io::Result<usize> {
let current = loop {
if self.sources.is_empty() {
return Ok(0); // EOF
}
let current = self.sources[0];
if self.pos_in_current < current.len() {
break current;
}
self.pos_in_current = 0;
self.sources = &self.sources[1..];
};
let read_size = buf.len().min(current.len() - self.pos_in_current);
buf[..read_size].copy_from_slice(&current[self.pos_in_current..][..read_size]);
self.pos_in_current += read_size;
Ok(read_size)
}
}
Playground
Create a wrapper type around the slices and implement Read for it. Compared to user4815162342's answer, I delegate down to the implementation of Read for slices:
use std::{io::Read, mem};
struct Wrapper<'a, 'b>(&'a mut [&'b [u8]]);
impl<'a, 'b> Read for Wrapper<'a, 'b> {
fn read(&mut self, buf: &mut [u8]) -> std::io::Result<usize> {
let slices = mem::take(&mut self.0);
match slices {
[head, ..] => {
let n_bytes = head.read(buf)?;
if head.is_empty() {
// Advance the child slice
self.0 = &mut slices[1..];
} else {
// More to read, put back all the child slices
self.0 = slices;
}
Ok(n_bytes)
}
_ => Ok(0),
}
}
}
fn main() {
let parts: &mut [&[u8]] = &mut [b"hello ", b"world"];
let mut w = Wrapper(parts);
let mut buf = Vec::new();
w.read_to_end(&mut buf).unwrap();
assert_eq!(b"hello world", &*buf);
}
A more efficient implementation would implement further methods from Read, such as read_to_end or read_vectored.
See also:
How do I implement a trait I don't own for a type I don't own?

Chain two iterators while lazily constructing the second one

I'd like a method like Iterator::chain() that only computes the argument iterator when it's needed. In the following code, expensive_function should never be called:
use std::{thread, time};
fn expensive_function() -> Vec<u64> {
thread::sleep(time::Duration::from_secs(5));
vec![4, 5, 6]
}
pub fn main() {
let nums = [1, 2, 3];
for &i in nums.iter().chain(expensive_function().iter()) {
if i > 2 {
break;
} else {
println!("{}", i);
}
}
}
One possible approach: delegate the expensive computation to an iterator adaptor.
let nums = [1, 2, 3];
for i in nums.iter()
.cloned()
.chain([()].into_iter().flat_map(|_| expensive_function()))
{
if i > 2 {
break;
} else {
println!("{}", i);
}
}
Playground
The passed iterator is the result of flat-mapping a dummy unit value () to the list of values, which is lazy. Since the iterator needs to own the respective outcome of that computation, I chose to copy the number from the array.
You can create your own custom iterator adapter that only evaluates a closure when the original iterator is exhausted.
trait IteratorExt: Iterator {
fn chain_with<F, I>(self, f: F) -> ChainWith<Self, F, I::IntoIter>
where
Self: Sized,
F: FnOnce() -> I,
I: IntoIterator<Item = Self::Item>,
{
ChainWith {
base: self,
factory: Some(f),
iterator: None,
}
}
}
impl<I: Iterator> IteratorExt for I {}
struct ChainWith<B, F, I> {
base: B,
factory: Option<F>,
iterator: Option<I>,
}
impl<B, F, I> Iterator for ChainWith<B, F, I::IntoIter>
where
B: Iterator,
F: FnOnce() -> I,
I: IntoIterator<Item = B::Item>,
{
type Item = I::Item;
fn next(&mut self) -> Option<Self::Item> {
if let Some(b) = self.base.next() {
return Some(b);
}
// Exhausted the first, generate the second
if let Some(f) = self.factory.take() {
self.iterator = Some(f().into_iter());
}
self.iterator
.as_mut()
.expect("There must be an iterator")
.next()
}
}
use std::{thread, time};
fn expensive_function() -> Vec<u64> {
panic!("You lose, good day sir");
thread::sleep(time::Duration::from_secs(5));
vec![4, 5, 6]
}
pub fn main() {
let nums = [1, 2, 3];
for i in nums.iter().cloned().chain_with(|| expensive_function()) {
if i > 2 {
break;
} else {
println!("{}", i);
}
}
}

How to define mutual recursion with closures?

I can do something like this:
fn func() -> (Vec<i32>, Vec<i32>) {
let mut u = vec![0;5];
let mut v = vec![0;5];
fn foo(u: &mut [i32], v: &mut [i32], i: usize, j: usize) {
for k in i+1..u.len() {
u[k] += 1;
bar(u, v, k, j);
}
}
fn bar(u: &mut [i32], v: &mut [i32], i: usize, j: usize) {
for k in j+1..v.len() {
v[k] += 1;
foo(u, v, i, k);
}
}
foo(&mut u, &mut v, 0, 0);
(u,v)
}
fn main() {
let (u,v) = func();
println!("{:?}", u);
println!("{:?}", v);
}
but I would prefer to do something like this:
fn func() -> (Vec<i32>, Vec<i32>) {
let mut u = vec![0;5];
let mut v = vec![0;5];
let foo = |i, j| {
for k in i+1..u.len() {
u[k] += 1;
bar(k, j);
}
};
let bar = |i, j| {
for k in j+1..v.len() {
v[k] += 1;
foo(i, k);
}
};
foo(0, 0);
(u,v)
}
fn main() {
let (u,v) = func();
println!("{:?}", u);
println!("{:?}", v);
}
The second example doesn't compile with the error: unresolved name bar.
In my task I can do it through one recursion, but it will not look clear.
Does anyone have any other suggestions?
I have a solution for mutually recursive closures, but it doesn't work with multiple mutable borrows, so I couldn't extend it to your example.
There is a way to use define mutually recursive closures, using an approach similar to how this answer does single recursion. You can put the closures together into a struct, where each of them takes a borrow of that struct as an extra argument.
fn func(n: u32) -> bool {
struct EvenOdd<'a> {
even: &'a Fn(u32, &EvenOdd<'a>) -> bool,
odd: &'a Fn(u32, &EvenOdd<'a>) -> bool
}
let evenodd = EvenOdd {
even: &|n, evenodd| {
if n == 0 {
true
} else {
(evenodd.odd)(n - 1, evenodd)
}
},
odd: &|n, evenodd| {
if n == 0 {
false
} else {
(evenodd.even)(n - 1, evenodd)
}
}
};
(evenodd.even)(n, &evenodd)
}
fn main() {
println!("{}", func(5));
println!("{}", func(6));
}
While defining mutually recursive closures works in some cases, as demonstrated in the answer by Alex Knauth, I don't think that's an approach you should usually take. It is kind of opaque, has some limitations pointed out in the other answer, and it also has a performance overhead since it uses trait objects and dynamic dispatch at runtime.
Closures in Rust can be thought of as functions with associated structs storing the data you closed over. So a more general solution is to define your own struct storing the data you want to close over, and define methods on that struct instead of closures. For this case, the code could look like this:
pub struct FooBar {
pub u: Vec<i32>,
pub v: Vec<i32>,
}
impl FooBar {
fn new(u: Vec<i32>, v: Vec<i32>) -> Self {
Self { u, v }
}
fn foo(&mut self, i: usize, j: usize) {
for k in i+1..self.u.len() {
self.u[k] += 1;
self.bar(k, j);
}
}
fn bar(&mut self, i: usize, j: usize) {
for k in j+1..self.v.len() {
self.v[k] += 1;
self.foo(i, k);
}
}
}
fn main() {
let mut x = FooBar::new(vec![0;5], vec![0;5]);
x.foo(0, 0);
println!("{:?}", x.u);
println!("{:?}", x.v);
}
(Playground)
While this can get slightly more verbose than closures, and requires a few more explicit type annotations, it's more flexible and easier to read, so I would generally prefer this approach.

How do I iterate over a range with a custom step?

How can I iterate over a range in Rust with a step other than 1? I'm coming from a C++ background so I'd like to do something like
for(auto i = 0; i <= n; i+=2) {
//...
}
In Rust I need to use the range function, and it doesn't seem like there is a third argument available for having a custom step. How can I accomplish this?
range_step_inclusive and range_step are long gone.
As of Rust 1.28, Iterator::step_by is stable:
fn main() {
for x in (1..10).step_by(2) {
println!("{}", x);
}
}
It seems to me that until the .step_by method is made stable, one can easily accomplish what you want with an Iterator (which is what Ranges really are anyway):
struct SimpleStepRange(isize, isize, isize); // start, end, and step
impl Iterator for SimpleStepRange {
type Item = isize;
#[inline]
fn next(&mut self) -> Option<isize> {
if self.0 < self.1 {
let v = self.0;
self.0 = v + self.2;
Some(v)
} else {
None
}
}
}
fn main() {
for i in SimpleStepRange(0, 10, 2) {
println!("{}", i);
}
}
If one needs to iterate multiple ranges of different types, the code can be made generic as follows:
use std::ops::Add;
struct StepRange<T>(T, T, T)
where for<'a> &'a T: Add<&'a T, Output = T>,
T: PartialOrd,
T: Clone;
impl<T> Iterator for StepRange<T>
where for<'a> &'a T: Add<&'a T, Output = T>,
T: PartialOrd,
T: Clone
{
type Item = T;
#[inline]
fn next(&mut self) -> Option<T> {
if self.0 < self.1 {
let v = self.0.clone();
self.0 = &v + &self.2;
Some(v)
} else {
None
}
}
}
fn main() {
for i in StepRange(0u64, 10u64, 2u64) {
println!("{}", i);
}
}
I'll leave it to you to eliminate the upper bounds check to create an open ended structure if an infinite loop is required...
Advantages of this approach is that is works with for sugaring and will continue to work even when unstable features become usable; also, unlike the de-sugared approach using the standard Ranges, it doesn't lose efficiency by multiple .next() calls. Disadvantages are that it takes a few lines of code to set up the iterator so may only be worth it for code that has a lot of loops.
If you are stepping by something predefined, and small like 2, you may wish to use the iterator to step manually. e.g.:
let mut iter = 1..10;
loop {
match iter.next() {
Some(x) => {
println!("{}", x);
},
None => break,
}
iter.next();
}
You could even use this to step by an arbitrary amount (although this is definitely getting longer and harder to digest):
let mut iter = 1..10;
let step = 4;
loop {
match iter.next() {
Some(x) => {
println!("{}", x);
},
None => break,
}
for _ in 0..step-1 {
iter.next();
}
}
Use the num crate with range_step
You'd write your C++ code:
for (auto i = 0; i <= n; i += 2) {
//...
}
...in Rust like so:
let mut i = 0;
while i <= n {
// ...
i += 2;
}
I think the Rust version is more readable too.

Resources