How to zip two iterators of unequal length with a default?

How to zip two iterators of unequal length with a default? - rust

I'm trying to zip two iterators of unequal length, it only returns when when there is value in both and ignores the rest in the longest iterator.
fn main() {
let num1 = vec![1, 2];
let num2 = vec![3];
for i in num1.iter().rev().zip(num2.iter().rev()) {
println!("{:?}", i);
}
}
This returns (2, 3). How do i make it return:
(2, 3)
(1, 0) // default is the 0 here.
Is there any other way to do it?

You could use the zip_longest provided by the itertools crate.
use itertools::{
Itertools,
EitherOrBoth::*,
};
fn main() {
let num1 = vec![1, 2];
let num2 = vec![3];
for pair in num1.iter().rev().zip_longest(num2.iter().rev()) {
match pair {
Both(l, r) => println!("({:?}, {:?})", l, r),
Left(l) => println!("({:?}, 0)", l),
Right(r) => println!("(0, {:?})", r),
}
}
}
Which would produce the following output:
(2, 3)
(1, 0)

Zip will stop as soon as one of iterators stops producing values. If you know which is the longest, you can pad the shorter one with your default value:
use std::iter;
fn main() {
let longer = vec![1, 2];
let shorter = vec![3];
for i in longer
.iter()
.rev()
.zip(shorter.iter().rev().chain(iter::repeat(&0)))
{
println!("{:?}", i);
}
}
If you don't know which is longest, you should use itertools, as Peter Varo suggests.

The key is to detect that one iterator is shorter then the other, you could do it before before in your case vector implement ExactSizeIterator but a general solution would be to have a custom .zip().
itertools already offer a general solution, .zip_longest():
use itertools::EitherOrBoth::{Both, Left, Right};
use itertools::Itertools;
fn main() {
let num1 = vec![1, 2];
let num2 = vec![3];
for i in num1
.iter()
.rev()
.zip_longest(num2.iter().rev())
.map(|x| match x {
Both(a, b) => (a, b),
Left(a) => (a, &0),
Right(b) => (&0, b),
})
{
println!("{:?}", i);
}
}
This require you write the closure everytime, if you need this feature a lot maybe implement a custom trait on iterator with .zip_default() where A and B implement Default:
use std::default::Default;
use std::iter::Fuse;
pub trait MyIterTools: Iterator {
fn zip_default<J>(self, other: J) -> ZipDefault<Self, J::IntoIter>
where
J: IntoIterator,
Self: Sized,
{
ZipDefault::new(self, other.into_iter())
}
}
#[derive(Clone, Debug)]
pub struct ZipDefault<I, J> {
i: Fuse<I>,
j: Fuse<J>,
}
impl<I, J> ZipDefault<I, J>
where
I: Iterator,
J: Iterator,
{
fn new(i: I, j: J) -> Self {
Self {
i: i.fuse(),
j: j.fuse(),
}
}
}
impl<T, U, A, B> Iterator for ZipDefault<T, U>
where
T: Iterator<Item = A>,
U: Iterator<Item = B>,
A: Default,
B: Default,
{
type Item = (A, B);
fn next(&mut self) -> Option<Self::Item> {
match (self.i.next(), self.j.next()) {
(Some(a), Some(b)) => Some((a, b)),
(Some(a), None) => Some((a, B::default())),
(None, Some(b)) => Some((A::default(), b)),
(None, None) => None,
}
}
}
impl<T: ?Sized> MyIterTools for T where T: Iterator {}
fn main() {
let num1 = vec![1, 2];
let num2 = vec![3];
for i in num1
.iter()
.copied()
.rev()
.zip_default(num2.iter().copied().rev())
{
println!("{:?}", i);
}
}
Using itertools we can delegate some logic:
use std::default::Default;
use itertools::Itertools;
use itertools::ZipLongest;
use itertools::EitherOrBoth::{Both, Left, Right};
pub trait MyIterTools: Iterator {
fn zip_default<J>(self, j: J) -> ZipDefault<Self, J::IntoIter>
where
Self: Sized,
J: IntoIterator,
{
ZipDefault::new(self, j.into_iter())
}
}
#[derive(Clone, Debug)]
pub struct ZipDefault<I, J> {
inner: ZipLongest<I, J>,
}
impl<I, J> ZipDefault<I, J>
where
I: Iterator,
J: Iterator,
{
fn new(i: I, j: J) -> Self {
Self {
inner: i.zip_longest(j),
}
}
}
impl<T, U, A, B> Iterator for ZipDefault<T, U>
where
T: Iterator<Item = A>,
U: Iterator<Item = B>,
A: Default,
B: Default,
{
type Item = (A, B);
fn next(&mut self) -> Option<Self::Item> {
match self.inner.next()? {
Both(a, b) => Some((a, b)),
Left(a) => Some((a, B::default())),
Right(b) => Some((A::default(), b)),
}
}
fn size_hint(&self) -> (usize, Option<usize>) {
self.inner.size_hint()
}
}
impl<T: ?Sized> MyIterTools for T where T: Iterator {}
fn main() {
let num1 = vec![1, 2];
let num2 = vec![3];
for i in num1
.iter()
.copied()
.rev()
.zip_default(num2.iter().copied().rev())
{
println!("{:?}", i);
}
}

I saw this neat trick in other guy code in leetcode solution. If you have access to length, you can swap iterators, making iter1 the longest.
fn main() {
let num1 = vec![1, 2];
let num2 = vec![3];
let mut iter1 = num1.iter();
let mut iter2 = num2.iter();
if iter1.len() < iter2.len(){
std::mem::swap(&mut iter1, &mut iter2);
} // now iter1 is the largest
for i in iter1.rev().zip(iter2.rev().chain(std::iter::repeat(&0))) {
println!("{:?}", i);
}
}

If you can get the length of the iterators, as is in this case, a quick and dirty way could be:
use std::iter::repeat;
fn main() {
let a = vec![1, 2, 3];
let b = vec![4, 5, 6, 7];
for i in a
.iter()
.rev()
.chain(repeat(&0).take(b.len().saturating_sub(a.len())))
.zip(
b.iter()
.rev()
.chain(repeat(&0).take(a.len().saturating_sub(b.len()))),
)
{
println!("{:?}", i);
}
}
You can also implement a trait containing a zip_default() using this approach:
pub trait MyIterTools<X: Default + Clone>: ExactSizeIterator<Item = X> {
fn zip_default<J, Y>(self, j: J) -> ZipDefault<Self, J::IntoIter, X, Y>
where
Self: Sized,
J: IntoIterator<Item = Y>,
J::IntoIter: ExactSizeIterator,
Y: Default + Clone,
{
ZipDefault::new(self, j.into_iter())
}
}
#[derive(Clone, Debug)]
pub struct ZipDefault<
I: ExactSizeIterator<Item = X>,
J: ExactSizeIterator<Item = Y>,
X: Default + Clone,
Y: Default + Clone,
> {
inner: Zip<Chain<I, Take<Repeat<X>>>, Chain<J, Take<Repeat<Y>>>>,
}
impl<
I: ExactSizeIterator<Item = X>,
J: ExactSizeIterator<Item = Y>,
X: Default + Clone,
Y: Default + Clone,
> ZipDefault<I, J, X, Y>
{
fn new(a: I, b: J) -> Self {
let a_len = a.len();
let b_len = b.len();
Self {
inner: a
.chain(repeat(X::default()).take(b_len.saturating_sub(a_len)))
.zip(b.chain(repeat(Y::default()).take(a_len.saturating_sub(b_len)))),
}
}
}
impl<
I: ExactSizeIterator<Item = X>,
J: ExactSizeIterator<Item = Y>,
X: Default + Clone,
Y: Default + Clone,
> Iterator for ZipDefault<I, J, X, Y>
{
type Item = (X, Y);
fn next(&mut self) -> Option<Self::Item> {
self.inner.next()
}
fn size_hint(&self) -> (usize, Option<usize>) {
self.inner.size_hint()
}
}
impl<T: ExactSizeIterator<Item = X>, X: Default + Clone> MyIterTools<X> for T {}
fn main() {
let a = vec![1, 2, 3];
let b = vec![4, 5, 6, 7];
a.into_iter()
.zip_default(b.into_iter())
.for_each(|i| println!("{:?}", i));
}

Related

Can i iterate by chunks using zip in rust

In python I can write the following to iterate by tuples.
it = iter(range(10))
zip(it, it) # [(0, 1), (2, 3), (4, 5), (6, 7), (8, 9)]
Rust wouldn't let me borrow iterator twice (or use same iterator twice because of the move).
let mut i1 = a.iter();
let i2 = &mut i1;
i1.zip(i2).for_each(|(a, b)| println!("a: {}, b: {}", a, b));
I know about itertools crate, I just wonder if there's some hack that would allow me to get by without it if I only need this functionality.
Obviously you can do something like that.
struct Chunks<I: Iterator<Item = T, T> {
seq: I,
}
impl<I: Iterator<Item = T>, T> Iterator for Chunks<I, T> {
type Item = (T, T);
fn next(&mut self) -> Option<Self::Item> {
self.seq.next().zip(self.seq.next())
}
}
But that works only for tuples and not for triples. Triples would require some kind of macroses.
With std::iter::from_fn you can create one-liner (thanks to #user4815162342).
let mut seq = a.iter();
let chunks = std::iter::from_fn(move || seq.next().zip(seq.next()));
chunks.for_each(|(a, b)| println!("a: {}, b: {}", a, b));

This could be achieved with interior mutability. Note that it is only the next method call that requires a mutable borrow.
use std::cell::RefCell;
struct CellIter<I: Iterator>(RefCell<I>);
impl<I: Iterator> CellIter<I> {
pub fn new(iter: I) -> Self {
Self(RefCell::new(iter))
}
}
impl<I: Iterator> Iterator for &CellIter<I> {
type Item = <I as Iterator>::Item;
fn next(&mut self) -> Option<Self::Item> {
self.0.borrow_mut().next()
}
}
fn main() {
let a = CellIter::new(0..10);
let i1 = &a;
let i2 = &a;
i1.zip(i2).for_each(|(a, b)| println!("a: {}, b: {}", a, b));
}
Output:
a: 0, b: 1
a: 2, b: 3
a: 4, b: 5
a: 6, b: 7
a: 8, b: 9

Here is a solution based on the same idea as Alsein's solution (using a RefCell), but somewhat shorter using std::iter::from_fn as helper:
use std::cell::RefCell;
fn share<T>(iter: impl Iterator<Item = T>) -> impl Fn() -> Option<T> {
let wrapped_iter = RefCell::new(iter);
move || wrapped_iter.borrow_mut().next()
}
fn main() {
let a = share(0..10);
let i1 = std::iter::from_fn(&a);
let i2 = std::iter::from_fn(&a);
i1.zip(i2).for_each(|(a, b)| println!("a: {}, b: {}", a, b));
}

You can use slice windows rather than zipping the iterators: https://doc.rust-lang.org/std/slice/struct.Windows.html

How to generate iterator with sliding window pairs?

I'd like to create an iterator that for this input:
[1, 2, 3, 4]
Will contain the following:
(1, 2)
(2, 3)
(3, 4)
Peekable seems ideal for this, but I'm new to Rust, so this naïve version doesn't work:
fn main() {
let i = ['a', 'b', 'c']
.iter()
.peekable();
let j = i.map(|x| (x, i.peek()));
println!("{:?}", j);
println!("Hello World!");
}
What am I doing wrong?

You can use the windows method on slices, and then map the arrays into tuples:
fn main() {
let i = [1, 2, 3, 4]
.windows(2)
.map(|pair| (pair[0], pair[1]));
println!("{:?}", i.collect::<Vec<_>>());
}
playground
If you want a solution that works for all iterators (and not just slices) and are willing to use a 3rd-party library you can use the tuple_windows method from itertools.
use itertools::{Itertools, TupleWindows}; // 0.10.0
fn main() {
let i: TupleWindows<_, (i32, i32)> = vec![1, 2, 3, 4]
.into_iter()
.tuple_windows();
println!("{:?}", i.collect::<Vec<_>>());
}
playground
If you're not willing to use a 3rd-party library it's still simple enough that you can implement it yourself! Here's an example generic implementation that works for any Iterator<Item = T> where T: Clone:
use std::collections::BTreeSet;
struct PairIter<I, T>
where
I: Iterator<Item = T>,
T: Clone,
{
iterator: I,
last_item: Option<T>,
}
impl<I, T> PairIter<I, T>
where
I: Iterator<Item = T>,
T: Clone,
{
fn new(iterator: I) -> Self {
PairIter {
iterator,
last_item: None,
}
}
}
impl<I, T> Iterator for PairIter<I, T>
where
I: Iterator<Item = T>,
T: Clone,
{
type Item = (T, T);
fn next(&mut self) -> Option<Self::Item> {
if self.last_item.is_none() {
self.last_item = self.iterator.next();
}
if self.last_item.is_none() {
return None;
}
let curr_item = self.iterator.next();
if curr_item.is_none() {
return None;
}
let temp_item = curr_item.clone();
let result = (self.last_item.take().unwrap(), curr_item.unwrap());
self.last_item = temp_item;
Some(result)
}
}
fn example<T: Clone>(iterator: impl Iterator<Item = T>) -> impl Iterator<Item = (T, T)> {
PairIter::new(iterator)
}
fn main() {
let mut set = BTreeSet::new();
set.insert(String::from("a"));
set.insert(String::from("b"));
set.insert(String::from("c"));
set.insert(String::from("d"));
dbg!(example(set.into_iter()).collect::<Vec<_>>());
}
playground

You can use tuple_windows() from the itertools crate as a drop-in replacement:
use itertools::Itertools;
fn main() {
let data = vec![1, 2, 3, 4];
for (a, b) in data.iter().tuple_windows() {
println!("({}, {})", a, b);
}
}
(1, 2)
(2, 3)
(3, 4)

Chain two iterators while lazily constructing the second one

I'd like a method like Iterator::chain() that only computes the argument iterator when it's needed. In the following code, expensive_function should never be called:
use std::{thread, time};
fn expensive_function() -> Vec<u64> {
thread::sleep(time::Duration::from_secs(5));
vec![4, 5, 6]
}
pub fn main() {
let nums = [1, 2, 3];
for &i in nums.iter().chain(expensive_function().iter()) {
if i > 2 {
break;
} else {
println!("{}", i);
}
}
}

One possible approach: delegate the expensive computation to an iterator adaptor.
let nums = [1, 2, 3];
for i in nums.iter()
.cloned()
.chain([()].into_iter().flat_map(|_| expensive_function()))
{
if i > 2 {
break;
} else {
println!("{}", i);
}
}
Playground
The passed iterator is the result of flat-mapping a dummy unit value () to the list of values, which is lazy. Since the iterator needs to own the respective outcome of that computation, I chose to copy the number from the array.

You can create your own custom iterator adapter that only evaluates a closure when the original iterator is exhausted.
trait IteratorExt: Iterator {
fn chain_with<F, I>(self, f: F) -> ChainWith<Self, F, I::IntoIter>
where
Self: Sized,
F: FnOnce() -> I,
I: IntoIterator<Item = Self::Item>,
{
ChainWith {
base: self,
factory: Some(f),
iterator: None,
}
}
}
impl<I: Iterator> IteratorExt for I {}
struct ChainWith<B, F, I> {
base: B,
factory: Option<F>,
iterator: Option<I>,
}
impl<B, F, I> Iterator for ChainWith<B, F, I::IntoIter>
where
B: Iterator,
F: FnOnce() -> I,
I: IntoIterator<Item = B::Item>,
{
type Item = I::Item;
fn next(&mut self) -> Option<Self::Item> {
if let Some(b) = self.base.next() {
return Some(b);
}
// Exhausted the first, generate the second
if let Some(f) = self.factory.take() {
self.iterator = Some(f().into_iter());
}
self.iterator
.as_mut()
.expect("There must be an iterator")
.next()
}
}
use std::{thread, time};
fn expensive_function() -> Vec<u64> {
panic!("You lose, good day sir");
thread::sleep(time::Duration::from_secs(5));
vec![4, 5, 6]
}
pub fn main() {
let nums = [1, 2, 3];
for i in nums.iter().cloned().chain_with(|| expensive_function()) {
if i > 2 {
break;
} else {
println!("{}", i);
}
}
}

Extend Iterator with a "mean" method

I'm trying to implement a mean method for Iterator, like it is done with sum.
However, sum is Iterator method, so I decided to implement trait for any type that implements Iterator:
pub trait Mean<A = Self>: Sized {
fn mean<I: Iterator<Item = A>>(iter: I) -> f64;
}
impl Mean for u64 {
fn mean<I: Iterator<Item = u64>>(iter: I) -> f64 {
//use zip to start enumeration from 1, not 0
iter.zip((1..))
.fold(0., |s, (e, i)| (e as f64 + s * (i - 1) as f64) / i as f64)
}
}
impl<'a> Mean<&'a u64> for u64 {
fn mean<I: Iterator<Item = &'a u64>>(iter: I) -> f64 {
iter.zip((1..))
.fold(0., |s, (&e, i)| (e as f64 + s * (i - 1) as f64) / i as f64)
}
}
trait MeanIterator: Iterator {
fn mean(self) -> f64;
}
impl<T: Iterator> MeanIterator for T {
fn mean(self) -> f64 {
Mean::mean(self)
}
}
fn main() {
assert_eq!([1, 2, 3, 4, 5].iter().mean(), 3.);
}
Playground
The error:
error[E0282]: type annotations needed
--> src/main.rs:26:9
|
26 | Mean::mean(self)
| ^^^^^^^^^^ cannot infer type for `Self`
Is there any way to fix the code, or it is impossible in Rust?

like it is done with sum
Let's review how sum works:
pub fn sum<S>(self) -> S
where
S: Sum<Self::Item>,
sum is implemented on any iterator, so long as the result type S implements Sum for the iterated value. The caller gets to pick the result type. Sum is defined as:
pub trait Sum<A = Self> {
pub fn sum<I>(iter: I) -> Self
where
I: Iterator<Item = A>;
}
Sum::sum takes an iterator of A and produces a value of the type it is implemented from.
We can copy-paste the structure, changing Sum for Mean and put the straightforward implementations:
trait MeanExt: Iterator {
fn mean<M>(self) -> M
where
M: Mean<Self::Item>,
Self: Sized,
{
M::mean(self)
}
}
impl<I: Iterator> MeanExt for I {}
trait Mean<A = Self> {
fn mean<I>(iter: I) -> Self
where
I: Iterator<Item = A>;
}
impl Mean for f64 {
fn mean<I>(iter: I) -> Self
where
I: Iterator<Item = f64>,
{
let mut sum = 0.0;
let mut count: usize = 0;
for v in iter {
sum += v;
count += 1;
}
if count > 0 {
sum / (count as f64)
} else {
0.0
}
}
}
impl<'a> Mean<&'a f64> for f64 {
fn mean<I>(iter: I) -> Self
where
I: Iterator<Item = &'a f64>,
{
iter.copied().mean()
}
}
fn main() {
let mean: f64 = [1.0, 2.0, 3.0].iter().mean();
println!("{:?}", mean);
let mean: f64 = std::array::IntoIter::new([-1.0, 2.0, 1.0]).mean();
println!("{:?}", mean);
}

You can do it like this, for example:
pub trait Mean {
fn mean(self) -> f64;
}
impl<F, T> Mean for T
where T: Iterator<Item = F>,
F: std::borrow::Borrow<f64>
{
fn mean(self) -> f64 {
self.zip((1..))
.fold(0.,
|s, (e, i)| (*e.borrow() + s * (i - 1) as f64) / i as f64)
}
}
fn main() {
assert_eq!([1f64, 2f64, 3f64, 4f64, 5f64].iter().mean(), 3.);
assert_eq!(vec![1f64, 2f64, 3f64, 4f64, 5f64].iter().mean(), 3.);
assert_eq!(vec![1f64, 2f64, 3f64, 4f64, 5f64].into_iter().mean(), 3.);
}
I used Borrow trait to support iterators over f64 and &f64.

How to define mutual recursion with closures?

I can do something like this:
fn func() -> (Vec<i32>, Vec<i32>) {
let mut u = vec![0;5];
let mut v = vec![0;5];
fn foo(u: &mut [i32], v: &mut [i32], i: usize, j: usize) {
for k in i+1..u.len() {
u[k] += 1;
bar(u, v, k, j);
}
}
fn bar(u: &mut [i32], v: &mut [i32], i: usize, j: usize) {
for k in j+1..v.len() {
v[k] += 1;
foo(u, v, i, k);
}
}
foo(&mut u, &mut v, 0, 0);
(u,v)
}
fn main() {
let (u,v) = func();
println!("{:?}", u);
println!("{:?}", v);
}
but I would prefer to do something like this:
fn func() -> (Vec<i32>, Vec<i32>) {
let mut u = vec![0;5];
let mut v = vec![0;5];
let foo = |i, j| {
for k in i+1..u.len() {
u[k] += 1;
bar(k, j);
}
};
let bar = |i, j| {
for k in j+1..v.len() {
v[k] += 1;
foo(i, k);
}
};
foo(0, 0);
(u,v)
}
fn main() {
let (u,v) = func();
println!("{:?}", u);
println!("{:?}", v);
}
The second example doesn't compile with the error: unresolved name bar.
In my task I can do it through one recursion, but it will not look clear.
Does anyone have any other suggestions?

I have a solution for mutually recursive closures, but it doesn't work with multiple mutable borrows, so I couldn't extend it to your example.
There is a way to use define mutually recursive closures, using an approach similar to how this answer does single recursion. You can put the closures together into a struct, where each of them takes a borrow of that struct as an extra argument.
fn func(n: u32) -> bool {
struct EvenOdd<'a> {
even: &'a Fn(u32, &EvenOdd<'a>) -> bool,
odd: &'a Fn(u32, &EvenOdd<'a>) -> bool
}
let evenodd = EvenOdd {
even: &|n, evenodd| {
if n == 0 {
true
} else {
(evenodd.odd)(n - 1, evenodd)
}
},
odd: &|n, evenodd| {
if n == 0 {
false
} else {
(evenodd.even)(n - 1, evenodd)
}
}
};
(evenodd.even)(n, &evenodd)
}
fn main() {
println!("{}", func(5));
println!("{}", func(6));
}

While defining mutually recursive closures works in some cases, as demonstrated in the answer by Alex Knauth, I don't think that's an approach you should usually take. It is kind of opaque, has some limitations pointed out in the other answer, and it also has a performance overhead since it uses trait objects and dynamic dispatch at runtime.
Closures in Rust can be thought of as functions with associated structs storing the data you closed over. So a more general solution is to define your own struct storing the data you want to close over, and define methods on that struct instead of closures. For this case, the code could look like this:
pub struct FooBar {
pub u: Vec<i32>,
pub v: Vec<i32>,
}
impl FooBar {
fn new(u: Vec<i32>, v: Vec<i32>) -> Self {
Self { u, v }
}
fn foo(&mut self, i: usize, j: usize) {
for k in i+1..self.u.len() {
self.u[k] += 1;
self.bar(k, j);
}
}
fn bar(&mut self, i: usize, j: usize) {
for k in j+1..self.v.len() {
self.v[k] += 1;
self.foo(i, k);
}
}
}
fn main() {
let mut x = FooBar::new(vec![0;5], vec![0;5]);
x.foo(0, 0);
println!("{:?}", x.u);
println!("{:?}", x.v);
}
(Playground)
While this can get slightly more verbose than closures, and requires a few more explicit type annotations, it's more flexible and easier to read, so I would generally prefer this approach.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How to zip two iterators of unequal length with a default? - rust

Related

Can i iterate by chunks using zip in rust

How to generate iterator with sliding window pairs?

Chain two iterators while lazily constructing the second one

Extend Iterator with a "mean" method

How to define mutual recursion with closures?

Categories

Resources