In python I can write the following to iterate by tuples.
it = iter(range(10))
zip(it, it) # [(0, 1), (2, 3), (4, 5), (6, 7), (8, 9)]
Rust wouldn't let me borrow iterator twice (or use same iterator twice because of the move).
let mut i1 = a.iter();
let i2 = &mut i1;
i1.zip(i2).for_each(|(a, b)| println!("a: {}, b: {}", a, b));
I know about itertools crate, I just wonder if there's some hack that would allow me to get by without it if I only need this functionality.
Obviously you can do something like that.
struct Chunks<I: Iterator<Item = T, T> {
seq: I,
}
impl<I: Iterator<Item = T>, T> Iterator for Chunks<I, T> {
type Item = (T, T);
fn next(&mut self) -> Option<Self::Item> {
self.seq.next().zip(self.seq.next())
}
}
But that works only for tuples and not for triples. Triples would require some kind of macroses.
With std::iter::from_fn you can create one-liner (thanks to #user4815162342).
let mut seq = a.iter();
let chunks = std::iter::from_fn(move || seq.next().zip(seq.next()));
chunks.for_each(|(a, b)| println!("a: {}, b: {}", a, b));
This could be achieved with interior mutability. Note that it is only the next method call that requires a mutable borrow.
use std::cell::RefCell;
struct CellIter<I: Iterator>(RefCell<I>);
impl<I: Iterator> CellIter<I> {
pub fn new(iter: I) -> Self {
Self(RefCell::new(iter))
}
}
impl<I: Iterator> Iterator for &CellIter<I> {
type Item = <I as Iterator>::Item;
fn next(&mut self) -> Option<Self::Item> {
self.0.borrow_mut().next()
}
}
fn main() {
let a = CellIter::new(0..10);
let i1 = &a;
let i2 = &a;
i1.zip(i2).for_each(|(a, b)| println!("a: {}, b: {}", a, b));
}
Output:
a: 0, b: 1
a: 2, b: 3
a: 4, b: 5
a: 6, b: 7
a: 8, b: 9
Here is a solution based on the same idea as Alsein's solution (using a RefCell), but somewhat shorter using std::iter::from_fn as helper:
use std::cell::RefCell;
fn share<T>(iter: impl Iterator<Item = T>) -> impl Fn() -> Option<T> {
let wrapped_iter = RefCell::new(iter);
move || wrapped_iter.borrow_mut().next()
}
fn main() {
let a = share(0..10);
let i1 = std::iter::from_fn(&a);
let i2 = std::iter::from_fn(&a);
i1.zip(i2).for_each(|(a, b)| println!("a: {}, b: {}", a, b));
}
You can use slice windows rather than zipping the iterators: https://doc.rust-lang.org/std/slice/struct.Windows.html
Related
Uncompilable version:
fn main() {
let v = vec![1, 2, 3, 4, 5, 6];
let mut b = Buffer::new(v.as_slice());
let r1 = b.read();
let r2 = b.read();
out2(r1, r2);
}
struct Buffer<'a> {
buf : &'a [u8],
pos : usize,
}
fn out2(a: &[u8], b: &[u8]){
println!("{:#?} {:#?}", a, b);
}
impl<'a> Buffer<'a> {
fn new(a : &'a [u8] ) -> Buffer<'a> {
Buffer { buf: (a), pos: (0) }
}
fn read(&'a mut self) -> &'a [u8] {
self.pos += 3;
&self.buf[self.pos - 3..self.pos]
}
}
Compiled successfully version
fn main() {
let v = vec![1, 2, 3, 4, 5, 6];
let mut b = Buffer::new(v.as_slice());
let r1 = b.read();
let r2 = b.read();
out2(r1, r2);
}
struct Buffer<'a> {
buf : &'a [u8],
pos : usize,
}
fn out2(a: &[u8], b: &[u8]){
println!("{:#?} {:#?}", a, b);
}
// a > b
impl<'b, 'a : 'b> Buffer<'a> {
fn new(a : &'a [u8] ) -> Buffer<'a> {
Buffer { buf: (a), pos: (0) }
}
fn read(&'b mut self) -> &'a [u8] {
self.pos += 3;
&self.buf[self.pos - 3..self.pos]
}
}
Both r1 r2 are also hold the partial reference of buffer. And neither held mut ref.
The most difference part is that read function's return lifetime is longer than &mut self.
But I can't understand why.
The second snippet is equivalent to the following:
impl<'a> Buffer<'a> {
fn read<'b>(&'b mut self) -> &'a [u8] {
self.pos += 3;
&self.buf[self.pos - 3..self.pos]
}
}
Basically, &'a mut self where 'a is defined in the struct is almost always wrong. You say the struct needs to be borrowed for as long as the data it holds. Since the data it holds exists from the creation of the instance to its end, so does this borrow. Basically, you say we can use this method only once.
The second snippet, on the other hand, takes a fresh, smaller lifetime on self and therefore can be called multiple times.
I would like to do the following and allow my perform function to take an iterator with &usize or with usize items as below.
fn perform<'a, I>(values: I) -> usize
where
I: Iterator<Item = &'a usize>,
{
*values.max().unwrap()
}
fn main() {
let v: Vec<usize> = vec![1, 2, 3, 4];
// Works.
let result = perform(v.iter());
print!("Result: {}", result);
// Does not work: `expected `&usize`, found `usize``.
let result = perform(v.iter().map(|v| v * 2));
print!("Result again: {}", result)
}
Playground example here.
In this case, by making your function more polymorphic, we actually get the behavior for free. Your perform works for any Ord value.
fn perform<'a, I, T>(values: I) -> T
where
I: Iterator<Item = T>,
T: Ord,
{
values.max().unwrap()
}
Now usize is Ord and so is &usize (by a blanket impl that makes references to any orderable type orderable).
Another way may be accepting owned values in perform like this:
fn perform<I>(values: I) -> usize
where
I: Iterator<Item = usize>,
{
values.max().unwrap()
}
fn main() {
let v: Vec<usize> = vec![1, 2, 3, 4];
// Works.
let result = perform(v.iter().map(|x| *x));
print!("Result: {}", result);
// Does not work: `expected `&usize`, found `usize``.
let result = perform(v.iter().map(|v| v * 2));
print!("Result again: {}", result)
}
I'd like to create an iterator that for this input:
[1, 2, 3, 4]
Will contain the following:
(1, 2)
(2, 3)
(3, 4)
Peekable seems ideal for this, but I'm new to Rust, so this naïve version doesn't work:
fn main() {
let i = ['a', 'b', 'c']
.iter()
.peekable();
let j = i.map(|x| (x, i.peek()));
println!("{:?}", j);
println!("Hello World!");
}
What am I doing wrong?
You can use the windows method on slices, and then map the arrays into tuples:
fn main() {
let i = [1, 2, 3, 4]
.windows(2)
.map(|pair| (pair[0], pair[1]));
println!("{:?}", i.collect::<Vec<_>>());
}
playground
If you want a solution that works for all iterators (and not just slices) and are willing to use a 3rd-party library you can use the tuple_windows method from itertools.
use itertools::{Itertools, TupleWindows}; // 0.10.0
fn main() {
let i: TupleWindows<_, (i32, i32)> = vec![1, 2, 3, 4]
.into_iter()
.tuple_windows();
println!("{:?}", i.collect::<Vec<_>>());
}
playground
If you're not willing to use a 3rd-party library it's still simple enough that you can implement it yourself! Here's an example generic implementation that works for any Iterator<Item = T> where T: Clone:
use std::collections::BTreeSet;
struct PairIter<I, T>
where
I: Iterator<Item = T>,
T: Clone,
{
iterator: I,
last_item: Option<T>,
}
impl<I, T> PairIter<I, T>
where
I: Iterator<Item = T>,
T: Clone,
{
fn new(iterator: I) -> Self {
PairIter {
iterator,
last_item: None,
}
}
}
impl<I, T> Iterator for PairIter<I, T>
where
I: Iterator<Item = T>,
T: Clone,
{
type Item = (T, T);
fn next(&mut self) -> Option<Self::Item> {
if self.last_item.is_none() {
self.last_item = self.iterator.next();
}
if self.last_item.is_none() {
return None;
}
let curr_item = self.iterator.next();
if curr_item.is_none() {
return None;
}
let temp_item = curr_item.clone();
let result = (self.last_item.take().unwrap(), curr_item.unwrap());
self.last_item = temp_item;
Some(result)
}
}
fn example<T: Clone>(iterator: impl Iterator<Item = T>) -> impl Iterator<Item = (T, T)> {
PairIter::new(iterator)
}
fn main() {
let mut set = BTreeSet::new();
set.insert(String::from("a"));
set.insert(String::from("b"));
set.insert(String::from("c"));
set.insert(String::from("d"));
dbg!(example(set.into_iter()).collect::<Vec<_>>());
}
playground
You can use tuple_windows() from the itertools crate as a drop-in replacement:
use itertools::Itertools;
fn main() {
let data = vec![1, 2, 3, 4];
for (a, b) in data.iter().tuple_windows() {
println!("({}, {})", a, b);
}
}
(1, 2)
(2, 3)
(3, 4)
I'm trying to zip two iterators of unequal length, it only returns when when there is value in both and ignores the rest in the longest iterator.
fn main() {
let num1 = vec![1, 2];
let num2 = vec![3];
for i in num1.iter().rev().zip(num2.iter().rev()) {
println!("{:?}", i);
}
}
This returns (2, 3). How do i make it return:
(2, 3)
(1, 0) // default is the 0 here.
Is there any other way to do it?
You could use the zip_longest provided by the itertools crate.
use itertools::{
Itertools,
EitherOrBoth::*,
};
fn main() {
let num1 = vec![1, 2];
let num2 = vec![3];
for pair in num1.iter().rev().zip_longest(num2.iter().rev()) {
match pair {
Both(l, r) => println!("({:?}, {:?})", l, r),
Left(l) => println!("({:?}, 0)", l),
Right(r) => println!("(0, {:?})", r),
}
}
}
Which would produce the following output:
(2, 3)
(1, 0)
Zip will stop as soon as one of iterators stops producing values. If you know which is the longest, you can pad the shorter one with your default value:
use std::iter;
fn main() {
let longer = vec![1, 2];
let shorter = vec![3];
for i in longer
.iter()
.rev()
.zip(shorter.iter().rev().chain(iter::repeat(&0)))
{
println!("{:?}", i);
}
}
If you don't know which is longest, you should use itertools, as Peter Varo suggests.
The key is to detect that one iterator is shorter then the other, you could do it before before in your case vector implement ExactSizeIterator but a general solution would be to have a custom .zip().
itertools already offer a general solution, .zip_longest():
use itertools::EitherOrBoth::{Both, Left, Right};
use itertools::Itertools;
fn main() {
let num1 = vec![1, 2];
let num2 = vec![3];
for i in num1
.iter()
.rev()
.zip_longest(num2.iter().rev())
.map(|x| match x {
Both(a, b) => (a, b),
Left(a) => (a, &0),
Right(b) => (&0, b),
})
{
println!("{:?}", i);
}
}
This require you write the closure everytime, if you need this feature a lot maybe implement a custom trait on iterator with .zip_default() where A and B implement Default:
use std::default::Default;
use std::iter::Fuse;
pub trait MyIterTools: Iterator {
fn zip_default<J>(self, other: J) -> ZipDefault<Self, J::IntoIter>
where
J: IntoIterator,
Self: Sized,
{
ZipDefault::new(self, other.into_iter())
}
}
#[derive(Clone, Debug)]
pub struct ZipDefault<I, J> {
i: Fuse<I>,
j: Fuse<J>,
}
impl<I, J> ZipDefault<I, J>
where
I: Iterator,
J: Iterator,
{
fn new(i: I, j: J) -> Self {
Self {
i: i.fuse(),
j: j.fuse(),
}
}
}
impl<T, U, A, B> Iterator for ZipDefault<T, U>
where
T: Iterator<Item = A>,
U: Iterator<Item = B>,
A: Default,
B: Default,
{
type Item = (A, B);
fn next(&mut self) -> Option<Self::Item> {
match (self.i.next(), self.j.next()) {
(Some(a), Some(b)) => Some((a, b)),
(Some(a), None) => Some((a, B::default())),
(None, Some(b)) => Some((A::default(), b)),
(None, None) => None,
}
}
}
impl<T: ?Sized> MyIterTools for T where T: Iterator {}
fn main() {
let num1 = vec![1, 2];
let num2 = vec![3];
for i in num1
.iter()
.copied()
.rev()
.zip_default(num2.iter().copied().rev())
{
println!("{:?}", i);
}
}
Using itertools we can delegate some logic:
use std::default::Default;
use itertools::Itertools;
use itertools::ZipLongest;
use itertools::EitherOrBoth::{Both, Left, Right};
pub trait MyIterTools: Iterator {
fn zip_default<J>(self, j: J) -> ZipDefault<Self, J::IntoIter>
where
Self: Sized,
J: IntoIterator,
{
ZipDefault::new(self, j.into_iter())
}
}
#[derive(Clone, Debug)]
pub struct ZipDefault<I, J> {
inner: ZipLongest<I, J>,
}
impl<I, J> ZipDefault<I, J>
where
I: Iterator,
J: Iterator,
{
fn new(i: I, j: J) -> Self {
Self {
inner: i.zip_longest(j),
}
}
}
impl<T, U, A, B> Iterator for ZipDefault<T, U>
where
T: Iterator<Item = A>,
U: Iterator<Item = B>,
A: Default,
B: Default,
{
type Item = (A, B);
fn next(&mut self) -> Option<Self::Item> {
match self.inner.next()? {
Both(a, b) => Some((a, b)),
Left(a) => Some((a, B::default())),
Right(b) => Some((A::default(), b)),
}
}
fn size_hint(&self) -> (usize, Option<usize>) {
self.inner.size_hint()
}
}
impl<T: ?Sized> MyIterTools for T where T: Iterator {}
fn main() {
let num1 = vec![1, 2];
let num2 = vec![3];
for i in num1
.iter()
.copied()
.rev()
.zip_default(num2.iter().copied().rev())
{
println!("{:?}", i);
}
}
I saw this neat trick in other guy code in leetcode solution. If you have access to length, you can swap iterators, making iter1 the longest.
fn main() {
let num1 = vec![1, 2];
let num2 = vec![3];
let mut iter1 = num1.iter();
let mut iter2 = num2.iter();
if iter1.len() < iter2.len(){
std::mem::swap(&mut iter1, &mut iter2);
} // now iter1 is the largest
for i in iter1.rev().zip(iter2.rev().chain(std::iter::repeat(&0))) {
println!("{:?}", i);
}
}
If you can get the length of the iterators, as is in this case, a quick and dirty way could be:
use std::iter::repeat;
fn main() {
let a = vec![1, 2, 3];
let b = vec![4, 5, 6, 7];
for i in a
.iter()
.rev()
.chain(repeat(&0).take(b.len().saturating_sub(a.len())))
.zip(
b.iter()
.rev()
.chain(repeat(&0).take(a.len().saturating_sub(b.len()))),
)
{
println!("{:?}", i);
}
}
You can also implement a trait containing a zip_default() using this approach:
pub trait MyIterTools<X: Default + Clone>: ExactSizeIterator<Item = X> {
fn zip_default<J, Y>(self, j: J) -> ZipDefault<Self, J::IntoIter, X, Y>
where
Self: Sized,
J: IntoIterator<Item = Y>,
J::IntoIter: ExactSizeIterator,
Y: Default + Clone,
{
ZipDefault::new(self, j.into_iter())
}
}
#[derive(Clone, Debug)]
pub struct ZipDefault<
I: ExactSizeIterator<Item = X>,
J: ExactSizeIterator<Item = Y>,
X: Default + Clone,
Y: Default + Clone,
> {
inner: Zip<Chain<I, Take<Repeat<X>>>, Chain<J, Take<Repeat<Y>>>>,
}
impl<
I: ExactSizeIterator<Item = X>,
J: ExactSizeIterator<Item = Y>,
X: Default + Clone,
Y: Default + Clone,
> ZipDefault<I, J, X, Y>
{
fn new(a: I, b: J) -> Self {
let a_len = a.len();
let b_len = b.len();
Self {
inner: a
.chain(repeat(X::default()).take(b_len.saturating_sub(a_len)))
.zip(b.chain(repeat(Y::default()).take(a_len.saturating_sub(b_len)))),
}
}
}
impl<
I: ExactSizeIterator<Item = X>,
J: ExactSizeIterator<Item = Y>,
X: Default + Clone,
Y: Default + Clone,
> Iterator for ZipDefault<I, J, X, Y>
{
type Item = (X, Y);
fn next(&mut self) -> Option<Self::Item> {
self.inner.next()
}
fn size_hint(&self) -> (usize, Option<usize>) {
self.inner.size_hint()
}
}
impl<T: ExactSizeIterator<Item = X>, X: Default + Clone> MyIterTools<X> for T {}
fn main() {
let a = vec![1, 2, 3];
let b = vec![4, 5, 6, 7];
a.into_iter()
.zip_default(b.into_iter())
.for_each(|i| println!("{:?}", i));
}
I am attempting to write a function that validates a given collection using a closure. The function takes ownership of a collection, iterates over the contents, and if no invalid item was found, returns ownership of the collection. This is so it can be used like this (without creating a temp for the Vec): let col = validate(vec![1, 2], |&v| v < 10)?;
This is the current implementation of the function:
use std::fmt::Debug;
fn validate<C, F, V>(col: C, pred: F) -> Result<C, String>
where C: Debug,
for<'c> &'c C: IntoIterator<Item = V>,
F: Fn(&V) -> bool,
V: Debug
{
if let Some(val) = (&col).into_iter().find(|v| !pred(v)) {
Err(format!("{:?} contains invalid item: {:?}.", col, val))?;
}
Ok(col)
}
It does compile, but it doesn't work when I try to use it:
use std::collections::BTreeMap;
use std::iter::{FromIterator, once};
fn main() {
println!("Vec: {:?}", validate(vec![1, 2, 3, 4], |&&v| v <= 3));
// ^^^^^^^^ expected bound lifetime parameter 'c, found concrete lifetime
println!("Map: {:?}",
validate(BTreeMap::from_iter(once((1, 2))), |&(&k, &v)| k <= 3));
}
Rust Playground
Is what I'm trying to accomplish here possible?
Background
I am writing a parser for a toy project of mine and was wondering if I
could write a single validate function that works with all the collection
types I use:
Vecs,
VecDeques,
BTreeSets,
BTreeMaps,
&[T] slices.
Each of these collections implements the IntoIterator trait for a reference of itself,
which can be used to call .into_iter() on a reference without consuming the items
in the collection:
Vec impl
VecDeque impl
BTreeSet impl
BTreeMap impl
&[T] slices impl
This is the what the for<'c> &'c C: IntoIterator<Item = V> in the function declaration
refers to. Since the reference is defined in the function body itself, we can't just
use a lifetime that's declared on the function (like fn validate<'c, ...), because this
would imply that the reference has to outlive the function (which it cannot). Instead we
have to use a Higher-Rank Trait Bound to
declare this lifetime.
It seems to me that this lifetime is also the source of the trouble, since a version of
the function that takes and returns a reference to the collection works fine:
// This works just fine.
fn validate<'c, C, F, V>(col: &'c C, pred: F) -> Result<&'c C, String>
where C: Debug,
&'c C: IntoIterator<Item = V>,
F: Fn(&V) -> bool,
V: Debug
{
if let Some(val) = col.into_iter().find(|v| !pred(v)) {
Err(format!("{:?} contains invalid item: {:?}.", col, val))?;
}
Ok(col)
}
Rust Playground
Furthermore, I managed to implement two other versions of the
function, one which works for Vec, VecDeque, BTreeSet and &[T] slices, and another
which works for BTreeMap and probably other mappings:
use std::fmt::Debug;
pub fn validate_collection<C, F, V>(col: C, pred: F) -> Result<C, String>
where C: Debug,
for<'c> &'c C: IntoIterator<Item = &'c V>,
F: Fn(&V) -> bool,
V: Debug
{
if let Some(val) = (&col).into_iter().find(|&v| !pred(v)) {
Err(format!("{:?} contains invalid item: {:?}.", col, val))?;
}
Ok(col)
}
pub fn validate_mapping<C, F, K, V>(col: C, pred: F) -> Result<C, String>
where C: Debug,
for<'c> &'c C: IntoIterator<Item = (&'c K, &'c V)>,
F: Fn(&K, &V) -> bool,
K: Debug,
V: Debug
{
if let Some(val) = (&col).into_iter().find(|&(k, v)| !pred(k, v)) {
Err(format!("{:?} contains invalid item: {:?}.", col, val))?;
}
Ok(col)
}
Rust Playground
In the end I hope to create a Validate trait. Currently, I can only impl
it for either collections or mappings, because the impls conflict.
use std::fmt::Debug;
trait Validate<V>: Sized {
fn validate<F>(self, F) -> Result<Self, String> where F: Fn(&V) -> bool;
}
// Impl that only works for collections, not mappings.
impl<C, V> Validate<V> for C
where C: Debug,
for<'c> &'c C: IntoIterator<Item = &'c V>,
V: Debug
{
fn validate<F>(self, pred: F) -> Result<C, String>
where F: Fn(&V) -> bool
{
if let Some(val) = (&self).into_iter().find(|&v| !pred(v)) {
Err(format!("{:?} contains invalid item: {:?}.", self, val))?;
}
Ok(self)
}
}
fn main() {
println!("Vec: {:?}", vec![1, 2, 3, 4].validate(|&v| v <= 3));
}
Rust Playground
Looking at your trait bounds (reformatted a little):
fn validate<C, F, V>(col: C, pred: F) -> Result<C, String>
where C: Debug,
for<'c> &'c C: IntoIterator<Item = V>,
F: Fn(&V) -> bool,
V: Debug {
the problem is that &C won't implement IntoIterator<Item = V>; references tend to iterate over references.
Fixing that (and the extra reference in the closure) makes it work:
fn validate<C, F, V>(col: C, pred: F) -> Result<C, String>
where C: Debug,
for<'c> &'c C: IntoIterator<Item = &'c V>,
F: Fn(&V) -> bool,
V: Debug
{
if let Some(val) = (&col).into_iter().find(|v| !pred(v)) {
Err(format!("{:?} contains invalid item: {:?}.", col, val))?;
}
Ok(col)
}
fn main() {
println!("Vec: {:?}", validate(vec![1, 2, 3, 4], |&v| v <= 3));
}
Playground
To extend this to work with BTreeMap values, we can abstract over the method used to generate the iterators. Let's add a trait HasValueIterator which knows how to get an iterator over values:
trait HasValueIterator<'a, V: 'a> {
type ValueIter : Iterator<Item=&'a V>;
fn to_value_iter(&'a self) -> Self::ValueIter;
}
and use that instead of IntoIterator:
fn validate<C, F, V>(col: C, pred: F) -> Result<C, String>
where C: Debug,
for<'c> C: HasValueIterator<'c, V>,
F: Fn(&V) -> bool,
V: Debug
{
if let Some(val) = (&col).to_value_iter().find(|v| !pred(v)) {
Err(format!("{:?} contains invalid item: {:?}.", col, val))?;
}
Ok(col)
}
Now we can implement it for Vec and BTreeMap (the latter using .values()), thought you have to name the iterator types:
impl<'c, V:'c> HasValueIterator<'c, V> for Vec<V> {
type ValueIter = std::slice::Iter<'c,V>;
fn to_value_iter(&'c self) -> Self::ValueIter {
self.iter()
}
}
impl<'c, V:'c, K:'c> HasValueIterator<'c, V> for BTreeMap<K, V> {
type ValueIter = std::collections::btree_map::Values<'c, K, V>;
fn to_value_iter(&'c self) -> Self::ValueIter {
self.values()
}
}
Now this works with both Vec and BTreeMap, at least with values:
fn main() {
println!("Vec: {:?}", validate(vec![1, 2, 3, 4], |&v| v <= 3));
let mut map = BTreeMap::new();
map.insert("first", 1);
map.insert("second", 2);
map.insert("third", 3);
println!("Map: {:?}", validate(map, |&v| v<=2));
}
Playground
This outputs:
Vec: Err("[1, 2, 3, 4] contains invalid item: 4.")
Map: Err("{\"first\": 1, \"second\": 2, \"third\": 3} contains invalid item: 3.")