Why does std::iter::Peekable::peek mutably borrow the self argument?

Why does std::iter::Peekable::peek mutably borrow the self argument? - rust

I am struggling to understand why the peek() method is borrowing the self argument mutably.
The documentation says:
"Returns a reference to the next() value without advancing the iterator."
Since it is not advancing the iterator, what is the point behind borrowing the argument as mutable?
I looked at the implementation of peek() and noticed it is calling a next() method.
#[inline]
#[stable(feature = "rust1", since = "1.0.0")]
pub fn peek(&mut self) -> Option<&I::Item> {
let iter = &mut self.iter;
self.peeked.get_or_insert_with(|| iter.next()).as_ref()
}
Is it because of the use of the next() method, the peek() method is designed to borrow mutably or is there another semantic behind the peek() method that really requires the mutable borrow?
In other words, what is it that gets mutated when the peek() method is called?

As you have already done, let's look at its source, which reveals a little about how it works internally:
pub struct Peekable<I: Iterator> {
iter: I,
/// Remember a peeked value, even if it was None.
peeked: Option<Option<I::Item>>,
}
Together with its implementation for next():
impl<I: Iterator> Iterator for Peekable<I> {
// ...
fn next(&mut self) -> Option<I::Item> {
match self.peeked.take() {
Some(v) => v,
None => self.iter.next(),
}
}
// ...
}
and it's implementation for peek():
impl<I: Iterator> Peekable<I> {
// ...
pub fn peek(&mut self) -> Option<&I::Item> {
let iter = &mut self.iter;
self.peeked.get_or_insert_with(|| iter.next()).as_ref()
}
// ...
}
Peek wraps an existing iterator. And existing iterators are not peekable.
So what peek does, is:
on peek():
take the next() item from the wrapped iterator and store it in self.peeked (if self.peeked does not yet contain the next item already)
return a reference to the peeked item
on next():
see if we currently have a self.peeked item
if yes, return that one
if no, take the next() item from the underlaying iterator.
So as you already realized, the peek() action needs &mut self because it might have to generate the next peeked item by calling next() on the underlying iterator.
So here is the reason, if you look at it from a more abstract point of view: The next item might not even exist yet. So peeking might involve actually generating that next item, which is definitely a mutating action on the underlying iterator.
Not all iterators are over arrays/slices where the items already exist; an iterator might by anything that generates a number of items, including lazy generators that only create said items as they are asked for it.
Could they have implemented it differently?
Yes, there absolutely is the possibility to do it differently. They could have next()ed the underlying iterator during new(). Then, when someone calls next() on the Peekable, it could return the currently peeked value and query the next one right away. Then, peeking would have been a &self method.
Why they went that way is unclear, but most certainly to keep the iterator as lazy as possible. Lazy iterators are a good thing in most cases.
That said, here is a proof of concept how a prefetching peekable iterator could be implemented that doesn't require &mut for peek():
pub struct PrefetchingPeekingIterator<I: Iterator> {
iter: I,
next_item: Option<I::Item>,
}
impl<I: Iterator> PrefetchingPeekingIterator<I> {
fn new(mut iter: I) -> Self {
let next_item = iter.next();
Self { iter, next_item }
}
fn peek(&self) -> Option<&I::Item> {
self.next_item.as_ref()
}
}
impl<I: Iterator> Iterator for PrefetchingPeekingIterator<I> {
type Item = I::Item;
fn next(&mut self) -> Option<Self::Item> {
std::mem::replace(&mut self.next_item, self.iter.next())
}
}
fn main() {
let mut range = PrefetchingPeekingIterator::new(1..10);
dbg!(range.next().unwrap());
dbg!(range.peek().unwrap());
dbg!(range.next().unwrap());
dbg!(range.peek().unwrap());
dbg!(range.next().unwrap());
dbg!(range.peek().unwrap());
}
[src/main.rs:27] range.next().unwrap() = 1
[src/main.rs:28] range.peek().unwrap() = 2
[src/main.rs:29] range.next().unwrap() = 2
[src/main.rs:30] range.peek().unwrap() = 3
[src/main.rs:31] range.next().unwrap() = 3
[src/main.rs:32] range.peek().unwrap() = 4

Yes as you noticed Peekable::peek might have to call self.iter.next() to get an element if its self.peeked doesn't already have something stored.
It then also has to store that value somewhere to not advance the Peekable iterator.
The underlying Iterator may very well get advanced by it.
Both advancing the underlying Iterator as well as storing the value in self.peeked require mutable access to self.

Related

Implementing a custom Iterator Trait

Source code
pub struct Iterating_ex {
start: u32,
end: u32,
}
impl Iterator for Iterating_ex {
type Item = u32;
fn next(&mut self) -> Option<u32> {
if self.start >= self.end {
None
} else {
let result = Some(self.start);
self.start += 1;
result
}
}
}
fn main() {
let example = Iterating_ex {
start: 0,
end: 5,
};
for i in example {
println!("{i}");
}
}
Output
0
1
2
3
4
Individually I understand what each piece of code is trying to do, however I am having trouble understanding the following, possibly due to my lack of understanding of the generic iterator trait;
Does implementing the Iterator trait for a struct automatically generate an iterable data type? In this case, I don't know why a for loop can be used on example.
It seems like the next method is called as a loop until None is returned. How does the code know to do so?

Ad 1. To be "iterable" in rust means to implement the Iterator trait. Some things however can be turned into an iterator and that is described by another trait IntoIterator. Standard library provides a blanket implementation:
impl<I: Iterator> IntoIterator for I { /* ... */}
Which means that any type that implements Iterator can be turned into one (it's noop). for loops are designed to work with types that implement IntoIterator. That's why you can write for example:
let mut v = vec![1, 2, 3, 4, 5];
for _ in &v {}
for _ in &mut v {}
for _ in v {}
Since types &Vec<T>, &mut Vec<T> and Vec<T> all implement IntoIterator trait. They all are turned into different iterator types of course, and are returning respectively &T, &mut T and T.
Ad 2. As stated before for loops can be used on types that implement IntoIterator. The documentation explains in detail how it works, but in a nutshell for loop is just a syntax sugar that turns this code:
for x in xs {
todo!()
}
Into something like this:
let mut xs_iter = xs.into_iter();
while let Some(x) = xs_iter.next() {
todo!()
}
while loops are also syntax sugar and are de-sugared into loop with a break statement but that's not relevant here.
Side note. I guess that this is just a learning example, and it's great, but the exact same iterator already exists in the standard library as std::ops::Range, so use that it your actual code if you need it.

Returning iterator from weak references for mapping and modifying values

I'm trying quite complex stuff with Rust where I need the following attributes, and am fighting the compiler.
Object which itself lives from start to finish of application, however, where internal maps/vectors could be modified during application lifetime
Multiple references to object that can read internal maps/vectors of an object
All single threaded
Multiple nested iterators which are map/modified in lazy manner to perform fast and complex calculations (see example below)
A small example, which already causes problems:
use std::cell::RefCell;
use std::rc::Rc;
use std::sync::Weak;
pub struct Holder {
array_ref: Weak<RefCell<Vec<isize>>>,
}
impl Holder {
pub fn new(array_ref: Weak<RefCell<Vec<isize>>>) -> Self {
Self { array_ref }
}
fn get_iterator(&self) -> impl Iterator<Item = f64> + '_ {
self.array_ref
.upgrade()
.unwrap()
.borrow()
.iter()
.map(|value| *value as f64 * 2.0)
}
}
get_iterator is just one of the implementations of a trait, but even this example already does not work.
The reason for Weak/Rc is to make sure that multiple places points to object (from point (1)) and other place can modify its internals (Vec<isize>).
What is the best way to approach this situation, given that end goal is performance critical?
EDIT:
Person suggested using https://doc.rust-lang.org/std/cell/struct.Ref.html#method.map
But unfortunately still can't get - if I should also change return type - or maybe the closure function is wrong here
fn get_iterator(&self) -> impl Iterator<Item=f64> + '_ {
let x = self.array_ref.upgrade().unwrap().borrow();
let map1 = Ref::map(x, |x| &x.iter());
let map2 = Ref::map(map1, |iter| &iter.map(|y| *y as f64 * 2.0));
map2
}
IDEA say it has wrong return type
the trait `Iterator` is not implemented for `Ref<'_, Map<std::slice::Iter<'_, isize>, [closure#src/bin/main.rs:30:46: 30:65]>>`

This won't work because self.array_ref.upgrade() creates a local temporary Arc value, but the Ref only borrows from it. Obviously, you can't return a value that borrows from a local.
To make this work you need a second structure to own the Arc, which can implement Iterator in this case since the produced items aren't references:
pub struct HolderIterator(Arc<RefCell<Vec<isize>>>, usize);
impl Iterator for HolderIterator {
type Item = f64;
fn next(&mut self) -> Option<f64> {
let r = self.0.borrow().get(self.1)
.map(|&y| y as f64 * 2.0);
if r.is_some() {
self.1 += 1;
}
r
}
}
// ...
impl Holder {
// ...
fn get_iterator<'a>(&'a self) -> Option<impl Iterator<Item=f64>> {
self.array_ref.upgrade().map(|rc| HolderIterator(rc, 0))
}
}
Alternatively, if you want the iterator to also weakly-reference the value contained within, you can have it hold a Weak instead and upgrade on each next() call. There are performance implications, but this also makes it easier to have get_iterator() be able to return an iterator directly instead of an Option, and the iterator written so that a failed upgrade means the sequence has ended:
pub struct HolderIterator(Weak<RefCell<Vec<isize>>>, usize);
impl Iterator for HolderIterator {
type Item = f64;
fn next(&mut self) -> Option<f64> {
let r = self.0.upgrade()?
.borrow()
.get(self.1)
.map(|&y| y as f64 * 2.0);
if r.is_some() {
self.1 += 1;
}
r
}
}
// ...
impl Holder {
// ...
fn get_iterator<'a>(&'a self) -> impl Iterator<Item=f64> {
HolderIterator(Weak::clone(&self.array_ref), 0)
}
}
This will make it so that you always get an iterator, but it's empty if the Weak is dead. The Weak can also die during iteration, at which point the sequence will abruptly end.

Iterate over Vec<"CustomStruct"> => unsatisfied trait bounds

I'm participating in this year's Advent of Code and wanted to take the opportunity to learn Rust. (So, if you're also participating, the following section might spoil something).
I want to iterate over the Vec vector and decrement the internal_counter value for each Item in this Vector. I tried the following:
let test: Vec<Lanternfish> = fish_list.map(|fish| fish.decrement_couner()).collect();
The compiler gives me the following error: method cannot be called on Vec<Lanternfish> due to unsatisfied trait bounds
I understand that the iterator function is not available for this, however I don't understand exactly how to fix the problem.
#[derive(Debug)]
struct Lanternfish {
internal_counter: u8,
}
impl Lanternfish {
fn new() -> Self {
Lanternfish {
internal_counter: 8,
}
}
fn decrement_counter(&mut self) {
self.internal_counter -= 1
}
}
fn part_one(content: &str) {
let content: Vec<char> = content.chars().filter(|char| char.is_digit(10)).collect();
let mut fish_list: Vec<Lanternfish> = init_list(content);
let test: Vec<Lanternfish> = fish_list.map(|fish| fish.decrement_counter()).collect();
}
fn init_list(initial_values: Vec<char>) -> Vec<Lanternfish> {
let mut all_lanternfish: Vec<_> = Vec::new();
for value in initial_values {
all_lanternfish.push(Lanternfish{internal_counter: value as u8});
}
all_lanternfish
}

The way to iterate over a Vec and call a mutating function on each element is:
for fish in &mut fish_list {
fish.decrement_counter();
}
What this line is doing:
fish_list.map(|fish| fish.decrement_couner).collect();
is
Try to call map on the Vec (it doesn't have that. Iterator has it, but you'd need to call iter(), iter_mut() or into_iter() on the Vec for that).
Assuming you get the right map, it then calls the lambda |fish| fish.decrement_couner on each element; typo aside, this is not a function call, but a field access, and Lanternfish doesn't have a field called decrement_couner. A call would need parentheses.
Assuming you fix the function call, you then collect all the results of the calls (a bunch of () "unit" values, since decrement_counter doesn't return anything) into a new Vec, which is of type Vec<()>.
And finally, you try to bind that to a variable of Vec<Lanternfish>, which will fail.
Meanwhile, the function calls will have modified the original Vec, if you used iter_mut(). Otherwise, the function calls will not compile.

Can a function that takes a reference be passed as a closure argument that will provide owned values?

I am trying to simplify my closures, but I had a problem converting my closure to a reference to an associated function when the parameter is owned by the closure but the inner function call only expects a reference.
#![deny(clippy::pedantic)]
fn main() {
let borrowed_structs = vec![BorrowedStruct, BorrowedStruct];
//Selected into_iter specifically to reproduce the minimal scenario that closure gets value instead of reference
borrowed_structs
.into_iter()
.for_each(|consumed_struct: BorrowedStruct| MyStruct::my_method(&consumed_struct));
// I want to write it with static method reference like following line:
// for_each(MyStruct::my_method);
}
struct MyStruct;
struct BorrowedStruct;
impl MyStruct {
fn my_method(prm: &BorrowedStruct) {
prm.say_hello();
}
}
impl BorrowedStruct {
fn say_hello(&self) {
println!("hello");
}
}
Playground
Is it possible to simplify this code:
into_iter().for_each(|consumed_struct: BorrowedStruct| MyStruct::my_method(&consumed_struct));
To the following:
into_iter().for_each(MyStruct::my_method)
Note that into_iter here is only to reproduce to scenario that I own the value in my closure. I know that iter can be used in such scenario but it is not the real scenario that I am working on.

The answer to your general question is no. Types must match exactly when passing a function as a closure argument.
There are one-off workarounds, as shown in rodrigo's answer, but the general solution is to simply take the reference yourself, as you've done:
something_taking_a_closure(|owned_value| some_function_or_method(&owned_value))
I actually advocated for this case about two years ago as part of ergonomics revamp, but no one else seemed interested.
In your specific case, you can remove the type from the closure argument to make it more succinct:
.for_each(|consumed_struct| MyStruct::my_method(&consumed_struct))

I don't think there is a for_each_ref in trait Iterator yet. But you can write your own quite easily (playground):
trait MyIterator {
fn for_each_ref<F>(self, mut f: F)
where
Self: Iterator + Sized,
F: FnMut(&Self::Item),
{
self.for_each(|x| f(&x));
}
}
impl<I: Iterator> MyIterator for I {}
borrowed_structs
.into_iter()
.for_each_ref(MyStruct::my_method);
Another option, if you are able to change the prototype of the my_method function you can make it accept the value either by value or by reference with borrow:
impl MyStruct {
fn my_method(prm: impl Borrow<BorrowedStruct>) {
let prm = prm.borrow();
prm.say_hello();
}
}
And then your original code with .for_each(MyStruct::my_method) just works.
A third option is to use a generic wrapper function (playground):
fn bind_by_ref<T>(mut f: impl FnMut(&T)) -> impl FnMut(T) {
move |x| f(&x)
}
And then call the wrapped function with .for_each(bind_by_ref(MyStruct::my_method));.

Borrowed value doesn't live long enough, trying to expose iterators instead of concrete Vec representations of the data

I have a struct representing a grid of data, and accessors for the rows and columns. I'm trying to add accessors for the rows and columns which return iterators instead of Vec.
use std::slice::Iter;
#[derive(Debug)]
pub struct Grid<Item : Copy> {
raw : Vec<Vec<Item>>
}
impl <Item : Copy> Grid <Item>
{
pub fn new( data: Vec<Vec<Item>> ) -> Grid<Item> {
Grid{ raw : data }
}
pub fn width( &self ) -> usize {
self.rows()[0].len()
}
pub fn height( &self ) -> usize {
self.rows().len()
}
pub fn rows( &self ) -> Vec<Vec<Item>> {
self.raw.to_owned()
}
pub fn cols( &self ) -> Vec<Vec<Item>> {
let mut cols = Vec::new();
for i in 0..self.height() {
let col = self.rows().iter()
.map( |row| row[i] )
.collect::<Vec<Item>>();
cols.push(col);
}
cols
}
pub fn rows_iter( &self ) -> Iter<Vec<Item>> {
// LIFETIME ERROR HERE
self.rows().iter()
}
pub fn cols_iter( &self ) -> Iter<Vec<Item>> {
// LIFETIME ERROR HERE
self.cols().iter()
}
}
Both functions rows_iter and cols_iter have the same problem: error: borrowed value does not live long enough. I've tried a lot of things, but pared it back to the simplest thing to post here.

You can use the method into_iter which returns std::vec::IntoIter. The function iter usually only borrows the data source iterated over. into_iter has ownership of the data source. Thus the vector will live as long as the actual data.
pub fn cols_iter( &self ) -> std::vec::IntoIter<Vec<Item>> {
self.cols().intoiter()
}
However, I think that the design of your Grid type could be improved a lot. Always cloning a vector is not a good thing (to name one issue).

Iterators only contain borrowed references to the original data structure; they don't take ownership of it. Therefore, a vector must live longer than an iterator on that vector.
rows and cols allocate and return a new Vec. rows_iter and cols_iter are trying to return an iterator on a temporary Vec. This Vec will be deallocated before rows_iter or cols_iter return. That means that an iterator on that Vec must be deallocated before the function returns. However, you're trying to return the iterator from the function, which would make the iterator live longer than the end of the function.
There is simply no way to make rows_iter and cols_iter compile as is. I believe these methods are simply unnecessary, since you already provide the public rows and cols methods.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Why does std::iter::Peekable::peek mutably borrow the self argument? - rust

Related

Implementing a custom Iterator Trait

Returning iterator from weak references for mapping and modifying values

Iterate over Vec<"CustomStruct"> => unsatisfied trait bounds

Can a function that takes a reference be passed as a closure argument that will provide owned values?

Borrowed value doesn't live long enough, trying to expose iterators instead of concrete Vec representations of the data

Categories

Resources