Mutably iterate through an iterator using Itertools' tuple_windows

Mutably iterate through an iterator using Itertools' tuple_windows - rust

I'm attempting to store a series of entries inside a Vec. Later I need to reprocess through the Vec to fill in some information in each entry about the next entry. The minimal example would be something like this:
struct Entry {
curr: i32,
next: Option<i32>
}
struct History {
entries: Vec<Entry>
}
where I would like to fill in the next fields to the next entries' curr value. To achieve this, I want to make use of the tuple_windows function from Itertools on the mutable iterator. I expect I can write a function like this:
impl History {
fn fill_next_with_itertools(&mut self) {
for (a, b) in self.entries.iter_mut().tuple_windows() {
a.next = Some(b.curr);
}
}
}
(playground)
However, it refuse to compile because the iterator Item's type, &mut Entry, is not Clone, which is required by tuple_windows function. I understand there is a way to iterate through the list using the indices like this:
fn fill_next_with_index(&mut self) {
for i in 0..(self.entries.len()-1) {
self.entries[i].next = Some(self.entries[i+1].curr);
}
}
(playground)
But I feel the itertools' approach more natural and elegant. What's the best ways to achieve the same effect?

From the documentation:
tuple_window clones the iterator elements so that they can be part of successive windows, this makes it most suited for iterators of references and other values that are cheap to copy.
This means that if you were to implement it with &mut items, then you'd have multiple mutable references to the same thing which is undefined behaviour.
If you still need shared, mutable access you'd have to wrap it in Rc<RefCell<T>>, Arc<Mutex<T>> or something similar:
fn fill_next_with_itertools(&mut self) {
for (a, b) in self.entries.iter_mut().map(RefCell::new).map(Rc::new).tuple_windows() {
a.borrow_mut().next = Some(b.borrow().curr);
}
}

Related

Why does std::iter::Peekable::peek mutably borrow the self argument?

I am struggling to understand why the peek() method is borrowing the self argument mutably.
The documentation says:
"Returns a reference to the next() value without advancing the iterator."
Since it is not advancing the iterator, what is the point behind borrowing the argument as mutable?
I looked at the implementation of peek() and noticed it is calling a next() method.
#[inline]
#[stable(feature = "rust1", since = "1.0.0")]
pub fn peek(&mut self) -> Option<&I::Item> {
let iter = &mut self.iter;
self.peeked.get_or_insert_with(|| iter.next()).as_ref()
}
Is it because of the use of the next() method, the peek() method is designed to borrow mutably or is there another semantic behind the peek() method that really requires the mutable borrow?
In other words, what is it that gets mutated when the peek() method is called?

As you have already done, let's look at its source, which reveals a little about how it works internally:
pub struct Peekable<I: Iterator> {
iter: I,
/// Remember a peeked value, even if it was None.
peeked: Option<Option<I::Item>>,
}
Together with its implementation for next():
impl<I: Iterator> Iterator for Peekable<I> {
// ...
fn next(&mut self) -> Option<I::Item> {
match self.peeked.take() {
Some(v) => v,
None => self.iter.next(),
}
}
// ...
}
and it's implementation for peek():
impl<I: Iterator> Peekable<I> {
// ...
pub fn peek(&mut self) -> Option<&I::Item> {
let iter = &mut self.iter;
self.peeked.get_or_insert_with(|| iter.next()).as_ref()
}
// ...
}
Peek wraps an existing iterator. And existing iterators are not peekable.
So what peek does, is:
on peek():
take the next() item from the wrapped iterator and store it in self.peeked (if self.peeked does not yet contain the next item already)
return a reference to the peeked item
on next():
see if we currently have a self.peeked item
if yes, return that one
if no, take the next() item from the underlaying iterator.
So as you already realized, the peek() action needs &mut self because it might have to generate the next peeked item by calling next() on the underlying iterator.
So here is the reason, if you look at it from a more abstract point of view: The next item might not even exist yet. So peeking might involve actually generating that next item, which is definitely a mutating action on the underlying iterator.
Not all iterators are over arrays/slices where the items already exist; an iterator might by anything that generates a number of items, including lazy generators that only create said items as they are asked for it.
Could they have implemented it differently?
Yes, there absolutely is the possibility to do it differently. They could have next()ed the underlying iterator during new(). Then, when someone calls next() on the Peekable, it could return the currently peeked value and query the next one right away. Then, peeking would have been a &self method.
Why they went that way is unclear, but most certainly to keep the iterator as lazy as possible. Lazy iterators are a good thing in most cases.
That said, here is a proof of concept how a prefetching peekable iterator could be implemented that doesn't require &mut for peek():
pub struct PrefetchingPeekingIterator<I: Iterator> {
iter: I,
next_item: Option<I::Item>,
}
impl<I: Iterator> PrefetchingPeekingIterator<I> {
fn new(mut iter: I) -> Self {
let next_item = iter.next();
Self { iter, next_item }
}
fn peek(&self) -> Option<&I::Item> {
self.next_item.as_ref()
}
}
impl<I: Iterator> Iterator for PrefetchingPeekingIterator<I> {
type Item = I::Item;
fn next(&mut self) -> Option<Self::Item> {
std::mem::replace(&mut self.next_item, self.iter.next())
}
}
fn main() {
let mut range = PrefetchingPeekingIterator::new(1..10);
dbg!(range.next().unwrap());
dbg!(range.peek().unwrap());
dbg!(range.next().unwrap());
dbg!(range.peek().unwrap());
dbg!(range.next().unwrap());
dbg!(range.peek().unwrap());
}
[src/main.rs:27] range.next().unwrap() = 1
[src/main.rs:28] range.peek().unwrap() = 2
[src/main.rs:29] range.next().unwrap() = 2
[src/main.rs:30] range.peek().unwrap() = 3
[src/main.rs:31] range.next().unwrap() = 3
[src/main.rs:32] range.peek().unwrap() = 4

Yes as you noticed Peekable::peek might have to call self.iter.next() to get an element if its self.peeked doesn't already have something stored.
It then also has to store that value somewhere to not advance the Peekable iterator.
The underlying Iterator may very well get advanced by it.
Both advancing the underlying Iterator as well as storing the value in self.peeked require mutable access to self.

Adapt a lending iterator type to a serde-style map visitor

I'm trying to adapt the data model I designed to a serde MapAccess interface. I have my own data model, which has a List type containing key-value pairs, which are exposed like this:
impl List {
fn next_item(&mut self) -> Option<(String, Item<'_>)>
}
In other words, the Item type mutably borrows from a List object, and must be fully processed before the next Item can be extracted from the List.
I need to adapt this data model to a deserializer trait that looks like this:
trait MapAccess {
fn next_key<T: Deserialize>(&mut self) -> Option<T>;
fn next_value<T: Deserialize>(&mut self) -> T;
}
The two methods should be called in alternating order; they're allowed to panic or return garbage results if you don't call them in the correct order. This trait is provided by the deserializer library I'm using, and is the one only of this problem I can't control or modify.
I'm trying to adapt a List or an &mut List into a MapAccess and I'm running into challenges. I do have a function which converts an Item<'_> to a T: Deserialize (and specifically ends the borrow of List):
fn deserialize_item<T: Deserialize>(item: Item<'_>) -> T;
The basic problem I'm running into is that MapAccess::next_key calls List::next_item and then deserializes the String as a key, and then needs to store the Item<'_> so that it can handled by next_value. I keep running into cases with types that are almost self referential (two fields that both mutably borrow the same thing). It's nothing truely self-referential so I'm not (yet) convinced that this is impossible. The closest I've come is some kind of enum that that resembles enum State {List(&'a mut List), Item(Item<'_>)} (because I don't need to access a List while I'm working on the Item, but I haven't found a way to restore access to the List for the next next_key after the Item is consumed.
As the author of the data model, I do have some leeway to change how it's designed, but if possible I'd like to hold on to some kind of borrowing model, which I use to enforce parse consistency (that is, the borrow prevents List from parsing the next item until after the Item has fully parsed the content of the current item.). I can of course design any additional types to implement this correctly.
Note: This is all serde related, but I'm using original types to simplify the interfaces. In particular I've omitted all the of the Result types and 'de lifetimes as I'm confident they're not relevant to this problem.
EDIT: Here's my attempt to make something that works. This doesn't compile, but it hopefully gets across what I'd like to do here.
fn deserialize_string<T: Deserialize>(value: String) -> T { todo!() }
fn deserialize_item<T: Deserialize>(item: Item<'_>) -> T { todo!() }
struct ListMapAccess<'a> {
list: &'a mut List,
item: Option<Item<'a>>,
}
impl<'a> MapAccess for ListMapAccess<'a> {
fn next_key<T: Deserialize>(&mut self) -> Option<T> {
let (key, item) = self.list.next_item()?;
self.item = Some(item);
Some(deserialize_string(key))
}
fn next_value<T: Deserialize>(&mut self) -> T {
let item = self.item.expect("Called next_item out of order");
deserialize_item(item)
}
}
Now, I'm really not at all tied to a ListMapAccess struct; I'm happy with any solution that integrates my List / Item interface into MapAccess.

Pass Vec as Iterator to function

What I want to do:
I have a struct that includes a Vec of another struct.
I will update this Vec from time to time.
I want to pass this Vec as a Iterator to another function.
Here follows a short code snippet of how I want to do things but I just can't get it to compile whatever I do:
struct Main {
data: Vec<OtherStruct>
}
callFunctionHere(self.data.iter());
pub fn callFunctionHere<I>(data: I)
where
I: Iterator<Item = OtherStruct>,
{
// for i in data...
}
Could I pass the data as a new copied object of some sort?

You need a trait bound of Iterator<Item = &OtherStruct>, since Vec<T>::iter returns an iterator over references to T. An iterator producing T values would have to move them out of the vector, which is safe only if the vector itself is consumed, and is what Vec<T>::into_iter() does.
Note that you'll need to also specify a lifetime of the reference, tied to lifetime of the data:
fn some_function<'a, I>(data: I)
where
I: Iterator<Item = &'a Other> + 'a,
{
for el in data {
println!("{:?}", el)
}
}
Complete example in the playground.
Finally, it's worth pointing out that in general it is preferred to request the IntoIterator bound instead of Iterator. Since Iterator implements IntoIterator, such function would keep accepting iterators, but would also accept objects that can be converted into iterators. In this case it would accept &s.data in addition to s.data.iter().

I solved it by:
callFunctionHere(data.iter().map(|i| OtherStruct{
d: i.d
})
´´´

Can a type know when a mutable borrow to itself has ended?

I have a struct and I want to call one of the struct's methods every time a mutable borrow to it has ended. To do so, I would need to know when the mutable borrow to it has been dropped. How can this be done?

Disclaimer: The answer that follows describes a possible solution, but it's not a very good one, as described by this comment from Sebastien Redl:
[T]his is a bad way of trying to maintain invariants. Mostly because dropping the reference can be suppressed with mem::forget. This is fine for RefCell, where if you don't drop the ref, you will simply eventually panic because you didn't release the dynamic borrow, but it is bad if violating the "fraction is in shortest form" invariant leads to weird results or subtle performance issues down the line, and it is catastrophic if you need to maintain the "thread doesn't outlive variables in the current scope" invariant.
Nevertheless, it's possible to use a temporary struct as a "staging area" that updates the referent when it's dropped, and thus maintain the invariant correctly; however, that version basically amounts to making a proper wrapper type and a kind of weird way to use it. The best way to solve this problem is through an opaque wrapper struct that doesn't expose its internals except through methods that definitely maintain the invariant.
Without further ado, the original answer:
Not exactly... but pretty close. We can use RefCell<T> as a model for how this can be done. It's a bit of an abstract question, but I'll use a concrete example to demonstrate. (This won't be a complete example, but something to show the general principles.)
Let's say you want to make a Fraction struct that is always in simplest form (fully reduced, e.g. 3/5 instead of 6/10). You write a struct RawFraction that will contain the bare data. RawFraction instances are not always in simplest form, but they have a method fn reduce(&mut self) that reduces them.
Now you need a smart pointer type that you will always use to mutate the RawFraction, which calls .reduce() on the pointed-to struct when it's dropped. Let's call it RefMut, because that's the naming scheme RefCell uses. You implement Deref<Target = RawFraction>, DerefMut, and Drop on it, something like this:
pub struct RefMut<'a>(&'a mut RawFraction);
impl<'a> Deref for RefMut<'a> {
type Target = RawFraction;
fn deref(&self) -> &RawFraction {
self.0
}
}
impl<'a> DerefMut for RefMut<'a> {
fn deref_mut(&mut self) -> &mut RawFraction {
self.0
}
}
impl<'a> Drop for RefMut<'a> {
fn drop(&mut self) {
self.0.reduce();
}
}
Now, whenever you have a RefMut to a RawFraction and drop it, you know the RawFraction will be in simplest form afterwards. All you need to do at this point is ensure that RefMut is the only way to get &mut access to the RawFraction part of a Fraction.
pub struct Fraction(RawFraction);
impl Fraction {
pub fn new(numerator: i32, denominator: i32) -> Self {
// create a RawFraction, reduce it and wrap it up
}
pub fn borrow_mut(&mut self) -> RefMut {
RefMut(&mut self.0)
}
}
Pay attention to the pub markings (and lack thereof): I'm using those to ensure the soundness of the exposed interface. All three types should be placed in a module by themselves. It would be incorrect to mark the RawFraction field pub inside Fraction, since then it would be possible (for code outside the module) to create an unreduced Fraction without using new or get a &mut RawFraction without going through RefMut.
Supposing all this code is placed in a module named frac, you can use it something like this (assuming Fraction implements Display):
let f = frac::Fraction::new(3, 10);
println!("{}", f); // prints 3/10
f.borrow_mut().numerator += 3;
println!("{}", f); // prints 3/5
The types encode the invariant: Wherever you have Fraction, you can know that it's fully reduced. When you have a RawFraction, &RawFraction, etc., you can't be sure. If you want, you may also make RawFraction's fields non-pub, so that you can't get an unreduced fraction at all except by calling borrow_mut on a Fraction.

Basically the same thing is done in RefCell. There you want to reduce the runtime borrow-count when a borrow ends. Here you want to perform an arbitrary action.
So let's re-use the concept of writing a function that returns a wrapped reference:
struct Data {
content: i32,
}
impl Data {
fn borrow_mut(&mut self) -> DataRef {
println!("borrowing");
DataRef { data: self }
}
fn check_after_borrow(&self) {
if self.content > 50 {
println!("Hey, content should be <= {:?}!", 50);
}
}
}
struct DataRef<'a> {
data: &'a mut Data
}
impl<'a> Drop for DataRef<'a> {
fn drop(&mut self) {
println!("borrow ends");
self.data.check_after_borrow()
}
}
fn main() {
let mut d = Data { content: 42 };
println!("content is {}", d.content);
{
let b = d.borrow_mut();
//let c = &d; // Compiler won't let you have another borrow at the same time
b.data.content = 123;
println!("content set to {}", b.data.content);
} // borrow ends here
println!("content is now {}", d.content);
}
This results in the following output:
content is 42
borrowing
content set to 123
borrow ends
Hey, content should be <= 50!
content is now 123
Be aware that you can still obtain an unchecked mutable borrow with e.g. let c = &mut d;. This will be silently dropped without calling check_after_borrow.

How to convert &Vector<Mutex> to Vector<Mutex>

I'm working my way trough the Rust examples. There is this piece of code:
fn new(name: &str, left: usize, right: usize) -> Philosopher {
Philosopher {
name: name.to_string(),
left: left,
right: right,
}
}
what is the best way to adapt this to a vector ? This works:
fn new(v: Vec<Mutex<()>>) -> Table {
Table {
forks: v
}
}
Than I tried the following:
fn new(v: &Vec<Mutex<()>>) -> Table {
Table {
forks: v.to_vec()
}
}
But that gives me:
the trait `core::clone::Clone` is not implemented
for the type `std::sync::mutex::Mutex<()>`
Which make sense. But what must I do If I want to pass a reference to Table and do not want to store a reference inside the Table struct ?

The error message actually explains a lot here. When you call to_vec on a &Vec<_>, you have to make a clone of the entire vector. That's because Vec owns the data, while a reference does not. In order to clone a vector, you also have to clone all of the contents. This is because the vector owns all the items inside of it.
However, your vector contains a Mutex, which is not able to be cloned. A mutex represents unique access to some data, so having two separate mutexes to the same data would be pointless.
Instead, you probably want to share references to the mutex, not clone it completely. Chances are, you want an Arc:
use std::sync::{Arc, Mutex};
fn main() {
let things = vec![Arc::new(Mutex::new(()))];
things.to_vec();
}

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Mutably iterate through an iterator using Itertools' tuple_windows - rust

Related

Why does std::iter::Peekable::peek mutably borrow the self argument?

Adapt a lending iterator type to a serde-style map visitor

Pass Vec as Iterator to function

Can a type know when a mutable borrow to itself has ended?

How to convert &Vector<Mutex> to Vector<Mutex>

Categories

Resources