Is it possible to implement an iterator with borrowed items?

Is it possible to implement an iterator with borrowed items? - rust

I am implementing a cursor over space grids, e.g. for a space [2, 2] the cursor should visit [0, 0], [1, 0], [0, 1], [1, 1] respectively. I have the code written below, but when I use for cursor in SpaceCursor::<2>::new(&[2, 2]), I noticed the cursor is not borrowed, but copied. Since the cursor will always within the SpaceCursor lifetime, I am wondering if I could have its type &[i32; 2] rather than [i32; 2]?
struct SpaceCursor<const D: usize> {
cursor: [i32; D],
boundary: [usize; D],
}
impl<const D: usize> SpaceCursor<D> {
pub fn new(boundary: &[usize]) -> Self {
let mut result = SpaceCursor::<D> {
cursor: [0; D],
boundary: boundary.try_into().expect("Bad boundary"),
};
if D >= 1 {
result.cursor[0] = -1;
}
result
}
}
impl<const D: usize> Iterator for SpaceCursor<D> {
type Item = [i32; D];
fn next(&mut self) -> Option<Self::Item> {
let mut index = 0;
while index < D {
self.cursor[index] += 1;
if self.cursor[index] < self.boundary[index] as i32 {
break;
}
index += 1;
}
if index == D {
None
} else {
for i in 0..index {
self.cursor[i] = 0;
}
Some(self.cursor)
}
}
}

If you need an iterator to yield references then the iterator itself must hold a reference to the underlying data. This is due to the design of the Iterator trait:
trait Iterator {
type Item;
fn next(&mut self) -> Option<Self::Item>;
}
When implementing Iterator where Item is a reference, you would need to give it a lifetime. But you need to pick just one lifetime which then must be valid for all items, and that lifetime must outlive the iterator. This only works if all of the items are already held somewhere in memory and will live longer than the time that the iterator is being used.
In Rust 1.65 this will (sort of) change. A new feature, generic associated types (GATs), will be available in stable, which will allow you to define a "streaming" iterator like this:
trait StreamingIterator {
type Item<'a> where Self: 'a;
fn stream_next<'a>(&'a mut self) -> Option<Self::Item<'a>>;
}
The difference here is that the lifetime 'a is instantiated every call to stream_next and therefore can be different each time. The caller also gets to choose how long they want to keep the reference for and is only limited by the mutable borrow of the streaming iterator itself. The mutability also means you can only borrow one item at a time, so you couldn't do things like collect them into a Vec without cloning them.
Your iterator could be ported like this:
impl<const D: usize> StreamingIterator for SpaceCursor<D> {
type Item<'a> = &'a [i32; D];
fn stream_next<'a>(&'a mut self) -> Option<Self::Item<'a>> {
let mut index = 0;
while index < D {
self.cursor[index] += 1;
if self.cursor[index] < self.boundary[index] as i32 {
break;
}
index += 1;
}
if index == D {
None
} else {
for i in 0..index {
self.cursor[i] = 0;
}
Some(&self.cursor)
}
}
}
Note that the "limitation" of only being able to borrow one item at a time, is really critical to this working, because you're overwriting the cursor value. This also should be another hint for why this couldn't have worked with the non-streaming Iterator trait.
Since StreamingIterator is not a built-in trait, there is no syntactic support for it in loops, so you must use it explicitly:
let data = vec![1,3,7,9];
let mut cursor = SpaceCursor::<4>::new(&data);
while let Some(i) = cursor.stream_next() {
println!("{i:?}");
}
You can try this now in Rust 1.64 beta, in nightly, or wait about 2 weeks for Rust 1.65.

Related

Mutable borrow into two parts with cleanup

I have some object that I want to split into two parts via a mutable borrow, then combine those back together into the original object when the split references go out of scope.
The simplified example below is for a Count struct that holds a single i32, which we want to split into two &mut i32s, who are both incorporated back into the original Count when the two mutable references go out of scope.
The approach I am taking below is to use an intermediate object CountSplit which holds a mutable reference to the original Count object and has the Drop trait implemented to do the re-combination logic.
This approach feels kludgy. In particular, this is awkward:
let mut ms = c.make_split();
let (x, y) = ms.split();
Doing this in one line like let (x, y) = c.make_split().split(); is not allowed because the intermediate object must have a longer lifetime. Ideally I would be able to do something like let (x, y) = c.magic_split(); and avoid exposing the intermediate object altogether.
Is there a way to do this which doesn't require doing two let's every time, or some other way to tackle this pattern that would be more idiomatic?
#[derive(Debug)]
struct Count {
val: i32,
}
trait MakeSplit<'a> {
type S: Split<'a>;
fn make_split(&'a mut self) -> Self::S;
}
impl<'a> MakeSplit<'a> for Count {
type S = CountSplit<'a>;
fn make_split(&mut self) -> CountSplit {
CountSplit {
top: self,
second: 0,
}
}
}
struct CountSplit<'a> {
top: &'a mut Count,
second: i32,
}
trait Split<'a> {
fn split(&'a mut self) -> (&'a mut i32, &'a mut i32);
}
impl<'a, 'b> Split<'a> for CountSplit<'b> {
fn split(&mut self) -> (&mut i32, &mut i32) {
(&mut self.top.val, &mut self.second)
}
}
impl<'a> Drop for CountSplit<'a> {
fn drop(&mut self) {
println!("custom drop occurs here");
self.top.val += self.second;
}
}
fn main() {
let mut c = Count { val: 2 };
println!("{:?}", c); // Count { val: 2 }
{
let mut ms = c.make_split();
let (x, y) = ms.split();
println!("split: {} {}", x, y); // split: 2 0
// each of these lines correctly gives a compile-time error
// c.make_split(); // can't borrow c as mutable
// println!("{:?}", c); // or immutable
// ms.split(); // also can't borrow ms
*x += 100;
*y += 5000;
println!("split: {} {}", x, y); // split: 102 5000
} // custom drop occurs here
println!("{:?}", c); // Count { val: 5102 }
}
playground:

I don't think a reference to a temporary value like yours can be made to work in today's Rust.
If it's any help, if you specifically want to call a function with two &mut i32 parameters like you mentioned in the comments, e.g.
fn foo(a: &mut i32, b: &mut i32) {
*a += 1;
*b += 2;
println!("split: {} {}", a, b);
}
you can already do that with the same number of lines as you'd have if your chaining worked.
With the chaining, you'd call
let (x, y) = c.make_split().split();
foo(x, y);
And if you just leave out the conversion to a tuple, it looks like this:
let mut ms = c.make_split();
foo(&mut ms.top.val, &mut ms.second);
You can make it a little prettier by e.g. storing the mutable reference to val directly in CountSplit as first, so that it becomes foo(&mut ms.first, &mut ms.second);. If you want it to feel even more like a tuple, I think you can use DerefMut to be able to write foo(&mut ms.0, &mut ms.1);.
Alternatively, you can of course formulate this as a function taking a function
impl Count {
fn as_split<F: FnMut(&mut i32, &mut i32)>(&mut self, mut f: F) {
let mut second = 0;
f(&mut self.val, &mut second);
self.val += second;
}
}
and then just call
c.as_split(foo);

How to implement Iterator yielding mutable references [duplicate]

This question already has an answer here:
How can I create my own data structure with an iterator that returns mutable references?
(1 answer)
Closed 1 year ago.
I am trying to implement a simple lookup iterator:
pub struct LookupIterMut<'a, D> {
data : &'a mut [D],
indices : &'a [usize],
i: usize
}
impl<'a, D> Iterator for LookupIterMut<'a, D> {
type Item = &'a mut D;
fn next(&mut self) -> Option<Self::Item> {
if self.i >= self.indices.len() {
None
} else {
let index = self.indices[self.i] as usize;
self.i += 1;
Some(&mut self.data[index]) // error here
}
}
}
The idea was to allow a caller consecutive mutable access to an internal storage. However I am getting the error cannot infer an appropriate lifetime for lifetime parameter in function call due to conflicting requirements.
As far as I understand I would have to change the function signature to next(&'a mut self) -> .. but this would not be an Iterator anymore.
I also discovered that I could simply use raw pointers, though I am not sure if this is appropriate here:
// ...
type Item = *mut D;
// ...
Thanks for your help

Your code is invalid because you try to return multiple mutable references to the same slice with the same lifetime 'a.
For such a thing to work, you would need a different lifetime for each returned Item so that you wouldn't hold 2 mutable references to the same slice. You cannot do that for now because it requires Generic Associated Types:
type Item<'item> = &'item mut D; // Does not work today
One solution is to check that the indices are unique and to rebind the lifetime of the referenced item to 'a in an unsafe block. This is safe because all the indices are unique, so the user cannot hold 2 mutable references to the same item.
Don't forget to encapsulate the whole code inside a module, so that the struct cannot be build without the check in new:
mod my_mod {
pub struct LookupIterMut<'a, D> {
data: &'a mut [D],
indices: &'a [usize],
i: usize,
}
impl<'a, D> LookupIterMut<'a, D> {
pub fn new(data: &'a mut [D], indices: &'a [usize]) -> Result<Self, ()> {
let mut uniq = std::collections::HashSet::new();
let all_distinct = indices.iter().all(move |&x| uniq.insert(x));
if all_distinct {
Ok(LookupIterMut {
data,
indices,
i: 0,
})
} else {
Err(())
}
}
}
impl<'a, D> Iterator for LookupIterMut<'a, D> {
type Item = &'a mut D;
fn next(&mut self) -> Option<Self::Item> {
self.indices.get(self.i).map(|&index| {
self.i += 1;
unsafe { std::mem::transmute(&mut self.data[index]) }
})
}
}
}
Note that your code will panic if one index is out of bounds.

Using unsafe
Reminder: it is unsound to have, at any time, two accessible mutable references to the same underlying value.
The crux of the problem is that the language cannot guarantee that the code abides by the above rule, should indices contain any duplicate, then the iterator as implemented would allow obtaining concurrently two mutable references to the same item in the slice, which is unsound.
When the language cannot make the guarantee on its own, then you either need to find an alternative approach or you need to do your due diligence and then use unsafe.
In this case, on the Playground:
impl<'a, D> LookupIterMut<'a, D> {
pub fn new(data: &'a mut [D], indices: &'a [usize]) -> Self {
let set: HashSet<usize> = indices.iter().copied().collect();
assert!(indices.len() == set.len(), "Duplicate indices!");
Self { data, indices, i: 0 }
}
}
impl<'a, D> Iterator for LookupIterMut<'a, D> {
type Item = &'a mut D;
fn next(&mut self) -> Option<Self::Item> {
if self.i >= self.indices.len() {
None
} else {
let index = self.indices[self.i];
assert!(index < self.data.len());
self.i += 1;
// Safety:
// - index is guaranteed to be within bounds.
// - indices is guaranteed not to contain duplicates.
Some(unsafe { &mut *self.data.as_mut_ptr().offset(index as isize) })
}
}
}
Performance wise, the construction of a HashSet in the constructor is rather unsatisfying but cannot really be avoided in general. If indices was guaranteed to be sorted for example, then the check could be performed without allocation.

How to implement an iterator over chunks of an array in a struct?

I want to implement an iterator for the struct with an array as one of its fields. The iterator should return a slice of that array, but this requires a lifetime parameter. Where should that parameter go?
The Rust version is 1.37.0
struct A {
a: [u8; 100],
num: usize,
}
impl Iterator for A {
type Item = &[u8]; // this requires a lifetime parameter, but there is none declared
fn next(&mut self) -> Option<Self::Item> {
if self.num >= 10 {
return None;
}
let res = &self.a[10*self.num..10*(self.num+1)];
self.num += 1;
Some(res)
}
}

I wouldn't implement my own. Instead, I'd reuse the existing chunks iterator and implement IntoIterator for a reference to the type:
struct A {
a: [u8; 100],
num: usize,
}
impl<'a> IntoIterator for &'a A {
type Item = &'a [u8];
type IntoIter = std::slice::Chunks<'a, u8>;
fn into_iter(self) -> Self::IntoIter {
self.a.chunks(self.num)
}
}
fn example(a: A) {
for chunk in &a {
println!("{}", chunk.iter().sum::<u8>())
}
}

When you return a reference from a function, its lifetime needs to be tied to something else. Otherwise, the compiler wouldn't know how long the reference is valid (the exception to this is a 'static lifetime, which lasts for the duration of the whole program).
So we need an existing reference to the slices. One standard way to do this is to tie the reference to the iterator itself. For example,
struct Iter<'a> {
slice: &'a [u8; 100],
num: usize,
}
Then what you have works almost verbatim. (I've changed the names of the types and fields to be a little more informative).
impl<'a> Iterator for Iter<'a> {
type Item = &'a [u8];
fn next(&mut self) -> Option<Self::Item> {
if self.num >= 100 {
return None;
}
let res = &self.slice[10 * self.num..10 * (self.num + 1)];
self.num += 1;
Some(res)
}
}
Now, you probably still have an actual [u8; 100] somewhere, not just a reference. If you still want to work with that, what you'll want is a separate struct that has a method to convert into A. For example
struct Data {
array: [u8; 100],
}
impl Data {
fn iter<'a>(&'a self) -> Iter<'a> {
Iter {
slice: &self.array,
num: 0,
}
}
}
Thanks to lifetime elision, the lifetimes on iter can be left out:
impl Data {
fn iter(&self) -> Iter {
Iter {
slice: &self.array,
num: 0,
}
}
}
(playground)
Just a few notes. There was one compiler error with [0u8; 100]. This may have been a typo for [u8; 100], but just in case, here's why we can't do that. In the fields for a struct definition, only the types are specified. There aren't default values for the fields or anything like that. If you're trying to have a default for the struct, consider using the Default trait.
Second, you're probably aware of this, but there's already an implementation of a chunk iterator for slices. If slice is a slice (or can be deref coerced into a slice - vectors and arrays are prime examples), then slice.chunks(n) is an iterator over chunks of that slice with length n. I gave an example of this in the code linked above. Interestingly, that implementation uses a very similar idea: slice.chunks(n) returns a new struct with a lifetime parameter and implements Iterator. This is almost exactly the same as our Data::iter.
Finally, your implementation of next has a bug in it that causes an out-of-bounds panic when run. See if you can spot it!

What is the most efficient way to have a iterator over the references of a given numeric range?

One way of doing this is to create an array or vector ([0, 1, 2, ..., n] and then use the iter() method. However, it is not memory efficient at all.
I tried the following implementation:
pub struct StaticIxIter {
max: usize,
current: usize,
next: usize,
}
impl StaticIxIter {
pub fn new(max: usize) -> Self {
StaticIxIter {
max,
current: 0,
next: 0,
}
}
}
impl Iterator for StaticIxIter {
type Item = &usize;
fn next(&mut self) -> Option<Self::Item> {
if self.next >= self.max {
return None;
}
self.current = self.next;
self.next += 1;
Some(&self.current)
}
}
fn main() {
for element in StaticIxIter::new(10) {
println!("{}", element);
}
}
It won't compile:
error[E0106]: missing lifetime specifier
--> src/main.rs:18:17
|
18 | type Item = &usize;
| ^ expected lifetime parameter

For iterating over a list of numbers, you might want to use Rust's range iterator.
Take a look at this iterator example, where a range is used:
for element in 0..100 {
println!("{}", element);
}
Changing this to 0..max is also perfectly fine. Don't forget to wrap this range between brackets like (0..100).map(...) if you want to use iterator functions on it.
About borrowing; for borrowing iterator items, you need to have an owner for them. I recommend to keep your implementation as simple as possible. Why don't you borrow iterator items after you iterated over it, like this?
for element in 0..100 {
println!("{}", &element);
// ^- borrow here
}

How can I use `index_mut` to get a mutable reference?

Even when I implement IndexMut for my struct, I cannot get a mutable reference to an element of structure inner vector.
use std::ops::{Index, IndexMut};
struct Test<T> {
data: Vec<T>,
}
impl<T> Index<usize> for Test<T> {
type Output = T;
fn index<'a>(&'a self, idx: usize) -> &'a T {
return &self.data[idx];
}
}
impl<T> IndexMut<usize> for Test<T> {
fn index_mut<'a>(&'a mut self, idx: usize) -> &'a mut T {
// even here I cannot get mutable reference to self.data[idx]
return self.data.index_mut(idx);
}
}
fn main() {
let mut a: Test<i32> = Test { data: Vec::new() };
a.data.push(1);
a.data.push(2);
a.data.push(3);
let mut b = a[1];
b = 10;
// will print `[1, 2, 3]` instead of [1, 10, 3]
println!("[{}, {}, {}]", a.data[0], a.data[1], a.data[2]);
}
How can I use index_mut to get a mutable reference? Is it possible?

You're almost there. Change this:
let mut b = a[1];
b = 10;
to this:
let b = &mut a[1];
*b = 10;
Indexing syntax returns the value itself, not a reference to it. Your code extracts one i32 from your vector and modifies the variable - naturally, it does not affect the vector itself. In order to obtain a reference through the index, you need to write it explicitly.
This is fairly natural: when you use indexing to access elements of a slice or an array, you get the values of the elements, not references to them, and in order to get a reference you need to write it explicitly.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Is it possible to implement an iterator with borrowed items? - rust

Related

Mutable borrow into two parts with cleanup

How to implement Iterator yielding mutable references [duplicate]

How to implement an iterator over chunks of an array in a struct?

What is the most efficient way to have a iterator over the references of a given numeric range?

How can I use `index_mut` to get a mutable reference?

Categories

Resources