Borrow checker issue in `zip`-like function with a callback - rust

I'm trying to implement a function that steps though two iterators at the same time, calling a function for each pair. This callback can control which of the iterators is advanced in each step by returning a (bool, bool) tuple. Since the iterators take a reference to a buffer in my use case, they can't implement the Iterator trait from the stdlib, but instead are used though a next_ref function, which is identical to Iterator::next, but takes an additional lifetime parameter.
// An iterator-like type, that returns references to itself
// in next_ref
struct RefIter {
value: u64
}
impl RefIter {
fn next_ref<'a>(&'a mut self) -> Option<&'a u64> {
self.value += 1;
Some(&self.value)
}
}
// Iterate over two RefIter simultaneously and call a callback
// for each pair. The callback returns a tuple of bools
// that indicate which iterators should be advanced.
fn each_zipped<F>(mut iter1: RefIter, mut iter2: RefIter, callback: F)
where F: Fn(&Option<&u64>, &Option<&u64>) -> (bool, bool)
{
let mut current1 = iter1.next_ref();
let mut current2 = iter2.next_ref();
loop {
let advance_flags = callback(&current1, &current2);
match advance_flags {
(true, true) => {
current1 = iter1.next_ref();
current2 = iter2.next_ref();
},
(true, false) => {
current1 = iter1.next_ref();
},
(false, true) => {
current2 = iter1.next_ref();
},
(false, false) => {
return
}
}
}
}
fn main() {
let mut iter1 = RefIter { value: 3 };
let mut iter2 = RefIter { value: 4 };
each_zipped(iter1, iter2, |val1, val2| {
let val1 = *val1.unwrap();
let val2 = *val2.unwrap();
println!("{}, {}", val1, val2);
(val1 < 10, val2 < 10)
});
}
error[E0499]: cannot borrow `iter1` as mutable more than once at a time
--> src/main.rs:28:28
|
22 | let mut current1 = iter1.next_ref();
| ----- first mutable borrow occurs here
...
28 | current1 = iter1.next_ref();
| ^^^^^ second mutable borrow occurs here
...
42 | }
| - first borrow ends here
error[E0499]: cannot borrow `iter2` as mutable more than once at a time
--> src/main.rs:29:28
|
23 | let mut current2 = iter2.next_ref();
| ----- first mutable borrow occurs here
...
29 | current2 = iter2.next_ref();
| ^^^^^ second mutable borrow occurs here
...
42 | }
| - first borrow ends here
error[E0499]: cannot borrow `iter1` as mutable more than once at a time
--> src/main.rs:32:28
|
22 | let mut current1 = iter1.next_ref();
| ----- first mutable borrow occurs here
...
32 | current1 = iter1.next_ref();
| ^^^^^ second mutable borrow occurs here
...
42 | }
| - first borrow ends here
error[E0499]: cannot borrow `iter1` as mutable more than once at a time
--> src/main.rs:35:28
|
22 | let mut current1 = iter1.next_ref();
| ----- first mutable borrow occurs here
...
35 | current2 = iter1.next_ref();
| ^^^^^ second mutable borrow occurs here
...
42 | }
| - first borrow ends here
I understand why it complains, but can't find a way around it. I'd appreciate any help on the subject.
Link to this snippet in the playground.

Since the iterators take a reference to a buffer in my use case, they can't implement the Iterator trait from the stdlib, but instead are used though a next_ref function, which is identical to Iterator::next, but takes an additional lifetime parameter.
You are describing a streaming iterator. There is a crate for this, aptly called streaming_iterator. The documentation describes your problem (emphasis mine):
While the standard Iterator trait's functionality is based off of
the next method, StreamingIterator's functionality is based off of
a pair of methods: advance and get. This essentially splits the
logic of next in half (in fact, StreamingIterator's next method
does nothing but call advance followed by get).
This is required because of Rust's lexical handling of borrows (more
specifically a lack of single entry, multiple exit borrows). If
StreamingIterator was defined like Iterator with just a required
next method, operations like filter would be impossible to define.
The crate does not currently have a zip function, and certainly not the variant you have described. However, it's easy enough to implement:
extern crate streaming_iterator;
use streaming_iterator::StreamingIterator;
fn each_zipped<A, B, F>(mut iter1: A, mut iter2: B, callback: F)
where
A: StreamingIterator,
B: StreamingIterator,
F: for<'a> Fn(Option<&'a A::Item>, Option<&'a B::Item>) -> (bool, bool),
{
iter1.advance();
iter2.advance();
loop {
let advance_flags = callback(iter1.get(), iter2.get());
match advance_flags {
(true, true) => {
iter1.advance();
iter2.advance();
}
(true, false) => {
iter1.advance();
}
(false, true) => {
iter1.advance();
}
(false, false) => return,
}
}
}
struct RefIter {
value: u64
}
impl StreamingIterator for RefIter {
type Item = u64;
fn advance(&mut self) {
self.value += 1;
}
fn get(&self) -> Option<&Self::Item> {
Some(&self.value)
}
}
fn main() {
let iter1 = RefIter { value: 3 };
let iter2 = RefIter { value: 4 };
each_zipped(iter1, iter2, |val1, val2| {
let val1 = *val1.unwrap();
let val2 = *val2.unwrap();
println!("{}, {}", val1, val2);
(val1 < 10, val2 < 10)
});
}

The issue with this code is that RefIter is being used in two ways, which are fundamentally at odds with one another:
Callers of next_ref recieve a reference to the stored value, which is tied to the lifetime of RefIter
RefIter's value needs to be mutable, so that it can be incremented on each call
This perfectly describes mutable aliasing (you're trying to modify 'value' while a reference to it is being held) - something which Rust is explicitly designed to prevent.
In order to make each_zipped work, you'll need to modify RefIter to avoid handing out references to data that you wish to mutate.
I've implemented one possibility below using a combination of RefCell and Rc:
use std::cell::RefCell;
use std::rc::Rc;
// An iterator-like type, that returns references to itself
// in next_ref
struct RefIter {
value: RefCell<Rc<u64>>
}
impl RefIter {
fn next_ref(&self) -> Option<Rc<u64>> {
let new_val = Rc::new(**self.value.borrow() + 1);
*self.value.borrow_mut() = new_val;
Some(Rc::clone(&*self.value.borrow()))
}
}
// Iterate over two RefIter simultaniously and call a callback
// for each pair. The callback returns a tuple of bools
// that indicate which iterators should be advanced.
fn each_zipped<F>(iter1: RefIter, iter2: RefIter, callback: F)
where F: Fn(&Option<Rc<u64>>, &Option<Rc<u64>>) -> (bool, bool)
{
let mut current1 = iter1.next_ref();
let mut current2 = iter2.next_ref();
loop {
let advance_flags = callback(&current1, &current2);
match advance_flags {
(true, true) => {
current1 = iter1.next_ref();
current2 = iter2.next_ref();
},
(true, false) => {
current1 = iter1.next_ref();
},
(false, true) => {
current2 = iter1.next_ref();
},
(false, false) => {
return
}
}
}
}
fn main() {
let iter1 = RefIter { value: RefCell::new(Rc::new(3)) };
let iter2 = RefIter { value: RefCell::new(Rc::new(4)) };
each_zipped(iter1, iter2, |val1, val2| {
// We can't use unwrap() directly, since we're only passed a reference to an Option
let val1 = **val1.iter().next().unwrap();
let val2 = **val2.iter().next().unwrap();
println!("{}, {}", val1, val2);
(val1 < 10, val2 < 10)
});
}
This version of RefIter hands out Rcs to consumers, instead of references. This avoids the issue of mutable aliasing - updating value is done by placing
a new Rc into the outer RefCell. A side effect of this is that consumers are able to hold onto an 'old' reference to the buffer (through the returned Rc), even after RefIter has advanced.

Related

How to return offset char_indices from a function

Suppose the following Rust snippet:
use std::borrow::Cow;
fn char_indices_from(s: &str, offset: usize) -> impl Iterator<Item=(usize, char)> + '_ {
s[offset..].char_indices().map(move |(i,c)| (i+offset,c))
}
fn main() {
let mut m = Cow::from("watermelons and stuff");
let offset = 2;
for (i, c) in char_indices_from(&m, offset) {
if i == 3 {
m = Cow::from("clouds and the sky");
break
}
}
}
This displeases the borrow checker:
error[E0506]: cannot assign to `m` because it is borrowed
--> src/main.rs:12:13
|
10 | for (i, c) in char_indices_from(&m, offset) {
| -----------------------------
| | |
| | borrow of `m` occurs here
| a temporary with access to the borrow is created here ...
11 | if i == 3 {
12 | m = Cow::from("clouds and the sky");
| ^ assignment to borrowed `m` occurs here
...
15 | }
| - ... and the borrow might be used here, when that temporary is dropped and runs the destructor for type `impl Iterator<Item = (usize, char)>`
Doing this, however, works just fine:
use std::borrow::Cow;
fn main() {
let mut m = Cow::from("watermelons and stuff");
let offset = 2;
for (i, c) in m[offset..].char_indices().map(|(i,c)| (i+offset, c)) {
if i == 3 {
m = Cow::from("clouds and the sky");
break
}
}
}
Those are some excellent diagnostics given by rustc. Nevertheless, I find myself confused as to how one would fix char_indices_from such that the first program satisfies Rust's borrowing rules.
Your assumption is that you can overwrite m because it's the last thing you do before break.
It's true that the Rust borrow checker is smart enough to figure this out; your second example proves this.
The borrow checker rightfully complains about the first example, though, because you forget destructors, meaning, the Drop trait. Because your return type is impl Iterator + '_, it has to assume this could be any type that implements Iterator and depends on the input lifetimes. Which includes types that use the borrowed values in their Drop implementation. This is also what the compiler tries to tell you.
You could fix that by replacing the impl return type with the actual type, proving to the borrow checker that there is no Drop implementation. Although you will also get problems with that, because your type contains a closure whose type cannot be named.
That's why usually these things return their own iterator type (for example the itertools crate, none of their functions have an impl return type).
So that's what I would do: implement your own iterator return type.
use std::{borrow::Cow, str::CharIndices};
struct CharIndicesFrom<'a> {
raw_indices: CharIndices<'a>,
offset: usize,
}
impl Iterator for CharIndicesFrom<'_> {
type Item = (usize, char);
fn next(&mut self) -> Option<Self::Item> {
self.raw_indices.next().map(|(i, c)| (i + self.offset, c))
}
}
fn char_indices_from(s: &str, offset: usize) -> CharIndicesFrom<'_> {
CharIndicesFrom {
raw_indices: s[offset..].char_indices(),
offset,
}
}
fn main() {
let mut m = Cow::from("watermelons and stuff");
let offset = 2;
for (i, c) in char_indices_from(&m, offset) {
if i == 3 {
m = Cow::from("clouds and the sky");
break;
}
}
}

Borrow inside a loop

I'm trying to learn Rust after many years of C++. I have a situation where the compiler is complaining about a borrow, and it doesn't seem to matter whether it is mutable or immutable. I don't seem to be able to use self as a parameter inside a loop that start with: for item in self.func.drain(..).I've tried calling appropriate() as a function:
Self::appropriate(&self,&item,index)
and I have tried it as a method:
self.appropriate(&item,index)
but I get the same message in either case:
The function or method appropriate() is intended imply examine the relationship among its parameters and return a bool without modifying anything. How can I call either a function or method on self without violating borrowing rules?This program is a learning exercise from exercism.org and doesn't include a main() so it won't run but should almost compile except for the error in question. Here's the code I have:
use std::collections::HashMap;
pub type Value = i32;
pub type Result = std::result::Result<(), Error>;
pub struct Forth {
v: Vec<Value>,
f: HashMap<String,usize>,
s: Vec<Vec<String>>,
func: Vec<String>
}
#[derive(Debug, PartialEq)]
pub enum Error {
DivisionByZero,
StackUnderflow,
UnknownWord,
InvalidWord,
}
impl Forth {
pub fn new() -> Forth {
let mut temp: Vec<Vec<String>> = Vec::new();
temp.push(Vec::new());
Forth{v: Vec::<Value>::new(), f: HashMap::new(), s: temp, func: Vec::new()}
}
pub fn stack(&self) -> &[Value] {
&self.v
}
pub fn eval(&mut self, input: &str) -> Result {
self.v.clear();
self.s[0].clear();
let mut count = 0;
{
let temp: Vec<&str> = input.split(' ').collect();
let n = temp.len() as i32;
for x in 0..n as usize {
self.s[0].push(String::from(temp[x]));
}
}
let mut collecting = false;
let mut xlist: Vec<(usize,usize)> = Vec::new();
let mut sx: usize = 0;
let mut z: i32 = -1;
let mut x: usize;
let mut n: usize = self.s[0].len();
loop {
count += 1;
if count > 20 {break;}
z += 1;
x = z as usize;
if x >= n {break;}
z = x as i32;
let word = &self.s[sx][x];
if word == ";" {
if collecting {
collecting = false;
let index: usize = self.s.len();
self.s.push(Vec::<String>::new());
for item in self.func.drain(..) {
if self.s[index].len() > 0 &&
Self::appropriate(&self,&item,index)
{
let sx = *self.f.get(&self.s[index][0]).unwrap();
let n = self.s[sx].len();
for x in 1..n as usize {
let symbol = self.s[sx][x].clone();
self.s[index].push(symbol);
}
}
else {
self.s[index].push(item);
}
}
self.f.insert(self.s[index][0].clone(), index);
self.func.clear();
continue;
}
if 0 < xlist.len() {
(x, n) = xlist.pop().unwrap();
continue;
}
return Err(Error::InvalidWord);
}
if collecting {
self.func.push(String::from(word));
continue;
}
if Self::is_op(word) {
if self.v.len() < 2 {
return Err(Error::StackUnderflow);
}
let b = self.v.pop().unwrap();
let a = self.v.pop().unwrap();
let c = match word.as_str() {
"+" => a + b,
"-" => a - b,
"*" => a * b,
"/" => {if b == 0 {return Err(Error::DivisionByZero);} a / b},
_ => 0
};
self.v.push(c);
continue;
}
match word.parse::<Value>() {
Ok(value) => { self.v.push(value); continue;},
_ => {}
}
if word == ":" {
collecting = true;
self.func.clear();
continue;
}
if word == "drop" {
if self.v.len() < 1 {
return Err(Error::StackUnderflow);
}
self.v.pop();
continue;
}
if word == "dup" {
if self.v.len() < 1 {
return Err(Error::StackUnderflow);
}
let temp = self.v[self.v.len() - 1];
self.v.push(temp);
continue;
}
if !self.f.contains_key(word) {
return Err(Error::UnknownWord);
}
xlist.push((sx,n));
sx = *self.f.get(word).unwrap();
n = self.s[sx].len();
z = 0;
}
Ok(())
}
fn is_op(input: &str) -> bool {
match input {"+"|"-"|"*"|"/" => true, _ => false}
}
fn appropriate(&self, item:&str, index:usize) -> bool
{
false
}
fn prev_def_is_short(&self, index: usize) -> bool {
if index >= self.s.len() {
false
}
else {
if let Some(&sx) = self.f.get(&self.func[0]) {
self.s[sx].len() == 2
}
else {
false
}
}
}
}
The error message relates to the call to appropriate(). I haven't even written the body of that function yet; I'd like to get the parameters right first. The compiler's complaint is:
As a subroutine call
error[E0502]: cannot borrow `self` as immutable because it is also borrowed as mutable
--> src/lib.rs:85:47
|
81 | for item in self.func.drain(..) {
| -------------------
| |
| mutable borrow occurs here
| mutable borrow later used here
...
85 | Self::appropriate(&self,&item,index)
| ^^^^^ immutable borrow occurs here
For more information about this error, try `rustc --explain E0502`.
as a method call
error[E0502]: cannot borrow `*self` as immutable because it is also borrowed as mutable
--> src/lib.rs:85:29
|
81 | for item in self.func.drain(..) {
| -------------------
| |
| mutable borrow occurs here
| mutable borrow later used here
...
85 | self.appropriate(&item,index)
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ immutable borrow occurs here
For more information about this error, try `rustc --explain E0502`.
Is there any canonical way to deal with this situation?
The problem is that self.func.drain() will consume the elements contained in self.func, thus an exclusive (&mut) access is needed on self.func for the entire for loop.
If during the iteration you need to pass a reference to self globally, then its func member is potentially accessible while the loop holds an exclusive access to it: Rust forbids that.
Since you use drain() in order to consume all the elements inside self.func, I suggest you swap this vector with an empty one just before the loop, then iterate on this other vector that is not anymore part of self.
No copy of the content of the vector is involved here; swap() only deals with pointers.
Here is an over-simplified version of your code, adapted consequently.
struct Forth {
func: Vec<String>,
}
impl Forth {
fn eval(&mut self) {
/*
for item in self.func.drain(..) {
self.appropriate(&self);
}
*/
let mut func = Vec::new();
std::mem::swap(&mut self.func, &mut func);
for item in func.drain(..) {
let b = self.appropriate();
println!("{:?} {:?}", item, b);
}
}
fn appropriate(&self) -> bool {
false
}
}
fn main() {
let mut f = Forth {
func: vec!["aaa".into(), "bbb".into()],
};
f.eval();
}

Linked list. Can't change borrowed value

I am on my way of learning Rust. In practical purpose I decided to make my own linked list collection. But I faced with some problems soon. I tried but can't find any way to fix this problem. Can I do this using this type of structure for linked list? The main problem I faced is implementation of deleting function.This function should delete element from the list with replacing address of the current node with address of the next node. But I can't change the value because it was borrowed already.
#[derive(Debug, PartialEq)]
enum AlmostList {
Cons(i32, Box<AlmostList>),
Nil,
}
The base of linked list is enumeration.
#[derive(Debug, PartialEq)]
struct List {
length: u32,
data: AlmostList,
}
But I want the instance of list knows its own length. Because of it I decided to use structure to store the length of the list in it.
use crate::AlmostList::{Cons, Nil};
//Implementetion of functions
impl List {
fn new() -> List {
List{ length: 0_u32, data: AlmostList::Nil }
}
fn append(&mut self, elem: i32) {
let mut alist = &mut self.data;
loop {
match alist {
Cons(_, ref mut ptr) => alist = ptr,
Nil => {
let node = Cons(elem, Box::new(Nil));
self.length += 1;
*alist = node;
break;
},
}
}
}
fn show(&self) {
let mut alist = &self.data;
if let Nil = alist {
panic!("List is empty");
}
loop {
match alist {
Cons(value, ptr) => {
print!("{} ", value);
alist = &*ptr;
},
Nil => break,
}
}
}
}
Everything was fine until I started to write function of deleting elements.This function should delete element from the list with replacing address of the current node with address of the next node.
fn delete(&mut self, number: i32) -> Option<i32> {
let mut alist = &mut self.data;
let result = loop {
match alist {
Cons(value, ptr) => {
if *value != number {
alist = ptr;
}
self.length -= 1;
*alist = *ptr.clone();
break Some(number);
},
Nil => break None,
}
};
result
}
}
Got this error:
error[E0506]: cannot assign to `*alist` because it is borrowed
--> main.rs:79:25
|
73 | Cons(value, ptr) => {
| --- borrow of `*alist` occurs here
...
79 | *alist = *ptr.clone();
| ^^^^^^
| |
| assignment to borrowed `*alist` occurs here
| borrow later used here
error[E0502]: cannot borrow `*ptr` as immutable because it is also borrowed as mutable
--> main.rs:79:35
|
75 | alist = ptr;
| --- mutable borrow occurs here
...
79 | *alist = *ptr.clone();
| ------ ^^^ immutable borrow occurs here
| |
| mutable borrow later used here
error: aborting due to 2 previous errors
Some errors have detailed explanations: E0502, E0506.
For more information about an error, try `rustc --explain E0502`.
Can I solve this problem somehow?
Yes, you can solve this problem. I fixed it in that way:
pub fn delete(&mut self, number: i32) -> Option<i32> {
let mut alist = &mut self.data;
let result = loop {
match alist {
AlmostList::Cons(value, ref mut ptr) => {
if *value != number {
alist = ptr;
} else {
break Some(ptr);
}
}
AlmostList::Nil => break None,
}
};
if let Some(ptr) = result {
*ptr = Box::new(AlmostList::Nil);
Some(number)
} else {
None
}
}
At least the code above was compiled successfully. Full rust playground example: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=5bd28014bdec30e5c2e851827f15dad5
Additional hints:
write better namings. Examples: AlmostLeast -> ListNode (or just Node), Nil -> None, Cons -> Some. I have feeling that you have experience with Lisp in the past :)
format your code with cargo fmt and check with cargo clippy
any data structure implementations should not panic. Use Result type
Update:
Code above is wrong. I've implemented it using recursion: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=ea6dbd3a627a59014df823be29e296b2

Double for loop mutable references

I want to use a double For loop to compare each element with each and if these properties are the same, make a change to the objects. Below is a little demo code of what I want to do.
But the Rust compiler tells me here that I can't have 2 mutable references from the same object. How can I implement the whole thing differently?
fn main() {
struct S {
a: i32
}
let mut v = vec![S{a: 1}, S{a: 1}, S{a: 1}];
let size = v.len();
for i in 0..size {
for j in 0..size {
if i == j {
continue;
}
let a = &mut v[i];
let b = &mut v[j];
if a.a == b.a {
a.a = 5;
}
}
}
}
Console:
error[E0499]: cannot borrow `v` as mutable more than once at a time
--> src\main.rs:35:26
|
34 | let a = &mut v[i];
| - first mutable borrow occurs here
35 | let b = &mut v[j];
| ^ second mutable borrow occurs here
36 |
37 | if a.a == b.a {
| --- first borrow later used here
First thing to note is that you don't need two mutable references since you are only mutating v[i].
Writing it like this will create two immutable borrows and inside the if a mutable borrow:
if v[i].a == v[j].a {
v[i].a = 5;
}
Being very explicit this is what is happening (but its nicer the other way):
let a = &v[i];
let b = &v[j];
if a.a == b.a {
let mut_a = &mut v[i];
mut_a.a = 5;
}
One of the rules of the borrow checker is that no two mutable borrows can exist to the same thing at the same time.

How to process every value in a HashMap and optionally reject some?

I would like to process the values from a HashMap one by one, while maybe removing some of them.
For example, I would like to do an equivalent of:
use std::collections::HashMap;
fn example() {
let mut to_process = HashMap::new();
to_process.insert(1, true);
loop {
// get an arbitrary element
let ans = to_process.iter().next().clone(); // get an item from the hash
match ans {
Some((k, v)) => {
if condition(&k,&v) {
to_process.remove(&k);
}
}
None => break, // work finished
}
}
}
But this fails to compile:
error[E0502]: cannot borrow `to_process` as mutable because it is also borrowed as immutable
--> src/lib.rs:12:17
|
9 | let ans = to_process.iter().next().clone();
| ---------- immutable borrow occurs here
...
12 | to_process.remove(&k);
| ^^^^^^^^^^^------^^^^
| | |
| | immutable borrow later used by call
| mutable borrow occurs here
I know I really would need https://github.com/rust-lang/rust/issues/27804 (which is for HashSet but for HashMap would be the same)
and I cannot implement the provided solutions without having a non-mut and mutable reference still or using unsafe.
Is there a simple way I am missing?
Note If you need to alter keys or add kvps to the HashMap during processing, see #edwardw's answer. Otherwise ...
Use HashMap::retain. You can change your process function to return a bool indicating whether to keep that key value pair. For example
let mut to_process: HashMap<u32, String> = HashMap::new();
to_process.insert(1, "ok".to_string());
to_process.insert(2, "bad".to_string());
to_process.retain(process);
fn process(k: &u32, v: &mut String) -> bool {
// do stuff with k and v
v == "ok"
}
This looks like an awfully good fit for Iterator::filter_map:
The closure must return an Option<T>. filter_map creates an iterator which calls this closure on each element. If the closure returns Some(element), then that element is returned. If the closure returns None, it will try again, and call the closure on the next element, seeing if it will return Some.
The following process_and_maybe_add is very simple, but you get the idea:
use std::collections::HashMap;
fn main() {
let mut data = HashMap::new();
data.insert(1, "a");
data.insert(2, "b");
data.insert(3, "c");
let processed = data
.into_iter()
.filter_map(process_and_maybe_add)
.collect::<HashMap<_, _>>();
dbg!(processed);
}
fn process_and_maybe_add((k, v): (u32, &str)) -> Option<(u32, String)> {
if k % 2 != 0 {
Some((k + 100, v.to_owned() + v))
} else {
None
}
}

Resources