Borrow two mutable values from the same HashMap - rust

I have the following code:
use std::collections::{HashMap, HashSet};
fn populate_connections(
start: i32,
num: i32,
conns: &mut HashMap<i32, HashSet<i32>>,
ancs: &mut HashSet<i32>,
) {
let mut orig_conns = conns.get_mut(&start).unwrap();
let pipes = conns.get(&num).unwrap();
for pipe in pipes.iter() {
if !ancs.contains(pipe) && !orig_conns.contains(pipe) {
ancs.insert(*pipe);
orig_conns.insert(*pipe);
populate_connections(start, num, conns, ancs);
}
}
}
fn main() {}
The logic is not very important, I'm trying to create a function which will itself and walk over pipes.
My issue is that this doesn't compile:
error[E0502]: cannot borrow `*conns` as immutable because it is also borrowed as mutable
--> src/main.rs:10:17
|
9 | let mut orig_conns = conns.get_mut(&start).unwrap();
| ----- mutable borrow occurs here
10 | let pipes = conns.get(&num).unwrap();
| ^^^^^ immutable borrow occurs here
...
19 | }
| - mutable borrow ends here
error[E0499]: cannot borrow `*conns` as mutable more than once at a time
--> src/main.rs:16:46
|
9 | let mut orig_conns = conns.get_mut(&start).unwrap();
| ----- first mutable borrow occurs here
...
16 | populate_connections(start, num, conns, ancs);
| ^^^^^ second mutable borrow occurs here
...
19 | }
| - first borrow ends here
I don't know how to make it work. At the beginning, I'm trying to get two HashSets stored in a HashMap (orig_conns and pipes).
Rust won't let me have both mutable and immutable variables at the same time. I'm confused a bit because this will be completely different objects but I guess if &start == &num, then I would have two different references to the same object (one mutable, one immutable).
Thats ok, but then how can I achieve this? I want to iterate over one HashSet and read and modify other one. Let's assume that they won't be the same HashSet.

If you can change your datatypes and your function signature, you can use a RefCell to create interior mutability:
use std::cell::RefCell;
use std::collections::{HashMap, HashSet};
fn populate_connections(
start: i32,
num: i32,
conns: &HashMap<i32, RefCell<HashSet<i32>>>,
ancs: &mut HashSet<i32>,
) {
let mut orig_conns = conns.get(&start).unwrap().borrow_mut();
let pipes = conns.get(&num).unwrap().borrow();
for pipe in pipes.iter() {
if !ancs.contains(pipe) && !orig_conns.contains(pipe) {
ancs.insert(*pipe);
orig_conns.insert(*pipe);
populate_connections(start, num, conns, ancs);
}
}
}
fn main() {}
Note that if start == num, the thread will panic because this is an attempt to have both mutable and immutable access to the same HashSet.
Safe alternatives to RefCell
Depending on your exact data and code needs, you can also use types like Cell or one of the atomics. These have lower memory overhead than a RefCell and only a small effect on codegen.
In multithreaded cases, you may wish to use a Mutex or RwLock.

Use hashbrown::HashMap
If you can switch to using hashbrown, you may be able to use a method like get_many_mut:
use hashbrown::HashMap; // 0.12.1
fn main() {
let mut map = HashMap::new();
map.insert(1, true);
map.insert(2, false);
dbg!(&map);
if let Some([a, b]) = map.get_many_mut([&1, &2]) {
std::mem::swap(a, b);
}
dbg!(&map);
}
As hashbrown is what powers the standard library hashmap, this is also available in nightly Rust as HashMap::get_many_mut.
Unsafe code
If you can guarantee that your two indices are different, you can use unsafe code and avoid interior mutability:
use std::collections::HashMap;
fn get_mut_pair<'a, K, V>(conns: &'a mut HashMap<K, V>, a: &K, b: &K) -> (&'a mut V, &'a mut V)
where
K: Eq + std::hash::Hash,
{
unsafe {
let a = conns.get_mut(a).unwrap() as *mut _;
let b = conns.get_mut(b).unwrap() as *mut _;
assert_ne!(a, b, "The two keys must not resolve to the same value");
(&mut *a, &mut *b)
}
}
fn main() {
let mut map = HashMap::new();
map.insert(1, true);
map.insert(2, false);
dbg!(&map);
let (a, b) = get_mut_pair(&mut map, &1, &2);
std::mem::swap(a, b);
dbg!(&map);
}
Similar code can be found in libraries like multi_mut.
This code tries to have an abundance of caution. An assertion enforces that the two values are distinct pointers before converting them back into mutable references and we explicitly add lifetimes to the returned variables.
You should understand the nuances of unsafe code before blindly using this solution. Notably, previous versions of this answer were incorrect. Thanks to #oberien for finding the unsoundness in the original implementation of this and proposing a fix. This playground demonstrates how purely safe Rust code could cause the old code to result in memory unsafety.
An enhanced version of this solution could accept an array of keys and return an array of values:
fn get_mut_pair<'a, K, V, const N: usize>(conns: &'a mut HashMap<K, V>, mut ks: [&K; N]) -> [&'a mut V; N]
It becomes more difficult to ensure that all the incoming keys are unique, however.
Note that this function doesn't attempt to solve the original problem, which is vastly more complex than verifying that two indices are disjoint. The original problem requires:
tracking three disjoint borrows, two of which are mutable and one that is immutable.
tracking the recursive call
must not modify the HashMap in any way which would cause resizing, which would invalidate any of the existing references from a previous level.
must not alias any of the references from a previous level.
Using something like RefCell is a much simpler way to ensure you do not trigger memory unsafety.

Related

Rust not allowing mutable borrow when splitting properly

struct Test {
a: i32,
b: i32,
}
fn other(x: &mut i32, _refs: &Vec<&i32>) {
*x += 1;
}
fn main() {
let mut xes: Vec<Test> = vec![Test { a: 3, b: 5 }];
let mut refs: Vec<&i32> = Vec::new();
for y in &xes {
refs.push(&y.a);
}
xes.iter_mut().for_each(|val| other(&mut val.b, &refs));
}
Although refs only holds references to the a-member of the elements in xes and the function other uses the b-member, rust produces following error:
error[E0502]: cannot borrow `xes` as mutable because it is also borrowed as immutable
--> /src/main.rs:16:5
|
13 | for y in &xes {
| ---- immutable borrow occurs here
...
16 | xes.iter_mut().for_each(|val| other(&mut val.b, &refs));
| ^^^ mutable borrow occurs here ---- immutable borrow later captured here by closure
Playground
Is there something wrong with the closure? Usually splitting borrows should allow this. What am I missing?
Splitting borrows only works from within one function. Here, though, you're borrowing field a in main and field b in the closure (which, apart from being able to consume and borrow variables from the outer scope, is a distinct function).
As of Rust 1.43.1, function signatures cannot express fine-grained borrows; when a reference is passed (directly or indirectly) to a function, it gets access to all of it. Borrow checking across functions is based on function signatures; this is in part for performance (inference across functions is more costly), in part for ensuring compatibility as a function evolves (especially in a library): what constitutes a valid argument to the function shouldn't depend on the function's implementation.
As I understand it, your requirement is that you need to be able to update field b of your objects based on the value of field a of the whole set of objects.
I see two ways to fix this. First, we can capture all mutable references to b at the same time as we capture the shared references to a. This is a proper example of splitting borrows. A downside of this approach is that we need to allocate two Vecs just to perform the operation.
fn main() {
let mut xes: Vec<Test> = vec![Test { a: 3, b: 5 }];
let mut x_as: Vec<&i32> = Vec::new();
let mut x_bs: Vec<&mut i32> = Vec::new();
for x in &mut xes {
x_as.push(&x.a);
x_bs.push(&mut x.b);
}
x_bs.iter_mut().for_each(|b| other(b, &x_as));
}
Here's an equivalent way of building the two Vecs using iterators:
fn main() {
let mut xes: Vec<Test> = vec![Test { a: 3, b: 5 }];
let (x_as, mut x_bs): (Vec<_>, Vec<_>) =
xes.iter_mut().map(|x| (&x.a, &mut x.b)).unzip();
x_bs.iter_mut().for_each(|b| other(b, &x_as));
}
Another way is to avoid mutable references completely and to use interior mutability instead. The standard library has Cell, which works well for Copy types such as i32, RefCell, which works for all types but does borrowing checking at runtime, adding some slight overhead, and Mutex and RwLock, which can be used in multiple threads but perform lock checks at runtime so at most one thread gets access to the inner value at any time.
Here's an example with Cell. We can eliminate the two temporary Vecs with this approach, and we can pass the whole collection of objects to the other function instead of just references to the a field.
use std::cell::Cell;
struct Test {
a: i32,
b: Cell<i32>,
}
fn other(x: &Cell<i32>, refs: &[Test]) {
x.set(x.get() + 1);
}
fn main() {
let xes: Vec<Test> = vec![Test { a: 3, b: Cell::new(5) }];
xes.iter().for_each(|x| other(&x.b, &xes));
}

Is there a way to use the value of an element from a map to update another element without interior mutability? [duplicate]

I have the following code:
use std::collections::{HashMap, HashSet};
fn populate_connections(
start: i32,
num: i32,
conns: &mut HashMap<i32, HashSet<i32>>,
ancs: &mut HashSet<i32>,
) {
let mut orig_conns = conns.get_mut(&start).unwrap();
let pipes = conns.get(&num).unwrap();
for pipe in pipes.iter() {
if !ancs.contains(pipe) && !orig_conns.contains(pipe) {
ancs.insert(*pipe);
orig_conns.insert(*pipe);
populate_connections(start, num, conns, ancs);
}
}
}
fn main() {}
The logic is not very important, I'm trying to create a function which will itself and walk over pipes.
My issue is that this doesn't compile:
error[E0502]: cannot borrow `*conns` as immutable because it is also borrowed as mutable
--> src/main.rs:10:17
|
9 | let mut orig_conns = conns.get_mut(&start).unwrap();
| ----- mutable borrow occurs here
10 | let pipes = conns.get(&num).unwrap();
| ^^^^^ immutable borrow occurs here
...
19 | }
| - mutable borrow ends here
error[E0499]: cannot borrow `*conns` as mutable more than once at a time
--> src/main.rs:16:46
|
9 | let mut orig_conns = conns.get_mut(&start).unwrap();
| ----- first mutable borrow occurs here
...
16 | populate_connections(start, num, conns, ancs);
| ^^^^^ second mutable borrow occurs here
...
19 | }
| - first borrow ends here
I don't know how to make it work. At the beginning, I'm trying to get two HashSets stored in a HashMap (orig_conns and pipes).
Rust won't let me have both mutable and immutable variables at the same time. I'm confused a bit because this will be completely different objects but I guess if &start == &num, then I would have two different references to the same object (one mutable, one immutable).
Thats ok, but then how can I achieve this? I want to iterate over one HashSet and read and modify other one. Let's assume that they won't be the same HashSet.
If you can change your datatypes and your function signature, you can use a RefCell to create interior mutability:
use std::cell::RefCell;
use std::collections::{HashMap, HashSet};
fn populate_connections(
start: i32,
num: i32,
conns: &HashMap<i32, RefCell<HashSet<i32>>>,
ancs: &mut HashSet<i32>,
) {
let mut orig_conns = conns.get(&start).unwrap().borrow_mut();
let pipes = conns.get(&num).unwrap().borrow();
for pipe in pipes.iter() {
if !ancs.contains(pipe) && !orig_conns.contains(pipe) {
ancs.insert(*pipe);
orig_conns.insert(*pipe);
populate_connections(start, num, conns, ancs);
}
}
}
fn main() {}
Note that if start == num, the thread will panic because this is an attempt to have both mutable and immutable access to the same HashSet.
Safe alternatives to RefCell
Depending on your exact data and code needs, you can also use types like Cell or one of the atomics. These have lower memory overhead than a RefCell and only a small effect on codegen.
In multithreaded cases, you may wish to use a Mutex or RwLock.
Use hashbrown::HashMap
If you can switch to using hashbrown, you may be able to use a method like get_many_mut:
use hashbrown::HashMap; // 0.12.1
fn main() {
let mut map = HashMap::new();
map.insert(1, true);
map.insert(2, false);
dbg!(&map);
if let Some([a, b]) = map.get_many_mut([&1, &2]) {
std::mem::swap(a, b);
}
dbg!(&map);
}
As hashbrown is what powers the standard library hashmap, this is also available in nightly Rust as HashMap::get_many_mut.
Unsafe code
If you can guarantee that your two indices are different, you can use unsafe code and avoid interior mutability:
use std::collections::HashMap;
fn get_mut_pair<'a, K, V>(conns: &'a mut HashMap<K, V>, a: &K, b: &K) -> (&'a mut V, &'a mut V)
where
K: Eq + std::hash::Hash,
{
unsafe {
let a = conns.get_mut(a).unwrap() as *mut _;
let b = conns.get_mut(b).unwrap() as *mut _;
assert_ne!(a, b, "The two keys must not resolve to the same value");
(&mut *a, &mut *b)
}
}
fn main() {
let mut map = HashMap::new();
map.insert(1, true);
map.insert(2, false);
dbg!(&map);
let (a, b) = get_mut_pair(&mut map, &1, &2);
std::mem::swap(a, b);
dbg!(&map);
}
Similar code can be found in libraries like multi_mut.
This code tries to have an abundance of caution. An assertion enforces that the two values are distinct pointers before converting them back into mutable references and we explicitly add lifetimes to the returned variables.
You should understand the nuances of unsafe code before blindly using this solution. Notably, previous versions of this answer were incorrect. Thanks to #oberien for finding the unsoundness in the original implementation of this and proposing a fix. This playground demonstrates how purely safe Rust code could cause the old code to result in memory unsafety.
An enhanced version of this solution could accept an array of keys and return an array of values:
fn get_mut_pair<'a, K, V, const N: usize>(conns: &'a mut HashMap<K, V>, mut ks: [&K; N]) -> [&'a mut V; N]
It becomes more difficult to ensure that all the incoming keys are unique, however.
Note that this function doesn't attempt to solve the original problem, which is vastly more complex than verifying that two indices are disjoint. The original problem requires:
tracking three disjoint borrows, two of which are mutable and one that is immutable.
tracking the recursive call
must not modify the HashMap in any way which would cause resizing, which would invalidate any of the existing references from a previous level.
must not alias any of the references from a previous level.
Using something like RefCell is a much simpler way to ensure you do not trigger memory unsafety.

How do I remove a value from a BTreeSet by using a reference to the value to remove? [duplicate]

I would like to remove items from a BTreeMap which have been found through iteration.
As it is not possible to remove items while iterating, I put the items to delete into a vector. The main issue is that it is not possible to use a vector of references, but only a vector of values. All the keys for which the entry has to be removed must then be cloned (assuming the key implements the Clone trait).
For instance, this short sample does not compile:
use std::collections::BTreeMap;
pub fn clean() {
let mut map = BTreeMap::<String, i32>::new();
let mut to_delete = Vec::new();
{
for (k, v) in map.iter() {
if *v > 10 {
to_delete.push(k);
}
}
}
for k in to_delete.drain(..) {
map.remove(k);
}
}
fn main() {}
It generates the following errors when it is compiled:
error[E0502]: cannot borrow `map` as mutable because it is also borrowed as immutable
--> src/main.rs:17:9
|
9 | for (k, v) in map.iter() {
| --- immutable borrow occurs here
...
17 | map.remove(k);
| ^^^ mutable borrow occurs here
18 | }
19 | }
| - immutable borrow ends here
Changing to_delete.push(k) with to_delete.push(k.clone()) makes this snippet compile correctly but it is quite costly if each key to delete must be cloned.
Is there a better solution?
TL;DR: you cannot.
As far as the compiler is concerned, the implementation of BTreeMap::remove might do this:
pub fn remove<Q>(&mut self, key: &Q) -> Option<V>
where
K: Borrow<Q>,
Q: Ord + ?Sized,
{
// actual deleting code, which destroys the value in the set
// now what `value` pointed to is gone and `value` points to invalid memory
// And now we access that memory, causing undefined behavior
key.borrow();
}
The compiler thus has to prevent using the reference to the value when the collection will be mutated.
To do this, you'd need something like the hypothetical "cursor" API for collections. This would allow you to iterate over the collection, returning a special type that hold the mutable innards of the collection. This type could give you a reference to check against and then allow you to remove the item.
I'd probably look at the problem from a bit different direction. Instead of trying to keep the map, I'd just create a brand new one:
use std::collections::BTreeMap;
pub fn main() {
let mut map = BTreeMap::new();
map.insert("thief", 5);
map.insert("troll", 52);
map.insert("gnome", 7);
let map: BTreeMap<_, _> =
map.into_iter()
.filter(|&(_, v)| v <= 10)
.collect();
println!("{:?}", map); // troll is gone
}
If your condition is equality on the field that makes the struct unique (a.k.a is the only field used in PartialEq and Hash), you can implement Borrow for your type and directly grab it / delete it:
use std::collections::BTreeMap;
#[derive(Debug, PartialEq, Eq, PartialOrd, Ord)]
struct Monster(String);
use std::borrow::Borrow;
impl Borrow<str> for Monster {
fn borrow(&self) -> &str { &self.0 }
}
pub fn main() {
let mut map = BTreeMap::new();
map.insert(Monster("thief".into()), 5);
map.insert(Monster("troll".into()), 52);
map.insert(Monster("gnome".into()), 7);
map.remove("troll");
println!("{:?}", map); // troll is gone
}
See also:
How to implement HashMap with two keys?
How to avoid temporary allocations when using a complex key for a HashMap?
As of 1.53.0, there is a BTreeMap.retain method which allows to remove/retain items based on return value of a closure, while iterating:
map.retain(|_, v| *v<=10);

Why does HashMap have iter_mut() but HashSet doesn't?

What is the design rationale for supplying an iter_mut function for HashMap but not HashSet in Rust?
Would it be a faux pas to roll one's own (assuming that can even be done)?
Having one could alleviate situations that give rise to
previous borrow of X occurs here; the immutable borrow prevents
subsequent moves or mutable borrows of X until the borrow ends
Example
An extremely convoluted example (Gist) that does not show-case why the parameter passing is the way that it is. Has a short comment explaining the pain-point:
use std::collections::HashSet;
fn derp(v: i32, unprocessed: &mut HashSet<i32>) {
if unprocessed.contains(&v) {
// Pretend that v has been processed
unprocessed.remove(&v);
}
}
fn herp(v: i32) {
let mut unprocessed: HashSet<i32> = HashSet::new();
unprocessed.insert(v);
// I need to iterate over the unprocessed values
while let Some(u) = unprocessed.iter().next() {
// And them pass them mutably to another function
// as I will process the values inside derp and
// remove them from the set.
//
// This is an extremely convoluted example but
// I need for derp to be a separate function
// as I will employ recursion there, as it is
// much more succinct than an iterative version.
derp(*u, &mut unprocessed);
}
}
fn main() {
println!("Hello, world!");
herp(10);
}
The statement
while let Some(u) = unprocessed.iter().next() {
is an immutable borrow, hence
derp(*u, &mut unprocessed);
is impossible as unprocessed cannot be borrowed mutably. The immutable borrow does not end until the end of the while-loop.
I have tried to use this as reference and essentially ended up with trying to fool the borrow checker through various permutations of assignments, enclosing braces, but due to the coupling of the intended expressions the problem remains.
You have to think about what HashSet actually is. The IterMut that you get from HashMap::iter_mut() is only mutable on the value part: (&key, &mut val), ((&'a K, &'a mut V))
HashSet is basically a HashMap<T, ()>, so the actual values are the keys, and if you would modify the keys the hash of them would have to be updated or you get an invalid HashMap.
If your HashSet contains a Copy type, such as i32, you can work on a copy of the value to release the borrow on the HashSet early. To do this, you need to eliminate all borrows from the bindings in the while let expression. In your original code, u is of type &i32, and it keeps borrowing from unprocessed until the end of the loop. If we change the pattern to Some(&u), then u is of type i32, which doesn't borrow from anything, so we're free to use unprocessed as we like.
fn herp(v: i32) {
let mut unprocessed: HashSet<i32> = HashSet::new();
unprocessed.insert(v);
while let Some(&u) = unprocessed.iter().next() {
derp(u, &mut unprocessed);
}
}
If the type is not Copy or is too expensive to copy/clone, you can wrap them in Rc or Arc, and clone them as you iterate on them using cloned() (cloning an Rc or Arc doesn't clone the underlying value, it just clones the Rc pointer and increments the reference counter).
use std::collections::HashSet;
use std::rc::Rc;
fn derp(v: &i32, unprocessed: &mut HashSet<Rc<i32>>) {
if unprocessed.contains(v) {
unprocessed.remove(v);
}
}
fn herp(v: Rc<i32>) {
let mut unprocessed: HashSet<Rc<i32>> = HashSet::new();
unprocessed.insert(v);
while let Some(u) = unprocessed.iter().cloned().next() {
// If you don't use u afterwards,
// you could also pass if by value to derp.
derp(&u, &mut unprocessed);
}
}
fn main() {
println!("Hello, world!");
herp(Rc::new(10));
}

Removing items from a BTreeMap or BTreeSet found through iteration

I would like to remove items from a BTreeMap which have been found through iteration.
As it is not possible to remove items while iterating, I put the items to delete into a vector. The main issue is that it is not possible to use a vector of references, but only a vector of values. All the keys for which the entry has to be removed must then be cloned (assuming the key implements the Clone trait).
For instance, this short sample does not compile:
use std::collections::BTreeMap;
pub fn clean() {
let mut map = BTreeMap::<String, i32>::new();
let mut to_delete = Vec::new();
{
for (k, v) in map.iter() {
if *v > 10 {
to_delete.push(k);
}
}
}
for k in to_delete.drain(..) {
map.remove(k);
}
}
fn main() {}
It generates the following errors when it is compiled:
error[E0502]: cannot borrow `map` as mutable because it is also borrowed as immutable
--> src/main.rs:17:9
|
9 | for (k, v) in map.iter() {
| --- immutable borrow occurs here
...
17 | map.remove(k);
| ^^^ mutable borrow occurs here
18 | }
19 | }
| - immutable borrow ends here
Changing to_delete.push(k) with to_delete.push(k.clone()) makes this snippet compile correctly but it is quite costly if each key to delete must be cloned.
Is there a better solution?
TL;DR: you cannot.
As far as the compiler is concerned, the implementation of BTreeMap::remove might do this:
pub fn remove<Q>(&mut self, key: &Q) -> Option<V>
where
K: Borrow<Q>,
Q: Ord + ?Sized,
{
// actual deleting code, which destroys the value in the set
// now what `value` pointed to is gone and `value` points to invalid memory
// And now we access that memory, causing undefined behavior
key.borrow();
}
The compiler thus has to prevent using the reference to the value when the collection will be mutated.
To do this, you'd need something like the hypothetical "cursor" API for collections. This would allow you to iterate over the collection, returning a special type that hold the mutable innards of the collection. This type could give you a reference to check against and then allow you to remove the item.
I'd probably look at the problem from a bit different direction. Instead of trying to keep the map, I'd just create a brand new one:
use std::collections::BTreeMap;
pub fn main() {
let mut map = BTreeMap::new();
map.insert("thief", 5);
map.insert("troll", 52);
map.insert("gnome", 7);
let map: BTreeMap<_, _> =
map.into_iter()
.filter(|&(_, v)| v <= 10)
.collect();
println!("{:?}", map); // troll is gone
}
If your condition is equality on the field that makes the struct unique (a.k.a is the only field used in PartialEq and Hash), you can implement Borrow for your type and directly grab it / delete it:
use std::collections::BTreeMap;
#[derive(Debug, PartialEq, Eq, PartialOrd, Ord)]
struct Monster(String);
use std::borrow::Borrow;
impl Borrow<str> for Monster {
fn borrow(&self) -> &str { &self.0 }
}
pub fn main() {
let mut map = BTreeMap::new();
map.insert(Monster("thief".into()), 5);
map.insert(Monster("troll".into()), 52);
map.insert(Monster("gnome".into()), 7);
map.remove("troll");
println!("{:?}", map); // troll is gone
}
See also:
How to implement HashMap with two keys?
How to avoid temporary allocations when using a complex key for a HashMap?
As of 1.53.0, there is a BTreeMap.retain method which allows to remove/retain items based on return value of a closure, while iterating:
map.retain(|_, v| *v<=10);

Resources