Running into an ownership issue when attempting to reference multiple values from a HashMap in a struct as parameters in a function call. Here is a PoC of the issue.
use std::collections::HashMap;
struct Resource {
map: HashMap<String, String>,
}
impl Resource {
pub fn new() -> Self {
Resource {
map: HashMap::new(),
}
}
pub fn load(&mut self, key: String) -> &mut String {
self.map.get_mut(&key).unwrap()
}
}
fn main() {
// Initialize struct containing a HashMap.
let mut res = Resource {
map: HashMap::new(),
};
res.map.insert("Item1".to_string(), "Value1".to_string());
res.map.insert("Item2".to_string(), "Value2".to_string());
// This compiles and runs.
let mut value1 = res.load("Item1".to_string());
single_parameter(value1);
let mut value2 = res.load("Item2".to_string());
single_parameter(value2);
// This has ownership issues.
// multi_parameter(value1, value2);
}
fn single_parameter(value: &String) {
println!("{}", *value);
}
fn multi_parameter(value1: &mut String, value2: &mut String) {
println!("{}", *value1);
println!("{}", *value2);
}
Uncommenting multi_parameter results in the following error:
28 | let mut value1 = res.load("Item1".to_string());
| --- first mutable borrow occurs here
29 | single_parameter(value1);
30 | let mut value2 = res.load("Item2".to_string());
| ^^^ second mutable borrow occurs here
...
34 | multi_parameter(value1, value2);
| ------ first borrow later used here
It would technically be possible for me to break up the function calls (using the single_parameter function approach), but it would be more convenient to pass the
variables to a single function call.
For additional context, the actual program where I'm encountering this issue is an SDL2 game where I'm attempting to pass multiple textures into a single function call to be drawn, where the texture data may be modified within the function.
This is currently not possible, without resorting to unsafe code or interior mutability at least. There is no way for the compiler to know if two calls to load will yield mutable references to different data as it cannot always infer the value of the key. In theory, mutably borrowing both res.map["Item1"] and res.map["Item2"] would be fine as they would refer to different values in the map, but there is no way for the compiler to know this at compile time.
The easiest way to do this, as already mentioned, is to use a structure that allows interior mutability, like RefCell, which typically enforces the memory safety rules at run-time before returning a borrow of the wrapped value. You can also work around the borrow checker in this case by dealing with mut pointers in unsafe code:
pub fn load_many<'a, const N: usize>(&'a mut self, keys: [&str; N]) -> [&'a mut String; N] {
// TODO: Assert that keys are distinct, so that we don't return
// multiple references to the same value
keys.map(|key| self.load(key) as *mut _)
.map(|ptr| unsafe { &mut *ptr })
}
Rust Playground
The TODO is important, as this assertion is the only way to ensure that the safety invariant of only having one mutable reference to any value at any time is upheld.
It is, however, almost always better (and easier) to use a known safe interior mutation abstraction like RefCell rather than writing your own unsafe code.
Related
struct Test {
a: i32,
b: i32,
}
fn other(x: &mut i32, _refs: &Vec<&i32>) {
*x += 1;
}
fn main() {
let mut xes: Vec<Test> = vec![Test { a: 3, b: 5 }];
let mut refs: Vec<&i32> = Vec::new();
for y in &xes {
refs.push(&y.a);
}
xes.iter_mut().for_each(|val| other(&mut val.b, &refs));
}
Although refs only holds references to the a-member of the elements in xes and the function other uses the b-member, rust produces following error:
error[E0502]: cannot borrow `xes` as mutable because it is also borrowed as immutable
--> /src/main.rs:16:5
|
13 | for y in &xes {
| ---- immutable borrow occurs here
...
16 | xes.iter_mut().for_each(|val| other(&mut val.b, &refs));
| ^^^ mutable borrow occurs here ---- immutable borrow later captured here by closure
Playground
Is there something wrong with the closure? Usually splitting borrows should allow this. What am I missing?
Splitting borrows only works from within one function. Here, though, you're borrowing field a in main and field b in the closure (which, apart from being able to consume and borrow variables from the outer scope, is a distinct function).
As of Rust 1.43.1, function signatures cannot express fine-grained borrows; when a reference is passed (directly or indirectly) to a function, it gets access to all of it. Borrow checking across functions is based on function signatures; this is in part for performance (inference across functions is more costly), in part for ensuring compatibility as a function evolves (especially in a library): what constitutes a valid argument to the function shouldn't depend on the function's implementation.
As I understand it, your requirement is that you need to be able to update field b of your objects based on the value of field a of the whole set of objects.
I see two ways to fix this. First, we can capture all mutable references to b at the same time as we capture the shared references to a. This is a proper example of splitting borrows. A downside of this approach is that we need to allocate two Vecs just to perform the operation.
fn main() {
let mut xes: Vec<Test> = vec![Test { a: 3, b: 5 }];
let mut x_as: Vec<&i32> = Vec::new();
let mut x_bs: Vec<&mut i32> = Vec::new();
for x in &mut xes {
x_as.push(&x.a);
x_bs.push(&mut x.b);
}
x_bs.iter_mut().for_each(|b| other(b, &x_as));
}
Here's an equivalent way of building the two Vecs using iterators:
fn main() {
let mut xes: Vec<Test> = vec![Test { a: 3, b: 5 }];
let (x_as, mut x_bs): (Vec<_>, Vec<_>) =
xes.iter_mut().map(|x| (&x.a, &mut x.b)).unzip();
x_bs.iter_mut().for_each(|b| other(b, &x_as));
}
Another way is to avoid mutable references completely and to use interior mutability instead. The standard library has Cell, which works well for Copy types such as i32, RefCell, which works for all types but does borrowing checking at runtime, adding some slight overhead, and Mutex and RwLock, which can be used in multiple threads but perform lock checks at runtime so at most one thread gets access to the inner value at any time.
Here's an example with Cell. We can eliminate the two temporary Vecs with this approach, and we can pass the whole collection of objects to the other function instead of just references to the a field.
use std::cell::Cell;
struct Test {
a: i32,
b: Cell<i32>,
}
fn other(x: &Cell<i32>, refs: &[Test]) {
x.set(x.get() + 1);
}
fn main() {
let xes: Vec<Test> = vec![Test { a: 3, b: Cell::new(5) }];
xes.iter().for_each(|x| other(&x.b, &xes));
}
I have the following code:
use std::collections::{HashMap, HashSet};
fn populate_connections(
start: i32,
num: i32,
conns: &mut HashMap<i32, HashSet<i32>>,
ancs: &mut HashSet<i32>,
) {
let mut orig_conns = conns.get_mut(&start).unwrap();
let pipes = conns.get(&num).unwrap();
for pipe in pipes.iter() {
if !ancs.contains(pipe) && !orig_conns.contains(pipe) {
ancs.insert(*pipe);
orig_conns.insert(*pipe);
populate_connections(start, num, conns, ancs);
}
}
}
fn main() {}
The logic is not very important, I'm trying to create a function which will itself and walk over pipes.
My issue is that this doesn't compile:
error[E0502]: cannot borrow `*conns` as immutable because it is also borrowed as mutable
--> src/main.rs:10:17
|
9 | let mut orig_conns = conns.get_mut(&start).unwrap();
| ----- mutable borrow occurs here
10 | let pipes = conns.get(&num).unwrap();
| ^^^^^ immutable borrow occurs here
...
19 | }
| - mutable borrow ends here
error[E0499]: cannot borrow `*conns` as mutable more than once at a time
--> src/main.rs:16:46
|
9 | let mut orig_conns = conns.get_mut(&start).unwrap();
| ----- first mutable borrow occurs here
...
16 | populate_connections(start, num, conns, ancs);
| ^^^^^ second mutable borrow occurs here
...
19 | }
| - first borrow ends here
I don't know how to make it work. At the beginning, I'm trying to get two HashSets stored in a HashMap (orig_conns and pipes).
Rust won't let me have both mutable and immutable variables at the same time. I'm confused a bit because this will be completely different objects but I guess if &start == &num, then I would have two different references to the same object (one mutable, one immutable).
Thats ok, but then how can I achieve this? I want to iterate over one HashSet and read and modify other one. Let's assume that they won't be the same HashSet.
If you can change your datatypes and your function signature, you can use a RefCell to create interior mutability:
use std::cell::RefCell;
use std::collections::{HashMap, HashSet};
fn populate_connections(
start: i32,
num: i32,
conns: &HashMap<i32, RefCell<HashSet<i32>>>,
ancs: &mut HashSet<i32>,
) {
let mut orig_conns = conns.get(&start).unwrap().borrow_mut();
let pipes = conns.get(&num).unwrap().borrow();
for pipe in pipes.iter() {
if !ancs.contains(pipe) && !orig_conns.contains(pipe) {
ancs.insert(*pipe);
orig_conns.insert(*pipe);
populate_connections(start, num, conns, ancs);
}
}
}
fn main() {}
Note that if start == num, the thread will panic because this is an attempt to have both mutable and immutable access to the same HashSet.
Safe alternatives to RefCell
Depending on your exact data and code needs, you can also use types like Cell or one of the atomics. These have lower memory overhead than a RefCell and only a small effect on codegen.
In multithreaded cases, you may wish to use a Mutex or RwLock.
Use hashbrown::HashMap
If you can switch to using hashbrown, you may be able to use a method like get_many_mut:
use hashbrown::HashMap; // 0.12.1
fn main() {
let mut map = HashMap::new();
map.insert(1, true);
map.insert(2, false);
dbg!(&map);
if let Some([a, b]) = map.get_many_mut([&1, &2]) {
std::mem::swap(a, b);
}
dbg!(&map);
}
As hashbrown is what powers the standard library hashmap, this is also available in nightly Rust as HashMap::get_many_mut.
Unsafe code
If you can guarantee that your two indices are different, you can use unsafe code and avoid interior mutability:
use std::collections::HashMap;
fn get_mut_pair<'a, K, V>(conns: &'a mut HashMap<K, V>, a: &K, b: &K) -> (&'a mut V, &'a mut V)
where
K: Eq + std::hash::Hash,
{
unsafe {
let a = conns.get_mut(a).unwrap() as *mut _;
let b = conns.get_mut(b).unwrap() as *mut _;
assert_ne!(a, b, "The two keys must not resolve to the same value");
(&mut *a, &mut *b)
}
}
fn main() {
let mut map = HashMap::new();
map.insert(1, true);
map.insert(2, false);
dbg!(&map);
let (a, b) = get_mut_pair(&mut map, &1, &2);
std::mem::swap(a, b);
dbg!(&map);
}
Similar code can be found in libraries like multi_mut.
This code tries to have an abundance of caution. An assertion enforces that the two values are distinct pointers before converting them back into mutable references and we explicitly add lifetimes to the returned variables.
You should understand the nuances of unsafe code before blindly using this solution. Notably, previous versions of this answer were incorrect. Thanks to #oberien for finding the unsoundness in the original implementation of this and proposing a fix. This playground demonstrates how purely safe Rust code could cause the old code to result in memory unsafety.
An enhanced version of this solution could accept an array of keys and return an array of values:
fn get_mut_pair<'a, K, V, const N: usize>(conns: &'a mut HashMap<K, V>, mut ks: [&K; N]) -> [&'a mut V; N]
It becomes more difficult to ensure that all the incoming keys are unique, however.
Note that this function doesn't attempt to solve the original problem, which is vastly more complex than verifying that two indices are disjoint. The original problem requires:
tracking three disjoint borrows, two of which are mutable and one that is immutable.
tracking the recursive call
must not modify the HashMap in any way which would cause resizing, which would invalidate any of the existing references from a previous level.
must not alias any of the references from a previous level.
Using something like RefCell is a much simpler way to ensure you do not trigger memory unsafety.
I have the following code:
use std::collections::{HashMap, HashSet};
fn populate_connections(
start: i32,
num: i32,
conns: &mut HashMap<i32, HashSet<i32>>,
ancs: &mut HashSet<i32>,
) {
let mut orig_conns = conns.get_mut(&start).unwrap();
let pipes = conns.get(&num).unwrap();
for pipe in pipes.iter() {
if !ancs.contains(pipe) && !orig_conns.contains(pipe) {
ancs.insert(*pipe);
orig_conns.insert(*pipe);
populate_connections(start, num, conns, ancs);
}
}
}
fn main() {}
The logic is not very important, I'm trying to create a function which will itself and walk over pipes.
My issue is that this doesn't compile:
error[E0502]: cannot borrow `*conns` as immutable because it is also borrowed as mutable
--> src/main.rs:10:17
|
9 | let mut orig_conns = conns.get_mut(&start).unwrap();
| ----- mutable borrow occurs here
10 | let pipes = conns.get(&num).unwrap();
| ^^^^^ immutable borrow occurs here
...
19 | }
| - mutable borrow ends here
error[E0499]: cannot borrow `*conns` as mutable more than once at a time
--> src/main.rs:16:46
|
9 | let mut orig_conns = conns.get_mut(&start).unwrap();
| ----- first mutable borrow occurs here
...
16 | populate_connections(start, num, conns, ancs);
| ^^^^^ second mutable borrow occurs here
...
19 | }
| - first borrow ends here
I don't know how to make it work. At the beginning, I'm trying to get two HashSets stored in a HashMap (orig_conns and pipes).
Rust won't let me have both mutable and immutable variables at the same time. I'm confused a bit because this will be completely different objects but I guess if &start == &num, then I would have two different references to the same object (one mutable, one immutable).
Thats ok, but then how can I achieve this? I want to iterate over one HashSet and read and modify other one. Let's assume that they won't be the same HashSet.
If you can change your datatypes and your function signature, you can use a RefCell to create interior mutability:
use std::cell::RefCell;
use std::collections::{HashMap, HashSet};
fn populate_connections(
start: i32,
num: i32,
conns: &HashMap<i32, RefCell<HashSet<i32>>>,
ancs: &mut HashSet<i32>,
) {
let mut orig_conns = conns.get(&start).unwrap().borrow_mut();
let pipes = conns.get(&num).unwrap().borrow();
for pipe in pipes.iter() {
if !ancs.contains(pipe) && !orig_conns.contains(pipe) {
ancs.insert(*pipe);
orig_conns.insert(*pipe);
populate_connections(start, num, conns, ancs);
}
}
}
fn main() {}
Note that if start == num, the thread will panic because this is an attempt to have both mutable and immutable access to the same HashSet.
Safe alternatives to RefCell
Depending on your exact data and code needs, you can also use types like Cell or one of the atomics. These have lower memory overhead than a RefCell and only a small effect on codegen.
In multithreaded cases, you may wish to use a Mutex or RwLock.
Use hashbrown::HashMap
If you can switch to using hashbrown, you may be able to use a method like get_many_mut:
use hashbrown::HashMap; // 0.12.1
fn main() {
let mut map = HashMap::new();
map.insert(1, true);
map.insert(2, false);
dbg!(&map);
if let Some([a, b]) = map.get_many_mut([&1, &2]) {
std::mem::swap(a, b);
}
dbg!(&map);
}
As hashbrown is what powers the standard library hashmap, this is also available in nightly Rust as HashMap::get_many_mut.
Unsafe code
If you can guarantee that your two indices are different, you can use unsafe code and avoid interior mutability:
use std::collections::HashMap;
fn get_mut_pair<'a, K, V>(conns: &'a mut HashMap<K, V>, a: &K, b: &K) -> (&'a mut V, &'a mut V)
where
K: Eq + std::hash::Hash,
{
unsafe {
let a = conns.get_mut(a).unwrap() as *mut _;
let b = conns.get_mut(b).unwrap() as *mut _;
assert_ne!(a, b, "The two keys must not resolve to the same value");
(&mut *a, &mut *b)
}
}
fn main() {
let mut map = HashMap::new();
map.insert(1, true);
map.insert(2, false);
dbg!(&map);
let (a, b) = get_mut_pair(&mut map, &1, &2);
std::mem::swap(a, b);
dbg!(&map);
}
Similar code can be found in libraries like multi_mut.
This code tries to have an abundance of caution. An assertion enforces that the two values are distinct pointers before converting them back into mutable references and we explicitly add lifetimes to the returned variables.
You should understand the nuances of unsafe code before blindly using this solution. Notably, previous versions of this answer were incorrect. Thanks to #oberien for finding the unsoundness in the original implementation of this and proposing a fix. This playground demonstrates how purely safe Rust code could cause the old code to result in memory unsafety.
An enhanced version of this solution could accept an array of keys and return an array of values:
fn get_mut_pair<'a, K, V, const N: usize>(conns: &'a mut HashMap<K, V>, mut ks: [&K; N]) -> [&'a mut V; N]
It becomes more difficult to ensure that all the incoming keys are unique, however.
Note that this function doesn't attempt to solve the original problem, which is vastly more complex than verifying that two indices are disjoint. The original problem requires:
tracking three disjoint borrows, two of which are mutable and one that is immutable.
tracking the recursive call
must not modify the HashMap in any way which would cause resizing, which would invalidate any of the existing references from a previous level.
must not alias any of the references from a previous level.
Using something like RefCell is a much simpler way to ensure you do not trigger memory unsafety.
I want my method of struct to perform in a synchronized way. I wanted to do this by using Mutex (Playground):
use std::sync::Mutex;
use std::collections::BTreeMap;
pub struct A {
map: BTreeMap<String, String>,
mutex: Mutex<()>,
}
impl A {
pub fn new() -> A {
A {
map: BTreeMap::new(),
mutex: Mutex::new(()),
}
}
}
impl A {
fn synchronized_call(&mut self) {
let mutex_guard_res = self.mutex.try_lock();
if mutex_guard_res.is_err() {
return
}
let mut _mutex_guard = mutex_guard_res.unwrap(); // safe because of check above
let mut lambda = |text: String| {
let _ = self.map.insert("hello".to_owned(),
"d".to_owned());
};
lambda("dd".to_owned());
}
}
Error message:
error[E0500]: closure requires unique access to `self` but `self.mutex` is already borrowed
--> <anon>:23:26
|
18 | let mutex_guard_res = self.mutex.try_lock();
| ---------- borrow occurs here
...
23 | let mut lambda = |text: String| {
| ^^^^^^^^^^^^^^ closure construction occurs here
24 | if let Some(m) = self.map.get(&text) {
| ---- borrow occurs due to use of `self` in closure
...
31 | }
| - borrow ends here
As I understand when we borrow anything from the struct we are unable to use other struct's fields till our borrow is finished. But how can I do method synchronization then?
The closure needs a mutable reference to the self.map in order to insert something into it. But closure capturing works with whole bindings only. This means, that if you say self.map, the closure attempts to capture self, not self.map. And self can't be mutably borrowed/captured, because parts of self are already immutably borrowed.
We can solve this closure-capturing problem by introducing a new binding for the map alone such that the closure is able to capture it (Playground):
let mm = &mut self.map;
let mut lambda = |text: String| {
let _ = mm.insert("hello".to_owned(), text);
};
lambda("dd".to_owned());
However, there is something you overlooked: since synchronized_call() accepts &mut self, you don't need the mutex! Why? Mutable references are also called exclusive references, because the compiler can assure at compile time that there is only one such mutable reference at any given time.
Therefore you statically know, that there is at most one instance of synchronized_call() running on one specific object at any given time, if the function is not recursive (calls itself).
If you have mutable access to a mutex, you know that the mutex is unlocked. See the Mutex::get_mut() method for more explanation. Isn't that amazing?
Rust mutexes do not work the way you are trying to use them. In Rust, a mutex protects specific data relying on the borrow-checking mechanism used elsewhere in the language. As a consequence, declaring a field Mutex<()> doesn't make sense, because it is protecting read-write access to the () unit object that has no values to mutate.
As Lukas explained, your call_synchronized as declared doesn't need to do synchronization because its signature already requests an exclusive (mutable) reference to self, which prevents it from being invoked from multiple threads on the same object. In other words, you need to change the signature of call_synchronized because the current one does not match the functionality it is intended to provide.
call_synchronized needs to accept a shared reference to self, which will signal to Rust that it can be called from multiple threads in the first place. Inside call_synchronized a call to Mutex::lock will simultaneously lock the mutex and provide a mutable reference to the underlying data, carefully scoped so that the lock is held for the duration of the reference:
use std::sync::Mutex;
use std::collections::BTreeMap;
pub struct A {
synced_map: Mutex<BTreeMap<String, String>>,
}
impl A {
pub fn new() -> A {
A {
synced_map: Mutex::new(BTreeMap::new()),
}
}
}
impl A {
fn synchronized_call(&self) {
let mut map = self.synced_map.lock().unwrap();
// omitting the lambda for brevity, but it would also work
// (as long as it refers to map rather than self.map)
map.insert("hello".to_owned(), "d".to_owned());
}
}
I'm learning Rust and I'm trying to cargo-cult this code into compiling:
use std::vec::Vec;
use std::collections::BTreeMap;
struct Occ {
docnum: u64,
weight: f32,
}
struct PostWriter<'a> {
bytes: Vec<u8>,
occurrences: BTreeMap<&'a [u8], Vec<Occ>>,
}
impl<'a> PostWriter<'a> {
fn new() -> PostWriter<'a> {
PostWriter {
bytes: Vec::new(),
occurrences: BTreeMap::new(),
}
}
fn add_occurrence(&'a mut self, term: &[u8], occ: Occ) {
let occurrences = &mut self.occurrences;
match occurrences.get_mut(term) {
Some(x) => x.push(occ),
None => {
// Add the term bytes to the big vector of all terms
let termstart = self.bytes.len();
self.bytes.extend(term);
// Create a new occurrences vector
let occs = vec![occ];
// Take the appended term as a slice to use as a key
// ERROR: cannot borrow `*occurrences` as mutable more than once at a time
occurrences.insert(&self.bytes[termstart..], occs);
}
}
}
}
fn main() {}
I get an error:
error[E0499]: cannot borrow `*occurrences` as mutable more than once at a time
--> src/main.rs:34:17
|
24 | match occurrences.get_mut(term) {
| ----------- first mutable borrow occurs here
...
34 | occurrences.insert(&self.bytes[termstart..], occs);
| ^^^^^^^^^^^ second mutable borrow occurs here
35 | }
36 | }
| - first borrow ends here
I don't understand... I'm just calling a method on a mutable reference, why would that line involve borrowing?
I'm just calling a method on a mutable reference, why would that line involve borrowing?
When you call a method on an object that's going to mutate the object, you can't have any other references to that object outstanding. If you did, your mutation could invalidate those references and leave your program in an inconsistent state. For example, say that you had gotten a value out of your hashmap and then added a new value. Adding the new value hits a magic limit and forces memory to be reallocated, your value now points off to nowhere! When you use that value... bang goes the program!
In this case, it looks like you want to do the relatively common "append or insert if missing" operation. You will want to use entry for that:
use std::collections::BTreeMap;
fn main() {
let mut map = BTreeMap::new();
{
let nicknames = map.entry("joe").or_insert(Vec::new());
nicknames.push("shmoe");
// Using scoping to indicate that we are done with borrowing `nicknames`
// If we didn't, then we couldn't borrow map as
// immutable because we could still change it via `nicknames`
}
println!("{:?}", map)
}
Because you're calling a method that borrows as mutable
I had a similar question yesterday about Hash, until I noticed something in the docs. The docs for BTreeMap show a method signature for insert starting with fn insert(&mut self..
So when you call .insert, you're implicitly asking that function to borrow the BTreeMap as mutable.