Rust: Modify value in HashMap while immutably borrowing the whole HashMap - rust

I'm trying to learn Rust by using it in a project of mine.
However, I've been struggling with the borrow checker quite a bit in some code which has a very similar form to the following:
use std::collections::HashMap;
use std::pin::Pin;
use std::vec::Vec;
struct MyStruct<'a> {
value: i32,
substructs: Option<Vec<Pin<&'a MyStruct<'a>>>>,
}
struct Toplevel<'a> {
my_structs: HashMap<String, Pin<Box<MyStruct<'a>>>>,
}
fn main() {
let mut toplevel = Toplevel {
my_structs: HashMap::new(),
};
// First pass: add the elements to the HashMap
toplevel.my_structs.insert(
"abc".into(),
Pin::new(Box::new(MyStruct {
value: 0,
substructs: None,
})),
);
toplevel.my_structs.insert(
"def".into(),
Pin::new(Box::new(MyStruct {
value: 5,
substructs: None,
})),
);
toplevel.my_structs.insert(
"ghi".into(),
Pin::new(Box::new(MyStruct {
value: -7,
substructs: None,
})),
);
// Second pass: for each MyStruct, add substructs
let subs = vec![
toplevel.my_structs.get("abc").unwrap().as_ref(),
toplevel.my_structs.get("def").unwrap().as_ref(),
toplevel.my_structs.get("ghi").unwrap().as_ref(),
];
toplevel.my_structs.get_mut("abc").unwrap().substructs = Some(subs);
}
When compiling, I get the following message:
error[E0502]: cannot borrow `toplevel.my_structs` as mutable because it is also borrowed as immutable
--> src/main.rs:48:5
|
44 | toplevel.my_structs.get("abc").unwrap().as_ref(),
| ------------------- immutable borrow occurs here
...
48 | toplevel.my_structs.get_mut("abc").unwrap().substructs = Some(subs);
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^--------------------
| |
| mutable borrow occurs here
| immutable borrow later used here
I think I understand why this happens: toplevel.my_structs.get_mut(...) borrows toplevel.my_structs as mutable. However, in the same block, toplevel.my_structs.get(...) also borrows toplevel.my_structs (though this time as immutable).
I also see how this would indeed be a problem if the function which borrows &mut toplevel.my_structs, say, added a new key.
However, all that is done here in the &mut toplevel.my_structs borrow is modify the value corresponding to a specific key, which shouldn't change memory layout (and that's guaranteed, thanks to Pin). Right?
Is there a way to communicate this to the compiler, so that I can compile this code? This appears to be somewhat similar to what motivates the hashmap::Entry API, but I need to be able to access other keys as well, not only the one I want to modify.

Your current problem is about conflicting mutable and immutable borrows, but there's a deeper problem here. This data structure cannot work for what you're trying to do:
struct MyStruct<'a> {
value: i32,
substructs: Option<Vec<Pin<&'a MyStruct<'a>>>>,
}
struct Toplevel<'a> {
my_structs: HashMap<String, Pin<Box<MyStruct<'a>>>>,
}
Any time a type has a lifetime parameter, that lifetime necessarily outlives (or lives exactly as long as) the values of that type. A container Toplevel<'a> which contains references &'a MyStruct must refer to MyStructs which were created before the Toplevel — unless you're using special tools like an arena allocator.
(It's possible to straightforwardly build a tree of references, but they must be constructed leaves first and not using a recursive algorithm; this is usually impractical for dynamic input data.)
In general, references are not really suitable for creating data structures; rather they're for temporarily “borrowing” parts of data structures.
In your case, if you want to have a collection of all the MyStructs and also be able to add connections between them after they are created, you need both shared ownership and interior mutability:
use std::collections::HashMap;
use std::cell::RefCell;
use std::rc::Rc;
struct MyStruct {
value: i32,
substructs: Option<Vec<Rc<RefCell<MyStruct>>>>,
}
struct Toplevel {
my_structs: HashMap<String, Rc<RefCell<MyStruct>>>,
}
The shared ownership via Rc allows both Toplevel and any number of MyStructs to refer to other MyStructs. The interior mutability via RefCell allows the MyStruct's substructs field to be modified even while it's being referred to from other elements of the overall data structure.
Given these definitions, you can write the code that you wanted:
fn main() {
let mut toplevel = Toplevel {
my_structs: HashMap::new(),
};
// First pass: add the elements to the HashMap
toplevel.my_structs.insert(
"abc".into(),
Rc::new(RefCell::new(MyStruct {
value: 0,
substructs: None,
})),
);
toplevel.my_structs.insert(
"def".into(),
Rc::new(RefCell::new(MyStruct {
value: 5,
substructs: None,
})),
);
toplevel.my_structs.insert(
"ghi".into(),
Rc::new(RefCell::new(MyStruct {
value: -7,
substructs: None,
})),
);
// Second pass: for each MyStruct, add substructs
let subs = vec![
toplevel.my_structs["abc"].clone(),
toplevel.my_structs["def"].clone(),
toplevel.my_structs["ghi"].clone(),
];
toplevel.my_structs["abc"].borrow_mut().substructs = Some(subs);
}
Note that because you're having "abc" refer to itself, this creates a reference cycle, which will not be freed when the Toplevel is dropped. To fix this, you can impl Drop for Toplevel and explicitly remove all the substructs references.
Another option, arguably more 'Rusty' is to just use indices for cross-references. This has several pros and cons:
Adds the cost of additional hash lookups.
Removes the cost of reference counting and interior mutability.
Can have “dangling references”: a key could be removed from the map, invalidating the references to it.
use std::collections::HashMap;
struct MyStruct {
value: i32,
substructs: Option<Vec<String>>,
}
struct Toplevel {
my_structs: HashMap<String, MyStruct>,
}
fn main() {
let mut toplevel = Toplevel {
my_structs: HashMap::new(),
};
// First pass: add the elements to the HashMap
toplevel.my_structs.insert(
"abc".into(),
MyStruct {
value: 0,
substructs: None,
},
);
toplevel.my_structs.insert(
"def".into(),
MyStruct {
value: 5,
substructs: None,
},
);
toplevel.my_structs.insert(
"ghi".into(),
MyStruct {
value: -7,
substructs: None,
},
);
// Second pass: for each MyStruct, add substructs
toplevel.my_structs.get_mut("abc").unwrap().substructs =
Some(vec!["abc".into(), "def".into(), "ghi".into()]);
}

In your code, you are attempting to modify a value referenced in the vector as immutable, which is not allowed. You could store mutable references in the vector instead, and mutate them directly, like this:
let subs = vec![
toplevel.my_structs.get_mut("abc").unwrap(),
toplevel.my_structs.get_mut("def").unwrap(),
toplevel.my_structs.get_mut("ghi").unwrap(),
];
(*subs[0]).substructs = Some(subs.clone());
However, it's easier (although more expensive) to store clones of the structs instead of references:
let subs = vec![
toplevel.my_structs.get("abc").unwrap().clone(),
toplevel.my_structs.get("def").unwrap().clone(),
toplevel.my_structs.get("ghi").unwrap().clone(),
];
(*toplevel.my_structs.get_mut("abc").unwrap()).substructs = Some(subs);

Related

Multiple Mutable Borrows from Struct Hashmap

Running into an ownership issue when attempting to reference multiple values from a HashMap in a struct as parameters in a function call. Here is a PoC of the issue.
use std::collections::HashMap;
struct Resource {
map: HashMap<String, String>,
}
impl Resource {
pub fn new() -> Self {
Resource {
map: HashMap::new(),
}
}
pub fn load(&mut self, key: String) -> &mut String {
self.map.get_mut(&key).unwrap()
}
}
fn main() {
// Initialize struct containing a HashMap.
let mut res = Resource {
map: HashMap::new(),
};
res.map.insert("Item1".to_string(), "Value1".to_string());
res.map.insert("Item2".to_string(), "Value2".to_string());
// This compiles and runs.
let mut value1 = res.load("Item1".to_string());
single_parameter(value1);
let mut value2 = res.load("Item2".to_string());
single_parameter(value2);
// This has ownership issues.
// multi_parameter(value1, value2);
}
fn single_parameter(value: &String) {
println!("{}", *value);
}
fn multi_parameter(value1: &mut String, value2: &mut String) {
println!("{}", *value1);
println!("{}", *value2);
}
Uncommenting multi_parameter results in the following error:
28 | let mut value1 = res.load("Item1".to_string());
| --- first mutable borrow occurs here
29 | single_parameter(value1);
30 | let mut value2 = res.load("Item2".to_string());
| ^^^ second mutable borrow occurs here
...
34 | multi_parameter(value1, value2);
| ------ first borrow later used here
It would technically be possible for me to break up the function calls (using the single_parameter function approach), but it would be more convenient to pass the
variables to a single function call.
For additional context, the actual program where I'm encountering this issue is an SDL2 game where I'm attempting to pass multiple textures into a single function call to be drawn, where the texture data may be modified within the function.
This is currently not possible, without resorting to unsafe code or interior mutability at least. There is no way for the compiler to know if two calls to load will yield mutable references to different data as it cannot always infer the value of the key. In theory, mutably borrowing both res.map["Item1"] and res.map["Item2"] would be fine as they would refer to different values in the map, but there is no way for the compiler to know this at compile time.
The easiest way to do this, as already mentioned, is to use a structure that allows interior mutability, like RefCell, which typically enforces the memory safety rules at run-time before returning a borrow of the wrapped value. You can also work around the borrow checker in this case by dealing with mut pointers in unsafe code:
pub fn load_many<'a, const N: usize>(&'a mut self, keys: [&str; N]) -> [&'a mut String; N] {
// TODO: Assert that keys are distinct, so that we don't return
// multiple references to the same value
keys.map(|key| self.load(key) as *mut _)
.map(|ptr| unsafe { &mut *ptr })
}
Rust Playground
The TODO is important, as this assertion is the only way to ensure that the safety invariant of only having one mutable reference to any value at any time is upheld.
It is, however, almost always better (and easier) to use a known safe interior mutation abstraction like RefCell rather than writing your own unsafe code.

Amend Vector of structs within hashmap in rust. Explicit lifetime annotation in return statement

I have the following code:
#![allow(unused)]
#![allow(unused_must_use)]
use std::collections::HashMap;
#[derive(Clone, Debug)]
struct Product {
name: String,
description: Option<String>,
barcode: String,
price: String
}
fn main() {
println!("Loading product list");
let mut h: HashMap<&str, Vec<Product>> = HashMap::new();
let plastic_bag = Product{ name: "Plastic Bag".to_string(), description: None, barcode: "0001A".to_string(), price: "4.50".to_string() };
let recyclable_bag = Product{ name: "Recyclable Bag".to_string(), description: None, barcode: "0001B".to_string(), price: "15.50".to_string() };
let category = vec![recyclable_bag, plastic_bag];
h.insert("checkout", category);
println!("{:#?}", h);
let mut h = make_free("checkout", &h);
println!("{:#?}", h);
}
fn make_free<'a>(category: &'a str, checkout_category: &'a mut HashMap<&str, Vec<Product>>) -> &'a mut HashMap<&'a str, Vec<Product>> {
let mut category = checkout_category.get_mut(category).unwrap();
for product in category {
product.price = "0.00".to_string();
println!("{:#?}", product);
}
return checkout_category
}
I have a list of pre-defined products filled in. I have a method which I call that I would like to change the prices of the borrowed reference's contents to $0.
I receive the 2 errors:
Compiling playground v0.0.1 (/playground)
error[E0308]: mismatched types
--> src/main.rs:31:38
|
31 | let mut h = make_free("checkout", &h);
| ^^ types differ in mutability
|
= note: expected mutable reference `&mut std::collections::HashMap<&str, std::vec::Vec<Product>>`
found reference `&std::collections::HashMap<&str, std::vec::Vec<Product>>`
error[E0621]: explicit lifetime required in the type of `checkout_category`
--> src/main.rs:45:11
|
36 | fn make_free<'a>(category: &'a str, checkout_category: &'a mut HashMap<&str, Vec<Product>>) -> &'a mut HashMap<&'a str, Vec<Product>> {
| ----------------------------------- help: add explicit lifetime `'a` to the type of `checkout_category`: `&'a mut std::collections::HashMap<&'a str, std::vec::Vec<Product>>`
...
45 | return checkout_category
| ^^^^^^^^^^^^^^^^^ lifetime `'a` required
I'm really confused on how you can add a lifetime specifier to the return statement, and why the types are different.
There are a number of problems with this code and they all relate to Rust's concept of ownership and shared and mutable references. I recommend reading the Understanding Ownership section of the free The Rust Programming Language book or even better the respective sections in "Programming Rust" from O'Reilly.
In summary, the most important rules regarding ownership are as follows:
Values have a single owner. Re-assigning a value to a new variable moves the value and makes the perviously owning variable invalid/unusable. Rust tracks this at compile time. When the variable that owns a value goes out of scope the value is dropped (=deleted).
There can be many shared references (e.g. &HashMap<..>) of a value. When there is one or more shared reference, the value is immutable.
Alternatively, there can be one (and only one) mutable reference (e.g. &mut HashMap<..>) of a value. When a mutable reference exists no other reference (shared or mutable) can exist. Mutable references are unique.
References must never outlive the value they refer to.
(There are ways to bend these rules, but these are the base rules in Rust and important to understand.)
The other part that is confusing in Rust are the differences between String and &str. Again, I recommend reading more on this, but the gist is
String owns a string value on the heap
&str is a reference to a string that somebody else owns.
Now, looking at the code
let mut h: HashMap<&str, Vec<Product>> = HashMap::new();
This part is a little bit weird (it might be what you wanted, but probably not): the variable h owns the HashMap, but the HashMap does not own its keys, it only has references to them. Per rule (4) h must not live longer than any of the keys put into it. In practice, Rust cannot track this, so this hashmap is effectively limited to holding &str references that live for the whole program, these are called &'static str, the most common are string literals.
To get a HashMap which owns its keys (they usual case), you'd use
let mut h: HashMap<String, Vec<Product>> = HashMap::new();
// ^^^^^^-- instead of &str
Now the make_free function wants to modify the hashmap it receives. There are two idiomatic ways of doing this: (a) take ownership of the hashmap and return a new hashmap or (b) take a mutable reference &mut HashMap<..> and modify it in place but don't return anything. In this case using a mutable reference would be more natural:
fn make_free(category: &str, checkout_category: &mut HashMap<String, Vec<Product>>) {
let mut category = checkout_category.get_mut(category).unwrap();
for product in category {
product.price = "0.00".to_string();
println!("{:#?}", product);
}
Notice that you don't need lifetimes in this case, you only need them when you return a reference (which is rare).
Using Strings owned by the hashmap and this version of make_free the code becomes (playground link):
#![allow(unused)]
#![allow(unused_must_use)]
use std::collections::HashMap;
#[derive(Clone, Debug)]
struct Product {
name: String,
description: Option<String>,
barcode: String,
price: String
}
fn main() {
println!("Loading product list");
let mut h: HashMap<String, Vec<Product>> = HashMap::new();
let plastic_bag = Product{ name: "Plastic Bag".to_string(), description: None, barcode: "0001A".to_string(), price: "4.50".to_string() };
let recyclable_bag = Product{ name: "Recyclable Bag".to_string(), description: None, barcode: "0001B".to_string(), price: "15.50".to_string() };
let category = vec![recyclable_bag, plastic_bag];
h.insert("checkout".to_string(), category);
println!("{:#?}", h);
make_free("checkout", &mut h);
println!("{:#?}", h);
}
fn make_free(category: &str, checkout_category: &mut HashMap<String, Vec<Product>>) {
let mut category = checkout_category.get_mut(category).unwrap();
for product in category {
product.price = "0.00".to_string();
println!("{:#?}", product);
}
}
Ownership and lifetime problems aside, your main issue is that the input parameter is missing the lifetime on the first type argument: you have HashMap<&str, Vec<Product>> rather than HashMap<&'a str, Vec<Product>> which is the return type. Therefore the inferred lifetime of checkout_category is not 'a, which is required by the method signature. Adding the missing 'a fixes this.
Next, your function make_free takes an &mut reference but you borrowed h as an & reference. This is easily fixed:
let mut h = make_free("checkout", &mut h);

Rust not allowing mutable borrow when splitting properly

struct Test {
a: i32,
b: i32,
}
fn other(x: &mut i32, _refs: &Vec<&i32>) {
*x += 1;
}
fn main() {
let mut xes: Vec<Test> = vec![Test { a: 3, b: 5 }];
let mut refs: Vec<&i32> = Vec::new();
for y in &xes {
refs.push(&y.a);
}
xes.iter_mut().for_each(|val| other(&mut val.b, &refs));
}
Although refs only holds references to the a-member of the elements in xes and the function other uses the b-member, rust produces following error:
error[E0502]: cannot borrow `xes` as mutable because it is also borrowed as immutable
--> /src/main.rs:16:5
|
13 | for y in &xes {
| ---- immutable borrow occurs here
...
16 | xes.iter_mut().for_each(|val| other(&mut val.b, &refs));
| ^^^ mutable borrow occurs here ---- immutable borrow later captured here by closure
Playground
Is there something wrong with the closure? Usually splitting borrows should allow this. What am I missing?
Splitting borrows only works from within one function. Here, though, you're borrowing field a in main and field b in the closure (which, apart from being able to consume and borrow variables from the outer scope, is a distinct function).
As of Rust 1.43.1, function signatures cannot express fine-grained borrows; when a reference is passed (directly or indirectly) to a function, it gets access to all of it. Borrow checking across functions is based on function signatures; this is in part for performance (inference across functions is more costly), in part for ensuring compatibility as a function evolves (especially in a library): what constitutes a valid argument to the function shouldn't depend on the function's implementation.
As I understand it, your requirement is that you need to be able to update field b of your objects based on the value of field a of the whole set of objects.
I see two ways to fix this. First, we can capture all mutable references to b at the same time as we capture the shared references to a. This is a proper example of splitting borrows. A downside of this approach is that we need to allocate two Vecs just to perform the operation.
fn main() {
let mut xes: Vec<Test> = vec![Test { a: 3, b: 5 }];
let mut x_as: Vec<&i32> = Vec::new();
let mut x_bs: Vec<&mut i32> = Vec::new();
for x in &mut xes {
x_as.push(&x.a);
x_bs.push(&mut x.b);
}
x_bs.iter_mut().for_each(|b| other(b, &x_as));
}
Here's an equivalent way of building the two Vecs using iterators:
fn main() {
let mut xes: Vec<Test> = vec![Test { a: 3, b: 5 }];
let (x_as, mut x_bs): (Vec<_>, Vec<_>) =
xes.iter_mut().map(|x| (&x.a, &mut x.b)).unzip();
x_bs.iter_mut().for_each(|b| other(b, &x_as));
}
Another way is to avoid mutable references completely and to use interior mutability instead. The standard library has Cell, which works well for Copy types such as i32, RefCell, which works for all types but does borrowing checking at runtime, adding some slight overhead, and Mutex and RwLock, which can be used in multiple threads but perform lock checks at runtime so at most one thread gets access to the inner value at any time.
Here's an example with Cell. We can eliminate the two temporary Vecs with this approach, and we can pass the whole collection of objects to the other function instead of just references to the a field.
use std::cell::Cell;
struct Test {
a: i32,
b: Cell<i32>,
}
fn other(x: &Cell<i32>, refs: &[Test]) {
x.set(x.get() + 1);
}
fn main() {
let xes: Vec<Test> = vec![Test { a: 3, b: Cell::new(5) }];
xes.iter().for_each(|x| other(&x.b, &xes));
}

What's the fastest idiomatic way to mutate multiple struct fields at the same time?

Many libraries allow you to define a type which implements a given trait to be used as a callback handler. This requires you to lump all of the data you'll need to handle the event together in a single data type, which complicates borrows.
For instance, mio allows you to implement Handler and provide your struct when you run the EventLoop. Consider an example with these trivialized data types:
struct A {
pub b: Option<B>
};
struct B;
struct MyHandlerType {
pub map: BTreeMap<Token, A>,
pub pool: Pool<B>
}
Your handler has a map from Token to items of type A. Each item of type A may or may not already have an associated value of type B. In the handler, you want to look up the A value for a given Token and, if it doesn't already have a B value, get one out of the handler's Pool<B>.
impl Handler for MyHandlerType {
fn ready(&mut self, event_loop: &mut EventLoop<MyHandlerType>,
token: Token, events: EventSet) {
let a : &mut A = self.map.get_mut(token).unwrap();
let b : B = a.b.take().or_else(|| self.pool.new()).unwrap();
// Continue working with `a` and `b`
// ...
}
}
In this arrangement, even though it's intuitively possible to see that self.map and self.pool are distinct entities, the borrow checker complains that self is already borrowed (via self.map) when we go to access self.pool.
One possible approach to this would be to wrap each field in MyHandlerType in Option<>. Then, at the start of the method call, take() those values out of self and restore them at the end of the call:
struct MyHandlerType {
// Wrap these fields in `Option`
pub map: Option<BTreeMap<Token, A>>,
pub pool: Option<Pool<B>>
}
// ...
fn ready(&mut self, event_loop: &mut EventLoop<MyHandlerType>,
token: Token, events: EventSet) {
// Move these values out of `self`
let map = self.map.take().unwrap();
let pool = self.pool.take().unwrap();
let a : &mut A = self.map.get_mut(token).unwrap();
let b : B = a.b.take().or_else(|| self.pool.new()).unwrap();
// Continue working with `a` and `b`
// ...
// Restore these values to `self`
self.map = Some(map);
self.pool = Some(pool);
}
This works but feels a bit kluge-y. It also introduces the overhead of moving values in and out of self for each method call.
What's the best way to do this?
To get simultaneous mutable references to different parts of the struct, use destructuring. Example here.
struct Pair {
x: Vec<u32>,
y: Vec<u32>,
}
impl Pair {
fn test(&mut self) -> usize {
let Pair{ ref mut x, ref mut y } = *self;
// Both references coexist now
return x.len() + y.len();
}
}
fn main() {
let mut nums = Pair {
x: vec![1, 2, 3],
y: vec![4, 5, 6, 7],
};
println!("{}", nums.test());
}

Why does the borrow from `HashMap::get` not end when the function returns?

Here is emulation of my problem, when a borrow ends too late
use std::collections::HashMap;
struct Item {
capacity: u64
}
struct Petrol {
name: String,
fuel: HashMap<&'static str, Item>
}
fn buy_gaz(p: &mut Petrol) {
match p.fuel.get("gaz") {
Some(gaz) => {
fire_petrol(p);
}
None => ()
}
}
fn fire_petrol(p: &mut Petrol) {
println!("Boom!");
p.fuel.remove("gaz");
p.fuel.remove("benzin");
}
fn main() {
let mut bt = Petrol {
name: "Britii Petrovich".to_string(),
fuel: HashMap::new()
};
bt.fuel.insert("gaz", Item { capacity: 1000 });
bt.fuel.insert("benzin", Item { capacity: 5000 });
buy_gaz(&mut bt);
}
When compiling I get:
note: previous borrow of `p.fuel` occurs here; the immutable borrow prevents subsequent moves or mutable borrows of `p.fuel` until the borrow ends
match p.fuel.get("gaz") {
^~~~~~
Why does the borrow end so late and not on exit from HashMap::get? How do I fix my case?
PS: I edited my first post for adding struct to HashMap, because decision below worked for simply types (with default Clone trait, I think), but doesn't work for custom structures
If you look at the documentation of HashMap::get you can see, that it returns an Option<&V>. The reference into the map allows you to do zero-copy accesses into a hash-map. The downside is, as long as you have a reference, you cannot modify the hash map as that might invalidate your reference.
The branch Some(gaz) causes the binding gaz to have type &u64, where the reference points into your hashmap. If you change that to Some(&gaz) you get a copy of the value instead of a reference, and may modify the hash map even inside that branch.
Minimal Example in Playpen

Resources