I am writing a small program to calculate the critical path on a PERT diagram (https://en.wikipedia.org/wiki/Program_evaluation_and_review_technique) in rust.
I am storing Task objects inside a hashmap. The hashmap is owned by an object called Pert. Each Task object owns two Vec<String> objects, identifying the task's prerequisites and successors, and has an i32 to specify its duration. The tasks are created inside main.rs and added to the Pert object through an add function.
task.rs:
pub struct Task {
name: String,
i32: duration,
followers: Vec<String>,
prerequisites: Vec<String>
// Additional fields, not relevant for the example
}
impl Task {
pub fn new(name: &str, duration: i32) -> Task {
Task {
name: String::from(name),
duration: duration,
followers: Vec::new(),
prerequisites: Vec::new(),
}
}
pub fn name(&self) -> &str {
&self.name
}
pub fn duration(&self) -> i32 {
self.duration
}
pub fn get_prerequisites(&self) -> & Vec<String> {
&self.prerequisites
}
pub fn get_followers(&self) -> & Vec<String> {
&self.followers
}
}
To evaluate the critical path it is necessary to calculate the max sum of duration of all tasks and record the earliest start and finish times of each task, as well as the latest start and finish times. The way that can be done is to add a "begin" and an "end" task that mark the beginning and the end to the graph respectively. Starting from the "begin" task, a BFS is performed on the graph until we get to the "end" task. The BFS is done inside the method completion_time of the Pert object.
In my current implementation I hit a problem with the borrow checker, since I am borrowing mutably the hashmap containing the tasks more than once. I don't see an alternative way to do this other than borrowing twice, but I am very new to rust and have no functional programming experience, so if there is an easy way to do this with functional programming, I can't see it either.
pert.rs:
pub struct Pert {
tasks: HashMap<String, Task>
}
impl Pert {
pub fn completion_time(&mut self) -> i32 {
let mut time = 0;
let mut q = VecDeque::<&mut Task>::new();
// put "begin" task at the top of the queue, first mutable borrow of self.tasks
q.push_back(self.tasks.get_mut("begin").unwrap());
while !q.is_empty() {
let old_time = time;
let mut curr_task = q.pop_front().unwrap();
for x in curr_task.get_followers() {
// second mutable borrow of self.tasks happens here
let task = self.tasks.get_mut(x).unwrap();
// additional piece of code here modifying other task properties
time = std::cmp::max(old_time, old_time + task.duration())
}
}
time
}
}
Building the project with an empty main.rs should be enough to trigger the following error message:
error[E0499]: cannot borrow `self.tasks` as mutable more than once at a time
--> src/pert.rs:84:28
|
79 | q.push_back(self.tasks.get_mut("begin").unwrap());
| ---------- first mutable borrow occurs here
80 | while !q.is_empty() {
| - first borrow later used here
...
84 | let task = self.tasks.get_mut(x).unwrap();
| ^^^^^^^^^^ second mutable borrow occurs here
The issue here is that you're trying to get multiple mutable references from the HashMap, which owns the tasks and can only safely give out one mutable reference at a time. By changing the VecDeque to take a &Task, and using .get() instead of .get_mut() on the hashmap in completion_time(), the program will compile.
It doesn't look like you're mutating the task in this example, but assuming that you want to modify this example to mutate the task, the best way is to use interior mutability within the Task struct itself, which is usually achieved with the RefCell type. Any value inside of the Task struct that you want to mutate, you can wrap in a RefCell<>, and when you need to mutate the value, you can call .borrow_mut() on the struct field to get a temporarily mutable reference. This answer explains it in a bit more detail: Borrow two mutable values from the same HashMap
Related
I have a simple struct like below called TopicTree:
#[derive(Debug, PartialEq, Eq, Hash, Default, Clone)]
// #[allow(dead_code)]
pub struct TopicTree {
topic_name: String,
child: Option<Vec<Box<TopicTree>>>,
data: Option<Vec<String>>
}
And another struct called App which has an impl block as below:
struct App {
// Topic Tree
topic_tree_root:TopicTree,
}
impl App {
pub fn parse_topic_to_tree(& mut self, topic: &str){
let mut temp_node = & mut self.topic_tree_root;
let mut found = false;
for text in topic.split("/") {
for item in temp_node.child.as_mut().unwrap() {
if item.topic_name == text {
temp_node = item.as_mut();
found = true;
break;
}
}
}
}
}
When I try to compile the code, rustc gives me this error:
error[E0499]: cannot borrow `temp_node.child` as mutable more than once at a time
--> src/pub/lib/app.rs:22:26
|
22 | for j in temp_node.child.as_mut().unwrap() {
| ^^^^^^^^^^^^^^^^^^^^^^^^ `temp_node.child` was mutably borrowed here in the previous iteration of the loop
So my question is, isn't variable item local scoped? if it is not as so, how can I iterate over temp_node.child in a nested loop, it is necessary because temp_node is also mutable.
For the inner loop to execute, the compiler has to 1) create an implicit borrow on temp_node in order to 2) borrow temp_node.child, in order to call as_mut() (which takes &mut self) and then bind the result to item. The lifetime of item depends on temp_node being alive, because of this borrow-chain.
In a subsequent iteration of the outer loop, a conflict occurs: If temp_node = item.as_mut() has executed, you need to mutably borrow temp_node in the for item = ... line. But it is already being borrowed to keep temp_item alive, which came from item, which came from temp_node... Here, the circular logic might become apparent: There can be no guarantee - as the code is written, notwithstanding that the data structure wouldn't support this - that temp_node and item end up being the same object, which would cause two mutable borrows on the same value.
There might be some confusion with respect to mut and &mut here. temp_node needs to be mut (as in let mut, because you change temp_node), but it does not need to be a &mut (as in "mutable borrow", because you are not modifying the data behind the reference).
Running into an ownership issue when attempting to reference multiple values from a HashMap in a struct as parameters in a function call. Here is a PoC of the issue.
use std::collections::HashMap;
struct Resource {
map: HashMap<String, String>,
}
impl Resource {
pub fn new() -> Self {
Resource {
map: HashMap::new(),
}
}
pub fn load(&mut self, key: String) -> &mut String {
self.map.get_mut(&key).unwrap()
}
}
fn main() {
// Initialize struct containing a HashMap.
let mut res = Resource {
map: HashMap::new(),
};
res.map.insert("Item1".to_string(), "Value1".to_string());
res.map.insert("Item2".to_string(), "Value2".to_string());
// This compiles and runs.
let mut value1 = res.load("Item1".to_string());
single_parameter(value1);
let mut value2 = res.load("Item2".to_string());
single_parameter(value2);
// This has ownership issues.
// multi_parameter(value1, value2);
}
fn single_parameter(value: &String) {
println!("{}", *value);
}
fn multi_parameter(value1: &mut String, value2: &mut String) {
println!("{}", *value1);
println!("{}", *value2);
}
Uncommenting multi_parameter results in the following error:
28 | let mut value1 = res.load("Item1".to_string());
| --- first mutable borrow occurs here
29 | single_parameter(value1);
30 | let mut value2 = res.load("Item2".to_string());
| ^^^ second mutable borrow occurs here
...
34 | multi_parameter(value1, value2);
| ------ first borrow later used here
It would technically be possible for me to break up the function calls (using the single_parameter function approach), but it would be more convenient to pass the
variables to a single function call.
For additional context, the actual program where I'm encountering this issue is an SDL2 game where I'm attempting to pass multiple textures into a single function call to be drawn, where the texture data may be modified within the function.
This is currently not possible, without resorting to unsafe code or interior mutability at least. There is no way for the compiler to know if two calls to load will yield mutable references to different data as it cannot always infer the value of the key. In theory, mutably borrowing both res.map["Item1"] and res.map["Item2"] would be fine as they would refer to different values in the map, but there is no way for the compiler to know this at compile time.
The easiest way to do this, as already mentioned, is to use a structure that allows interior mutability, like RefCell, which typically enforces the memory safety rules at run-time before returning a borrow of the wrapped value. You can also work around the borrow checker in this case by dealing with mut pointers in unsafe code:
pub fn load_many<'a, const N: usize>(&'a mut self, keys: [&str; N]) -> [&'a mut String; N] {
// TODO: Assert that keys are distinct, so that we don't return
// multiple references to the same value
keys.map(|key| self.load(key) as *mut _)
.map(|ptr| unsafe { &mut *ptr })
}
Rust Playground
The TODO is important, as this assertion is the only way to ensure that the safety invariant of only having one mutable reference to any value at any time is upheld.
It is, however, almost always better (and easier) to use a known safe interior mutation abstraction like RefCell rather than writing your own unsafe code.
I'm writing a game engine. In the engine, I've got a game state which contains the list of entities in the game.
I want to provide a function on my gamestate update which will in turn tell each entity to update. Each entity needs to be able to refer to the gamestate in order to correctly update itself.
Here's a simplified version of what I have so far.
pub struct GameState {
pub entities: Vec<Entity>,
}
impl GameState {
pub fn update(&mut self) {
for mut t in self.entities.iter_mut() {
t.update(self);
}
}
}
pub struct Entity {
pub value: i64,
}
impl Entity {
pub fn update(&mut self, container: &GameState) {
self.value += container.entities.len() as i64;
}
}
fn main() {
let mut c = GameState { entities: vec![] };
c.entities.push(Entity { value: 1 });
c.entities.push(Entity { value: 2 });
c.entities.push(Entity { value: 3 });
c.update();
}
The problem is the borrow checker doesn't like me passing the gamestate to the entity:
error[E0502]: cannot borrow `*self` as immutable because `self.entities` is also borrowed as mutable
--> example.rs:8:22
|
7 | for mut t in self.entities.iter_mut() {
| ------------- mutable borrow occurs here
8 | t.update(self);
| ^^^^ immutable borrow occurs here
9 | }
| - mutable borrow ends here
error: aborting due to previous error
Can anyone give me some suggestions on better ways to design this that fits with Rust better?
Thanks!
First, let's answer the question you didn't ask: Why is this not allowed?
The answer lies around the guarantees that Rust makes about & and &mut pointers. A & pointer is guaranteed to point to an immutable object, i.e. it's impossible for the objects behind the pointer to mutate while you can use that pointer. A &mut pointer is guaranteed to be the only active pointer to an object, i.e. you can be sure that nobody is going to observe or mutate the object while you're mutating it.
Now, let's look at the signature of Entity::update:
impl Entity {
pub fn update(&mut self, container: &GameState) {
// ...
}
}
This method takes two parameters: a &mut Entity and a &GameState. But hold on, we can get another reference to self through the &GameState! For example, suppose that self is the first entity. If we do this:
impl Entity {
pub fn update(&mut self, container: &GameState) {
let self_again = &container.entities[0];
// ...
}
}
then self and self_again alias each other (i.e. they refer to the same thing), which is not allowed as per the rules I mentioned above because one of the pointers is a mutable pointer.
What can you do about this?
One option is to remove an entity from the entities vector before calling update on it, then inserting it back after the call. This solves the aliasing problem because we can't get another alias to the entity from the game state. However, removing the entity from the vector and reinserting it are operations with linear complexity (the vector needs to shift all the following items), and if you do it for each entity, then the main update loop runs in quadratic complexity. You can work around that by using a different data structure; this can be as simple as a Vec<Option<Entity>>, where you simply take the Entity from each Option, though you might want to wrap this into a type that hides all None values to external code. A nice consequence is that when an entity has to interact with other entities, it will automatically skip itself when iterating on the entities vector, since it's no longer there!
A variation on the above is to simply take ownership of the whole vector of entities and temporarily replace the game state's vector of entities with an empty one.
impl GameState {
pub fn update(&mut self) {
let mut entities = std::mem::replace(&mut self.entities, vec![]);
for mut t in entities.iter_mut() {
t.update(self);
}
self.entities = entities;
}
}
This has one major downside: Entity::update will not be able to interact with the other entities.
Another option is to wrap each entity in a RefCell.
use std::cell::RefCell;
pub struct GameState {
pub entities: Vec<RefCell<Entity>>,
}
impl GameState {
pub fn update(&mut self) {
for t in self.entities.iter() {
t.borrow_mut().update(self);
}
}
}
By using RefCell, we can avoid retaining a mutable borrow on self. Here, we can use iter instead of iter_mut to iterate on entities. In return, we now need to call borrow_mut to obtain a mutable pointer to the value wrapped in the RefCell.
RefCell essentially performs borrow checking at runtime. This means that you can end up writing code that compiles fine but panics at runtime. For example, if we write Entity::update like this:
impl Entity {
pub fn update(&mut self, container: &GameState) {
for entity in container.entities.iter() {
self.value += entity.borrow().value;
}
}
}
the program will panic:
thread 'main' panicked at 'already mutably borrowed: BorrowError', ../src/libcore/result.rs:788
That's because we end up calling borrow on the entity that we're currently updating, which is still borrowed by the borrow_mut call done in GameState::update. Entity::update doesn't have enough information to know which entity is self, so you would have to use try_borrow or borrow_state (which are both unstable as of Rust 1.12.1) or pass additional data to Entity::update to avoid panics with this approach.
I want my method of struct to perform in a synchronized way. I wanted to do this by using Mutex (Playground):
use std::sync::Mutex;
use std::collections::BTreeMap;
pub struct A {
map: BTreeMap<String, String>,
mutex: Mutex<()>,
}
impl A {
pub fn new() -> A {
A {
map: BTreeMap::new(),
mutex: Mutex::new(()),
}
}
}
impl A {
fn synchronized_call(&mut self) {
let mutex_guard_res = self.mutex.try_lock();
if mutex_guard_res.is_err() {
return
}
let mut _mutex_guard = mutex_guard_res.unwrap(); // safe because of check above
let mut lambda = |text: String| {
let _ = self.map.insert("hello".to_owned(),
"d".to_owned());
};
lambda("dd".to_owned());
}
}
Error message:
error[E0500]: closure requires unique access to `self` but `self.mutex` is already borrowed
--> <anon>:23:26
|
18 | let mutex_guard_res = self.mutex.try_lock();
| ---------- borrow occurs here
...
23 | let mut lambda = |text: String| {
| ^^^^^^^^^^^^^^ closure construction occurs here
24 | if let Some(m) = self.map.get(&text) {
| ---- borrow occurs due to use of `self` in closure
...
31 | }
| - borrow ends here
As I understand when we borrow anything from the struct we are unable to use other struct's fields till our borrow is finished. But how can I do method synchronization then?
The closure needs a mutable reference to the self.map in order to insert something into it. But closure capturing works with whole bindings only. This means, that if you say self.map, the closure attempts to capture self, not self.map. And self can't be mutably borrowed/captured, because parts of self are already immutably borrowed.
We can solve this closure-capturing problem by introducing a new binding for the map alone such that the closure is able to capture it (Playground):
let mm = &mut self.map;
let mut lambda = |text: String| {
let _ = mm.insert("hello".to_owned(), text);
};
lambda("dd".to_owned());
However, there is something you overlooked: since synchronized_call() accepts &mut self, you don't need the mutex! Why? Mutable references are also called exclusive references, because the compiler can assure at compile time that there is only one such mutable reference at any given time.
Therefore you statically know, that there is at most one instance of synchronized_call() running on one specific object at any given time, if the function is not recursive (calls itself).
If you have mutable access to a mutex, you know that the mutex is unlocked. See the Mutex::get_mut() method for more explanation. Isn't that amazing?
Rust mutexes do not work the way you are trying to use them. In Rust, a mutex protects specific data relying on the borrow-checking mechanism used elsewhere in the language. As a consequence, declaring a field Mutex<()> doesn't make sense, because it is protecting read-write access to the () unit object that has no values to mutate.
As Lukas explained, your call_synchronized as declared doesn't need to do synchronization because its signature already requests an exclusive (mutable) reference to self, which prevents it from being invoked from multiple threads on the same object. In other words, you need to change the signature of call_synchronized because the current one does not match the functionality it is intended to provide.
call_synchronized needs to accept a shared reference to self, which will signal to Rust that it can be called from multiple threads in the first place. Inside call_synchronized a call to Mutex::lock will simultaneously lock the mutex and provide a mutable reference to the underlying data, carefully scoped so that the lock is held for the duration of the reference:
use std::sync::Mutex;
use std::collections::BTreeMap;
pub struct A {
synced_map: Mutex<BTreeMap<String, String>>,
}
impl A {
pub fn new() -> A {
A {
synced_map: Mutex::new(BTreeMap::new()),
}
}
}
impl A {
fn synchronized_call(&self) {
let mut map = self.synced_map.lock().unwrap();
// omitting the lambda for brevity, but it would also work
// (as long as it refers to map rather than self.map)
map.insert("hello".to_owned(), "d".to_owned());
}
}
I'm learning Rust and I'm trying to cargo-cult this code into compiling:
use std::vec::Vec;
use std::collections::BTreeMap;
struct Occ {
docnum: u64,
weight: f32,
}
struct PostWriter<'a> {
bytes: Vec<u8>,
occurrences: BTreeMap<&'a [u8], Vec<Occ>>,
}
impl<'a> PostWriter<'a> {
fn new() -> PostWriter<'a> {
PostWriter {
bytes: Vec::new(),
occurrences: BTreeMap::new(),
}
}
fn add_occurrence(&'a mut self, term: &[u8], occ: Occ) {
let occurrences = &mut self.occurrences;
match occurrences.get_mut(term) {
Some(x) => x.push(occ),
None => {
// Add the term bytes to the big vector of all terms
let termstart = self.bytes.len();
self.bytes.extend(term);
// Create a new occurrences vector
let occs = vec![occ];
// Take the appended term as a slice to use as a key
// ERROR: cannot borrow `*occurrences` as mutable more than once at a time
occurrences.insert(&self.bytes[termstart..], occs);
}
}
}
}
fn main() {}
I get an error:
error[E0499]: cannot borrow `*occurrences` as mutable more than once at a time
--> src/main.rs:34:17
|
24 | match occurrences.get_mut(term) {
| ----------- first mutable borrow occurs here
...
34 | occurrences.insert(&self.bytes[termstart..], occs);
| ^^^^^^^^^^^ second mutable borrow occurs here
35 | }
36 | }
| - first borrow ends here
I don't understand... I'm just calling a method on a mutable reference, why would that line involve borrowing?
I'm just calling a method on a mutable reference, why would that line involve borrowing?
When you call a method on an object that's going to mutate the object, you can't have any other references to that object outstanding. If you did, your mutation could invalidate those references and leave your program in an inconsistent state. For example, say that you had gotten a value out of your hashmap and then added a new value. Adding the new value hits a magic limit and forces memory to be reallocated, your value now points off to nowhere! When you use that value... bang goes the program!
In this case, it looks like you want to do the relatively common "append or insert if missing" operation. You will want to use entry for that:
use std::collections::BTreeMap;
fn main() {
let mut map = BTreeMap::new();
{
let nicknames = map.entry("joe").or_insert(Vec::new());
nicknames.push("shmoe");
// Using scoping to indicate that we are done with borrowing `nicknames`
// If we didn't, then we couldn't borrow map as
// immutable because we could still change it via `nicknames`
}
println!("{:?}", map)
}
Because you're calling a method that borrows as mutable
I had a similar question yesterday about Hash, until I noticed something in the docs. The docs for BTreeMap show a method signature for insert starting with fn insert(&mut self..
So when you call .insert, you're implicitly asking that function to borrow the BTreeMap as mutable.