HashMap borrow issue when trying to implement find or insert

HashMap borrow issue when trying to implement find or insert - rust

I tried to implement own analogue of find_or_insert method that looks like this:
use std::collections::HashMap;
pub struct SomeManager {
next: i32,
types: HashMap<i32, i32>,
}
impl SomeManager {
pub fn get_type<'a>(&'a mut self, k: i32) -> &'a i32 {
match self.types.get(&k) {
Some(ref x) => return *x,
None => {
self.types.insert(k, self.next);
self.next += 1;
return self.types.get(&k).unwrap();
}
}
}
}
fn main() {}
Error:
error[E0502]: cannot borrow `self.types` as mutable because it is also borrowed as immutable
--> src/main.rs:13:17
|
10 | match self.types.get(&k) {
| ---------- immutable borrow occurs here
...
13 | self.types.insert(k, self.next);
| ^^^^^^^^^^ mutable borrow occurs here
...
18 | }
| - immutable borrow ends here
I know that there are some standard methods that implement this functionality, but I want this method to be as light as possible - it will be called very-very often and almost all of the time the values will already exist.
As I understand it, when we call self.types.get we borrow it to scope of match statement, so we can't call self.types.insert here. I have tried to move methods from None branch out of match statement, but it also fails.
The only working solution that I found requires invoking get twice:
pub fn get_type<'a>(&'a mut self, k: i32) -> &'a i32 {
let is_none = match self.types.get(&k) {
Some(ref x) => false,
None => true,
};
if is_none {
self.types.insert(k, self.next);
self.next += 1;
}
self.types.get(&k).unwrap()
}
How can I work around such situations?

There are a handful of methods on HashMap to achieve these sorts of complex cases. Most notably, for your case, HashMap::entry and Entry::or_insert_with:
pub fn get_type<'a>(&'a mut self, k: i32) -> &'a i32 {
self.types.entry(k).or_insert_with(|| {
let value = self.next;
self.next += 1;
value
})
}
In your case, however, there’s the borrow of self inside, so that won’t do.
We thus shift the borrow of self.next outside of the closure so the compiler can reason about it as disjoint from self.types. Problem solved with only one lookup, as it should be.
pub fn get_type<'a>(&'a mut self, k: i32) -> &'a i32 {
let next = &mut self.next;
self.types.entry(k).or_insert_with(|| {
let value = *next;
*next += 1;
value
})
}

Note that in your first case you're doing one lookup when the key exists in the map and three when it does not exist. Your last attempt does two lookups in either case. This is somewhat prettified version of the latter:
pub fn get_type<'a>(&'a mut self, k: i32) -> &'a i32 {
let contains = self.types.contains_key(&k);
if !contains {
self.types.insert(k, self.next);
self.next += 1;
}
self.types.get(&k).unwrap()
}
I don't think it is possible to avoid the second lookup without some support from the map's implementation because of borrowing restrictions.
In any case, using the solution by Chris Morgan is superior to the above one (for example, it may be more efficient and in fact require less lookups), so I suggest sticking with it.

Related

Passing both &mut self and &self to the same function

Stripped down to the bare essentials, my problematic code looks as follows:
pub struct Item;
impl Item {
/// Partial copy. Not the same as simple assignment.
pub fn copy_from(&mut self, _other: &Item) {
}
}
pub struct Container {
items: Vec<Item>,
}
impl Container {
pub fn copy_from(&mut self, self_idx: usize, other: &Container, other_idx: usize) {
self.items[self_idx].copy_from(&other.items[other_idx]);
}
}
fn main() {
let mut container = Container { items: vec![Item, Item] };
container.copy_from(0, &container, 1);
}
This is of course rejected by the borrow checker:
error[E0502]: cannot borrow `container` as mutable because it is also borrowed as immutable
--> src/main.rs:21:5
|
21 | container.copy_from(0, &container, 1);
| ^^^^^^^^^^---------^^^^----------^^^^
| | | |
| | | immutable borrow occurs here
| | immutable borrow later used by call
| mutable borrow occurs here
For more information about this error, try `rustc --explain E0502`.
I understand why that happens, but I don't have a good solution.
I've considered adding a dedicated copy_from_self function that callers need to use in cases where self == other:
pub fn copy_from_self(&mut self, to_idx: usize, from_idx: usize) {
if to_idx != from_idx {
unsafe {
let from_item: *const Item = &self.items[from_idx];
self.items[to_idx].copy_from(&*from_item);
}
}
}
But this is un-ergonomic, bloats the API surface, and needs unsafe code inside.
Note that in reality, the internal items data structure is not a simple Vec, so any approach specific to Vec or slice will not work.
Is there an elegant, idiomatic solution to this problem?

If I understand the comments on the question correctly, a general solution seems to be impossible, so this answer is necessarily specific to my actual situation.
As mentioned, the actual data structure is not a Vec. If it were a Vec, we could use split_at_mut to at least implement copy_from_self safely.
But as it happens, my actual data structure is backed by a Vec, so I was able to add a helper function:
/// Returns a pair of mutable references to different items. Useful if you need to pass
/// a reference to one item to a function that takes `&mut self` on another item.
/// Panics if `a == b`.
fn get_mut_2(&mut self, a: usize, b: usize) -> (&mut T, &mut T) {
assert!(a != b);
if a < b {
let (first, second) = self.items.split_at_mut(b);
(&mut first[a], &mut second[0])
} else if a > b {
let (first, second) = self.items.split_at_mut(a);
(&mut second[0], &mut first[b])
} else {
panic!("cannot call get_mut_2 with the same index {} == {}", a, b);
}
}
Now we can implement copy_from_self without unsafe code:
pub fn copy_from_self(&mut self, to_idx: usize, from_idx: usize) {
let (to, from) = self.items.get_mut_2(to_idx, from_idx);
to.unwrap().copy_from(from.unwrap());
}

Borrow a struct field mutably and pass it to mutable methods

Because rust doesn't track individual field borrows across method calls - it assumes you've borrowed from the whole struct, it's not possible to borrow a field from the struct, return it, and then pass that into other mutable methods. That sounds contrived but I have this exact issue right now.
struct Foo {
x: i32,
y: i32
}
impl Foo {
pub fn borrow_x(&mut self) -> &mut i32 {
&mut self.x
}
pub fn cross_product(&mut self, other: &mut i32) {
self.y *= *other;
*other *= self.y;
}
}
fn main() {
let mut foo = Foo{x: 2, y: 4};
let x_ref = foo.borrow_x();
foo.cross_product(x_ref);
println!("x={} y={}", foo.x, foo.y);
}
This of course isn't permitted in rust, the error from the compiler is:
error[E0499]: cannot borrow `foo` as mutable more than once at a time
--> src/main.rs:20:5
|
19 | let x_ref = foo.borrow_x();
| --- first mutable borrow occurs here
20 | foo.cross_product(x_ref);
| ^^^ ----- first borrow later used here
| |
| second mutable borrow occurs here
My first reaction is to use unsafe to tell the compiler that it doesn't really borrow from self:
pub fn borrow_x<'a>(&mut self) -> &'a mut i32 {
unsafe {
std::mem::transmute::<&mut i32, &'a mut i32>(&mut self.x)
}
}
It works, but it requires that you don't actually break rust's aliasing rules and reference foo.x elsewhere by accident.
Then I thought maybe I could split the struct so that the borrow and cross_product methods are on different sibling structs. This does work:
struct Bar {
x: X,
y: Y,
}
struct X {
inner: i32
}
impl X {
pub fn borrow_x(&mut self) -> &mut i32 {
&mut self.inner
}
}
struct Y {
inner: i32
}
impl Y {
pub fn cross_product(&mut self, other: &mut i32) {
self.inner *= *other;
*other *= self.inner;
}
}
fn main() {
let mut bar = Bar{x: X{inner: 2}, y: Y{inner: 4}};
let x_ref = bar.x.borrow_x();
bar.y.cross_product(x_ref);
println!("x={} y={}", bar.x.inner, bar.y.inner);
}
I think one could ditch the method all-together and pass references to both fields into cross_product as a free function. That should be equivalent to the last solution. I find that kind of ugly though, especially as the number of fields needed from the struct grow.
Are there other solutions to this problem?

The thing to keep in mind is that rust is working at a per-function level.
pub fn cross_product(&mut self, other: &mut i32) {
self.y *= *other;
*other *= self.y;
}
The mutability rules ensure that self.y and other are not the same thing.
If you're not careful you can run into this kind of bug in C++/C/Java quite easily.
i.e. imagine if instead of
let x_ref = foo.borrow_x();
foo.cross_product(x_ref);
you were doing
let y_ref = foo.borrow_y();
foo.cross_product(y_ref);
What value should foo.y end up with? (Allowing these to refer to the same object can cause certain performance optimisations to not be applicable too)
As you mentioned, you could go with a free function
fn cross_product(a: &mut i32, b: &mut i32) {
a *= b;
b *= a;
}
but I'd consider ditching mutability altogether
fn cross_product(a:i32, b:i32) -> (i32,i32) {
(a*b, a*b*b)
}
ASIDE: If the return value seems odd to you (it did to me initially too) and you expected (a*b, a*b) you need to think a little harder... and I'd say that alone is a good reason to avoid the mutable version.
Then I can use it like this
let (_x,_y) = cross_product(foo.x, foo.y);
foo.x = _x;
foo.y = _y;
This is a bit verbose, but when RFC-2909 lands we can instead write:
(foo.y, foo.x) = cross_product(foo.x, foo.y);

Mutable borrow from HashMap and lifetime elision

Chapter 13 of the Rust book (2nd edition) contains an example of a simple calculation cache. Cacher takes the calculation function as a constructor param, and will cache the result - after the first call, it will not invoke the function again, but simply return the cached result:
struct Cacher<T>
where
T: Fn(u32) -> u32,
{
calculation: T,
value: Option<u32>,
}
impl<T> Cacher<T>
where
T: Fn(u32) -> u32,
{
fn new(calculation: T) -> Cacher<T> {
Cacher {
calculation,
value: None,
}
}
fn value(&mut self, arg: u32) -> u32 {
match self.value {
Some(v) => v,
None => {
let v = (self.calculation)(arg);
self.value = Some(v);
v
}
}
}
}
It is very limited, and there are a couple of improvement suggestions in the book, left as an exercise for the reader.
So I'm trying to make it cache multiple values. The Cacher holds a HashMap with result values, instead of just one value. When asked for a value, if it has in the map (cache), return it. Otherwise, calculate it, store it the cache, and then return it.
The cacher takes references, because it doesn't want to own the inputs. When using it (see unit test), I'm borrowing, because the cacher owns the results.
Here's my attempt:
use std::collections::HashMap;
struct Cacher<T>
where
T: Fn(&u32) -> u32,
{
calculation: T,
values: HashMap<u32, u32>,
}
impl<T> Cacher<T>
where
T: Fn(&u32) -> u32,
{
fn new(calculation: T) -> Cacher<T> {
Cacher {
calculation,
values: HashMap::new(),
}
}
fn value(&mut self, arg: u32) -> &u32 {
let values = &mut self.values;
match values.get(&arg) {
Some(v) => &v,
None => {
let v = (self.calculation)(&arg);
values.insert(arg, v);
&values.get(&arg).unwrap()
}
}
}
}
#[test]
fn call_with_different_values() {
let mut c = Cacher::new(|a| a + 1);
let v1 = c.value(1);
assert_eq!(*v1, 2);
let v2 = c.value(2);
assert_eq!(*v2, 3);
}
Compiler output:
22 | fn value(&mut self, arg: u32) -> &u32 {
| - let's call the lifetime of this reference `'1`
23 | let values = &mut self.values;
24 | match values.get(&arg) {
| ------ immutable borrow occurs here
25 | Some(v) => &v,
| -- returning this value requires that `*values` is borrowed for `'1`
...
28 | values.insert(arg, v);
| ^^^^^^^^^^^^^^^^^^^^^ mutable borrow occurs here
I'm borrowing self.values as mutable on line 23. However, when I try to use it on the following line, I get the "immutable borrow" error. How is match values.get(arg) an immutable borrow of values ? Didn't the borrowing already happen ?
There is also the error for line 25. From my understanding of lifetime elision rules, the 3rd one should apply here - we have &mut self as a method parameter, so it's lifetime should be automatically assigned to the return value ?

I do not think that you are going to be happy with that signature:
fn value(&mut self, arg: u32) -> &u32
Without lifetime elision this is read as:
fn value(&'a mut self, arg: u32) -> &'a u32
Even if you implement it correctly this has huge implications. For example you can not call value again as long as old results are still in use. And rightly so, after all nothing is preventing the function body to delete old values from the cache.
Better to follow #hellow advice and make the return type u32.
Another misunderstanding: Just because you borrowed values already, does not mean you cannot borrow from it again.
Now to answer your original question:
The compiler is not lying to you values.get(arg) is indeed an immutable borrow of values. The technical explanation is that the signature of that method call is (simplified) get(&self, k: &Q) -> Option<&V>. So the borrow of self (aka values) is valid as long as &V can still be referenced. &V however needs to be valid for the entire function body at least (so it can be returned). Now you tried to mutably borrow in the None case which means &V never existed in the first place. So if the compiler gets smarter it may allow your code to run, but for now (1.34.0) it isn't.

Can't borrow mutably within two different closures in the same scope

My goal is to make a function (specifically, floodfill) that works independent of the underlying data structure. I tried to do this by passing in two closures: one for querying, that borrows some data immutably, and another for mutating, that borrows the same data mutably.
Example (tested on the Rust Playground):
#![feature(nll)]
fn foo<F, G>(n: i32, closure: &F, mut_closure: &mut G)
where
F: Fn(i32) -> bool,
G: FnMut(i32) -> (),
{
if closure(n) {
mut_closure(n);
}
}
fn main() {
let mut data = 0;
let closure = |n| data == n;
let mut mut_closure = |n| {
data += n;
};
foo(0, &closure, &mut mut_closure);
}
Error: (Debug, Nightly)
error[E0502]: cannot borrow `data` as mutable because it is also borrowed as immutable
--> src/main.rs:16:27
|
15 | let closure = |n| data == n;
| --- ---- previous borrow occurs due to use of `data` in closure
| |
| immutable borrow occurs here
16 | let mut mut_closure = |n| {
| ^^^ mutable borrow occurs here
17 | data += n;
| ---- borrow occurs due to use of `data` in closure
18 | };
19 | foo(0, &closure, &mut mut_closure);
| -------- borrow later used here
I did come up with a solution, but it is very ugly. It works if I combine the closures into one and specify which behavior I want with a parameter:
// #![feature(nll)] not required for this solution
fn foo<F>(n: i32, closure: &mut F)
where
F: FnMut(i32, bool) -> Option<bool>,
{
if closure(n, false).unwrap() {
closure(n, true);
}
}
fn main() {
let mut data = 0;
let mut closure = |n, mutate| {
if mutate {
data += n;
None
} else {
Some(data == n)
}
};
foo(0, &mut closure);
}
Is there any way I can appease the borrow checker without this weird way of combining closures?

The problem is rooted in the fact that there's information that you know that the compiler doesn't.
As mentioned in the comments, you cannot mutate a value while there is a immutable reference to it — otherwise it wouldn't be immutable! It happens that your function needs to access the data immutably once and then mutably, but the compiler doesn't know that from the signature of the function. All it can tell is that the function can call the closures in any order and any number of times, which would include using the immutable data after it's been mutated.
I'm guessing that your original code indeed does that — it probably loops and accesses the "immutable" data after mutating it.
Compile-time borrow checking
One solution is to stop capturing the data in the closure. Instead, "promote" the data to an argument of the function and the closures:
fn foo<T, F, G>(n: i32, data: &mut T, closure: F, mut mut_closure: G)
where
F: Fn(&mut T, i32) -> bool,
G: FnMut(&mut T, i32),
{
if closure(data, n) {
mut_closure(data, n);
}
}
fn main() {
let mut data = 0;
foo(
0,
&mut data,
|data, n| *data == n,
|data, n| *data += n,
);
}
However, I believe that two inter-related closures like that will lead to poor maintainability. Instead, give a name to the concept and make a trait:
trait FillTarget {
fn test(&self, _: i32) -> bool;
fn do_it(&mut self, _: i32);
}
fn foo<F>(n: i32, mut target: F)
where
F: FillTarget,
{
if target.test(n) {
target.do_it(n);
}
}
struct Simple(i32);
impl FillTarget for Simple {
fn test(&self, n: i32) -> bool {
self.0 == n
}
fn do_it(&mut self, n: i32) {
self.0 += n
}
}
fn main() {
let data = Simple(0);
foo(0, data);
}
Run-time borrow checking
Because your code has a temporal need for the mutability (you only need it either mutable or immutable at a given time), you could also switch from compile-time borrow checking to run-time borrow checking. As mentioned in the comments, tools like Cell, RefCell, and Mutex can be used for this:
use std::cell::Cell;
fn main() {
let data = Cell::new(0);
foo(
0,
|n| data.get() == n,
|n| data.set(data.get() + n),
);
}
See also:
When I can use either Cell or RefCell, which should I choose?
Situations where Cell or RefCell is the best choice
Need holistic explanation about Rust's cell and reference counted types
Unsafe programmer-brain-time borrow checking
RefCell and Mutex have a (small) amount of runtime overhead. If you've profiled and determined that to be a bottleneck, you can use unsafe code. The usual unsafe caveats apply: it's now up to you, the fallible programmer, to ensure your code doesn't perform any undefined behavior. This means you have to know what is and is not undefined behavior!
use std::cell::UnsafeCell;
fn main() {
let data = UnsafeCell::new(0);
foo(
0,
|n| unsafe { *data.get() == n },
|n| unsafe { *data.get() += n },
);
}
I, another fallible programmer, believe this code to be safe because there will never be temporal mutable aliasing of data. However, that requires knowledge of what foo does. If it called one closure at the same time as the other, this would most likely become undefined behavior.
Additional comments
There's no reason to take references to your generic closure types for the closures
There's no reason to use -> () on the closure type, you can just omit it.

Why does a call to `fn pop(&mut self) -> Result<T, &str>` continue to borrow my data structure?

I am developing some basic data structures to learn the syntax and Rust in general. Here is what I came up with for a stack:
#[allow(dead_code)]
mod stack {
pub struct Stack<T> {
data: Vec<T>,
}
impl<T> Stack<T> {
pub fn new() -> Stack<T> {
return Stack { data: Vec::new() };
}
pub fn pop(&mut self) -> Result<T, &str> {
let len: usize = self.data.len();
if len > 0 {
let idx_to_rmv: usize = len - 1;
let last: T = self.data.remove(idx_to_rmv);
return Result::Ok(last);
} else {
return Result::Err("Empty stack");
}
}
pub fn push(&mut self, elem: T) {
self.data.push(elem);
}
pub fn is_empty(&self) -> bool {
return self.data.len() == 0;
}
}
}
mod stack_tests {
use super::stack::Stack;
#[test]
fn basics() {
let mut s: Stack<i16> = Stack::new();
s.push(16);
s.push(27);
let pop_result = s.pop().expect("");
assert_eq!(s.pop().expect("Empty stack"), 27);
assert_eq!(s.pop().expect("Empty stack"), 16);
let pop_empty_result = s.pop();
match pop_empty_result {
Ok(_) => panic!("Should have had no result"),
Err(_) => {
println!("Empty stack");
}
}
if s.is_empty() {
println!("O");
}
}
}
I get this interesting error:
error[E0502]: cannot borrow `s` as immutable because it is also borrowed as mutable
--> src/main.rs:58:12
|
49 | let pop_empty_result = s.pop();
| - mutable borrow occurs here
...
58 | if s.is_empty() {
| ^ immutable borrow occurs here
...
61 | }
| - mutable borrow ends here
Why can't I just call pop on my mutable struct?
Why does pop borrow the value? If I add a .expect() after it, it is ok, it doesn't trigger that error. I know that is_empty takes an immutable reference, if I switch it to mutable I just get a second mutable borrow.

Your pop function is declared as:
pub fn pop(&mut self) -> Result<T, &str>
Due to lifetime elision, this expands to
pub fn pop<'a>(&'a mut self) -> Result<T, &'a str>
This says that the Result::Err variant is a string that lives as long as the stack you are calling it on. Since the input and output lifetimes are the same, the returned value might be pointing somewhere into the Stack data structure so the returned value must continue to hold the borrow.
If I add a .expect() after it, it is ok, it doesn't trigger that error.
That's because expect consumes the Result, discarding the Err variant without ever putting it into a variable binding. Since that's never stored, the borrow cannot be saved anywhere and it is released.
To solve the problem, you need to have distinct lifetimes between the input reference and output reference. Since you are using a string literal, the easiest solution is to denote that using the 'static lifetime:
pub fn pop(&mut self) -> Result<T, &'static str>
Extra notes:
Don't call return explicitly at the end of the block / method: return Result::Ok(last) => Result::Ok(last).
Result, Result::Ok, and Result::Err are all imported via the prelude, so you don't need to qualify them: Result::Ok(last) => Ok(last).
There's no need to specify types in many cases let len: usize = self.data.len() => let len = self.data.len().

This happens because of lifetimes. When you construct a method which takes a reference the compiler detects that and if no lifetimes are specified it "generates" them:
pub fn pop<'a>(&'a mut self) -> Result<T, &'a str> {
let len: usize = self.data.len();
if len > 0 {
let idx_to_rmv: usize = len - 1;
let last: T = self.data.remove(idx_to_rmv);
return Result::Ok(last);
} else {
return Result::Err("Empty stack");
}
}
This is what compiler sees actually. So, you want to return a static string, then you have to specify the lifetime for a &str explicitly and let the lifetime for the reference to mut self be inferred automatically:
pub fn pop(&mut self) -> Result<T, &'static str> {

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

HashMap borrow issue when trying to implement find or insert - rust

Related

Passing both &mut self and &self to the same function

Borrow a struct field mutably and pass it to mutable methods

Mutable borrow from HashMap and lifetime elision

Can't borrow mutably within two different closures in the same scope

Why does a call to `fn pop(&mut self) -> Result<T, &str>` continue to borrow my data structure?

Categories

Resources