Chapter 13 of the Rust book (2nd edition) contains an example of a simple calculation cache. Cacher takes the calculation function as a constructor param, and will cache the result - after the first call, it will not invoke the function again, but simply return the cached result:
struct Cacher<T>
where
T: Fn(u32) -> u32,
{
calculation: T,
value: Option<u32>,
}
impl<T> Cacher<T>
where
T: Fn(u32) -> u32,
{
fn new(calculation: T) -> Cacher<T> {
Cacher {
calculation,
value: None,
}
}
fn value(&mut self, arg: u32) -> u32 {
match self.value {
Some(v) => v,
None => {
let v = (self.calculation)(arg);
self.value = Some(v);
v
}
}
}
}
It is very limited, and there are a couple of improvement suggestions in the book, left as an exercise for the reader.
So I'm trying to make it cache multiple values. The Cacher holds a HashMap with result values, instead of just one value. When asked for a value, if it has in the map (cache), return it. Otherwise, calculate it, store it the cache, and then return it.
The cacher takes references, because it doesn't want to own the inputs. When using it (see unit test), I'm borrowing, because the cacher owns the results.
Here's my attempt:
use std::collections::HashMap;
struct Cacher<T>
where
T: Fn(&u32) -> u32,
{
calculation: T,
values: HashMap<u32, u32>,
}
impl<T> Cacher<T>
where
T: Fn(&u32) -> u32,
{
fn new(calculation: T) -> Cacher<T> {
Cacher {
calculation,
values: HashMap::new(),
}
}
fn value(&mut self, arg: u32) -> &u32 {
let values = &mut self.values;
match values.get(&arg) {
Some(v) => &v,
None => {
let v = (self.calculation)(&arg);
values.insert(arg, v);
&values.get(&arg).unwrap()
}
}
}
}
#[test]
fn call_with_different_values() {
let mut c = Cacher::new(|a| a + 1);
let v1 = c.value(1);
assert_eq!(*v1, 2);
let v2 = c.value(2);
assert_eq!(*v2, 3);
}
Compiler output:
22 | fn value(&mut self, arg: u32) -> &u32 {
| - let's call the lifetime of this reference `'1`
23 | let values = &mut self.values;
24 | match values.get(&arg) {
| ------ immutable borrow occurs here
25 | Some(v) => &v,
| -- returning this value requires that `*values` is borrowed for `'1`
...
28 | values.insert(arg, v);
| ^^^^^^^^^^^^^^^^^^^^^ mutable borrow occurs here
I'm borrowing self.values as mutable on line 23. However, when I try to use it on the following line, I get the "immutable borrow" error. How is match values.get(arg) an immutable borrow of values ? Didn't the borrowing already happen ?
There is also the error for line 25. From my understanding of lifetime elision rules, the 3rd one should apply here - we have &mut self as a method parameter, so it's lifetime should be automatically assigned to the return value ?
I do not think that you are going to be happy with that signature:
fn value(&mut self, arg: u32) -> &u32
Without lifetime elision this is read as:
fn value(&'a mut self, arg: u32) -> &'a u32
Even if you implement it correctly this has huge implications. For example you can not call value again as long as old results are still in use. And rightly so, after all nothing is preventing the function body to delete old values from the cache.
Better to follow #hellow advice and make the return type u32.
Another misunderstanding: Just because you borrowed values already, does not mean you cannot borrow from it again.
Now to answer your original question:
The compiler is not lying to you values.get(arg) is indeed an immutable borrow of values. The technical explanation is that the signature of that method call is (simplified) get(&self, k: &Q) -> Option<&V>. So the borrow of self (aka values) is valid as long as &V can still be referenced. &V however needs to be valid for the entire function body at least (so it can be returned). Now you tried to mutably borrow in the None case which means &V never existed in the first place. So if the compiler gets smarter it may allow your code to run, but for now (1.34.0) it isn't.
Related
I'm having difficulties understanding lifetime parameters in the following code snippet.
struct C {
data: Vec<u32>,
cols: usize
}
trait M<'s> {
fn get(&'s self, r: usize, c: usize) -> u32;
fn get_mut(&'s mut self, r: usize, c: usize) -> &'s mut u32;
}
impl<'s> M<'s> for C {
fn get(&'s self, r: usize, c: usize) -> u32 {
return self.data[self.cols*r+c];
}
fn get_mut(&'s mut self, r: usize, c: usize) -> &'s mut u32 {
return &mut self.data[self.cols*r+c];
}
}
#[cfg(test)]
mod tests {
use super::*;
fn create() -> C {
let data = vec![0u32,1u32,2u32,3u32,4u32,5u32];
return C{data, cols: 3};
}
fn select<'s, 'r: 's>(data: &'r mut dyn M<'s>) {
let mut _val: u32 = 0;
for r in 0..2 {
for c in 0..3 {
_val += *data.get_mut(r,c);
}
}
}
#[test]
fn test_select() {
let mut data = create();
select(&mut data);
}
}
The code snippet does not compile, because it complains that *data is borrowed multiple times in the function fn select<'s, 'r: 's>(data: &'r mut dyn M<'s>) {} when calling get_mut (once in every loop iteration). Even safeguarding the questionable line with curly braces (and thus creating a new context) does not help. My expectation (in both cases) would be, that the mutable borrow of &mut data should end right after the execution of that line.
On the other hand, when I remove all lifetime parameters, everything works as expected.
Can anyone explain what's the difference between the two versions (with and without explicit lifetimes)?
I've also tried to find information about additional lifetime parameters for traits, in particular specifying their meaning, but I have found none. So I assume, that they are just a declaration of the used labels inside the trait. But if that is so, then I would assume that leaving out the lifetime parameters completely and applying the eliding rules would lead to the same result.
There are two things to consider. The first is when you use a generic lifetime for a function, that lifetime must be larger than the life of the function call simply by construction. And the second is since the lifetime self is tied to the lifetime parameter of the trait, when you call .get_mut(), data is borrowed for the lifetime of 's. Combining those two principles, data is borrowed for longer than the function call so you can't call it again (its already mutably borrowed).
On the other hand, when I remove all lifetime parameters, everything works as expected. Can anyone explain what's the difference between the two versions (with and without explicit lifetimes)?
Without a generic lifetime on M, the methods will behave as if defined as so:
impl M for C {
fn get<'a>(&'a self, r: usize, c: usize) -> u32 {
return self.data[self.cols * r + c];
}
fn get_mut<'a>(&'a mut self, r: usize, c: usize) -> &'a mut u32 {
return &mut self.data[self.cols * r + c];
}
}
Thus there is no lifetime associated with the trait; the lifetimes given and returned from the function are generic only to those method calls. And since the compiler can choose a new lifetime 'a for each call and it will always pick the shorted lifetime to satisfy its usage, you can then call data.get_mut() multiple times without worry. And I'll be honest, having the lifetime on the trait didn't make much sense with the original code; as mentioned, the code works with all lifetime annotations removed: playground.
Stripped down to the bare essentials, my problematic code looks as follows:
pub struct Item;
impl Item {
/// Partial copy. Not the same as simple assignment.
pub fn copy_from(&mut self, _other: &Item) {
}
}
pub struct Container {
items: Vec<Item>,
}
impl Container {
pub fn copy_from(&mut self, self_idx: usize, other: &Container, other_idx: usize) {
self.items[self_idx].copy_from(&other.items[other_idx]);
}
}
fn main() {
let mut container = Container { items: vec![Item, Item] };
container.copy_from(0, &container, 1);
}
This is of course rejected by the borrow checker:
error[E0502]: cannot borrow `container` as mutable because it is also borrowed as immutable
--> src/main.rs:21:5
|
21 | container.copy_from(0, &container, 1);
| ^^^^^^^^^^---------^^^^----------^^^^
| | | |
| | | immutable borrow occurs here
| | immutable borrow later used by call
| mutable borrow occurs here
For more information about this error, try `rustc --explain E0502`.
I understand why that happens, but I don't have a good solution.
I've considered adding a dedicated copy_from_self function that callers need to use in cases where self == other:
pub fn copy_from_self(&mut self, to_idx: usize, from_idx: usize) {
if to_idx != from_idx {
unsafe {
let from_item: *const Item = &self.items[from_idx];
self.items[to_idx].copy_from(&*from_item);
}
}
}
But this is un-ergonomic, bloats the API surface, and needs unsafe code inside.
Note that in reality, the internal items data structure is not a simple Vec, so any approach specific to Vec or slice will not work.
Is there an elegant, idiomatic solution to this problem?
If I understand the comments on the question correctly, a general solution seems to be impossible, so this answer is necessarily specific to my actual situation.
As mentioned, the actual data structure is not a Vec. If it were a Vec, we could use split_at_mut to at least implement copy_from_self safely.
But as it happens, my actual data structure is backed by a Vec, so I was able to add a helper function:
/// Returns a pair of mutable references to different items. Useful if you need to pass
/// a reference to one item to a function that takes `&mut self` on another item.
/// Panics if `a == b`.
fn get_mut_2(&mut self, a: usize, b: usize) -> (&mut T, &mut T) {
assert!(a != b);
if a < b {
let (first, second) = self.items.split_at_mut(b);
(&mut first[a], &mut second[0])
} else if a > b {
let (first, second) = self.items.split_at_mut(a);
(&mut second[0], &mut first[b])
} else {
panic!("cannot call get_mut_2 with the same index {} == {}", a, b);
}
}
Now we can implement copy_from_self without unsafe code:
pub fn copy_from_self(&mut self, to_idx: usize, from_idx: usize) {
let (to, from) = self.items.get_mut_2(to_idx, from_idx);
to.unwrap().copy_from(from.unwrap());
}
I am trying to write a small wrapper around VecDeque.
Specifically I have the code (playground):
use std::collections::VecDeque;
trait VecCircleTraits<T: Eq> {
fn new() -> VecCircle<T>;
fn find_and_remove(&self, _: T) -> Option<T>;
}
#[derive(Debug)]
struct VecCircle<T: Eq>(VecDeque<T>);
impl<T: Eq> VecCircleTraits<T> for VecCircle<T> {
fn new() -> VecCircle<T> {
return VecCircle(VecDeque::<T>::new());
}
fn find_and_remove(&self, key: T) -> Option<T> {
let search_index: Option<usize> = self.0.into_iter().position(|x| x == key); //error 1
if let Some(index) = search_index {
return self.0.remove(index); // error 2
} else {
return None;
}
}
}
Which gives me the following errors:
error: cannot borrow immutable anonymous field `self.0` as mutable
--> <anon>:20:20
|>
20 |> return self.0.remove(index); // error 2
|> ^^^^^^
error: cannot move out of borrowed content [--explain E0507]
--> <anon>:18:44
|>
18 |> let search_index: Option<usize> = self.0.into_iter().position(|x| x == key); //error 1
|> ^^^^ cannot move out of borrowed content
However, I am little confused as who has ownership over self.0? If I am understanding the docs correctly, wouldn't the memory region be bounded to self.0 and therefore giving it the ownership? Sorry for the shallow logic there but I am still trying to understand the ownership system.
In find_and_remove, you specified &self in the parameter list. This means that the method will receive a borrowed pointer to self; i.e. the type of self is &VecCircle<T>. Therefore, the method doesn't have ownership of the VecCircle<T>.
find_and_remove tries to call into_iter on a VecDeque, and into_iter receives its argument by value (self rather than &self or &mut self). Because of this, Rust interprets self.0 as trying to move the VecDeque out of the VecCircle. However, that's not allowed as you can't move anything out of borrowed content, as moving from some location makes that location invalid. But we can't just tell the caller "Hey, I just invalidated self, stop using it!"; if we wanted to do that, we'd have to specify self in the parameter list, rather than &self.
But that's not what you're trying to do here. into_iter would take ownership of the VecDeque and therefore destroy it. There are other ways to obtain an iterator for the VecDeque without destroying it. Here, we should use iter, which takes &self.
Then, find_and_remove tries to call remove. remove takes &mut self, i.e. a mutable reference to a VecDeque. However, we can't borrow self.0 as mutable, because self is not itself a mutable borrow. We can't just upgrade an immutable borrow to a mutable borrow: it is invalid to have both an immutable borrow and a mutable borrow usable at the same time. The solution here is to change &self to &mut self in the parameter list.
use std::collections::VecDeque;
trait VecCircleTraits<T: Eq> {
fn new() -> VecCircle<T>;
fn find_and_remove(&mut self, _: &T) -> Option<T>;
}
#[derive(Debug)]
struct VecCircle<T: Eq>(VecDeque<T>);
impl<T: Eq> VecCircleTraits<T> for VecCircle<T> {
fn new() -> VecCircle<T> {
return VecCircle(VecDeque::<T>::new());
}
fn find_and_remove(&mut self, key: &T) -> Option<T> {
let search_index: Option<usize> = self.0.iter().position(|x| x == key);
if let Some(index) = search_index {
self.0.remove(index)
} else {
None
}
}
}
Note: I also changed the key parameter to &T to solve another error, this time in the closure passed to position. Since iter iterates over references to the items in the VecDeque, position passes references to the closure. Since find_and_remove doesn't actually need to take ownership of the key, it should just receive an immutable borrow to it, so that both x and key are of type &T and thus we can apply == to them.
No, you don't have ownership of the VecCircle inside the find_and_remove method. All you need to know is in the function definition:
impl<T: Eq> VecCircleTraits<T> for VecCircle<T> {
fn find_and_remove(&self, key: T) -> Option<T>
}
This means that you are borrowing a reference to VecCircle. The longer way to write this would be
fn find_and_remove(self: &VecCircle, key: T) -> Option<T>
Perhaps that is more evident?
Since you don't have ownership of self, you cannot have ownership of self.0.
I've this snippet that doesn't pass the borrow checker:
use std::collections::HashMap;
enum Error {
FunctionNotFound,
}
#[derive(Copy, Clone)]
struct Function<'a> {
name: &'a str,
code: &'a [u32],
}
struct Context<'a> {
program: HashMap<&'a str, Function<'a>>,
call_stack: Vec<Function<'a>>,
}
impl<'a> Context<'a> {
fn get_function(&'a mut self, fun_name: &'a str) -> Result<Function<'a>, Error> {
self.program
.get(fun_name)
.map(|f| *f)
.ok_or(Error::FunctionNotFound)
}
fn call(&'a mut self, fun_name: &'a str) -> Result<(), Error> {
let fun = try!(self.get_function(fun_name));
self.call_stack.push(fun);
Ok(())
}
}
fn main() {}
error[E0499]: cannot borrow `self.call_stack` as mutable more than once at a time
--> src/main.rs:29:9
|
27 | let fun = try!(self.get_function(fun_name));
| ---- first mutable borrow occurs here
28 |
29 | self.call_stack.push(fun);
| ^^^^^^^^^^^^^^^ second mutable borrow occurs here
...
32 | }
| - first borrow ends here
My gut feeling is that the problem is tied to the fact that HashMap returns either None or a reference of the value inside the data structure. But I don't want that: my intention is that self.get_function should return either a byte copy of the stored value or an error (that's why I put .map(|f| *f), and Function is Copy).
Changing &'a mut self to something else doesn't help.
However, the following snippet, somewhat similar in spirit, is accepted:
#[derive(Debug)]
enum Error {
StackUnderflow,
}
struct Context {
stack: Vec<u32>,
}
impl Context {
fn pop(&mut self) -> Result<u32, Error> {
self.stack.pop().ok_or(Error::StackUnderflow)
}
fn add(&mut self) -> Result<(), Error> {
let a = try!(self.pop());
let b = try!(self.pop());
self.stack.push(a + b);
Ok(())
}
}
fn main() {
let mut a = Context { stack: vec![1, 2] };
a.add().unwrap();
println!("{:?}", a.stack);
}
Now I'm confused. What is the problem with the first snippet? Why doesn't it happen in the second?
The snippets are part of a larger piece of code. In order to provide more context, this on the Rust Playground shows a more complete example with the faulty code, and this shows an earlier version without HashMap, which passes the borrow checker and runs normally.
You have fallen into the lifetime-trap. Adding the same lifetime to more references will constrain your program more. Adding more lifetimes and giving each reference the minimal possible lifetime will permit more programs. As #o11c notes, removing the constraints to the 'a lifetime will solve your issue.
impl<'a> Context<'a> {
fn get_function(&mut self, fun_name: &str) -> Result<Function<'a>, Error> {
self.program
.get(fun_name)
.map(|f| *f)
.ok_or(Error::FunctionNotFound)
}
fn call(&mut self, fun_name: &str) -> Result<(), Error> {
let fun = try!(self.get_function(fun_name));
self.call_stack.push(fun);
Ok(())
}
}
The reason this works is that Rust inserts new lifetimes, so in the compiler your function's signatures will look like this:
fn get_function<'b>(&'b mut self, fun_name: &'b str) -> Result<Function<'a>, Error>
fn call<'b>(&'b mut self, fun_name: &'b str) -> Result<(), Error>
Always try to not use any lifetimes and let the compiler be smart. If that fails, don't spray lifetimes everywhere, think about where you want to pass ownership, and where you want to limit the lifetime of a reference.
You only need to remove unnecessary lifetime qualifiers in order for your code to compile:
fn get_function(&mut self, fun_name: &str) -> Result<Function<'a>, Error> { ... }
fn call(&mut self, fun_name: &str) -> Result<(), Error> { ... }
Your problem was that you tied the lifetime of &mut self and the lifetime of the value stored in it (Function<'a>), which is in most cases unnecessary. With this dependency which was present in get_function() definition, the compiler had to assume that the result of the call self.get_function(...) borrows self, and hence it prohibits you from borrowing it again.
Lifetime on &str argument is also unnecessary - it just limits the possible set of argument values for no reason. Your key can be a string with arbitrary lifetime, not just 'a.
I tried to implement own analogue of find_or_insert method that looks like this:
use std::collections::HashMap;
pub struct SomeManager {
next: i32,
types: HashMap<i32, i32>,
}
impl SomeManager {
pub fn get_type<'a>(&'a mut self, k: i32) -> &'a i32 {
match self.types.get(&k) {
Some(ref x) => return *x,
None => {
self.types.insert(k, self.next);
self.next += 1;
return self.types.get(&k).unwrap();
}
}
}
}
fn main() {}
Error:
error[E0502]: cannot borrow `self.types` as mutable because it is also borrowed as immutable
--> src/main.rs:13:17
|
10 | match self.types.get(&k) {
| ---------- immutable borrow occurs here
...
13 | self.types.insert(k, self.next);
| ^^^^^^^^^^ mutable borrow occurs here
...
18 | }
| - immutable borrow ends here
I know that there are some standard methods that implement this functionality, but I want this method to be as light as possible - it will be called very-very often and almost all of the time the values will already exist.
As I understand it, when we call self.types.get we borrow it to scope of match statement, so we can't call self.types.insert here. I have tried to move methods from None branch out of match statement, but it also fails.
The only working solution that I found requires invoking get twice:
pub fn get_type<'a>(&'a mut self, k: i32) -> &'a i32 {
let is_none = match self.types.get(&k) {
Some(ref x) => false,
None => true,
};
if is_none {
self.types.insert(k, self.next);
self.next += 1;
}
self.types.get(&k).unwrap()
}
How can I work around such situations?
There are a handful of methods on HashMap to achieve these sorts of complex cases. Most notably, for your case, HashMap::entry and Entry::or_insert_with:
pub fn get_type<'a>(&'a mut self, k: i32) -> &'a i32 {
self.types.entry(k).or_insert_with(|| {
let value = self.next;
self.next += 1;
value
})
}
In your case, however, there’s the borrow of self inside, so that won’t do.
We thus shift the borrow of self.next outside of the closure so the compiler can reason about it as disjoint from self.types. Problem solved with only one lookup, as it should be.
pub fn get_type<'a>(&'a mut self, k: i32) -> &'a i32 {
let next = &mut self.next;
self.types.entry(k).or_insert_with(|| {
let value = *next;
*next += 1;
value
})
}
Note that in your first case you're doing one lookup when the key exists in the map and three when it does not exist. Your last attempt does two lookups in either case. This is somewhat prettified version of the latter:
pub fn get_type<'a>(&'a mut self, k: i32) -> &'a i32 {
let contains = self.types.contains_key(&k);
if !contains {
self.types.insert(k, self.next);
self.next += 1;
}
self.types.get(&k).unwrap()
}
I don't think it is possible to avoid the second lookup without some support from the map's implementation because of borrowing restrictions.
In any case, using the solution by Chris Morgan is superior to the above one (for example, it may be more efficient and in fact require less lookups), so I suggest sticking with it.