What is the difference between Rc<RefCell<T>> and RefCell<Rc<T>>? - rust

The Rust documentation covers Rc<RefCell<T>> pretty extensively but doesn't go into RefCell<Rc<T>>, which I am now encountering.
Do these effectively give the same result? Is there an important difference between them?

Do these effectively give the same result?
They are very different.
Rc is a pointer with shared ownership while RefCell provides interior mutability. The order in which they are composed makes a big difference to how they can be used.
Usually, you compose them as Rc<RefCell<T>>; the whole thing is shared and each shared owner gets to mutate the contents. The effect of mutating the contents will be seen by all of the shared owners of the outer Rc because the inner data is shared.
You can't share a RefCell<Rc<T>> except by reference, so this configuration is more limited in how it can be used. In order to mutate the inner data, you would need to mutably borrow from the outer RefCell, but then you'd have access to an immutable Rc. The only way to mutate it would be to replace it with a completely different Rc. For example:
let a = Rc::new(1);
let b = Rc::new(2);
let c = RefCell::new(Rc::clone(&a));
let d = RefCell::new(Rc::clone(&a));
*d.borrow_mut() = Rc::clone(&b); // this doesn't affect c
There is no way to mutate the values in a and b. This seems far less useful than Rc<RefCell<T>>.

Related

Erronous mutable borrow (E0502) when trying to remove and insert into a HashMap

I am a beginner to Rust and tried using a HashMap<u64, u64>. I want to remove an element and insert it with a modified value:
let mut r = HashMap::new();
let mut i = 2;
...
if r.contains_key(&i) {
let v = r.get(&i).unwrap();
r.remove(&i);
r.insert(i, v+1);
}
Now, the borrow checker complains that r is borrowed immutable, then mutable and then immutable again in the three lines of the if-block.
I don't understand what's going on...I guess since the get, remove and insert methods have r as implicit argument, it is borrowed in the three calls. But why is it a problem that this borrow in the remove call is mutable?
But why is it a problem that this borrow in the remove call is mutable?
The problem is the spanning: Rust allows either any number of immutable borrows or a single mutable borrow, they can not overlap.
The issue here is that v is a reference to the map contents, meaning the existence of v requires borrowing the map until v stops being used. Which thus overlaps with both remove and insert calls, and forbids them.
Now there are various ways to fix this. Since in this specific case you're using u64 which is Copy, you can just dereference and it'll copy the value you got from the map, removing the need for a borrow:
if r.contains_key(&i) {
let v = *r.get(&i).unwrap();
r.remove(&i);
r.insert(i, v+1);
}
this is limited in its flexibility though, as it only works for Copy types[0].
In this specific case it probably doesn't matter that much, because Copy is cheap, but it would still make more sense to use the advanced APIs Rust provides, for safety, for clarity, and because you'll eventually need them for less trivial types.
The simplest is to just use get_mut: where get returns an Option<&V>, get_mut returns an Option<&mut V>, meaning you can... update the value in-place, you don't need to get it out, and you don't need to insert it back in (nor do you need a separate lookup but you already didn't need that really):
if let Some(v) = r.get_mut(&i) {
*v += 1;
}
more than sufficient for your use case.
The second option is the Entry API, and the thing which will ruin every other hashmap API for you forever. I'm not joking, every other language becomes ridiculously frustrating, you may want to avoid clicking on that link (though you will eventually need to learn about it anyway, as it solves real borrowing and efficiency issues).
It doesn't really show its stuff here because your use case is simple and get_mut more than does the job, but anyway, you could write the increment as:
r.entry(i).and_modify(|v| *v+=1);
Incidentally in most languages (and certainly in Rust as well) when you insert an item in a hashmap, the old value gets evicted if there was one. So the remove call was already redundant and wholly unnecessary.
And pattern-matching an Option (such as that returned by HashMap::get) is generally safer, cleaner, and faster than painstakenly and procedurally doing all the low-level bits.
So even without using advanced APIs, the original code can be simplified to:
if let Some(&v) = r.get(&i) {
r.insert(i, v+1);
}
I'd still recommend the get_mut version over that as it is simpler, avoids the double lookup, and works on non-Copy types, but YMMV.
Also unlike most languages Rust's HashMap::insert returns the old value (f any), not a concern here but can be useful in some cases.
[0] as well as Clone ones, by explicitly calling .clone(), that may or may not translate to a significant performance impact depending on the type you're cloning.
The problem is that you keep an immutable reference when getting v. Since it is a u64, just implicitly clone so there is no more reference involved:
let v = r.get(&i).unwrap().clone();
Playground

Understanding usage of Rc<RefCell<SomeStruct>> in Rust

I'm looking at some code that uses
Rc<RefCell<SomeStruct>>
So I went out to read about the differences between Rc and RefCell:
Here is a recap of the reasons to choose Box, Rc, or RefCell:
Rc enables multiple owners of the same data; Box and RefCell
have single owners.
Box allows immutable or mutable borrows checked
at compile time; Rc allows only immutable borrows checked at
compile time;
RefCell allows immutable or mutable borrows checked
at runtime. Because RefCell allows mutable borrows checked at
runtime, you can mutate the value inside the RefCell even when the
RefCell is immutable.
So, Rc makes sure that SomeStruct is accessible by many people at the same time. But how do I access? I only see the get_mut method, which returns a mutable reference. But the text explained that "Rc allows only immutable borrows".
If it's possible to access Rc's object in mut and not mut way, why a RefCell is needed?
So, Rc makes sure that SomeStruct is accessible by many people at the same time. But how do I access?
By dereferencing. If you have a variable x of type Rc<...>, you can access the inner value using *x. In many cases this happens implicitly; for example you can call methods on x simply with x.method(...).
I only see the get_mut method, which returns a mutable reference. But the text explained that "Rc allows only immutable borrows".
The get_mut() method is probably more recent than the explanation stating that Rc only allows immutable borrows. Moreover, it only returns a mutable borrow if there currently is only a single owner of the inner value, i.e. if you currently wouldn't need Rc in the first place. As soon as there are multiple owners, get_mut() will return None.
If it's possible to access Rc's object in mut and not mut way, why a RefCell is needed?
RefCell will allow you to get mutable access even when multiple owners exist, and even if you only hold a shared reference to the RefCell. It will dynamically check at runtime that only a single mutable reference exists at any given time, and it will panic if you request a second, concurrent one (or return and error for the try_borrow methods, respecitvely). This functionality is not offered by Rc.
So in summary, Rc gives you shared ownership. The innervalue has multiple owners, and reference counting makes sure the data stays alive as long as at least one owner still holds onto it. This is useful if your data doesn't have a clear single owner. RefCell gives you interior mutability, i.e. you can borrow the inner value dynamically at runtime, and modify it even with a shared reference. The combination Rc<RefCell<...>> gives you the combination of both – a value with multiple owners that can be borrowed mutably by any one of the owners.
For further details, you can read the relevant chapters of the Rust book:
Rc<T>, the Reference Counted Smart Pointer
RefCell<T> and the Interior Mutability Pattern
If it's possible to access Rc's object in mut and not mut way, why a
RefCell is needed?
Rc pointer allows you to have shared ownership. since ownership is shared, the value owned by Rc pointer is immutable
Refcell smart pointer represents single ownership over the data it holds, much like Box smart pointer. the difference is that box smart pointer enforces the borrowing rules at compile time, whereas refcell enforces the borrowing rules at run time.
If you combine them together, you can create a smart pointer which can have multiple owners, and some of the owners would be able to modify the value some cannot. A perfect use case is to create a doubly linked list in rust.
struct LinkedList<T>{
head:Pointer<T>,
tail:Pointer<T>
}
struct Node<T>{
element:T,
next:Pointer<T>,
prev:Pointer<T>,
}
// we need multiple owners who can mutate the data
// it is Option because "end.next" would be None
type Pointer<T>=Option<Rc<RefCell<Node<T>>>>;
In the image "front" and "end" nodes will both point to the "middle" node and they can both mutate it. Imagine you need to insert a new node after "front", you will need to mutate "front.next". So in doubly linked you need multiple ownership and mutability power at the same time.

How to modify private mutable state when the trait dictates a non-mutable self reference? [duplicate]

When would you be required to use Cell or RefCell? It seems like there are many other type choices that would be suitable in place of these, and the documentation warns that using RefCell is a bit of a "last resort".
Is using these types a "code smell"? Can anyone show an example where using these types makes more sense than using another type, such as Rc or even Box?
It is not entirely correct to ask when Cell or RefCell should be used over Box and Rc because these types solve different problems. Indeed, more often than not RefCell is used together with Rc in order to provide mutability with shared ownership. So yes, use cases for Cell and RefCell are entirely dependent on the mutability requirements in your code.
Interior and exterior mutability are very nicely explained in the official Rust book, in the designated chapter on mutability. External mutability is very closely tied to the ownership model, and mostly when we say that something is mutable or immutable we mean exactly the external mutability. Another name for external mutability is inherited mutability, which probably explains the concept more clearly: this kind of mutability is defined by the owner of the data and inherited to everything you can reach from the owner. For example, if your variable of a structural type is mutable, so are all fields of the structure in the variable:
struct Point { x: u32, y: u32 }
// the variable is mutable...
let mut p = Point { x: 10, y: 20 };
// ...and so are fields reachable through this variable
p.x = 11;
p.y = 22;
let q = Point { x: 10, y: 20 };
q.x = 33; // compilation error
Inherited mutability also defines which kinds of references you can get out of the value:
{
let px: &u32 = &p.x; // okay
}
{
let py: &mut u32 = &mut p.x; // okay, because p is mut
}
{
let qx: &u32 = &q.x; // okay
}
{
let qy: &mut u32 = &mut q.y; // compilation error since q is not mut
}
Sometimes, however, inherited mutability is not enough. The canonical example is reference-counted pointer, called Rc in Rust. The following code is entirely valid:
{
let x1: Rc<u32> = Rc::new(1);
let x2: Rc<u32> = x1.clone(); // create another reference to the same data
let x3: Rc<u32> = x2.clone(); // even another
} // here all references are destroyed and the memory they were pointing at is deallocated
At the first glance it is not clear how mutability is related to this, but recall that reference-counted pointers are called so because they contain an internal reference counter which is modified when a reference is duplicated (clone() in Rust) and destroyed (goes out of scope in Rust). Hence Rc has to modify itself even though it is stored inside a non-mut variable.
This is achieved via internal mutability. There are special types in the standard library, the most basic of them being UnsafeCell, which allow one to work around the rules of external mutability and mutate something even if it is stored (transitively) in a non-mut variable.
Another way to say that something has internal mutability is that this something can be modified through a &-reference - that is, if you have a value of type &T and you can modify the state of T which it points at, then T has internal mutability.
For example, Cell can contain Copy data and it can be mutated even if it is stored in non-mut location:
let c: Cell<u32> = Cell::new(1);
c.set(2);
assert_eq!(c.get(), 2);
RefCell can contain non-Copy data and it can give you &mut pointers to its contained value, and absence of aliasing is checked at runtime. This is all explained in detail on their documentation pages.
As it turned out, in overwhelming number of situations you can easily go with external mutability only. Most of existing high-level code in Rust is written that way. Sometimes, however, internal mutability is unavoidable or makes the code much clearer. One example, Rc implementation, is already described above. Another one is when you need shared mutable ownership (that is, you need to access and modify the same value from different parts of your code) - this is usually achieved via Rc<RefCell<T>>, because it can't be done with references alone. Even another example is Arc<Mutex<T>>, Mutex being another type for internal mutability which is also safe to use across threads.
So, as you can see, Cell and RefCell are not replacements for Rc or Box; they solve the task of providing you mutability somewhere where it is not allowed by default. You can write your code without using them at all; and if you get into a situation when you would need them, you will know it.
Cells and RefCells are not code smell; the only reason whey they are described as "last resort" is that they move the task of checking mutability and aliasing rules from the compiler to the runtime code, as in case with RefCell: you can't have two &muts pointing to the same data at the same time, this is statically enforced by the compiler, but with RefCells you can ask the same RefCell to give you as much &muts as you like - except that if you do it more than once it will panic at you, enforcing aliasing rules at runtime. Panics are arguably worse than compilation errors because you can only find errors causing them at runtime rather than at compilation time. Sometimes, however, the static analyzer in the compiler is too restrictive, and you indeed do need to "work around" it.
No, Cell and RefCell aren't "code smells". Normally, mutability is inherited, that is you can mutate a field or a part of a data structure if and only if you have exclusive access to of the whole data structure, and hence you can opt into mutability at that level with mut (i.e., foo.x inherits its mutability or lack thereof from foo). This is a very powerful pattern and should be used whenever it works well (which is surprisingly often). But it's not expressive enough for all code everywhere.
Box and Rc have nothing to do with this. Like almost all other types, they respect inherited mutability: you can mutate the contents of a Box if you have exclusive, mutable access to the Box (because that means you have exclusive access to the contents, too). Conversely, you can never get a &mut to the contents of an Rc because by its nature Rc is shared (i.e. there can be multiple Rcs referring to the same data).
One common case of Cell or RefCell is that you need to share mutable data between several places. Having two &mut references to the same data is normally not allowed (and for good reason!). However, sometimes you need it, and the cell types enable doing it safely.
This could be done via the common combination of Rc<RefCell<T>>, which allows the data to stick around for as long as anyone uses it and allows everyone (but only one at a time!) to mutate it. Or it could be as simple as &Cell<i32> (even if the cell is wrapped in a more meaningful type). The latter is also commonly used for internal, private, mutable state like reference counts.
The documentation actually has several examples of where you'd use Cell or RefCell. A good example is actually Rc itself. When creating a new Rc, the reference count must be increased, but the reference count is shared between all Rcs, so, by inherited mutability, this couldn't possibly work. Rc practically has to use a Cell.
A good guideline is to try writing as much code as possible without cell types, but using them when it hurts too much without them. In some cases, there is a good solution without cells, and, with experience, you'll be able to find those when you previously missed them, but there will always be things that just aren't possible without them.
Suppose you want or need to create some object of the type of your choice and dump it into an Rc.
let x = Rc::new(5i32);
Now, you can easily create another Rc that points to the exact same object and therefore memory location:
let y = x.clone();
let yval: i32 = *y;
Since in Rust you may never have a mutable reference to a memory location to which any other reference exists, these Rc containers can never be modified again.
So, what if you wanted to be able to modify those objects and have multiple Rc pointing to one and the same object?
This is the issue that Cell and RefCell solve. The solution is called "interior mutability", and it means that Rust's aliasing rules are enforced at runtime instead of compile-time.
Back to our original example:
let x = Rc::new(RefCell::new(5i32));
let y = x.clone();
To get a mutable reference to your type, you use borrow_mut on the RefCell.
let yval = x.borrow_mut();
*yval = 45;
In case you already borrowed the value your Rcs point to either mutably or non-mutably, the borrow_mut function will panic, and therefore enforce Rust's aliasing rules.
Rc<RefCell<T>> is just one example for RefCell, there are many other legitimate uses. But the documentation is right. If there is another way, use it, because the compiler cannot help you reason about RefCells.

Idiomatically access an element of a vector mutably and immutably

How would you mutate a vector in such a way where you would need an immutable reference to said vector to determine how you would need to mutate the vector? For example, I have a piece of code that looks something like this, and I want to duplicate the last element of the vector:
let mut vec: Vec<usize> = vec![123, 42, 10];
// Doesn't work of course:
vec.push(*vec.last().unwrap())
// Works, but is this necessary?
let x = *vec.last().unwrap();
vec.push(x);
immutable reference [...] to determine how you would need to mutate the vector?
The short answer is you don't. Any mutation to the vector could possibly invalidate all existing references, making any future operations access invalid data, potentially causing segfaults. Safe Rust doesn't allow for that possibility.
Your second example creates a copy of the value in the vector, so it no longer matters what happens to the vector; that value will continue to be valid.
What's unfortunate about the first example is that if you follow the order of operations, a human can tell that the immutable value is retrieved before the mutation happens. In fact, that's why the multiple-statement version is possible at all! This is indeed a current limitation of the Rust borrow checker. There is investigation ongoing to see if some of these types of limitations can be lifted.

Situations where Cell or RefCell is the best choice

When would you be required to use Cell or RefCell? It seems like there are many other type choices that would be suitable in place of these, and the documentation warns that using RefCell is a bit of a "last resort".
Is using these types a "code smell"? Can anyone show an example where using these types makes more sense than using another type, such as Rc or even Box?
It is not entirely correct to ask when Cell or RefCell should be used over Box and Rc because these types solve different problems. Indeed, more often than not RefCell is used together with Rc in order to provide mutability with shared ownership. So yes, use cases for Cell and RefCell are entirely dependent on the mutability requirements in your code.
Interior and exterior mutability are very nicely explained in the official Rust book, in the designated chapter on mutability. External mutability is very closely tied to the ownership model, and mostly when we say that something is mutable or immutable we mean exactly the external mutability. Another name for external mutability is inherited mutability, which probably explains the concept more clearly: this kind of mutability is defined by the owner of the data and inherited to everything you can reach from the owner. For example, if your variable of a structural type is mutable, so are all fields of the structure in the variable:
struct Point { x: u32, y: u32 }
// the variable is mutable...
let mut p = Point { x: 10, y: 20 };
// ...and so are fields reachable through this variable
p.x = 11;
p.y = 22;
let q = Point { x: 10, y: 20 };
q.x = 33; // compilation error
Inherited mutability also defines which kinds of references you can get out of the value:
{
let px: &u32 = &p.x; // okay
}
{
let py: &mut u32 = &mut p.x; // okay, because p is mut
}
{
let qx: &u32 = &q.x; // okay
}
{
let qy: &mut u32 = &mut q.y; // compilation error since q is not mut
}
Sometimes, however, inherited mutability is not enough. The canonical example is reference-counted pointer, called Rc in Rust. The following code is entirely valid:
{
let x1: Rc<u32> = Rc::new(1);
let x2: Rc<u32> = x1.clone(); // create another reference to the same data
let x3: Rc<u32> = x2.clone(); // even another
} // here all references are destroyed and the memory they were pointing at is deallocated
At the first glance it is not clear how mutability is related to this, but recall that reference-counted pointers are called so because they contain an internal reference counter which is modified when a reference is duplicated (clone() in Rust) and destroyed (goes out of scope in Rust). Hence Rc has to modify itself even though it is stored inside a non-mut variable.
This is achieved via internal mutability. There are special types in the standard library, the most basic of them being UnsafeCell, which allow one to work around the rules of external mutability and mutate something even if it is stored (transitively) in a non-mut variable.
Another way to say that something has internal mutability is that this something can be modified through a &-reference - that is, if you have a value of type &T and you can modify the state of T which it points at, then T has internal mutability.
For example, Cell can contain Copy data and it can be mutated even if it is stored in non-mut location:
let c: Cell<u32> = Cell::new(1);
c.set(2);
assert_eq!(c.get(), 2);
RefCell can contain non-Copy data and it can give you &mut pointers to its contained value, and absence of aliasing is checked at runtime. This is all explained in detail on their documentation pages.
As it turned out, in overwhelming number of situations you can easily go with external mutability only. Most of existing high-level code in Rust is written that way. Sometimes, however, internal mutability is unavoidable or makes the code much clearer. One example, Rc implementation, is already described above. Another one is when you need shared mutable ownership (that is, you need to access and modify the same value from different parts of your code) - this is usually achieved via Rc<RefCell<T>>, because it can't be done with references alone. Even another example is Arc<Mutex<T>>, Mutex being another type for internal mutability which is also safe to use across threads.
So, as you can see, Cell and RefCell are not replacements for Rc or Box; they solve the task of providing you mutability somewhere where it is not allowed by default. You can write your code without using them at all; and if you get into a situation when you would need them, you will know it.
Cells and RefCells are not code smell; the only reason whey they are described as "last resort" is that they move the task of checking mutability and aliasing rules from the compiler to the runtime code, as in case with RefCell: you can't have two &muts pointing to the same data at the same time, this is statically enforced by the compiler, but with RefCells you can ask the same RefCell to give you as much &muts as you like - except that if you do it more than once it will panic at you, enforcing aliasing rules at runtime. Panics are arguably worse than compilation errors because you can only find errors causing them at runtime rather than at compilation time. Sometimes, however, the static analyzer in the compiler is too restrictive, and you indeed do need to "work around" it.
No, Cell and RefCell aren't "code smells". Normally, mutability is inherited, that is you can mutate a field or a part of a data structure if and only if you have exclusive access to of the whole data structure, and hence you can opt into mutability at that level with mut (i.e., foo.x inherits its mutability or lack thereof from foo). This is a very powerful pattern and should be used whenever it works well (which is surprisingly often). But it's not expressive enough for all code everywhere.
Box and Rc have nothing to do with this. Like almost all other types, they respect inherited mutability: you can mutate the contents of a Box if you have exclusive, mutable access to the Box (because that means you have exclusive access to the contents, too). Conversely, you can never get a &mut to the contents of an Rc because by its nature Rc is shared (i.e. there can be multiple Rcs referring to the same data).
One common case of Cell or RefCell is that you need to share mutable data between several places. Having two &mut references to the same data is normally not allowed (and for good reason!). However, sometimes you need it, and the cell types enable doing it safely.
This could be done via the common combination of Rc<RefCell<T>>, which allows the data to stick around for as long as anyone uses it and allows everyone (but only one at a time!) to mutate it. Or it could be as simple as &Cell<i32> (even if the cell is wrapped in a more meaningful type). The latter is also commonly used for internal, private, mutable state like reference counts.
The documentation actually has several examples of where you'd use Cell or RefCell. A good example is actually Rc itself. When creating a new Rc, the reference count must be increased, but the reference count is shared between all Rcs, so, by inherited mutability, this couldn't possibly work. Rc practically has to use a Cell.
A good guideline is to try writing as much code as possible without cell types, but using them when it hurts too much without them. In some cases, there is a good solution without cells, and, with experience, you'll be able to find those when you previously missed them, but there will always be things that just aren't possible without them.
Suppose you want or need to create some object of the type of your choice and dump it into an Rc.
let x = Rc::new(5i32);
Now, you can easily create another Rc that points to the exact same object and therefore memory location:
let y = x.clone();
let yval: i32 = *y;
Since in Rust you may never have a mutable reference to a memory location to which any other reference exists, these Rc containers can never be modified again.
So, what if you wanted to be able to modify those objects and have multiple Rc pointing to one and the same object?
This is the issue that Cell and RefCell solve. The solution is called "interior mutability", and it means that Rust's aliasing rules are enforced at runtime instead of compile-time.
Back to our original example:
let x = Rc::new(RefCell::new(5i32));
let y = x.clone();
To get a mutable reference to your type, you use borrow_mut on the RefCell.
let yval = x.borrow_mut();
*yval = 45;
In case you already borrowed the value your Rcs point to either mutably or non-mutably, the borrow_mut function will panic, and therefore enforce Rust's aliasing rules.
Rc<RefCell<T>> is just one example for RefCell, there are many other legitimate uses. But the documentation is right. If there is another way, use it, because the compiler cannot help you reason about RefCells.

Resources