Referencing a containing struct in Rust (and calling methods on it) - rust

Editor's note: This code example is from a version of Rust prior to 1.0 and is not syntactically valid Rust 1.0 code. Updated versions of this code produce different errors, but the answers still contain valuable information.
I'm trying to write a container structure in Rust where its elements also store a reference to the containing container so that they can call methods on it. As far as I could figure out, I need to do this via Rc<RefCell<T>>. Is this correct?
So far, I have something like the following:
struct Container {
elems: ~[~Element]
}
impl Container {
pub fn poke(&mut self) {
println!("Got poked.");
}
}
struct Element {
datum: int,
container: Weak<RefCell<Container>>
}
impl Element {
pub fn poke_container(&mut self) {
let c1 = self.container.upgrade().unwrap(); // Option<Rc>
let mut c2 = c1.borrow().borrow_mut(); // &RefCell
c2.get().poke();
// self.container.upgrade().unwrap().borrow().borrow_mut().get().poke();
// -> Error: Borrowed value does not live long enough * 2
}
}
fn main() {
let container = Rc::new(RefCell::new(Container{ elems: ~[] }));
let mut elem1 = Element{ datum: 1, container: container.downgrade() };
let mut elem2 = Element{ datum: 2, container: container.downgrade() };
elem1.poke_container();
}
I feel like I am missing something here. Is accessing the contents of a Rc<RefCell<T>> really this difficult (in poke_container)? Or am I approaching the problem the wrong way?
Lastly, and assuming the approach is correct, how would I write an add method for Container so that it could fill in the container field in Element (assuming I changed the field to be of type Option<Rc<RefCell<T>>>? I can't create another Rc from &mut self as far as I know.

The long chain of method calls actually works for me on master without any changes, because the lifetime of "r-values" (e.g. the result of function calls) have changed so that the temporary return values last until the end of the statement, rather than the end of the next method call (which seemed to be how the old rule worked).
As Vladimir hints, overloadable dereference will likely reduce it to
self.container.upgrade().unwrap().borrow_mut().poke();
which is nicer.
In any case, "mutating" shared ownership is always going to be (slightly) harder to write in Rust that either single ownership code or immutable shared ownership code, because it's very easy for such code to be memory unsafe (and memory safety is the core goal of Rust).

Related

Rust mutable container of immutable elements?

With Rust, is it in general possible to have a mutable container of immutable values?
Example:
struct TestStruct { value: i32 }
fn test_fn()
{
let immutable_instance = TestStruct{value: 123};
let immutable_box = Box::new(immutable_instance);
let mut mutable_vector = vec!(immutable_box);
mutable_vector[0].value = 456;
}
Here, my TestStruct instance is wrapped in two containers: a Box, then a Vec. From the perspective of a new Rust user it's surprising that moving the Box into the Vec makes both the Box and the TestStruct instance mutable.
Is there a similar construct whereby the boxed value is immutable, but the container of boxes is mutable? More generally, is it possible to have multiple "layers" of containers without the whole tree being either mutable or immutable?
Is there a similar construct whereby the boxed value is immutable, but the container of boxes is mutable? More generally, is it possible to have multiple "layers" of containers without the whole tree being either mutable or immutable?
Not really. You could easily create one (just create a wrapper object which implements Deref but not DerefMut), but the reality is that Rust doesn't really see (im)mutability that way, because its main concern is controlling sharing / visibility.
After all, for an external observer what difference is there between
mutable_vector[0].value = 456;
and
mutable_vector[0] = Box::new(TestStruct{value: 456});
?
None is the answer, because Rust's ownership system means it's not possible for an observer to have kept a handle on the original TestStruct, thus they can't know whether that structure was replaced or modified in place[1][2].
If you want to secure your internal state, use visibility instead: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=8a9346072b32cedcf2fccc0eeb9f55c5
mod foo {
pub struct TestStruct { value: i32 }
impl TestStruct {
pub fn new(value: i32) -> Self { Self { value } }
}
}
fn test_fn() {
let immutable_instance = foo::TestStruct{value: 123};
let immutable_box = Box::new(immutable_instance);
let mut mutable_vector = vec!(immutable_box);
mutable_vector[0].value = 456;
}
does not compile because from the point of view of test_fn, TestStruct::value is not accessible. Therefore test_fn has no way to mutate a TestStruct unless you add an &mut method on it.
[1]: technically they could check the address in memory and that might tell them, but even then it's not a sure thing (in either direction) hence pinning being a thing.
[2]: this observability distinction is also embraced by other languages, for instance the Clojure language largely falls on the "immutable all the things" side, however it has a concept of transients which allow locally mutable objects

having two struct reference each other - rust

I'm quite new to Rust programming, and I'm trying to convert a code that I had in js to Rust.
A plain concept of it is as below:
fn main() {
let mut ds=DataSource::new();
let mut pp =Processor::new(&mut ds);
}
struct DataSource {
st2r: Option<&Processor>,
}
struct Processor {
st1r: &DataSource,
}
impl DataSource {
pub fn new() -> Self {
DataSource {
st2r: None,
}
}
}
impl Processor {
pub fn new(ds: &mut DataSource) -> Self {
let pp = Processor {
st1r: ds,
};
ds.st2r = Some(&pp);
pp
}
}
As you can see I have two main modules in my system that are inter-connected to each other and I need a reference of each in another.
Well, this code would complain about lifetimes and such stuff, of course 😑. So I started throwing lifetime specifiers around like a madman and even after all that, it still complains that in "Processor::new" I can't return something that has been borrowed. Legit. But I can't find any solution around it! No matter how I try to handle the referencing of each other, it ends with this borrowing error.
So, can anyone point out a solution for this situation? Is my app's structure not valid in Rust and I should do it in another way? or there's a trick to this that my inexperienced mind can't find?
Thanks.
What you're trying to do can't be expressed with references and lifetimes because:
The DataSource must live longer than the Processor so that pp.st1r is guaranteed to be valid,
and the Processor must live longer than the DataSource so that ds.st2r is guaranteed to be valid. You might think that since ds.st2r is an Option and since the None variant doesn't contain a reference this allows a DataSource with a None value in st2r to outlive any Processors, but unfortunately the compiler can't know at compile-time whether st2r contains Some value, and therefore must assume it does.
Your problem is compounded by the fact that you need a mutable reference to the DataSource so that you can set its st2r field at a time when you also have an immutable outstanding reference inside the Processor, which Rust won't allow.
You can make your code work by switching to dynamic lifetime and mutability tracking using Rc (for dynamic lifetime tracking) and RefCell (for dynamic mutability tracking):
use std::cell::RefCell;
use std::rc::{ Rc, Weak };
fn main() {
let ds = Rc::new (RefCell::new (DataSource::new()));
let pp = Processor::new (Rc::clone (&ds));
}
struct DataSource {
st2r: Weak<Processor>,
}
struct Processor {
st1r: Rc<RefCell<DataSource>>,
}
impl DataSource {
pub fn new() -> Self {
DataSource {
st2r: Weak::new(),
}
}
}
impl Processor {
pub fn new(ds: Rc::<RefCell::<DataSource>>) -> Rc<Self> {
let pp = Rc::new (Processor {
st1r: ds,
});
pp.st1r.borrow_mut().st2r = Rc::downgrade (&pp);
pp
}
}
Playground
Note that I've replaced your Option<&Processor> with a Weak<Processor>. It would be possible to use an Option<Rc<Processor>> but this would risk leaking memory if you dropped all references to DataSource without setting st2r to None first. The Weak<Processor> behaves more or less like an Option<Rc<Processor>> that is set to None automatically when all other references are dropped, ensuring that memory will be freed properly.

replace a value behind a mutable reference by moving and mapping the original

TLDR: I want to replace a T behind &mut T with a new T that I construct from the old T
Note: please forgive me if the solution to this problem is easy to find. I did a lot of googling, but I am not sure how to word the problem correctly.
Sample code (playground):
struct T { s: String }
fn main() {
let ref mut t = T { s: "hello".to_string() };
*t = T {
s: t.s + " world"
}
}
This obviously fails because the add impl on String takes self by value, and therefore would require being able to move out of T, which is however not possible, since T is behind a reference.
From what I was able to find, the usual way to achieve this is to do something like
let old_t = std::mem::replace(t, T { s: Default::default() });
t.s = old_t + " world";
but this requires that it's possible and feasible to create some placeholder T until we can fill it with real data.
Fortunately, in my use-case I can create a placeholder T, but it's still not clear to me why is an api similar to this not possible:
map_in_place(t, |old_t: T| T { s: old_t.s + " world" });
Is there a reason that is not possible or commonly done?
Is there a reason [map_in_place] is not possible or commonly done?
A map_in_place is indeed possible:
// XXX unsound, don't use
pub fn map_in_place<T>(place: &mut T, f: impl FnOnce(T) -> T) {
let place = place as *mut T;
unsafe {
let val = std::ptr::read(place);
let new_val = f(val);
std::ptr::write(place, new_val);
}
}
But unfortunately it's not sound. If f() panics, *place will be dropped twice. First it will be dropped while unwinding the scope of f(), which thinks it owns the value it received. Then it will be dropped a second time by the owner of the value place is borrowed from, which never got the memo that the value it thinks it owns is actually garbage because it was already dropped. This can even be reproduced in the playground where a simple panic!() in the closure results in a double free.
For this reason an implementation of map_in_place would itself have to be marked unsafe, with a safety contract that f() not panic. But since pretty much anything in Rust can panic (e.g. any slice access), it would be hard to ensure that safety contract and the function would be somewhat of a footgun.
The replace_with crate does offer such functionality, with several recovery options in case of panic. Judging by the documentation, the authors are keenly aware of the panic issue, so if you really need that functionality, that might be a good place to get it from.

Why does the Index trait allow returning a reference to a temporary value?

Consider this simple code:
use std::ops::Index;
use std::collections::HashMap;
enum BuildingType {
Shop,
House,
}
struct Street {
buildings: HashMap<u32, BuildingType>,
}
impl Index<u32> for Street {
type Output = BuildingType;
fn index(&self, pos: u32) -> &Self::Output {
&self.buildings[&pos]
}
}
It compiles with no issues, but I cannot understand why the borrow checker is not complaining about returning a reference to temporary value in the index function.
Why is it working?
You example looks fine.
The Index trait is only able to "view" what is in the object already, and it's not usable for returning arbitrary dynamically-generated data.
It's not possible in Rust to return a reference to a value created inside a function, if that value isn't stored somewhere permanently (references don't exist on their own, they always borrow some value owned somewhere).
The reference can't be borrowed from a variable inside the function, because all variables will be destroyed before the function returns. Lifetimes only describe what the program does, and can't "make" something live longer than it already does.
fn index(&self) -> &u32 {
let tmp = 1;
&tmp // not valid, because tmp isn't stored anywhere
}
fn index(&self) -> &u32 {
// ok, because the value is stored in self,
// which existed before this function has been called
&self.tmp
}
You may find that returning &1 works. That's because 1 is stored in your program's executable which, as far as the program is concerned, is permanent storage. But 'static is an exception for literals and leaked memory, so it's not something you can rely on in most cases.

Borrow data out of a mutex "borrowed value does not live long enough"

How can I return an iterator over data within a mutex which itself is contained within a struct. The error the compiler gives is "borrowed value does not live long enough".
How do I get the lifetime of the value to extend into the outer scope?
Here is a minimal demo of what I am trying to achieve.
use std::sync::{Mutex, Arc};
use std::vec::{Vec};
use std::slice::{Iter};
#[derive(Debug)]
struct SharedVec {
pub data: Arc<Mutex<Vec<u32>>>,
}
impl SharedVec {
fn iter(& self) -> Iter<u32> {
self.data.lock().unwrap().iter()
}
}
fn main() {
let sv = SharedVec {
data: Arc::new(Mutex::new(vec![1, 2, 3, 4, 5]))
};
for element in sv.data.lock().unwrap().iter() { // This works
println!("{:?}", element);
}
for element in sv.iter() { // This does not work
println!("{:?}", element);
}
}
Rust playground link: http://is.gd/voukyN
You cannot do it exactly how you have written here.
Mutexes in Rust use RAII pattern for acquisition and freeing, that is, you acquire a mutex when you call the corresponding method on it which returns a special guard value. When this guard goes out of scope, the mutex is released.
To make this pattern safe Rust uses its borrowing system. You can access the value inside the mutex only through the guard returned by lock(), and you only can do so by reference - MutexGuard<T> implements Deref<Target=T> and DerefMut<Target=T>, so you can get &T or &mut T out of it.
This means that every value you derive from a mutexed value will necessarily have its lifetime linked to the lifetime of the guard. However, in your case you're trying to return Iter<u32> with its lifetime parameter tied to the lifetime of self. The following is the full signature of iter() method, without lifetime parameters elision, and its body with explicit temporary variables:
fn iter<'a>(&'a self) -> Iter<'a, u32> {
let guard = self.data.lock().unwrap();
guard.iter()
}
Here the lifetime of guard.iter() result is tied to the one guard, which is strictly smaller than 'a because guard only lives inside the scope of the method body. This is a violation of borrowing rules, and so the compiler fails with an error.
When iter() returns, guard is destroyed and the lock is released, so Rust in fact prevented you from making an actual logical error! The same code in C++ would compile and behave incorrectly because you would access protected data without locking it, causing data races at the very least. Just another demonstration of the power of Rust :)
I don't think you'll be able to do what you want without nasty hacks or boilerplate wrappers around standard types. And I personally think this is good - you have to manage your mutexes as explicit as possible in order to avoid deadlocks and other nasty concurrency problems. And Rust already makes your life much easier because it enforces absence of data races through its borrowing system, which is exactly the reason why the guard system behaves as described above.
As Vladimir Matveev's answer mentions, this isn't possible with return values. You can achieve your goal if you pass the iterator into a function instead of returning it:
impl SharedVec {
fn iter<R>(&self, func: impl FnOnce(Iter<'_, u32>) -> R) -> R {
let guard = self.data.lock().unwrap();
func(guard.iter())
}
}
This function is used like this:
sv.iter(|iter| {
for element in iter {
println!("{:?}", element);
}
});
This type of function wrapping will have to be repeated with every type of iterator. If you end up doing that, it may be easier to hand over a mutable slice or &mut SharedVec instead, making the closure choose the iteration method.
This method works because you never release the lock keeping the data protected from multiple threads from writing at the same time.

Resources