I'm quite new to Rust programming, and I'm trying to convert a code that I had in js to Rust.
A plain concept of it is as below:
fn main() {
let mut ds=DataSource::new();
let mut pp =Processor::new(&mut ds);
}
struct DataSource {
st2r: Option<&Processor>,
}
struct Processor {
st1r: &DataSource,
}
impl DataSource {
pub fn new() -> Self {
DataSource {
st2r: None,
}
}
}
impl Processor {
pub fn new(ds: &mut DataSource) -> Self {
let pp = Processor {
st1r: ds,
};
ds.st2r = Some(&pp);
pp
}
}
As you can see I have two main modules in my system that are inter-connected to each other and I need a reference of each in another.
Well, this code would complain about lifetimes and such stuff, of course 😑. So I started throwing lifetime specifiers around like a madman and even after all that, it still complains that in "Processor::new" I can't return something that has been borrowed. Legit. But I can't find any solution around it! No matter how I try to handle the referencing of each other, it ends with this borrowing error.
So, can anyone point out a solution for this situation? Is my app's structure not valid in Rust and I should do it in another way? or there's a trick to this that my inexperienced mind can't find?
Thanks.
What you're trying to do can't be expressed with references and lifetimes because:
The DataSource must live longer than the Processor so that pp.st1r is guaranteed to be valid,
and the Processor must live longer than the DataSource so that ds.st2r is guaranteed to be valid. You might think that since ds.st2r is an Option and since the None variant doesn't contain a reference this allows a DataSource with a None value in st2r to outlive any Processors, but unfortunately the compiler can't know at compile-time whether st2r contains Some value, and therefore must assume it does.
Your problem is compounded by the fact that you need a mutable reference to the DataSource so that you can set its st2r field at a time when you also have an immutable outstanding reference inside the Processor, which Rust won't allow.
You can make your code work by switching to dynamic lifetime and mutability tracking using Rc (for dynamic lifetime tracking) and RefCell (for dynamic mutability tracking):
use std::cell::RefCell;
use std::rc::{ Rc, Weak };
fn main() {
let ds = Rc::new (RefCell::new (DataSource::new()));
let pp = Processor::new (Rc::clone (&ds));
}
struct DataSource {
st2r: Weak<Processor>,
}
struct Processor {
st1r: Rc<RefCell<DataSource>>,
}
impl DataSource {
pub fn new() -> Self {
DataSource {
st2r: Weak::new(),
}
}
}
impl Processor {
pub fn new(ds: Rc::<RefCell::<DataSource>>) -> Rc<Self> {
let pp = Rc::new (Processor {
st1r: ds,
});
pp.st1r.borrow_mut().st2r = Rc::downgrade (&pp);
pp
}
}
Playground
Note that I've replaced your Option<&Processor> with a Weak<Processor>. It would be possible to use an Option<Rc<Processor>> but this would risk leaking memory if you dropped all references to DataSource without setting st2r to None first. The Weak<Processor> behaves more or less like an Option<Rc<Processor>> that is set to None automatically when all other references are dropped, ensuring that memory will be freed properly.
Related
I'm tring to replace a value in a mutable borrow; moving part of it into the new value:
enum Foo<T> {
Bar(T),
Baz(T),
}
impl<T> Foo<T> {
fn switch(&mut self) {
*self = match self {
&mut Foo::Bar(val) => Foo::Baz(val),
&mut Foo::Baz(val) => Foo::Bar(val),
}
}
}
The code above doesn't work, and understandibly so, moving the value out of self breaks the integrity of it. But since that value is dropped immediately afterwards, I (if not the compiler) could guarantee it's safety.
Is there some way to achieve this? I feel like this is a job for unsafe code, but I'm not sure how that would work.
mem:uninitialized has been deprecated since Rust 1.39, replaced by MaybeUninit.
However, uninitialized data is not required here. Instead, you can use ptr::read to get the data referred to by self.
At this point, tmp has ownership of the data in the enum, but if we were to drop self, that data would attempt to be read by the destructor, causing memory unsafety.
We then perform our transformation and put the value back, restoring the safety of the type.
use std::ptr;
enum Foo<T> {
Bar(T),
Baz(T),
}
impl<T> Foo<T> {
fn switch(&mut self) {
// I copied this code from Stack Overflow without reading
// the surrounding text that explains why this is safe.
unsafe {
let tmp = ptr::read(self);
// Must not panic before we get to `ptr::write`
let new = match tmp {
Foo::Bar(val) => Foo::Baz(val),
Foo::Baz(val) => Foo::Bar(val),
};
ptr::write(self, new);
}
}
}
More advanced versions of this code would prevent a panic from bubbling out of this code and instead cause the program to abort.
See also:
replace_with, a crate that wraps this logic up.
take_mut, a crate that wraps this logic up.
Change enum variant while moving the field to the new variant
How can I swap in a new value for a field in a mutable reference to a structure?
The code above doesn't work, and understandibly so, moving the value
out of self breaks the integrity of it.
This is not exactly what happens here. For example, same thing with self would work nicely:
impl<T> Foo<T> {
fn switch(self) {
self = match self {
Foo::Bar(val) => Foo::Baz(val),
Foo::Baz(val) => Foo::Bar(val),
}
}
}
Rust is absolutely fine with partial and total moves. The problem here is that you do not own the value you're trying to move - you only have a mutable borrowed reference. You cannot move out of any reference, including mutable ones.
This is in fact one of the frequently requested features - a special kind of reference which would allow moving out of it. It would allow several kinds of useful patterns. You can find more here and here.
In the meantime for some cases you can use std::mem::replace and std::mem::swap. These functions allow you to "take" a value out of mutable reference, provided you give something in exchange.
Okay, I figured out how to do it with a bit of unsafeness and std::mem.
I replace self with an uninitialized temporary value. Since I now "own" what used to be self, I can safely move the value out of it and replace it:
use std::mem;
enum Foo<T> {
Bar(T),
Baz(T),
}
impl<T> Foo<T> {
fn switch(&mut self) {
// This is safe since we will overwrite it without ever reading it.
let tmp = mem::replace(self, unsafe { mem::uninitialized() });
// We absolutely must **never** panic while the uninitialized value is around!
let new = match tmp {
Foo::Bar(val) => Foo::Baz(val),
Foo::Baz(val) => Foo::Bar(val),
};
let uninitialized = mem::replace(self, new);
mem::forget(uninitialized);
}
}
fn main() {}
Like some kind of Box that holds the reference to the value or something? I'd have to check whether the value is still alive or not before reading it, like when a Option is pattern matched.
A mock example:
struct Whatever {
thing: AliveOrNot<i32>,
}
fn main() {
let mut w = Whatever { thing: Holder::Empty };
w.thing = AliveOrNot::new(100);
match w.thing {
Empty => println!("doesn't exist"),
Full(value) => println!("{}", value),
}
}
The exact case is that I'm using a sdl2 Font and I want to store instances of that struct in another struct. I don't want to do something like this because the Parent would have to live exactly as long as the Font:
struct Font<'a, 'b> {
aa: &'a i32,
bb: &'b i32,
}
struct Parent<'a, 'b, 'c> {
f: &'c Font<'a, 'b>
}
What I want is for the Parent to work regardless of whether that field is still alive or not.
You may be interested in std::rc::Weak or std::sync::Weak:
use std::rc::{Rc, Weak};
struct Whatever {
thing: Weak<i32>,
}
impl Whatever {
fn do_it(&self) {
match self.thing.upgrade() {
Some(value) => println!("{}", value),
None => println!("doesn't exist"),
}
}
}
fn its_dead_jim() -> Whatever {
let owner = Rc::new(42);
let thing = Rc::downgrade(&owner);
let whatever = Whatever { thing };
whatever.do_it();
whatever
}
fn main() {
let whatever = its_dead_jim();
whatever.do_it();
}
42
doesn't exist
There is no way to do this in safe Rust using non-'static references. A huge point of Rust's references is that it's impossible to refer to invalid values.
You could leak the memory, creating a &'static i32, but that's not sustainable if you need to do this multiple times.
You could use unsafe code and deal with raw pointers that have no notion of lifetimes. You then assume the responsibility of ensuring you don't introduce memory unsafety.
See also:
Need holistic explanation about Rust's cell and reference counted types
Situations where Cell or RefCell is the best choice
Is there a shared pointer with a single strong owner and multiple weak references?
Why can't I store a value and a reference to that value in the same struct?
How to convert a String into a &'static str
Is there any way to return a reference to a variable created in a function?
How to solve this rust lifetime bound issue of SDL2?
Cannot infer an appropriate lifetime for autoref due to conflicting requirements
This question already has answers here:
Why can't I store a value and a reference to that value in the same struct?
(4 answers)
Implement graph-like data structure in Rust
(3 answers)
What is the right smart pointer to have multiple strong references and allow mutability?
(1 answer)
Mutating one field while iterating over another immutable field
(3 answers)
How can I borrow from a HashMap to read and write at the same time?
(2 answers)
Closed 3 years ago.
I am trying to implement the JVM in Rust as a fun project but I am struggling with an issue involving references. When classes are loaded from class files, their references to other classes are represented as Strings. After the loading phase, a JVM should link classes with actual references to each other. I cannot figure out how to do this with Rust's references. Here is a stripped down version of what I am trying to achieve. (an MCV example)
use ClassRef::{Symbolic, Static};
use std::collections::HashMap;
fn main() {
let mut loader = ClassLoader { classes: HashMap::new() };
loader.load_class("java/lang/String", "java/lang/Object");
// java.lang.Object is the only class without a direct superclass
loader.load_class("java/lang/Object", "");
loader.link_classes();
}
struct ClassLoader<'a> {
classes: HashMap<String, Class<'a>>
}
impl<'a> ClassLoader<'a> {
pub fn load_class(&mut self, name: &str, super_name: &str) {
self.classes.insert(name.to_owned(), Class::new(name, super_name));
}
pub fn link_classes(&mut self) {
// Issue 1: Editing the stored class requires a mutable borrow
for (name, class) in self.classes.iter_mut() {
// Issue 2: I am not allowed to move this value to itself
class.super_class = match class.super_class {
// Issue 3: Storing a reference to another value in the map
Symbolic(ref super_name) => Static(self.classes.get(super_name)),
a => a
}
}
}
}
struct Class<'a> {
super_class: ClassRef<'a>,
name: String
}
impl<'a> Class<'a> {
pub fn new(name: &str, super_name: &str) -> Self {
Class {
name: name.to_owned(),
super_class: Symbolic(super_name.to_owned())
}
}
}
enum ClassRef<'a> {
Symbolic(String),
Static(Option<&'a Class<'a>>)
}
Currently, there are three issues with the process that I am unable to solve.
First, when I iterate through the list of loaded classes, I need to be able to change the field Class#super_class, so I need a mutable iterator. This means that I have a mutable borrow on self.classes. I also need an immutable borrow on self.classes during the iteration to look up a reference to the superclass. The way I feel like this should be solved is having a way of proving to the compiler that I am not mutating the map itself, but only a key. I feel like it has to do with std::cell::Cell as shown in Choosing your Guarantees, but the examples seem so simple I don't know how to correlate it to what I am trying to do here.
Second, I need the previous value of Class#super_class in order to calculate the new one. Essentially, I am moving the previous value out, changing it, and then moving it back in. I feel like this could be solved with some sort of higher order Map function to work on the value in place, but once again I'm not sure.
Third, when I give one class a reference to another class, there is no way of proving to the compiler that the other class won't be removed from the hashmap, causing the reference to point to deallocated memory. When classes are loaded, one class name will always point to the same class and classes are never unloaded, so I know that the value will always be there. The way I think this should be solved is with a Hashmap implementation which doesn't allow values to be overwritten or removed. I tried to poke around for such a thing but I couldn't find it.
EDIT: Solution
The duplicates linked were certainly helpful, but I ended up not using reference counting pointers (Rc<>) because the relationships between classes can sometimes be cyclic, so trying to drop the ClassLoader would end up leaking memory. The solution I settled on is using an immutable reference to a
Typed-Arena allocator. This allocator is an insertion only allocator which ensures the references it distributes are valid for the entire life of the reference to the allocator, as long as said reference is immutable. I then used the hashmap to store the references. That way everything is dropped when I drop the ClassLoader. Hopefully anyone that trying to implement a relationship similar to this finds the following code helpful.
extern crate typed_arena;
use ClassRef::{Symbolic, Static};
use std::collections::HashMap;
use std::cell::RefCell;
use core::borrow::BorrowMut;
use typed_arena::Arena;
fn main() {
let mut loader = ClassLoader { class_map: HashMap::new(), classes: &Arena::new() };
loader.load_class("java/lang/String", "java/lang/Object");
// java.lang.Object is the only class without a direct superclass
loader.load_class("java/lang/Object", "");
loader.link_classes();
}
struct ClassLoader<'a> {
classes: &'a Arena<RefCell<Class<'a>>>,
class_map: HashMap<String, &'a RefCell<Class<'a>>>
}
impl<'a> ClassLoader<'a> {
pub fn load_class(&mut self, name: &str, super_name: &str) {
let super_opt = if super_name.len() > 0 {
Some(super_name)
} else {
None
};
let class = Class::new(name, super_opt);
let class_ref = self.classes.alloc(RefCell::new(class));
self.class_map.insert(name.to_owned(), class_ref);
}
pub fn link_classes(&mut self) {
for (_name, class_ref) in &self.class_map {
let mut class = (*class_ref).borrow_mut();
if let Some(Symbolic(super_name)) = &class.super_class {
let super_class = self.class_map.get(super_name);
let super_class = super_class.map(|c| Static(c.clone()));
*class.super_class.borrow_mut() = super_class;
}
}
}
}
struct Class<'a> {
super_class: Option<ClassRef<'a>>,
name: String
}
impl<'a> Class<'a> {
pub fn new(name: &str, super_name: Option<&str>) -> Self {
Class {
name: name.to_owned(),
super_class: super_name.map(|name| Symbolic(name.to_owned()))
}
}
}
enum ClassRef<'a> {
Symbolic(String),
Static(&'a RefCell<Class<'a>>)
}
There is a new Pin type in unstable Rust and the RFC is already merged. It is said to be kind of a game changer when it comes to passing references, but I am not sure how and when one should use it.
Can anyone explain it in layman's terms?
What is pinning ?
In programming, pinning X means instructing X not to move.
For example:
Pinning a thread to a CPU core, to ensure it always executes on the same CPU,
Pinning an object in memory, to prevent a Garbage Collector to move it (in C# for example).
What is the Pin type about?
The Pin type's purpose is to pin an object in memory.
It enables taking the address of an object and having a guarantee that this address will remain valid for as long as the instance of Pin is alive.
What are the usecases?
The primary usecase, for which it was developed, is supporting Generators.
The idea of generators is to write a simple function, with yield, and have the compiler automatically translate this function into a state machine. The state that the generator carries around is the "stack" variables that need to be preserved from one invocation to another.
The key difficulty of Generators that Pin is designed to fix is that Generators may end up storing a reference to one of their own data members (after all, you can create references to stack values) or a reference to an object ultimately owned by their own data members (for example, a &T obtained from a Box<T>).
This is a subcase of self-referential structs, which until now required custom libraries (and lots of unsafe). The problem of self-referential structs is that if the struct move, the reference it contains still points to the old memory.
Pin apparently solves this years-old issue of Rust. As a library type. It creates the extra guarantee that as long as Pin exist the pinned value cannot be moved.
The usage, therefore, is to first create the struct you need, return it/move it at will, and then when you are satisfied with its place in memory initialize the pinned references.
One of the possible uses of the Pin type is self-referencing objects; an article by ralfj provides an example of a SelfReferential struct which would be very complicated without it:
use std::ptr;
use std::pin::Pin;
use std::marker::PhantomPinned;
struct SelfReferential {
data: i32,
self_ref: *const i32,
_pin: PhantomPinned,
}
impl SelfReferential {
fn new() -> SelfReferential {
SelfReferential { data: 42, self_ref: ptr::null(), _pin: PhantomPinned }
}
fn init(self: Pin<&mut Self>) {
let this : &mut Self = unsafe { self.get_unchecked_mut() };
// Set up self_ref to point to this.data.
this.self_ref = &mut this.data as *const i32;
}
fn read_ref(self: Pin<&Self>) -> Option<i32> {
let this : &Self= self.get_ref();
// Dereference self_ref if it is non-NULL.
if this.self_ref == ptr::null() {
None
} else {
Some(unsafe { *this.self_ref })
}
}
}
fn main() {
let mut data: Pin<Box<SelfReferential>> = Box::new(SelfReferential::new()).into();
data.as_mut().init();
println!("{:?}", data.as_ref().read_ref()); // prints Some(42)
}
Most of this is boilerplate, provided as a compilable example. Scroll down.
use std::rc::{Rc, Weak};
use std::cell::RefCell;
use std::any::{Any, AnyRefExt};
struct Shared {
example: int,
}
struct Widget {
parent: Option<Weak<Box<Widget>>>,
specific: RefCell<Box<Any>>,
shared: RefCell<Shared>,
}
impl Widget {
fn new(specific: Box<Any>,
parent: Option<Rc<Box<Widget>>>) -> Rc<Box<Widget>> {
let parent_option = match parent {
Some(parent) => Some(parent.downgrade()),
None => None,
};
let shared = Shared{pos: 10};
Rc::new(box Widget{
parent: parent_option,
specific: RefCell::new(specific),
shared: RefCell::new(shared)})
}
}
struct Woo {
foo: int,
}
impl Woo {
fn new() -> Box<Any> {
box Woo{foo: 10} as Box<Any>
}
}
fn main() {
let widget = Widget::new(Woo::new(), None);
{
// This is a lot of work...
let cell_borrow = widget.specific.borrow();
let woo = cell_borrow.downcast_ref::<Woo>().unwrap();
println!("{}", woo.foo);
}
// Can the above be made into a function?
// let woo = widget.get_specific::<Woo>();
}
I'm learning Rust and trying to figure out some workable way of doing a widget hierarchy. The above basically does what I need, but it is a bit cumbersome. Especially vexing is the fact that I have to use two statements to convert the inner widget (specific member of Widget). I tried several ways of writing a function that does it all, but the amount of reference and lifetime wizardry is just beyond me.
Can it be done? Can the commented out method at the bottom of my example code be made into reality?
Comments regarding better ways of doing this entire thing are appreciated, but put it in the comments section (or create a new question and link it)
I'll just present a working simplified and more idiomatic version of your code and then explain all the changed I made there:
use std::rc::{Rc, Weak};
use std::any::{Any, AnyRefExt};
struct Shared {
example: int,
}
struct Widget {
parent: Option<Weak<Widget>>,
specific: Box<Any>,
shared: Shared,
}
impl Widget {
fn new(specific: Box<Any>, parent: Option<Rc<Widget>>) -> Widget {
let parent_option = match parent {
Some(parent) => Some(parent.downgrade()),
None => None,
};
let shared = Shared { example: 10 };
Widget{
parent: parent_option,
specific: specific,
shared: shared
}
}
fn get_specific<T: 'static>(&self) -> Option<&T> {
self.specific.downcast_ref::<T>()
}
}
struct Woo {
foo: int,
}
impl Woo {
fn new() -> Woo {
Woo { foo: 10 }
}
}
fn main() {
let widget = Widget::new(box Woo::new() as Box<Any>, None);
let specific = widget.get_specific::<Woo>().unwrap();
println!("{}", specific.foo);
}
First of all, there are needless RefCells inside your structure. RefCells are needed very rarely - only when you need to mutate internal state of an object using only & reference (instead of &mut). This is useful tool for implementing abstractions, but it is almost never needed in application code. Because it is not clear from your code that you really need it, I assume that it was used mistakingly and can be removed.
Next, Rc<Box<Something>> when Something is a struct (like in your case where Something = Widget) is redundant. Putting an owned box into a reference-counted box is just an unnecessary double indirection and allocation. Plain Rc<Widget> is the correct way to express this thing. When dynamically sized types land, it will be also true for trait objects.
Finally, you should try to always return unboxed values. Returning Rc<Box<Widget>> (or even Rc<Widget>) is unnecessary limiting for the callers of your code. You can go from Widget to Rc<Widget> easily, but not the other way around. Rust optimizes by-value returns automatically; if your callers need Rc<Widget> they can box the return value themselves:
let rc_w = box(RC) Widget::new(...);
Same thing is also true for Box<Any> returned by Woo::new().
You can see that in the absence of RefCells your get_specific() method can be implemented very easily. However, you really can't do it with RefCell because RefCell uses RAII for dynamic borrow checks, so you can't return a reference to its internals from a method. You'll have to return core::cell::Refs, and your callers will need to downcast_ref() themselves. This is just another reason to use RefCells sparingly.