Store references inside a struct - rust

I got stuck with an issue about how to use references and lifetimes within a struct.
I have following constellation:
struct Item { ... }
struct Container {
items: Vec<Rc<RefCell<Item>>>,
active_item: RefMut<Item>,
}
There's a collection of "Item" where I'm going to select an active one which will be modified.
To simplify things I wanted to store a mutable reference to the active Item to avoid taking from the list and the RefCell every time. But storing a RefMut requires a lifetime parameter, so my struct would look like this:
struct Container<'a> {
items: Vec<Rc<RefCell<Item>>>,
active_item: RefMut<'a, Item>,
}
Is there a way to avoid the lifetime parameter when just referencing within the struct? Or something like a 'self lifetime, so I don't need to expose the lifetime parameter to the outside? None of the items in the List are referenced outside the struct. Also I don't necessarily need Rc or RefCell if it can be done without them.
Thank you

Related

How can return a reference to an item in a static hashmap inside a mutex?

I am trying to access a static hashmap for reading and writing but I am always getting error:
use std::collections::HashMap;
use std::sync::Mutex;
pub struct ModuleItem {
pub absolute_path: String,
}
lazy_static! {
static ref MODULE_MAP: Mutex<HashMap<i32, ModuleItem>> = Mutex::new(HashMap::new());
}
pub fn insert(identity_hash: i32, module_item: ModuleItem) {
MODULE_MAP
.lock()
.unwrap()
.insert(identity_hash, module_item);
}
pub fn get(identity_hash: i32) -> Option<&'static ModuleItem> {
MODULE_MAP.lock().unwrap().get(&identity_hash).clone()
}
But I am getting an error on the get function cannot return value referencing temporary value
I tried with .cloned(), .clone() or even nothing but I don't manage to get it to work. Can you help me?
I tried with .cloned(), .clone() or even nothing but I don't manage to get it to work. Can you help me?
All Option::clone does is clone the underlying structure, which in this case is an &ModuleItem so it just clones the reference, and you still have a reference, which you can't return because you only have access to the hashmap's contents while you hold the lock (otherwise it could not work).
Option::cloned actually clones the object being held by reference, but doesn't compile here because ModuleItem can't be cloned.
First you have to return a Option<ModuleItem>, you can not return a reference to the map contents since the lock is going to be released at the end of the function, and you can't keep a handle on hashmap contents across mutex boundaries as they could go away at any moment (e.g. an other thread could move them, or even clear the map entirely).
Then copy the ModuleItem, either by deriving Clone on ModuleItem (then calling Option::cloned) or by creating a new ModuleItem "by hand" e.g.
pub fn get(identity_hash: i32) -> Option<ModuleItem> {
MODULE_MAP.lock().unwrap().get(&identity_hash).map(|m|
ModuleItem { absolute_path: m.absolute_path.clone() }
)
}
If you need to get keys out a lot and are worried about performances, you could always store Arc<ModuleItem>. That has something of a cost (as it's a pointer so your string is now behind two pointers) however cloning an Arc is very cheap.
To avoid the double pointer you could make ModuleItem into an unsized type and have it store an str but... that's pretty difficult to work with so I wouldn't recommend it.
The function get cannot use a static lifetime because the data does not live for the entire life of the program (from the Rust book):
As a reference lifetime 'static indicates that the data pointed to by the reference lives for the entire lifetime of the running program. It can still be coerced to a shorter lifetime.
So you have to return either a none-static reference or a copy of the value of the HashMap. Reference is not possible because MODULE_MAP.lock().unwrap() returns a MutexGuard which is a local and therefore a temporary variable that hold the HashMap. And get() of the HashMap returns a reference.
Due to the fact that the temporary MutexGuard will be destroy at the end of the function, the reference returned by get would point to a temporary value.
To fix this, you could make ModuleItem clonable and return a copy of the value:
use std::collections::HashMap;
use std::sync::Mutex;
#[derive(Clone)]
pub struct ModuleItem {
pub absolute_path: String,
}
lazy_static::lazy_static! {
static ref MODULE_MAP: Mutex<HashMap<i32, ModuleItem>> = Mutex::new(HashMap::new());
}
pub fn insert(identity_hash: i32, module_item: ModuleItem) {
MODULE_MAP
.lock()
.unwrap()
.insert(identity_hash, module_item);
}
pub fn get(identity_hash: i32) -> Option<ModuleItem> {
MODULE_MAP.lock().unwrap().get(&identity_hash).cloned()
}

Why does the Index trait allow returning a reference to a temporary value?

Consider this simple code:
use std::ops::Index;
use std::collections::HashMap;
enum BuildingType {
Shop,
House,
}
struct Street {
buildings: HashMap<u32, BuildingType>,
}
impl Index<u32> for Street {
type Output = BuildingType;
fn index(&self, pos: u32) -> &Self::Output {
&self.buildings[&pos]
}
}
It compiles with no issues, but I cannot understand why the borrow checker is not complaining about returning a reference to temporary value in the index function.
Why is it working?
You example looks fine.
The Index trait is only able to "view" what is in the object already, and it's not usable for returning arbitrary dynamically-generated data.
It's not possible in Rust to return a reference to a value created inside a function, if that value isn't stored somewhere permanently (references don't exist on their own, they always borrow some value owned somewhere).
The reference can't be borrowed from a variable inside the function, because all variables will be destroyed before the function returns. Lifetimes only describe what the program does, and can't "make" something live longer than it already does.
fn index(&self) -> &u32 {
let tmp = 1;
&tmp // not valid, because tmp isn't stored anywhere
}
fn index(&self) -> &u32 {
// ok, because the value is stored in self,
// which existed before this function has been called
&self.tmp
}
You may find that returning &1 works. That's because 1 is stored in your program's executable which, as far as the program is concerned, is permanent storage. But 'static is an exception for literals and leaked memory, so it's not something you can rely on in most cases.

How can I simultaneously iterate over a Rust HashMap and modify some of its values?

I'm trying Advent of Code in Rust this year, as a way of learning the language. I've parsed the input (from day 7) into the following structure:
struct Process {
name: String,
weight: u32,
children: Vec<String>,
parent: Option<String>
}
These are stored in a HashMap<String, Process>. Now I want to iterate over the values in the map and update the parent values, based on what I find in the parent's "children" vector.
What doesn't work is
for p in self.processes.values() {
for child_name in p.children {
let mut child = self.processes.get_mut(child_name).expect("Child not found.");
child.parent = p.name;
}
}
I can't have both a mutable reference to the HashMap (self.processes) and a non-mutable reference, or two mutable references.
So, what is the most idiomatic way to accomplish this in Rust? The two options I can see are:
Copy the parent/child relationships into a new temporary data structure in one pass, and then update the Process structs in a second pass, after the immutable reference is out of scope.
Change my data structure to put "parent" in its own HashMap.
Is there a third option?
Yes, you can grant internal mutability to the HashMap's values using RefCell:
struct ProcessTree {
processes: HashMap<String, RefCell<Process>>, // change #1
}
impl ProcessTree {
fn update_parents(&self) {
for p in self.processes.values() {
let p = p.borrow(); // change #2
for child_name in &p.children {
let mut child = self.processes
.get(child_name) // change #3
.expect("Child not found.")
.borrow_mut(); // change #4
child.parent = Some(p.name.clone());
}
}
}
}
borrow_mut will panic at runtime if the child is already borrowed with borrow. This happens if a process is its own parent (which should presumably never happen, but in a more robust program you'd want to give a meaningful error message instead of just panicking).
I invented some names and made a few small changes (besides the ones specifically indicated) to make this code compile. Notably, p.name.clone() makes a full copy of p.name. This is necessary because both name and parent are owned Strings.

"borrowed value does not live long enough" error in trie insertion

I have a simple trie implementation where an Edge contains a character and a reference to another Node:
struct Edge<'a> {
ch: char,
to: &'a Node<'a>,
}
A Node contains a vector of edges:
pub struct Node<'a> {
edges: Vec<Edge<'a>>,
}
I'm trying to implement the method to insert/get a character into a node. I think the return value should be a reference to a Node: if the character is already in one of the edges, then we directly return the existing Node; if not, we return the newly created Node. This is where I get into trouble:
impl<'a> Node<'a> {
fn get_or_create(&mut self, ch: char) -> &Node<'a> {
match self.edges.binary_search_by(|e| e.ch.cmp(&ch)) {
Ok(idx) => {
return &self.edges.get(idx).unwrap().to;
}
Err(idx) => {
let to = &Node { edges: Vec::new() };
let e = Edge { ch: ch, to: to };
self.edges.insert(idx, e);
return to;
}
}
}
}
The to is said to not live long enough.
I'm quite sure what I wrote is far from idiomatic Rust. Initially when I included the reference to Node in Edge, I didn't add the lifetime parameter, and was prompted to do so, then I had to add it everywhere. However it looks quite weird. I wonder what would be the correct way to do it?
Maybe what I should really have used is some other wrapper type abstraction in Edge to refer to heap-allocated Node, e.g. Box? I shall read the section on this topic in The Rust Programming Language carefully.
This data structure can't work as designed. The red flag is the following sentence:
I think the return value should be a reference to a Node: if the character is already in one of the edges, then we directly return the existing Node; if not, we return the newly created Node.
The code doesn't return the newly created node, it attempts to return a reference to the newly created node. Returning a reference to an object is only safe if the object is stored in a place where it will outlive the reference. Otherwise the reference would end up pointing to the location on the stack where the object used to reside, resulting in a crash when used. Mistakes like this one were a frequent source of crashes in C and C++ and are precisely the kind of bug that Rust's borrow checker was designed to prevent.
Rust tracks reference lifetimes using the lifetime parameter on functions and data. To prove that the reference will not outlive the object, Rust prohibits the lifetime of the reference to extend beyond the lifetime of the object. Since the new node is dropped at the end of the function and the reference is returned from the function, the object's lifetime is too short and the code is correctly rejected as invalid.
There are several possible fixes:
Store the Node directly inside Edge. This was shown to compile.
Change &Node to Rc<Node>. This allows shared ownership of a single node by more than one edge, and automatic deallocation.
In both cases explicit lifetime management will no longer be necessary, and ownership will "just work". If you know C++11, an Rc<> is roughly equivalent to a std::shared_ptr.

How do I hold a collection of one struct in another where lifetimes are not predictable?

I want to manage a collection of objects in another object but I can't predict the lifetime of the elements in this collection.
I found this example in Syntax of Rust lifetime specifier that demonstrates what I can't do:
struct User<'a> {
name: &'a str,
}
// ... impls omitted
struct ChatRoom<'a> {
name: &'a str,
users: HashMap<&'a str, User<'a>>,
}
ChatRoom holds a map of Users. Each User is a copy although the name within User is a shared reference. The User and the ChatRoom have an explicit lifetime so when they are joined the compiler enforces that Users must live longer than the ChatRoom they're going into.
But what if my User was created after the ChatRoom? I can't use lifetimes because the compiler will complain. What if I delete a User before the ChatRoom? I can't do that either.
How can the ChatRoom hold Users who might be created after it or destroyed before it? I vaguely suspect that something could be done with boxes to implement this but Rust's box documentation is quite poor so I am not certain.
In Rust, some types come in pairs: a borrowed and an owned counterpart. For strings, the borrowed version is &'a str and the owned version is String. Owned versions don't have a lifetime parameter, because they own all their data. That doesn't mean they don't contain pointers internally; String stores its data on the heap, and the actual String object only contains a pointer to that data.
By using String instead of &'a str, you can avoid issues with the order of construction, because you can move owned data around freely (so long as it isn't borrowed elsewhere). For example, when you create a User, you first need to create a String, which you would then move into the new User, and finally you then move the User into the ChatRoom's HashMap.
struct User {
name: String,
}
struct ChatRoom {
name: String,
users: HashMap<String, User>,
}
However, since you need shared references, then you need to wrap the String in a type that provides that functionality. If you are writing a single-threaded program, you may use Rc for this. If you need to access these references from multiple threads, then Rc won't work; you need Arc instead.
struct User {
name: Rc<String>,
}
struct ChatRoom {
name: String,
users: HashMap<String, User>,
}
In order to create a new Rc<String> that points to the same string, you simply call the clone() method on the first Rc<String>.
Now, the String in an Rc<String> is immutable, because Rc doesn't provide any way to mutate its value (without consuming the Rc). If you need that capability, then you need to pair this with RefCell (or either Mutex or RwLock in a multi-threaded program).
struct User {
name: Rc<RefCell<String>>,
}
struct ChatRoom {
name: String,
users: HashMap<String, User>,
}

Resources