I have a game I'm trying to code and I want to memoize a next_guess function, as it's costly. I know there are some memoization crates, but I have some weird requirements and the whole project is an exercise in learning Rust, so I wanted to know how a true Rustacean would think about it. The next_guess function is in the impl of a Node struct. The game tree branches very fast, so each level has dozens of possible next step nodes to analyze. If I add a reference to the memoize HashMap to the Node struct for next_guess to use, I can't make it mutable, as there can't be multiple mutable references to it. But I needed it to be mutable so I can add new values. I thought using globals was a no-no, but is setting the HashMap as a lazy_static the right approach or should I use an unsafe block to access it (can I do that?). Thanks
Just my two cents:
Having a HashMap within each Node will consume lots of space.
If you already have a specific Node at hand, do you really want to look it up in an HashMap? Couldn't the Node itself store the cached value (possibly in an Option<BestNextGuess>?
I ended up refactoring to create a game struct that owned the HashMap and then passed all the guess attempts through the game struct first before calling into the node struct.
Related
I'm trying to load JSON files that refer to structs implementing a trait. When the JSON files are loaded, the struct is grabbed from a hashmap. The problem is, I'll probably have to have a lot of structs put into that hashmap all over my code. I would like to have that done automatically. To me this seems to be doable with procedural macros, something like:
#[my_proc_macro(type=ImplementedType)]
struct MyStruct {}
impl ImplementedType for MyStruct {}
fn load_implementors() {
let implementors = HashMap::new();
load_implementors!(implementors, ImplementedType);
}
Is there a way to do this?
No
There is a core issue that makes it difficult to skip manually inserting into a structure. Consider this simplified example, where we simply want to print values that are provided separately in the code-base:
my_register!(alice);
my_register!(bob);
fn main() {
my_print(); // prints "alice" and "bob"
}
In typical Rust, there is no mechanism to link the my_print() call to the multiple invocations of my_register. There is no support for declaration merging, run-time/compile-time reflection, or run-before-main execution that you might find in other languages that might make this possible (unless of course there's something I'm missing).
But Also Yes
There are third party crates built around link-time or run-time tricks that can make this possible:
ctor allows you to define functions that are executed before main(). With it, you can have my_register!() create invididual functions for alice and bob that when executed will add themselves to some global structure which can then be accessed by my_print().
linkme allows you to define a slice that is made from elements defined separately, which are combined at compile time. The my_register!() simply needs to use this crate's attributes to add an element to the slice, which my_print() can easily access.
I understand skepticism of these methods since the declarative approach is often clearer to me, but sometimes they are necessary or the ergonomic benefits outweigh the "magic".
I have a Vec<Box> and id like to store all my widgets in here for rendering but id also like to keep references to certain widgets. When i push the widgets into my vector they are moved and the borrow checker complains when i try to reference them. How can i get around this?
The issue you're seeing here is that you cannot mutate the vector while alos having references into it. This is a good guard to have as it can lead to dangling references.
There are a few ways to fix this, the first being reference counting with either Rc or Arc. This mutates your vector to be Vec<Rc/Arc<WidgetType>> or Vec<Rc/Arc<&dyn WidgetTrait>> if you're doing virtual calls. Then instead of referencing those widgets elsewhere you clone the reference counter. This can be done relativly easily but has the downsides of there being a pointer indirection added and the items can still exist even if they're removed from your render queue.
The next option is to store indexes/keys into the render list. This can be done with usize on a Vec as #trentcl said in the comments or an arbitray key with HashMap. This has the downsides of index movement in the case of a vector or a lot of hashing with a hashmap.
The best option in my opinion requires a library but covers all the other downsides. The library is slotmap. This creates a map with a vector backing that returns keys that you can replace your references with. There are a few options within it as to how you want the data to be stored and which opertions you want to be optimized, but that can all be seen in the documentaion. The main issue with this is that you have to add a dependancy which may not be preferable depending on your specific case.
I have a struct called Pizza. It contains a single Base struct and a vector of Topping structs.
I have a helper method that returns a Pizza. In the (near) future, I see the toppings being a collection somewhere and Pizzas being dynamically created from this collection of Toppings (mix and match style).
My question is about how the struct should reference Bases and Toppings. If I give ownership to the struct, then it makes it easy to handle it (no lifetime declarations, helper methods are also easier since they no longer need to create the memory outside of scope). However by having Toppings outside of scope of my helper method and declaring lifetimes I get less repetition in memory.
How do people reason about these problems? Is there a recommended rule of thumb to follow? Is it possible to have both?
If Topping is small (e.g. an enum), then you can just copy it (e.g. into Vec<Topping>).
If Topping is large and you want only one copy in memory, the easiest to work with is to use Arc<Topping> which is a shared pointer, and can be cheaply cloned and easily passed around (e.g. into Vec<Arc<Topping>>).
If both Pizza and Topping are used only in a specific statically-known scope (e.g. you create all toppings in main() and don't change them later, or you use a memory pool), you may get away with using &'a Topping in Pizza<'a>, but this is likely to be unnoticeably small performance improvement compared to Rc/Arc, and keeping track of the temporary lifetime will be annoying.
I have a struct that contains a field that is rather expensive to initialize, so I want to be able to do so lazily. However, this may be necessary in a method that takes &self. The field also needs to be able to modified once it is initialized, but this will only occur in methods that take &mut self.
What is the correct (as in idiomatic, as well as in thread-safe) way to do this in Rust? It seems to me that it would be trivial with either of the two constraints:
If it only needed to be lazily initialized, and not mutated, I could simply use lazy-init's Lazy<T> type.
If it only needed to be mutable and not lazy, then I could just use a normal field (obviously).
However, I'm not quite sure what to do with both in place. RwLock seems relevant, but it appears that there is considerable trickiness to thread-safe lazy initialization given what I've seen of lazy-init's source, so I am hesitant to roll my own solution based on it.
The simplest solution is RwLock<Option<T>>.
However, I'm not quite sure what to do with both in place. RwLock seems relevant, but it appears that there is considerable trickiness to thread-safe lazy initialization given what I've seen of lazy-init's source, so I am hesitant to roll my own solution based on it.
lazy-init uses tricky code because it guarantees lock-free access after creation. Lock-free is always a bit trickier.
Note that in Rust it's easy to tell whether something is tricky or not: tricky means using an unsafe block. Since you can use RwLock<Option<T>> without any unsafe block there is nothing for you to worry about.
A variant to RwLock<Option<T>> may be necessary if you want to capture a closure for initialization once, rather than have to pass it at each potential initialization call-site.
In this case, you'll need something like RwLock<SimpleLazy<T>> where:
enum SimpleLazy<T> {
Initialized(T),
Uninitialized(Box<FnOnce() -> T>),
}
You don't have to worry about making SimpleLazy<T> Sync as RwLock will take care of that for you.
I am writing a service that will collect a great number of values and build large structures around them. For some of these, lookup tables are needed, and due to memory constraints I do not want to copy the key or value passed to the HashMap. However, using references gets me into trouble with the borrow checker (see example below). What is the preferred way of working with run-time created instances?
use std::collections::HashMap;
#[derive(PartialEq, Eq, PartialOrd, Ord, Hash)]
struct LargeKey;
struct LargeValue;
fn main() {
let mut lots_of_lookups: HashMap<&LargeKey, &LargeValue> = HashMap::new();
let run_time_created_key = LargeKey;
let run_time_created_value = LargeValue;
lots_of_lookups.insert(&run_time_created_key, &run_time_created_value);
lots_of_lookups.clear();
}
I was expecting clear() to release the borrows, but even if it actually does so, perhaps the compiler cannot figure that out?
Also, I was expecting clear() to release the borrows, but even if it actually does so, perhaps the compiler cannot figure that out?
At the moment, borrowing is purely scope based. Only a method which consumes the borrower can revoke the borrow, which is not always ideal.
What is the preferred way of working with shared run-time created instances?
The simplest way to expressed shared ownership is to use shared ownership. It does come with some syntactic overhead, however it would greatly simplify reasoning.
In Rust, there are two simple standard ways of expressing shared ownership:
Rc<RefCell<T>>, for sharing within a thread,
Arc<Mutex<T>>, for sharing across threads.
There are some variations (using Cell instead of RefCell or RWLock instead of Mutex), however those are the basics.
Beyond syntactic overhead, there's also some amount of run-time overhead going into (a) increasing/decreasing the reference count whenever you make a clone and (b) checking/marking/clearing the usage flag when accessing the wrapped instance of T.
There is one non-negligible downside to this approach, though. The borrowing rules are now checked at runtime instead of compile time, and therefore violations lead to panic instead of compile time errors.