How to back a HashMap with a Vec - rust

I tried implementing a generic A* tree search algorithm. The important part is in the function hucs marked with a TODO:
use std::collections::BinaryHeap;
use std::collections::HashMap;
use std::cmp::Ordering;
pub trait SearchTree<A> {
fn available_actions(&self) -> Vec<A>;
fn apply_action(&self, act: &A) -> Self;
}
pub trait CostSearchTree<A>: SearchTree<A> + Eq {
fn action_cost(&self, act: &A) -> f64;
}
/// Node in the expanded search tree for uniform cost search with heuristic
struct HucsNode<A, T>
where
T: CostSearchTree<A>,
{
cost: f64,
heuristic_cost: f64,
parent_index: usize,
action: Option<A>,
tree: T,
}
impl<A, T> PartialEq for HucsNode<A, T>
where
T: CostSearchTree<A>,
{
fn eq(&self, other: &HucsNode<A, T>) -> bool {
// Can be used for closed list checking if we just compare the trees
return self.tree == other.tree;
}
}
impl<A, T> Eq for HucsNode<A, T>
where
T: CostSearchTree<A>,
{
}
impl<A, T> PartialOrd for HucsNode<A, T>
where
T: CostSearchTree<A>,
{
fn partial_cmp(&self, other: &HucsNode<A, T>) -> Option<Ordering> {
Some(self.cmp(other))
}
}
impl<A, T> Ord for HucsNode<A, T>
where
T: CostSearchTree<A>,
{
fn cmp(&self, other: &HucsNode<A, T>) -> Ordering {
let self_cost = self.cost + self.heuristic_cost;
let other_cost = other.cost + other.heuristic_cost;
// Flip for min-heap
match other_cost.partial_cmp(&self_cost) {
Some(order) => order,
_ => Ordering::Equal,
}
}
}
/// Perform a uniform cost search with a monotonous heuristic function on a search tree.
/// Returns a sequence of actions if a state is found that satisfies the predicate or None if the search terminates before.
pub fn hucs<A, T: CostSearchTree<A> + Hash>(
tree: T,
predicate: &Fn(&T) -> bool,
heuristic: &Fn(&T) -> f64,
) -> Option<Vec<A>> {
let mut node_heap = BinaryHeap::new() as BinaryHeap<HucsNode<A, T>>;
// Push the initial node onto the tree
node_heap.push(HucsNode {
action: None,
parent_index: usize::max_value(),
cost: 0.0,
heuristic_cost: heuristic(&tree),
tree: tree,
});
let mut old_nodes = Vec::new();
let mut last_node_index = 0 as usize;
'outer: while let Some(current_node) = node_heap.pop() {
// Break borrows with scope so current_node can be moved out
{
if predicate(&current_node.tree) {
return Some(form_action_sequence(current_node, old_nodes));
}
// Check if visited nodes already contains this tree with less cost
// TODO: Time complexity is hardly ideal
for old_node in old_nodes.iter() {
if old_node.tree == current_node.tree && old_node.cost <= current_node.cost {
continue 'outer;
}
}
let ref current_tree = current_node.tree;
for action in current_tree.available_actions() {
let action_cost = current_tree.action_cost(&action);
let new_tree = current_tree.apply_action(&action);
let new_cost = current_node.cost + action_cost;
let new_node = HucsNode {
action: Some(action),
cost: new_cost,
parent_index: last_node_index,
heuristic_cost: heuristic(&new_tree),
tree: new_tree,
};
node_heap.push(new_node);
}
}
old_nodes.push(current_node);
last_node_index += 1;
}
return None;
}
/// Restore the sequence of actions that was used to get to this node by climbing the tree of expanded nodes
fn form_action_sequence<A, T: CostSearchTree<A>>(
leaf: HucsNode<A, T>,
mut older_nodes: Vec<HucsNode<A, T>>,
) -> Vec<A> {
let mut action_vector = Vec::new();
let mut current = leaf;
while let Some(action) = current.action {
action_vector.insert(0, action);
// Safe to swap as nodes' parents are always before them
current = older_nodes.swap_remove(current.parent_index);
}
return action_vector;
}
The problem is that looking up whether the current node was in the old nodes by scanning over the old nodes takes way too long. Therefore I wanted to add a HashMap. Since I however also need to be able to access the old nodes by indices to form the solution action sequence at the end, I also need to keep the Vec. To solve this I tried adding a wrapper that I can insert into the HashMap as a key that just looks up its content in the Vec like this:
use std::hash::Hash;
use std::hash::Hasher;
struct BackedHashWrapper<'a, T: 'a + Hash + Eq> {
source: &'a Vec<T>,
index: usize,
}
impl<A, T> Hash for HucsNode<A, T>
where
T: CostSearchTree<A> + Hash,
{
fn hash<H>(&self, state: &mut H)
where
H: Hasher,
{
self.tree.hash(state);
}
}
impl<'a, T> Hash for BackedHashWrapper<'a, T>
where
T: Eq + Hash,
{
fn hash<H>(&self, state: &mut H)
where
H: Hasher,
{
self.source[self.index].hash(state);
}
}
impl<'a, T> PartialEq for BackedHashWrapper<'a, T>
where
T: Eq + Hash,
{
fn eq(&self, other: &BackedHashWrapper<T>) -> bool {
self.source[self.index] == other.source[other.index]
}
}
impl<'a, T> Eq for BackedHashWrapper<'a, T>
where
T: Eq + Hash,
{
}
I cannot figure out how to implement this in the hucs method, I tried the following just for adding elements to the hashmap
...
let mut old_nodes = Vec::new();
let mut hash_map = HashMap::new();
...
...
hash_map.insert(BackedHashWrapper {source: &old_nodes, index: last_node_index}, current_node.cost);
old_nodes.push(current_node);
last_node_index += 1;
...
but the borrow checker will not allow me to create such a BackedHashWrapper while the source vector is mutable. Clearly I am doing this completely the wrong way, so how could I accomplish this without having to clone either the tree or any actions?

I suppose it is easier to use other type of backing storage (TypedArena from typed-arena crate, for example).
But taking the question at face value, the problem you are dealing with is caused by Rust borrowing rules. That is you can't have shared (&) and mutable (&mut) references or multiple mutable references to the same object in the same scope.
hash_map in your example holds shared references to the vector, "freezing" it, which makes it impossible to modify the vector while hash_map is in the scope.
Solution to this problem is interior mutability pattern.
In your case, you can use RefCell<Vec<T>> to be able to modify vector while holding multiple references to it.
use std::cell::RefCell;
type RVec<T> = RefCell<Vec<T>>;
struct BackedHashWrapper<'a, T: 'a + Hash + Eq> {
source: &'a RVec<T>,
index: usize,
}
...
impl<'a, T> Hash for BackedHashWrapper<'a, T>
where
T: Eq + Hash,
{
fn hash<H>(&self, state: &mut H)
where
H: Hasher,
{
self.source.borrow()[self.index].hash(state);
}
}
...
// Similar changes for Eq and PartialEq
...
let mut old_nodes: RVec<_> = RefCell::default();
let mut hash_map = HashMap::new();
...
...
hash_map.insert(BackedHashWrapper {source: &old_nodes, index: last_node_index}, current_node.cost);
old_nodes.borrow_mut().push(current_node);
last_node_index += 1;
...
Maybe a couple of borrow()s and borrow_mut()s will be required in other places.

Related

Trouble with stricter lifetime requirements when implementing FromIterator for LazyList

I was playing with the code from this answer but the FromIterator impl does not compile any more:
error[E0276]: impl has stricter requirements than trait --> src/lib.rs:184:9
| 184 | fn from_iter<I: IntoIterator<Item = T> + 'a>(itrbl: I) -> LazyList<'a, T> {
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ impl has extra requirement `I: 'a`
For more information about this error, try `rustc --explain E0276`.
The slightly updated code is on the playground.
// only necessary because Box<FnOnce() -> R> doesn't work...
mod thunk {
pub trait Invoke<R = ()> {
fn invoke(self: Box<Self>) -> R;
}
impl<R, F: FnOnce() -> R> Invoke<R> for F {
#[inline(always)]
fn invoke(self: Box<F>) -> R { (*self)() }
}
}
// Lazy is lazily evaluated contained value using the above Invoke trait
// instead of the desire Box<FnOnce() -> T> or a stable FnBox (currently not)...
pub mod lazy {
use crate::thunk::Invoke;
use std::cell::UnsafeCell;
use std::mem::replace;
use std::ops::Deref;
// Lazy is lazily evaluated contained value using the above Invoke trait
// instead of the desire Box<FnOnce() -> T> or a stable FnBox (currently not)...
pub struct Lazy<'a, T: 'a>(UnsafeCell<LazyState<'a, T>>);
enum LazyState<'a, T: 'a> {
Unevaluated(Box<dyn Invoke<T> + 'a>),
EvaluationInProgress,
Evaluated(T),
}
use self::LazyState::*;
impl<'a, T: 'a> Lazy<'a, T> {
#[inline]
pub fn new<F: 'a + FnOnce() -> T>(func: F) -> Lazy<'a, T> {
Lazy(UnsafeCell::new(Unevaluated(Box::new(func))))
}
#[inline]
pub fn evaluated(val: T) -> Lazy<'a, T> {
Lazy(UnsafeCell::new(Evaluated(val)))
}
#[inline(always)]
fn force(&self) {
unsafe {
match *self.0.get() {
Evaluated(_) => {}, // nothing required; already Evaluated
EvaluationInProgress => panic!("Lazy::force called recursively!!!"),
_ => {
let ue = replace(&mut *self.0.get(), EvaluationInProgress);
if let Unevaluated(thnk) = ue {
*self.0.get() = Evaluated(thnk.invoke());
} // no other possiblity!
}
}
}
}
#[inline]
pub fn unwrap<'b>(self) -> T where T: 'b { // consumes the object to produce the value
self.force(); // evaluatate if not evealutated
match { self.0.into_inner() } {
Evaluated(v) => v,
_ => unreachable!() // previous code guarantees never not Evaluated
}
}
}
impl<'a, T: 'a> Deref for Lazy<'a, T> {
type Target = T;
#[inline]
fn deref(&self) -> &T {
self.force(); // evaluatate if not evalutated
match *unsafe { &*self.0.get() } {
Evaluated(ref v) => v,
_ => unreachable!(),
}
}
}
}
// LazyList is an immutable lazily-evaluated persistent (memoized) singly-linked list
// similar to lists in Haskell, although here only tails are lazy...
// depends on the contained type being Clone so that the LazyList can be
// extracted from the reference-counted Rc heap objects in which embedded.
pub mod lazylist {
use crate::lazy::Lazy;
use std::rc::Rc;
use std::iter::FromIterator;
use std::mem::{replace, swap};
#[derive(Clone)]
pub enum LazyList<'a, T: 'a + Clone> {
Empty,
Cons(T, RcLazyListNode<'a, T>),
}
pub use self::LazyList::Empty;
use self::LazyList::Cons;
type RcLazyListNode<'a, T> = Rc<Lazy<'a, LazyList<'a, T>>>;
// impl<'a, T:'a> !Sync for LazyList<'a, T> {}
impl<'a, T: 'a + Clone> LazyList<'a, T> {
#[inline]
pub fn singleton(v: T) -> LazyList<'a, T> {
Cons(v, Rc::new(Lazy::evaluated(Empty)))
}
#[inline]
pub fn cons<F>(v: T, cntf: F) -> LazyList<'a, T>
where F: 'a + FnOnce() -> LazyList<'a, T>
{
Cons(v, Rc::new(Lazy::new(cntf)))
}
#[inline]
pub fn head(&self) -> &T {
if let Cons(ref hd, _) = *self {
return hd;
}
panic!("LazyList::head called on an Empty LazyList!!!")
}
#[inline]
pub fn tail<'b>(&'b self) -> &'b Lazy<'a, LazyList<'a, T>> {
if let Cons(_, ref rlln) = *self {
return &*rlln;
}
panic!("LazyList::tail called on an Empty LazyList!!!")
}
#[inline]
pub fn unwrap(self) -> (T, RcLazyListNode<'a, T>) {
// consumes the object
if let Cons(hd, rlln) = self {
return (hd, rlln);
}
panic!("LazyList::unwrap called on an Empty LazyList!!!")
}
#[inline]
fn iter(&self) -> Iter<'a, T> {
Iter(self)
}
}
impl<'a, T: 'a + Clone> Iterator for LazyList<'a, T> {
type Item = T;
fn next(&mut self) -> Option<Self::Item> {
match replace(self, Empty) {
Cons(hd, rlln) => {
let mut newll = (*rlln).clone();
swap(self, &mut newll); // self now contains tail, newll contains the Empty
Some(hd)
}
_ => None,
}
}
}
pub struct Iter<'a, T: 'a + Clone>(*const LazyList<'a, T>);
impl<'a, T: 'a + Clone> Iterator for Iter<'a, T> {
type Item = &'a T;
fn next(&mut self) -> Option<Self::Item> {
unsafe {
if let LazyList::Cons(ref v, ref r) = *self.0 {
self.0 = &***r;
Some(v)
} else {
None
}
}
}
}
impl<'i, 'l, T: 'i + Clone> IntoIterator for &'l LazyList<'i, T> {
type Item = &'i T;
type IntoIter = Iter<'i, T>;
fn into_iter(self) -> Self::IntoIter {
self.iter()
}
}
impl<'a, T: 'a + Clone, > FromIterator<T> for LazyList<'a, T> {
fn from_iter<I: IntoIterator<Item = T> + 'a>(itrbl: I) -> LazyList<'a, T> {
let itr = itrbl.into_iter();
#[inline(always)]
fn next_iter<'b, R, Itr>(mut iter: Itr) -> LazyList<'b, R>
where R: 'b + Clone,
Itr: 'b + Iterator<Item = R>
{
match iter.next() {
Some(val) => LazyList::cons(val, move || next_iter(iter)),
None => Empty,
}
}
next_iter(itr)
}
}
}
Unfortunately I've exhausted ideas on how to try and fix this.
The code in the question (though not in the referenced answer, which has since been updated) relies on a soundness bug in older versions of the compiler (#18937) which has since been fixed.
It is not possible to implement FromIterator for LazyList, or indeed for any data structure, by storing the iterator inside the structure. This is because the FromIterator trait allows the implementor (Self) to outlive the iterator type (I::IntoIter). That the compiler ever accepted it was an oversight.
When copying code from the internet, be conscious of the age of the source. This code is also out of date in several other respects, notably:
it uses Rust 2015-style paths
it omits dyn on trait object types
the Invoke workaround is no longer needed, since dyn FnOnce() -> T works properly now.

Rust function that accepts either HashMap and BtreeMap

fn edit_map_values(
map1: &mut HashMap<String, i128> || &mut BTreeMap<String, i128>){
for tuple in map1.iter_mut() {
if !map1.contains_key(&"key1") {
*tuple.1 += 1;
}
}
map1.insert(&"key2", 10);
}
How do I write one function that accepts either HashMap and BtreeMap like in the example above?
It is possible to abstract over types by using traits and for your specific use-case, you can take a look at this more constrained example.
use core::{borrow::Borrow, hash::Hash};
use std::collections::{BTreeMap, HashMap};
trait GenericMap<K, V> {
fn contains_key<Q>(&self, k: &Q) -> bool
where
K: Borrow<Q>,
Q: Hash + Eq + Ord;
fn each_mut<F>(&mut self, cb: F)
where
F: FnMut((&K, &mut V));
fn insert(&mut self, key: K, value: V) -> Option<V>;
}
impl<K, V> GenericMap<K, V> for HashMap<K, V>
where
K: Eq + Hash,
{
fn contains_key<Q>(&self, k: &Q) -> bool
where
K: Borrow<Q>,
Q: Hash + Eq + Ord,
{
self.contains_key(k)
}
fn each_mut<F>(&mut self, mut cb: F)
where
F: FnMut((&K, &mut V)),
{
self.iter_mut().for_each(|x| cb(x))
}
fn insert(&mut self, key: K, value: V) -> Option<V> {
self.insert(key, value)
}
}
impl<K, V> GenericMap<K, V> for BTreeMap<K, V>
where
K: Ord,
{
fn contains_key<Q>(&self, k: &Q) -> bool
where
K: Borrow<Q>,
Q: Hash + Eq + Ord,
{
self.contains_key(k)
}
fn each_mut<F>(&mut self, mut cb: F)
where
F: FnMut((&K, &mut V)),
{
self.iter_mut().for_each(|x| cb(x))
}
fn insert(&mut self, key: K, value: V) -> Option<V> {
self.insert(key, value)
}
}
fn edit_map_values<T: GenericMap<String, i128>>(map: &mut T) {
map.each_mut(|(k, v)| {
if k != "key1" {
*v += 1;
}
});
map.insert("key2".into(), 10);
}
fn main() {
let mut hm: HashMap<String, i128> = [("One".into(), 1), ("Two".into(), 2)]
.iter()
.cloned()
.collect();
let mut btm: BTreeMap<String, i128> = [("Five".into(), 5), ("Six".into(), 6)]
.iter()
.cloned()
.collect();
dbg!(&hm);
dbg!(&btm);
edit_map_values(&mut hm);
edit_map_values(&mut btm);
dbg!(&hm);
dbg!(&btm);
}
Way back before the 1.0 release, there used to be Map and MutableMap traits, but they have been removed before stabilization. The Rust type system is currently unable to express these traits in a nice way due to the lack of higher kinded types.
The eclectic crate provides experimental collection traits, but they haven't been updated for a year, so I'm not sure they are still useful for recent versions of Rust.
Further information:
Does Rust have Collection traits?
No common trait for Map types? (Rust language forum)
Associated type constructors, part 1: basic concepts and introduction (blog post by Niko Matsakis)
Generic associated type RFC
While there is no common Map trait, you could use a combination of other traits to operate on an Iterator to achieve similar functionality. Although this might not be very memory efficient due to cloning, and also a bit involved depending on the kind of operation you are trying to perform. The operation you tried to do may be implemented like this:
fn edit_map_values<I>(map: &mut I)
where
I: Clone + IntoIterator<Item = (String, i128)> + std::iter::FromIterator<(String, i128)>,
{
// Since into_iter consumes self, we have to clone here.
let (keys, _values): (Vec<String>, Vec<_>) = map.clone().into_iter().unzip();
*map = map
.clone()
.into_iter()
// iterating while mutating entries can be done with map
.map(|mut tuple| {
if !keys.contains(&"key1".to_string()) {
tuple.1 += 1;
}
tuple
})
// inserting an element can be done with chain and once
.chain(std::iter::once(("key2".into(), 10)))
.collect();
// removing an element could be done with filter
// removing and altering elements could be done with filter_map
// etc.
}
fn main() {
use std::collections::{BTreeMap, HashMap};
{
let mut m = HashMap::new();
m.insert("a".to_string(), 0);
m.insert("key3".to_string(), 1);
edit_map_values(&mut m);
println!("{:#?}", m);
}
{
let mut m = BTreeMap::new();
m.insert("a".to_string(), 0);
m.insert("key3".to_string(), 1);
edit_map_values(&mut m);
println!("{:#?}", m);
}
}
Both times the output is the same, except for the order of the HashMap of course:
{
"a": 1,
"key2": 10,
"key3": 2,
}

How do I remove excessive `clone` calls from a struct that caches arbitrary results?

I am reading the section on closures in the second edition of the Rust book. At the end of this section, there is an exercise to extend the Cacher implementation given before. I gave it a try:
use std::clone::Clone;
use std::cmp::Eq;
use std::collections::HashMap;
use std::hash::Hash;
struct Cacher<T, K, V>
where
T: Fn(K) -> V,
K: Eq + Hash + Clone,
V: Clone,
{
calculation: T,
values: HashMap<K, V>,
}
impl<T, K, V> Cacher<T, K, V>
where
T: Fn(K) -> V,
K: Eq + Hash + Clone,
V: Clone,
{
fn new(calculation: T) -> Cacher<T, K, V> {
Cacher {
calculation,
values: HashMap::new(),
}
}
fn value(&mut self, arg: K) -> V {
match self.values.clone().get(&arg) {
Some(v) => v.clone(),
None => {
self.values
.insert(arg.clone(), (self.calculation)(arg.clone()));
self.values.get(&arg).unwrap().clone()
}
}
}
}
After creating a version that finally works, I am really unhappy with it. What really bugs me is that cacher.value(...) has 5(!) calls to clone() in it. Is there a way to avoid this?
Your suspicion is correct, the code contains too many calls to clone(), defeating the very optimizations Cacher is designed to achieve.
Cloning the entire cache
The one to start with is the call to self.values.clone() - it creates a copy of the entire cache on every single access.
After non-lexical lifetimes
Remove this clone.
Before non-lexical lifetimes
As you likely discovered yourself, simply removing .clone() doesn't compile. This is because the borrow checker considers the map referenced for the entire duration of match. The shared reference returned by HashMap::get points to the item inside the map, which means that while it exists, it is forbidden to create another mutable reference to the same map, which is required by HashMap::insert. For the code to compile, you need to split up the match in order to force the shared reference to go out of scope before insert is invoked:
// avoids unnecessary clone of the whole map
fn value(&mut self, arg: K) -> V {
if let Some(v) = self.values.get(&arg).map(V::clone) {
return v;
} else {
let v = (self.calculation)(arg.clone());
self.values.insert(arg, v.clone());
v
}
}
This is much better and probably "good enough" for most practical purposes. The hot path, where the value is already cached, now consists of only a single clone, and that one is actually necessary because the original value must remain in the hash map. (Also, note that cloning doesn't need to be expensive or imply deep copying - the stored value can be an Rc<RealValue>, which buys object sharing for free. In that case, clone() will simply increment the reference count on the object.)
Clone on cache miss
In case of cache miss, the key must be cloned, because calculation is declared to consume it. A single cloning will be sufficient, though, so we can pass the original arg to insert without cloning it again. The key clone still feels unnecessary, though - a calculation function shouldn't require ownership of the key it is transforming. Removing this clone boils down to modifying the signature of the calculation function to take the key by reference. Changing the trait bounds of T to T: Fn(&K) -> V allows the following formulation of value():
// avoids unnecessary clone of the key
fn value(&mut self, arg: K) -> V {
if let Some(v) = self.values.get(&arg).map(V::clone) {
return v;
} else {
let v = (self.calculation)(&arg);
self.values.insert(arg, v.clone());
v
}
}
Avoiding double lookups
Now are left with exactly two calls to clone(), one in each code path. This is optimal, as far as value cloning is concerned, but the careful reader will still be nagged by one detail: in case of cache miss, the hash table lookup will effectively happen twice for the same key: once in the call to HashMap::get, and then once more in HashMap::insert. It would be nice if we could instead reuse the work done the first time and perform only one hash map lookup. This can be achieved by replacing get() and insert() with entry():
// avoids the second lookup on cache miss
fn value(&mut self, arg: K) -> V {
match self.values.entry(arg) {
Entry::Occupied(entry) => entry.into_mut(),
Entry::Vacant(entry) => {
let v = (self.calculation)(entry.key());
entry.insert(v)
}
}.clone()
}
We've also taken the opportunity to move the .clone() call after the match.
Runnable example in the playground.
I was solving the same exercise and ended with the following code:
use std::thread;
use std::time::Duration;
use std::collections::HashMap;
use std::hash::Hash;
use std::fmt::Display;
struct Cacher<P, R, T>
where
T: Fn(&P) -> R,
P: Eq + Hash + Clone,
{
calculation: T,
values: HashMap<P, R>,
}
impl<P, R, T> Cacher<P, R, T>
where
T: Fn(&P) -> R,
P: Eq + Hash + Clone,
{
fn new(calculation: T) -> Cacher<P, R, T> {
Cacher {
calculation,
values: HashMap::new(),
}
}
fn value<'a>(&'a mut self, key: P) -> &'a R {
let calculation = &self.calculation;
let key_copy = key.clone();
self.values
.entry(key_copy)
.or_insert_with(|| (calculation)(&key))
}
}
It only makes a single copy of the key in the value() method. It does not copy the resulting value, but instead returns a reference with a lifetime specifier, which is equal to the lifetime of the enclosing Cacher instance (which is logical, I think, because values in the map will continue to exist until the Cacher itself is dropped).
Here's a test program:
fn main() {
let mut cacher1 = Cacher::new(|num: &u32| -> u32 {
println!("calculating slowly...");
thread::sleep(Duration::from_secs(2));
*num
});
calculate_and_print(10, &mut cacher1);
calculate_and_print(20, &mut cacher1);
calculate_and_print(10, &mut cacher1);
let mut cacher2 = Cacher::new(|str: &&str| -> usize {
println!("calculating slowly...");
thread::sleep(Duration::from_secs(2));
str.len()
});
calculate_and_print("abc", &mut cacher2);
calculate_and_print("defghi", &mut cacher2);
calculate_and_print("abc", &mut cacher2);
}
fn calculate_and_print<P, R, T>(intensity: P, cacher: &mut Cacher<P, R, T>)
where
T: Fn(&P) -> R,
P: Eq + Hash + Clone,
R: Display,
{
println!("{}", cacher.value(intensity));
}
And its output:
calculating slowly...
10
calculating slowly...
20
10
calculating slowly...
3
calculating slowly...
6
3
If you remove the requirement of returning values, you don't need to perform any clones by making use of the Entry:
use std::{
collections::{hash_map::Entry, HashMap},
fmt::Display,
hash::Hash,
thread,
time::Duration,
};
struct Cacher<P, R, T>
where
T: Fn(&P) -> R,
P: Eq + Hash,
{
calculation: T,
values: HashMap<P, R>,
}
impl<P, R, T> Cacher<P, R, T>
where
T: Fn(&P) -> R,
P: Eq + Hash,
{
fn new(calculation: T) -> Cacher<P, R, T> {
Cacher {
calculation,
values: HashMap::new(),
}
}
fn value<'a>(&'a mut self, key: P) -> &'a R {
let calculation = &self.calculation;
match self.values.entry(key) {
Entry::Occupied(e) => e.into_mut(),
Entry::Vacant(e) => {
let result = (calculation)(e.key());
e.insert(result)
}
}
}
}
fn main() {
let mut cacher1 = Cacher::new(|num: &u32| -> u32 {
println!("calculating slowly...");
thread::sleep(Duration::from_secs(1));
*num
});
calculate_and_print(10, &mut cacher1);
calculate_and_print(20, &mut cacher1);
calculate_and_print(10, &mut cacher1);
let mut cacher2 = Cacher::new(|str: &&str| -> usize {
println!("calculating slowly...");
thread::sleep(Duration::from_secs(2));
str.len()
});
calculate_and_print("abc", &mut cacher2);
calculate_and_print("defghi", &mut cacher2);
calculate_and_print("abc", &mut cacher2);
}
fn calculate_and_print<P, R, T>(intensity: P, cacher: &mut Cacher<P, R, T>)
where
T: Fn(&P) -> R,
P: Eq + Hash,
R: Display,
{
println!("{}", cacher.value(intensity));
}
You could then choose to wrap this in another struct that performed a clone:
struct ValueCacher<P, R, T>
where
T: Fn(&P) -> R,
P: Eq + Hash,
R: Clone,
{
cacher: Cacher<P, R, T>,
}
impl<P, R, T> ValueCacher<P, R, T>
where
T: Fn(&P) -> R,
P: Eq + Hash,
R: Clone,
{
fn new(calculation: T) -> Self {
Self {
cacher: Cacher::new(calculation),
}
}
fn value(&mut self, key: P) -> R {
self.cacher.value(key).clone()
}
}

How can I explicitly specify a lifetime when implementing a trait?

Given the implementation below, where essentially I have some collection of items that can be looked up via either a i32 id field or a string field. To be able to use either interchangeably, a trait "IntoKey" is used, and a match dispatches to the appropriate lookup map; this all works fine for my definition of get within the MapCollection impl:
use std::collections::HashMap;
use std::ops::Index;
enum Key<'a> {
I32Key(&'a i32),
StringKey(&'a String),
}
trait IntoKey<'a> {
fn into_key(&'a self) -> Key<'a>;
}
impl<'a> IntoKey<'a> for i32 {
fn into_key(&'a self) -> Key<'a> { Key::I32Key(self) }
}
impl<'a> IntoKey<'a> for String {
fn into_key(&'a self) -> Key<'a> { Key::StringKey(self) }
}
#[derive(Debug)]
struct Bar {
i: i32,
n: String,
}
struct MapCollection
{
items: Vec<Bar>,
id_map: HashMap<i32, usize>,
name_map: HashMap<String, usize>,
}
impl MapCollection {
fn new(items: Vec<Bar>) -> MapCollection {
let mut is = HashMap::new();
let mut ns = HashMap::new();
for (idx, item) in items.iter().enumerate() {
is.insert(item.i, idx);
ns.insert(item.n.clone(), idx);
}
MapCollection {
items: items,
id_map: is,
name_map: ns,
}
}
fn get<'a, K>(&self, key: &'a K) -> Option<&Bar>
where K: IntoKey<'a> //'
{
match key.into_key() {
Key::I32Key(i) => self.id_map.get(i).and_then(|idx| self.items.get(*idx)),
Key::StringKey(s) => self.name_map.get(s).and_then(|idx| self.items.get(*idx)),
}
}
}
fn main() {
let bars = vec![Bar { i:1, n:"foo".to_string() }, Bar { i:2, n:"far".to_string() }];
let map = MapCollection::new(bars);
if let Some(bar) = map.get(&1) {
println!("{:?}", bar);
}
if map.get(&3).is_none() {
println!("no item numbered 3");
}
if let Some(bar) = map.get(&"far".to_string()) {
println!("{:?}", bar);
}
if map.get(&"baz".to_string()).is_none() {
println!("no item named baz");
}
}
However, if I then want to implement std::ops::Index for this struct, if I attempt to do the below:
impl<'a, K> Index<K> for MapCollection
where K: IntoKey<'a> {
type Output = Bar;
fn index<'b>(&'b self, k: &K) -> &'b Bar {
self.get(k).expect("no element")
}
}
I hit a compiler error:
src/main.rs:70:18: 70:19 error: cannot infer an appropriate lifetime for automatic coercion due to conflicting requirements
src/main.rs:70 self.get(k).expect("no element")
^
src/main.rs:69:5: 71:6 help: consider using an explicit lifetime parameter as shown: fn index<'b>(&'b self, k: &'a K) -> &'b Bar
src/main.rs:69 fn index<'b>(&'b self, k: &K) -> &'b Bar {
src/main.rs:70 self.get(k).expect("no element")
src/main.rs:71 }
I can find no way to specify a distinct lifetime here; following the compiler's recommendation is not permitted as it changes the function signature and no longer matches the trait, and anything else I try fails to satisfy the lifetime specification.
I understand that I can implement the trait for each case (i32, String) separately instead of trying to implement it once for IntoKey, but I am more generally trying to understand lifetimes and appropriate usage. Essentially:
Is there actually an issue the compiler is preventing? Is there something unsound about this approach?
Am I specifying my lifetimes incorrectly? To me, the lifetime 'a in Key/IntoKey is dictating that the reference need only live long enough to do the lookup; the lifetime 'b associated with the index fn is stating that the reference resulting from the lookup will live as long as the containing MapCollection.
Or am I simply not utilizing the correct syntax to specify the needed information?
(using rustc 1.0.0-nightly (b63cee4a1 2015-02-14 17:01:11 +0000))
Do you intend on implementing IntoKey on struct's that are going to store references of lifetime 'a? If not, you can change your trait and its implementations to:
trait IntoKey {
fn into_key<'a>(&'a self) -> Key<'a>;
}
This is the generally recommended definition style, if you can use it. If you can't...
Let's look at this smaller reproduction:
use std::collections::HashMap;
use std::ops::Index;
struct Key<'a>(&'a u8);
trait IntoKey<'a> { //'
fn into_key(&'a self) -> Key<'a>;
}
struct MapCollection;
impl MapCollection {
fn get<'a, K>(&self, key: &'a K) -> &u8
where K: IntoKey<'a> //'
{
unimplemented!()
}
}
impl<'a, K> Index<K> for MapCollection //'
where K: IntoKey<'a> //'
{
type Output = u8;
fn index<'b>(&'b self, k: &K) -> &'b u8 { //'
self.get(k)
}
}
fn main() {
}
The problem lies in get:
fn get<'a, K>(&self, key: &'a K) -> &u8
where K: IntoKey<'a>
Here, we are taking a reference to K that must live as long as the Key we get out of it. However, the Index trait doesn't guarantee that:
fn index<'b>(&'b self, k: &K) -> &'b u8
You can fix this by simply giving a fresh lifetime to key:
fn get<'a, 'b, K>(&self, key: &'b K) -> &u8
where K: IntoKey<'a>
Or more succinctly:
fn get<'a, K>(&self, key: &K) -> &u8
where K: IntoKey<'a>

How would I create a handle manager in Rust?

pub struct Storage<T>{
vec: Vec<T>
}
impl<T: Clone> Storage<T>{
pub fn new() -> Storage<T>{
Storage{vec: Vec::new()}
}
pub fn get<'r>(&'r self, h: &Handle<T>)-> &'r T{
let index = h.id;
&self.vec[index]
}
pub fn set(&mut self, h: &Handle<T>, t: T){
let index = h.id;
self.vec[index] = t;
}
pub fn create(&mut self, t: T) -> Handle<T>{
self.vec.push(t);
Handle{id: self.vec.len()-1}
}
}
struct Handle<T>{
id: uint
}
I am currently trying to create a handle system in Rust and I have some problems. The code above is a simple example of what I want to achieve.
The code works but has one weakness.
let mut s1 = Storage<uint>::new();
let mut s2 = Storage<uint>::new();
let handle1 = s1.create(5);
s1.get(handle1); // works
s2.get(handle1); // unsafe
I would like to associate a handle with a specific storage like this
//Pseudo code
struct Handle<T>{
id: uint,
storage: &Storage<T>
}
impl<T> Handle<T>{
pub fn get(&self) -> &T;
}
The problem is that Rust doesn't allow this. If I would do that and create a handle with the reference of a Storage I wouldn't be allowed to mutate the Storage anymore.
I could implement something similar with a channel but then I would have to clone T every time.
How would I express this in Rust?
The simplest way to model this is to use a phantom type parameter on Storage which acts as a unique ID, like so:
use std::kinds::marker;
pub struct Storage<Id, T> {
marker: marker::InvariantType<Id>,
vec: Vec<T>
}
impl<Id, T> Storage<Id, T> {
pub fn new() -> Storage<Id, T>{
Storage {
marker: marker::InvariantType,
vec: Vec::new()
}
}
pub fn get<'r>(&'r self, h: &Handle<Id, T>) -> &'r T {
let index = h.id;
&self.vec[index]
}
pub fn set(&mut self, h: &Handle<Id, T>, t: T) {
let index = h.id;
self.vec[index] = t;
}
pub fn create(&mut self, t: T) -> Handle<Id, T> {
self.vec.push(t);
Handle {
marker: marker::InvariantLifetime,
id: self.vec.len() - 1
}
}
}
pub struct Handle<Id, T> {
id: uint,
marker: marker::InvariantType<Id>
}
fn main() {
struct A; struct B;
let mut s1 = Storage::<A, uint>::new();
let s2 = Storage::<B, uint>::new();
let handle1 = s1.create(5);
s1.get(&handle1);
s2.get(&handle1); // won't compile, since A != B
}
This solves your problem in the simplest case, but has some downsides. Mainly, it depends on the use to define and use all of these different phantom types and to prove that they are unique. It doesn't prevent bad behavior on the user's part where they can use the same phantom type for multiple Storage instances. In today's Rust, however, this is the best we can do.
An alternative solution that doesn't work today for reasons I'll get in to later, but might work later, uses lifetimes as anonymous id types. This code uses the InvariantLifetime marker, which removes all sub typing relationships with other lifetimes for the lifetime it uses.
Here is the same system, rewritten to use InvariantLifetime instead of InvariantType:
use std::kinds::marker;
pub struct Storage<'id, T> {
marker: marker::InvariantLifetime<'id>,
vec: Vec<T>
}
impl<'id, T> Storage<'id, T> {
pub fn new() -> Storage<'id, T>{
Storage {
marker: marker::InvariantLifetime,
vec: Vec::new()
}
}
pub fn get<'r>(&'r self, h: &Handle<'id, T>) -> &'r T {
let index = h.id;
&self.vec[index]
}
pub fn set(&mut self, h: &Handle<'id, T>, t: T) {
let index = h.id;
self.vec[index] = t;
}
pub fn create(&mut self, t: T) -> Handle<'id, T> {
self.vec.push(t);
Handle {
marker: marker::InvariantLifetime,
id: self.vec.len() - 1
}
}
}
pub struct Handle<'id, T> {
id: uint,
marker: marker::InvariantLifetime<'id>
}
fn main() {
let mut s1 = Storage::<uint>::new();
let s2 = Storage::<uint>::new();
let handle1 = s1.create(5);
s1.get(&handle1);
// In theory this won't compile, since the lifetime of s2
// is *slightly* shorter than the lifetime of s1.
//
// However, this is not how the compiler works, and as of today
// s2 gets the same lifetime as s1 (since they can be borrowed for the same period)
// and this (unfortunately) compiles without error.
s2.get(&handle1);
}
In a hypothetical future, the assignment of lifetimes may change and we may grow a better mechanism for this sort of tagging. However, for now, the best way to accomplish this is with phantom types.

Resources