fn edit_map_values(
map1: &mut HashMap<String, i128> || &mut BTreeMap<String, i128>){
for tuple in map1.iter_mut() {
if !map1.contains_key(&"key1") {
*tuple.1 += 1;
}
}
map1.insert(&"key2", 10);
}
How do I write one function that accepts either HashMap and BtreeMap like in the example above?
It is possible to abstract over types by using traits and for your specific use-case, you can take a look at this more constrained example.
use core::{borrow::Borrow, hash::Hash};
use std::collections::{BTreeMap, HashMap};
trait GenericMap<K, V> {
fn contains_key<Q>(&self, k: &Q) -> bool
where
K: Borrow<Q>,
Q: Hash + Eq + Ord;
fn each_mut<F>(&mut self, cb: F)
where
F: FnMut((&K, &mut V));
fn insert(&mut self, key: K, value: V) -> Option<V>;
}
impl<K, V> GenericMap<K, V> for HashMap<K, V>
where
K: Eq + Hash,
{
fn contains_key<Q>(&self, k: &Q) -> bool
where
K: Borrow<Q>,
Q: Hash + Eq + Ord,
{
self.contains_key(k)
}
fn each_mut<F>(&mut self, mut cb: F)
where
F: FnMut((&K, &mut V)),
{
self.iter_mut().for_each(|x| cb(x))
}
fn insert(&mut self, key: K, value: V) -> Option<V> {
self.insert(key, value)
}
}
impl<K, V> GenericMap<K, V> for BTreeMap<K, V>
where
K: Ord,
{
fn contains_key<Q>(&self, k: &Q) -> bool
where
K: Borrow<Q>,
Q: Hash + Eq + Ord,
{
self.contains_key(k)
}
fn each_mut<F>(&mut self, mut cb: F)
where
F: FnMut((&K, &mut V)),
{
self.iter_mut().for_each(|x| cb(x))
}
fn insert(&mut self, key: K, value: V) -> Option<V> {
self.insert(key, value)
}
}
fn edit_map_values<T: GenericMap<String, i128>>(map: &mut T) {
map.each_mut(|(k, v)| {
if k != "key1" {
*v += 1;
}
});
map.insert("key2".into(), 10);
}
fn main() {
let mut hm: HashMap<String, i128> = [("One".into(), 1), ("Two".into(), 2)]
.iter()
.cloned()
.collect();
let mut btm: BTreeMap<String, i128> = [("Five".into(), 5), ("Six".into(), 6)]
.iter()
.cloned()
.collect();
dbg!(&hm);
dbg!(&btm);
edit_map_values(&mut hm);
edit_map_values(&mut btm);
dbg!(&hm);
dbg!(&btm);
}
Way back before the 1.0 release, there used to be Map and MutableMap traits, but they have been removed before stabilization. The Rust type system is currently unable to express these traits in a nice way due to the lack of higher kinded types.
The eclectic crate provides experimental collection traits, but they haven't been updated for a year, so I'm not sure they are still useful for recent versions of Rust.
Further information:
Does Rust have Collection traits?
No common trait for Map types? (Rust language forum)
Associated type constructors, part 1: basic concepts and introduction (blog post by Niko Matsakis)
Generic associated type RFC
While there is no common Map trait, you could use a combination of other traits to operate on an Iterator to achieve similar functionality. Although this might not be very memory efficient due to cloning, and also a bit involved depending on the kind of operation you are trying to perform. The operation you tried to do may be implemented like this:
fn edit_map_values<I>(map: &mut I)
where
I: Clone + IntoIterator<Item = (String, i128)> + std::iter::FromIterator<(String, i128)>,
{
// Since into_iter consumes self, we have to clone here.
let (keys, _values): (Vec<String>, Vec<_>) = map.clone().into_iter().unzip();
*map = map
.clone()
.into_iter()
// iterating while mutating entries can be done with map
.map(|mut tuple| {
if !keys.contains(&"key1".to_string()) {
tuple.1 += 1;
}
tuple
})
// inserting an element can be done with chain and once
.chain(std::iter::once(("key2".into(), 10)))
.collect();
// removing an element could be done with filter
// removing and altering elements could be done with filter_map
// etc.
}
fn main() {
use std::collections::{BTreeMap, HashMap};
{
let mut m = HashMap::new();
m.insert("a".to_string(), 0);
m.insert("key3".to_string(), 1);
edit_map_values(&mut m);
println!("{:#?}", m);
}
{
let mut m = BTreeMap::new();
m.insert("a".to_string(), 0);
m.insert("key3".to_string(), 1);
edit_map_values(&mut m);
println!("{:#?}", m);
}
}
Both times the output is the same, except for the order of the HashMap of course:
{
"a": 1,
"key2": 10,
"key3": 2,
}
Related
Being an aspiring rustacean, I've been working my way through The Rust Programming Language book and being in the 13th chapter I was attempting to generalize the Cacher struct, that has as a purpose implementing lazy evaluation around a closure. While I was able to use generics to generalize the closure signature to any one parameter with any one output type, I can't figure out how to generalize this to closures with any number of params. I feel like there should be a way to do this.
struct Cacher<'a, Args, V: Clone>
{
calculation: &'a dyn Fn(Args) -> V,
value: Option<V>
}
impl<'a, Args, V: Clone> Cacher<'a, Args, V>
{
fn new(calculation: &'a dyn Fn(Args) -> V) -> Cacher<Args, V> {
Cacher {
calculation: calculation,
value: None,
}
}
fn value(&mut self, arg: Args) -> V {
// all this cloning is probably not the best way to do this
match self.value.clone() {
Some(v) => v,
None => {
let v = (self.calculation)(arg);
self.value = Some(v.clone());
v
}
}
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn it_works() {
let mut cached_func = Cacher::new(&(|asd| asd + 1));
assert_eq!(cached_func.value(1), 2);
assert_eq!(cached_func.value(4), 2);
}
#[test]
fn it_works_too() {
// compiler hates this
let mut cached_func = Cacher::new(&(|asd, qwe| asd + qwe));
assert_eq!(cached_func.value(1, 1), 2);
assert_eq!(cached_func.value(4, 1), 2);
}
}
You can do this on nightly using the fn_traits (and closely related unboxed_closures) features. This allows you to use Fn like Fn<Args, Output = V> where Args is a tuple type of all the parameters passed to the function.
#![feature(unboxed_closures)]
#![feature(fn_traits)]
struct Cacher<'a, Args, V: Clone>
{
calculation: &'a dyn Fn<Args, Output = V>,
value: Option<V>
}
impl<'a, Args, V: Clone> Cacher<'a, Args, V>
{
fn new(calculation: &'a dyn Fn<Args, Output = V>) -> Cacher<Args, V> {
Cacher {
calculation: calculation,
value: None,
}
}
fn value(&mut self, args: Args) -> V {
// all this cloning is probably not the best way to do this
match self.value.clone() {
Some(v) => v,
None => {
let v = self.calculation.call(args);
self.value = Some(v.clone());
v
}
}
}
}
This does require you to call value() with a tuple:
let mut cache1 = Cacher::new(&|a| a + 1);
let value1 = cache1.value((7,));
let mut cache2 = Cacher::new(&|a, b| a + b);
let value2 = cache2.value((7, 8));
However, you can make it nicer to use if you're willing to make the boilerplate for the numerous tuple types:
impl<'a, T, V: Clone> Cacher<'a, (T,), V>
{
fn value2(&mut self, arg1: T) -> V {
self.value((arg1, ))
}
}
impl<'a, T, U, V: Clone> Cacher<'a, (T, U), V>
{
fn value2(&mut self, arg1: T, arg2: U) -> V {
self.value((arg1, arg2))
}
}
// ...
let mut cache1 = Cacher::new(&|a: usize| a + 1);
let value1 = cache1.value2(7);
let mut cache2 = Cacher::new(&|a: usize, b: usize| a + b);
let value2 = cache2.value2(7, 8);
See it running on the playground.
This only works on nightly because its not yet been stabilized if this is how they will be supported generically in the future.
In rust, functions do not have a variable numbers of arguments, except in some cases for compatibility with C. This answer provides more background.
In your example, you could achieve some generic lazy evaluation with the lazy static crate. You don’t pass a closure to this crate, not explicitly at least. But you put the body of the closure in a variable that lazy static evaluates on first access (a bit like a closure taking () and whose result would be stored in Cacher, if you will).
It's fairly hard to understand exactly what is it that you need. So here's my guess:
struct Cacher<'a, Args, V: Copy>
{
calculation: &'a dyn Fn(Args) -> V,
value: Option<V>
}
impl<'a, Args, V: Copy> Cacher<'a, Args, V>
{
fn new(calculation: &'a dyn Fn(Args) -> V) -> Cacher<Args, V> {
Cacher {
calculation: calculation,
value: None,
}
}
fn value(&mut self, arg: Args) -> V {
// Cloning fixed
match self.value {
Some(v) => v,
None => {
let v = (self.calculation)(arg);
self.value = Some(v);
v
}
}
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn it_works() {
let mut cached_func = Cacher::new(&(|asd| asd + 1));
assert_eq!(cached_func.value(1), 2);
assert_eq!(cached_func.value(4), 2);
}
#[test]
fn it_works_too() {
// The compiler is fine
// Although now, it's not multiple arguments but rather one arg, acting as many
let mut cached_func = Cacher::new(&(|asd: (usize, usize)| asd.0 + asd.1));
assert_eq!(cached_func.value((1, 1)), 2);
assert_eq!(cached_func.value((4, 1)), 2);
}
}
Remember that Rust's generics could be considered as Algebraic Data Types, hence, only enums, structs and functions are allowed (closures too, if you consider them different to functions). The second test works because tuples could be considered structs.
Because of this, it's impossible to have multiple arguments in one function definition.
The usual way that rust solves this issue is with macros. Although method macros don't exist in rust yet.
So I'm a bit stuck, trying to merge two HashMaps.
It's easy to do it inline:
fn inline() {
let mut first_context = HashMap::new();
first_context.insert("Hello", "World");
let mut second_context = HashMap::new();
second_context.insert("Hey", "There");
let mut new_context = HashMap::new();
for (key, value) in first_context.iter() {
new_context.insert(*key, *value);
}
for (key, value) in second_context.iter() {
new_context.insert(*key, *value);
}
println!("Inline:\t\t{}", new_context);
println!("Inline:\t\t{}\t{} [Initial Maps Still Usable]", first_context, second_context);
}
It's easy enough to make a function:
fn abstracted() {
fn merge<'a>(first_context: &HashMap<&'a str, &'a str>, second_context: &HashMap<&'a str, &'a str>) -> HashMap<&'a str, &'a str> {
let mut new_context = HashMap::new();
for (key, value) in first_context.iter() {
new_context.insert(*key, *value);
}
for (key, value) in second_context.iter() {
new_context.insert(*key, *value);
}
new_context
}
let mut first_context = HashMap::new();
first_context.insert("Hello", "World");
let mut second_context = HashMap::new();
second_context.insert("Hey", "There");
println!("Abstracted:\t{}", merge(&first_context, &second_context));
println!("Abstracted:\t{}\t{} [Initial Maps Still Usable]", first_context, second_context);
}
However, I can't seem to get the generic version to work:
fn generic() {
fn merge<'a, K: Hash + Eq, V>(first_context: &HashMap<&'a K, &'a V>, second_context: &HashMap<&'a K, &'a V>) -> HashMap<&'a K, &'a V> {
let mut new_context = HashMap::new();
for (key, value) in first_context.iter() {
new_context.insert(*key, *value);
}
for (key, value) in second_context.iter() {
new_context.insert(*key, *value);
}
new_context
}
let mut first_context = HashMap::new();
first_context.insert("Hello", "World");
let mut second_context = HashMap::new();
second_context.insert("Hey", "There");
println!("Generic:\t{}", merge(&first_context, &second_context));
println!("Generic:\t{}\t{} [Initial Maps Still Usable]", first_context, second_context);
}
The above code on play.rust-lang.org.
Compiling it:
error: the trait `core::kinds::Sized` is not implemented for the type `str`
I get that the compiler is confused about the size of the generic value, but I'm not sure why "str" doesn't have a strict memory size? I know its a String slice and not a type, but still this should work, no? Is this a bug?
I thought this would be a relatively trivial function. If someone has a good solution, I'd love to learn. Actually ideally, I'd love to see a solution with a trait Mergeable and write a decorator for HashMap<&K, &V>, such that I can call let new_context = first_context.merge(&second_context); but this can be a different question.
A more up to date answer from this tweet:
use std::collections::HashMap;
// Mutating one map
fn merge1(map1: &mut HashMap<(), ()>, map2: HashMap<(), ()>) {
map1.extend(map2);
}
// Without mutation
fn merge2(map1: HashMap<(), ()>, map2: HashMap<(), ()>) -> HashMap<(), ()> {
map1.into_iter().chain(map2).collect()
}
// If you only have a reference to the map to be merged in
fn merge_from_ref(map: &mut HashMap<(), ()>, map_ref: &HashMap<(), ()>) {
map.extend(map_ref.into_iter().map(|(k, v)| (k.clone(), v.clone())));
}
Rust Playground Link
This version does work:
use std::collections::HashMap;
use std::hash::Hash;
fn main() {
fn merge<K: Hash + Eq + Copy, V: Copy>(first_context: &HashMap<K, V>, second_context: &HashMap<K, V>) -> HashMap<K, V> {
let mut new_context = HashMap::new();
for (key, value) in first_context.iter() {
new_context.insert(*key, *value);
}
for (key, value) in second_context.iter() {
new_context.insert(*key, *value);
}
new_context
}
let mut first_context = HashMap::new();
first_context.insert("Hello", "World");
let mut second_context = HashMap::new();
second_context.insert("Hey", "There");
println!("Generic:\t{}", merge(&first_context, &second_context));
println!("Generic:\t{}\t{} [Initial Maps Still Usable]", first_context, second_context);
}
The difference is in the signature of merge(). Here is yours:
fn merge<'a, K: Hash + Eq, V>(first_context: &HashMap<&'a K, &'a V>, second_context: &HashMap<&'a K, &'a V>) -> HashMap<&'a K, &'a V>
Here is mine:
fn merge<K: Hash + Eq + Copy, V: Copy>(first_context: &HashMap<K, V>, second_context: &HashMap<K, V>) -> HashMap<K, V>
For some reason you are trying to abstract HashMap<&str, &str> to HashMap<&K, &V>, but this is not really correct: while &str is a borrowed pointer, it is special - it points to dynamically sized type str. Size of str is not known to the compiler, so you can use it only through a pointer. Consequently, neither Hash nor Eq are implemented for str, they are implemented for &str instead. Hence I've changed HashMap<&'a K, &'a V> to HashMap<K, V>.
The second problem is that in general you can't write your function if it takes only references to maps. Your non-generic merge function works only because &str is a reference and references are implicitly copyable. In general case, however, both keys and values can be non-copyable, and merging them into the single map will require moving these maps into the function. Adding Copy bound allows that.
You can also add Clone bound instead of Copy and use explicit clone() call:
fn merge<K: Hash + Eq + Clone, V: Clone>(first_context: &HashMap<K, V>, second_context: &HashMap<K, V>) -> HashMap<K, V> {
// ...
for (key, value) in first_context.iter() {
new_context.insert(key.clone(), value.clone());
}
// ...
}
The most general way, however, is moving maps into the function:
fn merge<K: Hash + Eq, V>(first_context: HashMap<K, V>, second_context: HashMap<K, V>) -> HashMap<K, V> {
// ...
for (key, value) in first_context.into_iter() {
new_context.insert(key, value);
}
// ...
}
Note into_iter() method which consumes the map, but returns an iterator of tuples with actual values instead of references.
I am reading the section on closures in the second edition of the Rust book. At the end of this section, there is an exercise to extend the Cacher implementation given before. I gave it a try:
use std::clone::Clone;
use std::cmp::Eq;
use std::collections::HashMap;
use std::hash::Hash;
struct Cacher<T, K, V>
where
T: Fn(K) -> V,
K: Eq + Hash + Clone,
V: Clone,
{
calculation: T,
values: HashMap<K, V>,
}
impl<T, K, V> Cacher<T, K, V>
where
T: Fn(K) -> V,
K: Eq + Hash + Clone,
V: Clone,
{
fn new(calculation: T) -> Cacher<T, K, V> {
Cacher {
calculation,
values: HashMap::new(),
}
}
fn value(&mut self, arg: K) -> V {
match self.values.clone().get(&arg) {
Some(v) => v.clone(),
None => {
self.values
.insert(arg.clone(), (self.calculation)(arg.clone()));
self.values.get(&arg).unwrap().clone()
}
}
}
}
After creating a version that finally works, I am really unhappy with it. What really bugs me is that cacher.value(...) has 5(!) calls to clone() in it. Is there a way to avoid this?
Your suspicion is correct, the code contains too many calls to clone(), defeating the very optimizations Cacher is designed to achieve.
Cloning the entire cache
The one to start with is the call to self.values.clone() - it creates a copy of the entire cache on every single access.
After non-lexical lifetimes
Remove this clone.
Before non-lexical lifetimes
As you likely discovered yourself, simply removing .clone() doesn't compile. This is because the borrow checker considers the map referenced for the entire duration of match. The shared reference returned by HashMap::get points to the item inside the map, which means that while it exists, it is forbidden to create another mutable reference to the same map, which is required by HashMap::insert. For the code to compile, you need to split up the match in order to force the shared reference to go out of scope before insert is invoked:
// avoids unnecessary clone of the whole map
fn value(&mut self, arg: K) -> V {
if let Some(v) = self.values.get(&arg).map(V::clone) {
return v;
} else {
let v = (self.calculation)(arg.clone());
self.values.insert(arg, v.clone());
v
}
}
This is much better and probably "good enough" for most practical purposes. The hot path, where the value is already cached, now consists of only a single clone, and that one is actually necessary because the original value must remain in the hash map. (Also, note that cloning doesn't need to be expensive or imply deep copying - the stored value can be an Rc<RealValue>, which buys object sharing for free. In that case, clone() will simply increment the reference count on the object.)
Clone on cache miss
In case of cache miss, the key must be cloned, because calculation is declared to consume it. A single cloning will be sufficient, though, so we can pass the original arg to insert without cloning it again. The key clone still feels unnecessary, though - a calculation function shouldn't require ownership of the key it is transforming. Removing this clone boils down to modifying the signature of the calculation function to take the key by reference. Changing the trait bounds of T to T: Fn(&K) -> V allows the following formulation of value():
// avoids unnecessary clone of the key
fn value(&mut self, arg: K) -> V {
if let Some(v) = self.values.get(&arg).map(V::clone) {
return v;
} else {
let v = (self.calculation)(&arg);
self.values.insert(arg, v.clone());
v
}
}
Avoiding double lookups
Now are left with exactly two calls to clone(), one in each code path. This is optimal, as far as value cloning is concerned, but the careful reader will still be nagged by one detail: in case of cache miss, the hash table lookup will effectively happen twice for the same key: once in the call to HashMap::get, and then once more in HashMap::insert. It would be nice if we could instead reuse the work done the first time and perform only one hash map lookup. This can be achieved by replacing get() and insert() with entry():
// avoids the second lookup on cache miss
fn value(&mut self, arg: K) -> V {
match self.values.entry(arg) {
Entry::Occupied(entry) => entry.into_mut(),
Entry::Vacant(entry) => {
let v = (self.calculation)(entry.key());
entry.insert(v)
}
}.clone()
}
We've also taken the opportunity to move the .clone() call after the match.
Runnable example in the playground.
I was solving the same exercise and ended with the following code:
use std::thread;
use std::time::Duration;
use std::collections::HashMap;
use std::hash::Hash;
use std::fmt::Display;
struct Cacher<P, R, T>
where
T: Fn(&P) -> R,
P: Eq + Hash + Clone,
{
calculation: T,
values: HashMap<P, R>,
}
impl<P, R, T> Cacher<P, R, T>
where
T: Fn(&P) -> R,
P: Eq + Hash + Clone,
{
fn new(calculation: T) -> Cacher<P, R, T> {
Cacher {
calculation,
values: HashMap::new(),
}
}
fn value<'a>(&'a mut self, key: P) -> &'a R {
let calculation = &self.calculation;
let key_copy = key.clone();
self.values
.entry(key_copy)
.or_insert_with(|| (calculation)(&key))
}
}
It only makes a single copy of the key in the value() method. It does not copy the resulting value, but instead returns a reference with a lifetime specifier, which is equal to the lifetime of the enclosing Cacher instance (which is logical, I think, because values in the map will continue to exist until the Cacher itself is dropped).
Here's a test program:
fn main() {
let mut cacher1 = Cacher::new(|num: &u32| -> u32 {
println!("calculating slowly...");
thread::sleep(Duration::from_secs(2));
*num
});
calculate_and_print(10, &mut cacher1);
calculate_and_print(20, &mut cacher1);
calculate_and_print(10, &mut cacher1);
let mut cacher2 = Cacher::new(|str: &&str| -> usize {
println!("calculating slowly...");
thread::sleep(Duration::from_secs(2));
str.len()
});
calculate_and_print("abc", &mut cacher2);
calculate_and_print("defghi", &mut cacher2);
calculate_and_print("abc", &mut cacher2);
}
fn calculate_and_print<P, R, T>(intensity: P, cacher: &mut Cacher<P, R, T>)
where
T: Fn(&P) -> R,
P: Eq + Hash + Clone,
R: Display,
{
println!("{}", cacher.value(intensity));
}
And its output:
calculating slowly...
10
calculating slowly...
20
10
calculating slowly...
3
calculating slowly...
6
3
If you remove the requirement of returning values, you don't need to perform any clones by making use of the Entry:
use std::{
collections::{hash_map::Entry, HashMap},
fmt::Display,
hash::Hash,
thread,
time::Duration,
};
struct Cacher<P, R, T>
where
T: Fn(&P) -> R,
P: Eq + Hash,
{
calculation: T,
values: HashMap<P, R>,
}
impl<P, R, T> Cacher<P, R, T>
where
T: Fn(&P) -> R,
P: Eq + Hash,
{
fn new(calculation: T) -> Cacher<P, R, T> {
Cacher {
calculation,
values: HashMap::new(),
}
}
fn value<'a>(&'a mut self, key: P) -> &'a R {
let calculation = &self.calculation;
match self.values.entry(key) {
Entry::Occupied(e) => e.into_mut(),
Entry::Vacant(e) => {
let result = (calculation)(e.key());
e.insert(result)
}
}
}
}
fn main() {
let mut cacher1 = Cacher::new(|num: &u32| -> u32 {
println!("calculating slowly...");
thread::sleep(Duration::from_secs(1));
*num
});
calculate_and_print(10, &mut cacher1);
calculate_and_print(20, &mut cacher1);
calculate_and_print(10, &mut cacher1);
let mut cacher2 = Cacher::new(|str: &&str| -> usize {
println!("calculating slowly...");
thread::sleep(Duration::from_secs(2));
str.len()
});
calculate_and_print("abc", &mut cacher2);
calculate_and_print("defghi", &mut cacher2);
calculate_and_print("abc", &mut cacher2);
}
fn calculate_and_print<P, R, T>(intensity: P, cacher: &mut Cacher<P, R, T>)
where
T: Fn(&P) -> R,
P: Eq + Hash,
R: Display,
{
println!("{}", cacher.value(intensity));
}
You could then choose to wrap this in another struct that performed a clone:
struct ValueCacher<P, R, T>
where
T: Fn(&P) -> R,
P: Eq + Hash,
R: Clone,
{
cacher: Cacher<P, R, T>,
}
impl<P, R, T> ValueCacher<P, R, T>
where
T: Fn(&P) -> R,
P: Eq + Hash,
R: Clone,
{
fn new(calculation: T) -> Self {
Self {
cacher: Cacher::new(calculation),
}
}
fn value(&mut self, key: P) -> R {
self.cacher.value(key).clone()
}
}
I tried implementing a generic A* tree search algorithm. The important part is in the function hucs marked with a TODO:
use std::collections::BinaryHeap;
use std::collections::HashMap;
use std::cmp::Ordering;
pub trait SearchTree<A> {
fn available_actions(&self) -> Vec<A>;
fn apply_action(&self, act: &A) -> Self;
}
pub trait CostSearchTree<A>: SearchTree<A> + Eq {
fn action_cost(&self, act: &A) -> f64;
}
/// Node in the expanded search tree for uniform cost search with heuristic
struct HucsNode<A, T>
where
T: CostSearchTree<A>,
{
cost: f64,
heuristic_cost: f64,
parent_index: usize,
action: Option<A>,
tree: T,
}
impl<A, T> PartialEq for HucsNode<A, T>
where
T: CostSearchTree<A>,
{
fn eq(&self, other: &HucsNode<A, T>) -> bool {
// Can be used for closed list checking if we just compare the trees
return self.tree == other.tree;
}
}
impl<A, T> Eq for HucsNode<A, T>
where
T: CostSearchTree<A>,
{
}
impl<A, T> PartialOrd for HucsNode<A, T>
where
T: CostSearchTree<A>,
{
fn partial_cmp(&self, other: &HucsNode<A, T>) -> Option<Ordering> {
Some(self.cmp(other))
}
}
impl<A, T> Ord for HucsNode<A, T>
where
T: CostSearchTree<A>,
{
fn cmp(&self, other: &HucsNode<A, T>) -> Ordering {
let self_cost = self.cost + self.heuristic_cost;
let other_cost = other.cost + other.heuristic_cost;
// Flip for min-heap
match other_cost.partial_cmp(&self_cost) {
Some(order) => order,
_ => Ordering::Equal,
}
}
}
/// Perform a uniform cost search with a monotonous heuristic function on a search tree.
/// Returns a sequence of actions if a state is found that satisfies the predicate or None if the search terminates before.
pub fn hucs<A, T: CostSearchTree<A> + Hash>(
tree: T,
predicate: &Fn(&T) -> bool,
heuristic: &Fn(&T) -> f64,
) -> Option<Vec<A>> {
let mut node_heap = BinaryHeap::new() as BinaryHeap<HucsNode<A, T>>;
// Push the initial node onto the tree
node_heap.push(HucsNode {
action: None,
parent_index: usize::max_value(),
cost: 0.0,
heuristic_cost: heuristic(&tree),
tree: tree,
});
let mut old_nodes = Vec::new();
let mut last_node_index = 0 as usize;
'outer: while let Some(current_node) = node_heap.pop() {
// Break borrows with scope so current_node can be moved out
{
if predicate(¤t_node.tree) {
return Some(form_action_sequence(current_node, old_nodes));
}
// Check if visited nodes already contains this tree with less cost
// TODO: Time complexity is hardly ideal
for old_node in old_nodes.iter() {
if old_node.tree == current_node.tree && old_node.cost <= current_node.cost {
continue 'outer;
}
}
let ref current_tree = current_node.tree;
for action in current_tree.available_actions() {
let action_cost = current_tree.action_cost(&action);
let new_tree = current_tree.apply_action(&action);
let new_cost = current_node.cost + action_cost;
let new_node = HucsNode {
action: Some(action),
cost: new_cost,
parent_index: last_node_index,
heuristic_cost: heuristic(&new_tree),
tree: new_tree,
};
node_heap.push(new_node);
}
}
old_nodes.push(current_node);
last_node_index += 1;
}
return None;
}
/// Restore the sequence of actions that was used to get to this node by climbing the tree of expanded nodes
fn form_action_sequence<A, T: CostSearchTree<A>>(
leaf: HucsNode<A, T>,
mut older_nodes: Vec<HucsNode<A, T>>,
) -> Vec<A> {
let mut action_vector = Vec::new();
let mut current = leaf;
while let Some(action) = current.action {
action_vector.insert(0, action);
// Safe to swap as nodes' parents are always before them
current = older_nodes.swap_remove(current.parent_index);
}
return action_vector;
}
The problem is that looking up whether the current node was in the old nodes by scanning over the old nodes takes way too long. Therefore I wanted to add a HashMap. Since I however also need to be able to access the old nodes by indices to form the solution action sequence at the end, I also need to keep the Vec. To solve this I tried adding a wrapper that I can insert into the HashMap as a key that just looks up its content in the Vec like this:
use std::hash::Hash;
use std::hash::Hasher;
struct BackedHashWrapper<'a, T: 'a + Hash + Eq> {
source: &'a Vec<T>,
index: usize,
}
impl<A, T> Hash for HucsNode<A, T>
where
T: CostSearchTree<A> + Hash,
{
fn hash<H>(&self, state: &mut H)
where
H: Hasher,
{
self.tree.hash(state);
}
}
impl<'a, T> Hash for BackedHashWrapper<'a, T>
where
T: Eq + Hash,
{
fn hash<H>(&self, state: &mut H)
where
H: Hasher,
{
self.source[self.index].hash(state);
}
}
impl<'a, T> PartialEq for BackedHashWrapper<'a, T>
where
T: Eq + Hash,
{
fn eq(&self, other: &BackedHashWrapper<T>) -> bool {
self.source[self.index] == other.source[other.index]
}
}
impl<'a, T> Eq for BackedHashWrapper<'a, T>
where
T: Eq + Hash,
{
}
I cannot figure out how to implement this in the hucs method, I tried the following just for adding elements to the hashmap
...
let mut old_nodes = Vec::new();
let mut hash_map = HashMap::new();
...
...
hash_map.insert(BackedHashWrapper {source: &old_nodes, index: last_node_index}, current_node.cost);
old_nodes.push(current_node);
last_node_index += 1;
...
but the borrow checker will not allow me to create such a BackedHashWrapper while the source vector is mutable. Clearly I am doing this completely the wrong way, so how could I accomplish this without having to clone either the tree or any actions?
I suppose it is easier to use other type of backing storage (TypedArena from typed-arena crate, for example).
But taking the question at face value, the problem you are dealing with is caused by Rust borrowing rules. That is you can't have shared (&) and mutable (&mut) references or multiple mutable references to the same object in the same scope.
hash_map in your example holds shared references to the vector, "freezing" it, which makes it impossible to modify the vector while hash_map is in the scope.
Solution to this problem is interior mutability pattern.
In your case, you can use RefCell<Vec<T>> to be able to modify vector while holding multiple references to it.
use std::cell::RefCell;
type RVec<T> = RefCell<Vec<T>>;
struct BackedHashWrapper<'a, T: 'a + Hash + Eq> {
source: &'a RVec<T>,
index: usize,
}
...
impl<'a, T> Hash for BackedHashWrapper<'a, T>
where
T: Eq + Hash,
{
fn hash<H>(&self, state: &mut H)
where
H: Hasher,
{
self.source.borrow()[self.index].hash(state);
}
}
...
// Similar changes for Eq and PartialEq
...
let mut old_nodes: RVec<_> = RefCell::default();
let mut hash_map = HashMap::new();
...
...
hash_map.insert(BackedHashWrapper {source: &old_nodes, index: last_node_index}, current_node.cost);
old_nodes.borrow_mut().push(current_node);
last_node_index += 1;
...
Maybe a couple of borrow()s and borrow_mut()s will be required in other places.
I am attempting to write a function that validates a given collection using a closure. The function takes ownership of a collection, iterates over the contents, and if no invalid item was found, returns ownership of the collection. This is so it can be used like this (without creating a temp for the Vec): let col = validate(vec![1, 2], |&v| v < 10)?;
This is the current implementation of the function:
use std::fmt::Debug;
fn validate<C, F, V>(col: C, pred: F) -> Result<C, String>
where C: Debug,
for<'c> &'c C: IntoIterator<Item = V>,
F: Fn(&V) -> bool,
V: Debug
{
if let Some(val) = (&col).into_iter().find(|v| !pred(v)) {
Err(format!("{:?} contains invalid item: {:?}.", col, val))?;
}
Ok(col)
}
It does compile, but it doesn't work when I try to use it:
use std::collections::BTreeMap;
use std::iter::{FromIterator, once};
fn main() {
println!("Vec: {:?}", validate(vec![1, 2, 3, 4], |&&v| v <= 3));
// ^^^^^^^^ expected bound lifetime parameter 'c, found concrete lifetime
println!("Map: {:?}",
validate(BTreeMap::from_iter(once((1, 2))), |&(&k, &v)| k <= 3));
}
Rust Playground
Is what I'm trying to accomplish here possible?
Background
I am writing a parser for a toy project of mine and was wondering if I
could write a single validate function that works with all the collection
types I use:
Vecs,
VecDeques,
BTreeSets,
BTreeMaps,
&[T] slices.
Each of these collections implements the IntoIterator trait for a reference of itself,
which can be used to call .into_iter() on a reference without consuming the items
in the collection:
Vec impl
VecDeque impl
BTreeSet impl
BTreeMap impl
&[T] slices impl
This is the what the for<'c> &'c C: IntoIterator<Item = V> in the function declaration
refers to. Since the reference is defined in the function body itself, we can't just
use a lifetime that's declared on the function (like fn validate<'c, ...), because this
would imply that the reference has to outlive the function (which it cannot). Instead we
have to use a Higher-Rank Trait Bound to
declare this lifetime.
It seems to me that this lifetime is also the source of the trouble, since a version of
the function that takes and returns a reference to the collection works fine:
// This works just fine.
fn validate<'c, C, F, V>(col: &'c C, pred: F) -> Result<&'c C, String>
where C: Debug,
&'c C: IntoIterator<Item = V>,
F: Fn(&V) -> bool,
V: Debug
{
if let Some(val) = col.into_iter().find(|v| !pred(v)) {
Err(format!("{:?} contains invalid item: {:?}.", col, val))?;
}
Ok(col)
}
Rust Playground
Furthermore, I managed to implement two other versions of the
function, one which works for Vec, VecDeque, BTreeSet and &[T] slices, and another
which works for BTreeMap and probably other mappings:
use std::fmt::Debug;
pub fn validate_collection<C, F, V>(col: C, pred: F) -> Result<C, String>
where C: Debug,
for<'c> &'c C: IntoIterator<Item = &'c V>,
F: Fn(&V) -> bool,
V: Debug
{
if let Some(val) = (&col).into_iter().find(|&v| !pred(v)) {
Err(format!("{:?} contains invalid item: {:?}.", col, val))?;
}
Ok(col)
}
pub fn validate_mapping<C, F, K, V>(col: C, pred: F) -> Result<C, String>
where C: Debug,
for<'c> &'c C: IntoIterator<Item = (&'c K, &'c V)>,
F: Fn(&K, &V) -> bool,
K: Debug,
V: Debug
{
if let Some(val) = (&col).into_iter().find(|&(k, v)| !pred(k, v)) {
Err(format!("{:?} contains invalid item: {:?}.", col, val))?;
}
Ok(col)
}
Rust Playground
In the end I hope to create a Validate trait. Currently, I can only impl
it for either collections or mappings, because the impls conflict.
use std::fmt::Debug;
trait Validate<V>: Sized {
fn validate<F>(self, F) -> Result<Self, String> where F: Fn(&V) -> bool;
}
// Impl that only works for collections, not mappings.
impl<C, V> Validate<V> for C
where C: Debug,
for<'c> &'c C: IntoIterator<Item = &'c V>,
V: Debug
{
fn validate<F>(self, pred: F) -> Result<C, String>
where F: Fn(&V) -> bool
{
if let Some(val) = (&self).into_iter().find(|&v| !pred(v)) {
Err(format!("{:?} contains invalid item: {:?}.", self, val))?;
}
Ok(self)
}
}
fn main() {
println!("Vec: {:?}", vec![1, 2, 3, 4].validate(|&v| v <= 3));
}
Rust Playground
Looking at your trait bounds (reformatted a little):
fn validate<C, F, V>(col: C, pred: F) -> Result<C, String>
where C: Debug,
for<'c> &'c C: IntoIterator<Item = V>,
F: Fn(&V) -> bool,
V: Debug {
the problem is that &C won't implement IntoIterator<Item = V>; references tend to iterate over references.
Fixing that (and the extra reference in the closure) makes it work:
fn validate<C, F, V>(col: C, pred: F) -> Result<C, String>
where C: Debug,
for<'c> &'c C: IntoIterator<Item = &'c V>,
F: Fn(&V) -> bool,
V: Debug
{
if let Some(val) = (&col).into_iter().find(|v| !pred(v)) {
Err(format!("{:?} contains invalid item: {:?}.", col, val))?;
}
Ok(col)
}
fn main() {
println!("Vec: {:?}", validate(vec![1, 2, 3, 4], |&v| v <= 3));
}
Playground
To extend this to work with BTreeMap values, we can abstract over the method used to generate the iterators. Let's add a trait HasValueIterator which knows how to get an iterator over values:
trait HasValueIterator<'a, V: 'a> {
type ValueIter : Iterator<Item=&'a V>;
fn to_value_iter(&'a self) -> Self::ValueIter;
}
and use that instead of IntoIterator:
fn validate<C, F, V>(col: C, pred: F) -> Result<C, String>
where C: Debug,
for<'c> C: HasValueIterator<'c, V>,
F: Fn(&V) -> bool,
V: Debug
{
if let Some(val) = (&col).to_value_iter().find(|v| !pred(v)) {
Err(format!("{:?} contains invalid item: {:?}.", col, val))?;
}
Ok(col)
}
Now we can implement it for Vec and BTreeMap (the latter using .values()), thought you have to name the iterator types:
impl<'c, V:'c> HasValueIterator<'c, V> for Vec<V> {
type ValueIter = std::slice::Iter<'c,V>;
fn to_value_iter(&'c self) -> Self::ValueIter {
self.iter()
}
}
impl<'c, V:'c, K:'c> HasValueIterator<'c, V> for BTreeMap<K, V> {
type ValueIter = std::collections::btree_map::Values<'c, K, V>;
fn to_value_iter(&'c self) -> Self::ValueIter {
self.values()
}
}
Now this works with both Vec and BTreeMap, at least with values:
fn main() {
println!("Vec: {:?}", validate(vec![1, 2, 3, 4], |&v| v <= 3));
let mut map = BTreeMap::new();
map.insert("first", 1);
map.insert("second", 2);
map.insert("third", 3);
println!("Map: {:?}", validate(map, |&v| v<=2));
}
Playground
This outputs:
Vec: Err("[1, 2, 3, 4] contains invalid item: 4.")
Map: Err("{\"first\": 1, \"second\": 2, \"third\": 3} contains invalid item: 3.")