HashMap implements the get and insert methods which take a single immutable borrow and a single move of a value respectively.
I want a trait which is just like this but which takes two keys instead of one. It uses the map inside, but it's just a detail of implementation.
pub struct Table<A: Eq + Hash, B: Eq + Hash> {
map: HashMap<(A, B), f64>,
}
impl<A: Eq + Hash, B: Eq + Hash> Memory<A, B> for Table<A, B> {
fn get(&self, a: &A, b: &B) -> f64 {
let key: &(A, B) = ??;
*self.map.get(key).unwrap()
}
fn set(&mut self, a: A, b: B, v: f64) {
self.map.insert((a, b), v);
}
}
This is certainly possible. The signature of get is
fn get<Q: ?Sized>(&self, k: &Q) -> Option<&V>
where
K: Borrow<Q>,
Q: Hash + Eq,
The problem here is to implement a &Q type such that
(A, B): Borrow<Q>
Q implements Hash + Eq
To satisfy condition (1), we need to think of how to write
fn borrow(self: &(A, B)) -> &Q
The trick is that &Q does not need to be a simple pointer, it can be a trait object! The idea is to create a trait Q which will have two implementations:
impl Q for (A, B)
impl Q for (&A, &B)
The Borrow implementation will simply return self and we can construct a &dyn Q trait object from the two elements separately.
The full implementation is like this:
use std::borrow::Borrow;
use std::collections::HashMap;
use std::hash::{Hash, Hasher};
// See explanation (1).
trait KeyPair<A, B> {
/// Obtains the first element of the pair.
fn a(&self) -> &A;
/// Obtains the second element of the pair.
fn b(&self) -> &B;
}
// See explanation (2).
impl<'a, A, B> Borrow<dyn KeyPair<A, B> + 'a> for (A, B)
where
A: Eq + Hash + 'a,
B: Eq + Hash + 'a,
{
fn borrow(&self) -> &(dyn KeyPair<A, B> + 'a) {
self
}
}
// See explanation (3).
impl<A: Hash, B: Hash> Hash for dyn KeyPair<A, B> + '_ {
fn hash<H: Hasher>(&self, state: &mut H) {
self.a().hash(state);
self.b().hash(state);
}
}
impl<A: Eq, B: Eq> PartialEq for dyn KeyPair<A, B> + '_ {
fn eq(&self, other: &Self) -> bool {
self.a() == other.a() && self.b() == other.b()
}
}
impl<A: Eq, B: Eq> Eq for dyn KeyPair<A, B> + '_ {}
// OP's Table struct
pub struct Table<A: Eq + Hash, B: Eq + Hash> {
map: HashMap<(A, B), f64>,
}
impl<A: Eq + Hash, B: Eq + Hash> Table<A, B> {
fn new() -> Self {
Table {
map: HashMap::new(),
}
}
fn get(&self, a: &A, b: &B) -> f64 {
*self.map.get(&(a, b) as &dyn KeyPair<A, B>).unwrap()
}
fn set(&mut self, a: A, b: B, v: f64) {
self.map.insert((a, b), v);
}
}
// Boring stuff below.
impl<A, B> KeyPair<A, B> for (A, B) {
fn a(&self) -> &A {
&self.0
}
fn b(&self) -> &B {
&self.1
}
}
impl<A, B> KeyPair<A, B> for (&A, &B) {
fn a(&self) -> &A {
self.0
}
fn b(&self) -> &B {
self.1
}
}
//----------------------------------------------------------------
#[derive(Eq, PartialEq, Hash)]
struct A(&'static str);
#[derive(Eq, PartialEq, Hash)]
struct B(&'static str);
fn main() {
let mut table = Table::new();
table.set(A("abc"), B("def"), 4.0);
table.set(A("123"), B("456"), 45.0);
println!("{:?} == 45.0?", table.get(&A("123"), &B("456")));
println!("{:?} == 4.0?", table.get(&A("abc"), &B("def")));
// Should panic below.
println!("{:?} == NaN?", table.get(&A("123"), &B("def")));
}
Explanation:
The KeyPair trait takes the role of Q we mentioned above. We'd need to impl Eq + Hash for dyn KeyPair, but Eq and Hash are both not object safe. We add the a() and b() methods to help implementing them manually.
Now we implement the Borrow trait from (A, B) to dyn KeyPair + 'a. Note the 'a — this is a subtle bit that is needed to make Table::get actually work. The arbitrary 'a allows us to say that an (A, B) can be borrowed to the trait object for any lifetime. If we don't specify the 'a, the unsized trait object will default to 'static, meaning the Borrow trait can only be applied when the implementation like (&A, &B) outlives 'static, which is certainly not the case.
Finally, we implement Eq and Hash. Same reason as point 2, we implement for dyn KeyPair + '_ instead of dyn KeyPair (which means dyn KeyPair + 'static in this context). The '_ here is a syntax sugar meaning arbitrary lifetime.
Using trait objects will incur indirection cost when computing the hash and checking equality in get(). The cost can be eliminated if the optimizer is able to devirtualize that, but whether LLVM will do it is unknown.
An alternative is to store the map as HashMap<(Cow<A>, Cow<B>), f64>. Using this requires less "clever code", but there is now a memory cost to store the owned/borrowed flag as well as runtime cost in both get() and set().
Unless you fork the standard HashMap and add a method to look up an entry via Hash + Eq alone, there is no guaranteed-zero-cost solution.
A Memory trait that take two keys, set by value and get by reference:
trait Memory<A: Eq + Hash, B: Eq + Hash> {
fn get(&self, a: &A, b: &B) -> Option<&f64>;
fn set(&mut self, a: A, b: B, v: f64);
}
You can impl such trait using a Map of Maps:
pub struct Table<A: Eq + Hash, B: Eq + Hash> {
table: HashMap<A, HashMap<B, f64>>,
}
impl<A: Eq + Hash, B: Eq + Hash> Memory<A, B> for Table<A, B> {
fn get(&self, a: &A, b: &B) -> Option<&f64> {
self.table.get(a)?.get(b)
}
fn set(&mut self, a: A, b: B, v: f64) {
let inner = self.table.entry(a).or_insert(HashMap::new());
inner.insert(b, v);
}
}
Please note that if the solution is somewhat elegant the memory footprint of an HashMap of HashMaps have to be considered when thousands of HashMap instances has to be managed.
Full example
Related
I have some problems generalizing a trait working for &str to other string types (e.g. Rc<str>, Box<str>,String).
First of all my example function should work for:
assert_eq!(count("ababbc", 'a'), 2); // already working
assert_eq!(count(Rc::from("ababbc"), 'a'), 2); // todo
assert_eq!(count("ababbc".to_string(), 'a'), 2); // todo
This is the working code, which makes the first test run:
pub trait Atom: Copy + Eq + Ord + Display + Debug {}
impl Atom for char {}
pub trait Atoms<A, I>
where
I: Iterator<Item = A>,
A: Atom,
{
fn atoms(&self) -> I;
}
impl<'a> Atoms<char, std::str::Chars<'a>> for &'a str {
fn atoms(&self) -> std::str::Chars<'a> {
self.chars()
}
}
pub fn count<I, A, T>(pattern: T, item: A) -> usize
where
A: Atom,
I: Iterator<Item = A>,
T: Atoms<A, I>,
{
pattern.atoms().filter(|i| *i == item).count()
}
To make the next tests run, I changed the signature of count and Atoms in following way:
pub trait Atoms<'a, A, I>
where
I: Iterator<Item = A> + 'a,
A: Atom,
{
fn atoms<'b>(&'b self) -> I
where
'b: 'a;
}
impl<'a, S> Atoms<'a, char, std::str::Chars<'a>> for S
where
S: AsRef<str> + 'a,
{
fn atoms<'b>(&'b self) -> std::str::Chars<'b>
where
'b: 'a,
{
self.as_ref().chars()
}
}
but now the function count does not compile any more:
pub fn count<'a, I, A, T>(pattern: T, item: A) -> usize
where
A: Atom,
I: Iterator<Item = A> + 'a,
T: Atoms<'a, A, I>,
{
pattern.atoms().filter(|i| *i == item).count()
}
Playground-Link
The compiler error is: the parameter type 'T' may not live long enough ... consider adding an explicit lifetime bound...: 'T: 'a'. This is not completely understandable for me, so I tried to apply the help T: Atoms<'a, A, I> + 'a. Now the error is: 'pattern' does not live long enough ... 'pattern' dropped here while still borrowed.
Since the latter error also occurs without implementations of Atoms and by just replacing the function body by pattern.atoms();return 1; I suspect that the type signature of Atoms is not suitable for my purpose.
Has anybody a hint what could be wrong with the type signature of Atoms or count?
trait Atoms<'a, A, I> where I: Iterator<Item = A> + 'a ...
By writing this, you're requiring the iterator to outlive the lifetime 'a. Since 'a is part of the trait's generic parameters, it must be a lifetime that extends before you start using (implicitly) <String as Atoms<'a, char, std::str::Chars>>::atoms. This directly contradicts the idea of returning a new iterator object, since that object did not exist before the call to atoms().
Unfortunately, I'm not sure how to rewrite this so that it will work, without generic associated types (a feature not yet stabilized), but I hope that at least this explanation of the problem helps.
I hope I clarify at least some parts of the problem. After doing some internet research on HRTBs and GATs I come to the conclusion that my problem is hardly solvable with stable rust.
The main problem is that one cannot
have a trait with different lifetime signature than its implementations
keep lifetimes generic in a trait for later instantiation in the implementation
limit the upper bound of a results lifetime if it is owned
I tried several approaches to but most evolve to fail:
at compiling the implementation (because the implementations lifetimes conflict with those of the trait)
at compiling the caller of the trait because a compiling implementation limits the lifetimes in a way, that no object can satisfy them.
At last I found two solutions:
Implement the trait for references
the function atoms(self) does now expect Self and not &Self
Atoms<A,I> is implemented for &'a str and &'a S where S:AsRef<str>
This gives us control of the lifetimes of the self objects ('a) and strips the lifetime completely from the trait.
The disadvantage of this approach is that we have to pass references to our count function even for smart references.
Playground-Link
use std::fmt::Display;
use std::fmt::Debug;
pub trait Atom: Copy + Eq + Ord + Display + Debug {}
impl Atom for char {}
pub trait Atoms<A, I>
where
I: Iterator<Item = A>,
A: Atom,
{
fn atoms(self) -> I;
}
impl<'a> Atoms<char, std::str::Chars<'a>> for &'a str {
fn atoms(self) -> std::str::Chars<'a> {
self.chars()
}
}
impl<'a, S> Atoms<char, std::str::Chars<'a>> for &'a S
where
S: AsRef<str>,
{
fn atoms(self) -> std::str::Chars<'a> {
self.as_ref().chars()
}
}
pub fn count<I, A, T>(pattern: T, item: A) -> usize
where
A: Atom,
I: Iterator<Item = A>,
T: Atoms<A, I>,
{
pattern.atoms().filter(|i| *i == item).count()
}
#[cfg(test)]
mod tests {
use std::rc::Rc;
use super::*;
#[test]
fn test_example() {
assert_eq!(count("ababbc", 'a'), 2);
assert_eq!(count(&"ababbc".to_string(), 'a'), 2);
assert_eq!(count(&Rc::from("ababbc"), 'a'), 2);
}
}
Switch to Generic Associated Types (unstable)
reduce the generic type Atoms<A,I> to an type Atoms<A> with an generic associated type I<'a> (which is instantiable at implementations)
now the function count can refer to the lifetime of I like this fn atoms<'a>(&'a self) -> Self::I<'a>
and all implementations just have to define how the want to map the lifetime 'a to their own lifetime (for example to Chars<'a>)
In this case we have all lifetime constraints in the trait, the implementation can consider to map this lifetime or ignore it. The trait, the implementation and the call site are concise and do not require references or helper lifetimes.
The disadvantage of this solution is that it is unstable, I do not know whether this means that runtime failures would probably occur or that api could change (or both). You will have to activate #![feature(generic_associated_types)] to let it run.
Playground-Link
use std::{fmt::Display, str::Chars};
use std::{fmt::Debug, rc::Rc};
pub trait Atom: Copy + Eq + Ord + Display + Debug {}
impl Atom for char {}
pub trait Atoms<A>
where
A: Atom,
{
type I<'a>: Iterator<Item = A>;
fn atoms<'a>(&'a self) -> Self::I<'a>;
}
impl Atoms<char> for str
{
type I<'a> = Chars<'a>;
fn atoms<'a>(&'a self) -> Chars<'a> {
self.chars()
}
}
impl <S> Atoms<char> for S
where
S: AsRef<str>,
{
type I<'a> = Chars<'a>;
fn atoms<'a>(&'a self) -> Chars<'a> {
self.as_ref().chars()
}
}
pub fn count<A, S>(pattern: S, item: A) -> usize
where
A: Atom,
S: Atoms<A>,
{
pattern.atoms().filter(|i| *i == item).count()
}
#[cfg(test)]
mod tests {
use std::rc::Rc;
use super::*;
#[test]
fn test_example() {
assert_eq!(count("ababbc", 'a'), 2);
assert_eq!(count("ababbc".to_string(), 'a'), 2);
assert_eq!(count(Rc::from("ababbc"), 'a'), 2);
}
}
I have a function that takes a collection of &T (represented by IntoIterator) with the requirement that every element is unique.
fn foo<'a, 'b, T: std::fmt::Debug, I>(elements: &'b I)
where
&'b I: IntoIterator<Item = &'a T>,
T: 'a,
'b: 'a,
I would like to also write a wrapper function which can work even if the elements are not unique, by using a HashSet to remove the duplicate elements first.
I tried the following implementation:
use std::collections::HashSet;
fn wrap<'a, 'b, T: std::fmt::Debug + Eq + std::hash::Hash, J>(elements: &'b J)
where
&'b J: IntoIterator<Item = &'a T>,
T: 'a,
'b: 'a,
{
let hashset: HashSet<&T> = elements.into_iter().into_iter().collect();
foo(&hashset);
}
playground
However, the compiler doesn't seem happy with my assumption that HashSet<&T> implements IntoIterator<Item = &'a T>:
error[E0308]: mismatched types
--> src/lib.rs:10:9
|
10 | foo(&hashset);
| ^^^^^^^^ expected type parameter, found struct `std::collections::HashSet`
|
= note: expected type `&J`
found type `&std::collections::HashSet<&T>`
= help: type parameters must be constrained to match other types
= note: for more information, visit https://doc.rust-lang.org/book/ch10-02-traits.html#traits-as-parameters
I know I could use a HashSet<T> by cloning all the input elements, but I want to avoid unnecessary copying and memory use.
If you have a &HashSet<&T> and need an iterator of &T (not &&T) that you can process multiple times, then you can use Iterator::copied to convert the iterator's &&T to a &T:
use std::{collections::HashSet, fmt::Debug, hash::Hash, marker::PhantomData};
struct Collection<T> {
item: PhantomData<T>,
}
impl<T> Collection<T>
where
T: Debug,
{
fn foo<'a, I>(elements: I) -> Self
where
I: IntoIterator<Item = &'a T> + Clone,
T: 'a,
{
for element in elements.clone() {
println!("{:?}", element);
}
for element in elements {
println!("{:?}", element);
}
Self { item: PhantomData }
}
}
impl<T> Collection<T>
where
T: Debug + Eq + Hash,
{
fn wrap<'a, I>(elements: I) -> Self
where
I: IntoIterator<Item = &'a T>,
T: 'a,
{
let set: HashSet<_> = elements.into_iter().collect();
Self::foo(set.iter().copied())
}
}
#[derive(Debug, Hash, PartialEq, Eq)]
struct Foo(i32);
fn main() {
let v = vec![Foo(1), Foo(2), Foo(4)];
Collection::<Foo>::wrap(&v);
}
See also:
Using the same iterator multiple times in Rust
Does cloning an iterator copy the entire underlying vector?
Why does cloning my custom type result in &T instead of T?
Note that the rest of this answer made the assumption that a struct named Collection<T> was a collection of values of type T. OP has clarified that this is not true.
That's not your problem, as shown by your later examples. That can be boiled down to this:
struct Collection<T>(T);
impl<T> Collection<T> {
fn new(value: &T) -> Self {
Collection(value)
}
}
You are taking a reference to a type (&T) and trying to store it where a T is required; these are different types and will generate an error. You are using PhantomData for some reason and accepting references via the iterator, but the problem is the same.
In fact, PhantomData makes the problem harder to see as you can just make up values that don't work. For example, we never have any kind of string here but we "successfully" created the struct:
use std::marker::PhantomData;
struct Collection<T>(PhantomData<T>);
impl Collection<String> {
fn new<T>(value: &T) -> Self {
Collection(PhantomData)
}
}
Ultimately, your wrap function doesn't make sense, either:
impl<T: Eq + Hash> Collection<T> {
fn wrap<I>(elements: I) -> Self
where
I: IntoIterator<Item = T>,
This is equivalent to
impl<T: Eq + Hash> Collection<T> {
fn wrap<I>(elements: I) -> Collection<T>
where
I: IntoIterator<Item = T>,
Which says that, given an iterator of elements T, you will return a collection of those elements. However, you put them in a HashMap and iterate on a reference to it, which yields &T. Thus this function signature cannot be right.
It seems most likely that you want to accept an iterator of owned values instead:
use std::{collections::HashSet, fmt::Debug, hash::Hash};
struct Collection<T> {
item: T,
}
impl<T> Collection<T> {
fn foo<I>(elements: I) -> Self
where
I: IntoIterator<Item = T>,
for<'a> &'a I: IntoIterator<Item = &'a T>,
T: Debug,
{
for element in &elements {
println!("{:?}", element);
}
for element in &elements {
println!("{:?}", element);
}
Self {
item: elements.into_iter().next().unwrap(),
}
}
}
impl<T> Collection<T>
where
T: Eq + Hash,
{
fn wrap<I>(elements: I) -> Self
where
I: IntoIterator<Item = T>,
T: Debug,
{
let s: HashSet<_> = elements.into_iter().collect();
Self::foo(s)
}
}
#[derive(Debug, Hash, PartialEq, Eq)]
struct Foo(i32);
fn main() {
let v = vec![Foo(1), Foo(2), Foo(4)];
let c = Collection::wrap(v);
println!("{:?}", c.item)
}
Here we place a trait bound on the generic iterator type directly and a second higher-ranked trait bound on a reference to the iterator. This allows us to use a reference to the iterator as an iterator itself.
See also:
How does one generically duplicate a value in Rust?
Is there any way to return a reference to a variable created in a function?
How do I write the lifetimes for references in a type constraint when one of them is a local reference?
There were a number of orthogonal issues with my code that Shepmaster pointed out, but to solve the issue of using a HashSet<&T> as an IntoIterator<Item=&T>, I found that one way to solve it is with a wrapper struct:
struct Helper<T, D: Deref<Target = T>>(HashSet<D>);
struct HelperIter<'a, T, D: Deref<Target = T>>(std::collections::hash_set::Iter<'a, D>);
impl<'a, T, D: Deref<Target = T>> Iterator for HelperIter<'a, T, D>
where
T: 'a,
{
type Item = &'a T;
fn next(&mut self) -> Option<Self::Item> {
self.0.next().map(|x| x.deref())
}
}
impl<'a, T, D: Deref<Target = T>> IntoIterator for &'a Helper<T, D> {
type Item = &'a T;
type IntoIter = HelperIter<'a, T, D>;
fn into_iter(self) -> Self::IntoIter {
HelperIter((&self.0).into_iter())
}
}
Which is used as follows:
struct Collection<T> {
item: PhantomData<T>,
}
impl<T: Debug> Collection<T> {
fn foo<I>(elements: I) -> Self
where
I: IntoIterator + Copy,
I::Item: Deref<Target = T>,
{
for element in elements {
println!("{:?}", *element);
}
for element in elements {
println!("{:?}", *element);
}
return Self { item: PhantomData };
}
}
impl<T: Debug + Eq + Hash> Collection<T> {
fn wrap<I>(elements: I) -> Self
where
I: IntoIterator + Copy,
I::Item: Deref<Target = T> + Eq + Hash,
{
let helper = Helper(elements.into_iter().collect());
Self::foo(&helper);
return Self { item: PhantomData };
}
}
fn main() {
let v = vec![Foo(1), Foo(2), Foo(4)];
Collection::<Foo>::wrap(&v);
}
I'm guessing that some of this may be more complicated than it needs to be, but I'm not sure how.
full playground
I am reading the section on closures in the second edition of the Rust book. At the end of this section, there is an exercise to extend the Cacher implementation given before. I gave it a try:
use std::clone::Clone;
use std::cmp::Eq;
use std::collections::HashMap;
use std::hash::Hash;
struct Cacher<T, K, V>
where
T: Fn(K) -> V,
K: Eq + Hash + Clone,
V: Clone,
{
calculation: T,
values: HashMap<K, V>,
}
impl<T, K, V> Cacher<T, K, V>
where
T: Fn(K) -> V,
K: Eq + Hash + Clone,
V: Clone,
{
fn new(calculation: T) -> Cacher<T, K, V> {
Cacher {
calculation,
values: HashMap::new(),
}
}
fn value(&mut self, arg: K) -> V {
match self.values.clone().get(&arg) {
Some(v) => v.clone(),
None => {
self.values
.insert(arg.clone(), (self.calculation)(arg.clone()));
self.values.get(&arg).unwrap().clone()
}
}
}
}
After creating a version that finally works, I am really unhappy with it. What really bugs me is that cacher.value(...) has 5(!) calls to clone() in it. Is there a way to avoid this?
Your suspicion is correct, the code contains too many calls to clone(), defeating the very optimizations Cacher is designed to achieve.
Cloning the entire cache
The one to start with is the call to self.values.clone() - it creates a copy of the entire cache on every single access.
After non-lexical lifetimes
Remove this clone.
Before non-lexical lifetimes
As you likely discovered yourself, simply removing .clone() doesn't compile. This is because the borrow checker considers the map referenced for the entire duration of match. The shared reference returned by HashMap::get points to the item inside the map, which means that while it exists, it is forbidden to create another mutable reference to the same map, which is required by HashMap::insert. For the code to compile, you need to split up the match in order to force the shared reference to go out of scope before insert is invoked:
// avoids unnecessary clone of the whole map
fn value(&mut self, arg: K) -> V {
if let Some(v) = self.values.get(&arg).map(V::clone) {
return v;
} else {
let v = (self.calculation)(arg.clone());
self.values.insert(arg, v.clone());
v
}
}
This is much better and probably "good enough" for most practical purposes. The hot path, where the value is already cached, now consists of only a single clone, and that one is actually necessary because the original value must remain in the hash map. (Also, note that cloning doesn't need to be expensive or imply deep copying - the stored value can be an Rc<RealValue>, which buys object sharing for free. In that case, clone() will simply increment the reference count on the object.)
Clone on cache miss
In case of cache miss, the key must be cloned, because calculation is declared to consume it. A single cloning will be sufficient, though, so we can pass the original arg to insert without cloning it again. The key clone still feels unnecessary, though - a calculation function shouldn't require ownership of the key it is transforming. Removing this clone boils down to modifying the signature of the calculation function to take the key by reference. Changing the trait bounds of T to T: Fn(&K) -> V allows the following formulation of value():
// avoids unnecessary clone of the key
fn value(&mut self, arg: K) -> V {
if let Some(v) = self.values.get(&arg).map(V::clone) {
return v;
} else {
let v = (self.calculation)(&arg);
self.values.insert(arg, v.clone());
v
}
}
Avoiding double lookups
Now are left with exactly two calls to clone(), one in each code path. This is optimal, as far as value cloning is concerned, but the careful reader will still be nagged by one detail: in case of cache miss, the hash table lookup will effectively happen twice for the same key: once in the call to HashMap::get, and then once more in HashMap::insert. It would be nice if we could instead reuse the work done the first time and perform only one hash map lookup. This can be achieved by replacing get() and insert() with entry():
// avoids the second lookup on cache miss
fn value(&mut self, arg: K) -> V {
match self.values.entry(arg) {
Entry::Occupied(entry) => entry.into_mut(),
Entry::Vacant(entry) => {
let v = (self.calculation)(entry.key());
entry.insert(v)
}
}.clone()
}
We've also taken the opportunity to move the .clone() call after the match.
Runnable example in the playground.
I was solving the same exercise and ended with the following code:
use std::thread;
use std::time::Duration;
use std::collections::HashMap;
use std::hash::Hash;
use std::fmt::Display;
struct Cacher<P, R, T>
where
T: Fn(&P) -> R,
P: Eq + Hash + Clone,
{
calculation: T,
values: HashMap<P, R>,
}
impl<P, R, T> Cacher<P, R, T>
where
T: Fn(&P) -> R,
P: Eq + Hash + Clone,
{
fn new(calculation: T) -> Cacher<P, R, T> {
Cacher {
calculation,
values: HashMap::new(),
}
}
fn value<'a>(&'a mut self, key: P) -> &'a R {
let calculation = &self.calculation;
let key_copy = key.clone();
self.values
.entry(key_copy)
.or_insert_with(|| (calculation)(&key))
}
}
It only makes a single copy of the key in the value() method. It does not copy the resulting value, but instead returns a reference with a lifetime specifier, which is equal to the lifetime of the enclosing Cacher instance (which is logical, I think, because values in the map will continue to exist until the Cacher itself is dropped).
Here's a test program:
fn main() {
let mut cacher1 = Cacher::new(|num: &u32| -> u32 {
println!("calculating slowly...");
thread::sleep(Duration::from_secs(2));
*num
});
calculate_and_print(10, &mut cacher1);
calculate_and_print(20, &mut cacher1);
calculate_and_print(10, &mut cacher1);
let mut cacher2 = Cacher::new(|str: &&str| -> usize {
println!("calculating slowly...");
thread::sleep(Duration::from_secs(2));
str.len()
});
calculate_and_print("abc", &mut cacher2);
calculate_and_print("defghi", &mut cacher2);
calculate_and_print("abc", &mut cacher2);
}
fn calculate_and_print<P, R, T>(intensity: P, cacher: &mut Cacher<P, R, T>)
where
T: Fn(&P) -> R,
P: Eq + Hash + Clone,
R: Display,
{
println!("{}", cacher.value(intensity));
}
And its output:
calculating slowly...
10
calculating slowly...
20
10
calculating slowly...
3
calculating slowly...
6
3
If you remove the requirement of returning values, you don't need to perform any clones by making use of the Entry:
use std::{
collections::{hash_map::Entry, HashMap},
fmt::Display,
hash::Hash,
thread,
time::Duration,
};
struct Cacher<P, R, T>
where
T: Fn(&P) -> R,
P: Eq + Hash,
{
calculation: T,
values: HashMap<P, R>,
}
impl<P, R, T> Cacher<P, R, T>
where
T: Fn(&P) -> R,
P: Eq + Hash,
{
fn new(calculation: T) -> Cacher<P, R, T> {
Cacher {
calculation,
values: HashMap::new(),
}
}
fn value<'a>(&'a mut self, key: P) -> &'a R {
let calculation = &self.calculation;
match self.values.entry(key) {
Entry::Occupied(e) => e.into_mut(),
Entry::Vacant(e) => {
let result = (calculation)(e.key());
e.insert(result)
}
}
}
}
fn main() {
let mut cacher1 = Cacher::new(|num: &u32| -> u32 {
println!("calculating slowly...");
thread::sleep(Duration::from_secs(1));
*num
});
calculate_and_print(10, &mut cacher1);
calculate_and_print(20, &mut cacher1);
calculate_and_print(10, &mut cacher1);
let mut cacher2 = Cacher::new(|str: &&str| -> usize {
println!("calculating slowly...");
thread::sleep(Duration::from_secs(2));
str.len()
});
calculate_and_print("abc", &mut cacher2);
calculate_and_print("defghi", &mut cacher2);
calculate_and_print("abc", &mut cacher2);
}
fn calculate_and_print<P, R, T>(intensity: P, cacher: &mut Cacher<P, R, T>)
where
T: Fn(&P) -> R,
P: Eq + Hash,
R: Display,
{
println!("{}", cacher.value(intensity));
}
You could then choose to wrap this in another struct that performed a clone:
struct ValueCacher<P, R, T>
where
T: Fn(&P) -> R,
P: Eq + Hash,
R: Clone,
{
cacher: Cacher<P, R, T>,
}
impl<P, R, T> ValueCacher<P, R, T>
where
T: Fn(&P) -> R,
P: Eq + Hash,
R: Clone,
{
fn new(calculation: T) -> Self {
Self {
cacher: Cacher::new(calculation),
}
}
fn value(&mut self, key: P) -> R {
self.cacher.value(key).clone()
}
}
HashMap implements the get and insert methods which take a single immutable borrow and a single move of a value respectively.
I want a trait which is just like this but which takes two keys instead of one. It uses the map inside, but it's just a detail of implementation.
pub struct Table<A: Eq + Hash, B: Eq + Hash> {
map: HashMap<(A, B), f64>,
}
impl<A: Eq + Hash, B: Eq + Hash> Memory<A, B> for Table<A, B> {
fn get(&self, a: &A, b: &B) -> f64 {
let key: &(A, B) = ??;
*self.map.get(key).unwrap()
}
fn set(&mut self, a: A, b: B, v: f64) {
self.map.insert((a, b), v);
}
}
This is certainly possible. The signature of get is
fn get<Q: ?Sized>(&self, k: &Q) -> Option<&V>
where
K: Borrow<Q>,
Q: Hash + Eq,
The problem here is to implement a &Q type such that
(A, B): Borrow<Q>
Q implements Hash + Eq
To satisfy condition (1), we need to think of how to write
fn borrow(self: &(A, B)) -> &Q
The trick is that &Q does not need to be a simple pointer, it can be a trait object! The idea is to create a trait Q which will have two implementations:
impl Q for (A, B)
impl Q for (&A, &B)
The Borrow implementation will simply return self and we can construct a &dyn Q trait object from the two elements separately.
The full implementation is like this:
use std::borrow::Borrow;
use std::collections::HashMap;
use std::hash::{Hash, Hasher};
// See explanation (1).
trait KeyPair<A, B> {
/// Obtains the first element of the pair.
fn a(&self) -> &A;
/// Obtains the second element of the pair.
fn b(&self) -> &B;
}
// See explanation (2).
impl<'a, A, B> Borrow<dyn KeyPair<A, B> + 'a> for (A, B)
where
A: Eq + Hash + 'a,
B: Eq + Hash + 'a,
{
fn borrow(&self) -> &(dyn KeyPair<A, B> + 'a) {
self
}
}
// See explanation (3).
impl<A: Hash, B: Hash> Hash for dyn KeyPair<A, B> + '_ {
fn hash<H: Hasher>(&self, state: &mut H) {
self.a().hash(state);
self.b().hash(state);
}
}
impl<A: Eq, B: Eq> PartialEq for dyn KeyPair<A, B> + '_ {
fn eq(&self, other: &Self) -> bool {
self.a() == other.a() && self.b() == other.b()
}
}
impl<A: Eq, B: Eq> Eq for dyn KeyPair<A, B> + '_ {}
// OP's Table struct
pub struct Table<A: Eq + Hash, B: Eq + Hash> {
map: HashMap<(A, B), f64>,
}
impl<A: Eq + Hash, B: Eq + Hash> Table<A, B> {
fn new() -> Self {
Table {
map: HashMap::new(),
}
}
fn get(&self, a: &A, b: &B) -> f64 {
*self.map.get(&(a, b) as &dyn KeyPair<A, B>).unwrap()
}
fn set(&mut self, a: A, b: B, v: f64) {
self.map.insert((a, b), v);
}
}
// Boring stuff below.
impl<A, B> KeyPair<A, B> for (A, B) {
fn a(&self) -> &A {
&self.0
}
fn b(&self) -> &B {
&self.1
}
}
impl<A, B> KeyPair<A, B> for (&A, &B) {
fn a(&self) -> &A {
self.0
}
fn b(&self) -> &B {
self.1
}
}
//----------------------------------------------------------------
#[derive(Eq, PartialEq, Hash)]
struct A(&'static str);
#[derive(Eq, PartialEq, Hash)]
struct B(&'static str);
fn main() {
let mut table = Table::new();
table.set(A("abc"), B("def"), 4.0);
table.set(A("123"), B("456"), 45.0);
println!("{:?} == 45.0?", table.get(&A("123"), &B("456")));
println!("{:?} == 4.0?", table.get(&A("abc"), &B("def")));
// Should panic below.
println!("{:?} == NaN?", table.get(&A("123"), &B("def")));
}
Explanation:
The KeyPair trait takes the role of Q we mentioned above. We'd need to impl Eq + Hash for dyn KeyPair, but Eq and Hash are both not object safe. We add the a() and b() methods to help implementing them manually.
Now we implement the Borrow trait from (A, B) to dyn KeyPair + 'a. Note the 'a — this is a subtle bit that is needed to make Table::get actually work. The arbitrary 'a allows us to say that an (A, B) can be borrowed to the trait object for any lifetime. If we don't specify the 'a, the unsized trait object will default to 'static, meaning the Borrow trait can only be applied when the implementation like (&A, &B) outlives 'static, which is certainly not the case.
Finally, we implement Eq and Hash. Same reason as point 2, we implement for dyn KeyPair + '_ instead of dyn KeyPair (which means dyn KeyPair + 'static in this context). The '_ here is a syntax sugar meaning arbitrary lifetime.
Using trait objects will incur indirection cost when computing the hash and checking equality in get(). The cost can be eliminated if the optimizer is able to devirtualize that, but whether LLVM will do it is unknown.
An alternative is to store the map as HashMap<(Cow<A>, Cow<B>), f64>. Using this requires less "clever code", but there is now a memory cost to store the owned/borrowed flag as well as runtime cost in both get() and set().
Unless you fork the standard HashMap and add a method to look up an entry via Hash + Eq alone, there is no guaranteed-zero-cost solution.
A Memory trait that take two keys, set by value and get by reference:
trait Memory<A: Eq + Hash, B: Eq + Hash> {
fn get(&self, a: &A, b: &B) -> Option<&f64>;
fn set(&mut self, a: A, b: B, v: f64);
}
You can impl such trait using a Map of Maps:
pub struct Table<A: Eq + Hash, B: Eq + Hash> {
table: HashMap<A, HashMap<B, f64>>,
}
impl<A: Eq + Hash, B: Eq + Hash> Memory<A, B> for Table<A, B> {
fn get(&self, a: &A, b: &B) -> Option<&f64> {
self.table.get(a)?.get(b)
}
fn set(&mut self, a: A, b: B, v: f64) {
let inner = self.table.entry(a).or_insert(HashMap::new());
inner.insert(b, v);
}
}
Please note that if the solution is somewhat elegant the memory footprint of an HashMap of HashMaps have to be considered when thousands of HashMap instances has to be managed.
Full example
How can I make this code compile ?
trait Pair<'a, A, B> {
fn first_ref(&'a self) -> &'a A;
fn second_ref(&'a self) -> &'a B;
};
struct PairOwned<A, B> {
first: A,
second: B,
}
// Only implemented for the cases we are interested in ...
impl<'a, A, B> Pair<'a, A, B> for &'a PairOwned<&'a A,&'a B> {
fn first_ref(&'a self) -> &'a A {
self.first
}
fn second_ref(&'a self) -> &'a B {
self.second
}
}
impl<'a, A, B> Pair<'a, A, B> for &'a(&'a A, &'a B) {
fn first_ref(&'a self) -> &'a A {
self.0
}
fn second_ref(&'a self) -> &'a B {
self.1
}
}
fn pair_transformer<'a, I, T>(pairs: I) -> String
where T: Pair<'a, &'a Str, &'a Str> + 'a,
I: Iterator<Item=T> {
let mut s = String::new();
for pair in pairs {
s = s
+ pair.first_ref().as_slice()
+ pair.second_ref().as_slice();
}
s
}
pair_transformer([PairOwned { first: "a", second: "b" }].iter());
pair_transformer([("a", "b")].iter());
The compiler says:
tests/lang.rs:902:5: 902:21 error: the trait `pair_trait_for_iteration::Pair<'_, &core::str::Str, &core::str::Str>` is not implemented for the type `&pair_trait_for_iteration::PairOwned<&str, &str>` [E0277]
tests/lang.rs:902 pair_transformer([PairOwned { first: "a", second: "b" }].iter());
^~~~~~~~~~~~~~~~
tests/lang.rs:903:5: 903:21 error: the trait `pair_trait_for_iteration::Pair<'_, &core::str::Str, &core::str::Str>` is not implemented for the type `&(&str, &str)` [E0277]
tests/lang.rs:903 pair_transformer([("a", "b")].iter());
Notes
I have the feeling it is somehow related to the various ways to specify what a trait should be implemented for, something I have not fully understood yet.
// As stated in the answer at
// http://stackoverflow.com/questions/28283641/what-is-the-preferred-way-to-implement-the-add-trait-efficiently-for-vector-type
impl Add<YourType> for YourType { ... }
impl<'r> Add<YourType> for &'r YourType { ... }
impl<'a> Add<&'a YourType> for YourType { ... }
impl<'r, 'a> Add<&'a YourType> for &'r YourType { ... }
Using rustc 1.0.0-nightly (522d09dfe 2015-02-19) (built 2015-02-19)
There are a couple of mistakes in your code:
You probably want to implement your trait directly for your type, as the methods defined by the trait take the trait by reference (which is not the case of the Add trait in the other post you linked)
Your use of OwnedPair { first: "a", second: "b"} isn't actually owned: your type will be OwnedPair<&'static str, &'static str> so I included examples with String (which are owned) as I assume that is what you wanted
The items returned by your iterator are actually references, so you probably want to bind I to Iterator<Item=&'a T>
As I tried to be as generic as possible (and for the example to compile with both OwnedPair<&str,&str> and OwnedPair<String,String>) I used the trait std::borrow::Borrow, which basically means that it is possible to borrow a reference to the type T from the type by which this trait is implemented.
I also needed to use ?Sized as a bound for most type parameters. This allows to use types which size is not known at compile time, and will be used behind a "fat pointer". More information in this blog post (a little bit old)
Here is the full corrected code (runnable in playpen)
use std::borrow::Borrow;
trait Pair<'a, A: ?Sized, B: ?Sized> {
fn first_ref(&'a self) -> &'a A;
fn second_ref(&'a self) -> &'a B;
}
struct PairOwned<A, B> {
first: A,
second: B,
}
// Only implemented for the cases we are interested in ...
impl<'a, ARef: ?Sized, BRef: ?Sized, A: Borrow<ARef>, B: Borrow<BRef>> Pair<'a, ARef, BRef> for PairOwned<A,B> {
fn first_ref(&'a self) -> &'a ARef {
self.first.borrow()
}
fn second_ref(&'a self) -> &'a BRef {
self.second.borrow()
}
}
// It should also be possible to be more generic here with Borrow
// But I wanted to leave your original implementation
impl<'a, A: ?Sized, B: ?Sized> Pair<'a, A, B> for (&'a A, &'a B) {
fn first_ref(&'a self) -> &'a A {
self.0
}
fn second_ref(&'a self) -> &'a B {
self.1
}
}
fn pair_transformer<'a, I, T>(pairs: I) -> String
where T: Pair<'a, str, str> + 'a,
I: Iterator<Item=&'a T> {
let mut s = String::new();
for pair in pairs {
s = s
+ pair.first_ref().as_slice()
+ pair.second_ref().as_slice();
}
s
}
fn main() {
pair_transformer([PairOwned { first: "a".to_string(), second: "b".to_string() }].iter());
pair_transformer([PairOwned { first: "a".to_string(), second: "b" }].iter()); // It is even possible to mix String and &str
pair_transformer([PairOwned { first: "a", second: "b" }].iter());
pair_transformer([("a", "b")].iter());
}