Creating a HashMap with a function with reference paramaters as key - rust

I'm trying to create a HashMap where the key type is a function with a reference paramater.
let mut hm: HashMap<fn(&u32) -> u32, u32> = HashMap::new();. This works fine, but i cant insert anything into the map. The compiler says it's not legal since trait bounds weren't satisfied
rules.insert(
|v: &u32| v + 1,
0,
);
gives
the method `insert` exists but the following trait bounds were not satisfied:
`for<'r> fn(&'r u32) -> u32: Eq`
`for<'r> fn(&'r u32) -> u32: Hash`
I've read about Lifetimes in the various texts, but i can't figure out how to solve this.
Background: I'm implementing wave function collapse. I want my implementation to be generic in the sense that any "grid" can be used, 1d, 2d, 3d, nd, hexagonal etc. To do this i use a "ruleset" which is a hashmap where the key is a function that takes a cell coordinate and returns its neighbours, and the value is the rule for how to collapse said neighbours' states. The key function takes an index and a reference to the "grid" and returns the index of the neihbour.

If you want to support multiple grid implementations, the most natural approach is to define a Grid trait that abstracts differences between the grids. Here's one I made up, loosely based on your use case description:
enum CollapseRule {
A,
B,
}
trait Grid {
const COLLAPSE_RULE: CollapseRule;
fn neighbours(&self, index: usize) -> Vec<usize>;
}
#[derive(Clone, Debug, Default)]
struct Grid1d {
vertices: Vec<f64>,
}
impl Grid for Grid1d {
const COLLAPSE_RULE: CollapseRule = CollapseRule::A;
fn neighbours(&self, index: usize) -> Vec<usize> {
let len = self.vertices.len();
match index {
0 => vec![1],
_ if index < len => vec![index - 1, index + 1],
_ if index == len => vec![index - 1],
_ => panic!(),
}
}
}
Whereever your code needs to accept a grid, you can accept a generic type G with a trait bound G: Grid, and any interaction with the grid happens via the trait. Your trait will likely need more functions than in my simplistic example.

The immediate issue you're running into is that Rust's HashMap requires its keys to be Hash and Eq, and the closure you provide does not implement either trait. Function pointers (fn(T, U) -> V) are Hash and Eq, but less flexible than closures.
Even if closures could implement Hash and Eq, you couldn't make them useful as HashMap keys. From the Rust reference, ยง10.1.12:
A closure expression produces a closure value with a unique, anonymous type that cannot be written out. [emphasis added]
In other words, no two closures can have the same type, even if their contents are identical! As a result, the types of any two closures you attempt to add as keys will conflict.
I'm not too familiar with wave function collapse (which you mention in your comment), but a solution here could be to use some type that describes the function as the key, rather than the function itself. Take this very simple example:
#[derive(PartialEq, Eq, Hash)]
pub struct KeyFunction {
to_add: i32,
}
impl KeyFunction {
pub fn apply(&self, param: &i32) -> i32 {
*param + self.to_add
}
}
type Ruleset<V> = HashMap<KeyFunction, V>;
This will limit the keys you can use in a particular ruleset based on what information the KeyFunction type stores. However, you could use a trait to genericize over different kinds of key function (playground link):
use std::collections::HashMap;
pub trait KeyFunction {
type Coord;
fn apply(&self, input: &Self::Coord) -> Self::Coord;
}
/// Descriptor type for simple, contextless 2-D key functions.
#[derive(Clone, PartialEq, Eq, Hash)]
pub struct D2KeyFunction {
x_ofs: i32,
y_ofs: i32,
}
impl KeyFunction for D2KeyFunction {
type Coord = (i32, i32);
fn apply(&self, input: &Self::Coord) -> Self::Coord {
(input.0 + self.x_ofs, input.1 + self.y_ofs)
}
}
type D2Ruleset<V> = HashMap<D2KeyFunction, V>;
This allows you to have flexible key functions while working within Rust's type system, and also enforces boundaries between incompatible key functions.

Related

Generalizing iteraton method in Rust

Suppose I have some custom collection of Foos:
struct Bar {}
struct Foo {
bar: Bar
}
struct SubList {
contents: Vec<Foo>,
}
and suppose I also have a SuperList which is a custom collection of SubLists:
struct SuperList {
contents: Vec<SubList>,
}
SubList and SuperList each provide a method bars:
impl SubList {
fn bars(&self) -> impl Iterator<Item = &Bar> + '_ {
self.contents.iter().map(|x| &x.bar)
}
}
impl SuperList {
fn bars(&self) -> impl Iterator<Item = &Bar> + '_ {
self.contents.iter().flat_map(|x| x.items())
}
}
I want to define a trait that provides a method items, and implement that trait on SubList and SuperList so that SubList::items is equivalent to SubList::bars and SuperList::items is equivalent to SuperList::bars, so that I can do this:
fn do_it<T: Buz<Bar>>(buz: &T) {
for item in buz.items() {
println!("yay!")
}
}
fn main() {
let foos = vec![Foo{ bar: Bar{} }];
let sublist = SubList{ contents: foos };
do_it(&sublist);
let superlist = SuperList{ contents: vec![sublist] };
do_it(&superlist);
}
I can do what I want with dynamic dispatch:
trait Buz<T> {
fn items(&self) -> Box<dyn Iterator<Item = &T> + '_>;
}
impl Buz<Bar> for SubList {
fn items(&self) -> Box<dyn Iterator<Item = &Bar> + '_> {
SubList::bars(self)
}
}
impl Buz<Bar> for SuperList {
fn items(&self) -> Box<dyn Iterator<Item = &Bar> + '_> {
SuperList::bars(self)
}
}
However, the following doesn't work:
trait Baz<T> {
fn items(&self) -> impl Iterator<Item = &T> + '_;
}
impl Baz<Bar> for SubList {
fn items(&self) -> impl Iterator<Item = &Bar> + '_ {
SubList::bars(self)
}
}
impl Baz<Bar> for SuperList {
fn items(&self) -> impl Iterator<Item = &Bar> + '_ {
SuperList::bars(self)
}
}
(error[E0562]: `impl Trait` not allowed outside of function and inherent method return types)
Here's a playground link to what I've tried so far
How can I define a trait Baz which provides an items method to abstract over the bars methods of SubList and SuperList without using dynamic dispatch?
Unfortunately, what you are trying to do is not really possible in Rust right now. Not by design, but simply because some relevant type level features are not implemented or stabilized yet.
Unless you have spare time, an interest in type level stuff and are willing to use nightly: just use boxed iterators. They are simple, they just work, and in most cases it will likely not even hurt performance in a meaningful way.
You're still reading? Ok let's talk about it.
As you intuitively tried, impl Trait in return type position would be the obvious solution here. But as you noticed, it doesn't work: error[E0562]: `impl Trait` not allowed outside of function and inherent method return types. Why is that? RFC 1522 says:
Initial limitations:
impl Trait may only be written within the return type of a freestanding or inherent-impl function, not in trait definitions [...] Eventually, we will want to allow the feature to be used within traits [...]
These initial limitations were put in place because the type level machinery to make this work was/is not in place yet:
One important usecase of abstract return types is to use them in trait methods.
However, there is an issue with this, namely that in combinations with generic trait methods, they are effectively equivalent to higher kinded types. Which is an issue because Rust's HKT story is not yet figured out, so any "accidental implementation" might cause unintended fallout.
The following explanation in the RFC is also worth reading.
That said, some uses of impl Trait in traits can be achieved already today: with associated types! Consider this:
trait Foo {
type Bar: Clone;
fn bar() -> Self::Bar;
}
struct A;
struct B;
impl Foo for A {
type Bar = u32;
fn bar() -> Self::Bar { 0 }
}
impl Foo for B {
type Bar = String;
fn bar() -> Self::Bar { "hello".into() }
}
This works and is "basically equivalent" to:
trait Foo {
fn bar() -> impl Clone;
}
Each impl block can choose a different return type as long as it implements a trait. So why then does impl Trait not simply desugar to an associated type? Well, let's try with your example:
trait Baz<T> {
type Iter: Iterator<Item = &Bar>;
fn items(&self) -> Self::Iter;
}
We get a missing lifetime specifier error:
4 | type Iter: Iterator<Item = &Bar>;
| ^ expected named lifetime parameter
Trying to add a lifetime parameter... we notice that we can't do that. What we need is to use the lifetime of &self here. But we can't get it at that point (in the associated type definition). This limitation is very unfortunate and is encountered in many situations (search term "streaming iterator"). The solution here are GATs: generic associated types. They allow us to write this:
trait Baz<T> {
// vvvv That's what GATs allow
type Iter<'s>: Iterator<Item = &'s Bar>;
fn items(&self) -> Self::Iter<'_>;
}
GATs are not fully implemented and certainly not stable yet. See the tracking issue.
But even with GATs, we cannot fully make your example work. That's because you use iterator types that are unnameable, due to using closures. So the impl Baz for ... blocks would not be able to provide the type Iter<'s> definition. Here we can use another feature that is not stable yet: impl Trait in type aliases!
impl Baz<Bar> for SubList {
type Iter<'s> = impl Iterator<Item = &'s Bar>;
fn items(&self) -> Self::Iter<'_> {
SubList::bars(self)
}
}
This actually works! (Again, on nightly, with these unstable features.) You can see your full, working example in this playground.
This type system work has been going on for a long time and it seems like it's slowly reaching a state of being usable. (Which I am very happy about ^_^). I expect that a few of the foundational features, or at least subsets of them, will be stabilized in the not-too-distant future. And once these are used in practice and we see how they work, more convenience features (like impl Trait in traits) will be reconsidered and stabilized. Or so I think.
Also worth noting that async fns in traits are also blocked by this, since they basically desugar into a method returning impl Future<Output = ...>. And those are also a highly requested feature.
In summary: these limitations have been a pain point for quite some time and they resurface in different practical situations (async, streaming iterator, your example, ...). I'm not involved in compiler development or language team discussions, but I kept an eye on this topic for a long time and I think we are getting close to finally resolving a lot of these issues. Certainly not in the next releases, but I see a decent chance we get some of this in the next year or two.

`cannot infer type` on Borrow::borrow despite constraints

I want to write a piece of code that can take references or owned values of a copyable type, and return an owned version of that type. I've reduced the problems I'm having with the type inference to the following code, which errors:
use std::borrow::Borrow;
fn copy<R, E>(val: E) -> R
where
R: Default + Copy,
E: Borrow<R>,
{
*val.borrow()
}
fn main() {
assert_eq!(6, copy(&6));
assert_eq!(6, copy(6));
assert_eq!(6.0, copy(&6.0));
assert!((6.0f64 - copy(&6.0f64)).abs() < 1e-6);
}
The error comes from the last assert:
error[E0282]: type annotations needed
--> src/main.rs:15:13
|
15 | assert!((6.0f64 - copy(&6.0f64)).abs() < 1e-6);
| ^^^^^^^^^^^^^^^^^^^^^^^^ cannot infer type
|
= note: type must be known at this point
My only hypothesis that the Sub trait on f64 allows f64 or &f64, and if the Default constraint weren't there, then a valid expression for the last copy would be copy::<&f64, &f64>(&6.0f64), however that isn't allowed because &f64 doesn't implement Default. If we pass in an f64 by value it works, presumably because then it restricts R to be f64 instead of either.
What I'm not clear on is why the compiler can't further restrict the return type of copy, or how to indicate to the compiler that the value returned won't be a reference.
Nothing in copy constrains R to a specific concrete type. In particular, &f64 could implement Borrow<R> for multiple values of R (not just f64). It doesn't in your current code, but the lack of alternatives is not considered grounds to pick a specific implementation.
I can even add an implementation that matches:
#[derive(Copy, Clone, Debug, Default)]
struct Foo;
impl Borrow<Foo> for &f64 {
fn borrow(&self) -> &Foo { &Foo }
}
(This trait implementation is permitted even though f64 is a standard library type because Foo is a type defined in the current crate.)
Now we can actually use the choice:
fn main() {
dbg!(copy::<f64, _>(&1.0));
dbg!(copy::<Foo, _>(&1.0));
}
[src/main.rs:19] copy::<f64, _>(&1.0) = 1.0
[src/main.rs:20] copy::<Foo, _>(&1.0) = Foo
A function like copy can only have its return type derived from its argument type when the return type actually depends on the argument type: for example, if it is an associated type of trait implemented by the argument. Both AsRef and Borrow have a type parameter rather than an associated type (and can therefore be implemented multiple times for the same implementing type); Deref has an associated Target type instead, but Deref doesn't offer going from f64 to f64. You could implement your own trait for this:
trait DerefCopy: Copy {
type Output;
fn deref_copy(self) -> Self::Output;
}
impl<T: Copy> DerefCopy for &T {
type Output = T;
fn deref_copy(self) -> T {
*self
}
}
impl DerefCopy for f64 {
type Output = Self;
fn deref_copy(self) -> Self {
self
}
}
fn main() {
assert_eq!(6, (&6).deref_copy());
assert_eq!(6, (6).deref_copy());
assert_eq!(6.0, (&6.0).deref_copy());
assert!((6.0f64 - (&6.0f64).deref_copy()).abs() < 1e-6);
}
However, this would require you to implement DerefCopy for every non-reference type you wish to use it with, because it's not possible to write a blanket implementation for all non-reference Ts; the reason Borrow can have a blanket implementation is that impl Borrow<T> for T doesn't conflict with impl Borrow<T> for &T because if we suppose T is itself a reference &U, we get impl Borrow<&U> for &&U which is still not the same as impl Borrow<T> for T.

Eliminate lifetime parameter from a trait whose implementation wraps a HashMap?

I'd like to wrap a few methods of HashMap such as insert and keys. This attempt compiles, and the tests pass:
use std::collections::HashMap;
use std::hash::Hash;
pub trait Map<'a, N: 'a> {
type ItemIterator: Iterator<Item=&'a N>;
fn items(&'a self) -> Self::ItemIterator;
fn insert(&mut self, item: N);
}
struct MyMap<N> {
map: HashMap<N, ()>
}
impl<N: Eq + Hash> MyMap<N> {
fn new() -> Self {
MyMap { map: HashMap::new() }
}
}
impl<'a, N: 'a + Eq + Hash> Map<'a, N> for MyMap<N> {
type ItemIterator = std::collections::hash_map::Keys<'a, N, ()>;
fn items(&'a self) -> Self::ItemIterator {
self.map.keys()
}
fn insert(&mut self, item: N) {
self.map.insert(item, ());
}
}
#[cfg(test)]
mod tests {
use super::*;
#[derive(Eq, Hash, PartialEq, Debug)]
struct MyItem;
#[test]
fn test() {
let mut map = MyMap::new();
let item = MyItem { };
map.insert(&item);
let foo = map.items().collect::<Vec<_>>();
for it_item in map.items() {
assert_eq!(it_item, &&item);
}
assert_eq!(foo, vec![&&item]);
}
}
I'd like to eliminate the need for the lifetime parameter in Map if possible, but so far haven't found a way. The problem seems to result from the definition of std::collections::hash_map::Keys, which requires a lifetime parameter.
Attempts to redefine the Map trait work until it becomes necessary to supply the lifetime parameter on Keys:
use std::collections::HashMap;
use std::hash::Hash;
pub trait Map<N> {
type ItemIterator: Iterator<Item=N>;
fn items(&self) -> Self::ItemIterator;
fn insert(&mut self, item: N);
}
struct MyMap<N> {
map: HashMap<N, ()>
}
impl<N: Eq + Hash> MyMap<N> {
fn new() -> Self {
MyMap { map: HashMap::new() }
}
}
// ERROR: "unconstrained lifetime parameter"
impl<'a, N> Map<N> for MyMap<N> {
type ItemIterator = std::collections::hash_map::Keys<'a, N, ()>;
}
The compiler issues an error about an unconstrained lifetime parameter that I haven't been able to fix without re-introducing the lifetime into the Map trait.
The main goal of this experiment was to see how I could also eliminate Box from previous attempts. As this question explains, that's another way to return an iterator. So I'm not interested in that approach at the moment.
How can I set up Map and an implementation without introducing a lifetime parameter or using Box?
Something to think about is that since hash_map::Keys has a generic lifetime parameter, it is probably necessary for some reason, so your trait to abstract over Keys will probably need it to.
In this case, in the definition of Map, you need some way to specify how long the ItemIterator's Item lives. (The Item is &'a N).
This was your definition:
type ItemIterator: Iterator<Item=&'a N>
You are trying to say that for any struct that implements Map, the struct's associated ItemIterator must be an iterator of references; however, this constraint alone is useless without any further information: we also need to know how long the reference lives for (hence why type ItemIterator: Iterator<Item=&N> throws an error: it is missing this information, and it cannot currently be elided AFAIK).
So, you choose 'a to name a generic lifetime that you guarantee each &'a N will be valid for. Now, in order to satisfy the borrow checker, prove that &'a N will be valid for 'a, and establish some useful promises about 'a, you specify that:
Any value for the reference &self given to items() must live at least as long as 'a. This ensures that for each of the returned items (&'a N), the &self reference must still be valid in order for the item reference to remain valid, in other words, the items must outlive self. This invariant allows you to reference &self in the return value of items(). You have specified this with fn items(&'a self). (Side note: my_map.items() is really shorthand for MyMap::items(&my_map)).
Each of the Ns themselves must also remain valid for as long as 'a. This is important if the objects contain any references that won't live forever (aka non-'static references); this ensures that all of the references that the item N contains live at least as long as 'a. You have specified this with the constraint N: 'a.
So, to recap, the definition of Map<'a, N> requires that an implementors' items() function must return an ItemIterator of references that are valid for 'a to items that are valid for 'a. Now, your implementation:
impl<'a, N: 'a + Eq + Hash> Map<'a, N> for MyMap<N> { ... }
As you can see, the 'a parameter is completely unconstrained, so you can use any 'a with the methods from Map on an instance of MyMap, as long as N fulfills its constraints of N: 'a + Eq + Hash. 'a should automatically become the longest lifetime for which both N and the map passed to items() are valid.
Anyway, what you're describing here is known as a streaming iterator, which has been a problem in years. For some relevant discussion, see the approved but currently unimplemented RFC 1598 (but prepare to be overwhelmed).
Finally, as some people have commented, it's possible that your Map trait might be a bad design from the start since it may be better expressed as a combination of the built-in IntoIterator<Item=&'a N> and a separate trait for insert(). This would mean that the default iterator used in for loops, etc. would be the items iterator, which is inconsistent with the built-in HashMap, but I am not totally clear on the purpose of your trait so I think your design likely makes sense.

Trait method which returns `Self`-type with a different type and/or lifetime parameter

The Rust compiler doesn't allow specifying new type parameters on Self in the return types of trait methods. That is, a trait method can return Self, but not Self<T>. I'm interested in what the proper way is to accomplish the following; or if it's fundamentally impossible, then why that's the case.
Suppose we have two structs as follows:
pub struct SomeStruct<T> {
id: usize,
description: String,
extra_data: T,
}
pub struct OtherStruct<T> {
permit_needed: bool,
registration_number: usize,
extra_data: T,
}
Both of these structs have an extra_data field of type T, where T is a generic type parameter. We can implement methods such as get_extra_data(), and more interestingly, transform(), on both structs:
impl<T> SomeStruct<T> {
pub fn get_extra_data(&self) -> &T {
&self.extra_data
}
pub fn transform<S>(&self, data: S) -> SomeStruct<S> {
SomeStruct {
id: self.id,
description: self.description.clone(),
extra_data: data,
}
}
}
impl<T> OtherStruct<T> {
pub fn get_extra_data(&self) -> &T {
&self.extra_data
}
pub fn transform<S>(&self, data: S) -> OtherStruct<S> {
OtherStruct {
permit_needed: self.permit_needed,
registration_number: self.registration_number,
extra_data: data,
}
}
}
But I'm having trouble creating a trait Transformable that unifies these two types with respect to these operations:
trait Transformable<T> {
fn get_extra_data(&self) -> &T;
fn transform<S>(&self, data: S) -> Self<S>;
}
The transform() function needs to return something of the same type as Self, just with a different type parameter. Unfortunately, the rust compiler rejects this with the error message E0109: type arguments are not allowed for this type.
Even the weaker signature fn transform<S>(&self, data: S) -> impl Transformable<S> is rejected, with error E0562: impl Trait not allowed outside of function and inherent method return types. The traditional solution here, to my knowledge, would be to return an associated type instead of an impl, but that would require generic associated types in this case. And at any rate that would seem much more roundabout than the stronger Self<S> signature which is what we really want.
Playground link with the above code
An analogous problem happens exactly the same way with lifetime parameters. I'm actually more interested in the lifetime parameters case, but I thought the type parameters case might be easier to understand and talk about, and also make it clear that the issue isn't lifetime-specific.

How to return an anonymous type from a trait method without using Box?

I have an extension trait whose methods are just shorthands for adapters/combinators:
fn foo(self) -> ... { self.map(|i| i * 2).foo().bar() }
The return type of Trait::foo() is some nested Map<Foo<Bar<Filter..., including closures, and is therefor anonymous for all practical purposes. My problem is how to return such a type from a trait method, preferably without using Box.
impl Trait in return position would be the way to go, yet this feature is not implemented for trait methods yet.
Returning a Box<Trait>is possible, yet I don't want to allocate for every adapter shorthanded by the trait.
I can't put the anonymous type into a struct and return that, because struct Foo<T> { inner: T } can't be implemented (I promise an impl for all T, yet only return a specific Foo<Map<Filter<Bar...).
Existential types would probably solve the above problem, yet they won't be implemented for some time.
I could also just avoid the problem and use a macro or a freestanding function; this also feels unhygienic, though.
Any more insights?
What is the correct way to return an Iterator (or any other trait)? covers all the present solutions. The one you haven't used is to replace closures with function pointers and then use a type alias (optionally wrapping in a newtype). This isn't always possible, but since you didn't provide a MCVE of your code, we can't tell if this will work for you or not:
use std::iter;
type Thing<T> = iter::Map<iter::Filter<T, fn(&i32) -> bool>, fn(i32) -> i32>;
trait IterExt: Iterator<Item = i32> {
fn thing(self) -> Thing<Self>
where
Self: Sized + 'static,
{
// self.filter(|&v| v > 10).map(|v| v * 2)
fn a(v: &i32) -> bool { *v > 10 }
fn b(v: i32) -> i32 { v * 2 }
self.filter(a as fn(&i32) -> bool).map(b as fn(i32) -> i32)
}
}
impl<I> IterExt for I
where
I: Iterator<Item = i32>,
{}
fn main() {}
Honestly, in these cases I would create a newtype wrapping the boxed trait object. That way, I have the flexibility to internally re-implement it with a non-boxed option in an API-compatible fashion when it becomes practical to do so.

Resources