How to create a lazy_static HashMap with function references as value? - rust

I tried to create a HashMap with functions as the values:
#[macro_use]
extern crate lazy_static;
use std::collections::HashMap;
lazy_static! {
static ref HASHES: HashMap<&'static str, &'static Fn([u8])> = {
let mut m = HashMap::new();
m.insert("md5", &md5);
m
};
}
fn md5(bytes: &[u8]) -> String {
String::default()
}
The compiler gives me an error:
error[E0277]: the trait bound `std::ops::Fn([u8]) + 'static: std::marker::Sync` is not satisfied in `&'static std::ops::Fn([u8]) + 'static`
--> src/main.rs:6:1
|
6 | lazy_static! {
| _^ starting here...
7 | | static ref HASHES: HashMap<&'static str, &'static Fn([u8])> = {
8 | | let mut m = HashMap::new();
9 | | m.insert("md5", &md5);
10 | | m
11 | | };
12 | | }
| |_^ ...ending here: within `&'static std::ops::Fn([u8]) + 'static`, the trait `std::marker::Sync` is not implemented for `std::ops::Fn([u8]) + 'static`
|
= note: `std::ops::Fn([u8]) + 'static` cannot be shared between threads safely
= note: required because it appears within the type `&'static std::ops::Fn([u8]) + 'static`
= note: required because of the requirements on the impl of `std::marker::Sync` for `std::collections::hash::table::RawTable<&'static str, &'static std::ops::Fn([u8]) + 'static>`
= note: required because it appears within the type `std::collections::HashMap<&'static str, &'static std::ops::Fn([u8]) + 'static>`
= note: required by `lazy_static::lazy::Lazy`
= note: this error originates in a macro outside of the current crate
I don't understand what should I do to fix this error and I don't know any other way of creating such a HashMap.

Your code has multiple issues. The error presented by the compiler is telling you that your code, will allow memory unsafety:
`std::ops::Fn([u8]) + 'static` cannot be shared between threads safely
The type you are storing in your HashMap has no guarantee that it can be shared.
You can "fix" that by specifying such a bound by changing your value type to &'static (Fn([u8]) + Sync). This unlocks the next error, due to the fact that your function signatures don't match up:
expected type `std::collections::HashMap<&'static str, &'static std::ops::Fn([u8]) + std::marker::Sync + 'static>`
found type `std::collections::HashMap<&str, &fn(&[u8]) -> std::string::String {md5}>`
"Fixing" that with &'static (Fn(&[u8]) -> String + Sync) leads to esoteric higher-kinded lifetime errors:
expected type `std::collections::HashMap<&'static str, &'static for<'r> std::ops::Fn(&'r [u8]) -> std::string::String + std::marker::Sync + 'static>`
found type `std::collections::HashMap<&str, &fn(&[u8]) -> std::string::String {md5}>`
Which can be "fixed" by casting the function with &md5 as &'static (Fn(&[u8]) -> String + Sync)), which leads to
note: borrowed value must be valid for the static lifetime...
note: consider using a `let` binding to increase its lifetime
This bottoms out because the reference you've made is to a temporary value that doesn't live outside of the scope.
I put fix in scare quotes because this isn't really the right solution. The right thing is to just use a function pointer:
lazy_static! {
static ref HASHES: HashMap<&'static str, fn(&[u8]) -> String> = {
let mut m = HashMap::new();
m.insert("md5", md5 as fn(&[u8]) -> std::string::String);
m
};
}
Honestly, I'd say that a HashMap is probably overkill; I'd use an array. A small array is probably faster than a small HashMap:
type HashFn = fn(&[u8]) -> String;
static HASHES: &'static [(&'static str, HashFn)] = &[
("md5", md5),
];
You can start by just iterating through the list, or maybe be fancy and alphabetize it and then use binary_search when it gets a bit bigger.

Related

How to establish relation with respect to higher ranked trait bounds

I am trying to create a high order function which takes in (a function that takes &str and returns iterator of &str). What I am having difficulty is to relate lifetime variables of for<'a> for the function and the return type of that function. It is hard to describe in words, so let's jump right into the code:
use std::fs::File;
use std::io::{BufRead, BufReader, Result};
fn static_dispatcher<'b, I>(
tokenize: impl for<'a> Fn(&'a str) -> I,
ifs: BufReader<File>,
) -> Result<()>
where
I: Iterator<Item = &'b str>,
{
ifs.lines()
.map(|line| tokenize(&line.unwrap()).count())
.for_each(move |n| {
println!("{n}");
});
Ok(())
}
fn main() -> Result<()> {
let ifs = BufReader::new(File::open("/dev/stdin")?);
static_dispatcher(|line| line.split_whitespace(), ifs)
// static_dispatcher(|line| line.split('\t'), ifs)
}
The compiler complains that the lifetime relation of the tokenize's input 'a and output 'b is not specified.
--> src/main.rs:21:30
|
21 | static_dispatcher(|line| line.split_whitespace(), ifs)
| ----- ^^^^^^^^^^^^^^^^^^^^^^^ returning this value requires that `'1` must outlive `'2`
| | |
| | return type of closure is SplitWhitespace<'2>
| has type `&'1 str`
|
= note: requirement occurs because of the type `SplitWhitespace<'_>`, which makes the generic argument `'_` invariant
= note: the struct `SplitWhitespace<'a>` is invariant over the parameter `'a`
= help: see <https://doc.rust-lang.org/nomicon/subtyping.html> for more information about variance
I want to specify 'a = 'b, but I can't because 'a comes from for<'a>, which is not visible for the type I.
I also tried
fn static_dispatcher<'a, I>(
tokenize: impl Fn(&'a str) -> I,
ifs: BufReader<File>,
) -> Result<()>
where
I: Iterator<Item = &'a str>,
but this does not work either b/c the lifetime of tokenize argument must be generic, i.e., must be used with for <'a>.
How can I fix this problem?

Understanding &type + 'a syntax

I have started to learn Rust. Currently I'm trying to learn how to properly use lifetime annotations and think I have understood the basics quite well. However, I have on several occasions encountered the following structure:
fn<'a> foo(a: &'a str, ...) -> &str + 'a
The str is not relevant it can be any type really, my question is specifically what &str + 'a mean (I might not be using it correctly, which is why I'm asking about it) as opposed to &'a str. As a real world example I have encountered it in this tutorial for async rust where they write:
fn foo_expanded<'a>(x: &'a u8) -> impl Future<Output = u8> + 'a
I'm speculating that it might have to do with that Future is a trait and not a type, but I have been unable to verify it in any official documentation and have not found any source on what the syntax mean.
First of all, the syntax shown in your post is not allowed.
fn<'a> foo(a: &'a str, ...) -> &str + 'a
There are two reasons:
lifetime generics must be specified after the function name.
the displayed way of specifying return lifetimes is allowed only for traits, not complete types.
Otherwise you would get one of the two following errors:
error[E0178]: expected a path on the left-hand side of `+`, not `&str`
--> ./ex_056.rs:11:43
|
11 | fn _get<'a>(ms: &'a MyStruct, s: &str) -> &str + 'a {
| ^^^^^^^^^ help: try adding parentheses: `&(str + 'a)`
or
error[E0404]: expected trait, found builtin type `str`
--> ./ex_056.rs:15:31
|
15 | fn _get2<'a>(s: &'a str) -> &(str + 'a) {
| ^^^ not a trait
Thus it's not valid.
As a crude guess, I imagine that you have been misled by not a complete type but just a trait object. Since such a notation was allowed in 2015 but now it is deprecated, as you can see in the following warning:
warning: trait objects without an explicit `dyn` are deprecated
--> ./ex_056.rs:15:31
|
15 | fn _get2<'a>(s: &'a str) -> &(str + 'a) {
| ^^^^^^^^ help: use `dyn`: `dyn str + 'a`
|
= warning: this is accepted in the current edition (Rust 2015) but is a hard error in Rust 2021!
= note: for more information, see <https://doc.rust-lang.org/nightly/edition-guide/rust-2021/warnings-promoted-to-error.html>
Your first example (&str + 'a) is not valid. The + 'a notation can only be applied to a trait.
Your second example: impl Future<Output = u8> + 'a means that foo_expanded returns some unknown type that implements the trait Future<Output = u8> and that this unknown type may contain references with the 'a lifetime. Therefore you won't be able to use the returned value once the 'a lifetime expires.

Closure arguments: passing a function that mutates the inner variables

The idea is to to have one closure (change_x in this case) that captures state (x in this case) that takes a function as its parameter(alterer) that would dictate how the inner state changes.
pub fn plus17(h: & u64) -> u64 {
*h + 17
}
pub fn main() {
let x = 0; //take x by reference
let mut change_x = move |alterer: &dyn FnOnce(&u64) ->u64 | alterer(&x) ;
change_x(&mut plus17);
println!("{}", x);
}
I can't seem to get the types right however:
error[E0161]: cannot move a value of type dyn for<'r> FnOnce(&'r u64) -> u64: the size of dyn for<'r> FnOnce(&'r u64) -> u64 cannot be statically determined
--> playground/src/main.rs:19:69
|
19 | let mut increment_x = move |alterer: &dyn FnOnce(&u64) ->u64 | alterer(&x) ;
| ^^^^^^^
error[E0507]: cannot move out of `*alterer` which is behind a shared reference
--> playground/src/main.rs:19:69
|
19 | let mut increment_x = move |alterer: &dyn FnOnce(&u64) ->u64 | alterer(&x) ;
| ^^^^^^^ move occurs because `*alterer` has type `dyn for<'r> FnOnce(&'r u64) -> u64`, which does not implement the `Copy` trait
I'm not sure if i'm justified in putting the dyn where i put it, it was a compiler's suggestion and im not really sure why i have to put it there. Is it because the alterer can be of arbitrary size despite the input/return type of &u64->u64?
I have also tried to make alterer a FnMut as opposed to FnOnce, but im also pretty shaky as to their distinction and the fact that a given alterer would run only once (at the moment of invocation by outer closure change_x) seemed reasonable.
FnOnce needs an owned self. Thus alterer cannot be FnOnce, because it is not owned but a reference.
You can either make it &dyn Fn or &mut dyn FnMut (I'd recommend going with FnMut), or take Box<dyn FnOnce>.

Rust "doesn't have a size known at compile-time" error for iterators?

I'm refactoring the cartesian product code in Rust's itertools [1] as a way of learning Rust. The cartesian product is formed from an Iterator I and an IntoIterator J. The IntoIterator J is converted into an iterator multiple times and iterated over.
So far I have the code below, which is a minor modification of the code in the itertools source code. The biggest change is specifying a specific type (i8) instead of using generic types.
struct Product {
a: dyn Iterator<Item=i8>,
a_cur: Option<i8>,
b: dyn IntoIterator<Item=i8, IntoIter=dyn Iterator<Item=i8>>,
b_iter: dyn Iterator<Item=i8>
}
impl Iterator for Product {
type Item = (i8, i8);
fn next(&mut self) -> Option<Self::Item> {
let elt_b = match self.b_iter.next() {
None => {
self.b_iter = self.b.into_iter();
match self.b_iter.next() {
None => return None,
Some(x) => {
self.a_cur = self.a.next();
x
}
}
}
Some(x) => x
};
match self.a_cur {
None => None,
Some(ref a) => {
Some((a, elt_b))
}
}
}
}
fn cp(i: impl Iterator<Item=i8>, j: impl IntoIterator<Item=i8>) -> Product {
let p = Product{
a: i,
a_cur: i.next(),
b: j,
b_iter: j.into_iter()};
return p
}
fn main() {
for foo in cp(vec![1,4,7], vec![2,3,9]) {
println!("{:?}", foo);
}
}
Unfortunately the compiler is giving errors I have been unable to fix. I've attempted the fixes suggested by the compiler, but when I make them I get many more "doesn't have size known at compile time" errors.
I'm especially confused because the implementation in Rust's itertools library (link below) has a very similar structure and didn't require specifying lifetimes, borrowing, using Boxes, or the dyn keyword. I'd love to know what I changed that led to the Rust compiler suggesting using borrowing and/or Boxes.
error[E0277]: the size for values of type `(dyn Iterator<Item = i8> + 'static)` cannot be known at compilation time
--> src/main.rs:15:8
|
15 | a: dyn Iterator<Item=i8>,
| ^^^^^^^^^^^^^^^^^^^^^ doesn't have a size known at compile-time
|
= help: the trait `Sized` is not implemented for `(dyn Iterator<Item = i8> + 'static)`
= note: only the last field of a struct may have a dynamically sized type
= help: change the field's type to have a statically known size
help: borrowed types always have a statically known size
|
15 | a: &dyn Iterator<Item=i8>,
| ^
help: the `Box` type always has a statically known size and allocates its contents in the heap
|
15 | a: Box<dyn Iterator<Item=i8>>,
| ^^^^ ^
error[E0277]: the size for values of type `(dyn IntoIterator<Item = i8, IntoIter = (dyn Iterator<Item = i8> + 'static)> + 'static)` cannot be known at compilation time
--> src/main.rs:17:8
|
17 | b: dyn IntoIterator<Item=i8, IntoIter=dyn Iterator<Item=i8>>,
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ doesn't have a size known at compile-time
|
= help: the trait `Sized` is not implemented for `(dyn IntoIterator<Item = i8, IntoIter = (dyn Iterator<Item = i8> + 'static)> + 'static)`
= note: only the last field of a struct may have a dynamically sized type
= help: change the field's type to have a statically known size
help: borrowed types always have a statically known size
|
17 | b: &dyn IntoIterator<Item=i8, IntoIter=dyn Iterator<Item=i8>>,
| ^
help: the `Box` type always has a statically known size and allocates its contents in the heap
|
17 | b: Box<dyn IntoIterator<Item=i8, IntoIter=dyn Iterator<Item=i8>>>,
| ^^^^ ^
error: aborting due to 2 previous errors
[1] Docs at https://nozaq.github.io/shogi-rs/itertools/trait.Itertools.html#method.cartesian_product and code at https://github.com/rust-itertools/itertools/blob/master/src/adaptors/mod.rs#L286 .
You didn't just change the item type, you also removed the generic iterator. In the itertools crate, there is:
pub struct Product<I, J>
where I: Iterator
{
a: I,
…
Meaning that a is an iterator whose exact type will be specified by the user (but still at compile time).
You have removed the generic I parameter and instead you have written:
pub struct Product
{
a: dyn Iterator<Item=i8>,
…
If that worked, it would mean that a is an iterator whose item type is u8 but whose exact type will be specified at runtime. Therefore at compile-time the compiler can't know the exact type of a nor how much space it should allocate to store a inside Product.
If you want your cartesian product to work for any iterator whose items are u8, you need to keep the generic parameter I with an extra constraint:
pub struct Product<I, J>
where I: Iterator<Item=u8>
{
a: I,
…
And a similar change will be required for J in impl Iterator.

Why does this result of a binary operator need an appropriate lifetime?

This is related to my earlier question on making a modular exponentiation method generic. I've now arrived at the following code:
fn powm<T>(fbase: &T, exponent: &T, modulus: &T) -> T
where
T: Mul<T, Output = T>
+ From<u8>
+ PartialEq<T>
+ Rem<T, Output = T>
+ Copy
+ for<'a> Rem<&'a T, Output = T>
+ Clone
+ PartialOrd<T>
+ ShrAssign<T>,
for<'a> &'a T: PartialEq<T> + Rem<&'a T, Output = T>,
{
if modulus == T::from(1) {
T::from(0)
} else {
let mut result = T::from(1);
let mut base = fbase % modulus;
let mut exp = exponent.clone();
while exp > T::from(0) {
if exp % T::from(2) == T::from(1) {
result = (result * base) % modulus;
}
exp >>= T::from(1);
base = (base * base) % modulus;
}
result
}
}
It is my understanding that by defining the trait bound where for<'a> &'a T: Rem<&'a T, Output=T> that it is understood that I can use the modulo operator % on two operands of type &'a T, and the result will be of type T. However, I get the following error:
error[E0495]: cannot infer an appropriate lifetime due to conflicting requirements
--> src/main.rs:20:30
|
20 | let mut base = fbase % modulus;
| ^
|
note: first, the lifetime cannot outlive the anonymous lifetime #3 defined on the function body at 3:1...
--> src/main.rs:3:1
|
3 | / fn powm<T>(fbase: &T, exponent: &T, modulus: &T) -> T
4 | | where
5 | | T: Mul<T, Output = T>
6 | | + From<u8>
... |
30 | | }
31 | | }
| |_^
note: ...so that reference does not outlive borrowed content
--> src/main.rs:20:32
|
20 | let mut base = fbase % modulus;
| ^^^^^^^
note: but, the lifetime must be valid for the anonymous lifetime #1 defined on the function body at 3:1...
--> src/main.rs:3:1
|
3 | / fn powm<T>(fbase: &T, exponent: &T, modulus: &T) -> T
4 | | where
5 | | T: Mul<T, Output = T>
6 | | + From<u8>
... |
30 | | }
31 | | }
| |_^
note: ...so that types are compatible (expected std::ops::Rem, found std::ops::Rem<&T>)
--> src/main.rs:20:30
|
20 | let mut base = fbase % modulus;
| ^
The code does work if I replace the line in question by
let mut base = fbase.clone() % modulus;
I don't see why I would need to clone in the first place if I can use the modulo operator already to return a "fresh" element of type T. Do I need to modify my trait bounds instead? Why does this go wrong?
When programming, it's very useful to learn how to create a Minimal, Complete, and Verifiable example (MCVE). This allows you to ignore irrelevant details and focus on the core of the problem.
As one example, your entire blob of code can be reduced down to:
use std::ops::Rem;
fn powm<T>(fbase: &T, modulus: &T)
where
for<'a> &'a T: Rem<&'a T, Output = T>,
{
fbase % modulus;
}
fn main() {}
Once you have a MCVE, you can make permutations to it to explore. For example, we can remove the lifetime elision:
fn powm<'a, 'b, T>(fbase: &'a T, modulus: &'b T)
where
for<'x> &'x T: Rem<&'x T, Output = T>,
{
fbase % modulus;
}
Now we start to see something: what is the relation between all three lifetimes? Well, there isn't one, really. What happens if we make one?
If we say that the input references can be unified to the same
lifetime, it works:
fn powm<'a, T>(fbase: &'a T, modulus: &'a T)
If we say that 'b outlives 'a, it works:
fn powm<'a, 'b: 'a, T>(fbase: &'a T, modulus: &'b T)
If we say that we can have two different lifetimes in the operator, it works:
for<'x, 'y> &'x T: Rem<&'y T, Output = T>,
What about if we poke at the call site?
If we directly call the Rem::rem method, it works:
Rem::rem(fbase, modulus);
If we dereference and re-reference, it works:
&*fbase % &*modulus;
I don't know exactly why the original doesn't work — conceptually both the input references should be able to be unified to one lifetime. It's possible that there's a piece of inference that either cannot or isn't happening, but I'm not aware of it.
Some further discussion with a Rust compiler developer led to an issue as it doesn't quite seem right. This issue has now been resolved and should theoretically be available in Rust 1.23.

Resources