How to store state within a module between function calls? - rust

I have the following functions in a module:
pub fn square(s: u32) -> u64 {
if s < 1 || s > 64 {
panic!("Square must be between 1 and 64")
}
total_for_square(s) - total_for_square(s - 1)
}
fn total_for_square(s: u32) -> u64 {
if s == 64 {
return u64::max_value();
}
2u64.pow(s) - 1
}
pub fn total() -> u64 {
u64::max_value()
}
This works fine when calling individual functions directly. However, I want to optimize it and cache values to total_for_square to speed up future look ups (storing in a HashMap). How should I approach where to store the HashMap so it's available between calls? I know I could refactor to put all of this in a struct, but in this case, I cannot change the API.
In other, higher level languages I have used, I would just have a variable in the same scope as the functions. However, it's not clear if that is possible in Rust on the module level.

In other, higher level languages I have used, I would just have a variable in the same scope as the functions.
You can use something similar in Rust but it's syntactically more complicated: you need to create a global for your cache using lazy_static or once_cell for instance.
The cache will need to be thread-safe though, so either a regular map sitting behind a Mutex or RwLock, or some sort of concurrent map.
Although given you only have 64 inputs, you could just precompute the entire thing and return precomputed values directly.

The cached crate comes in handy:
use cached::proc_macro::cached;
#[cached]
fn total_for_square(s: u32) -> u64 {
if s == 64 {
return u64::MAX;
}
2u64.pow(s) - 1
}
Indeed, you only need to write two lines, and the crate will take care of everything. Internally, the cached values are stored in a hash map.
(Note that u64::max_value() has been superseded by u64::MAX)
Side note: in this specific case, the simplest solution is probably to modify square so that it returns s * s.

Related

Stable alternative to collect_into - or - how do I collect a sized queue?

I have some pipeline which manipulates an iterator to a very big data set, and at the end, I wish to just keep the N top values.
I wrote a wrapper around a Vec - a struct which holds the Vec and its max size, and implements insertion such that the data in the vec is always ordered, and values which are too small would get ignored (could have also used a BTreeSet, if N is large enough).
Anyway, I thought I'd use it as follows:
let mut q = SizedQueue(5);
<my iterator pipleline>.collect_into(&mut q);
but I was disappointed to discover that collect_into is unstable, and could potentially be dropped because it might be deemed unnecessary, the reason given is that it could be done differently.
My question is - how could it be done differently (other than me just implementing a Trait for Iterator with this functionality myself)?
collect_into() is just a convenient shortcut to calling Extend::extend():
let mut q = SizedQueue(5);
q.extend(<my iterator pipleline>);
Of course, you need to implement Extend for your type. A simple implementation may look like:
impl<T: PartialOrd> Extend<T> for SizedQueue<T> {
fn extend<I: IntoIterator<Item = T>>(&mut self, iter: I) {
for item in iter {
self.push(item);
}
}
}
But if this is only for one use site where you call extend(), you may as well just inline it and loop and push().

How to dynamically signal to Rust compiler that a given variable is non-zero?

I'd like to try to eliminate bounds checking on code generated by Rust. I have variables that are rarely zero and my code paths ensure they do not run into trouble. But because they can be, I cannot use NonZeroU64. When I am sure they are non-zero, how can I signal this to the compiler?
For example, if I have the following function, I know it will be non-zero. Can I tell the compiler this or do I have to have the unnecessary check?
pub fn f(n:u64) -> u32 {
n.trailing_zeros()
}
I can wrap the number in NonZeroU64 when I am sure, but then I've already incurred the check, which defeats the purpose ...
Redundant checks within a single function body can usually be optimized out. So you just need convert the number to NonZeroU64 before calling trailing_zeros(), and rely on the compiler to optimize the bound checks.
use std::num::NonZeroU64;
pub fn g(n: NonZeroU64) -> u32 {
n.trailing_zeros()
}
pub fn other_fun(n: u64) -> u32 {
if n != 0 {
println!("Do something with non-zero!");
let n = NonZeroU64::new(n).unwrap();
g(n)
} else {
42
}
}
In the above code, the if n != 0 makes sure n cannot be zero within the block, and compiler is smart enough to remove the unwrap call, making NonZeroU64::new(n).unwrap() an zero-cost operation. You can check the asm to verify that.
core::intrinsics::assume
Informs the optimizer that a condition is always true. If the
condition is false, the behavior is undefined.
No code is generated for this intrinsic, but the optimizer will try to
preserve it (and its condition) between passes, which may interfere
with optimization of surrounding code and reduce performance. It
should not be used if the invariant can be discovered by the optimizer
on its own, or if it does not enable any significant optimizations.
This intrinsic does not have a stable counterpart.

How Can I Hash By A Raw Pointer?

I want to create a function that provides a two step write and commit, like so:
// Omitting locking for brevity
struct States {
commited_state: u64,
// By reference is just a placeholder - I don't know how to do this
pending_states: HashSet<i64>
}
impl States {
fn read_dirty(&self) -> {
// Sum committed state and all non committed states
self.commited_state +
pending_states.into_iter().fold(sum_all_values).unwrap_or(0)
}
fn read_committed(&self) {
self.commited_state
}
}
let state_container = States::default();
async fn update_state(state_container: States, new_state: i64) -> Future {
// This is just pseudo code missing locking and such
// I'd like to add a reference to new_state
state_container.pending_states.insert(
new_state
)
async move {
// I would like to defer the commit
// I add the state to the commited state
state_container.commited_state =+ new_state;
// Then remove it *by reference* from the pending states
state_container.remove(new_state)
}
}
I'd like to be in a situation where I can call it like so
let commit_handler = update_state(state_container, 3).await;
// Do some external transactional stuff
third_party_transactional_service(...)?
// Commit if the above line does not error
commit_handler.await;
The problem I have is that HashMaps and HashSets, hash values based of their value and not their actual reference - so I can't remove them by reference.
I appreciate this a bit of a long question, but I'm just trying to give a bit more context as to what I'm trying to do. I know that in a typical database you'd generally have an atomic counter to generate the transaction ID, but that feels a bit overkill when the pointer reference would be enough.
However, I don't want to get the pointer value using unsafe, because it just seems a bit off to do something relatively simple.
Values in rust don't have an identity like they do in other languages. You need to ascribe them an identity somehow. You've hit on two ways to do this in your question: an ID contained within the value, or the address of the value as a pointer.
Option 1: An ID contained in the value
It's trivial to have a usize ID with a static AtomicUsize (atomics have interior mutability).
use std::sync::atomic::{AtomicUsize, Ordering};
// No impl of clone/copy as we want these IDs to be unique.
#[derive(Debug, Hash, PartialEq, Eq)]
#[repr(transparent)]
pub struct OpaqueIdentifier(usize);
impl OpaqueIdentifier {
pub fn new() -> Self {
static COUNTER: AtomicUsize = AtomicUsize::new(0);
Self(COUNTER.fetch_add(1, Ordering::Relaxed))
}
pub fn id(&self) -> usize {
self.0
}
}
Now your map key becomes usize, and you're done.
Having this be a separate type that doesn't implement Copy or Clone allows you to have a concept of an "owned unique ID" and then every type with one of these IDs is forced not to be Copy, and a Clone impl would require obtaining a new ID.
(You can use a different integer type than usize. I chose it semi-arbitrarily.)
Option 2: A pointer to the value
This is more challenging in Rust since values in Rust are movable by default. In order for this approach to be viable, you have to remove this capability by pinning.
To make this work, both of the following must be true:
You pin the value you're using to provide identity, and
The pinned value is !Unpin (otherwise pinning still allows moves!), which can be forced by adding a PhantomPinned member to the value's type.
Note that the pin contract is only upheld if the object remains pinned for its entire lifetime. To enforce this, your factory for such objects should only dispense pinned boxes.
This could complicate your API as you cannot obtain a mutable reference to a pinned value without unsafe. The pin documentation has examples of how to do this properly.
Assuming that you have done all of this, you can then use *const T as the key in your map (where T is the pinned type). Note that conversion to a pointer is safe -- it's conversion back to a reference that isn't. So you can just use some_pin_box.get_ref() as *const _ to obtain the pointer you'll use for lookup.
The pinned box approach comes with pretty significant drawbacks:
All values being used to provide identity have to be allocated on the heap (unless using local pinning, which is unlikely to be ergonomic -- the pin! macro making this simpler is experimental).
The implementation of the type providing identity has to accept self as a &Pin or &mut Pin, requiring unsafe code to mutate the contents.
In my opinion, it's not even a good semantic fit for the problem. "Location in memory" and "identity" are different things, and it's only kind of by accident that the former can sometimes be used to implement the latter. It's a bit silly that moving a value in memory would change its identity, no?
I'd just go with adding an ID to the value. This is a substantially more obvious pattern, and it has no serious drawbacks.

Refactoring out `clone` when Copy trait is not implemented?

Is there a way to get rid of clone(), given the restrictions I've noted in the comments? I would really like to know if it's possible to use borrowing in this case, where modifying the third-party function signature is not possible.
// We should keep the "data" hidden from the consumer
mod le_library {
pub struct Foobar {
data: Vec<i32> // Something that doesn't implement Copy
}
impl Foobar {
pub fn new() -> Foobar {
Foobar {
data: vec![1, 2, 3],
}
}
pub fn foo(&self) -> String {
let i = third_party(self.data.clone()); // Refactor out clone?
format!("{}{}", "foo!", i)
}
}
// Can't change the signature, suppose this comes from a crate
pub fn third_party(data:Vec<i32>) -> i32 {
data[0]
}
}
use le_library::Foobar;
fn main() {
let foobar = Foobar::new();
let foo = foobar.foo();
let foo2 = foobar.foo();
println!("{}", foo);
println!("{}", foo2);
}
playground
As long as your foo() method accepts &self, it is not possible, because the
pub fn third_party(data: Vec<i32>) -> i32
signature is quite unambiguous: regardless of what this third_party function does, it's API states that it needs its own instance of Vec, by value. This precludes using borrowing of any form, and because foo() accepts self by reference, you can't really do anything except for cloning.
Also, supposedly this third_party is written without any weird unsafe hacks, so it is quite safe to assume that the Vec which is passed into it is eventually dropped and deallocated. Therefore, unsafely creating a copy of the original Vec without cloning it (by copying internal pointers) is out of question - you'll definitely get a use-after-free if you do it.
While your question does not state it, the fact that you want to preserve the original value of data is kind of a natural assumption. If this assumption can be relaxed, and you're actually okay with giving the data instance out and e.g. replacing it with an empty vector internally, then there are several things you can potentially do:
Switch foo(&self) to foo(&mut self), then you can quite easily extract data and replace it with an empty vector.
Use Cell or RefCell to store the data. This way, you can continue to use foo(&self), at the cost of some runtime checks when you extract the value out of a cell and replace it with some default value.
Both these approaches, however, will result in you losing the original Vec. With the given third-party API there is no way around that.
If you still can somehow influence this external API, then the best solution would be to change it to accept &[i32], which can easily be obtained from Vec<i32> with borrowing.
No, you can't get rid of the call to clone here.
The problem here is with the third-party library. As the function third_party is written now, it's true that it could be using an &Vec<i32>; it doesn't require ownership, since it's just moving out a value that's Copy. However, since the implementation is outside of your control, there's nothing preventing the person maintaining the function from changing it to take advantage of owning the Vec. It's possible that whatever it is doing would be easier or require less memory if it were allowed to overwrite the provided memory, and the function writer is leaving the door open to do so in the future. If that's not the case, it might be worth suggesting a change to the third-party function's signature and relying on clone in the meantime.

Does partial application in Rust have overhead?

I like using partial application, because it permits (among other things) to split a complicated function call, that is more readable.
An example of partial application:
fn add(x: i32, y: i32) -> i32 {
x + y
}
fn main() {
let add7 = |x| add(7, x);
println!("{}", add7(35));
}
Is there overhead to this practice?
Here is the kind of thing I like to do (from a real code):
fn foo(n: u32, things: Vec<Things>) {
let create_new_multiplier = |thing| ThingMultiplier::new(thing, n); // ThingMultiplier is an Iterator
let new_things = things.clone().into_iter().flat_map(create_new_multiplier);
things.extend(new_things);
}
This is purely visual. I do not like to imbricate too much the stuff.
There should not be a performance difference between defining the closure before it's used versus defining and using it it directly. There is a type system difference — the compiler doesn't fully know how to infer types in a closure that isn't immediately called.
In code:
let create_new_multiplier = |thing| ThingMultiplier::new(thing, n);
things.clone().into_iter().flat_map(create_new_multiplier)
will be the exact same as
things.clone().into_iter().flat_map(|thing| {
ThingMultiplier::new(thing, n)
})
In general, there should not be a performance cost for using closures. This is what Rust means by "zero cost abstraction": the programmer could not have written it better themselves.
The compiler converts a closure into implementations of the Fn* traits on an anonymous struct. At that point, all the normal compiler optimizations kick in. Because of techniques like monomorphization, it may even be faster. This does mean that you need to do normal profiling to see if they are a bottleneck.
In your particular example, yes, extend can get inlined as a loop, containing another loop for the flat_map which in turn just puts ThingMultiplier instances into the same stack slots holding n and thing.
But you're barking up the wrong efficiency tree here. Instead of wondering whether an allocation of a small struct holding two fields gets optimized away you should rather wonder how efficient that clone is, especially for large inputs.

Resources