How to access item in nested hashmap without multiple clones? - rust

There is a TokenBalances struct defined using a nested map. I want to implement a the balance_of method which takes two keys: token, account as input, and returns the balance as output.
use std::collections::*;
type TokenId = u128;
type AccountId = u128;
type AccountBalance = u128;
#[derive(Default, Clone)]
struct TokenBalances {
balances: HashMap<TokenId, HashMap<AccountId, AccountBalance>>,
}
impl TokenBalances {
fn balance_of(&self, token: TokenId, account: AccountId) -> AccountBalance {
self.balances
.get(&token)
.cloned()
.unwrap_or_default()
.get(&account)
.cloned()
.unwrap_or_default()
}
}
fn main() {
println!("{}", TokenBalances::default().balance_of(0, 0)) // 0
}
It uses cloned twice to turn an Option<&T> to Option<T>
I'm aware of to_owned as an alternative to cloned, but in its docs it says
Creates owned data from borrowed data, usually by cloning.
I'm wondering if the clones are really necessary. Is there any idomatic way to rewrite the method without cloning twice? And are clones completely avoidable?

You can use Option::and_then:
self.balances
.get(&token)
.and_then(|map| map.get(&account))
.copied()
.unwrap_or_default()
The and_then call returns an Option<&AccountBalance>. This is then cloned/copied with copied. This is fine as it's just a u128 right now. If it ever becomes a more complex type where copying isn't cheap, make balance_of return &AccountBalance instead. Then you can remove the copied() call and unwrap_or(&0) for example.
Last note: the unwrap_or_default might hint at a code smell. I don't know your context, but it might make more sense to actually return Option<AccountBalance> from the method instead of defaulting to 0.

Your need to clone comes from the fact that you are using unwrap_or_default inbetween. Without the clone, you have an Option<&HashMap> (as per HashMap::get on the outer map) and &HashMap does not implement Default - where should this default reference point to? Luckily, we don't actually need an owned HashMap, the reference to the inner map totally suffices to do a lookup. To chain two functions that may return None, one uses Option::and_then. It is like Option::map, but flattens an Option<Option<T>> into just an Option<T>. In your case, the code would look like this:
impl TokenBalances {
fn balance_of(&self, token: TokenId, account: AccountId) -> AccountBalance {
self.balances
.get(&token)
.and_then(|inner_map| inner_map.get(&account))
.copied()
.unwrap_or_default()
}
}
I also changed the final cloned to copied, which indicates that this clone is cheap.

Related

Is there a way to reference an anonymous type as a struct member?

Consider the following Rust code:
use std::future::Future;
use std::pin::Pin;
fn main() {
let mut v: Vec<_> = Vec::new();
for _ in 1..10 {
v.push(wrap_future(Box::pin(async {})));
}
}
fn wrap_future<T>(a: Pin<Box<dyn Future<Output=T>>>) -> impl Future<Output=T> {
async {
println!("doing stuff before awaiting");
let result=a.await;
println!("doing stuff after awaiting");
result
}
}
As you can see, the futures I'm putting into the Vec don't need to be boxed, since they are all the same type and the compiler can infer what that type is.
I would like to create a struct that has this Vec<...> type as one of its members, so that I could add a line at the end of main():
let thing = MyStruct {myvec: v};
without any additional overhead (i.e. boxing).
Type inference and impl Trait syntax aren't allowed on struct members, and since the future type returned by an async block exists entirely within the compiler and is exclusive to that exact async block, there's no way to reference it by name. It seems to me that what I want to do is impossible. Is it? If so, will it become possible in a future version of Rust?
I am aware that it would be easy to sidestep this problem by simply boxing all the futures in the Vec as I did the argument to wrap_future() but I'd prefer not to do this if I can avoid it.
I am well aware that doing this would mean that there could be only one async block in my entire codebase whose result values could possibly be added to such a Vec, and thus that there could be only one function in my entire codebase that could create values that could possibly be pushed to it. I am okay with this limitation.
Nevermind, I'm stupid. I forgot that structs could have type parameters.
struct MyStruct<F> where F: Future<Output=()> {
myvec: Vec<F>,
}

How can return a reference to an item in a static hashmap inside a mutex?

I am trying to access a static hashmap for reading and writing but I am always getting error:
use std::collections::HashMap;
use std::sync::Mutex;
pub struct ModuleItem {
pub absolute_path: String,
}
lazy_static! {
static ref MODULE_MAP: Mutex<HashMap<i32, ModuleItem>> = Mutex::new(HashMap::new());
}
pub fn insert(identity_hash: i32, module_item: ModuleItem) {
MODULE_MAP
.lock()
.unwrap()
.insert(identity_hash, module_item);
}
pub fn get(identity_hash: i32) -> Option<&'static ModuleItem> {
MODULE_MAP.lock().unwrap().get(&identity_hash).clone()
}
But I am getting an error on the get function cannot return value referencing temporary value
I tried with .cloned(), .clone() or even nothing but I don't manage to get it to work. Can you help me?
I tried with .cloned(), .clone() or even nothing but I don't manage to get it to work. Can you help me?
All Option::clone does is clone the underlying structure, which in this case is an &ModuleItem so it just clones the reference, and you still have a reference, which you can't return because you only have access to the hashmap's contents while you hold the lock (otherwise it could not work).
Option::cloned actually clones the object being held by reference, but doesn't compile here because ModuleItem can't be cloned.
First you have to return a Option<ModuleItem>, you can not return a reference to the map contents since the lock is going to be released at the end of the function, and you can't keep a handle on hashmap contents across mutex boundaries as they could go away at any moment (e.g. an other thread could move them, or even clear the map entirely).
Then copy the ModuleItem, either by deriving Clone on ModuleItem (then calling Option::cloned) or by creating a new ModuleItem "by hand" e.g.
pub fn get(identity_hash: i32) -> Option<ModuleItem> {
MODULE_MAP.lock().unwrap().get(&identity_hash).map(|m|
ModuleItem { absolute_path: m.absolute_path.clone() }
)
}
If you need to get keys out a lot and are worried about performances, you could always store Arc<ModuleItem>. That has something of a cost (as it's a pointer so your string is now behind two pointers) however cloning an Arc is very cheap.
To avoid the double pointer you could make ModuleItem into an unsized type and have it store an str but... that's pretty difficult to work with so I wouldn't recommend it.
The function get cannot use a static lifetime because the data does not live for the entire life of the program (from the Rust book):
As a reference lifetime 'static indicates that the data pointed to by the reference lives for the entire lifetime of the running program. It can still be coerced to a shorter lifetime.
So you have to return either a none-static reference or a copy of the value of the HashMap. Reference is not possible because MODULE_MAP.lock().unwrap() returns a MutexGuard which is a local and therefore a temporary variable that hold the HashMap. And get() of the HashMap returns a reference.
Due to the fact that the temporary MutexGuard will be destroy at the end of the function, the reference returned by get would point to a temporary value.
To fix this, you could make ModuleItem clonable and return a copy of the value:
use std::collections::HashMap;
use std::sync::Mutex;
#[derive(Clone)]
pub struct ModuleItem {
pub absolute_path: String,
}
lazy_static::lazy_static! {
static ref MODULE_MAP: Mutex<HashMap<i32, ModuleItem>> = Mutex::new(HashMap::new());
}
pub fn insert(identity_hash: i32, module_item: ModuleItem) {
MODULE_MAP
.lock()
.unwrap()
.insert(identity_hash, module_item);
}
pub fn get(identity_hash: i32) -> Option<ModuleItem> {
MODULE_MAP.lock().unwrap().get(&identity_hash).cloned()
}

Am I moving or cloning this String

I have an enum defined like this:
#[derive(Clone, Debug)]
pub enum JsonState {
NameReadingState(String),
StringState(String),
}
impl JsonState {
pub fn get_name_read(self) -> String {
if let JsonState::NameReadingState(name) = self {
return name;
} else {
panic!(
"Error: attempted to get name from non name state {:#?}",
self
);
}
}
}
If I were to call get_name_read on an instance of JsonState would the string be moved out of the enum or would it be copied? My understanding is that since I am passing self and not &self I am taking ownership of the instance inside the function and so I should be able to simply move the string out of it.
It is moved.
This is, in my opinion, one of the great advantages of Rust over C++: if you don't see a .clone() anywhere, then you are not cloning! In Rust, there are no implicit deep copies like in C++. If you want to create a copy/clone then you have to do it explicitly by calling a method that clones your instance.
All of this comes with one exception: types that implement Copy. These types use copy semantics instead of move semantics. It should be noted that Copy can only be implemented for types "whose values can be duplicated simply by copying bits", i.e. very simple types. String and any other types that manage heap memory do not implement Copy.

It is possible to have single-threaded zero-cost memoization without mutability in safe Rust?

I'm interested in finding or implementing a Rust data structure that provides a zero-cost way to memoize a single computation with an arbitrary output type T. Specifically I would like a generic type Cache<T> whose internal data takes up no more space than an Option<T>, with the following basic API:
impl<T> Cache<T> {
/// Return a new Cache with no value stored in it yet.
pub fn new() -> Self {
// ...
}
/// If the cache has a value stored in it, return a reference to the
/// stored value. Otherwise, compute `f()`, store its output
/// in the cache, and then return a reference to the stored value.
pub fn call<F: FnOnce() -> T>(&self, f: F) -> &T {
// ...
}
}
The goal here is to be able to share multiple immutable references to a Cache within a single thread, with any holder of such a reference being able to access the value (triggering the computation if it is the first time). Since we're only requiring to be able to share a Cache within a single thread, it is not necessary for it to be Sync.
Here is a way to implement the API safely (or at least I think it's safe) using unsafe under the hood:
use std::cell::UnsafeCell;
pub struct Cache<T> {
value: UnsafeCell<Option<T>>
}
impl<T> Cache<T> {
pub fn new() -> Self {
Cache { value: UnsafeCell::new(None) }
}
pub fn call<F: FnOnce() -> T>(&self, f: F) -> &T {
let ptr = self.value.get();
unsafe {
if (*ptr).is_none() {
let t = f();
// Since `f` potentially could have invoked `call` on this
// same cache, to be safe we must check again that *ptr
// is still None, before setting the value.
if (*ptr).is_none() {
*ptr = Some(t);
}
}
(*ptr).as_ref().unwrap()
}
}
}
Is it possible to implement such a type in safe Rust (i.e., not writing our own unsafe code, only indirectly relying on unsafe code in the standard library)?
Clearly, the call method requires mutating self, which means that Cache must use some form of interior mutability. However, it does not seem possible to do with a Cell, because Cell provides no way to retrieve a reference to the enclosed value, as is required by the desired API of call above. And this is for a good reason, as it would be unsound for Cell to provide such a reference, because it would have no way to ensure that the referenced value is not mutated over the lifetime of the reference. On the other hand, for the Cache type, after call is called once, the API above does not provide any way for the stored value to be mutated again, so it is safe for it to hand out a reference with a lifetime that can be a long as that of the Cache itself.
If Cell can't work, I'm curious if the Rust standard library might provide some other safe building blocks for interior mutability which could be used to implement this Cache.
Neither RefCell nor Mutex fulfill the goals here:
they are not zero-cost: they involve storing more data than that of an Option<T> and add unnecessary runtime checks.
they do not seem to provide any way to return a real reference with the lifetime that we want -- instead we could only return a Ref or MutexGuard which is not the same thing.
Using only an Option would not provide the same functionality: if we share immutable references to the Cache, any holder of such a reference can invoke call and obtain the desired value (and mutating the Cache in the process so that future calls will retrieve the same value); whereas, sharing immutable references to an Option, it would not be possible to mutate the Option, so it could not work.

Access nested structures without moving

I've got these structs:
#[derive(Debug, RustcDecodable)]
struct Config {
ssl: Option<SslConfig>,
}
#[derive(Debug, RustcDecodable)]
struct SslConfig {
key: Option<String>,
cert: Option<String>,
}
They get filled from a toml file. This works perfectly fine. Since I got an Option<T> in it I either have to call unwrap() or do a match.
But if I want to do the following:
let cfg: Config = read_config(); // Reads a File, parses it and returns the Config-Struct
let keypath = cfg.ssl.unwrap().key.unwrap();
let certpath = cfg.ssl.unwrap().cert.unwrap();
It won't work because cfg.ssl gets moved to keypath. But why does it get moved? I call unwrap() on ssl to get the key (and unwrap() it to). So the result of key.unwrap() should get moved?
Or am I missing a point? Whats the best way to make these structs accessible like this (or in a other neat way)? I tried to implement #[derive(Debug, RustcDecodable, Copy, Clone)] but this won't work because I have to implement Copy to String as well. Then I have to implement Copy to Vec<u8> and so on. There must be a more convenient solution?
What is the definition of Option::unwrap? From the documentation:
fn unwrap(self) -> T
it consumes its input (cfg.ssl here).
This is not what you want, you instead want to go from Option<T> to &T, which will start by consuming &self (by reference, not value)... or you want to clone the Option before calling unwrap.
Cloning is rarely the solution... the alternative here is as_ref:
fn as_ref(&self) -> Option<&T>
And therefore you can write:
let keypath /*: &String*/ = cfg.ssl.as_ref().unwrap().key.as_ref().unwrap();
^~~~~~~ ^~~~~~~~
So the result of key.unwrap() should get moved?
Yes, but not only that. The key insight here is, that a variable get's moved into of unwrap(). Let's look at the function signature:
fn unwrap(self) -> T { ... }
It takes self, so the object is moved into the function. But this applies to ssl.unwrap(), too!
So when writing:
cfg.ssl.unwrap().key.unwrap();
You first move cfg.ssl into unwrap(), then you access one field of the result and move that field into unwrap() again. So yes, cfg.ssl is moved. In order to solve this, you can save the temporary result of the first unwrap() call, like so:
let ssl = cfg.ssl.unwrap();
let keypath = ssl.key.unwrap();
let certpath = ssl.cert.unwrap();
Or you can look at the as_ref() method, if you don't want to move (which is probably the case).

Resources