Implement Borrow on something behind a RefCell? - rust

I have the structs Value and RefValue in my project. RefValue is a reference-counted, dynamically borrowable Value. Now Value may contain a HashMap of RefValue, where both the key and the value is a RefValue.
type ValueMap = HashMap<RefValue, RefValue>;
#[derive(Debug, PartialEq, Eq)]
enum Value {
Integer(i64),
String(String),
Map(ValueMap),
}
#[derive(Debug, PartialEq, Eq)]
struct RefValue {
value: Rc<RefCell<Value>>,
}
I've implemented Hash on RefValue on my own, and some From-traits separately in this playground.
What I want to achieve is something like this main program:
fn main() {
// Simple values
let x = RefValue::from(42);
let y = RefValue::from("Hello");
// Make a map from these values
let mut z = ValueMap::new();
z.insert(RefValue::from("x"), x);
z.insert(RefValue::from("y"), y);
// Make a value from the map
let z = RefValue::from(z);
println!("z = {:?}", z);
// Try to access "x"
if let Value::Map(m) = &*z.borrow() {
println!("m[x] = {:?}", m["x"]); // <- here I want to access by &str
};
}
Unfortunately I'm getting strange results, as you can find in the playground comments. I'm also quite unsure if there's not a better implementation of the entire problem, as the RefCell cannot return a borrowed value of its contained element.
Can anybody give me a hint?

When you implement Borrow<T>, your Hash implementation must return the same hash value as T's for when the underlying value is equivalent. That is, if x.hash() must be equal to x.borrow().hash(). HashMap relies on this property when you index into it: it requires Idx: Borrow<Key> and then uses this rule to ensure it can find the value.
Your impl Borrow<str> for RefValue does not follow this rule. RefValue::hash() for RefValue::String calls write_u8(2) before hashing the string. Since you broke the contract, the hashmap is allowed to do anything (excluding undefined behavior), like panicking, aborting the process, or not finding your key, which is what it does in this case.
To fix that, you should just not hash the discriminant (removed it from the others too, for consistency):
impl Hash for RefValue {
fn hash<H: Hasher>(&self, state: &mut H) {
match &*self.borrow() {
Value::Integer(i) => {
i.hash(state);
}
Value::String(s) => {
s.hash(state);
}
Value::Map(m) => {
(m as *const ValueMap as usize).hash(state); // Object address
}
}
}
}
Now it panics in your Borrow implementation, like you expected (playground).
But you should not implement Borrow, since implementing it means your value is a reflection of the borrowed value. RefValue is by no means str. It can be integers, or maps, too. Thus, you should not implement Borrow for any of those. You could implement Borrow<Value>, but this is impossible because you use RefCell and thus need to return Ref but Borrow mandates returning a reference. You're out of luck. Your only option is to index with RefValues.
Lastly, you should avoid interior mutability in keys. Once change them, and it's easy to change them by mistake, and your hash/equality change, you broke your contract with the map once again.

Related

Hashmap with enum values

Rust Newbie.
I'd like to create a hashmap that contains values of different types. I got as far as shown, and I can store the values, but I cannot cast them back to the original time when reading them. I'm sure I'm missing something basic, but I'm still struggling with the enum concept in Rust.
#[derive(Debug)]
struct My1 { value: i32 }
#[derive(Debug)]
struct My2 { value: String }
#[derive(Debug)]
enum MyValueType {
MyOne(Vec<My1>),
MyTwo(Vec<My2>)
}
fn main() {
use std::collections::HashMap;
let mut map: HashMap<&str, MyValueType> = HashMap::new();
let a1 = vec!(My1 { value: 100 });
let a2 = vec!(My2 { value: "onehundred".into() });
map.insert("one", MyValueType::MyOne(a1));
map.insert("two", MyValueType::MyTwo(a2));
//let b: &Vec<My1> = map.get("one").unwrap().into(); // err
for (key, value) in &map {
println!("{}: {:?}", key, value);
}
let k1: Vec<My1> = *map.get("one").unwrap().into(); // err: type annotation needed
let k2: Vec<My2> = *map.get("two").unwrap().into(); // err: type annotation needed
}
How should I implement this so I can cast the value of type MyValueType back to Vec or Vec as the case may be? Or am I fundamentally wrong on how I'm setting this up in general?
Starting with:
let v = map.get("one");
The hash map will return an option of the enum (Option<MyValueType>). After unwrapping the option, you’re left with the enum.
let v = map.get("one"); // v is MyValueType (specifically MyOne)
This enum has one of the possible values of MyOne or MyTwo, but we don’t yet know which (more specifically — the compiler doesn’t know, even if we can tell just by looking that it’s MyOne). If you want to reach in to MyOne or MyTwo and grab one of the Vecs that are stored there, you need to match against the enum. For example:
match map.get("one").unwrap() {
MyValueType::MyOne(vector) => {
// do something
},
MyValurType::MyTwo => panic!(“unexpected)
}
This intentionally forces you to check that the enum is the value you are expecting before you are able to access to the data within. Typically you won’t know the exact type of the enum when you are writing code (otherwise why use an enum!) which is why this might seem a bit verbose.

Rust: how to assign `iter().map()` or `iter().enumarate()` to same variable

struct A {...whatever...};
const MY_CONST_USIZE:usize = 127;
// somewhere in function
// vec1_of_A:Vec<A> vec2_of_A_refs:Vec<&A> have values from different data sources and have different inside_item types
let my_iterator;
if my_rand_condition() { // my_rand_condition is random and compiles for sake of simplicity
my_iterator = vec1_of_A.iter().map(|x| (MY_CONST_USIZE, &x)); // Map<Iter<Vec<A>>>
} else {
my_iterator = vec2_of_A_refs.iter().enumerate(); // Enumerate<Iter<Vec<&A>>>
}
how to make this code compile?
at the end (based on condition) I would like to have iterator able build from both inputs and I don't know how to integrate these Map and Enumerate types into single variable without calling collect() to materialize iterator as Vec
reading material will be welcomed
In the vec_of_A case, first you need to replace &x with x in your map function. The code you have will never compile because the mapping closure tries to return a reference to one of its parameters, which is never allowed in Rust. To make the types match up, you need to dereference the &&A in the vec2_of_A_refs case to &A instead of trying to add a reference to the other.
Also, -127 is an invalid value for usize, so you need to pick a valid value, or use a different type than usize.
Having fixed those, now you need some type of dynamic dispatch. The simplest approach would be boxing into a Box<dyn Iterator>.
Here is a complete example:
#![allow(unused)]
#![allow(non_snake_case)]
struct A;
// Fixed to be a valid usize.
const MY_CONST_USIZE: usize = usize::MAX;
fn my_rand_condition() -> bool { todo!(); }
fn example() {
let vec1_of_A: Vec<A> = vec![];
let vec2_of_A_refs: Vec<&A> = vec![];
let my_iterator: Box<dyn Iterator<Item=(usize, &A)>>;
if my_rand_condition() {
// Fixed to return x instead of &x
my_iterator = Box::new(vec1_of_A.iter().map(|x| (MY_CONST_USIZE, x)));
} else {
// Added map to deref &&A to &A to make the types match
my_iterator = Box::new(vec2_of_A_refs.iter().map(|x| *x).enumerate());
}
for item in my_iterator {
// ...
}
}
(Playground)
Instead of a boxed trait object, you could also use the Either type from the either crate. This is an enum with Left and Right variants, but the Either type itself implements Iterator if both the left and right types also do, with the same type for the Item associated type. For example:
#![allow(unused)]
#![allow(non_snake_case)]
use either::Either;
struct A;
const MY_CONST_USIZE: usize = usize::MAX;
fn my_rand_condition() -> bool { todo!(); }
fn example() {
let vec1_of_A: Vec<A> = vec![];
let vec2_of_A_refs: Vec<&A> = vec![];
let my_iterator;
if my_rand_condition() {
my_iterator = Either::Left(vec1_of_A.iter().map(|x| (MY_CONST_USIZE, x)));
} else {
my_iterator = Either::Right(vec2_of_A_refs.iter().map(|x| *x).enumerate());
}
for item in my_iterator {
// ...
}
}
(Playground)
Why would you choose one approach over the other?
Pros of the Either approach:
It does not require a heap allocation to store the iterator.
It implements dynamic dispatch via match which is likely (but not guaranteed) to be faster than dynamic dispatch via vtable lookup.
Pros of the boxed trait object approach:
It does not depend on any external crates.
It scales easily to many different types of iterators; the Either approach quickly becomes unwieldy with more than two types.
You can do this using a Boxed trait object like so:
let my_iterator: Box<dyn Iterator<Item = _>> = if my_rand_condition() {
Box::new(vec1_of_A.iter().map(|x| (MY_CONST_USIZE, x)))
} else {
Box::new(vec2_of_A_refs.iter().enumerate().map(|(i, x)| (i, *x)))
};
I don't think this is a good idea generally though. A few things to note:
The use of trait objects means the types here must be resolved dynamically. This adds a lot of overhead.
The closure in vec1's iterator's map method cannot reference its arguments. Instead the second map must be added to vec2s iterator. The effect of this is that all the items are being copied regardless. If you are doing this, why not collect()? The overhead for creating the Vec or whatever you choose should be less than that of the dynamic resolution.
Bit pedantic, but remember if statements are expressions in Rust, and so the assignment can be expressed a little more cleanly as I have done above.

Transferring ownership between enum variants [duplicate]

I'm tring to replace a value in a mutable borrow; moving part of it into the new value:
enum Foo<T> {
Bar(T),
Baz(T),
}
impl<T> Foo<T> {
fn switch(&mut self) {
*self = match self {
&mut Foo::Bar(val) => Foo::Baz(val),
&mut Foo::Baz(val) => Foo::Bar(val),
}
}
}
The code above doesn't work, and understandibly so, moving the value out of self breaks the integrity of it. But since that value is dropped immediately afterwards, I (if not the compiler) could guarantee it's safety.
Is there some way to achieve this? I feel like this is a job for unsafe code, but I'm not sure how that would work.
mem:uninitialized has been deprecated since Rust 1.39, replaced by MaybeUninit.
However, uninitialized data is not required here. Instead, you can use ptr::read to get the data referred to by self.
At this point, tmp has ownership of the data in the enum, but if we were to drop self, that data would attempt to be read by the destructor, causing memory unsafety.
We then perform our transformation and put the value back, restoring the safety of the type.
use std::ptr;
enum Foo<T> {
Bar(T),
Baz(T),
}
impl<T> Foo<T> {
fn switch(&mut self) {
// I copied this code from Stack Overflow without reading
// the surrounding text that explains why this is safe.
unsafe {
let tmp = ptr::read(self);
// Must not panic before we get to `ptr::write`
let new = match tmp {
Foo::Bar(val) => Foo::Baz(val),
Foo::Baz(val) => Foo::Bar(val),
};
ptr::write(self, new);
}
}
}
More advanced versions of this code would prevent a panic from bubbling out of this code and instead cause the program to abort.
See also:
replace_with, a crate that wraps this logic up.
take_mut, a crate that wraps this logic up.
Change enum variant while moving the field to the new variant
How can I swap in a new value for a field in a mutable reference to a structure?
The code above doesn't work, and understandibly so, moving the value
out of self breaks the integrity of it.
This is not exactly what happens here. For example, same thing with self would work nicely:
impl<T> Foo<T> {
fn switch(self) {
self = match self {
Foo::Bar(val) => Foo::Baz(val),
Foo::Baz(val) => Foo::Bar(val),
}
}
}
Rust is absolutely fine with partial and total moves. The problem here is that you do not own the value you're trying to move - you only have a mutable borrowed reference. You cannot move out of any reference, including mutable ones.
This is in fact one of the frequently requested features - a special kind of reference which would allow moving out of it. It would allow several kinds of useful patterns. You can find more here and here.
In the meantime for some cases you can use std::mem::replace and std::mem::swap. These functions allow you to "take" a value out of mutable reference, provided you give something in exchange.
Okay, I figured out how to do it with a bit of unsafeness and std::mem.
I replace self with an uninitialized temporary value. Since I now "own" what used to be self, I can safely move the value out of it and replace it:
use std::mem;
enum Foo<T> {
Bar(T),
Baz(T),
}
impl<T> Foo<T> {
fn switch(&mut self) {
// This is safe since we will overwrite it without ever reading it.
let tmp = mem::replace(self, unsafe { mem::uninitialized() });
// We absolutely must **never** panic while the uninitialized value is around!
let new = match tmp {
Foo::Bar(val) => Foo::Baz(val),
Foo::Baz(val) => Foo::Bar(val),
};
let uninitialized = mem::replace(self, new);
mem::forget(uninitialized);
}
}
fn main() {}

`HashMap::get_mut` leading to "returns reference to local value", any efficient work-around?

There have been a fair number of questions around this, and the solution is mostly "use Entry".
However this is an issue because HashMap::entry() requires an owned value meaning possibly expensive copies / allocations even when the key is already present and we just want to update the value in-place, hence the use of get_mut. However the use of get_mut on a reference to a local leads rustc to assume that said reference gets stored into the hashmap, and thus that returning the hashmap is an error:
use std::borrow::Cow;
use std::collections::HashMap;
fn get_string() -> String { String::from("xxxxxxx") }
fn foo() -> HashMap<Cow<'static, str>, usize> {
let mut v = HashMap::new();
// stand-in for "get a string slice as key",
// real case is getting a String from an
// mpsc and the key being a segment of that string
let s = get_string();
// stand-in for a structure which contains an `Option<Cow>`
let k = Cow::from(&s[2..3]);
// because of get_mut, `&s` is apparently considered to be stored in `v`?
if let Some(e) = v.get_mut(&k) {
*e += 1;
} else {
v.insert(Cow::from(k.into_owned()), 0);
}
v
}
Note that the manipulations at lines 9~13 are there to clarify the point of the pattern, but get_mut alone is sufficient to trigger the issue
Is there a way around without the efficiency hit, or is an eager allocation the only way? (note: because this is a static issue, dynamic gates like contains_key or get obviously don't do anything).
According to the docs, HashSet::get_mut() requires a value of type &Q such that the key of the hash implements Borrow<Q>.
The key of your hash is Cow<'static, str>, that implements Borrow<str>. This means that you can use either a &Cow<'static, str> or a &str. But you are passing a &Cow<'local, str> for some 'local lifetime. The compiler tries to match that 'local with 'static and issues a somewhat confusing error message about lifetimes.
The solution is actually easy, because you can get an &str from the Cow either calling k.as_ref() or doing &*k, and the lifetime of the &str is unrestricted: (playground)
let k = Cow::from(&s[2..3]);
if let Some(e) = v.get_mut(k.as_ref()) { /* ...*/ }

A built-in Object in Rust

Rust doesn't have built-in Object type I take it? If so, how do I, say, create a HashMap of "something" that in Java would be Object:
fn method1(my_hash_map: HashMap<&str, ???>) { ... } // Rust
void method1(Map<String, Object> myMap) { ... } // Java
If you want a HashMap that can mix values of many different types, you'll have to use Any. The most direct equivalent to Map<String, Object> would be HashMap<String, Box<Any>>. I switched &str to String because &str without a lifetime is probably not what you want and in any case even further removed from Java String than Rust's String already is.
However, if you simply don't care about the type of the values, it's simpler and more efficient to make method1 generic:
fn method1<T>(my_hash_map: HashMap<String, T>) { ... }
Of course, you can also add constraints T:Trait to do more interesting things with the values (cf. Object allows equality comparisons and hashing).
To expand on rightføld's comment, Any is the closest you can really get in Rust, though it does come with a major restriction: it is only implemented by types which satisfy the 'static lifetime; that is, you can't treat any type which contains non-static references as an Any.
A second complication is that Object in Java has reference semantics and gives you shared ownership. As such, you'd need something like Rc<RefCell<Any>> to get something roughly comparable. Note, however, that this is heavily discouraged since it basically moves a lot of checks to runtime. Something like this should be a fallback of last resort.
Finally, note that, insofar as I'm aware, there's no way to do a dynamic upcast on an Any to anything other than the erased type; so you can't take a reference to a value that, say, implements Show, turn it into an &Any, and then upcast to a &Show.
Better alternatives, if applicable, include generalising the value type (so use generic functions and structs), using an enum if there is a fixed, finite list of types you want to support, or write and implement a custom trait, in that order.
To give you an example of working with Any, however, I threw the following together. Note that we have to try explicitly upcasting to every supported type.
#![feature(if_let)]
use std::any::{Any, AnyRefExt};
use std::collections::HashMap;
fn main() {
let val_a = box "blah";
let val_b = box 42u;
let val_c = box 3.14159f64;
let mut map = HashMap::new();
map.insert("a".into_string(), val_a as Box<Any>);
map.insert("b".into_string(), val_b as Box<Any>);
map.insert("c".into_string(), val_c as Box<Any>);
println!("{}", map);
splang(&map);
}
fn splang(map: &HashMap<String, Box<Any>>) {
for (k, v) in map.iter() {
if let Some(v) = v.downcast_ref::<&str>() {
println!("[\"{}\"]: &str = \"{}\"", k, *v);
} else if let Some(v) = v.downcast_ref::<uint>() {
println!("[\"{}\"]: uint = {}", k, *v);
} else {
println!("[\"{}\"]: ? = {}", k, v);
}
}
}
When run, it outputs:
{c: Box<Any>, a: Box<Any>, b: Box<Any>}
["c"]: ? = Box<Any>
["a"]: &str = "blah"
["b"]: uint = 42

Resources