Rust doesn't have built-in Object type I take it? If so, how do I, say, create a HashMap of "something" that in Java would be Object:
fn method1(my_hash_map: HashMap<&str, ???>) { ... } // Rust
void method1(Map<String, Object> myMap) { ... } // Java
If you want a HashMap that can mix values of many different types, you'll have to use Any. The most direct equivalent to Map<String, Object> would be HashMap<String, Box<Any>>. I switched &str to String because &str without a lifetime is probably not what you want and in any case even further removed from Java String than Rust's String already is.
However, if you simply don't care about the type of the values, it's simpler and more efficient to make method1 generic:
fn method1<T>(my_hash_map: HashMap<String, T>) { ... }
Of course, you can also add constraints T:Trait to do more interesting things with the values (cf. Object allows equality comparisons and hashing).
To expand on rightføld's comment, Any is the closest you can really get in Rust, though it does come with a major restriction: it is only implemented by types which satisfy the 'static lifetime; that is, you can't treat any type which contains non-static references as an Any.
A second complication is that Object in Java has reference semantics and gives you shared ownership. As such, you'd need something like Rc<RefCell<Any>> to get something roughly comparable. Note, however, that this is heavily discouraged since it basically moves a lot of checks to runtime. Something like this should be a fallback of last resort.
Finally, note that, insofar as I'm aware, there's no way to do a dynamic upcast on an Any to anything other than the erased type; so you can't take a reference to a value that, say, implements Show, turn it into an &Any, and then upcast to a &Show.
Better alternatives, if applicable, include generalising the value type (so use generic functions and structs), using an enum if there is a fixed, finite list of types you want to support, or write and implement a custom trait, in that order.
To give you an example of working with Any, however, I threw the following together. Note that we have to try explicitly upcasting to every supported type.
#![feature(if_let)]
use std::any::{Any, AnyRefExt};
use std::collections::HashMap;
fn main() {
let val_a = box "blah";
let val_b = box 42u;
let val_c = box 3.14159f64;
let mut map = HashMap::new();
map.insert("a".into_string(), val_a as Box<Any>);
map.insert("b".into_string(), val_b as Box<Any>);
map.insert("c".into_string(), val_c as Box<Any>);
println!("{}", map);
splang(&map);
}
fn splang(map: &HashMap<String, Box<Any>>) {
for (k, v) in map.iter() {
if let Some(v) = v.downcast_ref::<&str>() {
println!("[\"{}\"]: &str = \"{}\"", k, *v);
} else if let Some(v) = v.downcast_ref::<uint>() {
println!("[\"{}\"]: uint = {}", k, *v);
} else {
println!("[\"{}\"]: ? = {}", k, v);
}
}
}
When run, it outputs:
{c: Box<Any>, a: Box<Any>, b: Box<Any>}
["c"]: ? = Box<Any>
["a"]: &str = "blah"
["b"]: uint = 42
Related
struct A {...whatever...};
const MY_CONST_USIZE:usize = 127;
// somewhere in function
// vec1_of_A:Vec<A> vec2_of_A_refs:Vec<&A> have values from different data sources and have different inside_item types
let my_iterator;
if my_rand_condition() { // my_rand_condition is random and compiles for sake of simplicity
my_iterator = vec1_of_A.iter().map(|x| (MY_CONST_USIZE, &x)); // Map<Iter<Vec<A>>>
} else {
my_iterator = vec2_of_A_refs.iter().enumerate(); // Enumerate<Iter<Vec<&A>>>
}
how to make this code compile?
at the end (based on condition) I would like to have iterator able build from both inputs and I don't know how to integrate these Map and Enumerate types into single variable without calling collect() to materialize iterator as Vec
reading material will be welcomed
In the vec_of_A case, first you need to replace &x with x in your map function. The code you have will never compile because the mapping closure tries to return a reference to one of its parameters, which is never allowed in Rust. To make the types match up, you need to dereference the &&A in the vec2_of_A_refs case to &A instead of trying to add a reference to the other.
Also, -127 is an invalid value for usize, so you need to pick a valid value, or use a different type than usize.
Having fixed those, now you need some type of dynamic dispatch. The simplest approach would be boxing into a Box<dyn Iterator>.
Here is a complete example:
#![allow(unused)]
#![allow(non_snake_case)]
struct A;
// Fixed to be a valid usize.
const MY_CONST_USIZE: usize = usize::MAX;
fn my_rand_condition() -> bool { todo!(); }
fn example() {
let vec1_of_A: Vec<A> = vec![];
let vec2_of_A_refs: Vec<&A> = vec![];
let my_iterator: Box<dyn Iterator<Item=(usize, &A)>>;
if my_rand_condition() {
// Fixed to return x instead of &x
my_iterator = Box::new(vec1_of_A.iter().map(|x| (MY_CONST_USIZE, x)));
} else {
// Added map to deref &&A to &A to make the types match
my_iterator = Box::new(vec2_of_A_refs.iter().map(|x| *x).enumerate());
}
for item in my_iterator {
// ...
}
}
(Playground)
Instead of a boxed trait object, you could also use the Either type from the either crate. This is an enum with Left and Right variants, but the Either type itself implements Iterator if both the left and right types also do, with the same type for the Item associated type. For example:
#![allow(unused)]
#![allow(non_snake_case)]
use either::Either;
struct A;
const MY_CONST_USIZE: usize = usize::MAX;
fn my_rand_condition() -> bool { todo!(); }
fn example() {
let vec1_of_A: Vec<A> = vec![];
let vec2_of_A_refs: Vec<&A> = vec![];
let my_iterator;
if my_rand_condition() {
my_iterator = Either::Left(vec1_of_A.iter().map(|x| (MY_CONST_USIZE, x)));
} else {
my_iterator = Either::Right(vec2_of_A_refs.iter().map(|x| *x).enumerate());
}
for item in my_iterator {
// ...
}
}
(Playground)
Why would you choose one approach over the other?
Pros of the Either approach:
It does not require a heap allocation to store the iterator.
It implements dynamic dispatch via match which is likely (but not guaranteed) to be faster than dynamic dispatch via vtable lookup.
Pros of the boxed trait object approach:
It does not depend on any external crates.
It scales easily to many different types of iterators; the Either approach quickly becomes unwieldy with more than two types.
You can do this using a Boxed trait object like so:
let my_iterator: Box<dyn Iterator<Item = _>> = if my_rand_condition() {
Box::new(vec1_of_A.iter().map(|x| (MY_CONST_USIZE, x)))
} else {
Box::new(vec2_of_A_refs.iter().enumerate().map(|(i, x)| (i, *x)))
};
I don't think this is a good idea generally though. A few things to note:
The use of trait objects means the types here must be resolved dynamically. This adds a lot of overhead.
The closure in vec1's iterator's map method cannot reference its arguments. Instead the second map must be added to vec2s iterator. The effect of this is that all the items are being copied regardless. If you are doing this, why not collect()? The overhead for creating the Vec or whatever you choose should be less than that of the dynamic resolution.
Bit pedantic, but remember if statements are expressions in Rust, and so the assignment can be expressed a little more cleanly as I have done above.
I have the structs Value and RefValue in my project. RefValue is a reference-counted, dynamically borrowable Value. Now Value may contain a HashMap of RefValue, where both the key and the value is a RefValue.
type ValueMap = HashMap<RefValue, RefValue>;
#[derive(Debug, PartialEq, Eq)]
enum Value {
Integer(i64),
String(String),
Map(ValueMap),
}
#[derive(Debug, PartialEq, Eq)]
struct RefValue {
value: Rc<RefCell<Value>>,
}
I've implemented Hash on RefValue on my own, and some From-traits separately in this playground.
What I want to achieve is something like this main program:
fn main() {
// Simple values
let x = RefValue::from(42);
let y = RefValue::from("Hello");
// Make a map from these values
let mut z = ValueMap::new();
z.insert(RefValue::from("x"), x);
z.insert(RefValue::from("y"), y);
// Make a value from the map
let z = RefValue::from(z);
println!("z = {:?}", z);
// Try to access "x"
if let Value::Map(m) = &*z.borrow() {
println!("m[x] = {:?}", m["x"]); // <- here I want to access by &str
};
}
Unfortunately I'm getting strange results, as you can find in the playground comments. I'm also quite unsure if there's not a better implementation of the entire problem, as the RefCell cannot return a borrowed value of its contained element.
Can anybody give me a hint?
When you implement Borrow<T>, your Hash implementation must return the same hash value as T's for when the underlying value is equivalent. That is, if x.hash() must be equal to x.borrow().hash(). HashMap relies on this property when you index into it: it requires Idx: Borrow<Key> and then uses this rule to ensure it can find the value.
Your impl Borrow<str> for RefValue does not follow this rule. RefValue::hash() for RefValue::String calls write_u8(2) before hashing the string. Since you broke the contract, the hashmap is allowed to do anything (excluding undefined behavior), like panicking, aborting the process, or not finding your key, which is what it does in this case.
To fix that, you should just not hash the discriminant (removed it from the others too, for consistency):
impl Hash for RefValue {
fn hash<H: Hasher>(&self, state: &mut H) {
match &*self.borrow() {
Value::Integer(i) => {
i.hash(state);
}
Value::String(s) => {
s.hash(state);
}
Value::Map(m) => {
(m as *const ValueMap as usize).hash(state); // Object address
}
}
}
}
Now it panics in your Borrow implementation, like you expected (playground).
But you should not implement Borrow, since implementing it means your value is a reflection of the borrowed value. RefValue is by no means str. It can be integers, or maps, too. Thus, you should not implement Borrow for any of those. You could implement Borrow<Value>, but this is impossible because you use RefCell and thus need to return Ref but Borrow mandates returning a reference. You're out of luck. Your only option is to index with RefValues.
Lastly, you should avoid interior mutability in keys. Once change them, and it's easy to change them by mistake, and your hash/equality change, you broke your contract with the map once again.
There have been a fair number of questions around this, and the solution is mostly "use Entry".
However this is an issue because HashMap::entry() requires an owned value meaning possibly expensive copies / allocations even when the key is already present and we just want to update the value in-place, hence the use of get_mut. However the use of get_mut on a reference to a local leads rustc to assume that said reference gets stored into the hashmap, and thus that returning the hashmap is an error:
use std::borrow::Cow;
use std::collections::HashMap;
fn get_string() -> String { String::from("xxxxxxx") }
fn foo() -> HashMap<Cow<'static, str>, usize> {
let mut v = HashMap::new();
// stand-in for "get a string slice as key",
// real case is getting a String from an
// mpsc and the key being a segment of that string
let s = get_string();
// stand-in for a structure which contains an `Option<Cow>`
let k = Cow::from(&s[2..3]);
// because of get_mut, `&s` is apparently considered to be stored in `v`?
if let Some(e) = v.get_mut(&k) {
*e += 1;
} else {
v.insert(Cow::from(k.into_owned()), 0);
}
v
}
Note that the manipulations at lines 9~13 are there to clarify the point of the pattern, but get_mut alone is sufficient to trigger the issue
Is there a way around without the efficiency hit, or is an eager allocation the only way? (note: because this is a static issue, dynamic gates like contains_key or get obviously don't do anything).
According to the docs, HashSet::get_mut() requires a value of type &Q such that the key of the hash implements Borrow<Q>.
The key of your hash is Cow<'static, str>, that implements Borrow<str>. This means that you can use either a &Cow<'static, str> or a &str. But you are passing a &Cow<'local, str> for some 'local lifetime. The compiler tries to match that 'local with 'static and issues a somewhat confusing error message about lifetimes.
The solution is actually easy, because you can get an &str from the Cow either calling k.as_ref() or doing &*k, and the lifetime of the &str is unrestricted: (playground)
let k = Cow::from(&s[2..3]);
if let Some(e) = v.get_mut(k.as_ref()) { /* ...*/ }
tl;dr in Rust, is there a "strong" type alias (or typing mechanism) such that the rustc compiler will reject (emit an error) for mix-ups that may be the same underlying type?
Problem
Currently, type aliases of the same underlying type may be defined
type WidgetCounter = usize;
type FoobarTally = usize;
However, the compiler will not reject (emit an error or a warning) if I mistakenly mix up instances of the two type aliases.
fn tally_the_foos(tally: FoobarTally) -> FoobarTally {
// ...
tally
}
fn main() {
let wc: WidgetCounter = 33;
let ft: FoobarTally = 1;
// whoops, passed the wrong variable!
let tally_total = tally_the_foos(wc);
}
(Rust Playground)
Possible Solutions?
I'm hoping for something like an additional keyword strong
strong type WidgetCounter = usize;
strong type FoobarTally = usize;
such that the previous code, when compiled, would cause a compiler error:
error[E4444]: mismatched strong alias type WidgetCounter,
expected a FoobarTally
Or maybe there is a clever trick with structs that would achieve this?
Or a cargo module that defines a macro to accomplish this?
I know I could "hack" this by type aliasing different number types, i.e. i32, then u32, then i64, etc. But that's an ugly hack for many reasons.
Is there a way to have the compiler help me avoid these custom type alias mixups?
Rust has a nice trick called the New Type Idiom just for this. By wrapping a single item in a tuple struct, you can create a "strong" or "distinct" type wrapper.
This idiom is also mentioned briefly in the tuple struct section of the Rust docs.
The "New Type Idiom" link has a great example. Here is one similar to the types you are looking for:
// Defines two distinct types. Counter and Tally are incompatible with
// each other, even though they contain the same item type.
struct Counter(usize);
struct Tally(usize);
// You can destructure the parameter here to easily get the contained value.
fn print_tally(Tally(value): &Tally) {
println!("Tally is {}", value);
}
fn return_tally(tally: Tally) -> Tally {
tally
}
fn print_value(value: usize) {
println!("Value is {}", value);
}
fn main() {
let count: Counter = Counter(12);
let mut tally: Tally = Tally(10);
print_tally(&tally);
tally = return_tally(tally);
// This is a compile time error.
// Counter is not compatible with type Tally.
// print_tally(&count);
// The contained value can be obtained through destructuring
// or by potision.
let Tally(tally_value ) = tally;
let tally_value_from_position: usize = tally.0;
print_value(tally_value);
print_value(tally_value_from_position);
}
The recommended way to create a regular boxed slice (i.e. Box<[T]>) seems to be to first create a std::Vec<T>, and use .into_boxed_slice(). However, nothing similar to this seems to work if I want the slice to be wrapped in UnsafeCell.
A solution with unsafe code is fine, but I'd really like to avoid having to manually manage the memory.
The only (not-unsafe) way to create a Box<[T]> is via Box::from, given a &[T] as the parameter. This is because [T] is ?Sized and can't be passed a parameter. This in turn effectively requires T: Copy, because T has to be copied from behind the reference into the new Box. But UnsafeCell is not Copy, regardless if T is. Discussion about making UnsafeCell Copy has been going on for years, yielding no final conclusion, due to safety concerns.
If you really, really want a Box<UnsafeCell<[T]>>, there are only two ways:
Because Box and UnsafeCell are both CoerceUnsize, and [T; N] is Unsize, you can create a Box<UnsafeCell<[T; N]>> and coerce it to a Box<UnsafeCell<[T]>. This limits you to initializing from fixed-sized arrays.
Unsize coercion:
fn main() {
use std::cell::UnsafeCell;
let x: [u8;3] = [1,2,3];
let c: Box<UnsafeCell<[_]>> = Box::new(UnsafeCell::new(x));
}
Because UnsafeCell is #[repr(transparent)], you can create a Box<[T]> and unsafely mutate it to a Box<UnsafeCell<[T]>, as the UnsafeCell<[T]> is guaranteed to have the same memory layout as a [T], given that [T] doesn't use niche-values (even if T does).
Transmute:
// enclose the transmute in a function accepting and returning proper type-pairs
fn into_boxed_unsafecell<T>(inp: Box<[T]>) -> Box<UnsafeCell<[T]>> {
unsafe {
mem::transmute(inp)
}
}
fn main() {
let x = vec![1,2,3];
let b = x.into_boxed_slice();
let c: Box<UnsafeCell<[_]>> = into_boxed_unsafecell(b);
}
Having said all this: I strongly suggest you are suffering from the xy-problem. A Box<UnsafeCell<[T]>> is a very strange type (especially compared to UnsafeCell<Box<[T]>>). You may want to give details on what you are trying to accomplish with such a type.
Just swap the pointer types to UnsafeCell<Box<[T]>>:
use std::cell::UnsafeCell;
fn main() {
let mut res: UnsafeCell<Box<[u32]>> = UnsafeCell::new(vec![1, 2, 3, 4, 5].into_boxed_slice());
unsafe {
println!("{}", (*res.get())[1]);
res.get_mut()[1] = 10;
println!("{}", (*res.get())[1]);
}
}
Playground