Is it possible to build a HashMap of &str referencing environment variables? - rust

I'm trying to make a read-only map of environment variables.
fn os_env_hashmap() -> HashMap<&'static str, &'static str> {
let mut map = HashMap::new();
use std::env;
for (key,val) in env::vars_os() {
let k = key.to_str();
if k.is_none() { continue }
let v = val.to_str();
if v.is_none() { continue }
k.unwrap();
//map.insert( k.unwrap(), v.unwrap() );
}
return map;
}
Can't seem to uncomment the "insert" line near the bottom without compiler errors about key,val,k, and v being local.
I might be able to fix the compiler error by using String instead of str, but str seems perfect for a read-only result.
Feel free to suggest a more idiomatic way to do this.

This is unfortunately not straightforward using only the facilities of the Rust standard library.
If env::vars_os() returned an iterator over &'static OsStr instead of OsString, this would be trivial. Unfortunately, not all platforms allow creating an &OsStr to the contents of an environment variable. In particular, on Windows, the native encoding is UTF-16 but the encoding needed by OsStr is WTF-8. For this reason, there really is no OsStr anywhere you could take a reference to, until you create an OsString by iterating over env::vars_os().
The simplest thing, as the question comments mention, is to return owned Strings:
fn os_env_hashmap() -> HashMap<String, String> {
let mut map = HashMap::new();
use std::env;
for (key, val) in env::vars_os() {
// Use pattern bindings instead of testing .is_some() followed by .unwrap()
if let (Ok(k), Ok(v)) = (key.into_string(), val.into_string()) {
map.insert(k, v);
}
}
return map;
}
The result is not "read-only", but it is not shared, so you cannot cause data races or other weird bugs by mutating it.
See also
Is there any way to return a reference to a variable created in a function?
Return local String as a slice (&str)

Related

`HashMap::get_mut` leading to "returns reference to local value", any efficient work-around?

There have been a fair number of questions around this, and the solution is mostly "use Entry".
However this is an issue because HashMap::entry() requires an owned value meaning possibly expensive copies / allocations even when the key is already present and we just want to update the value in-place, hence the use of get_mut. However the use of get_mut on a reference to a local leads rustc to assume that said reference gets stored into the hashmap, and thus that returning the hashmap is an error:
use std::borrow::Cow;
use std::collections::HashMap;
fn get_string() -> String { String::from("xxxxxxx") }
fn foo() -> HashMap<Cow<'static, str>, usize> {
let mut v = HashMap::new();
// stand-in for "get a string slice as key",
// real case is getting a String from an
// mpsc and the key being a segment of that string
let s = get_string();
// stand-in for a structure which contains an `Option<Cow>`
let k = Cow::from(&s[2..3]);
// because of get_mut, `&s` is apparently considered to be stored in `v`?
if let Some(e) = v.get_mut(&k) {
*e += 1;
} else {
v.insert(Cow::from(k.into_owned()), 0);
}
v
}
Note that the manipulations at lines 9~13 are there to clarify the point of the pattern, but get_mut alone is sufficient to trigger the issue
Is there a way around without the efficiency hit, or is an eager allocation the only way? (note: because this is a static issue, dynamic gates like contains_key or get obviously don't do anything).
According to the docs, HashSet::get_mut() requires a value of type &Q such that the key of the hash implements Borrow<Q>.
The key of your hash is Cow<'static, str>, that implements Borrow<str>. This means that you can use either a &Cow<'static, str> or a &str. But you are passing a &Cow<'local, str> for some 'local lifetime. The compiler tries to match that 'local with 'static and issues a somewhat confusing error message about lifetimes.
The solution is actually easy, because you can get an &str from the Cow either calling k.as_ref() or doing &*k, and the lifetime of the &str is unrestricted: (playground)
let k = Cow::from(&s[2..3]);
if let Some(e) = v.get_mut(k.as_ref()) { /* ...*/ }

How to create a Box<UnsafeCell<[T]>>

The recommended way to create a regular boxed slice (i.e. Box<[T]>) seems to be to first create a std::Vec<T>, and use .into_boxed_slice(). However, nothing similar to this seems to work if I want the slice to be wrapped in UnsafeCell.
A solution with unsafe code is fine, but I'd really like to avoid having to manually manage the memory.
The only (not-unsafe) way to create a Box<[T]> is via Box::from, given a &[T] as the parameter. This is because [T] is ?Sized and can't be passed a parameter. This in turn effectively requires T: Copy, because T has to be copied from behind the reference into the new Box. But UnsafeCell is not Copy, regardless if T is. Discussion about making UnsafeCell Copy has been going on for years, yielding no final conclusion, due to safety concerns.
If you really, really want a Box<UnsafeCell<[T]>>, there are only two ways:
Because Box and UnsafeCell are both CoerceUnsize, and [T; N] is Unsize, you can create a Box<UnsafeCell<[T; N]>> and coerce it to a Box<UnsafeCell<[T]>. This limits you to initializing from fixed-sized arrays.
Unsize coercion:
fn main() {
use std::cell::UnsafeCell;
let x: [u8;3] = [1,2,3];
let c: Box<UnsafeCell<[_]>> = Box::new(UnsafeCell::new(x));
}
Because UnsafeCell is #[repr(transparent)], you can create a Box<[T]> and unsafely mutate it to a Box<UnsafeCell<[T]>, as the UnsafeCell<[T]> is guaranteed to have the same memory layout as a [T], given that [T] doesn't use niche-values (even if T does).
Transmute:
// enclose the transmute in a function accepting and returning proper type-pairs
fn into_boxed_unsafecell<T>(inp: Box<[T]>) -> Box<UnsafeCell<[T]>> {
unsafe {
mem::transmute(inp)
}
}
fn main() {
let x = vec![1,2,3];
let b = x.into_boxed_slice();
let c: Box<UnsafeCell<[_]>> = into_boxed_unsafecell(b);
}
Having said all this: I strongly suggest you are suffering from the xy-problem. A Box<UnsafeCell<[T]>> is a very strange type (especially compared to UnsafeCell<Box<[T]>>). You may want to give details on what you are trying to accomplish with such a type.
Just swap the pointer types to UnsafeCell<Box<[T]>>:
use std::cell::UnsafeCell;
fn main() {
let mut res: UnsafeCell<Box<[u32]>> = UnsafeCell::new(vec![1, 2, 3, 4, 5].into_boxed_slice());
unsafe {
println!("{}", (*res.get())[1]);
res.get_mut()[1] = 10;
println!("{}", (*res.get())[1]);
}
}
Playground

Lifetime of references in closures

I need a closure to refer to parts of an object in its enclosing environment. The object is created within the environment and is scoped to it, but once created it could be safely moved to the closure.
The use case is a function that does some preparatory work and returns a closure that will do the rest of the work. The reason for this design are execution constraints: the first part of the work involves allocation, and the remainder must do no allocation. Here is a minimal example:
fn stage_action() -> Box<Fn() -> ()> {
// split a freshly allocated string into pieces
let string = String::from("a:b:c");
let substrings = vec![&string[0..1], &string[2..3], &string[4..5]];
// the returned closure refers to the subtrings vector of
// slices without any further allocation or modification
Box::new(move || {
for sub in substrings.iter() {
println!("{}", sub);
}
})
}
fn main() {
let action = stage_action();
// ...executed some time later:
action();
}
This fails to compile, correctly stating that &string[0..1] and others must not outlive string. But if string were moved into the closure, there would be no problem. Is there a way to force that to happen, or another approach that would allow the closure to refer to parts of an object created just outside of it?
I've also tried creating a struct with the same functionality to make the move fully explicit, but that doesn't compile either. Again, compilation fails with the error that &later[0..1] and others only live until the end of function, but "borrowed value must be valid for the static lifetime".
Even completely avoiding a Box doesn't appear to help - the compiler complains that the object doesn't live long enough.
There's nothing specific to closures here; it's the equivalent of:
fn main() {
let string = String::from("a:b:c");
let substrings = vec![&string[0..1], &string[2..3], &string[4..5]];
let string = string;
}
You are attempting to move the String while there are outstanding borrows. In my example here, it's to another variable; in your example it's to the closure's environment. Either way, you are still moving it.
Additionally, you are trying to move the substrings into the same closure environment as the owning string. That's makes the entire problem equivalent to Why can't I store a value and a reference to that value in the same struct?:
struct Environment<'a> {
string: String,
substrings: Vec<&'a str>,
}
fn thing<'a>() -> Environment<'a> {
let string = String::from("a:b:c");
let substrings = vec![&string[0..1], &string[2..3], &string[4..5]];
Environment {
string: string,
substrings: substrings,
}
}
The object is created within the environment and is scoped to it
I'd disagree; string and substrings are created outside of the closure's environment and moved into it. It's that move that's tripping you up.
once created it could be safely moved to the closure.
In this case that's true, but only because you, the programmer, can guarantee that the address of the string data inside the String will remain constant. You know this for two reasons:
String is internally implemented with a heap allocation, so moving the String doesn't move the string data.
The String will never be mutated, which could cause the string to reallocate, invalidating any references.
The easiest solution for your example is to simply convert the slices to Strings and let the closure own them completely. This may even be a net benefit if that means you can free a large string in favor of a few smaller strings.
Otherwise, you meet the criteria laid out under "There is a special case where the lifetime tracking is overzealous" in Why can't I store a value and a reference to that value in the same struct?, so you can use crates like:
owning_ref
use owning_ref::RcRef; // 0.4.1
use std::rc::Rc;
fn stage_action() -> impl Fn() {
let string = RcRef::new(Rc::new(String::from("a:b:c")));
let substrings = vec![
string.clone().map(|s| &s[0..1]),
string.clone().map(|s| &s[2..3]),
string.clone().map(|s| &s[4..5]),
];
move || {
for sub in &substrings {
println!("{}", &**sub);
}
}
}
fn main() {
let action = stage_action();
action();
}
ouroboros
use ouroboros::self_referencing; // 0.2.3
fn stage_action() -> impl Fn() {
#[self_referencing]
struct Thing {
string: String,
#[borrows(string)]
substrings: Vec<&'this str>,
}
let thing = ThingBuilder {
string: String::from("a:b:c"),
substrings_builder: |s| vec![&s[0..1], &s[2..3], &s[4..5]],
}
.build();
move || {
thing.with_substrings(|substrings| {
for sub in substrings {
println!("{}", sub);
}
})
}
}
fn main() {
let action = stage_action();
action();
}
Note that I'm no expert user of either of these crates, so these examples may not be the best use of it.

A built-in Object in Rust

Rust doesn't have built-in Object type I take it? If so, how do I, say, create a HashMap of "something" that in Java would be Object:
fn method1(my_hash_map: HashMap<&str, ???>) { ... } // Rust
void method1(Map<String, Object> myMap) { ... } // Java
If you want a HashMap that can mix values of many different types, you'll have to use Any. The most direct equivalent to Map<String, Object> would be HashMap<String, Box<Any>>. I switched &str to String because &str without a lifetime is probably not what you want and in any case even further removed from Java String than Rust's String already is.
However, if you simply don't care about the type of the values, it's simpler and more efficient to make method1 generic:
fn method1<T>(my_hash_map: HashMap<String, T>) { ... }
Of course, you can also add constraints T:Trait to do more interesting things with the values (cf. Object allows equality comparisons and hashing).
To expand on rightføld's comment, Any is the closest you can really get in Rust, though it does come with a major restriction: it is only implemented by types which satisfy the 'static lifetime; that is, you can't treat any type which contains non-static references as an Any.
A second complication is that Object in Java has reference semantics and gives you shared ownership. As such, you'd need something like Rc<RefCell<Any>> to get something roughly comparable. Note, however, that this is heavily discouraged since it basically moves a lot of checks to runtime. Something like this should be a fallback of last resort.
Finally, note that, insofar as I'm aware, there's no way to do a dynamic upcast on an Any to anything other than the erased type; so you can't take a reference to a value that, say, implements Show, turn it into an &Any, and then upcast to a &Show.
Better alternatives, if applicable, include generalising the value type (so use generic functions and structs), using an enum if there is a fixed, finite list of types you want to support, or write and implement a custom trait, in that order.
To give you an example of working with Any, however, I threw the following together. Note that we have to try explicitly upcasting to every supported type.
#![feature(if_let)]
use std::any::{Any, AnyRefExt};
use std::collections::HashMap;
fn main() {
let val_a = box "blah";
let val_b = box 42u;
let val_c = box 3.14159f64;
let mut map = HashMap::new();
map.insert("a".into_string(), val_a as Box<Any>);
map.insert("b".into_string(), val_b as Box<Any>);
map.insert("c".into_string(), val_c as Box<Any>);
println!("{}", map);
splang(&map);
}
fn splang(map: &HashMap<String, Box<Any>>) {
for (k, v) in map.iter() {
if let Some(v) = v.downcast_ref::<&str>() {
println!("[\"{}\"]: &str = \"{}\"", k, *v);
} else if let Some(v) = v.downcast_ref::<uint>() {
println!("[\"{}\"]: uint = {}", k, *v);
} else {
println!("[\"{}\"]: ? = {}", k, v);
}
}
}
When run, it outputs:
{c: Box<Any>, a: Box<Any>, b: Box<Any>}
["c"]: ? = Box<Any>
["a"]: &str = "blah"
["b"]: uint = 42

How would I create and use a string to string Hashmap in Rust?

How would I idiomatically create a string to string hashmap in rust. The following works, but is it the right way to do it? is there a different kind of string I should be using?
use std::collections::hashmap::HashMap;
//use std::str;
fn main() {
let mut mymap = HashMap::new();
mymap.insert("foo".to_string(), "bar".to_string());
println!("{0}", mymap["foo".to_string()]);
}
Assuming you would like the flexibility of String, HashMap<String, String> is correct. The other choice is &str, but that imposes significant restrictions on how the HashMap can be used/where it can be passed around; but if it it works, changing one or both parameter to &str will be more efficient. This choice should be dictated by what sort of ownership semantics you need, and how dynamic the strings are, see this answer and the strings guide for more.
BTW, searching a HashMap<String, ...> with a String can be expensive: if you don't already have one, it requires allocating a new String. We have a work around in the form of find_equiv, which allows you to pass a string literal (and, more generally, any &str) without allocating a new String:
use std::collections::HashMap;
fn main() {
let mut mymap = HashMap::new();
mymap.insert("foo".to_string(), "bar".to_string());
println!("{}", mymap.find_equiv(&"foo"));
println!("{}", mymap.find_equiv(&"not there"));
}
playpen (note I've left the Option in the return value, one could call .unwrap() or handle a missing key properly).
Another slightly different option (more general in some circumstances, less in others), is the std::string::as_string function, which allows viewing the data in &str as if it were a &String, without allocating (as the name suggests). It returns an object that can be dereferenced to a String, e.g.
use std::collections::HashMap;
use std::string;
fn main() {
let mut mymap = HashMap::new();
mymap.insert("foo".to_string(), "bar".to_string());
println!("{}", mymap[*string::as_string("foo")]);
}
playpen
(There is a similar std::vec::as_vec.)
Writing this answer for future readers. huon's answer is correct at the time but *_equiv methods were purged some time ago.
The HashMap documentation provides an example on using String-String hashmaps, where &str can be used.
The following code will work just fine. No new String allocation necessary:
use std::collections::HashMap;
fn main() {
let mut mymap = HashMap::new();
mymap.insert("foo".to_string(), "bar".to_string());
println!("{0}", mymap["foo"]);
println!("{0}", mymap.get("foo").unwrap());
}

Resources