Pattern matching in rust and mutability - rust

Let's say that I have an Option<String> and want to pattern match to get a mutable reference to the String. I can do the following (_a needs to be mutable):
let mut _a: Option<String> = Some(String::from("foo"));
if let Some(ref mut aa) = _a {
aa.push_str("_");
println!("aa: {:?}", aa)
}
Now let's say that I have two Option<String> values that I want to pattern match over.
let _b: Option<String> = Some(String::from("bar"));
let _c: Option<String> = Some(String::from("baz"));
if let (Some(ref mut bb), Some(ref mut cc)) = (_b, _c) {
bb.push_str("_");
cc.push_str("_");
println!("bb: {:?}, cc: {:?}", bb, cc);
}
Strangely, I can use ref mut in the patterns, even though neither _b nor _c are mutable, and I can mutate the strings. Why is that allowed here in this case? I'd expect this not to compile unless both _b and _c are declared as mutable similar to the first example above.
I think what may be happening is that a tuple is constructed in the pattern match, i.e. (_b, _c), and then some compiler magic allows the ref mut on the pattern that is "bound" to this tuple. Is that correct?
Rust version:
rustc 1.41.1 (f3e1a954d 2020-02-24)

The mut isn't necessary if the value is wrapped in a tuple, because the tuple is a new value (playground):
let a = Some(String::from("foo"));
if let (Some(mut aa),) = (a,) {
aa.push_str("_");
println!("aa: {:?}", aa)
}

Related

How to return a reference to something inside a moved box?

I want to store a Box in a map, and take the address of the contents of the Box and hand it out to C code. This is why I don't just store the value directly in the map, as map values move when a map grows.
I'd like a function that returns a reference to the contents of the Box, that I can use in Rust code and either use as-is in Rust code, or convert the returned reference to a raw pointer for C.
This code illustrates what I'm trying to do, but it doesn't work for obvious reasons. However, it's not clear to me how to fix it.
use std::collections::{HashMap};
fn box_and_ref<'a>(map: &'a mut HashMap<String, Box<Vec<u8>>>) -> &'a Vec<u8> {
let v = vec!{b'h', b'e', b'l', b'l', b'o'};
let b = Box::new(v);
let r = b.as_ref();
map.insert("foo".to_string(), b);
r
}
fn main() {
let mut map: HashMap<String, Box<Vec<u8>>> = HashMap::new();
let v = box_and_ref(&mut map);
println!("{:?}", v);
}
playground link
For any “avoid double lookup” issues, use the entry API:
fn box_and_ref<'a>(map: &'a mut HashMap<String, Box<Vec<u8>>>) -> &'a Vec<u8> {
let v = vec!{b'h', b'e', b'l', b'l', b'o'};
let b = Box::new(v);
map.entry("foo".to_string()).or_insert(b)
}

Mutate vector within filter

So, I have the following code successfully performing filter in vector:
let mut v1 : Vec<i32> = vec!(1,2,3);
let v2 : Vec<&mut i32> = v1.iter_mut().filter(|x| {**x == 2}).collect();
println!("{:?}", v2);
Since the type signature of the predicate in the filter function is
FnMut(&Self::Item) -> bool, I was assuming that that mutation inside
the closure will work:
let mut v1 : Vec<i32> = vec!(1,2,3);
let v2 : Vec<&mut i32> = v1.iter_mut().filter(|x| {**x = 3; **x == 2}).collect();
println!("{:?}", v2);
But the above code results in a compile error. How to fix that ? Note
that I'm playing with rust to get a better understanding, so the abpve
example doesn't make sense (usually, nobody will try to mutate
things inside filter).
You are confusing two concepts: FnMut means that a function can change its captured variables, like:
fn main() {
let v1 = vec![1, 2, 3];
let mut i = 0usize;
let v2: Vec<_> = v1
.into_iter()
.filter(|x| {
i = i + 1;
*x == 2
})
.collect();
println!("We iterate {} times and produce {:?}", i, v2);
}
This doesn't mean that every parameter of a function will be mutable.
In your code, filter() takes a &Self::Item, which is very different from the map() one that takes Self::Item. Because the real type will translate to Map<Item=&mut i32> and Filter<Item=&&mut i32>. Rust forbids you from mutating a reference if it's behind a non mutable reference:
fn test(a: &&mut i32) {
**a = 5;
}
error[E0594]: cannot assign to `**a` which is behind a `&` reference
This is because Rust follows the the-rules-of-references:
At any given time, you can have either one mutable reference or any number of immutable references.
References must always be valid.
This means you can have more than one &&mut but only one &mut &mut. If Rust didn't stop you, you could mutate a &&mut and that would poison any other &&mut.
Unfortunately the full error description of E0594 is still not available, see #61137.
Note: Avoid side effects when you use the iterator API, I think it's OK to mutate your FnMut state but not the item, you should do this in a for loop, like:
fn main() {
let mut v1 = vec![1, 2, 3];
for x in v1.iter_mut().filter(|x| **x == 2) {
*x = 1;
}
println!("{:?}", v1);
}

Using str and String interchangably

Suppose I'm trying to do a fancy zero-copy parser in Rust using &str, but sometimes I need to modify the text (e.g. to implement variable substitution). I really want to do something like this:
fn main() {
let mut v: Vec<&str> = "Hello there $world!".split_whitespace().collect();
for t in v.iter_mut() {
if (t.contains("$world")) {
*t = &t.replace("$world", "Earth");
}
}
println!("{:?}", &v);
}
But of course the String returned by t.replace() doesn't live long enough. Is there a nice way around this? Perhaps there is a type which means "ideally a &str but if necessary a String"? Or maybe there is a way to use lifetime annotations to tell the compiler that the returned String should be kept alive until the end of main() (or have the same lifetime as v)?
Rust has exactly what you want in form of a Cow (Clone On Write) type.
use std::borrow::Cow;
fn main() {
let mut v: Vec<_> = "Hello there $world!".split_whitespace()
.map(|s| Cow::Borrowed(s))
.collect();
for t in v.iter_mut() {
if t.contains("$world") {
*t.to_mut() = t.replace("$world", "Earth");
}
}
println!("{:?}", &v);
}
as #sellibitze correctly notes, the to_mut() creates a new String which causes a heap allocation to store the previous borrowed value. If you are sure you only have borrowed strings, then you can use
*t = Cow::Owned(t.replace("$world", "Earth"));
In case the Vec contains Cow::Owned elements, this would still throw away the allocation. You can prevent that using the following very fragile and unsafe code (It does direct byte-based manipulation of UTF-8 strings and relies of the fact that the replacement happens to be exactly the same number of bytes.) inside your for loop.
let mut last_pos = 0; // so we don't start at the beginning every time
while let Some(pos) = t[last_pos..].find("$world") {
let p = pos + last_pos; // find always starts at last_pos
last_pos = pos + 5;
unsafe {
let s = t.to_mut().as_mut_vec(); // operating on Vec is easier
s.remove(p); // remove $ sign
for (c, sc) in "Earth".bytes().zip(&mut s[p..]) {
*sc = c;
}
}
}
Note that this is tailored exactly to the "$world" -> "Earth" mapping. Any other mappings require careful consideration inside the unsafe code.
std::borrow::Cow, specifically used as Cow<'a, str>, where 'a is the lifetime of the string being parsed.
use std::borrow::Cow;
fn main() {
let mut v: Vec<Cow<'static, str>> = vec![];
v.push("oh hai".into());
v.push(format!("there, {}.", "Mark").into());
println!("{:?}", v);
}
Produces:
["oh hai", "there, Mark."]

Default mutable value from HashMap

Suppose I have a HashMap and I want to get a mutable reference to an entry, or if that entry does not exist I want a mutable reference to a new object, how can I do it? I've tried using unwrap_or(), something like this:
fn foo() {
let mut map: HashMap<&str, Vec<&str>> = HashMap::new();
let mut ref = map.get_mut("whatever").unwrap_or( &mut Vec::<&str>::new() );
// Modify ref.
}
But that doesn't work because the lifetime of the Vec isn't long enough. Is there any way to tell Rust that I want the returned Vec to have the same lifetime as foo()? I mean there is this obvious solution but I feel like there should be a better way:
fn foo() {
let mut map: HashMap<&str, Vec<&str>> = HashMap::new();
let mut dummy: Vec<&str> = Vec::new();
let mut ref = map.get_mut("whatever").unwrap_or( &dummy );
// Modify ref.
}
As mentioned by Shepmaster, here is an example of using the entry pattern. It seems verbose at first, but this avoids allocating an array you might not use unless you need it. I'm sure you could make a generic function around this to cut down on the chatter :)
use std::collections::HashMap;
use std::collections::hash_map::Entry::{Occupied, Vacant};
fn foo() {
let mut map = HashMap::<&str, Vec<&str>>::new();
let mut result = match map.entry("whatever") {
Vacant(entry) => entry.insert(Vec::new()),
Occupied(entry) => entry.into_mut(),
};
// Do the work
result.push("One thing");
result.push("Then another");
}
This can also be shortened to or_insert as I just discovered!
use std::collections::HashMap;
fn foo() {
let mut map = HashMap::<&str, Vec<&str>>::new();
let mut result = map.entry("whatever").or_insert(Vec::new());
// Do the work
result.push("One thing");
result.push("Then another");
}
If you want to add your dummy into the map, then this is a duplicate of How to properly use HashMap::entry? or Want to add to HashMap using pattern match, get borrow mutable more than once at a time (or any question about the entry API).
If you don't want to add it, then your code is fine, you just need to follow the compiler error messages to fix it. You are trying to use a keyword as an identifier (ref), and you need to get a mutable reference to dummy (& mut dummy):
use std::collections::HashMap;
fn foo() {
let mut map: HashMap<&str, Vec<&str>> = HashMap::new();
let mut dummy: Vec<&str> = Vec::new();
let f = map.get_mut("whatever").unwrap_or( &mut dummy );
}
fn main() {}

Converting from Option<String> to Option<&str>

Very often I have obtained an Option<String> from a calculation, and I would like to either use this value or a default hardcoded value.
This would be trivial with an integer:
let opt: Option<i32> = Some(3);
let value = opt.unwrap_or(0); // 0 being the default
But with a String and a &str, the compiler complains about mismatched types:
let opt: Option<String> = Some("some value".to_owned());
let value = opt.unwrap_or("default string");
The exact error here is:
error[E0308]: mismatched types
--> src/main.rs:4:31
|
4 | let value = opt.unwrap_or("default string");
| ^^^^^^^^^^^^^^^^
| |
| expected struct `std::string::String`, found reference
| help: try using a conversion method: `"default string".to_string()`
|
= note: expected type `std::string::String`
found type `&'static str`
One option is to convert the string slice into an owned String, as suggested by rustc:
let value = opt.unwrap_or("default string".to_string());
But this causes an allocation, which is undesirable when I want to immediately convert the result back to a string slice, as in this call to Regex::new():
let rx: Regex = Regex::new(&opt.unwrap_or("default string".to_string()));
I would rather convert the Option<String> to an Option<&str> to avoid this allocation.
What is the idiomatic way to write this?
As of Rust 1.40, the standard library has Option::as_deref to do this:
fn main() {
let opt: Option<String> = Some("some value".to_owned());
let value = opt.as_deref().unwrap_or("default string");
}
See also:
How can I iterate on an Option<Vec<_>>?
You can use as_ref() and map() to transform an Option<String> into an Option<&str>.
fn main() {
let opt: Option<String> = Some("some value".to_owned());
let value = opt.as_ref().map(|x| &**x).unwrap_or("default string");
}
First, as_ref() implicitly takes a reference on opt, giving an &Option<String> (because as_ref() takes &self, i.e. it receives a reference), and turns it into an Option<&String>. Then we use map to convert it to an Option<&str>. Here's what &**x does: the rightmost * (which is evaluated first) simply dereferences the &String, giving a String lvalue. Then, the leftmost * actually invokes the Deref trait, because String implements Deref<Target=str>, giving us a str lvalue. Finally, the & takes the address of the str lvalue, giving us a &str.
You can simplify this a bit further by using map_or to combine map and unwrap_or in a single operation:
fn main() {
let opt: Option<String> = Some("some value".to_owned());
let value = opt.as_ref().map_or("default string", |x| &**x);
}
If &**x looks too magical to you, you can write String::as_str instead:
fn main() {
let opt: Option<String> = Some("some value".to_owned());
let value = opt.as_ref().map_or("default string", String::as_str);
}
or String::as_ref (from the AsRef trait, which is in the prelude):
fn main() {
let opt: Option<String> = Some("some value".to_owned());
let value = opt.as_ref().map_or("default string", String::as_ref);
}
or String::deref (though you need to import the Deref trait too):
use std::ops::Deref;
fn main() {
let opt: Option<String> = Some("some value".to_owned());
let value = opt.as_ref().map_or("default string", String::deref);
}
For either of these to work, you need to keep an owner for the Option<String> as long as the Option<&str> or unwrapped &str needs to remain available. If that's too complicated, you could use Cow.
use std::borrow::Cow::{Borrowed, Owned};
fn main() {
let opt: Option<String> = Some("some value".to_owned());
let value = opt.map_or(Borrowed("default string"), |x| Owned(x));
}
A nicer way could be to implement this generically for T: Deref:
use std::ops::Deref;
trait OptionDeref<T: Deref> {
fn as_deref(&self) -> Option<&T::Target>;
}
impl<T: Deref> OptionDeref<T> for Option<T> {
fn as_deref(&self) -> Option<&T::Target> {
self.as_ref().map(Deref::deref)
}
}
which effectively generalizes as_ref.
Although I love Veedrac's answer (I used it), if you need it at just one point and you would like something that is expressive you can use as_ref(), map and String::as_str chain:
let opt: Option<String> = Some("some value".to_string());
assert_eq!(Some("some value"), opt.as_ref().map(String::as_str));
Here's one way you can do it. Keep in mind that you have to keep the original String around, otherwise what would the &str be a slice into?
let opt = Some(String::from("test")); // kept around
let unwrapped: &str = match opt.as_ref() {
Some(s) => s, // deref coercion
None => "default",
};
playpen

Resources