&&str.to_owned() doesn't result in a String - reference

I've got the following code:
use std::collections::HashMap;
fn main() {
let xs: Vec<&str> = vec!("a", "b", "c", "d");
let ys: Vec<i32> = vec!(1, 2, 3, 4);
let mut map: HashMap<String,i32> = HashMap::new();
for (x,y) in xs.iter().zip(ys) {
map.insert(x.to_owned(), y);
}
println!("{:?}", map);
}
Which results in error:
<anon>:8:20: 8:32 error: mismatched types:
expected `collections::string::String`,
found `&str`
(expected struct `collections::string::String`,
found &-ptr) [E0308]
<anon>:8 map.insert(x.to_owned(), y);
But it doesn't make sense to me. x should be &&str at this point. So why doesn't &&str.to_owned() automagically Deref the same way x.to_string() does at this point? (Why is x.to_owned() a &str?)
I know I can fix this by either using x.to_string(), or xs.into_iter() instead.

Because ToOwned is implemented for T where T: Clone, and Clone is implemented for &T. You need to roughly understand how pattern matching on &self works when both T and &T are available. Using a pseudo-syntax for exposition,
str → String
str doesn't match &self
&str (auto-ref) matches &self with self == str
Thus ToOwned<str> kicks in.
&str → String
&str matches &self with self == str
Thus ToOwned<str> kicks in.
&&str → &str
&&str matches &self with self == &str
Thus ToOwned<&T> kicks in.
Note that in this case, auto-deref can never kick in, since &T will always match in cases where T might, which lowers the complexity a bit. Note also that auto-ref only kicks in once (and once more for each auto-deref'd type).
To copy from huon's much better answer than mine,
The core of the algorithm is:
For each each "dereference step" U (that is, set U = T and then U = *T, ...)
if there's a method bar where the receiver type (the type of self in the method) matches U exactly , use it (a "by value method")
otherwise, add one auto-ref (take & or &mut of the receiver), and, if some method's receiver matches &U, use it (an "autorefd method")
FWIW, .into() is normally prettier than .to_owned() (especially when types are implied; oft even when not), so I suggest that here. You still need a manual dereference, though.

Related

Why do I need the type annotation here?

In this leetcode invert binary tree problem, I'm trying to borrow a node wrapped in an Rc mutably. Here is the code.
use std::rc::Rc;
use std::cell::RefCell;
impl Solution {
pub fn invert_tree(root: Option<Rc<RefCell<TreeNode>>>) -> Option<Rc<RefCell<TreeNode>>> {
let mut stack: Vec<Option<Rc<RefCell<TreeNode>>>> = vec![root.clone()];
while stack.len() > 0 {
if let Some(node) = stack.pop().unwrap() {
let n: &mut TreeNode = &mut node.borrow_mut();
std::mem::swap(&mut n.left, &mut n.right);
stack.extend(vec![n.left.clone(), n.right.clone()]);
}
}
root
}
}
If I change the line let n: &mut TreeNode to just let n = &mut node.borrow_mut(), I get a compiler error on the next line, "cannot borrow *n as mutable more than once at a time"
It seems like the compiler infers n to be of type &mut RefMut<TreeNode>, but it all works out when I explicitly say it is &mut TreeNode. Any reason why?
A combination of borrow splitting and deref-coercion causes the seemingly identical code to behave differently.
The compiler infers n to be of type RefMut<TreeNode>, because that's what borrow_mut actually returns:
pub fn borrow_mut(&self) -> RefMut<'_, T>
RefMut is a funny little type that's designed to look like a &mut, but it's actually a separate thing. It implements Deref and DerefMut, so it will happily pretend to be a &mut TreeNode when needed. But Rust is still inserting calls to .deref() in there for you.
Now, why does one work and not the other? Without the type annotation, after deref insertion, you get
let n = &mut node.borrow_mut();
std::mem::swap(&mut n.deref_mut().left, &mut n.deref_mut().right);
So we're trying to call deref_mut (which takes a &mut self) twice in the same line on the same variable. That's not allowed by Rust's borrow rules, so it fails.
(Note that the &mut on the first line simply borrows an owned value for no reason. Temporary lifetime extension lets us get away with this, even though you don't need the &mut at all in this case)
Now, on the other hand, if you do put in the type annotation, then Rust sees that borrow_mut returns a RefMut<'_, TreeNode> but you asked for a &mut TreeNode, so it inserts the deref_mut on the first line. You get
let n: &mut TreeNode = &mut node.borrow_mut().deref_mut();
std::mem::swap(&mut n.left, &mut n.right);
Now the only deref_mut call is on the first line. Then, on the second line, we access n.left and n.right, both mutably, simultaneously. It looks like we're accessing n mutably twice at once, but Rust is actually smart enough to see that we're accessing two disjoint parts of n simultaneously, so it allows it. This is called borrow splitting. Rust will split borrows on different instance fields, but it's not smart enough to see the split across a deref_mut call (function calls could, in principle, do anything, so Rust's borrow checker refuses to try to do advanced reasoning about their return value).

Clarification on Deref coercion

Consider this example:
fn main() {
let string: String = "A string".to_string();
let string_ref: &String = &string;
let str_ref_a: &str = string_ref; // A
let str_ref_b: &str = &string; // B
}
How exactly are lines A and B different? string_ref is of type &String, so my understanding is that in line A we have an example of Deref coercion. What about line B though? Is it correct to say that it has nothing to do with Deref coercion and we simply have "a direct borrowing of a String as str", due to this:
impl Borrow<str> for String {
#[inline]
fn borrow(&self) -> &str {
&self[..]
}
}
Both are essentially equivalent and involve deref coercion:
let str_ref_a: &str = string_ref; // A
let str_ref_b: &str = &string; // B
The value string above is of type String, so the expression &string is of type &String which coerces into &str due to deref coercion as String implements Deref<Target=str>.
Regarding your question about Borrow: No, you aren't calling borrow() on the string value. Instead, that would be:
let str_ref_b: &str = string.borrow(); // Borrow::borrow() on String
That is, unlike deref(), the call to borrow() isn't inserted automatically.
No. Both lines involve deref coercion.
The Borrow trait is not special in any way - it is not known to the compiler (not a lang item). The Deref trait is.
The difference between Deref and Borrow (and also AsRef) is that Deref can only have one implementation for a type (since Target is an associated type and not a generic parameter) while AsRef (and Borrow) take a generic parameter and thus can be implemented multiple times. This is because Deref is for smart pointers: you should implement Deref<Target = T> if I am a T (note that the accurate definition of "smart pointer" is in flux). String is a str (plus additional features), and Vec is a slice. Box<T> is T.
On the other hand, AsRef and Borrow are conversion traits. I should implement AsRef<T> if I can be viewed as T. String can be viewed as str. But take e.g. str and OsStr. str is not an OsStr. But it can be treated like it if it was. So it implements AsRef<OsStr>. Borrow is the same as AsRef except it has additional requirements (namely, it should have the same Eq, Hash and Ord as the original value).

Proper signature for a function accepting an iterator of strings

I'm confused about the proper type to use for an iterator yielding string slices.
fn print_strings<'a>(seq: impl IntoIterator<Item = &'a str>) {
for s in seq {
println!("- {}", s);
}
}
fn main() {
let arr: [&str; 3] = ["a", "b", "c"];
let vec: Vec<&str> = vec!["a", "b", "c"];
let it: std::str::Split<'_, char> = "a b c".split(' ');
print_strings(&arr);
print_strings(&vec);
print_strings(it);
}
Using <Item = &'a str>, the arr and vec calls don't compile. If, instead, I use <Item = &'a'a str>, they work, but the it call doesn't compile.
Of course, I can make the Item type generic too, and do
fn print_strings<'a, I: std::fmt::Display>(seq: impl IntoIterator<Item = I>)
but it's getting silly. Surely there must be a single canonical "iterator of string values" type?
The error you are seeing is expected because seq is &Vec<&str> and &Vec<T> implements IntoIterator with Item=&T, so with your code, you end up with Item=&&str where you are expecting it to be Item=&str in all cases.
The correct way to do this is to expand Item type so that is can handle both &str and &&str. You can do this by using more generics, e.g.
fn print_strings(seq: impl IntoIterator<Item = impl AsRef<str>>) {
for s in seq {
let s = s.as_ref();
println!("- {}", s);
}
}
This requires the Item to be something that you can retrieve a &str from, and then in your loop .as_ref() will return the &str you are looking for.
This also has the added bonus that your code will also work with Vec<String> and any other type that implements AsRef<str>.
TL;DR The signature you use is fine, it's the callers that are providing iterators with wrong Item - but can be easily fixed.
As explained in the other answer, print_string() doesn't accept &arr and &vec because IntoIterator for &[T; n] and &Vec<T> yield references to T. This is because &Vec, itself a reference, is not allowed to consume the Vec in order to move T values out of it. What it can do is hand out references to T items sitting inside the Vec, i.e. items of type &T. In the case of your callers that don't compile, the containers contain &str, so their iterators hand out &&str.
Other than making print_string() more generic, another way to fix the issue is to call it correctly to begin with. For example, these all compile:
print_strings(arr.iter().map(|sref| *sref));
print_strings(vec.iter().copied());
print_strings(it);
Playground
iter() is the method provided by slices (and therefore available on arrays and Vec) that iterates over references to elements, just like IntoIterator of &Vec. We call it explicitly to be able to call map() to convert &&str to &str the obvious way - by using the * operator to dereference the &&str. The copied() iterator adapter is another way of expressing the same, possibly a bit less cryptic than map(|x| *x). (There is also cloned(), equivalent to map(|x| x.clone()).)
It's also possible to call print_strings() if you have a container with String values:
let v = vec!["foo".to_owned(), "bar".to_owned()];
print_strings(v.iter().map(|s| s.as_str()));

Dereferencing strings and HashMaps in Rust

I'm trying to understand how HashMaps work in Rust and I have come up with this example.
use std::collections::HashMap;
fn main() {
let mut roman2number: HashMap<&'static str, i32> = HashMap::new();
roman2number.insert("X", 10);
roman2number.insert("I", 1);
let roman_num = "XXI".to_string();
let r0 = roman_num.chars().take(1).collect::<String>();
let r1: &str = &r0.to_string();
println!("{:?}", roman2number.get(r1)); // This works
// println!("{:?}", roman2number.get(&r0.to_string())); // This doesn't
}
When I try to compile the code with last line uncommented, I get the following error
error: the trait bound `&str: std::borrow::Borrow<std::string::String>` is not satisfied [E0277]
println!("{:?}", roman2number.get(&r0.to_string()));
^~~
note: in this expansion of format_args!
note: in this expansion of print! (defined in <std macros>)
note: in this expansion of println! (defined in <std macros>)
help: run `rustc --explain E0277` to see a detailed explanation
The Trait implementation section of the docs gives the dereferencing as fn deref(&self) -> &str
So what is happening here?
The error is caused by that generic function HashMap::get over String is selected by the compiler during type inference. But you want HashMap::get over str.
So just change
println!("{:?}", roman2number.get(&r0.to_string()));
to
println!("{:?}", roman2number.get::<str>(&r0.to_string()));
to make it explicit. This helps the compiler to select the right function.
Check out Playground here.
It looks to me that coercion Deref<Target> can only happen when we know the target type, so when compiler is trying to infer which HashMap::get to use, it sees &r0.to_string() as type &String but never &str. And &'static str does not implement Borrow<String>. This results a type error. When we specify HashMap::get::<str>, this function expects &str, when coercion can be applied to &String to get a matching &str.
You can check out Deref coercion and String Deref for more details.
The other answers are correct, but I wanted to point out that you have an unneeded to_string (you've already collected into a String) and an alternate way of coercing to a &str, using as:
let r0: String = roman_num.chars().take(1).collect();
println!("{:?}", roman2number.get(&r0 as &str));
In this case, I'd probably just rewrite the map to contain char as the key though:
use std::collections::HashMap;
fn main() {
let mut roman2number = HashMap::new();
roman2number.insert('X', 10);
roman2number.insert('I', 1);
let roman_num = "XXI";
for c in roman_num.chars() {
println!("{:?}", roman2number.get(&c));
}
}
Note there's no need to have an explicit type for the map, it will be inferred.
The definition of the get method looks as follows
fn get<Q: ?Sized>(&self, k: &Q) -> Option<&V> where K: Borrow<Q>, Q: Hash + Eq
The first part is the type of object which you pass: Q. There are constraints on Q. The conditions on Q are that
the key-type K needs to implement the Borrow trait over Q
Q needs to implement the Hash and Eq traits.
Replacing this with your actual types means that the key-type &'static str needs to implement Borrow<String>. By the definition of Borrow, this means that a &'static str needs to be convertible to &String. But all the docs/texts I've read state that everywhere you'd use &String you should be using &str instead. So it makes little sense to offer a &str -> &String conversion, even if it would make life a little easier sometimes.
Since every reference type is borrowable as a shorter lived reference type.), you can pass a &str when a &'static str is the key-type, because &'static str implements Borrow<str>

How does Vec<T> implement iter()?

I am looking at the code of Vec<T> to see how it implements iter() as I want to implement iterators for my struct:
pub struct Column<T> {
name: String,
vec: Vec<T>,
...
}
My goal is not to expose the fields and provide iterators to do looping, max, min, sum, avg, etc for a column.
fn test() {
let col: Column<f32> = ...;
let max = col.iter().max();
}
I thought I would see how Vec<T> does iteration. I can see iter() is defined in SliceExt but it's implemented for [T] and not Vec<T> so I am stumped how you can call iter() from Vec<T>?
Indeed, as fjh said, this happens due to how dereference operator functions in Rust and how methods are resolved.
Rust has special Deref trait which allows values of the types implementing it to be "dereferenced" to obtain another type, usually one which is naturally connected to the source type. For example, an implementation like this one:
impl<T> Deref for Vec<T> {
type Target = [T];
fn deref<'a>(&'a self) -> &'a [T] { self.as_slice() }
}
means that applying * unary operator to a Vec<T> would yield [T] which you would need to borrow again:
let v: Vec<u32> = vec![0; 10];
let s: &[u32] = &*v;
(note that even though deref() returns a reference, the dereference operator * returns Target, not &Target - the compiler inserts automatic dereference if you do not borrow the dereferenced value immediately).
This is the first piece of puzzle. The second one is how methods are resolved. Basically, when you write something like
v.iter()
the compiler first tries to find iter() defined on the type of v (in this case Vec<u32>). If no such method can be found, the compiler tries to insert an appropriate number of *s and &s so the method invocation becomes valid. In this case it find that the following is indeed a valid invocation:
(&*v).iter()
Remember, Deref on Vec<T> returns &[T], and slices do have iter() method defined on them. This is also how you can invoke e.g. a method taking &self on a regular value - the compiler automatically inserts a reference operation for you.

Resources