Why is "&&" being used in closure arguments? - reference

I have two questions regarding this example:
let a = [1, 2, 3];
assert_eq!(a.iter().find(|&&x| x == 2), Some(&2));
assert_eq!(a.iter().find(|&&x| x == 5), None);
Why is &&x used in the closure arguments rather than just x? I understand that & is passing a reference to an object, but what does using it twice mean?
I don't understand what the documentation says:
Because find() takes a reference, and many iterators iterate over references, this leads to a possibly confusing situation where the argument is a double reference. You can see this effect in the examples below, with &&x.
Why is Some(&2) used rather than Some(2)?

a is of type [i32; 3]; an array of three i32s.
[i32; 3] does not implement an iter method, but it does dereference into &[i32].
&[i32] implements an iter method which produces an iterator.
This iterator implements Iterator<Item=&i32>.
It uses &i32 rather than i32 because the iterator has to work on arrays of any type, and not all types can be safely copied. So rather than restrict itself to copyable types, it iterates over the elements by reference rather than by value.
find is a method defined for all Iterators. It lets you look at each element and return the one that matches the predicate. Problem: if the iterator produces non-copyable values, then passing the value into the predicate would make it impossible to return it from find. The value cannot be re-generated, since iterators are not (in general) rewindable or restartable. Thus, find has to pass the element to the predicate by-reference rather than by-value.
So, if you have an iterator that implements Iterator<Item=T>, then Iterator::find requires a predicate that takes a &T and returns a bool. [i32]::iter produces an iterator that implements Iterator<Item=&i32>. Thus, Iterator::find called on an array iterator requires a predicate that takes a &&i32. That is, it passes the predicate a pointer to a pointer to the element in question.
So if you were to write a.iter().find(|x| ..), the type of x would be &&i32. This cannot be directly compared to the literal i32 value 2. There are several ways of fixing this. One is to explicitly dereference x: a.iter().find(|x| **x == 2). The other is to use pattern matching to destructure the double reference: a.iter().find(|&&x| x == 2). These two approaches are, in this case, doing exactly the same thing. [1]
As for why Some(&2) is used: because a.iter() is an iterator over &i32, not an iterator of i32. If you look at the documentation for Iterator::find, you'll see that for Iterator<Item=T>, it returns an Option<T>. Hence, in this case, it returns an Option<&i32>, so that's what you need to compare it against.
[1]: The differences only matter when you're talking about non-Copy types. For example, |&&x| .. wouldn't work on a &&String, because you'd have to be able to move the String out from behind the reference, and that's not allowed. However, |x| **x .. would work, because that is just reaching inside the reference without moving anything.

1) I thought the book explanation was good, maybe my example with .cloned() below will be useful. But since .iter() iterates over references, you have to specify reference additionally because find expects a reference.
2) .iter() is iterating over references; therefore, you find a reference.
You could use .cloned() to see what it would look like if you didn't have to do deal with references:
assert_eq!(a.iter().cloned().find(|&x| x == 2), Some(2));

Related

Calculate object A, then return object B that references A in Rust

In my code I often want to calculate a new value A, and then return some view of that value B, because B is a type that's more convenient to work with. The simplest case is where A is a vector and B is a slice that I would like to return. Let's say I want to write a function that returns a set of indices. Ideally this would return a slice directly because then I can use it immediately to index a string.
If I return a vector instead of a slice, I have to use to_slice:
fn all_except(except: usize, max:usize) -> Vec<usize> {
(0..except).chain((except + 1)..max).collect()
}
"abcdefg"[all_except(1, 7)]
string indices are ranges of `usize`
the type `str` cannot be indexed by `Vec<usize>`
help: the trait `SliceIndex<str>` is not implemented for `Vec<usize>`
I can't return a slice directly:
fn all_except(except: usize, max:usize) -> &[usize] {
(0..except).chain((except + 1)..max).collect()
}
"abcdefg"[all_except(1, 7)]
^ expected named lifetime parameter
missing lifetime specifier
help: this function's return type contains a borrowed value with an elided lifetime, but the lifetime cannot be derived from the arguments
help: consider using the `'static` lifetime
I can't even return the underlying vector and a slice of it, for the same reason
pub fn except(index: usize, max: usize) -> (&[usize], Vec<usize>) {
let v = (0..index).chain((index + 1)..max).collect();
(v, v.as_slice)
}
"abcdefg"[all_except(1, 7)[1]
Now it may be possible to hack this particular example using deref coercion (I'm not sure), but I have encountered this problem with more complex types. For example, I have a function that loads an ndarray::Array2<T> from CSV file, then want to split it into two parts using array.split_at(), but this returns two ArrayView2<T> which reference the original Array2<T>, so I encounter the same issue. In general I'm wondering if there's a solution to this problem in general. Can I somehow tell the compiler to move A into the parent frame's scope, or let me return a tuple of (A, B), where it realises that the slice is still valid because A is still alive?
Your code doesn't seem to make any sense, you can't index a string using a slice. If you could the first snippet would have worked with just an as_slice in the caller or something, vecs trivially coerce to slices. That's exactly what the compiler error is telling you: the compiler is looking for a SliceIndex and a Vec (or slice) is definitely not that.
That aside,
Can I somehow tell the compiler to move A into the parent frame's scope, or let me return a tuple of (A, B), where it realises that the slice is still valid because A is still alive?
There are packages like owning_ref which can bundle owner and reference to avoid extra allocations. It tends to be somewhat fiddly.
I don't think there's any other general solution, because Rust reasons at the function level, the type checker has no notion of "tell the compiler to move A into the parent scope". So you need a construct which works around borrow checker.

Convert Vec<Cow<'_, [u8]> to &str

A lib (quick_xml) function (attributes()) returns a value with a type Vec<Cow<'_, [u8]>.
The exact line is e.attributes().map(|a| a.unwrap().value).collect::<Vec<_>>() and the printed value = [[116, 101, 115, 116]].
How can I convert it to a string ("test" in this case) so I can use it later?
I assume you are referencing this example. In the future, please give us the whole source code – this makes answering the question much easier.
Understanding the code
Let's take it one step at a time:
e.attributes().map(|a| a.unwrap().value).collect::<Vec<_>>()
^
e is a BytesStart struct, so it represents an opening XML tag, in your case <tag1 att1 = "test">.
e.attributes().map(|a| a.unwrap().value).collect::<Vec<_>>()
^^^^^^^^^^^^
This is the attributes method of BytesStart. It returns the Attributes struct which represents the set of attributes that one tag has. In your case, that is only one attribute: It has the name attr1 and the value test.
Attributes is an iterator, this means you can iterate over the contained Attributes (note that Attributes contains multiple Attributes – these are not the same type!). If you want to learn more about iterators, you may want to read the chapter about it in the Rust book.
e.attributes().map(|a| a.unwrap().value).collect::<Vec<_>>()
^^^^^^^^^^^^^^^^^^^^^^^^^
Here, we call the map method of the Iterator struct. It lets us transform one iterator (in this case the Attributes struct) into another iterator by transforming each value of the iterator. We call it with a closure (if you don't know what this is, the Rust book also has a chapter about this) that takes one value of the original iterator and returns the transformed value of the new iterator. Now, let's look at that iterator:
|a| a.unwrap().value
^^^
This iterator takes one argument named a, which is, as I said above, the type that the original iterator contains. I said above that Attributes contains multiple Attributes – while this is true, it is not the full picture, the iterator iterates over Result<Attribute>, and that is the type of a.
|a| a.unwrap().value
^^^^^^^^^^
When operating normally, a will always be an instance of Result::Ok containing your Attribute, but if your XML is somehow invalid, amay also be a Result::Err to indicate some kind of parse error. We don't want to care about error handling here, so we just call the unwrap method of Result that returns the contained Argument and panics if there was an error.
|a| a.unwrap().value
^^^^^
The Attribute struct contains two values: key and [value]. We are only interested in value, so let's select it. The value field is of type Cow<'a, [u8]>. Cow is a smart pointer with some interesting properties that aren't really relevant here. If you want to learn more about it, you may be interested in the documentation of Cow (although his may be a bit too complicated for a Rust newbie). For the remainder of this explanation, I will just pretend value is of type &[u8] (a reference to a u8 slice).
We now have determined that the closure returns a &[u8], therefore the iterator returned by map iterates over &[u8].
e.attributes().map(|a| a.unwrap().value).collect::<Vec<_>>()
^^^^^^^^^^^^^^^^^^^
Now we call the collect method of Iterator which transforms the iterator into a collection. The type of the collection is given as a generic parameter an is Vec<_>. The underscore tells rustc to try to find out the correct type by context or output an error if this is not possible. The only type possible here is &[u8], therefore, this method returns a Vec<&[u8]>.
The solution
You can use the unescape_and_decode_value method of Attribute. This transforms the Attribute value to a String and also unescapes escape sequences if the attribute contains them.
e.attributes().map(|a| a.unwrap().unescape_and_decode_value(&reader).unwrap()).collect::<Vec<_>>()
Note that this still returns Vector<String>, not String. The vector contains the values of all attributes assigned to this element – in this case, it's just the attribute value "Test".
You can use std::str::from_utf8 for fallible conversion of &[u8] to &str:
use std::borrow::Cow;
fn main() {
let s = "test";
let v = s.as_bytes();
let c = Cow::Borrowed(v);
println!("{}", std::str::from_utf8(&*c).unwrap());
}
The crucial part is the deref and reborrow of Cow since from_utf8 takes &[u8] instead of Cow. Cow implements Deref for T, in this case T is [u8], thus you can get a &[u8] via &*.
Playground Link
In your concrete example you should be able to get a Vec<&str> by:
e.attributes().map(|a| std::str::from_utf8(&*a.unwrap().value).unwrap()).collect::<Vec<_>>()

In Rust, why does std::iter::Iterator's min function return a reference?

In Rust, why does std::iter::Iterator's min function return a reference?
Take this example from the documentation page linked above:
let a = vec![1, 2, 3];
assert_eq!(a.iter().min(), Some(&1));
Why is the result a reference to the value 1 wrapped inside the Option type instead of the literal value 1? This little detail tripped me up recently. I found I had to dereference the result after unwrapping it before I could use it in math operations.
Technically it does not: min() returns an Option<Self::Item> where Self is the Iterator. That is, min() returns whatever the iterator yields. Since the iterator is created via .iter() on the Vec, you get an Iterator over references, therefore min() returns a reference. If you use a.into_iter().min() you get an owned value.
Your question suggests you wondered about the Option as well: The iterator may not yield any items at all. In this case, min() has no value and None is returned.

Why does the argument for the find closure need two ampersands?

I have been playing with Rust by porting my Score4 AI engine to it - basing the work on my functional-style implementation in OCaml. I specifically wanted to see how Rust fares with functional-style code.
The end result: It works, and it's very fast - much faster than OCaml. It almost touches the speed of imperative-style C/C++ - which is really cool.
There's a thing that troubles me, though — why do I need two ampersands in the last line of this code?
let moves_and_scores: Vec<_> = moves_and_boards
.iter()
.map(|&(column,board)| (column, score_board(&board)))
.collect();
let target_score = if maximize_or_minimize {
ORANGE_WINS
} else {
YELLOW_WINS
};
if let Some(killer_move) = moves_and_scores.iter()
.find(|& &(_,score)| score==target_score) {
...
I added them is because the compiler errors "guided" me to it; but I am trying to understand why... I used the trick mentioned elsewhere in Stack Overflow to "ask" the compiler to tell me what type something is:
let moves_and_scores: Vec<_> = moves_and_boards
.iter()
.map(|&(column,board)| (column, score_board(&board)))
.collect();
let () = moves_and_scores;
...which caused this error:
src/main.rs:108:9: 108:11 error: mismatched types:
expected `collections::vec::Vec<(u32, i32)>`,
found `()`
(expected struct `collections::vec::Vec`,
found ()) [E0308]
src/main.rs:108 let () = moves_and_scores;
...as I expected, moves_and_scores is a vector of tuples: Vec<(u32, i32)>. But then, in the immediate next line, iter() and find() force me to use the hideous double ampersands in the closure parameter:
if let Some(killer_move) = moves_and_scores.iter()
.find(|& &(_,score)| score==target_score) {
Why does the find closure need two ampersands? I could see why it may need one (pass the tuple by reference to save time/space) but why two? Is it because of the iter? That is, is the iter creating references, and then find expects a reference on each input, so a reference on a reference?
If this is so, isn't this, arguably, a rather ugly design flaw in Rust?
In fact, I would expect find and map and all the rest of the functional primitives to be parts of the collections themselves. Forcing me to iter() to do any kind of functional-style work seems burdensome, and even more so if it forces this kind of "double ampersands" in every possible functional chain.
I am hoping I am missing something obvious - any help/clarification most welcome.
This here
moves_and_scores.iter()
gives you an iterator over borrowed vector elements. If you follow the API doc what type this is, you'll notice that it's just the iterator for a borrowed slice and this implements Iterator with Item=&T where T is (u32, i32) in your case.
Then, you use find which takes a predicate which takes a &Item as parameter. Sice Item already is a reference in your case, the predicate has to take a &&(u32, i32).
pub trait Iterator {
...
fn find<P>(&mut self, predicate: P) -> Option<Self::Item>
where P: FnMut(&Self::Item) -> bool {...}
... ^
It was probably defined like this because it's only supposed to inspect the item and return a bool. This does not require the item being passed by value.
If you want an iterator over (u32, i32) you could write
moves_and_scores.iter().cloned()
cloned() converts the iterator from one with an Item type &T to one with an Item type T if T is Clone. Another way to do it would be to use into_iter() instead of iter().
moves_and_scores.into_iter()
The difference between the two is that the first option clones the borrowed elements while the 2nd one consumes the vector and moves the elements out of it.
By writing the lambda like this
|&&(_, score)| score == target_score
you destructure the "double reference" and create a local copy of the i32. This is allowed since i32 is a simple type that is Copy.
Instead of destructuring the parameter of your predicate you could also write
|move_and_score| move_and_score.1 == target_score
because the dot operator automatically dereferences as many times as needed.

Why does cloned() allow this function to compile

I'm starting to learn Rust and I tried to implement a function to reverse a vector of strings. I found a solution but I don't understand why it works.
This works:
fn reverse_strings(strings:Vec<&str>) -> Vec<&str> {
let actual: Vec<_> = strings.iter().cloned().rev().collect();
return actual;
}
But this doesn't.
fn reverse_strings(strings:Vec<&str>) -> Vec<&str> {
let actual: Vec<_> = strings.iter().rev().collect(); // without clone
return actual;
}
Error message
src/main.rs:28:10: 28:16 error: mismatched types:
expected `collections::vec::Vec<&str>`,
found `collections::vec::Vec<&&str>`
(expected str,
found &-ptr) [E0308]
Can someone explain to me why? What happens in the second function? Thanks!
So the call to .cloned() is essentially like doing .map(|i| i.clone()) in the same position (i.e. you can replace the former with the latter).
The thing is that when you call iter(), you're iterating/operating on references to the items being iterated. Notice that the vector already consists of 'references', specifically string slices.
So to zoom in a bit, let's replace cloned() with the equivalent map() that I mentioned above, for pedagogical purposes, since they are equivalent. This is what it actually looks like:
.map(|i: & &str| i.clone())
So notice that that's a reference to a reference (slice), because like I said, iter() operates on references to the items, not the items themselves. So since a single element in the vector being iterated is of type &str, then we're actually getting a reference to that, i.e. & &str. By calling clone() on each of these items, we go from a & &str to a &str, just like calling .clone() on a &i64 would result in an i64.
So to bring everything together, iter() iterates over references to the elements. So if you create a new vector from the collected items yielded by the iterator you construct (which you constructed by calling iter()) you would get a vector of references to references, that is:
let actual: Vec<& &str> = strings.iter().rev().collect();
So first of all realize that this is not the same as the type you're saying the function returns, Vec<&str>. More fundamentally, however, the lifetimes of these references would be local to the function, so even if you changed the return type to Vec<& &str> you would get a lifetime error.
Something else you could do, however, is to use the into_iter() method. This method actually does iterate over each element, not a reference to it. However, this means that the elements are moved from the original iterator/container. This is only possible in your situation because you're passing the vector by value, so you're allowed to move elements out of it.
fn reverse_strings(strings:Vec<&str>) -> Vec<&str> {
let actual: Vec<_> = strings.into_iter().rev().collect();
return actual;
}
playpen
This probably makes a bit more sense than cloning, since we are passed the vector by value, we're allowed to do anything with the elements, including moving them to a different location (in this case the new, reversed vector). And even if we don't, the vector will be dropped at the end of that function anyways, so we might as well. Cloning would be more appropriate if we're not allowed to do that (e.g. if we were passed the vector by reference, or a slice instead of a vector more likely).

Resources