Value and references when providing a closure to Iterator::find - rust

I have still quite a long way to go in learning Rust, but I find the way values and references are used to be inconsistent. This may be born from my own ignorance of the language.
For example, this works:
let x = (1..100).find(|a| a % 2 == 0);
But let x = (1..100).find(|a| a > 50); does not. I am not sure - why though?
Using let x = (1..100).find(|&a| a > 50); fixes the error, but then I thought using &a is like asking for reference of element from the range and hence following should work, but it does not:
let x = (1..100).find(|&a| *a > 50);
Again no idea why!

but then I thought using &a is like asking for reference of element from the range
This is the wrong part of your reasoning. Using & in pattern does exactly the opposite - it implicitly dereferences the matched value:
let &a = &10;
// a is 10, not &10 or &&10
As you probably already know, find() accepts a closure which satisfies FnMut(&T) -> bool, that is, this closure accepts a reference to each element of the iterator, so if you write (1..100).find(|a| ...), a will be of type &i32.
let x = (1..100).find(|a| a % 2 == 0) works because arithmetic operators are overloaded to work on references, so you can apply % to a reference and it still would be able to compile.
Comparison operators are not overloaded to handle references, and so you need to get an i32 from &i32. This could be done in two ways, first, like you already did:
let x = (1..100).find(|&a| a > 50)
Here we use & patterns to implicitly dereference the function argument. It is equivalent to this one:
let x = (1..100).find(|a| { let a = *a; a > 50 })
Another way would be to dereference the argument explicitly:
let x = (1..100).find(|a| *a > 50)

I thought using &a is like asking for reference of element from the range
Sometimes & is used as an operator, and sometimes it is used as a pattern match. For the closure parameter (|&a|), it is being used as a pattern match. This means that the variable a will be automatically dereferenced when it is used. It is also equivalent to do
let x = (1..100).find(|a| *a > 50);

Non-trivial patterns usually destructure something, i.e., break something into its components. This usually mirrors some construction syntax, so it looks very similar but is actually the inverse. This dualism applies to records, to tuples, to boxes (once those are properly implemented), and also to references:
The expression &x creates a reference to whatever x evaluates to. Here, the & turns a value of type T into one of type &T.
The pattern &a, on the other hand, eliminates the reference, so a is bound to what is behind the reference (note that a could also be another, more complicated pattern). Here, the & goes from a &T value to a T value.
The closures in your examples are all of of type &i32 -> bool1. So they accept a reference to an integer, and you can either work with that reference (which you do in the first example, which works because arithmetic operators are overloaded for references too) or you can use the pattern &a. In the latter case, a is a i32 (compare the general explanation above, substitute i32 for T), so of course you can't dereference it further.
1 This is not actually a real type, but it's close enough for our purposes.

Related

Why does Rust fail to correctly infer the type?

Why does this work fine:
let items = [1, 2, 3];
let mut cumulator = 0;
for next in items.iter() {
cumulator += next;
}
println!("Final {}", cumulator);
But this fail?:
let items = [1, 2, 3];
let mut cumulator = 0;
for next in items.iter() {
cumulator += next.pow(2);
}
println!("Final {}", cumulator);
Error on .pow(2):
no method named `pow` found for reference `&{integer}` in the current scope
method not found in `&{integer}`rustc (E0599)
My IDE identifies next as i32 and the first code example works fine. But the compiler has an issue the moment I reference next.pow() or any function on next . The compiler complains that next is an ambiguous integer type.
Sure, I can fix this by either explicitly declaring the array as i32[]. Or I can also use an interim variable before cumulator which is also explicitly declared i32. But these seem unnecessary and a bit clunky.
So why is compiler happy in the first case and not in the second?
Calling methods on objects is kind of funny, because it conveys zero information. That is, if I write
a + b
Then, even if Rust knows nothing about a and b, it can now assume that a is Add where the Rhs type is the type of b. We can refine the types and, hopefully, get more information down the road. Similarly, if I write foobar(), where foobar is a local variable, then Rust knows it has to be at least FnOnce.
However, if foo is a variable, and I write
foo.frobnicate()
Then Rust has no idea what to do with that. Is it an inherent impl on some type? Is it a trait function? It could be literally anything. If it's inherent, then it could even be in a module that we haven't imported, so we can't simply check everything that's in scope.
In your case, pow isn't even a trait function, it's actually several different functions. Even if it was a trait function, we couldn't say anything, because we don't, a priori, know which trait. So Rust sees next.pow(2) and bails out immediately, rather than trying to do something unexpected.
In your other case, Rust is able to infer the type. At the end of the function, all it knows about the type is that it's an {integer} on which Add is defined, and Rust has integer defaulting rules that kick in to turn that into i32, in the absence of any other information.
Could they have applied integer defaulting to next.pow(2)? Possibly, but I'm glad they didn't. Integers are already a special case in Rust (integers and floats are the only types with polymorphic literals), so minimizing the amount of special casing required by the compiler is, at least in my mind, a good thing. The defaulting rules kicked in in the first example because nothing caused it to bail out, and they would have in the second if it hadn't already encountered the bigger error condition of "calling an impl function on an unknown type".

How do references work in patterns in binding expressions? [duplicate]

This question already has answers here:
Meaning of '&variable' in arguments/patterns
(1 answer)
What is the difference between `e1` and `&e2` when used as the for-loop variable?
(1 answer)
What's the difference between ref and & when assigning a variable from a reference?
(3 answers)
Closed 3 years ago.
I came across below example in Rust book.
for &item in list.iter() {
if item > largest {
largest = item;
}
}
I suppose it means list.iter() returns the reference to the elements in the list hence &item but while comparing it with largest number why are we not using *item? Also, when I change the &item to item in the first line I am forced to use *item in 2nd and 3rd line by the compiler.
I have seen another example online.
(0..).map(|x| x * x)
.take_while(|&x| x <= limit)
.filter(|x| is_even(*x))
Here the closure in take_while accepts &x but uses x directly but the closure in filter takes x without reference but passes *x to is_even.
So how does this work in Rust?
What you are seeing here is called destructuring. This is a feature where you can take apart a structure with a pattern.
You probably already saw something like let (a, b) = returns_a_tuple();. Here, a tuple is destructured. You can also destructure references:
// The comments at the end of the line tell you the type of the variable
let i = 3; // : i32
let ref_i = &i; // : &i32
let ref_ref_i = &ref_i; // : &&i32
let &x = ref_i; // : i32
let &y = ref_ref_i; // : &i32
let &&z = ref_ref_i; // : i32
// All of these error because we try to destructure more layers of references
// than there are.
let &a = i;
let &&b = ref_i;
let &&&c = ref_ref_i;
This has the counter-intuitive effect that the more & you add in the pattern, the fewer & will the type of the variable have. But it does make sense in the context of destructuring: when you already mention the structure in the pattern, the structure won't be in the bound variables anymore.
(It is worth noting that this "destructuring references away" only works with referee types that are Copy. Otherwise you will get a "cannot move out of borrowed content" error.)
Now what does that have to do with your for loop and the closures? Turns out: patterns are everywhere. The slot between for and in in the for loop is a pattern, and arguments of functions and closures are pattern as well! This works:
// Destructuring a tuple in the for loop pattern
let v = vec![3];
for (i, elem) in v.iter().enumerate() {}
// Destructuring an array in the function argument (works the same for closures)
fn foo([x, y, z]: [f32; 3]) {}
I suppose it means list.iter() returns the reference to the elements in the list
Exactly.
... hence &item
"hence" is not correct here. The author of this code didn't want to work with the reference to the item, but instead work with the real value. So they added the & in the pattern to destructure the reference away.
but while comparing it with largest number why are we not using *item?
Yes, because the reference was already removed by the destructuring pattern.
Also, when I change the &item to item in the first line I am forced to use *item in 2nd and 3rd line by the compiler.
Yes, because now the pattern doesn't destructure the reference anymore, so item is a reference again. This is the basic gist with all of this: most of the time you can either remove the reference in the pattern by adding a & or you can remove the reference when using the variable by adding a *.
Here the closure in take_while accepts &x but uses x directly but the closure in filter takes x without reference but passes *x to is_even.
It should be clear by now why that is, right? The take_while closure removes the reference via destructuring in the pattern while the filter closure does it via standard dereferencing.
You can read more about all of this in this chapter of the book.

String equality in Rust: how does referencing and dereferencing work?

As a Rust newbie, I'm working through the Project Euler problems to help me get a feel for the language. Problem 4 deals with palindromes, and I found two solutions for creating a vector of palindromes, but I'm not sure how either of them work.
I'm using a vector of strings, products, that's calculated like this:
let mut products = Vec::new();
for i in 100..500 {
for j in 500..1000 {
products.push((i * j).to_string());
}
}
For filtering these products to only those that are palindromic, I have the following two solutions:
Solution 1:
let palindromes: Vec<_> = products
.iter()
.filter(|&x| x == &x.chars().rev().collect::<String>())
.collect();
Solution 2:
let palindromes: Vec<_> = products
.iter()
.filter(|&x| *x == *x.chars().rev().collect::<String>())
.collect();
They both yield the correct result, but I have no idea why!
In Solution 1, we're comparing a reference of a string to a reference of a string we've just created?
In Solution 2, we dereference a reference to a string and compare it to a dereferenced new string?
What I would expect to be able to do:
let palindromes: Vec<_> = products
.iter()
.filter(|x| x == x.chars().rev().collect::<String>())
.collect();
I'm hoping somebody will be able to explain to me:
What is the difference is between my two solutions, and why do they both work?
Why can't I just use x without referencing or dereferencing it in my filter function?
Thank you!
Vec<String>.iter() returns an iterator over references (&String).
The closure argument of .filter() takes a reference to an iterator's item. So the type that is passed to the closure is a double reference &&String.
|&x| tells the closure to expect a reference, so x is now of type &String.
First solution: collect returns a String, of which & takes the reference. x is also a reference to a string, so the comparison is between two &String.
Second solution: The dereference operator * is applied to x, which results in a String. The right hand side is interesting: The String result of collect is dereferenced. This results in a string slice because String implements Deref<Target=str>. Now the comparison is between String and str, which is works because it is implemented in the standard library (Note that a == b is equivalent to a.eq(&b)).
Third solution: The compiler explains why it does not work.
the trait std::cmp::PartialEq<std::string::String> is not implemented for &&std::string::String
The left side is a double reference to string (&&String) and the right side is just a String . You need to get both sides to the same "reference level". All of these work:
x.iter().filter(|x| x == &&x.chars().rev().collect::<String>());
x.iter().filter(|x| *x == &x.chars().rev().collect::<String>());
x.iter().filter(|x| **x == x.chars().rev().collect::<String>());

Why is "&&" being used in closure arguments?

I have two questions regarding this example:
let a = [1, 2, 3];
assert_eq!(a.iter().find(|&&x| x == 2), Some(&2));
assert_eq!(a.iter().find(|&&x| x == 5), None);
Why is &&x used in the closure arguments rather than just x? I understand that & is passing a reference to an object, but what does using it twice mean?
I don't understand what the documentation says:
Because find() takes a reference, and many iterators iterate over references, this leads to a possibly confusing situation where the argument is a double reference. You can see this effect in the examples below, with &&x.
Why is Some(&2) used rather than Some(2)?
a is of type [i32; 3]; an array of three i32s.
[i32; 3] does not implement an iter method, but it does dereference into &[i32].
&[i32] implements an iter method which produces an iterator.
This iterator implements Iterator<Item=&i32>.
It uses &i32 rather than i32 because the iterator has to work on arrays of any type, and not all types can be safely copied. So rather than restrict itself to copyable types, it iterates over the elements by reference rather than by value.
find is a method defined for all Iterators. It lets you look at each element and return the one that matches the predicate. Problem: if the iterator produces non-copyable values, then passing the value into the predicate would make it impossible to return it from find. The value cannot be re-generated, since iterators are not (in general) rewindable or restartable. Thus, find has to pass the element to the predicate by-reference rather than by-value.
So, if you have an iterator that implements Iterator<Item=T>, then Iterator::find requires a predicate that takes a &T and returns a bool. [i32]::iter produces an iterator that implements Iterator<Item=&i32>. Thus, Iterator::find called on an array iterator requires a predicate that takes a &&i32. That is, it passes the predicate a pointer to a pointer to the element in question.
So if you were to write a.iter().find(|x| ..), the type of x would be &&i32. This cannot be directly compared to the literal i32 value 2. There are several ways of fixing this. One is to explicitly dereference x: a.iter().find(|x| **x == 2). The other is to use pattern matching to destructure the double reference: a.iter().find(|&&x| x == 2). These two approaches are, in this case, doing exactly the same thing. [1]
As for why Some(&2) is used: because a.iter() is an iterator over &i32, not an iterator of i32. If you look at the documentation for Iterator::find, you'll see that for Iterator<Item=T>, it returns an Option<T>. Hence, in this case, it returns an Option<&i32>, so that's what you need to compare it against.
[1]: The differences only matter when you're talking about non-Copy types. For example, |&&x| .. wouldn't work on a &&String, because you'd have to be able to move the String out from behind the reference, and that's not allowed. However, |x| **x .. would work, because that is just reaching inside the reference without moving anything.
1) I thought the book explanation was good, maybe my example with .cloned() below will be useful. But since .iter() iterates over references, you have to specify reference additionally because find expects a reference.
2) .iter() is iterating over references; therefore, you find a reference.
You could use .cloned() to see what it would look like if you didn't have to do deal with references:
assert_eq!(a.iter().cloned().find(|&x| x == 2), Some(2));

Why does the argument for the find closure need two ampersands?

I have been playing with Rust by porting my Score4 AI engine to it - basing the work on my functional-style implementation in OCaml. I specifically wanted to see how Rust fares with functional-style code.
The end result: It works, and it's very fast - much faster than OCaml. It almost touches the speed of imperative-style C/C++ - which is really cool.
There's a thing that troubles me, though — why do I need two ampersands in the last line of this code?
let moves_and_scores: Vec<_> = moves_and_boards
.iter()
.map(|&(column,board)| (column, score_board(&board)))
.collect();
let target_score = if maximize_or_minimize {
ORANGE_WINS
} else {
YELLOW_WINS
};
if let Some(killer_move) = moves_and_scores.iter()
.find(|& &(_,score)| score==target_score) {
...
I added them is because the compiler errors "guided" me to it; but I am trying to understand why... I used the trick mentioned elsewhere in Stack Overflow to "ask" the compiler to tell me what type something is:
let moves_and_scores: Vec<_> = moves_and_boards
.iter()
.map(|&(column,board)| (column, score_board(&board)))
.collect();
let () = moves_and_scores;
...which caused this error:
src/main.rs:108:9: 108:11 error: mismatched types:
expected `collections::vec::Vec<(u32, i32)>`,
found `()`
(expected struct `collections::vec::Vec`,
found ()) [E0308]
src/main.rs:108 let () = moves_and_scores;
...as I expected, moves_and_scores is a vector of tuples: Vec<(u32, i32)>. But then, in the immediate next line, iter() and find() force me to use the hideous double ampersands in the closure parameter:
if let Some(killer_move) = moves_and_scores.iter()
.find(|& &(_,score)| score==target_score) {
Why does the find closure need two ampersands? I could see why it may need one (pass the tuple by reference to save time/space) but why two? Is it because of the iter? That is, is the iter creating references, and then find expects a reference on each input, so a reference on a reference?
If this is so, isn't this, arguably, a rather ugly design flaw in Rust?
In fact, I would expect find and map and all the rest of the functional primitives to be parts of the collections themselves. Forcing me to iter() to do any kind of functional-style work seems burdensome, and even more so if it forces this kind of "double ampersands" in every possible functional chain.
I am hoping I am missing something obvious - any help/clarification most welcome.
This here
moves_and_scores.iter()
gives you an iterator over borrowed vector elements. If you follow the API doc what type this is, you'll notice that it's just the iterator for a borrowed slice and this implements Iterator with Item=&T where T is (u32, i32) in your case.
Then, you use find which takes a predicate which takes a &Item as parameter. Sice Item already is a reference in your case, the predicate has to take a &&(u32, i32).
pub trait Iterator {
...
fn find<P>(&mut self, predicate: P) -> Option<Self::Item>
where P: FnMut(&Self::Item) -> bool {...}
... ^
It was probably defined like this because it's only supposed to inspect the item and return a bool. This does not require the item being passed by value.
If you want an iterator over (u32, i32) you could write
moves_and_scores.iter().cloned()
cloned() converts the iterator from one with an Item type &T to one with an Item type T if T is Clone. Another way to do it would be to use into_iter() instead of iter().
moves_and_scores.into_iter()
The difference between the two is that the first option clones the borrowed elements while the 2nd one consumes the vector and moves the elements out of it.
By writing the lambda like this
|&&(_, score)| score == target_score
you destructure the "double reference" and create a local copy of the i32. This is allowed since i32 is a simple type that is Copy.
Instead of destructuring the parameter of your predicate you could also write
|move_and_score| move_and_score.1 == target_score
because the dot operator automatically dereferences as many times as needed.

Resources