What does this Rust Closure argument syntax mean?

What does this Rust Closure argument syntax mean? - rust

I modified code found on the internet to create a function that obtains the statistical mode of any Hashable type that implements Eq, but I do not understand some of the syntax. Here is the function:
use std::hash::Hash;
use std::collections::HashMap;
pub fn mode<'a, I, T>(items: I) -> &'a T
where I: IntoIterator<Item = &'a T>,
T: Hash + Clone + Eq, {
let mut occurrences: HashMap<&T, usize> = HashMap::new();
for value in items.into_iter() {
*occurrences.entry(value).or_insert(0) += 1;
}
occurrences
.into_iter()
.max_by_key(|&(_, count)| count)
.map(|(val, _)| val)
.expect("Cannot compute the mode of zero items")
}
(I think requiring Clone may be overkill.)
The syntax I do not understand is in the closure passed to map_by_key:
|&(_, count)| count
What is the &(_, count) doing? I gather the underscore means I can ignore that parameter. Is this some sort of destructuring of a tuple in a parameter list? Does this make count take the reference of the tuple's second item?

.max_by_key(|&(_, count)| count) is equivalent to .max_by_key(f) where f is this:
fn f<T>(t: &(T, usize)) -> usize {
(*t).1
}
f() could also be written using pattern matching, like this:
fn f2<T>(&(_, count): &(T, usize)) -> usize {
count
}
And f2() is much closer to the first closure you're asking about.
The second closure is essentially the same, except there is no reference slightly complicating matters.

Related

Swap two elements in a vector in rust

I want to swap two elements in a vector.
I wrote this function to swap elements, but it gives it gives an error.
fn swap<T>(arr: &mut Vec<T>, i: usize, j: usize) {
let temp = arr[i];
arr[i] = arr[j];
arr[j] = temp;
}
error[E0507]: cannot move out of index of `Vec<T>`
--> src/quick_sort.rs:2:16
|
2 | let temp = arr[i];
| ^^^^^^
| |
| move occurs because value has type `T`, which does not implement the `Copy` trait
| help: consider borrowing here: `&arr[i]`

You are in luck, the implementers thought that people may want an easy way of swapping elements and added Vec::swap. This method is also implemented with slices. If you want to swap the values for two mutable references you can use std::mem::swap.
fn swap<T>(arr: &mut Vec<T>, i: usize, j: usize) {
arr.swap(i, j);
}
Alternatively while it is a bit of a pain to do, you can split a slice or array into two or more non-overlapping mutable slices of the original. This allows you to take multiple multiple references into an slice at once.
pub fn swap(arr: &mut [Foo], i: usize, j: usize) {
let (low, high) = match i.cmp(&j) {
Ordering::Less => (i, j),
Ordering::Greater => (j, i),
Ordering::Equal => return,
};
let (a, b) = arr.split_at_mut(high);
std::mem::swap(&mut a[low], &mut b[0]);
}

Because you haven't added any constraints to T, your generic swap<T>() function needs to be able to work for any type T. Importantly, it needs to be able to work for types even if they don't implement the Copy trait, therefore the assignment operator (=) performs a move. You can't move the value out of the vector like this, or this would invalidate the vector. Of course, you plan to fix up the vector so that it is valid again, but the compiler doesn't see the big picture here, it only sees the initial move as invalidating the vector, and therefore is illegal.
To implement swap here, you would need to use unsafe code. However, swap is a common problem, so the Rust standard library exposes functions to do this so you don't have to (std::mem::swap() or Vec::swap() as #Locke mentioned).
Alternatively, you could specify that your swap function only works for types which implement the Copy trait, like so:
fn swap<T: Copy>(arr: &mut Vec<T>, i: usize, j: usize) {
let temp = arr[i];
arr[i] = arr[j];
arr[j] = temp;
}
However, there is no advantage to writing your own swap over std::mem::swap().

`fold` values into a HashMap

After reading this article Learning Programming Concepts by Jumping in at the Deep End I can't seem to understand how exactly fold() is working in this context. Mainly how fold() knows to grab the word variable from split().
Here's the example:
use std::collections::HashMap;
fn count_words(text: &str) -> HashMap<&str, usize> {
text.split(' ').fold(
HashMap::new(),
|mut map, word| { *map.entry(word).or_insert(0) += 1; map }
)
}
Playground
Rust docs say:
fold() takes two arguments: an initial value, and a closure with two arguments: an ‘accumulator’, and an element. The closure returns the value that the accumulator should have for the next iteration.
Iterator - fold
So I get the mut map is the accumulator and I get that split() returns an iterator and therefore fold() is iterating over those values but how does fold know to grab that value? It's being implicitly passed but I cant seem to wrap my head around this. How is that being mapped to the word variable...
Not sure if I have the right mental model for this...
Thanks!

but how does fold know to grab that value?
fold() is a method on the iterator. That means that it has access to self which is the actual iterator, so it can call self.next() to get the next item (in this case the word, since self is of type Split, so its next() does get the next word). You could imagine fold() being implemented with the following pseudocode:
fn fold<B, F>(mut self, init: B, mut f: F) -> B
where
Self: Sized,
F: FnMut(B, Self::Item) -> B,
{
let mut accum = init;
while let Some(x) = self.next() {
accum = f(accum, x);
}
accum
}
Ok, the above is not pseudocode, it's the actual implementation.

How is it that I can circumvent "cannot borrow as mutable more than once at a time" with semantically equivalent code?

I have the following for a merge sort problem with huge files:
struct MergeIterator<'a, T> where T: Copy {
one: &'a mut dyn Iterator<Item=T>,
two: &'a mut dyn Iterator<Item=T>,
a: Option<T>,
b: Option<T>
}
impl<'m, T> MergeIterator<'m, T> where T: Copy {
pub fn new(i1: &'m mut dyn Iterator<Item=T>,
i2: &'m mut dyn Iterator<Item=T>) -> MergeIterator<'m, T> {
let mut m = MergeIterator {one:i1, two:i2, a: None, b: None};
m.a = m.one.next();
m.b = m.two.next();
m
}
}
This seems to make rustc happy. However, I started with this (imho) less clumsy body of the new() function:
MergeIterator {one:i1, two:i2, a: i1.next(), b: i2.next()}
and got harsh feedback from the compiler saying
cannot borrow `*i1` as mutable more than once at a time
and likewise for i2.
I'd like to understand where the semantic difference is between initializing the data elements through the m.one reference vs the i1 argument? Why must I write clumsy imperative code here to achieve what I need?

This will probably be clearer if you write it in lines so that the sequence of operations is visible:
MergeIterator {
one: i1,
two: i2,
a: i1.next(),
b: i2.next(),
}
You're giving i1 to the new struct, you don't have it anymore to call next.
The solution is to change the order of operations to call next first before giving away the mutable reference:
MergeIterator {
a: i1.next(),
b: i2.next(),
one: i1,
two: i2,
}
To make it clearer, you must understand that i1.next() borrows i1 only for the time of the function call while i1 gives away the mutable reference. Reversing the order isn't equivalent.

Proper signature for a function accepting an iterator of strings

I'm confused about the proper type to use for an iterator yielding string slices.
fn print_strings<'a>(seq: impl IntoIterator<Item = &'a str>) {
for s in seq {
println!("- {}", s);
}
}
fn main() {
let arr: [&str; 3] = ["a", "b", "c"];
let vec: Vec<&str> = vec!["a", "b", "c"];
let it: std::str::Split<'_, char> = "a b c".split(' ');
print_strings(&arr);
print_strings(&vec);
print_strings(it);
}
Using <Item = &'a str>, the arr and vec calls don't compile. If, instead, I use <Item = &'a'a str>, they work, but the it call doesn't compile.
Of course, I can make the Item type generic too, and do
fn print_strings<'a, I: std::fmt::Display>(seq: impl IntoIterator<Item = I>)
but it's getting silly. Surely there must be a single canonical "iterator of string values" type?

The error you are seeing is expected because seq is &Vec<&str> and &Vec<T> implements IntoIterator with Item=&T, so with your code, you end up with Item=&&str where you are expecting it to be Item=&str in all cases.
The correct way to do this is to expand Item type so that is can handle both &str and &&str. You can do this by using more generics, e.g.
fn print_strings(seq: impl IntoIterator<Item = impl AsRef<str>>) {
for s in seq {
let s = s.as_ref();
println!("- {}", s);
}
}
This requires the Item to be something that you can retrieve a &str from, and then in your loop .as_ref() will return the &str you are looking for.
This also has the added bonus that your code will also work with Vec<String> and any other type that implements AsRef<str>.

TL;DR The signature you use is fine, it's the callers that are providing iterators with wrong Item - but can be easily fixed.
As explained in the other answer, print_string() doesn't accept &arr and &vec because IntoIterator for &[T; n] and &Vec<T> yield references to T. This is because &Vec, itself a reference, is not allowed to consume the Vec in order to move T values out of it. What it can do is hand out references to T items sitting inside the Vec, i.e. items of type &T. In the case of your callers that don't compile, the containers contain &str, so their iterators hand out &&str.
Other than making print_string() more generic, another way to fix the issue is to call it correctly to begin with. For example, these all compile:
print_strings(arr.iter().map(|sref| *sref));
print_strings(vec.iter().copied());
print_strings(it);
Playground
iter() is the method provided by slices (and therefore available on arrays and Vec) that iterates over references to elements, just like IntoIterator of &Vec. We call it explicitly to be able to call map() to convert &&str to &str the obvious way - by using the * operator to dereference the &&str. The copied() iterator adapter is another way of expressing the same, possibly a bit less cryptic than map(|x| *x). (There is also cloned(), equivalent to map(|x| x.clone()).)
It's also possible to call print_strings() if you have a container with String values:
let v = vec!["foo".to_owned(), "bar".to_owned()];
print_strings(v.iter().map(|s| s.as_str()));

How do I compare a vector against a reversed version of itself?

Why won't this compile?
fn isPalindrome<T>(v: Vec<T>) -> bool {
return v.reverse() == v;
}
I get
error[E0308]: mismatched types
--> src/main.rs:2:25
|
2 | return v.reverse() == v;
| ^ expected (), found struct `std::vec::Vec`
|
= note: expected type `()`
found type `std::vec::Vec<T>`

Since you only need to look at the front half and back half, you can use the DoubleEndedIterator trait (methods .next() and .next_back()) to look at pairs of front and back elements this way:
/// Determine if an iterable equals itself reversed
fn is_palindrome<I>(iterable: I) -> bool
where
I: IntoIterator,
I::Item: PartialEq,
I::IntoIter: DoubleEndedIterator,
{
let mut iter = iterable.into_iter();
while let (Some(front), Some(back)) = (iter.next(), iter.next_back()) {
if front != back {
return false;
}
}
true
}
(run in playground)
This version is a bit more general, since it supports any iterable that is double ended, for example slice and chars iterators.
It only examines each element once, and it automatically skips the remaining middle element if the iterator was of odd length.

Read up on the documentation for the function you are using:
Reverse the order of elements in a slice, in place.
Or check the function signature:
fn reverse(&mut self)
The return value of the method is the unit type, an empty tuple (). You can't compare that against a vector.
Stylistically, Rust uses 4 space indents, snake_case identifiers for functions and variables, and has an implicit return at the end of blocks. You should adjust to these conventions in a new language.
Additionally, you should take a &[T] instead of a Vec<T> if you are not adding items to the vector.
To solve your problem, we will use iterators to compare the slice. You can get forward and backward iterators of a slice, which requires a very small amount of space compared to reversing the entire array. Iterator::eq allows you to do the comparison succinctly.
You also need to state that the T is comparable against itself, which requires Eq or PartialEq.
fn is_palindrome<T>(v: &[T]) -> bool
where
T: Eq,
{
v.iter().eq(v.iter().rev())
}
fn main() {
println!("{}", is_palindrome(&[1, 2, 3]));
println!("{}", is_palindrome(&[1, 2, 1]));
}
If you wanted to do the less-space efficient version, you have to allocate a new vector yourself:
fn is_palindrome<T>(v: &[T]) -> bool
where
T: Eq + Clone,
{
let mut reverse = v.to_vec();
reverse.reverse();
reverse == v
}
fn main() {
println!("{}", is_palindrome(&[1, 2, 3]));
println!("{}", is_palindrome(&[1, 2, 1]));
}
Note that we are now also required to Clone the items in the vector, so we add that trait bound to the method.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

What does this Rust Closure argument syntax mean? - rust

Related

Swap two elements in a vector in rust

`fold` values into a HashMap

How is it that I can circumvent "cannot borrow as mutable more than once at a time" with semantically equivalent code?

Proper signature for a function accepting an iterator of strings

How do I compare a vector against a reversed version of itself?

Categories

Resources