Flatten a set of pairs in Rust - rust

I am working with sets of pairs of integers in Rust using HashSet<(usize, usize)>, and my goal is to compute a set HashSet<usize> which contain all nodes appearing in those pairs. Essentially, this is a flatten operation.
Let's suppose variable set: HashSet<(usize, usize)> contains the initial set of pairs. My first attempt to compute the flattened version was:
let res: HashSet<usize> = set.iter().flatten().collect()
however I encounter the error: "&(usize, usize) is not an iterator the trait Iterator is not implemented for &(usize, usize) required for &(usize, usize) to implement IntoIterator. Required by a bound in flatten."
I thought the issue was that the iterator was over references so I also tried using into_iter instead, but I get the same error message, this time saying "(usize, usize) is not an iterator..."
I do have a piece of code that works fine using a for loop and the push operation, but I would like to do it in a more functional way with flatten.

Simply convert the tuple to an array, which is iterable:
let res: HashSet<usize> = set.iter().flat_map(|&(a, b)| [a, b]).collect();

Related

Chain Vector and IntoIterator element in Rust

I want to write a function in Rust that will return the vector composed of start integer, then all intermediate integers and then end integer. The assertion it should hold is this:
assert_eq!(intervals(0, 4, 1..4), vec![0, 1, 2, 3, 4]);
The hint is to use chain method for iterators. The function declaration is predefined, I implemented it in one way, which is the following code:
pub fn intervals< I>(start: u32, end: u32, intermediate: I) -> Vec<u32>
where
I: IntoIterator<Item = u32>,
{
let mut a1 = vec![];
a1.push(start);
let inter: Vec<u32> = intermediate.into_iter().collect();
let mut iter : Vec<u32> = a1.iter().chain(inter.iter()).map(|x| *x).collect();
iter.push(end);
return iter;
}
But I am quite convinced this is not really optimal way to do this. I am sure I am doing lots of unnecessary things in the middle two lines. I tried to use intermediate directly like this:
let mut iter: Vec<u32> = a1.iter().chain(intermediate).map(|x| *x).collect();
But I am getting this error for chain method and I don't know how to solve it:
type mismatch resolving <I as std::iter::IntoIterator>::Item==&u32,
expected u32, found &u32
I am super new in Rust so any advice would be helpful to understand what's the right way to use intermediate parameter here.
Here are a few hints:
You have created three separate vectors (one explicitly, two using collect) when in fact you only need one.
You can use the std::iter::once iterator to produce iterators for the start and end integers
No need to collect the intermediate range. The intermediate argument implements IntoIterator, so you can feed it directly to chain. So, you can chain together the start, intermediate and end.
No need to use the 'return' keyword at the end of a function - the result of a function is the value of the last expression in it (as long as there is no semicolon on the end).
Applying those tips your function would look like this:
use std::iter::once;
pub fn intervals< I>(start: u32, end: u32, intermediate: I) -> Vec<u32>
where
I: IntoIterator<Item = u32>,
{
once(start).chain(intermediate).chain(once(end)).collect()
}
One additional thing to note, to answer your question from the comments:
why trying this: a1.iter().chain(intermediate) gives an error with chain method
Calling Vec::iter() returns an iterator that returns references to the values in the vector. This makes sense: calling iter() does not consume the vector, and its contents remain intact: you could iterate over it multiple times if you wanted.
On the other hand, invoking into_iter() from the IntoIterator trait returns an iterator that returns the values. This also makes sense: into_iter() does consume the object you are calling it on, so the iterator then takes ownership of the items that were previously owned by the object.
Trying to chain together two such iterators does not work because they are each iterating different types. One resolution would be to consume a1 as well, like this:
let mut iter : Vec<u32> = a1.into_iter().chain(intermediate).collect();

Is there a way to remove entries from a generic first vector that are present in another vector?

I have a problem understanding ownership when a higher order function is called. I am supposed to remove entries from the first vector if the elements exist in the second vector so I came up with this attempt:
fn array_diff<T: PartialEq>(a: Vec<T>, b: Vec<T>) -> Vec<T> {
a.iter()
.filter(|incoming| !b.contains(incoming))
.collect::<Vec<T>>()
}
I can't change the function signature. The .collect() call doesn't work because all I am getting is a reference to elements in a. While this is generic, I don't know if the result is copy-able or clone-able. I also probably can't dereference the elements in a.
Is there a way to fix this piece of code without rewriting it from scratch?
For this particular test ... you can consume the vector instead of relying on references. The signature yields values and not references. As such, to pass the test you only have to use into_iter instead:
a.into_iter() // <----------- call into_iter
.filter(|incoming| !b.contains(incoming))
.collect::<Vec<T>>()
This consumes the values and returns them out again.
Destroying the incoming allocation to create a new allocation isn't very efficient. Instead, write code that is more directly in line with the problem statement:
fn array_diff<T: PartialEq>(mut a: Vec<T>, b: Vec<T>) -> Vec<T> {
a.retain(|aa| !b.contains(aa));
a
}
Adding mut in the signature doesn't change the signature because no one can tell that you've added it. It's the exact same as:
fn array_diff<T: PartialEq>(a: Vec<T>, b: Vec<T>) -> Vec<T> {
let mut a = a;
a.retain(|aa| !b.contains(aa));
a
}

How to convert from &[u8] to Vec<u8>?

I'm attempting to simply convert a slice to a vector. The following code:
let a = &[0u8];
let b: Vec<u8> = a.iter().collect();
fails with the following error message:
3 | let b: Vec<u8> = a.iter().collect();
| ^^^^^^^ a collection of type `std::vec::Vec<u8>` cannot be built from an iterator over elements of type `&u8`
What am I missing?
Collecting into a Vec is so common that slices have a method to_vec that does exactly this:
let b = a.to_vec();
You get the same thing as CodesInChaos's answer, but more concisely.
Notice that to_vec requires T: Clone. To get a Vec<T> out of a &[T] you have to be able to get an owned T out of a non-owning &T, which is what Clone does.
Slices also implement ToOwned, so you can use to_owned instead of to_vec if you want to be generic over different types of non-owning container. If your code only works with slices, prefer to_vec instead.
The iterator only returns references to the elements (here &u8). To get owned values (here u8), you can used .cloned().
let a: &[u8] = &[0u8];
let b: Vec<u8> = a.iter().cloned().collect();

Why does the argument for the find closure need two ampersands?

I have been playing with Rust by porting my Score4 AI engine to it - basing the work on my functional-style implementation in OCaml. I specifically wanted to see how Rust fares with functional-style code.
The end result: It works, and it's very fast - much faster than OCaml. It almost touches the speed of imperative-style C/C++ - which is really cool.
There's a thing that troubles me, though — why do I need two ampersands in the last line of this code?
let moves_and_scores: Vec<_> = moves_and_boards
.iter()
.map(|&(column,board)| (column, score_board(&board)))
.collect();
let target_score = if maximize_or_minimize {
ORANGE_WINS
} else {
YELLOW_WINS
};
if let Some(killer_move) = moves_and_scores.iter()
.find(|& &(_,score)| score==target_score) {
...
I added them is because the compiler errors "guided" me to it; but I am trying to understand why... I used the trick mentioned elsewhere in Stack Overflow to "ask" the compiler to tell me what type something is:
let moves_and_scores: Vec<_> = moves_and_boards
.iter()
.map(|&(column,board)| (column, score_board(&board)))
.collect();
let () = moves_and_scores;
...which caused this error:
src/main.rs:108:9: 108:11 error: mismatched types:
expected `collections::vec::Vec<(u32, i32)>`,
found `()`
(expected struct `collections::vec::Vec`,
found ()) [E0308]
src/main.rs:108 let () = moves_and_scores;
...as I expected, moves_and_scores is a vector of tuples: Vec<(u32, i32)>. But then, in the immediate next line, iter() and find() force me to use the hideous double ampersands in the closure parameter:
if let Some(killer_move) = moves_and_scores.iter()
.find(|& &(_,score)| score==target_score) {
Why does the find closure need two ampersands? I could see why it may need one (pass the tuple by reference to save time/space) but why two? Is it because of the iter? That is, is the iter creating references, and then find expects a reference on each input, so a reference on a reference?
If this is so, isn't this, arguably, a rather ugly design flaw in Rust?
In fact, I would expect find and map and all the rest of the functional primitives to be parts of the collections themselves. Forcing me to iter() to do any kind of functional-style work seems burdensome, and even more so if it forces this kind of "double ampersands" in every possible functional chain.
I am hoping I am missing something obvious - any help/clarification most welcome.
This here
moves_and_scores.iter()
gives you an iterator over borrowed vector elements. If you follow the API doc what type this is, you'll notice that it's just the iterator for a borrowed slice and this implements Iterator with Item=&T where T is (u32, i32) in your case.
Then, you use find which takes a predicate which takes a &Item as parameter. Sice Item already is a reference in your case, the predicate has to take a &&(u32, i32).
pub trait Iterator {
...
fn find<P>(&mut self, predicate: P) -> Option<Self::Item>
where P: FnMut(&Self::Item) -> bool {...}
... ^
It was probably defined like this because it's only supposed to inspect the item and return a bool. This does not require the item being passed by value.
If you want an iterator over (u32, i32) you could write
moves_and_scores.iter().cloned()
cloned() converts the iterator from one with an Item type &T to one with an Item type T if T is Clone. Another way to do it would be to use into_iter() instead of iter().
moves_and_scores.into_iter()
The difference between the two is that the first option clones the borrowed elements while the 2nd one consumes the vector and moves the elements out of it.
By writing the lambda like this
|&&(_, score)| score == target_score
you destructure the "double reference" and create a local copy of the i32. This is allowed since i32 is a simple type that is Copy.
Instead of destructuring the parameter of your predicate you could also write
|move_and_score| move_and_score.1 == target_score
because the dot operator automatically dereferences as many times as needed.

Why does cloned() allow this function to compile

I'm starting to learn Rust and I tried to implement a function to reverse a vector of strings. I found a solution but I don't understand why it works.
This works:
fn reverse_strings(strings:Vec<&str>) -> Vec<&str> {
let actual: Vec<_> = strings.iter().cloned().rev().collect();
return actual;
}
But this doesn't.
fn reverse_strings(strings:Vec<&str>) -> Vec<&str> {
let actual: Vec<_> = strings.iter().rev().collect(); // without clone
return actual;
}
Error message
src/main.rs:28:10: 28:16 error: mismatched types:
expected `collections::vec::Vec<&str>`,
found `collections::vec::Vec<&&str>`
(expected str,
found &-ptr) [E0308]
Can someone explain to me why? What happens in the second function? Thanks!
So the call to .cloned() is essentially like doing .map(|i| i.clone()) in the same position (i.e. you can replace the former with the latter).
The thing is that when you call iter(), you're iterating/operating on references to the items being iterated. Notice that the vector already consists of 'references', specifically string slices.
So to zoom in a bit, let's replace cloned() with the equivalent map() that I mentioned above, for pedagogical purposes, since they are equivalent. This is what it actually looks like:
.map(|i: & &str| i.clone())
So notice that that's a reference to a reference (slice), because like I said, iter() operates on references to the items, not the items themselves. So since a single element in the vector being iterated is of type &str, then we're actually getting a reference to that, i.e. & &str. By calling clone() on each of these items, we go from a & &str to a &str, just like calling .clone() on a &i64 would result in an i64.
So to bring everything together, iter() iterates over references to the elements. So if you create a new vector from the collected items yielded by the iterator you construct (which you constructed by calling iter()) you would get a vector of references to references, that is:
let actual: Vec<& &str> = strings.iter().rev().collect();
So first of all realize that this is not the same as the type you're saying the function returns, Vec<&str>. More fundamentally, however, the lifetimes of these references would be local to the function, so even if you changed the return type to Vec<& &str> you would get a lifetime error.
Something else you could do, however, is to use the into_iter() method. This method actually does iterate over each element, not a reference to it. However, this means that the elements are moved from the original iterator/container. This is only possible in your situation because you're passing the vector by value, so you're allowed to move elements out of it.
fn reverse_strings(strings:Vec<&str>) -> Vec<&str> {
let actual: Vec<_> = strings.into_iter().rev().collect();
return actual;
}
playpen
This probably makes a bit more sense than cloning, since we are passed the vector by value, we're allowed to do anything with the elements, including moving them to a different location (in this case the new, reversed vector). And even if we don't, the vector will be dropped at the end of that function anyways, so we might as well. Cloning would be more appropriate if we're not allowed to do that (e.g. if we were passed the vector by reference, or a slice instead of a vector more likely).

Resources