rust combine similar values in a sequence [duplicate] - rust

This question already has answers here:
Is there a simple way remove duplicate elements from an array?
(4 answers)
Closed 1 year ago.
In rust, given some sequence (an array, a vector, etc.), what is a good way to combine all same values and return a sequence with unique values?
For a specific example, given some array [1, 2, 2, 3, 2], modify the returned array (or create a new array or vector) so that each u32 value is only contained once, i.e. it becomes [1, 2, 3].
Later, I want to iterate over the result.
In this case, "good way" means not too complicated, grokkable. The solution could use std::collections.

Simple: dedup! As the rust doc for Vec::dedup says:
Removes consecutive repeated elements in the vector according to the PartialEq trait implementation.
If the vector is sorted, this removes all duplicates.
let mut vec = vec![1, 2, 2, 3, 2];
vec.dedup();
assert_eq!(vec, [1, 2, 3, 2]);
And as the doc says, if it's sorted, all duplicates are removed. See also Vec::sort.

If you do not care about ordering, you could use HashSet:
use std::collections::HashSet;
fn main() {
let data = vec![1, 2, 2, 3, 2];
let res: Vec<u32> = data
.iter()
.copied()
.collect::<HashSet<_>>()
.into_iter()
.collect();
println!("{:?}", res);
}
Playground
Idea is going from Vec -> HashSet -> Vec again.

Related

How to compare the value of next element with the current element in an iterator without a loop and for_each?

I have a vector like [1, 2, 4, 3], I want to remove 3 because 3 is smaller than 4.
I want to use the iterator to solve this problem, and do not use the loop or for_each.
The first step I think need to do is vec.into_iter, but I don't know what to do next.
To reformulate, you want to remove any element smaller than the previous element.
Let's write a function to do so. As you want to work exclusively with iterators, therefore in a functional style, we are going to assume the input vector is immutable, so the function should take a slice as input, and return a new Vec:
fn remove_smaller<T: Ord + Copy>(v: &[T]) -> Vec<T> {
v.iter()
.rev()
.collect::<Vec<_>>()
.windows(2)
.filter(|a| a[0] > a[1])
.map(|a| *a[0])
.chain([v[0]])
.rev()
.collect()
}
Let's explain what this function is doing, using vec![1, 2, 4, 3] as sample input.
We first reverse the order of the vector so we can operate on windows looking at the previous value, and collect it into a new vector (needed as windows is implemented for slices only).
&[3, 4, 2, 1]
windows(2) returns an iterator that will yield overlapping pairs of elements of the slice, except the last element, which has no next:
&[3, 4], &[4, 2], &[2, 1]
We then filter with filter(|a| a[0] > a[1]) meaning we only keep entries which are ordered (hence why type of input needs to be Ord):
&[4, 2], &[2, 1]
We then map with map(|a| *a[0]) in order to keep each value, which needs T to be Copy:
4, 2
Now, since we are missing the first element of the input array, we need to add it again, using .chain([v[0]]) giving us:
4, 2, 1
We then reverse the iterator to obtain the output array in correct order:
1, 2, 4
See it in action in the playground.
This is not a very efficient method to achieve the result, as it needs to allocate twice as much memory as the input.
You can use the zip and skip functionality to put together two elements of an array.
Following the footsteps of #sirdarius, Here is how your function can be:
fn remove_smaller<T: Ord + Copy>(v: &[T]) -> Vec<T> {
let mut res = vec![v[0]];
res.extend(
v.iter()
.zip(v.iter().skip(1))
.filter(|(a, b)| a < b)
.map(|(_, b)| *b),
);
res
}
Walk through:
We fist create our result vector and push the first element in it since it is always in the answer vector.
Then we extend our result vector by another iterator which would perform the following:
create a tuple for each element of the array with indices of the same array but one index ahead (v.iter().skip(1)).
we then filter out pairs which meet our ordering and finally, we map the pair to a single value.
There is an iterator only way to do what you wanted in O(1) space.
fn non_decreasing(v: Vec<i32>) -> Vec<i32> {
if v.is_empty() {
return v;
}
let first = v[0];
once(first)
.chain(
v.into_iter()
.skip(1)
.scan(first, |last_max, cur_elem| {
if cur_elem < *last_max {
Some(None)
} else {
*last_max = cur_elem;
Some(Some(cur_elem))
}
})
.flatten(),
)
.collect()
}
This function will not use any extra space (even for the output, on newer rustc versions). It will return a vector that's non-decreasing. That is, each element in the result vector will be >= the previous one.
If you wanted to compare the elements only to the previous element and not the previous largest, then just add the *last_max = cur_elem line to the if branch as well.

How do I output multiple values from .map() or use map twice in one iteration?

How do I use map twice on one into_iter. Currently I have.
let res_arr_to: Vec<String> = v.result.transactions.into_iter().map( |x| x.to).collect();
let res_arr_from: Vec<String> = v.result.transactions.into_iter().map( |x| x.from).collect();
What I want is both arrays in one array, the order doesn't matter. I need either a closure that outputs two values (if that is even a closure?). Or a way to use map twice in one iteration, without using the generated value, but instead using the untouched iterator if that makes sense and is possible. I am a total noob in functional programming so if there is a completely different way to do this another explanation is fine to.
v is an EthBlockTxResponse:
#[derive(Debug, Deserialize)]
struct EthTransactionObj {
from: String,
to: String
}
#[derive(Debug, Deserialize)]
struct EthTransactions {
transactions : Vec<EthTransactionObj>
}
#[derive(Debug, Deserialize)]
struct EthBlockTxResponse {
result : EthTransactions
}
Thanks
You can use .unzip() to collect two vectors at once like this:
let (res_arr_to, res_arr_from): (Vec<_>, Vec<_>) =
v.result.transactions.into_iter().map(|x| (x.to, x.from)).unzip();
Note that into_iter consumes v.result.transactions - moving out of that field. This is probably not what you want, and you should copy the strings instead in that case:
let (res_arr_to, res_arr_from): (Vec<_>, Vec<_>) =
v.result.transactions.iter().map(|x| (x.to.clone(), x.from.clone())).unzip();
I find the question a bit vague, but think you're trying to get both the x.to and the x.from at the same time instead of having to iterate the data twice and build two vectors. I'll address that first and then some cases of what you might have meant by some other things you mentioned.
One way you can do it is with .flat_map(). This will produce one flat vector removing the extra level of nesting. If you wanted tuples, you could just use .map(|x| (x.from, x.to)). I'm assuming that x.from and x.to are Copy and you actually want everything in a single vector without nesting.
let res_arr_combined = v.result.transactions.into_iter()
.flat_map( |x| [x.to, x.from])
.collect::<Vec<_>>();
Reference:
Iterator::flat_map()
Excerpt:
The map adapter is very useful, but only when the closure argument produces values. If it produces an iterator instead, there’s an extra layer of indirection. flat_map() will remove this extra layer on its own.
fn main()
{
// Adding more data to an iterator stream.
(0..5).flat_map(|n| [n, n * n])
.for_each(|n| print!("{}, ", n));
println!("");
}
output:
0, 0, 1, 1, 2, 4, 3, 9, 4, 16,
You may not really require the following, but wrt your comment about wanting to get data from an iterator without using the value or changing the state of the iterator, there is a .peek() operation you can invoke on iterators wrapped in Peekable.
To get a peekable iterator, you just invoke .peekable() on any iterator.
let mut p = [1, 2, 3, 4].into_iter().peekable();
println!("{:?}", p.peek());
println!("{:?}", p.next());
output:
Some(1)
Some(1)
The peekable behaves the same way as the iterator it was taken from, but adds a couple interesting methods like .next_if(|x| x > 0), which produces an iterator that will continue rendering items until the condition evaluates to false without consuming the last item it didn't render.
And one last topic in line with "using map twice in one iteration", if by that you might mean to pull items from a slice in chunks of 2. If v.result.transactions is itself a Vec, you can use the .chunks() method to group its item by 2's - or 3's as I have below:
let a = [1, 2, 3, 4, 5, 6, 7, 8, 9].chunks(3).collect::<Vec<_>>();
println!("{:?}", a);
output:
[[1, 2, 3], [4, 5, 6], [7, 8, 9]]

Implementing PHP array_column in Rust

I'm in the process of learning Rust, but I could not find an answer to this question.
In PHP, there's the array_column method and it works this way:
given an array of arrays (this would be a a Vector of vectors in Rust):
$records = [
[1,2,3],
[1,2,3],
[1,2,3],
[1,2,3]
];
if I want to get an array containing all the first elements (a "column") of the inner arrays I can do:
$column = array_column($records, 0);
This way, for example, I get [1,1,1,1]. If I change that 0 with 1, I get [2,2,2,2] and so on.
Since there's no array_column equivalent in Rust (that is: I could not find it), what could be the best way to implement a similar behavior with a vector of vectors?
I decided to play with iterators, as you tried in the comments.
This version works with any clonable values (numbers included). We iterate over subvectors, and for each we call a get method, which either yields an element of the vector Some(&e) or None if we ask out of bounds.
and_then then accepts a value from get, and if it was None, then None is returned, otherwise, if it's Some(&e) then Some(e.clone()) is returned, i.e. we clone the value (because we only have the reference to the value from get, we can't store it, we have to copy the value).
collect then works with Iter<Option<T>>, and it conveniently turns it in Option<Vec<T>>, i.e. it returns None if some Nones were in the iterator (which means some arrays didn't have big enough size), or returns Some(Vec<T>), if everything is fine.
fn main() {
let array = vec![
vec![1, 2, 3, 4],
vec![1, 2, 3, 4, 5],
vec![1, 2, 3, 4],
vec![1, 2, 3, 4],
];
let ac = array_column(&array, 0);
println!("{:?}", ac); // Some([1, 1, 1, 1])
let ac = array_column(&array, 3);
println!("{:?}", ac); // Some([4, 4, 4, 4])
let ac = array_column(&array, 4); // None
println!("{:?}", ac);
}
fn array_column<T: Clone>(array: &Vec<Vec<T>>, column: usize) -> Option<Vec<T>> {
array.iter()
.map( |subvec| subvec.get(column).and_then(|e| Some(e.clone())) )
.collect()
}
Alex version is good, but you can generalize it using references too, so there will be no need for the item to be Clone:
fn array_column<'a, T>(array: &'a Vec<Vec<T>>, column: usize) -> Option<Vec<&'a T>> {
array.iter()
.map( |subvec| subvec.get(column) )
.collect()
}
Playground

Why does the closure executed in .map not change the captured value? [duplicate]

This question already has answers here:
How do I cope with lazy iterators?
(3 answers)
Closed 3 years ago.
Here is my code:
let mut v = Vec::new();
let _ = (0..5).map(|i| v.push(i));
println!("{:?}", v); //output: []
The captured value is v. I expect the code above to print [0, 1, 2, 3, 4], but it prints [].
Why is that?
The map method does not iterate through elements immediately. Instead, it creates a lazy iterator which you can use later. One of the ways to force the new iterator to perform its job is the Iterator::collect method. In your case, it will produce a new collection filled with empty values (() because it's the type of v.push(i)):
let mut v = Vec::new();
let v2: Vec<()> = (0..5).map(|i| v.push(i)).collect();
println!("{:?}", v); //output: [0, 1, 2, 3, 4]
This does excess work by creating the vector v2. Try to avoid such inefficient operations.

How to remove an element from a vector given the element?

Is there a simple way to remove an element from a Vec<T>?
There's a method called remove(), and it takes an index: usize, but there isn't even an index_of() method that I can see.
I'm looking for something (hopefully) simple and O(n).
This is what I have come up so far (that also makes the borrow checker happy):
let index = xs.iter().position(|x| *x == some_x).unwrap();
xs.remove(index);
I'm still waiting to find a better way to do this as this is pretty ugly.
Note: my code assumes the element does exist (hence the .unwrap()).
You can use the retain method but it will delete every instance of the value:
fn main() {
let mut xs = vec![1, 2, 3];
let some_x = 2;
xs.retain(|&x| x != some_x);
println!("{:?}", xs); // prints [1, 3]
}
Your question is under-specified: do you want to return all items equal to your needle or just one? If one, the first or the last? And what if there is no single element equal to your needle? And can it be removed with the fast swap_remove or do you need the slower remove? To force programmers to think about those questions, there is no simple method to "remove an item" (see this discussion for more information).
Remove first element equal to needle
// Panic if no such element is found
vec.remove(vec.iter().position(|x| *x == needle).expect("needle not found"));
// Ignore if no such element is found
if let Some(pos) = vec.iter().position(|x| *x == needle) {
vec.remove(pos);
}
You can of course handle the None case however you like (panic and ignoring are not the only possibilities).
Remove last element equal to needle
Like the first element, but replace position with rposition.
Remove all elements equal to needle
vec.retain(|x| *x != needle);
... or with swap_remove
Remember that remove has a runtime of O(n) as all elements after the index need to be shifted. Vec::swap_remove has a runtime of O(1) as it swaps the to-be-removed element with the last one. If the order of elements is not important in your case, use swap_remove instead of remove!
There is a position() method for iterators which returns the index of the first element matching a predicate. Related question: Is there an equivalent of JavaScript's indexOf for Rust arrays?
And a code example:
fn main() {
let mut vec = vec![1, 2, 3, 4];
println!("Before: {:?}", vec);
let removed = vec.iter()
.position(|&n| n > 2)
.map(|e| vec.remove(e))
.is_some();
println!("Did we remove anything? {}", removed);
println!("After: {:?}", vec);
}
Is drain_filter() new from the last answers?
Seems similar to Kai's answer:
#![feature(drain_filter)]
let mut numbers = vec![1, 2, 3, 4, 5, 6, 8, 9, 11, 13, 14, 15];
numbers.drain_filter(|x| *x % 2 == 0).collect::<Vec<_>>();
assert_eq!(numbers, vec![1, 3, 5, 9, 11, 13, 15]);
https://doc.rust-lang.org/std/vec/struct.Vec.html#method.drain_filter
If your data is sorted, please use binary search for O(log n) removal, which could be much much faster for large inputs.
match values.binary_search(value) {
Ok(removal_index) => values.remove(removal_index),
Err(_) => {} // value not contained.
}

Resources