How to group elements of a vector by a pattern? - rust

How to break a vector such as [9,7,6,3,4,0,1,7,3,9] -> [[9,7,6,3],[4,1],[7,3],[9]] -> [25,5,10,9]?
The logic behind it is that a vector is broken into subvectors where each subsequent element is smaller than previous one(0'z are ignored), a descending sequence . When the subvectors are formed, each one is replaced with a sum of all of its elements.
[https://www.codewars.com/kata/5f8fb3c06c8f520032c1e091][1]

Iterate over the elements of the nums to build up the split. For every number, compare it to the last number to decide whether to create a sublist, or append to the existing one:
let nums = vec![9,7,6,3,4,0,1,7,3,9];
let mut split: Vec<Vec<i32>> = vec![vec![]];
for num in nums.iter().filter(|n| **n != 0) {
let sublist = split.last_mut().unwrap();
match sublist.last_mut() {
Some(x) if num > x => {
split.push(vec![*num]);
}
_ => sublist.push(*num),
}
}
let split = split; // make split immmutable
let summed: Vec<i32> = split.iter().map(|v| v.iter().sum()).collect();
(try in playground)
It's probably possible to make a more elegant solution using Iterator::partition_in_place, but that fn is sadly unstable for now.

Related

Check serde_json Value for value with key sets of arbitrary depth or add if null Rust

With something like the vec below Id like to add arbitrary depth to a json object.
let set = vec![vec!["123","apple","orange","999"],vec!["1234","apple"],vec!["12345","apple","orange"]];
Once created the above would look something like:
{"123":{"apple":{"orange":"999"}}, "1234":"apple", "12345":{"apple":"orange"}}
Ive tried recursion, the issue Im running into is that Im having trouble reasoning through it. The wall Ive hit is how do I refer up the chain of values?
Is there a method Im missing here? Surely Im not the only person whos wanted to do this...
I would prefer if at all possible not writing something cumbersome that takes the length of a key set vec and matches creating the nesting ex.:
match keys.len() {
2 => json_obj[keys[0]] = json!(keys[1]),
3 => json_obj[keys[0]][keys[1]] = json!(keys[2]),
4 => json_obj[keys[0]][keys[1]][keys[2]] = json!(keys[3]),
...
_=> ()
}
Any ideas?
You can do this with iteration -- each loop you walk deeper into the structure, and further into the iterator, but the trick is that each step you need to know if there are more elements beyond the final one because the final element needs to be a string instead of an object. We'll do this using a match construct that matches on the next two items in the sequence at once.
We can further generify the function to take "anything that can be turned into an iterator that produces items from which we can obtain a &str". This will accept both an iterator of String or an iterator of &str, for example, or even directly a Vec of either.
use std::borrow::Borrow;
use serde_json::Value;
fn set_path(
mut obj: &mut Value,
path: impl IntoIterator<Item=impl Borrow<str>>
) {
let mut path = path.into_iter();
// Start with nothing in "a" and the first item in "b".
let mut a;
let mut b = path.next();
loop {
// Shift "b" down into "a" and put the next item into "b".
a = b;
b = path.next();
// Move "a" but borrow "b" because we will use it on the next iteration.
match (a, &b) {
(Some(key), Some(_)) => {
// This level is an object, rebind deeper.
obj = &mut obj[key.borrow()];
}
(Some(s), None) => {
// This is the final string in the sequence.
*obj = Value::String(s.borrow().to_owned());
break;
}
// We were given an empty iterator.
(None, _) => { break; }
}
}
}
(Playground)

How to find a string of multiple occurences in a vector?

I have a vector of strings, and I want to find a string that has the number of occurrences more than one. I've tried this but didn't work.
let strings = vec!["Rust", "Rest", "Rust"]; // I want to find "Rust" in this case
let val = strings
.into_iter()
.find(|x| o.into_iter().filter(|y| x == y).count() >= 2)
// sorry o ^ here is supposed to be strings
.unwrap();
There are two issues in your code:
o doesn't exist. I assume you meant to use strings instead.
into_itertakes ownership of the value, so once you have called into_iter on strings (or o), you can't call it again. You should use plain iter instead.
Here's a fixed version:
let strings = vec!["Rust", "Rest", "Rust"]; // I want to find "Rust" in this case
let val = strings
.iter()
.find(|x| strings.iter().filter(|y| x == y).count() >= 2)
.unwrap();
Note however that this is pretty slow. Depending on your requirements, there are more efficient alternatives:
Sort the strings array first. Then you only need to look at the next item to see if it is duplicated instead of needing to go through the whole array over and over. Advantage: no extra memory used. Drawback: you lose the original order.
Use an auxiliary variable to store the values you've already seen and/or count the number of occurences of each string. This may be a HashSet, BTreeSet, HashMap or BTreeMap. See #Netwave's answer. Advantage: doesn't change the input array. Drawback: uses memory to keep track of the duplicates.
You can count the appearances in O(n) with a tree or table like:
fn main() {
let strings = vec!["Rust", "Rest", "Rust"];
let mut sorted_data : HashMap<&str, u32> = HashMap::new();
strings.iter().for_each(|item| {
if !sorted_data.contains_key(item) {
sorted_data.insert(item, 0);
}
*sorted_data.get_mut(item).unwrap() += 1;
});
println!("{:?}", sorted_data);
}
The just use the one with the biggest key, for example with the new fold_first:
let result = sorted_data.iter().fold_first(|(k1, v1), (k2, v2)| { if v2 > v1 {(k2, v2)} else {(k1, v1)}}).unwrap();
Playground

How does one get an iterator to the max value element in Rust?

I want to access the element next to the maximal one in a Vec<i32>. I'm looking for something like this:
let v = vec![1, 3, 2];
let it = v.iter().max_element();
assert_eq!(Some(&2), it.next());
In C++, I would go with std::max_element and then just increase the iterator (with or without bounds checking, depending on how adventurous I feel at the moment). The Rust max only returns a reference to the element, which is not good enough for my use case.
The only solution I came up with is using enumerate to get the index of the item - but this seems manual and cumbersome when compared to the C++ way.
I would prefer something in the standard library.
This example is simplified - I actually want to attach to the highest value and then from that point loop over the whole container (possibly with cycle() or something similar).
C++ iterators are not the same as Rust iterators. Rust iterators are forward-only and can only be traversed once. C++ iterators can be thought of as cursors. See What are the main differences between a Rust Iterator and C++ Iterator? for more details.
In order to accomplish your goal in the most generic way possible, you have to walk through the entire iterator to find the maximum value. Along the way, you have to duplicate the iterator each time you find a new maximum value. At the end, you can return the iterator corresponding to the point after the maximum value.
trait MaxElement {
type Iter;
fn max_element(self) -> Self::Iter;
}
impl<I> MaxElement for I
where
I: Iterator + Clone,
I::Item: PartialOrd,
{
type Iter = Self;
fn max_element(mut self) -> Self::Iter {
let mut max_iter = self.clone();
let mut max_val = None;
while let Some(val) = self.next() {
if max_val.as_ref().map_or(true, |m| &val > m) {
max_iter = self.clone();
max_val = Some(val);
}
}
max_iter
}
}
fn main() {
let v = vec![1, 3, 2];
let mut it = v.iter().max_element();
assert_eq!(Some(&2), it.next());
}
See also:
How can I add new methods to Iterator?
I actually want to attach to the highest value and then from that point loop over the whole container (possibly with cycle() or something similar).
In that case, I'd attempt to be more obvious:
fn index_of_max(values: &[i32]) -> Option<usize> {
values
.iter()
.enumerate()
.max_by_key(|(_idx, &val)| val)
.map(|(idx, _val)| idx)
}
fn main() {
let v = vec![1, 3, 2];
let idx = index_of_max(&v).unwrap_or(0);
let (a, b) = v.split_at(idx);
let mut it = b.iter().chain(a).skip(1);
assert_eq!(Some(&2), it.next());
}
See also:
What's the fastest way of finding the index of the maximum value in an array?
Using max_by_key on a vector of floats
What is the idiomatic way to get the index of a maximum or minimum floating point value in a slice or Vec in Rust?
Find the item in an array with the largest property
a simple solution is to use fold,
the following code produces "largest num is: 99"
let vv:Vec<i32> = (1..100).collect();
let largest = vv.iter().fold(std::i32::MIN, |a,b| a.max(*b));
println!("largest {} ", largest);
If all you want is the value of the item following the maximum, I would do it with a simple call to fold, keeping track of the max found so far and the corresponding next value:
fn main() {
let v = vec![1, 3, 2];
let nxt = v.iter().fold (
(None, None),
|acc, x| {
match acc {
(Some (max), _) if x > max => (Some (x), None),
(Some (max), None) => (Some (max), Some (x)),
(None, _) => (Some (x), None),
_ => acc
}
}
).1;
assert_eq!(Some(&2), nxt);
}
playground
Depending on what you want to do with the items following the max, a similar approach may allow you to do it in a single pass.

Collect items from an iterator at a specific index

I was wondering if it is possible to use .collect() on an iterator to grab items at a specific index. For example if I start with a string, I would normally do:
let line = "Some line of text for example";
let l = line.split(" ");
let lvec: Vec<&str> = l.collect();
let text = &lvec[3];
But what would be nice is something like:
let text: &str = l.collect(index=(3));
No, it's not; however you can easily filter before you collect, which in practice achieves the same effect.
If you wish to filter by index, you need to add the index in and then strip it afterwards:
enumerate (to add the index to the element)
filter based on this index
map to strip the index from the element
Or in code:
fn main() {
let line = "Some line of text for example";
let l = line.split(" ")
.enumerate()
.filter(|&(i, _)| i == 3 )
.map(|(_, e)| e);
let lvec: Vec<&str> = l.collect();
let text = &lvec[0];
println!("{}", text);
}
If you only wish to get a single index (and thus element), then using nth is much easier. It returns an Option<&str> here, which you need to take care of:
fn main() {
let line = "Some line of text for example";
let text = line.split(" ").nth(3).unwrap();
println!("{}", text);
}
If you can have an arbitrary predicate but wishes only the first element that matches, then collecting into a Vec is inefficient: it will consume the whole iterator (no laziness) and allocate potentially a lot of memory that is not needed at all.
You are thus better off simply asking for the first element using the next method of the iterator, which returns an Option<&str> here:
fn main() {
let line = "Some line of text for example";
let text = line.split(" ")
.enumerate()
.filter(|&(i, _)| i % 7 == 3 )
.map(|(_, e)| e)
.next()
.unwrap();
println!("{}", text);
}
If you want to select part of the result, by index, you may also use skip and take before collecting, but I guess you have enough alternatives presented here already.
There is a nth function on Iterator that does this:
let text = line.split(" ").nth(3).unwrap();
No; you can use take and next, though:
let line = "Some line of text for example";
let l = line.split(" ");
let text = l.skip(3).next();
Note that this results in text being an Option<&str>, as there's no guarantee that the sequence actually has at least four elements.
Addendum: using nth is definitely shorter, though I prefer to be explicit about the fact that accessing the nth element of an iterator necessarily consumes all the elements before it.
For anyone who may be interested, you can can do loads of cool things with iterators (thanks Matthieu M), for example to get multiple 'words' from a string according to their index, you can use filter along with logical or || to test for multiple indexes !
let line = "FCC2CCMACXX:4:1105:10758:14389# 81 chrM 1 32 10S90M = 16151 16062"
let words: Vec<&str> = line.split(" ")
.enumerate()
.filter(|&(i, _)| i==1 || i==3 || i==6 )
.map(|(_, e) | e)
.collect();

What's the best way to compare 2 vectors or strings element by element?

What's the best way to compare 2 vectors or strings element by element in Rust, while being able to do processing on each pair of elements? For example if you wanted to keep count of the number of differing elements. This is what I'm using:
let mut diff_count: i32 = 0i32;
for (x, y) in a.chars().zip(b.chars()) {
if x != y {
diff_count += 1i32;
}
}
Is that the correct way or is there something more canonical?
To get the count of matching elements, I'd probably use filter and count.
fn main() {
let a = "Hello";
let b = "World";
let matching = a.chars().zip(b.chars()).filter(|&(a, b)| a == b).count();
println!("{}", matching);
let a = [1, 2, 3, 4, 5];
let b = [1, 1, 3, 3, 5];
let matching = a.iter().zip(&b).filter(|&(a, b)| a == b).count();
println!("{}", matching);
}
Iterator::zip takes two iterators and produces another iterator of the tuple of each iterator's values.
Iterator::filter takes a reference to the iterator's value and discards any value where the predicate closure returns false. This performs the comparison.
Iterator::count counts the number of elements in the iterator.
Note that Iterator::zip stops iterating when one iterator is exhausted. If you need different behavior, you may also be interested in
Itertools::zip_longest or Itertools::zip_eq.
If you wanted to use #Shepmaster's answer as the basis of an assertion to be used in a unit test, try this:
fn do_vecs_match<T: PartialEq>(a: &Vec<T>, b: &Vec<T>) -> bool {
let matching = a.iter().zip(b.iter()).filter(|&(a, b)| a == b).count();
matching == a.len() && matching == b.len()
}
Of course, be careful when using this on floats! Those pesky NaNs won't compare, and you might want to use a tolerance for comparing the other values. And you might want to make it fancy by telling the index of the first nonmatching value.

Resources