I have two approaches to grouping characters by the number of occurrences in a string. One of them is using std::collections::HashMap and the second one is using itertools::Itertools::group_by. Unfortunately, grouping with Itertools gives me undesirable results.
Example input word: "Barbara"
Using std::collections::HashMap
let map1 = word.to_lowercase()
.chars()
.fold(HashMap::new(), |mut acc, c| {
*acc.entry(c).or_insert(0) += 1;
acc
});
Result {'a': 3, 'b': 2, 'r': 2}
And using itertools::Itertools::group_by
let map2: HashMap<char, u32> = word.to_lowercase()
.chars()
.group_by(|&x| x)
.into_iter()
.map(|(k, v)| (k, v.count() as u32))
.collect();
Result {'r': 1, 'a': 1, 'b': 1}
Oddly enough, when the input string has identical characters in succession, Itertools takes those characters into account.
The question is, what makes it return different results?
Playground
You're looking for into_group_map_by. group_by only groups consecutive elements according to the docs.
use itertools::Itertools;
use std::collections::HashMap;
fn main() {
let word = "Barbara";
let map1 = word
.to_lowercase()
.chars()
.fold(HashMap::new(), |mut acc, c| {
*acc.entry(c).or_insert(0) += 1;
acc
});
println!("{:?}", map1);
let map2: HashMap<char, u32> = word
.to_lowercase()
.chars()
.into_group_map_by(|&x| x)
.into_iter()
.map(|(k, v)| (k, v.len() as u32))
.collect();
println!("{:?}", map2);
}
Output:
{'b': 2, 'a': 3, 'r': 2}
{'b': 2, 'r': 2, 'a': 3}
Playground
There's also into_grouping_map_by, which can be used for this like:
let map2: HashMap<char, u32> = word
.to_lowercase()
.chars()
.into_grouping_map_by(|&x| x)
.fold(0, |acc, _key, _value| acc + 1);
The documentation says (emphasis added):
fn group_by<K, F>(self, key: F) -> GroupBy<K, Self, F>
where
Self: Sized,
F: FnMut(&Self::Item) -> K,
K: PartialEq,
Return an iterable that can group iterator elements. Consecutive elements that map to the same key (“runs”), are assigned to the same group.
It only groups consecutive elements. You'll need to sort the characters before calling group_by.
let map2: HashMap<char, u32> = word.to_lowercase()
.chars()
.sorted()
.group_by(|&x| x)
...
Output:
{'a': 3, 'r': 2, 'b': 2}
{'b': 2, 'a': 3, 'r': 2}
Playground
Related
I have a vector containing two vectors of different sizes:
let vectors = vec![
vec![0, 1],
vec![2, 3, 4]
];
I would like to create an iterator to cycle over the elements of each vector, returning:
0: [0, 2]
1: [1, 3]
2: [0, 4]
3: [1, 2]
...
In this example there are two vectors, but I would like to generalize this for k vectors.
I have tried this:
let cycles = vectors
.into_iter()
.map(|x| x.into_iter().cycle());
loop {
let output: Vec<_> = cycles
.map(|x| x.next().unwrap())
.collect();
}
However, it does not work, because x cannot be borrowed as mutable.
error[E0596]: cannot borrow `x` as mutable, as it is not declared as mutable
--> src/main.rs:14:22
|
14 | .map(|x| x.next().unwrap())
| - ^^^^^^^^ cannot borrow as mutable
| |
| help: consider changing this to be mutable: `mut x`
I understand the error, but I fail to think of an alternative way to build this iterator. Playground.
You have to collect the iterators into some datastructure like Vec.
You can then use iter_mut to iterate over mutable references which let you advance the collected iterators.
fn main() {
let vectors = vec![vec![0, 1], vec![2, 3, 4]];
let mut cycles = vectors
.into_iter()
.map(|x| x.into_iter().cycle())
.collect::<Vec<_>>();
for i in 0.. {
let output: Vec<_> = cycles.iter_mut().map(|x| x.next().unwrap()).collect();
println!("{i}: {output:?}");
}
}
Do you mean:
0: [0, 2]
1: [1, 3]
2: [0, 4]
3: [1, 2] <- was 3
...
If so:
let vectors: Vec<Vec<u8>> = vec![vec![0, 1], vec![2, 3, 4]];
let mut cycles: Vec<_> = vectors.iter().map(|x| x.iter().cycle()).collect();
for i in 0..4 {
let output: Vec<_> = cycles.iter_mut().map(|x| x.next().unwrap()).collect();
println!("{i}: {output:?}");
}
I have a vec: vec![1, 2, 3, 4]. I want to know how to:
Iterate its prefixes from shortest to longest:
&[], &[1], &[1, 2], &[1, 2, 3], &[1, 2, 3, 4]
Iterate its prefixes from longest to shortest:
&[1, 2, 3, 4], &[1, 2, 3], &[1, 2], &[1], &[]
Iterate its suffixes from shortest to longest:
&[], &[4], &[3, 4], &[2, 3, 4], &[1, 2, 3, 4]
Iterate its suffixes from longest to shortest:
&[1, 2, 3, 4], &[2, 3, 4], &[3, 4], &[4], &[],
See also: How to iterate prefixes and suffixes of str or String in rust?
What works for a &[T] (i.e. a slice) will also work for &Vec<T> due to Deref coercion
To construct a range of indexes from 0 to slice.len() inclusive: 0..=slice.len()
std::ops::Range implements both Iterator and DoubleEndedIterator. This allows you to use the rev() method:
(0..=slice.len()).rev()
To get a prefix of a given length: &slice[..len]
To get a suffix without the first cut items: &slice[cut..]
Putting it all together
To iterate from shortest to longest:
pub fn prefixes_asc<T>(slice: &[T]) -> impl Iterator<Item = &[T]> + DoubleEndedIterator {
(0..=slice.len()).map(move |len| &slice[..len])
}
pub fn suffixes_asc<T>(slice: &[T]) -> impl Iterator<Item = &[T]> + DoubleEndedIterator {
(0..=slice.len()).rev().map(move |cut| &slice[cut..])
}
To reverse just use .rev():
pub fn prefixes_desc<T>(slice: &[T]) -> impl Iterator<Item = &[T]> + DoubleEndedIterator {
prefixes_asc(slice).rev()
}
pub fn suffixes_desc<T>(slice: &[T]) -> impl Iterator<Item = &[T]> + DoubleEndedIterator {
suffixes_asc(slice).rev()
}
tests
In python its done this way:
>>> x = {1: 2, 3: 4, 4: 3, 2: 1, 0: 0}
>>> {k: v for k, v in sorted(x.items(), key=lambda item: item[1])}
{0: 0, 2: 1, 1: 2, 4: 3, 3: 4}
How to sort a HashMap by values in rust?
My code so far:
use std::collections::HashMap;
fn main() {
let mut count: HashMap<String, u32>= HashMap::new();
count.insert(String::from("A"), 5);
count.insert(String::from("B"), 2);
count.insert(String::from("C"), 11);
count.insert(String::from("D"), 10);
let highest = count.iter().max_by(|a, b| a.1.cmp(&b.1)).unwrap();
println!("largest hash: {:?}", highest); // largest hash: ("C", 11)
}
Unlike Python's dict, Rust's "built-in" hashmap is not ordered, so sorting it has no effect.
If you need an ordered map for some reason, you should use indexmap. Alternatively, BTreeMap is sorted based on the key.
As you don't really present any sort of compelling use case it's hard to provide counsel though.
Ya, sorted it by converting to vector:
use std::collections::HashMap;
fn main() {
let mut count: HashMap<String, u32>= HashMap::new();
count.insert(String::from("A"), 5);
count.insert(String::from("B"), 2);
count.insert(String::from("C"), 11);
count.insert(String::from("D"), 10);
let mut hash_vec: Vec<(&String, &u32)> = count.iter().collect();
println!("{:?}", hash_vec);
hash_vec.sort_by(|a, b| b.1.cmp(a.1));
println!("Sorted: {:?}", hash_vec); //Sorted: [("C", 11), ("D", 10), ("A", 5), ("B", 2)]
}
Sort HashMap data by value
I have a vector and I want to sort it, where the first criterion is frequency. Second criterion is position in the vector. If two elements have the same number of occurrences, I want the most recently seen element to take advantage and go first. In the end, I want to remove duplicate elements from it.
For instance, if the input is this:
fn main() {
let history = vec![3, 2, 4, 6, 2, 4, 3, 3, 4, 5, 6, 3, 2, 4, 5, 5, 3];
}
The output should be:
3 4 5 2 6
How can I do this in Rust?
A straightforward method is to build hash maps for frequencies and positions of the elements:
use std::collections::HashMap;
fn frequency_map(nums: &[i32]) -> HashMap<i32, usize> {
let mut map = HashMap::new();
for &n in nums {
*map.entry(n).or_insert(0) += 1;
}
map
}
fn position_map(nums: &[i32]) -> HashMap<i32, usize> {
let mut map = HashMap::new();
for (pos, &n) in nums.iter().enumerate() {
map.insert(n, pos);
}
map
}
And then do an unstable sort by position followed by a stable sort by frequency:
fn custom_sort(nums: &mut Vec<i32>) {
let freq_map = frequency_map(nums);
let pos_map = position_map(nums);
nums.sort_unstable_by(|a, b| pos_map.get(b).unwrap().cmp(pos_map.get(a).unwrap()));
nums.dedup();
nums.sort_by(|a, b| freq_map.get(b).unwrap().cmp(freq_map.get(a).unwrap()));
}
Example:
use itertools::Itertools;
fn main() {
let mut history = vec![3, 2, 4, 6, 2, 4, 3, 3, 4, 5, 6, 3, 2, 4, 5, 5, 3];
custom_sort(&mut history);
println!("[{}]", history.iter().format(", "));
}
Output:
[3, 4, 5, 2, 6]
(playground)
The rust standard library has a fold() which collapses an iterator into a single result:
let a = [1, 2, 3];
// the sum of all of the elements of the array
let sum = a.iter().fold(0, |acc, x| acc + x);
assert_eq!(sum, 6);
Does the standard library have an equivalent version that yields each element? That is, something like:
let partial_sums = a.iter()
.what_goes_here(0, |acc, x| acc + x)
.collect::<Vec<_>>();
assert_eq!(partial_sums, vec![1, 3, 6]);
Effectively, iter.fold(init, f) is semantically equivalent to
iter
.what_goes_here(init, f)
.last()
.unwrap_or(init)
For anyone in the same boat as me, I'm looking for the Rust equivalent of the C++ algorithm partial_sum.
You want Iterator::scan:
fn main() {
let v = vec![1, 2, 3];
let res = v
.iter()
.scan(0, |acc, &x| {
*acc += x;
Some(*acc)
})
.collect::<Vec<_>>();
assert_eq!(res, vec![1, 3, 6]);
}