How to count HashMap values using a predicate in Rust? - rust

I'm trying this, but doesn't work:
let map = HashMap::new();
map.insert(1, "aaa");
map.insert(2, "bbb");
let a = map.counts_by(|k, v| v.starts_with("a"));
What is the right way?

Anything that iterates over collections in Rust is going to factor through the Iterator API, and unlike in Java where iterators are often implicitly used, it's very common in Rust to explicitly ask for an iterator (with .iter()) and do some work directly on it in a functional style. In your case, there are three things we need to do here.
Get the values of the HashMap. This can be done with the values method, which returns an iterator.
Keep only the ones satisfying a particular predicate. This is a filter operation and will produce another iterator. Note that this does not yet iterate over the hash map; it merely produces another iterator capable of doing so later.
Count the matches, using count.
Putting it all together, we have
map.values().filter(|v| v.starts_with("a")).count()

You should filter an iterator of the HashMap, then count the elements of the iterator:
use std::collections::HashMap;
fn main() {
let mut map = HashMap::new();
map.insert(1, "aaa");
map.insert(2, "bbb");
assert_eq!(
map.iter().filter(|(_k, v)| v.starts_with("a")).count(),
1
);
}
Notice that the map also has to be marked as mut in order to insert new elements, and the filter closure destructures into a tuple containing the key and the value, rather than accepting two separate parameters.

Related

How to iterate over HashMap starting from given key?

Given a HashMap of n elements how does one start iteration from n-x element.
The order of elements does not matter, the only problem I need to solve is to start iteration from given key.
Example:
let mut map: HashMap<&str, i32> = HashMap::new();
map.insert("one", 1);
map.insert("two", 2);
map.insert("three", 3);
map.insert("four", 4);
[...]
for (k, v) in map {
//how to start iteration from third item and not the first one
}
Tried to google it but no examples found so far.
Tried to google it but no examples found so far.
That's because as Chayim Friedman notes it doesn't really make sense, a hashmap has an essentially random internal order, which means it has an arbitrary iteration order. Iterating from or between keys (/ entries) thus doesn't make much sense.
So it sounds a lot like an XY problem, what is the reason why you're trying to iterate "starting from a given key"?
Though if you really want that, you can just use the skip_while adapter, and skip while you have not found the key you're looking for.
Alternatively, since your post is ambiguous (you talk about both key and position) you can use the skip adapter to skip over a fixed number of items.
Technically neither will start iterating from that entry, they'll both start iterating from 0 but will only yield items following the specified break point. The standard library's hashmap has no support for range iteration (because that doesn't really make any sense on hashmap), and its iterators are not random access either (for similar reason).
You may want to use a BTreeMap, which has sorted keys and a range function which iterates over a range of keys.
use std::collections::BTreeMap;
fn main() {
let mut map = BTreeMap::new();
map.insert(1, "one");
map.insert(2, "two");
map.insert(3, "three");
for (&key, &value) in map.range(2..) {
println!("{key}: {value}");
}
}
// 2: two
// 3: three

rust zip function that copies values rather than referencing them

I would like to zip two vectors together, but what I get when calling the zip function is (&i32, &i32). I would like to get (i32, i32) - copy values from both vectors into a new vector.
let v1 = vec![1,2,3];
let v1 = vec![4,5,6];
// what I want
let zipped : Vec<(i32, i32)> = v1.iter().zip(v2.iter()).collect();
// what I actually get
let zipped : Vec<(&i32, &i32)> = v1.iter().zip(v2.iter()).collect();
Is it possible to force the zip function to copy the values?
zip() doesn't influence the values you're iterating over, it simply creates an iterator over tuples of the first and second iterator's values.
If you want to get owned values, you can use into_iter() on the Vecs. This will consume the vectors, so you can't use them anymore after the call. If you need to keep those vectors around, there's a copied() method that can be called on iterators over types that implement Copy, which is the case for i32. So you can get the same result while keeping the Vecs around by v1.iter().copied().zip(v2.iter().copied()).collect().
You can use cloned:
let zipped : Vec<(i32, i32)> = v1.iter().cloned().zip(v2.iter().cloned()).collect();
Playground

How to turn two nested HashMaps in a Vec of tuples without for loops?

Take the following data type:
let mut items = HashMap::<u64, HashMap<u64, bool>>::new();
I successfully managed to turn it into a vector of tuples like this:
let mut vector_of_tuples: Vec<(u64, u64, bool)> = vec![];
for (outer_id, inner_hash_map) in items.iter() {
for (inner_id, state) in inner_hash_map.iter() {
vector_of_tuples.push((*outer_id, *inner_id, *state));
}
}
But I want to shrink this code logic, possibly with the help of the Map and Zip functions from the Rust standard library.
How can I achieve the same result without using for loops?
How can I achieve the same result without using for loops?
You can use collect() to build a vector from an iterator without an explicit loop:
let vector_of_tuples: Vec<(u64, u64, bool)> = items
.iter()
// ...
.collect();
To expand the contents of the inner hash maps into the iterator, you can use flat_map:
let vector_of_tuples: Vec<_> = items
.iter()
.flat_map(|(&outer_id, inner_hash_map)| {
inner_hash_map
.iter()
.map(move |(&inner_id, &state)| (outer_id, inner_id, state))
})
.collect();
In many cases chaining iterator adapters will yield more understandable code than the equivalent for loop because iterators are written in a declarative style and tend not to require side effects. However, in this particular case the original for loop might actually be the more readable option. YMMV, the best option will sometimes depend on the programming background of you and other project contributors.

How does one create a HashMap with a default value in Rust?

Being fairly new to Rust, I was wondering on how to create a HashMap with a default value for a key? For example, having a default value 0 for any key inserted in the HashMap.
In Rust, I know this creates an empty HashMap:
let mut mymap: HashMap<char, usize> = HashMap::new();
I am looking to maintain a counter for a set of keys, for which one way to go about it seems to be:
for ch in "AABCCDDD".chars() {
mymap.insert(ch, 0)
}
Is there a way to do it in a much better way in Rust, maybe something equivalent to what Ruby provides:
mymap = Hash.new(0)
mymap["b"] = 1
mymap["a"] # 0
Answering the problem you have...
I am looking to maintain a counter for a set of keys.
Then you want to look at How to lookup from and insert into a HashMap efficiently?. Hint: *map.entry(key).or_insert(0) += 1
Answering the question you asked...
How does one create a HashMap with a default value in Rust?
No, HashMaps do not have a place to store a default. Doing so would cause every user of that data structure to allocate space to store it, which would be a waste. You'd also have to handle the case where there is no appropriate default, or when a default cannot be easily created.
Instead, you can look up a value using HashMap::get and provide a default if it's missing using Option::unwrap_or:
use std::collections::HashMap;
fn main() {
let mut map: HashMap<char, usize> = HashMap::new();
map.insert('a', 42);
let a = map.get(&'a').cloned().unwrap_or(0);
let b = map.get(&'b').cloned().unwrap_or(0);
println!("{}, {}", a, b); // 42, 0
}
If unwrap_or doesn't work for your case, there are several similar functions that might:
Option::unwrap_or_else
Option::map_or
Option::map_or_else
Of course, you are welcome to wrap this in a function or a data structure to provide a nicer API.
ArtemGr brings up an interesting point:
in C++ there's a notion of a map inserting a default value when a key is accessed. That always seemed a bit leaky though: what if the type doesn't have a default? Rust is less demanding on the mapped types and more explicit about the presence (or absence) of a key.
Rust adds an additional wrinkle to this. Actually inserting a value would require that simply getting a value can also change the HashMap. This would invalidate any existing references to values in the HashMap, as a reallocation might be required. Thus you'd no longer be able to get references to two values at the same time! That would be very restrictive.
What about using entry to get an element from the HashMap, and then modify it.
From the docs:
fn entry(&mut self, key: K) -> Entry<K, V>
Gets the given key's corresponding entry in the map for in-place
manipulation.
example
use std::collections::HashMap;
let mut letters = HashMap::new();
for ch in "a short treatise on fungi".chars() {
let counter = letters.entry(ch).or_insert(0);
*counter += 1;
}
assert_eq!(letters[&'s'], 2);
assert_eq!(letters[&'t'], 3);
assert_eq!(letters[&'u'], 1);
assert_eq!(letters.get(&'y'), None);
.or_insert() and .or_insert_with()
Adding to the existing example for .entry().or_insert(), I wanted to mention that if the default value passed to .or_insert() is dynamically generated, it's better to use .or_insert_with().
Using .or_insert_with() as below, the default value is not generated if the key already exists. It only gets created when necessary.
for v in 0..s.len() {
components.entry(unions.get_root(v))
.or_insert_with(|| vec![]) // vec only created if needed.
.push(v);
}
In the snipped below, the default vector passed to .or_insert() is generated on every call. If the key exists, a vector is being created and then disposed of, which can be wasteful.
components.entry(unions.get_root(v))
.or_insert(vec![]) // vec always created.
.push(v);
So for fixed values that don't have much creation overhead, use .or_insert(), and for values that have appreciable creation overhead, use .or_insert_with().
A way to start a map with initial values is to construct the map from a vector of tuples. For instance, considering, the code below:
let map = vec![("field1".to_string(), value1), ("field2".to_string(), value2)].into_iter().collect::<HashMap<_, _>>();

How to get a slice from an Iterator?

I started to use clippy as a linter. Sometimes, it shows this warning:
writing `&Vec<_>` instead of `&[_]` involves one more reference and cannot be
used with non-Vec-based slices. Consider changing the type to `&[...]`,
#[warn(ptr_arg)] on by default
I changed the parameter to a slice but this adds boilerplate on the call side. For instance, the code was:
let names = args.arguments.iter().map(|arg| {
arg.name.clone()
}).collect();
function(&names);
but now it is:
let names = args.arguments.iter().map(|arg| {
arg.name.clone()
}).collect::<Vec<_>>();
function(&names);
otherwise, I get the following error:
error: the trait `core::marker::Sized` is not implemented for the type
`[collections::string::String]` [E0277]
So I wonder if there is a way to convert an Iterator to a slice or avoid having to specify the collected type in this specific case.
So I wonder if there is a way to convert an Iterator to a slice
There is not.
An iterator only provides one element at a time, whereas a slice is about getting several elements at a time. This is why you first need to collect all the elements yielded by the Iterator into a contiguous array (Vec) before being able to use a slice.
The first obvious answer is not to worry about the slight overhead, though personally I would prefer placing the type hint next to the variable (I find it more readable):
let names: Vec<_> = args.arguments.iter().map(|arg| {
arg.name.clone()
}).collect();
function(&names);
Another option would be for function to take an Iterator instead (and an iterator of references, at that):
let names = args.arguments.iter().map(|arg| &arg.name);
function(names);
After all, iterators are more general, and you can always "realize" the slice inside the function if you need to.
So I wonder if there is a way to convert an Iterator to a slice
There is. (in applicable cases)
Got here searching "rust iter to slice", for my use-case, there was a solution:
fn main() {
// example struct
#[derive(Debug)]
struct A(u8);
let list = vec![A(5), A(6), A(7)];
// list_ref passed into a function somewhere ...
let list_ref: &[A] = &list;
let mut iter = list_ref.iter();
// consume some ...
let _a5: Option<&A> = iter.next();
// now want to eg. return a slice of the rest
let slice: &[A] = iter.as_slice();
println!("{:?}", slice); // [A(6), A(7)]
}
That said, .as_slice is defined on an iter of an existing slice, so the previous answerer was correct in that if you've got, eg. a map iter, you would need to collect it first (so there is something to slice from).
docs: https://doc.rust-lang.org/std/slice/struct.Iter.html#method.as_slice

Resources