Get amount of elements in a HashMap - rust

I would like to know how to get the amount of elements in a HashMap with rust.
I'm currently using this to check if a HashMap is empty or not, so if there is a more idiomatic way to get that as well, I would love to know both.

std::HashMap has a len method to check for the number of elements, but you can use is_empty method to check if it contains any items.
let mut map = HashMap::new();
assert!(map.is_empty());
assert_eq!(map.len(), 0);
a.insert(1, "a");
assert!(!map.is_empty());
assert_eq!(map.len(), 1);

Related

How to iterate over HashMap starting from given key?

Given a HashMap of n elements how does one start iteration from n-x element.
The order of elements does not matter, the only problem I need to solve is to start iteration from given key.
Example:
let mut map: HashMap<&str, i32> = HashMap::new();
map.insert("one", 1);
map.insert("two", 2);
map.insert("three", 3);
map.insert("four", 4);
[...]
for (k, v) in map {
//how to start iteration from third item and not the first one
}
Tried to google it but no examples found so far.
Tried to google it but no examples found so far.
That's because as Chayim Friedman notes it doesn't really make sense, a hashmap has an essentially random internal order, which means it has an arbitrary iteration order. Iterating from or between keys (/ entries) thus doesn't make much sense.
So it sounds a lot like an XY problem, what is the reason why you're trying to iterate "starting from a given key"?
Though if you really want that, you can just use the skip_while adapter, and skip while you have not found the key you're looking for.
Alternatively, since your post is ambiguous (you talk about both key and position) you can use the skip adapter to skip over a fixed number of items.
Technically neither will start iterating from that entry, they'll both start iterating from 0 but will only yield items following the specified break point. The standard library's hashmap has no support for range iteration (because that doesn't really make any sense on hashmap), and its iterators are not random access either (for similar reason).
You may want to use a BTreeMap, which has sorted keys and a range function which iterates over a range of keys.
use std::collections::BTreeMap;
fn main() {
let mut map = BTreeMap::new();
map.insert(1, "one");
map.insert(2, "two");
map.insert(3, "three");
for (&key, &value) in map.range(2..) {
println!("{key}: {value}");
}
}
// 2: two
// 3: three

How to count HashMap values using a predicate in Rust?

I'm trying this, but doesn't work:
let map = HashMap::new();
map.insert(1, "aaa");
map.insert(2, "bbb");
let a = map.counts_by(|k, v| v.starts_with("a"));
What is the right way?
Anything that iterates over collections in Rust is going to factor through the Iterator API, and unlike in Java where iterators are often implicitly used, it's very common in Rust to explicitly ask for an iterator (with .iter()) and do some work directly on it in a functional style. In your case, there are three things we need to do here.
Get the values of the HashMap. This can be done with the values method, which returns an iterator.
Keep only the ones satisfying a particular predicate. This is a filter operation and will produce another iterator. Note that this does not yet iterate over the hash map; it merely produces another iterator capable of doing so later.
Count the matches, using count.
Putting it all together, we have
map.values().filter(|v| v.starts_with("a")).count()
You should filter an iterator of the HashMap, then count the elements of the iterator:
use std::collections::HashMap;
fn main() {
let mut map = HashMap::new();
map.insert(1, "aaa");
map.insert(2, "bbb");
assert_eq!(
map.iter().filter(|(_k, v)| v.starts_with("a")).count(),
1
);
}
Notice that the map also has to be marked as mut in order to insert new elements, and the filter closure destructures into a tuple containing the key and the value, rather than accepting two separate parameters.

What is the proper way of modifying a value of an entry in a HashMap?

I am a beginner in Rust, I haven't finished the "Book" yet, but one thing made me ask this question.
Considering this code:
fn main() {
let mut entries = HashMap::new();
entries.insert("First".to_string(), 10);
entries.entry("Second".to_string()).or_insert(20);
assert_eq!(10, *entries.get("First").unwrap());
entries.entry(String::from("First")).and_modify(|value| { *value = 20});
assert_eq!(20, *entries.get("First").unwrap());
entries.insert("First".to_string(), 30);
assert_eq!(30, *entries.get("First").unwrap());
}
I have used two ways of modifying an entry:
entries.entry(String::from("First")).and_modify(|value| { *value = 20});
entries.insert("First".to_string(), 30);
The insert way looks clunkish, and I woundn't personally use it to modify a value in an entry, but... it works. Nevertheless, is there a reason not to use it other than semantics? As I said, I'd rather use the entry construct than just bruteforcing an update using insert with an existing key. Something a newbie Rustacean like me could not possibly know?
insert() is a bit more idiomatic when you are replacing an entire value, particularly when you don't know (or care) if the value was present to begin with.
get_mut() is more idiomatic when you want to do something to a value that requires mutability, such as replacing only one field of a struct or invoking a method that requires a mutable reference. If you know the key is present you can use .unwrap(), otherwise you can use one of the other Option utilities or match.
entry(...).and_modify(...) by itself is rarely idiomatic; it's more useful when chaining other methods of Entry together, such as where you want to modify a value if it exists, otherwise add a different value. You might see this pattern when working with maps where the values are totals:
entries.entry(key)
.and_modify(|v| *v += 1)
.or_insert(1);

How does one create a HashMap with a default value in Rust?

Being fairly new to Rust, I was wondering on how to create a HashMap with a default value for a key? For example, having a default value 0 for any key inserted in the HashMap.
In Rust, I know this creates an empty HashMap:
let mut mymap: HashMap<char, usize> = HashMap::new();
I am looking to maintain a counter for a set of keys, for which one way to go about it seems to be:
for ch in "AABCCDDD".chars() {
mymap.insert(ch, 0)
}
Is there a way to do it in a much better way in Rust, maybe something equivalent to what Ruby provides:
mymap = Hash.new(0)
mymap["b"] = 1
mymap["a"] # 0
Answering the problem you have...
I am looking to maintain a counter for a set of keys.
Then you want to look at How to lookup from and insert into a HashMap efficiently?. Hint: *map.entry(key).or_insert(0) += 1
Answering the question you asked...
How does one create a HashMap with a default value in Rust?
No, HashMaps do not have a place to store a default. Doing so would cause every user of that data structure to allocate space to store it, which would be a waste. You'd also have to handle the case where there is no appropriate default, or when a default cannot be easily created.
Instead, you can look up a value using HashMap::get and provide a default if it's missing using Option::unwrap_or:
use std::collections::HashMap;
fn main() {
let mut map: HashMap<char, usize> = HashMap::new();
map.insert('a', 42);
let a = map.get(&'a').cloned().unwrap_or(0);
let b = map.get(&'b').cloned().unwrap_or(0);
println!("{}, {}", a, b); // 42, 0
}
If unwrap_or doesn't work for your case, there are several similar functions that might:
Option::unwrap_or_else
Option::map_or
Option::map_or_else
Of course, you are welcome to wrap this in a function or a data structure to provide a nicer API.
ArtemGr brings up an interesting point:
in C++ there's a notion of a map inserting a default value when a key is accessed. That always seemed a bit leaky though: what if the type doesn't have a default? Rust is less demanding on the mapped types and more explicit about the presence (or absence) of a key.
Rust adds an additional wrinkle to this. Actually inserting a value would require that simply getting a value can also change the HashMap. This would invalidate any existing references to values in the HashMap, as a reallocation might be required. Thus you'd no longer be able to get references to two values at the same time! That would be very restrictive.
What about using entry to get an element from the HashMap, and then modify it.
From the docs:
fn entry(&mut self, key: K) -> Entry<K, V>
Gets the given key's corresponding entry in the map for in-place
manipulation.
example
use std::collections::HashMap;
let mut letters = HashMap::new();
for ch in "a short treatise on fungi".chars() {
let counter = letters.entry(ch).or_insert(0);
*counter += 1;
}
assert_eq!(letters[&'s'], 2);
assert_eq!(letters[&'t'], 3);
assert_eq!(letters[&'u'], 1);
assert_eq!(letters.get(&'y'), None);
.or_insert() and .or_insert_with()
Adding to the existing example for .entry().or_insert(), I wanted to mention that if the default value passed to .or_insert() is dynamically generated, it's better to use .or_insert_with().
Using .or_insert_with() as below, the default value is not generated if the key already exists. It only gets created when necessary.
for v in 0..s.len() {
components.entry(unions.get_root(v))
.or_insert_with(|| vec![]) // vec only created if needed.
.push(v);
}
In the snipped below, the default vector passed to .or_insert() is generated on every call. If the key exists, a vector is being created and then disposed of, which can be wasteful.
components.entry(unions.get_root(v))
.or_insert(vec![]) // vec always created.
.push(v);
So for fixed values that don't have much creation overhead, use .or_insert(), and for values that have appreciable creation overhead, use .or_insert_with().
A way to start a map with initial values is to construct the map from a vector of tuples. For instance, considering, the code below:
let map = vec![("field1".to_string(), value1), ("field2".to_string(), value2)].into_iter().collect::<HashMap<_, _>>();

Sort HashMap data by value

I want to sort HashMap data by value in Rust (e.g., when counting character frequency in a string).
The Python equivalent of what I’m trying to do is:
count = {}
for c in text:
count[c] = count.get('c', 0) + 1
sorted_data = sorted(count.items(), key=lambda item: -item[1])
print('Most frequent character in text:', sorted_data[0][0])
My corresponding Rust code looks like this:
// Count the frequency of each letter
let mut count: HashMap<char, u32> = HashMap::new();
for c in text.to_lowercase().chars() {
*count.entry(c).or_insert(0) += 1;
}
// Get a sorted (by field 0 ("count") in reversed order) list of the
// most frequently used characters:
let mut count_vec: Vec<(&char, &u32)> = count.iter().collect();
count_vec.sort_by(|a, b| b.1.cmp(a.1));
println!("Most frequent character in text: {}", count_vec[0].0);
Is this idiomatic Rust? Can I construct the count_vec in a way so that it would consume the HashMaps data and owns it (e.g., using map())? Would this be more idomatic?
Is this idiomatic Rust?
There's nothing particularly unidiomatic, except possibly for the unnecessary full type constraint on count_vec; you could just use
let mut count_vec: Vec<_> = count.iter().collect();
It's not difficult from context to work out what the full type of count_vec is. You could also omit the type constraint for count entirely, but then you'd have to play shenanigans with your integer literals to have the correct value type inferred. That is to say, an explicit annotation is eminently reasonable in this case.
The other borderline change you could make if you feel like it would be to use |a, b| a.1.cmp(b.1).reverse() for the sort closure. The Ordering::reverse method just reverses the result so that less-than becomes greater-than, and vice versa. This makes it slightly more obvious that you meant what you wrote, as opposed to accidentally transposing two letters.
Can I construct the count_vec in a way so that it would consume the HashMaps data and owns it?
Not in any meaningful way. Just because HashMap is using memory doesn't mean that memory is in any way compatible with Vec. You could use count.into_iter() to consume the HashMap and move the elements out (as opposed to iterating over pointers), but since both char and u32 are trivially copyable, this doesn't really gain you anything.
This could be another way to address the matter without the need of an intermediary vector.
// Count the frequency of each letter
let mut count: HashMap<char, u32> = HashMap::new();
for c in text.to_lowercase().chars() {
*count.entry(c).or_insert(0) += 1;
}
let top_char = count.iter().max_by(|a, b| a.1.cmp(&b.1)).unwrap();
println!("Most frequent character in text: {}", top_char.0);
use BTreeMap for sorted data
BTreeMap sorts its elements by key by default, therefore exchanging the place of your key and value and putting them into a BTreeMap
let count_b: BTreeMap<&u32,&char> = count.iter().map(|(k,v)| (v,k)).collect();
should give you a sorted map according to character frequency.
Some character of the same frequency shall be lost though. But if you only want the most frequent character, it does not matter.
You can get the result using
println!("Most frequent character in text: {}", count_b.last_key_value().unwrap().1);

Resources