How to iterate over HashMap starting from given key? - rust

Given a HashMap of n elements how does one start iteration from n-x element.
The order of elements does not matter, the only problem I need to solve is to start iteration from given key.
Example:
let mut map: HashMap<&str, i32> = HashMap::new();
map.insert("one", 1);
map.insert("two", 2);
map.insert("three", 3);
map.insert("four", 4);
[...]
for (k, v) in map {
//how to start iteration from third item and not the first one
}
Tried to google it but no examples found so far.

Tried to google it but no examples found so far.
That's because as Chayim Friedman notes it doesn't really make sense, a hashmap has an essentially random internal order, which means it has an arbitrary iteration order. Iterating from or between keys (/ entries) thus doesn't make much sense.
So it sounds a lot like an XY problem, what is the reason why you're trying to iterate "starting from a given key"?
Though if you really want that, you can just use the skip_while adapter, and skip while you have not found the key you're looking for.
Alternatively, since your post is ambiguous (you talk about both key and position) you can use the skip adapter to skip over a fixed number of items.
Technically neither will start iterating from that entry, they'll both start iterating from 0 but will only yield items following the specified break point. The standard library's hashmap has no support for range iteration (because that doesn't really make any sense on hashmap), and its iterators are not random access either (for similar reason).

You may want to use a BTreeMap, which has sorted keys and a range function which iterates over a range of keys.
use std::collections::BTreeMap;
fn main() {
let mut map = BTreeMap::new();
map.insert(1, "one");
map.insert(2, "two");
map.insert(3, "three");
for (&key, &value) in map.range(2..) {
println!("{key}: {value}");
}
}
// 2: two
// 3: three

Related

Re-use already advanced iterator for different function

While iterating over lines in a file I need to first do "task_A" and then "task_B". The first few lines there is some data that I need to put into some data structure (task_A) and after that the lines describe how the data inside of the data structure is manipulated (task_B). Right now I use a for-loop with enumerate and if-else statements that switch depending on which file number:
let file = File::open("./example.txt").unwrap();
let reader = BufReader::new(file);
for (i, lines) in reader.lines().map(|l| l.unwrap()).enumerate() {
if i < n {
do_task_a(&lines);
} else {
do_task_b(&lines);
}
}
There is also the take_while()-method for iterators. But this only solves one part. Ideally I would pass the iterator for n steps to one function and after that to another function. I want to have a solution that only needs to iterate over the file one time.
(For anyone wondering: I want a more elegant solution for 5th day of Advent of Code 2022 Is there a way to do that? To "re-use" the iterator when it is already advanced n steps?
Looping or using an iterator adapter will consume an iterator. But if I is an iterator then so is &mut I!
You can use that instance to partially iterate through the iterator with one adapter and then continue with another. The first use consumes only the mutable reference, but not the iterator itself. For example using take:
let mut it = reader.lines().map(|l| l.unwrap());
for lines in (&mut it).take(n) {
do_task_a(&lines);
}
for lines in it {
do_task_b(&lines);
}
But I think your original code is still completely fine.

How to check two HashMap are identical in Rust?

I have two HashMaps (playground):
let mut m1: HashMap<u8, usize, _> = HashMap::new();
m1.insert(1, 100);
m1.insert(2, 200);
let mut m2: HashMap<u8, usize, _> = HashMap::new();
m2.insert(2, 200);
m2.insert(1, 100);
How can I check if the two maps m1 and m2 are identical?
By "identical", I mean all of the following conditions are satisfied.
Type of keys is same.
Type of values is same.
Two maps have exactly the same key set. Insertion order shall not matter.
Two maps have exactly the same value for every key (i.e. m1.get(k) == m2.get(k) for every existing key k).
As far as I tested, just m1 == m2 works. However, it this behavior guaranteed? I want a some sort of guarantee (thus I added #language-lawyer tag).
I've already read the official documentation of HashMap.
Also, what about HashSet and Vec? (I've also read their documentation.)
Looking through the source of the std libraries you can find the implementation of PartialEq for those different collections:
HashMap iterate over all key/value pair and check if the other map has a corresponding entry for that key, and then check if those value are equal: source.
HashSet iterate of the keys and check if the other set contains that key: source.
Vec actually call eq on the underlying slice, which either iterate across every values and compare them: source or does a bitwise comparaison if the type allows it by calling memcmp: source.
I don't know if there is any kind of garanties that this behavior will never change, but being stables, widely use APIs, I don't see them change, ever.

What is the proper way of modifying a value of an entry in a HashMap?

I am a beginner in Rust, I haven't finished the "Book" yet, but one thing made me ask this question.
Considering this code:
fn main() {
let mut entries = HashMap::new();
entries.insert("First".to_string(), 10);
entries.entry("Second".to_string()).or_insert(20);
assert_eq!(10, *entries.get("First").unwrap());
entries.entry(String::from("First")).and_modify(|value| { *value = 20});
assert_eq!(20, *entries.get("First").unwrap());
entries.insert("First".to_string(), 30);
assert_eq!(30, *entries.get("First").unwrap());
}
I have used two ways of modifying an entry:
entries.entry(String::from("First")).and_modify(|value| { *value = 20});
entries.insert("First".to_string(), 30);
The insert way looks clunkish, and I woundn't personally use it to modify a value in an entry, but... it works. Nevertheless, is there a reason not to use it other than semantics? As I said, I'd rather use the entry construct than just bruteforcing an update using insert with an existing key. Something a newbie Rustacean like me could not possibly know?
insert() is a bit more idiomatic when you are replacing an entire value, particularly when you don't know (or care) if the value was present to begin with.
get_mut() is more idiomatic when you want to do something to a value that requires mutability, such as replacing only one field of a struct or invoking a method that requires a mutable reference. If you know the key is present you can use .unwrap(), otherwise you can use one of the other Option utilities or match.
entry(...).and_modify(...) by itself is rarely idiomatic; it's more useful when chaining other methods of Entry together, such as where you want to modify a value if it exists, otherwise add a different value. You might see this pattern when working with maps where the values are totals:
entries.entry(key)
.and_modify(|v| *v += 1)
.or_insert(1);

How to get the lower bound and upper bound of an element in a BTreeSet?

Reading the BTreeSet documentation, I can't seem to figure out how to get the least value greater than, or greatest value less than an element from a BTreeSet in logarithmic time.
I see there is a range method that can give the values in an arbitrary (min, max) range, but what if I don't know the range and I just want the previous and/or the next element in logarithmic time?
This would be similar to lower_bound and upper_bound in std::set in C++.
but what if I don't know the range
Then use an unbounded range:
use std::collections::BTreeSet;
fn neighbors(tree: &BTreeSet<i32>, val: i32) -> (Option<&i32>, Option<&i32>) {
use std::ops::Bound::*;
let mut before = tree.range((Unbounded, Excluded(val)));
let mut after = tree.range((Excluded(val), Unbounded));
(before.next_back(), after.next())
}
fn main() {
let tree: BTreeSet<_> = [1, 3, 5].iter().cloned().collect();
let (prev, next) = neighbors(&tree, 2);
println!("greatest less than 2: {:?}", prev);
println!("least bigger than 2: {:?}", next);
}
greatest less than 2: Some(1)
least bigger than 2: Some(3)
BTreeSet::range returns a double-ended iterator, so you can pull from either side of it.
Note that we are using the very explicit Bound operator so that we do not include the value we are looking around.
There have been discussions about enhancing BTreeMap / BTreeSet to have a "cursor" API that might allow you to find an element and then "move around" inside the tree. This would allow you to avoid searching through the tree to find the start node twice, but it has not been implemented.
A pull request was opened to do so, but it was closed because it was deemed that there should be more discussion about how such an API should look and work.
Well... if you don't mind modifying the current collection and taking a performance hit... it appears that you can use split_off creatively.
let mut tree = BTreeSet::new();
tree.insert(1);
tree.insert(3);
tree.insert(5);
let other = tree.split_off(&2);
println!("{:?}", tree);
println!("{:?}", other);
Will print {1} and {3, 5}:
the lower-bound is the first element of the second range,
the upper-bound is the first element of the second range if not equal, and the second otherwise.
Once you are done, you can reassemble the tree using tree.append(other).
And yes, it's really less than ideal...
If you can change your data structure, you can use intrusive collections.
You have the desired methods:
RBTree::lower_bound
RBTree::upper_bound

How does one create a HashMap with a default value in Rust?

Being fairly new to Rust, I was wondering on how to create a HashMap with a default value for a key? For example, having a default value 0 for any key inserted in the HashMap.
In Rust, I know this creates an empty HashMap:
let mut mymap: HashMap<char, usize> = HashMap::new();
I am looking to maintain a counter for a set of keys, for which one way to go about it seems to be:
for ch in "AABCCDDD".chars() {
mymap.insert(ch, 0)
}
Is there a way to do it in a much better way in Rust, maybe something equivalent to what Ruby provides:
mymap = Hash.new(0)
mymap["b"] = 1
mymap["a"] # 0
Answering the problem you have...
I am looking to maintain a counter for a set of keys.
Then you want to look at How to lookup from and insert into a HashMap efficiently?. Hint: *map.entry(key).or_insert(0) += 1
Answering the question you asked...
How does one create a HashMap with a default value in Rust?
No, HashMaps do not have a place to store a default. Doing so would cause every user of that data structure to allocate space to store it, which would be a waste. You'd also have to handle the case where there is no appropriate default, or when a default cannot be easily created.
Instead, you can look up a value using HashMap::get and provide a default if it's missing using Option::unwrap_or:
use std::collections::HashMap;
fn main() {
let mut map: HashMap<char, usize> = HashMap::new();
map.insert('a', 42);
let a = map.get(&'a').cloned().unwrap_or(0);
let b = map.get(&'b').cloned().unwrap_or(0);
println!("{}, {}", a, b); // 42, 0
}
If unwrap_or doesn't work for your case, there are several similar functions that might:
Option::unwrap_or_else
Option::map_or
Option::map_or_else
Of course, you are welcome to wrap this in a function or a data structure to provide a nicer API.
ArtemGr brings up an interesting point:
in C++ there's a notion of a map inserting a default value when a key is accessed. That always seemed a bit leaky though: what if the type doesn't have a default? Rust is less demanding on the mapped types and more explicit about the presence (or absence) of a key.
Rust adds an additional wrinkle to this. Actually inserting a value would require that simply getting a value can also change the HashMap. This would invalidate any existing references to values in the HashMap, as a reallocation might be required. Thus you'd no longer be able to get references to two values at the same time! That would be very restrictive.
What about using entry to get an element from the HashMap, and then modify it.
From the docs:
fn entry(&mut self, key: K) -> Entry<K, V>
Gets the given key's corresponding entry in the map for in-place
manipulation.
example
use std::collections::HashMap;
let mut letters = HashMap::new();
for ch in "a short treatise on fungi".chars() {
let counter = letters.entry(ch).or_insert(0);
*counter += 1;
}
assert_eq!(letters[&'s'], 2);
assert_eq!(letters[&'t'], 3);
assert_eq!(letters[&'u'], 1);
assert_eq!(letters.get(&'y'), None);
.or_insert() and .or_insert_with()
Adding to the existing example for .entry().or_insert(), I wanted to mention that if the default value passed to .or_insert() is dynamically generated, it's better to use .or_insert_with().
Using .or_insert_with() as below, the default value is not generated if the key already exists. It only gets created when necessary.
for v in 0..s.len() {
components.entry(unions.get_root(v))
.or_insert_with(|| vec![]) // vec only created if needed.
.push(v);
}
In the snipped below, the default vector passed to .or_insert() is generated on every call. If the key exists, a vector is being created and then disposed of, which can be wasteful.
components.entry(unions.get_root(v))
.or_insert(vec![]) // vec always created.
.push(v);
So for fixed values that don't have much creation overhead, use .or_insert(), and for values that have appreciable creation overhead, use .or_insert_with().
A way to start a map with initial values is to construct the map from a vector of tuples. For instance, considering, the code below:
let map = vec![("field1".to_string(), value1), ("field2".to_string(), value2)].into_iter().collect::<HashMap<_, _>>();

Resources