Process value in Hashmap based on another value in same Hashmap - rust

I have a HashMap acting as a lookup table in my code, mapping IDs <-> Data.
I need to lookup some data (let's call it Data A) based on my ID, then read the contents. Based on entries in the content, I would then need to lookup another value in the same lookup table, read those data, and do some calculations, updating my original data A.
Here is a minimal working example:
playground
use std::collections::HashMap;
struct MyData {
id: i32,
result: i32,
complex_data: Vec<i32>
}
impl MyData {
fn new(id: i32) -> Self {
MyData {
id,
result: 0,
complex_data: Vec::new()
}
}
}
fn main() {
let mut lookup_table = HashMap::new();
// init data
lookup_table.insert(1, MyData::new(1));
lookup_table.insert(2, MyData::new(2));
lookup_table.insert(3, MyData::new(3));
lookup_table.insert(4, MyData::new(4));
// process data based on an ID. In this example, hard coded as "1"
if let Some(data) = lookup_table.get_mut(&1) {
// process each entry
for c in data.complex_data.iter() {
// lookup some more values based the entry
if let Some(lookup_data) = lookup_table.get(c) {
//^^^^^^^^^^^^^^^^^^^ - cannot borrow `lookup_table` as immutable
// do some calculation and store result
data.result = lookup_data.result + 42; // random calculation as an example
}
}
}
println!("Hello, world!");
}
The error occurs because it seems I'm borrowing lookup_table twice. From what I understand, the compiler is worried that my second lookup also looks up the ID = 1, which will mean I have an mutable reference of DataID = 1, and an immutable reference of DataID = 1 at the same time.
I am fine with this, however, since my second read is immutable, and also this whole thing is single-threaded, so I'm not worried about any race conditions.
How can I restructure my code to make the Rust compiler happy whilst achieving my functionality?

I think you can work around the issue by doing all the reads with an immutable borrow at the first part of the if statement, saving the calculation results into a temporary vector, and doing all the writes with a mutable borrow at the second part. See the code below.
// process data based on an ID. In this example, hard coded as "1"
if let Some(data) = lookup_table.get(&1) {
let mut results = Vec::new();
// process each entry
for c in data.complex_data.iter() {
// lookup some more values based the entry
if let Some(lookup_data) = lookup_table.get(c) {
// do some calculation and store result
results.push(lookup_data.result);
}
}
let data = lookup_table.get_mut(&1).unwrap();
for v in results {
data.result = v + 42;
}
}
The latter assignment to data shadows the previous one and ends the lifetime of the immutable borrow.
Playground

You can use interior mutability pattern on result field. This gives you the possibility to make an immutable borrow &MyData in the outer loop, and mutate its result field in the inner loop. The borrow checker doesn't complain because all checks are done at runtime.
And at runtime, you never have several mutable ref at the same time.
use std::{cell::RefCell, collections::HashMap};
struct MyData {
id: i32,
result: RefCell<i32>,
complex_data: Vec<i32>,
}
impl MyData {
fn new(id: i32) -> Self {
MyData {
id,
result: RefCell::new(0),
complex_data: vec![1, 2, 3, 4],
}
}
fn set_result(&self, result: i32) {
*self.result.borrow_mut() = result;
}
fn get_result(&self) -> i32 {
self.result.take()
}
}
fn main() {
let mut lookup_table = HashMap::new();
// init data
lookup_table.insert(1, MyData::new(1));
lookup_table.insert(2, MyData::new(2));
lookup_table.insert(3, MyData::new(3));
lookup_table.insert(4, MyData::new(4));
// process data based on an ID. In this example, hard coded as "1"
if let Some(data) = lookup_table.get(&1) {
// process each entry
for c in data.complex_data.iter() {
// lookup some more values based the entry
if let Some(lookup_data) = lookup_table.get(c) {
//^^^^^^^^^^^^^^^^^^^ - cannot borrow `lookup_table` as immutable
// do some calculation and store result
data.set_result(lookup_data.get_result() + 42);
// random calculation as an example
}
}
}
}
If you don't want to pay the runtime cost, you can use the interior mutability pattern with Cell instead of RefCell.

Interior mutability is one option, as presented in a previous answer. Collecting the values is another option, as presented in another previous answer. Depending on the nature of your calculation, you might not need any allocation at all, but just store intermediate results in a global variable, and assign it at the end.
For example, this compiles:
fn main() {
let mut lookup_table = HashMap::from([
(1, MyData::new(1)),
(2, MyData::new(2)),
(3, MyData::new(3)),
(4, MyData::new(4)),
]);
let data_key = 1;
let mut to_store = None;
if let Some(data) = lookup_table.get(&data_key) {
let mut result = data.result;
for subkey in &data.complex_data {
if let Some(sub_data) = lookup_table.get(subkey) {
result += sub_data.result + 42;
}
}
to_store = Some(result);
}
if let Some(to_store) = to_store {
lookup_table.get_mut(&data_key).unwrap().result = to_store;
}
println!("Hello, world!");
}
Playground

Related

Remove found entry from HashMap without cloning the key

I am finding a certain entry in a HashMap (in this case it's the one that's "least used"). I now want to remove that entry from the map. Since the key was obtained from the HashMap itself (and thus is a reference), how can I now use it to remove the entry without cloning the key.
Here is a small, runnable example:
use std::collections::HashMap;
fn main() {
let mut usage = HashMap::<String, usize>::new();
usage.insert("entry one".to_owned(), 5);
usage.insert("entry two".to_owned(), 1);
let mut least_used: Option<(&String, &usize)> = None;
for curr in usage.iter() {
if let Some(prev) = least_used {
if curr.1 < prev.1 {
least_used = Some(curr);
}
} else {
least_used = Some(curr);
}
}
println!("{:?}", least_used);
usage.remove(least_used.unwrap().0);
}
Rust playground
The error I'm getting is:
cannot borrow `usage` as mutable because it is also borrowed as immutable

Iterating over a vector gives me a different value than what is inside the vector Rust

I have been using Petgraph recently to make simple graphs with Structs for nodes and custom edges, but I have come across a problem which I am unsure if it comes from the library or Rust.
I have a graph, in which I have multiple nodes, each nodes have a name. I then put all of the index of the node (with type NodeIndex) in a vector, since Petgraph doesn't have a function to give all the nodes from a graph. I want to then create a function that given a string, it returns the index of the node that matches the name.
My problem is that somehow the type in the vector containing the nodes seems to change. I store it as NodeIndex yet the types somehow change by themselves to u32 without me changing anything. Since it changes automatically, I can't pass the values inside Petgraph functions since they require NodeIndex as inputs and not u32.
The code following is what I have so far and the problem arises in the function find_node_index_with_name where the types seem to change even though I pass a vector of NodeIndex as input so when I iterate over it, I should also get NodeIndex back.
use petgraph::adj::NodeIndex;
use petgraph::stable_graph::StableGraph;
use petgraph::dot::Dot;
#[derive(Clone,Debug,Default)]
struct ControlBloc
{
name:String,
value:u32,
}
fn create_bloc(name:String,value:u32) -> ControlBloc
{
ControlBloc
{
name,
value,
}
}
fn find_node_index_with_name(gr:StableGraph<ControlBloc,u32> , nodes:Vec<NodeIndex> , name_search:String) -> Option<NodeIndex>
{
for i in 0..nodes.len()
{
if gr.node_weight(nodes[i]).unwrap().name == name_search
{
return nodes[i];
}
}
return None;
}
fn main() {
let mut graph = StableGraph::<ControlBloc,u32>::new();
let m = create_bloc(String::from("Main"),10);
let b1 = create_bloc(String::from("sub1"),20);
let b2 = create_bloc(String::from("sub2"),30);
let main = graph.add_node(m);
let sub1 = graph.add_node(b1);
let sub2 = graph.add_node(b2);
let all_nodes = vec![main,sub1,sub2];
println!("{:?}",find_node_index_with_name(graph, all_nodes, String::from("Main")));
}
I am a bit stumped as to why the types change.
Thank you for any inputs!
graph.add_node() returns a petgraph::graph::NodeIndex.
But you used petgraph::adj::NodeIndex which appears to be a different type (don't ask me why), thus the type mismatch.
I took the liberty to change a bit your code in order to use references where you used owned values.
use petgraph::graph::NodeIndex; // graph not adj
use petgraph::stable_graph::StableGraph;
#[derive(Clone, Debug, Default)]
struct ControlBloc {
name: String,
value: u32,
}
fn create_bloc(
name: String,
value: u32,
) -> ControlBloc {
ControlBloc { name, value }
}
fn find_node_index_with_name(
gr: &StableGraph<ControlBloc, u32>,
nodes: &[NodeIndex],
name_search: &str,
) -> Option<NodeIndex> {
nodes
.iter()
.map(|n| *n)
.find(|n| gr.node_weight(*n).unwrap().name == name_search)
/*
for i in 0..nodes.len() {
if gr.node_weight(nodes[i]).unwrap().name == name_search {
return Some(nodes[i]);
}
}
None
*/
}
fn main() {
let mut graph = StableGraph::<ControlBloc, u32>::new();
let m = create_bloc(String::from("Main"), 10);
let b1 = create_bloc(String::from("sub1"), 20);
let b2 = create_bloc(String::from("sub2"), 30);
let main = graph.add_node(m);
let sub1 = graph.add_node(b1);
let sub2 = graph.add_node(b2);
let all_nodes = vec![main, sub1, sub2];
for n in ["Main", "sub1", "sub2"] {
println!("{:?}", find_node_index_with_name(&graph, &all_nodes, n));
}
}
/*
Some(NodeIndex(0))
Some(NodeIndex(1))
Some(NodeIndex(2))
*/

Splitting structure or duplicate calls or a better way in rust?

The following code is used as an example of my problem. A structure named State contains a number of Residents.
Now there is a function that needs to modify both the State property and one of the Resident's properties.
Since it is not possible to get mutable borrows of State and one of the Residents in this State at the same time. The code can not compile.
I can think of two ways to solve it.
One is that just give only one parameter to modify_state_and_resident(): a mutable reference of State is provided. But I have to call the code to find Resident in hash map again in modify_state_and_resident(), which is expensive.
Another way is to split the State structure, splitting its properties and residents into separate variables. But this would bring other logical problems. After all, this is a complete entity that has to be referenced everywhere at the same time.
I don't know if there is a more perfect way to solve it.
#[derive(Debug)]
struct Resident {
age: i32,
name: String,
viewcnt: i32,
}
use std::collections::HashMap;
#[derive(Debug)]
struct State {
version: i32,
residents: HashMap<String, Resident>,
}
// This function cann not be invoked from modify_state_and_resident
fn show_resident(resident: &Resident) {
println!("{:?}", resident);
}
fn modify_state_and_resident(class: &mut State, resident: &mut Resident) {
// I do not want to call hash get again.
class.version = class.version + 1;
resident.viewcnt = resident.viewcnt + 1;
}
#[test]
fn whole_part_mutable() {
let mut s = State {
version: 1,
residents: HashMap::from([
(
String::from("this is a man who named Aaron"),
Resident{age: 18, name: String::from("Aaron"), viewcnt: 0}
),
])};
// get is expensive, I just want to call it when neccessary
let r = s.residents.get_mut("this is a man who named Aaron").unwrap();
// can not call from other function
show_resident(r);
modify_state_and_resident(&mut s, r);
}
You can destructure the State struct &mut to get individual access to both parts:
fn modify_state_and_resident(version: &mut i32, resident: &mut Resident) {
// I do not want to call hash get again.
*version = *version + 1;
resident.viewcnt = resident.viewcnt + 1;
}
#[test]
fn whole_part_mutable() {
let mut s = State {
version: 1,
residents: HashMap::from([
(
String::from("this is a man who named Aaron"),
Resident{age: 18, name: String::from("Aaron"), viewcnt: 0}
),
])};
let State { version, residents } = &mut s;
// get is expensive, I just want to call it when neccessary
let r = residents.get_mut("this is a man who named Aaron").unwrap();
// can not call from other function
show_resident(r);
modify_state_and_resident(version, r);
println!("{:?}", s);
}
Playground

How to pass &mut str and change the original mut str without a return?

I'm learning Rust from the Book and I was tackling the exercises at the end of chapter 8, but I'm hitting a wall with the one about converting words into Pig Latin. I wanted to see specifically if I could pass a &mut String to a function that takes a &mut str (to also accept slices) and modify the referenced string inside it so the changes are reflected back outside without the need of a return, like in C with a char **.
I'm not quite sure if I'm just messing up the syntax or if it's more complicated than it sounds due to Rust's strict rules, which I have yet to fully grasp. For the lifetime errors inside to_pig_latin() I remember reading something that explained how to properly handle the situation but right now I can't find it, so if you could also point it out for me it would be very appreciated.
Also what do you think of the way I handled the chars and indexing inside strings?
use std::io::{self, Write};
fn main() {
let v = vec![
String::from("kaka"),
String::from("Apple"),
String::from("everett"),
String::from("Robin"),
];
for s in &v {
// cannot borrow `s` as mutable, as it is not declared as mutable
// cannot borrow data in a `&` reference as mutable
to_pig_latin(&mut s);
}
for (i, s) in v.iter().enumerate() {
print!("{}", s);
if i < v.len() - 1 {
print!(", ");
}
}
io::stdout().flush().unwrap();
}
fn to_pig_latin(mut s: &mut str) {
let first = s.chars().nth(0).unwrap();
let mut pig;
if "aeiouAEIOU".contains(first) {
pig = format!("{}-{}", s, "hay");
s = &mut pig[..]; // `pig` does not live long enough
} else {
let mut word = String::new();
for (i, c) in s.char_indices() {
if i != 0 {
word.push(c);
}
}
pig = format!("{}-{}{}", word, first.to_lowercase(), "ay");
s = &mut pig[..]; // `pig` does not live long enough
}
}
Edit: here's the fixed code with the suggestions from below.
fn main() {
// added mut
let mut v = vec![
String::from("kaka"),
String::from("Apple"),
String::from("everett"),
String::from("Robin"),
];
// added mut
for mut s in &mut v {
to_pig_latin(&mut s);
}
for (i, s) in v.iter().enumerate() {
print!("{}", s);
if i < v.len() - 1 {
print!(", ");
}
}
println!();
}
// converted into &mut String
fn to_pig_latin(s: &mut String) {
let first = s.chars().nth(0).unwrap();
if "aeiouAEIOU".contains(first) {
s.push_str("-hay");
} else {
// added code to make the new first letter uppercase
let second = s.chars().nth(1).unwrap();
*s = format!(
"{}{}-{}ay",
second.to_uppercase(),
// the slice starts at the third char of the string, as if &s[2..]
&s[first.len_utf8() * 2..],
first.to_lowercase()
);
}
}
I'm not quite sure if I'm just messing up the syntax or if it's more complicated than it sounds due to Rust's strict rules, which I have yet to fully grasp. For the lifetime errors inside to_pig_latin() I remember reading something that explained how to properly handle the situation but right now I can't find it, so if you could also point it out for me it would be very appreciated.
What you're trying to do can't work: with a mutable reference you can update the referee in-place, but this is extremely limited here:
a &mut str can't change length or anything of that matter
a &mut str is still just a reference, the memory has to live somewhere, here you're creating new Strings inside your function then trying to use these as the new backing buffers for the reference, which as the compiler tells you doesn't work: the String will be deallocated at the end of the function
What you could do is take an &mut String, that lets you modify the owned string itself in-place, which is much more flexible. And, in fact, corresponds exactly to your request: an &mut str corresponds to a char*, it's a pointer to a place in memory.
A String is also a pointer, so an &mut String is a double-pointer to a zone in memory.
So something like this:
fn to_pig_latin(s: &mut String) {
let first = s.chars().nth(0).unwrap();
if "aeiouAEIOU".contains(first) {
*s = format!("{}-{}", s, "hay");
} else {
let mut word = String::new();
for (i, c) in s.char_indices() {
if i != 0 {
word.push(c);
}
}
*s = format!("{}-{}{}", word, first.to_lowercase(), "ay");
}
}
You can also likely avoid some of the complete string allocations by using somewhat finer methods e.g.
fn to_pig_latin(s: &mut String) {
let first = s.chars().nth(0).unwrap();
if "aeiouAEIOU".contains(first) {
s.push_str("-hay")
} else {
s.replace_range(first.len_utf8().., "");
write!(s, "-{}ay", first.to_lowercase()).unwrap();
}
}
although the replace_range + write! is not very readable and not super likely to be much of a gain, so that might as well be a format!, something along the lines of:
fn to_pig_latin(s: &mut String) {
let first = s.chars().nth(0).unwrap();
if "aeiouAEIOU".contains(first) {
s.push_str("-hay")
} else {
*s = format!("{}-{}ay", &s[first.len_utf8()..], first.to_lowercase());
}
}

Why does the Rust compiler complain that I use a moved value when I've replaced it with a new value?

I am working on two singly linked lists, named longer and shorter. The length of the longer one is guaranteed to be no less than the shorter one.
I pair the lists element-wise and do something to each pair. If the longer list has more unpaired elements, process the rest of them:
struct List {
next: Option<Box<List>>,
}
fn drain_lists(mut shorter: Option<Box<List>>, mut longer: Option<Box<List>>) {
// Pair the elements in the two lists.
while let (Some(node1), Some(node2)) = (shorter, longer) {
// Actual work elided.
shorter = node1.next;
longer = node2.next;
}
// Process the rest in the longer list.
while let Some(node) = longer {
// Actual work elided.
longer = node.next;
}
}
However, the compiler complains on the second while loop that
error[E0382]: use of moved value
--> src/lib.rs:13:20
|
5 | fn drain_lists(mut shorter: Option<Box<List>>, mut longer: Option<Box<List>>) {
| ---------- move occurs because `longer` has type `std::option::Option<std::boxed::Box<List>>`, which does not implement the `Copy` trait
6 | // Pair the elements in the two lists.
7 | while let (Some(node1), Some(node2)) = (shorter, longer) {
| ------ value moved here
...
13 | while let Some(node) = longer {
| ^^^^ value used here after move
However, I do set a new value for shorter and longer at the end of the loop, so that I will never use a moved value of them.
How should I cater to the compiler?
I think that the problem is caused by the tuple temporary in the first loop. Creating a tuple moves its components into the new tuple, and that happens even when the subsequent pattern matching fails.
First, let me write a simpler version of your code. This compiles fine:
struct Foo(i32);
fn main() {
let mut longer = Foo(0);
while let Foo(x) = longer {
longer = Foo(x + 1);
}
println!("{:?}", longer.0);
}
But if I add a temporary to the while let then I'll trigger a compiler error similar to yours:
fn fwd<T>(t: T) -> T { t }
struct Foo(i32);
fn main() {
let mut longer = Foo(0);
while let Foo(x) = fwd(longer) {
longer = Foo(x + 1);
}
println!("{:?}", longer.0);
// Error: ^ borrow of moved value: `longer`
}
The solution is to add a local variable with the value to be destructured, instead of relying on a temporary. In your code:
struct List {
next: Option<Box<List>>
}
fn drain_lists(shorter: Option<Box<List>>,
longer: Option<Box<List>>) {
// Pair the elements in the two lists.
let mut twolists = (shorter, longer);
while let (Some(node1), Some(node2)) = twolists {
// Actual work elided.
twolists = (node1.next, node2.next);
}
// Process the rest in the longer list.
let (_, mut longer) = twolists;
while let Some(node) = longer {
// Actual work elided.
longer = node.next;
}
}
Other than getting rid of the tuple (shown by others), you can capture a mutable reference to the nodes:
while let (&mut Some(ref mut node1), &mut Some(ref mut node2)) = (&mut shorter, &mut longer) {
shorter = node1.next.take();
longer = node2.next.take();
}
The use of take() enables this to work: shorter = node1.next would complain of moving a field out of a reference, which is not allowed (it would leave the node in an undefined state). But takeing it is ok because it leaves None in the next field.
Looks like the destructuring on line 7 moves the value even when the block afterwards is not evaluated. (Edit: as #Sven Marnach pointed out in the comments, a temporary tuple gets created here which causes the move)
I've uglyfied your code to prove that point :)
struct List {
next: Option<Box<List>>
}
fn drain_lists(mut shorter: Option<Box<List>>,
mut longer: Option<Box<List>>) {
// Pair the elements in the two lists.
match(shorter, longer) {
(Some(node1), Some(node2)) => {
shorter = node1.next;
longer = node2.next;
},
(_, _) => return // without this you get the error
}
// Process the rest in the longer list.
while let Some(node) = longer {
// Actual work elided.
longer = node.next;
}
}
When I added the return for the default case, the code compiled.
One solution is to avoid the tuple and consequently the move of longer into the tuple.
fn actual_work(node1: &Box<List>, node2: &Box<List>) {
// Actual work elided
}
fn drain_lists(mut shorter: Option<Box<List>>, mut longer: Option<Box<List>>) {
while let Some(node1) = shorter {
if let Some(node2) = longer.as_ref() {
actual_work(&node1, node2);
}
shorter = node1.next;
longer = longer.map_or(None, move |l| {
l.next
});
}
// Process the rest in the longer list.
while let Some(node) = longer {
// Actual work elided.
longer = node.next;
}
}

Resources