How to get value from HashMap or create new one in Rust - struct

I am wondering what would be an elegant way to either get an existing value from a HashMap or create a new one and assign it to a variable.
I was thinking about something like this ...
struct Test {
name: String,
valid: bool
}
let tests: HashMap<String, Test> = HashMap::new();
let key = String::from("a");
let test = match tests.get(&key) {
Some(t) => t,
None => {
let t = Test {
name: key,
valid: false,
};
&t
}
};
This results to the expected error
error[E0597]: `t` does not live long enough
--> src/main.rs:33:12
|
33 | &t
| ^^
| |
| borrowed value does not live long enough
| borrow later used here
34 | }
| - `t` dropped here while still borrowed
A kind of workaround is to do this ...
let t;
let test = match tests.get(&key) {
Some(node) => node,
None => {
t = Test {
name: key,
valid: false,
};
&t
}
};
It seems to me a little bit unhandy just to introduce the extra variable t in advance to be able to shift out the reference of the allocated structure for the case the instance of the structure is not found in the HashMap.
Is there a better way to do this?

Related

rusqlite: How can I pass a vector with different data types as params to the execute function of a prepared statement

Background
To give a little bit of context: I have to mention that I am fairly new to Rust, so some of my questions might be really basic/silly - apologies for that! I am trying to implement a thin wrapper around rusqlite, which should allow to cache inserts to the database into a HashMap and then execute the actual inserts in a batched manner to improve performance of the inserts. The concept is taken from this post, in particular the batched version of the code.
To achieve this, I have thought of the following construct:
A struct DataBase, which holds some basic attributes of the database and a HashMap cache which contains the information of the cached tables
A struct CachedTable holding a vector with the field names and a vector of vectors holding the records to insert.
An enum SQLDataType enumerating a "Text" and "Integer" variant (for now)
The code looks like this:
pub enum SQLDataType {
Text(String),
Integer(isize),
}
#[derive(Debug)]
struct CachedTable {
data: RefCell<Vec<Vec<SQLDataType>>>,
fields: Vec<String>,
}
#[derive(Debug)]
pub struct DataBase {
name: String,
conn: Connection,
cache: HashMap<String, CachedTable>,
batch_size: usize,
}
Then I have a function commit_writes which does the follwoing:
Creates a vector of the tables in the cache
Loops through the tables vector and creates a prepared INSERT INTO statement based on the field names and the no of records (in the batched approach one needs to concatenate the list of value placeholders as many times as there are records to process within the VALUES() part of the statement.)
Creates the params vector
Call the statement.execute with the prepared params vector
Issue
I have tried a number of versions and got error messages about missing ToSql traits, livetime and borrowing errors, etc. After a bit of searching I found this stackoverflow question, but the code does not compile anymore and even if it would, it is handling a single data type (String) in the params vector. What I would like to achieve is to pass a vector of mixed data types (Strings and Integers for now) to the statement.execute function. My current version of commit_writes is as follows - the full code can be found here:
impl DataBase {
pub fn new(db_name: &str) -> Self {
//... snip ...//
}
fn add_to_cache(&mut self, table_name: &str, record: Vec<SQLDataType>) {
//... snip ...//
}
pub fn commit_writes(&mut self) {
// collect all keys to then iterate over the cache
// collecting all keys avoids the "move issue" of iterators
// over a mutable reference to the 'cache' HashMap
let mut tables: Vec<String> = Vec::new();
for key in self.cache.keys() {
tables.push(key.to_owned());
}
// process all cached tables and write to the DB
for table in &tables {
// only process cached tables that do contain data
let no_of_records = self.cache[table].data.borrow().len();
if no_of_records > 0 {
// create the field list
let field_list = self.cache[table].fields.join(", ");
// get the number of elements and create the params part of the SQL
let no_elems = self.cache[table].fields.len();
let params_string = vec!["?"; no_elems].join(", ").repeat(no_of_records);
// create the SQL statement and prepare it
let sql_ins = format!(
"INSERT INTO {} ({}) VALUES ({})",
table, field_list, params_string
);
let stmt = self.conn.prepare_cached(sql_ins.as_str()).unwrap();
// create the param values vector
let mut param_values: Vec<_> = Vec::new();
let mut int_value: isize = 0;
let mut string_value: String = "default".to_string();
for record in self.cache[table].data.borrow().iter() {
for item_value in record.iter() {
match item_value {
SQLDataType::Integer(v) => {
int_value = *v;
param_values.push(int_value as &dyn ToSql);
}
SQLDataType::Text(v) => {
string_value = *v;
param_values.push(string_value as &dyn ToSql);
}
}
}
}
// fianlly executed the batch of inserts
stmt.execute(&*param_values).unwrap();
// now clear the cached table's data
self.cache[table].data.borrow_mut().clear();
}
}
}
}
and this is the output of cargo check:
Checking utilrs v0.1.0 (C:\LocalData\Rust\utilrs)
error[E0605]: non-primitive cast: `isize` as `&dyn ToSql`
--> src\persistence.rs:104:51
|
104 | ... param_values.push(int_value as &dyn ToSql);
| ^^^^^^^^^^^^^^^^^^^^^^^ invalid cast
|
help: consider borrowing the value
|
104 | param_values.push(&int_value as &dyn ToSql);
| +
error[E0605]: non-primitive cast: `String` as `&dyn ToSql`
--> src\persistence.rs:108:51
|
108 | ... param_values.push(string_value as &dyn ToSql);
| ^^^^^^^^^^^^^^^^^^^^^^^^^^ invalid cast
|
help: consider borrowing the value
|
108 | param_values.push(&string_value as &dyn ToSql);
| +
For more information about this error, try `rustc --explain E0605`.
error: could not compile `utilrs` due to 2 previous errors
But then if I add the borrow suggested by the compiler, this leads to the following errors related to the borrow checker:
error[E0506]: cannot assign to `int_value` because it is borrowed
--> src\persistence.rs:103:33
|
103 | ... int_value = *v;
| ^^^^^^^^^^^^^^ assignment to borrowed `int_value` occurs here
104 | ... param_values.push(&int_value as &dyn ToSql);
| -------------------------------------------
| | |
| | borrow of `int_value` occurs here
| borrow later used here
error[E0506]: cannot assign to `string_value` because it is borrowed
--> src\persistence.rs:107:33
|
107 | ... string_value = *v;
| ^^^^^^^^^^^^ assignment to borrowed `string_value` occurs here
108 | ... param_values.push(&string_value as &dyn ToSql);
| ----------------------------------------------
| | |
| | borrow of `string_value` occurs here
| borrow later used here
error[E0507]: cannot move out of `*v` which is behind a shared reference
--> src\persistence.rs:107:48
|
107 | ... string_value = *v;
| ^^ move occurs because `*v` has type `String`, which does not impent the `Copy` trait
error[E0596]: cannot borrow `stmt` as mutable, as it is not declared as mutable
--> src\persistence.rs:115:17
|
93 | let stmt = self.conn.prepare_cached(sql_ins.as_str()).unwrap();
| ---- help: consider changing this to be mutable: `mut stmt`
...
115 | stmt.execute(&*param_values).unwrap();
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ cannot borrow as mutable
Some errors have detailed explanations: E0506, E0507, E0596.
For more information about an error, try `rustc --explain E0506`.
So I am going around in circles and would really appreciate any help with this!
I would suggest storing rusqlite::types::Value values directly in your struct instead of Box<dyn ToSql> like my solution in a comment above, which also works. Here's the new one:
let mut param_values: Vec<rusqlite::types::Value> = Vec::new();
for record in self.cache[table].data.iter() {
for item_value in record.iter() {
match item_value {
SQLDataType::Integer(v) => {
param_values.push((*v).into());
}
SQLDataType::Text(v) => {
param_values.push(v.clone().into());
}
}
}
}
Other than that, I went ahead and removed unnecessary RefCell uses and cleaned up some small things. The final code:
use rusqlite::Connection;
use std::collections::HashMap;
// See also https://stackoverflow.com/questions/40559931/vector-store-mixed-types-of-data-in-rust
#[derive(Debug)]
pub enum SQLDataType {
Text(String),
Integer(isize),
}
#[derive(Debug)]
struct CachedTable {
data: Vec<Vec<SQLDataType>>,
fields: Vec<String>,
}
#[derive(Debug)]
pub struct DataBase {
name: String,
conn: Connection,
cache: HashMap<String, CachedTable>,
batch_size: usize,
}
impl DataBase {
pub fn new(db_name: &str) -> Self {
let db_conn = Connection::open(db_name).unwrap();
let mut db = DataBase {
name: db_name.to_owned(),
conn: db_conn,
cache: HashMap::new(),
batch_size: 50,
};
db.cache.insert(
String::from("User"),
CachedTable {
data: Vec::new(),
fields: vec![
String::from("Name"),
String::from("Age"),
String::from("Gender"),
],
},
);
db
}
pub fn add_to_cache(&mut self, table_name: &str, record: Vec<SQLDataType>) {
if let Some(chached_table) = self.cache.get_mut(table_name) {
chached_table.data.push(record);
}
}
pub fn commit_writes(&mut self) {
// collect all keys to then iterate over the cache
// collecting all keys avoids the "move issue" of iterators
// over a mutable reference to the 'cache' HashMap
let mut tables: Vec<String> = Vec::new();
for key in self.cache.keys() {
tables.push(key.to_owned());
}
// process all cached tables and write to the DB
for table in &tables {
// only process cached tables that do contain data
let no_of_records = self.cache[table].data.len();
if no_of_records > 0 {
// create the field list
let field_list = self.cache[table].fields.join(", ");
// get the number of elements and create the params part of the SQL
let no_elems = self.cache[table].fields.len();
let params_string = vec!["?"; no_elems].join(", ").repeat(no_of_records);
// create the SQL statement and prepare it
let sql_ins = format!(
"INSERT INTO {} ({}) VALUES ({})",
table, field_list, params_string
);
let mut stmt = self.conn.prepare_cached(sql_ins.as_str()).unwrap();
// create the param values vector
let mut param_values: Vec<rusqlite::types::Value> = Vec::new();
for record in self.cache[table].data.iter() {
for item_value in record.iter() {
match item_value {
SQLDataType::Integer(v) => {
param_values.push((*v).into());
}
SQLDataType::Text(v) => {
param_values.push(v.clone().into());
}
}
}
}
// fianlly executed the batch of inserts
stmt.execute(rusqlite::params_from_iter(param_values))
.unwrap();
// now clear the cached table's data
self.cache.get_mut(table).unwrap().data.clear();
}
}
}
}
fn main() {
let mut db = DataBase::new("test.db");
let record: Vec<SQLDataType> = vec![
SQLDataType::Text("John Doe".to_string()),
SQLDataType::Integer(35),
SQLDataType::Text("male".to_string()),
];
db.add_to_cache("User", record);
db.commit_writes();
}

Get HashMap entry or add it if there isn't one

I want to do something like this:
fn some_fn() {
let mut my_map = HashMap::from([
(1, "1.0".to_string()),
(2, "2.0".to_string()),
]);
let key = 3;
let res = match my_map.get(&key) {
Some(child) => child,
None => {
let value = "3.0".to_string();
my_map.insert(key, value);
&value // HERE IT FAILS
}
};
println!("{}", res);
}
but it compiles with errors:
error[E0597]: `value` does not live long enough
--> src/lib.rs:16:13
|
16 | &value // HERE IT FAILS
| ^^^^^^
| |
| borrowed value does not live long enough
| borrow later used here
17 | }
| - `value` dropped here while still borrowed
error[E0382]: borrow of moved value: `value`
--> src/lib.rs:16:13
|
14 | let value = "3.0".to_string();
| ----- move occurs because `value` has type `String`, which does not implement the `Copy` trait
15 | my_map.insert(key, value);
| ----- value moved here
16 | &value // HERE IT FAILS
| ^^^^^^ value borrowed here after move
Playground
How can I fix it elegantly? Make a copy of string seems to me non-optimal.
The issue here is that you're inserting the value string then trying to return a reference to it, despite it already being moved into the HashMap. What you really want is to insert the value then get a reference to the inserted value, which could look something like this:
let res = match my_map.get(&key) {
Some(child) => child,
None => {
let value = "3.0".to_string();
my_map.insert(key, value);
my_map.get(&key).unwrap() // unwrap is guaranteed to work
}
};
BUT DON'T DO THIS. It's ugly and slow, as it has to look up the key in the map twice. Rust has a convenient Entry type that lets you get an entry of a HashMap then perform operations on that:
// or_insert returns the value if it exists, otherwise it inserts a default and returns it
let res = my_map.entry(key).or_insert("3.0".to_string());
Alternatively, if the operation to generate the default value is expensive, you can use or_insert_with to pass a closure:
// this won't call "3.0".to_string() unless it needs to
let res = my_map.entry(key).or_insert_with(|| "3.0".to_string());

Iterate through a whole file one character at a time

I'm new to Rust and I'm struggle with the concept of lifetimes. I want to make a struct that iterates through a file a character at a time, but I'm running into issues where I need lifetimes. I've tried to add them where I thought they should be but the compiler isn't happy. Here's my code:
struct Advancer<'a> {
line_iter: Lines<BufReader<File>>,
char_iter: Chars<'a>,
current: Option<char>,
peek: Option<char>,
}
impl<'a> Advancer<'a> {
pub fn new(file: BufReader<File>) -> Result<Self, Error> {
let mut line_iter = file.lines();
if let Some(Ok(line)) = line_iter.next() {
let char_iter = line.chars();
let mut advancer = Advancer {
line_iter,
char_iter,
current: None,
peek: None,
};
// Prime the pump. Populate peek so the next call to advance returns the first char
let _ = advancer.next();
Ok(advancer)
} else {
Err(anyhow!("Failed reading an empty file."))
}
}
pub fn next(&mut self) -> Option<char> {
self.current = self.peek;
if let Some(char) = self.char_iter.next() {
self.peek = Some(char);
} else {
if let Some(Ok(line)) = self.line_iter.next() {
self.char_iter = line.chars();
self.peek = Some('\n');
} else {
self.peek = None;
}
}
self.current
}
pub fn current(&self) -> Option<char> {
self.current
}
pub fn peek(&self) -> Option<char> {
self.peek
}
}
fn main() -> Result<(), Error> {
let file = File::open("input_file.txt")?;
let file_buf = BufReader::new(file);
let mut advancer = Advancer::new(file_buf)?;
while let Some(char) = advancer.next() {
print!("{}", char);
}
Ok(())
}
And here's what the compiler is telling me:
error[E0515]: cannot return value referencing local variable `line`
--> src/main.rs:37:13
|
25 | let char_iter = line.chars();
| ---- `line` is borrowed here
...
37 | Ok(advancer)
| ^^^^^^^^^^^^ returns a value referencing data owned by the current function
error[E0597]: `line` does not live long enough
--> src/main.rs:49:34
|
21 | impl<'a> Advancer<'a> {
| -- lifetime `'a` defined here
...
49 | self.char_iter = line.chars();
| -----------------^^^^--------
| | |
| | borrowed value does not live long enough
| assignment requires that `line` is borrowed for `'a`
50 | self.peek = Some('\n');
51 | } else {
| - `line` dropped here while still borrowed
error: aborting due to 2 previous errors
Some errors have detailed explanations: E0515, E0597.
For more information about an error, try `rustc --explain E0515`.
error: could not compile `advancer`.
Some notes:
The Chars iterator borrows from the String it was created from. So you can't drop the String while the iterator is alive. But that's what happens in your new() method, the line variable owning the String disappears while the iterator referencing it is stored in the struct.
You could also try storing the current line in the struct, then it would live long enough, but that's not an option – a struct cannot hold a reference to itself.
Can you make a char iterator on a String that doesn't store a reference into the String? Yes, probably, for instance by storing the current position in the string as an integer – it shouldn't be the index of the char, because chars can be more than one byte long, so you'd need to deal with the underlying bytes yourself (using e.g. is_char_boundary() to take the next bunch of bytes starting from your current index that form a char).
Is there an easier way? Yes, if performance is not of highest importance, one solution is to make use of Vec's IntoIterator instance (which uses unsafe magic to create an object that hands out parts of itself) :
let char_iter = file_buf.lines().flat_map(|line_res| {
let line = line_res.unwrap_or(String::new());
line.chars().collect::<Vec<_>>()
});
Note that just returning line.chars() would have the same problem as the first point.
You might think that String should have a similar IntoIterator instance, and I wouldn't disagree.

How can I borrow the item in an Option or create a new item when it's None?

When I have an Option and want a reference to what's inside or create something if it's a None I get an error.
Example Code:
fn main() {
let my_opt: Option<String> = None;
let ref_to_thing = match my_opt {
Some(ref t) => t,
None => &"new thing created".to_owned(),
};
println!("{:?}", ref_to_thing);
}
playground
Error:
error[E0597]: borrowed value does not live long enough
--> src/main.rs:6:18
|
6 | None => &"new thing created".to_owned(),
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^-
| | |
| | temporary value dropped here while still borrowed
| temporary value does not live long enough
...
10 | }
| - temporary value needs to live until here
Basically the created value doesn't live long enough. What is the best way to get a reference to the value in a Some or create a value if it's a None and use the reference?
You can also just write:
None => "new thing created"
With this adjustment your initial variant of the code will compile without the need of an extra variable binding.
An alternative could also be:
let ref_to_thing = my_opt.unwrap_or("new thing created".to_string());
The only way I've found is to create a "dummy variable" to hold the created item and give it a lifetime:
fn main() {
let my_opt: Option<String> = None;
let value_holder;
let ref_to_thing = match my_opt {
Some(ref t) => t,
None => {
value_holder = "new thing created".to_owned();
&value_holder
}
};
println!("{:?}", ref_to_thing);
}
playground
If you don't mind mutating your Option in place, you can use Option::method.get_or_insert_with:
fn main() {
let mut my_opt: Option<String> = None;
let ref_to_thing = my_opt.get_or_insert_with(|| "new thing created".to_owned());
println!("{:?}", ref_to_thing);
}

Swapping values between two hashmaps

Edit Note: This code now compile see What are non-lexical lifetimes?.
I have two HashMaps and want to swap a value between them under certain conditions. If the key does not exist in the second HashMap, it should be inserted. I do not want to clone the value, since that is too expensive.
The (simplified) critical code that is not working is as follows:
use std::collections::HashMap;
use std::collections::hash_map::Entry;
use std::mem;
#[derive(Debug)]
struct ExpensiveStruct {
replace: bool,
// imagine a lot of heap data here
}
fn main() {
let mut hm : HashMap<usize, ExpensiveStruct> = HashMap::new();
let mut hm1 : HashMap<usize, ExpensiveStruct> = HashMap::new();
let dummy = ExpensiveStruct { replace: false };
hm.insert(1, ExpensiveStruct { replace: true});
hm1.insert(1, ExpensiveStruct { replace: true});
match hm1.get_mut(&1) {
Some(ref mut x) =>
match hm.entry(1) {
Entry::Occupied(mut y) => { if y.get().replace {
mem::swap(x, &mut y.get_mut());
}
},
Entry::Vacant(y) => { y.insert(mem::replace(x, dummy)); }
},
None => {}
}
println!("{:?}", hm);
}
(On the Rust Playground)
I get the error:
error[E0597]: `y` does not live long enough
--> src/main.rs:28:9
|
23 | mem::swap(x, &mut y.get_mut());
| - borrow occurs here
...
28 | },
| ^ `y` dropped here while still borrowed
29 | None => {}
30 | }
| - borrowed value needs to live until here
I am really confused about this borrow problem and I do not see a way to fix it. If I replace the Entry by a match hm.get_mut(1), I cannot insert in the None case, since the matching mutably borrows the HashMap.
You're giving references to references where you should be giving references.
&mut y.get_mut()
for instance is
&mut &mut ExpensiveStruct
and you're having a simular issue with
match hm1.get_mut(&1) {
Some(ref mut x) =>
Your code works as expected when the types are trimmed down to &mut ExpensiveStruct:
match hm1.get_mut(&1) {
Some(x) => match hm.entry(1) {
Entry::Occupied(mut y) => if y.get().replace {
mem::swap(x, y.get_mut());
(On the Playground)
Note that the ref mut is removed, because hm1.get_mut(&1) already returns an Option for a mutable reference, and the &mut is removed because y.get_mut() already returns a reference.

Resources