Deserialize tuple value life time issue - rust

When I try to deserialise a value inside a tuple, it won't allow me to push the result into an array.
fn main() {
let raw_value = vec![(1, ("a", "a", 1))];
//serialise
let mut serialized_value = vec![];
for val in raw_value.iter() {
serialized_value.push((
bincode::serialize(&val.0).unwrap(),
bincode::serialize(&val.1).unwrap(),
));
}
//deserialise
let mut deserialized_value: Vec<(&str, &str, i32)> = vec![];
for val in serialized_value {
deserialized_value.push(bincode::deserialize(&val.1).unwrap());
}
}
It issues the following error:
error[E0597]: `val.1` does not live long enough
--> src/main.rs:18:54
|
18 | deserialized_value.push(bincode::deserialize(&val.1).unwrap());
| ---------------------------------------------^^^^^^-----------
| | |
| | borrowed value does not live long enough
| borrow later used here
19 | }
| - `val.1` dropped here while still borrowed
But this works fine if I refactor the value to not be inside a tuple.
Any ideas why?

You are iterating destructively through serialized_value.
Because of that, after the given loop iteration ends, the val is dropped and after the whole loop the serialized_value is fully consumed.
If you iterate through &serialized_value instead, then the val will be reference to the element in the container and serialized_value will be kept alive on the stack, because it's being referenced.

Adding to #miszcz2137's answer, here is the working version:
fn main() {
let raw_value = vec![(1, ("a", "a", 1))];
//serialise
let mut serialized_value = vec![];
for val in raw_value.iter() {
serialized_value.push((
bincode::serialize(&val.0).unwrap(),
bincode::serialize(&val.1).unwrap(),
));
}
//deserialise
let mut deserialized_value: Vec<(&str, &str, i32)> = vec![];
for val in &serialized_value {
deserialized_value.push(bincode::deserialize(&val.1).unwrap());
}
println!("{:?}", deserialized_value);
}
[("a", "a", 1)]
That said, it's not quite idiomatic to fill a vector through a for loop. It's much better practice to use iterators and .collect():
fn main() {
let raw_value = vec![(1, ("a", "a", 1))];
//serialise
let serialized_value: Vec<_> = raw_value
.iter()
.map(|val| {
(
bincode::serialize(&val.0).unwrap(),
bincode::serialize(&val.1).unwrap(),
)
})
.collect();
//deserialise
let deserialized_value: Vec<(&str, &str, i32)> = serialized_value
.iter()
.map(|val| bincode::deserialize(&val.1).unwrap())
.collect();
println!("{:?}", deserialized_value);
}
[("a", "a", 1)]

Related

return statement causes Rust to seemingly have a mutable and shared reference at the same time

I have the following code :
fn main() {
let mut v = vec![1, 2, 3];
for _ in v.iter() {
v[0] = 0;
// return;
}
}
It does not compile, and I see why, since I'm attempting to have a shared (v.iter()) and mutable (v[0] = 0) reference of v at the same time.
However, uncommenting the return statement makes this code compile, and I'm not sure why. I guess that the compiler somehow manages to know that the shared reference to v can be dropped before the mutable borrow occurs, but how exactly does it knows that ?
In this exemple, what would be the lifetimes of the two borrows of v ?
A for loop is just syntactic sugar for a loop with a manual iterator:
for x in c {
//...
}
is equivalent to:
let mut __iterator = c.into_iter();
while let Some(x) = __iterator.next() {
//...
}
So for your original code that would be:
fn main() {
let mut v = vec![1, 2, 3];
let mut __iterator = v.iter().into_iter();
while let Some(_) = __iterator.next() {
v[0] = 0;
// return;
}
}
(The call to into_iter() is redundant here, because using that with an iterator just returns itself.)
This code fails just like yours, as expected.
But if you uncomment the return, then the compiler knows that the loop never repeats, so it is transformed as if:
fn main() {
let mut v = vec![1, 2, 3];
let mut __iterator = v.iter().into_iter();
if let Some(_) = __iterator.next() {
v[0] = 0;
}
}
Now, non-lexical lifetimes (NLL) kick in, and sees that iterator is not used beyond the call to next() so it can be dropped just there. The _ does not count because it is not a real bind, but even if you wrote _x instead, NLL would drop it immediately as it is not used further. Thus v is no longer borrowed and the assignment to v[0] is perfectly safe and legal.
If you keep the original borrow of v alive it will fail again:
fn main() {
let mut v = vec![1, 2, 3];
let mut __iterator = v.iter().into_iter();
if let Some(x) = __iterator.next() {
v[0] = 0; // error! v is still borrowed
dbg!(x);
}
}
This fails with a nice explanation:
error[E0502]: cannot borrow `v` as mutable because it is also borrowed as immutable
--> src/main.rs:5:9
|
3 | let mut __iterator = v.iter().into_iter();
| -------- immutable borrow occurs here
4 | if let Some(x) = __iterator.next() {
5 | v[0] = 0;
| ^ mutable borrow occurs here
6 | dbg!(x);
| - immutable borrow later used here

How to use common BTreeMap variable in rust(single thread)

Here is my original simplified code, I want to use a global variable instead of the variables in separate functions. What's the suggestion method in rust?
BTW, I've tried to use global or change to function parameter, both are nightmare for a beginner. Too difficult to solve the lifetime & variable type cast issue.
This simple program is only a single thread tool, so, in C language, it is not necessary the extra mutex.
// version 1
use std::collections::BTreeMap;
// Trying but failed
// let mut guess_number = BTreeMap::new();
// | ^^^ expected item
fn read_csv() {
let mut guess_number = BTreeMap::new();
let lines = ["Tom,4", "John,6"];
for line in lines.iter() {
let split = line.split(",");
let vec: Vec<_> = split.collect();
println!("{} {:?}", line, vec);
let number: u16 = vec[1].trim().parse().unwrap();
guess_number.insert(vec[0], number);
}
for (k, v) in guess_number {
println!("{} {:?}", k, v);
}
}
fn main() {
let mut guess_number = BTreeMap::new();
guess_number.insert("Tom", 3);
guess_number.insert("John", 7);
if guess_number.contains_key("John") {
println!("John's number={:?}", guess_number.get("John").unwrap());
}
read_csv();
}
To explain how hard it is for a beginner, by pass parameter
// version 2
use std::collections::BTreeMap;
fn read_csv(guess_number: BTreeMap) {
// ^^^^^^^^ expected 2 generic arguments
let lines = ["Tom,4", "John,6"];
for line in lines.iter() {
let split = line.split(",");
let vec: Vec<_> = split.collect();
println!("{} {:?}", line, vec);
let number: u16 = vec[1].trim().parse().unwrap();
guess_number.insert(vec[0], number);
}
}
fn main() {
let mut guess_number = BTreeMap::new();
guess_number.insert("Tom", 3);
guess_number.insert("John", 7);
if guess_number.contains_key("John") {
println!("John's number={:?}", guess_number.get("John").unwrap());
}
read_csv(guess_number);
for (k, v) in guess_number {
println!("{} {:?}", k, v);
}
}
After some effort, try & error to get the possible work type BTreeMap<&str, i32>
// version 3
use std::collections::BTreeMap;
fn read_csv(guess_number: &BTreeMap<&str, i32>) {
// let mut guess_number = BTreeMap::new();
let lines = ["Tom,4", "John,6"];
for line in lines.iter() {
let split = line.split(",");
let vec: Vec<_> = split.collect();
println!("{} {:?}", line, vec);
let number: i32 = vec[1].trim().parse().unwrap();
guess_number.insert(vec[0], number);
}
for (k, v) in guess_number {
println!("{} {:?}", k, v);
}
}
fn main() {
let mut guess_number: BTreeMap<&str, i32> = BTreeMap::new();
guess_number.insert("Tom", 3);
guess_number.insert("John", 7);
if guess_number.contains_key("John") {
println!("John's number={:?}", guess_number.get("John").unwrap());
}
read_csv(&guess_number);
for (k, v) in guess_number {
println!("{} {:?}", k, v);
}
}
will cause following error
7 | fn read_csv(guess_number: &BTreeMap<&str, i32>) {
| -------------------- help: consider changing this to be a mutable reference: `&mut BTreeMap<&str, i32>`
...
16 | guess_number.insert(vec[0], number);
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ `guess_number` is a `&` reference, so the data it refers to cannot be borrowed as mutable
The final answer (seems not suggest use global in Rust, so use 'mutable reference').
// version 4
use std::collections::BTreeMap;
fn read_csv(guess_number: &mut BTreeMap<&str, i32>) {
let lines = ["Tom,4", "John,6"];
for line in lines.iter() {
let split = line.split(",");
let vec: Vec<_> = split.collect();
println!("{} {:?}", line, vec);
let number: i32 = vec[1].trim().parse().unwrap();
guess_number.insert(vec[0], number);
}
}
fn main() {
let mut guess_number: BTreeMap<&str, i32> = BTreeMap::new();
guess_number.insert("Tom", 3);
guess_number.insert("John", 7);
if guess_number.contains_key("John") {
println!("John's number={:?}", guess_number.get("John").unwrap());
}
read_csv(&mut guess_number);
for (k, v) in guess_number {
println!("{} {:?}", k, v);
}
}
This question is not specific to BTreeMaps but for pretty much all data types, such as numbers, strings, vectors, enums, etc.
If you want to pass a variable (value) from one function to another, you can do that in various ways in Rust. Typically you either move the value or you pass a reference to it. Moving is something quite specific to Rust and its ownership model. This is really essential, so if you have serious intentions to learn Rust, I strongly suggest you read the chapter Understanding Ownership from "the book". Don't get discouraged if you don't understand it from one reading. Spend as much time as needed, as you really can't move forward w/o this knowledge.
As for global variables, there are very few situations where they should be used. In Rust using global variables is slightly more difficult, compared to most other languages. This thread is quite useful, although you might find it a bit difficult to comprehend. My advice to a beginner would be to first fully understand the basic concept of moving and passing references.

Cannot borrow value from a hashmap as mutable because it is also borrowed as immutable

I want to get two values from a hashmap at the same time, but I can't escape the following error, I have simplified the code as follows, can anyone help me to fix this error.
#[warn(unused_variables)]
use hashbrown::HashMap;
fn do_cal(a: &[usize], b: &[usize]) -> usize {
a.iter().sum::<usize>() + b.iter().sum::<usize>()
}
fn do_check(i: usize, j:usize) -> bool {
i/2 < j - 10
}
fn do_expensive_cal(i: usize) -> Vec<usize> {
vec![i,i,i]
}
fn main() {
let size = 1000000;
let mut hash: HashMap<usize, Vec<usize>> = HashMap::new();
for i in 0..size{
if i > 0 {
hash.remove(&(i - 1));
}
if !hash.contains_key(&i){
hash.insert(i, do_expensive_cal(i));
}
let data1 = hash.get(&i).unwrap();
for j in i + 1..size {
if do_check(i, j) {
break
}
if !hash.contains_key(&j){
hash.insert(j, do_expensive_cal(j));
}
let data2 = hash.get(&j).unwrap();
let res = do_cal(data1, data2);
println!("res:{}", res);
}
}
}
Playground
error[E0502]: cannot borrow hash as mutable because it is also borrowed as immutable
--> src/main.rs:26:8
|
19 | let data1 = hash.get(&i).unwrap();
| ------------ immutable borrow occurs here
...
26 | hash.insert(j, vec![1,2,3]);
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^ mutable borrow occurs here
...
29 | let res = do_cal(data1, data2);
| ----- immutable borrow later used here
For more information about this error, try rustc --explain E0502.
error: could not compile playground due to previous error
Consider this: the borrow checker doesn't know that hash.insert(j, …) will leave the data you inserted with hash.insert(i, …) alone. For the borrow checker, hash.insert(…) may do anything to any element in hash, including rewriting or removing it. So you can't be allowed to hold the reference data1 over hash.insert(j, …).
How to get over that? The easiest is probably to move let data1 = hash.get(…) so it doesn't have to live for so long:
let data1 = hash.get(&i).unwrap();
let data2 = hash.get(&j).unwrap();
let res = do_cal(data1, data2);
This will of course look up data1 every loop iteration (and it must, since hash.insert(j, …) may have resized and thus realocated the content of the hashmap, giving data1 a new storage location in the hashmap). For completeness's sake, there are ways to get around that, but I don't recommend you do any of this:
Clone: let data1 = hash.get(&i).unwrap().clone() (if your vecs are short, this may actually be reasonable…)
As a way of making the cloning cheap, you could use a HashMap<usize, Rc<Vec<usize>>> instead (where you only need to clone the Rc, no the entire Vec)
If you ever need mutable references to both arguments of do_call, you could combine the Rc with a RefCell: Rc<RefCell<Vec<…>>>
If you need to overengineer it even more, you could replace the Rcs with references obtained from allocating in a bump allocator, e.g. bumpalo.
Since the keys to the hash table are the integers 0..100 you can use a Vec to perform these steps, temporarily splitting the Vec into 2 slices to allow the mutation on one side. If you need a HashMap for later computations, you can then create a HashMap from the Vec.
The following code compiles but panics because the j - 10 calculation underflows:
fn do_cal(a: &[usize], b: &[usize]) -> usize {
a.iter().sum::<usize>() + b.iter().sum::<usize>()
}
fn do_check(i: usize, j:usize) -> bool {
i/2 < j - 10
}
fn main() {
let size = 100;
let mut v: Vec<Option<Vec<usize>>> = vec![None; size];
for i in 0..size {
let (v1, v2) = v.split_at_mut(i + 1);
if v1[i].is_none() {
v1[i] = Some(vec![1,2,3]);
}
let data1 = v1[i].as_ref().unwrap();
for (j, item) in (i + 1..).zip(v2.iter_mut()) {
if do_check(i, j) {
break
}
if item.is_none() {
*item = Some(vec![1,2,3]);
}
let data2 = item.as_ref().unwrap();
let res = do_cal(data1, data2);
println!("res:{}", res);
}
}
}

Mutate a field of a Vec's element while looping over a field of another element in the same Vec

I'm writing a Trie data structure in Rust to implement the Aho-Corasick algorithm. The TrieNode struct respresents a node and looks like this:
use std::collections::{HashMap, HashSet, VecDeque};
struct TrieNode {
next_ids: HashMap<char, usize>,
kw_indices: HashSet<usize>,
fail_id: Option<usize>,
}
I use the same strategy as a generational arena to implement the trie, where all the nodes are stored in one Vec and reference each other using their indices. When building the automaton after creating the all the nodes, I'm trying to get the following code work without using the clone() method:
fn build_ac_automaton(nodes: &mut Vec<TrieNode>) {
let mut q = VecDeque::new();
for &i in nodes[0].next_ids.values() {
q.push_back(i);
nodes[i].fail_id = Some(0);
}
// ...
}
but the borrow checker is not very happy about it:
error[E0502]: cannot borrow `*nodes` as mutable because it is also borrowed as immutable
|
| for &i in nodes[0].next_ids.values() {
| ----- - immutable borrow ends here
| |
| immutable borrow occurs here
| q.push_back(i);
| nodes[i].fail_id = Some(0);
| ^^^^^ mutable borrow occurs here
What is another way (if any) to achieve the above without using the costly clone() method?
Split the slice:
fn build_ac_automaton(nodes: &mut Vec<TrieNode>) {
let mut q = VecDeque::new();
let (first, rest) = nodes.split_first_mut();
for &i in first.next_ids.values() {
q.push_back(i);
if i == 0 {
first.fail_id = Some(0);
} else {
rest[i-1].fail_id = Some(0);
}
}
...
}
However it might be less costly to simply clone only the next_ids:
fn build_ac_automaton(nodes: &mut Vec<TrieNode>) {
let mut q = VecDeque::new();
let ids: Vec<_> = nodes[0].next_ids.values().cloned().collect();
for &i in ids {
q.push_back(i);
nodes[i].fail_id = Some(0);
}
...
}

Borrow checker issue in `zip`-like function with a callback

I'm trying to implement a function that steps though two iterators at the same time, calling a function for each pair. This callback can control which of the iterators is advanced in each step by returning a (bool, bool) tuple. Since the iterators take a reference to a buffer in my use case, they can't implement the Iterator trait from the stdlib, but instead are used though a next_ref function, which is identical to Iterator::next, but takes an additional lifetime parameter.
// An iterator-like type, that returns references to itself
// in next_ref
struct RefIter {
value: u64
}
impl RefIter {
fn next_ref<'a>(&'a mut self) -> Option<&'a u64> {
self.value += 1;
Some(&self.value)
}
}
// Iterate over two RefIter simultaneously and call a callback
// for each pair. The callback returns a tuple of bools
// that indicate which iterators should be advanced.
fn each_zipped<F>(mut iter1: RefIter, mut iter2: RefIter, callback: F)
where F: Fn(&Option<&u64>, &Option<&u64>) -> (bool, bool)
{
let mut current1 = iter1.next_ref();
let mut current2 = iter2.next_ref();
loop {
let advance_flags = callback(&current1, &current2);
match advance_flags {
(true, true) => {
current1 = iter1.next_ref();
current2 = iter2.next_ref();
},
(true, false) => {
current1 = iter1.next_ref();
},
(false, true) => {
current2 = iter1.next_ref();
},
(false, false) => {
return
}
}
}
}
fn main() {
let mut iter1 = RefIter { value: 3 };
let mut iter2 = RefIter { value: 4 };
each_zipped(iter1, iter2, |val1, val2| {
let val1 = *val1.unwrap();
let val2 = *val2.unwrap();
println!("{}, {}", val1, val2);
(val1 < 10, val2 < 10)
});
}
error[E0499]: cannot borrow `iter1` as mutable more than once at a time
--> src/main.rs:28:28
|
22 | let mut current1 = iter1.next_ref();
| ----- first mutable borrow occurs here
...
28 | current1 = iter1.next_ref();
| ^^^^^ second mutable borrow occurs here
...
42 | }
| - first borrow ends here
error[E0499]: cannot borrow `iter2` as mutable more than once at a time
--> src/main.rs:29:28
|
23 | let mut current2 = iter2.next_ref();
| ----- first mutable borrow occurs here
...
29 | current2 = iter2.next_ref();
| ^^^^^ second mutable borrow occurs here
...
42 | }
| - first borrow ends here
error[E0499]: cannot borrow `iter1` as mutable more than once at a time
--> src/main.rs:32:28
|
22 | let mut current1 = iter1.next_ref();
| ----- first mutable borrow occurs here
...
32 | current1 = iter1.next_ref();
| ^^^^^ second mutable borrow occurs here
...
42 | }
| - first borrow ends here
error[E0499]: cannot borrow `iter1` as mutable more than once at a time
--> src/main.rs:35:28
|
22 | let mut current1 = iter1.next_ref();
| ----- first mutable borrow occurs here
...
35 | current2 = iter1.next_ref();
| ^^^^^ second mutable borrow occurs here
...
42 | }
| - first borrow ends here
I understand why it complains, but can't find a way around it. I'd appreciate any help on the subject.
Link to this snippet in the playground.
Since the iterators take a reference to a buffer in my use case, they can't implement the Iterator trait from the stdlib, but instead are used though a next_ref function, which is identical to Iterator::next, but takes an additional lifetime parameter.
You are describing a streaming iterator. There is a crate for this, aptly called streaming_iterator. The documentation describes your problem (emphasis mine):
While the standard Iterator trait's functionality is based off of
the next method, StreamingIterator's functionality is based off of
a pair of methods: advance and get. This essentially splits the
logic of next in half (in fact, StreamingIterator's next method
does nothing but call advance followed by get).
This is required because of Rust's lexical handling of borrows (more
specifically a lack of single entry, multiple exit borrows). If
StreamingIterator was defined like Iterator with just a required
next method, operations like filter would be impossible to define.
The crate does not currently have a zip function, and certainly not the variant you have described. However, it's easy enough to implement:
extern crate streaming_iterator;
use streaming_iterator::StreamingIterator;
fn each_zipped<A, B, F>(mut iter1: A, mut iter2: B, callback: F)
where
A: StreamingIterator,
B: StreamingIterator,
F: for<'a> Fn(Option<&'a A::Item>, Option<&'a B::Item>) -> (bool, bool),
{
iter1.advance();
iter2.advance();
loop {
let advance_flags = callback(iter1.get(), iter2.get());
match advance_flags {
(true, true) => {
iter1.advance();
iter2.advance();
}
(true, false) => {
iter1.advance();
}
(false, true) => {
iter1.advance();
}
(false, false) => return,
}
}
}
struct RefIter {
value: u64
}
impl StreamingIterator for RefIter {
type Item = u64;
fn advance(&mut self) {
self.value += 1;
}
fn get(&self) -> Option<&Self::Item> {
Some(&self.value)
}
}
fn main() {
let iter1 = RefIter { value: 3 };
let iter2 = RefIter { value: 4 };
each_zipped(iter1, iter2, |val1, val2| {
let val1 = *val1.unwrap();
let val2 = *val2.unwrap();
println!("{}, {}", val1, val2);
(val1 < 10, val2 < 10)
});
}
The issue with this code is that RefIter is being used in two ways, which are fundamentally at odds with one another:
Callers of next_ref recieve a reference to the stored value, which is tied to the lifetime of RefIter
RefIter's value needs to be mutable, so that it can be incremented on each call
This perfectly describes mutable aliasing (you're trying to modify 'value' while a reference to it is being held) - something which Rust is explicitly designed to prevent.
In order to make each_zipped work, you'll need to modify RefIter to avoid handing out references to data that you wish to mutate.
I've implemented one possibility below using a combination of RefCell and Rc:
use std::cell::RefCell;
use std::rc::Rc;
// An iterator-like type, that returns references to itself
// in next_ref
struct RefIter {
value: RefCell<Rc<u64>>
}
impl RefIter {
fn next_ref(&self) -> Option<Rc<u64>> {
let new_val = Rc::new(**self.value.borrow() + 1);
*self.value.borrow_mut() = new_val;
Some(Rc::clone(&*self.value.borrow()))
}
}
// Iterate over two RefIter simultaniously and call a callback
// for each pair. The callback returns a tuple of bools
// that indicate which iterators should be advanced.
fn each_zipped<F>(iter1: RefIter, iter2: RefIter, callback: F)
where F: Fn(&Option<Rc<u64>>, &Option<Rc<u64>>) -> (bool, bool)
{
let mut current1 = iter1.next_ref();
let mut current2 = iter2.next_ref();
loop {
let advance_flags = callback(&current1, &current2);
match advance_flags {
(true, true) => {
current1 = iter1.next_ref();
current2 = iter2.next_ref();
},
(true, false) => {
current1 = iter1.next_ref();
},
(false, true) => {
current2 = iter1.next_ref();
},
(false, false) => {
return
}
}
}
}
fn main() {
let iter1 = RefIter { value: RefCell::new(Rc::new(3)) };
let iter2 = RefIter { value: RefCell::new(Rc::new(4)) };
each_zipped(iter1, iter2, |val1, val2| {
// We can't use unwrap() directly, since we're only passed a reference to an Option
let val1 = **val1.iter().next().unwrap();
let val2 = **val2.iter().next().unwrap();
println!("{}, {}", val1, val2);
(val1 < 10, val2 < 10)
});
}
This version of RefIter hands out Rcs to consumers, instead of references. This avoids the issue of mutable aliasing - updating value is done by placing
a new Rc into the outer RefCell. A side effect of this is that consumers are able to hold onto an 'old' reference to the buffer (through the returned Rc), even after RefIter has advanced.

Resources