rust E0597: borrowed value does not live lnog enough - rust

I am trying to rewrite an algorithm from javascript to rust. In the following code, I get borrowed value does not live long enough error at line number 17.
[dependencies]
scraper = "0.11.0"
use std::fs;
fn get_html(fname: &str) -> String {
fs::read_to_string(fname).expect("Something went wrong reading the file")
}
pub mod diff_html {
use scraper::{element_ref::ElementRef, Html};
pub struct DiffNode<'a> {
node_ref: ElementRef<'a>,
}
impl<'a> DiffNode<'a> {
fn from_html(html: &str) -> Self {
let doc = Self::get_doc(&html);
let root_element = doc.root_element().to_owned();
let diffn = Self {
node_ref: root_element,
};
diffn
}
fn get_doc(html: &str) -> Html {
Html::parse_document(html).to_owned()
}
}
pub fn diff<'a>(html1: &str, _html2: &str) -> DiffNode<'a> {
let diff1 = DiffNode::from_html(&html1);
diff1
}
}
fn main() {
//read strins
let filename1: &str = "test/test1.html";
let filename2: &str = "test/test2.html";
let html1: &str = &get_html(filename1);
let html2: &str = &get_html(filename2);
let diff1 = diff_html::diff(html1, html2);
//write html
//fs::write("test_outs/testx.html", html1).expect("unable to write file");
//written output file.
}
warning: unused variable: `diff1`
--> src\main.rs:43:9
|
43 | let diff1 = diff_html::diff(html1, html2);
| ^^^^^ help: if this is intentional, prefix it with an underscore: `_diff1`
|
= note: `#[warn(unused_variables)]` on by default
error[E0597]: `doc` does not live long enough
--> src\main.rs:17:32
|
14 | impl<'a> DiffNode<'a> {
| -- lifetime `'a` defined here
...
17 | let root_element = doc.root_element().to_owned();
| ^^^--------------------------
| |
| borrowed value does not live long enough
| assignment requires that `doc` is borrowed for `'a`
...
22 | }
| - `doc` dropped here while still borrowed
I want a detailed explanation/solution if possible.

root_element which is actually an ElementRef has reference to objects inside doc, not the actual owned object. The object doc here is created in from_html function and therefore owned by the function. Because doc is not returned, it is dropped / deleted from memory at the end of from_html function block.
ElementRef needs doc, the thing it is referencing to, to be alive when it is returned from the memory.
pub mod diff_html {
use scraper::{element_ref::ElementRef, Html};
pub struct DiffNode<'a> {
node_ref: ElementRef<'a>,
}
impl<'a> DiffNode<'a> {
fn from_html(html: &'a scraper::html::Html) -> Self {
Self {
node_ref: html.root_element(),
}
}
}
pub fn diff<'a>(html1_string: &str, _html2_string: &str) {
let html1 = Html::parse_document(&html1_string);
let diff1 = DiffNode::from_html(&html1);
// do things here
// at the end of the function, diff1 and html1 is dropped together
// this way the compiler doesn't yell at you
}
}
More or less you need to do something like this with diff function to let the HTML and ElementRef's lifetime to be the same.
This behavior is actually Rust's feature to guard values in memory so that it doesn't leak or reference not referencing the wrong memory address.
Also if you want to feel like operating detachable objects and play with reference (like java, javascript, golang) I suggest reading this https://doc.rust-lang.org/book/ch15-05-interior-mutability.html

Related

Rust async function parameters and lifetimes

I'd like to understand why test1 causes an error but test2 compiles.
It seems like rust is being clever, and realising that when the .await is called directly on the async function result it knows to keep the parameter around for execution of the future but when the async is called on a separate line it can't do this.
Would love to have a link to the relevant functionality that makes this work to learn the details.
async fn do_async_thing(s: &String) {
println!("{s}");
}
fn get_string() -> String {
"sf".to_string()
}
#[tokio::test]
async fn test1() {
let a = do_async_thing(&get_string());
a.await;
}
#[tokio::test]
async fn test2() {
do_async_thing(&get_string()).await;
}
The error
error[E0716]: temporary value dropped while borrowed
--> crates/dynamo/src/error.rs:11:29
|
11 | let a = do_async_thing(&get_string());
| ^^^^^^^^^^^^ - temporary value is freed at the end of this statement
| |
| creates a temporary value which is freed while still in use
12 | a.await;
| - borrow later used here
|
= note: consider using a `let` binding to create a longer lived value
It is not directly to do with async, its because the future returned from do_async_thing holds the string reference.
You can create your own future with the same result
struct DoAsyncThingFuture<'a> {
s: &'a String
}
impl<'a> Future for DoAsyncThingFuture<'a> {
type Output = ();
fn poll(self: std::pin::Pin<&mut Self>, cx: &mut std::task::Context<'_>) -> Poll<Self::Output> {
println!("{}", self.s);
Poll::Ready(())
}
}
fn do_async_thing(s: &String) -> DoAsyncThingFuture {
DoAsyncThingFuture {
s
}
}
And even get the same result without a future
fn do_sync_thing(s: &String) -> &String {
s
}
Attempting to use the return value from either of these functions will give the same error. This happens the return value of get_string does not have an owner so it is dropped after the call to do_sync_thing witch means the return reference is dangling. So as why one works and the other does not:
let a = do_sync_thing(&get_string());
println!("{}", a);
//Same as
let _temp_value = get_string();
let a = do_async_thing(&_temp_value);
drop(_temp_value);
println!("{}", a);
vs
println!("{}", do_sync_thing(&get_string()));
//Same as
let _temp_value = get_string();
println!("{}", do_async_thing(&_temp_value));
drop(_temp_value);

Iterate through a whole file one character at a time

I'm new to Rust and I'm struggle with the concept of lifetimes. I want to make a struct that iterates through a file a character at a time, but I'm running into issues where I need lifetimes. I've tried to add them where I thought they should be but the compiler isn't happy. Here's my code:
struct Advancer<'a> {
line_iter: Lines<BufReader<File>>,
char_iter: Chars<'a>,
current: Option<char>,
peek: Option<char>,
}
impl<'a> Advancer<'a> {
pub fn new(file: BufReader<File>) -> Result<Self, Error> {
let mut line_iter = file.lines();
if let Some(Ok(line)) = line_iter.next() {
let char_iter = line.chars();
let mut advancer = Advancer {
line_iter,
char_iter,
current: None,
peek: None,
};
// Prime the pump. Populate peek so the next call to advance returns the first char
let _ = advancer.next();
Ok(advancer)
} else {
Err(anyhow!("Failed reading an empty file."))
}
}
pub fn next(&mut self) -> Option<char> {
self.current = self.peek;
if let Some(char) = self.char_iter.next() {
self.peek = Some(char);
} else {
if let Some(Ok(line)) = self.line_iter.next() {
self.char_iter = line.chars();
self.peek = Some('\n');
} else {
self.peek = None;
}
}
self.current
}
pub fn current(&self) -> Option<char> {
self.current
}
pub fn peek(&self) -> Option<char> {
self.peek
}
}
fn main() -> Result<(), Error> {
let file = File::open("input_file.txt")?;
let file_buf = BufReader::new(file);
let mut advancer = Advancer::new(file_buf)?;
while let Some(char) = advancer.next() {
print!("{}", char);
}
Ok(())
}
And here's what the compiler is telling me:
error[E0515]: cannot return value referencing local variable `line`
--> src/main.rs:37:13
|
25 | let char_iter = line.chars();
| ---- `line` is borrowed here
...
37 | Ok(advancer)
| ^^^^^^^^^^^^ returns a value referencing data owned by the current function
error[E0597]: `line` does not live long enough
--> src/main.rs:49:34
|
21 | impl<'a> Advancer<'a> {
| -- lifetime `'a` defined here
...
49 | self.char_iter = line.chars();
| -----------------^^^^--------
| | |
| | borrowed value does not live long enough
| assignment requires that `line` is borrowed for `'a`
50 | self.peek = Some('\n');
51 | } else {
| - `line` dropped here while still borrowed
error: aborting due to 2 previous errors
Some errors have detailed explanations: E0515, E0597.
For more information about an error, try `rustc --explain E0515`.
error: could not compile `advancer`.
Some notes:
The Chars iterator borrows from the String it was created from. So you can't drop the String while the iterator is alive. But that's what happens in your new() method, the line variable owning the String disappears while the iterator referencing it is stored in the struct.
You could also try storing the current line in the struct, then it would live long enough, but that's not an option – a struct cannot hold a reference to itself.
Can you make a char iterator on a String that doesn't store a reference into the String? Yes, probably, for instance by storing the current position in the string as an integer – it shouldn't be the index of the char, because chars can be more than one byte long, so you'd need to deal with the underlying bytes yourself (using e.g. is_char_boundary() to take the next bunch of bytes starting from your current index that form a char).
Is there an easier way? Yes, if performance is not of highest importance, one solution is to make use of Vec's IntoIterator instance (which uses unsafe magic to create an object that hands out parts of itself) :
let char_iter = file_buf.lines().flat_map(|line_res| {
let line = line_res.unwrap_or(String::new());
line.chars().collect::<Vec<_>>()
});
Note that just returning line.chars() would have the same problem as the first point.
You might think that String should have a similar IntoIterator instance, and I wouldn't disagree.

How to use wirefilter over an infinite stream of data

I am writing a program to use wirefilter in order to filter data from an infinite stream.
But it seems that I cannot use a compiled ast in a loop because of lifetimes and when I try to compile, this is the output:
error: borrowed data cannot be stored outside of its closure
--> src/main.rs:34:33
|
31 | let filter = ast.compile();
| ------ ...so that variable is valid at time of its declaration
32 |
33 | for my_struct in data.filter(|my_struct| {
| ----------- borrowed data cannot outlive this closure
34 | let execution_context = my_struct.execution_context();
| ^^^^^^^^^ ----------------- cannot infer an appropriate lifetime...
| |
| cannot be stored outside of its closure
error: aborting due to previous error
error: Could not compile `wirefilter_playground`.
To learn more, run the command again with --verbose.
main.rs
use wirefilter::{ExecutionContext, Scheme};
lazy_static::lazy_static! {
static ref SCHEME: Scheme = Scheme! {
port: Int
};
}
#[derive(Debug)]
struct MyStruct {
port: i32,
}
impl MyStruct {
fn scheme() -> &'static Scheme {
&SCHEME
}
fn execution_context(&self) -> ExecutionContext {
let mut ctx = ExecutionContext::new(Self::scheme());
ctx.set_field_value("port", self.port).unwrap();
ctx
}
}
fn main() -> Result<(), failure::Error> {
let data = expensive_data_iterator();
let scheme = MyStruct::scheme();
let ast = scheme.parse("port in {2 5}")?;
let filter = ast.compile();
for my_struct in data.filter(|my_struct| {
let execution_context = my_struct.execution_context();
filter.execute(&execution_context).unwrap()
}).take(10) {
println!("{:?}", my_struct);
}
Ok(())
}
fn expensive_data_iterator() -> impl Iterator<Item=MyStruct> {
(0..).map(|port| MyStruct { port })
}
Cargo.toml
[package]
name = "wirefilter_playground"
version = "0.1.0"
edition = "2018"
[dependencies]
wirefilter-engine = "0.6.1"
failure = "0.1.5"
lazy_static = "1.3.0"
is it possible to make it work? I would like to yield only the filtered data for the final user otherwise the amount of data would be huge in memory.
Thank you in advance!
It looks like the problem is with the lifetime elision in return structs. In particular this code:
fn execution_context(&self) -> ExecutionContext {
//...
}
is equivalent to this one:
fn execution_context<'s>(&'s self) -> ExecutionContext<'s> {
//...
}
Which becomes obvious once you realize that ExecutionContext has an associated lifetime.
The lifetime of ExecutionContext does not have to match that of the MyStruct so you probably want to write:
fn execution_context<'e>(&self) -> ExecutionContext<'e> {
//...
}
or maybe:
fn execution_context<'s, 'e>(&'s self) -> ExecutionContext<'e>
where 'e: 's {
//...
}
depending on whether your context will eventually refer to any content of MyStruct.

Cursor of HashMap records with RwLockGuard in Rust

I am new to Rust, and I am trying to implement a simple, thread-safe memory key-value store, using a HashMap protected within a RwLock. My code looks like this:
use std::sync::{ Arc, RwLock, RwLockReadGuard };
use std::collections::HashMap;
use std::collections::hash_map::Iter;
type SimpleCollection = HashMap<String, String>;
struct Store(Arc<RwLock<SimpleCollection>>);
impl Store {
fn new() -> Store { return Store(Arc::new(RwLock::new(SimpleCollection::new()))) }
fn get(&self, key: &str) -> Option<String> {
let map = self.0.read().unwrap();
return map.get(&key.to_string()).map(|s| s.clone());
}
fn set(&self, key: &str, value: &str) {
let mut map = self.0.write().unwrap();
map.insert(key.to_string(), value.to_string());
}
}
So far, this code works OK. The problem is that I am trying to implement a scan() function, which returns a Cursor object that can be used to iterate over all the records. I want the Cursor object to hold a RwLockGuard, which is not released until the cursor itself is released (basically I don't want to allow modifications while a Cursor is alive).
I tried this:
use ...
type SimpleCollection = HashMap<String, String>;
struct Store(Arc<RwLock<SimpleCollection>>);
impl Store {
...
fn scan(&self) -> Cursor {
let guard = self.0.read().unwrap();
let iter = guard.iter();
return Cursor { guard, iter };
}
}
struct Cursor<'l> {
guard: RwLockReadGuard<'l, SimpleCollection>,
iter: Iter<'l, String, String>
}
impl<'l> Cursor<'l> {
fn next(&mut self) -> Option<(String, String)> {
return self.iter.next().map(|r| (r.0.clone(), r.1.clone()));
}
}
But that did not work, as I got this compilation error:
error[E0597]: `guard` does not live long enough
--> src/main.rs:24:20
|
24 | let iter = guard.iter();
| ^^^^^ borrowed value does not live long enough
25 | return Cursor { guard, iter };
26 | }
| - borrowed value only lives until here
|
note: borrowed value must be valid for the anonymous lifetime #1 defined on the method body at 22:5...
--> src/main.rs:22:5
|
22 | / fn scan(&self) -> Cursor {
23 | | let guard = self.0.read().unwrap();
24 | | let iter = guard.iter();
25 | | return Cursor { guard, iter };
26 | | }
| |_____^
Any ideas?
As mentioned in the comments, the problem is that structs generally can't be self-referential in Rust. The Cursor struct you are trying to construct contains both the MutexGuard and the iterator borrowing the MutexGuard, which is not possible (for good reasons – see the linked question).
The easiest fix in this case is to introduce a separate struct storing the MutexGuard, e.g.
struct StoreLock<'a> {
guard: RwLockReadGuard<'a, SimpleCollection>,
}
On the Store, we can then introduce a method returning a StoreLock
fn lock(&self) -> StoreLock {
StoreLock { guard: self.0.read().unwrap() }
}
and the StoreLock can expose the actual scan() method (and possibly others requiring a persistent lock):
impl<'a> StoreLock<'a> {
fn scan(&self) -> Cursor {
Cursor { iter: self.guard.iter() }
}
}
The Cursor struct itself only contains the iterator:
struct Cursor<'a> {
iter: Iter<'a, String, String>,
}
Client code first needs to obtain the lock, then get the cursor:
let lock = s.lock();
let cursor = lock.scan();
This ensures that the lock lives long enough to finish scanning.
Full code on the playground

Bind a reference to a struct property to a variable inside a function returning a mutable reference [duplicate]

I am learning Rust and I don't quite get why this is not working.
#[derive(Debug)]
struct Node {
value: String,
}
#[derive(Debug)]
pub struct Graph {
nodes: Vec<Box<Node>>,
}
fn mk_node(value: String) -> Node {
Node { value }
}
pub fn mk_graph() -> Graph {
Graph { nodes: vec![] }
}
impl Graph {
fn add_node(&mut self, value: String) {
if let None = self.nodes.iter().position(|node| node.value == value) {
let node = Box::new(mk_node(value));
self.nodes.push(node);
};
}
fn get_node_by_value(&self, value: &str) -> Option<&Node> {
match self.nodes.iter().position(|node| node.value == *value) {
None => None,
Some(idx) => self.nodes.get(idx).map(|n| &**n),
}
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn some_test() {
let mut graph = mk_graph();
graph.add_node("source".to_string());
graph.add_node("destination".to_string());
let source = graph.get_node_by_value("source").unwrap();
let dest = graph.get_node_by_value("destination").unwrap();
graph.add_node("destination".to_string());
}
}
(playground)
This has the error
error[E0502]: cannot borrow `graph` as mutable because it is also borrowed as immutable
--> src/main.rs:50:9
|
47 | let source = graph.get_node_by_value("source").unwrap();
| ----- immutable borrow occurs here
...
50 | graph.add_node("destination".to_string());
| ^^^^^ mutable borrow occurs here
51 | }
| - immutable borrow ends here
This example from Programming Rust is quite similar to what I have but it works:
pub struct Queue {
older: Vec<char>, // older elements, eldest last.
younger: Vec<char>, // younger elements, youngest last.
}
impl Queue {
/// Push a character onto the back of a queue.
pub fn push(&mut self, c: char) {
self.younger.push(c);
}
/// Pop a character off the front of a queue. Return `Some(c)` if there /// was a character to pop, or `None` if the queue was empty.
pub fn pop(&mut self) -> Option<char> {
if self.older.is_empty() {
if self.younger.is_empty() {
return None;
}
// Bring the elements in younger over to older, and put them in // the promised order.
use std::mem::swap;
swap(&mut self.older, &mut self.younger);
self.older.reverse();
}
// Now older is guaranteed to have something. Vec's pop method // already returns an Option, so we're set.
self.older.pop()
}
pub fn split(self) -> (Vec<char>, Vec<char>) {
(self.older, self.younger)
}
}
pub fn main() {
let mut q = Queue {
older: Vec::new(),
younger: Vec::new(),
};
q.push('P');
q.push('D');
assert_eq!(q.pop(), Some('P'));
q.push('X');
let (older, younger) = q.split(); // q is now uninitialized.
assert_eq!(older, vec!['D']);
assert_eq!(younger, vec!['X']);
}
A MRE of your problem can be reduced to this:
// This applies to the version of Rust this question
// was asked about; see below for updated examples.
fn main() {
let mut items = vec![1];
let item = items.last();
items.push(2);
}
error[E0502]: cannot borrow `items` as mutable because it is also borrowed as immutable
--> src/main.rs:4:5
|
3 | let item = items.last();
| ----- immutable borrow occurs here
4 | items.push(2);
| ^^^^^ mutable borrow occurs here
5 | }
| - immutable borrow ends here
You are encountering the exact problem that Rust was designed to prevent. You have a reference pointing into the vector and are attempting to insert into the vector. Doing so might require that the memory of the vector be reallocated, invalidating any existing references. If that happened and you used the value in item, you'd be accessing uninitialized memory, potentially causing a crash.
In this particular case, you aren't actually using item (or source, in the original) so you could just... not call that line. I assume you did that for some reason, so you could wrap the references in a block so that they go away before you try to mutate the value again:
fn main() {
let mut items = vec![1];
{
let item = items.last();
}
items.push(2);
}
This trick is no longer needed in modern Rust because non-lexical lifetimes have been implemented, but the underlying restriction still remains — you cannot have a mutable reference while there are other references to the same thing. This is one of the rules of references covered in The Rust Programming Language. A modified example that still does not work with NLL:
let mut items = vec![1];
let item = items.last();
items.push(2);
println!("{:?}", item);
In other cases, you can copy or clone the value in the vector. The item will no longer be a reference and you can modify the vector as you see fit:
fn main() {
let mut items = vec![1];
let item = items.last().cloned();
items.push(2);
}
If your type isn't cloneable, you can transform it into a reference-counted value (such as Rc or Arc) which can then be cloned. You may or may not also need to use interior mutability:
struct NonClone;
use std::rc::Rc;
fn main() {
let mut items = vec![Rc::new(NonClone)];
let item = items.last().cloned();
items.push(Rc::new(NonClone));
}
this example from Programming Rust is quite similar
No, it's not, seeing as how it doesn't use references at all.
See also
Cannot borrow `*x` as mutable because it is also borrowed as immutable
Pushing something into a vector depending on its last element
Why doesn't the lifetime of a mutable borrow end when the function call is complete?
How should I restructure my graph code to avoid an "Cannot borrow variable as mutable more than once at a time" error?
Why do I get the error "cannot borrow x as mutable more than once"?
Why does Rust want to borrow a variable as mutable more than once at a time?
Try to put your immutable borrow inside a block {...}.
This ends the borrow after the block.
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn some_test() {
let mut graph = mk_graph();
graph.add_node("source".to_string());
graph.add_node("destination".to_string());
{
let source = graph.get_node_by_value("source").unwrap();
let dest = graph.get_node_by_value("destination").unwrap();
}
graph.add_node("destination".to_string());
}
}
So for anyone else banging their head against this problem and wanting a quick way out - use clones instead of references. Eg I'm iterating this list of cells and want to change an attribute so I first copy the list:
let created = self.cells
.into_iter()
.map(|c| {
BoardCell {
x: c.x,
y: c.y,
owner: c.owner,
adjacency: c.adjacency.clone(),
}
})
.collect::<Vec<BoardCell>>();
And then modify the values in the original by looping the copy:
for c in created {
self.cells[(c.x + c.y * self.size) as usize].adjacency[dir] = count;
}
Using Vec<&BoardCell> would just yield this error. Not sure how Rusty this is but hey, it works.

Resources