nom parser borrow checker issue - rust

I have this Rust program using nom 4.2.2. (I have taken the liberty of expanding the nom parser function.)
extern crate failure;
extern crate nom;
use failure::Error;
use std::fs::File;
use std::io::Read;
fn nom_parser(i: &[u8]) -> ::nom::IResult<&[u8], String, u32> {
{ ::nom::lib::std::result::Result::Ok((i, ("foo".to_owned()))) }
}
fn my_parser(buf: &[u8]) -> Result<(&[u8], String), Error> {
Ok((buf, "foo".to_owned()))
}
fn main() -> Result<(), Error> {
let handler = |mut entries: String| { entries.clear() };
loop {
let mut buf = Vec::new();
File::open("/etc/hosts")?.read_to_end(&mut buf)?;
let res = nom_parser(&buf)?.1;
// let res = my_parser(&buf)?.1;
handler(res);
}
}
Compiling this program with rustc 1.33.0 (2aa4c46cf 2019-02-28) yields the following issue:
error[E0597]: `buf` does not live long enough
--> nom-parsing/src/main.rs:21:26
|
21 | let res = nom_parser(&buf)?.1;
| -----------^^^^-
| | |
| | borrowed value does not live long enough
| argument requires that `buf` is borrowed for `'static`
...
24 | }
| - `buf` dropped here while still borrowed
Switching to the commented out version of the parser compiles just fine. How are my_parser and nom_parser different? Who is borrowing buf? How should I change the program to placate the borrow checker?

let res = nom_parser(&buf)?.1;
^ here
You are using the ? operator to propagate the error out of main. The IResult<&[u8], String, u32> = Result<(&[u8], String), nom::Err<&[u8], u32>>. So in case of error the &buf is returned as part of it, so it must stay alive even after main function exits, but it won't because buf is local variable inside main.
In your case the nom_parser never returns error, but the validation only cares about the types and function signatures.
To fix it, you should process the error somehow before propagating it up. For example:
let res = nom_parser(&buf).map_err(|_| failure::format_err!("Parsing failed!"))?.1;
Note that Err in the IResult is not always hard error. It could be nom::Err::Incomplete, meaning that the parsing may succeed if more data is supplied, or nom::Err::Error meaning that the input was not matched by the parser (so perhaps another parser in alt! could succeed), or nom::Err::Failure, meaning that something went really wrong during parsing. Depending on the situation, you may consider them all as failure, or handle them differently.

The problem appears to be in IResult<I, O, E = u32>, which expends to Result<(I, O), Err<I, E>>
As you can see, when you use the ?, the Err that you may return can still contain a reference to the type I, which is your &[u8], and return from your function.
The only way for the function to return this reference would be that the reference has a lifetime that doesn't end with the function, 'static
A simple solution to your problem would be to change the &[u8] to a Vec<u8>, even if I'm not sure what you're trying to do with it.

Related

Why rust requires manual drop of a variable in this case?

Following code gives error if cnode isn't manually dropped after each iteration. As it's going out of scope, it should be automatically dropped, and I think there is no chance of outlive the borrowed value. But it complains with error borrowed value does not live long enough.
#[cfg(feature = "localrun")]
#[derive(Debug, PartialEq, Eq)]
pub struct TreeNode {
pub val: i32,
pub left: Option<Rc<RefCell<TreeNode>>>,
pub right: Option<Rc<RefCell<TreeNode>>>,
}
#[cfg(feature = "localrun")]
impl TreeNode {
#[inline]
pub fn new(val: i32) -> Self {
TreeNode {
val,
left: None,
right: None
}
}
}
// use std::borrow::Borrow;
use std::rc::Rc;
use std::cell::RefCell;
impl Solution {
pub fn max_depth(root: Option<Rc<RefCell<TreeNode>>>) -> i32 {
let mut depth = 0;
let mut vec = vec![];
if let Some(vnode) = &root {
vec.push(vnode.clone());
}
while !vec.is_empty() {
let mut size = vec.len();
while size > 0 {
size -= 1;
let cnode = vec.pop().unwrap();
if let Some(lnode) = cnode.borrow().left.as_ref() {
vec.push(lnode.clone());
}
if let Some(rnode) = cnode.borrow().right.as_ref() {
vec.push(rnode.clone());
}
// drop(cnode); // <---- Uncommenting This fixes the error though
}
}
unimplemented!()
}
}
#[cfg(feature = "localrun")]
struct Solution{}
Edit: Full error msg:
error[E0597]: `cnode` does not live long enough
--> src/w35_w36_easy/n104.rs:105:30
|
105 | if let Some(rnode) = cnode.borrow().right.as_ref() {
| ^^^^^^^^^^^^^^
| |
| borrowed value does not live long enough
| a temporary with access to the borrow is created here ...
...
109 | }
| -
| |
| `cnode` dropped here while still borrowed
| ... and the borrow might be used here, when that temporary is dropped and runs the destructor for type `Ref<'_, TreeNode>`
|
help: consider adding semicolon after the expression so its temporaries are dropped sooner, before the local variables declared by the block are dropped
|
107 | };
| +
I just noticed the helpful error message from compiler. It says to add a semicolon after if block. Actually it fixed the error! Though I'm not sure why semicolon made the difference.
This is actually quite interesting, but requires delving a little more deeply than you might expect.
A while loop is, like most things in Rust, an expression—which is to say that the loop itself evaluates to some value (currently it's always of unit type (), although that could change in the future) and it can therefore be used in expression position, such as on the right hand side of an assignment:
let _: () = while false {};
The block of a while loop is also an expression (albeit one that must also always be of unit type); this value comes from the final expression in the block—which, in your case, is the final if let:
let _: () = if let Some(rnode) = cnode.borrow().right.as_ref() {
vec.push(rnode.clone());
};
The borrow of cnode continues to the end of this expression. Since the expression is evaluated as that of the while loop's block, which is evaluated as that of the while loop expression itself, it actually lives until the while loop expression is evaluated (i.e. the while loop terminates). But your borrow of cnode must not live that long, because subsequent iterations of the loop may need to borrow it again! Hence the error, and the suggestion to add a semicolon at the end of the if let expression (thus converting it into a statement and terminating the borrow of cnode before the end of the while loop block):
if let Some(rnode) = cnode.borrow().right.as_ref() {
vec.push(rnode.clone());
};
Perhaps Rust could/should be more intelligent here, and recognise the borrow cannot possibly be required for so long, but no doubt that would add considerable additional complexity and may make it harder to introduce future changes to the language in a backwards-compatible way.

Taking ownership of a &String without copying

Context
Link: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=9a9ffa99023735f4fbedec09e1c7ac55
Here's a contrived repro of what I'm running into
fn main() {
let mut s = String::from("Hello World");
example(&mut s);
}
fn example(s: &mut str) -> Option<String> {
other_func(Some(s.to_owned()))
// other random mutable stuff happens
}
fn other_func(s: Option<String>) {
match s {
Some(ref s) => other_func2(*s),
None => panic!()
}
}
fn other_func2(s: String) {
println!("{}", &s)
}
and the error
Compiling playground v0.0.1 (/playground)
error[E0507]: cannot move out of `*s` which is behind a shared reference
--> src/main.rs:12:36
|
12 | Some(ref s) => other_func2(*s),
| ^^ move occurs because `*s` has type `String`, which does not implement the `Copy` trait
Question
In the following code, why can't I deference the &String without having to do some sort of clone/copy? i.e. this doesn't work
fn other_func(s: Option<String>) {
match s {
Some(ref s) => other_func2(*s),
None => panic!()
}
}
but it works if I replace *s with s.to_owned()/s.to_string()/s.clone()
As an aside, I understand this can probably be solved by refactoring to use &str, but I'm specifically interested in turning &String -> String
Why would the compiler allow you to?
s is &String. And you cannot get a String from a &String without cloning. That's obvious.
And the fact that it was created from an owned String? The compiler doesn't care, and it is right. This is not different from the following code:
let s: String = ...;
let r: &String = ...;
let s2: String = *r; // Error
Which is in turn not different from the following code, for instance, as far as the compiler is concerned:
let r: &String = ...;
let s: String = *s;
And we no longer have an owned string at the beginning. In general, the compiler doesn't track data flow. And rightfully so - when it type-checks the move it doesn't even can confirm that this reference isn't aliased. Or that the owned value is not used anymore. References are just references, they give you no right to drop the value.
Changing that will not be feasible in the general case (for example, the compiler will have to track data flow across function calls), and will require some form of manual annotation to say "this value is mine". And you already have such annotation - use an owned value, String, instead of &String: this is exactly what it's about.

Best practice for handling partial move issues when working with nested structs [duplicate]

This is the code I am trying to execute:
fn my_fn(arg1: &Option<Box<i32>>) -> i32 {
if arg1.is_none() {
return 0;
}
let integer = arg1.unwrap();
*integer
}
fn main() {
let integer = 42;
my_fn(&Some(Box::new(integer)));
}
(on the Rust playground)
I get the following error in previous versions of Rust:
error[E0507]: cannot move out of borrowed content
--> src/main.rs:5:19
|
5 | let integer = arg1.unwrap();
| ^^^^ cannot move out of borrowed content
And in more modern versions:
error[E0507]: cannot move out of `*arg1` which is behind a shared reference
--> src/main.rs:5:19
|
5 | let integer = arg1.unwrap();
| ^^^^
| |
| move occurs because `*arg1` has type `std::option::Option<std::boxed::Box<i32>>`, which does not implement the `Copy` trait
| help: consider borrowing the `Option`'s content: `arg1.as_ref()`
I see there is already a lot of documentation about borrow checker issues, but after reading it, I still can't figure out the problem.
Why is this an error and how do I solve it?
Option::unwrap() consumes the option, that is, it accepts the option by value. However, you don't have a value, you only have a reference to it. That's what the error is about.
Your code should idiomatically be written like this:
fn my_fn(arg1: &Option<Box<i32>>) -> i32 {
match arg1 {
Some(b) => **b,
None => 0,
}
}
fn main() {
let integer = 42;
my_fn(&Some(Box::new(integer)));
}
(on the Rust playground)
Or you can use Option combinators like Option::as_ref or Option::as_mut paired with Option::map_or, as Shepmaster has suggested:
fn my_fn(arg1: &Option<Box<i32>>) -> i32 {
arg1.as_ref().map_or(0, |n| **n)
}
This code uses the fact that i32 is automatically copyable. If the type inside the Box weren't Copy, then you wouldn't be able to obtain the inner value by value at all - you would only be able to clone it or to return a reference, for example, like here:
fn my_fn2(arg1: &Option<Box<i32>>) -> &i32 {
arg1.as_ref().map_or(&0, |n| n)
}
Since you only have an immutable reference to the option, you can only return an immutable reference to its contents. Rust is smart enough to promote the literal 0 into a static value to keep in order to be able to return it in case of absence of the input value.
Since Rust 1.40 there is Option::as_deref, so now you can do:
fn my_fn(arg1: &Option<Box<i32>>) -> i32 {
*arg1.as_deref().unwrap_or(&0)
}

Can match on Result here be replaced with map_err and "?"

I have some code which looks like this (greatly simplified version). A function takes two function arguments of type LoadClient and CheckApproval and returns either an error or a string.
pub struct Client {
pub id: String,
}
pub enum MyErr {
RequiresApproval(Client, String),
LoadFailed,
}
pub fn authorize<LoadClient, CheckApproval>(load_client: LoadClient, check_approval: CheckApproval) -> Result<String, MyErr>
where
LoadClient: FnOnce(String) -> Result<Client, String>,
CheckApproval: for<'a> FnOnce(&'a Client, &str) -> Result<&'a str, ()>,
{
let client = load_client("hello".to_string()).map_err(|_| MyErr::LoadFailed)?;
let permission = "something";
// This doesn't compile
// let authorized = check_approval(&client, permission).map_err(|_| MyErr::RequiresApproval(client, permission.to_string()))?;
// Ok(authorized.to_string())
// This version does
match check_approval(&client, permission) {
Err(_) => Err(MyErr::RequiresApproval(client, permission.to_string())),
Ok(authorized) => Ok(authorized.to_string()),
}
}
I'd like to used ? with the check_approval call (as the commented out code shows) for simpler code and to avoid the extra nesting - the Ok branch in the final match is actually a much longer block.
Unfortunately that doesn't compile:
error[E0505]: cannot move out of `client` because it is borrowed
--> src/lib.rs:19:66
|
19 | let authorized = check_approval(&client, permission).map_err(|_| MyErr::RequiresApproval(client, permission.to_string()))?;
| ------- ------- ^^^ ------ move occurs due to use in closure
| | | |
| | | move out of `client` occurs here
| | borrow later used by call
| borrow of `client` occurs here
These seem similar (to my untrained eye). Hasn't the borrowed reference to client been returned by the time map_err is called?
My main question: Is there a way to get round this and write the code without using match?
rust playground link.
While your two versions of the code are semantically equivalent, they are actually quite different for the compiler.
The failing one calls Result::map_err() with a closure that captures the value of client. That is, client is moved into the closure, but it is borrowed when calling check_approval(). And here lies the error, a borrowed value cannot be moved.
You may think that this borrow should finish when the function returns, but that is not the case because of its return type Result<&'a str, ()>, being 'a precisely the lifetime of that borrow. The borrow of client is extended for as long as this 'a exists. And that is why your second version works: when you match your Result, the 'a does not extend to the Err(()) branch, only to the Ok(&'a str) so the Err(()) is able to move client freely.
Is there a way to get round this and write the code without using match?
Well, you are calling authorized.to_string() in the returned &'a str and converting it to an owned String. So, if you can change your CheckApproval constraint to:
CheckApproval: FnOnce(&Client, &str) -> Result<String, ()>,
the problem just goes away.
If you cannot change that, another option is to do the to_string() before moving the client into the closure, finishing the borrow before it can do harm:
let authorized = check_approval(&client, permission)
.map(|a| a.to_string())
.map_err(|_| MyErr::RequiresApproval(client, permission.to_string()))?;
Ok(authorized)

Why can't I reuse a &mut reference after passing it to a function that accepts a generic type?

Why doesn't this code compile:
fn use_cursor(cursor: &mut io::Cursor<&mut Vec<u8>>) {
// do some work
}
fn take_reference(data: &mut Vec<u8>) {
{
let mut buf = io::Cursor::new(data);
use_cursor(&mut buf);
}
data.len();
}
fn produce_data() {
let mut data = Vec::new();
take_reference(&mut data);
data.len();
}
The error in this case is:
error[E0382]: use of moved value: `*data`
--> src/main.rs:14:5
|
9 | let mut buf = io::Cursor::new(data);
| ---- value moved here
...
14 | data.len();
| ^^^^ value used here after move
|
= note: move occurs because `data` has type `&mut std::vec::Vec<u8>`, which does not implement the `Copy` trait
The signature of io::Cursor::new is such that it takes ownership of its argument. In this case, the argument is a mutable reference to a Vec.
pub fn new(inner: T) -> Cursor<T>
It sort of makes sense to me; because Cursor::new takes ownership of its argument (and not a reference) we can't use that value later on. At the same time it doesn't make sense: we essentially only pass a mutable reference and the cursor goes out of scope afterwards anyway.
In the produce_data function we also pass a mutable reference to take_reference, and it doesn't produce a error when trying to use data again, unlike inside take_reference.
I found it possible to 'reclaim' the reference by using Cursor.into_inner(), but it feels a bit weird to do it manually, since in normal use-cases the borrow-checker is perfectly capable of doing it itself.
Is there a nicer solution to this problem than using .into_inner()? Maybe there's something else I don't understand about the borrow-checker?
Normally, when you pass a mutable reference to a function, the compiler implicitly performs a reborrow. This produces a new borrow with a shorter lifetime.
When the parameter is generic (and is not of the form &mut T), the compiler doesn't do this reborrowing automatically1. However, you can do it manually by dereferencing your existing mutable reference and then referencing it again:
fn take_reference(data: &mut Vec<u8>) {
{
let mut buf = io::Cursor::new(&mut *data);
use_cursor(&mut buf);
}
data.len();
}
1 — This is because the current compiler architecture only allows a chance to do a coercion if both the source and target types are known at the coercion site.

Resources