Lifetime of a parser struct in Rust - struct

I'm trying to rewrite my parser to allow for strings to be passed into the parse method, instead of being bound to the struct.
Previously, my code looked like this:
use std::collections::HashMap;
use std::str;
#[derive(Debug)]
pub enum ParserError {
Generic
}
pub struct Resource(
pub HashMap<String, String>
);
pub struct Parser<'a> {
source: str::Chars<'a>
}
impl<'a> Parser<'a> {
pub fn new(source: &str) -> Parser {
Parser { source: source.chars() }
}
pub fn parse(&mut self) -> Result<Resource, ParserError> {
let entries = HashMap::new();
Ok(Resource(entries))
}
}
fn main() {
let parser = Parser::new("key1 = Value 1");
let res = parser.parse();
}
and in my new code I'm trying something like this:
use std::collections::HashMap;
use std::str;
#[derive(Debug)]
pub enum ParserError {
Generic
}
pub struct Resource(
pub HashMap<String, String>
);
pub struct Parser<'a> {
source: Option<str::Chars<'a>>
}
impl<'a> Parser<'a> {
pub fn new() -> Parser<'a> {
Parser { source: None }
}
pub fn parse(&mut self, source: &str) -> Result<Resource, ParserError> {
self.source = Some(source.chars());
let entries = HashMap::new();
Ok(Resource(entries))
}
}
fn main() {
let parser = Parser::new();
parser.parse("key1 = Value 1");
parser.parse("key2 = Value 2");
}
but it seems like I'm messing with lifetimes in a way that I'm not fully comfortable with. The error I get is:
error[E0495]: cannot infer an appropriate lifetime for autoref due to conflicting requirements
--> test.rs:22:35
|
22 | self.source = Some(source.chars());
|
What's the canonical way of handling this? How can I take a String and clone it into the lifetime of the Parser struct?

The full error message is:
error[E0495]: cannot infer an appropriate lifetime for autoref due to conflicting requirements
--> src/main.rs:22:35
|
22 | self.source = Some(source.chars());
| ^^^^^
|
help: consider using an explicit lifetime parameter as shown: fn parse(&mut self, source: &'a str) -> Result<Resource, ParserError>
--> src/main.rs:21:5
|
21 | pub fn parse(&mut self, source: &str) -> Result<Resource, ParserError> {
| ^
Doing as it suggests:
pub fn parse(&mut self, source: &'a str) -> Result<Resource, ParserError>
Allows the code to compile and run (after fixing the unrelated mismatched mutability in main).
To understand the difference, you must first understand lifetime elision.
Your original code was:
fn new(source: &str) -> Parser // with elision
fn new<'b>(source: &'b str) -> Parser<'b> // without elision
In words, the generic lifetime parameter 'a of the struct was tied to the lifetime of the incoming string.
Your new code was more complicated:
fn new() -> Parser<'b>
// with elision
fn parse(&mut self, source: &str) -> Result<Resource, ParserError>
// without elision
fn parse<'c, 'd>(&'c mut self, source: &'d str) -> Result<Resource, ParserError>
In words, the generic lifetime parameter 'a of the struct is still defined by the caller of new, but now it's not tied to anything from the constructor. When calling parse, you were attempting to pass in a string of an unrelated lifetime and store a reference to it (through the Chars iterator). Since the two lifetimes were unrelated, you cannot be sure it will last long enough.

Related

Generic parameter with reference used as function pointer argument

I am having trouble figuring out what lifetime parameter will work for this, so my current workarounds include transmutes or raw pointers. I have a structure holding a function pointer with a generic as a parameter:
struct CB<Data> {
cb: fn(Data) -> usize
}
I would like to store an instance of that, parameterized by some type containing a reference, in some other structure that implements a trait with one method, and use that trait method to call the function pointer in CB.
struct Holder<'a> {
c: CB<Option<&'a usize>>
}
trait Exec {
fn exec(&self, v: &usize) -> usize;
}
impl<'a> Holder<'a> {
fn exec_aux(&self, v: &'a usize) -> usize {
(self.c.cb)(Some(v))
}
}
impl<'a> Exec for Holder<'a> {
fn exec(&self, v: &usize) -> usize
{
self.exec_aux(v)
}
}
This gives me a lifetime error for the 'Exec' impl of Holder:
error[E0495]: cannot infer an appropriate lifetime for lifetime parameter `'a` due to conflicting requirements
Simply calling exec_aux works fine as long as I don't define that Exec impl:
fn main() {
let h = Holder { c: CB{cb:cbf}};
let v = 12;
println!("{}", h.exec_aux(&v));
}
Also, making CB not generic also makes this work:
struct CB {
cb: fn(Option<&usize>) -> usize
}
The parameter in my actual code is not a usize but something big that I would rather not copy.
The lifetimes in your Exec trait are implicitly this:
trait Exec {
fn exec<'s, 'a>(&'s self, v: &'a usize) -> usize;
}
In other words, types that implement Exec need to accept any lifetimes 's and 'a. However, your Holder::exec_aux method expects a specific lifetime 'a that's tied to the lifetime parameter of the Holder type.
To make this work, you need to add 'a as a lifetime parameter to the Exec trait instead, so that you can implement the trait specifically for that lifetime:
trait Exec<'a> {
// ^^^^ vv
fn exec(&self, v: &'a usize) -> usize;
}
impl<'a> Exec<'a> for Holder<'a> {
// ^^^^ vv
fn exec(&self, v: &'a usize) -> usize
{
self.exec_aux(v)
}
}
The problem here is that the Exec trait is too generic to be used in this way by Holder. First, consider the definition:
trait Exec {
fn exec(&self, v: &usize) -> usize;
}
This definition will cause the compiler to automatically assign two anonymous lifetimes for &self and &v in exec. It's basically the same as
fn exec<'a, 'b>(&'a self, v: &'b usize) -> usize;
Note that there is no restriction on who needs to outlive whom, the references just need to be alive for the duration of the method call.
Now consider the definition
impl<'a> Holder<'a> {
fn exec_aux(&self, v: &'a usize) -> usize {
// ... doesn't matter
}
}
Since we know that &self is a &Holder<'a> (this is what the impl refers to), we need to have at least a &'a Holder<'a> here, because &'_ self can't have a lifetime shorter than 'a in Holder<'a>. So this is saying that the two parameters have the same lifetime: &'a self, &'a usize.
Where it all goes wrong is when you try to combine the two. The trait forces you into the following signature, which (again) has two distinct implicit lifetimes. But the actual Holder which you then try to call a method on forces you to have the same lifetimes for &self and &v.
fn exec(&self, v: &usize) -> usize {
// Holder<'a> needs `v` to be `'a` when calling exec_aux
// But the trait doesn't say so.
self.exec_aux(v)
}
One solution is to redefine the trait as
trait Exec<'a> {
fn exec(&'a self, v: &'a usize) -> usize;
}
and then implement it as
impl<'a> Exec<'a> for Holder<'a> {
fn exec(&'a self, v: &'a usize) -> usize {
self.exec_aux(v)
}
}

Passing in method reference to a struct

I have a struct other_struct that has a bunch of methods that I need to call depending on certain situations (in this example there is only foo(). I'd like to have a field in other_struct called fmap that stores a HashMap of other_struct methods.
use std::collections::HashMap;
pub struct fn_struct {
pub func: Option<fn(&other_struct) -> ()>,
}
pub struct other_struct<'a> {
fmap: HashMap<String, fn_struct>,
some_str: &'a str,
}
impl<'a> other_struct<'a> {
fn new(some_str: &str) -> other_struct {
let mut new_struct = other_struct {
fmap: HashMap::new(),
some_str: some_str,
};
new_struct.fmap.insert(
String::from("foo"),
fn_struct {
func: Some(other_struct::foo),
},
);
new_struct
}
pub fn foo(&self) {
println!("Do some stuff foo");
}
}
fn main() {
let test_str = "test";
let mut new_o = other_struct::new(test_str);
new_o.fmap.get("foo").unwrap().func.unwrap()(&new_o);
}
I'm struggling with dealing with the lifetimes, as I get the following error:
error[E0308]: mismatched types
--> src/main.rs:22:28
|
22 | func: Some(other_struct::foo),
| ^^^^^^^^^^^^^^^^^ one type is more general than the other
|
= note: expected fn pointer `for<'r, 's> fn(&'r other_struct<'s>)`
found fn pointer `for<'r> fn(&'r other_struct<'_>)`
I've been reading the high ranked trait bound documentation but it's unclear to me what's happening here. Does this mean the compiler wants me to specify the lifetime of the instance of other_struct somehow in relation to the lifetime of the pointer to the method?
The fn(&other_struct) -> () part of fn_struct is a function pointer that must be able to accept other_structs of any lifetime for<'r, 's> fn(&'r other_struct<'s>) -> (). However, other_struct::foo only accepts a specific lifetime for other_struct<'_>.
You can fix it by specifying that lifetime in some manner, here basically saying its always going to pass in itself:
pub struct fn_struct<'a> {
pub func: Option<fn(&other_struct<'a>) -> ()>,
}
pub struct other_struct<'a> {
fmap: HashMap<String, fn_struct<'a>>,
some_str: &'a str,
}
Or by making foo more generic by unbinding it from 'a.
pub fn foo(_self: &other_struct) {
println!("Do some stuff foo");
}
The difference is subtle, but the original is other_struct<'a>::foo<'b>(&'b other_struct<'a>) -> () and the fixed version is other_struct<'a>::foo<'b, 'c>(&'b other_struct<'c>) -> ().

Borrowing an object as mutable twice for unrelated, sequential uses

I'm trying to implement an abstraction that allows me to read from either a directory or a zip file. I start by implementing something of this sort:
pub trait FileOpener<'a> {
type ReaderType: Read;
fn open(&'a self, file_name: &str) -> Result<Self::ReaderType, Box<dyn Error>>;
}
pub struct DirectoryFileOpener<'a> {
root: &'a Path
}
impl<'a> DirectoryFileOpener<'a> {
pub fn new(root: &'a Path) -> Self {
DirectoryFileOpener { root }
}
}
impl<'a> FileOpener<'a> for DirectoryFileOpener<'a> {
type ReaderType = File;
fn open(&'a self, file_name: &str) -> Result<File, Box<dyn Error>> {
Ok(File::open(self.root.join(file_name))?)
}
}
But then I realize that the zip-rs package's zip::ZipFile is constructed from a mutable reference to the zip::ZipArchive which it is located in, so I end up with the following code:
use std::path::Path;
use std::error::Error;
use std::fs::File;
use std::io::prelude::*;
use zip::{ZipArchive, read::ZipFile};
use std::marker::PhantomData;
pub trait FileOpener<'a> {
type ReaderType: Read;
fn open(&'a mut self, file_name: &str) -> Result<Self::ReaderType, Box<dyn Error>>;
}
pub struct DirectoryFileOpener<'a> {
root: &'a Path
}
impl<'a> DirectoryFileOpener<'a> {
pub fn new(root: &'a Path) -> Self {
DirectoryFileOpener { root }
}
}
impl<'a> FileOpener<'a> for DirectoryFileOpener<'a> {
type ReaderType = File;
fn open(&'a mut self, file_name: &str) -> Result<File, Box<dyn Error>> {
Ok(File::open(self.root.join(file_name))?)
}
}
pub struct ZipFileOpener<'a, R: Read + Seek> {
zip: ZipArchive<R>,
phantom: PhantomData<&'a Self>
}
impl<'a, R: Read + Seek> ZipFileOpener<'a, R> {
pub fn new(zip: ZipArchive<R>) -> Self {
ZipFileOpener { zip, phantom: PhantomData }
}
}
impl<'a, R: Read + Seek> FileOpener<'a> for ZipFileOpener<'a, R> {
type ReaderType = ZipFile<'a>;
fn open(&'a mut self, file_name: &str) -> Result<ZipFile<'a>, Box<dyn Error>> {
Ok(self.zip.by_name(file_name)?)
}
}
I'm not sure if that's the most optimal way to write that, but at least it compiles. Then I try to use it as such:
fn load(root: &Path) -> Result<...> {
let mut opener = io::DirectoryFileOpener::new(root);
let a = Self::parse_a(opener.open("a.txt")?)?;
let b = Self::parse_b(opener.open("b.txt")?, a)?;
}
and I get cannot borrow 'opener' as mutable more than once at a time. This does not surprise me much, as I indeed use open(), which borrows opener as mutable, twice - although a is only a u64, and from my point of view it is unrelated to the lifetime of opener.open(), from the compiler's point of view it has to be in the same lifetime of the line below it, and thus we attempt to borrow opener as mutable twice.
However, I then look at the following code, which compiles and works well and which I started this whole thing by trying to improve:
fn load_zip(root: &Path) -> Result<...> {
let file = File::open(root)?;
let mut zip = ZipArchive::new(file)?;
let a = Self::parse_a(zip.by_name("a.txt")?)?;
let b = Self::parse_b(zip.by_name("b.txt")?, a)?;
}
This throws me off completely, because the function by_name() also borrows zip as mutable, and is also called twice! Why is it allowed to borrow zip as mutable twice here but not in the previous case?
After researching the issue and Rust's semantics deeper, and building on top of the notes by trentcl, I came to realize that the problem essentially boils down to defining the FileOpener trait where the lifetime argument is bound to the associated type and not to the trait itself, e.g.
pub trait FileOpener {
type ReaderType: Read;
fn open(&'a mut self, file_name: &str) -> Result<Self::ReaderType, Box<dyn Error>>;
}
impl<'a, R: Read + Seek> FileOpener for ZipFileOpener<R> {
type ReaderType = ZipFile<'a>;
...
}
However, this is known as generic associated types (GAT), and is not yet supported in Rust. The GAT RFC does however mention that in some cases the problem can be circumvented by binding the lifetime to the trait itself and using higher-rank trait bounds (HRTB) in the receiving function, which yields the following working solution to this question:
pub trait FileOpener<'a> {
type ReaderType: Read;
fn open(&'a self, file_name: &str) -> Result<Self::ReaderType, Box<dyn Error>>;
}
...
fn load<T: for<'a> FileOpener<'a>>(opener: T) -> ... {
let a = parse_a(opener.open("a.txt")?)?;
let b = parse_b(opener.open("b.txt")?, a)?;
}
This is because the HRTB allows us to bind T to a FileOpener without binding a specific lifetime to it, which enables the late binding of different lifetimes for each call to opener.open()

Struct containing reference to a file in Rust fails to borrow

Not sure what I am missing here, the lifetime is declared, therefore the struct should use the path to create the file and return a Struct with the mutable File reference for me to be able to call "write" wrapper later...
use std::path::Path;
use std::fs::File;
// use std::io::Write;
#[derive(Debug)]
pub struct Foo<'a> {
file: &'a mut File,
}
impl<'a> Foo<'a> {
pub fn new(path: &'a Path) -> Result<Self, std::io::Error> {
let mut f: &'a File = &File::create(path)?;
Ok(Self { file: &mut f })
}
//pub fn write(&self, b: [u8]) {
// self.file.write(b);
//}
}
Error:
| impl<'a> Foo<'a> {
| -- lifetime `'a` defined here
11 | pub fn new(path: &'a Path) -> Result<Self, std::io::Error> {
12 | let mut f: &'a File = &File::create(path)?;
| -------- ^^^^^^^^^^^^^^^^^^^ creates a temporary which is freed while still in use
| |
| type annotation requires that borrow lasts for `'a`
...
15 | }
| - temporary value is freed at the end of this statement
As #E_net4 mentioned, I don't want a mutable reference, yet I want to own the value. Rather than trying to play with lifetimes, I can basically just own the file and handle the whole struct as mutable when trying to write to the file!
use std::path::{ PathBuf };
use std::fs::File;
use std::io::Write;
use std::env;
#[derive(Debug)]
pub struct Foo {
file: File,
}
impl Foo {
pub fn new(path: PathBuf) -> Self {
Self {
file: File::create(path).unwrap(),
}
}
pub fn write(&mut self, b: &[u8]) -> Result<usize, std::io::Error> {
self.file.write(b)
}
}
fn main() {
let mut tmp_dir = env::temp_dir();
tmp_dir.push("foo23");
let mut f = Foo::new(tmp_dir);
f.write(b"test2").unwrap();
}

Cannot borrow as mutable more than once at a time in one code - but can in another very similar

I've this snippet that doesn't pass the borrow checker:
use std::collections::HashMap;
enum Error {
FunctionNotFound,
}
#[derive(Copy, Clone)]
struct Function<'a> {
name: &'a str,
code: &'a [u32],
}
struct Context<'a> {
program: HashMap<&'a str, Function<'a>>,
call_stack: Vec<Function<'a>>,
}
impl<'a> Context<'a> {
fn get_function(&'a mut self, fun_name: &'a str) -> Result<Function<'a>, Error> {
self.program
.get(fun_name)
.map(|f| *f)
.ok_or(Error::FunctionNotFound)
}
fn call(&'a mut self, fun_name: &'a str) -> Result<(), Error> {
let fun = try!(self.get_function(fun_name));
self.call_stack.push(fun);
Ok(())
}
}
fn main() {}
error[E0499]: cannot borrow `self.call_stack` as mutable more than once at a time
--> src/main.rs:29:9
|
27 | let fun = try!(self.get_function(fun_name));
| ---- first mutable borrow occurs here
28 |
29 | self.call_stack.push(fun);
| ^^^^^^^^^^^^^^^ second mutable borrow occurs here
...
32 | }
| - first borrow ends here
My gut feeling is that the problem is tied to the fact that HashMap returns either None or a reference of the value inside the data structure. But I don't want that: my intention is that self.get_function should return either a byte copy of the stored value or an error (that's why I put .map(|f| *f), and Function is Copy).
Changing &'a mut self to something else doesn't help.
However, the following snippet, somewhat similar in spirit, is accepted:
#[derive(Debug)]
enum Error {
StackUnderflow,
}
struct Context {
stack: Vec<u32>,
}
impl Context {
fn pop(&mut self) -> Result<u32, Error> {
self.stack.pop().ok_or(Error::StackUnderflow)
}
fn add(&mut self) -> Result<(), Error> {
let a = try!(self.pop());
let b = try!(self.pop());
self.stack.push(a + b);
Ok(())
}
}
fn main() {
let mut a = Context { stack: vec![1, 2] };
a.add().unwrap();
println!("{:?}", a.stack);
}
Now I'm confused. What is the problem with the first snippet? Why doesn't it happen in the second?
The snippets are part of a larger piece of code. In order to provide more context, this on the Rust Playground shows a more complete example with the faulty code, and this shows an earlier version without HashMap, which passes the borrow checker and runs normally.
You have fallen into the lifetime-trap. Adding the same lifetime to more references will constrain your program more. Adding more lifetimes and giving each reference the minimal possible lifetime will permit more programs. As #o11c notes, removing the constraints to the 'a lifetime will solve your issue.
impl<'a> Context<'a> {
fn get_function(&mut self, fun_name: &str) -> Result<Function<'a>, Error> {
self.program
.get(fun_name)
.map(|f| *f)
.ok_or(Error::FunctionNotFound)
}
fn call(&mut self, fun_name: &str) -> Result<(), Error> {
let fun = try!(self.get_function(fun_name));
self.call_stack.push(fun);
Ok(())
}
}
The reason this works is that Rust inserts new lifetimes, so in the compiler your function's signatures will look like this:
fn get_function<'b>(&'b mut self, fun_name: &'b str) -> Result<Function<'a>, Error>
fn call<'b>(&'b mut self, fun_name: &'b str) -> Result<(), Error>
Always try to not use any lifetimes and let the compiler be smart. If that fails, don't spray lifetimes everywhere, think about where you want to pass ownership, and where you want to limit the lifetime of a reference.
You only need to remove unnecessary lifetime qualifiers in order for your code to compile:
fn get_function(&mut self, fun_name: &str) -> Result<Function<'a>, Error> { ... }
fn call(&mut self, fun_name: &str) -> Result<(), Error> { ... }
Your problem was that you tied the lifetime of &mut self and the lifetime of the value stored in it (Function<'a>), which is in most cases unnecessary. With this dependency which was present in get_function() definition, the compiler had to assume that the result of the call self.get_function(...) borrows self, and hence it prohibits you from borrowing it again.
Lifetime on &str argument is also unnecessary - it just limits the possible set of argument values for no reason. Your key can be a string with arbitrary lifetime, not just 'a.

Resources