Borrowing an object as mutable twice for unrelated, sequential uses - rust

I'm trying to implement an abstraction that allows me to read from either a directory or a zip file. I start by implementing something of this sort:
pub trait FileOpener<'a> {
type ReaderType: Read;
fn open(&'a self, file_name: &str) -> Result<Self::ReaderType, Box<dyn Error>>;
}
pub struct DirectoryFileOpener<'a> {
root: &'a Path
}
impl<'a> DirectoryFileOpener<'a> {
pub fn new(root: &'a Path) -> Self {
DirectoryFileOpener { root }
}
}
impl<'a> FileOpener<'a> for DirectoryFileOpener<'a> {
type ReaderType = File;
fn open(&'a self, file_name: &str) -> Result<File, Box<dyn Error>> {
Ok(File::open(self.root.join(file_name))?)
}
}
But then I realize that the zip-rs package's zip::ZipFile is constructed from a mutable reference to the zip::ZipArchive which it is located in, so I end up with the following code:
use std::path::Path;
use std::error::Error;
use std::fs::File;
use std::io::prelude::*;
use zip::{ZipArchive, read::ZipFile};
use std::marker::PhantomData;
pub trait FileOpener<'a> {
type ReaderType: Read;
fn open(&'a mut self, file_name: &str) -> Result<Self::ReaderType, Box<dyn Error>>;
}
pub struct DirectoryFileOpener<'a> {
root: &'a Path
}
impl<'a> DirectoryFileOpener<'a> {
pub fn new(root: &'a Path) -> Self {
DirectoryFileOpener { root }
}
}
impl<'a> FileOpener<'a> for DirectoryFileOpener<'a> {
type ReaderType = File;
fn open(&'a mut self, file_name: &str) -> Result<File, Box<dyn Error>> {
Ok(File::open(self.root.join(file_name))?)
}
}
pub struct ZipFileOpener<'a, R: Read + Seek> {
zip: ZipArchive<R>,
phantom: PhantomData<&'a Self>
}
impl<'a, R: Read + Seek> ZipFileOpener<'a, R> {
pub fn new(zip: ZipArchive<R>) -> Self {
ZipFileOpener { zip, phantom: PhantomData }
}
}
impl<'a, R: Read + Seek> FileOpener<'a> for ZipFileOpener<'a, R> {
type ReaderType = ZipFile<'a>;
fn open(&'a mut self, file_name: &str) -> Result<ZipFile<'a>, Box<dyn Error>> {
Ok(self.zip.by_name(file_name)?)
}
}
I'm not sure if that's the most optimal way to write that, but at least it compiles. Then I try to use it as such:
fn load(root: &Path) -> Result<...> {
let mut opener = io::DirectoryFileOpener::new(root);
let a = Self::parse_a(opener.open("a.txt")?)?;
let b = Self::parse_b(opener.open("b.txt")?, a)?;
}
and I get cannot borrow 'opener' as mutable more than once at a time. This does not surprise me much, as I indeed use open(), which borrows opener as mutable, twice - although a is only a u64, and from my point of view it is unrelated to the lifetime of opener.open(), from the compiler's point of view it has to be in the same lifetime of the line below it, and thus we attempt to borrow opener as mutable twice.
However, I then look at the following code, which compiles and works well and which I started this whole thing by trying to improve:
fn load_zip(root: &Path) -> Result<...> {
let file = File::open(root)?;
let mut zip = ZipArchive::new(file)?;
let a = Self::parse_a(zip.by_name("a.txt")?)?;
let b = Self::parse_b(zip.by_name("b.txt")?, a)?;
}
This throws me off completely, because the function by_name() also borrows zip as mutable, and is also called twice! Why is it allowed to borrow zip as mutable twice here but not in the previous case?

After researching the issue and Rust's semantics deeper, and building on top of the notes by trentcl, I came to realize that the problem essentially boils down to defining the FileOpener trait where the lifetime argument is bound to the associated type and not to the trait itself, e.g.
pub trait FileOpener {
type ReaderType: Read;
fn open(&'a mut self, file_name: &str) -> Result<Self::ReaderType, Box<dyn Error>>;
}
impl<'a, R: Read + Seek> FileOpener for ZipFileOpener<R> {
type ReaderType = ZipFile<'a>;
...
}
However, this is known as generic associated types (GAT), and is not yet supported in Rust. The GAT RFC does however mention that in some cases the problem can be circumvented by binding the lifetime to the trait itself and using higher-rank trait bounds (HRTB) in the receiving function, which yields the following working solution to this question:
pub trait FileOpener<'a> {
type ReaderType: Read;
fn open(&'a self, file_name: &str) -> Result<Self::ReaderType, Box<dyn Error>>;
}
...
fn load<T: for<'a> FileOpener<'a>>(opener: T) -> ... {
let a = parse_a(opener.open("a.txt")?)?;
let b = parse_b(opener.open("b.txt")?, a)?;
}
This is because the HRTB allows us to bind T to a FileOpener without binding a specific lifetime to it, which enables the late binding of different lifetimes for each call to opener.open()

Related

Generic parameter with reference used as function pointer argument

I am having trouble figuring out what lifetime parameter will work for this, so my current workarounds include transmutes or raw pointers. I have a structure holding a function pointer with a generic as a parameter:
struct CB<Data> {
cb: fn(Data) -> usize
}
I would like to store an instance of that, parameterized by some type containing a reference, in some other structure that implements a trait with one method, and use that trait method to call the function pointer in CB.
struct Holder<'a> {
c: CB<Option<&'a usize>>
}
trait Exec {
fn exec(&self, v: &usize) -> usize;
}
impl<'a> Holder<'a> {
fn exec_aux(&self, v: &'a usize) -> usize {
(self.c.cb)(Some(v))
}
}
impl<'a> Exec for Holder<'a> {
fn exec(&self, v: &usize) -> usize
{
self.exec_aux(v)
}
}
This gives me a lifetime error for the 'Exec' impl of Holder:
error[E0495]: cannot infer an appropriate lifetime for lifetime parameter `'a` due to conflicting requirements
Simply calling exec_aux works fine as long as I don't define that Exec impl:
fn main() {
let h = Holder { c: CB{cb:cbf}};
let v = 12;
println!("{}", h.exec_aux(&v));
}
Also, making CB not generic also makes this work:
struct CB {
cb: fn(Option<&usize>) -> usize
}
The parameter in my actual code is not a usize but something big that I would rather not copy.
The lifetimes in your Exec trait are implicitly this:
trait Exec {
fn exec<'s, 'a>(&'s self, v: &'a usize) -> usize;
}
In other words, types that implement Exec need to accept any lifetimes 's and 'a. However, your Holder::exec_aux method expects a specific lifetime 'a that's tied to the lifetime parameter of the Holder type.
To make this work, you need to add 'a as a lifetime parameter to the Exec trait instead, so that you can implement the trait specifically for that lifetime:
trait Exec<'a> {
// ^^^^ vv
fn exec(&self, v: &'a usize) -> usize;
}
impl<'a> Exec<'a> for Holder<'a> {
// ^^^^ vv
fn exec(&self, v: &'a usize) -> usize
{
self.exec_aux(v)
}
}
The problem here is that the Exec trait is too generic to be used in this way by Holder. First, consider the definition:
trait Exec {
fn exec(&self, v: &usize) -> usize;
}
This definition will cause the compiler to automatically assign two anonymous lifetimes for &self and &v in exec. It's basically the same as
fn exec<'a, 'b>(&'a self, v: &'b usize) -> usize;
Note that there is no restriction on who needs to outlive whom, the references just need to be alive for the duration of the method call.
Now consider the definition
impl<'a> Holder<'a> {
fn exec_aux(&self, v: &'a usize) -> usize {
// ... doesn't matter
}
}
Since we know that &self is a &Holder<'a> (this is what the impl refers to), we need to have at least a &'a Holder<'a> here, because &'_ self can't have a lifetime shorter than 'a in Holder<'a>. So this is saying that the two parameters have the same lifetime: &'a self, &'a usize.
Where it all goes wrong is when you try to combine the two. The trait forces you into the following signature, which (again) has two distinct implicit lifetimes. But the actual Holder which you then try to call a method on forces you to have the same lifetimes for &self and &v.
fn exec(&self, v: &usize) -> usize {
// Holder<'a> needs `v` to be `'a` when calling exec_aux
// But the trait doesn't say so.
self.exec_aux(v)
}
One solution is to redefine the trait as
trait Exec<'a> {
fn exec(&'a self, v: &'a usize) -> usize;
}
and then implement it as
impl<'a> Exec<'a> for Holder<'a> {
fn exec(&'a self, v: &'a usize) -> usize {
self.exec_aux(v)
}
}

Cannot move out of borrowed content for a struct

I'm trying to implement deserializer for a BERT data which comes from an another program via sockets. For the following code:
use std::io::{self, Read};
#[derive(Clone, Copy)]
pub struct Deserializer<R: Read> {
reader: R,
header: Option<u8>,
}
impl<R: Read> Read for Deserializer<R> {
#[inline]
fn read(&mut self, buf: &mut [u8]) -> io::Result<usize> {
self.reader.read(buf)
}
}
impl<R: Read> Deserializer<R> {
/// Creates the BERT parser from an `std::io::Read`.
#[inline]
pub fn new(reader: R) -> Deserializer<R> {
Deserializer {
reader: reader,
header: None,
}
}
#[inline]
pub fn read_string(&mut self, len: usize) -> io::Result<String> {
let mut string_buffer = String::with_capacity(len);
self.reader.take(len as u64).read_to_string(&mut string_buffer);
Ok(string_buffer)
}
}
The Rust compiler generates an error, when I'm trying to read a string from a passed data:
error: cannot move out of borrowed content [E0507]
self.reader.take(len as u64).read_to_string(&mut string_buffer);
^~~~
help: run `rustc --explain E0507` to see a detailed explanation
How can I fix this even if my Deserializer<R> struct has had Clone/Copy traits?
The take method takes self:
fn take(self, limit: u64) -> Take<Self> where Self: Sized
so you cannot use it on anything borrowed.
Use the by_ref method. Replace the error line with this:
{
let reference = self.reader.by_ref();
reference.take(len as u64).read_to_string(&mut string_buffer);
}

Cannot borrow as mutable more than once at a time in one code - but can in another very similar

I've this snippet that doesn't pass the borrow checker:
use std::collections::HashMap;
enum Error {
FunctionNotFound,
}
#[derive(Copy, Clone)]
struct Function<'a> {
name: &'a str,
code: &'a [u32],
}
struct Context<'a> {
program: HashMap<&'a str, Function<'a>>,
call_stack: Vec<Function<'a>>,
}
impl<'a> Context<'a> {
fn get_function(&'a mut self, fun_name: &'a str) -> Result<Function<'a>, Error> {
self.program
.get(fun_name)
.map(|f| *f)
.ok_or(Error::FunctionNotFound)
}
fn call(&'a mut self, fun_name: &'a str) -> Result<(), Error> {
let fun = try!(self.get_function(fun_name));
self.call_stack.push(fun);
Ok(())
}
}
fn main() {}
error[E0499]: cannot borrow `self.call_stack` as mutable more than once at a time
--> src/main.rs:29:9
|
27 | let fun = try!(self.get_function(fun_name));
| ---- first mutable borrow occurs here
28 |
29 | self.call_stack.push(fun);
| ^^^^^^^^^^^^^^^ second mutable borrow occurs here
...
32 | }
| - first borrow ends here
My gut feeling is that the problem is tied to the fact that HashMap returns either None or a reference of the value inside the data structure. But I don't want that: my intention is that self.get_function should return either a byte copy of the stored value or an error (that's why I put .map(|f| *f), and Function is Copy).
Changing &'a mut self to something else doesn't help.
However, the following snippet, somewhat similar in spirit, is accepted:
#[derive(Debug)]
enum Error {
StackUnderflow,
}
struct Context {
stack: Vec<u32>,
}
impl Context {
fn pop(&mut self) -> Result<u32, Error> {
self.stack.pop().ok_or(Error::StackUnderflow)
}
fn add(&mut self) -> Result<(), Error> {
let a = try!(self.pop());
let b = try!(self.pop());
self.stack.push(a + b);
Ok(())
}
}
fn main() {
let mut a = Context { stack: vec![1, 2] };
a.add().unwrap();
println!("{:?}", a.stack);
}
Now I'm confused. What is the problem with the first snippet? Why doesn't it happen in the second?
The snippets are part of a larger piece of code. In order to provide more context, this on the Rust Playground shows a more complete example with the faulty code, and this shows an earlier version without HashMap, which passes the borrow checker and runs normally.
You have fallen into the lifetime-trap. Adding the same lifetime to more references will constrain your program more. Adding more lifetimes and giving each reference the minimal possible lifetime will permit more programs. As #o11c notes, removing the constraints to the 'a lifetime will solve your issue.
impl<'a> Context<'a> {
fn get_function(&mut self, fun_name: &str) -> Result<Function<'a>, Error> {
self.program
.get(fun_name)
.map(|f| *f)
.ok_or(Error::FunctionNotFound)
}
fn call(&mut self, fun_name: &str) -> Result<(), Error> {
let fun = try!(self.get_function(fun_name));
self.call_stack.push(fun);
Ok(())
}
}
The reason this works is that Rust inserts new lifetimes, so in the compiler your function's signatures will look like this:
fn get_function<'b>(&'b mut self, fun_name: &'b str) -> Result<Function<'a>, Error>
fn call<'b>(&'b mut self, fun_name: &'b str) -> Result<(), Error>
Always try to not use any lifetimes and let the compiler be smart. If that fails, don't spray lifetimes everywhere, think about where you want to pass ownership, and where you want to limit the lifetime of a reference.
You only need to remove unnecessary lifetime qualifiers in order for your code to compile:
fn get_function(&mut self, fun_name: &str) -> Result<Function<'a>, Error> { ... }
fn call(&mut self, fun_name: &str) -> Result<(), Error> { ... }
Your problem was that you tied the lifetime of &mut self and the lifetime of the value stored in it (Function<'a>), which is in most cases unnecessary. With this dependency which was present in get_function() definition, the compiler had to assume that the result of the call self.get_function(...) borrows self, and hence it prohibits you from borrowing it again.
Lifetime on &str argument is also unnecessary - it just limits the possible set of argument values for no reason. Your key can be a string with arbitrary lifetime, not just 'a.

How can I explicitly specify a lifetime when implementing a trait?

Given the implementation below, where essentially I have some collection of items that can be looked up via either a i32 id field or a string field. To be able to use either interchangeably, a trait "IntoKey" is used, and a match dispatches to the appropriate lookup map; this all works fine for my definition of get within the MapCollection impl:
use std::collections::HashMap;
use std::ops::Index;
enum Key<'a> {
I32Key(&'a i32),
StringKey(&'a String),
}
trait IntoKey<'a> {
fn into_key(&'a self) -> Key<'a>;
}
impl<'a> IntoKey<'a> for i32 {
fn into_key(&'a self) -> Key<'a> { Key::I32Key(self) }
}
impl<'a> IntoKey<'a> for String {
fn into_key(&'a self) -> Key<'a> { Key::StringKey(self) }
}
#[derive(Debug)]
struct Bar {
i: i32,
n: String,
}
struct MapCollection
{
items: Vec<Bar>,
id_map: HashMap<i32, usize>,
name_map: HashMap<String, usize>,
}
impl MapCollection {
fn new(items: Vec<Bar>) -> MapCollection {
let mut is = HashMap::new();
let mut ns = HashMap::new();
for (idx, item) in items.iter().enumerate() {
is.insert(item.i, idx);
ns.insert(item.n.clone(), idx);
}
MapCollection {
items: items,
id_map: is,
name_map: ns,
}
}
fn get<'a, K>(&self, key: &'a K) -> Option<&Bar>
where K: IntoKey<'a> //'
{
match key.into_key() {
Key::I32Key(i) => self.id_map.get(i).and_then(|idx| self.items.get(*idx)),
Key::StringKey(s) => self.name_map.get(s).and_then(|idx| self.items.get(*idx)),
}
}
}
fn main() {
let bars = vec![Bar { i:1, n:"foo".to_string() }, Bar { i:2, n:"far".to_string() }];
let map = MapCollection::new(bars);
if let Some(bar) = map.get(&1) {
println!("{:?}", bar);
}
if map.get(&3).is_none() {
println!("no item numbered 3");
}
if let Some(bar) = map.get(&"far".to_string()) {
println!("{:?}", bar);
}
if map.get(&"baz".to_string()).is_none() {
println!("no item named baz");
}
}
However, if I then want to implement std::ops::Index for this struct, if I attempt to do the below:
impl<'a, K> Index<K> for MapCollection
where K: IntoKey<'a> {
type Output = Bar;
fn index<'b>(&'b self, k: &K) -> &'b Bar {
self.get(k).expect("no element")
}
}
I hit a compiler error:
src/main.rs:70:18: 70:19 error: cannot infer an appropriate lifetime for automatic coercion due to conflicting requirements
src/main.rs:70 self.get(k).expect("no element")
^
src/main.rs:69:5: 71:6 help: consider using an explicit lifetime parameter as shown: fn index<'b>(&'b self, k: &'a K) -> &'b Bar
src/main.rs:69 fn index<'b>(&'b self, k: &K) -> &'b Bar {
src/main.rs:70 self.get(k).expect("no element")
src/main.rs:71 }
I can find no way to specify a distinct lifetime here; following the compiler's recommendation is not permitted as it changes the function signature and no longer matches the trait, and anything else I try fails to satisfy the lifetime specification.
I understand that I can implement the trait for each case (i32, String) separately instead of trying to implement it once for IntoKey, but I am more generally trying to understand lifetimes and appropriate usage. Essentially:
Is there actually an issue the compiler is preventing? Is there something unsound about this approach?
Am I specifying my lifetimes incorrectly? To me, the lifetime 'a in Key/IntoKey is dictating that the reference need only live long enough to do the lookup; the lifetime 'b associated with the index fn is stating that the reference resulting from the lookup will live as long as the containing MapCollection.
Or am I simply not utilizing the correct syntax to specify the needed information?
(using rustc 1.0.0-nightly (b63cee4a1 2015-02-14 17:01:11 +0000))
Do you intend on implementing IntoKey on struct's that are going to store references of lifetime 'a? If not, you can change your trait and its implementations to:
trait IntoKey {
fn into_key<'a>(&'a self) -> Key<'a>;
}
This is the generally recommended definition style, if you can use it. If you can't...
Let's look at this smaller reproduction:
use std::collections::HashMap;
use std::ops::Index;
struct Key<'a>(&'a u8);
trait IntoKey<'a> { //'
fn into_key(&'a self) -> Key<'a>;
}
struct MapCollection;
impl MapCollection {
fn get<'a, K>(&self, key: &'a K) -> &u8
where K: IntoKey<'a> //'
{
unimplemented!()
}
}
impl<'a, K> Index<K> for MapCollection //'
where K: IntoKey<'a> //'
{
type Output = u8;
fn index<'b>(&'b self, k: &K) -> &'b u8 { //'
self.get(k)
}
}
fn main() {
}
The problem lies in get:
fn get<'a, K>(&self, key: &'a K) -> &u8
where K: IntoKey<'a>
Here, we are taking a reference to K that must live as long as the Key we get out of it. However, the Index trait doesn't guarantee that:
fn index<'b>(&'b self, k: &K) -> &'b u8
You can fix this by simply giving a fresh lifetime to key:
fn get<'a, 'b, K>(&self, key: &'b K) -> &u8
where K: IntoKey<'a>
Or more succinctly:
fn get<'a, K>(&self, key: &K) -> &u8
where K: IntoKey<'a>

How would I create a handle manager in Rust?

pub struct Storage<T>{
vec: Vec<T>
}
impl<T: Clone> Storage<T>{
pub fn new() -> Storage<T>{
Storage{vec: Vec::new()}
}
pub fn get<'r>(&'r self, h: &Handle<T>)-> &'r T{
let index = h.id;
&self.vec[index]
}
pub fn set(&mut self, h: &Handle<T>, t: T){
let index = h.id;
self.vec[index] = t;
}
pub fn create(&mut self, t: T) -> Handle<T>{
self.vec.push(t);
Handle{id: self.vec.len()-1}
}
}
struct Handle<T>{
id: uint
}
I am currently trying to create a handle system in Rust and I have some problems. The code above is a simple example of what I want to achieve.
The code works but has one weakness.
let mut s1 = Storage<uint>::new();
let mut s2 = Storage<uint>::new();
let handle1 = s1.create(5);
s1.get(handle1); // works
s2.get(handle1); // unsafe
I would like to associate a handle with a specific storage like this
//Pseudo code
struct Handle<T>{
id: uint,
storage: &Storage<T>
}
impl<T> Handle<T>{
pub fn get(&self) -> &T;
}
The problem is that Rust doesn't allow this. If I would do that and create a handle with the reference of a Storage I wouldn't be allowed to mutate the Storage anymore.
I could implement something similar with a channel but then I would have to clone T every time.
How would I express this in Rust?
The simplest way to model this is to use a phantom type parameter on Storage which acts as a unique ID, like so:
use std::kinds::marker;
pub struct Storage<Id, T> {
marker: marker::InvariantType<Id>,
vec: Vec<T>
}
impl<Id, T> Storage<Id, T> {
pub fn new() -> Storage<Id, T>{
Storage {
marker: marker::InvariantType,
vec: Vec::new()
}
}
pub fn get<'r>(&'r self, h: &Handle<Id, T>) -> &'r T {
let index = h.id;
&self.vec[index]
}
pub fn set(&mut self, h: &Handle<Id, T>, t: T) {
let index = h.id;
self.vec[index] = t;
}
pub fn create(&mut self, t: T) -> Handle<Id, T> {
self.vec.push(t);
Handle {
marker: marker::InvariantLifetime,
id: self.vec.len() - 1
}
}
}
pub struct Handle<Id, T> {
id: uint,
marker: marker::InvariantType<Id>
}
fn main() {
struct A; struct B;
let mut s1 = Storage::<A, uint>::new();
let s2 = Storage::<B, uint>::new();
let handle1 = s1.create(5);
s1.get(&handle1);
s2.get(&handle1); // won't compile, since A != B
}
This solves your problem in the simplest case, but has some downsides. Mainly, it depends on the use to define and use all of these different phantom types and to prove that they are unique. It doesn't prevent bad behavior on the user's part where they can use the same phantom type for multiple Storage instances. In today's Rust, however, this is the best we can do.
An alternative solution that doesn't work today for reasons I'll get in to later, but might work later, uses lifetimes as anonymous id types. This code uses the InvariantLifetime marker, which removes all sub typing relationships with other lifetimes for the lifetime it uses.
Here is the same system, rewritten to use InvariantLifetime instead of InvariantType:
use std::kinds::marker;
pub struct Storage<'id, T> {
marker: marker::InvariantLifetime<'id>,
vec: Vec<T>
}
impl<'id, T> Storage<'id, T> {
pub fn new() -> Storage<'id, T>{
Storage {
marker: marker::InvariantLifetime,
vec: Vec::new()
}
}
pub fn get<'r>(&'r self, h: &Handle<'id, T>) -> &'r T {
let index = h.id;
&self.vec[index]
}
pub fn set(&mut self, h: &Handle<'id, T>, t: T) {
let index = h.id;
self.vec[index] = t;
}
pub fn create(&mut self, t: T) -> Handle<'id, T> {
self.vec.push(t);
Handle {
marker: marker::InvariantLifetime,
id: self.vec.len() - 1
}
}
}
pub struct Handle<'id, T> {
id: uint,
marker: marker::InvariantLifetime<'id>
}
fn main() {
let mut s1 = Storage::<uint>::new();
let s2 = Storage::<uint>::new();
let handle1 = s1.create(5);
s1.get(&handle1);
// In theory this won't compile, since the lifetime of s2
// is *slightly* shorter than the lifetime of s1.
//
// However, this is not how the compiler works, and as of today
// s2 gets the same lifetime as s1 (since they can be borrowed for the same period)
// and this (unfortunately) compiles without error.
s2.get(&handle1);
}
In a hypothetical future, the assignment of lifetimes may change and we may grow a better mechanism for this sort of tagging. However, for now, the best way to accomplish this is with phantom types.

Resources