Borrowed value does not live long enough when creating a Vec - rust

Editor's note: This question was asked before Rust 1.0. Since then, many functions and types have changed, as have certain language semantics. The code in the question is no longer valid, but the ideas expressed in the answers may be.
I'm trying to list the files in a directory and copy the filename to my own Vec. I've tried several solutions, but it always ends up with a problem of not being able to create long enough living variables. I don't understand my mistake.
fn getList(action_dir_path : &str) -> Vec<&str> {
let v = fs::readdir(&Path::new(action_dir_path))
.unwrap()
.iter()
.map(|&x| x.filestem_str().unwrap())
.collect();
return v;
}
Why does the compiler complain about "x" ? I don't care about x, I want the &str inside it and I thought &str were static.
I tried this way, but I got the same result with the compiler complaining about "paths" not living long enough.
fn getList2(action_dir_path : &str) -> Vec<&str> {
let paths = fs::readdir(&Path::new(action_dir_path)).unwrap();
let mut v : Vec<&str> = Vec::new();
for path in paths.iter(){
let aSlice = path.filestem_str().unwrap();
v.push(aSlice);
}
return v;
}
Here is the playground.

The most literal translation of your code that supports Rust 1.0 is this:
use std::{fs, path::Path, ffi::OsStr};
fn getList(action_dir_path: &str) -> Vec<&OsStr> {
let v = fs::read_dir(&Path::new(action_dir_path))
.unwrap()
.map(|x| x.unwrap().path().file_stem().unwrap())
.collect();
return v;
}
This produces the error messages:
Rust 2015
error[E0597]: borrowed value does not live long enough
--> src/lib.rs:6:18
|
6 | .map(|x| x.unwrap().path().file_stem().unwrap())
| ^^^^^^^^^^^^^^^^^ - temporary value only lives until here
| |
| temporary value does not live long enough
|
note: borrowed value must be valid for the anonymous lifetime #1 defined on the function body at 3:1...
--> src/lib.rs:3:1
|
3 | / fn getList(action_dir_path: &str) -> Vec<&OsStr> {
4 | | let v = fs::read_dir(&Path::new(action_dir_path))
5 | | .unwrap()
6 | | .map(|x| x.unwrap().path().file_stem().unwrap())
7 | | .collect();
8 | | return v;
9 | | }
| |_^
Rust 2018
error[E0515]: cannot return value referencing temporary value
--> src/lib.rs:6:18
|
6 | .map(|x| x.unwrap().path().file_stem().unwrap())
| -----------------^^^^^^^^^^^^^^^^^^^^^
| |
| returns a value referencing data owned by the current function
| temporary value created here
The problem comes from Path::file_stem. This is the signature:
pub fn file_stem(&self) -> Option<&OsStr>
This indicates that the method will return a borrowed reference to a OsStr. The PathBuf struct is the owner of the string. When you leave the method, there's nowhere left that owns the PathBuf, so it will be dropped. This means that any references into the PathBuf will no longer be valid. This is Rust preventing you from having references to memory that is no longer allocated, yay for Rust!
The easiest thing you can do is return a Vec<String>. String owns the string inside of it, so we don't need to worry about it being freed when we leave the function:
fn get_list(action_dir_path: &str) -> Vec<String> {
fs::read_dir(action_dir_path)
.unwrap()
.map(|x| {
x.unwrap()
.path()
.file_stem()
.unwrap()
.to_str()
.unwrap()
.to_string()
})
.collect()
}
I also updated the style (at no charge!) to be more Rust-like:
Use snake_case for items
No space before the colon in type definitions
There's no reason to set a variable just to return it.
Don't use explicit return statements unless you are exiting from a function early.
There's no need to wrap the path in a Path.
However, I'm not a fan of all of the unwrapping. I'd write the function like this:
use std::{ffi::OsString, fs, io, path::Path};
fn get_list(action_dir_path: impl AsRef<Path>) -> io::Result<Vec<OsString>> {
fs::read_dir(action_dir_path)?
.map(|entry| entry.map(|e| e.file_name()))
.collect()
}
fn main() {
println!("{:?}", get_list("/etc"));
}
In addition to the changes above:
I use a generic type for the input path.
I return a Result to propagate errors to the caller.
I directly ask the DirEntry for the filename.
I leave the type as an OsString.

One small related point:
I thought &str were static.
&'static strs are static, but that's only one kind of &str. It can have any kind of lifetime.

Related

Rust String::to_bytes | What does the Rust compiler mean here, exactly?

I am newbie in the Rust world.
As an exercise, this is the problem I am trying to solve:
fn main() {
let s = give_ownership();
println!("{}", s);
}
// Only modify the code below!
fn give_ownership() -> String {
let s = String::from("hello, world");
// Convert String to Vec
let _s = s.into_bytes();
s
}
I have gotten through. My solution works.
However, when I compile the exercise code-snippet above unchanged, I don't quite get what the compiler is telling me here, as a note below:
Compiling playground v0.0.1 (/playground)
error[E0382]: use of moved value: `s`
--> src/main.rs:12:5
|
9 | let s = String::from("hello, world");
| - move occurs because `s` has type `String`, which does not implement the `Copy` trait
10 | // Convert String to Vec
11 | let _s = s.into_bytes();
| ------------ `s` moved due to this method call
12 | s
| ^ value used here after move
|
note: this function takes ownership of the receiver `self`, which moves `s`
My guess is that the note is about the function into_bytes(). The RustDoc says this about the function:
This consumes the String, so we do not need to copy its contents.
Could someone please elaborate on this?
into_bytes() takes self (i.e. an owned self, not a reference).
This means that it takes ownership of the string it's called on. It's conceptually the same as this:
fn main() {
let s = String::from("hello");
take_string(s);
println!("{s}"); // ERROR
}
fn take_string(s: String) {}
This is useful because it allows you to turn a String into a Vec<u8>, while reusing the allocation. A String is really just a Vec<u8> with the guarantee that the bytes are valid UTF-8.
So once you write let _s = s.into_bytes(), the data that was in s has now moved to _s, so you can't return s from your function. There's nothing there.
If you just want to return the string, you can just return String::from("stuff")

Is this the idiomatic way to make self-referential structures?

I am interested in knowing the idiomatic/canonical way of making self-referential structures in Rust. The related question Why can't I store a value and a reference to that value in the same struct explains the problem, but try as I might, I couldn't figure out the answer in the existing question (although there were some useful hints).
I have come up with a solution, but I am unsure of how safe it is, or if it is the idiomatic way to solve this problem; if it isn't, I would very much like to know what the usual solution is.
I have an existing structure in my program that holds a reference to a sequence. Sequences hold information about chromosomes so they can be rather long, and copying them isn't a viable idea.
// My real Foo is more complicated than this and is an existing
// type I'd rather not have to rewrite if I can avoid it...
struct Foo<'a> {
x: &'a [usize],
// more here...
}
impl<'a> Foo<'a> {
pub fn new(x: &'a [usize]) -> Self {
Foo {
x, /* more here... */
}
}
}
I now need a new structure that reduces the sequence to something smaller and then builds a Foo structure over the reduced string, and since someone has to own both reduced string and Foo object, I would like to put both in a structure.
// My real Bar is slightly more complicated, but it boils down to having
// a vector it owns and a Foo over that vector.
struct Bar<'a> {
x: Vec<usize>,
y: Foo<'a>, // has a reference to &x
}
// This doesn't work because x is moved after y has borrowed it
impl<'a> Bar<'a> {
pub fn new() -> Self {
let x = vec![1, 2, 3];
let y = Foo::new(&x);
Bar { x, y }
}
}
Now, this doesn't work because the Foo object in a Bar refers into the Bar object
and if the Bar object moves, the reference will point into memory that is no longer occupied by the Bar object
To avoid this problem, the x element in Bar must sit on the heap and not move around. (I think the data in a Vec already sits happily on the heap, but that doesn't seem to help me here).
A pinned box should do the trick, I belive.
struct Bar<'a> {
x: Pin<Box<Vec<usize>>>,
y: Foo<'a>,
}
Now the structure looks like this
and when I move it, the references point to the same memory.
However, moving x to the heap isn't enough for the type-checker. It still thinks that moving the pinned box will move what it points to.
If I implement Bar's constructor like this:
impl<'a> Bar<'a> {
pub fn new() -> Self {
let v: Vec<usize> = vec![1, 2, 3];
let x = Box::pin(v);
let y = Foo::new(&x);
Bar { x, y }
}
}
I get the error
error[E0515]: cannot return value referencing local variable `x`
--> src/main.rs:22:9
|
21 | let y = Foo::new(&x);
| -- `x` is borrowed here
22 | Bar { x, y }
| ^^^^^^^^^^^^ returns a value referencing data owned by the current function
error[E0505]: cannot move out of `x` because it is borrowed
--> src/main.rs:22:15
|
17 | impl<'a> Bar<'a> {
| -- lifetime `'a` defined here
...
21 | let y = Foo::new(&x);
| -- borrow of `x` occurs here
22 | Bar { x, y }
| ------^-----
| | |
| | move out of `x` occurs here
| returning this value requires that `x` is borrowed for `'a`
Some errors have detailed explanations: E0505, E0515.
For more information about an error, try `rustc --explain E0505`.
(Playground)
Even though the object I take a reference of sits on the heap, and doesn't move, the checker still sees me borrowing from an object that moves, and that, of course, is a no-no.
Here, you might stop and notice that I am trying to make two pointers to the same object, so Rc or Arc is an obvious solution. And it is, but I would have to change the implementation of Foo to have an Rc member instead of a reference. While I do have control of the source code for Foo, and I could update it and all the code that uses it, I am reluctant to make such a major change if I can avoid it. And I could have been in a situation where I am not in control of the Foo, so I couldn't change its implementation, and I would love to know how I would solve that situation then.
The only solution I could get to work was to get a raw pointer to x, so the type-checker doesn't see that I borrow it, and then connect x and y though that.
impl<'a> Bar<'a> {
pub fn new() -> Self {
let v: Vec<usize> = vec![1, 2, 3];
let x = Box::new(v);
let (x, y) = unsafe {
let ptr: *mut Vec<usize> = Box::into_raw(x);
let w: &Vec<usize> = ptr.as_ref().unwrap();
(Pin::new(Box::from_raw(ptr)), Foo::new(&w))
};
Bar { x, y }
}
}
Playground code here
What I don't know is if this is the right way to do it. It seems rather complicated, but perhaps it is the only way to make a structure like this in Rust? That some sort of unsafe is needed to trick the compiler. So that is the first of my questions.
The second is, if this is safe to do? Of course it is unsafe in the technical sense, but am I risking creating a reference to memory that might not be valid later? It is my impression that Pin should guarantee that the object remains where it is supposed to sit, and that the lifetime of the Bar<'a> and Foo<'a> objects should ensure that the reference doesn't out-live the vector, but once I have gone unsafe, could that promise be broken?
Update
The owning_ref crate has functionality that looks like what I need. You can create owned objects that present their references as well.
There is an OwningRef type that wraps an object and a reference, and it would be wonderful if you could have the slice in that and getting the reference wasn't seen as borrowing from the object, but obviously that isn't the case. Code such as this
use owning_ref::OwningRef;
struct Bar<'a> {
x: OwningRef<Vec<usize>, [usize]>,
y: Foo<'a>, // has a reference to &x
}
// This doesn't work because x is moved after y has borrowed it
impl<'a> Bar<'a> {
pub fn new() -> Self {
let v: Vec<usize> = vec![1, 2, 3];
let x = OwningRef::new(v);
let y = Foo::new(x.as_ref());
Bar { x, y }
}
}
you get the error
error[E0515]: cannot return value referencing local variable `x`
--> src/main.rs:22:9
|
21 | let y = Foo::new(x.as_ref());
| ---------- `x` is borrowed here
22 | Bar { x, y }
| ^^^^^^^^^^^^ returns a value referencing data owned by the current function
error[E0505]: cannot move out of `x` because it is borrowed
--> src/main.rs:22:15
|
17 | impl<'a> Bar<'a> {
| -- lifetime `'a` defined here
...
21 | let y = Foo::new(x.as_ref());
| ---------- borrow of `x` occurs here
22 | Bar { x, y }
| ------^-----
| | |
| | move out of `x` occurs here
| returning this value requires that `x` is borrowed for `'a`
Some errors have detailed explanations: E0505, E0515.
For more information about an error, try `rustc --explain E0505`.
error: could not compile `foo` due to 2 previous errors
The reason is the same as before: I borrow a reference to x and then I move it.
There are different wrapper objects in the crate, and in various combinations they will let me get close to a solution and then snatch it away from me, because what I borrow I still cannot move later, e.g.:
use owning_ref::{BoxRef, OwningRef};
struct Bar<'a> {
x: OwningRef<Box<Vec<usize>>, Vec<usize>>,
y: Foo<'a>, // has a reference to &x
}
// This doesn't work because x is moved after y has borrowed it
impl<'a> Bar<'a> {
pub fn new() -> Self {
let v: Vec<usize> = vec![1, 2, 3];
let v = Box::new(v); // Vector on the heap
let x = BoxRef::new(v);
let y = Foo::new(x.as_ref());
Bar { x, y }
}
}
error[E0515]: cannot return value referencing local variable `x`
--> src/main.rs:23:9
|
22 | let y = Foo::new(x.as_ref());
| ---------- `x` is borrowed here
23 | Bar { x, y }
| ^^^^^^^^^^^^ returns a value referencing data owned by the current function
error[E0505]: cannot move out of `x` because it is borrowed
--> src/main.rs:23:15
|
17 | impl<'a> Bar<'a> {
| -- lifetime `'a` defined here
...
22 | let y = Foo::new(x.as_ref());
| ---------- borrow of `x` occurs here
23 | Bar { x, y }
| ------^-----
| | |
| | move out of `x` occurs here
| returning this value requires that `x` is borrowed for `'a`
Some errors have detailed explanations: E0505, E0515.
For more information about an error, try `rustc --explain E0505`.
I can get around this by going unsafe and work with a pointer, of course, but then I am back to the solution I had with Pin and pointer hacking. I strongly feel that there is a solution here, (especially because having a Box<Vec<...>> and the corresponding Vec<...> isn't adding much to the table so there must be more to the crate), but what it is is eluding me.
(I think the data in a Vec already sits happily on the heap, but that doesn't seem to help me here).
Indeed the data in a Vec does already sit on the heap, and the x: &'a [usize] in Foo is already a reference to that heap allocation; so your problem here is not (as shown in your graphics) that moving Bar would result in (the undefined behaviour of) a dangling reference.
However, what happens if the Vec were to outgrow its current allocation? It would reallocate and be moved from its present heap allocation to another—and this would result in a dangling reference. Hence the borrow checker must enforce that, so long as anything (e.g. a Foo) that borrows from the Vec exists, the Vec cannot be mutated. Yet here we already have an expressivity problem: the Rust language has no way to annotate Bar to indicate this relationship.
Your proposed unsafe solution uses <*mut _>::as_ref, whose safety documentation includes the following requirement (emphasis added):
You must enforce Rust’s aliasing rules, since the returned lifetime 'a is arbitrarily chosen and does not necessarily reflect the actual lifetime of the data. In particular, for the duration of this lifetime, the memory the pointer points to must not get mutated (except inside UnsafeCell).
This is the key bit of the compiler's safety checks that you are trying to opt out of—but because accessing Bar now requires that one uphold this requirement, you do not have a completely safe abstraction. In my view, a raw pointer would be a tad safer here because it forces one to consider the safety of every access.
For example, one issue that immediately springs to mind is that x is declared before y in Bar and therefore, upon destruction, it will be dropped first: the Vec's heap allocation will be freed while Foo still holds references into it: undefined behaviour! Simply reordering the fields would avoid this particular problem, but there would have been no such problem with raw pointers (and any attempt to dereference them in Foo's drop handler would have forced one to consider whether they were still dereferenceable at that time).
Personally, I would try to avoid self-referencing here and probably use an arena.
I think I have finally grokked ouroboros and that is an elegant solution.
You use a macro, self_referencing when defining a structure, and inside the structure you can specify that one entry borrows others. For my application, I got it to work like this:
use ouroboros::self_referencing;
#[self_referencing]
struct _Bar {
x: Vec<usize>,
#[borrows(x)]
#[covariant]
y: Foo<'this>,
}
struct Bar(pub _Bar);
The y element references x, so I specify that. I'm sure why co-/contra-varianse is needed in this particular case where there is only one lifetime, but it specififes whether other references should live longer or can live shorter than the object. I've defined the struct as _Bar and then wrapped it in Bar. This is because macro will create a new method, and I don't want the default one. At the same time I wnat to call my constructor new to stick with tradition. So I wrap the type and write my own constructor:
impl Bar {
pub fn new() -> Self {
let x: Vec<usize> = vec![1, 2, 3];
let _bar = _BarBuilder {
x,
y_builder: |x: &Vec<usize>| Foo::new(&x),
}
.build();
Bar(_bar)
}
}
I don't use the generated _Bar::new but a generated _BarBuilder object where I can specify how to get the y value from the x reference.
I have also written accessors to get the two values. There isn't anything special here.
impl Bar {
pub fn x(&self) -> &Vec<usize> {
self.0.borrow_x()
}
pub fn y(&self) -> &Foo {
self.0.borrow_y()
}
}
and with that my trivial little test case runs...
fn main() {
let bar = Bar::new();
let vec = bar.x();
for &i in vec {
println!("i == {}", i);
}
let vec = bar.y().x;
for &i in vec {
println!("i == {}", i);
}
}
This is probably the best solution so far, assuming that there are no hidden costs that I am currently unaware of.

Cannot borrow `*self` as mutable more than once at a time; in combination with HashMap

This is my code:
use std::collections::HashMap;
struct Foo {
pub map : HashMap<i32, String>
}
impl Foo {
fn foo(&mut self, x: &String) -> i32 {
// I'm planning to use/modify "x" here and also modify "self"
42
}
fn bar(&mut self) -> i32 {
let x = self.map.get_mut(&1).unwrap();
self.foo(x)
}
}
I'm getting:
error[E0499]: cannot borrow `*self` as mutable more than once at a time
--> src/main.rs:13:9
|
12 | let x = self.map.get_mut(&1).unwrap();
| -------------------- first mutable borrow occurs here
13 | self.foo(x)
| ^^^^^^^^^-^
| | |
| | first borrow later used here
| second mutable borrow occurs here
What's going on?
Modifying self and x here breaks memory safety (at least in the general situation, which is what Rust must deal with). Consider the following implementation of foo which is allowed by your signature (fixing &String to &str):
fn foo(&mut self, x: &str) -> i32 {
self.map.clear();
println!("{}", x);
42
}
But you're calling this with x being a reference to something inside of self.map. So x could be destroyed by the time it's used. That's invalid, and Rust can't prove you won't do that, because you said you might. (Kevin Anderson provides a helpful comment below if you're coming from a GC language like C# where "reference" has a different meaning.)
How to fix this depends on what you're really trying to do, though one approach would be to clone the string so it cannot be destroyed:
fn bar(&mut self) -> i32 {
let x = self.map.get(&1).unwrap().clone(); // <== now you have a copy
self.foo(&x)
}
Note this got rid of the get_mut(). It's unclear what that was for. If you need an exclusive (mut) reference into the map, then you'll need to do that separately, and you can't do that directly while also holding an exclusive reference to self for the reasons above. Remember that mut means "exclusive access," not "mutable." A side effect of having exclusive access is that mutation is allowed.
If you really need something along these lines, you need to wrap your values (String) in Arc so that you can maintain reference counts and have shared ownership. But I would first try to redesign your algorithm to avoid this.

Chain iterators to references of different lifetimies

I want to build a recursive function for traversing a tree in Rust. The function should always get the next element and an iterator over references to the ancestor elements.
For the iterator over ancestor elements, one could in principle use the chain and once methods. Consider the following simple example, where the tree is jsut a Vec (for the purpose of this demonstration):
fn proceed<'a, I>(mut remaining: Vec<String>, ancestors: I)
where
I: Iterator<Item = &'a String> + Clone,
{
if let Some(next) = remaining.pop() {
let next_ancestors = ancestors.chain(std::iter::once(&next));
proceed(remaining, next_ancestors);
}
}
Playground
This fails to compile because &next has a shorter lifetime than 'a:
error[E0597]: `next` does not live long enough
--> src/lib.rs:6:62
|
1 | fn proceed<'a, I>(mut remaining: Vec<String>, ancestors: I)
| -- lifetime `'a` defined here
...
6 | let next_ancestors = ancestors.chain(std::iter::once(&next));
| --------------------------------^^^^^--
| | |
| | borrowed value does not live long enough
| argument requires that `next` is borrowed for `'a`
7 | proceed(remaining, next_ancestors);
8 | }
| - `next` dropped here while still borrowed
I tried to overcome this by adding an explicit second lifetime 'b: 'a and forcing an explicit reference by something like let next_ref: &'b String = &next, but that yields a (different) error message as well.
One solution I came up with was to call map as follows:
let next_ancestors = ancestors.map(|r| r).chain(std::iter::once(&next));
As pointed out by #trentcl, this doesn't actually solve the problem, as the compiler then gets stuck in an infinite loop when compiling proceed for all the nested Chains when one actually tries to call the function.
The pieces of solution are already around, just to summarize:
As you already know, using map(|r| r) "decouples" the lifetime requirement of ancestors
from the lifetime of &next.
As already stated in the comments, fixing
the infinite recursion is a matter to change ancestors into a trait object.
fn proceed<'a>(mut remaining: Vec<String>, ancestors: &mut dyn Iterator<Item = &'a String>) {
if let Some(next) = remaining.pop() {
let mut next_ancestors = ancestors.map(|r| r).chain(std::iter::once(&next));
proceed(remaining, &mut next_ancestors);
}
}
fn main() {
let v = vec!["a".to_string(), "b".to_string()];
proceed(v, &mut std::iter::empty());
}

Rust - Lifetime of struct member depends on another struct member [duplicate]

This question already has answers here:
Why can't I store a value and a reference to that value in the same struct?
(4 answers)
Closed 3 years ago.
I'm trying to write a Rust struct. The struct owns a Reference counted pointer to a string and also owns a vector of string slices to the same string.
Furthermore I'm trying to write a function to generate this struct. I'm unsure how to proceed.
struct MyStruct<'a> {
rc_string: Rc<String>,
vec: Vec<&'a str>
}
fn build_my_struct<'a>(s: &Rc<String>) -> MyStruct<'a> {
let rc_string = s.clone();
let mut vec = Vec::new();
vec.push(&rc_string[0..2]);
MyStruct {
rc_string: rc_string,
vec: vec
}
}
error[E0515]: cannot return value referencing local variable `rc_string`
--> src/main.rs:13:5
|
11 | vec.push(&rc_string[0..2]);
| --------- `rc_string` is borrowed here
12 |
13 | / MyStruct {
14 | | rc_string: rc_string,
15 | | vec: vec
16 | | }
| |_____^ returns a value referencing data owned by the current function
I understand that the vec variable has borrowed the rc_string. The compiler doesn't like returning vec because it has the borrow to the local variable rc_string.
However rc_string is being returned as well? The string slices are valid for the duration of the life of MyStruct.rc_string?
You need to borrow Rc for life time 'a as well. Compiler needs to know that slice from a String is living in 'a or not. In this case we need to borrow Rc for 'a and compiler will know inner of Rc will also live in 'a.
If you clone s and assign it to rc_string:
s will stay in the function's scope as borrowed Rc for lifetime 'a
rc_string will be the owner of the Rc pointer
and compiler won't be able to know slice of a rc_string is living for 'a or not.
Using slice from a s will work :
fn build_my_struct<'a>(s: &'a Rc<String>) -> MyStruct<'a> {
let mut vec = Vec::new();
let rc_string = s.clone();
vec.push(&s[0..2]);
MyStruct { rc_string, vec }
}
Playground

Resources