So I am trying to test if I understand lifetimes, and wanted to create a scenario that would fail at compile time. The code I came up with is below:
#[test]
fn lifetime() {
struct Identity<'a> {
first_name: &'a str
}
let name: Identity;
{
let first: &str = "hello";
name = Identity {
first_name: first
};
}
println!("{}", name.first_name);
}
the reasoning is that instance of Identity should live as long as what first_name refrences.
Then in the code I create let first: &str = "hello" with a smaller scope, set it to let name: Identity; and then after first should have gone out of scope, I then attempted to print name.first_name. I was expecting this not to compile,, but it compile fine.
What am I missing in my understanding of how lifetimes work and why did this compile?
#Edit
updating the code to have this instead made the compilation fail:
let string = String::from("hello");
let first: &str = string.as_str();
still curious to know why the original code worked.
Because you move first into name. first references 'static data (a special lifetime that lives for the entirety of a program), a literal string in this case which can never go out of scope.
To make your test fail to compile, try referencing data that will go out of scope:
#[test]
fn lifetime() {
struct Identity<'a> {
first_name: &'a str
}
let name: Identity;
{
let first: String = String::from("hello");
name = Identity {
first_name: &first,
};
// `first` will go out of scope here and any references to it
// (like `&first` within `name`) will become invalid.
}
// `first` has been dropped so you can't reference it anymore here:
println!("{}", name.first_name);
}
Variable first is an alias to a string literal with static lifetime. Therefore, the let first: &str = "hello"; has a hidden lifetime specifier "static" and it is equivalent to:
let first: &'static str = "hello";
That enables the variable name to have a lifetime anything up to static. However, it is lifetime is determined by the outer scope, which is shorter but long enough to enable that println statement.
Related
I have problems with understanding the behavior and availability of structs with multiple lifetime parameters. Consider the following:
struct My<'a,'b> {
first: &'a String,
second: &'b String
}
fn main() {
let my;
let first = "first".to_string();
{
let second = "second".to_string();
my = My{
first: &first,
second: &second
}
}
println!("{}", my.first)
}
The error message says that
|
13 | second: &second
| ^^^^^^^ borrowed value does not live long enough
14 | }
15 | }
| - `second` dropped here while still borrowed
16 | println!("{}", my.first)
| -------- borrow later used here
First, I do not access the .second element of the struct. So, I do not see the problem.
Second, the struct has two life time parameters. I assume that compiler tracks the fields of struct seperately.
For example the following compiles fine:
struct Own {
first: String,
second: String
}
fn main() {
let my;
let first = "first".to_string();
{
let second = "second".to_string();
my = Own{
first: first,
second: second
}
}
std::mem::drop(my.second);
println!("{}", my.first)
}
Which means that even though, .second of the struct is dropped that does not invalidate the whole struct. I can still access the non-dropped elements.
Why doesn't the same the same work for structs with references?
The struct has two independent lifetime parameters. Just like a struct with two type parameters are independent of each other, I would expect that these two lifetimes are independent as well. But the error message suggest that in the case of lifetimes these are not independent. The resultant struct does not have two lifetime parameters but only one that is the smaller of the two.
If the validity of struct containing two references limited to the lifetime of reference with the smallest lifetime, then my question is what is the difference between
struct My1<'a,'b>{
f: &'a X,
s: &'b Y,
}
and
struct My2<'a>{
f: &'a X,
s: &'a Y
}
I would expect that structs with multiple lifetime parameters to behave similar to functions with multiple lifetime parameters. Consider these two functions
fn fun_single<'a>(x:&'a str, y: &'a str) -> &'a str {
if x.len() <= y.len() {&x[0..1]} else {&y[0..1]}
}
fn fun_double<'a,'b>(x: &'a str, y:&'b str) -> &'a str {
&x[0..1]
}
fn main() {
let first = "first".to_string();
let second = "second".to_string();
let ref_first = &first;
let ref_second = &second;
let result_ref = fun_single(ref_first, ref_second);
std::mem::drop(second);
println!("{result_ref}")
}
In this version we get the result from a function with single life time parameter. Compiler thinks that two function parameters are related so it picks the smallest lifetime for the reference we return from the function. So it does not compile this version.
But if we just replace the line
let result_ref = fun_single(ref_first, ref_second);
with
let result_ref = fun_double(ref_first, ref_second);
the compiler sees that two lifetimes are independent so even when you drop second result_ref is still valid, the lifetime of the return reference is not the smallest but independent from second parameter and it compiles.
I would expect that structs with multiple lifetimes and functions with multiple lifetimes to behave similarly. But they don't.
What am I missing here?
I assume that compiler tracks the fields of struct seperately.
I think that's the core of your confusion. The compiler does track each lifetime separately, but only statically at compile time, not during runtime. It follows from this that Rust generally can not allow structs to be partially valid.
So, while you do specify two lifetime parameters, the compiler figures that the struct can only be valid as long as both of them are alive: that is, until the shorter-lived one lives.
But then how does the second example work? It relies on an exceptional feature of the compiler, called Partial Moving. That means that whenever you move out of a struct, it allows you to move disjoint parts separately.
It is essentially a syntax sugar for the following:
struct Own {
first: String,
second: String
}
fn main() {
let my;
let first = "first".to_string();
{
let second = "second".to_string();
my = Own{
first: first,
second: second
}
}
let Own{
first: my_first,
second: my_second,
} = my;
std::mem::drop(my_second);
println!("{}", my_first);
}
Note that this too is a static feature, so the following will not compile (even though it would work when run):
struct Own {
first: String,
second: String
}
fn main() {
let my;
let first = "first".to_string();
{
let second = "second".to_string();
my = Own{
first: first,
second: second
}
}
if false {
std::mem::drop(my.first);
}
println!("{}", my.first)
}
The struct may not be moved as a whole once it has been partially moved, so not even this allows you to have partially valid structs.
A local variable may be partially initialized, such as in your second example. Rust can track this for local variables and give you an error if you attempt to access the uninitialized parts.
However in your first example the variable isn't actually partially initialized, it's fully initialized (you give it both the first and second field). Then, when second goes out of scope, my is still fully initialized, but it's second field is now invalid (but initialized). Thus it doesn't even let the variable exist past when second is dropped to avoid an invalid reference.
Rust could track this since you have 2 lifetimes and name the second lifetime a special 'never that would signal the reference is always invalid, but it currently doesn't.
Here is a simplified version of what I want to archive:
struct Foo<'a> {
boo: Option<&'a mut String>,
}
fn main() {
let mut foo = Foo { boo: None };
{
let mut string = "Hello".to_string();
foo.boo = Some(&mut string);
foo.boo.unwrap().push_str(", I am foo!");
foo.boo = None;
} // string goes out of scope. foo does not reference string anymore
} // foo goes out of scope
This is obviously completely safe as foo.boo is None once string goes out of scope.
Is there a way to tell this to the compiler?
This is obviously completely safe
What is obvious to humans isn't always obvious to the compiler; sometimes the compiler isn't as smart as humans (but it's way more vigilant!).
In this case, your original code compiles when non-lexical lifetimes are enabled:
#![feature(nll)]
struct Foo<'a> {
boo: Option<&'a mut String>,
}
fn main() {
let mut foo = Foo { boo: None };
{
let mut string = "Hello".to_string();
foo.boo = Some(&mut string);
foo.boo.unwrap().push_str(", I am foo!");
foo.boo = None;
} // string goes out of scope. foo does not reference string anymore
} // foo goes out of scope
This is only because foo is never used once it would be invalid (after string goes out of scope), not because you set the value to None. Trying to print out the value after the innermost scope would still result in an error.
Is it possible to have a struct which contains a reference to a value which has a shorter lifetime than the struct?
The purpose of Rust's borrowing system is to ensure that things holding references do not live longer than the referred-to item.
After non-lexical lifetimes
Maybe, so long as you don't make use of the reference after it is no longer valid. This works, for example:
#![feature(nll)]
struct Foo<'a> {
boo: Option<&'a mut String>,
}
fn main() {
let mut foo = Foo { boo: None };
// This lives less than `foo`
let mut string1 = "Hello".to_string();
foo.boo = Some(&mut string1);
// This lives less than both `foo` and `string1`!
let mut string2 = "Goodbye".to_string();
foo.boo = Some(&mut string2);
}
Before non-lexical lifetimes
No. The borrow checker is not smart enough to tell that you cannot / don't use the reference after it would be invalid. It's overly conservative.
In this case, you are running into the fact that lifetimes are represented as part of the type. Said another way, the generic lifetime parameter 'a has been "filled in" with a concrete lifetime value covering the lines where string is alive. However, the lifetime of foo is longer than those lines, thus you get an error.
The compiler does not look at what actions your code takes; once it has seen that you parameterize it with that specific lifetime, that's what it is.
The usual fix I would reach for is to split the type into two parts, those that need the reference and those that don't:
struct FooCore {
size: i32,
}
struct Foo<'a> {
core: FooCore,
boo: &'a mut String,
}
fn main() {
let core = FooCore { size: 42 };
let core = {
let mut string = "Hello".to_string();
let foo = Foo { core, boo: &mut string };
foo.boo.push_str(", I am foo!");
foo.core
}; // string goes out of scope. foo does not reference string anymore
} // foo goes out of scope
Note how this removes the need for the Option — your types now tell you if the string is present or not.
An alternate solution would be to map the whole type when setting the string. In this case, we consume the whole variable and change the type by changing the lifetime:
struct Foo<'a> {
boo: Option<&'a mut String>,
}
impl<'a> Foo<'a> {
fn set<'b>(self, boo: &'b mut String) -> Foo<'b> {
Foo { boo: Some(boo) }
}
fn unset(self) -> Foo<'static> {
Foo { boo: None }
}
}
fn main() {
let foo = Foo { boo: None };
let foo = {
let mut string = "Hello".to_string();
let mut foo = foo.set(&mut string);
foo.boo.as_mut().unwrap().push_str(", I am foo!");
foo.unset()
}; // string goes out of scope. foo does not reference string anymore
} // foo goes out of scope
Shepmaster's answer is completely correct: you can't express this with lifetimes, which are a compile time feature. But if you're trying to replicate something that would work in a managed language, you can use reference counting to enforce safety at run time.
(Safety in the usual Rust sense of memory safety. Panics and leaks are still possible in safe Rust; there are good reasons for this, but that's a topic for another question.)
Here's an example (playground). Rc pointers disallow mutation, so I had to add a layer of RefCell to imitate the code in the question.
use std::rc::{Rc,Weak};
use std::cell::RefCell;
struct Foo {
boo: Weak<RefCell<String>>,
}
fn main() {
let mut foo = Foo { boo: Weak::new() };
{
// create a string with a shorter lifetime than foo
let string = "Hello".to_string();
// move the string behind an Rc pointer
let rc1 = Rc::new(RefCell::new(string));
// weaken the pointer to store it in foo
foo.boo = Rc::downgrade(&rc1);
// accessing the string
let rc2 = foo.boo.upgrade().unwrap();
assert_eq!("Hello", *rc2.borrow());
// mutating the string
let rc3 = foo.boo.upgrade().unwrap();
rc3.borrow_mut().push_str(", I am foo!");
assert_eq!("Hello, I am foo!", *rc3.borrow());
} // rc1, rc2 and rc3 go out of scope and string is automatically dropped.
// foo.boo now refers to a dropped value and cannot be upgraded anymore.
assert!(foo.boo.upgrade().is_none());
}
Notice that I didn't have to reassign foo.boo before string went out of scope, like in your example -- the Weak pointer is automatically marked invalid when the last extant Rc pointer is dropped. This is one way in which Rust's type system still helps you enforce memory safety even after dropping the strong compile-time guarantees of shared & pointers.
I need a closure to refer to parts of an object in its enclosing environment. The object is created within the environment and is scoped to it, but once created it could be safely moved to the closure.
The use case is a function that does some preparatory work and returns a closure that will do the rest of the work. The reason for this design are execution constraints: the first part of the work involves allocation, and the remainder must do no allocation. Here is a minimal example:
fn stage_action() -> Box<Fn() -> ()> {
// split a freshly allocated string into pieces
let string = String::from("a:b:c");
let substrings = vec![&string[0..1], &string[2..3], &string[4..5]];
// the returned closure refers to the subtrings vector of
// slices without any further allocation or modification
Box::new(move || {
for sub in substrings.iter() {
println!("{}", sub);
}
})
}
fn main() {
let action = stage_action();
// ...executed some time later:
action();
}
This fails to compile, correctly stating that &string[0..1] and others must not outlive string. But if string were moved into the closure, there would be no problem. Is there a way to force that to happen, or another approach that would allow the closure to refer to parts of an object created just outside of it?
I've also tried creating a struct with the same functionality to make the move fully explicit, but that doesn't compile either. Again, compilation fails with the error that &later[0..1] and others only live until the end of function, but "borrowed value must be valid for the static lifetime".
Even completely avoiding a Box doesn't appear to help - the compiler complains that the object doesn't live long enough.
There's nothing specific to closures here; it's the equivalent of:
fn main() {
let string = String::from("a:b:c");
let substrings = vec![&string[0..1], &string[2..3], &string[4..5]];
let string = string;
}
You are attempting to move the String while there are outstanding borrows. In my example here, it's to another variable; in your example it's to the closure's environment. Either way, you are still moving it.
Additionally, you are trying to move the substrings into the same closure environment as the owning string. That's makes the entire problem equivalent to Why can't I store a value and a reference to that value in the same struct?:
struct Environment<'a> {
string: String,
substrings: Vec<&'a str>,
}
fn thing<'a>() -> Environment<'a> {
let string = String::from("a:b:c");
let substrings = vec![&string[0..1], &string[2..3], &string[4..5]];
Environment {
string: string,
substrings: substrings,
}
}
The object is created within the environment and is scoped to it
I'd disagree; string and substrings are created outside of the closure's environment and moved into it. It's that move that's tripping you up.
once created it could be safely moved to the closure.
In this case that's true, but only because you, the programmer, can guarantee that the address of the string data inside the String will remain constant. You know this for two reasons:
String is internally implemented with a heap allocation, so moving the String doesn't move the string data.
The String will never be mutated, which could cause the string to reallocate, invalidating any references.
The easiest solution for your example is to simply convert the slices to Strings and let the closure own them completely. This may even be a net benefit if that means you can free a large string in favor of a few smaller strings.
Otherwise, you meet the criteria laid out under "There is a special case where the lifetime tracking is overzealous" in Why can't I store a value and a reference to that value in the same struct?, so you can use crates like:
owning_ref
use owning_ref::RcRef; // 0.4.1
use std::rc::Rc;
fn stage_action() -> impl Fn() {
let string = RcRef::new(Rc::new(String::from("a:b:c")));
let substrings = vec![
string.clone().map(|s| &s[0..1]),
string.clone().map(|s| &s[2..3]),
string.clone().map(|s| &s[4..5]),
];
move || {
for sub in &substrings {
println!("{}", &**sub);
}
}
}
fn main() {
let action = stage_action();
action();
}
ouroboros
use ouroboros::self_referencing; // 0.2.3
fn stage_action() -> impl Fn() {
#[self_referencing]
struct Thing {
string: String,
#[borrows(string)]
substrings: Vec<&'this str>,
}
let thing = ThingBuilder {
string: String::from("a:b:c"),
substrings_builder: |s| vec![&s[0..1], &s[2..3], &s[4..5]],
}
.build();
move || {
thing.with_substrings(|substrings| {
for sub in substrings {
println!("{}", sub);
}
})
}
}
fn main() {
let action = stage_action();
action();
}
Note that I'm no expert user of either of these crates, so these examples may not be the best use of it.
Can I create a binding with the type Option<&str>? Tiny non-working example:
fn main() {
let a: Option<&str> = {
Some(&{"a".to_string() + "b"}) // Let's say the string is not static
};
}
This does not work, I need to add lifetime (or use a Option<String> without &). So how can I declare lifetime here? I know I can return a Option<String> and everything will be fine, but that's not what I want — I'm trying to understand some Rust mechanics. I can declare lifetime in a function, but don't know how to do this in a simple let binding.
Absolutely:
fn main() {
let s = "a".to_string() + "b";
let a: Option<&str> = Some(&s);
}
The problem is not in creating a Option<&str>, it's that you are trying to take a reference to something that has gone out of scope. This is (a part of) the error message for the original code:
error: borrowed value does not live long enough
|> Some(&{"a".to_string() + "b"})
|> ^^^^^^^^^^^^^^^^^^^^^^^ does not live long enough
See Return local String as a slice (&str) for further information.
You need to extend lifetime of the string beyond a's initializer expression by binding it to another name:
fn main() {
let x: String = "a".to_string() + "b";
let a: Option<&str> = {
Some(&x)
};
}
You are trying to keep a reference to a value that does not live long enough, like the error says:
3 |> Some(&{"a".to_string() + "b"}) // Let's say the string is not static
|> ^^^^^^^^^^^^^^^^^^^^^^^ does not live long enough
You can have a bind of type Option<&str>, but the reference must live longer than the bind. In your example, you create a bind to the result string, an than take a reference to it:
fn main() {
let x: String = "a".to_string() + "b";
let a: Option<&str> = Some(&x);
}
This works:
fn user_add<'x>(data: &'x Input, db: &'x mut Database<'x>) -> HandlerOutput {
//let input: UserAddIn = json::decode(&data.post).unwrap();
//let username = input.username.as_bytes();
//let password = input.password.as_bytes();
db.put(b"Hi", b"hello");
//db.delete(username);
Ok("Hi".to_string())
}
This does not work:
fn user_add<'x>(data: &'x Input, db: &'x mut Database<'x>) -> HandlerOutput {
//let input: UserAddIn = json::decode(&data.post).unwrap();
//let username = input.username.as_bytes();
//let password = input.password.as_bytes();
let my_str = "hi".to_string();
let username = my_str.as_bytes();
db.put(username, b"hello");
//db.delete(username);
Ok("Hi".to_string())
}
Compiler output:
src/handlers.rs:85:17: 85:23 error: `my_str` does not live long enough
src/handlers.rs:85 let username = my_str.as_bytes();
^~~~~~
src/handlers.rs:80:77: 89:2 note: reference must be valid for the lifetime 'x as defined on the block at 80:76...
src/handlers.rs:80 fn user_add<'x>(data: &'x Input, db: &'x mut Database<'x>) -> HandlerOutput {
src/handlers.rs:81 //let input: UserAddIn = json::decode(&data.post).unwrap();
src/handlers.rs:82 //let username = input.username.as_bytes();
src/handlers.rs:83 //let password = input.password.as_bytes();
src/handlers.rs:84 let my_str = "hi".to_string();
src/handlers.rs:85 let username = my_str.as_bytes();
...
src/handlers.rs:84:32: 89:2 note: ...but borrowed value is only valid for the block suffix following statement 0 at 84:31
src/handlers.rs:84 let my_str = "hi".to_string();
src/handlers.rs:85 let username = my_str.as_bytes();
src/handlers.rs:86 db.put(username, b"hello");
src/handlers.rs:87 //db.delete(username);
src/handlers.rs:88 Ok("Hi".to_string())
src/handlers.rs:89 }
I've seen several questions about lifetime in Rust and I think the book is not that clear about it. I still use lifetimes as trial and error. This specific case has confused me because I've made several attempts fighting against the compiler and this is just the last error I got. If you have some Rust skills please consider editing the part about lifetimes in the book.
In the first case b"Hi" is a byte literal, and has type &'static [u8] which means “slice of u8 with infinite lifetime”. The function put needs some lifetime 'x, since 'static live is bigger than any lifetime, Rust is happy to use it.
In the second case
let my_str = "hi".to_string();
let username = my_str.as_bytes();
username is a reference to the inner buffer of my_str and cannot outlive it. The compiler complains because the first argument of put should have a lifetime 'x which is broader than that of my_str (local to user_add). Rust won't allow you to do that because db would point to dangling data at the end of the function call:
user_add(input, &mut db);
// `my_str` was local to `user_add` and doesn't exist anymore
// if Rust had allowed you to put it in `db`, `db` would now contain some invalid data here
Thanks to #mcarton for answering why the error happens. In this answer I hope it becames clear how to solve it too.
The compiler's code generation is perfect but the error message is
just terribly confusing to me.
The problem was in another library that I made, that happens to be a
database. The database struct contains an entry that holds slices.
The lifetime of the slices was set as:
struct Entry<'a> {
key: &'a [u8],
value: &'a [u8],
}
pub struct Database<'a> {
file: File,
entries: Vec<Entry<'a>>,
}
It means that the data that the slice holds need to live longer than the
database struct. The username variable goes out of scope but the database holding a reference to it still lives. So it means that the database would have to hold data that lives longer than it, like static variables, which makes the database useless.
The library compiled okay. But the error showed elsewhere.
The solution for that was to exchange the slices for vectors because
vectors are not pointers. The vectors can live less than the database.
struct Entry {
key: Vec<u8>,
value: Vec<u8>,
}
pub struct Database {
file: File,
entries: Vec<Entry>,
}