I've encountered a number of types in Rust denoted with a single apostrophe:
'static
'r
'a
What is the significance of that apostrophe (')? Maybe it's a modifier of references (&)? Generic typing specific to references? I've no idea where the documentation for this is hiding.
These are Rust's named lifetimes.
Quoting from The Rust Programming Language:
Every reference in Rust has a lifetime, which is the scope for which that reference is valid. Most of the time lifetimes are implicit and inferred, just like most of the time types are inferred. Similarly to when we have to annotate types because multiple types are possible, there are cases where the lifetimes of references could be related in a few different ways, so Rust needs us to annotate the relationships using generic lifetime parameters so that it can make sure the actual references used at runtime will definitely be valid.
Lifetime annotations don’t change how long any of the references
involved live. In the same way that functions can accept any type when
the signature specifies a generic type parameter, functions can accept
references with any lifetime when the signature specifies a generic
lifetime parameter. What lifetime annotations do is relate the
lifetimes of multiple references to each other.
Lifetime annotations have a slightly unusual syntax: the names of
lifetime parameters must start with an apostrophe '. The names of
lifetime parameters are usually all lowercase, and like generic types,
their names are usually very short. 'a is the name most people use as
a default. Lifetime parameter annotations go after the & of a
reference, and a space separates the lifetime annotation from the
reference’s type.
Said another way, a lifetime approximates the span of execution during which the data a reference points to is valid. The Rust compiler will conservatively infer the shortest lifetime possible to be safe. If you want to tell the compiler that a reference lives longer than the shortest estimate, you can name it, saying that the output reference, for example, has the same lifetime as a given input reference.
The 'static lifetime is a special lifetime, the longest lived of all lifetimes - for the duration of the program. A typical example are string "literals" that will always be available during the lifetime of the program/module.
You can get more information from this slide deck, starting around slide 29.
Lifetimes in Rust also discusses lifetimes in some depth.
To add to quux00's excellent answer, named lifetimes are also used to indicate the origin of a returned borrowed variable to the rust compiler.
This function
pub fn f(a: &str, b: &str) -> &str {
b
}
won't compile because it returns a borrowed value but does not specify whether it borrowed it from a or b.
To fix that, you'd declare a named lifetime and use the same lifetime for b and the return type:
pub fn f<'r>(a: &str, b: &'r str) -> &'r str {
// ---- --- ---
b
}
and use it as expected
f("a", "b")
Related
All of Rust's documentation and third-party/blog examples (at least the top several results in Google) use <'a> to demonstrate lifetime annotations.
What is the significance of the name choice 'a?
Is this a new de-facto convention like the old i in for(i=0; ...)?
Do you use 'a in all the lifetime annotations in your production code?
What are some real-world examples if lifetime scope names you have used in your own code?
There is no special significance to 'a. It's just the first letter of the Latin alphabet and, if you had more than one lifetime, you might name them 'a, 'b, 'c etc.
One reason that descriptive names are not commonly used for lifetime parameters is similar to why we often use single letters for type parameters. These parameters represent all possible types with the actual type left up to the caller. In generic contexts, the actual argument could be anything and often it's completely unconstrained, so naming it might imply that the usage is narrower than it actually is. For example the T in Option<T> means the type and it wouldn't make sense to be more specific than that.
That said, lifetimes are often connected to another type and it's not uncommon to name lifetimes to be a bit more obvious where they come from. For example, you could use 't to name the lifetime of a T parameter:
fn foo<'t, T: 't>(arg: T> {}
serde is an example of a popular library that does this. When a value is borrowed from data that is being deserialized, this is usually named 'de:
pub trait Deserialize<'de>: Sized {
fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
where
D: Deserializer<'de>;
}
An example from my own work where I've named them more carefully is in a query language that searches over in-memory data structures. I used 'd to refer to values borrowed from the data being queried and 'q for values borrowed from the query itself.
'a is just the first letter of the alphabet, and easy to type. Yes, it's a stand-in similar to i when iterating of a range of numbers or x in almost any context. It's most similar to using T for a generic parameter.
It has become a de-factor convention to use 'a, 'b, etc as lifetime names, because it reduces the amount of visual space they take up. In cases where clarity is important, you'll see longer names, often matching the name of the field or generic parameter:
struct SomeStruct<'name, 'comment> {
name: &'name str,
comment: &'comment str,
}
Is this a new de-facto convention like the old i in for(i=0; ...)?
Yes, pretty much. The names you pick for lifetime parameters aren't important (except for 'static, that's a special case). They're like variable names.
See the Rust Reference's definition of lifetime labels - the apostrophe is followed by an IDENTIFIER_OR_KEYWORD token.
Personally, I usually use 'a, 'b, etc. in my code, since the lifetimes aren't usually very tricky or important - I'm just trying to get the borrow checker happy.
But sometimes it's important to actually understand what's happening with the lifetimes involved, and thinking through it is nontrivial. In those cases I might use more informative lifetime parameter names.
I'm a newbie in Rust and I'm still struggling with lifetimes in Rust. Below is an example from a book I'm reading. Could anyone help explain why the author can get this information just by looking at the function signature? I already have basic understanding of borrowing, references etc. but still can't understand it.
For example, suppose we have a parsing function that takes a slice of bytes, and returns a structure holding the results of the parse:
fn parse_record<'i>(input: &'i [u8]) -> Record<'i> { ... }
Without looking into the definition of the Record type at all, we can tell that, if we
receive a Record from parse_record, whatever references it contains must point into
the input buffer we passed in, and nowhere else (except perhaps at 'static values).
The Record can get references only from the function body. In theory, these can come from
values in input, which are all references with lifetime 'i
values defined outside parse_record, which must have 'static lifetime
values defined in parse_record. These can be
dynamically created. Such values would be dropped by the end of the function scope, so any references to them would end up as dangling pointers. As such, this isn't allowed by the compiler.
literals (1, "cat"). These are actually baked into the binary, and so are effectively defined outside the function. In the function they're 'static references
The input:
input: &'i [u8]
Says the following:
I am a reference to a series of bytes ([u8]) that will live for at least as long as 'i.
So, when saying that I've a type that looks like this: Record<'i>, I can say the following about it:
I am a struct (Named Record<'i>) that contains something (Perhaps a reference, perhaps something else) that depends on something living for at least as long as 'i.
What lifetimes on references/structs/enums/whatever are telling you is that there is a dependency that an object that lives in 'i must live as long as I do.
In other words, this function signature tells you that the Record must live as long as the bytes referenced by input (The u8s under the reference, not the reference itself).
A lack of a lifetime parameter was recently deprecated and now causes a warning, so keep in mind when reading examples like the following:
fn parse_record(input: &[u8]) -> Record
that there may be a lifetime attached to Record so you must consult some kind of documentation. The compile desugars this (And the warning asks you to do this yourself) to this:
fn parse_record(input: &'_ [u8]) -> Record<'_>
This is identical to your 'i example.
This question already has answers here:
Why are explicit lifetimes needed in Rust?
(10 answers)
Closed 2 years ago.
I have been learning the lifetimes topic for the last three days, and they start making sense to me now. However, I experimented a lot, but didn't manage to specify lifetimes in a way when they'd lead to runtime-unsafe behavior, because the compiler seems to be smart enough to prevent such cases, by not compiling.
Hence I have the chain of questions below:
Is it true that Rust compiler will catch every case of unsafe lifetime specifiers usage?
If yes, then why does Rust require manually specifying lifetimes, when it can do it on its own, by deducing the unsafe scenarios? Or is it just a relic that will go away once the compiler becomes powerful enough to make lifetime elision everywhere?
If no, what is the example (are the examples) of unsafe lifetime specifiers usage? They'd clearly prove the necessity of manually specifying lifetimes.
It is not possible (barring any compiler bugs) to induce undefined behavior with lifetime specifiers unless you use unsafe code (either in the function or elsewhere). However, lifetime specifiers are still necessary because sometimes there is ambiguity in what the proper lifetime should be. For example:
fn foo(bar: &i32, baz: &i32) -> &i32 {
// ...
}
What should the lifetime of the return type be? The compiler cannot infer this because it could be tied to either bar or baz, and each case would affect how long the return value lasts and therefore how the function can be used. The body of the function cannot be used to infer the lifetime because type and lifetime checks must be possible to complete using only the signature of the function. The only way to remove this ambiguity is to explicitly state what lifetime the return value should have:
fn foo<'a>(bar: &i32, baz: &'a i32) -> &'a i32 {
// ...
}
You can read more about the lifetime elision rules here.
The third rule of lifetime elision says
If there are multiple input lifetime parameters, but one of them is &self or &mut self because this is a method, then the lifetime of self is assigned to all output lifetime parameters. This makes writing methods much nicer.
Here is the tutorial describing what happened for this function
fn announce_and_return_part(&self, announcement: &str) -> &str
There are two input lifetimes, so Rust applies the first lifetime elision rule and gives both &self and announcement their own lifetimes. Then, because one of the parameters is &self, the return type gets the lifetime of &self, and all lifetimes have been accounted for.
We can show that all the lifetimes are not accounted for since it is possible that announcement will have a different lifetime than &self:
struct ImportantExcerpt<'a> {
part: &'a str,
}
impl<'a> ImportantExcerpt<'a> {
fn announce_and_return_part(&self, announcement: &str) -> &str {
println!("Attention please: {}", announcement);
announcement
}
}
fn main() {
let i = ImportantExcerpt { part: "IAOJSDI" };
let test_string_lifetime;
{
let a = String::from("xyz");
test_string_lifetime = i.announce_and_return_part(a.as_str());
}
println!("{:?}", test_string_lifetime);
}
The lifetime of announcement is not as long as &self, so it is not correct to associate the output lifetime to &self, shouldn't the output lifetime be associated to the longer of the input?
Why is the third rule of lifetime elision a valid way to assign output lifetime?
No, the elision rules do not capture every possible case for lifetimes. If they did, then there wouldn't be any elision rules, they would be the only rules and we wouldn't need any syntax to specify explicit lifetimes.
Quoting from the documentation you linked to, emphasis mine:
The patterns programmed into Rust's analysis of references are called
the lifetime elision rules. These aren't rules for programmers to
follow; the rules are a set of particular cases that the compiler will
consider, and if your code fits these cases, you don't need to write
the lifetimes explicitly.
The elision rules don't provide full inference: if Rust
deterministically applies the rules but there's still ambiguity as to
what lifetimes the references have, it won't guess what the lifetime
of the remaining references should be. In this case, the compiler will
give you an error that can be resolved by adding the lifetime
annotations that correspond to your intentions for how the references
relate to each other.
The lifetime of announcement is not as long as &self, so it is not correct to associate the output lifetime to &self
Why is the third rule of lifetime elision a valid way to assign output lifetime?
"correct" is probably not the right word to use here. What the elision rules have done is a valid way, it just doesn't happen to be what you might have wanted.
shouldn't the output lifetime be associated to the longer of the input?
Yes, that would be acceptable for this example, it's just not the most common case, so it's not what the elision rules were aimed to do.
See also:
Why are explicit lifetimes needed in Rust?
When do I need to specify explicit lifetimes in Rust?
When is it useful to define multiple lifetimes in a struct?
Why would you ever use the same lifetimes for references in a struct?
I'm trying to wrap my head around Rust lifetimes (as the official guides don't really explain them that well).
Do rust lifetimes only refer to references, or can they refer to base/primitive values as well?
Lifetimes are the link between values and references to said values.
In order to understand this link, I will use a broken parallel: houses and addresses.
A house is a physical entity. It is built on a piece of land at some time, will live for a few dozen or hundred years, may be renovated multiple times during this time, and will most likely be destroyed at some point.
An address is a logical entity, it may point to a house, or to other physical entities (a field, a school, a train station, a company's HQ, ...).
The lifetime of a house is relatively clear: it represents the duration during which a house is usable, from the moment it is built to the moment it is destroyed. The house may undergo several renovations during this time, and what used to be a simple cabana may end up being a full-fledged manor, but that is of no concern to us; for our purpose the house is living throughout those transformations. Only its creation and ultimate destruction matter... even though it might be better if no one happen to be in the bedroom when we tear the roof down.
Now, imagine that you are a real estate agent. You do not keep the houses you sell in your office, it's impractical; you do, however, keep their addresses!
Without the notion of lifetime, from time to time your customers will complain because the address you sent them to... was the address of a garbage dump, and not at all that lovely two-story house you had the photography of. You might also get a couple of inquiries from the police station asking why people holding onto a booklet from your office were found in a just destroyed house, the ensuing lawsuit might shut down your business.
This is obviously a risk to your business, and therefore you should seek a better solution. What if each address could be tagged with the lifetime of the house it refers to, so that you know not to send people to their death (or disappointment) ?
You may have recognized the C manual memory management strategy in that garbage dump; in C it's up to you, the real estate agent developer, to make sure that your addresses (pointers/references) always refer to living houses.
In Rust, however, the references are tagged with a special marker: 'enough; it represents the a lower-bound on the lifetime of the value referred.
When the compiler checks whether your usage of the reference is safe or not, it asks the question:
Is the value still alive ?
It does not matter whether the value will be there for a 100 years afterward, as long as it lives long 'enough for the use you have of it.
No, they refer to values as well. If it is not clear from the context how long they will live, they have to be annotated as well. It is then called a lifetime bound.
In the following example it is necessary to specify that the value, the reference is referring to, lives at least as long as the reference itself:
use std::num::Primitive;
struct Foo<'a, T: Primitive + 'a> {
a: &'a T
}
Try deleting the + 'a and the compiler will complain. This is required since T could be anything implementing Primitive.
Yes, they only refer to references, however those references can refer to primitive types. Rust is not like Java (and similar languages) that make a distinction between primitive types, which are passed by value, and more complex types (Objects in Java) that are passed by reference. Complex types can be allocated on the stack and passed by value, and references can be taken to primitive types.
For example, here is a function that takes two references to i32's, and returns a reference to the larger one:
fn bigger<'a>(a: &'a i32, b: &'a i32) -> &'a i32 {
if a > b { a } else { b }
}
It uses the lifetime 'a to communicate that the lifetime of the returned reference is the same as that of the references passed in.
When you see a lifetime annotation (e.g. 'a) in the code, there's almost always a reference, or borrowed pointer, involved.
The full syntax for borrowed pointers is &'a T. 'a is the lifetime of the referent. T is the type of the referent.
Structs and enums can have lifetime parameters. This is usually a consequence of the struct or enum containing a borrowed pointer. When you store a borrowed pointer in a struct or enum, you must explicitly state the referent's lifetime. For example, the Cow enum in the standard library contains a borrowed pointer in one of its variants. Therefore, it has a lifetime parameter that is used in the borrowed pointer's type to define the referent's lifetime.
Traits can have type bounds and also a lifetime bound. The lifetime bound indicates the largest region in which all the borrowed pointers in a concrete implementation of that trait are valid (i.e. their referents are alive). If the implementation contains no borrowed
pointers, then the lifetime is inferred as 'static. Lifetime bounds can appear in type parameter definitions, in where clauses and on trait objects.
Sometimes, you might want to define a struct or enum with a lifetime parameter, but without a corresponding value to borrow from. You can use a marker type, such as ContravariantLifetime<'a>, to ensure the lifetime parameter has the proper variance (ContravariantLifetime corresponds to the variance of borrowed pointers; without a marker, the lifetime would be bivariant, which means the lifetime could be substituted with any other lifetime... not very useful!). See an example of this use case here.