I was reading up code for anyhow crate in Rust. There is a particular line I don't fully grasp:
{
let vtable = &ErrorVTable { ... };
construct(vtable, ...);
}
fn construct(vtable: &'static ErrorVTable, ...);
We seem to create an ErrorVTable struct, and return a reference to it which has lifetime `static. I'd expect compiler to create a struct on function stack, and return a reference to it, causing weird memory issues.
But it appears like compiler detects that this variable for all possible E inferred on compile time and somehow creates static variables for them? How does this actually work?
Consider this simplified code:
struct Foo {
x: i32
}
fn test(_: &'static Foo) {}
fn main() {
let f = Foo{ x: 42 };
test(&f);
}
As expected, it does not compile, with the message:
f does not live long enough
However this slightly variation does compile:
fn main() {
let f = &Foo{ x: 42 };
test(f);
}
The difference is that in the former, the Foo object is local, with a local lifetime, so no 'static reference can be built. But in the latter, the actual object is a static constant so it has static lifetime, and f is just a reference to it.
To help see the difference, consider this other equivalent code:
const F: Foo = Foo{ x: 42 };
fn main() {
test(&F);
}
Or if you use an actual constant literal:
fn test_2(_: &'static i32) {}
fn main() {
let i = &42;
test_2(&i);
}
Naturally, this only works if all the arguments of the Foo construction are constant. If any value is not constant, then the compiler will silently switch to a local temporary instead of a static constant and you will lose the 'static lifetime.
The precise rules for this constant promotion, as it is sometimes called, are a bit complicated and may be extended in newer compiler versions.
Related
Consider the following Rust code:
use std::thread;
fn main() {
bar();
loop {}
}
fn bar() {
let var = b"foo";
thread::spawn(|| {
write(var);
});
}
fn write(d: &[u8]) {
println!("{:?}", d)
}
To my understanding, var is on the stack of function bar, which no longer exists after it returns.
Still, the new thread accesses it afterwards and successfully writes the data.
Why does this work?
Why doesn't the Rust compiler complain?
To my understanding, var is on the stack of function bar, which no longer exists after it returns.
var is just a reference with type &'static [u8; 3]. This reference value is what is on the stack, not the string literal.
The owner of the byte string literal b"foo" is the program binary, which results in the string literal having a 'static lifetime because it exists for the entire lifetime of the running program.
The b"foo" value doesn't live on the stack actually. It is stored in the read-only memory of the compiled binary and has a 'static lifetime.
Consider this alternative example:
fn bar() {
let var = format!("foo");
thread::spawn(|| {
write(&var);
});
}
fn write(d: &str) {
println!("{:?}", d)
}
That won't work (unless you add move before the closure), because var (of type String) is allocated on the stack.
I have a Config struct to store the global config. The config needs to be passed to different structs (abstract layers). For config consistency and saving memory, structs store a reference to the global config instead of a copy.
Then I have the implementation like the following: Playground
/// Global Config
#[derive(Debug)]
struct Config {
pub version: String,
}
/// Layer A
#[derive(Debug)]
struct A<'cfg> {
id: u32,
config: &'cfg Config,
}
/// Layer B
#[derive(Debug)]
struct B<'cfg> {
a_id: u32,
b_id: u32,
config: &'cfg Config,
}
impl<'cfg> A<'cfg> {
pub fn new(id: u32, config: &Config) -> A {
A { id, config }
}
#[allow(unused)]
pub fn version(&self) {
println!("{}_a{}", self.config.version, self.id);
}
pub fn create_many_b(&self) -> Vec<B<'cfg>> { // <-- Removing explicit `'cfg` lifetime for B here makes complier complain
let cfg = self.config;
let mut res = Vec::new();
for id in 0..10u32 {
res.push(B { a_id: self.id, b_id: id, config: cfg });
}
res
}
}
impl<'cfg> B<'cfg> {
pub fn version(&self) {
println!("{}_a{}+b{}", self.config.version, self.a_id, self.b_id);
}
}
fn create_many_a(config: &Config) -> Vec<A> {
let mut res = Vec::new();
for id in 0..5u32 {
res.push(A::new(id, config));
}
res
}
fn main() {
// Create a global config
let cfg = Config { version: "1.0".to_owned() };
println!("cfg pointer: {:p}", &cfg);
// Use the global config to create many `A`s
let a_vec = create_many_a(&cfg);
let mut b_vec = Vec::new();
// then create many `B`s
for a in a_vec.into_iter() {
b_vec.extend(a.create_many_b())
}
for b in b_vec.iter() {
b.version();
}
}
As my code comment said, the compiler rejects building when removing the lifetime mark (playground). My question is why it is required to have an explicit lifetime mark for B in the function create_many_b. How is the lifetime elision work in this example? Shouldn't the Bs created by an A via function create_many_b have the same lifetime as the A, which already has a 'cfg lifetime from the Config?
With these formulations
pub fn create_many_b(&self) -> Vec<B>
fn create_many_a(config: &Config) -> Vec<A>
the compiler deduces that the lifetime information missing for the result is identical to the lifetime of the reference given as argument.
In the case of create_many_a() this is correct because the missing lifetime is exactly the lifetime of config (it's what we want).
On the other hand, in the case of create_many_b() we don't want the Bs to contain a reference with a lifetime which is the lifetime of self (a A) but we want the lifetime used by the config member inside this A.
Thus, writing
impl<'cfg> A<'cfg> {
...
pub fn create_many_b(&self) -> Vec<B<'cfg>>
clearly states that the lifetime information used for B is the same as the one used for A.
The A referenced by self can disappear, the reference inside the B is still valid because it does not depend on the A itself but on the Config the A was dependant on.
In the main() function
for a in a_vec.into_iter() {
introduces a A that will disappear at each iteration (they are consumed from a_vec), so the created Bs cannot last longer than this iteration if they depend on a, but they can if they depend on cfg which will still exist after this loop.
If you had written
for a in a_vec.iter() {
(still with the missing <'cfg> annotation) it would have been accepted by the compiler, but it would have been misleading since the Bs inside b_vec depend on the As inside a_vec and not directly on cfg which is what was intended in the definition of struct B<'cfg> {....
With elided lifetimes, the compiler will infer the lifetimes to be:
pub fn create_many_b<'a>(&'a self) -> Vec<B<'a>>;
This means that the Bs in the vector may not outlive the caller's reference to self.
If you were borrowing from data that was owned by self, this would be the correct lifetime: the data would be invalid after self was dropped. But the data is borrowed from config which is not owned by A, so you can make it less restrictive by expressing in the lifetimes where the data actually comes from:
pub fn create_many_b(&self) -> Vec<B<'cfg>>;
Often you wouldn't notice the difference in these two signatures, but your usage of these types actually requires the more permissive lifetimes. In particular, a_vec.into_iter() consumes a. This conflicts with the inferred lifetime because you then use the Bs in the vector after a is consumed.
Using the less restrictive lifetime allows you to use the Bs as long as config has not been dropped, regardless of what happened to the A.
When you remove the lifetime annotation, the compiler will infer it by filling the gaps. Something like this:
pub fn create_many_b<'a>(&'a self) -> Vec<B<'a>> {
The compiler is assuming that each B will live as long as the reference to self.
When you call this function:
for a in a_vec.into_iter() {
b_vec.extend(a.create_many_b())
} <- "a" is dropped here
You are supposed to have a Vector of Bs, of which each one is holding a reference to a, which has already been dropped at this point.
So, basically, by removing the lifetime annotation you are telling the compiler to infer it, in which case the compiler will fill the lifetimes based on the reference it holds within the function, rather than the lifetime of A.
I hope this helps to understand the error message that you are receiving.
Here is a simplified version of what I want to archive:
struct Foo<'a> {
boo: Option<&'a mut String>,
}
fn main() {
let mut foo = Foo { boo: None };
{
let mut string = "Hello".to_string();
foo.boo = Some(&mut string);
foo.boo.unwrap().push_str(", I am foo!");
foo.boo = None;
} // string goes out of scope. foo does not reference string anymore
} // foo goes out of scope
This is obviously completely safe as foo.boo is None once string goes out of scope.
Is there a way to tell this to the compiler?
This is obviously completely safe
What is obvious to humans isn't always obvious to the compiler; sometimes the compiler isn't as smart as humans (but it's way more vigilant!).
In this case, your original code compiles when non-lexical lifetimes are enabled:
#![feature(nll)]
struct Foo<'a> {
boo: Option<&'a mut String>,
}
fn main() {
let mut foo = Foo { boo: None };
{
let mut string = "Hello".to_string();
foo.boo = Some(&mut string);
foo.boo.unwrap().push_str(", I am foo!");
foo.boo = None;
} // string goes out of scope. foo does not reference string anymore
} // foo goes out of scope
This is only because foo is never used once it would be invalid (after string goes out of scope), not because you set the value to None. Trying to print out the value after the innermost scope would still result in an error.
Is it possible to have a struct which contains a reference to a value which has a shorter lifetime than the struct?
The purpose of Rust's borrowing system is to ensure that things holding references do not live longer than the referred-to item.
After non-lexical lifetimes
Maybe, so long as you don't make use of the reference after it is no longer valid. This works, for example:
#![feature(nll)]
struct Foo<'a> {
boo: Option<&'a mut String>,
}
fn main() {
let mut foo = Foo { boo: None };
// This lives less than `foo`
let mut string1 = "Hello".to_string();
foo.boo = Some(&mut string1);
// This lives less than both `foo` and `string1`!
let mut string2 = "Goodbye".to_string();
foo.boo = Some(&mut string2);
}
Before non-lexical lifetimes
No. The borrow checker is not smart enough to tell that you cannot / don't use the reference after it would be invalid. It's overly conservative.
In this case, you are running into the fact that lifetimes are represented as part of the type. Said another way, the generic lifetime parameter 'a has been "filled in" with a concrete lifetime value covering the lines where string is alive. However, the lifetime of foo is longer than those lines, thus you get an error.
The compiler does not look at what actions your code takes; once it has seen that you parameterize it with that specific lifetime, that's what it is.
The usual fix I would reach for is to split the type into two parts, those that need the reference and those that don't:
struct FooCore {
size: i32,
}
struct Foo<'a> {
core: FooCore,
boo: &'a mut String,
}
fn main() {
let core = FooCore { size: 42 };
let core = {
let mut string = "Hello".to_string();
let foo = Foo { core, boo: &mut string };
foo.boo.push_str(", I am foo!");
foo.core
}; // string goes out of scope. foo does not reference string anymore
} // foo goes out of scope
Note how this removes the need for the Option — your types now tell you if the string is present or not.
An alternate solution would be to map the whole type when setting the string. In this case, we consume the whole variable and change the type by changing the lifetime:
struct Foo<'a> {
boo: Option<&'a mut String>,
}
impl<'a> Foo<'a> {
fn set<'b>(self, boo: &'b mut String) -> Foo<'b> {
Foo { boo: Some(boo) }
}
fn unset(self) -> Foo<'static> {
Foo { boo: None }
}
}
fn main() {
let foo = Foo { boo: None };
let foo = {
let mut string = "Hello".to_string();
let mut foo = foo.set(&mut string);
foo.boo.as_mut().unwrap().push_str(", I am foo!");
foo.unset()
}; // string goes out of scope. foo does not reference string anymore
} // foo goes out of scope
Shepmaster's answer is completely correct: you can't express this with lifetimes, which are a compile time feature. But if you're trying to replicate something that would work in a managed language, you can use reference counting to enforce safety at run time.
(Safety in the usual Rust sense of memory safety. Panics and leaks are still possible in safe Rust; there are good reasons for this, but that's a topic for another question.)
Here's an example (playground). Rc pointers disallow mutation, so I had to add a layer of RefCell to imitate the code in the question.
use std::rc::{Rc,Weak};
use std::cell::RefCell;
struct Foo {
boo: Weak<RefCell<String>>,
}
fn main() {
let mut foo = Foo { boo: Weak::new() };
{
// create a string with a shorter lifetime than foo
let string = "Hello".to_string();
// move the string behind an Rc pointer
let rc1 = Rc::new(RefCell::new(string));
// weaken the pointer to store it in foo
foo.boo = Rc::downgrade(&rc1);
// accessing the string
let rc2 = foo.boo.upgrade().unwrap();
assert_eq!("Hello", *rc2.borrow());
// mutating the string
let rc3 = foo.boo.upgrade().unwrap();
rc3.borrow_mut().push_str(", I am foo!");
assert_eq!("Hello, I am foo!", *rc3.borrow());
} // rc1, rc2 and rc3 go out of scope and string is automatically dropped.
// foo.boo now refers to a dropped value and cannot be upgraded anymore.
assert!(foo.boo.upgrade().is_none());
}
Notice that I didn't have to reassign foo.boo before string went out of scope, like in your example -- the Weak pointer is automatically marked invalid when the last extant Rc pointer is dropped. This is one way in which Rust's type system still helps you enforce memory safety even after dropping the strong compile-time guarantees of shared & pointers.
How do I get over something like this:
struct Test {
foo: Option<fn()>
}
impl Test {
fn new(&mut self) {
self.foo = Option::Some(self.a);
}
fn a(&self) { /* can use Test */ }
}
I get this error:
error: attempted to take value of method `a` on type `&mut Test`
--> src/main.rs:7:36
|
7 | self.foo = Option::Some(self.a);
| ^
|
= help: maybe a `()` to call it is missing? If not, try an anonymous function
How do I pass a function pointer from a trait? Similar to what would happen in this case:
impl Test {
fn new(&mut self) {
self.foo = Option::Some(a);
}
}
fn a() { /* can't use Test */ }
What you're trying to do here is get a function pointer from a (to use Python terminology here, since Rust doesn't have a word for this) bound method. You can't.
Firstly, because Rust doesn't have a concept of "bound" methods; that is, you can't refer to a method with the invocant (the thing on the left of the .) already bound in place. If you want to construct a callable which approximates this, you'd use a closure; i.e. || self.a().
However, this still wouldn't work because closures aren't function pointers. There is no "base type" for callable things like in some other languages. Function pointers are a single, specific kind of callable; closures are completely different. Instead, there are traits which (when implemented) make a type callable. They are Fn, FnMut, and FnOnce. Because they are traits, you can't use them as types, and must instead use them from behind some layer of indirection, such as Box<FnOnce()> or &mut FnMut(i32) -> String.
Now, you could change Test to store an Option<Box<Fn()>> instead, but that still wouldn't help. That's because of the other, other problem: you're trying to store a reference to the struct inside of itself. This is not going to work well. If you manage to do this, you effectively render the Test value permanently unusable. More likely is that the compiler just won't let you get that far.
Aside: you can do it, but not without resorting to reference counting and dynamic borrow checking, which is out of scope here.
So the answer to your question as-asked is: you don't.
Let's change the question: instead of trying to crowbar a self-referential closure in, we can instead store a callable that doesn't attempt to capture the invocant at all.
struct Test {
foo: Option<Box<Fn(&Test)>>,
}
impl Test {
fn new() -> Test {
Test {
foo: Option::Some(Box::new(Self::a)),
}
}
fn a(&self) { /* can use Test */ }
fn invoke(&self) {
if let Some(f) = self.foo.as_ref() {
f(self);
}
}
}
fn main() {
let t = Test::new();
t.invoke();
}
The callable being stored is now a function that takes the invocant explicitly, side-stepping the issues with cyclic references. We can use this to store Test::a directly, by referring to it as a free function. Also note that because Test is the implementation type, I can also refer to it as Self.
Aside: I've also corrected your Test::new function. Rust doesn't have constructors, just functions that return values like any other.
If you're confident you will never want to store a closure in foo, you can replace Box<Fn(&Test)> with fn(&Test) instead. This limits you to function pointers, but avoids the extra allocation.
If you haven't already, I strongly urge you to read the Rust Book.
There are few mistakes with your code. new function (by the convention) should not take self reference, since it is expected to create Self type.
But the real issue is, Test::foo expecting a function type fn(), but Test::a's type is fn(&Test) == fn a(&self) if you change the type of foo to fn(&Test) it will work. Also you need to use function name with the trait name instead of self. Instead of assigning to self.a you should assign Test::a.
Here is the working version:
extern crate chrono;
struct Test {
foo: Option<fn(&Test)>
}
impl Test {
fn new() -> Test {
Test {
foo: Some(Test::a)
}
}
fn a(&self) {
println!("a run!");
}
}
fn main() {
let test = Test::new();
test.foo.unwrap()(&test);
}
Also if you gonna assign a field in new() function, and the value must always set, then there is no need to use Option instead it can be like that:
extern crate chrono;
struct Test {
foo: fn(&Test)
}
impl Test {
fn new() -> Test {
Test {
foo: Test::a
}
}
fn a(&self) {
println!("a run!");
}
}
fn main() {
let test = Test::new();
(test.foo)(&test); // Make sure the paranthesis are there
}
I'm trying to select a function to call depending on a condition. I want to store that function in a variable so that I can call it again later without carrying the condition around. Here's a working minimal example:
fn foo() {
println! ("Foo");
}
fn bar() {
println! ("Bar");
}
fn main() {
let selector = 0;
let foo: &Fn() = &foo;
let bar: &Fn() = &bar;
let test = match selector {
0 => foo,
_ => bar
};
test();
}
My question is: is it possible to get rid of the intermediate variables? I've tried simply removing them:
fn foo() {
println! ("Foo");
}
fn bar() {
println! ("Bar");
}
fn main() {
let selector = 0;
let test = match selector {
0 => &foo as &Fn(),
_ => &bar as &Fn()
};
test();
}
but then the borrow checker complains that the borrowed values are only valid until the end of the match (btw, why? the functions are 'static anyway so should be valid to the end of times). I've also tried making the 'static lifetime explicit by using &foo as &'static Fn() but that doesn't work either.
The following works, if you only need to work with static functions and not closures:
fn foo() {
println!("Foo");
}
fn bar() {
println!("Bar");
}
fn main() {
let selector = 0;
let test: fn() = match selector {
0 => foo,
_ => bar
};
test();
}
(try on playground)
Here I've used function type instead of function trait.
The reason that the borrowed trait object doesn't work is probably the following. Any trait object is a fat pointer which consists of a pointer to some value and a pointer to a virtual table. When the trait object is created out of a closure, everything is clear - the value would be represented by the closure itself (internally being an instance of a structure containing all captured variables) and the virtual table would contain a pointer to the implementation of the corresponding Fn*() trait generated by the compiler whose body would be the closure body.
With functions, however, things are not so clear. There are no value to create a trait object from because the function itself should correspond to the implementation of Fn() trait. Therefore, rustc probably generates an empty structure and implements Fn() for it, and this implementation calls the static function directly (not actual Rust, but something close):
struct SomeGeneratedStructFoo;
impl Fn<()> for SomeGeneratedStructFoo {
type Output = ();
fn call(&self, args: ()) -> () {
foo();
}
}
Therefore, when a trait object is created out of fn foo(), a reference is taken in fact to a temporary value of type SomeGeneratedStructFoo. However, this value is created inside the match, and only a reference to it is returned from the match, thus this value does not live long enough, and that's what the error is about.
fn() is a function pointer type. It's already a pointer type. You can check this with std::mem::size_of::<fn()>(). It is not a zero-sized type.
When you do &foo, you take a pointer to a stack allocated pointer. This inner pointer does not survive very long, causing the error.
You can cast these to the generic fn() type as suggested. I would be interested in knowing why you can't cast fn() to &Fn(), though.