Does the Drop trait in Rust modify scoping rules? - rust

I'll use the CustomSmartPointer from The Book, which is used to explain the Drop trait, to build an example:
struct CustomSmartPointer {
data: String,
}
impl Drop for CustomSmartPointer {
fn drop(&mut self) {
println!("Dropping CustomSmartPointer with data `{}`!", self.data);
}
}
fn main() {
println!("A");
let pointer = CustomSmartPointer {
data: String::from("my stuff"),
};
println!("B");
println!("output: {}", pointer.data);
println!("C");
}
This prints:
A
B
output: my stuff
C
Dropping CustomSmartPointer with data `my stuff`!
However, from what I learned I would have expected the last two lines to be swapped. The local variable pointer isn't used anymore after being printed, so I would have expected its scope to end after the line that prints its contents, before "C" gets printed. For example, the section on references has this example:
let mut s = String::from("hello");
let r1 = &s; // no problem
let r2 = &s; // no problem
println!("{} and {}", r1, r2);
// variables r1 and r2 will not be used after this point
let r3 = &mut s; // no problem
println!("{}", r3);
So it seems like either (1) the Drop trait extends the scope until the end of the current block, or (2) it is only references whose scope ends "ASAP" and everything else already lives until the end of the current block (and that would be a major difference between references and smart pointers), or (3) something entirely else is going on here. Which one is it?
Edit: another example
I realized one thing while reading the answers so far: that (1) and (2) are equivalent if (1) includes cases where a type does not implement the Drop trait but "things happen while dropping", which is the case e.g. for a struct that contains another value that implements Drop. I tried this and it is indeed the case (using the CustomSmartPointer from above):
struct Wrapper {
csp: CustomSmartPointer,
}
fn create_wrapper() -> Wrapper {
let pointer = CustomSmartPointer {
data: String::from("my stuff"),
};
Wrapper {
csp: pointer,
}
}
fn main() {
println!("A");
let wrapper = create_wrapper();
println!("B");
println!("output: {}", wrapper.csp.data);
println!("C");
}
This still prints "Dropping CSP" last, after "C", so even a non-Drop wrapper struct that contains a value with the Drop trait has lexical scope. As hinted above, you could then equivalently say: the Drop-able value inside the struct causes a usage at the end of the block that causes the whole wrapper to be dropped only at the end of the block, or you could say that only references have NLL. The difference between the two statements is only about when a value gets dropped that is deeply free of Drop-trait values, which isn't observable.

Do not look if a type implements Drop, look at the return value of needs_drop::<T>() instead.
That said, it is option (1): a type that needs_drop() has an implicit call to drop() at the end of the lexical scope. It is this call that extends the scope of the value.
So you code is as if:
fn main() {
println!("A");
let wrapper = create_wrapper();
println!("B");
println!("output: {}", wrapper.csp.data);
println!("C");
drop(wrapper); // <- implicitly called, wrapper scope ends here
}
Naturally, you can call drop(wrapper) anywhere to end the scope prematurely. As drop() takes its argument by value, it finishes the scope there.
If the type of a value does not needs_drop(), then it is released at the last usage of that value, that is a non lexical scope (NLL).
The non-lexical scopes affects not only references, but any type that doesn't need drop. The thing is that if a value doesn't need drop and doesn't borrow anything, then its scope does not have any visible effect and nobody cares.
For example, this code has a NLL that is technically not a reference:
use std::marker::PhantomData;
#[derive(Debug)]
struct Foo<'a> { _pd: PhantomData<&'a ()> }
impl<'a> Foo<'a> {
fn new<T>(x: &'a mut T) -> Foo<'a> {
Foo { _pd: PhantomData }
}
}
fn main() {
let mut x = 42;
let f1 = Foo::new(&mut x);
let f2 = Foo::new(&mut x);
//dbg!(&f1); // uncomment this line and it will fail to compile
}

It's option two. Implementing Drop trait means that additional actions will happen when object is dropped. But everything will be at some point dropped, whether it implements Drop or not.

println! is not taking your pointer
playgroud...
However, from what I learned I would have expected the last two lines to be swapped. The local variable pointer isn't used anymore after being printed, so I would have expected its scope to end after the line that prints its contents, before "C" gets printed. For example, the section on references has this example:
let mut s = String::from("hello");
let r1 = &s; // no problem
let r2 = &s; // no problem
println!("{} and {}", r1, r2);
// variables r1 and r2 will not be used after this point
let r3 = &mut s; // no problem
println!("{}", r3);
The thing here is that s is never taked, r1 and r2 are dropped after r3 because it takes a mutable reference to s
playground

Related

"Temporary value dropped while borrowed" when using string.replace()

Can someone explain which exact temporary value is dropped and what the recommended way to do this operation is?
fn main() {
let mut a = &mut String::from("Hello Ownership");
a = &mut a.replace("Ownership", "World");
println!("a is {}", a);
}
If you want to keep the &mut references (which are generally not needed in your case, of course), you can do something like this:
fn main() {
let a = &mut String::from("Hello Ownership");
let a = &mut a.replace("Ownership", "World");
println!("a is {}", a);
}
The type of a would by &mut String. In the second line we do what's known as variable shadowing (not that it's needed) and the type is still &mut String.
That doesn't quite answer your question. I don't know why exactly your version doesn't compile, but at least I thought this info might be useful. (see below)
Update
Thanks to Solomon's findings, I wanted to add that apparently in this case:
let a = &mut ...;
let b = &mut ...;
or this one (variable shadowing, basically the same as the above):
let a = &mut ...;
let a = &mut ...;
, the compiler will automatically extend the lifetime of each temporary until the end of the enclosing block. However, in the case of:
let mut a = &mut ...;
a = &mut ...;
, it seems the compiler simply doesn't do such lifetime extension, so that's why the OP's code doesn't compile, even though the code seems to be doing pretty much the same thing.
Why are you using &mut there? Try this:
fn main() {
let mut a = String::from("Hello Ownership");
a = a.replace("Ownership", "World");
println!("a is {}", a);
}
Aha, figured it out!
https://doc.rust-lang.org/nightly/error-index.html#E0716 says:
Temporaries are not always dropped at the end of the enclosing statement. In simple cases where the & expression is immediately stored into a variable, the compiler will automatically extend the lifetime of the temporary until the end of the enclosing block. Therefore, an alternative way to fix the original program is to write let tmp = &foo() and not let tmp = foo():
fn foo() -> i32 { 22 }
fn bar(x: &i32) -> &i32 { x }
let value = &foo();
let p = bar(value);
let q = *p;
Here, we are still borrowing foo(), but as the borrow is assigned directly into a variable, the temporary will not be dropped until the end of the enclosing block. Similar rules apply when temporaries are stored into aggregate structures like a tuple or struct:
// Here, two temporaries are created, but
// as they are stored directly into `value`,
// they are not dropped until the end of the
// enclosing block.
fn foo() -> i32 { 22 }
let value = (&foo(), &foo());

Can I have a mutable reference to a type and its trait object in the same scope? [duplicate]

Why can I have multiple mutable references to a static type in the same scope?
My code:
static mut CURSOR: Option<B> = None;
struct B {
pub field: u16,
}
impl B {
pub fn new(value: u16) -> B {
B { field: value }
}
}
struct A;
impl A {
pub fn get_b(&mut self) -> &'static mut B {
unsafe {
match CURSOR {
Some(ref mut cursor) => cursor,
None => {
CURSOR= Some(B::new(10));
self.get_b()
}
}
}
}
}
fn main() {
// first creation of A, get a mutable reference to b and change its field.
let mut a = A {};
let mut b = a.get_b();
b.field = 15;
println!("{}", b.field);
// second creation of A, a the mutable reference to b and change its field.
let mut a_1 = A {};
let mut b_1 = a_1.get_b();
b_1.field = 16;
println!("{}", b_1.field);
// Third creation of A, get a mutable reference to b and change its field.
let mut a_2 = A {};
let b_2 = a_2.get_b();
b_2.field = 17;
println!("{}", b_1.field);
// now I can change them all
b.field = 1;
b_1.field = 2;
b_2.field = 3;
}
I am aware of the borrowing rules
one or more references (&T) to a resource,
exactly one mutable reference (&mut T).
In the above code, I have a struct A with the get_b() method for returning a mutable reference to B. With this reference, I can mutate the fields of struct B.
The strange thing is that more than one mutable reference can be created in the same scope (b, b_1, b_2) and I can use all of them to modify B.
Why can I have multiple mutable references with the 'static lifetime shown in main()?
My attempt at explaining this is behavior is that because I am returning a mutable reference with a 'static lifetime. Every time I call get_b() it is returning the same mutable reference. And at the end, it is just one identical reference. Is this thought right? Why am I able to use all of the mutable references got from get_b() individually?
There is only one reason for this: you have lied to the compiler. You are misusing unsafe code and have violated Rust's core tenet about mutable aliasing. You state that you are aware of the borrowing rules, but then you go out of your way to break them!
unsafe code gives you a small set of extra abilities, but in exchange you are now responsible for avoiding every possible kind of undefined behavior. Multiple mutable aliases are undefined behavior.
The fact that there's a static involved is completely orthogonal to the problem. You can create multiple mutable references to anything (or nothing) with whatever lifetime you care about:
fn foo() -> (&'static i32, &'static i32, &'static i32) {
let somewhere = 0x42 as *mut i32;
unsafe { (&*somewhere, &*somewhere, &*somewhere) }
}
In your original code, you state that calling get_b is safe for anyone to do any number of times. This is not true. The entire function should be marked unsafe, along with copious documentation about what is and is not allowed to prevent triggering unsafety. Any unsafe block should then have corresponding comments explaining why that specific usage doesn't break the rules needed. All of this makes creating and using unsafe code more tedious than safe code, but compared to C where every line of code is conceptually unsafe, it's still a lot better.
You should only use unsafe code when you know better than the compiler. For most people in most cases, there is very little reason to create unsafe code.
A concrete reminder from the Firefox developers:

What happens to the ownership of a value returned but not assigned by the calling function?

Consider the following Rust code, slightly modified from examples in The Book.
I'm trying to understand what happens to the value in the second running of function dangle() in the main() function (see comment). I would imagine that because the value isn't assigned to any owner, it gets deallocated, but I've so far failed to find information to confirm that. Otherwise, I would think that calling dangle() repeatedly would constantly allocate more memory without deallocating it. Which is it?
fn main() {
// Ownership of dangle()'s return value is passed to the variable `thingamabob`.
let thingamabob = dangle();
// No ownership specified. Is the return value deallocated here?
dangle();
println!("Ref: {}", thingamabob);
}
fn dangle() -> String {
// Ownership specified.
let s = String::from("hello");
// Ownership is passed to calling function.
s
}
When a value has no owner (is not bound to a variable) it goes out of scope. Values that go out of scope are dropped. Dropping a value frees the resources associated with that value.
Anything less would lead to memory leaks, which would be a poor idea in a programming language.
See also:
Is it possible in Rust to delete an object before the end of scope?
How does Rust know whether to run the destructor during stack unwind?
Does Rust free up the memory of overwritten variables?
In your example, the second call creates an unnamed temporary value whose lifetime ends immediately after that one line of code, so it goes out of scope (and any resources are reclaimed) immediately.
If you bind the value to a name using let, then its lifetime extends until the end of the current lexical scope (closing curly brace).
You can explore some of this yourself by implementing the Drop trait on a simple type to see when its lifetime ends. Here's a small program I made to play with this (playground):
#[derive(Debug)]
struct Thing {
val: i32,
}
impl Thing {
fn new(val: i32) -> Self {
println!("Creating Thing #{}", val);
Thing { val }
}
fn foo(self, val: i32) -> Self {
Thing::new(val)
}
}
impl Drop for Thing {
fn drop(&mut self) {
println!("Dropping {:?}", self);
}
}
pub fn main() {
let _t1 = Thing::new(1);
Thing::new(2); // dropped immediately
{
let t3 = Thing::new(3);
Thing::new(4).foo(5).foo(6); // all are dropped, in order, as the next one is created
println!("Doing something with t3: {:?}", t3);
} // t3 is dropped here
} // _t1 is dropped last

Is it possible to have a struct which contains a reference to a value which has a shorter lifetime than the struct?

Here is a simplified version of what I want to archive:
struct Foo<'a> {
boo: Option<&'a mut String>,
}
fn main() {
let mut foo = Foo { boo: None };
{
let mut string = "Hello".to_string();
foo.boo = Some(&mut string);
foo.boo.unwrap().push_str(", I am foo!");
foo.boo = None;
} // string goes out of scope. foo does not reference string anymore
} // foo goes out of scope
This is obviously completely safe as foo.boo is None once string goes out of scope.
Is there a way to tell this to the compiler?
This is obviously completely safe
What is obvious to humans isn't always obvious to the compiler; sometimes the compiler isn't as smart as humans (but it's way more vigilant!).
In this case, your original code compiles when non-lexical lifetimes are enabled:
#![feature(nll)]
struct Foo<'a> {
boo: Option<&'a mut String>,
}
fn main() {
let mut foo = Foo { boo: None };
{
let mut string = "Hello".to_string();
foo.boo = Some(&mut string);
foo.boo.unwrap().push_str(", I am foo!");
foo.boo = None;
} // string goes out of scope. foo does not reference string anymore
} // foo goes out of scope
This is only because foo is never used once it would be invalid (after string goes out of scope), not because you set the value to None. Trying to print out the value after the innermost scope would still result in an error.
Is it possible to have a struct which contains a reference to a value which has a shorter lifetime than the struct?
The purpose of Rust's borrowing system is to ensure that things holding references do not live longer than the referred-to item.
After non-lexical lifetimes
Maybe, so long as you don't make use of the reference after it is no longer valid. This works, for example:
#![feature(nll)]
struct Foo<'a> {
boo: Option<&'a mut String>,
}
fn main() {
let mut foo = Foo { boo: None };
// This lives less than `foo`
let mut string1 = "Hello".to_string();
foo.boo = Some(&mut string1);
// This lives less than both `foo` and `string1`!
let mut string2 = "Goodbye".to_string();
foo.boo = Some(&mut string2);
}
Before non-lexical lifetimes
No. The borrow checker is not smart enough to tell that you cannot / don't use the reference after it would be invalid. It's overly conservative.
In this case, you are running into the fact that lifetimes are represented as part of the type. Said another way, the generic lifetime parameter 'a has been "filled in" with a concrete lifetime value covering the lines where string is alive. However, the lifetime of foo is longer than those lines, thus you get an error.
The compiler does not look at what actions your code takes; once it has seen that you parameterize it with that specific lifetime, that's what it is.
The usual fix I would reach for is to split the type into two parts, those that need the reference and those that don't:
struct FooCore {
size: i32,
}
struct Foo<'a> {
core: FooCore,
boo: &'a mut String,
}
fn main() {
let core = FooCore { size: 42 };
let core = {
let mut string = "Hello".to_string();
let foo = Foo { core, boo: &mut string };
foo.boo.push_str(", I am foo!");
foo.core
}; // string goes out of scope. foo does not reference string anymore
} // foo goes out of scope
Note how this removes the need for the Option — your types now tell you if the string is present or not.
An alternate solution would be to map the whole type when setting the string. In this case, we consume the whole variable and change the type by changing the lifetime:
struct Foo<'a> {
boo: Option<&'a mut String>,
}
impl<'a> Foo<'a> {
fn set<'b>(self, boo: &'b mut String) -> Foo<'b> {
Foo { boo: Some(boo) }
}
fn unset(self) -> Foo<'static> {
Foo { boo: None }
}
}
fn main() {
let foo = Foo { boo: None };
let foo = {
let mut string = "Hello".to_string();
let mut foo = foo.set(&mut string);
foo.boo.as_mut().unwrap().push_str(", I am foo!");
foo.unset()
}; // string goes out of scope. foo does not reference string anymore
} // foo goes out of scope
Shepmaster's answer is completely correct: you can't express this with lifetimes, which are a compile time feature. But if you're trying to replicate something that would work in a managed language, you can use reference counting to enforce safety at run time.
(Safety in the usual Rust sense of memory safety. Panics and leaks are still possible in safe Rust; there are good reasons for this, but that's a topic for another question.)
Here's an example (playground). Rc pointers disallow mutation, so I had to add a layer of RefCell to imitate the code in the question.
use std::rc::{Rc,Weak};
use std::cell::RefCell;
struct Foo {
boo: Weak<RefCell<String>>,
}
fn main() {
let mut foo = Foo { boo: Weak::new() };
{
// create a string with a shorter lifetime than foo
let string = "Hello".to_string();
// move the string behind an Rc pointer
let rc1 = Rc::new(RefCell::new(string));
// weaken the pointer to store it in foo
foo.boo = Rc::downgrade(&rc1);
// accessing the string
let rc2 = foo.boo.upgrade().unwrap();
assert_eq!("Hello", *rc2.borrow());
// mutating the string
let rc3 = foo.boo.upgrade().unwrap();
rc3.borrow_mut().push_str(", I am foo!");
assert_eq!("Hello, I am foo!", *rc3.borrow());
} // rc1, rc2 and rc3 go out of scope and string is automatically dropped.
// foo.boo now refers to a dropped value and cannot be upgraded anymore.
assert!(foo.boo.upgrade().is_none());
}
Notice that I didn't have to reassign foo.boo before string went out of scope, like in your example -- the Weak pointer is automatically marked invalid when the last extant Rc pointer is dropped. This is one way in which Rust's type system still helps you enforce memory safety even after dropping the strong compile-time guarantees of shared & pointers.

Is using `ref` in a function argument the same as automatically taking a reference?

Rust tutorials often advocate passing an argument by reference:
fn my_func(x: &Something)
This makes it necessary to explicitly take a reference of the value at the call site:
my_func(&my_value).
It is possible to use the ref keyword usually used in pattern matching:
fn my_func(ref x: Something)
I can call this by doing
my_func(my_value)
Memory-wise, does this work like I expect or does it copy my_value on the stack before calling my_func and then get a reference to the copy?
The value is copied, and the copy is then referenced.
fn f(ref mut x: i32) {
*x = 12;
}
fn main() {
let mut x = 42;
f(x);
println!("{}", x);
}
Output: 42
Both functions declare x to be &Something. The difference is that the former takes a reference as the parameter, while the latter expects it to be a regular stack value. To illustrate:
#[derive(Debug)]
struct Something;
fn by_reference(x: &Something) {
println!("{:?}", x); // prints "&Something""
}
fn on_the_stack(ref x: Something) {
println!("{:?}", x); // prints "&Something""
}
fn main() {
let value_on_the_stack: Something = Something;
let owned: Box<Something> = Box::new(Something);
let borrowed: &Something = &value_on_the_stack;
// Compiles:
on_the_stack(value_on_the_stack);
// Fail to compile:
// on_the_stack(owned);
// on_the_stack(borrowed);
// Dereferencing will do:
on_the_stack(*owned);
on_the_stack(*borrowed);
// Compiles:
by_reference(owned); // Does not compile in Rust 1.0 - editor
by_reference(borrowed);
// Fails to compile:
// by_reference(value_on_the_stack);
// Taking a reference will do:
by_reference(&value_on_the_stack);
}
Since on_the_stack takes a value, it gets copied, then the copy matches against the pattern in the formal parameter (ref x in your example). The match binds x to the reference to the copied value.
If you call a function like f(x) then x is always passed by value.
fn f(ref x: i32) {
// ...
}
is equivalent to
fn f(tmp: i32) {
let ref x = tmp;
// or,
let x = &tmp;
// ...
}
i.e. the referencing is completely restricted to the function call.
The difference between your two functions becomes much more pronounced and obvious if the value doesn't implement Copy. For example, a Vec<T> doesn't implement Copy, because that is an expensive operation, instead, it implements Clone (Which requires a specific method call).
Assume two methods are defined as such
fn take_ref(ref v: Vec<String>) {}// Takes a reference, ish
fn take_addr(v: &Vec<String>) {}// Takes an explicit reference
take_ref will try to copy the value passed, before referencing it. For Vec<T>, this is actually a move operation (Because it doesn't copy). This actually consumes the vector, meaning the following code would throw a compiler error:
let v: Vec<String>; // assume a real value
take_ref(v);// Value is moved here
println!("{:?}", v);// Error, v was moved on the previous line
However, when the reference is explicit, as in take_addr, the Vec isn't moved but passed by reference. Therefore, this code does work as intended:
let v: Vec<String>; // assume a real value
take_addr(&v);
println!("{:?}", v);// Prints contents as you would expect

Resources