Why is the destructor not called for Box::from_raw()? - rust

I am passing a raw pointer to two different closures and converting the raw pointer to a reference using Box::from_raw() and the program is working fine.
However, after converting the raw pointer to reference, the destructor should be called automatically as the documentation says:
This function is unsafe because improper use may lead to memory problems. For example, a double-free may occur if the function is called twice on the same raw pointer.
However, I am able to access the reference to ABC even after calling Box::from_raw() on raw pointer twice and it's working fine.
struct ABC {}
impl ABC {
pub fn new() -> ABC {
ABC {}
}
pub fn print(&self, x: u32) {
println!("Inside handle {}", x);
}
}
fn main() {
let obj = ABC::new();
let const_obj: *const ABC = &obj;
let handle = |x| {
let abc = unsafe { Box::from_raw(const_obj as *mut ABC) };
abc.print(x);
};
handle(1);
let handle1 = |x| {
let abc = unsafe { Box::from_raw(const_obj as *mut ABC) };
abc.print(x);
};
handle1(2);
}
Rust Playground
Why is the destructor is not called for ABC after handle and before handle1 as the description for Box::from_raw() function specifies:
Specifically, the Box destructor will call the destructor of T and free the allocated memory.
Why is Box::from_raw() working multiple times on a raw pointer?

TL;DR you are doing it wrong.
converting the raw pointer to a reference
No, you are converting it into a Box, not a reference.
the program is working fine
It is not. You are merely being "lucky" that memory unsafety and undefined behavior isn't triggering a crash. This is likely because your type has no actual data.
to reference, the destructor should be called automatically
No, when references go out of scope, the destructor is not executed.
Why is the destructor is not called
It is, which is one of multiple reasons your code is completely and totally broken and unsafe.
Add code to be run during destruction:
impl Drop for ABC {
fn drop(&mut self) {
println!("drop")
}
}
And you will see it is called 3 times:
Inside handle 1
drop
Inside handle 2
drop
drop
I am able to access the reference to ABC
Yes, which is unsafe. You are breaking the rules that you are supposed to be upholding when writing unsafe code. You've taken a raw pointer, done something to make it invalid, and are then accessing the original, now invalid variable.
The documentation also states:
the only valid pointer to pass to this function is the one taken from another Box via the Box::into_raw function.
You are ignoring this aspect as well.

Related

Ensuring value lives for its entire scope

I have a type (specifically CFData from core-foundation), whose memory is managed by C APIs and that I need to pass and receive from C functions as a *c_void. For simplicity, consider the following struct:
struct Data {
ptr: *mut ffi::c_void,
}
impl Data {
pub fn new() -> Self {
// allocate memory via an unsafe C-API
Self {
ptr: std::ptr::null(), // Just so this compiles.
}
}
pub fn to_raw(&self) -> *const ffi::c_void {
self.ptr
}
}
impl Drop for Data {
fn drop(&mut self) {
unsafe {
// Free memory via a C-API
}
}
}
Its interface is safe, including to_raw(), since it only returns a raw pointer. It doesn't dereference it. And the caller doesn't dereference it. It's just used in a callback.
pub extern "C" fn called_from_C_ok(on_complete: extern "C" fn(*const ffi::c_void)) {
let data = Data::new();
// Do things to compute Data.
// This is actually async code, which is why there's a completion handler.
on_complete(data.to_raw()); // data survives through the function call
}
This is fine. Data is safe to manipulate, and (I strongly believe) Rust promises that data will live until the end of the on_complete function call.
On the other hand, this is not ok:
pub extern "C" fn called_from_C_broken(on_complete: extern "C" fn(*const ffi::c_void)) {
let data = Data::new();
// ...
let ptr = data.to_raw(); // data can be dropped at this point, so ptr is dangling.
on_complete(ptr); // This may crash when the receiver dereferences it.
}
In my code, I made this mistake and it started crashing. It's easy to see why and it's easy to fix. But it's subtle, and it's easy for a future developer (me) to modify the ok version into the broken version without realizing the problem (and it may not always crash).
What I'd like to do is to ensure data lives as long as ptr. In Swift, I'd do this with:
withExtendedLifetime(&data) { data in
// ...data cannot be dropped until this closure ends...
}
Is there a similar construct in Rust that explicitly marks the minimum lifetime for a variable to a scope (that the optimizer may not reorder), even if it's not directly accessed? (I'm sure it's trivial to build a custom with_extended_lifetime in Rust, but I'm looking for a more standard solution so that it will be obvious to other developers what's going on).
Playground
I do believe the following "works" but I'm not sure how flexible it is, or if it's just replacing a more standard solution:
fn with_extended_lifetime<T, U, F>(value: &T, f: F) -> U
where
F: Fn(&T) -> U,
{
f(value)
}
with_extended_lifetime(&data, |data| {
let ptr = data.to_raw();
on_complete(ptr)
});
The optimizer is not allowed to change when a value is dropped. If you assign a value to a variable (and that value is not then moved elsewhere or overwritten by assignment), it will always be dropped at the end of the block, not earlier.
You say that this code is incorrect:
pub extern "C" fn called_from_C_broken(on_complete: extern "C" fn(*const ffi::c_void)) {
let data = Data::new();
// ...
let ptr = data.to_raw(); // data can be dropped at this point, so ptr is dangling.
on_complete(ptr); // This may crash when the receiver dereferences it.
}
but in fact data may not be dropped at that point, and this code is sound. What you may be confusing this with is the mistake of not assigning the value to a variable:
let ptr = Data::new().to_raw();
on_complete(ptr);
In this case, the pointer is dangling, because the result of Data::new() is stored in a temporary variable within the statement, which is dropped at the end of the statement, not a local variable, which is dropped at the end of the block.
If you want to adopt a programming style which makes explicit when values are dropped, the usual pattern is to use the standard drop() function to mark the exact time of drop:
let data = Data::new();
...
on_complete(data.to_raw());
drop(data); // We have stopped using the raw pointer now
(Note that drop() is not magic: it is simply a function which takes one argument and does nothing with it other than dropping. Its presence in the standard library is to give a name to this pattern, not to provide special functionality.)
However, if you want to, there isn't anything wrong with using your with_extended_lifetime (other than nonstandard style and arguably a misleading name) if you want to make the code structure even more strongly indicate the scope of the value. One nitpick: the function parameter should be FnOnce, not Fn, for maximum generality (this allows it to be passed functions that can't be called more than once).
Other than explicitly dropping as the other answer mentions, there is another way to help prevent these types of accidental drops: use a wrapper around a raw pointer that has lifetime information.
use std::marker::PhantomData;
#[repr(transparent)]
struct PtrWithLifetime<'a>{
ptr: *mut ffi::c_void,
_data: PhantomData<&'a ffi::c_void>,
}
impl Data {
fn to_raw(&self) -> PtrWithLife<'_>{
PtrWithLifetime{
ptr: self.ptr,
_data: PhantomData,
}
}
}
The #[repr(transparent)] guarantees that PtrWithLife is stored in memory the same as *const ffi::c_void is, so you can adjust the declaration of on_complete to
fn called_from_c(on_complete: extern "C" fn(PtrWithLifetime<'_>)){
//stuff
}
without causing any major inconvenience to any downstream users, especially since the ffi bindings can be adjusted in a similar fashion.

What happens to the ownership of a value returned but not assigned by the calling function?

Consider the following Rust code, slightly modified from examples in The Book.
I'm trying to understand what happens to the value in the second running of function dangle() in the main() function (see comment). I would imagine that because the value isn't assigned to any owner, it gets deallocated, but I've so far failed to find information to confirm that. Otherwise, I would think that calling dangle() repeatedly would constantly allocate more memory without deallocating it. Which is it?
fn main() {
// Ownership of dangle()'s return value is passed to the variable `thingamabob`.
let thingamabob = dangle();
// No ownership specified. Is the return value deallocated here?
dangle();
println!("Ref: {}", thingamabob);
}
fn dangle() -> String {
// Ownership specified.
let s = String::from("hello");
// Ownership is passed to calling function.
s
}
When a value has no owner (is not bound to a variable) it goes out of scope. Values that go out of scope are dropped. Dropping a value frees the resources associated with that value.
Anything less would lead to memory leaks, which would be a poor idea in a programming language.
See also:
Is it possible in Rust to delete an object before the end of scope?
How does Rust know whether to run the destructor during stack unwind?
Does Rust free up the memory of overwritten variables?
In your example, the second call creates an unnamed temporary value whose lifetime ends immediately after that one line of code, so it goes out of scope (and any resources are reclaimed) immediately.
If you bind the value to a name using let, then its lifetime extends until the end of the current lexical scope (closing curly brace).
You can explore some of this yourself by implementing the Drop trait on a simple type to see when its lifetime ends. Here's a small program I made to play with this (playground):
#[derive(Debug)]
struct Thing {
val: i32,
}
impl Thing {
fn new(val: i32) -> Self {
println!("Creating Thing #{}", val);
Thing { val }
}
fn foo(self, val: i32) -> Self {
Thing::new(val)
}
}
impl Drop for Thing {
fn drop(&mut self) {
println!("Dropping {:?}", self);
}
}
pub fn main() {
let _t1 = Thing::new(1);
Thing::new(2); // dropped immediately
{
let t3 = Thing::new(3);
Thing::new(4).foo(5).foo(6); // all are dropped, in order, as the next one is created
println!("Doing something with t3: {:?}", t3);
} // t3 is dropped here
} // _t1 is dropped last

Why does the Index trait allow returning a reference to a temporary value?

Consider this simple code:
use std::ops::Index;
use std::collections::HashMap;
enum BuildingType {
Shop,
House,
}
struct Street {
buildings: HashMap<u32, BuildingType>,
}
impl Index<u32> for Street {
type Output = BuildingType;
fn index(&self, pos: u32) -> &Self::Output {
&self.buildings[&pos]
}
}
It compiles with no issues, but I cannot understand why the borrow checker is not complaining about returning a reference to temporary value in the index function.
Why is it working?
You example looks fine.
The Index trait is only able to "view" what is in the object already, and it's not usable for returning arbitrary dynamically-generated data.
It's not possible in Rust to return a reference to a value created inside a function, if that value isn't stored somewhere permanently (references don't exist on their own, they always borrow some value owned somewhere).
The reference can't be borrowed from a variable inside the function, because all variables will be destroyed before the function returns. Lifetimes only describe what the program does, and can't "make" something live longer than it already does.
fn index(&self) -> &u32 {
let tmp = 1;
&tmp // not valid, because tmp isn't stored anywhere
}
fn index(&self) -> &u32 {
// ok, because the value is stored in self,
// which existed before this function has been called
&self.tmp
}
You may find that returning &1 works. That's because 1 is stored in your program's executable which, as far as the program is concerned, is permanent storage. But 'static is an exception for literals and leaked memory, so it's not something you can rely on in most cases.

Is it possible in Rust to delete an object before the end of scope?

From what I understand, the compiler automatically generates code to call the destructor to delete an object when it's no longer needed, at the end of scope.
In some situations, it is beneficial to delete an object as soon as it's no longer needed, instead of waiting for it to go out of scope. Is it possible to call the destructor of an object explicitly in Rust?
Is it possible in Rust to delete an object before the end of scope?
Yes.
Is it possible to call the destructor of an object explicitly in Rust?
No.
To clarify, you can use std::mem::drop to transfer ownership of a variable, which causes it to go out of scope:
struct Noisy;
impl Drop for Noisy {
fn drop(&mut self) {
println!("Dropping Noisy!");
}
}
fn main() {
let a = Noisy;
let b = Noisy;
println!("1");
drop(b);
println!("2");
}
1
Dropping Noisy!
2
Dropping Noisy!
However, you are forbidden from calling the destructor (the implementation of the Drop trait) yourself. Doing so would lead to double free situations as the compiler will still insert the automatic call to the the Drop trait.
Amusing side note — the implementation of drop is quite elegant:
pub fn drop<T>(_x: T) { }
The official answer is to call mem::drop:
fn do_the_thing() {
let s = "Hello, World".to_string();
println!("{}", s);
drop(s);
println!("{}", 3);
}
However, note that mem::drop is nothing special. Here is the definition in full:
pub fn drop<T>(_x: T) { }
That's all.
Any function taking ownership of a parameter will cause this parameter to be dropped at the end of said function. From the point of view of the caller, it's an early drop :)

Referencing a containing struct in Rust (and calling methods on it)

Editor's note: This code example is from a version of Rust prior to 1.0 and is not syntactically valid Rust 1.0 code. Updated versions of this code produce different errors, but the answers still contain valuable information.
I'm trying to write a container structure in Rust where its elements also store a reference to the containing container so that they can call methods on it. As far as I could figure out, I need to do this via Rc<RefCell<T>>. Is this correct?
So far, I have something like the following:
struct Container {
elems: ~[~Element]
}
impl Container {
pub fn poke(&mut self) {
println!("Got poked.");
}
}
struct Element {
datum: int,
container: Weak<RefCell<Container>>
}
impl Element {
pub fn poke_container(&mut self) {
let c1 = self.container.upgrade().unwrap(); // Option<Rc>
let mut c2 = c1.borrow().borrow_mut(); // &RefCell
c2.get().poke();
// self.container.upgrade().unwrap().borrow().borrow_mut().get().poke();
// -> Error: Borrowed value does not live long enough * 2
}
}
fn main() {
let container = Rc::new(RefCell::new(Container{ elems: ~[] }));
let mut elem1 = Element{ datum: 1, container: container.downgrade() };
let mut elem2 = Element{ datum: 2, container: container.downgrade() };
elem1.poke_container();
}
I feel like I am missing something here. Is accessing the contents of a Rc<RefCell<T>> really this difficult (in poke_container)? Or am I approaching the problem the wrong way?
Lastly, and assuming the approach is correct, how would I write an add method for Container so that it could fill in the container field in Element (assuming I changed the field to be of type Option<Rc<RefCell<T>>>? I can't create another Rc from &mut self as far as I know.
The long chain of method calls actually works for me on master without any changes, because the lifetime of "r-values" (e.g. the result of function calls) have changed so that the temporary return values last until the end of the statement, rather than the end of the next method call (which seemed to be how the old rule worked).
As Vladimir hints, overloadable dereference will likely reduce it to
self.container.upgrade().unwrap().borrow_mut().poke();
which is nicer.
In any case, "mutating" shared ownership is always going to be (slightly) harder to write in Rust that either single ownership code or immutable shared ownership code, because it's very easy for such code to be memory unsafe (and memory safety is the core goal of Rust).

Resources