How Can I Hash By A Raw Pointer? - rust

I want to create a function that provides a two step write and commit, like so:
// Omitting locking for brevity
struct States {
commited_state: u64,
// By reference is just a placeholder - I don't know how to do this
pending_states: HashSet<i64>
}
impl States {
fn read_dirty(&self) -> {
// Sum committed state and all non committed states
self.commited_state +
pending_states.into_iter().fold(sum_all_values).unwrap_or(0)
}
fn read_committed(&self) {
self.commited_state
}
}
let state_container = States::default();
async fn update_state(state_container: States, new_state: i64) -> Future {
// This is just pseudo code missing locking and such
// I'd like to add a reference to new_state
state_container.pending_states.insert(
new_state
)
async move {
// I would like to defer the commit
// I add the state to the commited state
state_container.commited_state =+ new_state;
// Then remove it *by reference* from the pending states
state_container.remove(new_state)
}
}
I'd like to be in a situation where I can call it like so
let commit_handler = update_state(state_container, 3).await;
// Do some external transactional stuff
third_party_transactional_service(...)?
// Commit if the above line does not error
commit_handler.await;
The problem I have is that HashMaps and HashSets, hash values based of their value and not their actual reference - so I can't remove them by reference.
I appreciate this a bit of a long question, but I'm just trying to give a bit more context as to what I'm trying to do. I know that in a typical database you'd generally have an atomic counter to generate the transaction ID, but that feels a bit overkill when the pointer reference would be enough.
However, I don't want to get the pointer value using unsafe, because it just seems a bit off to do something relatively simple.

Values in rust don't have an identity like they do in other languages. You need to ascribe them an identity somehow. You've hit on two ways to do this in your question: an ID contained within the value, or the address of the value as a pointer.
Option 1: An ID contained in the value
It's trivial to have a usize ID with a static AtomicUsize (atomics have interior mutability).
use std::sync::atomic::{AtomicUsize, Ordering};
// No impl of clone/copy as we want these IDs to be unique.
#[derive(Debug, Hash, PartialEq, Eq)]
#[repr(transparent)]
pub struct OpaqueIdentifier(usize);
impl OpaqueIdentifier {
pub fn new() -> Self {
static COUNTER: AtomicUsize = AtomicUsize::new(0);
Self(COUNTER.fetch_add(1, Ordering::Relaxed))
}
pub fn id(&self) -> usize {
self.0
}
}
Now your map key becomes usize, and you're done.
Having this be a separate type that doesn't implement Copy or Clone allows you to have a concept of an "owned unique ID" and then every type with one of these IDs is forced not to be Copy, and a Clone impl would require obtaining a new ID.
(You can use a different integer type than usize. I chose it semi-arbitrarily.)
Option 2: A pointer to the value
This is more challenging in Rust since values in Rust are movable by default. In order for this approach to be viable, you have to remove this capability by pinning.
To make this work, both of the following must be true:
You pin the value you're using to provide identity, and
The pinned value is !Unpin (otherwise pinning still allows moves!), which can be forced by adding a PhantomPinned member to the value's type.
Note that the pin contract is only upheld if the object remains pinned for its entire lifetime. To enforce this, your factory for such objects should only dispense pinned boxes.
This could complicate your API as you cannot obtain a mutable reference to a pinned value without unsafe. The pin documentation has examples of how to do this properly.
Assuming that you have done all of this, you can then use *const T as the key in your map (where T is the pinned type). Note that conversion to a pointer is safe -- it's conversion back to a reference that isn't. So you can just use some_pin_box.get_ref() as *const _ to obtain the pointer you'll use for lookup.
The pinned box approach comes with pretty significant drawbacks:
All values being used to provide identity have to be allocated on the heap (unless using local pinning, which is unlikely to be ergonomic -- the pin! macro making this simpler is experimental).
The implementation of the type providing identity has to accept self as a &Pin or &mut Pin, requiring unsafe code to mutate the contents.
In my opinion, it's not even a good semantic fit for the problem. "Location in memory" and "identity" are different things, and it's only kind of by accident that the former can sometimes be used to implement the latter. It's a bit silly that moving a value in memory would change its identity, no?
I'd just go with adding an ID to the value. This is a substantially more obvious pattern, and it has no serious drawbacks.

Related

Rust behavior after move [duplicate]

The Rust language website claims move semantics as one of the features of the language. But I can't see how move semantics is implemented in Rust.
Rust boxes are the only place where move semantics are used.
let x = Box::new(5);
let y: Box<i32> = x; // x is 'moved'
The above Rust code can be written in C++ as
auto x = std::make_unique<int>(5);
auto y = std::move(x); // Note the explicit move
As far as I know (correct me if I'm wrong),
Rust doesn't have constructors at all, let alone move constructors.
No support for rvalue references.
No way to create functions overloads with rvalue parameters.
How does Rust provide move semantics?
I think it's a very common issue when coming from C++. In C++ you are doing everything explicitly when it comes to copying and moving. The language was designed around copying and references. With C++11 the ability to "move" stuff was glued onto that system. Rust on the other hand took a fresh start.
Rust doesn't have constructors at all, let alone move constructors.
You do not need move constructors. Rust moves everything that "does not have a copy constructor", a.k.a. "does not implement the Copy trait".
struct A;
fn test() {
let a = A;
let b = a;
let c = a; // error, a is moved
}
Rust's default constructor is (by convention) simply an associated function called new:
struct A(i32);
impl A {
fn new() -> A {
A(5)
}
}
More complex constructors should have more expressive names. This is the named constructor idiom in C++
No support for rvalue references.
It has always been a requested feature, see RFC issue 998, but most likely you are asking for a different feature: moving stuff to functions:
struct A;
fn move_to(a: A) {
// a is moved into here, you own it now.
}
fn test() {
let a = A;
move_to(a);
let c = a; // error, a is moved
}
No way to create functions overloads with rvalue parameters.
You can do that with traits.
trait Ref {
fn test(&self);
}
trait Move {
fn test(self);
}
struct A;
impl Ref for A {
fn test(&self) {
println!("by ref");
}
}
impl Move for A {
fn test(self) {
println!("by value");
}
}
fn main() {
let a = A;
(&a).test(); // prints "by ref"
a.test(); // prints "by value"
}
Rust's moving and copying semantics are very different from C++. I'm going to take a different approach to explain them than the existing answer.
In C++, copying is an operation that can be arbitrarily complex, due to custom copy constructors. Rust doesn't want custom semantics of simple assignment or argument passing, and so takes a different approach.
First, an assignment or argument passing in Rust is always just a simple memory copy.
let foo = bar; // copies the bytes of bar to the location of foo (might be elided)
function(foo); // copies the bytes of foo to the parameter location (might be elided)
But what if the object controls some resources? Let's say we are dealing with a simple smart pointer, Box.
let b1 = Box::new(42);
let b2 = b1;
At this point, if just the bytes are copied over, wouldn't the destructor (drop in Rust) be called for each object, thus freeing the same pointer twice and causing undefined behavior?
The answer is that Rust moves by default. This means that it copies the bytes to the new location, and the old object is then gone. It is a compile error to access b1 after the second line above. And the destructor is not called for it. The value was moved to b2, and b1 might as well not exist anymore.
This is how move semantics work in Rust. The bytes are copied over, and the old object is gone.
In some discussions about C++'s move semantics, Rust's way was called "destructive move". There have been proposals to add the "move destructor" or something similar to C++ so that it can have the same semantics. But move semantics as they are implemented in C++ don't do this. The old object is left behind, and its destructor is still called. Therefore, you need a move constructor to deal with the custom logic required by the move operation. Moving is just a specialized constructor/assignment operator that is expected to behave in a certain way.
So by default, Rust's assignment moves the object, making the old location invalid. But many types (integers, floating points, shared references) have semantics where copying the bytes is a perfectly valid way of creating a real copy, with no need to ignore the old object. Such types should implement the Copy trait, which can be derived by the compiler automatically.
#[derive(Copy)]
struct JustTwoInts {
one: i32,
two: i32,
}
This signals the compiler that assignment and argument passing do not invalidate the old object:
let j1 = JustTwoInts { one: 1, two: 2 };
let j2 = j1;
println!("Still allowed: {}", j1.one);
Note that trivial copying and the need for destruction are mutually exclusive; a type that is Copy cannot also be Drop.
Now what about when you want to make a copy of something where just copying the bytes isn't enough, e.g. a vector? There is no language feature for this; technically, the type just needs a function that returns a new object that was created the right way. But by convention this is achieved by implementing the Clone trait and its clone function. In fact, the compiler supports automatic derivation of Clone too, where it simply clones every field.
#[Derive(Clone)]
struct JustTwoVecs {
one: Vec<i32>,
two: Vec<i32>,
}
let j1 = JustTwoVecs { one: vec![1], two: vec![2, 2] };
let j2 = j1.clone();
And whenever you derive Copy, you should also derive Clone, because containers like Vec use it internally when they are cloned themselves.
#[derive(Copy, Clone)]
struct JustTwoInts { /* as before */ }
Now, are there any downsides to this? Yes, in fact there is one rather big downside: because moving an object to another memory location is just done by copying bytes, and no custom logic, a type cannot have references into itself. In fact, Rust's lifetime system makes it impossible to construct such types safely.
But in my opinion, the trade-off is worth it.
Rust supports move semantics with features like these:
All types are moveable.
Sending a value somewhere is a move, by default, throughout the language. For non-Copy types, like Vec, the following are all moves in Rust: passing an argument by value, returning a value, assignment, pattern-matching by value.
You don't have std::move in Rust because it's the default. You're really using moves all the time.
Rust knows that moved values must not be used. If you have a value x: String and do channel.send(x), sending the value to another thread, the compiler knows that x has been moved. Trying to use it after the move is a compile-time error, "use of moved value". And you can't move a value if anyone has a reference to it (a dangling pointer).
Rust knows not to call destructors on moved values. Moving a value transfers ownership, including responsibility for cleanup. Types don't have to be able to represent a special "value was moved" state.
Moves are cheap and the performance is predictable. It's basically memcpy. Returning a huge Vec is always fast—you're just copying three words.
The Rust standard library uses and supports moves everywhere. I already mentioned channels, which use move semantics to safely transfer ownership of values across threads. Other nice touches: all types support copy-free std::mem::swap() in Rust; the Into and From standard conversion traits are by-value; Vec and other collections have .drain() and .into_iter() methods so you can smash one data structure, move all the values out of it, and use those values to build a new one.
Rust doesn't have move references, but moves are a powerful and central concept in Rust, providing a lot of the same performance benefits as in C++, and some other benefits as well.
let s = vec!["udon".to_string(), "ramen".to_string(), "soba".to_string()];
this is how it is represented in memory
Then let's assign s to t
let t = s;
this is what happens:
let t = s MOVED the vector’s three header fields from s to t; now t is the owner of the vector. The vector’s elements stayed just
where they were, and nothing happened to the strings either. Every value still has a single owner.
Now s is freed, if I write this
let u = s
I get error: "use of moved value: s"
Rust applies move semantics to almost any use of a value (Except Copy types). Passing
arguments to functions moves ownership to the function’s parameters;
returning a value from a function moves ownership to the caller.
Building a tuple moves the values into the tuple. And so on.
Ref for example:Programming Rust by Jim Blandy, Jason Orendorff, Leonora F. S. Tindall
Primitive types cannot be empty and are fixed size while non primitives can grow and can be empty. since primitive types cannot be empty and are fixed size, therefore assigning memory to store them and handling them are relatively easy. however the handling of non primitives involves the computation of how much memory they will take as they grow and other costly operations.Wwith primitives rust will make a copy, with non primitive rust does a move
fn main(){
// this variable is stored in stack. primitive types are fixed size, we can store them on stack
let x:i32=10;
// s1 is stored in heap. os will assign memory for this. pointer of this memory will be stored inside stack.
// s1 is the owner of memory space in heap which stores "my name"
// if we dont clear this memory, os will have no access to this memory. rust uses ownership to free the memory
let s1=String::from("my name");
// s1 will be cleared from the stack, s2 will be added to the stack poniting the same heap memory location
// making new copy of this string will create extra overhead, so we MOVED the ownership of s1 into s2
let s2=s1;
// s3 is the pointer to s2 which points to heap memory. we Borrowed the ownership
// Borrowing is similar borrowing in real life, you borrow a car from your friend, but its ownership does not change
let s3=&s2;
// this is creating new "my name" in heap and s4 stored as the pointer of this memory location on the heap
let s4=s2.clone()
}
Same principle applies when we pass primitive or non-primitive type arguments to a function:
fn main(){
// since this is primitive stack_function will make copy of it so this will remain unchanged
let stack_num=50;
let mut heap_vec=vec![2,3,4];
// when we pass a stack variable to a function, function will make a copy of that and will use the copy. "move" does not occur here
stack_var_fn(stack_num);
println!("The stack_num inside the main fn did not change:{}",stack_num);
// the owner of heap_vec moved here and when function gets executed, it goes out of scope so the variable will be dropped
// we can pass a reference to reach the value in heap. so we use the pointer of heap_vec
// we use "&"" operator to indicate that we are passing a reference
heap_var_fn(&heap_vec);
println!("the heap_vec inside main is:{:?}",heap_vec);
}
// this fn that we pass an argument stored in stack
fn stack_var_fn(mut var:i32){
// we are changing the arguments value
var=56;
println!("Var inside stack_var_fn is :{}",var);
}
// this fn that we pass an arg that stored in heap
fn heap_var_fn(var:&Vec<i32>){
println!("Var:{:?}",var);
}
I would like to add that it is not necessary for move to memcpy. If the object on the stack is large enough, Rust's compiler may choose to pass the object's pointer instead.
In C++ the default assignment of classes and structs is shallow copy. The values are copied, but not the data referenced by pointers. So modifying one instance changes the referenced data of all copies. The values (f.e. used for administration) remain unchanged in the other instance, likely rendering an inconsistent state. A move semantic avoids this situation. Example for a C++ implementation of a memory managed container with move semantic:
template <typename T>
class object
{
T *p;
public:
object()
{
p=new T;
}
~object()
{
if (p != (T *)0) delete p;
}
template <typename V> //type V is used to allow for conversions between reference and value
object(object<V> &v) //copy constructor with move semantic
{
p = v.p; //move ownership
v.p = (T *)0; //make sure it does not get deleted
}
object &operator=(object<T> &v) //move assignment
{
delete p;
p = v.p;
v.p = (T *)0;
return *this;
}
T &operator*() { return *p; } //reference to object *d
T *operator->() { return p; } //pointer to object data d->
};
Such an object is automatically garbage collected and can be returned from functions to the calling program. It is extremely efficient and does the same as Rust does:
object<somestruct> somefn() //function returning an object
{
object<somestruct> a;
auto b=a; //move semantic; b becomes invalid
return b; //this moves the object to the caller
}
auto c=somefn();
//now c owns the data; memory is freed after leaving the scope

Rust Data Structure

I am currently learning Rust for fun. I have some experience in C / C++ and other experience in other programming languages that use more complex paradigms like generics.
Background
For my first project (after the tutorial), I wanted to create a N-Dimensional array (or Matrix) data structure to practice development in Rust.
Here is what I have so far for my Matrix struct and a basic fill and new initializations.
Forgive the absent bound checking and parameter testing
pub struct Matrix<'a, T> {
data: Vec<Option<T>>,
dimensions: &'a [usize],
}
impl<'a, T: Clone> Matrix<'a, T> {
pub fn fill(dimensions: &'a [usize], fill: T) -> Matrix<'a, T> {
let mut total = if dimensions.len() > 0 { 1 } else { 0 };
for dim in dimensions.iter() {
total *= dim;
}
Matrix {
data: vec![Some(fill); total],
dimensions: dimensions,
}
}
pub fn new(dimensions: &'a [usize]) -> Matrix<'a, T> {
...
Matrix {
data: vec![None; total],
dimensions: dimensions,
}
}
}
I wanted the ability to create an "empty" N-Dimensional array using the New fn. I thought using the Option enum would be the best way to accomplish this, as I can fill the N-Dimensional with None and it would allocate space for this T generic automatically.
So then it comes down to being able to set the entries for this. I found the IndexMut and Index traits that looked like I could do something like m[&[2, 3]] = 23. Since the logic is similar to each other here is the IndexMut impl for Matrix.
impl<'a, T> ops::IndexMut<&[usize]> for Matrix<'a, T> {
fn index_mut(&mut self, indices: &[usize]) -> &mut Self::Output {
match self.data[get_matrix_index(self.dimensions, indices)].as_mut() {
Some(x) => x,
None => {
NOT SURE WHAT TO DO HERE.
}
}
}
}
Ideally what would happen is that the value (if there) would be changed i.e.
let mut mat = Matrix::fill(&[4, 4], 0)
mat[&[2, 3]] = 23
This would set the value from 0 to 23 (which the above fn does via returning &mut x from Some(x)). But I also want None to set the value i.e.
let mut mat = Matrix::new(&[4, 4])
mat[&[2, 3]] = 23
Question
Finally, is there a way to make m[&[2,3]] = 23 possible with what the Vec struct requires to allocate the memory? If not what should I change and how can I still have an array with "empty" spots. Open to any suggestions as I am trying to learn. :)
Final Thoughts
Through my research, the Vec struct impls I see that the type T is typed and has to be Sized. This could be useful as to allocate the Vec with the appropriate size via vec![pointer of T that is null but of size of T; total]. But I am unsure of how to do this.
So there are a few ways to make this more similar to idiomatic rust, but first, let's look at why the none branch doesn't make sense.
So the Output type for IndexMut I'm going to assume is &mut T as you don't show the index definition but I feel safe in that assumption. The type &mut T means a mutable reference to an initialized T, unlike pointers in C/C++ where they can point to initialized or uninitialized memory. What this means is that you have to return an initialized T which the none branch cannot because there is no initialized value. This leads to the first of the more idiomatic ways.
Return an Option<T>
The easiest way would be to change Index::Output to be an Option<T>. This is better because the user can decide what to do if there was no value there before and is close to what you are actually storing. Then you can also remove the panic in your index method and allow the caller to choose what to do if there is no value. At this point, I think you can go a little further with gentrifying the structure in the next option.
Store a T directly
This method allows the caller to directly change what the type is that's stored rather than wrapping it in an option. This cleans up most of your index code nicely as you just have to access what's already stored. The main problem is now initialization, how do you represent uninitialized values? You were correct that option is the best way to do this1, but now the caller can decide to have this optional initialization capability by storing an Option themselves. So that means we can always store initialized Ts without losing functionality. This only really changes your new function to instead not fill with None values. My suggestion here is to make a bound T: Default for the new function2:
impl<'a, T: Default> Matrix<'a, T> {
pub fn new(dimensions: &'a [usize]) -> Matrix<'a, T> {
Matrix {
data: (0..total).into_iter().map(|_|Default::default()).collect(),
dimensions: dimensions,
}
}
}
This method is much more common in the rust world and allows the caller to choose whether to allow for uninitialized values. Option<T> also implements default for all T and returns None So the functionality is very similar to what you have currently.
Aditional Info
As you're new to rust there are a few comments that I can make about traps that I've fallen into before. To start your struct contains a reference to the dimensions with a lifetime. What this means is that your structs cannot exist longer than the dimension object that created them. This hasn't caused you a problem so far as all you've been passing is statically created dimensions, dimensions that are typed into the code and stored in static memory. This gives your object a lifetime of 'static, but this won't occur if you use dynamic dimensions.
How else can you store these dimensions so that your object always has a 'static lifetime (same as no lifetime)? Since you want an N-dimensional array stack allocation is out of the question since stack arrays must be deterministic at compile time (otherwise known as const in rust). This means you have to use the heap. This leaves two real options Box<[usize]> or Vec<usize>. Box is just another way of saying this is on the heap and adds Sized to values that are ?Sized. Vec is a little more self-explanatory and adds the ability to be resized at the cost of a little overhead. Either would allow your matrix object to always have a 'static lifetime.
1. The other way to represent this without Option<T>'s discriminate is MaybeUninit<T> which is unsafe territory. This allows you to have a chunk of initialized memory big enough to hold a T and then assume it's initialized unsafely. This can cause a lot of problems and is usually not worth it as Option is already heavily optimized in that if it stores a type with a pointer it uses compiler magic to store the discriminate in whether or not that value is a null pointer.
2. The reason this section doesn't just use vec![Default::default(); total] is that this requires T: Clone as the way this macro works the first part is called once and cloned until there are enough values. This is an extra requirement that we don't need to have so the interface is smoother without it.

Refactoring out `clone` when Copy trait is not implemented?

Is there a way to get rid of clone(), given the restrictions I've noted in the comments? I would really like to know if it's possible to use borrowing in this case, where modifying the third-party function signature is not possible.
// We should keep the "data" hidden from the consumer
mod le_library {
pub struct Foobar {
data: Vec<i32> // Something that doesn't implement Copy
}
impl Foobar {
pub fn new() -> Foobar {
Foobar {
data: vec![1, 2, 3],
}
}
pub fn foo(&self) -> String {
let i = third_party(self.data.clone()); // Refactor out clone?
format!("{}{}", "foo!", i)
}
}
// Can't change the signature, suppose this comes from a crate
pub fn third_party(data:Vec<i32>) -> i32 {
data[0]
}
}
use le_library::Foobar;
fn main() {
let foobar = Foobar::new();
let foo = foobar.foo();
let foo2 = foobar.foo();
println!("{}", foo);
println!("{}", foo2);
}
playground
As long as your foo() method accepts &self, it is not possible, because the
pub fn third_party(data: Vec<i32>) -> i32
signature is quite unambiguous: regardless of what this third_party function does, it's API states that it needs its own instance of Vec, by value. This precludes using borrowing of any form, and because foo() accepts self by reference, you can't really do anything except for cloning.
Also, supposedly this third_party is written without any weird unsafe hacks, so it is quite safe to assume that the Vec which is passed into it is eventually dropped and deallocated. Therefore, unsafely creating a copy of the original Vec without cloning it (by copying internal pointers) is out of question - you'll definitely get a use-after-free if you do it.
While your question does not state it, the fact that you want to preserve the original value of data is kind of a natural assumption. If this assumption can be relaxed, and you're actually okay with giving the data instance out and e.g. replacing it with an empty vector internally, then there are several things you can potentially do:
Switch foo(&self) to foo(&mut self), then you can quite easily extract data and replace it with an empty vector.
Use Cell or RefCell to store the data. This way, you can continue to use foo(&self), at the cost of some runtime checks when you extract the value out of a cell and replace it with some default value.
Both these approaches, however, will result in you losing the original Vec. With the given third-party API there is no way around that.
If you still can somehow influence this external API, then the best solution would be to change it to accept &[i32], which can easily be obtained from Vec<i32> with borrowing.
No, you can't get rid of the call to clone here.
The problem here is with the third-party library. As the function third_party is written now, it's true that it could be using an &Vec<i32>; it doesn't require ownership, since it's just moving out a value that's Copy. However, since the implementation is outside of your control, there's nothing preventing the person maintaining the function from changing it to take advantage of owning the Vec. It's possible that whatever it is doing would be easier or require less memory if it were allowed to overwrite the provided memory, and the function writer is leaving the door open to do so in the future. If that's not the case, it might be worth suggesting a change to the third-party function's signature and relying on clone in the meantime.

Forcing the order in which struct fields are dropped

I'm implementing an object that owns several resources created from C libraries through FFI. In order to clean up what's already been done if the constructor panics, I'm wrapping each resource in its own struct and implementing Drop for them. However, when it comes to dropping the object itself, I cannot guarantee that resources will be dropped in a safe order because Rust doesn't define the order that a struct's fields are dropped.
Normally, you would solve this by making it so the object doesn't own the resources but rather borrows them (so that the resources may borrow each other). In effect, this pushes the problem up to the calling code, where the drop order is well defined and enforced with the semantics of borrowing. But this is inappropriate for my use case and in general a bit of a cop-out.
What's infuriating is that this would be incredibly easy if drop took self instead of &mut self for some reason. Then I could just call std::mem::drop in my desired order.
Is there any way to do this? If not, is there any way to clean up in the event of a constructor panic without manually catching and repanicking?
You can specify drop order of your struct fields in two ways:
Implicitly
I wrote RFC 1857 specifying drop order and it was merged 2017/07/03! According to the RFC, struct fields are dropped in the same order as they are declared.
You can check this by running the example below
struct PrintDrop(&'static str);
impl Drop for PrintDrop {
fn drop(&mut self) {
println!("Dropping {}", self.0)
}
}
struct Foo {
x: PrintDrop,
y: PrintDrop,
z: PrintDrop,
}
fn main() {
let foo = Foo {
x: PrintDrop("x"),
y: PrintDrop("y"),
z: PrintDrop("z"),
};
}
The output should be:
Dropping x
Dropping y
Dropping z
Explicitly
RFC 1860 introduces the ManuallyDrop type, which wraps another type and disables its destructor. The idea is that you can manually drop the object by calling a special function (ManuallyDrop::drop). This function is unsafe, since memory is left uninitialized after dropping the object.
You can use ManuallyDrop to explicitly specify the drop order of your fields in the destructor of your type:
#![feature(manually_drop)]
use std::mem::ManuallyDrop;
struct Foo {
x: ManuallyDrop<String>,
y: ManuallyDrop<String>
}
impl Drop for Foo {
fn drop(&mut self) {
// Drop in reverse order!
unsafe {
ManuallyDrop::drop(&mut self.y);
ManuallyDrop::drop(&mut self.x);
}
}
}
fn main() {
Foo {
x: ManuallyDrop::new("x".into()),
y: ManuallyDrop::new("y".into())
};
}
If you need this behavior without being able to use either of the newer methods, keep on reading...
The issue with drop
The drop method cannot take its parameter by value, since the parameter would be dropped again at the end of the scope. This would result in infinite recursion for all destructors of the language.
A possible solution/workaround
A pattern that I have seen in some codebases is to wrap the values that are being dropped in an Option<T>. Then, in the destructor, you can replace each option with None and drop the resulting value in the right order.
For instance, in the scoped-threadpool crate, the Pool object contains threads and a sender that will schedule new work. In order to join the threads correctly upon dropping, the sender should be dropped first and the threads second.
pub struct Pool {
threads: Vec<ThreadData>,
job_sender: Option<Sender<Message>>
}
impl Drop for Pool {
fn drop(&mut self) {
// By setting job_sender to `None`, the job_sender is dropped first.
self.job_sender = None;
}
}
A note on ergonomics
Of course, doing things this way is more of a workaround than a proper solution. Also, if the optimizer cannot prove that the option will always be Some, you now have an extra branch for each access to your struct field.
Fortunately, nothing prevents a future version of Rust to implement a feature that allows specifying drop order. It would probably require an RFC, but seems certainly doable. There is an ongoing discussion on the issue tracker about specifying drop order for the language, though it has been inactive last months.
A note on safety
If destroying your structs in the wrong order is unsafe, you should probably consider making their constructors unsafe and document this fact (in case you haven't done that already). Otherwise it would be possible to trigger unsafe behavior just by creating the structs and letting them fall out of scope.

How to abstract over a reference to a value or a value itself?

I have a trait that defines an interface for objects that can hold a value. The trait has a way of getting the current value:
pub trait HasValue<T> {
fn get_current_value(&self) -> &T;
}
This is fine, but I realized that depending on the actual implementation, sometimes it's convenient to return a reference if T is stored in a field, and sometimes it's convenient to return a clone of T if the backing field was being shared across threads (for example). I'm struggling to figure out how to represent this in the trait. I could have something like this:
pub enum BorrowedOrOwned<'a, T: 'a> {
Borrowed(&'a T),
Owned(T)
}
impl<'a, T: 'a> Deref for BorrowedOrOwned<'a, T> {
type Target = T;
fn deref(&self) -> &T {
use self::BorrowedOrOwned::*;
match self {
&Borrowed(b) => b,
&Owned(ref o) => o,
}
}
}
And change get_current_value() to return a BorrowedOrOwned<T> but I'm not sure that this is idiomatic. BorrowedOrOwned<T> kind of reminds me of Cow<T> but since the point of Cow is to copy-on-write and I will be discarding any writes, that seems semantically wrong.
Is Cow<T> the correct way to abstract over a reference or an owned value? Is there a better way than BorrowedOrOwned<T>?
I suggest you use a Cow, since your BorrowedOrOwned has no difference to Cow except that it has fewer convenience methods. Anybody that gets a hold of a BorrowedOrOwned object could match on it and get the owned value or a mutable reference to it. If you want to prevent the confusion of being able to get a mutable reference or the object itself, the solution below applies, too.
For your use case i'd simply stay with &T, since there's no reason to make the API more complex. If a user wants a usize, when T is usize, they can simply dereference the reference.
An owned object only makes sense, if you expect the user to actually process it in an owned fashion. And even then, Cow is meant to abstract over big/heavy objects that you pass by ownership for the purpose of not requiring anyone to clone it. Your use case is the opposite, you want to pass small objects by ownership to prevent users from needing to copy small objects, and instead you are copying it.

Resources