Is there a difference between Pin<Box<T>> and Box<Pin<T>>? - rust

In Rust, are there any functional differences between Pin<Box<T>> and Box<Pin<T>>? I think that they should behave the same, but I'm not sure.

Pin<Box<T>> is what you want. Box<Pin<T>> will not work at all.
Pin requires its type to be a pointer of some kind. It then prevents you from moving out of this pointer (if the pointee isn't Unpin), by requiring unsafe to access it mutably. In Pin<Box<T>> Box<T> is the pointer. It is common because you can create it safely (as opposed to Pin<&mut T> that without macros can only be created unsafely) because you give the ownership of the Box to it, and thus you cannot access the inner T not through the Pin. Box<Pin<T>>, on the other hand, is useless: it is impossible to create if T does not implement Deref (as Pin's constructors require that, because they are meant to use with pointers) and even if T does, the Box is redundant: you already have a pointer, there is no need to wrap it in Box. In addition, you cannot create an instance of Box<Pin<T>> if the <T as Deref>::Target does not implement Unpin without unsafe code, and there is little benefit in Pin with Unpin types (it can be passed to APIs that require it, such as Future::poll(), but in that case you don't need the Box).

Related

How to wrap one C pointer? Use *mut T, Nonnull<T> or Unique<T>?

I'm calling a C constructor function that allocates memory returns one pointer.
I found some similar questions.
they use Nonnull or *mut T to wrap it.
And I also found another similar structure Unique, it will take the ownership of T.
This makes me wonder, what is the difference between them and how should I choose?
Use NonNull if your pointer is never null. Use *const T or *mut T otherwise.
Unique is a private pointer for the standard library, you cannot use it (I think it was used to be exposed unstably, but not anymore). It represents an owned type: for example, it is used for Box and Vec. It is basically NonNull, although it has some differences because it is considered owned: it impls Send/Sync (and other auto traits) if T does, and it is considered owned for the aliasing model and Miri (although it is not obvious we want that).

When should I use Pin<Arc<T>> in Rust?

I'm pretty new to Rust and have a couple different implementations of a method that includes a closure referencing self. To use the reference in the closure effectively, I've been using Arc<Self> (I am multithreading) and Pin<Arc<Self>>.
I would like to make this method as generally memory efficient as possible. I assume pinning the Arc in memory would help with this. However, (a) I've read that Arcs are pinned and (b) it seems like Pin<Arc<T>> may require additional allocations.
What is Pin<Arc<T>> good for?
Adding Pin around some pointer type does not change the behavior of the program. It only adds a restriction on what further code you can write (and even that, only if the T in Pin<Arc<T>> is not Unpin, which most types are).
Therefore, there is no "memory efficiency" to be gained by adding Pin.
The only use of Pin is to allow working with types that require they be pinned to use them, such as Futures.

When is a static lifetime not appropriate?

I have found a lot of information across the web about rust lifetimes, including information about static lifetimes. It makes sense to me that, in certain situations, you must guarantee that a reference will outlive everything.
For instance, I have a reference that I’m passing to a thread, and the compiler is requesting that the reference been marked as static. In this scenario, that seems to make sense because the compiler can’t know how long the thread will live and thus needs to ensure the passed reference outlives the thread. (I think that’s correct?)
I don’t know where this comes from, but I am always concerned that marking something with a static lifetime is something to be skeptical of, and avoided when possible.
So I wonder if that’s correct. Should I be critical of marking things with a static lifetime? Are there situations when the compiler will want to require one, but an alternate strategy might actually be more optimal?
What are some concrete ways that I can reason about the application of a static lifetime, and possibly determine when it might not be appropriate?
As you might have already guessed, there is no definitive, technical answer to this.
As a newcomer to Rust, 'static references seem to defeat the entire purpose of the borrowing system and there is a notion to avoid them. Once you get more experienced, this notion will go away.
First of all, 'static is not bad as it seems, since all things that have no other lifetimes associated with them are 'static, e.g. String::new(). Notice that 'static does not mean that the value in question does truly live forever. It just means that the value can be made to live forever. In your threading-examples, the thread can't make any promises about its own lifetime, so it needs to be able to make all things passed to it live forever. Any owned value which does not include lifetimes shorter than 'static (like vec![1,2,3]) can be made to live forever (simply by not destroying them) and are therefor 'static.
Second, &'static - the static reference - does not come up often anyway. If it does, you'll usually be aware of why. You won't see a lot of fn foo(bar: &'static Bar) because there simply aren't that many use-cases for it, not because it is actively avoided.
There are situations where 'static does come up in surprising ways. Out of my head:
A Box<dyn Trait> is implicitly a Box<dyn Trait + 'static>. This is because when the type of the value inside the Box gets erased, it might have had lifetimes associated with it; and all (different) types must be valid for as long as the Box lives. Therefore all types need to share a common denominator wrt their lifetimes and Rust is defined to choose 'static. This choice is usually ok, but can lead to surprising "requires 'static" errors. You can generalize this explicitly to Box<dyn Trait + 'a>
If you have a custom impl Drop on your type, the Drop-checker may not be able to prove that the destructor is unable to observe values that have already been dropped. To prevent the Drop impl from accessing references to values that have already been dropped, the compiler requires the entire type to only have 'static references inside of it. This can be overcome by an unsafe impl, which lifts the 'static-requirement.
Instead of &'static T, pass Arc<T> to the thread. This has only a tiny cost and ensures lifetimes will not be longer than necessary.

Why is transmuting &T to &mut T Undefined Behaviour?

I want to reinterpret an immutable reference to a mutable reference (in an unsafe block) and be responsible for the safety checks on my own, yet it appears I cannot use mem::transmute() to do so.
let map_of_vecs: HashMap<usize, Vec<_>> = ...;
let vec = map_of_vecs[2];
/// obtain a mutable reference to vec here
I do not want to wrap the Vecs into Cells because that would affect all other areas of code that use map_of_vecs and I only need mutability in one line.
I do not have mutable access to map_of_vecs
The Rust optimiser makes the assumption that &mut T references are unique. For example, it might deduce that a particular piece of memory can be reused because a mutable reference to that memory exists but is never accessed again.
However, if you transmute a &T to a &mut T then you are able to create multiple mutable references to the same data. If the compiler makes this assumption, you could end up dereferencing a value that has been overwritten with something else.
This is just one example of how the compiler might make use of the assumption that mutable references are unique. In fact, the compiler is free to use this information in any way it sees fit — which could (and likely will) change from version to version.
Even if you think you have guaranteed that the reference isn't aliased, you can't always guarantee that users of your code won't create more references. Even if you think you can be sure of that, the existence of references is extremely subtle and it's very easy to miss one. For example when you call a method that takes &self, that's a reference.
The Rust compiler annotates &T function parameters with the LLVM noalias and readonly attributes (provided that T does not contain any UnsafeCell parts). The noalias attribute tells LLVM that the memory behind this pointer may only be written to through this pointer (and not through any other pointers), and the readonly attribute tells LLVM that it can't be written to through this pointer (but possibly other pointers). In combination, the two attributes allow the LLVM optimiser to assume the memory is not changed at all during the execution of this function, and the code can be optimised based on this assumption. The optimiser may reorder instructions or remove code in a way that is only safe to do if you actually stick to this contract.
Another way the conversion can lead to undefined behaviour is for statics: immutable statics without UnsafeCells will be placed into read-only memory, so if you actually write to them, your code will segfault.
For parameters with UnsafeCells the compiler does not emit the readonly attribute, and statics containing an UnsafeCell are placed into writable memory.

Understand smart pointers in Rust

I am a newbie to Rust and writing to understand the "Smart pointers" in Rust. I have basic understanding of how smart pointers works in C++ and has been using it for memory management since a few years ago. But to my very much surprise, Rust also provides such utility explicitly.
Because from a tutorial here (https://pcwalton.github.io/2013/03/18/an-overview-of-memory-management-in-rust.html), it seems that every raw pointers have been automatically wrapped with a smart pointer, which seems very reasonable. Then why do we still need such Box<T>, Rc<T>, and Ref<T> stuff? According to this specification: https://doc.rust-lang.org/book/ch15-00-smart-pointers.html
Any comments will be apprecicated a lot. Thanks.
You can think about the difference between a T and a Box<T> as the difference between a statically allocated object and a dynamically allocated object (the latter being created via a new expression in C++ terms).
In Rust, both T and Box<T> represent a variable that has ownership over the referent object (i.e. when the variable goes out of scope, the object will be destroyed, whether it was stored by value or by reference). On the contrary, &T and &mut T represent borrowing of the object (i.e. these variables are not responsible for destroying the object, and they cannot outlive the owner of the object).
By default, you'd probably want to use T, but sometimes you might want (or have) to use Box<T>. For example, you would use a Box<T> if you want to own a T that's too large to be allocated in place. You would also use it when the object doesn't have a known size at all, which means that your only choice to store it or pass it around is through the "pointer" (the Box<T>).
In Rust, an object is generally either mutable or aliased, but not both. If you have given out immutable references to an object, you normally need to wait until those references are over before you can mutate that object again.
Additionally, Rust's immutability is transitive. If you receive an object immutably, it means that you have access to its contents (and the contents of those contents, and so on) also immutably.
Normally, all of these things are enforced at compile time. This means that you catch errors faster, but you are limited to being able to express only what the compiler can prove statically.
Like T and Box<T>, you may sometimes use RefCell<T>, which is another ownership type. But unlike T and Box<T>, the RefCell<T> enforces the borrow checking rules at runtime instead of compile time, meaning that sometimes you can do things with it that are safe but wouldn't pass the compiler's static borrow checker. The main example for this is getting a mutable reference to the interior of an object that was received immutably (which, under the statically enforced rules of Rust, would make the entire interior immutable).
The types Ref<T> and RefMut<T> are the runtime-checked equivalents of &T and &mut T respectively.
(EDIT: This whole thing is somewhat of a lie. &mut really means "unique borrow" and & means "non-unique borrow". Certain types, like mutexes, can be non-uniquely but still mutably borrowed, because otherwise they would be useless.)
Rust's ownership model tries to push you to write programs in which objects' lifetimes are known at compile time. This works well in certain scenarios, but makes other scenarios difficult or impossible to express.
Rc<T> and its atomic sibling Arc<T> are reference-counting wrappers of T. They offer you an alternative to the ownership model.
They are useful when you want to use and properly dispose an object, but it is not easy (or possible) to determine, at the moment you're writing the code, which specific variable should be the owner of that object (and therefore should take care of disposing it). Much like in C++, this means that there is no single owner of the object and that the object will be disposed by the last reference-counting wrapper that points to it.
The article you linked uses outdated syntax. Certain smart pointers used to have special names and associated syntax that has been removed since some time before Rust 1.0:
Box<T> replaced ~T ("owned pointers")
Rc<T> replaced #T ("managed pointers")
Because the Internet never forgets, you can still find pre-1.0 documentation and articles (such as the one you linked) that use the old syntax. Check the date of the article: if it's before May 2015, you're dealing with an early, unstable Rust.

Resources