Is it undefined behavior to do runtime borrow management with the help of raw pointers in Rust? - rust

As part of binding a C API to Rust, I have a mutable reference ph: &mut Ph, a struct struct EnsureValidContext<'a> { ph: &'a mut Ph }, and some methods:
impl Ph {
pub fn print(&mut self, s: &str) {
/*...*/
}
pub fn with_context<F, R>(&mut self, ctx: &Context, f: F) -> Result<R, InvalidContextError>
where
F: Fn(EnsureValidContext) -> R,
{
/*...*/
}
/* some others */
}
impl<'a> EnsureValidContext<'a> {
pub fn print(&mut self, s: &str) {
self.ph.print(s)
}
pub fn close(self) {}
/* some others */
}
I don't control these. I can only use these.
Now, the closure API is nice if you want the compiler to force you to think about performance (and the tradeoffs you have to make between performance and the behaviour you want. Context validation is expensive). However, let's say you just don't care about that and want it to just work.
I was thinking of making a wrapper that handles it for you:
enum ValidPh<'a> {
Ph(&'a mut Ph),
Valid(*mut Ph, EnsureValidContext<'a>),
Poisoned,
}
impl<'a> ValidPh<'a> {
pub fn print(&mut self) {
/* whatever the case, just call .print() on the inner object */
}
pub fn set_context(&mut self, ctx: &Context) {
/*...*/
}
pub fn close(&mut self) {
/*...*/
}
/* some others */
}
This would work by, whenever necessary, checking if we're a Ph or a Valid, and if we're a Ph we'd upgrade to a Valid by going:
fn upgrade(&mut self) {
if let Ph(_) = self { // don't call mem::replace unless we need to
if let Ph(ph) = mem::replace(self, Poisoned) {
let ptr = ph as *mut _;
let evc = ph.with_context(ph.get_context(), |evc| evc);
*self = Valid(ptr, evc);
}
}
}
Downgrading is different for each method, as it has to call the target method, but here's an example close:
pub fn close(&mut self) {
if let Valid(_, _) = self {
/* ok */
} else {
self.upgrade()
}
if let Valid(ptr, evc) = mem::replace(self, Invalid) {
evc.close(); // consume the evc, dropping the borrow.
// we can now use our original borrow, but since we don't have it anymore, bring it back using our trusty ptr
*self = unsafe { Ph(&mut *ptr) };
} else {
// this can only happen due to a bug in our code
unreachable!();
}
}
You get to use a ValidPh like:
/* given a &mut vph */
vph.print("hello world!");
if vph.set_context(ctx) {
vph.print("closing existing context");
vph.close();
}
vph.print("opening new context");
vph.open("context_name");
vph.print("printing in new context");
Without vph, you'd have to juggle &mut Ph and EnsureValidContext around on your own. While the Rust compiler makes this trivial (just follow the errors), you may want to let the library handle it automatically for you. Otherwise you might end up just calling the very expensive with_context for every operation, regardless of whether the operation can invalidate the context or not.
Note that this code is rough pseudocode. I haven't compiled or tested it yet.
One might argue I need an UnsafeCell or a RefCell or some other Cell. However, from reading this it appears UnsafeCell is only a lang item because of interior mutability — it's only necessary if you're mutating state through an &T, while in this case I have &mut T all the way.
However, my reading may be flawed. Does this code invoke UB?
(Full code of Ph and EnsureValidContext, including FFI bits, available here.)

Taking a step back, the guarantees upheld by Rust are:
&T is a reference to T which is potentially aliased,
&mut T is a reference to T which is guaranteed not to be aliased.
The crux of the question therefore is: what does guaranteed not to be aliased means?
Let's consider a safe Rust sample:
struct Foo(u32);
impl Foo {
fn foo(&mut self) { self.bar(); }
fn bar(&mut self) { *self.0 += 1; }
}
fn main() { Foo(0).foo(); }
If we take a peek at the stack when Foo::bar is being executed, we'll see at least two pointers to Foo: one in bar and one in foo, and there may be further copies on the stack or in other registers.
So, clearly, there are aliases in existence. How come! It's guaranteed NOT to be aliased!
Take a deep breath: how many of those aliases can you access at the time?
Only 1. The guarantee of no aliasing is not spatial but temporal.
I would think, therefore, that at any point in time, if a &mut T is accessible, then no other reference to this instance must be accessible.
Having a raw pointer (*mut T) is perfectly fine, it requires unsafe to access; however forming a second reference may or may not be safe, even without using it, so I would avoid it.

Rust's memory model is not rigorously defined yet, so it's hard to say for sure, but I believe it's not undefined behavior to:
carry a *mut Ph around while a &'a mut Ph is also reachable from another path, so long as you don't dereference the *mut Ph, even just for reading, and don't convert it to a &Ph or &mut Ph, because mutable references grant exclusive access to the pointee.
cast the *mut Ph back to a &'a mut Ph once the other &'a mut Ph falls out of scope.

Related

How can you use an immutable Option by reference that contains a mutable reference?

Here's a Thing:
struct Thing(i32);
impl Thing {
pub fn increment_self(&mut self) {
self.0 += 1;
println!("incremented: {}", self.0);
}
}
And here's a function that tries to mutate a Thing and returns either true or false, depending on if a Thing is available:
fn try_increment(handle: Option<&mut Thing>) -> bool {
if let Some(t) = handle {
t.increment_self();
true
} else {
println!("warning: increment failed");
false
}
}
Here's a sample of usage:
fn main() {
try_increment(None);
let mut thing = Thing(0);
try_increment(Some(&mut thing));
try_increment(Some(&mut thing));
try_increment(None);
}
As written, above, it works just fine (link to Rust playground). Output below:
warning: increment failed
incremented: 1
incremented: 2
warning: increment failed
The problem arises when I want to write a function that mutates the Thing twice. For example, the following does not work:
fn try_increment_twice(handle: Option<&mut Thing>) {
try_increment(handle);
try_increment(handle);
}
fn main() {
try_increment_twice(None);
let mut thing = Thing(0);
try_increment_twice(Some(&mut thing));
try_increment_twice(None);
}
The error makes perfect sense. The first call to try_increment(handle) gives ownership of handle away and so the second call is illegal. As is often the case, the Rust compiler yields a sensible error message:
|
24 | try_increment(handle);
| ------ value moved here
25 | try_increment(handle);
| ^^^^^^ value used here after move
|
In an attempt to solve this, I thought it would make sense to pass handle by reference. It should be an immutable reference, mind, because I don't want try_increment to be able to change handle itself (assigning None to it, for example) only to be able to call mutations on its value.
My problem is that I couldn't figure out how to do this.
Here is the closest working version that I could get:
struct Thing(i32);
impl Thing {
pub fn increment_self(&mut self) {
self.0 += 1;
println!("incremented: {}", self.0);
}
}
fn try_increment(handle: &mut Option<&mut Thing>) -> bool {
// PROBLEM: this line is allowed!
// (*handle) = None;
if let Some(ref mut t) = handle {
t.increment_self();
true
} else {
println!("warning: increment failed");
false
}
}
fn try_increment_twice(mut handle: Option<&mut Thing>) {
try_increment(&mut handle);
try_increment(&mut handle);
}
fn main() {
try_increment_twice(None);
let mut thing = Thing(0);
try_increment_twice(Some(&mut thing));
try_increment_twice(None);
}
This code runs, as expected, but the Option is now passed about by mutable reference and that is not what I want:
I'm allowed to mutate the Option by reassigning None to it, breaking all following mutations. (Uncomment line 12 ((*handle) = None;) for example.)
It's messy: There are a whole lot of extraneous &mut's lying about.
It's doubly messy: Heaven only knows why I must use ref mut in the if let statement while the convention is to use &mut everywhere else.
It defeats the purpose of having the complicated borrow-checking and mutability checking rules in the compiler.
Is there any way to actually achieve what I want: passing an immutable Option around, by reference, and actually being able to use its contents?
You can't extract a mutable reference from an immutable one, even a reference to its internals. That's kind of the point! Multiple aliases of immutable references are allowed so, if Rust allowed you to do that, you could have a situation where two pieces of code are able to mutate the same data at the same time.
Rust provides several escape hatches for interior mutability, for example the RefCell:
use std::cell::RefCell;
fn try_increment(handle: &Option<RefCell<Thing>>) -> bool {
if let Some(t) = handle {
t.borrow_mut().increment_self();
true
} else {
println!("warning: increment failed");
false
}
}
fn try_increment_twice(handle: Option<RefCell<Thing>>) {
try_increment(&handle);
try_increment(&handle);
}
fn main() {
let mut thing = RefCell::new(Thing(0));
try_increment_twice(Some(thing));
try_increment_twice(None);
}
TL;DR: The answer is No, I can't.
After the discussions with #Peter Hall and #Stargateur, I have come to understand why I need to use &mut Option<&mut Thing> everywhere. RefCell<> would also be a feasible work-around but it is no neater and does not really achieve the pattern I was originally seeking to implement.
The problem is this: if one were allowed to mutate the object for which one has only an immutable reference to an Option<&mut T> one could use this power to break the borrowing rules entirely. Concretely, you could, essentially, have many mutable references to the same object because you could have many such immutable references.
I knew there was only one mutable reference to the Thing (owned by the Option<>) but, as soon as I started taking references to the Option<>, the compiler no longer knew that there weren't many of those.
The best version of the pattern is as follows:
fn try_increment(handle: &mut Option<&mut Thing>) -> bool {
if let Some(ref mut t) = handle {
t.increment_self();
true
}
else {
println!("warning: increment failed");
false
}
}
fn try_increment_twice(mut handle: Option<&mut Thing>) {
try_increment(&mut handle);
try_increment(&mut handle);
}
fn main() {
try_increment_twice(None);
let mut thing = Thing(0);
try_increment_twice(Some(&mut thing));
try_increment_twice(None);
}
Notes:
The Option<> holds the only extant mutable reference to the Thing
try_increment_twice() takes ownership of the Option<>
try_increment() must take the Option<> as &mut so that the compiler knows that it has the only mutable reference to the Option<>, during the call
If the compiler knows that try_increment() has the only mutable reference to the Option<> which holds the unique mutable reference to the Thing, the compiler knows that the borrow rules have not been violated.
Another Experiment
The problem of the mutability of Option<> remains because one can call take() et al. on a mutable Option<>, breaking everything following.
To implement the pattern that I wanted, I need something that is like an Option<> but, even if it is mutable, it cannot be mutated. Something like this:
struct Handle<'a> {
value: Option<&'a mut Thing>,
}
impl<'a> Handle<'a> {
fn new(value: &'a mut Thing) -> Self {
Self {
value: Some(value),
}
}
fn empty() -> Self {
Self {
value: None,
}
}
fn try_mutate<T, F: Fn(&mut Thing) -> T>(&mut self, mutation: F) -> Option<T> {
if let Some(ref mut v) = self.value {
Some(mutation(v))
}
else {
None
}
}
}
Now, I thought, I can pass around &mut Handle's all day long and know that someone who has a Handle can only mutate its contents, not the handle itself. (See Playground)
Unfortunately, even this gains nothing because, if you have a mutable reference, you can always reassign it with the dereferencing operator:
fn try_increment(handle: &mut Handle) -> bool {
if let Some(_) = handle.try_mutate(|t| { t.increment_self() }) {
// This breaks future calls:
(*handle) = Handle::empty();
true
}
else {
println!("warning: increment failed");
false
}
}
Which is all fine and well.
Bottom line conclusion: just use &mut Option<&mut T>

Can a type know when a mutable borrow to itself has ended?

I have a struct and I want to call one of the struct's methods every time a mutable borrow to it has ended. To do so, I would need to know when the mutable borrow to it has been dropped. How can this be done?
Disclaimer: The answer that follows describes a possible solution, but it's not a very good one, as described by this comment from Sebastien Redl:
[T]his is a bad way of trying to maintain invariants. Mostly because dropping the reference can be suppressed with mem::forget. This is fine for RefCell, where if you don't drop the ref, you will simply eventually panic because you didn't release the dynamic borrow, but it is bad if violating the "fraction is in shortest form" invariant leads to weird results or subtle performance issues down the line, and it is catastrophic if you need to maintain the "thread doesn't outlive variables in the current scope" invariant.
Nevertheless, it's possible to use a temporary struct as a "staging area" that updates the referent when it's dropped, and thus maintain the invariant correctly; however, that version basically amounts to making a proper wrapper type and a kind of weird way to use it. The best way to solve this problem is through an opaque wrapper struct that doesn't expose its internals except through methods that definitely maintain the invariant.
Without further ado, the original answer:
Not exactly... but pretty close. We can use RefCell<T> as a model for how this can be done. It's a bit of an abstract question, but I'll use a concrete example to demonstrate. (This won't be a complete example, but something to show the general principles.)
Let's say you want to make a Fraction struct that is always in simplest form (fully reduced, e.g. 3/5 instead of 6/10). You write a struct RawFraction that will contain the bare data. RawFraction instances are not always in simplest form, but they have a method fn reduce(&mut self) that reduces them.
Now you need a smart pointer type that you will always use to mutate the RawFraction, which calls .reduce() on the pointed-to struct when it's dropped. Let's call it RefMut, because that's the naming scheme RefCell uses. You implement Deref<Target = RawFraction>, DerefMut, and Drop on it, something like this:
pub struct RefMut<'a>(&'a mut RawFraction);
impl<'a> Deref for RefMut<'a> {
type Target = RawFraction;
fn deref(&self) -> &RawFraction {
self.0
}
}
impl<'a> DerefMut for RefMut<'a> {
fn deref_mut(&mut self) -> &mut RawFraction {
self.0
}
}
impl<'a> Drop for RefMut<'a> {
fn drop(&mut self) {
self.0.reduce();
}
}
Now, whenever you have a RefMut to a RawFraction and drop it, you know the RawFraction will be in simplest form afterwards. All you need to do at this point is ensure that RefMut is the only way to get &mut access to the RawFraction part of a Fraction.
pub struct Fraction(RawFraction);
impl Fraction {
pub fn new(numerator: i32, denominator: i32) -> Self {
// create a RawFraction, reduce it and wrap it up
}
pub fn borrow_mut(&mut self) -> RefMut {
RefMut(&mut self.0)
}
}
Pay attention to the pub markings (and lack thereof): I'm using those to ensure the soundness of the exposed interface. All three types should be placed in a module by themselves. It would be incorrect to mark the RawFraction field pub inside Fraction, since then it would be possible (for code outside the module) to create an unreduced Fraction without using new or get a &mut RawFraction without going through RefMut.
Supposing all this code is placed in a module named frac, you can use it something like this (assuming Fraction implements Display):
let f = frac::Fraction::new(3, 10);
println!("{}", f); // prints 3/10
f.borrow_mut().numerator += 3;
println!("{}", f); // prints 3/5
The types encode the invariant: Wherever you have Fraction, you can know that it's fully reduced. When you have a RawFraction, &RawFraction, etc., you can't be sure. If you want, you may also make RawFraction's fields non-pub, so that you can't get an unreduced fraction at all except by calling borrow_mut on a Fraction.
Basically the same thing is done in RefCell. There you want to reduce the runtime borrow-count when a borrow ends. Here you want to perform an arbitrary action.
So let's re-use the concept of writing a function that returns a wrapped reference:
struct Data {
content: i32,
}
impl Data {
fn borrow_mut(&mut self) -> DataRef {
println!("borrowing");
DataRef { data: self }
}
fn check_after_borrow(&self) {
if self.content > 50 {
println!("Hey, content should be <= {:?}!", 50);
}
}
}
struct DataRef<'a> {
data: &'a mut Data
}
impl<'a> Drop for DataRef<'a> {
fn drop(&mut self) {
println!("borrow ends");
self.data.check_after_borrow()
}
}
fn main() {
let mut d = Data { content: 42 };
println!("content is {}", d.content);
{
let b = d.borrow_mut();
//let c = &d; // Compiler won't let you have another borrow at the same time
b.data.content = 123;
println!("content set to {}", b.data.content);
} // borrow ends here
println!("content is now {}", d.content);
}
This results in the following output:
content is 42
borrowing
content set to 123
borrow ends
Hey, content should be <= 50!
content is now 123
Be aware that you can still obtain an unchecked mutable borrow with e.g. let c = &mut d;. This will be silently dropped without calling check_after_borrow.

Rc<Trait> to Option<T>?

I'm trying to implement a method that looks like:
fn concretify<T: Any>(rc: Rc<Any>) -> Option<T> {
Rc::try_unwrap(rc).ok().and_then(|trait_object| {
let b: Box<Any> = unimplemented!();
b.downcast().ok().map(|b| *b)
})
}
However, try_unwrap doesn't work on trait objects (which makes sense, as they're unsized). My next thought was to try to find some function that unwraps Rc<Any> into Box<Any> directly. The closest thing I could find would be
if Rc::strong_count(&rc) == 1 {
Some(unsafe {
Box::from_raw(Rc::into_raw(rc))
})
} else {
None
}
However, Rc::into_raw() appears to require that the type contained in the Rc to be Sized, and I'd ideally not like to have to use unsafe blocks.
Is there any way to implement this?
Playground Link, I'm looking for an implementation of rc_to_box here.
Unfortunately, it appears that the API of Rc is lacking the necessary method to be able to get ownership of the wrapped type when it is !Sized.
The only method which may return the interior item of a Rc is Rc::try_unwrap, however it returns Result<T, Rc<T>> which requires that T be Sized.
In order to do what you wish, you would need to have a method with a signature: Rc<T> -> Result<Box<T>, Rc<T>>, which would allow T to be !Sized, and from there you could extract Box<Any> and perform the downcast call.
However, this method is impossible due to how Rc is implemented. Here is a stripped down version of Rc:
struct RcBox<T: ?Sized> {
strong: Cell<usize>,
weak: Cell<usize>,
value: T,
}
pub struct Rc<T: ?Sized> {
ptr: *mut RcBox<T>,
_marker: PhantomData<T>,
}
Therefore, the only Box you can get out of Rc<T> is Box<RcBox<T>>.
Note that the design is severely constrained here:
single-allocation mandates that all 3 elements be in a single struct
T: ?Sized mandates that T be the last field
so there is little room for improvement in general.
However, in your specific case, it is definitely possible to improve on the generic situation. It does, of course, require unsafe code. And while it works fairly well with Rc, implementing it with Arc would be complicated by the potential data-races.
Oh... and the code is provided as is, no warranty implied ;)
use std::any::Any;
use std::{cell, mem, ptr};
use std::rc::Rc;
struct RcBox<T: ?Sized> {
strong: cell::Cell<usize>,
_weak: cell::Cell<usize>,
value: T,
}
fn concretify<T: Any>(rc: Rc<Any>) -> Option<T> {
// Will be responsible for freeing the memory if there is no other weak
// pointer by the end of this function.
let _guard = Rc::downgrade(&rc);
unsafe {
let killer: &RcBox<Any> = {
let killer: *const RcBox<Any> = mem::transmute(rc);
&*killer
};
if killer.strong.get() != 1 { return None; }
// Do not forget to decrement the count if we do take ownership,
// as otherwise memory will not get released.
let result = killer.value.downcast_ref().map(|r| {
killer.strong.set(0);
ptr::read(r as *const T)
});
// Do not forget to destroy the content of the box if we did not
// take ownership
if result.is_none() {
let _: Rc<Any> = mem::transmute(killer as *const RcBox<Any>);
}
result
}
}
fn main() {
let x: Rc<Any> = Rc::new(1);
println!("{:?}", concretify::<i32>(x));
}
I don't think it's possible to implement your concretify function if you're expecting it to move the original value back out of the Rc; see this question for why.
If you're willing to return a clone, it's straightforward:
fn concretify<T: Any+Clone>(rc: Rc<Any>) -> Option<T> {
rc.downcast_ref().map(Clone::clone)
}
Here's a test:
#[derive(Debug,Clone)]
struct Foo(u32);
#[derive(Debug,Clone)]
struct Bar(i32);
fn main() {
let rc_foo: Rc<Any> = Rc::new(Foo(42));
let rc_bar: Rc<Any> = Rc::new(Bar(7));
let foo: Option<Foo> = concretify(rc_foo);
println!("Got back: {:?}", foo);
let bar: Option<Foo> = concretify(rc_bar);
println!("Got back: {:?}", bar);
}
This outputs:
Got back: Some(Foo(42))
Got back: None
Playground
If you want something more "movey", and creating your values is cheap, you could also make a dummy, use downcast_mut() instead of downcast_ref(), and then std::mem::swap with the dummy.

Drop a Rust void pointer stored in an FFI

I'm wrapping a C API which allows the caller to set/get an arbitrary pointer via function calls. In this way, the C API allows a caller to associate arbitrary data with one of the C API objects. This data is not used in any callbacks, it's just a pointer that a user can stash away and get at later.
My wrapper struct implements the Drop trait for the C object that contains this pointer. What I'd like to be able to do, but am not sure it's possible, is have the data dropped correctly if the pointer is not null when the wrapper struct drops. I'm not sure how I would recover the correct type though from a raw c_void pointer.
Two alternatives I'm thinking of are
Implement the behavior of these two calls in the wrapper. Don't make any calls to the C API.
Don't attempt to offer any kind of safer interface to these functions. Document that the pointer must be managed by the caller of the wrapper.
Is what I want to do possible? If not, is there a generally accepted practice for these kinds of situations?
A naive + fully automatic approach is NOT possible for the following reasons:
freeing memory does not call drop/deconstructors/...: the C API can be used from languages which can have objects which should be deconstructed properly, e.g. C++ or Rust itself. So when you only store a memory pointer you do not know you to call the proper function (you neither know which function not how the calling conventions look like).
which memory allocator?: memory allocation and deallocation isn't a trivial thing. your program needs to request memory from the OS and then manage this resources in an intelligent way to be efficient and correct. This is usually done by a library. In case of Rust, jemalloc is used (but can be changed). So even when you ask the API caller to only pass Plain Old Data (which should be easier to destruct) you still don't know which library function to call to deallocate memory. Just using libc::free won't work (it can but it could horrible fail).
Solutions:
dealloc callback: you can ask the API user to set an additional pointer to, let's say a void destruct(void* ptr) function. If this one is not NULL, you call that function during your drop. You could also use int as an return type to signal when the destruction went wrong. In that case you could for example panic!.
global callback: let's assume you requested your user to only pass POD (plain old data). To know which free function of the memory allocator to call, you could request the user to register a global void (*free)(void* ptr) pointer which is called during drop. You could also make that one optional.
Although I was able to follow the advice in this thread, I wasn't entirely satisfied with my results, so I asked the question on the Rust forums and found the answer I was really looking for. (play)
use std::any::Any;
static mut foreign_ptr: *mut () = 0 as *mut ();
unsafe fn api_set_fp(ptr: *mut ()) {
foreign_ptr = ptr;
}
unsafe fn api_get_fp() -> *mut() {
foreign_ptr
}
struct ApiWrapper {}
impl ApiWrapper {
fn set_foreign<T: Any>(&mut self, value: Box<T>) {
self.free_foreign();
unsafe {
let raw = Box::into_raw(Box::new(value as Box<Any>));
api_set_fp(raw as *mut ());
}
}
fn get_foreign_ref<T: Any>(&self) -> Option<&T> {
unsafe {
let raw = api_get_fp() as *const Box<Any>;
if !raw.is_null() {
let b: &Box<Any> = &*raw;
b.downcast_ref()
} else {
None
}
}
}
fn get_foreign_mut<T: Any>(&mut self) -> Option<&mut T> {
unsafe {
let raw = api_get_fp() as *mut Box<Any>;
if !raw.is_null() {
let b: &mut Box<Any> = &mut *raw;
b.downcast_mut()
} else {
None
}
}
}
fn free_foreign(&mut self) {
unsafe {
let raw = api_get_fp() as *mut Box<Any>;
if !raw.is_null() {
Box::from_raw(raw);
}
}
}
}
impl Drop for ApiWrapper {
fn drop(&mut self) {
self.free_foreign();
}
}
struct MyData {
i: i32,
}
impl Drop for MyData {
fn drop(&mut self) {
println!("Dropping MyData with value {}", self.i);
}
}
fn main() {
let p1 = Box::new(MyData {i: 1});
let mut api = ApiWrapper{};
api.set_foreign(p1);
{
let p2 = api.get_foreign_ref::<MyData>().unwrap();
println!("i is {}", p2.i);
}
api.set_foreign(Box::new("Hello!"));
{
let p3 = api.get_foreign_ref::<&'static str>().unwrap();
println!("payload is {}", p3);
}
}

Why is a Cell used to create unmovable objects?

So I ran into this code snippet showing how to create "unmoveable" types in Rust - moves are prevented because the compiler treats the object as borrowed for its whole lifetime.
use std::cell::Cell;
use std::marker;
struct Unmovable<'a> {
lock: Cell<marker::ContravariantLifetime<'a>>,
marker: marker::NoCopy
}
impl<'a> Unmovable<'a> {
fn new() -> Unmovable<'a> {
Unmovable {
lock: Cell::new(marker::ContravariantLifetime),
marker: marker::NoCopy
}
}
fn lock(&'a self) {
self.lock.set(marker::ContravariantLifetime);
}
fn new_in(self_: &'a mut Option<Unmovable<'a>>) {
*self_ = Some(Unmovable::new());
self_.as_ref().unwrap().lock();
}
}
fn main(){
let x = Unmovable::new();
x.lock();
// error: cannot move out of `x` because it is borrowed
// let z = x;
let mut y = None;
Unmovable::new_in(&mut y);
// error: cannot move out of `y` because it is borrowed
// let z = y;
assert_eq!(std::mem::size_of::<Unmovable>(), 0)
}
I don't yet understand how this works. My guess is that the lifetime of the borrow-pointer argument is forced to match the lifetime of the lock field. The weird thing is, this code continues working in the same way if:
I change ContravariantLifetime<'a> to CovariantLifetime<'a>, or to InvariantLifetime<'a>.
I remove the body of the lock method.
But, if I remove the Cell, and just use lock: marker::ContravariantLifetime<'a> directly, as so:
use std::marker;
struct Unmovable<'a> {
lock: marker::ContravariantLifetime<'a>,
marker: marker::NoCopy
}
impl<'a> Unmovable<'a> {
fn new() -> Unmovable<'a> {
Unmovable {
lock: marker::ContravariantLifetime,
marker: marker::NoCopy
}
}
fn lock(&'a self) {
}
fn new_in(self_: &'a mut Option<Unmovable<'a>>) {
*self_ = Some(Unmovable::new());
self_.as_ref().unwrap().lock();
}
}
fn main(){
let x = Unmovable::new();
x.lock();
// does not error?
let z = x;
let mut y = None;
Unmovable::new_in(&mut y);
// does not error?
let z = y;
assert_eq!(std::mem::size_of::<Unmovable>(), 0)
}
Then the "Unmoveable" object is allowed to move. Why would that be?
The true answer is comprised of a moderately complex consideration of lifetime variancy, with a couple of misleading aspects of the code that need to be sorted out.
For the code below, 'a is an arbitrary lifetime, 'small is an arbitrary lifetime that is smaller than 'a (this can be expressed by the constraint 'a: 'small), and 'static is used as the most common example of a lifetime that is larger than 'a.
Here are the facts and steps to follow in the consideration:
Normally, lifetimes are contravariant; &'a T is contravariant with regards to 'a (as is T<'a> in the absence of any variancy markers), meaning that if you have a &'a T, it’s OK to substitute a longer lifetime than 'a, e.g. you can store in such a place a &'static T and treat it as though it were a &'a T (you’re allowed to shorten the lifetime).
In a few places, lifetimes can be invariant; the most common example is &'a mut T which is invariant with regards to 'a, meaning that if you have a &'a mut T, you cannot store a &'small mut T in it (the borrow doesn’t live long enough), but you also cannot store a &'static mut T in it, because that would cause trouble for the reference being stored as it would be forgotten that it actually lived for longer, and so you could end up with multiple simultaneous mutable references being created.
A Cell contains an UnsafeCell; what isn’t so obvious is that UnsafeCell is magic, being wired to the compiler for special treatment as the language item named “unsafe”. Importantly, UnsafeCell<T> is invariant with regards to T, for similar sorts of reasons to the invariance of &'a mut T with regards to 'a.
Thus, Cell<any lifetime variancy marker> will actually behave the same as Cell<InvariantLifetime<'a>>.
Furthermore, you don’t actually need to use Cell any more; you can just use InvariantLifetime<'a>.
Returning to the example with the Cell wrapping removed and a ContravariantLifetime (actually equivalent to just defining struct Unmovable<'a>;, for contravariance is the default as is no Copy implementation): why does it allow moving the value? … I must confess, I don’t grok this particular case yet and would appreciate some help myself in understanding why it’s allowed. It seems back to front, that covariance would allow the lock to be shortlived but that contravariance and invariance wouldn’t, but in practice it seems that only invariance is performing the desired function.
Anyway, here’s the final result. Cell<ContravariantLifetime<'a>> is changed to InvariantLifetime<'a> and that’s the only functional change, making the lock method function as desired, taking a borrow with an invariant lifetime. (Another solution would be to have lock take &'a mut self, for a mutable reference is, as already discussed, invariant; this is inferior, however, as it requires needless mutability.)
One other thing that needs mentioning: the contents of the lock and new_in methods are completely superfluous. The body of a function will never change the static behaviour of the compiler; only the signature matters. The fact that the lifetime parameter 'a is marked invariant is the key point. So the whole “construct an Unmovable object and call lock on it” part of new_in is completely superfluous. Similarly setting the contents of the cell in lock was a waste of time. (Note that it is again the invariance of 'a in Unmovable<'a> that makes new_in work, not the fact that it is a mutable reference.)
use std::marker;
struct Unmovable<'a> {
lock: marker::InvariantLifetime<'a>,
}
impl<'a> Unmovable<'a> {
fn new() -> Unmovable<'a> {
Unmovable {
lock: marker::InvariantLifetime,
}
}
fn lock(&'a self) { }
fn new_in(_: &'a mut Option<Unmovable<'a>>) { }
}
fn main() {
let x = Unmovable::new();
x.lock();
// This is an error, as desired:
let z = x;
let mut y = None;
Unmovable::new_in(&mut y);
// Yay, this is an error too!
let z = y;
}
An interesting problem! Here's my understanding of it...
Here's another example that doesn't use Cell:
#![feature(core)]
use std::marker::InvariantLifetime;
struct Unmovable<'a> { //'
lock: Option<InvariantLifetime<'a>>, //'
}
impl<'a> Unmovable<'a> {
fn lock_it(&'a mut self) { //'
self.lock = Some(InvariantLifetime)
}
}
fn main() {
let mut u = Unmovable { lock: None };
u.lock_it();
let v = u;
}
(Playpen)
The important trick here is that the structure needs to borrow itself. Once we have done that, it can no longer be moved because any move would invalidate the borrow. This isn't conceptually different from any other kind of borrow:
struct A(u32);
fn main() {
let a = A(42);
let b = &a;
let c = a;
}
The only thing is that you need some way of letting the struct contain its own reference, which isn't possible to do at construction time. My example uses Option, which requires &mut self and the linked example uses Cell, which allows for interior mutability and just &self.
Both examples use a lifetime marker because it allows the typesystem to track the lifetime without needing to worry about a particular instance.
Let's look at your constructor:
fn new() -> Unmovable<'a> { //'
Unmovable {
lock: marker::ContravariantLifetime,
marker: marker::NoCopy
}
}
Here, the lifetime put into lock is chosen by the caller, and it ends up being the normal lifetime of the Unmovable struct. There's no borrow of self.
Let's next look at your lock method:
fn lock(&'a self) {
}
Here, the compiler knows that the lifetime won't change. However, if we make it mutable:
fn lock(&'a mut self) {
}
Bam! It's locked again. This is because the compiler knows that the internal fields could change. We can actually apply this to our Option variant and remove the body of lock_it!

Resources