Why does this variable definition imply static lifetime? - multithreading

I'm trying to execute a function on chunks of a vector and then send the result back using the message passing library.
However, I get a strange error about the lifetime of the vector that isn't even participating in the thread operations:
src/lib.rs:153:27: 154:25 error: borrowed value does not live long enough
src/lib.rs:153 let extended_segments = (segment_size..max_val)
error: src/lib.rs:154 .collect::<Vec<_>>()borrowed value does not live long enough
note: reference must be valid for the static lifetime...:153
let extended_segments = (segment_size..max_val)
src/lib.rs:153:3: 155:27: 154 .collect::<Vec<_>>()
note: but borrowed value is only valid for the statement at 153:2:
reference must be valid for the static lifetime...
src/lib.rs:
let extended_segments = (segment_size..max_val)
consider using a `let` binding to increase its lifetime
I tried moving around the iterator and adding lifetimes to different places, but I couldn't get the checker to pass and still stay on type.
The offending code is below, based on the concurrency chapter in the Rust book. (Complete code is at github.)
use std::sync::mpsc;
use std::thread;
fn sieve_segment(a: &[usize], b: &[usize]) -> Vec<usize> {
vec![]
}
fn eratosthenes_sieve(val: usize) -> Vec<usize> {
vec![]
}
pub fn segmented_sieve_parallel(max_val: usize, mut segment_size: usize) -> Vec<usize> {
if max_val <= ((2 as i64).pow(16) as usize) {
// early return if the highest value is small enough (empirical)
return eratosthenes_sieve(max_val);
}
if segment_size > ((max_val as f64).sqrt() as usize) {
segment_size = (max_val as f64).sqrt() as usize;
println!("Segment size is larger than √{}. Reducing to {} to keep resource use down.",
max_val,
segment_size);
}
let small_primes = eratosthenes_sieve((max_val as f64).sqrt() as usize);
let mut big_primes = small_primes.clone();
let (tx, rx): (mpsc::Sender<Vec<usize>>, mpsc::Receiver<Vec<usize>>) = mpsc::channel();
let extended_segments = (segment_size..max_val)
.collect::<Vec<_>>()
.chunks(segment_size);
for this_segment in extended_segments.clone() {
let small_primes = small_primes.clone();
let tx = tx.clone();
thread::spawn(move || {
let sieved_segment = sieve_segment(&small_primes, this_segment);
tx.send(sieved_segment).unwrap();
});
}
for _ in 1..extended_segments.count() {
big_primes.extend(&rx.recv().unwrap());
}
big_primes
}
fn main() {}
How do I understand and avoid this error? I'm not sure how to make the lifetime of the thread closure static as in this question and still have the function be reusable (i.e., not main()). I'm not sure how to "consume all things that come into [the closure]" as mentioned in this question. And I'm not sure where to insert .map(|s| s.into()) to ensure that all references become moves, nor am I sure I want to.

When trying to reproduce a problem, I'd encourage you to create a MCVE by removing all irrelevant code. In this case, something like this seems to produce the same error:
fn segmented_sieve_parallel(max_val: usize, segment_size: usize) {
let foo = (segment_size..max_val)
.collect::<Vec<_>>()
.chunks(segment_size);
}
fn main() {}
Let's break that down:
Create an iterator between numbers.
Collect all of them into a Vec<usize>.
Return an iterator that contains references to the vector.
Since the vector isn't bound to any variable, it's dropped at the end of the statement. This would leave the iterator pointing to an invalid region of memory, so that's disallowed.
Check out the definition of slice::chunks:
fn chunks(&self, size: usize) -> Chunks<T>
pub struct Chunks<'a, T> where T: 'a {
// some fields omitted
}
The lifetime marker 'a lets you know that the iterator contains a reference to something. Lifetime elision has removed the 'a from the function, which looks like this, expanded:
fn chunks<'a>(&'a self, size: usize) -> Chunks<'a, T>
Check out this line of the error message:
help: consider using a let binding to increase its lifetime
You can follow that as such:
fn segmented_sieve_parallel(max_val: usize, segment_size: usize) {
let foo = (segment_size..max_val)
.collect::<Vec<_>>();
let bar = foo.chunks(segment_size);
}
fn main() {}
Although I'd write it as
fn segmented_sieve_parallel(max_val: usize, segment_size: usize) {
let foo: Vec<_> = (segment_size..max_val).collect();
let bar = foo.chunks(segment_size);
}
fn main() {}
Re-inserting this code back into your original problem won't solve the problem, but it will be much easier to understand. That's because you are attempting to pass a reference to thread::spawn, which may outlive the current thread. Thus, everything passed to thread::spawn must have the 'static lifetime. There are tons of questions that detail why that must be prevented and a litany of solutions, including scoped threads and cloning the vector.
Cloning the vector is the easiest, but potentially inefficient:
for this_segment in extended_segments.clone() {
let this_segment = this_segment.to_vec();
// ...
}

Related

Multiple Mutable Borrows from Struct Hashmap

Running into an ownership issue when attempting to reference multiple values from a HashMap in a struct as parameters in a function call. Here is a PoC of the issue.
use std::collections::HashMap;
struct Resource {
map: HashMap<String, String>,
}
impl Resource {
pub fn new() -> Self {
Resource {
map: HashMap::new(),
}
}
pub fn load(&mut self, key: String) -> &mut String {
self.map.get_mut(&key).unwrap()
}
}
fn main() {
// Initialize struct containing a HashMap.
let mut res = Resource {
map: HashMap::new(),
};
res.map.insert("Item1".to_string(), "Value1".to_string());
res.map.insert("Item2".to_string(), "Value2".to_string());
// This compiles and runs.
let mut value1 = res.load("Item1".to_string());
single_parameter(value1);
let mut value2 = res.load("Item2".to_string());
single_parameter(value2);
// This has ownership issues.
// multi_parameter(value1, value2);
}
fn single_parameter(value: &String) {
println!("{}", *value);
}
fn multi_parameter(value1: &mut String, value2: &mut String) {
println!("{}", *value1);
println!("{}", *value2);
}
Uncommenting multi_parameter results in the following error:
28 | let mut value1 = res.load("Item1".to_string());
| --- first mutable borrow occurs here
29 | single_parameter(value1);
30 | let mut value2 = res.load("Item2".to_string());
| ^^^ second mutable borrow occurs here
...
34 | multi_parameter(value1, value2);
| ------ first borrow later used here
It would technically be possible for me to break up the function calls (using the single_parameter function approach), but it would be more convenient to pass the
variables to a single function call.
For additional context, the actual program where I'm encountering this issue is an SDL2 game where I'm attempting to pass multiple textures into a single function call to be drawn, where the texture data may be modified within the function.
This is currently not possible, without resorting to unsafe code or interior mutability at least. There is no way for the compiler to know if two calls to load will yield mutable references to different data as it cannot always infer the value of the key. In theory, mutably borrowing both res.map["Item1"] and res.map["Item2"] would be fine as they would refer to different values in the map, but there is no way for the compiler to know this at compile time.
The easiest way to do this, as already mentioned, is to use a structure that allows interior mutability, like RefCell, which typically enforces the memory safety rules at run-time before returning a borrow of the wrapped value. You can also work around the borrow checker in this case by dealing with mut pointers in unsafe code:
pub fn load_many<'a, const N: usize>(&'a mut self, keys: [&str; N]) -> [&'a mut String; N] {
// TODO: Assert that keys are distinct, so that we don't return
// multiple references to the same value
keys.map(|key| self.load(key) as *mut _)
.map(|ptr| unsafe { &mut *ptr })
}
Rust Playground
The TODO is important, as this assertion is the only way to ensure that the safety invariant of only having one mutable reference to any value at any time is upheld.
It is, however, almost always better (and easier) to use a known safe interior mutation abstraction like RefCell rather than writing your own unsafe code.

How to correctly use trait objects , mutably iterating over mutable trait objects container, while mutating container itself?

I am trying to write a packet parser, where basically one builds up a packet by parsing each Layer in the packet. The packet then holds those 'layers' in a vector.
The ~pseudo code~ code with compilation errors is something like the following -
Also added comments below - for each step. I have experimented with RefCell , but could not get that working. Essentially the challenges are enumerated at the end of the code.
The basic pattern is as follows - Get the object of a Layer type (Every Layer type will return a default next object based upon some field in the current layer as a 'boxed trait object'.)
Edit: I am adding a code that's more than a pseudo code - Also added following compilation errors. May be a way to figure out how to fix these errors could solve the problems.!
#[derive(Debug, Default)]
pub struct Packet<'a> {
data: Option<&'a [u8]>,
meta: PacketMetadata,
layers: Vec<Box<dyn Layer<'a>>>,
}
pub trait Layer<'a>: Debug {
fn from_u8<'b>(&mut self, bytes: &'b [u8]) -> Result<(Option<Box<dyn Layer>>, usize), Error>;
}
#[derive(Debug, Default)]
pub struct PacketMetadata {
timestamp: Timestamp,
inface: i8,
len: u16,
caplen: u16,
}
impl<'a> Packet<'a> {
fn from_u8(bytes: &'a [u8], _encap: EncapType) -> Result<Self, Error> {
let mut p = Packet::default();
let eth = ethernet::Ethernet::default();
let mut layer: RefCell<Box<dyn Layer>> = RefCell::new(Box::new(eth));
let mut res: (Option<Box<dyn Layer>>, usize);
let mut start = 0;
loop {
let mut decode_layer = layer.borrow_mut();
// process it
res = decode_layer.from_u8(&bytes[start..])?;
if res.0.is_none() {
break;
}
// if the layer exists, get it in a layer.
let boxed = layer.replace(res.0.unwrap());
start = res.1;
// append the layer to layers.
p.layers.push(boxed);
}
Ok(p)
}
}
Compilation Errors
error[E0515]: cannot return value referencing local variable `decode_layer`
--> src/lib.rs:81:9
|
68 | res = decode_layer.from_u8(&bytes[start..])?;
| ------------ `decode_layer` is borrowed here
...
81 | Ok(p)
| ^^^^^ returns a value referencing data owned by the current function
error[E0515]: cannot return value referencing local variable `layer`
--> src/lib.rs:81:9
|
65 | let mut decode_layer = layer.borrow_mut();
| ----- `layer` is borrowed here
...
81 | Ok(p)
| ^^^^^ returns a value referencing data owned by the current function
error: aborting due to 2 previous errors; 3 warnings emitted
It's not clear why the above errors come. I am using the values returned by the calls. (The 3: warnings shown above can be ignored, they are unused warnings.)
The challenges -
p.layers.last_mut and p.layers.push are simultaneous mutable borrows - not allowed. I could somehow put it behind a RefCell, but how that's not clear.
This code is similar in pattern to syn::token::Tokens, however one basic difference being, there an Enum is used(TokenTree). In the above example I cannot use Enum because the list of protocols to be supported is potentially unbounded.
I cannot use Layer trait without Trait Objects due to the loop construct.
The pattern can be thought of as - mutably iterating over a container of Trait objects while updating the container itself.
Perhaps I am missing something very basic.
The problem with the above code is due to lifetime annotation on the Layer trait. If that lifetime annotation is removed, the above code indeed compiles with a few modifications as posted below -
// Layer Trait definition
pub trait Layer: Debug {
fn from_u8(&mut self, bytes: &[u8]) -> Result<(Option<Box<dyn Layer>>, usize), Error>;
}
impl<'a> Packet<'a> {
fn from_u8(bytes: &'a [u8], _encap: EncapType) -> Result<Self, Error> {
let mut p = Packet::default();
let eth = ethernet::Ethernet::default();
let layer: RefCell<Box<dyn Layer>> = RefCell::new(Box::new(eth));
let mut res: (Option<Box<dyn Layer>>, usize);
let mut start = 0;
loop {
{
// Do a `borrow_mut` in it's own scope, that gets dropped at the end.
let mut decode_layer = layer.borrow_mut();
res = decode_layer.from_u8(&bytes[start..])?;
}
if res.0.is_none() {
// This is just required to push something to the RefCell, that will get dropped anyways.
let fake_boxed = Box::new(FakeLayer {});
let boxed = layer.replace(fake_boxed);
p.layers.push(boxed);
break;
}
// if the layer exists, get it in a layer.
let boxed = layer.replace(res.0.unwrap());
start = res.1;
// append the layer to layers.
p.layers.push(boxed);
}
Ok(p)
}
}

How can you use an immutable Option by reference that contains a mutable reference?

Here's a Thing:
struct Thing(i32);
impl Thing {
pub fn increment_self(&mut self) {
self.0 += 1;
println!("incremented: {}", self.0);
}
}
And here's a function that tries to mutate a Thing and returns either true or false, depending on if a Thing is available:
fn try_increment(handle: Option<&mut Thing>) -> bool {
if let Some(t) = handle {
t.increment_self();
true
} else {
println!("warning: increment failed");
false
}
}
Here's a sample of usage:
fn main() {
try_increment(None);
let mut thing = Thing(0);
try_increment(Some(&mut thing));
try_increment(Some(&mut thing));
try_increment(None);
}
As written, above, it works just fine (link to Rust playground). Output below:
warning: increment failed
incremented: 1
incremented: 2
warning: increment failed
The problem arises when I want to write a function that mutates the Thing twice. For example, the following does not work:
fn try_increment_twice(handle: Option<&mut Thing>) {
try_increment(handle);
try_increment(handle);
}
fn main() {
try_increment_twice(None);
let mut thing = Thing(0);
try_increment_twice(Some(&mut thing));
try_increment_twice(None);
}
The error makes perfect sense. The first call to try_increment(handle) gives ownership of handle away and so the second call is illegal. As is often the case, the Rust compiler yields a sensible error message:
|
24 | try_increment(handle);
| ------ value moved here
25 | try_increment(handle);
| ^^^^^^ value used here after move
|
In an attempt to solve this, I thought it would make sense to pass handle by reference. It should be an immutable reference, mind, because I don't want try_increment to be able to change handle itself (assigning None to it, for example) only to be able to call mutations on its value.
My problem is that I couldn't figure out how to do this.
Here is the closest working version that I could get:
struct Thing(i32);
impl Thing {
pub fn increment_self(&mut self) {
self.0 += 1;
println!("incremented: {}", self.0);
}
}
fn try_increment(handle: &mut Option<&mut Thing>) -> bool {
// PROBLEM: this line is allowed!
// (*handle) = None;
if let Some(ref mut t) = handle {
t.increment_self();
true
} else {
println!("warning: increment failed");
false
}
}
fn try_increment_twice(mut handle: Option<&mut Thing>) {
try_increment(&mut handle);
try_increment(&mut handle);
}
fn main() {
try_increment_twice(None);
let mut thing = Thing(0);
try_increment_twice(Some(&mut thing));
try_increment_twice(None);
}
This code runs, as expected, but the Option is now passed about by mutable reference and that is not what I want:
I'm allowed to mutate the Option by reassigning None to it, breaking all following mutations. (Uncomment line 12 ((*handle) = None;) for example.)
It's messy: There are a whole lot of extraneous &mut's lying about.
It's doubly messy: Heaven only knows why I must use ref mut in the if let statement while the convention is to use &mut everywhere else.
It defeats the purpose of having the complicated borrow-checking and mutability checking rules in the compiler.
Is there any way to actually achieve what I want: passing an immutable Option around, by reference, and actually being able to use its contents?
You can't extract a mutable reference from an immutable one, even a reference to its internals. That's kind of the point! Multiple aliases of immutable references are allowed so, if Rust allowed you to do that, you could have a situation where two pieces of code are able to mutate the same data at the same time.
Rust provides several escape hatches for interior mutability, for example the RefCell:
use std::cell::RefCell;
fn try_increment(handle: &Option<RefCell<Thing>>) -> bool {
if let Some(t) = handle {
t.borrow_mut().increment_self();
true
} else {
println!("warning: increment failed");
false
}
}
fn try_increment_twice(handle: Option<RefCell<Thing>>) {
try_increment(&handle);
try_increment(&handle);
}
fn main() {
let mut thing = RefCell::new(Thing(0));
try_increment_twice(Some(thing));
try_increment_twice(None);
}
TL;DR: The answer is No, I can't.
After the discussions with #Peter Hall and #Stargateur, I have come to understand why I need to use &mut Option<&mut Thing> everywhere. RefCell<> would also be a feasible work-around but it is no neater and does not really achieve the pattern I was originally seeking to implement.
The problem is this: if one were allowed to mutate the object for which one has only an immutable reference to an Option<&mut T> one could use this power to break the borrowing rules entirely. Concretely, you could, essentially, have many mutable references to the same object because you could have many such immutable references.
I knew there was only one mutable reference to the Thing (owned by the Option<>) but, as soon as I started taking references to the Option<>, the compiler no longer knew that there weren't many of those.
The best version of the pattern is as follows:
fn try_increment(handle: &mut Option<&mut Thing>) -> bool {
if let Some(ref mut t) = handle {
t.increment_self();
true
}
else {
println!("warning: increment failed");
false
}
}
fn try_increment_twice(mut handle: Option<&mut Thing>) {
try_increment(&mut handle);
try_increment(&mut handle);
}
fn main() {
try_increment_twice(None);
let mut thing = Thing(0);
try_increment_twice(Some(&mut thing));
try_increment_twice(None);
}
Notes:
The Option<> holds the only extant mutable reference to the Thing
try_increment_twice() takes ownership of the Option<>
try_increment() must take the Option<> as &mut so that the compiler knows that it has the only mutable reference to the Option<>, during the call
If the compiler knows that try_increment() has the only mutable reference to the Option<> which holds the unique mutable reference to the Thing, the compiler knows that the borrow rules have not been violated.
Another Experiment
The problem of the mutability of Option<> remains because one can call take() et al. on a mutable Option<>, breaking everything following.
To implement the pattern that I wanted, I need something that is like an Option<> but, even if it is mutable, it cannot be mutated. Something like this:
struct Handle<'a> {
value: Option<&'a mut Thing>,
}
impl<'a> Handle<'a> {
fn new(value: &'a mut Thing) -> Self {
Self {
value: Some(value),
}
}
fn empty() -> Self {
Self {
value: None,
}
}
fn try_mutate<T, F: Fn(&mut Thing) -> T>(&mut self, mutation: F) -> Option<T> {
if let Some(ref mut v) = self.value {
Some(mutation(v))
}
else {
None
}
}
}
Now, I thought, I can pass around &mut Handle's all day long and know that someone who has a Handle can only mutate its contents, not the handle itself. (See Playground)
Unfortunately, even this gains nothing because, if you have a mutable reference, you can always reassign it with the dereferencing operator:
fn try_increment(handle: &mut Handle) -> bool {
if let Some(_) = handle.try_mutate(|t| { t.increment_self() }) {
// This breaks future calls:
(*handle) = Handle::empty();
true
}
else {
println!("warning: increment failed");
false
}
}
Which is all fine and well.
Bottom line conclusion: just use &mut Option<&mut T>

Discerning lifetimes understanding the move keyword

I've been playing around with AudioUnit via Rust and the Rust library coreaudio-rs. Their example seems to work well:
extern crate coreaudio;
use coreaudio::audio_unit::{AudioUnit, IOType};
use coreaudio::audio_unit::render_callback::{self, data};
use std::f32::consts::PI;
struct Iter {
value: f32,
}
impl Iterator for Iter {
type Item = [f32; 2];
fn next(&mut self) -> Option<[f32; 2]> {
self.value += 440.0 / 44_100.0;
let amp = (self.value * PI * 2.0).sin() as f32 * 0.15;
Some([amp, amp])
}
}
fn main() {
run().unwrap()
}
fn run() -> Result<(), coreaudio::Error> {
// 440hz sine wave generator.
let mut samples = Iter { value: 0.0 };
//let buf: Vec<[f32; 2]> = vec![[0.0, 0.0]];
//let mut samples = buf.iter();
// Construct an Output audio unit that delivers audio to the default output device.
let mut audio_unit = try!(AudioUnit::new(IOType::DefaultOutput));
// Q: What is this type?
let callback = move |args| {
let Args { num_frames, mut data, .. } = args;
for i in 0..num_frames {
let sample = samples.next().unwrap();
for (channel_idx, channel) in data.channels_mut().enumerate() {
channel[i] = sample[channel_idx];
}
}
Ok(())
};
type Args = render_callback::Args<data::NonInterleaved<f32>>;
try!(audio_unit.set_render_callback(callback));
try!(audio_unit.start());
std::thread::sleep(std::time::Duration::from_millis(30000));
Ok(())
}
However, changing it up a little bit to load via a buffer doesn't work as well:
extern crate coreaudio;
use coreaudio::audio_unit::{AudioUnit, IOType};
use coreaudio::audio_unit::render_callback::{self, data};
fn main() {
run().unwrap()
}
fn run() -> Result<(), coreaudio::Error> {
let buf: Vec<[f32; 2]> = vec![[0.0, 0.0]];
let mut samples = buf.iter();
// Construct an Output audio unit that delivers audio to the default output device.
let mut audio_unit = try!(AudioUnit::new(IOType::DefaultOutput));
// Q: What is this type?
let callback = move |args| {
let Args { num_frames, mut data, .. } = args;
for i in 0..num_frames {
let sample = samples.next().unwrap();
for (channel_idx, channel) in data.channels_mut().enumerate() {
channel[i] = sample[channel_idx];
}
}
Ok(())
};
type Args = render_callback::Args<data::NonInterleaved<f32>>;
try!(audio_unit.set_render_callback(callback));
try!(audio_unit.start());
std::thread::sleep(std::time::Duration::from_millis(30000));
Ok(())
}
It says, correctly so, that buf only lives until the end of run and does not live long enough for the audio unit—which makes sense, because "borrowed value must be valid for the static lifetime...".
In any case, that doesn't bother me; I can modify the iterator to load and read from the buffer just fine. However, it does raise some questions:
Why does the Iter { value: 0.0 } have the 'static lifetime?
If it doesn't have the 'static lifetime, why does it say the borrowed value must be valid for the 'static lifetime?
If it does have the 'static lifetime, why? It seems like it would be on the heap and closed on by callback.
I understand that the move keyword allows moving inside the closure, which doesn't help me understand why it interacts with lifetimes. Why can't it move the buffer? Do I have to move both the buffer and the iterator into the closure? How would I do that?
Over all this, how do I figure out the expected lifetime without trying to be a compiler myself? It doesn't seem like guessing and compiling is always a straightforward method to resolving these issues.
Why does the Iter { value: 0.0 } have the 'static lifetime?
It doesn't; only references have lifetimes.
why does it say the borrowed value must be valid for the 'static lifetime
how do I figure out the expected lifetime without trying to be a compiler myself
Read the documentation; it tells you the restriction:
fn set_render_callback<F, D>(&mut self, f: F) -> Result<(), Error>
where
F: FnMut(Args<D>) -> Result<(), ()> + 'static, // <====
D: Data
This restriction means that any references inside of F must live at least as long as the 'static lifetime. Having no references is also acceptable.
All type and lifetime restrictions are expressed at the function boundary — this is a hard rule of Rust.
I understand that the move keyword allows moving inside the closure, which doesn't help me understand why it interacts with lifetimes.
The only thing that the move keyword does is force every variable directly used in the closure to be moved into the closure. Otherwise, the compiler tries to be conservative and move in references/mutable references/values based on the usage inside the closure.
Why can't it move the buffer?
The variable buf is never used inside the closure.
Do I have to move both the buffer and the iterator into the closure? How would I do that?
By creating the iterator inside the closure. Now buf is used inside the closure and will be moved:
let callback = move |args| {
let mut samples = buf.iter();
// ...
}
It doesn't seem like guessing and compiling is always a straightforward method to resolving these issues.
Sometimes it is, and sometimes you have to think about why you believe the code to be correct and why the compiler states it isn't and come to an understanding.

Why is a Cell used to create unmovable objects?

So I ran into this code snippet showing how to create "unmoveable" types in Rust - moves are prevented because the compiler treats the object as borrowed for its whole lifetime.
use std::cell::Cell;
use std::marker;
struct Unmovable<'a> {
lock: Cell<marker::ContravariantLifetime<'a>>,
marker: marker::NoCopy
}
impl<'a> Unmovable<'a> {
fn new() -> Unmovable<'a> {
Unmovable {
lock: Cell::new(marker::ContravariantLifetime),
marker: marker::NoCopy
}
}
fn lock(&'a self) {
self.lock.set(marker::ContravariantLifetime);
}
fn new_in(self_: &'a mut Option<Unmovable<'a>>) {
*self_ = Some(Unmovable::new());
self_.as_ref().unwrap().lock();
}
}
fn main(){
let x = Unmovable::new();
x.lock();
// error: cannot move out of `x` because it is borrowed
// let z = x;
let mut y = None;
Unmovable::new_in(&mut y);
// error: cannot move out of `y` because it is borrowed
// let z = y;
assert_eq!(std::mem::size_of::<Unmovable>(), 0)
}
I don't yet understand how this works. My guess is that the lifetime of the borrow-pointer argument is forced to match the lifetime of the lock field. The weird thing is, this code continues working in the same way if:
I change ContravariantLifetime<'a> to CovariantLifetime<'a>, or to InvariantLifetime<'a>.
I remove the body of the lock method.
But, if I remove the Cell, and just use lock: marker::ContravariantLifetime<'a> directly, as so:
use std::marker;
struct Unmovable<'a> {
lock: marker::ContravariantLifetime<'a>,
marker: marker::NoCopy
}
impl<'a> Unmovable<'a> {
fn new() -> Unmovable<'a> {
Unmovable {
lock: marker::ContravariantLifetime,
marker: marker::NoCopy
}
}
fn lock(&'a self) {
}
fn new_in(self_: &'a mut Option<Unmovable<'a>>) {
*self_ = Some(Unmovable::new());
self_.as_ref().unwrap().lock();
}
}
fn main(){
let x = Unmovable::new();
x.lock();
// does not error?
let z = x;
let mut y = None;
Unmovable::new_in(&mut y);
// does not error?
let z = y;
assert_eq!(std::mem::size_of::<Unmovable>(), 0)
}
Then the "Unmoveable" object is allowed to move. Why would that be?
The true answer is comprised of a moderately complex consideration of lifetime variancy, with a couple of misleading aspects of the code that need to be sorted out.
For the code below, 'a is an arbitrary lifetime, 'small is an arbitrary lifetime that is smaller than 'a (this can be expressed by the constraint 'a: 'small), and 'static is used as the most common example of a lifetime that is larger than 'a.
Here are the facts and steps to follow in the consideration:
Normally, lifetimes are contravariant; &'a T is contravariant with regards to 'a (as is T<'a> in the absence of any variancy markers), meaning that if you have a &'a T, it’s OK to substitute a longer lifetime than 'a, e.g. you can store in such a place a &'static T and treat it as though it were a &'a T (you’re allowed to shorten the lifetime).
In a few places, lifetimes can be invariant; the most common example is &'a mut T which is invariant with regards to 'a, meaning that if you have a &'a mut T, you cannot store a &'small mut T in it (the borrow doesn’t live long enough), but you also cannot store a &'static mut T in it, because that would cause trouble for the reference being stored as it would be forgotten that it actually lived for longer, and so you could end up with multiple simultaneous mutable references being created.
A Cell contains an UnsafeCell; what isn’t so obvious is that UnsafeCell is magic, being wired to the compiler for special treatment as the language item named “unsafe”. Importantly, UnsafeCell<T> is invariant with regards to T, for similar sorts of reasons to the invariance of &'a mut T with regards to 'a.
Thus, Cell<any lifetime variancy marker> will actually behave the same as Cell<InvariantLifetime<'a>>.
Furthermore, you don’t actually need to use Cell any more; you can just use InvariantLifetime<'a>.
Returning to the example with the Cell wrapping removed and a ContravariantLifetime (actually equivalent to just defining struct Unmovable<'a>;, for contravariance is the default as is no Copy implementation): why does it allow moving the value? … I must confess, I don’t grok this particular case yet and would appreciate some help myself in understanding why it’s allowed. It seems back to front, that covariance would allow the lock to be shortlived but that contravariance and invariance wouldn’t, but in practice it seems that only invariance is performing the desired function.
Anyway, here’s the final result. Cell<ContravariantLifetime<'a>> is changed to InvariantLifetime<'a> and that’s the only functional change, making the lock method function as desired, taking a borrow with an invariant lifetime. (Another solution would be to have lock take &'a mut self, for a mutable reference is, as already discussed, invariant; this is inferior, however, as it requires needless mutability.)
One other thing that needs mentioning: the contents of the lock and new_in methods are completely superfluous. The body of a function will never change the static behaviour of the compiler; only the signature matters. The fact that the lifetime parameter 'a is marked invariant is the key point. So the whole “construct an Unmovable object and call lock on it” part of new_in is completely superfluous. Similarly setting the contents of the cell in lock was a waste of time. (Note that it is again the invariance of 'a in Unmovable<'a> that makes new_in work, not the fact that it is a mutable reference.)
use std::marker;
struct Unmovable<'a> {
lock: marker::InvariantLifetime<'a>,
}
impl<'a> Unmovable<'a> {
fn new() -> Unmovable<'a> {
Unmovable {
lock: marker::InvariantLifetime,
}
}
fn lock(&'a self) { }
fn new_in(_: &'a mut Option<Unmovable<'a>>) { }
}
fn main() {
let x = Unmovable::new();
x.lock();
// This is an error, as desired:
let z = x;
let mut y = None;
Unmovable::new_in(&mut y);
// Yay, this is an error too!
let z = y;
}
An interesting problem! Here's my understanding of it...
Here's another example that doesn't use Cell:
#![feature(core)]
use std::marker::InvariantLifetime;
struct Unmovable<'a> { //'
lock: Option<InvariantLifetime<'a>>, //'
}
impl<'a> Unmovable<'a> {
fn lock_it(&'a mut self) { //'
self.lock = Some(InvariantLifetime)
}
}
fn main() {
let mut u = Unmovable { lock: None };
u.lock_it();
let v = u;
}
(Playpen)
The important trick here is that the structure needs to borrow itself. Once we have done that, it can no longer be moved because any move would invalidate the borrow. This isn't conceptually different from any other kind of borrow:
struct A(u32);
fn main() {
let a = A(42);
let b = &a;
let c = a;
}
The only thing is that you need some way of letting the struct contain its own reference, which isn't possible to do at construction time. My example uses Option, which requires &mut self and the linked example uses Cell, which allows for interior mutability and just &self.
Both examples use a lifetime marker because it allows the typesystem to track the lifetime without needing to worry about a particular instance.
Let's look at your constructor:
fn new() -> Unmovable<'a> { //'
Unmovable {
lock: marker::ContravariantLifetime,
marker: marker::NoCopy
}
}
Here, the lifetime put into lock is chosen by the caller, and it ends up being the normal lifetime of the Unmovable struct. There's no borrow of self.
Let's next look at your lock method:
fn lock(&'a self) {
}
Here, the compiler knows that the lifetime won't change. However, if we make it mutable:
fn lock(&'a mut self) {
}
Bam! It's locked again. This is because the compiler knows that the internal fields could change. We can actually apply this to our Option variant and remove the body of lock_it!

Resources