How can I write my own swap function in Rust? - rust

std::mem::swap has the signature:
pub fn swap<T>(x: &mut T, y: &mut T)
If I try to implement it (playground):
pub fn swap<T>(a: &mut T, b: &mut T) {
let t = a;
a = b;
b = t;
}
I get an error about the lifetimes of the two parameters:
error[E0623]: lifetime mismatch
--> src/lib.rs:4:9
|
1 | pub fn swap<T>(a: &mut T, b: &mut T) {
| ------ ------
| |
| these two types are declared with different lifetimes...
...
4 | b = t;
| ^ ...but data from `a` flows into `b` here
error[E0623]: lifetime mismatch
--> src/lib.rs:3:9
|
1 | pub fn swap<T>(a: &mut T, b: &mut T) {
| ------ ------ these two types are declared with different lifetimes...
2 | let t = a;
3 | a = b;
| ^ ...but data from `b` flows into `a` here
If I change the signature to:
pub fn swap_lt<'t, T>(mut a: &'t T, mut b: &'t T)
It compiles, but I get a warning which seems to mean that we're just swapping temporary copies:
warning: value assigned to `a` is never read
--> src/lib.rs:3:5
|
3 | a = b;
| ^
|
= note: `#[warn(unused_assignments)]` on by default
= help: maybe it is overwritten before being read?
warning: value assigned to `b` is never read
--> src/lib.rs:4:5
|
4 | b = t;
| ^
|
= help: maybe it is overwritten before being read?

Your code is not operating on temporary copies. It just swaps the references that were passed in, which does not have any effect on the values they are pointing to. This also explains why the compiler wants the lifetimes to match – reference x is pointing to the value reference y pointed to before and vice versa, which is only possible if the two references have the same lifetime.
When swapping the actual values, a different problem occurs. You first need to move one of the values to a temporary variable. However, since T is not Copy, you can't move a value out from behind a reference, since this would leave the reference invalid, which is not allowed in Rust. If you allow T: Default, you could replace the value with its default temporarily. However, if you want to implement the function for the general case, you need to resort to unsafe code. One way of doing so is using the std::ptr::read() and std::ptr::write() functions to read and write data from raw pointers:
fn swap<T>(x: &mut T, y: &mut T) {
unsafe {
let z = read(x);
write(x, read(y));
write(y, z);
}
}
This code is trickier than it looks. The read() function returns a copy of the value without invalidating the original value, so we end up with the same non-Copy value being present in two places. We need to take care that we don't drop any of the values, which happens implicitly in many cases. For example, this implementation is wrong, since it implicitly drops the value x is initially pointing to
fn swap<T>(x: &mut T, y: &mut T) {
unsafe {
let z = read(x);
*x = read(y); // Wrong – drops the original value x is pointing to
write(y, z);
}
}
The actual implementation of swap() in the standard library uses a few optimizations:
It makes use of the std::ptr::copy_nonoverlapping() function instead of write(x, read(y)), which is implemented as a compiler intrinsic. The Rust compiler delegates this to LLVM to make sure the generated code is as efficient as possible for the target platform. Our code actually uses temporary storage for both x and y. Using copy_nonoverlapping(), temporary storage is only needed for one of the variables.
Values of size 32 or larger are swapped in blocks, so only 32 bytes of temporary storage are needed.

If you, for the sake of an exercise, don't want to use core::mem::swap or say core::ptr::swap, you could implement it as such:
pub fn swap<T>(a: &mut T, b: &mut T) {
unsafe {
let t = core::ptr::read(a);
core::ptr::copy_nonoverlapping(b, a, 1);
core::ptr::write(b, t);
}
}
Doing it using strictly safe code is not possible without having something like T: Default.

Other answers have covered unsafe implementations of swap(). A safe implementation is possible as well, but it requires additional constraints on T. For example:
pub fn swap<T: Default>(x: &mut T, y: &mut T) {
let t = std::mem::take(x);
*x = std::mem::take(y);
*y = t;
}
Here T: Default is required by std::mem::take(), which moves the value out of an &mut T reference, and leaves T::default() as replacement. A replacement is needed because the value behind the reference can and will be used again, so it must be in a valid state. For example, to move the value out of *x, we need to leave a well-defined value in *x because we will assign to *x in the subsequent line. The assignment, unaware of the previous operation, expects a valid value on the left-hand side, in order to destroy it. Leaving the old value untouched in *x would result in use-after-free and ultimately a double-free.
Another option is to require Clone:
pub fn swap<T: Clone>(x: &mut T, y: &mut T) {
let t = x.clone();
*x = y.clone();
*y = t;
}
For standard library containers this variant will be less efficient because T::clone() will perform a deep copy of the container, whereas T::default() will create an empty container without performing an allocation.
Implementing swap() without additional constraint on T requires unsafe code, as shown in other answers.

Related

Is this the idiomatic way to make self-referential structures?

I am interested in knowing the idiomatic/canonical way of making self-referential structures in Rust. The related question Why can't I store a value and a reference to that value in the same struct explains the problem, but try as I might, I couldn't figure out the answer in the existing question (although there were some useful hints).
I have come up with a solution, but I am unsure of how safe it is, or if it is the idiomatic way to solve this problem; if it isn't, I would very much like to know what the usual solution is.
I have an existing structure in my program that holds a reference to a sequence. Sequences hold information about chromosomes so they can be rather long, and copying them isn't a viable idea.
// My real Foo is more complicated than this and is an existing
// type I'd rather not have to rewrite if I can avoid it...
struct Foo<'a> {
x: &'a [usize],
// more here...
}
impl<'a> Foo<'a> {
pub fn new(x: &'a [usize]) -> Self {
Foo {
x, /* more here... */
}
}
}
I now need a new structure that reduces the sequence to something smaller and then builds a Foo structure over the reduced string, and since someone has to own both reduced string and Foo object, I would like to put both in a structure.
// My real Bar is slightly more complicated, but it boils down to having
// a vector it owns and a Foo over that vector.
struct Bar<'a> {
x: Vec<usize>,
y: Foo<'a>, // has a reference to &x
}
// This doesn't work because x is moved after y has borrowed it
impl<'a> Bar<'a> {
pub fn new() -> Self {
let x = vec![1, 2, 3];
let y = Foo::new(&x);
Bar { x, y }
}
}
Now, this doesn't work because the Foo object in a Bar refers into the Bar object
and if the Bar object moves, the reference will point into memory that is no longer occupied by the Bar object
To avoid this problem, the x element in Bar must sit on the heap and not move around. (I think the data in a Vec already sits happily on the heap, but that doesn't seem to help me here).
A pinned box should do the trick, I belive.
struct Bar<'a> {
x: Pin<Box<Vec<usize>>>,
y: Foo<'a>,
}
Now the structure looks like this
and when I move it, the references point to the same memory.
However, moving x to the heap isn't enough for the type-checker. It still thinks that moving the pinned box will move what it points to.
If I implement Bar's constructor like this:
impl<'a> Bar<'a> {
pub fn new() -> Self {
let v: Vec<usize> = vec![1, 2, 3];
let x = Box::pin(v);
let y = Foo::new(&x);
Bar { x, y }
}
}
I get the error
error[E0515]: cannot return value referencing local variable `x`
--> src/main.rs:22:9
|
21 | let y = Foo::new(&x);
| -- `x` is borrowed here
22 | Bar { x, y }
| ^^^^^^^^^^^^ returns a value referencing data owned by the current function
error[E0505]: cannot move out of `x` because it is borrowed
--> src/main.rs:22:15
|
17 | impl<'a> Bar<'a> {
| -- lifetime `'a` defined here
...
21 | let y = Foo::new(&x);
| -- borrow of `x` occurs here
22 | Bar { x, y }
| ------^-----
| | |
| | move out of `x` occurs here
| returning this value requires that `x` is borrowed for `'a`
Some errors have detailed explanations: E0505, E0515.
For more information about an error, try `rustc --explain E0505`.
(Playground)
Even though the object I take a reference of sits on the heap, and doesn't move, the checker still sees me borrowing from an object that moves, and that, of course, is a no-no.
Here, you might stop and notice that I am trying to make two pointers to the same object, so Rc or Arc is an obvious solution. And it is, but I would have to change the implementation of Foo to have an Rc member instead of a reference. While I do have control of the source code for Foo, and I could update it and all the code that uses it, I am reluctant to make such a major change if I can avoid it. And I could have been in a situation where I am not in control of the Foo, so I couldn't change its implementation, and I would love to know how I would solve that situation then.
The only solution I could get to work was to get a raw pointer to x, so the type-checker doesn't see that I borrow it, and then connect x and y though that.
impl<'a> Bar<'a> {
pub fn new() -> Self {
let v: Vec<usize> = vec![1, 2, 3];
let x = Box::new(v);
let (x, y) = unsafe {
let ptr: *mut Vec<usize> = Box::into_raw(x);
let w: &Vec<usize> = ptr.as_ref().unwrap();
(Pin::new(Box::from_raw(ptr)), Foo::new(&w))
};
Bar { x, y }
}
}
Playground code here
What I don't know is if this is the right way to do it. It seems rather complicated, but perhaps it is the only way to make a structure like this in Rust? That some sort of unsafe is needed to trick the compiler. So that is the first of my questions.
The second is, if this is safe to do? Of course it is unsafe in the technical sense, but am I risking creating a reference to memory that might not be valid later? It is my impression that Pin should guarantee that the object remains where it is supposed to sit, and that the lifetime of the Bar<'a> and Foo<'a> objects should ensure that the reference doesn't out-live the vector, but once I have gone unsafe, could that promise be broken?
Update
The owning_ref crate has functionality that looks like what I need. You can create owned objects that present their references as well.
There is an OwningRef type that wraps an object and a reference, and it would be wonderful if you could have the slice in that and getting the reference wasn't seen as borrowing from the object, but obviously that isn't the case. Code such as this
use owning_ref::OwningRef;
struct Bar<'a> {
x: OwningRef<Vec<usize>, [usize]>,
y: Foo<'a>, // has a reference to &x
}
// This doesn't work because x is moved after y has borrowed it
impl<'a> Bar<'a> {
pub fn new() -> Self {
let v: Vec<usize> = vec![1, 2, 3];
let x = OwningRef::new(v);
let y = Foo::new(x.as_ref());
Bar { x, y }
}
}
you get the error
error[E0515]: cannot return value referencing local variable `x`
--> src/main.rs:22:9
|
21 | let y = Foo::new(x.as_ref());
| ---------- `x` is borrowed here
22 | Bar { x, y }
| ^^^^^^^^^^^^ returns a value referencing data owned by the current function
error[E0505]: cannot move out of `x` because it is borrowed
--> src/main.rs:22:15
|
17 | impl<'a> Bar<'a> {
| -- lifetime `'a` defined here
...
21 | let y = Foo::new(x.as_ref());
| ---------- borrow of `x` occurs here
22 | Bar { x, y }
| ------^-----
| | |
| | move out of `x` occurs here
| returning this value requires that `x` is borrowed for `'a`
Some errors have detailed explanations: E0505, E0515.
For more information about an error, try `rustc --explain E0505`.
error: could not compile `foo` due to 2 previous errors
The reason is the same as before: I borrow a reference to x and then I move it.
There are different wrapper objects in the crate, and in various combinations they will let me get close to a solution and then snatch it away from me, because what I borrow I still cannot move later, e.g.:
use owning_ref::{BoxRef, OwningRef};
struct Bar<'a> {
x: OwningRef<Box<Vec<usize>>, Vec<usize>>,
y: Foo<'a>, // has a reference to &x
}
// This doesn't work because x is moved after y has borrowed it
impl<'a> Bar<'a> {
pub fn new() -> Self {
let v: Vec<usize> = vec![1, 2, 3];
let v = Box::new(v); // Vector on the heap
let x = BoxRef::new(v);
let y = Foo::new(x.as_ref());
Bar { x, y }
}
}
error[E0515]: cannot return value referencing local variable `x`
--> src/main.rs:23:9
|
22 | let y = Foo::new(x.as_ref());
| ---------- `x` is borrowed here
23 | Bar { x, y }
| ^^^^^^^^^^^^ returns a value referencing data owned by the current function
error[E0505]: cannot move out of `x` because it is borrowed
--> src/main.rs:23:15
|
17 | impl<'a> Bar<'a> {
| -- lifetime `'a` defined here
...
22 | let y = Foo::new(x.as_ref());
| ---------- borrow of `x` occurs here
23 | Bar { x, y }
| ------^-----
| | |
| | move out of `x` occurs here
| returning this value requires that `x` is borrowed for `'a`
Some errors have detailed explanations: E0505, E0515.
For more information about an error, try `rustc --explain E0505`.
I can get around this by going unsafe and work with a pointer, of course, but then I am back to the solution I had with Pin and pointer hacking. I strongly feel that there is a solution here, (especially because having a Box<Vec<...>> and the corresponding Vec<...> isn't adding much to the table so there must be more to the crate), but what it is is eluding me.
(I think the data in a Vec already sits happily on the heap, but that doesn't seem to help me here).
Indeed the data in a Vec does already sit on the heap, and the x: &'a [usize] in Foo is already a reference to that heap allocation; so your problem here is not (as shown in your graphics) that moving Bar would result in (the undefined behaviour of) a dangling reference.
However, what happens if the Vec were to outgrow its current allocation? It would reallocate and be moved from its present heap allocation to another—and this would result in a dangling reference. Hence the borrow checker must enforce that, so long as anything (e.g. a Foo) that borrows from the Vec exists, the Vec cannot be mutated. Yet here we already have an expressivity problem: the Rust language has no way to annotate Bar to indicate this relationship.
Your proposed unsafe solution uses <*mut _>::as_ref, whose safety documentation includes the following requirement (emphasis added):
You must enforce Rust’s aliasing rules, since the returned lifetime 'a is arbitrarily chosen and does not necessarily reflect the actual lifetime of the data. In particular, for the duration of this lifetime, the memory the pointer points to must not get mutated (except inside UnsafeCell).
This is the key bit of the compiler's safety checks that you are trying to opt out of—but because accessing Bar now requires that one uphold this requirement, you do not have a completely safe abstraction. In my view, a raw pointer would be a tad safer here because it forces one to consider the safety of every access.
For example, one issue that immediately springs to mind is that x is declared before y in Bar and therefore, upon destruction, it will be dropped first: the Vec's heap allocation will be freed while Foo still holds references into it: undefined behaviour! Simply reordering the fields would avoid this particular problem, but there would have been no such problem with raw pointers (and any attempt to dereference them in Foo's drop handler would have forced one to consider whether they were still dereferenceable at that time).
Personally, I would try to avoid self-referencing here and probably use an arena.
I think I have finally grokked ouroboros and that is an elegant solution.
You use a macro, self_referencing when defining a structure, and inside the structure you can specify that one entry borrows others. For my application, I got it to work like this:
use ouroboros::self_referencing;
#[self_referencing]
struct _Bar {
x: Vec<usize>,
#[borrows(x)]
#[covariant]
y: Foo<'this>,
}
struct Bar(pub _Bar);
The y element references x, so I specify that. I'm sure why co-/contra-varianse is needed in this particular case where there is only one lifetime, but it specififes whether other references should live longer or can live shorter than the object. I've defined the struct as _Bar and then wrapped it in Bar. This is because macro will create a new method, and I don't want the default one. At the same time I wnat to call my constructor new to stick with tradition. So I wrap the type and write my own constructor:
impl Bar {
pub fn new() -> Self {
let x: Vec<usize> = vec![1, 2, 3];
let _bar = _BarBuilder {
x,
y_builder: |x: &Vec<usize>| Foo::new(&x),
}
.build();
Bar(_bar)
}
}
I don't use the generated _Bar::new but a generated _BarBuilder object where I can specify how to get the y value from the x reference.
I have also written accessors to get the two values. There isn't anything special here.
impl Bar {
pub fn x(&self) -> &Vec<usize> {
self.0.borrow_x()
}
pub fn y(&self) -> &Foo {
self.0.borrow_y()
}
}
and with that my trivial little test case runs...
fn main() {
let bar = Bar::new();
let vec = bar.x();
for &i in vec {
println!("i == {}", i);
}
let vec = bar.y().x;
for &i in vec {
println!("i == {}", i);
}
}
This is probably the best solution so far, assuming that there are no hidden costs that I am currently unaware of.

Why does moving a disjoint field capture into a closure differ when the type is a value vs a reference?

As explained in Why is the move keyword needed when returning a closure which captures a Copy type? and How to copy instead of borrow an i64 into a closure in Rust?, if a closure captures a type that implements Copy, we need to use the move keyword to eagerly copy the value:
fn use_move_to_copy_into_closure() -> impl FnOnce(i32) -> bool {
let captured = 0;
move |value| value > captured
}
As of Rust 2021, disjoint capture in closures means that only the specific fields used in a closure are captured by the closure:
struct Wrapper(i32);
fn edition_2021_only_captures_specific_fields(captured: Wrapper) -> impl FnOnce(i32) -> bool {
let ret = move |value| value > captured.0;
drop(captured); // fails in 2015, 2018, succeeds in 2021
ret
}
If I capture a Copy field belonging to a reference, however, the field is not copied:
struct Wrapper(i32);
fn capturing_a_field_of_a_reference(captured: &Wrapper) -> impl FnOnce(i32) -> bool {
move |value| value > captured.0
}
error[E0700]: hidden type for `impl Trait` captures lifetime that does not appear in bounds
--> src/lib.rs:15:60
|
15 | fn capturing_a_field_of_a_reference(captured: &Wrapper) -> impl FnOnce(i32) -> bool {
| -------- ^^^^^^^^^^^^^^^^^^^^^^^^
| |
| hidden type `[closure#src/lib.rs:16:5: 16:36]` captures the anonymous lifetime defined here
|
help: to declare that the `impl Trait` captures `'_`, you can add an explicit `'_` lifetime bound
|
15 | fn capturing_a_field_of_a_reference(captured: &Wrapper) -> impl FnOnce(i32) -> bool + '_ {
| ++++
I expected that the field .0 would be copied in, just like in the unwrapped i32 or the owned Wrapper cases. What causes this difference?
move captures the environment by value, but things in the environment that are references remain references -- the references are captured by value.
Put another way, move closures only try to move values, because otherwise there wouldn't be a way to simultaneously capture some things by value and some things by reference. For example, it's a common pattern to do this when dealing with threads in e.g. crossbeam. Assume these structs:
#[derive(Clone)]
struct Foo;
impl Foo {
pub fn baz(&mut self, _: i32) {}
}
struct Bar(pub i32);
And this snippet:
let foo = Foo;
let bar = Bar(0);
{
let mut foo = foo.clone();
let bar = &bar;
s.spawn(move |_| {
foo.baz(bar.0);
});
}
Here the closure takes ownership of the clone of foo but references bar.0. This is true even if the type of bar.0 is Copy.
If it didn't work this way, there would be no way to express that the closure should own the foo clone, but borrow the copyable value bar.0.
Remember that implementing Copy on a type only means that an attempt to move a value of that type will instead copy. Since captured is a reference in your third example, capturing captured.0 doesn't attempt to move the i32 like it does in your second example where you have an owned value, and if a move isn't attempted then no copy can happen.
It's not a question of whether the copy occurs, but when it occurs. The copy only happens when the closure is called, not when it's defined (how could it be -- a captured hasn't been provided yet). Since the closure outlives the function call, it needs to know that its reference to captured will be valid when it's actually called precisely so that it has something to copy. Therefore it needs to be annotated with a lifetime tying it to the lifetime of captured. Thanks to lifetime inference, it's sufficient to simply add '_ to the impl instead of needing to explicitly write fn capturing_a_field_of_a_reference<'a>(captured: &'a Wrapper) -> impl 'a + FnOnce(i32) -> bool.

Cannot borrow `*self` as mutable more than once at a time; in combination with HashMap

This is my code:
use std::collections::HashMap;
struct Foo {
pub map : HashMap<i32, String>
}
impl Foo {
fn foo(&mut self, x: &String) -> i32 {
// I'm planning to use/modify "x" here and also modify "self"
42
}
fn bar(&mut self) -> i32 {
let x = self.map.get_mut(&1).unwrap();
self.foo(x)
}
}
I'm getting:
error[E0499]: cannot borrow `*self` as mutable more than once at a time
--> src/main.rs:13:9
|
12 | let x = self.map.get_mut(&1).unwrap();
| -------------------- first mutable borrow occurs here
13 | self.foo(x)
| ^^^^^^^^^-^
| | |
| | first borrow later used here
| second mutable borrow occurs here
What's going on?
Modifying self and x here breaks memory safety (at least in the general situation, which is what Rust must deal with). Consider the following implementation of foo which is allowed by your signature (fixing &String to &str):
fn foo(&mut self, x: &str) -> i32 {
self.map.clear();
println!("{}", x);
42
}
But you're calling this with x being a reference to something inside of self.map. So x could be destroyed by the time it's used. That's invalid, and Rust can't prove you won't do that, because you said you might. (Kevin Anderson provides a helpful comment below if you're coming from a GC language like C# where "reference" has a different meaning.)
How to fix this depends on what you're really trying to do, though one approach would be to clone the string so it cannot be destroyed:
fn bar(&mut self) -> i32 {
let x = self.map.get(&1).unwrap().clone(); // <== now you have a copy
self.foo(&x)
}
Note this got rid of the get_mut(). It's unclear what that was for. If you need an exclusive (mut) reference into the map, then you'll need to do that separately, and you can't do that directly while also holding an exclusive reference to self for the reasons above. Remember that mut means "exclusive access," not "mutable." A side effect of having exclusive access is that mutation is allowed.
If you really need something along these lines, you need to wrap your values (String) in Arc so that you can maintain reference counts and have shared ownership. But I would first try to redesign your algorithm to avoid this.

Why can't I create a closure that produces mutable references to what it closes on? [duplicate]

I was playing around with Rust closures when I hit this interesting scenario:
fn main() {
let mut y = 10;
let f = || &mut y;
f();
}
This gives an error:
error[E0495]: cannot infer an appropriate lifetime for borrow expression due to conflicting requirements
--> src/main.rs:4:16
|
4 | let f = || &mut y;
| ^^^^^^
|
note: first, the lifetime cannot outlive the lifetime as defined on the body at 4:13...
--> src/main.rs:4:13
|
4 | let f = || &mut y;
| ^^^^^^^^^
note: ...so that closure can access `y`
--> src/main.rs:4:16
|
4 | let f = || &mut y;
| ^^^^^^
note: but, the lifetime must be valid for the call at 6:5...
--> src/main.rs:6:5
|
6 | f();
| ^^^
note: ...so type `&mut i32` of expression is valid during the expression
--> src/main.rs:6:5
|
6 | f();
| ^^^
Even though the compiler is trying to explain it line by line, I still haven't understood what exactly it is complaining about.
Is it trying to say that the mutable reference cannot outlive the enclosing closure?
The compiler does not complain if I remove the call f().
Short version
The closure f stores a mutable reference to y. If it were allowed to return a copy of this reference, you would end up with two simultaneous mutable references to y (one in the closure, one returned), which is forbidden by Rust's memory safety rules.
Long version
The closure can be thought of as
struct __Closure<'a> {
y: &'a mut i32,
}
Since it contains a mutable reference, the closure is called as FnMut, essentially with the definition
fn call_mut(&mut self, args: ()) -> &'a mut i32 { self.y }
Since we only have a mutable reference to the closure itself, we can't move the field y out, neither are we able to copy it, since mutable references aren't Copy.
We can trick the compiler into accepting the code by forcing the closure to be called as FnOnce instead of FnMut. This code works fine:
fn main() {
let x = String::new();
let mut y: u32 = 10;
let f = || {
drop(x);
&mut y
};
f();
}
Since we are consuming x inside the scope of the closure and x is not Copy, the compiler detects that the closure can only be FnOnce. Calling an FnOnce closure passes the closure itself by value, so we are allowed to move the mutable reference out.
Another more explicit way to force the closure to be FnOnce is to pass it to a generic function with a trait bound. This code works fine as well:
fn make_fn_once<'a, T, F: FnOnce() -> T>(f: F) -> F {
f
}
fn main() {
let mut y: u32 = 10;
let f = make_fn_once(|| {
&mut y
});
f();
}
There are two main things at play here:
Closures cannot return references to their environment
A mutable reference to a mutable reference can only use the lifetime of the outer reference (unlike with immutable references)
Closures returning references to environment
Closures cannot return any references with the lifetime of self (the closure object). Why is that? Every closure can be called as FnOnce, since that's the super-trait of FnMut which in turn is the super-trait of Fn. FnOnce has this method:
fn call_once(self, args: Args) -> Self::Output;
Note that self is passed by value. So since self is consumed (and now lives within the call_once function`) we cannot return references to it -- that would be equivalent to returning references to a local function variable.
In theory, the call_mut would allow to return references to self (since it receives &mut self). But since call_once, call_mut and call are all implemented with the same body, closures in general cannot return references to self (that is: to their captured environment).
Just to be sure: closures can capture references and return those! And they can capture by reference and return that reference. Those things are something different. It's just about what is stored in the closure type. If there is a reference stored within the type, it can be returned. But we can't return references to anything stored within the closure type.
Nested mutable references
Consider this function (note that the argument type implies 'inner: 'outer; 'outer being shorter than 'inner):
fn foo<'outer, 'inner>(x: &'outer mut &'inner mut i32) -> &'inner mut i32 {
*x
}
This won't compile. On the first glance, it seems like it should compile, since we're just peeling one layer of references. And it does work for immutable references! But mutable references are different here to preserve soundness.
It's OK to return &'outer mut i32, though. But it's impossible to get a direct reference with the longer (inner) lifetime.
Manually writing the closure
Let's try to hand code the closure you were trying to write:
let mut y = 10;
struct Foo<'a>(&'a mut i32);
impl<'a> Foo<'a> {
fn call<'s>(&'s mut self) -> &'??? mut i32 { self.0 }
}
let mut f = Foo(&mut y);
f.call();
What lifetime should the returned reference have?
It can't be 'a, because we basically have a &'s mut &'a mut i32. And as discussed above, in such a nested mutable reference situation, we can't extract the longer lifetime!
But it also can't be 's since that would mean the closure returns something with the lifetime of 'self ("borrowed from self"). And as discussed above, closures can't do that.
So the compiler can't generate the closure impls for us.
Consider this code:
fn main() {
let mut y: u32 = 10;
let ry = &mut y;
let f = || ry;
f();
}
It works because the compiler is able to infer ry's lifetime: the reference ry lives in the same scope of y.
Now, the equivalent version of your code:
fn main() {
let mut y: u32 = 10;
let f = || {
let ry = &mut y;
ry
};
f();
}
Now the compiler assigns to ry a lifetime associated to the scope of the closure body, not to the lifetime associated with the main body.
Also note that the immutable reference case works:
fn main() {
let mut y: u32 = 10;
let f = || {
let ry = &y;
ry
};
f();
}
This is because &T has copy semantics and &mut T has move semantics, see Copy/move semantics documentation of &T/&mut T types itself for more details.
The missing piece
The compiler throws an error related to a lifetime:
cannot infer an appropriate lifetime for borrow expression due to conflicting requirements
but as pointed out by Sven Marnach there is also a problem related to the error
cannot move out of borrowed content
But why doesn't the compiler throw this error?
The short answer is that the compiler first executes type checking and then borrow checking.
the long answer
A closure is made up of two pieces:
the state of the closure: a struct containing all the variables captured by the closure
the logic of the closure: an implementation of the FnOnce, FnMut or Fn trait
In this case the state of the closure is the mutable reference y and the logic is the body of the closure { &mut y } that simply returns a mutable reference.
When a reference is encountered, Rust controls two aspects:
the state: if the reference points to a valid memory slice, (i.e. the read-only part of lifetime validity);
the logic: if the memory slice is aliased, in other words if it is pointed from more than one reference simultaneously;
Note the move out from borrowed content is forbidden for avoiding memory aliasing.
The Rust compiler executes its job through several stages, here's a simplified workflow:
.rs input -> AST -> HIR -> HIR postprocessing -> MIR -> HIR postprocessing -> LLVM IR -> binary
The compiler reports a lifetime problem because it first executes the type checking phase in HIR postprocessing (which comprises lifetime analysis) and after that, if successful, executes borrow checking in the MIR postprocessing phase.

Why can I not return a mutable reference to an outer variable from a closure?

I was playing around with Rust closures when I hit this interesting scenario:
fn main() {
let mut y = 10;
let f = || &mut y;
f();
}
This gives an error:
error[E0495]: cannot infer an appropriate lifetime for borrow expression due to conflicting requirements
--> src/main.rs:4:16
|
4 | let f = || &mut y;
| ^^^^^^
|
note: first, the lifetime cannot outlive the lifetime as defined on the body at 4:13...
--> src/main.rs:4:13
|
4 | let f = || &mut y;
| ^^^^^^^^^
note: ...so that closure can access `y`
--> src/main.rs:4:16
|
4 | let f = || &mut y;
| ^^^^^^
note: but, the lifetime must be valid for the call at 6:5...
--> src/main.rs:6:5
|
6 | f();
| ^^^
note: ...so type `&mut i32` of expression is valid during the expression
--> src/main.rs:6:5
|
6 | f();
| ^^^
Even though the compiler is trying to explain it line by line, I still haven't understood what exactly it is complaining about.
Is it trying to say that the mutable reference cannot outlive the enclosing closure?
The compiler does not complain if I remove the call f().
Short version
The closure f stores a mutable reference to y. If it were allowed to return a copy of this reference, you would end up with two simultaneous mutable references to y (one in the closure, one returned), which is forbidden by Rust's memory safety rules.
Long version
The closure can be thought of as
struct __Closure<'a> {
y: &'a mut i32,
}
Since it contains a mutable reference, the closure is called as FnMut, essentially with the definition
fn call_mut(&mut self, args: ()) -> &'a mut i32 { self.y }
Since we only have a mutable reference to the closure itself, we can't move the field y out, neither are we able to copy it, since mutable references aren't Copy.
We can trick the compiler into accepting the code by forcing the closure to be called as FnOnce instead of FnMut. This code works fine:
fn main() {
let x = String::new();
let mut y: u32 = 10;
let f = || {
drop(x);
&mut y
};
f();
}
Since we are consuming x inside the scope of the closure and x is not Copy, the compiler detects that the closure can only be FnOnce. Calling an FnOnce closure passes the closure itself by value, so we are allowed to move the mutable reference out.
Another more explicit way to force the closure to be FnOnce is to pass it to a generic function with a trait bound. This code works fine as well:
fn make_fn_once<'a, T, F: FnOnce() -> T>(f: F) -> F {
f
}
fn main() {
let mut y: u32 = 10;
let f = make_fn_once(|| {
&mut y
});
f();
}
There are two main things at play here:
Closures cannot return references to their environment
A mutable reference to a mutable reference can only use the lifetime of the outer reference (unlike with immutable references)
Closures returning references to environment
Closures cannot return any references with the lifetime of self (the closure object). Why is that? Every closure can be called as FnOnce, since that's the super-trait of FnMut which in turn is the super-trait of Fn. FnOnce has this method:
fn call_once(self, args: Args) -> Self::Output;
Note that self is passed by value. So since self is consumed (and now lives within the call_once function`) we cannot return references to it -- that would be equivalent to returning references to a local function variable.
In theory, the call_mut would allow to return references to self (since it receives &mut self). But since call_once, call_mut and call are all implemented with the same body, closures in general cannot return references to self (that is: to their captured environment).
Just to be sure: closures can capture references and return those! And they can capture by reference and return that reference. Those things are something different. It's just about what is stored in the closure type. If there is a reference stored within the type, it can be returned. But we can't return references to anything stored within the closure type.
Nested mutable references
Consider this function (note that the argument type implies 'inner: 'outer; 'outer being shorter than 'inner):
fn foo<'outer, 'inner>(x: &'outer mut &'inner mut i32) -> &'inner mut i32 {
*x
}
This won't compile. On the first glance, it seems like it should compile, since we're just peeling one layer of references. And it does work for immutable references! But mutable references are different here to preserve soundness.
It's OK to return &'outer mut i32, though. But it's impossible to get a direct reference with the longer (inner) lifetime.
Manually writing the closure
Let's try to hand code the closure you were trying to write:
let mut y = 10;
struct Foo<'a>(&'a mut i32);
impl<'a> Foo<'a> {
fn call<'s>(&'s mut self) -> &'??? mut i32 { self.0 }
}
let mut f = Foo(&mut y);
f.call();
What lifetime should the returned reference have?
It can't be 'a, because we basically have a &'s mut &'a mut i32. And as discussed above, in such a nested mutable reference situation, we can't extract the longer lifetime!
But it also can't be 's since that would mean the closure returns something with the lifetime of 'self ("borrowed from self"). And as discussed above, closures can't do that.
So the compiler can't generate the closure impls for us.
Consider this code:
fn main() {
let mut y: u32 = 10;
let ry = &mut y;
let f = || ry;
f();
}
It works because the compiler is able to infer ry's lifetime: the reference ry lives in the same scope of y.
Now, the equivalent version of your code:
fn main() {
let mut y: u32 = 10;
let f = || {
let ry = &mut y;
ry
};
f();
}
Now the compiler assigns to ry a lifetime associated to the scope of the closure body, not to the lifetime associated with the main body.
Also note that the immutable reference case works:
fn main() {
let mut y: u32 = 10;
let f = || {
let ry = &y;
ry
};
f();
}
This is because &T has copy semantics and &mut T has move semantics, see Copy/move semantics documentation of &T/&mut T types itself for more details.
The missing piece
The compiler throws an error related to a lifetime:
cannot infer an appropriate lifetime for borrow expression due to conflicting requirements
but as pointed out by Sven Marnach there is also a problem related to the error
cannot move out of borrowed content
But why doesn't the compiler throw this error?
The short answer is that the compiler first executes type checking and then borrow checking.
the long answer
A closure is made up of two pieces:
the state of the closure: a struct containing all the variables captured by the closure
the logic of the closure: an implementation of the FnOnce, FnMut or Fn trait
In this case the state of the closure is the mutable reference y and the logic is the body of the closure { &mut y } that simply returns a mutable reference.
When a reference is encountered, Rust controls two aspects:
the state: if the reference points to a valid memory slice, (i.e. the read-only part of lifetime validity);
the logic: if the memory slice is aliased, in other words if it is pointed from more than one reference simultaneously;
Note the move out from borrowed content is forbidden for avoiding memory aliasing.
The Rust compiler executes its job through several stages, here's a simplified workflow:
.rs input -> AST -> HIR -> HIR postprocessing -> MIR -> HIR postprocessing -> LLVM IR -> binary
The compiler reports a lifetime problem because it first executes the type checking phase in HIR postprocessing (which comprises lifetime analysis) and after that, if successful, executes borrow checking in the MIR postprocessing phase.

Resources