Ownership, closures, FnOnce: much confusion

Ownership, closures, FnOnce: much confusion - rust

I have the following snippet of code:
fn f<T: FnOnce() -> u32>(c: T) {
println!("Hello {}", c());
}
fn main() {
let mut x = 32;
let g = move || {
x = 33;
x
};
g(); // Error: cannot borrow as mutable. Doubt 1
f(g); // Instead, this would work. Doubt 2
println!("{}", x); // 32
}
Doubt 1
I can not run my closure even once.
Doubt 2
... but I can invoke that closure as many times as I want, provided that I call it through f. Funnily, if I declare it FnMut, I get the same error as in doubt 1.
Doubt 3
What does self refer to in Fn, FnMut and FnOnce traits definition? Is that the closure itself? Or the environment?
E.g. from the documentation:
pub trait FnMut<Args>: FnOnce<Args> {
extern "rust-call" fn call_mut(&mut self, args: Args) -> Self::Output;
}

Some basics about the Fn* trait family is needed to understand how closures actually work. You have the following traits:
FnOnce, which, as the name implies, can only be run once. If we look at the docs page we'll see that the trait definition is almost the same as what you specified in your question. What is most important though, is the following: The "call" function takes self, meaning that it consumes the object which implements FnOnce, so like any trait function which takes a self as a parameter, it takes ownership of the object.
FnMut, which allows for mutation of the captured variables, or in other words, it takes &mut self. What this means, is that when you make a move || {} closure, it will move any variables you reference which are outside the scope of the closure into the closure's object. The closure's object has a type which is unnameable, meaning that it is unique to each closure. This does force the user to take some kind of mutable version of the closure, so &mut impl FnMut() -> () or mut x: impl FnMut() -> ()
Fn, which is generally considered the most flexible. This allows the user to take an immutable version of the object implementing the trait. The function signature for this trait's "call" function is the simplest to understand of the three, as it only takes a reference to the closure, meaning that you don't need to worry about ownership while passing it around or calling it.
To address your individual doubts:
Doubt 1: As seen above, when you move something into a closure, the variable is now owned by the closure. Essentially, what the compiler generates is like the following pseudocode:
struct g_Impl {
x: usize
}
impl FnOnce() -> usize for g_Impl {
fn call_once(mut self) -> usize {
}
}
impl FnMut() -> usize for g_Impl {
fn call_mut(&mut self) -> usize {
//Here starts your actual code:
self.x = 33;
self.x
}
}
//No impl Fn() -> usize.
And by default it calls the FnMut() -> usize implementation.
Doubt 2: What is happening here is that closures are Copy as long as each of their captured variables are Copy, meaning that the closure that is generated will be copied into f, so that f ends up taking a Copy of it. When you change the definition for f to take an FnMut instead, you get the error because you are facing a similar situation to doubt 1: You are trying to call a function which receives &mut self while you've declared the parameter to be c: T instead of either mut c: T or c: &mut T, either of which qualify for &mut self in the eyes of FnMut.
Finally, doubt 3, the self parameter is the closure itself, which has captured or moved some variables into itself, so it now owns them.

You are dealing with two different kinds of closures here – FnOnce and FnMut. Both types of closures have different calling conventions.
If you define your closure as
let mut x = 32;
let g = move || {
x = 33;
x
};
the compiler will infer the type of the closure as FnMut. While the closure returns the owned variable x, it can still be called multiple times, since x is Copy, so the compiler choses FnMut as the most general applicable type.
When calling an FnMut closure, the closure itself is passed by mutable reference. This explains your first question – calling g directly does not work unless you make it mutable, since otherwise you can't take a mutable reference to it. I also implicitly answered your third question here – self in the call methods of the Fn traits refers to the closure itself, which can be thought of as a struct containing all captured variables.
When calling f(g), you pass the FnMut closure g as a FnOnce closure to f(). This is allowed since all FnOnce is a supertrait of FnMut, so every closure implementing FnMut also implements FnOnce. Now that the closure has been converted to FnOnce, it is also called according to the FnOnce calling convention:
pub trait FnOnce<Args> {
type Output;
extern "rust-call" fn call_once(self, args: Args) -> Self::Output;
}
The closure is passed in by value in this case, so the call consumes the closure. You can give away ownership of any value you own – it does not need to be mutable for this to work.
The reason why you can call g multiple times when calling it through f() is that g is Copy. It only captures a single integer, so it can be copied as many times as you want. Each call to f() creates a new copy of g, which is consumed when it is called inside of f().

Related

Can a closure return a reference to data it owns? [duplicate]

Considering the following code:
fn foo<'a, T: 'a>(t: T) -> Box<Fn() -> &'a T + 'a> {
Box::new(move || &t)
}
What I expect:
The type T has lifetime 'a.
The value t live as long as T.
t moves to the closure, so the closure live as long as t
The closure returns a reference to t which was moved to the closure. So the reference is valid as long as the closure exists.
There is no lifetime problem, the code compiles.
What actually happens:
The code does not compile:
error[E0495]: cannot infer an appropriate lifetime for borrow expression due to conflicting requirements
--> src/lib.rs:2:22
|
2 | Box::new(move || &t)
| ^^
|
note: first, the lifetime cannot outlive the lifetime as defined on the body at 2:14...
--> src/lib.rs:2:14
|
2 | Box::new(move || &t)
| ^^^^^^^^^^
note: ...so that closure can access `t`
--> src/lib.rs:2:22
|
2 | Box::new(move || &t)
| ^^
note: but, the lifetime must be valid for the lifetime 'a as defined on the function body at 1:8...
--> src/lib.rs:1:8
|
1 | fn foo<'a, T: 'a>(t: T) -> Box<Fn() -> &'a T + 'a> {
| ^^
= note: ...so that the expression is assignable:
expected std::boxed::Box<(dyn std::ops::Fn() -> &'a T + 'a)>
found std::boxed::Box<dyn std::ops::Fn() -> &T>
I do not understand the conflict. How can I fix it?

Very interesting question! I think I understood the problem(s) at play here. Let me try to explain.
tl;dr: closures cannot return references to values captured by moving, because that would be a reference to self. Such a reference cannot be returned because the Fn* traits don't allow us to express that. This is basically the same as the streaming iterator problem and could be fixed via GATs (generic associated types).
Implementing it manually
As you probably know, when you write a closure, the compiler will generate a struct and impl blocks for the appropriate Fn traits, so closures are basically syntax sugar. Let's try to avoid all that sugar and build your type manually.
What you want is a type which owns another type and can return references to that owned type. And you want to have a function which returns a boxed instance of said type.
struct Baz<T>(T);
impl<T> Baz<T> {
fn call(&self) -> &T {
&self.0
}
}
fn make_baz<T>(t: T) -> Box<Baz<T>> {
Box::new(Baz(t))
}
This is pretty equivalent to your boxed closure. Let's try to use it:
let outside = {
let s = "hi".to_string();
let baz = make_baz(s);
println!("{}", baz.call()); // works
baz
};
println!("{}", outside.call()); // works too
This works just fine. The string s is moved into the Baz type and that Baz instance is moved into the Box. s is now owned by baz and then by outside.
It gets more interesting when we add a single character:
let outside = {
let s = "hi".to_string();
let baz = make_baz(&s); // <-- NOW BORROWED!
println!("{}", baz.call()); // works
baz
};
println!("{}", outside.call()); // doesn't work!
Now we cannot make the lifetime of baz bigger than the lifetime of s, since baz contains a reference to s which would be an dangling reference of s would go out of scope earlier than baz.
The point I wanted to make with this snippet: we didn't need to annotate any lifetimes on the type Baz to make this safe; Rust figured it out on its own and enforces that baz lives no longer than s. This will be important below.
Writing a trait for it
So far we only covered the basics. Let's try to write a trait like Fn to get closer to your original problem:
trait MyFn {
type Output;
fn call(&self) -> Self::Output;
}
In our trait, there are no function parameters, but otherwise it's fairly identical to the real Fn trait.
Let's implement it!
impl<T> MyFn for Baz<T> {
type Output = ???;
fn call(&self) -> Self::Output {
&self.0
}
}
Now we have a problem: what do we write instead of ???? Naively one would write &T... but we need a lifetime parameter for that reference. Where do we get one? What lifetime does the return value even have?
Let's check the function we implemented before:
impl<T> Baz<T> {
fn call(&self) -> &T {
&self.0
}
}
So here we use &T without lifetime parameter too. But this only works because of lifetime elision. Basically, the compiler fills in the blanks so that fn call(&self) -> &T is equivalent to:
fn call<'s>(&'s self) -> &'s T
Aha, so the lifetime of the returned reference is bound to the self lifetime! (more experienced Rust users might already have a feeling where this is going...).
(As a side note: why is the returned reference not dependent on the lifetime of T itself? If T references something non-'static then this has to be accounted for, right? Yes, but it is already accounted for! Remember that no instance of Baz<T> can ever live longer than the thing T might reference. So the self lifetime is already shorter than whatever lifetime T might have. Thus we only need to concentrate on the self lifetime)
But how do we express that in the trait impl? Turns out: we can't (yet). This problem is regularly mentioned in the context of streaming iterators -- that is, iterators that return an item with a lifetime bound to the self lifetime. In today's Rust, it is sadly impossible to implement this; the type system is not strong enough.
What about the future?
Luckily, there is an RFC "Generic Associated Types" which was merged some time ago. This RFC extends the Rust type system to allow associated types of traits to be generic (over other types and lifetimes).
Let's see how we can make your example (kinda) work with GATs (according to the RFC; this stuff doesn't work yet ☹). First we have to change the trait definition:
trait MyFn {
type Output<'a>; // <-- we added <'a> to make it generic
fn call(&self) -> Self::Output;
}
The function signature hasn't changed in the code, but notice that lifetime elision kicks in! The above fn call(&self) -> Self::Output is equivalent to:
fn call<'s>(&'s self) -> Self::Output<'s>
So the lifetime of the associated type is bound to the self lifetime. Just as we wanted! The impl looks like this:
impl<T> MyFn for Baz<T> {
type Output<'a> = &'a T;
fn call(&self) -> Self::Output {
&self.0
}
}
To return a boxed MyFn we would need to write this (according to this section of the RFC:
fn make_baz<T>(t: T) -> Box<for<'a> MyFn<Output<'a> = &'a T>> {
Box::new(Baz(t))
}
And what if we want to use the real Fn trait? As far as I understand, we can't, even with GATs. I think it's impossible to change the existing Fn trait to use GATs in a backwards compatible manner. So it's likely that the standard library will keep the less powerful trait as is. (side note: how to evolve the standard library in backwards incompatible ways to use new language features is something I wondered about a few times already; so far I haven't heard of any real plan in this regards; I hope the Rust team comes up with something...)
Summary
What you want is not technically impossible or unsafe (we implemented it as a simple struct and it works). However, unfortunately it is impossible to express what you want in the form of closures/Fn traits in Rust's type system right now. This is the same problem streaming iterators are dealing with.
With the planned GAT feature, it is possible to express all of this in the type system. However, the standard library would need to catch up somehow to make your exact code possible.

What I expect:
The type T has lifetime 'a.
The value t live as long as T.
This makes no sense. A value cannot "live as long" as a type, because a type doesn't live. "T has lifetime 'a" is a very imprecise statement, easy to misunderstand. What T: 'a really means is "instances of T must stay valid at least as long as lifetime 'a. For example, T must not be a reference with a lifetime shorter than 'a, or a struct containing such a reference. Note that this has nothing to do with forming references to T, i.e. &T.
The value t, then, lives as long as its lexical scope (it's a function parameter) says it does, which has nothing to do with 'a at all.
t moves to the closure, so the closure live as long as t
This is also incorrect. The closure lives as long as the closure does lexically. It is a temporary in the result expression, and therefore lives until the end of the result expression. t's lifetime concerns the closure not at all, since it has its own T variable inside, the capture of t. Since the capture is a copy/move of t, it is not in any way affected by t's lifetime.
The temporary closure is then moved into the box's storage, but that's a new object with its own lifetime. The lifetime of that closure is bound to the lifetime of the box, i.e. it is the return value of the function, and later (if you store the box outside the function) the lifetime of whatever variable you store the box in.
All of that means that a closure that returns a reference to its own capture state must bind the lifetime of that reference to its own reference. Unfortunately, this is not possible.
Here's why:
The Fn trait implies the FnMut trait, which in turn implies the FnOnce trait. That is, every function object in Rust can be called with a by-value self argument. This means that every function object must be still valid being called with a by-value self argument and returning the same thing as always.
In other words, trying to write a closure that returns a reference to its own captures expands to roughly this code:
struct Closure<T> {
captured: T,
}
impl<T> FnOnce<()> for Closure<T> {
type Output = &'??? T; // what do I put as lifetime here?
fn call_once(self, _: ()) -> Self::Output {
&self.captured // returning reference to local variable
// no matter what, the reference would be invalid once we return
}
}
And this is why what you're trying to do is fundamentally impossible. Take a step back, think of what you're actually trying to accomplish with this closure, and find some other way to accomplish it.

You expect the type T to have lifetime 'a, but t is not a reference to a value of type T. The function takes ownership of the variable t by argument passing:
// t is moved here, t lifetime is the scope of the function
fn foo<'a, T: 'a>(t: T)
You should do:
fn foo<'a, T: 'a>(t: &'a T) -> Box<Fn() -> &'a T + 'a> {
Box::new(move || t)
}

The other answers are top-notch, but I wanted to chime in with another reason your original code couldn't work. A big problem lies in the signature:
fn foo<'a, T: 'a>(t: T) -> Box<Fn() -> &'a T + 'a>
This says that the caller may specify any lifetime when calling foo and the code will be valid and memory-safe. That cannot possibly be true for this code. It wouldn't make sense to call this with 'a set to 'static, but nothing about this signature would prevent that.

How can a closure using the `move` keyword create a FnMut closure?

Up to this moment I thought that move |...| {...} would move variables inside a closure and the closure would implement only FnOnce, because you can move variables only once. To my surprise, however, I found that this code works:
extern crate futures;
use futures::stream;
use futures::stream::{Stream, StreamExt};
use std::rc::Rc;
#[derive(Debug)]
struct Foo(i32);
fn bar(r: Rc<Foo>) -> Box<Stream<Item = (), Error = ()> + 'static> {
Box::new(stream::repeat::<_, ()>(()).map(move |_| {
println!("{:?}", r);
}))
}
fn main() {
let r = Rc::new(Foo(0));
let _ = bar(r);
}
Despite map having this signature:
fn map<U, F>(self, f: F) -> Map<Self, F>
where
F: FnMut(Self::Item) -> U,
It's surprising to me that a FnMut closure was created when using the move keyword and it even has 'static lifetime. Where can I find some details about move? Or how does it actually work?

Yes, this point is quite confusing, and I think the wording of the Rust book contributes. After I read it, I thought the same as you did: that a move closure was necessarily FnOnce, and that a non-move closure was FnMut (and may also be Fn). But this is kind-of backwards from the real situation.
The closure can capture values from the scope where it's created. move controls how those values go into the closure: either by being moved, or by reference. But it's how they're used after they're captured that determines whether the closure is FnMut or not.
If the body of the closure consumes any value it captured, then the closure can only be FnOnce. After the closure runs the first time, and consumes that value, it can't run again.
As you've mentioned, you can consume a value inside the closure by calling drop on it, or in other ways, but the most common case is to return it from the closure, which moves it out of the closure. Here's the simplest example:
let s = String::from("hello world");
let my_fnonce = move || { s };
If the body of the closure doesn't consume any of its captures, then it's FnMut, whether it was move or not. If it also doesn't mutate any of its captures, it's also Fn; any closure that is Fn is also FnMut. Here's a simple example, albeit not a very good one.
let s = "hello world";
let my_fn = move || { s.len() }
Summary
The move modifier controls how captures are moved into the closure when it's created. FnMut membership is determined by how captures are moved out of the closure (or consumed in some other way) when it's executed.

Up to this moment I thought that move |...| {...} will move variables inside closure and closure will implement only FnOnce, because you can move vars only once.
The variables are moved when the closure is created, not when it is invoked. Since you're only creating one closure, the move only happens once - regardless of how often map calls the function.

Cannot infer an appropriate lifetime for a closure that returns a reference

Considering the following code:
fn foo<'a, T: 'a>(t: T) -> Box<Fn() -> &'a T + 'a> {
Box::new(move || &t)
}
What I expect:
The type T has lifetime 'a.
The value t live as long as T.
t moves to the closure, so the closure live as long as t
The closure returns a reference to t which was moved to the closure. So the reference is valid as long as the closure exists.
There is no lifetime problem, the code compiles.
What actually happens:
The code does not compile:
error[E0495]: cannot infer an appropriate lifetime for borrow expression due to conflicting requirements
--> src/lib.rs:2:22
|
2 | Box::new(move || &t)
| ^^
|
note: first, the lifetime cannot outlive the lifetime as defined on the body at 2:14...
--> src/lib.rs:2:14
|
2 | Box::new(move || &t)
| ^^^^^^^^^^
note: ...so that closure can access `t`
--> src/lib.rs:2:22
|
2 | Box::new(move || &t)
| ^^
note: but, the lifetime must be valid for the lifetime 'a as defined on the function body at 1:8...
--> src/lib.rs:1:8
|
1 | fn foo<'a, T: 'a>(t: T) -> Box<Fn() -> &'a T + 'a> {
| ^^
= note: ...so that the expression is assignable:
expected std::boxed::Box<(dyn std::ops::Fn() -> &'a T + 'a)>
found std::boxed::Box<dyn std::ops::Fn() -> &T>
I do not understand the conflict. How can I fix it?

Very interesting question! I think I understood the problem(s) at play here. Let me try to explain.
tl;dr: closures cannot return references to values captured by moving, because that would be a reference to self. Such a reference cannot be returned because the Fn* traits don't allow us to express that. This is basically the same as the streaming iterator problem and could be fixed via GATs (generic associated types).
Implementing it manually
As you probably know, when you write a closure, the compiler will generate a struct and impl blocks for the appropriate Fn traits, so closures are basically syntax sugar. Let's try to avoid all that sugar and build your type manually.
What you want is a type which owns another type and can return references to that owned type. And you want to have a function which returns a boxed instance of said type.
struct Baz<T>(T);
impl<T> Baz<T> {
fn call(&self) -> &T {
&self.0
}
}
fn make_baz<T>(t: T) -> Box<Baz<T>> {
Box::new(Baz(t))
}
This is pretty equivalent to your boxed closure. Let's try to use it:
let outside = {
let s = "hi".to_string();
let baz = make_baz(s);
println!("{}", baz.call()); // works
baz
};
println!("{}", outside.call()); // works too
This works just fine. The string s is moved into the Baz type and that Baz instance is moved into the Box. s is now owned by baz and then by outside.
It gets more interesting when we add a single character:
let outside = {
let s = "hi".to_string();
let baz = make_baz(&s); // <-- NOW BORROWED!
println!("{}", baz.call()); // works
baz
};
println!("{}", outside.call()); // doesn't work!
Now we cannot make the lifetime of baz bigger than the lifetime of s, since baz contains a reference to s which would be an dangling reference of s would go out of scope earlier than baz.
The point I wanted to make with this snippet: we didn't need to annotate any lifetimes on the type Baz to make this safe; Rust figured it out on its own and enforces that baz lives no longer than s. This will be important below.
Writing a trait for it
So far we only covered the basics. Let's try to write a trait like Fn to get closer to your original problem:
trait MyFn {
type Output;
fn call(&self) -> Self::Output;
}
In our trait, there are no function parameters, but otherwise it's fairly identical to the real Fn trait.
Let's implement it!
impl<T> MyFn for Baz<T> {
type Output = ???;
fn call(&self) -> Self::Output {
&self.0
}
}
Now we have a problem: what do we write instead of ???? Naively one would write &T... but we need a lifetime parameter for that reference. Where do we get one? What lifetime does the return value even have?
Let's check the function we implemented before:
impl<T> Baz<T> {
fn call(&self) -> &T {
&self.0
}
}
So here we use &T without lifetime parameter too. But this only works because of lifetime elision. Basically, the compiler fills in the blanks so that fn call(&self) -> &T is equivalent to:
fn call<'s>(&'s self) -> &'s T
Aha, so the lifetime of the returned reference is bound to the self lifetime! (more experienced Rust users might already have a feeling where this is going...).
(As a side note: why is the returned reference not dependent on the lifetime of T itself? If T references something non-'static then this has to be accounted for, right? Yes, but it is already accounted for! Remember that no instance of Baz<T> can ever live longer than the thing T might reference. So the self lifetime is already shorter than whatever lifetime T might have. Thus we only need to concentrate on the self lifetime)
But how do we express that in the trait impl? Turns out: we can't (yet). This problem is regularly mentioned in the context of streaming iterators -- that is, iterators that return an item with a lifetime bound to the self lifetime. In today's Rust, it is sadly impossible to implement this; the type system is not strong enough.
What about the future?
Luckily, there is an RFC "Generic Associated Types" which was merged some time ago. This RFC extends the Rust type system to allow associated types of traits to be generic (over other types and lifetimes).
Let's see how we can make your example (kinda) work with GATs (according to the RFC; this stuff doesn't work yet ☹). First we have to change the trait definition:
trait MyFn {
type Output<'a>; // <-- we added <'a> to make it generic
fn call(&self) -> Self::Output;
}
The function signature hasn't changed in the code, but notice that lifetime elision kicks in! The above fn call(&self) -> Self::Output is equivalent to:
fn call<'s>(&'s self) -> Self::Output<'s>
So the lifetime of the associated type is bound to the self lifetime. Just as we wanted! The impl looks like this:
impl<T> MyFn for Baz<T> {
type Output<'a> = &'a T;
fn call(&self) -> Self::Output {
&self.0
}
}
To return a boxed MyFn we would need to write this (according to this section of the RFC:
fn make_baz<T>(t: T) -> Box<for<'a> MyFn<Output<'a> = &'a T>> {
Box::new(Baz(t))
}
And what if we want to use the real Fn trait? As far as I understand, we can't, even with GATs. I think it's impossible to change the existing Fn trait to use GATs in a backwards compatible manner. So it's likely that the standard library will keep the less powerful trait as is. (side note: how to evolve the standard library in backwards incompatible ways to use new language features is something I wondered about a few times already; so far I haven't heard of any real plan in this regards; I hope the Rust team comes up with something...)
Summary
What you want is not technically impossible or unsafe (we implemented it as a simple struct and it works). However, unfortunately it is impossible to express what you want in the form of closures/Fn traits in Rust's type system right now. This is the same problem streaming iterators are dealing with.
With the planned GAT feature, it is possible to express all of this in the type system. However, the standard library would need to catch up somehow to make your exact code possible.

What I expect:
The type T has lifetime 'a.
The value t live as long as T.
This makes no sense. A value cannot "live as long" as a type, because a type doesn't live. "T has lifetime 'a" is a very imprecise statement, easy to misunderstand. What T: 'a really means is "instances of T must stay valid at least as long as lifetime 'a. For example, T must not be a reference with a lifetime shorter than 'a, or a struct containing such a reference. Note that this has nothing to do with forming references to T, i.e. &T.
The value t, then, lives as long as its lexical scope (it's a function parameter) says it does, which has nothing to do with 'a at all.
t moves to the closure, so the closure live as long as t
This is also incorrect. The closure lives as long as the closure does lexically. It is a temporary in the result expression, and therefore lives until the end of the result expression. t's lifetime concerns the closure not at all, since it has its own T variable inside, the capture of t. Since the capture is a copy/move of t, it is not in any way affected by t's lifetime.
The temporary closure is then moved into the box's storage, but that's a new object with its own lifetime. The lifetime of that closure is bound to the lifetime of the box, i.e. it is the return value of the function, and later (if you store the box outside the function) the lifetime of whatever variable you store the box in.
All of that means that a closure that returns a reference to its own capture state must bind the lifetime of that reference to its own reference. Unfortunately, this is not possible.
Here's why:
The Fn trait implies the FnMut trait, which in turn implies the FnOnce trait. That is, every function object in Rust can be called with a by-value self argument. This means that every function object must be still valid being called with a by-value self argument and returning the same thing as always.
In other words, trying to write a closure that returns a reference to its own captures expands to roughly this code:
struct Closure<T> {
captured: T,
}
impl<T> FnOnce<()> for Closure<T> {
type Output = &'??? T; // what do I put as lifetime here?
fn call_once(self, _: ()) -> Self::Output {
&self.captured // returning reference to local variable
// no matter what, the reference would be invalid once we return
}
}
And this is why what you're trying to do is fundamentally impossible. Take a step back, think of what you're actually trying to accomplish with this closure, and find some other way to accomplish it.

You expect the type T to have lifetime 'a, but t is not a reference to a value of type T. The function takes ownership of the variable t by argument passing:
// t is moved here, t lifetime is the scope of the function
fn foo<'a, T: 'a>(t: T)
You should do:
fn foo<'a, T: 'a>(t: &'a T) -> Box<Fn() -> &'a T + 'a> {
Box::new(move || t)
}

The other answers are top-notch, but I wanted to chime in with another reason your original code couldn't work. A big problem lies in the signature:
fn foo<'a, T: 'a>(t: T) -> Box<Fn() -> &'a T + 'a>
This says that the caller may specify any lifetime when calling foo and the code will be valid and memory-safe. That cannot possibly be true for this code. It wouldn't make sense to call this with 'a set to 'static, but nothing about this signature would prevent that.

How do I create an iterator that invalidates the previous result each time `next` is called? [duplicate]

This would make it possible to safely iterate over the same element twice, or to hold some state for the global thing being iterated over in the item type.
Something like:
trait IterShort<Iter>
where Self: Borrow<Iter>,
{
type Item;
fn next(self) -> Option<Self::Item>;
}
then an implementation could look like:
impl<'a, MyIter> IterShort<MyIter> for &'a mut MyIter {
type Item = &'a mut MyItem;
fn next(self) -> Option<Self::Item> {
// ...
}
}
I realize I could write my own (I just did), but I'd like one that works with the for-loop notation. Is that possible?

The std::iter::Iterator trait can not do this, but you can write a different trait:
trait StreamingIterator {
type Item;
fn next<'a>(&'a mut self) -> Option<&'a mut Self::Item>;
}
Note that the return value of next borrows the iterator itself, whereas in Vec::iter for example it only borrows the vector.
The downside is that &mut is hard-coded. Making it generic would require higher-kinded types (so that StreamingIterator::Item could itself be generic over a lifetime parameter).
Alexis Beingessner gave a talk about this and more titled Who Owns This Stream of Data? at RustCamp.
As to for loops, they’re really tied to std::iter::IntoIterator which is tied to std::iter::Iterator. You’d just have to implement both.

The standard iterators can't do this as far as I can see. The very definition of an iterator is that the outside has control over the elements while the inside has control over what produces the elements.
From what I understand of what you are trying to do, I'd flip the concept around and instead of returning elements from an iterator to a surrounding environment, pass the environment to the iterator. That is, you create a struct with a constructor function that accepts a closure and implements the iterator trait. On each call to next, the passed-in closure is called with the next element and the return value of that closure or modifications thereof are returned as the current element. That way, next can handle the lifetime of whatever would otherwise be returned to the surrounding environment.

Returning a borrow from an owned resource

I am trying to write a function that maps an Arc<[T]> into an Iterable, for use with flat_map (that is, I want to call i.flat_map(my_iter) for some other i: Iterator<Item=Arc<[T]>>).
fn my_iter<'a, T>(n: Arc<[T]>) -> slice::Iter<'a, T> {
let t: &'a [T] = &*n.clone();
t.into_iter()
}
The function above does not work because n.clone() produces an owned value of type Arc<[T]>, which I can dereference to [T] and then borrow to get &[T], but the lifetime of the borrow only lasts until the end of the function, while the 'a lifetime lasts until the client drops the returned iterator.
How do I clone the Arc in such a way that the client takes ownership of the clone, so that the value is only dropped after the client is done with the iterator (assuming no one else is using the Arc)?
Here's some sample code for the source iterator:
struct BaseIter<T>(Arc<[T]>);
impl<T> Iterator for BaseIter<T> {
type Item = Arc<[T]>;
fn next(&mut self) -> Option<Self::Item> {
Some(self.0.clone())
}
}
How do I implement the result of BaseIter(data).flat_map(my_iter) (which is of type Iterator<&T>) given that BaseIter is producing data, not just borrowing it? (The real thing is more complicated than this, it's not always the same result, but the ownership semantics are the same.)

You cannot do this. Remember, lifetimes in Rust are purely compile-time entities and are only used to validate that your code doesn't accidentally access dropped data. For example:
fn my_iter<'a, T>(n: Arc<[T]>) -> slice::Iter<'a, T>
Here 'a does not "last until the client drops the returned iterator"; this reasoning is incorrect. From the point of view of slice::Iter its lifetime parameter means the lifetime of the slice it is pointing at; from the point of view of my_iter 'a is just a lifetime parameter which can be chosen arbitrarily by the caller. In other words, slice::Iter is always tied to some slice with some concrete lifetime, but the signature of my_iter states that it is able to return arbitrary lifetime. Do you see the contradiction?
As a side note, due to covariance of lifetimes you can return a slice of a static slice from such a function:
static DATA: &'static [u8] = &[1, 2, 3];
fn get_data<'a>() -> &'a [u8] {
DATA
}
The above definition compiles, but it only works because DATA is stored in static memory of your program and is always valid when your program is running; this is not so with Arc<[T]>.
Arc<[T]> implies shared ownership, that is, the data inside Arc<[T]> is jointly owned by all clones of the original Arc<[T]> value. Therefore, when the last clone of an Arc goes out of scope, the value it contains is dropped, and the respective memory is freed. Now, consider what would happen if my_iter() was allowed to compile:
let iter = {
let data: Arc<[i32]> = get_arc_slice();
my_iter(data.clone())
};
iter.map(|x| x+1).collect::<Vec<_>>();
Because in my_iter() 'a can be arbitrary and is not linked in any way to Arc<[T]> (and can not be, actually), nothing prevents this code from compilation - the user might as well choose 'static lifetime. However, here all clones of data will be dropped inside the block, and the array it contains inside will be freed. Using iter after the block is unsafe because it now provides access to the freed memory.
How do I clone the Arc in such a way that the client takes ownership of the clone, so that the value is only dropped after the client is done with the iterator (assuming no one else is using the Arc)?
So, as follows from the above, this is impossible. Only the owner of the data determines when this data should be destroyed, and borrowed references (whose existence is always implied by lifetime parameters) may only borrow the data for the time when it exists, but borrows cannot affect when and how the data is destroyed. In order for borrowed references to compile, they need to always borrow only the data which is valid through the whole time these references are active.
What you can do is to rethink your architecture. It is hard to say what exactly can be done without looking at the full code, but in the case of this particular example you can, for example, first collect the iterator into a vector and then iterate over the vector:
let items: Vec<_> = your_iter.collect();
items.iter().flat_map(my_iter)
Note that now my_iter() should indeed accept &Arc<[T]>, just as Francis Gagné has suggested; this way, the lifetimes of the output iterator will be tied to the lifetime of the input reference, and everything should work fine, because now it is guaranteed that Arcs are stored stably in the vector for their later perusal during the iteration.

There's no way to make this work by passing an Arc<[T]> by value. You need to start from a reference to an Arc<[T]> in order to construct a valid slice::Iter.
fn my_iter<'a, T>(n: &'a Arc<[T]>) -> slice::Iter<'a, T> {
n.into_iter()
}
Or, if we elide the lifetimes:
fn my_iter<T>(n: &Arc<[T]>) -> slice::Iter<T> {
n.into_iter()
}

You need to use another iterator as return type of the function my_iter. slice::Iter<'a, T> has an associated type Item = &'a T. You need an iterator with associated type Item = T. Something like vec::IntoIter<T>. You can implement such an iterator yourself:
use std::sync::Arc;
struct BaseIter<T>(Arc<[T]>);
impl<T> Iterator for BaseIter<T> {
type Item = Arc<[T]>;
fn next(&mut self) -> Option<Self::Item> {
Some(self.0.clone())
}
}
struct ArcIntoIter<T>(usize, Arc<[T]>);
impl<T:Clone> Iterator for ArcIntoIter<T> {
type Item = T;
fn next(&mut self) -> Option<Self::Item> {
if self.0 < self.1.len(){
let i = self.0;
self.0+=1;
Some(self.1[i].clone())
}else{
None
}
}
}
fn my_iter<T>(n: Arc<[T]>) -> ArcIntoIter<T> {
ArcIntoIter(0, n)
}
fn main() {
let data = Arc::new(["A","B","C"]);
println!("{:?}", BaseIter(data).take(3).flat_map(my_iter).collect::<String>());
//output:"ABCABCABC"
}

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Ownership, closures, FnOnce: much confusion - rust

Related

Can a closure return a reference to data it owns? [duplicate]

How can a closure using the `move` keyword create a FnMut closure?

Cannot infer an appropriate lifetime for a closure that returns a reference

How do I create an iterator that invalidates the previous result each time `next` is called? [duplicate]

Returning a borrow from an owned resource

Categories

Resources