How do I build a Cacher in Rust without relying on the Copy trait? - hashmap

I am trying to implement a Cacher as mentioned in Chapter 13 of the Rust book and running into trouble.
My Cacher code looks like:
use std::collections::HashMap;
use std::hash::Hash;
pub struct Cacher<T, K, V>
where
T: Fn(K) -> V,
{
calculation: T,
values: HashMap<K, V>,
}
impl<T, K: Eq + Hash, V> Cacher<T, K, V>
where
T: Fn(K) -> V,
{
pub fn new(calculation: T) -> Cacher<T, K, V> {
Cacher {
calculation,
values: HashMap::new(),
}
}
pub fn value(&mut self, k: K) -> &V {
let result = self.values.get(&k);
match result {
Some(v) => {
return v;
}
None => {
let v = (self.calculation)(k);
self.values.insert(k, v);
&v
}
}
}
}
and my test case for this lib looks like:
mod cacher;
#[cfg(test)]
mod tests {
use cacher::Cacher;
#[test]
fn repeated_runs_same() {
let mut cacher = Cacher::new(|x| x);
let run1 = cacher.value(5);
let run2 = cacher.value(7);
assert_ne!(run1, run2);
}
}
I ran into the following problems when running my test case:
error[E0499]: cannot borrow cacher as mutable more than once at a time
Each time I make a run1, run2 value it is trying to borrow cacher as a mutable borrow. I don't understand why it is borrowing at all - I thought cacher.value() should be returning a reference to the item that is stored in the cacher which is not a borrow.
error[E0597]: v does not live long enough pointing to the v I return in the None case of value(). How do I properly move the v into the HashMap and give it the same lifetime as the HashMap? Clearly the lifetime is expiring as it returns, but I want to just return a reference to it to use as the return from value().
error[E0502]: cannot borrowself.valuesas mutable because it is also borrowed as immutable in value(). self.values.get(&k) is an immutable borrow and self.values.insert(k,v) is a mutable borrow - though I thought .get() was an immutable borrow and .insert() was a transfer of ownership.
and a few other errors related to moving which I should be able to handle separately. These are much more fundamental errors that indicate I have misunderstood the idea of ownership in Rust, yet rereading the segment of the book doesn't make clear to me what I missed.

I think there a quite a few issues to look into here:
First, for the definition of the function value(&mut self, k: K) -> &V ; the compiler will insert the lifetimes for you so that it becomes value(&'a mut self, k: K) -> &'a V. This means, the lifetime of the self cannot shrink for the sake of the function, because there is reference coming out of the function with the same lifetime, and will live for as long as the scope. Since it is a mutable reference, you cannot borrow it again. Hence the error error[E0499]: cannot borrow cacher as mutable more than once at a time.
Second, you call the calculation function that returns the value within some inner scope of the function value() and then you return the reference to it, which is not possible. You expect the reference to live longer than the the referent. Hence the error error[E0597]: v does not live long enough
The third error is a bit involved. You see, let result = self.values.get(&k); as mentioned in the first statement, causes k to be held immutably till the end of the function. result returned will live for as long your function value(), which means you cannot take a borrow(mutable) in the same scope, giving the error
error[E0502]: cannot borrow self.values as mutable because it is also borrowed as immutable in value() self.values.get(&k)
Your K needs to be a Clone, reason being k will be moved into the function calculation, rendering it unusable during insert.
So with K as a Clone, the Cacher implementation will be:
impl<T, K: Eq + Hash + Clone, V> Cacher<T, K, V>
where
T: Fn(K) -> V,
{
pub fn new(calculation: T) -> Cacher<T, K, V> {
Cacher {
calculation,
values: hash_map::HashMap::new(),
}
}
pub fn value(&mut self, k: K) -> &V {
if self.values.contains_key(&k) {
return &self.values[&k];
}
self.values.insert(k.clone(), (self.calculation)(k.clone()));
self.values.get(&k).unwrap()
}
}
This lifetimes here are based on the branching control flow. The if self.values.contains_key ... block always returns, hence the code after if block can only be executed when if self.values.contains_key ... is false. The tiny scope created for if condition, will only live within the condition check, i.e reference taken (and returned) for if self.values.contains_key(... will go away with this tiny scope.
For more please refer NLL RFC
As mentioned by #jmb in his answer, for your test to work, V will need to be a Clone (impl <... V:Clone> Cacher<T, K, V>) to return by value or use shared ownership like Rc to avoid the cloning cost.
eg.
fn value(&mut self, k: K) -> V { ..
fn value(&mut self, k: K) -> Rc<V> { ..

Returning a reference to a value is the same thing as borrowing that value. Since that value is owned by the cacher, it implicitly borrows the cacher too. This makes sense: if you take a reference to a value inside the cacher then destroy the cacher, what happens to your reference? Note also that if you modify the cacher (e.g. by inserting a new element), this could reallocate the storage, which would invalidate any references to values stored inside.
You need your values to be at least Clone so that Cacher::value can return by value instead of by reference. You can use Rc if your values are too expensive to clone and you are ok with all callers getting the same instance.
The naive way to get the instance that was stored in the HashMap as opposed to the temporary you allocated to build it would be to call self.values.get (k).unwrap() after inserting the value in the map. In order to avoid the cost of computing twice the location of the value in the map, you can use the Entry interface:
pub fn value(&mut self, k: K) -> Rc<V> {
self.values.entry (&k).or_insert_with (|| Rc::new (self.calculation (k)))
}
I believe my answer to point 2 also solves this point.

Related

How is it that I can circumvent "cannot borrow as mutable more than once at a time" with semantically equivalent code?

I have the following for a merge sort problem with huge files:
struct MergeIterator<'a, T> where T: Copy {
one: &'a mut dyn Iterator<Item=T>,
two: &'a mut dyn Iterator<Item=T>,
a: Option<T>,
b: Option<T>
}
impl<'m, T> MergeIterator<'m, T> where T: Copy {
pub fn new(i1: &'m mut dyn Iterator<Item=T>,
i2: &'m mut dyn Iterator<Item=T>) -> MergeIterator<'m, T> {
let mut m = MergeIterator {one:i1, two:i2, a: None, b: None};
m.a = m.one.next();
m.b = m.two.next();
m
}
}
This seems to make rustc happy. However, I started with this (imho) less clumsy body of the new() function:
MergeIterator {one:i1, two:i2, a: i1.next(), b: i2.next()}
and got harsh feedback from the compiler saying
cannot borrow `*i1` as mutable more than once at a time
and likewise for i2.
I'd like to understand where the semantic difference is between initializing the data elements through the m.one reference vs the i1 argument? Why must I write clumsy imperative code here to achieve what I need?
This will probably be clearer if you write it in lines so that the sequence of operations is visible:
MergeIterator {
one: i1,
two: i2,
a: i1.next(),
b: i2.next(),
}
You're giving i1 to the new struct, you don't have it anymore to call next.
The solution is to change the order of operations to call next first before giving away the mutable reference:
MergeIterator {
a: i1.next(),
b: i2.next(),
one: i1,
two: i2,
}
To make it clearer, you must understand that i1.next() borrows i1 only for the time of the function call while i1 gives away the mutable reference. Reversing the order isn't equivalent.

Compiler continues to count the borrow as mutable when it is actually immutable [duplicate]

This code fails the dreaded borrow checker (playground):
struct Data {
a: i32,
b: i32,
c: i32,
}
impl Data {
fn reference_to_a(&mut self) -> &i32 {
self.c = 1;
&self.a
}
fn get_b(&self) -> i32 {
self.b
}
}
fn main() {
let mut dat = Data{ a: 1, b: 2, c: 3 };
let aref = dat.reference_to_a();
println!("{}", dat.get_b());
}
Since non-lexical lifetimes were implemented, this is required to trigger the error:
fn main() {
let mut dat = Data { a: 1, b: 2, c: 3 };
let aref = dat.reference_to_a();
let b = dat.get_b();
println!("{:?}, {}", aref, b);
}
Error:
error[E0502]: cannot borrow `dat` as immutable because it is also borrowed as mutable
--> <anon>:19:20
|
18 | let aref = dat.reference_to_a();
| --- mutable borrow occurs here
19 | println!("{}", dat.get_b());
| ^^^ immutable borrow occurs here
20 | }
| - mutable borrow ends here
Why is this? I would have thought that the mutable borrow of dat is converted into an immutable one when reference_to_a() returns, because that function only returns an immutable reference. Is the borrow checker just not clever enough yet? Is this planned? Is there a way around it?
Lifetimes are separate from whether a reference is mutable or not. Working through the code:
fn reference_to_a(&mut self) -> &i32
Although the lifetimes have been elided, this is equivalent to:
fn reference_to_a<'a>(&'a mut self) -> &'a i32
i.e. the input and output lifetimes are the same. That's the only way to assign lifetimes to a function like this (unless it returned an &'static reference to global data), since you can't make up the output lifetime from nothing.
That means that if you keep the return value alive by saving it in a variable, you're keeping the &mut self alive too.
Another way of thinking about it is that the &i32 is a sub-borrow of &mut self, so is only valid until that expires.
As #aSpex points out, this is covered in the nomicon.
Why is this an error: While a more precise explanation was already given by #Chris some 2.5 years ago, you can read fn reference_to_a(&mut self) -> &i32 as a declaration that:
“I want to exclusively borrow self, then return a shared/immutable reference which lives as long as the original exclusive borrow” (source)
Apparently it can even prevent me from shooting myself in the foot.
Is the borrow checker just not clever enough yet? Is this planned?
There's still no way to express "I want to exclusively borrow self for the duration of the call, and return a shared reference with a separate lifetime". It is mentioned in the nomicon as #aSpex pointed out, and is listed among the Things Rust doesn’t let you do as of late 2018.
I couldn't find specific plans to tackle this, as previously other borrow checker improvements were deemed higher priority. The idea about allowing separate read/write "lifetime roles" (Ref2<'r, 'w>) was mentioned in the NLL RFC, but no-one has made it into an RFC of its own, as far as I can see.
Is there a way around it? Not really, but depending on the reason you needed this in the first place, other ways of structuring the code may be appropriate:
You can return a copy/clone instead of the reference
Sometimes you can split a fn(&mut self) -> &T into two, one taking &mut self and another returning &T, as suggested by #Chris here
As is often the case in Rust, rearranging your structs to be "data-oriented" rather than "object-oriented" can help
You can return a shared reference from the method: fn(&mut self) -> (&Self, &T) (from this answer)
You can make the fn take a shared &self reference and use interior mutability (i.e. define the parts of Self that need to be mutated as Cell<T> or RefCell<T>). This may feel like cheating, but it's actually appropriate, e.g. when the reason you need mutability as an implementation detail of a logically-immutable method. After all we're making a method take a &mut self not because it mutates parts of self, but to make it known to the caller so that it's possible to reason about which values can change in a complex program.

How can I create an Iter over a Vec contained in a RefCell? [duplicate]

Given the following struct and impl:
use std::slice::Iter;
use std::cell::RefCell;
struct Foo {
bar: RefCell<Vec<u32>>,
}
impl Foo {
pub fn iter(&self) -> Iter<u32> {
self.bar.borrow().iter()
}
}
fn main() {}
I get an error message about a lifetime issue:
error: borrowed value does not live long enough
--> src/main.rs:9:9
|
9 | self.bar.borrow().iter()
| ^^^^^^^^^^^^^^^^^ does not live long enough
10 | }
| - temporary value only lives until here
|
note: borrowed value must be valid for the anonymous lifetime #1 defined on the body at 8:36...
--> src/main.rs:8:37
|
8 | pub fn iter(&self) -> Iter<u32> {
| _____________________________________^ starting here...
9 | | self.bar.borrow().iter()
10 | | }
| |_____^ ...ending here
How am I able to return and use bars iterator?
You cannot do this because it would allow you to circumvent runtime checks for uniqueness violations.
RefCell provides you a way to "defer" mutability exclusiveness checks to runtime, in exchange allowing mutation of the data it holds inside through shared references. This is done using RAII guards: you can obtain a guard object using a shared reference to RefCell, and then access the data inside RefCell using this guard object:
&'a RefCell<T> -> Ref<'a, T> (with borrow) or RefMut<'a, T> (with borrow_mut)
&'b Ref<'a, T> -> &'b T
&'b mut RefMut<'a, T> -> &'b mut T
The key point here is that 'b is different from 'a, which allows one to obtain &mut T references without having a &mut reference to the RefCell. However, these references will be linked to the guard instead and can't live longer than the guard. This is done intentionally: Ref and RefMut destructors toggle various flags inside their RefCell to force mutability checks and to force borrow() and borrow_mut() panic if these checks fail.
The simplest thing you can do is to return a wrapper around Ref, a reference to which would implement IntoIterator:
use std::cell::Ref;
struct VecRefWrapper<'a, T: 'a> {
r: Ref<'a, Vec<T>>
}
impl<'a, 'b: 'a, T: 'a> IntoIterator for &'b VecRefWrapper<'a, T> {
type IntoIter = Iter<'a, T>;
type Item = &'a T;
fn into_iter(self) -> Iter<'a, T> {
self.r.iter()
}
}
(try it on playground)
You can't implement IntoIterator for VecRefWrapper directly because then the internal Ref will be consumed by into_iter(), giving you essentially the same situation you're in now.
Alternate Solution
Here is an alternate solution that uses interior mutability as it was intended. Instead of creating an iterator for &T values, we should create an iterator for Ref<T> values, which deference automatically.
struct Iter<'a, T> {
inner: Option<Ref<'a, [T]>>,
}
impl<'a, T> Iterator for Iter<'a, T> {
type Item = Ref<'a, T>;
fn next(&mut self) -> Option<Self::Item> {
match self.inner.take() {
Some(borrow) => match *borrow {
[] => None,
[_, ..] => {
let (head, tail) = Ref::map_split(borrow, |slice| {
(&slice[0], &slice[1..])
});
self.inner.replace(tail);
Some(head)
}
},
None => None,
}
}
}
Playground
Explanation
The accepted answer has a few significant drawbacks that may confuse those new to Rust. I will explain how, in my personal experience, the accepted answer might actually be harmful to a beginner, and why I believe this alternative uses interior mutability and iterators as they were intended.
As the previous answer importantly highlights, using RefCell creates a divergent type hierarchy that isolates mutable and immutable access to a shared value, but you do not have to worry about lifetimes to solve the iteration problem:
RefCell<T> .borrow() -> Ref<T> .deref() -> &T
RefCell<T> .borrow_mut() -> RefMut<T> .deref_mut() -> &mut T
The key to solving this without lifetimes is the Ref::map method, which is critically missed in the book. Ref::map "makes a new reference to a component of the borrowed data", or in other words converts a Ref<T> of the outer type to a Ref<U> of some inner value:
Ref::map(Ref<T>, ...) -> Ref<U>
Ref::map and its counterpart RefMut::map are the real stars of the interior mutability pattern, not borrow() and borrow_mut().
Why? Because unlike borrow() and borrow_mut(), Ref::mut and RefMut::map, allow you to create references to interior values that can be "returned".
Consider adding a first() method to the Foo struct described in the question:
fn first(&self) -> &u32 {
&self.bar.borrow()[0]
}
Nope, .borrow() makes a temporary Ref that only lives until the method returns:
error[E0515]: cannot return value referencing temporary value
--> src/main.rs:9:11
|
9 | &self.bar.borrow()[0]
| ^-----------------^^^
| ||
| |temporary value created here
| returns a value referencing data owned by the current function
error: aborting due to previous error; 1 warning emitted
We can make it more obvious what is happening if we break it up and make the implicit deference explicit:
fn first(&self) -> &u32 {
let borrow: Ref<_> = self.bar.borrow();
let bar: &Vec<u32> = borrow.deref();
&bar[0]
}
Now we can see that .borrow() creates a Ref<T> that is owned by the method's scope, and isn't returned and therefore dropped even before the reference it provided can be used. So, what we really need is to return an owned type instead of a reference. We want to return a Ref<T>, as it implements Deref for us!
Ref::map will help us do just that for component (internal) values:
fn first(&self) -> Ref<u32> {
Ref::map(self.bar.borrow(), |bar| &bar[0])
}
Of course, the .deref() will still happen automatically, and Ref<u32> will be mostly be referentially transparent as &u32.
Gotcha. One easy mistake to make when using Ref::map is to try to create an owned value in the closure, which is not possible as when we tried to use borrow(). Consider the type signature of the second parameter, the function: FnOnce(&T) -> &U,. It returns a reference, not an owned type!
This is why we use a slice in the answer &v[..] instead of trying to use the vector's .iter() method, which returns an owned std::slice::Iter<'a, T>. Slices are a reference type.
Additional Thoughts
Alright, so now I will attempt to justify why this solution is better than the accepted answer.
First, the use of IntoIterator is inconsistent with the Rust standard library, and arguably the purpose and intent of the trait. The trait method consumes self: fn into_iter(self) -> ....
let v = vec![1,2,3,4];
let i = v.into_iter();
// v is no longer valid, it was moved into the iterator
Using IntoIterator indirectly for a wrapper is inconsistent as you consume the wrapper and not the collection. In my experience, beginners will benefit from sticking with the conventions. We should use a regular Iterator.
Next, the IntoIterator trait is implemented for the reference &VecRefWrapper and not the owned type VecRefWrapper.
Suppose you are implementing a library. The consumers of your API will have to seemingly arbitrarily decorate owned values with reference operators, as is demonstrated in the example on the playground:
for &i in &foo.iter() {
println!("{}", i);
}
This is a subtle and confusing distinction if you are new to Rust. Why do we have to take a reference to the value when it is anonymously owned by - and should only exist for - the scope of the loop?
Finally, the solution above shows how it is possible to drill all they way into your data with interior mutability, and makes the path forward for implementing a mutable iterator clear as well. Use RefMut.
From my research there is currently no solution to this problem. The biggest problem here is self-referentiality and the fact that rust cannot prove your code to be safe. Or at least not in the generic fashion.
I think it's safe to assume that crates like ouroboros, self-cell and owning_ref are solution if you know that your struct (T in Ref<T>) does not contain any smart pointers nor anything which could invalidate any pointers you might obtain in your "dependent" struct.
Note that self-cell does this safely with extra heap allocation which might be ok in some cases.
There was also RFC for adding map_value to Ref<T> but as you can see, there is always some way to invalidate pointers in general (which does not mean your specific case is wrong it's just that it probably will never be added to the core library/language because it cannot be guaranteed for any T)
Yeah, so no answer, sorry. impl IntoIterator for &T works but I think it's rather hack and it forces you to write for x in &iter instead of for x in iter

Why doesn't a mutable borrow of self change to immutable?

This code fails the dreaded borrow checker (playground):
struct Data {
a: i32,
b: i32,
c: i32,
}
impl Data {
fn reference_to_a(&mut self) -> &i32 {
self.c = 1;
&self.a
}
fn get_b(&self) -> i32 {
self.b
}
}
fn main() {
let mut dat = Data{ a: 1, b: 2, c: 3 };
let aref = dat.reference_to_a();
println!("{}", dat.get_b());
}
Since non-lexical lifetimes were implemented, this is required to trigger the error:
fn main() {
let mut dat = Data { a: 1, b: 2, c: 3 };
let aref = dat.reference_to_a();
let b = dat.get_b();
println!("{:?}, {}", aref, b);
}
Error:
error[E0502]: cannot borrow `dat` as immutable because it is also borrowed as mutable
--> <anon>:19:20
|
18 | let aref = dat.reference_to_a();
| --- mutable borrow occurs here
19 | println!("{}", dat.get_b());
| ^^^ immutable borrow occurs here
20 | }
| - mutable borrow ends here
Why is this? I would have thought that the mutable borrow of dat is converted into an immutable one when reference_to_a() returns, because that function only returns an immutable reference. Is the borrow checker just not clever enough yet? Is this planned? Is there a way around it?
Lifetimes are separate from whether a reference is mutable or not. Working through the code:
fn reference_to_a(&mut self) -> &i32
Although the lifetimes have been elided, this is equivalent to:
fn reference_to_a<'a>(&'a mut self) -> &'a i32
i.e. the input and output lifetimes are the same. That's the only way to assign lifetimes to a function like this (unless it returned an &'static reference to global data), since you can't make up the output lifetime from nothing.
That means that if you keep the return value alive by saving it in a variable, you're keeping the &mut self alive too.
Another way of thinking about it is that the &i32 is a sub-borrow of &mut self, so is only valid until that expires.
As #aSpex points out, this is covered in the nomicon.
Why is this an error: While a more precise explanation was already given by #Chris some 2.5 years ago, you can read fn reference_to_a(&mut self) -> &i32 as a declaration that:
“I want to exclusively borrow self, then return a shared/immutable reference which lives as long as the original exclusive borrow” (source)
Apparently it can even prevent me from shooting myself in the foot.
Is the borrow checker just not clever enough yet? Is this planned?
There's still no way to express "I want to exclusively borrow self for the duration of the call, and return a shared reference with a separate lifetime". It is mentioned in the nomicon as #aSpex pointed out, and is listed among the Things Rust doesn’t let you do as of late 2018.
I couldn't find specific plans to tackle this, as previously other borrow checker improvements were deemed higher priority. The idea about allowing separate read/write "lifetime roles" (Ref2<'r, 'w>) was mentioned in the NLL RFC, but no-one has made it into an RFC of its own, as far as I can see.
Is there a way around it? Not really, but depending on the reason you needed this in the first place, other ways of structuring the code may be appropriate:
You can return a copy/clone instead of the reference
Sometimes you can split a fn(&mut self) -> &T into two, one taking &mut self and another returning &T, as suggested by #Chris here
As is often the case in Rust, rearranging your structs to be "data-oriented" rather than "object-oriented" can help
You can return a shared reference from the method: fn(&mut self) -> (&Self, &T) (from this answer)
You can make the fn take a shared &self reference and use interior mutability (i.e. define the parts of Self that need to be mutated as Cell<T> or RefCell<T>). This may feel like cheating, but it's actually appropriate, e.g. when the reason you need mutability as an implementation detail of a logically-immutable method. After all we're making a method take a &mut self not because it mutates parts of self, but to make it known to the caller so that it's possible to reason about which values can change in a complex program.

Returning a borrow from an owned resource

I am trying to write a function that maps an Arc<[T]> into an Iterable, for use with flat_map (that is, I want to call i.flat_map(my_iter) for some other i: Iterator<Item=Arc<[T]>>).
fn my_iter<'a, T>(n: Arc<[T]>) -> slice::Iter<'a, T> {
let t: &'a [T] = &*n.clone();
t.into_iter()
}
The function above does not work because n.clone() produces an owned value of type Arc<[T]>, which I can dereference to [T] and then borrow to get &[T], but the lifetime of the borrow only lasts until the end of the function, while the 'a lifetime lasts until the client drops the returned iterator.
How do I clone the Arc in such a way that the client takes ownership of the clone, so that the value is only dropped after the client is done with the iterator (assuming no one else is using the Arc)?
Here's some sample code for the source iterator:
struct BaseIter<T>(Arc<[T]>);
impl<T> Iterator for BaseIter<T> {
type Item = Arc<[T]>;
fn next(&mut self) -> Option<Self::Item> {
Some(self.0.clone())
}
}
How do I implement the result of BaseIter(data).flat_map(my_iter) (which is of type Iterator<&T>) given that BaseIter is producing data, not just borrowing it? (The real thing is more complicated than this, it's not always the same result, but the ownership semantics are the same.)
You cannot do this. Remember, lifetimes in Rust are purely compile-time entities and are only used to validate that your code doesn't accidentally access dropped data. For example:
fn my_iter<'a, T>(n: Arc<[T]>) -> slice::Iter<'a, T>
Here 'a does not "last until the client drops the returned iterator"; this reasoning is incorrect. From the point of view of slice::Iter its lifetime parameter means the lifetime of the slice it is pointing at; from the point of view of my_iter 'a is just a lifetime parameter which can be chosen arbitrarily by the caller. In other words, slice::Iter is always tied to some slice with some concrete lifetime, but the signature of my_iter states that it is able to return arbitrary lifetime. Do you see the contradiction?
As a side note, due to covariance of lifetimes you can return a slice of a static slice from such a function:
static DATA: &'static [u8] = &[1, 2, 3];
fn get_data<'a>() -> &'a [u8] {
DATA
}
The above definition compiles, but it only works because DATA is stored in static memory of your program and is always valid when your program is running; this is not so with Arc<[T]>.
Arc<[T]> implies shared ownership, that is, the data inside Arc<[T]> is jointly owned by all clones of the original Arc<[T]> value. Therefore, when the last clone of an Arc goes out of scope, the value it contains is dropped, and the respective memory is freed. Now, consider what would happen if my_iter() was allowed to compile:
let iter = {
let data: Arc<[i32]> = get_arc_slice();
my_iter(data.clone())
};
iter.map(|x| x+1).collect::<Vec<_>>();
Because in my_iter() 'a can be arbitrary and is not linked in any way to Arc<[T]> (and can not be, actually), nothing prevents this code from compilation - the user might as well choose 'static lifetime. However, here all clones of data will be dropped inside the block, and the array it contains inside will be freed. Using iter after the block is unsafe because it now provides access to the freed memory.
How do I clone the Arc in such a way that the client takes ownership of the clone, so that the value is only dropped after the client is done with the iterator (assuming no one else is using the Arc)?
So, as follows from the above, this is impossible. Only the owner of the data determines when this data should be destroyed, and borrowed references (whose existence is always implied by lifetime parameters) may only borrow the data for the time when it exists, but borrows cannot affect when and how the data is destroyed. In order for borrowed references to compile, they need to always borrow only the data which is valid through the whole time these references are active.
What you can do is to rethink your architecture. It is hard to say what exactly can be done without looking at the full code, but in the case of this particular example you can, for example, first collect the iterator into a vector and then iterate over the vector:
let items: Vec<_> = your_iter.collect();
items.iter().flat_map(my_iter)
Note that now my_iter() should indeed accept &Arc<[T]>, just as Francis Gagné has suggested; this way, the lifetimes of the output iterator will be tied to the lifetime of the input reference, and everything should work fine, because now it is guaranteed that Arcs are stored stably in the vector for their later perusal during the iteration.
There's no way to make this work by passing an Arc<[T]> by value. You need to start from a reference to an Arc<[T]> in order to construct a valid slice::Iter.
fn my_iter<'a, T>(n: &'a Arc<[T]>) -> slice::Iter<'a, T> {
n.into_iter()
}
Or, if we elide the lifetimes:
fn my_iter<T>(n: &Arc<[T]>) -> slice::Iter<T> {
n.into_iter()
}
You need to use another iterator as return type of the function my_iter. slice::Iter<'a, T> has an associated type Item = &'a T. You need an iterator with associated type Item = T. Something like vec::IntoIter<T>. You can implement such an iterator yourself:
use std::sync::Arc;
struct BaseIter<T>(Arc<[T]>);
impl<T> Iterator for BaseIter<T> {
type Item = Arc<[T]>;
fn next(&mut self) -> Option<Self::Item> {
Some(self.0.clone())
}
}
struct ArcIntoIter<T>(usize, Arc<[T]>);
impl<T:Clone> Iterator for ArcIntoIter<T> {
type Item = T;
fn next(&mut self) -> Option<Self::Item> {
if self.0 < self.1.len(){
let i = self.0;
self.0+=1;
Some(self.1[i].clone())
}else{
None
}
}
}
fn my_iter<T>(n: Arc<[T]>) -> ArcIntoIter<T> {
ArcIntoIter(0, n)
}
fn main() {
let data = Arc::new(["A","B","C"]);
println!("{:?}", BaseIter(data).take(3).flat_map(my_iter).collect::<String>());
//output:"ABCABCABC"
}

Resources