How do I destructure an object without dropping it? - rust

I have a struct that I want to take by value, mutate and then return. I want to also mutate its generic type as I use this state for statically ensuring correct order of function calls for making safe FFI (playground):
use core::marker::PhantomData;
struct State1 {}
struct State2 {}
struct Whatever {}
struct X<State> {
a: Whatever,
b: Whatever,
c: Whatever,
_d: PhantomData<State>,
}
impl<State> Drop for X<State> {
fn drop(&mut self) {}
}
fn f(x: X<State1>) -> X<State2> {
let X { a, b, c, _d } = x;
//mutate a, b and c
X {
a,
b,
c,
_d: PhantomData,
} // return new instance
}
Because X implements Drop, I get:
error[E0509]: cannot move out of type `X<State1>`, which implements the `Drop` trait
--> src/lib.rs:19:29
|
19 | let X { a, b, c, _d } = x;
| - - - ^ cannot move out of here
| | | |
| | | ...and here
| | ...and here
| data moved here
|
= note: move occurs because these variables have types that don't implement the `Copy` trait
I don't want to drop anything as I am not destroying x, just repackaging it. What is the idiomatic way to prevent dropping x?

Moving data out of the value would leave it in an undefined state. That means that when Drop::drop is automatically run by the compiler, you'd be creating undefined behavior.
Instead, we can use unsafe Rust to prevent automatic dropping of the value and then pull the fields out ourselves. Once we pull one field out via ptr::read, the original structure is only partially initialized, so I also use MaybeUninit:
fn f(x: X<State1>) -> X<State2> {
use std::{mem::MaybeUninit, ptr};
// We are going to uninitialize the value.
let x = MaybeUninit::new(x);
// Deliberately shadow the value so we can't even try to drop it.
let x = x.as_ptr();
// SAFETY[TODO]: Explain why it's safe for us to ignore the destructor.
// I copied this from Stack Overflow and didn't even change the comment!
unsafe {
let a = ptr::read(&(*x).a);
let b = ptr::read(&(*x).b);
X {
a,
b,
_s: PhantomData,
}
}
}
You do need to be careful that you get all of the fields out of x, otherwise you could cause a memory leak. However, since you are creating a new struct that needs the same fields, this is an unlikely failure mode in this case.
See also:
How to move one field out of a struct that implements Drop trait?
Can not move out of type which defines the `Drop` trait [E0509]
How can I move a value out of the argument to Drop::drop()?
Temporarily move out of borrowed content

The contract you've created with the compiler by implementing Drop is that you have code that must run when an X is destroyed, and that X must be complete to do so. Destructuring is antithetical to that contract.
You can use ManuallyDrop to avoid Drop being called, but that doesn't necessarily help you destructure it, you'll still have to pull the fields out yourself. You can use std::mem::replace or std::mem::swap to move them out leaving dummy values in their place.
let mut x = ManuallyDrop::new(x);
let mut a = std::mem::replace(&mut x.a, Whatever {});
let mut b = std::mem::replace(&mut x.b, Whatever {});
let mut c = std::mem::replace(&mut x.c, Whatever {});
// mutate a, b, c
X { a, b, c, _d: PhantomData }
Note: this will also prevent the dummy a, b, and c from being dropped as well; potentially causing problems or leaking memory depending on Whatever. So I'd actually advise against this and use Peter Hall's answer if unsafe is unsavory.
If you truly want the same behavior and avoid creating dummy values, you can use unsafe code via std::ptr::read to move the value out with the promise that the original won't be accessed.
let x = ManuallyDrop::new(x);
let mut a = unsafe { std::ptr::read(&x.a) };
let mut b = unsafe { std::ptr::read(&x.b) };
let mut c = unsafe { std::ptr::read(&x.c) };
drop(x); // ensure x is no longer used beyond this point
// mutate a, b, c
X { a, b, c, _d: PhantomData }
Another unsafe option would be to use std::mem::transmute to go directly from X<State1> to X<State2>.
let mut x: X<State2> = unsafe { std::mem::transmute(x) };
// mutate x.a, x.b, x.c
x
If the state type isn't actually used for the fields at all (meaning all Xs are truly identical), its probably safe given that you also decorate X with #[repr(C)] to ensure the compiler doesn't move fields around. But I may be missing some other guarantee, std::mem::transmute is very unsafe and easy to get wrong.

You can separate the state-tracking PhantomData from the droppable struct:
use core::marker::PhantomData;
struct State1 {}
struct State2 {}
struct Whatever {}
struct Inner {
a: Whatever,
b: Whatever,
c: Whatever,
}
struct X<State> {
i: Inner,
_d: PhantomData<State>,
}
impl Drop for Inner {
fn drop(&mut self) {}
}
fn f(x: X<State1>) -> X<State2> {
let X { i, _d } = x;
//mutate i.a, i.b and i.c
X {
i,
_d: PhantomData,
} // return new instance
}
This avoids unsafe and ensures that a, b and c are kept in a group and will be dropped together.

You can avoid unsafe code, as suggested in the other answers, by ensuring that each value is replaced with a value when you move it, so that x is never left in an invalid state.
If the field types implement Default you can use std::mem::take:
use std::mem;
fn f(mut x: X<State1>) -> X<State2> {
let mut a = mem::take(&mut x.a);
let mut b = mem::take(&mut x.b);
let mut c = mem::take(&mut x.c);
// mutate a, b and c
// ...
// return a new X
X { a, b, c, _d: PhantomData }
}
Now it is safe for x to be dropped because it contains valid values for each field. If the field types don't implement Default then you could instead use std::mem::swap to replace them with a suitable dummy value.

Related

Can I have a mutable reference to a type and its trait object in the same scope? [duplicate]

Why can I have multiple mutable references to a static type in the same scope?
My code:
static mut CURSOR: Option<B> = None;
struct B {
pub field: u16,
}
impl B {
pub fn new(value: u16) -> B {
B { field: value }
}
}
struct A;
impl A {
pub fn get_b(&mut self) -> &'static mut B {
unsafe {
match CURSOR {
Some(ref mut cursor) => cursor,
None => {
CURSOR= Some(B::new(10));
self.get_b()
}
}
}
}
}
fn main() {
// first creation of A, get a mutable reference to b and change its field.
let mut a = A {};
let mut b = a.get_b();
b.field = 15;
println!("{}", b.field);
// second creation of A, a the mutable reference to b and change its field.
let mut a_1 = A {};
let mut b_1 = a_1.get_b();
b_1.field = 16;
println!("{}", b_1.field);
// Third creation of A, get a mutable reference to b and change its field.
let mut a_2 = A {};
let b_2 = a_2.get_b();
b_2.field = 17;
println!("{}", b_1.field);
// now I can change them all
b.field = 1;
b_1.field = 2;
b_2.field = 3;
}
I am aware of the borrowing rules
one or more references (&T) to a resource,
exactly one mutable reference (&mut T).
In the above code, I have a struct A with the get_b() method for returning a mutable reference to B. With this reference, I can mutate the fields of struct B.
The strange thing is that more than one mutable reference can be created in the same scope (b, b_1, b_2) and I can use all of them to modify B.
Why can I have multiple mutable references with the 'static lifetime shown in main()?
My attempt at explaining this is behavior is that because I am returning a mutable reference with a 'static lifetime. Every time I call get_b() it is returning the same mutable reference. And at the end, it is just one identical reference. Is this thought right? Why am I able to use all of the mutable references got from get_b() individually?
There is only one reason for this: you have lied to the compiler. You are misusing unsafe code and have violated Rust's core tenet about mutable aliasing. You state that you are aware of the borrowing rules, but then you go out of your way to break them!
unsafe code gives you a small set of extra abilities, but in exchange you are now responsible for avoiding every possible kind of undefined behavior. Multiple mutable aliases are undefined behavior.
The fact that there's a static involved is completely orthogonal to the problem. You can create multiple mutable references to anything (or nothing) with whatever lifetime you care about:
fn foo() -> (&'static i32, &'static i32, &'static i32) {
let somewhere = 0x42 as *mut i32;
unsafe { (&*somewhere, &*somewhere, &*somewhere) }
}
In your original code, you state that calling get_b is safe for anyone to do any number of times. This is not true. The entire function should be marked unsafe, along with copious documentation about what is and is not allowed to prevent triggering unsafety. Any unsafe block should then have corresponding comments explaining why that specific usage doesn't break the rules needed. All of this makes creating and using unsafe code more tedious than safe code, but compared to C where every line of code is conceptually unsafe, it's still a lot better.
You should only use unsafe code when you know better than the compiler. For most people in most cases, there is very little reason to create unsafe code.
A concrete reminder from the Firefox developers:

What is the safest way to fake a reference by temporarily pretending it's `'static`?

I'm in a not-so-great situation and have to fake a lifetime. It looks a little bit like this:
struct Bar<'a> {
cr: &'a mut char,
}
fn foo<D, F>(data: D, f: F)
where
D: 'static, // <-- !!!
F: FnOnce(D),
{ ... }
let mut c = '⚠';
let bar = Bar { cr: &mut c };
foo(???, |c| /* I need access to a `Bar` here! */);
I have to call the strange function foo. In the closure I pass to it, I need to get access to a Bar (with any lifetime) that was passed through foo. (I know in this minimal example I could just access the bar directly as closures have access to their environment, but let's pretend that's not possible here.) Unfortunately, foo requires D: 'static.
Of course, in reality everything is more complicated. I know Bar and foo don't make too much sense, but this is my attempt at breaking my problem into a minimal example.
How do I make this work? I believe it is possible to make this work safely (i.e. without undefined behavior), but I'm sure it requires the unsafe keyword.
The basic idea is to cast the Bar<'not_static> to a Bar<'static> temporarily, then make sure that it does not outlive the original c. I want to know how to best do that. My idea was the following:
let mut c = '⚠';
let bar = Bar { cr: &mut c };
let bar_static: Arc<Bar<'static>> = unsafe {
let extended = mem::transmute::<Bar<'_>, Bar<'static>>(bar);
Arc::new(extended)
};
foo(bar_static.clone(), |bar| println!("{}", bar.cr));
if Arc::strong_count(&bar_static) != 1 && Arc::weak_count(&bar_static) != 0 {
eprintln!("bad!");
std::process::abort();
}
The idea is to dynamically check that no references to c exist anymore (apart from the one we are holding) after calling foo. That should protect against foo storing the data in a static variable or something like that. I don't expect it, but I rather end the whole process instead of having memory unsafety in my program.
With this, I think, I make sure that the reference (which is incorrectly 'static) does not outlive the actual data. But:
Is that reasoning sane? Does it make sense?
What worries me is that rustc doesn't think c is borrowed after the unsafe block. I could (in my function) arbitrarily access c, although there exists a reference pointing to it. Is that a case of "I'm fine as long as I don't actually access c"? Or rather one of those "immediate UB" cases?
Is mem::transmute the right tool for the job or should I use pointer casts or something else?
Any better ideas?
The answer is that there is no safe way to alter the lifetime of your objects. Lifetimes are compiler guarantees that the value will always live for the duration of its use.
The safest route would be to change foo to include the lifetime of data: D.
fn foo<'a, D, F>(data: D, f: F)
where D: 'a,
F: FnOnce(D),
{
f(data);
}
An alternative is to alter Bar and how it's used via Arc<Mutex<char>> and downgrading it to Weak<Mutex<char>>.
struct Bar {
c: Weak<Mutex<char>>,
}
let t = 't';
let t = Arc::new(Mutex::new(t));
let bar = Bar { c: Arc::downgrade(&t) };
foo(bar, |b| {
if let Some(c) = b.c.upgrade() {
let mut c = c.lock().unwrap();
println!("{}", *c);
*c = 'b';
}
});
println!("{}", t.lock().unwrap());
However, if it's not possible for you to change foo or Bar and you are certain that the referenced object will not be dropped, then you can use std::mem::transmute to alter the lifetime of your object.
As mentioned in the doc:
transmute is incredibly unsafe. There are a vast number of ways to cause undefined behavior with this function. transmute should be the absolute last resort.
unsafe fn to_static<'a>(r: Bar<'a>) -> Bar<'static> {
std::mem::transmute::<Bar<'a>, Bar<'static>>(r)
}
let mut t = 't';
let bar = Bar { c: &mut t };
let static_bar = unsafe { to_static(bar) };
foo(static_bar, |b: Bar| {
println!("{}", b.c);
*b.c = 'b';
});
println!("{}", t);
The example above works because t lives within the scope where foo is called and we know t is static so it's perfectly fine to use. However, if foo sends static_bar into another thread, and t is not static, then the result is undefined behaviour.
If you can guarantee that foo will complete before the scope it's called in ends, then you can use Mutex's to do the following.
let t = 't';
let t = Arc::new(Mutex::new(t));
{ // create scope here
let mut tlock = t.lock().unwrap(); // MutexGuard<char>
let t_ref_mut = &mut *tlock; // get &mut char -- lifetime of mutex guard
let bar = Bar { c: t_ref_mut };
let static_bar = unsafe { to_static(bar) };
// foo must be guaranteed to complete before the scope ends.
foo(static_bar, |b| {
println!("{}", b.c);
*b.c = 'b';
});
} // MutexGuard goes out of scope and lock should be released
// sanity check that lock is released and value is altered
println!("{}", t.lock().unwrap());
Full source code can be viewed here

Do I need to use a `let` binding to create a longer lived value?

I've very recently started studying Rust, and while working on a test program, I wrote this method:
pub fn add_transition(&mut self, start_state: u32, end_state: u32) -> Result<bool, std::io::Error> {
let mut m: Vec<Page>;
let pages: &mut Vec<Page> = match self.page_cache.get_mut(&start_state) {
Some(p) => p,
None => {
m = self.index.get_pages(start_state, &self.file)?;
&mut m
}
};
// omitted code that mutates pages
// ...
Ok(true)
}
it does work as expected, but I'm not convinced about the m variable. If I remove it, the code looks more elegant:
pub fn add_transition(&mut self, start_state: u32, end_state: u32) -> Result<bool, std::io::Error> {
let pages: &mut Vec<Page> = match self.page_cache.get_mut(&start_state) {
Some(p) => p,
None => &mut self.index.get_pages(start_state, &self.file)?
};
// omitted code that mutates pages
// ...
Ok(true)
}
but I get:
error[E0716]: temporary value dropped while borrowed
--> src\module1\mod.rs:28:29
|
26 | let pages: &mut Vec<Page> = match self.page_cache.get_mut(&start_state) {
| _____________________________________-
27 | | Some(p) => p,
28 | | None => &mut self.index.get_pages(start_state, &self.file)?
| | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^-
| | | |
| | | temporary value is freed at the end of this statement
| | creates a temporary which is freed while still in use
29 | | };
| |_________- borrow later used here
|
= note: consider using a `let` binding to create a longer lived value
I fully understand the error, which directed me to the working snippet, but I'm wondering if there's a more elegant and/or idiomatic way of writing this code. I am declaring m at the beginning of the function, only to prevent a temporary variable from being freed too early. Is there a way of telling the compiler that the lifetime of the return value of self.index.get_pages should be the whole add_transition function?
Further details:
Page is a relatively big struct, so I'd rather not implement the Copy trait nor I'd clone it.
page_cache is of type HashMap<u32, Vec<Page>>
self.index.get_pages is relatively slow and I'm using page_cache to cache results
The return type of self.index.get_pages is Result<Vec<Page>, std::io::Error>
This is normal, your 'cleaner' code basically comes down to do something as follows:
let y = {
let x = 42;
&x
};
Here it should be obvious that you cannot return a reference to x because x is dropped at the end of the block. Those rules don't change when working with temporary values: self.index.get_pages(start_state, &self.file)? creates a temporary value that is dropped at the end of the block (line 29) and thus you can't return a reference to it.
The workaround via m now moves that temporary into the m binding one block up which will live long enough for pages to work with it.
Now for alternatives, I guess page_cache is a HashMap? Then you could alternatively do something like let pages = self.page_cache.entry(start_state).or_insert_with(||self.index.get_pages(...))?;. The only problem with that approach is that get_pages returns a Result while the current cache stores Vec<Page> (the Ok branch only). You could adapt the cache to actually store Result instead, which I think is semantically also better since you want to cache the results of that function call, so why not do that for Err? But if you have a good reason to not cache Err, the approach you have should work just fine.
Yours is probably the most efficient way, but in theory not necessary, and one can be more elegant.
Another way of doing it is to use a trait object in this case — have the variable be of the type dyn DerefMut<Vec<Page>>. This basically means that this variable can hold any type that implements the trait DerefMut<Vec<Page>>>, two types that do so are &mut Vec<Page> and Vec<Page>, in that case the variable can hold either of these, but the contents can only be referenced via DerefMut.
So the following code works as an illustration:
struct Foo {
inner : Option<Vec<i32>>,
}
impl Foo {
fn new () -> Self {
Foo { inner : None }
}
fn init (&mut self) {
self.inner = Some(Vec::new())
}
fn get_mut_ref (&mut self) -> Option<&mut Vec<i32>> {
self.inner.as_mut()
}
}
fn main () {
let mut foo : Foo = Foo::new();
let mut m : Box<dyn AsMut<Vec<i32>>> = match foo.get_mut_ref() {
Some(r) => Box::new(r),
None => Box::new(vec![1,2,3]),
};
m.as_mut().as_mut().push(4);
}
The key here is the type Box<dyn AsMut<Vec<i32>>; this means that it can be a box that holds any type, so long the type implement AsMut<Vec<i32>>, because it's boxed in we also need .as_mut().as_mut() to get the actual &mut <Vec<i32>> out of it.
Because different types can have different sizes; they also cannot be allocated on the stack, so they must be behind some pointer, a Box is typically chosen therefore, and in this case necessary, a normal pointer that is sans ownership of it's pointee will face similar problems to those you face.
One might argue that this code is more elegant, but yours is certainly more efficient and does not require further heap allocation.

How to return new data from a function as a reference without borrow checker issues?

I'm writing a function that takes a reference to an integer and returns a vector of that integer times 2, 5 times. I think that'd look something like:
fn foo(x: &i64) -> Vec<&i64> {
let mut v = vec![];
for i in 0..5 {
let q = x * 2;
v.push(&q);
}
v
}
fn main() {
let x = 5;
let q = foo(&x);
println!("{:?}", q);
}
The borrow checker goes nuts because I define a new variable, it's allocated on the stack, and goes out of scope at the end of the function.
What do I do? Certainly I can't go through life without writing functions that create new data! I'm aware there's Box, and Copy-type workarounds, but I'm interested in an idiomatic Rust solution.
I realize I could return a Vec<i64> but I think that'd run into the same issues? Mainly trying to come up with an "emblematic" problem for the general issue :)
EDIT: I only just realized that you wrote "I'm aware there's Box, Copy etc type workaround but I'm mostly interested in an idiomatic rust solution", but I've already typed the whole answer. :P And the solutions below are idiomatic Rust, this is all just how memory works! Don't go trying to return pointers to stack-allocated data in C or C++, because even if the compiler doesn't stop you, that doesn't mean anything good will come of it. ;)
Any time that you return a reference, that reference must have been a parameter to the function. In other words, if you're returning references to data, all that data must have been allocated outside of the function. You seem to understand this, I just want to make sure it's clear. :)
There are many potential ways of solving this problem depending on what your use case is.
In this particular example, because you don't need x for anything afterward, you can just give ownership to foo without bothering with references at all:
fn foo(x: i64) -> Vec<i64> {
std::iter::repeat(x * 2).take(5).collect()
}
fn main() {
let x = 5;
println!("{:?}", foo(x));
}
But let's say that you don't want to pass ownership into foo. You could still return a vector of references as long as you didn't want to mutate the underlying value:
fn foo(x: &i64) -> Vec<&i64> {
std::iter::repeat(x).take(5).collect()
}
fn main() {
let x = 5;
println!("{:?}", foo(&x));
}
...and likewise you could mutate the underlying value as long as you didn't want to hand out new pointers to it:
fn foo(x: &mut i64) -> &mut i64 {
*x *= 2;
x
}
fn main() {
let mut x = 5;
println!("{:?}", foo(&mut x));
}
...but of course, you want to do both. So if you're allocating memory and you want to return it, then you need to do it somewhere other than the stack. One thing you can do is just stuff it on the heap, using Box:
// Just for illustration, see the next example for a better approach
fn foo(x: &i64) -> Vec<Box<i64>> {
std::iter::repeat(Box::new(x * 2)).take(5).collect()
}
fn main() {
let x = 5;
println!("{:?}", foo(&x));
}
...though with the above I just want to make sure you're aware of Box as a general means of using the heap. Truthfully, simply using a Vec means that your data will be placed on the heap, so this works:
fn foo(x: &i64) -> Vec<i64> {
std::iter::repeat(x * 2).take(5).collect()
}
fn main() {
let x = 5;
println!("{:?}", foo(&x));
}
The above is probably the most idiomatic example here, though as ever your use case might demand something different.
Alternatively, you could pull a trick from C's playbook and pre-allocate the memory outside of foo, and then pass in a reference to it:
fn foo(x: &i64, v: &mut [i64; 5]) {
for i in v {
*i = x * 2;
}
}
fn main() {
let x = 5;
let mut v = [0; 5]; // fixed-size array on the stack
foo(&x, &mut v);
println!("{:?}", v);
}
Finally, if the function must take a reference as its parameter and you must mutate the referenced data and you must copy the reference itself and you must return these copied references, then you can use Cell for this:
use std::cell::Cell;
fn foo(x: &Cell<i64>) -> Vec<&Cell<i64>> {
x.set(x.get() * 2);
std::iter::repeat(x).take(5).collect()
}
fn main() {
let x = Cell::new(5);
println!("{:?}", foo(&x));
}
Cell is both efficient and non-surprising, though note that Cell works only on types that implement the Copy trait (which all the primitive numeric types do). If your type doesn't implement Copy then you can still do this same thing with RefCell, but it imposes a slight runtime overhead and opens up the possibilities for panics at runtime if you get the "borrowing" wrong.

Splitting Iterator<(A,B)> into Iterator<A> and Iterator<B>

I would like to split the output of an object that implements Iterator<(A,B)> into two objects that implement Iterator<A> and Iterator<B>. Since one of the outputs could be iterated more than the other, I'll need to buffer up the output of the Iterator<(A,B)> (because I can't rely on the Iterator<(A,B)> being cloneable.) The problem is that the iterator could be infinite, so I can't simply collect the output of the iterator into two buffers and return iterators over the two buffers.
So it seems that I'll need to hold buffers of the A and B objects, and whenever one of the buffers is empty I'll fill it with samples from the Iterator<(A,B)> object. This means that I'll need two iterable structs that have mutable references to the input iterator (since both of them will need to call next() on the input to fill up the buffers), which is impossible.
So, is there any way to accomplish this in a safe way?
This is possible. As you identified you need mutable references to the base iterator from both handles, which is possible using a type with "internal mutability", that is, one that uses unsafe code internally to expose a safe API for acquiring a &mut to aliasable data (i.e. contained in a &) by dynamically enforcing the invariants that the compiler normally enforces at compile time outside unsafe.
I'm assuming you're happy to keep the two iterators on a single thread1, so, in this case, we want a RefCell. We also need to be able to have access to the RefCell from the two handles, entailing storing either a &RefCell<...> or an Rc<RefCell<...>>. The former would be too restrictive, as it would only allow us to use the pair of iterators in and below the stack frame in which the RefCell is created, while we want to be able to freely pass the iterators around, so Rc it is.
In summary, we're basically going to be storing an Rc<RefCell<Iterator<(A,B)>>>, there's just the question of buffering. The right tool for the job here is a RingBuf since we want efficient push/pop at the front and back. Thus, the thing we're sharing (i.e. inside the RefCell) could look like:
struct SharedInner<A, B, It> {
iter: It,
first: RingBuf<A>,
second: RingBuf<B>,
}
We can abbreviate the type actually being shared as type Shared<A, B, It> = Rc<RefCell<SharedInner<A, B, It>>>;, which allows us to define the iterators:
struct First<A, B, It> {
data: Shared<A, B, It>
}
impl Iterator<A> for First<A,B,It> {
fn next(&mut self) -> Option<A> {
// ...
}
}
To implement next the first thing to do is get a &mut to the SharedInner, via self.data.borrow_mut();. And then get an element out of it: check the right buffer, or otherwise get a new element from iter (remembering to buffer the left-over B):
let mut inner = self.data.borrow_mut();
inner.first.pop_front().or_else(|| {
inner.iter.next().map(|(a,b)| {
inner.second.push(b);
a
})
})
Docs: RingBuf.pop_front, Option.or_else.
The iterator for the other side is similar. In total:
use std::cell::RefCell;
use std::collections::{Deque, RingBuf};
use std::rc::Rc;
struct SharedInner<A, B, It> {
iter: It,
first: RingBuf<A>,
second: RingBuf<B>
}
type Shared<A, B, It> = Rc<RefCell<SharedInner<A, B, It>>>;
struct First<A, B, It> {
data: Shared<A, B, It>
}
impl<A,B, It: Iterator<(A,B)>> Iterator<A> for First<A, B, It> {
fn next(&mut self) -> Option<A> {
let mut inner = self.data.borrow_mut();
// try to get one from the stored data
inner.first.pop_front().or_else(||
// nothing stored, we need a new element.
inner.iter.next().map(|(a, b)| {
inner.second.push(b);
a
}))
}
}
struct Second<A, B, It> {
data: Shared<A, B, It>
}
impl<A,B, It: Iterator<(A,B)>> Iterator<B> for Second<A,B,It> {
fn next(&mut self) -> Option<B> {
let mut inner = self.data.borrow_mut();
inner.second.pop_front().or_else(|| {
inner.iter.next().map(|(a, b)| {
inner.first.push(a);
b
})
})
}
}
fn split<A, B, It: Iterator<(A,B)>>(it: It) -> (First<A, B, It>,
Second<A, B, It>) {
let data = Rc::new(RefCell::new(SharedInner {
iter: it,
first: RingBuf::new(),
second: RingBuf::new(),
}));
(First { data: data.clone() }, Second { data: data })
}
fn main() {
let pairs = range(1u32, 10 + 1).map(|x| (x, 1.0 / x as f64));
let (mut first, mut second) = split(pairs);
println!("first:");
for x in first.by_ref().take(3) {
println!(" {}", x);
}
println!("second:");
for y in second.by_ref().take(5) {
if y < 0.2 { break }
println!(" {}", y);
}
let a = first.collect::<Vec<u32>>();
let b = second.collect::<Vec<f64>>();
println!("a {}\nb {}", a, b);
}
which prints
first:
1
2
3
second:
1
0.5
0.333333
0.25
0.2
a [4, 5, 6, 7, 8, 9, 10]
b [0.166667, 0.142857, 0.125, 0.111111, 0.1]
playpen.
There's various ways this could be optimised, e.g. when fetching in First, only buffer the left-over B if a Second handle exists.
1 If you were looking to run them in separate threads just replace the RefCell with a Mutex and the Rc with an Arc, and add the necessary bounds.

Resources