Copying a struct for use on another thread - multithreading

I have a struct:
struct MyData {
x: i32
}
I want to asynchronously start a long operation on this struct.
My first attempt was this:
fn foo(&self) { //should return immediately
std::thread::Thread::spawn(move || {
println!("{:?}",self.x); //consider a very long operation
});
}
Clearly the compiler cannot infer an appropriate lifetime due to conflicting requirements because self may be on the stack frame and thus cannot be guaranteed to exist by the time the operation is running on a different stack frame.
To solve this, I attempted to make a copy of self and provide that copy to the new thread:
fn foo(&self) { //should return immediately
let clone = self.clone();
std::thread::Thread::spawn(move || {
println!("{:?}",clone.x); //consider a very long operation
});
}
I think that does not compile because now clone is on the stack frame which is similar to before. I also tried to do the clone inside the thread, and that does not compile either, I think for similar reasons.
Then I decided maybe I could use a channel to push the copied data into the thread, on the theory that perhaps channel can magically move (copy?) stack-allocated data between threads, which is suggested by this example in the documentation. However the compiler cannot infer a lifetime for this either:
fn foo(&self) { //should return immediately
let (tx, rx) = std::sync::mpsc::channel();
tx.send(self.clone());
std::thread::Thread::spawn(move || {
println!("{:?}",rx.recv().unwrap().x); //consider a very long operation
});
}
Finally, I decided to just copy my struct onto the heap explicitly, and pass an Arc into the thread. But not even here can the compiler figure out a lifetime:
fn foo(&self) { //should return immediately
let arc = std::sync::Arc::new(self.clone());
std::thread::Thread::spawn(move || {
println!("{:?}",arc.clone().x); //consider a very long operation
});
}
Okay borrow checker, I give up. How do I get a copy of self onto my new thread?

I think your issue is simply because your structure does not derive the Clone trait. You can get your second example to compile and run by adding a #[derive(Clone)] before your struct's definition.
What I don't understand in the compiler behaviour here is what .clone() function it tried to use here. Your structure indeed did not implement the Clone trait so should not by default have a .clone() function.
playpen
You may also want to consider in your function taking self by value, and let your caller decide whether it should make a clone, or just a move.

As an alternative solution, you could use thread::scoped and maintain a handle to the thread. This allows the thread to hold a reference, without the need to copy it in:
#![feature(old_io,std_misc)]
use std::thread::{self,JoinGuard};
use std::old_io::timer;
use std::time::duration::Duration;
struct MyData {
x: i32,
}
// returns immediately
impl MyData {
fn foo(&self) -> JoinGuard<()> {
thread::scoped(move || {
timer::sleep(Duration::milliseconds(300));
println!("{:?}", self.x); //consider a very long operation
timer::sleep(Duration::milliseconds(300));
})
}
}
fn main() {
let d = MyData { x: 42 };
let _thread = d.foo();
println!("I'm so fast!");
}

Related

`variable` was mutably borrowed here in the previous iteration of the loop

I'm fairly new to Rust, and I've been trying to work through an error in my code.
The code below compiles. However, if I uncomment out the line to add a packet to my buffer, it throws the error:
`interface` was mutably borrowed here in the previous iteration of the loop
How? It's not related at all to the packet at that point. I thought I was beginning to grasp references and memory management concepts, but this has me second guessing everything...
let mut buffer: VecDeque<pcap::Packet> = VecDeque::with_capacity(1000);
while let Ok(packet) = interface.next_packet() {
if start_time.is_none() {
start_time = Some(Instant::now());
}
let buf_packet = packet.to_owned();
// buffer.push_back(buf_packet);
let elapsed = start_time.unwrap().elapsed();
if elapsed >= time_limit {
break;
}
}
Packet has no owned variant. to_owned() is from the blanket ToOwned implementation in the standard library, which is implemented for anything that implements Clone. Weirdly, this type implements Clone but not Copy.
Anyway, Packet just holds two references. Cloning copies those references, so this doesn't actually do anything meaningful with regards to the borrows or their lifetimes.
Instead, consider implementing your own "owned packet" type that owns its own data:
struct OwnedPacket {
pub header: PacketHeader,
pub data: Vec<u8>,
}
impl<'a> From<Packet<'a>> for OwnedPacket {
fn from(other: Packet<'a>) -> Self {
Self {
header: *other.header,
data: other.data.into(),
}
}
}
Now you can change your deque to VecDeque<OwnedPacket> and call packet.into() when pushing the packet to perform the copy.

How do I define the lifetime for a tokio task spawned from a class?

I'm attempting to write a generic set_interval function helper:
pub fn set_interval<F, Fut>(mut f: F, dur: Duration)
where
F: Send + 'static + FnMut() -> Fut,
Fut: Future<Output = ()> + Send + 'static,
{
let mut interval = tokio::time::interval(dur);
tokio::spawn(async move {
// first tick is at 0ms
interval.tick().await;
loop {
interval.tick().await;
tokio::spawn(f());
}
});
}
This works fine until it's called from inside a class:
fn main() {}
struct Foo {}
impl Foo {
fn bar(&self) {
set_interval(|| self.task(), Duration::from_millis(1000));
}
async fn task(&self) {
}
}
self is not 'static, and we can't restrict lifetime parameter to something that is less than 'static because of tokio::task.
Is it possible to modify set_interval implementation so it works in cases like this?
Link to playground
P.S. Tried to
let instance = self.clone();
set_interval(move || instance.task(), Duration::from_millis(1000));
but I also get an error: error: captured variable cannot escape FnMut closure body
Is it possible to modify set_interval implementation so it works in cases like this?
Not really. Though spawn-ing f() really doesn't help either, as it precludes a simple "callback owns the object" solution (as you need either both callback and future to own the object, or just future).
I think that leaves two solutions:
Convert everything to shared mutability Arc, the callback owns one Arc, then on each tick it clones that and moves the clone into the future (the task method).
Have the future (task) acquire the object from some external source instead of being called on one, this way the intermediate callback doesn't need to do anything. Or the callback can do the acquiring and move that into the future, same diff.
Incidentally at this point it could make sense to just create the future directly, but allow cloning it. So instead of taking a callback set_interval would take a clonable future, and it would spawn() clones of its stored future instead of creating them anew.
As mentioned by #Masklinn, you can clone the Arc to allow for this. Note that cloning the Arc will not clone the underlying data, just the pointer, so it is generally OK to do so, and should not have a major impact on performance.
Here is an example. The following code will produce the error async block may outlive the current function, but it borrows data, which is owned by the current function:
fn main() {
// 🛑 Error: async block may outlive the current function, but it borrows data, which is owned by the current function
let data = Arc::new("Hello, World".to_string());
tokio::task::spawn(async {
println!("1: {}", data.len());
});
tokio::task::spawn(async {
println!("2: {}", data.len());
});
}
Rust unhelpfully suggests adding move to both async blocks, but that will result in a borrowing error because there would be multiple ownership.
To fix the problem, we can clone the Arc for each task and then add the move keyword to the async blocks:
fn main() {
let data = Arc::new("Hello, World".to_string());
let data_for_task_1 = data.clone();
tokio::task::spawn(async move {
println!("1: {}", data_for_task_1.len());
});
let data_for_task_2 = data.clone();
tokio::task::spawn(async move {
println!("2: {}", data_for_task_2.len());
});
}

Transferring ownership between enum variants [duplicate]

I'm tring to replace a value in a mutable borrow; moving part of it into the new value:
enum Foo<T> {
Bar(T),
Baz(T),
}
impl<T> Foo<T> {
fn switch(&mut self) {
*self = match self {
&mut Foo::Bar(val) => Foo::Baz(val),
&mut Foo::Baz(val) => Foo::Bar(val),
}
}
}
The code above doesn't work, and understandibly so, moving the value out of self breaks the integrity of it. But since that value is dropped immediately afterwards, I (if not the compiler) could guarantee it's safety.
Is there some way to achieve this? I feel like this is a job for unsafe code, but I'm not sure how that would work.
mem:uninitialized has been deprecated since Rust 1.39, replaced by MaybeUninit.
However, uninitialized data is not required here. Instead, you can use ptr::read to get the data referred to by self.
At this point, tmp has ownership of the data in the enum, but if we were to drop self, that data would attempt to be read by the destructor, causing memory unsafety.
We then perform our transformation and put the value back, restoring the safety of the type.
use std::ptr;
enum Foo<T> {
Bar(T),
Baz(T),
}
impl<T> Foo<T> {
fn switch(&mut self) {
// I copied this code from Stack Overflow without reading
// the surrounding text that explains why this is safe.
unsafe {
let tmp = ptr::read(self);
// Must not panic before we get to `ptr::write`
let new = match tmp {
Foo::Bar(val) => Foo::Baz(val),
Foo::Baz(val) => Foo::Bar(val),
};
ptr::write(self, new);
}
}
}
More advanced versions of this code would prevent a panic from bubbling out of this code and instead cause the program to abort.
See also:
replace_with, a crate that wraps this logic up.
take_mut, a crate that wraps this logic up.
Change enum variant while moving the field to the new variant
How can I swap in a new value for a field in a mutable reference to a structure?
The code above doesn't work, and understandibly so, moving the value
out of self breaks the integrity of it.
This is not exactly what happens here. For example, same thing with self would work nicely:
impl<T> Foo<T> {
fn switch(self) {
self = match self {
Foo::Bar(val) => Foo::Baz(val),
Foo::Baz(val) => Foo::Bar(val),
}
}
}
Rust is absolutely fine with partial and total moves. The problem here is that you do not own the value you're trying to move - you only have a mutable borrowed reference. You cannot move out of any reference, including mutable ones.
This is in fact one of the frequently requested features - a special kind of reference which would allow moving out of it. It would allow several kinds of useful patterns. You can find more here and here.
In the meantime for some cases you can use std::mem::replace and std::mem::swap. These functions allow you to "take" a value out of mutable reference, provided you give something in exchange.
Okay, I figured out how to do it with a bit of unsafeness and std::mem.
I replace self with an uninitialized temporary value. Since I now "own" what used to be self, I can safely move the value out of it and replace it:
use std::mem;
enum Foo<T> {
Bar(T),
Baz(T),
}
impl<T> Foo<T> {
fn switch(&mut self) {
// This is safe since we will overwrite it without ever reading it.
let tmp = mem::replace(self, unsafe { mem::uninitialized() });
// We absolutely must **never** panic while the uninitialized value is around!
let new = match tmp {
Foo::Bar(val) => Foo::Baz(val),
Foo::Baz(val) => Foo::Bar(val),
};
let uninitialized = mem::replace(self, new);
mem::forget(uninitialized);
}
}
fn main() {}

What happens to the ownership of a value returned but not assigned by the calling function?

Consider the following Rust code, slightly modified from examples in The Book.
I'm trying to understand what happens to the value in the second running of function dangle() in the main() function (see comment). I would imagine that because the value isn't assigned to any owner, it gets deallocated, but I've so far failed to find information to confirm that. Otherwise, I would think that calling dangle() repeatedly would constantly allocate more memory without deallocating it. Which is it?
fn main() {
// Ownership of dangle()'s return value is passed to the variable `thingamabob`.
let thingamabob = dangle();
// No ownership specified. Is the return value deallocated here?
dangle();
println!("Ref: {}", thingamabob);
}
fn dangle() -> String {
// Ownership specified.
let s = String::from("hello");
// Ownership is passed to calling function.
s
}
When a value has no owner (is not bound to a variable) it goes out of scope. Values that go out of scope are dropped. Dropping a value frees the resources associated with that value.
Anything less would lead to memory leaks, which would be a poor idea in a programming language.
See also:
Is it possible in Rust to delete an object before the end of scope?
How does Rust know whether to run the destructor during stack unwind?
Does Rust free up the memory of overwritten variables?
In your example, the second call creates an unnamed temporary value whose lifetime ends immediately after that one line of code, so it goes out of scope (and any resources are reclaimed) immediately.
If you bind the value to a name using let, then its lifetime extends until the end of the current lexical scope (closing curly brace).
You can explore some of this yourself by implementing the Drop trait on a simple type to see when its lifetime ends. Here's a small program I made to play with this (playground):
#[derive(Debug)]
struct Thing {
val: i32,
}
impl Thing {
fn new(val: i32) -> Self {
println!("Creating Thing #{}", val);
Thing { val }
}
fn foo(self, val: i32) -> Self {
Thing::new(val)
}
}
impl Drop for Thing {
fn drop(&mut self) {
println!("Dropping {:?}", self);
}
}
pub fn main() {
let _t1 = Thing::new(1);
Thing::new(2); // dropped immediately
{
let t3 = Thing::new(3);
Thing::new(4).foo(5).foo(6); // all are dropped, in order, as the next one is created
println!("Doing something with t3: {:?}", t3);
} // t3 is dropped here
} // _t1 is dropped last

Drop a Rust void pointer stored in an FFI

I'm wrapping a C API which allows the caller to set/get an arbitrary pointer via function calls. In this way, the C API allows a caller to associate arbitrary data with one of the C API objects. This data is not used in any callbacks, it's just a pointer that a user can stash away and get at later.
My wrapper struct implements the Drop trait for the C object that contains this pointer. What I'd like to be able to do, but am not sure it's possible, is have the data dropped correctly if the pointer is not null when the wrapper struct drops. I'm not sure how I would recover the correct type though from a raw c_void pointer.
Two alternatives I'm thinking of are
Implement the behavior of these two calls in the wrapper. Don't make any calls to the C API.
Don't attempt to offer any kind of safer interface to these functions. Document that the pointer must be managed by the caller of the wrapper.
Is what I want to do possible? If not, is there a generally accepted practice for these kinds of situations?
A naive + fully automatic approach is NOT possible for the following reasons:
freeing memory does not call drop/deconstructors/...: the C API can be used from languages which can have objects which should be deconstructed properly, e.g. C++ or Rust itself. So when you only store a memory pointer you do not know you to call the proper function (you neither know which function not how the calling conventions look like).
which memory allocator?: memory allocation and deallocation isn't a trivial thing. your program needs to request memory from the OS and then manage this resources in an intelligent way to be efficient and correct. This is usually done by a library. In case of Rust, jemalloc is used (but can be changed). So even when you ask the API caller to only pass Plain Old Data (which should be easier to destruct) you still don't know which library function to call to deallocate memory. Just using libc::free won't work (it can but it could horrible fail).
Solutions:
dealloc callback: you can ask the API user to set an additional pointer to, let's say a void destruct(void* ptr) function. If this one is not NULL, you call that function during your drop. You could also use int as an return type to signal when the destruction went wrong. In that case you could for example panic!.
global callback: let's assume you requested your user to only pass POD (plain old data). To know which free function of the memory allocator to call, you could request the user to register a global void (*free)(void* ptr) pointer which is called during drop. You could also make that one optional.
Although I was able to follow the advice in this thread, I wasn't entirely satisfied with my results, so I asked the question on the Rust forums and found the answer I was really looking for. (play)
use std::any::Any;
static mut foreign_ptr: *mut () = 0 as *mut ();
unsafe fn api_set_fp(ptr: *mut ()) {
foreign_ptr = ptr;
}
unsafe fn api_get_fp() -> *mut() {
foreign_ptr
}
struct ApiWrapper {}
impl ApiWrapper {
fn set_foreign<T: Any>(&mut self, value: Box<T>) {
self.free_foreign();
unsafe {
let raw = Box::into_raw(Box::new(value as Box<Any>));
api_set_fp(raw as *mut ());
}
}
fn get_foreign_ref<T: Any>(&self) -> Option<&T> {
unsafe {
let raw = api_get_fp() as *const Box<Any>;
if !raw.is_null() {
let b: &Box<Any> = &*raw;
b.downcast_ref()
} else {
None
}
}
}
fn get_foreign_mut<T: Any>(&mut self) -> Option<&mut T> {
unsafe {
let raw = api_get_fp() as *mut Box<Any>;
if !raw.is_null() {
let b: &mut Box<Any> = &mut *raw;
b.downcast_mut()
} else {
None
}
}
}
fn free_foreign(&mut self) {
unsafe {
let raw = api_get_fp() as *mut Box<Any>;
if !raw.is_null() {
Box::from_raw(raw);
}
}
}
}
impl Drop for ApiWrapper {
fn drop(&mut self) {
self.free_foreign();
}
}
struct MyData {
i: i32,
}
impl Drop for MyData {
fn drop(&mut self) {
println!("Dropping MyData with value {}", self.i);
}
}
fn main() {
let p1 = Box::new(MyData {i: 1});
let mut api = ApiWrapper{};
api.set_foreign(p1);
{
let p2 = api.get_foreign_ref::<MyData>().unwrap();
println!("i is {}", p2.i);
}
api.set_foreign(Box::new("Hello!"));
{
let p3 = api.get_foreign_ref::<&'static str>().unwrap();
println!("payload is {}", p3);
}
}

Resources