I would like to write a function that accepts an integer, and spawns threads that use this integer. The integer could be computed. It doesn't need to be a literal. If I use a concrete type such as usize, it would work but when I try to generalize it, it fails to compile:
fn g<A>(a: A)
where
A: num::PrimInt + Copy + std::fmt::Debug + Send,
{
let hs = (0..3).map(|_| {
std::thread::spawn(move || {
println!("{:?}", a);
})
});
for h in hs {
h.join().unwrap();
}
}
The error is:
1 | fn g<A>(a: A)
| - help: consider adding an explicit lifetime bound `A: 'static`...
...
6 | std::thread::spawn(move || {
| ^^^^^^^^^^^^^^^^^^
|
note: ...so that the type `[closure#src/main.rs:6:28: 8:10 a:A]` will meet its required lifetime bounds
--> src/main.rs:6:9
|
6 | std::thread::spawn(move || {
| ^^^^^^^^^^^^^^^^^^
Since I have the Copy trait, it should be able to copy this value for each thread and hence the lifetime bound recommendation is not necessary. How do I resolve this?
Copy and 'static are different constraints, and both are necessary for a value to be moved across threads. While you may only intend to pass integers (which satisfy both Copy and 'static) to your generic function, the compiler cannot know for sure. It must be able to prove that the function body is valid for all possible type parameters. An example of a type which is Copy but not 'static is a reference to a local variable:
fn assert_is_copy<T: Copy>(_: T) {}
fn assert_is_static<T: 'static>(_: T) {}
fn main() {
let x: usize = 5;
assert_is_copy(x);
assert_is_static(x);
assert_is_copy(&x);
//assert_is_static(&x); // FAILS TO COMPILE
}
I think you'll agree that you don't want to be passing references to your stack variables to different threads (although it can be safe in some limited cases). If you did, you could detach that thread, pop the stack frame with the local variable on it, and cause undefined behavior!
This answer gives a simple, (mostly) correct explanation of what it means to say T: 'static.
Others have explained why your code doesn't work, but if you're still wondering how to make it work, simply add + 'static to your constraints:
fn g<A>(a: A)
where
A: num::PrimInt + Copy + std::fmt::Debug + Send + 'static,
{
let hs = (0..3).map(|_| {
std::thread::spawn(move || {
println!("{:?}", a);
})
});
for h in hs {
h.join().unwrap();
}
}
The direct answer to my question based on the other answers is the following:
In Rust, it is not possible to pass a computed, non-literal PrimInt to more than one thread.
Related
I've very recently started studying Rust, and while working on a test program, I wrote this method:
pub fn add_transition(&mut self, start_state: u32, end_state: u32) -> Result<bool, std::io::Error> {
let mut m: Vec<Page>;
let pages: &mut Vec<Page> = match self.page_cache.get_mut(&start_state) {
Some(p) => p,
None => {
m = self.index.get_pages(start_state, &self.file)?;
&mut m
}
};
// omitted code that mutates pages
// ...
Ok(true)
}
it does work as expected, but I'm not convinced about the m variable. If I remove it, the code looks more elegant:
pub fn add_transition(&mut self, start_state: u32, end_state: u32) -> Result<bool, std::io::Error> {
let pages: &mut Vec<Page> = match self.page_cache.get_mut(&start_state) {
Some(p) => p,
None => &mut self.index.get_pages(start_state, &self.file)?
};
// omitted code that mutates pages
// ...
Ok(true)
}
but I get:
error[E0716]: temporary value dropped while borrowed
--> src\module1\mod.rs:28:29
|
26 | let pages: &mut Vec<Page> = match self.page_cache.get_mut(&start_state) {
| _____________________________________-
27 | | Some(p) => p,
28 | | None => &mut self.index.get_pages(start_state, &self.file)?
| | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^-
| | | |
| | | temporary value is freed at the end of this statement
| | creates a temporary which is freed while still in use
29 | | };
| |_________- borrow later used here
|
= note: consider using a `let` binding to create a longer lived value
I fully understand the error, which directed me to the working snippet, but I'm wondering if there's a more elegant and/or idiomatic way of writing this code. I am declaring m at the beginning of the function, only to prevent a temporary variable from being freed too early. Is there a way of telling the compiler that the lifetime of the return value of self.index.get_pages should be the whole add_transition function?
Further details:
Page is a relatively big struct, so I'd rather not implement the Copy trait nor I'd clone it.
page_cache is of type HashMap<u32, Vec<Page>>
self.index.get_pages is relatively slow and I'm using page_cache to cache results
The return type of self.index.get_pages is Result<Vec<Page>, std::io::Error>
This is normal, your 'cleaner' code basically comes down to do something as follows:
let y = {
let x = 42;
&x
};
Here it should be obvious that you cannot return a reference to x because x is dropped at the end of the block. Those rules don't change when working with temporary values: self.index.get_pages(start_state, &self.file)? creates a temporary value that is dropped at the end of the block (line 29) and thus you can't return a reference to it.
The workaround via m now moves that temporary into the m binding one block up which will live long enough for pages to work with it.
Now for alternatives, I guess page_cache is a HashMap? Then you could alternatively do something like let pages = self.page_cache.entry(start_state).or_insert_with(||self.index.get_pages(...))?;. The only problem with that approach is that get_pages returns a Result while the current cache stores Vec<Page> (the Ok branch only). You could adapt the cache to actually store Result instead, which I think is semantically also better since you want to cache the results of that function call, so why not do that for Err? But if you have a good reason to not cache Err, the approach you have should work just fine.
Yours is probably the most efficient way, but in theory not necessary, and one can be more elegant.
Another way of doing it is to use a trait object in this case — have the variable be of the type dyn DerefMut<Vec<Page>>. This basically means that this variable can hold any type that implements the trait DerefMut<Vec<Page>>>, two types that do so are &mut Vec<Page> and Vec<Page>, in that case the variable can hold either of these, but the contents can only be referenced via DerefMut.
So the following code works as an illustration:
struct Foo {
inner : Option<Vec<i32>>,
}
impl Foo {
fn new () -> Self {
Foo { inner : None }
}
fn init (&mut self) {
self.inner = Some(Vec::new())
}
fn get_mut_ref (&mut self) -> Option<&mut Vec<i32>> {
self.inner.as_mut()
}
}
fn main () {
let mut foo : Foo = Foo::new();
let mut m : Box<dyn AsMut<Vec<i32>>> = match foo.get_mut_ref() {
Some(r) => Box::new(r),
None => Box::new(vec![1,2,3]),
};
m.as_mut().as_mut().push(4);
}
The key here is the type Box<dyn AsMut<Vec<i32>>; this means that it can be a box that holds any type, so long the type implement AsMut<Vec<i32>>, because it's boxed in we also need .as_mut().as_mut() to get the actual &mut <Vec<i32>> out of it.
Because different types can have different sizes; they also cannot be allocated on the stack, so they must be behind some pointer, a Box is typically chosen therefore, and in this case necessary, a normal pointer that is sans ownership of it's pointee will face similar problems to those you face.
One might argue that this code is more elegant, but yours is certainly more efficient and does not require further heap allocation.
Why doesn't this code compile:
fn use_cursor(cursor: &mut io::Cursor<&mut Vec<u8>>) {
// do some work
}
fn take_reference(data: &mut Vec<u8>) {
{
let mut buf = io::Cursor::new(data);
use_cursor(&mut buf);
}
data.len();
}
fn produce_data() {
let mut data = Vec::new();
take_reference(&mut data);
data.len();
}
The error in this case is:
error[E0382]: use of moved value: `*data`
--> src/main.rs:14:5
|
9 | let mut buf = io::Cursor::new(data);
| ---- value moved here
...
14 | data.len();
| ^^^^ value used here after move
|
= note: move occurs because `data` has type `&mut std::vec::Vec<u8>`, which does not implement the `Copy` trait
The signature of io::Cursor::new is such that it takes ownership of its argument. In this case, the argument is a mutable reference to a Vec.
pub fn new(inner: T) -> Cursor<T>
It sort of makes sense to me; because Cursor::new takes ownership of its argument (and not a reference) we can't use that value later on. At the same time it doesn't make sense: we essentially only pass a mutable reference and the cursor goes out of scope afterwards anyway.
In the produce_data function we also pass a mutable reference to take_reference, and it doesn't produce a error when trying to use data again, unlike inside take_reference.
I found it possible to 'reclaim' the reference by using Cursor.into_inner(), but it feels a bit weird to do it manually, since in normal use-cases the borrow-checker is perfectly capable of doing it itself.
Is there a nicer solution to this problem than using .into_inner()? Maybe there's something else I don't understand about the borrow-checker?
Normally, when you pass a mutable reference to a function, the compiler implicitly performs a reborrow. This produces a new borrow with a shorter lifetime.
When the parameter is generic (and is not of the form &mut T), the compiler doesn't do this reborrowing automatically1. However, you can do it manually by dereferencing your existing mutable reference and then referencing it again:
fn take_reference(data: &mut Vec<u8>) {
{
let mut buf = io::Cursor::new(&mut *data);
use_cursor(&mut buf);
}
data.len();
}
1 — This is because the current compiler architecture only allows a chance to do a coercion if both the source and target types are known at the coercion site.
I am trying to write a method that returns a rusqlite::MappedRows:
pub fn dump<F>(&self) -> MappedRows<F>
where F: FnMut(&Row) -> DateTime<UTC>
{
let mut stmt =
self.conn.prepare("SELECT created_at FROM work ORDER BY created_at ASC").unwrap();
let c: F = |row: &Row| {
let created_at: DateTime<UTC> = row.get(0);
created_at
};
stmt.query_map(&[], c).unwrap()
}
I am getting stuck on a compiler error:
error[E0308]: mismatched types
--> src/main.rs:70:20
|
70 | let c: F = |row: &Row| {
| ____________________^ starting here...
71 | | let created_at: DateTime<UTC> = row.get(0);
72 | | created_at
73 | | };
| |_________^ ...ending here: expected type parameter, found closure
|
= note: expected type `F`
= note: found type `[closure#src/main.rs:70:20: 73:10]`
What am I doing wrong here?
I tried passing the closure directly to query_map but I get the same compiler error.
I'll divide the answer in two parts, the first about how to fix the return type without considering borrow-checker, the second about why it doesn't work even if you fixed the return type.
§1.
Every closure has a unique, anonymous type, so c cannot be of any type F the caller provides. That means this line will never compile:
let c: F = |row: &Row| { ... } // no, wrong, always.
Instead, the type should be propagated out from the dump function, i.e. something like:
// ↓ no generics
pub fn dump(&self) -> MappedRows<“type of that c”> {
..
}
Stable Rust does not provide a way to name that type. But we could do so in nightly with the "impl Trait" feature:
#![feature(conservative_impl_trait)]
// ↓~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
pub fn dump(&self) -> MappedRows<impl FnMut(&Row) -> DateTime<UTC>> {
..
}
// note: wrong, see §2.
The impl F here means that, “we are going to return a MappedRows<T> type where T: F, but we are not going to specify what exactly is T; the caller should be ready to treat anything satisfying F as a candidate of T”.
As your closure does not capture any variables, you could in fact turn c into a function. We could name a function pointer type, without needing "impl Trait".
// ↓~~~~~~~~~~~~~~~~~~~~~~~~
pub fn dump(&self) -> MappedRows<fn(&Row) -> DateTime<UTC>> {
let mut stmt = self.conn.prepare("SELECT created_at FROM work ORDER BY created_at ASC").unwrap();
fn c(row: &Row) -> DateTime<UTC> {
row.get(0)
}
stmt.query_map(&[], c as fn(&Row) -> DateTime<UTC>).unwrap()
}
// note: wrong, see §2.
Anyway, if we do use "impl Trait", since MappedRows is used as an Iterator, it is more appropriate to just say so:
#![feature(conservative_impl_trait)]
pub fn dump<'c>(&'c self) -> impl Iterator<Item = Result<DateTime<UTC>>> + 'c {
..
}
// note: wrong, see §2.
(without the 'c bounds the compiler will complain E0564, seems lifetime elision doesn't work with impl Trait yet)
If you are stuck with Stable Rust, you cannot use the "impl Trait" feature. You could wrap the trait object in a Box, at the cost of heap allocation and dynamic dispatch:
pub fn dump(&self) -> Box<Iterator<Item = Result<DateTime<UTC>>>> {
...
Box::new(stmt.query_map(&[], c).unwrap())
}
// note: wrong, see §2.
§2.
The above fix works if you want to, say, just return an independent closure or iterator. But it does not work if you return rusqlite::MappedRows. The compiler will not allow the above to work due to lifetime issue:
error: `stmt` does not live long enough
--> 1.rs:23:9
|
23 | stmt.query_map(&[], c).unwrap()
| ^^^^ does not live long enough
24 | }
| - borrowed value only lives until here
|
note: borrowed value must be valid for the anonymous lifetime #1 defined on the body at 15:80...
--> 1.rs:15:81
|
15 | pub fn dump(conn: &Connection) -> MappedRows<impl FnMut(&Row) -> DateTime<UTC>> {
| ^
And this is correct. MappedRows<F> is actually MappedRows<'stmt, F>, this type is valid only when the original SQLite statement object (having 'stmt lifetime) outlives it — thus the compiler complains that stmt is dead when you return the function.
Indeed, if the statement is dropped before we iterate on those rows, we will get garbage results. Bad!
What we need to do is to make sure all rows are read before dropping the statement.
You could collect the rows into a vector, thus disassociating the result from the statement, at the cost of storing everything in memory:
// ↓~~~~~~~~~~~~~~~~~~~~~~~~~
pub fn dump(&self) -> Vec<Result<DateTime<UTC>>> {
..
let it = stmt.query_map(&[], c).unwrap();
it.collect()
}
Or invert the control, let dump accept a function, which dump will call while keeping stmt alive, at the cost of making the calling syntax weird:
// ↓~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
pub fn dump<F>(&self, mut f: F) where F: FnMut(Result<DateTime<UTC>>) {
...
for res in stmt.query_map(&[], c).unwrap() {
f(res);
}
}
x.dump(|res| println!("{:?}", res));
Or split dump into two functions, and let the caller keep the statement alive, at the cost of exposing an intermediate construct to the user:
#![feature(conservative_impl_trait)]
pub fn create_dump_statement(&self) -> Statement {
self.conn.prepare("SELECT '2017-03-01 12:34:56'").unwrap()
}
pub fn dump<'s>(&self, stmt: &'s mut Statement) -> impl Iterator<Item = Result<DateTime<UTC>>> + 's {
stmt.query_map(&[], |row| row.get(0)).unwrap()
}
...
let mut stmt = x.create_dump_statement();
for res in x.dump(&mut stmt) {
println!("{:?}", res);
}
The issue here is that you are implicitly trying to return a closure, so to find explanations and examples you can search for that.
The use of the generic <F> means that the caller decides the concrete type of F and not the function dump.
What you would like to achieve instead requires the long awaited feature impl trait.
I have some mutable state I need to share between threads. I followed the concurrency section of the Rust book, which shares a vector between threads and mutates it.
Instead of a vector, I need to share a generic struct that is ultimately monomorphized. Here is a distilled example of what I'm trying:
use std::sync::{Arc, Mutex};
use std::thread;
use std::time::Duration;
use std::marker::PhantomData;
trait Memory {}
struct SimpleMemory;
impl Memory for SimpleMemory {}
struct SharedData<M: Memory> {
value: usize,
phantom: PhantomData<M>,
}
impl<M: Memory> SharedData<M> {
fn new() -> Self {
SharedData {
value: 0,
phantom: PhantomData,
}
}
}
fn main() {
share(SimpleMemory);
}
fn share<M: Memory>(memory: M) {
let data = Arc::new(Mutex::new(SharedData::<M>::new()));
for i in 0..3 {
let data = data.clone();
thread::spawn(move || {
let mut data = data.lock().unwrap();
data.value += i;
});
}
thread::sleep(Duration::from_millis(50));
}
The compiler complains with the following error:
error[E0277]: the trait bound `M: std::marker::Send` is not satisfied
--> src/main.rs:37:9
|
37 | thread::spawn(move || {
| ^^^^^^^^^^^^^
|
= help: consider adding a `where M: std::marker::Send` bound
= note: required because it appears within the type `std::marker::PhantomData<M>`
= note: required because it appears within the type `SharedData<M>`
= note: required because of the requirements on the impl of `std::marker::Send` for `std::sync::Mutex<SharedData<M>>`
= note: required because of the requirements on the impl of `std::marker::Send` for `std::sync::Arc<std::sync::Mutex<SharedData<M>>>`
= note: required because it appears within the type `[closure#src/main.rs:37:23: 40:10 data:std::sync::Arc<std::sync::Mutex<SharedData<M>>>, i:usize]`
= note: required by `std::thread::spawn`
I'm trying to understand why M would need to implement Send, and what the appropriate way to accomplish this is.
I'm trying to understand why M would need to implement Send, ...
Because, as stated by the Send documentation:
Types that can be transferred across thread boundaries.
If it's not Send, it is by definition not safe to send to another thread.
Almost all of the information you need is right there in the documentation:
thread::spawn requires the callable you give it to be Send.
You're using a closure, which is only Send if all the values it captures are Send. This is true in general of most types (they are Send if everything they're made of is Send, and similarly for Sync).
You're capturing data, which is an Arc<T>, which is only Send if T is Send.
T is a Mutex<U>, which is only Send if U is Send.
U is M. Thus, M must be Send.
In addition, note that thread::spawn also requires that the callable be 'static, so you need that too. It needs that because if it didn't require that, it'd have no guarantee that the value will continue to exist for the entire lifetime of the thread (which may or may not outlive the thread that spawned it).
..., and what the appropriate way to accomplish this is.
Same way as any other constraints: M: 'static + Send + Memory.
This is similar to Parameter type may not live long enough?, but my interpretation of the solution doesn't seem to be working. My original boiled-down test case is:
use std::fmt::Debug;
use std::thread;
trait HasFeet: Debug + Send + Sync + Clone {}
#[derive(Debug, Clone)]
struct Person;
impl HasFeet for Person {}
#[derive(Debug, Copy, Clone)]
struct Cordwainer<A: HasFeet> {
shoes_for: A,
}
impl<A: HasFeet> Cordwainer<A> {
fn make_shoes(&self) {
let cloned = self.shoes_for.clone();
thread::spawn(move || {
println!("making shoes for = {:?}", cloned);
});
}
}
This gives me the error:
error[E0310]: the parameter type `A` may not live long enough
--> src/main.rs:19:9
|
16 | impl<A: HasFeet> Cordwainer<A> {
| -- help: consider adding an explicit lifetime bound `A: 'static`...
...
19 | thread::spawn(move || {
| ^^^^^^^^^^^^^
|
note: ...so that the type `[closure#src/main.rs:19:23: 21:10 cloned:A]` will meet its required lifetime bounds
--> src/main.rs:19:9
|
19 | thread::spawn(move || {
| ^^^^^^^^^^^^^
Instead of making A 'static, I go through and add an explicit lifetime to the HasFeet trait:
use std::fmt::Debug;
use std::thread;
trait HasFeet<'a>: 'a + Send + Sync + Debug {}
#[derive(Debug, Copy, Clone)]
struct Person;
impl<'a> HasFeet<'a> for Person {}
struct Cordwainer<'a, A: HasFeet<'a>> {
shoes_for: A,
}
impl<'a, A: HasFeet<'a>> Cordwainer<'a, A> {
fn make_shoes(&self) {
let cloned = self.shoes_for.clone();
thread::spawn(move || {
println!("making shoes for = {:?}", cloned);
})
}
}
This now gives me the error:
error[E0392]: parameter `'a` is never used
--> src/main.rs:11:19
|
11 | struct Cordwainer<'a, A: HasFeet<'a>> {
| ^^ unused type parameter
|
= help: consider removing `'a` or using a marker such as `std::marker::PhantomData`
I think that 'a is used as the lifetime parameter to the HasFeet trait. What am I doing wrong here?
The std::thread::spawn() function is declared with a Send + 'static bound on its closure. Anything which this closure captures must satisfy the Send + 'static bound. There is no way around this in safe Rust. If you want to pass data to other threads using this function, it must be 'static, period.
It is possible to lift the 'static restriction with the proper API, see How can I pass a reference to a stack variable to a thread? for an example.
However, a 'static bound is not as scary as it may seem. First of all, you cannot force anything to do anything with lifetime bounds (you can't force anything to do anything with any kind of bounds). Bounds just limit the set of types which can be used for the bounded type parameter and that is all; if you tried to pass a value whose type does not satisfy these bounds, the compiler will fail to compile your program, it won't magically make the values "live longer".
Moreover, a 'static bound does not mean that the value must live for the duration of the program; it means that the value must not contain borrowed references with lifetimes other than 'static. In other words, it is a lower bound for possible references inside the value; if there are no references, then the bound does not matter. For example, String or Vec<u64> or i32 satisfy the 'static bound.
'static is a very natural restriction to put on spawn(). If it weren't present, the values transferred to another thread could contain references to the stack frame of the parent thread. If the parent thread finishes before the spawned thread, these references would become dangling.