Creating pool of threads using a Vec<String> as input [duplicate] - multithreading

This question already has answers here:
Parameter type may not live long enough (with threads)
(1 answer)
How can I send non-static data to a thread in Rust and is it needed in this example?
(1 answer)
How can I pass a reference to a stack variable to a thread?
(1 answer)
Is there another option to share an Arc in multiple closures besides cloning it before each closure?
(2 answers)
Closed 2 years ago.
I'm trying to adapt code from this gist to make something like a pool of workers that could update data simultaneously for me, so I could get it from them via mpsc::channel.
I want to build this pool of workers using a Vec<String> as input, telling it how many workers will be created. I guess the input must be a Vec<String> because it is dynamically built before the pool is created.
This is a basic snippet for result replication:
use std::{
collections::HashMap,
sync::{
mpsc::{self, Receiver, Sender},
Arc, Mutex, RwLock,
},
thread,
};
#[derive(Debug)]
pub struct Pool {
workers: RwLock<HashMap<String, Arc<Mutex<Worker>>>>,
}
impl Pool {
pub fn new(symbols: Vec<String>) -> Pool {
let mut workers: HashMap<String, Arc<Mutex<Worker>>> = HashMap::new();
for (s, w) in symbols.iter().map(|symbol| {
(
symbol,
Arc::new(Mutex::new({
let (in_tx, in_rx): (Sender<i32>, Receiver<i32>) = mpsc::channel();
let (out_tx, out_rx) = mpsc::channel();
let worker = Worker {
in_tx: in_tx,
out_rx: out_rx,
};
thread::spawn(move || {
println!("{}", symbol);
});
worker
})),
)
}) {
workers.insert(s.to_string(), w);
}
Pool {
workers: RwLock::new(workers),
}
}
}
#[derive(Debug)]
struct Worker {
pub in_tx: Sender<i32>,
pub out_rx: Receiver<i32>,
}
The compiler keeps complaining:
error[E0597]: `symbols` does not live long enough
--> src/lib.rs:18:23
|
18 | for (s, w) in symbols.iter().map(|symbol| {
| ^^^^^^^-------
| |
| borrowed value does not live long enough
| argument requires that `symbols` is borrowed for `'static`
...
40 | }
| - `symbols` dropped here while still borrowed
I think I kind of understand the reasons why it's happening, but I can't figure a way of working around this.

Related

Proper way to share references to Vec between threads

I am new to rust and I am attempting to create a Vec that will live on the main thread, and pass a reference to another thread, which then pushes members onto the vector, for the main thread to use.
use std::{thread};
fn main() {
let mut v: Vec<u8> = Vec::new();
let _ = thread::spawn(move || {
vec_push(&mut v, 0)
});
for i in v.iter_mut() {
println!("poo {}", i);
}
}
fn vec_push(v: &mut Vec<u8>, n: u8) {
v.push(n);
}
This is a simplified version of what I am trying to do. In my main code I am want it to be a Vec of TcpStreams.
I think this post would also apply to maintaining a struct (that doesn't implement Copy) between threads.
I get this error
error[E0382]: borrow of moved value: `v`
--> src/main.rs:8:11
|
4 | let mut v: Vec<u8> = Vec::new();
| ----- move occurs because `v` has type `Vec<u8>`, which does not implement the `Copy` trait
5 | let _ = thread::spawn(move || {
| ------- value moved into closure here
6 | vec_push(&mut v, 0)
| - variable moved due to use in closure
7 | });
8 | for i in v.iter_mut() {
| ^^^^^^^^^^^^ value borrowed here after move
Is there a better way to do this? Am I missing some basic concept?
Any help would be useful, I am used to C where I can just throw around references willy-nilly
What you are doing is wildly unsound. You are trying to have two mutable references to a object, which is strictly forbidden in rust. Rust forbids this to prevent you from having data races that would result in memory unsafety.
If you want to mutate an object from different threads you have to synchronize it somehow. The easiest way to do it is by using Mutex. This probably won't be very efficient in a high-congestion scenario (as locking a mutex can become your bottle neck), but it will be safe.
To share this Mutex between threads you can wrap it in an Arc (an atomic counted shared reference smart pointer). So your code can be transformed to something like this:
use std::thread;
use std::sync::{Arc, Mutex};
fn main() {
let v = Arc::new(Mutex::new(Vec::new()));
let v_clone = Arc::clone(&v);
let t = thread::spawn(move || {
vec_push(v_clone, 0)
});
t.join().unwrap();
for i in v.lock().unwrap().iter_mut() {
println!("poo {}", i);
}
}
fn vec_push(v: Arc<Mutex<Vec<u8>>>, n: u8) {
v.lock().unwrap().push(n);
}
You probably will also want to join your spawned thread, so you should name it.

Why is this Rust code for waiting for a thread to finish not working? [duplicate]

This question already has answers here:
Joining a thread in a method that takes `&mut self` (like drop) results in "cannot move out of borrowed content"
(4 answers)
Closed 1 year ago.
I have some multi-threaded code that's giving me trouble. This is as simple as I can reproduce it:
use std::thread;
use std::time;
use std::sync::{Arc, atomic::{Ordering, AtomicBool}};
use std::ops::Drop;
struct Container {
// Join Handle for a thread
th: Option<thread::JoinHandle<()>>,
// Gets set to true when we want the thread to exit
stop_thread: Arc<AtomicBool>,
}
impl Container {
fn new() -> Self {
// Create new instance
let mut inst = Self {
th: None,
stop_thread: Arc::new(AtomicBool::new(false)),
};
let stop_thread = inst.stop_thread.clone();
// Start a new thread that does some work
let t = thread::spawn(move || {
// Keep doing work until stop_thread gets set to true
while !stop_thread.load(Ordering::SeqCst) {
println!("Doing stuff...");
thread::sleep(time::Duration::from_secs(1));
}
println!("Thread exited");
});
inst.th = Some(t);
inst
}
}
impl Drop for Container {
fn drop(&mut self) {
self.stop_thread.store(true, Ordering::SeqCst);
if let Some(t) = self.th {
t.join().unwrap();
}
}
}
fn main() {
let c = Container::new();
thread::sleep(time::Duration::from_secs(3));
drop(c);
}
The idea is that when a new instance of the Container struct is created, a background thread is started that does something. It keeps running until the instance is destroyed, at which point, I need the thread to be notified that it needs to exit. I also need to actually wait for the thread to exit before proceeding.
Everything works great, except for the code in the drop function. Rust is unhappy with if let Some(t) = self.th. It says:
error[E0507]: cannot move out of `self.th.0` which is behind a mutable reference
--> src/main.rs:45:26
|
45 | if let Some(t) = self.th {
| - ^^^^^^^ help: consider borrowing here: `&self.th`
| |
| data moved here
| move occurs because `t` has type `JoinHandle<()>`, which does not implement the `Copy` trait
Why can't I do this? What is self.th.0?
When I try to take Rust's suggestion, and do if let Some(t) = &self.th instead, it still doesn't compile:
error[E0507]: cannot move out of `*t` which is behind a shared reference
--> src/main.rs:46:13
|
46 | t.join().unwrap();
| ^ move occurs because `*t` has type `JoinHandle<()>`, which does not implement the `Copy` trait
What am I doing wrong?
As specified in this answer (linked by Rabbid76), this can be worked around by using the .take() function:
impl Drop for Container {
fn drop(&mut self) {
self.stop_thread.store(true, Ordering::SeqCst);
if let Some(t) = self.th.take() {
t.join().unwrap();
}
}
}
Though you might wanna consider if waiting on another thread inside the drop implementation is a good idea as explained here

What is the most idiomatic way of using struct methods that modify the struct's internal state from within a loop? [duplicate]

This question already has answers here:
What do I have to do to solve a "use of moved value" error?
(3 answers)
What types are valid for the `self` parameter of a method?
(2 answers)
Closed 3 years ago.
It seems to be fine to modify a vector on its own, but as soon as it is wrapped in a struct, it can not be mutated by a method on the struct.
I've created a very simplified version of my use case below, in two versions, one with just a vector, and one with a struct.
Why is it that this code is fine:
struct Task {
foo: String,
}
fn main() {
let mut list_of_tasks = Vec::new();
loop {
list_of_tasks.push(Task {
foo: String::from("bar"),
});
}
}
But this is not:
struct Task {
foo: String,
}
struct ListOfTasks(pub Vec<Task>);
impl ListOfTasks {
fn push(mut self, task: Task) {
self.0.push(task);
}
}
fn main() {
let list_of_tasks = ListOfTasks(Vec::new());
loop {
list_of_tasks.push(Task {
foo: String::from("bar"),
});
}
}
The second example fails with:
error[E0382]: use of moved value: `list_of_tasks`
--> src/main.rs:17:9
|
14 | let list_of_tasks = ListOfTasks(Vec::new());
| ------------- move occurs because `list_of_tasks` has type `ListOfTasks`, which does not implement the `Copy` trait
...
17 | list_of_tasks.push(Task {
| ^^^^^^^^^^^^^ value moved here, in previous iteration of loop
I think I'm not understanding something about moving a struct that uses mut self, but can't find any obvious examples online of how to approach this.

How can I send non-static data to a thread in Rust and is it needed in this example?

I am trying to fire up a new thread using some heap data in Rust and I am getting a bunch of errors that stem from the need of the data to have 'static lifetime. I've worked my way backwards up my program but hit a problem.
use std::sync::Arc;
use std::thread;
struct ThreadData {
vector_of_strings: Vec<String>,
terms: Vec<&'static str>,
quotient: usize,
}
fn perform_search(slice: &[String], terms: &[&str]) {
/* ... */
}
fn threaded_search(td_arc: &Arc<ThreadData>) {
let no_of_lines = td_arc.vector_of_strings.len();
let new_tda1 = td_arc.clone();
let strings_as_slice1 = new_tda1.vector_of_strings.as_slice();
thread::spawn(move || {
perform_search(&strings_as_slice1[0..td_arc.quotient], &new_tda1.terms);
});
}
fn main() {
let td = ThreadData {
vector_of_strings: Vec::new(),
terms: Vec::new(),
quotient: 0,
};
let td_arc = Arc::new(td);
threaded_search(&td_arc);
}
Error:
error[E0621]: explicit lifetime required in the type of `td_arc`
--> src/main.rs:20:5
|
14 | fn threaded_search(td_arc: &Arc<ThreadData>) {
| ---------------- help: add explicit lifetime `'static` to the type of `td_arc`: `&'static std::sync::Arc<ThreadData>`
...
20 | thread::spawn(move || {
| ^^^^^^^^^^^^^ lifetime `'static` required
The error about 'static is because the new thread created within thread::spawn may outlive the invocation of threaded_search during which the thread is initially created, which means the thread must not be permitted to use any local variables from threaded_search with a lifetime shorter than 'static.
In your code the new thread is referring to strings_as_slice1 and td_arc.
Generally with thread::spawn and Arc you will want to move ownership of one reference count into the thread and have the thread access whatever it needs through that reference counted pointer rather than from the enclosing short-lived scope directly.
fn threaded_search(td_arc: &Arc<ThreadData>) {
// Increment reference count that we can move into the new thread.
let td_arc = td_arc.clone();
thread::spawn(move || {
perform_search(&td_arc.vector_of_strings[0..td_arc.quotient], &td_arc.terms);
});
}

How can multiple threads share an iterator?

I've been working on a function that will copy a bunch of files from a source to a destination using Rust and threads. I'm getting some trouble making the threads share the iterator. I am not still used to the borrowing system:
extern crate libc;
extern crate num_cpus;
use libc::{c_char, size_t};
use std::thread;
use std::fs::copy;
fn python_str_array_2_str_vec<T, U, V>(_: T, _: U) -> V {
unimplemented!()
}
#[no_mangle]
pub extern "C" fn copyFiles(
sources: *const *const c_char,
destinies: *const *const c_char,
array_len: size_t,
) {
let src: Vec<&str> = python_str_array_2_str_vec(sources, array_len);
let dst: Vec<&str> = python_str_array_2_str_vec(destinies, array_len);
let mut iter = src.iter().zip(dst);
let num_threads = num_cpus::get();
let threads = (0..num_threads).map(|_| {
thread::spawn(|| while let Some((s, d)) = iter.next() {
copy(s, d);
})
});
for t in threads {
t.join();
}
}
fn main() {}
I'm getting this compilation error that I have not been able to solve:
error[E0597]: `src` does not live long enough
--> src/main.rs:20:20
|
20 | let mut iter = src.iter().zip(dst);
| ^^^ does not live long enough
...
30 | }
| - borrowed value only lives until here
|
= note: borrowed value must be valid for the static lifetime...
error[E0373]: closure may outlive the current function, but it borrows `**iter`, which is owned by the current function
--> src/main.rs:23:23
|
23 | thread::spawn(|| while let Some((s, d)) = iter.next() {
| ^^ ---- `**iter` is borrowed here
| |
| may outlive borrowed value `**iter`
|
help: to force the closure to take ownership of `**iter` (and any other referenced variables), use the `move` keyword, as shown:
| thread::spawn(move || while let Some((s, d)) = iter.next() {
I've seen the following questions already:
Value does not live long enough when using multiple threads
I'm not using chunks, I would like to try to share an iterator through the threads although creating chunks to pass them to the threads will be the classic solution.
Unable to send a &str between threads because it does not live long enough
I've seen some of the answers to use channels to communicate with the threads, but I'm not quite sure about using them. There should be an easier way of sharing just one object through threads.
Why doesn't a local variable live long enough for thread::scoped
This got my attention, scoped is supposed to fix my error, but since it is in the unstable channel I would like to see if there is another way of doing it just using spawn.
Can someone explain how should I fix the lifetimes so the iterator can be accessed from the threads?
Here's a minimal, reproducible example of your problem:
use std::thread;
fn main() {
let src = vec!["one"];
let dst = vec!["two"];
let mut iter = src.iter().zip(dst);
thread::spawn(|| {
while let Some((s, d)) = iter.next() {
println!("{} -> {}", s, d);
}
});
}
There are multiple related problems:
The iterator lives on the stack and the thread's closure takes a reference to it.
The closure takes a mutable reference to the iterator.
The iterator itself has a reference to a Vec that lives on the stack.
The Vec itself has references to string slices that likely live on the stack but are not guaranteed to live longer than the thread either way.
Said another way, the Rust compiler has stopped you from executing four separate pieces of memory unsafety.
A main thing to recognize is that any thread you spawn might outlive the place where you spawned it. Even if you call join right away, the compiler cannot statically verify that will happen, so it has to take the conservative path. This is the point of scoped threads — they guarantee the thread exits before the stack frame they were started in.
Additionally, you are attempting to use a mutable reference in multiple concurrent threads. There's zero guarantee that the iterator (or any of the iterators it was built on) can be safely called in parallel. It's entirely possible that two threads call next at exactly the same time. The two pieces of code run in parallel and write to the same memory address. One thread writes half of the data and the other thread writes the other half, and now your program crashes at some arbitrary point in the future.
Using a tool like crossbeam, your code would look something like:
use crossbeam; // 0.7.3
fn main() {
let src = vec!["one"];
let dst = vec!["two"];
let mut iter = src.iter().zip(dst);
while let Some((s, d)) = iter.next() {
crossbeam::scope(|scope| {
scope.spawn(|_| {
println!("{} -> {}", s, d);
});
})
.unwrap();
}
}
As mentioned, this will only spawn one thread at a time, waiting for it to finish. An alternative to get more parallelism (the usual point of this exercise) is to interchange the calls to next and spawn. This requires transferring ownership of s and d to the thread via the move keyword:
use crossbeam; // 0.7.3
fn main() {
let src = vec!["one", "alpha"];
let dst = vec!["two", "beta"];
let mut iter = src.iter().zip(dst);
crossbeam::scope(|scope| {
while let Some((s, d)) = iter.next() {
scope.spawn(move |_| {
println!("{} -> {}", s, d);
});
}
})
.unwrap();
}
If you add a sleep call inside the spawn, you can see the threads run in parallel.
I'd have written it using a for loop, however:
let iter = src.iter().zip(dst);
crossbeam::scope(|scope| {
for (s, d) in iter {
scope.spawn(move |_| {
println!("{} -> {}", s, d);
});
}
}).unwrap();
In the end, the iterator is exercised on the current thread, and each value returned from the iterator is then handed off to a new thread. The new threads are guaranteed to exit before the captured references.
You may be interested in Rayon, a crate that allows easy parallelization of certain types of iterators.
See also:
How can I pass a reference to a stack variable to a thread?
Lifetime troubles sharing references between threads
How do I use static lifetimes with threads?
Thread references require static lifetime?
Lifetime woes when using threads
Cannot call a function in a spawned thread because it "does not fulfill the required lifetime"

Resources