Can I safely multithread something which isn't meant to be multithreaded? - multithreading

I'm using a trait which isn't designed around multithreading (Cursive).
Now, while it's using multithreading, it's going to be behind a mutex, so it won't be able to be used at two threads at the same time.
What is rust trying to protect me against and can I do anything about it?
For sample reference, my sample code is:
extern crate cursive;
use cursive::Cursive;
use std::thread;
use std::sync::{Mutex,Arc};
fn main() {
let mut siv = Arc::new(Mutex::new(Cursive::default()));
let copy_siv = siv.clone();
thread::spawn(move || {
let mut new_siv = copy_siv.lock().unwrap();
});
(*(siv.lock().unwrap())).run();
}
The compiler complains at thread::spawn:
Error[E0277]: `(dyn cursive::traits::View + 'static)` cannot be sent between threads safely
--> src/main.rs:16:5
|
16 | thread::spawn(move || {
| ^^^^^^^^^^^^^ `(dyn cursive::traits::View + 'static)` cannot be sent between threads safely
|
= help: the trait `std::marker::Send` is not implemented for `(dyn cursive::traits::View + 'static)`

What is rust trying to protect me against [...]
Something in what you're sending between threads contains a dyn cursive::traits::View trait object. This trait object is not Send. It needs to be Send because by putting it inside an Arc, you can no longer predict which thread will be responsible for destroying it, so it must be safe to transfer ownership between threads.
[...] can I do anything about it?
You haven't provided enough context to say for certain, but probably not.
You could maybe try using a plain borrowed reference (plus a threading library that supports scoped threads), but I can't say if that will work for you.
Why wouldn't Mutex make it sync? Isn't that the point of Mutex?
No. It can't make something thread-safe when it wasn't already thread-safe. Mutex just manages exclusive access to a value, it doesn't make that access from different threads safe. The only thing that can make a type thread-safe is the type in question.
Making a guess: the library was written such that it does not require thread safety, thus Arc cannot assume it's thread-safe, so it refuses to compile.

I don't know what your actual code is. But the following example replicate the exact error you have:
use std::thread;
use std::sync::{Mutex,Arc};
struct Cursive;
impl Default for Cursive {
fn default() -> Self {
Cursive
}
}
trait View{
fn run(&self);
}
impl View for Cursive{
fn run(&self){}
}
fn main() {
let mut siv:Arc<Mutex<dyn View>> = Arc::new(Mutex::new(Cursive::default()));
let copy_siv = siv.clone();
thread::spawn(move || {
let mut new_siv = copy_siv.lock().unwrap();
});
(*(siv.lock().unwrap())).run();
}
You can try it in playground. The error message:
error[E0277]: `dyn View` cannot be sent between threads safely
--> src/main.rs:21:5
|
21 | thread::spawn(move || {
| ^^^^^^^^^^^^^ `dyn View` cannot be sent between threads safely
|
= help: the trait `std::marker::Send` is not implemented for `dyn View`
= note: required because of the requirements on the impl of `std::marker::Send` for `std::sync::Mutex<dyn View>`
= note: required because of the requirements on the impl of `std::marker::Send` for `std::sync::Arc<std::sync::Mutex<dyn View>>`
= note: required because it appears within the type `[closure#src/main.rs:21:19: 23:6 copy_siv:std::sync::Arc<std::sync::Mutex<dyn View>>]`
= note: required by `std::thread::spawn`
Analysis and Solution
The error message explained everything to experienced users. For those new to the language, siv is a reference counted, mutex protected trait object. This object only known to be a View, the compiler have no evidence on whether or not it is Send. However, for the code to work,
Arc<Mutex<T>> must be Send, as you are sending such a thing to another thread; Therefore:
Mutex<T> must be Send and Sync, as Arc requires the reference counted object to be Send and Sync. Therefore:
T must be Send, as the same object will be accessed in different threads without any further protection.
So, this code does not work. The solution is
let mut siv:Arc<Mutex<dyn View + Send>> = ...
You can try it yourself!
Mutex<T>: Send + Sync requires T: Send
To see why, first ask a question: what cannot be Send?
One example is that references to things with interior mutablity cannot be Send. Because if they were, people can mutate the thing through interior mutability in different threads and causes data race.
Now suppose you have a Mutex<&Cell<T>>, because the protected thing is only a reference, not the Cell itself, the Cell itself may still be somewhere unprotected. The compiler thus cannot conclude when you call lock().set() there is no risk to cause data race. So the compiler prevent it from Send.
What if I have to ...
So we see that &Cell<T> is not Send, and so even it is protected in Mutex we still cannot use it in another thread. What can we do then?
This problem is actually not new. Almost all UI API have the same problem: the UI components were created in the UI thread, and so you cannot access them in any other threads. Instead, you have to schedule a routine to be run in the UI thread, and let the UI thread to access the component.
Fails to do so in other languages (.NET, Java...) will throw exceptions in the best, causing undefined behavior in the worst. Once again, Rust turns such violates into compile errors without special treatments (&Cell<T> have nothing to do with UI), this is really GOOD!
So, if this is what you wanted to do, you have to do the same thing: access the view object in the UI thread only. How to do so depends on the API you were using.

Related

Basic multithreading: How to use struct without Send trait in thread move closure

I am learning Rust and trying to make a simple egui GUI app that polls a telnet host in a seperate thread, to avoid the main thread locking the GUI. I am using the telnet crate for the client.
Here is the code where I am having issues:
struct TelnetApp {
gui_status: AppView,
telnet_ip: String,
telnet_port: String,
telnet_connect_failed_display: bool,
telnet_connect_failed_message: String,
telnet_client: Arc<Mutex<Option<Telnet>>>,
telnet_result : Arc<Mutex<String>>,
}
impl TelnetApp {
// Called from gui after successfully connecting to Telnet host
fn start_telnet_loop(&mut self) {
let arc_telnet_result = self.telnet_result.clone();
let arc_telnet_client = self.telnet_client.clone();
let time = SystemTime::now();
thread::spawn(move || { // <---- ERROR: `(dyn Stream + 'static)` cannot be sent between threads safely
loop {
thread::sleep(Duration::from_millis(1000));
arc_telnet_client.lock().unwrap().unwrap().read(); // <--- This line causes error
{
// I can read and modify the String, presumably because it implements Send?
*arc_telnet_result.lock().unwrap() = String::from(format!("Time {}", time.elapsed().unwrap().as_micros()));
}
}
});
}
}
As i marked with a comment, the thread spawn line gives me an error, which seems to stem from the fact that arc_telnet_client does not implement the trait "Send", as the error goes away when removing the line:
arc_telnet_client.lock().unwrap().unwrap().read()
I read that wrapping in Arc<Mutex<>> is the recommended way to handle multithreading, but this does still not give the trait Send.
Why is my approach not allowed, even when I am using a mutex to lock it? How would you implement a simple polling thread like this?
Why is my approach not allowed, even when I am using a mutex to lock it?
Because !Send means the object is not safe to move to a different thread at all. It doesn't matter how you protect it, it's just not valid.
For instance it might be using threadlocal data, or kernel task resources, or some sort of unprotected global or shared state, or affinity mechanics. No matter how or why, if it's !Send, it can only be accessed from and by the thread where it was created, doesn't matter what you wrap it in. An example of that is MutexGuard:
impl<T: ?Sized> !Send for MutexGuard<'_, T>
That is because it's common for mutexes to only be unlockable from the thread which locked them (it's UB in posix, the release fails on windows)
As its traits describe, a Mutex is Sync (so can be shared between threads) if an object is Send:
impl<T: ?Sized + Send> Sync for Mutex<T>
That is because semantically, having a mutex around a Send is equivalent to moving the wrapped object from one thread to the next through a channel (the wrapped object will always be used from one thread at a time).
RwLock, on the other hand, requires Sync since the wrapped object can be concurrently accessed from different threads (in read mode):
impl<T: ?Sized + Send + Sync> Sync for RwLock<T>

Can the borrow checker know when an Arc is "released"? Can a 'static lifetime granted temporarily?

I'm trying to speed up a computationally-heavy Rust function by making it concurrent using only the built-in thread support. In particular, I want to alternate between quick single-threaded phases (where the main thread has mutable access to a big structure) and concurrent phases (where many worker threads run with read-only access to the structure). I don't want to make extra copies of the structure or force it to be 'static. Where I'm having trouble is convincing the borrow checker that the worker threads have finished.
Ignoring the borrow checker, an Arc reference seems like does all that is needed. The reference count in the Arc increases with the .clone() for each worker, then decreases as the workers conclude and I join all the worker threads. If (and only if) the Arc reference count is 1, it should be safe for the main thread to resume. The borrow checker, however, doesn't seem to know about Arc reference counts, and insists that my structure needs to be 'static.
Here's some sample code which works fine if I don't use threads, but won't compile if I switch the comments to enable the multi-threaded case.
struct BigStruct {
data: Vec<usize>
// Lots more
}
pub fn main() {
let ref_bigstruct = &mut BigStruct { data: Vec::new() };
for i in 0..3 {
ref_bigstruct.data.push(i); // Phase where main thread has write access
run_threads(ref_bigstruct); // Phase where worker threads have read-only access
}
}
fn run_threads(ref_bigstruct: &BigStruct) {
let arc_bigstruct = Arc::new(ref_bigstruct);
{
let arc_clone_for_worker = arc_bigstruct.clone();
// SINGLE-THREADED WORKS:
worker_thread(arc_clone_for_worker);
// MULTI-THREADED DOES NOT COMPILE:
// let handle = thread::spawn(move || { worker_thread(arc_clone_for_worker); } );
// handle.join();
}
assert!(Arc::strong_count(&arc_bigstruct) == 1);
println!("??? How can I tell the borrow checker that all borrows of ref_bigstruct are done?")
}
fn worker_thread(my_struct: Arc<&BigStruct>) {
println!(" worker says len()={}", my_struct.data.len());
}
I'm still learning about Rust lifetimes, but what I think (fear?) what I need is an operation that will take an ordinary (not 'static) reference to my structure and give me an Arc that I can clone into immutable references with a 'static lifetime for use by the workers. Once all the the worker Arc references are dropped, the borrow checker needs to allow my thread-spawning function to return. For safety, I assume this would panic if the the reference count is >1. While this seems like it would generally confirm with Rust's safety requirements, I don't see how to do it.
The underlying problem is not the borrowing checker not following Arc and the solution is not to use Arc. The problem is the borrow checker being unable to understand that the reason a thread must be 'static is because it may outlive the spawning thread, and thus if I immediately .join() it it is fine.
And the solution is to use scoped threads, that is, threads that allow you to use non-'static data because they always immediately .join(), and thus the spawned thread cannot outlive the spawning thread. Problem is, there are no worker threads on the standard library. Well, there are, however they're unstable.
So if you insist on not using crates, for some reason, you have no choice but to use unsafe code (don't, really). But if you can use external crates, then you can use the well-known crossbeam crate with its crossbeam::scope function, at least til std's scoped threads are stabilized.
In Rust Arc< T>, T is per definition immutable. Which means in order to use Arc, to make threads access data that is going to change, you also need it to wrap in some type that is interiorly mutable.
Rust provides a type that is especially suited for a single write or multiple read accesses in parallel, called RwLock.
So for your simple example, this would propably look something like this
use std::{sync::{Arc, RwLock}, thread};
struct BigStruct {
data: Vec<usize>
// Lots more
}
pub fn main() {
let arc_bigstruct = Arc::new(RwLock::new(BigStruct { data: Vec::new() }));
for i in 0..3 {
arc_bigstruct.write().unwrap().data.push(i); // Phase where main thread has write access
run_threads(&arc_bigstruct); // Phase where worker threads have read-only access
}
}
fn run_threads(ref_bigstruct: &Arc<RwLock<BigStruct>>) {
{
let arc_clone_for_worker = ref_bigstruct.clone();
//MULTI-THREADED
let handle = thread::spawn(move || { worker_thread(&arc_clone_for_worker); } );
handle.join().unwrap();
}
assert!(Arc::strong_count(&ref_bigstruct) == 1);
}
fn worker_thread(my_struct: &Arc<RwLock<BigStruct>>) {
println!(" worker says len()={}", my_struct.read().unwrap().data.len());
}
Which outputs
worker says len()=1
worker says len()=2
worker says len()=3
As for your question, the borrow checker does not know when an Arc is released, as far as I know. The references are counted at runtime.

How to defer lifetime checking to runtime

I'm trying to pass a non-static closure into tokio. Obviously this doesn't work. Is there a way to make sure the lifetimes are appropriate at runtime? Here's what I tried:
Attempt with Arc
In order to not pass the closure directly into tokio, I put it into the struct that manages our timers:
type Delays<'l, K: Eq + Hash + Debug + Copy + Send> = HashMap<K, Box<dyn FnOnce() + 'l + Send>>;
pub struct Timers<'l, K: Eq + Hash + Debug + Clone + Send> {
delays: Arc<Mutex<Delays<'l, K>>>,
}
The impl for that struct lets us easily add and remove timers. My plan was to somehow pass a static closure into tokio, by only moving a Weak reference
to the mutexed hashmap:
// remember handler function
delays.insert(key.clone(), Box::new(func));
// create a weak reference to the delay map to pass into the closure
let weak_handlers = Arc::downgrade(&self.delays);
// task that runs after a delay
let task = Delay::new(Instant::now() + delay)
.map_err(|e| warn!("Tokio timer error: {}", e)) // Map the error type to ()
.and_then(move |_| {
// get the handler from the table, of which we have only a weak ref.
let handler = Weak::upgrade(&weak_handlers)
.ok_or(())? // If the Arc dropped, return an error and thus aborting the future
.lock()
.remove(&key)
.ok_or(())?; // If the handler isn't there anymore, we can abort aswell.
// call the handler
handler();
Ok(())
});
So with the Weak we make sure that we abort, if the hash table was dropped.
It's important to know that the lifetime 'l is the same as that of the Timers struct, but how can I tell the compiler? Also, I think the real problem is that Weak<T>: 'static is not satisfied.
Writing it myself using unsafe
I tried building something similar to Sc to achieve this. First, is Sc going to work here? I read the code and understand it. I can't see any obvious problems - though it was kind of hard to come to the conclusion that the map method is actually safe, because the reference will definitely be dropped at the end of the map and not stored somewhere.
So I tried to adapt Sc for my needs. This is only a rough outline and I know there are some issues with this, but I believe something like this should be possible:
Have a struct Doa<T> that will own T
Doa::ref(&self) -> DoaRef<T> will produce a opaque object that internally contain a *const u8 to the owned object.
DoaRef doesn't contain references with non-static lifetimes and thus can be passed to tokio.
Have impl<T> Drop for Doa<T> that sets that *const u8 to null
So the DoaRef can now check if the value still exists and get a reference to it.
I also tried to make sure that the lifetime of &self in ref must be longer than the lifetimes of references in T, to ensure this works only if Doa really lives longer than the object the pointer points to.
struct Doa<'t, T: 'l> { ... }
pub fn ref(&'s self) -> DoaRef<T> where 't: 'a
But then T is lifetime-contrained and since DoaRef is parameterized over it DoaRef: 'static doesn't hold anymore.
Or is there some crate, or maybe even something in std that can do this?

Compiler says that data cannot be shared between threads safely even though the data is wrapped within a Mutex

I'm using Rocket which has a State that it passes to the HTTP requests. This struct contains a Mutex<DatastoreInstance> which gives access to a SQLite database and is locked with a mutex to make read and writes safe.
pub struct DatastoreInstance {
conn: Connection,
}
When the DatastoreInstance struct looked like this, with only a SQLite connection everything worked fine, but I then also wanted to add a transaction object within this struct:
pub struct DatastoreInstance {
conn: Connection,
events_transaction: Transaction,
}
This did not compile because the Transaction object needs to reference a Connection object which should have a lifetime which it is aware of.
The Connection and Transaction objects within rusqlite which I am using are defined as following:
pub struct Connection {
db: RefCell<InnerConnection>,
cache: StatementCache,
path: Option<PathBuf>,
}
pub struct Transaction<'conn> {
conn: &'conn Connection,
drop_behavior: DropBehavior,
}
To solve the lifetime issues I had to add these lifetime parameters to get it working:
pub struct DatastoreInstance<'a> {
conn: Connection,
events_transaction: Transaction<'a>,
}
This was the result and was supposed to work according to my understanding of both lifetimes and mutexes, but I now get a compiler error telling me:
`std::cell::RefCell<lru_cache::LruCache<std::string::String, rusqlite::raw_statement::RawStatement>>` cannot be shared between threads safely
|
= help: within `rusqlite::Connection`, the trait `std::marker::Sync` is not implemented for `std::cell::RefCell<lru_cache::LruCache<std::string::String, rusqlite::raw_statement::RawStatement>>`
= note: required because it appears within the type `rusqlite::cache::StatementCache`
= note: required because it appears within the type `rusqlite::Connection`
= note: required because of the requirements on the impl of `std::marker::Send` for `&rusqlite::Connection`
= note: required because it appears within the type `datastore::DatastoreInstance<'_>`
= note: required because of the requirements on the impl of `std::marker::Send` for `std::sync::Mutex<datastore::DatastoreInstance<'_>>`
= note: required because it appears within the type `endpoints::ServerState<'_>`
= note: required by `rocket::State`
According to my understanding of mutexes, this code should be valid because the whole DatastoreInstance struct is wrapped within a Mutex which should guarantee that only one thread is referencing this object at a time.
What am I missing?
Why doesn't the compiler find RefCell to be safe anymore after being within a Connection referenced within a Transaction instead of solely within a Connection?
Do I have a bad understanding of how mutexes work? Are my lifetimes invalid and somehow break read/write safety? Is the design of having the Connection and Transaction within the same struct a bad design which breaks read/write safety? Do I need to redesign my data structures somehow to make this safe? Or am I just missing something very obvious?
A Mutex is only Send or Sync if the value it contains is itself Send:
impl<T: ?Sized + Send> Send for Mutex<T>
impl<T: ?Sized + Send> Sync for Mutex<T>
A &T is only Send when T is Sync:
impl<'a, T> Send for &'a T
where
T: Sync + ?Sized,
And a RefCell is never Sync
impl<T> !Sync for RefCell<T>
where
T: ?Sized,
As the error message states, your transaction contains a reference to a RefCell. It doesn't matter that there's a mutex, it's inherently not memory-safe to share it across threads. A simple reproduction:
use std::{cell::RefCell, sync::Mutex};
struct Connection(RefCell<i32>);
struct Transaction<'a>(&'a Connection);
fn is_send<T: Send>(_: T) {}
fn main() {
let c = Connection(RefCell::new(42));
let t = Transaction(&c);
let m = Mutex::new(t);
is_send(m);
}
error[E0277]: `std::cell::RefCell<i32>` cannot be shared between threads safely
--> src/main.rs:13:5
|
13 | is_send(m);
| ^^^^^^^ `std::cell::RefCell<i32>` cannot be shared between threads safely
|
= help: within `Connection`, the trait `std::marker::Sync` is not implemented for `std::cell::RefCell<i32>`
= note: required because it appears within the type `Connection`
= note: required because of the requirements on the impl of `std::marker::Send` for `&Connection`
= note: required because it appears within the type `Transaction<'_>`
= note: required because of the requirements on the impl of `std::marker::Send` for `std::sync::Mutex<Transaction<'_>>`
note: required by `is_send`
--> src/main.rs:6:1
|
6 | fn is_send<T: Send>(_: T) {}
| ^^^^^^^^^^^^^^^^^^^^^^^^^
Why doesn't the compiler find RefCell to be safe anymore after being within a Connection referenced within a Transaction instead of solely within a Connection?
The RefCell is fine, it's the reference to a RefCell that is not.
Is the design of having the Connection and Transaction within the same struct a bad design [...] Do I need to redesign my data structures
Yes.
How to store rusqlite Connection and Statement objects in the same struct in Rust?
Why can't I store a value and a reference to that value in the same struct?

Rust mpsc::Sender cannot be shared between threads?

I thought the whole purpose of a channel was to share data between threads. I have this code, based on this example:
let tx_thread = tx.clone();
let ctx = self;
thread::spawn(|| {
...
let result = ctx.method()
tx_thread.send((String::from(result), someOtherString)).unwrap();
})
Where tx is a mpsc::Sender<(String, String)>
error[E0277]: the trait bound `std::sync::mpsc::Sender<(std::string::String, std::string::String)>: std::marker::Sync` is not satisfied
--> src/my_module/my_file.rs:137:9
|
137 | thread::spawn(|| {
| ^^^^^^^^^^^^^
|
= note: `std::sync::mpsc::Sender<(std::string::String, std::string::String)>` cannot be shared between threads safely
= note: required because of the requirements on the impl of `std::marker::Send` for `&std::sync::mpsc::Sender<(std::string::String, std::string::String)>`
= note: required because it appears within the type `[closure#src/my_module/my_file.rs:137:23: 153:10 res:&&str, ctx:&&my_module::my_submodule::Reader, tx_thread:&std::sync::mpsc::Sender<(std::string::String, std::string::String)>]`
= note: required by `std::thread::spawn`
I'm confused where I went wrong. Unless I'm looking in the wrong place and my issue is actually my use of let ctx = self;?
Sender cannot be shared between threads, but it can be sent!
It implements the trait Send but not Sync (Sync: Safe to access shared reference to Sender across threads).
The design of channels intends that you .clone() the sender and pass it as a value to a thread (for each thread you have). You are missing the move keyword on the thread's closure, which instructs the closure to capture variables by taking ownership of them.
If you must share a single channel endpoint between several threads, it must be wrapped in a mutex. Mutex<Sender<T>> is Sync + Send where T: Send.
Interesting implementation note: The channel starts out for use as a stream where it has a single producer. The internal data structures are upgraded to a multi-producer implementation the first time a sender is cloned.
You may use std::sync::mpsc::SyncSender from the standard library. The diffrenece is that it implements the Sync trait but it will may block if there is no space in the internal buffer while sending a message.
For more information:
std::sync::mpsc::channel
std::sync::mpsc::sync_channel

Resources