Calling an unsafe library object of from a different thread - multithreading

I am using tikv/raft-rs library for implementing a consensus system. This library has a RawNode object which is a thread-unsafe object. We must execute some functions on this object periodically (example), hence I use a new thread for executing.
Here are the constraints:
I need to have this object on the main-thread doesn't have this object for accessing some its internal states. (e.g.: raw_node.is_leader)
This object must be accessed on a worker thread.
Because of these constraints, this seems impossible because:
If I create this object, and move to the new thread, the main thread cannot call its state.
If I keep this object on the main thread object, I cannot use Arc<Mutex<>> because this object doesn't implement Copy method.
Here is my code:
use std::thread::JoinHandle;
use std::sync::{Arc, Mutex, mpsc};
use std::collections::{VecDeque, HashMap};
use raft::RawNode;
use std::sync::mpsc::{Receiver, Sender, TryRecvError};
use protobuf::Message as PbMessage;
use raft::eraftpb::ConfState;
use raft::storage::MemStorage;
use raft::{prelude::*, StateRole};
use regex::Regex;
use crate::proposal::Proposal;
use crate::{proposal, batch};
use std::{str, thread};
use std::time::{Instant, Duration};
use crate::batch::Mailbox;
pub struct Node {
core: Arc<Mutex<CoreNode>>
}
#[derive(Copy)]
struct CoreNode {
raft_group: RawNode<MemStorage>,
}
impl Node {
pub fn new() -> Self {
let cfg = Config {
election_tick: 10,
heartbeat_tick: 3,
..Default::default()
};
let storage = MemStorage::new();
let core = Arc::new(Mutex::new(CoreNode {
raft_group: RawNode::new(&cfg, storage).unwrap()
}));
thread::spawn(move || {
core.lock().unwrap().run();
return;
});
Node { core: core.clone() }
}
pub fn is_leader(&self) -> bool {
return self.core.lock().unwrap().raft_group.raft.state == StateRole::Leader;
}
}
impl CoreNode {
pub fn run(mut self) {}
}
When compiling, here is the error:
22 | #[derive(Copy)]
| ^^^^
23 | struct CoreNode {
24 | raft_group: RawNode<MemStorage>,
| ------------------------------- this field does not implement `Copy`
|
My question is: How can I design around this problem.

Related

Rust SDL2 Timer: How can I return timer object from the function?

I am using sdl2 crate and I am trying to create a simple "context" object with timer inside of it. I have the following code
extern crate sdl2;
struct Ctx<'a, 'b> {
timer: sdl2::timer::Timer<'a, 'b>,
}
impl<'a,'b> Ctx<'a, 'b> {
pub fn new() -> Ctx<'a,'b> {
let sdl_context = sdl2::init().unwrap();
let timer_subsystem = sdl_context.timer().unwrap();
let timer = timer_subsystem.add_timer(100, Box::new(|| -> u32 { 100 }));
Ctx { timer }
}
}
pub fn main() {
let _ = Ctx::new();
}
The problem is that add_timer() seem to borrow timer_subsystem object, and it prevents timer to be moved out of the function.
error[E0515]: cannot return value referencing local variable `timer_subsystem`
--> src/main.rs:13:9
|
11 | let timer = timer_subsystem.add_timer(100, Box::new(|| -> u32 { 100 }));
| ----------------------------------------------------------- `timer_subsystem` is borrowed here
12 |
13 | Ctx { timer }
| ^^^^^^^^^^^^^ returns a value referencing data owned by the current function
However, if I look into add_timer(), I cannot see why does the reference is considered still alive. In fact the self reference is not used in the function at all:
https://docs.rs/sdl2/latest/src/sdl2/timer.rs.html#17-32
pub fn add_timer<'b, 'c>(&'b self, delay: u32, callback: TimerCallback<'c>) -> Timer<'b, 'c> {
unsafe {
let callback = Box::new(callback);
let timer_id = sys::SDL_AddTimer(
delay,
Some(c_timer_callback),
mem::transmute_copy(&callback),
);
Timer {
callback: Some(callback),
raw: timer_id,
_marker: PhantomData,
}
}
}
I was trying to put both timer and timer_subsystem to Ctx (to make sure their lifetime is the same), but it didn't help. Neither wrapping them with Box<> (to prevent having a dangling pointer after timer_subsystem was moved), but neither worked.
Why of I have a reference to local data here and how can I properly return the timer object?
The secret is in the PhantomData. The full type as declared inside Timer is PhantomData<&'b ()>, and 'b is specified as &'b self in add_timer(). This ties the lifetime of the returned Timer to the lifetime of the TimerSubsystem. This is because when the subsystem is dropped it is shutted down, and it is not permitted to access the timer afterward.
The best option is probably to initialize the timers subsystem (and SDL) before calling Ctx::new() and passing it as an argument, as it is hard to store both the subsystem and the timer (a self-referential struct).

How do I send read-only data to other threads without copying?

I'm trying to send a "view" of a read-only data to another thread for processing. Basically the main thread does work, and continuously updates a set of data. Whenever an update occurs, the main thread should send the updated data down to other threads where they will process it in a read-only manner. I do not want to copy the data as it may be very large. (The main thread also keeps a "cache" of the data in-memory anyway.)
I can achieve this with Arc<RwLock<T>>, where T being my data structure.
However, there is nothing stopping the side threads updating the data. The side threads can simply call lock() and the write to the data.
My question is there something similar to RwLock where the owner/creator of it has the only write access, but all other instances have read-only access? This way I will have compile time checking of any logic bugs that may occur via side threads accidentally updating data.
Regarding these questions:
Sharing read-only object between threads in Rust?
How can I pass a reference to a stack variable to a thread?
The above questions suggest solving it with Arc<Mutex<T>> or Arc<RwLock<T>> which is all fine. But it still doesn't give compile time enforcement of only one writer.
Additionally: crossbeam or rayon's scoped threads don't help here as I want my side threads to outlive my main thread.
You can create a wrapper type over an Arc<RwLock<T>> that only exposes cloning via a read only wrapper:
mod shared {
use std::sync::{Arc, LockResult, RwLock, RwLockReadGuard, RwLockWriteGuard};
pub struct Lock<T> {
inner: Arc<RwLock<T>>,
}
impl<T> Lock<T> {
pub fn new(val: T) -> Self {
Self {
inner: Arc::new(RwLock::new(val)),
}
}
pub fn write(&self) -> LockResult<RwLockWriteGuard<'_, T>> {
self.inner.write()
}
pub fn read(&self) -> LockResult<RwLockReadGuard<'_, T>> {
self.inner.read()
}
pub fn read_only(&self) -> ReadOnly<T> {
ReadOnly {
inner: self.inner.clone(),
}
}
}
pub struct ReadOnly<T> {
inner: Arc<RwLock<T>>,
}
impl<T> ReadOnly<T> {
pub fn read(&self) -> LockResult<RwLockReadGuard<'_, T>> {
self.inner.read()
}
}
}
Now you can pass read only versions of the value to spawned threads, and continue writing in the main thread:
fn main() {
let val = shared::Lock::new(String::new());
for _ in 0..10 {
let view = val.read_only();
std::thread::spawn(move || {
// view.write().unwrap().push_str("...");
// ERROR: no method named `write` found for struct `ReadOnly` in the current scope
println!("{}", view.read().unwrap());
});
}
val.write().unwrap().push_str("...");
println!("{}", val.read().unwrap());
}

Rust/rocket pass variable to endpoints

Not to my preference but I'm forced to write some Rust today so I'm trying to create a Rocket instance with only one endpoint but, on that endpoint I need to access a variable that is being created during main. The variable takes a long time to be instantiated so that's why I do it there.
My problem is that I can;t find a way to pass it safely. Whatever I do, the compiler complaints about thread safety even though the library appears to be thread safe: https://github.com/brave/adblock-rust/pull/130 (commited code is found on my local instance)
This is the error tat I get:
|
18 | / lazy_static! {
19 | | static ref rules_engine: Mutex<Vec<Engine>> = Mutex::new(vec![]);
20 | | }
| |_^ `std::rc::Rc<std::cell::RefCell<lifeguard::CappedCollection<std::vec::Vec<u64>>>>` cannot be sent between threads safely
|
...and this is my code:
#![feature(proc_macro_hygiene, decl_macro)]
#[macro_use]
extern crate rocket;
use std::fs::File;
use std::io::{self, BufRead};
use std::path::Path;
use lazy_static::lazy_static;
use std::sync::Mutex;
use adblock::engine::Engine;
use adblock::lists::FilterFormat;
use rocket::request::{Form, FormError, FormDataError};
lazy_static! {
static ref rules_engine: Mutex<Vec<Engine>> = Mutex::new(vec![]);
}
fn main() {
if !Path::new("./rules.txt").exists() {
println!("rules file does not exist")
} else {
println!("loading rules");
let mut rules = vec![];
if let Ok(lines) = read_lines("./rules.txt") {
for line in lines {
if let Ok(ip) = line {
rules.insert(0, ip)
}
}
let eng = Engine::from_rules(&rules, FilterFormat::Standard);
rules_engine.lock().unwrap().push(eng);
rocket().launch();
}
}
}
#[derive(Debug, FromForm)]
struct FormInput<> {
#[form(field = "textarea")]
text_area: String
}
#[post("/", data = "<sink>")]
fn sink(sink: Result<Form<FormInput>, FormError>) -> String {
match sink {
Ok(form) => {
format!("{:?}", &*form)
}
Err(FormDataError::Io(_)) => format!("Form input was invalid UTF-8."),
Err(FormDataError::Malformed(f)) | Err(FormDataError::Parse(_, f)) => {
format!("Invalid form input: {}", f)
}
}
}
fn rocket() -> rocket::Rocket {
rocket::ignite().mount("/", routes![sink])
}
fn read_lines<P>(filename: P) -> io::Result<io::Lines<io::BufReader<File>>>
where P: AsRef<Path>, {
let file = File::open(filename)?;
Ok(io::BufReader::new(file).lines())
}
Any way of having the eng available inside the sink endpoint method?
Rc is not thread safe, even behind a mutex. It looks like Rc is used in eng.blocker.pool.pool which is a lifeguard::Pool. So no, the Engine is not thread safe (at least by default).
Fortunately, it appears that the adblock crate has a feature called "object-pooling", which enables that specific functionality. Removing that feature will (hopefully) make it thread safe.
Rocket makes it really easy to share resources between routes (and also between main or any other thread you might have spawned from main). They call their mechanism state. Check out its documentation here.
To give a short example of how it works:
You create your type that you want to share in your application and manage an instance of that type in the instance of rocket that you use for your application. In the guide they give this example:
use std::sync::atomic::AtomicUsize;
struct HitCount {
count: AtomicUsize
}
rocket::build().manage(HitCount { count: AtomicUsize::new(0) });
In a route then you access the resource like this (again from the guide):
use rocket::State;
#[get("/count")]
fn count(hit_count: &State<HitCount>) -> String {
let current_count = hit_count.count.load(Ordering::Relaxed);
format!("Number of visits: {}", current_count)
}
While I learnt rocket I needed to share a struct that contained a String, which is not thread safe per se. That means you need to wrap it into a Mutex before you can manage it with rocket.
Also, as far as I understand, only one resource of any specific type can be shared with manage. But you can just create differently named wrapper types in that case and work around that limitation.

What is the Rust equivalent of C++'s shared_from_this?

I have an object that I know that is inside an Arc because all the instances are always Arced. I would like to be able to pass a cloned Arc of myself in a function call. The thing I am calling will call me back later on other threads.
In C++, there is a standard mixin called enable_shared_from_this. It enables me to do exactly this
class Bus : public std::enable_shared_from_this<Bus>
{
....
void SetupDevice(Device device,...)
{
device->Attach(shared_from_this());
}
}
If this object is not under shared_ptr management (the closest C++ has to Arc) then this will fail at run time.
I cannot find an equivalent.
EDIT:
Here is an example of why its needed. I have a timerqueue library. It allows a client to request an arbitrary closure to be run at some point in the future. The code is run on a dedicated thread. To use it you must pass a closure of the function you want to be executed later.
use std::time::{Duration, Instant};
use timerqueue::*;
use parking_lot::Mutex;
use std::sync::{Arc,Weak};
use std::ops::{DerefMut};
// inline me keeper cos not on github
pub struct MeKeeper<T> {
them: Mutex<Weak<T>>,
}
impl<T> MeKeeper<T> {
pub fn new() -> Self {
Self {
them: Mutex::new(Weak::new()),
}
}
pub fn save(&self, arc: &Arc<T>) {
*self.them.lock().deref_mut() = Arc::downgrade(arc);
}
pub fn get(&self) -> Arc<T> {
match self.them.lock().upgrade() {
Some(arc) => return arc,
None => unreachable!(),
}
}
}
// -----------------------------------
struct Test {
data:String,
me: MeKeeper<Self>,
}
impl Test {
pub fn new() -> Arc<Test>{
let arc = Arc::new(Self {
me: MeKeeper::new(),
data: "Yo".to_string()
});
arc.me.save(&arc);
arc
}
fn task(&self) {
println!("{}", self.data);
}
// in real use case the TQ and a ton of other status data is passed in the new call for Test
// to keep things simple here the 'container' passes tq as an arg
pub fn do_stuff(&self, tq: &TimerQueue) {
// stuff includes a async task that must be done in 1 second
//.....
let me = self.me.get().clone();
tq.queue(
Box::new(move || me.task()),
"x".to_string(),
Instant::now() + Duration::from_millis(1000),
);
}
}
fn main() {
// in real case (PDP11 emulator) there is a Bus class owning tons of objects thats
// alive for the whole duration
let tq = Arc::new(TimerQueue::new());
let test = Test::new();
test.do_stuff(&*tq);
// just to keep everything alive while we wait
let mut input = String::new();
std::io::stdin().read_line(&mut input).unwrap();
}
cargo toml
[package]
name = "tqclient"
version = "0.1.0"
edition = "2018"
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[dependencies]
timerqueue = { git = "https://github.com/pm100/timerqueue.git" }
parking_lot = "0.11"
There is no way to go from a &self to the Arc that self is stored in. This is because:
Rust references have additional assumptions compared to C++ references that would make such a conversion undefined behavior.
Rust's implementation of Arc does not even expose the information necessary to determine whether self is stored in an Arc or not.
Luckily, there is an alternative approach. Instead of creating a &self to the value inside the Arc, and passing that to the method, pass the Arc directly to the method that needs to access it. You can do that like this:
use std::sync::Arc;
struct Shared {
field: String,
}
impl Shared {
fn print_field(self: Arc<Self>) {
let clone: Arc<Shared> = self.clone();
println!("{}", clone.field);
}
}
Then the print_field function can only be called on an Shared encapsulated in an Arc.
having found that I needed this three times in recent days I decided to stop trying to come up with other designs. Maybe poor data design as far as rust is concerned but I needed it.
Works by changing the new function of the types using it to return an Arc rather than a raw self. All my objects are arced anyway, before they were arced by the caller, now its forced.
mini util library called mekeeper
use parking_lot::Mutex;
use std::sync::{Arc,Weak};
use std::ops::{DerefMut};
pub struct MeKeeper<T> {
them: Mutex<Weak<T>>,
}
impl<T> MeKeeper<T> {
pub fn new() -> Self {
Self {
them: Mutex::new(Weak::new()),
}
}
pub fn save(&self, arc: &Arc<T>) {
*self.them.lock().deref_mut() = Arc::downgrade(arc);
}
pub fn get(&self) -> Arc<T> {
match self.them.lock().upgrade() {
Some(arc) => return arc,
None => unreachable!(),
}
}
}
to use it
pub struct Test {
me: MeKeeper<Self>,
foo:i8,
}
impl Test {
pub fn new() -> Arc<Self> {
let arc = Arc::new(Test {
me: MeKeeper::new(),
foo:42
});
arc.me.save(&arc);
arc
}
}
now when an instance of Test wants to call a function that requires it to pass in an Arc it does:
fn nargle(){
let me = me.get();
Ooddle::fertang(me,42);// fertang needs an Arc<T>
}
the weak use is what the shared_from_this does so as to prevent refcount deadlocks, I stole that idea.
The unreachable path is safe because the only place that can call MeKeeper::get is the instance of T (Test here) that owns it and that call can only happen if the T instance is alive. Hence no none return from weak::upgrade

Spawning tasks with non-static lifetimes with tokio 0.1.x

I have a tokio core whose main task is running a websocket (client). When I receive some messages from the server, I want to execute a new task that will update some data. Below is a minimal failing example:
use tokio_core::reactor::{Core, Handle};
use futures::future::Future;
use futures::future;
struct Client {
handle: Handle,
data: usize,
}
impl Client {
fn update_data(&mut self) {
// spawn a new task that updates the data
self.handle.spawn(future::ok(()).and_then(|x| {
self.data += 1; // error here
future::ok(())
}));
}
}
fn main() {
let mut runtime = Core::new().unwrap();
let mut client = Client {
handle: runtime.handle(),
data: 0,
};
let task = future::ok::<(), ()>(()).and_then(|_| {
// under some conditions (omitted), we update the data
client.update_data();
future::ok::<(), ()>(())
});
runtime.run(task).unwrap();
}
Which produces this error:
error[E0477]: the type `futures::future::and_then::AndThen<futures::future::result_::FutureResult<(), ()>, futures::future::result_::FutureResult<(), ()>, [closure#src/main.rs:13:51: 16:10 self:&mut &mut Client]>` does not fulfill the required lifetime
--> src/main.rs:13:21
|
13 | self.handle.spawn(future::ok(()).and_then(|x| {
| ^^^^^
|
= note: type must satisfy the static lifetime
The problem is that new tasks that are spawned through a handle need to be static. The same issue is described here. Sadly it is unclear to me how I can fix the issue. Even some attempts with and Arc and a Mutex (which really shouldn't be needed for a single-threaded application), I was unsuccessful.
Since developments occur rather quickly in the tokio landscape, I am wondering what the current best solution is. Do you have any suggestions?
edit
The solution by Peter Hall works for the example above. Sadly when I built the failing example I changed tokio reactor, thinking they would be similar. Using tokio::runtime::current_thread
use futures::future;
use futures::future::Future;
use futures::stream::Stream;
use std::cell::Cell;
use std::rc::Rc;
use tokio::runtime::current_thread::{Builder, Handle};
struct Client {
handle: Handle,
data: Rc<Cell<usize>>,
}
impl Client {
fn update_data(&mut self) {
// spawn a new task that updates the data
let mut data = Rc::clone(&self.data);
self.handle.spawn(future::ok(()).and_then(move |_x| {
data.set(data.get() + 1);
future::ok(())
}));
}
}
fn main() {
// let mut runtime = Core::new().unwrap();
let mut runtime = Builder::new().build().unwrap();
let mut client = Client {
handle: runtime.handle(),
data: Rc::new(Cell::new(1)),
};
let task = future::ok::<(), ()>(()).and_then(|_| {
// under some conditions (omitted), we update the data
client.update_data();
future::ok::<(), ()>(())
});
runtime.block_on(task).unwrap();
}
I obtain:
error[E0277]: `std::rc::Rc<std::cell::Cell<usize>>` cannot be sent between threads safely
--> src/main.rs:17:21
|
17 | self.handle.spawn(future::ok(()).and_then(move |_x| {
| ^^^^^ `std::rc::Rc<std::cell::Cell<usize>>` cannot be sent between threads safely
|
= help: within `futures::future::and_then::AndThen<futures::future::result_::FutureResult<(), ()>, futures::future::result_::FutureResult<(), ()>, [closure#src/main.rs:17:51: 20:10 data:std::rc::Rc<std::cell::Cell<usize>>]>`, the trait `std::marker::Send` is not implemented for `std::rc::Rc<std::cell::Cell<usize>>`
= note: required because it appears within the type `[closure#src/main.rs:17:51: 20:10 data:std::rc::Rc<std::cell::Cell<usize>>]`
= note: required because it appears within the type `futures::future::chain::Chain<futures::future::result_::FutureResult<(), ()>, futures::future::result_::FutureResult<(), ()>, [closure#src/main.rs:17:51: 20:10 data:std::rc::Rc<std::cell::Cell<usize>>]>`
= note: required because it appears within the type `futures::future::and_then::AndThen<futures::future::result_::FutureResult<(), ()>, futures::future::result_::FutureResult<(), ()>, [closure#src/main.rs:17:51: 20:10 data:std::rc::Rc<std::cell::Cell<usize>>]>`
So it does seem like in this case I need an Arc and a Mutex even though the entire code is single-threaded?
In a single-threaded program, you don't need to use Arc; Rc is sufficient:
use std::{rc::Rc, cell::Cell};
struct Client {
handle: Handle,
data: Rc<Cell<usize>>,
}
impl Client {
fn update_data(&mut self) {
let data = Rc::clone(&self.data);
self.handle.spawn(future::ok(()).and_then(move |_x| {
data.set(data.get() + 1);
future::ok(())
}));
}
}
The point is that you no longer have to worry about the lifetime because each clone of the Rc acts as if it owns the data, rather than accessing it via a reference to self. The inner Cell (or RefCell for non-Copy types) is needed because the Rc can't be dereferenced mutably, since it has been cloned.
The spawn method of tokio::runtime::current_thread::Handle requires that the future is Send, which is what is causing the problem in the update to your question. There is an explanation (of sorts) for why this is the case in this Tokio Github issue.
You can use tokio::runtime::current_thread::spawn instead of the method of Handle, which will always run the future in the current thread, and does not require that the future is Send. You can replace self.handle.spawn in the code above and it will work just fine.
If you need to use the method on Handle then you will also need to resort to Arc and Mutex (or RwLock) in order to satisfy the Send requirement:
use std::sync::{Mutex, Arc};
struct Client {
handle: Handle,
data: Arc<Mutex<usize>>,
}
impl Client {
fn update_data(&mut self) {
let data = Arc::clone(&self.data);
self.handle.spawn(future::ok(()).and_then(move |_x| {
*data.lock().unwrap() += 1;
future::ok(())
}));
}
}
If your data is really a usize, you could also use AtomicUsize instead of Mutex<usize>, but I personally find it just as unwieldy to work with.

Resources