Thread-safe mutable non-owning pointer in Rust? - multithreading

I'm trying to parallelize an algorithm I have. This is a sketch of how I would write it in C++:
void thread_func(std::vector<int>& results, int threadid) {
results[threadid] = threadid;
}
std::vector<int> foo() {
std::vector<int> results(4);
for(int i = 0; i < 4; i++)
{
spawn_thread(thread_func, results, i);
}
join_threads();
return results;
}
The point here is that each thread has a reference to a shared, mutable object that it does not own. It seems like this is difficult to do in Rust. Should I try to cobble it together in terms of (and I'm guessing here) Mutex, Cell and &mut, or is there a better pattern I should follow?

The proper way is to use Arc<Mutex<...>> or, for example, Arc<RWLock<...>>. Arc is a shared ownership-based concurrency-safe pointer to immutable data, and Mutex/RWLock introduce synchronized internal mutability. Your code then would look like this:
use std::sync::{Arc, Mutex};
use std::thread;
fn thread_func(results: Arc<Mutex<Vec<i32>>>, thread_id: i32) {
let mut results = results.lock().unwrap();
results[thread_id as usize] = thread_id;
}
fn foo() -> Arc<Mutex<Vec<i32>>> {
let results = Arc::new(Mutex::new(vec![0; 4]));
let guards: Vec<_> = (0..4).map(|i| {
let results = results.clone();
thread::spawn(move || thread_func(results, i))
}).collect();
for guard in guards {
guard.join();
}
results
}
This unfortunately requires you to return Arc<Mutex<Vec<i32>>> from the function because there is no way to "unwrap" the value. An alternative is to clone the vector before returning.
However, using a crate like scoped_threadpool (whose approach could only be recently made sound; something like it will probably make into the standard library instead of the now deprecated thread::scoped() function, which is unsafe) it can be done in a much nicer way:
extern crate scoped_threadpool;
use scoped_threadpool::Pool;
fn thread_func(result: &mut i32, thread_id: i32) {
*result = thread_id;
}
fn foo() -> Vec<i32> {
let results = vec![0; 4];
let mut pool = Pool::new(4);
pool.scoped(|scope| {
for (i, e) in results.iter_mut().enumerate() {
scope.execute(move || thread_func(e, i as i32));
}
});
results
}
If your thread_func needs to access the whole vector, however, you can't get away without synchronization, so you would need a Mutex, and you would still get the unwrapping problem:
extern crate scoped_threadpool;
use std::sync::Mutex;
use scoped_threadpool::Pool;
fn thread_func(results: &Mutex<Vec<u32>>, thread_id: i32) {
let mut results = results.lock().unwrap();
result[thread_id as usize] = thread_id;
}
fn foo() -> Vec<i32> {
let results = Mutex::new(vec![0; 4]);
let mut pool = Pool::new(4);
pool.scoped(|scope| {
for i in 0..4 {
scope.execute(move || thread_func(&results, i));
}
});
results.lock().unwrap().clone()
}
But at least you don't need any Arcs here. Also execute() method is unsafe if you use stable compiler because it does not have a corresponding fix to make it safe. It is safe on all compiler versions greater than 1.4.0, according to its build script.

Related

Compile Time HashMap in Zig

Take for example the Rust library lazy_static's example:
use lazy_static::lazy_static;
use std::collections::HashMap;
lazy_static! {
static ref HASHMAP: HashMap<u32, &'static str> = {
let mut m = HashMap::new();
m.insert(0, "foo");
m.insert(1, "bar");
m.insert(2, "baz");
m
};
static ref COUNT: usize = HASHMAP.len();
static ref NUMBER: u32 = times_two(21);
}
How might this be done in Zig?
I have tried this which is the only thing that makes sense to me:
const std = #import("std");
pub fn main() void {
comptime var h = std.StringHashMap(i32).init(std.testing.allocator);
h.put("hi", 5) catch {};
std.debug.print("{}", .{h});
}
but this segfaults.
Is it even possible to do this in Zig?
It seems (thanks to this Reddit post) that this is implemented in the standard library
and can be used via std.ComptimeStringMap
This however does not seem to support any dynamic insertion as there is no insert method.

How do I remove `MutexGuard` around a value?

I'm trying to use ndarray as an asynchronous process to do linear algebra and such.
I used Rust's tokio and ndarray to create the following code.
use std::sync::{Arc, Mutex};
use ndarray::prelude::*;
use futures::future::join_all;
fn print_type_of<T>(_: &T) {
println!("{}", std::any::type_name::<T>())
}
#[tokio::main]
async fn main() {
let db = Arc::new(Mutex::new(array![0,0,0,0,0,0,0,0]));
let mut handels = vec![];
for i in 0..8 {
let db = db.clone();
let unchange_array = unchange_array.clone();
handels.push(tokio::spawn(async move{
print(i, db).await;
}));
}
join_all(handels).await;
let array = Arc::try_unwrap(db).unwrap();
let array = array.lock().unwrap();
print_type_of(&array); // -> std::sync::mutex::MutexGuard<ndarray::ArrayBase<ndarray::data_repr::OwnedRepr<u32>, ndarray::dimension::dim::Dim<[usize; 1]>>>
}
async fn print(i: u32, db: Arc<Mutex<Array1<u32>>>) {
let unchange = unchange.to_owned();
let mut tmp = 0;
// time-consuming process
for k in 0..100000000 {
tmp = k;
}
tmp += i;
let mut db = db.lock().unwrap();
db.fill(i);
println!("{:?}", unchange);
print_type_of(&db);
}
I would like to change the data std::sync::mutex::MutexGuard<ndarray::ArrayBase<OwnedRepr<u32>, Dim<[usize; 1]>>>
to ndarray::ArrayBase<OwnedRepr<u32>, Dim<[usize; 1]>>.
How can I do this?
You can't. That's the whole point of MutexGuard: if you could take the data out of the MutexGuard, then you would be able to make a reference that can be accessed without locking the mutex, defeating the whole purpose of having a mutex in the first place.
Depending on what you really want to do, one of the following solutions might apply to you:
Most of the time, you don't need to take the data out of the mutex: MutexGuard<T> implements Deref<Target=T> and DerefMut<Target=T>, so you can use the MutexGuard everywhere you would use a &T or a &mut T. Note that if you change your code to call print_type_of(&*array) instead of print_type_of(&array), it will print the inner type.
If you really need to, you can take the data out of the Mutex itself (but not the MutexGuard) with into_inner, which consumes the mutex, ensuring that no one else can ever access it:
let array = Arc::try_unwrap(db).unwrap();
let array = array.into_inner().unwrap();
print_type_of(&array); // -> ndarray::ArrayBase<ndarray::data_repr::OwnedRepr<u32>, ndarray::dimension::dim::Dim<[usize; 1]>>

How to idiomatically share data between closures with wasm-bindgen?

In my browser application, two closures access data stored in a Rc<RefCell<T>>. One closure mutably borrows the data, while the other immutably borrows it. The two closures are invoked independently of one another, and this will occasionally result in a BorrowError or BorrowMutError.
Here is my attempt at an MWE, though it uses a future to artificially inflate the likelihood of the error occurring:
use std::cell::RefCell;
use std::future::Future;
use std::pin::Pin;
use std::rc::Rc;
use std::task::{Context, Poll, Waker};
use wasm_bindgen::prelude::*;
use wasm_bindgen::JsValue;
#[wasm_bindgen]
extern "C" {
#[wasm_bindgen(js_namespace = console)]
pub fn log(s: &str);
#[wasm_bindgen(js_name = setTimeout)]
fn set_timeout(closure: &Closure<dyn FnMut()>, millis: u32) -> i32;
#[wasm_bindgen(js_name = setInterval)]
fn set_interval(closure: &Closure<dyn FnMut()>, millis: u32) -> i32;
}
pub struct Counter(u32);
#[wasm_bindgen(start)]
pub async fn main() -> Result<(), JsValue> {
console_error_panic_hook::set_once();
let counter = Rc::new(RefCell::new(Counter(0)));
let counter_clone = counter.clone();
let log_closure = Closure::wrap(Box::new(move || {
let c = counter_clone.borrow();
log(&c.0.to_string());
}) as Box<dyn FnMut()>);
set_interval(&log_closure, 1000);
log_closure.forget();
let counter_clone = counter.clone();
let increment_closure = Closure::wrap(Box::new(move || {
let counter_clone = counter_clone.clone();
wasm_bindgen_futures::spawn_local(async move {
let mut c = counter_clone.borrow_mut();
// In reality this future would be replaced by some other
// time-consuming operation manipulating the borrowed data
SleepFuture::new(5000).await;
c.0 += 1;
});
}) as Box<dyn FnMut()>);
set_timeout(&increment_closure, 3000);
increment_closure.forget();
Ok(())
}
struct SleepSharedState {
waker: Option<Waker>,
completed: bool,
closure: Option<Closure<dyn FnMut()>>,
}
struct SleepFuture {
shared_state: Rc<RefCell<SleepSharedState>>,
}
impl Future for SleepFuture {
type Output = ();
fn poll(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Self::Output> {
let mut shared_state = self.shared_state.borrow_mut();
if shared_state.completed {
Poll::Ready(())
} else {
shared_state.waker = Some(cx.waker().clone());
Poll::Pending
}
}
}
impl SleepFuture {
fn new(duration: u32) -> Self {
let shared_state = Rc::new(RefCell::new(SleepSharedState {
waker: None,
completed: false,
closure: None,
}));
let state_clone = shared_state.clone();
let closure = Closure::wrap(Box::new(move || {
let mut state = state_clone.borrow_mut();
state.completed = true;
if let Some(waker) = state.waker.take() {
waker.wake();
}
}) as Box<dyn FnMut()>);
set_timeout(&closure, duration);
shared_state.borrow_mut().closure = Some(closure);
SleepFuture { shared_state }
}
}
panicked at 'already mutably borrowed: BorrowError'
The error makes sense, but how should I go about resolving it?
My current solution is to have the closures use try_borrow or try_borrow_mut, and if unsuccessful, use setTimeout for an arbitrary amount of time before attempting to borrow again.
Think about this problem independently of Rust's borrow semantics. You have a long-running operation that's updating some shared state.
How would you do it if you were using threads? You would put the shared state behind a lock. RefCell is like a lock except that you can't block on unlocking it — but you can emulate blocking by using some kind of message-passing to wake up the reader.
How would you do it if you were using pure JavaScript? You don't automatically have anything like RefCell, so either:
The state can be safely read while the operation is still ongoing (in a concurrency-not-parallelism sense): in this case, emulate that by not holding a single RefMut (result of borrow_mut()) alive across an await boundary.
The state is not safe to be read: you'd either write something lock-like as described above, or perhaps arrange so that it's only written once when the operation is done, and until then, the long-running operation has its own private state not shared with the rest of the application (so there can be no BorrowError conflicts).
Think about what your application actually needs and pick a suitable solution. Implementing any of these solutions will most likely involve having additional interior-mutable objects used for communication.

How to add state to a function in Rust

Rust have anonymous closures with state. Can I do the same with named function?
(invalid pseudocode)
fn counting_function()->i32 {
let mut static counter = 0;
counter = counter + 1;
return counter.clone();
}
I understand I can use structs and functions/traits to do this. And I understand that iterators are the proper way to do it. But leaving aside structs with traits and iterators, can I do this without passing the any burden (of initializing structure) to caller?
This is a thread safe variant using an atomic:
use std::sync::atomic::{AtomicUsize, Ordering};
fn counting_function() -> usize {
static COUNTER: AtomicUsize = AtomicUsize::new(0);
let result = COUNTER.fetch_add(1, Ordering::Relaxed);
result
}
But it's actually a code smell I'd say.
Your pseudocode almost works as is. To work with the static mut variable, you'll need to mark the accessing and modifying parts of your code as unsafe as these operations are not threadsafe.
fn counting_function() -> u32 {
static mut counter: u32 = 0;
let retval = unsafe { counter };
unsafe {
counter += 1;
}
retval
}

How to use RwLocks without scoped?

I'm trying to share a RwLock amongst several threads without using scoped threads but I can't figure out how to get the lifetimes correct. I assume that this is possible (what's the point of RwLocks otherwise?) but I can't find any examples of it.
Here is a toy example of what I'm trying to accomplish. Any advice would be appreciated.
rust playpen for this code
use std::sync::{Arc, RwLock};
use std::thread;
struct Stuff {
x: i32
}
fn main() {
let mut stuff = Stuff{x: 5};
helper(&mut stuff);
println!("done");
}
fn helper(stuff: &mut Stuff){
let rwlock = RwLock::new(stuff);
let arc = Arc::new(rwlock);
let local_arc = arc.clone();
for _ in 0..10{
let my_rwlock = arc.clone();
thread::spawn(move || {
let reader = my_rwlock.read().unwrap();
// do some stuff
});
}
let mut writer = local_arc.write().unwrap();
writer.x += 1;
}
&mut references are not safe to send to a non-scoped thread, because the thread may still run after the referenced data has been deallocated. Furthermore, after helper returns, the main thread would still be able to mutate stuff, and the spawned thread would also be able to mutate stuff indirectly, which is not allowed in Rust (there can only be one mutable alias for a variable).
Instead, the RwLock should own the data, rather than borrow it. This means helper should receive a Stuff rather than a &mut Stuff.
use std::sync::{Arc, RwLock};
use std::thread;
struct Stuff {
x: i32
}
fn main() {
let mut stuff = Stuff{x: 5};
helper(stuff);
println!("done");
}
fn helper(stuff: Stuff){
let rwlock = RwLock::new(stuff);
let arc = Arc::new(rwlock);
let local_arc = arc.clone();
for _ in 0..10{
let my_rwlock = arc.clone();
thread::spawn(move || {
let reader = my_rwlock.read().unwrap();
// do some stuff
});
}
let mut writer = local_arc.write().unwrap();
writer.x += 1;
}

Resources