How to use RwLocks without scoped? - rust

I'm trying to share a RwLock amongst several threads without using scoped threads but I can't figure out how to get the lifetimes correct. I assume that this is possible (what's the point of RwLocks otherwise?) but I can't find any examples of it.
Here is a toy example of what I'm trying to accomplish. Any advice would be appreciated.
rust playpen for this code
use std::sync::{Arc, RwLock};
use std::thread;
struct Stuff {
x: i32
}
fn main() {
let mut stuff = Stuff{x: 5};
helper(&mut stuff);
println!("done");
}
fn helper(stuff: &mut Stuff){
let rwlock = RwLock::new(stuff);
let arc = Arc::new(rwlock);
let local_arc = arc.clone();
for _ in 0..10{
let my_rwlock = arc.clone();
thread::spawn(move || {
let reader = my_rwlock.read().unwrap();
// do some stuff
});
}
let mut writer = local_arc.write().unwrap();
writer.x += 1;
}

&mut references are not safe to send to a non-scoped thread, because the thread may still run after the referenced data has been deallocated. Furthermore, after helper returns, the main thread would still be able to mutate stuff, and the spawned thread would also be able to mutate stuff indirectly, which is not allowed in Rust (there can only be one mutable alias for a variable).
Instead, the RwLock should own the data, rather than borrow it. This means helper should receive a Stuff rather than a &mut Stuff.
use std::sync::{Arc, RwLock};
use std::thread;
struct Stuff {
x: i32
}
fn main() {
let mut stuff = Stuff{x: 5};
helper(stuff);
println!("done");
}
fn helper(stuff: Stuff){
let rwlock = RwLock::new(stuff);
let arc = Arc::new(rwlock);
let local_arc = arc.clone();
for _ in 0..10{
let my_rwlock = arc.clone();
thread::spawn(move || {
let reader = my_rwlock.read().unwrap();
// do some stuff
});
}
let mut writer = local_arc.write().unwrap();
writer.x += 1;
}

Related

Sharing a reference in multiple threads inside a function

I want to build a function that takes a HashMap reference as an argument. This HashMap should be shared between threads for read only access. The code example is very simple:
I insert some value into the HashMap, pass it to the function and want antoher thread to read that value. I get an Error that the borrowed value does not live long enough at line let exit_code = test(&m);. Why is this not working?
use std::thread;
use std::collections::HashMap;
use std::sync::{Arc, RwLock };
fn main(){
let mut m: HashMap<u32, f64> = HashMap::new();
m.insert(0, 0.1);
let exit_code = test(&m);
std::process::exit(exit_code);
}
fn test(m: &'static HashMap<u32, f64>) -> i32{
let map_lock = Arc::new(RwLock::new(m));
let read_thread = thread::spawn(move || {
if let Ok(r_guard) = map_lock.read(){
println!("{:?}", r_guard.get(&0).unwrap());
}
});
read_thread.join().unwrap();
return 0;
}
if I don't put the 'static in the function signature for the HashMap argument, Arc::new(RwLock::new(m)); doesn't work. How can I sovlve this problem?
A reference is not safe to share unless is 'static meaning that something will live for the extent of the program. Otherwise the compiler is not able to track the liveliness of the shared element.
You should wrap it outside of the function, and take ownership of an Arc:
use std::thread;
use std::collections::HashMap;
use std::sync::{Arc, RwLock };
fn main(){
let mut map = HashMap::new();
map.insert(0, 0.1);
let m = Arc::new(RwLock::new(map));
let exit_code = test(m);
std::process::exit(exit_code);
}
fn test(map_lock: Arc<RwLock<HashMap<u32, f64>>>) -> i32 {
let read_thread = thread::spawn(move || {
if let Ok(r_guard) = map_lock.read(){
println!("{:?}", r_guard.get(&0).unwrap());
}
});
read_thread.join().unwrap();
return 0;
}
Playground

How do you pass a struct that contains a String reference between threads?

In the following code snippet I am trying to create a tuple that contains a String value and a struct that contains an attribute that is set as the reference to the first value (which would be the String) in the tuple.
My question is how do I make the String (which is the first value in the tuple) live long enough.
Here is the code:
#[derive(Debug, Clone)]
struct RefTestStruct {
key: usize,
ref_value: Option<&'static str>,
}
fn init3<'a>( cache: &Arc<Mutex<HashMap<usize, (String, Option<RefTestStruct>)>>> ) {
let mut handles: Vec<JoinHandle<()>> = vec![];
for idx in 0..10 {
//create reference copy of cache
let cache_clone = Arc::clone(cache);
let handle = thread::spawn(move || {
//lock cache
let mut locked_cache = cache_clone.lock().unwrap();
// add new object to cache
let tuple_value = (format!("value: {}", idx), None );
let mut ts_obj =
RefTestStruct {
key: idx,
ref_value: Some( tuple_value.0.as_str() ),
};
tuple_value.1 = Some(ts_obj);
locked_cache.insert(idx as usize, tuple_value);
println!("IDX: {} - CACHE: {:?}", idx, locked_cache.get( &(idx as usize) ).unwrap() )
});
handles.push(handle);
}
for handle in handles {
handle.join().unwrap();
}
println!("\n");
}
fn main() {
// init
let cache = HashMap::<usize, (String, Option<RefTestStruct>)>::new();
let am_cache = Arc::new(Mutex::new(cache));
init3(&am_cache);
// change cache contents
let mut handles: Vec<JoinHandle<()>> = vec![];
for idx in 0..10 {
let cache_clone = Arc::clone(&am_cache);
let handle = thread::spawn(move || {
let mut locked_cache = cache_clone.lock().unwrap();
let tuple_value = locked_cache.get_mut( &(idx as usize) ).unwrap();
(*tuple_value).0 = format!("changed value: {}", (*tuple_value).1.unwrap().key + 11);
let ts_obj =
RefTestStruct {
key: (*tuple_value).1.unwrap().key,
ref_value: Some( (*tuple_value).0.as_str() ),
};
tuple_value.1 = Some(ts_obj);
// locked_cache.insert(idx as usize, tuple_value);
});
handles.push(handle);
}
for handle in handles {
handle.join().unwrap();
}
// display cache contents
let mut handles: Vec<JoinHandle<()>> = vec![];
for idx in 0..10 {
let cache_clone = Arc::clone(&am_cache);
let handle = thread::spawn(move || {
let locked_cache = cache_clone.lock().unwrap();
let ts_obj = locked_cache.get( &(idx as usize) ).unwrap();
println!("IDX: {} - CACHE: {:?}", idx, &ts_obj );
});
handles.push(handle);
}
for handle in handles {
handle.join().unwrap();
}
}
ERROR:
error[E0597]: `tuple_value.0` does not live long enough
--> src\main.rs:215:42
|
215 | ref_value: Some( tuple_value.0.as_str() ),
| ^^^^^^^^^^^^^^^^^^^^^^
| |
| borrowed value does not live long enough
| argument requires that `tuple_value.0` is borrowed for `'static`
...
221 | });
| - `tuple_value.0` dropped here while still borrowed
How do you pass a struct that contains a String reference between threads?
First I answer the question in the caption and ignore the program you posted (And I'll try to ignore my itch to guess what you really want to do.)
I'll assume, you want to avoid unsafe in your own code. (So dependencies with unsafe code might be ok.)
To pass a reference to a thread, the reference has to live as long as the thread will live.
Your options:
Have a 'static lifetime on your reference.
Use threads which have a non 'static lifetime.
Regarding 1. have a 'static lifetime on the reference.
So now the questions changed into, how to obtain a reference to
a String with 'static lifetime.
We can't use a constant, as we can't initialize it.
This leaves us with the possibility to Box the String and leak it.
But of course, this leaks memory, as the String can't be dropped any more. Also you get a &'mut String out
of it, so only one thread can have it at a time and after being
done with it, pass it on. So to me it looks like it is essentially
the same you could do with the owned value of type String too,
so why bother and risk the memory leak. (Putting a Arc<Mutex<..>> around it, is also "same you coud do with the owned value".).
Other methods might exist, which I am not aware of.
Regarding 2. Use threads which have a non `'static' lifetime.
These are called scoped threads and guarantee, that the thread exits,
before the lifetime ends. These are not in standard rust yet, but there
are crates implementing those (I guess using unsafe).
For example crossbeam offers an implementation of this.
Looking at your program though, I guess you want to have long lived
threads.
The program you posted and what you probably want to achieve.
The program you posted has a different issue besides.
You try to create a self referential value in your cache. As far as I know, this is not possible in rust, without using a shared ownership tool, like Arc.
If you try to use a plain Arc, the access would be read only and users of the cache could only change, which String the cache entry points to, but not the "pointed to" String itself.
So If one user replaced the Arc<String> in the cache, other threads would still have a reference to the "old" String and don't see the changed value. But I guess that is what you want to achieve.
But that is a conceptual problem, you can't have unsynchronized read access to a value, which can then be mutated concurrently (by whatever means).
So, if you want to have the ability to change the value with all prior users being able to see it, this leaves you with Mutex or RWLock depending on the access pattern, you are anticipating.
A solution using Arc<String> with the aforementioned defects:
use std::collections::HashMap;
use std::sync::{Arc, Mutex};
use std::thread::{self, JoinHandle};
#[derive(Debug, Clone)]
struct RefTestStruct {
key: usize,
_ref_value: Arc<String>,
}
type Cache = HashMap<usize, (Arc<String>, RefTestStruct)>;
type AmCache = Arc<Mutex<Cache>>;
fn init3(cache: &AmCache) {
let mut handles: Vec<JoinHandle<()>> = vec![];
for idx in 0..10_usize {
//create reference copy of cache
let cache_clone = Arc::clone(cache);
let handle = thread::spawn(move || {
//lock cache
let mut locked_cache = cache_clone.lock().unwrap();
// add new object to cache
let s = Arc::new(format!("value: {}", idx));
let ref_struct = RefTestStruct {
key: idx,
_ref_value: s.clone(),
};
let tuple_value = (s, ref_struct);
locked_cache.insert(idx, tuple_value);
println!(
"IDX: {} - CACHE: {:?}",
idx,
locked_cache.get(&idx).unwrap()
)
});
handles.push(handle);
}
for handle in handles {
handle.join().unwrap();
}
println!("\n");
}
fn main() {
// init
let cache = Cache::new();
let am_cache = Arc::new(Mutex::new(cache));
init3(&am_cache);
// change cache contents
let mut handles: Vec<JoinHandle<()>> = vec![];
for idx in 0..10_usize {
let cache_clone = Arc::clone(&am_cache);
let handle = thread::spawn(move || {
let mut locked_cache = cache_clone.lock().unwrap();
let tuple_value = locked_cache.get_mut(&idx).unwrap();
let new_key = tuple_value.1.key + 11;
let new_s = Arc::new(format!("changed value: {}", new_key));
(*tuple_value).1 = RefTestStruct {
key: new_key,
_ref_value: new_s.clone(),
};
(*tuple_value).0 = new_s;
});
handles.push(handle);
}
for handle in handles {
handle.join().unwrap();
}
// display cache contents
let mut handles: Vec<JoinHandle<()>> = vec![];
for idx in 0..10_usize {
let cache_clone = Arc::clone(&am_cache);
let handle = thread::spawn(move || {
let locked_cache = cache_clone.lock().unwrap();
let ts_obj = locked_cache.get(&idx).unwrap();
println!("IDX: {} - CACHE: {:?}", idx, &ts_obj);
});
handles.push(handle);
}
for handle in handles {
handle.join().unwrap();
}
}
I am sure it is not difficult to derive the Arc<Mutex<String>> version from that, if desired.
That would not be as pointless, as one might think. The Mutex around the cache protects the HashMap for insertion and deletion. And the Mutex around the String values, protects the string values themselves and can be locked indepently of the Mutex around the HashMap. Usual caveats apply, if you ever need both locks at the same time, lock them in the same order always, or risk a deadlock.

How do I remove `MutexGuard` around a value?

I'm trying to use ndarray as an asynchronous process to do linear algebra and such.
I used Rust's tokio and ndarray to create the following code.
use std::sync::{Arc, Mutex};
use ndarray::prelude::*;
use futures::future::join_all;
fn print_type_of<T>(_: &T) {
println!("{}", std::any::type_name::<T>())
}
#[tokio::main]
async fn main() {
let db = Arc::new(Mutex::new(array![0,0,0,0,0,0,0,0]));
let mut handels = vec![];
for i in 0..8 {
let db = db.clone();
let unchange_array = unchange_array.clone();
handels.push(tokio::spawn(async move{
print(i, db).await;
}));
}
join_all(handels).await;
let array = Arc::try_unwrap(db).unwrap();
let array = array.lock().unwrap();
print_type_of(&array); // -> std::sync::mutex::MutexGuard<ndarray::ArrayBase<ndarray::data_repr::OwnedRepr<u32>, ndarray::dimension::dim::Dim<[usize; 1]>>>
}
async fn print(i: u32, db: Arc<Mutex<Array1<u32>>>) {
let unchange = unchange.to_owned();
let mut tmp = 0;
// time-consuming process
for k in 0..100000000 {
tmp = k;
}
tmp += i;
let mut db = db.lock().unwrap();
db.fill(i);
println!("{:?}", unchange);
print_type_of(&db);
}
I would like to change the data std::sync::mutex::MutexGuard<ndarray::ArrayBase<OwnedRepr<u32>, Dim<[usize; 1]>>>
to ndarray::ArrayBase<OwnedRepr<u32>, Dim<[usize; 1]>>.
How can I do this?
You can't. That's the whole point of MutexGuard: if you could take the data out of the MutexGuard, then you would be able to make a reference that can be accessed without locking the mutex, defeating the whole purpose of having a mutex in the first place.
Depending on what you really want to do, one of the following solutions might apply to you:
Most of the time, you don't need to take the data out of the mutex: MutexGuard<T> implements Deref<Target=T> and DerefMut<Target=T>, so you can use the MutexGuard everywhere you would use a &T or a &mut T. Note that if you change your code to call print_type_of(&*array) instead of print_type_of(&array), it will print the inner type.
If you really need to, you can take the data out of the Mutex itself (but not the MutexGuard) with into_inner, which consumes the mutex, ensuring that no one else can ever access it:
let array = Arc::try_unwrap(db).unwrap();
let array = array.into_inner().unwrap();
print_type_of(&array); // -> ndarray::ArrayBase<ndarray::data_repr::OwnedRepr<u32>, ndarray::dimension::dim::Dim<[usize; 1]>>

Thread-safe mutable non-owning pointer in Rust?

I'm trying to parallelize an algorithm I have. This is a sketch of how I would write it in C++:
void thread_func(std::vector<int>& results, int threadid) {
results[threadid] = threadid;
}
std::vector<int> foo() {
std::vector<int> results(4);
for(int i = 0; i < 4; i++)
{
spawn_thread(thread_func, results, i);
}
join_threads();
return results;
}
The point here is that each thread has a reference to a shared, mutable object that it does not own. It seems like this is difficult to do in Rust. Should I try to cobble it together in terms of (and I'm guessing here) Mutex, Cell and &mut, or is there a better pattern I should follow?
The proper way is to use Arc<Mutex<...>> or, for example, Arc<RWLock<...>>. Arc is a shared ownership-based concurrency-safe pointer to immutable data, and Mutex/RWLock introduce synchronized internal mutability. Your code then would look like this:
use std::sync::{Arc, Mutex};
use std::thread;
fn thread_func(results: Arc<Mutex<Vec<i32>>>, thread_id: i32) {
let mut results = results.lock().unwrap();
results[thread_id as usize] = thread_id;
}
fn foo() -> Arc<Mutex<Vec<i32>>> {
let results = Arc::new(Mutex::new(vec![0; 4]));
let guards: Vec<_> = (0..4).map(|i| {
let results = results.clone();
thread::spawn(move || thread_func(results, i))
}).collect();
for guard in guards {
guard.join();
}
results
}
This unfortunately requires you to return Arc<Mutex<Vec<i32>>> from the function because there is no way to "unwrap" the value. An alternative is to clone the vector before returning.
However, using a crate like scoped_threadpool (whose approach could only be recently made sound; something like it will probably make into the standard library instead of the now deprecated thread::scoped() function, which is unsafe) it can be done in a much nicer way:
extern crate scoped_threadpool;
use scoped_threadpool::Pool;
fn thread_func(result: &mut i32, thread_id: i32) {
*result = thread_id;
}
fn foo() -> Vec<i32> {
let results = vec![0; 4];
let mut pool = Pool::new(4);
pool.scoped(|scope| {
for (i, e) in results.iter_mut().enumerate() {
scope.execute(move || thread_func(e, i as i32));
}
});
results
}
If your thread_func needs to access the whole vector, however, you can't get away without synchronization, so you would need a Mutex, and you would still get the unwrapping problem:
extern crate scoped_threadpool;
use std::sync::Mutex;
use scoped_threadpool::Pool;
fn thread_func(results: &Mutex<Vec<u32>>, thread_id: i32) {
let mut results = results.lock().unwrap();
result[thread_id as usize] = thread_id;
}
fn foo() -> Vec<i32> {
let results = Mutex::new(vec![0; 4]);
let mut pool = Pool::new(4);
pool.scoped(|scope| {
for i in 0..4 {
scope.execute(move || thread_func(&results, i));
}
});
results.lock().unwrap().clone()
}
But at least you don't need any Arcs here. Also execute() method is unsafe if you use stable compiler because it does not have a corresponding fix to make it safe. It is safe on all compiler versions greater than 1.4.0, according to its build script.

Unable to iterate over Arc Mutex

Consider the following code, I append each of my threads to a Vector in order to join them up to the main thread after I have spawned each thread, however I am not able to call iter() on my vector of JoinHandlers.
How can I go about doing this?
fn main() {
let requests = Arc::new(Mutex::new(Vec::new()));
let threads = Arc::new(Mutex::new(Vec::new()));
for _x in 0..100 {
println!("Spawning thread: {}", _x);
let mut client = Client::new();
let thread_items = requests.clone();
let handle = thread::spawn(move || {
for _y in 0..100 {
println!("Firing requests: {}", _y);
let start = time::precise_time_s();
let _res = client.get("http://jacob.uk.com")
.header(Connection::close())
.send().unwrap();
let end = time::precise_time_s();
thread_items.lock().unwrap().push((Request::new(end-start)));
}
});
threads.lock().unwrap().push((handle));
}
// src/main.rs:53:22: 53:30 error: type `alloc::arc::Arc<std::sync::mutex::Mutex<collections::vec::Vec<std::thread::JoinHandle<()>>>>` does not implement any method in scope named `unwrap`
for t in threads.iter(){
println!("Hello World");
}
}
First, you don't need threads to be contained in Arc in Mutex. You can keep it just Vec:
let mut threads = Vec::new();
...
threads.push(handle);
This is so because you don't share threads between, well, threads. You only access it from the main thread.
Second, if for some reason you do need to keep it in Arc (e.g. if your example does not reflect the actual structure of your program which is more complex), then you need to lock the mutex to obtain a reference to the contained vector, just as you do when pushing:
for t in threads.lock().unwrap().iter() {
...
}

Resources