Situation
I have an array of f32
I have some threads that each will change a small part of the array
I do not know which indices will be changed
Every thread has to lock the array and then spend some time on an expensive calculation
Afterwards, it will change the index and release the array
Take a look at the commented minimal example below
The Problem
The first thread will lock the array and other threads can not edit it anymore. Thus wasting a lot of time. Other threads that need to edit different indices and would never touch the ones required by the first thread could have been executed at the same time.
Possible Solution
I know that the array outlives all threads so unsafe Rust is a viable option
I already posted a solution using the AtomicFloat external crate.
You may come up with a stdlib-only solution.
Minimal example:
use std::sync::{Arc, Mutex};
use std::thread;
use std::time::Duration;
use rand::Rng;
fn main() {
// Store the mutex
let container = Arc::new(Mutex::new([0.0; 10]));
// This will keep track of the created threads
let mut threads = vec![];
// Create new Threads
for _ in 0..10 {
// Create a copy of the mutex reference
let clone = Arc::clone(&container);
threads.push(thread::spawn(move || {
// The function somehow calculates the index that has to be changed
// In our case its simulated by picking a random index to emphasize that we do not know the index
let mut rng = rand::thread_rng();
let index = rng.gen_range(0..10);
// Unfortuantely we have to lock the array before the intense calculation !!!
// If we could just lock the index of the array, other threads could change other indices in parallel
// But now all of them need to wait for the lock
let mut myarray = clone.lock().unwrap();
// simulate intense calculation
thread::sleep(Duration::from_millis(1000));
// Now the index can be changed
println!("Changing index {}", index);
myarray[index] += 1.0;
}));
}
// Wait for all threads to finish
for thread in threads {
thread.join().unwrap();
}
// I know that myarray outlives the runtime of all threads.
// Therefore someone may come up with an unsafe solution
// Print the result
println!("{:?}", container);
}
This is a solution I came up with using the atomic_float crate.
Introduction to scoped threads
AtomicF32 docs
It works because the scoped thread ensures that all values live as long as the scope.
AtomicF32 protects the values from concurrent access
Note that this does not lock the index before the heavy work. When describing my problem I simplified it a bit. In reality, I had a loop that will often access the data thus almost constantly locking the array
use atomic_float::AtomicF32;
use rand::Rng;
use std::sync::atomic::Ordering;
use std::thread::{self, sleep};
use std::time::Duration;
fn main() {
// Create a new atomic float array
// atomic floats ensure that the value is protected from concurrent access
// Thus locking only the indecies
// Notice the array is not mutable
let myarray = [
AtomicF32::new(0.0),
AtomicF32::new(0.0),
AtomicF32::new(0.0),
AtomicF32::new(0.0),
AtomicF32::new(0.0),
AtomicF32::new(0.0),
AtomicF32::new(0.0),
AtomicF32::new(0.0),
AtomicF32::new(0.0),
AtomicF32::new(0.0),
];
// This is a scoped thread
thread::scope(|s| {
// The loop has to be inside the scope
// All threads spawned within thread::scope must terminate before thread::scope can return
// That's how it makes sure the scoped variables exist at least as long as the spawned threads
for _ in 0..10 {
//Create a new thread
s.spawn(|| {
let mut rng = rand::thread_rng();
let index = rng.gen_range(0..10);
// Simulate heavy work
sleep(Duration::from_millis(3000));
// Now the index can be changed
println!("Changing index {}", index);
// This is the atomic operation. The value is therefore protected from concurrent access
myarray[index].fetch_add(1.0, Ordering::SeqCst)
});
}
});
println!("{:?}", myarray);
}
Related
my current project requires recording some information for various events that happen during the execution of a thread. These events are saved in a global struct index by the thread id:
RECORDER1: HashMap<ThreadId, Vec<Entry>> = HashMap::new();
Every thread appends new Entry to its vector. Therefore, threads access "disjoint" vectors. Rust requires synchronization primitives to make the above work of course. So the real implementation looks like:
struct Entry {
// ... not important.
}
#[derive(Clone, Eq, PartialEq, Hash)]
struct ThreadId;
// lazy_static necessary to initialize this data structure.
lazy_static! {
/// Global data structure. Threads access disjoint entries based on their unique thread id.
/// "Outer" mutex necessary as lazy_static requires sync (so cannot use RefCell).
static ref RECORDER2: Mutex<HashMap<ThreadId, Vec<Entry>>> = Mutex::new(HashMap::new());
}
This works, but all threads contend on the same global lock. It would be nice if a thread could "borrow" its respective vector for the lifetime of the thread so it could write all the entries it needs without needing to lock every time (I understand the outer lock is necessary for ensuring threads don't insert into the HashMap at the same time).
We can do this by adding an Arc and some more interior mutability via a Mutex for the values in the HashMap:
lazy_static! {
static ref RECORDER: Mutex<HashMap<ThreadId, Arc<Mutex<Vec<Entry>>>>> = Mutex::new(HashMap::new());
}
Now we can "check out" our entry when a thread is spawned:
fn local_borrow() {
std::thread::spawn(|| {
let mut recorder = RECORDER.lock().expect("Unable to acquire outer mutex lock.");
let my_thread_id: ThreadId = ThreadId {}; // Get thread id...
// Insert entry in hashmap for our thread.
// Omit logic to check if key-value pair already existed (it shouldn't).
recorder.insert(my_thread_id.clone(), Arc::new(Mutex::new(Vec::new())));
// Get "reference" to vector
let local_entries: Arc<Mutex<Vec<Entry>>> = recorder
.get(&my_thread_id)
.unwrap() // We just inserted this entry, so unwrap.
.clone(); // Clone on the Arc to acquire a "copy".
// Lock once, use multiple times.
let mut local_entries: MutexGuard<_> = local_entries.lock().unwrap();
local_entries.push(Entry {});
local_entries.push(Entry {});
});
}
This works and is what I want. However, due to API constraints I have to access the MutexGuard from widely different places across the code without the ability to pass the MutexGuard as an argument to functions. So instead I use a thread local variable:
thread_local! {
/// This variable is initialized lazily. Due to API constraints, we use this thread_local! to
/// "pass" LOCAL_ENTRIES around.
static LOCAL_ENTRIES: Arc<Mutex<Vec<Entry>>> = {
let mut recorder = RECORDER.lock().expect("Unable to acquire outer mutex lock.");
let my_thread_id: ThreadId = ThreadId {}; // Get thread id...
// Omit logic to check if key-value pair already existed (it shouldn't).
recorder.insert(my_thread_id.clone(), Arc::new(Mutex::new(Vec::new())));
// Get "reference" to vector
recorder
.get(&my_thread_id)
.unwrap() // We just inserted this entry, so unwrap.
.clone() // Clone on the Arc to acquire a "copy".
}
}
I cannot make LOCAL_ENTRIES: MutexGuard<_> since thread_local! requires a 'static lifetime. So currently I have to .lock() every time I want to access the thread-local variable:
fn main() {
std::thread::spawn(|| {
// Record important message.
LOCAL_ENTRIES.with(|entries| {
// We have to lock every time we want to write to LOCAL_ENTRIES. It would be nice
// to lock once and hold on to the MutexGuard for the lifetime of the thread, but
// this is not possible to due the lifetime on the MutextGuard.
let mut entries = entries.lock().expect("Unable to acquire lock");
entries.push(Entry {});
});
});
}
Sorry for all the code and explanation but I'm really stuck and wanted to show why it doesn't work and what I'm trying to get working. How can one get around this in Rust?
Or am I getting hung up on cost of the mutex locking? For any Arc<Mutex<Vec<Entry>>>, the lock will always be unlocked so the cost of doing the atomic locking will be tiny?
Thanks for any thoughts. Here is the complete example in Rust Playground.
I am writing an application that needs to run on many threads at the same time. It will process a long list of items where one property of each item is a user_id. I am trying to make sure that items belonging to the same user_id are never processed at the same time. This means that the closure running the sub threads needs to wait until no other thread is processing data for the same user.
I do not understand how to solve this. My simplified, current example, looks like this:
use std::collections::HashMap;
use std::sync::{Arc, Mutex};
use threadpool::ThreadPool;
fn main() {
let pool = ThreadPool::new(num_cpus::get());
let mut locks: HashMap<String, Mutex<bool>> = HashMap::new();
let queue = Arc::new(vec![
"1".to_string(),
"1".to_string(),
"2".to_string(),
"1".to_string(),
"3".to_string(),
]);
let count = queue.len();
for i in 0..count {
let user_id = queue[i].clone();
// Problem: cannot borrow `locks` as mutable more than once at a time
// mutable borrow starts here in previous iteration of loop
let lock = locks.entry(user_id).or_insert(Mutex::new(true));
pool.execute(move || {
// Wait until the user_id becomes free.
lock.lock().unwrap();
// Do stuff with user_id, but never process
// the same user_id more than once at the same time.
println!("{:?}", user_id);
});
}
pool.join();
}
I am trying to keep a list of Mutex which I then use to wait for the user_id to become free, but the borrow checker does not allow this. The queue items and the item process code is much more complex in the actual application I am working on.
I am not allowed to change the order of the items in the queue (but some variations will be allowed because of waiting for the lock).
How to solve this scenario?
First of all, HashMap::entry() consumes the key, so since you want to use it in the closure as well, you'll need to clone it, i.e. .entry(user_id.clone()).
Since you need to share the Mutex<bool> between the main thread and worker threads, then you need to likewise wrap that in an Arc. You can also use Entry::or_insert_with(), so you avoid needlessly creating a new Mutex unless needed.
let mut locks: HashMap<String, Arc<Mutex<bool>>> = HashMap::new();
// ...
let lock = locks
.entry(user_id.clone())
.or_insert_with(|| Arc::new(Mutex::new(true)))
.clone();
Lastly, you must store the guard returned by lock(), otherwise it is immediately released.
let _guard = lock.lock().unwrap();
use std::collections::HashMap;
use std::sync::{Arc, Mutex};
use threadpool::ThreadPool;
fn main() {
let pool = ThreadPool::new(num_cpus::get());
let mut locks: HashMap<String, Arc<Mutex<bool>>> = HashMap::new();
let queue = Arc::new(vec![
"1".to_string(),
"1".to_string(),
"2".to_string(),
"1".to_string(),
"3".to_string(),
]);
let count = queue.len();
for i in 0..count {
let user_id = queue[i].clone();
let lock = locks
.entry(user_id.clone())
.or_insert_with(|| Arc::new(Mutex::new(true)))
.clone();
pool.execute(move || {
// Wait until the user_id becomes free.
let _guard = lock.lock().unwrap();
// Do stuff with user_id, but never process
// the same user_id more than once at the same time.
println!("{:?}", user_id);
});
}
pool.join();
}
I would like to have a shared struct between threads. The struct has many fields that are never modified and a HashMap, which is. I don't want to lock the whole HashMap for a single update/remove, so my HashMap looks something like HashMap<u8, Mutex<u8>>. This works, but it makes no sense since the thread will lock the whole map anyways.
Here's this working version, without threads; I don't think that's necessary for the example.
use std::collections::HashMap;
use std::sync::{Arc, Mutex};
fn main() {
let s = Arc::new(Mutex::new(S::new()));
let z = s.clone();
let _ = z.lock().unwrap();
}
struct S {
x: HashMap<u8, Mutex<u8>>, // other non-mutable fields
}
impl S {
pub fn new() -> S {
S {
x: HashMap::default(),
}
}
}
Playground
Is this possible in any way? Is there something obvious I missed in the documentation?
I've been trying to get this working, but I'm not sure how. Basically every example I see there's always a Mutex (or RwLock, or something like that) guarding the inner value.
I don't see how your request is possible, at least not without some exceedingly clever lock-free data structures; what should happen if multiple threads need to insert new values that hash to the same location?
In previous work, I've used a RwLock<HashMap<K, Mutex<V>>>. When inserting a value into the hash, you get an exclusive lock for a short period. The rest of the time, you can have multiple threads with reader locks to the HashMap and thus to a given element. If they need to mutate the data, they can get exclusive access to the Mutex.
Here's an example:
use std::{
collections::HashMap,
sync::{Arc, Mutex, RwLock},
thread,
time::Duration,
};
fn main() {
let data = Arc::new(RwLock::new(HashMap::new()));
let threads: Vec<_> = (0..10)
.map(|i| {
let data = Arc::clone(&data);
thread::spawn(move || worker_thread(i, data))
})
.collect();
for t in threads {
t.join().expect("Thread panicked");
}
println!("{:?}", data);
}
fn worker_thread(id: u8, data: Arc<RwLock<HashMap<u8, Mutex<i32>>>>) {
loop {
// Assume that the element already exists
let map = data.read().expect("RwLock poisoned");
if let Some(element) = map.get(&id) {
let mut element = element.lock().expect("Mutex poisoned");
// Perform our normal work updating a specific element.
// The entire HashMap only has a read lock, which
// means that other threads can access it.
*element += 1;
thread::sleep(Duration::from_secs(1));
return;
}
// If we got this far, the element doesn't exist
// Get rid of our read lock and switch to a write lock
// You want to minimize the time we hold the writer lock
drop(map);
let mut map = data.write().expect("RwLock poisoned");
// We use HashMap::entry to handle the case where another thread
// inserted the same key while where were unlocked.
thread::sleep(Duration::from_millis(50));
map.entry(id).or_insert_with(|| Mutex::new(0));
// Let the loop start us over to try again
}
}
This takes about 2.7 seconds to run on my machine, even though it starts 10 threads that each wait for 1 second while holding the exclusive lock to the element's data.
This solution isn't without issues, however. When there's a huge amount of contention for that one master lock, getting a write lock can take a while and completely kills parallelism.
In that case, you can switch to a RwLock<HashMap<K, Arc<Mutex<V>>>>. Once you have a read or write lock, you can then clone the Arc of the value, returning it and unlocking the hashmap.
The next step up would be to use a crate like arc-swap, which says:
Then one would lock, clone the [RwLock<Arc<T>>] and unlock. This suffers from CPU-level contention (on the lock and on the reference count of the Arc) which makes it relatively slow. Depending on the implementation, an update may be blocked for arbitrary long time by a steady inflow of readers.
The ArcSwap can be used instead, which solves the above problems and has better performance characteristics than the RwLock, both in contended and non-contended scenarios.
I often advocate for performing some kind of smarter algorithm. For example, you could spin up N threads each with their own HashMap. You then shard work among them. For the simple example above, you could use id % N_THREADS, for example. There are also complicated sharding schemes that depend on your data.
As Go has done a good job of evangelizing: do not communicate by sharing memory; instead, share memory by communicating.
Suppose the key of the data is map-able to a u8
You can have Arc<HashMap<u8,Mutex<HashMap<Key,Value>>>
When you initialize the data structure you populate all the first level map before putting it in Arc (it will be immutable after initialization)
When you want a value from the map you will need to do a double get, something like:
data.get(&map_to_u8(&key)).unwrap().lock().expect("poison").get(&key)
where the unwrap is safe because we initialized the first map with all the value.
to write in the map something like:
data.get(&map_to_u8(id)).unwrap().lock().expect("poison").entry(id).or_insert_with(|| value);
It's easy to see contention is reduced because we now have 256 Mutex and the probability of multiple threads asking the same Mutex is low.
#Shepmaster example with 100 threads takes about 10s on my machine, the following example takes a little more than 1 second.
use std::{
collections::HashMap,
sync::{Arc, Mutex, RwLock},
thread,
time::Duration,
};
fn main() {
let mut inner = HashMap::new( );
for i in 0..=u8::max_value() {
inner.insert(i, Mutex::new(HashMap::new()));
}
let data = Arc::new(inner);
let threads: Vec<_> = (0..100)
.map(|i| {
let data = Arc::clone(&data);
thread::spawn(move || worker_thread(i, data))
})
.collect();
for t in threads {
t.join().expect("Thread panicked");
}
println!("{:?}", data);
}
fn worker_thread(id: u8, data: Arc<HashMap<u8,Mutex<HashMap<u8,Mutex<i32>>>>> ) {
loop {
// first unwrap is safe to unwrap because we populated for every `u8`
if let Some(element) = data.get(&id).unwrap().lock().expect("poison").get(&id) {
let mut element = element.lock().expect("Mutex poisoned");
// Perform our normal work updating a specific element.
// The entire HashMap only has a read lock, which
// means that other threads can access it.
*element += 1;
thread::sleep(Duration::from_secs(1));
return;
}
// If we got this far, the element doesn't exist
// Get rid of our read lock and switch to a write lock
// You want to minimize the time we hold the writer lock
// We use HashMap::entry to handle the case where another thread
// inserted the same key while where were unlocked.
thread::sleep(Duration::from_millis(50));
data.get(&id).unwrap().lock().expect("poison").entry(id).or_insert_with(|| Mutex::new(0));
// Let the loop start us over to try again
}
}
Maybe you want to consider evmap:
A lock-free, eventually consistent, concurrent multi-value map.
The trade-off is eventual-consistency: Readers do not see changes until the writer refreshes the map. A refresh is atomic and the writer decides when to do it and expose new data to the readers.
I'm trying to write a program that spawns a background thread that continuously inserts data into some collection. At the same time, I want to keep getting input from stdin and check if that input is in the collection the thread is operating on.
Here is a boiled down example:
use std::collections::HashSet;
use std::thread;
fn main() {
let mut set: HashSet<String> = HashSet::new();
thread::spawn(move || {
loop {
set.insert("foo".to_string());
}
});
loop {
let input: String = get_input_from_stdin();
if set.contains(&input) {
// Do something...
}
}
}
fn get_input_from_stdin() -> String {
String::new()
}
However this doesn't work because of ownership stuff.
I'm still new to Rust but this seems like something that should be possible. I just can't find the right combination of Arcs, Rcs, Mutexes, etc. to wrap my data in.
First of all, please read Need holistic explanation about Rust's cell and reference counted types.
There are two problems to solve here:
Sharing ownership between threads,
Mutable aliasing.
To share ownership, the simplest solution is Arc. It requires its argument to be Sync (accessible safely from multiple threads) which can be achieved for any Send type by wrapping it inside a Mutex or RwLock.
To safely get aliasing in the presence of mutability, both Mutex and RwLock will work. If you had multiple readers, RwLock might have an extra performance edge. Since you have a single reader there's no point: let's use the simple Mutex.
And therefore, your type is: Arc<Mutex<HashSet<String>>>.
The next trick is passing the value to the closure to run in another thread. The value is moved, and therefore you need to first make a clone of the Arc and then pass the clone, otherwise you've moved your original and cannot access it any longer.
Finally, accessing the data requires going through the borrows and locks...
use std::sync::{Arc, Mutex};
fn main() {
let set = Arc::new(Mutex::new(HashSet::new()));
let clone = set.clone();
thread::spawn(move || {
loop {
clone.lock().unwrap().insert("foo".to_string());
}
});
loop {
let input: String = get_input_from_stdin();
if set.lock().unwrap().contains(&input) {
// Do something...
}
}
}
The call to unwrap is there because Mutex::lock returns a Result; it may be impossible to lock the Mutex if it is poisoned, which means a panic occurred while it was locked and therefore its content is possibly garbage.
What is the right approach to share a common object between many threads when the object may sometimes be written to by one owner?
I tried to create one Configuration trait object with several methods to get and set config keys. I'd like to pass this to other threads where configuration items may be read. Bonus points would be if it can be written and read by everyone.
I found a Reddit thread which talks about Rc and RefCell; would that be the right way? I think these would not allow me to borrow the object immutably multiple times and still mutate it.
Rust has a built-in concurrency primitive exactly for this task called RwLock. Together with Arc, it can be used to implement what you want:
use std::sync::{Arc, RwLock};
use std::sync::mpsc;
use std::thread;
const N: usize = 12;
let shared_data = Arc::new(RwLock::new(Vec::new()));
let (finished_tx, finished_rx) = mpsc::channel();
for i in 0..N {
let shared_data = shared_data.clone();
let finished_tx = finished_tx.clone();
if i % 4 == 0 {
thread::spawn(move || {
let mut guard = shared_data.write().expect("Unable to lock");
guard.push(i);
finished_tx.send(()).expect("Unable to send");
});
} else {
thread::spawn(move || {
let guard = shared_data.read().expect("Unable to lock");
println!("From {}: {:?}", i, *guard);
finished_tx.send(()).expect("Unable to send");
});
}
}
// wait until everything's done
for _ in 0..N {
let _ = finished_rx.recv();
}
println!("Done");
This example is very silly but it demonstrates what RwLock is and how to use it.
Also note that Rc and RefCell/Cell are not appropriate in a multithreaded environment because they are not synchronized properly. Rust won't even allow you to use them at all with thread::spawn(). To share data between threads you must use an Arc, and to share mutable data you must additionally use one of the synchronization primitives like RWLock or Mutex.