Rust: concurrency error, program hangs after first thread - multithreading

I have created a simplified version of my problem below, I have a Bag struct and Item struct. I want to spawn 10 threads that execute item_action method from Bag on each item in an item_list, and print a statement if both item's attributes are in the bag's attributes.
use std::sync::{Mutex,Arc};
use std::thread;
#[derive(Clone, Debug)]
struct Bag{
attributes: Arc<Mutex<Vec<usize>>>
}
impl Bag {
fn new(n: usize) -> Self {
let mut v = Vec::with_capacity(n);
for _ in 0..n {
v.push(0);
}
Bag{
attributes:Arc::new(Mutex::new(v)),
}
}
fn item_action(&self, item_attr1: usize, item_attr2: usize) -> Result<(),()> {
if self.attributes.lock().unwrap().contains(&item_attr1) ||
self.attributes.lock().unwrap().contains(&item_attr2) {
println!("Item attributes {} and {} are in Bag attribute list!", item_attr1, item_attr2);
Ok(())
} else {
Err(())
}
}
}
#[derive(Clone, Debug)]
struct Item{
item_attr1: usize,
item_attr2: usize,
}
impl Item{
pub fn new(item_attr1: usize, item_attr2: usize) -> Self {
Item{
item_attr1: item_attr1,
item_attr2: item_attr2
}
}
}
fn main() {
let mut item_list: Vec<Item> = Vec::new();
for i in 0..10 {
item_list.push(Item::new(i, (i+1)%10));
}
let bag: Bag= Bag::new(10); //create 10 attributes
let mut handles = Vec::with_capacity(10);
for x in 0..10 {
let bag2 = bag.clone();
let item_list2= item_list.clone();
handles.push(
thread::spawn(move || {
bag2.item_action(item_list2[x].item_attr1, item_list2[x].item_attr2);
})
)
}
for h in handles {
println!("Here");
h.join().unwrap();
}
}
When running, I only got one line, and the program just stops there without returning.
Item attributes 0 and 1 are in Bag attribute list!
May I know what went wrong? Please see code in Playground
Updated:
With suggestion from #loganfsmyth, the program can return now... but still only prints 1 line as above. I expect it to print 10 because my item_list has 10 items. Not sure if my thread logic is correct.
I have added println!("Here"); when calling join all threads. And I can see Here is printed 10 times, just not the actual log from item_action

I believe this is because Rust is not running your
if self.attributes.lock().unwrap().contains(&item_attr1) ||
self.attributes.lock().unwrap().contains(&item_attr2) {
expression in the order you expect. The evaluation order of subexpressions in Rust is currently undefined. What appears to be happening is that you essentially end up with
const condition = {
let lock1 = self.attributes.lock().unwrap();
let lock2 = self.attributes.lock().unwrap();
lock1.contains(&item_attr1) || lock2.contains(&item_attr2)
};
if condition {
which is causing your code to deadlock.
You should instead write:
let attributes = self.attributes.lock().unwrap();
if attributes.contains(&item_attr1) ||
attributes.contains(&item_attr2) {
so that there is only one lock.
Your code would also work as-is if you used an RwLock or ReentrantMutex instead of a Mutex since those allow the same thread to have multiple immutable references to the data.

Related

How to avoid incurring in lifetime errors while filter with an `async` predicate?

Using an async predicate to filter a list of values makes Rust complain about lifetimes. Even if the collection is awaited, which means the predicate will not outlive the filtered value, Rust remains skeptical.
Full repro below with playground here. Note that it filters on a non-copy struct we'd rather pass by reference, rather than a simple value we could just copy and forget without incurring in overhead.
use futures::stream::iter;
use futures::StreamExt;
#[derive(Debug)]
struct Foo {
bar: usize
}
impl Foo {
fn new(bar: usize) -> Self {
Self {
bar
}
}
}
#[tokio::main]
async fn main() {
let arr = vec![
Foo::new(0),
Foo::new(1),
Foo::new(2)
];
let filtered = iter(arr)
.filter(|f| async {compute_baz(f).await > 0})
.collect::<Vec<_>>()
.await;
// should print Foo{bar:1} and Foo{bar:2}
println!("{:?}", filtered)
}
async fn compute_baz(foo: &Foo) -> usize {
// ...do lengthy task...
foo.bar
}
Update
As #Ceasar pointed out below, the async functions are not run in parallel, can that be done?
I'm trying to do something like:
let filter_mask = join_all(items.map(predicate));
let filtered = items.filter(|i| filter_mask[i]).collect::<Vec<_>>();
without the clutter.
An easy workaround is to avoid the closure:
let mut filtered = vec![];
for f in arr.iter() {
if compute_baz(f).await > 0 {
filtered.push(f);
}
}

How to update in one thread and read from many?

I've failed to get this code past the borrow-checker:
use std::sync::Arc;
use std::thread::{sleep, spawn};
use std::time::Duration;
#[derive(Debug, Clone)]
struct State {
count: u64,
not_copyable: Vec<u8>,
}
fn bar(thread_num: u8, arc_state: Arc<State>) {
let state = arc_state.clone();
loop {
sleep(Duration::from_millis(1000));
println!("thread_num: {}, state.count: {}", thread_num, state.count);
}
}
fn main() -> std::io::Result<()> {
let mut state = State {
count: 0,
not_copyable: vec![],
};
let arc_state = Arc::new(state);
for i in 0..2 {
spawn(move || {
bar(i, arc_state.clone());
});
}
loop {
sleep(Duration::from_millis(300));
state.count += 1;
}
}
I'm probably trying the wrong thing.
I want one (main) thread which can update state and many threads which can read state.
How should I do this in Rust?
I have read the Rust book on shared state, but that uses mutexes which seem overly complex for a single writer / multiple reader situation.
In C I would achieve this with a generous sprinkling of _Atomic.
Atomics are indeed a proper way, there are plenty of those in std (link. Your example needs 2 fixes.
Arc must be cloned before moving into the closure, so your loop becomes:
for i in 0..2 {
let arc_state = arc_state.clone();
spawn(move || { bar(i, arc_state); });
}
Using AtomicU64 is fairly straight forward, though you need explicitly use newtype methods with specified Ordering (Playground):
use std::sync::atomic::{AtomicU64, Ordering};
use std::sync::Arc;
use std::thread::{sleep, spawn};
use std::time::Duration;
#[derive(Debug)]
struct State {
count: AtomicU64,
not_copyable: Vec<u8>,
}
fn bar(thread_num: u8, arc_state: Arc<State>) {
let state = arc_state.clone();
loop {
sleep(Duration::from_millis(1000));
println!(
"thread_num: {}, state.count: {}",
thread_num,
state.count.load(Ordering::Relaxed)
);
}
}
fn main() -> std::io::Result<()> {
let state = State {
count: AtomicU64::new(0),
not_copyable: vec![],
};
let arc_state = Arc::new(state);
for i in 0..2 {
let arc_state = arc_state.clone();
spawn(move || {
bar(i, arc_state);
});
}
loop {
sleep(Duration::from_millis(300));
// you can't use `state` here, because it moved
arc_state.count.fetch_add(1, Ordering::Relaxed);
}
}

Maintaining a mutable reference to struct in HashMap

Is it possible to borrow a mutable reference to the contents of a HashMap and use it for an extended period of time without impeding read-only access?
This is for trying to maintain a window into the state of various components in a system that are running independently (via Tokio) and need to be monitored.
As an example:
use std::sync::Arc;
use std::collections::HashMap;
struct Container {
running : bool,
count : u8
}
impl Container {
fn run(&mut self) {
for i in 1..100 {
self.count = i;
}
self.running = false;
}
}
fn main() {
let mut map = HashMap::new();
let mut container = Arc::new(
Box::new(
Container {
running: true,
count: 0
}
)
);
map.insert(0, container.clone());
container.run();
map.remove(&0);
}
This is for a Tokio-driven program where multiple operations will be happening asynchronously and visibility into the overall state of them is required.
There's this question where a temporary mutable reference can be borrowed, but that won't work as the run() function needs time to complete.
Based on suggestions from Jmb and Stargateur reworked this to use a RwLock internally. These internals could be reworked by having methods that manipulate them, but the basics are here:
use std::sync::Arc;
use std::sync::RwLock;
use std::collections::HashMap;
#[derive(Debug)]
struct ContainerState {
running : bool,
count : u8
}
struct Container {
state : Arc<RwLock<ContainerState>>
}
impl Container {
fn run(&self) {
for i in 1..100 {
let mut state = self.state.write().unwrap();
state.count = i;
}
{
let mut state = self.state.write().unwrap();
state.running = false;
}
}
}
fn main() {
let mut map = HashMap::new();
let state = Arc::new(
RwLock::new(
ContainerState {
running: true,
count: 0
}
)
);
map.insert(0, state);
let container = Container {
state: map[&0].clone()
};
container.run();
println!("Final state: {:?}", map[&0]);
map.remove(&0);
}
Where the key thing I was missing is you can have a mutable reference or multiple immutable references, and they're mutually exclusive. My initial understanding was that these two limits were independent.

"thread '<main>' has overflowed its stack" when constructing a large tree

I implemented a tree struct:
use std::collections::VecDeque;
use std::rc::{Rc, Weak};
use std::cell::RefCell;
struct A {
children: Option<VecDeque<Rc<RefCell<A>>>>
}
// I got thread '<main>' has overflowed its stack
fn main(){
let mut tree_stack: VecDeque<Rc<RefCell<A>>> = VecDeque::new();
// when num is 1000, everything works
for i in 0..100000 {
tree_stack.push_back(Rc::new(RefCell::new(A {children: None})));
}
println!("{:?}", "reach here means we are not out of mem");
loop {
if tree_stack.len() == 1 {break;}
let mut new_tree_node = Rc::new(RefCell::new(A {children: None}));
let mut tree_node_children: VecDeque<Rc<RefCell<A>>> = VecDeque::new();
// combine last two nodes to one new node
match tree_stack.pop_back() {
Some(x) => {
tree_node_children.push_front(x);
},
None => {}
}
match tree_stack.pop_back() {
Some(x) => {
tree_node_children.push_front(x);
},
None => {}
}
new_tree_node.borrow_mut().children = Some(tree_node_children);
tree_stack.push_back(new_tree_node);
}
}
Playpen link
But it crashes with
thread '<main>' has overflowed its stack
How do I fix that?
The problem that you are experiencing is because you have a giant linked-list of nodes. When that list is dropped, the first element tries to free all the members of the struct first. That means that the second element does the same, and so on, until the end of the list. This means that you will have a call stack that is proportional to the number of elements in your list!
Here's a small reproduction:
struct A {
children: Option<Box<A>>
}
fn main() {
let mut list = A { children: None };
for _ in 0..1_000_000 {
list = A { children: Some(Box::new(list)) };
}
}
And here's how you would fix it:
impl Drop for A {
fn drop(&mut self) {
if let Some(mut child) = self.children.take() {
while let Some(next) = child.children.take() {
child = next;
}
}
}
}
This code overrides the default recursive drop implementation with an iterative one. It rips the children out of the node, replacing it with a terminal item (None). It then allows the node to drop normally, but there will be no recursive calls.
The code is complicated a bit because we can't drop ourselves, so we need to do a little two-step dance to ignore the first item and then eat up all the children.
See also:
How can I swap in a new value for a field in a mutable reference to a structure?
How do I move out of a struct field that is an Option?

Split/gather pattern for jobs

I have a set of jobs that I am trying to run in parallel. I want to run each task on its own thread and gather the responses on the calling thread.
Some jobs may take much longer than others, so I'd like to start using each result as it comes in, and not have to wait for all jobs to complete.
Here is an attempt:
struct Container<T> {
items : Vec<T>
}
#[derive(Debug)]
struct Item {
x: i32
}
impl Item {
fn foo (&mut self) {
self.x += 1; //consider an expensive mutating computation
}
}
fn main() {
use std;
use std::sync::{Mutex, Arc};
use std::collections::RingBuf;
//set up a container with 2 items
let mut item1 = Item { x: 0};
let mut item2 = Item { x: 1};
let container = Container { items: vec![item1, item2]};
//set a gather system for our results
let ringBuf = Arc::new(Mutex::new(RingBuf::<Item>::new()));
//farm out each job to its own thread...
for item in container.items {
std::thread::Thread::spawn(|| {
item.foo(); //job
ringBuf.lock().unwrap().push_back(item); //push item back to caller
});
}
loop {
let rb = ringBuf.lock().unwrap();
if rb.len() > 0 { //gather results as soon as they are available
println!("{:?}",rb[0]);
rb.pop_front();
}
}
}
For starters, this does not compile due to the impenetrable cannot infer an appropriate lifetime due to conflicting requirements error.
What am I doing wrong and how do I do it right?
You've got a couple compounding issues, but the first one is a misuse / misunderstanding of Arc. You need to give each thread it's own copy of the Arc. Arc itself will make sure that changes are synchronized. The main changes were the addition of .clone() and the move keyword:
for item in container.items {
let mrb = ringBuf.clone();
std::thread::Thread::spawn(move || {
item.foo(); //job
mrb.lock().unwrap().push_back(item); //push item back to caller
});
}
After changing this, you'll run into some simpler errors about forgotten mut qualifiers, and then you hit another problem - you are trying to send mutable references across threads. Your for loop will need to return &mut Item to call foo, but this doesn't match your Vec. Changing it, we can get to something that compiles:
for mut item in container.items.into_iter() {
let mrb = ringBuf.clone();
std::thread::Thread::spawn(move || {
item.foo(); //job
mrb.lock().unwrap().push_back(item); //push item back to caller
});
}
Here, we consume the input vector, moving each of the Items to the worker thread. Unfortunately, this hits the Playpen timeout, so there's probably some deeper issue.
All that being said, I'd highly recommend using channels:
#![feature(std_misc)]
use std::sync::mpsc::channel;
#[derive(Debug)]
struct Item {
x: i32
}
impl Item {
fn foo(&mut self) { self.x += 1; }
}
fn main() {
let items = vec![Item { x: 0 }, Item { x: 1 }];
let rx = {
let (tx, rx) = channel();
for item in items.into_iter() {
let my_tx = tx.clone();
std::thread::Thread::spawn(move || {
let mut item = item;
item.foo();
my_tx.send(item).unwrap();
});
}
rx
};
for item in rx.iter() {
println!("{:?}", item);
}
}
This also times-out in the playpen, but works fine when compiled and run locally.

Resources