Difficulty creating a mock 'sink' for testing purposes in Rust - rust

Context
I have a function that will use rayon to paralellize some work. Each thread should write its output to a separate consumer. In practice, these consumers will usually write to (separate) files on disc. For testing, I want them to append to vectors, which I can then make assertions on.
Problem
In fact I hit an issue even with a serial version of this code.
The function I want to test is func_under_test. I want get_sink to create and hand out a new 'sink' (a Vec<i32>) each time it is called, which will be inside func_under_test. But I also want to keep a reference to these vectors in my main function, so I can make assertions on them. I've tried writing this:
fn func_under_test<GetSink, F>(
mut get_sink: GetSink
)
where
GetSink: FnMut(usize) -> F,
F: FnMut(i32) -> (),
{
let idxs : Vec<usize> = (0..2).collect();
idxs.iter().for_each(|&i| {
let mut sink = get_sink(i);
sink(0i32);
sink(1i32);
sink(2i32);
});
}
fn main() {
let mut sinks : Vec<&Vec<i32>> = vec![];
let get_sink = |i: usize| {
let mut sink : Vec<i32> = vec![];
sinks.push(&sink);
move |x : i32| sink.push(x)
};
func_under_test(get_sink);
assert_eq!(**sinks.get(0).unwrap(), vec![0i32, 1i32, 2i32]);
}
But I get this compilation error:
error[E0597]: `sink` does not live long enough
--> src/main.rs:23:20
|
20 | let mut sinks : Vec<&Vec<i32>> = vec![];
| --------- lifetime `'1` appears in the type of `sinks`
...
23 | sinks.push(&sink);
| -----------^^^^^-
| | |
| | borrowed value does not live long enough
| argument requires that `sink` is borrowed for `'1`
24 | move |x : i32| sink.push(x)
25 | };
| - `sink` dropped here while still borrowed
error[E0505]: cannot move out of `sink` because it is borrowed
--> src/main.rs:24:9
|
20 | let mut sinks : Vec<&Vec<i32>> = vec![];
| --------- lifetime `'1` appears in the type of `sinks`
...
23 | sinks.push(&sink);
| -----------------
| | |
| | borrow of `sink` occurs here
| argument requires that `sink` is borrowed for `'1`
24 | move |x : i32| sink.push(x)
| ^^^^^^^^^^^^^^ ---- move occurs due to use in closure
| |
| move out of `sink` occurs here
Am I correct in thinking that I need to use something like an Rc in this context, since I need to have two references to each Vec<i32>?
Since only one of those references needs to be mutable, do I need to use Rc<Cell<Vec<i32>>> in one case, and just Rc<Vec<i32>> in the other?

You could use an Arc<Mutex<i32>> to share mutable access to the vectors. (Rc is not thread-safe.)
A tidier solution would be typed_arena::Arena, which would allow you to write essentially the code you tried, but with the roles swapped: the Vec<i32>s are always owned by the arena (thus it outlives func_under_tests). Then, after all of the references to the arena are gone, you can convert it into a vector of its elements.
use typed_arena::Arena;
fn main() {
let sinks: Arena<Vec<i32>> = Arena::new();
let get_sink = |_i: usize| {
let sink_ref = sinks.alloc(Vec::new());
move |x| sink_ref.push(x)
};
func_under_test(get_sink);
assert_eq!(sinks.into_vec()[0], vec![0i32, 1i32, 2i32]);
}

I think I've figured out the answer to this. The thread doing the testing needs to have access to the inner vectors, and so does the thread that is populating this. So they need to be wrapped in an Arc and a Mutex. But the outer vector does not, since it is only borrowed by the get_sink closure.
So sinks can have type Vec<Arc<Mutex<Vec<i32>>>.
Anyway, I can write this and it does compile and behave and expected:
use rayon::prelude::*;
use std::sync::{Arc, Mutex};
fn func_under_test<G, S>(
get_sink: G)
where
G: Fn(&usize) -> S + Sync,
S: FnMut(i32) -> (),
{
let idxs : Vec<usize> = vec![0, 1, 2];
idxs.par_iter().for_each(|i| {
let mut sink : S = get_sink(i);
for j in 0..3 {
sink(j + 3*(*i) as i32);
}
});
}
fn main() {
let sinks : Vec<Arc<Mutex<Vec<i32>>>> = vec![
Arc::new(Mutex::new(vec![])),
Arc::new(Mutex::new(vec![])),
Arc::new(Mutex::new(vec![])),
];
let get_sink = |i: &usize| {
let m : Arc<Mutex<Vec<i32>>> = sinks.get(*i).unwrap().clone();
move |x: i32| {
m.lock().unwrap().push(x);
}
};
func_under_test(get_sink);
assert_eq!(*sinks.get(0).unwrap().lock().unwrap(), vec![0, 1, 2]);
assert_eq!(*sinks.get(1).unwrap().lock().unwrap(), vec![3, 4, 5]);
assert_eq!(*sinks.get(2).unwrap().lock().unwrap(), vec![6, 7, 8]);
}
It also seems like G implements the correct traits here: it is Sync, since it needs to be shared between threads. But is is not Send, since it does not need to send information between threads.

Related

How do I move a vector reference into threads? [duplicate]

This question already has an answer here:
How can I pass a reference to a stack variable to a thread?
(1 answer)
Closed last month.
How do I move a vector reference into threads? The closest I get is the (minimized) code below. (I realize that the costly calculation still isn't parallel, as it is locked by the mutex, but one problem at a time.)
Base problem: I'm calculating values based on information saved in a vector. Then I'm storing the results as nodes per vector element. So vector in vector (but only one vector in the example code below). The calculation takes time so I would like to divide it into threads. The structure is big, so I don't want to copy it.
use std::sync::{Arc, Mutex};
use std::thread;
fn main() {
let n = Nodes::init();
n.calc();
println!("Result: nodes {:?}", n);
}
#[derive(Debug)]
struct Nodes {
nodes: Vec<Info>,
}
impl Nodes {
fn init() -> Self {
let mut n = Nodes { nodes: Vec::new() };
n.nodes.push(Info::init(1));
n.nodes.push(Info::init(2));
n
}
fn calc(&self) {
Nodes::calc_associative(&self.nodes);
}
fn calc_associative(nodes: &Vec<Info>) {
let mut handles = vec![];
let arc_nodes = Arc::new(nodes);
let counter = Arc::new(Mutex::new(0));
for _ in 0..2 {
let arc_nodes = Arc::clone(&arc_nodes);
let counter = Arc::clone(&counter);
let handle = thread::spawn(move || {
let mut idx = counter.lock().unwrap();
// costly calculation
arc_nodes[*idx].set_length(arc_nodes[*idx].get_length() * 2);
*idx += 1;
});
handles.push(handle);
}
for handle in handles {
handle.join().unwrap();
}
}
}
#[derive(Debug)]
struct Info {
length: u32,
}
impl Info {
fn init(length: u32) -> Self {
Info { length }
}
fn get_length(&self) -> u32 {
self.length
}
fn set_length(&mut self, x: u32) {
self.length = x;
}
}
The compiler complains that life time of the reference isn't fulfilled, but isn't that what Arc::clone() should do? Then Arc require a deref, but maybe there are better solutions before starting to dig into that...?
Compiling threads v0.1.0 (/home/freefox/proj/threads)
error[E0596]: cannot borrow data in an `Arc` as mutable
--> src/main.rs:37:17
|
37 | arc_nodes[*idx].set_length(arc_nodes[*idx].get_length() * 2);
| ^^^^^^^^^ cannot borrow as mutable
|
= help: trait `DerefMut` is required to modify through a dereference, but it is not implemented for `Arc<&Vec<Info>>`
error[E0521]: borrowed data escapes outside of associated function
--> src/main.rs:34:26
|
25 | fn calc_associative(nodes: &Vec<Info>) {
| ----- - let's call the lifetime of this reference `'1`
| |
| `nodes` is a reference that is only valid in the associated function body
...
34 | let handle = thread::spawn(move || {
| __________________________^
35 | | let mut idx = counter.lock().unwrap();
36 | | // costly calculation
37 | | arc_nodes[*idx].set_length(arc_nodes[*idx].get_length() * 2);
38 | | *idx += 1;
39 | | });
| | ^
| | |
| |______________`nodes` escapes the associated function body here
| argument requires that `'1` must outlive `'static`
Some errors have detailed explanations: E0521, E0596.
For more information about an error, try `rustc --explain E0521`.
error: could not compile `threads` due to 2 previous errors
You wrap a reference with Arc. Now the type is Arc<&Vec<Info>>. There is still a reference here, so the variable could still be destroyed before the thread return and we have a dangling reference.
Instead, you should take a &Arc<Vec<Info>>, and on the construction of the Vec wrap it in Arc, or take &[Info] and clone it (let arc_nodes = Arc::new(nodes.to_vec());). You also need a mutex along the way (either Arc<Mutex<Vec<Info>>> or Arc<Vec<Mutex<Info>>>), since you want to change the items.
Or better, since you immediately join() the threads, use scoped threads:
fn calc_associative(nodes: &[Mutex<Info>]) {
let counter = std::sync::atomic::AtomicUsize::new(0); // Changed to atomic, prefer it to mutex wherever possible
std::thread::scope(|s| {
for _ in 0..2 {
s.spawn(|| {
let idx = counter.fetch_add(1, std::sync::atomic::Ordering::SeqCst);
let node = &mut *nodes[idx].lock().unwrap();
// costly calculation
node.set_length(node.get_length() * 2);
});
}
});
}

Pulling a vec out of a vec of structs

I'm working on a compiler for a toy language and I want to be able to check for errors on each file. I have a MooFile struct that has a Vec<anyhow::Error> where I put errors when they are encountered. Now I want to go through the files, get the errors, and flatten that into a single vec to return from the compiler but I'm running up against the borrow checker. I'm going to post my various attempt but they keep running into similar issues.
fn compile(&mut self) -> Result<(), Vec<Error>> {
...
// Check for errors
let has_errors = self.files
.iter()
.map(|file| file.has_errors())
.reduce(|acc, value| acc | value)
.unwrap_or(false);
if has_errors {
let mut errors = vec![];
self.files
.iter()
.for_each(|&mut file| errors.append(&mut file.errors));
Err(errors)
} else {
Ok(())
}
}
error[E0596]: cannot borrow `file.errors` as mutable, as it is behind a `&` reference
--> src\lib.rs:62:48
|
62 | .for_each(|file| errors.append(&mut file.errors));
| ---- ^^^^^^^^^^^^^^^^ `file` is a `&` reference, so the data it refers to cannot be borrowed as mutable
| |
| help: consider changing this to be a mutable reference: `&mut MooFile`
fn compile(&mut self) -> Result<(), Vec<Error>> {
...
// Check for errors
let has_errors = self.files
.iter()
.map(|file| file.has_errors())
.reduce(|acc, value| acc | value)
.unwrap_or(false);
if has_errors {
let mut errors = vec![];
self.files
.into_iter()
.for_each(|mut file| errors.append(&mut file.errors));
Err(errors)
} else {
Ok(())
}
}
error[E0507]: cannot move out of `self.files` which is behind a mutable reference
--> src\lib.rs:63:13
|
63 | self.files
| ^^^^^^^^^^ move occurs because `self.files` has type `Vec<MooFile>`, which does not implement the `Copy` trait
64 | .into_iter()
| ----------- `self.files` moved due to this method call
fn compile(&mut self) -> Result<(), Vec<Error>> {
...
// Check for errors
let errors: Vec<Error> = self.files
.iter()
.map(|file| file.errors)
.flatten()
.collect();
if errors.len() > 0 {
Err(errors)
} else {
Ok(())
}
}
error[E0507]: cannot move out of `file.errors` which is behind a shared reference
--> src\lib.rs:54:25
|
54 | .map(|file| file.errors)
| ^^^^^^^^^^^ move occurs because `file.errors` has type `Vec<anyhow::Error>`, which does not implement the `Copy` trait
For more information about this error, try `rustc --explain E0507`.
error: could not compile `moo2` due to previous error
I feel like there has to be some kind of idiomatic way of doing this kind of thing that I'm missing.
Most of these errors come from how you create your iterators. If you assume x is a Vec<Y>:
x.iter(): Consumes &x and creates an iterator of &Y.
x.iter_mut(): Consumes &mut x and creates an iterator of &mut Y.
x.into_iter(): Consumes ownership of the entire Vec<Y> and creates an iterator of owned Y.
The first version uses .iter() when you want a mutable reference (.iter_mut()). The second uses .into_iter() when you want a mutable reference (.iter_mut()). And the third uses .iter() when you attempt to claim ownership of the errors by disposing of the files (.into_iter(), but you can't drop self.files without issues).
Here is one way you could Clone the errors without moving them out of each file.
let mut errors = Vec::new();
self.files.iter()
.filter(|file| file.has_errors())
.foreach(|file| errors.extend_from_slice(&file.errors));
if has_errors {
return Err(errors)
}
Ok(())
Or you could drain them into a new Vec using extend. This version also uses any to look a bit cleaner, but is not as efficient.
if self.files.iter().any(|f| f.has_errors()) {
let mut errors = Vec::new();
for file in &mut self.files {
errors.extend(file.errors.drain(..));
}
return Err(errors)
}
Ok(())

How to make multiple mutable references to the same element in an array using unsafe code?

I am trying to make a prime sieve in Rust.
I need several mutable references to the same element of the array, not mutable references to different parts of the array.
For the way this works, I know data races are not a relevant problem, so multiple mutable references are acceptable, but the Rust compiler does not accept my unsafe code.
I am using crossbeam 0.8.0.
fn extend_helper(
primes: &Vec<usize>,
true_block: &mut Vec<bool>,
segment_min: usize,
segment_len: usize,
) {
crossbeam::scope(|scope| {
for prime in primes {
let prime = prime.clone();
let segment_min = &segment_min;
let segment_len = &segment_len;
let shared = unsafe { &mut *true_block };
scope.spawn(move |_| {
let tmp = smallest_multiple_of_n_geq_m(prime, *segment_min) - *segment_min;
for j in (tmp..*segment_len).step_by(prime) {
shared[j] = false;
}
});
}
})
.unwrap();
}
fn smallest_multiple_of_n_geq_m(n: usize, m: usize) -> usize {
m + ((n - (m % n)) % n)
}
error[E0499]: cannot borrow `*true_block` as mutable more than once at a time
--> src/lib.rs:12:35
|
7 | crossbeam::scope(|scope| {
| ----- has type `&Scope<'1>`
...
12 | let shared = unsafe { &mut *true_block };
| ^^^^^^^^^^^^^^^^ `*true_block` was mutably borrowed here in the previous iteration of the loop
13 |
14 | / scope.spawn(move |_| {
15 | | let tmp = smallest_multiple_of_n_geq_m(prime, *segment_min) - *segment_min;
16 | | for j in (tmp..*segment_len).step_by(prime) {
17 | | shared[j] = false;
18 | | }
19 | | });
| |______________- argument requires that `*true_block` is borrowed for `'1`
warning: unnecessary `unsafe` block
--> src/lib.rs:12:26
|
12 | let shared = unsafe { &mut *true_block };
| ^^^^^^ unnecessary `unsafe` block
|
= note: `#[warn(unused_unsafe)]` on by default
How I should write the unsafe in a way that Rust accepts?
Rust does not allow that in its model. You want AtomicBool with relaxed ordering.
This is a somewhat common edge case in most concurrency models - if you have two non-atomic writes of the same value to one location, is that well-defined? In Rust (and C++) it is not, and you need to explicitly use atomic.
On any platform you care about, the atomic bool store with relaxed ordering will have no impact.
For a reason why this matters, consider:
pub fn do_the_thing(x: &mut bool) {
if !*x { return };
// Do some stuff.
*x = true;
}
At setting x to true, the compiler is free to assume (due to no shared mutable references) that x is still false. It may implement this assignment as, e.g., inc x in assembly, moving its value from 0 to 1.
If two threads came through this and both reached *x = true, its value may become something other than 0 or 1, which could violate other assumed invariants.

How to pass an Arc clone to a closure?

I'm trying to write a closure that uses an Arc by cloning it. Ideally I'd like to have the clone inside the closure, but I'm kinda forced to pass the original Arc, which might be the reason I'm getting the error:
use std::sync::Arc;
use std::sync::Condvar;
use std::sync::Mutex;
use std::collections::VecDeque;
type Fifo<T> = Arc<(Mutex<VecDeque<T>>, Condvar)>;
fn executor(f: Box<dyn Fn()>) {
f();
}
fn main() {
let a = Fifo::<u8>::new(
(Mutex::new(VecDeque::new()), Condvar::new())
);
let r = Box::new(||{
let f = a.clone();
f.0.lock().unwrap().push_back(0);
});
executor(r);
}
Error:
error[E0597]: `a` does not live long enough
--> src/main.rs:19:17
|
18 | let r = Box::new(||{
| -- value captured here
19 | let f = a.clone();
| ^ borrowed value does not live long enough
...
22 | executor(r);
| - cast requires that `a` is borrowed for `'static`
23 | }
| - `a` dropped here while still borrowed
error: aborting due to previous error
I thought changing to
let r = Box::new(||{
//let f = a.clone();
a.0.lock().unwrap().push_back(0);
});
would force the closure to decide to clone a, therefore fixing the problem, but I get the same error.
How can I pass an Arc to a closure?
Clone the Arc outside the closure and then move the clone into the closure. Example:
use std::collections::VecDeque;
use std::sync::Arc;
use std::sync::Condvar;
use std::sync::Mutex;
type Fifo<T> = Arc<(Mutex<VecDeque<T>>, Condvar)>;
fn executor(f: Box<dyn Fn()>) {
f();
}
fn main() {
let a = Fifo::<u8>::new((Mutex::new(VecDeque::new()), Condvar::new()));
let f = a.clone();
let r = Box::new(move || {
f.0.lock().unwrap().push_back(0);
});
executor(r);
}
playground

Lifetime issues in rust: borrowed value does not live long enough

I have reproduced my problem in the short code below.
Problem: The inner thread uses reference of variable v from the outer thread. The rust compiler throws an error because "technically" the outer thread could terminate before the inner thread and hence inner thread could loose access to variable v. However in the code below that clearly cannot happen.
Question: How shall I change this code so that it complies while maintaining the same functionality?
fn main() { //outer thread
let v = vec![0, 1];
let test = Test { v: &v }; //inner_thread
std::thread::spawn(move || test.print());
loop {
// this thread will never die because it will never leave this loop
}
}
pub struct Test<'a> {
v: &'a Vec<u32>,
}
impl<'a> Test<'a> {
fn print(&self) {
println!("{:?}", self.v);
}
}
error[E0597]: `v` does not live long enough
--> src/main.rs:3:26
|
3 | let test = Test { v: &v }; //inner_thread
| ^^ borrowed value does not live long enough
4 | std::thread::spawn(move || test.print());
| ---------------------------------------- argument requires that `v` is borrowed for `'static`
...
8 | }
| - `v` dropped here while still borrowed
The obvious solution would be to have Test own the vector instead of just have a reference.
But if you really need to borrow the value in the thread (probably because you want to use it after end of execution), then you may use crossbeam's scope:
let v = vec![0, 1];
let test = Test { v: &v }; //inner_thread
crossbeam::thread::scope(|scope| {
scope.spawn(|_| test.print());
}).unwrap();

Resources