Printing the Arc and Mutex types - rust

How can I print the values of a Vec that is encapsulated by a Mutex and Arc? I'm really new to Rust, so I'm not sure if I am phrasing this well.
This is the code I have, loosely based on the documentation.
use std::sync::{Arc, Mutex};
use std::thread;
fn main() {
let data = Arc::new(Mutex::new(vec![104, 101, 108, 108, 111]));
for i in 0..2 {
let data = data.clone();
thread::spawn(move || {
let mut data = data.lock().unwrap();
data[i] += 1;
});
}
println!("{:?}", String::from_utf8(data).unwrap());
thread::sleep_ms(50);
}
The error that the compiler gives me:
$ rustc datarace_fixed.rs
datarace_fixed.rs:14:37: 14:41 error: mismatched types:
expected collections::vec::Vec<u8>,
found alloc::arc::Arc<std::sync::mutex::Mutex<collections::vec::Vec<_>>>
(expected struct collections::vec::Vec,
found struct alloc::arc::Arc) [E0308]
datarace_fixed.rs:14 println!("{:?}", String::from_utf8(data).unwrap());

To work with a Mutex value you have to lock the Mutex, just like you do in the spawned threads. (playpen):
let data = data.lock().unwrap();
println!("{:?}", String::from_utf8(data.clone()).unwrap());
Note that String::from_utf8 consumes the vector (in order to wrap it in a String without extra allocations), which is obvious from it taking a value vec: Vec<u8> and not a reference. Since we aren't ready to relinquish our hold on data we have to clone it when using this method.
A cheaper alternative would be to use the slice-based version of from_utf8 (playpen):
let data = data.lock().unwrap();
println!("{:?}", from_utf8(&data).unwrap());

Related

How to use std::slice::Chunks on Arc<Mutex<Vec<u8>>> properly between threads in Rust?

i'd like to gain understanding why following seems to not work properly in Rust.
I'd like to chunk a vector and give every thread a chunk to work on it. I tried it with a Arc and a Mutex combination to have mutual access to my vec.
This was my first (obvious) attempt:
Declare the vec, chunk it, send chunk into each thread. In my understanding it should work because the Chunk methods guarantees non overlapping chunks.
use std::sync::{Arc, Mutex};
use std::thread;
fn main() {
let data = Arc::new(Mutex::new(vec![0;20]));
let chunk_size = 5;
let mut threads = vec![];
let chunks: Vec<&mut [u8]> = data.lock().unwrap().chunks_mut(chunk_size).collect();
for chunk in chunks.into_iter(){
threads.push(thread::spawn(move || {
inside_thread(chunk)
}));
}
}
fn inside_thread(chunk: &mut [u8]) {
// now do something with chunk
}
The error says data does not live enough. Silly me, with the chunking i created pointers to the array but passed no Arc reference into the thread.
So i changed a few lines, but it would make no sense because I'd have an unused reference in my thread!?
for i in 0..data.lock().unwrap().len() / 5 {
let ref_to_data = data.clone();
threads.push(thread::spawn(move || {
inside_thread(chunk, ref_to_data)
}));
}
The error still says that data does not live enough.
The next attempt didn't work either. I thought ok i could work around it and chunk it inside the thread and get my chunk by an index. But if it worked the code wouldn't be very rust idiomatic :/
use std::sync::{Arc, Mutex};
use std::thread;
fn main() {
let data = Arc::new(Mutex::new(vec![0;20]));
let chunk_size = 5;
let mut threads = vec![];
for i in 0..data.lock().unwrap().len() / chunk_size {
let ref_to_data = data.clone();
threads.push(thread::spawn(move || {
inside_thread(ref_to_data, i, chunk_size)
}));
}
}
fn inside_thread(data: Arc<Mutex<Vec<u8>>>, index: usize, chunk_size: usize) {
let chunk: &mut [u8] = data.lock().unwrap().chunks_mut(chunk_size).collect()[index];
// now do something with chunk
}
The error says:
--> src/main.rs:18:72
|
18 | let chunk: &mut [u8] = data.lock().unwrap().chunks_mut(chunk_size).collect()[index];
| ^^^^^^^
| |
| cannot infer type for type parameter `B` declared on the method `collect`
| help: consider specifying the type argument in the method call: `collect::<B>`
|
= note: type must be known at this point
And when i try to infer it it doesn't work either. So for now i tried much and nothing worked and im out of ideas.
Ok what i always could do is to "do the chunking on my own". Just mutating the vector inside the threads works fine. But this is no idiomatic nice way to do it (in my opinion).
Question: Is there a possible way to solve this problem rust idiomatic?
Hint: I know i could work with an scoped-threadpool or something like this but i want to gain this knowledge for my thesis.
Much thanks for spending your time on this!

Vector holding mutable functions

I would like to have a vector with functions. Then I would like to iterate on this vector and execute functions one by one. The functions would mutate an external state. Additionally, I would like to be able to place the same function twice in the vector.
The problems I have are:
I cannot dereference and execute the function from the vector,
Adding the same function to the vector twice fails with, understandable, error that I cannot have two mutable references.
The closest I got is:
fn main() {
let mut c = 0;
{
let mut f = ||{c += 1};
let mut v: Vec<&mut FnMut()> = vec![];
v.push(&mut f);
// How to execute the stored function? The following complains about
// an immutable reference:
// assignment into an immutable reference
// (v[0])();
// How to store the same function twice? The following will fail with:
// cannot borrow `f` as mutable more than once at a time
// v.push(&mut f);
}
println!("c {}", c);
}
For the first problem, I don't really know why no mutable dereference happens here (in my opinion, it should), but there is a simple workaround: just do the dereference and then reference manually:
(&mut *v[0])();
Your second problem is more complex, though. There is no simple solution, because what you're trying to do violates Rust aliasing guarantees, and since you did not describe the purpose of it, I can't suggest alternatives properly. In general, however, you can overcome this error by switching to runtime borrow-checking with Cell/RefCell or Mutex (the latter is when you need concurrent access). With Cell (works nice for primitives):
use std::cell::Cell;
fn main() {
let c = Cell::new(0);
{
let f = || { c.set(c.get() + 1); };
let mut v: Vec<&Fn()> = vec![];
v.push(&f);
v.push(&f);
v[0]();
v[1]();
}
println!("c {}", c.get());
}
With RefCell (works nice for more complex types):
use std::cell::RefCell;
fn main() {
let c = RefCell::new(0);
{
let f = || { *c.borrow_mut() += 1; };
let mut v: Vec<&Fn()> = vec![];
v.push(&f);
v.push(&f);
v[0]();
v[1]();
}
println!("c {}", *c.borrow());
}
As you can see, now you have &Fn() instead of &mut FnMut(), which can be aliased freely, and whose captured environment may also contain aliased references (immutable, of course).

How to tell Rust to let me modify a shared variable hidden behind an RwLock?

Safe Rust demands the following from all references:
One or more references (&T) to a resource,
Exactly one mutable reference (&mut T).
I want to have one Vec that can be read by multiple threads and written by one, but only one of those should be possible at a time (as the language demands).
So I use an RwLock.
I need a Vec<i8>. To let it outlive the main function, I Box it and then I RwLock around that, like thus:
fn main() {
println!("Hello, world!");
let mut v = vec![0, 1, 2, 3, 4, 5, 6];
let val = RwLock::new(Box::new(v));
for i in 0..10 {
thread::spawn(move || threadFunc(&val));
}
loop {
let mut VecBox = (val.write().unwrap());
let ref mut v1 = *(*VecBox);
v1.push(1);
//And be very busy.
thread::sleep(Duration::from_millis(10000));
}
}
fn threadFunc(val: &RwLock<Box<Vec<i8>>>) {
loop {
//Use Vec
let VecBox = (val.read().unwrap());
let ref v1 = *(*VecBox);
println!("{}", v1.len());
//And be very busy.
thread::sleep(Duration::from_millis(1000));
}
}
Rust refuses to compile this:
capture of moved value: `val`
--> src/main.rs:14:43
|
14 | thread::spawn(move || threadFunc(&val));
| ------- ^^^ value captured here after move
| |
| value moved (into closure) here
Without the thread:
for i in 0..10 {
threadFunc(&val);
}
It compiles. The problem is with the closure. I have to "move" it, or else Rust complains that it can outlive main, I also can't clone val (RwLock doesn't implement clone()).
What should I do?
Note that there's no structural difference between using a RwLock and a Mutex; they just have different access patterns. See
Concurrent access to vector from multiple threads using a mutex lock for related discussion.
The problem centers around the fact that you've transferred ownership of the vector (in the RwLock) to some thread; therefore your main thread doesn't have it anymore. You can't access it because it's gone.
In fact, you'll have the same problem as you've tried to pass the vector to each of the threads. You only have one vector to give away, so only one thread could have it.
You need thread-safe shared ownership, provided by Arc:
use std::sync::{Arc, RwLock};
use std::thread;
use std::time::Duration;
fn main() {
println!("Hello, world!");
let v = vec![0, 1, 2, 3, 4, 5, 6];
let val = Arc::new(RwLock::new(v));
for _ in 0..10 {
let v = val.clone();
thread::spawn(move || thread_func(v));
}
for _ in 0..5 {
{
let mut val = val.write().unwrap();
val.push(1);
}
thread::sleep(Duration::from_millis(1000));
}
}
fn thread_func(val: Arc<RwLock<Vec<i8>>>) {
loop {
{
let val = val.read().unwrap();
println!("{}", val.len());
}
thread::sleep(Duration::from_millis(100));
}
}
Other things to note:
I removed the infinite loop in main so that the code can actually finish.
I fixed all of the compiler warnings. If you are going to use a compiled language, pay attention to the warnings.
unnecessary parentheses
snake_case identifiers. Definitely do not use PascalCase for local variables; that's used for types. camelCase does not get used in Rust.
I added some blocks to shorten the lifetime that the read / write locks will be held. Otherwise there's a lot of contention and the child threads never have a chance to get a read lock.
let ref v1 = *(*foo); is non-idiomatic. Prefer let v1 = &**foo. You don't even need to do that at all, thanks to Deref.

How to read (std::io::Read) from a Vec or Slice?

Vecs support std::io::Write, so code can be written that takes a File or Vec, for example. From the API reference, it looks like neither Vec nor slices support std::io::Read.
Is there a convenient way to achieve this? Does it require writing a wrapper struct?
Here is an example of working code, that reads and writes a file, with a single line commented that should read a vector.
use ::std::io;
// Generic IO
fn write_4_bytes<W>(mut file: W) -> Result<usize, io::Error>
where W: io::Write,
{
let len = file.write(b"1234")?;
Ok(len)
}
fn read_4_bytes<R>(mut file: R) -> Result<[u8; 4], io::Error>
where R: io::Read,
{
let mut buf: [u8; 4] = [0; 4];
file.read(&mut buf)?;
Ok(buf)
}
// Type specific
fn write_read_vec() {
let mut vec_as_file: Vec<u8> = Vec::new();
{ // Write
println!("Writing Vec... {}", write_4_bytes(&mut vec_as_file).unwrap());
}
{ // Read
// println!("Reading File... {:?}", read_4_bytes(&vec_as_file).unwrap());
// ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
// Comment this line above to avoid an error!
}
}
fn write_read_file() {
let filepath = "temp.txt";
{ // Write
let mut file_as_file = ::std::fs::File::create(filepath).expect("open failed");
println!("Writing File... {}", write_4_bytes(&mut file_as_file).unwrap());
}
{ // Read
let mut file_as_file = ::std::fs::File::open(filepath).expect("open failed");
println!("Reading File... {:?}", read_4_bytes(&mut file_as_file).unwrap());
}
}
fn main() {
write_read_vec();
write_read_file();
}
This fails with the error:
error[E0277]: the trait bound `std::vec::Vec<u8>: std::io::Read` is not satisfied
--> src/main.rs:29:42
|
29 | println!("Reading File... {:?}", read_4_bytes(&vec_as_file).unwrap());
| ^^^^^^^^^^^^ the trait `std::io::Read` is not implemented for `std::vec::Vec<u8>`
|
= note: required by `read_4_bytes`
I'd like to write tests for a file format encoder/decoder, without having to write to the file-system.
While vectors don't support std::io::Read, slices do.
There is some confusion here caused by Rust being able to coerce a Vec into a slice in some situations but not others.
In this case, an explicit coercion to a slice is needed because at the stage coercions are applied, the compiler doesn't know that Vec<u8> doesn't implement Read.
The code in the question will work when the vector is coerced into a slice using one of the following methods:
read_4_bytes(&*vec_as_file)
read_4_bytes(&vec_as_file[..])
read_4_bytes(vec_as_file.as_slice()).
Note:
When asking the question initially, I was taking &Read instead of Read. This made passing a reference to a slice fail, unless I'd passed in &&*vec_as_file which I didn't think to do.
Recent versions of rust you can also use as_slice() to convert a Vec to a slice.
Thanks to #arete on #rust for finding the solution!
std::io::Cursor
std::io::Cursor is a simple and useful wrapper that implements Read for Vec<u8>, so it allows to use vector as a readable entity.
let mut file = Cursor::new(vector);
read_something(&mut file);
And documentation shows how to use Cursor instead of File to write unit-tests!
Working example:
use std::io::Cursor;
use std::io::Read;
fn read_something(file: &mut impl Read) {
let _ = file.read(&mut [0; 8]);
}
fn main() {
let vector = vec![1, 2, 3, 4];
let mut file = Cursor::new(vector);
read_something(&mut file);
}
From the documentation about std::io::Cursor:
Cursors are typically used with in-memory buffers to allow them to implement Read and/or Write...
The standard library implements some I/O traits on various types which are commonly used as a buffer, like Cursor<Vec<u8>> and Cursor<&[u8]>.
Slice
The example above works for slices as well. In that case it would look like the following:
read_something(&mut &vector[..]);
Working example:
use std::io::Read;
fn read_something(file: &mut impl Read) {
let _ = file.read(&mut [0; 8]);
}
fn main() {
let vector = vec![1, 2, 3, 4];
read_something(&mut &vector[..]);
}
&mut &vector[..] is a "mutable reference to a slice" (a reference to a reference to a part of vector), so I just find the explicit option with Cursor to be more clear and elegant.
Cursor <-> Slice
Even more: if you have a Cursor that owns a buffer, and you need to emulate, for instance, a part of a "file", you can get a slice from the Cursor and pass to the function.
read_something(&mut &file.get_ref()[1..3]);

How do I share a mutable object between threads using Arc?

I'm trying to share a mutable object between threads in Rust using Arc, but I get this error:
error[E0596]: cannot borrow data in a `&` reference as mutable
--> src/main.rs:11:13
|
11 | shared_stats_clone.add_stats();
| ^^^^^^^^^^^^^^^^^^ cannot borrow as mutable
This is the sample code:
use std::{sync::Arc, thread};
fn main() {
let total_stats = Stats::new();
let shared_stats = Arc::new(total_stats);
let threads = 5;
for _ in 0..threads {
let mut shared_stats_clone = shared_stats.clone();
thread::spawn(move || {
shared_stats_clone.add_stats();
});
}
}
struct Stats {
hello: u32,
}
impl Stats {
pub fn new() -> Stats {
Stats { hello: 0 }
}
pub fn add_stats(&mut self) {
self.hello += 1;
}
}
What can I do?
Arc's documentation says:
Shared references in Rust disallow mutation by default, and Arc is no exception: you cannot generally obtain a mutable reference to something inside an Arc. If you need to mutate through an Arc, use Mutex, RwLock, or one of the Atomic types.
You will likely want a Mutex combined with an Arc:
use std::{
sync::{Arc, Mutex},
thread,
};
struct Stats;
impl Stats {
fn add_stats(&mut self, _other: &Stats) {}
}
fn main() {
let shared_stats = Arc::new(Mutex::new(Stats));
let threads = 5;
for _ in 0..threads {
let my_stats = shared_stats.clone();
thread::spawn(move || {
let mut shared = my_stats.lock().unwrap();
shared.add_stats(&Stats);
});
// Note: Immediately joining, no multithreading happening!
// THIS WAS A LIE, see below
}
}
This is largely cribbed from the Mutex documentation.
How can I use shared_stats after the for? (I'm talking about the Stats object). It seems that the shared_stats cannot be easily converted to Stats.
As of Rust 1.15, it's possible to get the value back. See my additional answer for another solution as well.
[A comment in the example] says that there is no multithreading. Why?
Because I got confused! :-)
In the example code, the result of thread::spawn (a JoinHandle) is immediately dropped because it's not stored anywhere. When the handle is dropped, the thread is detached and may or may not ever finish. I was confusing it with JoinGuard, a old, removed API that joined when it is dropped. Sorry for the confusion!
For a bit of editorial, I suggest avoiding mutability completely:
use std::{ops::Add, thread};
#[derive(Debug)]
struct Stats(u64);
// Implement addition on our type
impl Add for Stats {
type Output = Stats;
fn add(self, other: Stats) -> Stats {
Stats(self.0 + other.0)
}
}
fn main() {
let threads = 5;
// Start threads to do computation
let threads: Vec<_> = (0..threads).map(|_| thread::spawn(|| Stats(4))).collect();
// Join all the threads, fail if any of them failed
let result: Result<Vec<_>, _> = threads.into_iter().map(|t| t.join()).collect();
let result = result.unwrap();
// Add up all the results
let sum = result.into_iter().fold(Stats(0), |i, sum| sum + i);
println!("{:?}", sum);
}
Here, we keep a reference to the JoinHandle and then wait for all the threads to finish. We then collect the results and add them all up. This is the common map-reduce pattern. Note that no thread needs any mutability, it all happens in the master thread.

Resources