What tools does Rust provide to avoid copies when using MPSC channels?

What tools does Rust provide to avoid copies when using MPSC channels? - multithreading

I'm creating a multithreaded application using mpsc to share memory between my threads:
use std::thread;
use std::sync::mpsc::{Sender, Receiver};
#[derive(Debug)]
struct Msg {
pub content: Vec<i16>,
/* ... */
}
#[derive(Debug)]
struct MsgBack {
pub content: Vec<i16>,
pub new_content: Vec<i16>,
/* ... */
}
fn child(rx: mpsc::Receiver<Msg>, tx: mpsc::Sender<MsgBack>) {
let message = rx.recv().unwrap();
let new_content = message.content.iter().map(|x| -x).collect();
tx.send(MsgBack { // The memory is moved/copied
content: message.content,
new_content: new_content,
});
}
fn main() {
let (tx, rx): (Sender<Msg>, Receiver<Msg>) = mpsc::channel();
let (tx_back, rx_back): (Sender<MsgBack>, Receiver<MsgBack>) = mpsc::channel();
thread::spawn(move || {
child(rx, tx_back);
});
let message = Msg {
content: (0..100).map(|x| x).collect(), // Dummy initialisation
};
println!("{:#?}", message);
tx.send(message).unwrap(); // The memory is moved/copied
let answer = rx_back.recv().unwrap();
println!("{:#?}", answer);
}
I did some profiling and I saw that sending the data is responsible for 1/3 of the execution time in my real program (which sends more than just a Vec).
I want to keep this code structure but avoid moves/copies when sending a message to save a lot of time.

Move of a Vec doesn't move its content, but only the 3-word "header". Therefore, unless your MsgBack contains a lot of other fields or large fixed-size arrays inline, it should be rather cheap to move.
In general, you can put things in Box to allocate them on the heap, so then the Box<T> itself is only pointer-sized. "Moving" of the Box doesn't move any data, only copies the pointer.
If your actual iterators are more complex than your example and don't have useful size_hint, you may be seeing memcpy from .collect() reallocating the vector as it grows. You can avoid that by pre-allocating required size:
Instead of:
let dst = iterator.collect();
use:
let mut dst = Vec::with_capacity(required_size);
dst.extend(iterator);

Related

Safely call Arc::from_raw multiple times for a raw pointer? In other words, how to hold a raw pointer of an Arc and later use it multiple times?

I need to hold a raw pointer of an Arc, and later use it multiple times. Then, is it safe to do the following?
use std::sync::Arc;
let x_ptr = Arc::into_raw(Arc::new(some_big_object));
// Use it concurrently and multiple times.
// Ignore the grammar error since this is only a demonstration.
for i in 0..10 {
thread::spawn(|| {
unsafe {
let x_recovered = ManuallyDrop::new(Arc::from_raw(x_ptr));
let x_recovered_cloned = x_recovered.clone();
make_use_of_it(x_recovered);
}
});
}
wait_and_join_the_threads();
// Here I really deallocate the memory.
drop(Arc::from_raw(x_ptr));
fn make_use_of_it(x: Arc) {...}

You can use Arc::increment_strong_count():
let x_ptr = Arc::into_raw(Arc::new(some_big_object));
for i in 0..10 {
thread::spawn(|| {
unsafe {
Arc::increment_strong_count(x_ptr);
let x_recovered = Arc::from_raw(x_ptr);
make_use_of_it(x_recovered);
}
});
}
unsafe {
Arc::decrement_strong_count(x_ptr);
// Could be also
// drop(Arc::from_raw(x_ptr));
}

Are non-atomic writes safe to read if gated by an atomic operation?

I want to create an object that is "empty" but can complex data (here a and b) that I can update later and set an atomic flag to mark it as non-empty so that it can be used in other threads. Pseudo example:
use std::sync::atomic::{AtomicBool, Ordering};
use std::cell::Cell;
use std::sync::Arc;
use std::{thread, time};
struct MyObject {
is_empty: AtomicBool,
a: Cell<u64>,
b: Cell<u64>,
}
unsafe impl Sync for MyObject {}
fn main() {
let obj = Arc::new(MyObject {
is_empty: AtomicBool::new(true),
a: Cell::new(0),
b: Cell::new(0)
});
let thread_obj = obj.clone();
let t = thread::spawn(move || {
while thread_obj.is_empty.load(Ordering::SeqCst) {
thread::sleep(time::Duration::from_millis(10));
}
println!("a is: {}", thread_obj.a.get());
println!("b is: {}", thread_obj.b.get());
});
thread::sleep(time::Duration::from_millis(100));
obj.a.set(42);
obj.b.set(5);
obj.is_empty.store(false, Ordering::SeqCst);
t.join().unwrap();
}
See it on the Rust Playground
It seems to work, but that doesn't mean much. I'm mostly concerned if the writes to a and b will definitely be visible to other threads that read is_empty as false. If I guarantee:
all writes to a and b occur before setting the flag
no thread reads a and b before the flag is set
is this ok?
I could use an AtomicPtr instead, create the object in full, and swap the pointer, but I'm curious if I can avoid the extra indirection.

You might wanna use Release and Acquire instead of SeqCst
Release :
When coupled with a store, all previous operations become ordered before any load of this value with Acquire (or stronger) ordering. In particular, all previous writes become visible to all threads that perform an Acquire (or stronger) load of this value.
Acquire :
When coupled with a load, if the loaded value was written by a store operation with Release (or stronger) ordering, then all subsequent operations become ordered after that store. In particular, all subsequent loads will see data written before the store.
Change this:
fn main() {
let obj = Arc::new(MyObject {
is_empty: AtomicBool::new(true),
a: Cell::new(0),
b: Cell::new(0)
});
let thread_obj = obj.clone();
let t = thread::spawn(move || {
while thread_obj.is_empty.load(Ordering::SeqCst) {
thread::sleep(time::Duration::from_millis(10));
}
println!("a is: {}", thread_obj.a.get());
println!("b is: {}", thread_obj.b.get());
});
thread::sleep(time::Duration::from_millis(100));
obj.a.set(42);
obj.b.set(5);
obj.is_empty.store(false, Ordering::SeqCst);
t.join().unwrap();
}
Into :
fn main() {
let obj = Arc::new(MyObject {
is_empty: AtomicBool::new(true),
a: Cell::new(0),
b: Cell::new(0)
});
let thread_obj = obj.clone();
let t = thread::spawn(move || {
while thread_obj.is_empty.load(Ordering::Acquire){ // change
thread::sleep(time::Duration::from_millis(10));
}
println!("a is: {}", thread_obj.a.get());
println!("b is: {}", thread_obj.b.get());
});
thread::sleep(time::Duration::from_millis(100));
obj.a.set(42);
obj.b.set(5);
obj.is_empty.store(false, Ordering::Release); //change
t.join().unwrap();
}
Also see docs and nomicon.

How to Use Serial Port in Multiple Threads in Rust?

I am trying to read and write to my serial port on Linux to communicate with a microcontroller and I'm trying to do so in Rust.
My normal pattern when developing in say C++ or Python is to have two threads: one which sends requests out over serial periodically and one which reads bytes out of the buffer and handles them.
In Rust, I'm running into trouble with the borrow checker while using the serial crate. This makes sense to me why this is, but I'm unsure what designing for an asynchronous communication interface would look like in Rust. Here's a snippet of my source:
let mut port = serial::open(&device_path.as_os_str()).unwrap();
let request_temperature: Vec<u8> = vec![0xAA];
thread::spawn(|| {
let mut buffer: Vec<u8> = Vec::new();
loop {
let _bytes_read = port.read(&mut buffer);
// process data
thread::sleep(Duration::from_millis(100));
}
});
loop {
port.write(&request_temperature);
thread::sleep(Duration::from_millis(1000));
}
How can I emulate this functionality where I have two threads holding onto a mutable resource in Rust? I know that since this specific example could be done in a single thread, but I'm thinking for an eventual larger program this would end up being multiple threads.

You can wrap your port in a Arc and a Mutex, then you can write something like:
use std::sync::{Arc, Mutex};
use std::thread;
use std::time::Duration;
struct Port;
impl Port {
pub fn read(&mut self, _v: &mut Vec<u8>) {
println!("READING...");
}
pub fn write(&mut self, _v: &Vec<u8>) {
println!("WRITING...");
}
}
pub fn main() {
let mut port = Arc::new(Mutex::new(Port));
let p2 = port.clone();
let handle = thread::spawn(move || {
let mut buffer: Vec<u8> = Vec::new();
for j in 0..100 {
let _bytes_read = p2.lock().unwrap().read(&mut buffer);
thread::sleep(Duration::from_millis(10));
}
});
let request_temperature: Vec<u8> = vec![0xAA];
for i in 0..10 {
port.lock().unwrap().write(&request_temperature);
thread::sleep(Duration::from_millis(100));
}
handle.join();
}
So that this will run on a test machine, I've replaced the serial port with a stub class, reduced the sleeps and replaced the infinite loop with some finite loops.
While this works, you'll probably actually want proper communication between the threads at some stage, at which point you'll want to look at std::sync::mpsc::channel

Idiomatic Rust method for handling references to a buffer

I would like to be able to construct objects that contain immutable references to a mutable buffer object. The following code does not work but illustrates my use case, is there an idiomatic Rust method for handling this?
#[derive(Debug)]
struct Parser<'a> {
buffer: &'a String
}
fn main() {
let mut source = String::from("Peter");
let buffer = &source;
let parser = Parser { buffer };
// How can I legally change source?
source.push_str(" Pan");
println!("{:?}", parser);
}

The golden rule of the rust borrow checker is: Only one writer OR multiple readers can access a resource at a time. This ensures that algorithms are safe to run in multiple threads.
You breach this rule here:
#[derive(Debug)]
struct Parser<'a> {
buffer: &'a String
}
fn main() {
// mutable access begins here
let mut source = String::from("Peter");
// immutable access begins here
let buffer = &source;
let parser = Parser { buffer };
source.push_str(" Pan");
println!("{:?}", parser);
// Both immutable and mutable access end here
}
If you are sure that your program doesn't actively access resources at the same time mutably and immutably, you can move the check from compile time to run time by wrapping your resource in a RefCell:
use std::cell::RefCell;
use std::rc::Rc;
#[derive(Debug)]
struct Parser {
buffer: Rc<RefCell<String>>
}
fn main() {
let source = Rc::new(RefCell::new(String::from("Peter")));
let parser = Parser { buffer: source.clone() };
source.borrow_mut().push_str(" Pan");
println!("{:?}", parser);
}
If you plan on passing your resource around threads, you can use an RwLock to block the thread until the resource is available:
use std::sync::{RwLock, Arc};
#[derive(Debug)]
struct Parser {
buffer: Arc<RwLock<String>>
}
fn main() {
let source = Arc::new(RwLock::new(String::from("Peter")));
let parser = Parser { buffer: source.clone() };
source.write().unwrap().push_str(" Pan");
println!("{:?}", parser);
}
On another note, you should prefer &str over &String

It's hard to tell what exactly you want to achieve by mutating the source; I would assume you don't want it to happen while the parser is doing its work? You can always try (depending on your specific use case) to separate the immutable from the mutable with an extra scope:
fn main() {
let mut source = String::from("Peter");
{
let buffer = &source;
let parser = Parser { buffer };
println!("{:?}", parser);
}
source.push_str(" Pan");
}
If you don't want to use RefCell, unsafe (or to simply keep a mutable reference to source in Parser and use that), I'm afraid it doesn't get better than plain refactoring.

To elaborate on how this can be done unsafely, what you've described can be achieved by using a raw const pointer to avoid the borrowing rules, which of course is inherently unsafe, as the very concept of what you've described is pretty unsafe. There are ways to make it safer though, should you choose this path. But I would probably default to using an Arc<RwLock> or Arc<Mutex> should safety be important.
use std::fmt::{self, Display};
#[derive(Debug)]
struct Parser {
buffer: *const String
}
impl Display for Parser {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
let buffer = unsafe { &*self.buffer };
write!(f, "{}", buffer)
}
}
fn main() {
let mut source = String::from("Peter");
let buffer = &source as *const String;
let parser = Parser { buffer };
source.push_str(" Pan");
println!("{}", parser);
}

How to create and write to memory mapped files?

Editor's note: This code example is from a version of Rust prior to 1.0 and the code it uses does not exist in Rust 1.0. Some answers have been updated to answer the core question for newer versions of Rust.
I'm trying to create a memory mapped file using std::os::MemoryMap. The current approach looks as follows:
use std::os;
use std::ptr;
use std::old_io as io;
use std::os::unix::prelude::AsRawFd;
use std::os::MapOption;
let path = Path::new("test.mmap");
let f = match io::File::open_mode(&path, io::Open, io::ReadWrite) {
Ok(f) => f,
Err(err) => panic!("Could not open file: {}", err),
};
let mmap_opts = &[
MapOption::MapReadable,
MapOption::MapWritable,
MapOption::MapFd(f.as_raw_fd())
];
let mmap = match os::MemoryMap::new(1024*1024, mmap_opts) {
Ok(mmap) => {
println!("Successfully created the mmap: {}", mmap.len());
mmap
}
Err(err) => panic!("Could not read the mmap: {}", err),
};
unsafe {
let data = mmap.data();
if data.is_null() {
panic!("Could not access data from memory mapped file")
}
let src = "Hello!";
ptr::copy_memory(data, src.as_ptr(), src.as_bytes().len());
}
This program fails with
Process didn't exit successfully: `target/mmap` (status=4)
when calling ptr::copy_memory or any other operations on data.
What is the reason I cannot write (or read) the data from the MemoryMap?
What is the correct way to use MemoryMap in Rust?

The real answer is to use a crate that provides this functionality, ideally in a cross-platform manner.
use memmap; // 0.7.0
use std::{
fs::OpenOptions,
io::{Seek, SeekFrom, Write},
};
const SIZE: u64 = 1024 * 1024;
fn main() {
let src = "Hello!";
let mut f = OpenOptions::new()
.read(true)
.write(true)
.create(true)
.open("test.mmap")
.expect("Unable to open file");
// Allocate space in the file first
f.seek(SeekFrom::Start(SIZE)).unwrap();
f.write_all(&[0]).unwrap();
f.seek(SeekFrom::Start(0)).unwrap();
let mut data = unsafe {
memmap::MmapOptions::new()
.map_mut(&f)
.expect("Could not access data from memory mapped file")
};
data[..src.len()].copy_from_slice(src.as_bytes());
}
Note that this is still possible for this code to lead to undefined behavior. Since the slice is backed by a file, the contents of the file (and thus the slice) may change from outside of the Rust program, breaking the invariants that the unsafe block is supposed to hold. The programmer needs to ensure that the file doesn't change during the lifetime of the map. Unfortunately, the crate itself does not provide much assistance to prevent this from happening or even any documentation warning the user.
If you wish to use lower-level system calls, you are missing two main parts:
mmap doesn't allocate any space on its own, so you need to set some space in the file. Without this, I get Illegal instruction: 4 when running on macOS.
MemoryMap (was) private by default so you need to mark the mapping as public so that changes are written back to the file (I'm assuming you want the writes to be saved). Without this, the code runs, but the file is never changed.
Here's a version that works for me:
use libc; // 0.2.67
use std::{
fs::OpenOptions,
io::{Seek, SeekFrom, Write},
os::unix::prelude::AsRawFd,
ptr,
};
fn main() {
let src = "Hello!";
let size = 1024 * 1024;
let mut f = OpenOptions::new()
.read(true)
.write(true)
.create(true)
.open("test.mmap")
.expect("Unable to open file");
// Allocate space in the file first
f.seek(SeekFrom::Start(size as u64)).unwrap();
f.write_all(&[0]).unwrap();
f.seek(SeekFrom::Start(0)).unwrap();
// This refers to the `File` but doesn't use lifetimes to indicate
// that. This is very dangerous, and you need to be careful.
unsafe {
let data = libc::mmap(
/* addr: */ ptr::null_mut(),
/* len: */ size,
/* prot: */ libc::PROT_READ | libc::PROT_WRITE,
// Then make the mapping *public* so it is written back to the file
/* flags: */ libc::MAP_SHARED,
/* fd: */ f.as_raw_fd(),
/* offset: */ 0,
);
if data == libc::MAP_FAILED {
panic!("Could not access data from memory mapped file")
}
ptr::copy_nonoverlapping(src.as_ptr(), data as *mut u8, src.len());
}
}

Up to date version:
use std::ptr;
use std::fs;
use std::io::{Write, SeekFrom, Seek};
use std::os::unix::prelude::AsRawFd;
use mmap::{MemoryMap, MapOption};
// from crates.io
extern crate mmap;
extern crate libc;
fn main() {
let size: usize = 1024*1024;
let mut f = fs::OpenOptions::new().read(true)
.write(true)
.create(true)
.open("test.mmap")
.unwrap();
// Allocate space in the file first
f.seek(SeekFrom::Start(size as u64)).unwrap();
f.write_all(&[0]).unwrap();
f.seek(SeekFrom::Start(0)).unwrap();
let mmap_opts = &[
// Then make the mapping *public* so it is written back to the file
MapOption::MapNonStandardFlags(libc::consts::os::posix88::MAP_SHARED),
MapOption::MapReadable,
MapOption::MapWritable,
MapOption::MapFd(f.as_raw_fd()),
];
let mmap = MemoryMap::new(size, mmap_opts).unwrap();
let data = mmap.data();
if data.is_null() {
panic!("Could not access data from memory mapped file")
}
let src = "Hello!";
let src_data = src.as_bytes();
unsafe {
ptr::copy(src_data.as_ptr(), data, src_data.len());
}
}

2022-version:
use memmap2::Mmap;
use std::fs::{self};
use std::io::{Seek, SeekFrom, Write};
use std::ops::DerefMut;
pub fn memmap2() {
// How to write to a file using mmap
// First open the file with writing option
let mut file = fs::OpenOptions::new()
.read(true)
.write(true)
.create(true)
.open("mmap_write_example2.txt")
.unwrap();
// Allocate space in the file for the data to be written,
// UTF8-encode string to get byte slice.
let data_to_write: &[u8] =
"Once upon a midnight dreary as I pondered weak and weary; äåößf\n".as_bytes();
let size: usize = data_to_write.len();
file.seek(SeekFrom::Start(size as u64 - 1)).unwrap();
file.write_all(&[0]).unwrap();
file.seek(SeekFrom::Start(0)).unwrap();
// Then write to the file
let mmap = unsafe { Mmap::map(&file).unwrap() };
let mut mut_mmap = mmap.make_mut().unwrap();
mut_mmap.deref_mut().write_all(data_to_write).unwrap();
}

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

What tools does Rust provide to avoid copies when using MPSC channels? - multithreading

Related

Safely call Arc::from_raw multiple times for a raw pointer? In other words, how to hold a raw pointer of an Arc and later use it multiple times?

Are non-atomic writes safe to read if gated by an atomic operation?

How to Use Serial Port in Multiple Threads in Rust?

Idiomatic Rust method for handling references to a buffer

How to create and write to memory mapped files?

Categories

Resources