Multiprocessing atomics as a spinlock in Rust? - rust

I would like to use a spinlock in my code which will be used by different processes. Since Rust has atomics for multithreading with swap operations that could be useful for spinlocks, I would like to know:
Can I use atomics from Rust in a shared memory between processes while keeping the provided safety guarantees?
First I need to know how to create an instance inside shared memory, but most important:
Is it possible to do something like this to reuse an atomic in a second process?
use std::sync::atomic::{AtomicU8, Ordering};
fn use_atomic(ptr_value: u64) {
// given: some memory pointer
// e.g. ptr_value == 0xDEADBEEFu64
let atomic = unsafe { &*(ptr_value as *mut AtomicU8) };
let _old_value = atomic.swap(1, Ordering::Relaxed);
}
In case this is a bad idea: is there a better way to do this in Rust? (I'm not used to assembly, but maybe there is some finished code I missed)

Related

What does "uninitialized" mean in the context of FFI?

I'm writing some GPU code for macOS using the metal crate. In doing so, I allocate a Buffer object by calling:
let buffer = device.new_buffer(num_bytes, MTLResourceOptions::StorageModeShared)
This FFIs to Apple's Metal API, which allocates a region of memory that both the CPU and GPU can access and the Rust wrapper returns a Buffer object. I can then get a pointer to this region of memory by doing:
let data = buffer.contents() as *mut u32
In the colloquial sense, this region of memory is uninitialized. However, is this region of memory "uninitialized" in the Rust sense?
Is this sound?
let num_bytes = num_u32 * std::mem::size_of::<u32>();
let buffer = device.new_buffer(num_bytes, MTLResourceOptions::StorageModeShared);
let data = buffer.contents() as *mut u32;
let as_slice = unsafe { slice::from_raw_parts_mut(data, num_u32) };
for i in as_slice {
*i = 42u32;
}
Here I'm writing u32s to a region of memory returned to me by FFI. From the nomicon:
...The subtle aspect of this is that usually, when we use = to assign to a value that the Rust type checker considers to already be initialized (like x[i]), the old value stored on the left-hand side gets dropped. This would be a disaster. However, in this case, the type of the left-hand side is MaybeUninit<Box>, and dropping that does not do anything! See below for some more discussion of this drop issue.
None of the from_raw_parts rules are violated and u32 doesn't have a drop method.
Nonetheless, is this sound?
Would reading from the region (as u32s) before writing to it be sound (nonsense values aside)? The region of memory is valid and u32 is defined for all bit patterns.
Best practices
Now consider a type T that does have a drop method (and you've done all the bindgen and #[repr(C)] nonsense so that it can go across FFI boundaries).
In this situation, should one:
Initialize the buffer in Rust by scanning the region with pointers and calling .write()?
Do:
let as_slice = unsafe { slice::from_raw_parts_mut(data as *mut MaybeUninit<T>, num_t) };
for i in as_slice {
*i = unsafe { MaybeUninit::new(T::new()).assume_init() };
}
Furthermore, after initializing the region, how does the Rust compiler remember this region is initialized on subsequent calls to .contents() later in the program?
Thought experiment
In some cases, the buffer is the output of a GPU kernel and I want to read the results. All the writes occurred in code outside of Rust's control and when I call .contents(), the pointer at the region of memory contains the correct uint32_t values. This thought experiment should relay my concern with this.
Suppose I call C's malloc, which returns an allocated buffer of uninitialized data. Does reading u32 values from this buffer (pointers are properly aligned and in bounds) as any type should fall squarely into undefined behavior.
However, suppose I instead call calloc, which zeros the buffer before returning it. If you don't like calloc, then suppose I have an FFI function that calls malloc, explicitly writes 0 uint32_t types in C, then returns this buffer to Rust. This buffer is initialized with valid u32 bit patterns.
From Rust's perspective, does malloc return "uninitialized" data while calloc returns initialized data?
If the cases are different, how would the Rust compiler know the difference between the two with respect to soundness?
There are multiple parameters to consider when you have an area of memory:
The size of it is the most obvious.
Its alignment is still somewhat obvious.
Whether or not it's initialized -- and notably, for types like bool whether it's initialized with valid values as not all bit-patterns are valid.
Whether it's concurrently read/written.
Focusing on the trickier aspects, the recommendation is:
If the memory is potentially uninitialized, use MaybeUninit.
If the memory is potentially concurrently read/written, use a synchronization method -- be it a Mutex or AtomicXXX or ....
And that's it. Doing so will always be sound, no need to look for "excuses" or "exceptions".
Hence, in your case:
let num_bytes = num_u32 * std::mem::size_of::<u32>();
assert!(num_bytes <= isize::MAX as usize);
let buffer = device.new_buffer(num_bytes, MTLResourceOptions::StorageModeShared);
let data = buffer.contents() as *mut MaybeUninit<u32>;
// Safety:
// - `data` is valid for reads and writes.
// - `data` points to `num_u32` elements.
// - Access to `data` is exclusive for the duration.
// - `num_u32 * size_of::<u32>() <= isize::MAX`.
let as_slice = unsafe { slice::from_raw_parts_mut(data, num_u32) };
for i in as_slice {
i.write(42); // Yes you can write `*i = MaybeUninit::new(42);` too,
// but why would you?
}
// OR with nightly:
as_slice.write_slice(some_slice_of_u32s);
This is very similar to this post on the users forum mentioned in the comment on your question. (here's some links from that post: 2 3)
The answers there aren't the most organized, but it seems like there's four main issues with uninitialized memory:
Rust assumes it is initialized
Rust assumes the memory is a valid bit pattern for the type
The OS may overwrite it
Security vulnerabilities from reading freed memory
For #1, this seems to me to not be an issue, since if there was another version of the FFI function that returned initialized memory instead of uninitialized memory, it would look identical to rust.
I think most people understand #2, and that's not an issue for u32.
#3 could be a problem, but since this is for a specific OS you may be able to ignore this if MacOS guarantees it does not do this.
#4 may or may not be undefined behavior, but it is highly undesirable. This is why you should treat it as uninitialized even if rust thinks it's a list of valid u32s. You don't want rust to think it's valid. Therefore, you should use MaybeUninit even for u32.
MaybeUninit
It's correct to cast the pointer to a slice of MaybeUninit. Your example isn't written correctly, though. assume_init returns T, and you can't assign that to an element from [MaybeUninit<T>]. Fixed:
let as_slice = unsafe { slice::from_raw_parts_mut(data as *mut MaybeUninit<T>, num_t) };
for i in as_slice {
i.write(T::new());
}
Then, turning that slice of MaybeUninit into a slice of T:
let init_slice = unsafe { &mut *(as_slice as *mut [MaybeUninit<T>] as *mut [T]) };
Another issue is that &mut may not be correct to have at all here since you say it's shared between GPU and CPU. Rust depends on your rust code being the only thing that can access &mut data, so you need to ensure any &mut are gone while the GPU accesses the memory. If you want to interlace rust access and GPU access, you need to synchronize them somehow, and only store *mut while the GPU has access (or reacquire it from FFI).
Notes
The code is mainly taken from Initializing an array element-by-element in the MaybeUninit doc, plus the very useful Alternatives section from transmute. The conversion from &mut [MaybeUninit<T>] to &mut [T] is how slice_assume_init_mut is written as well. You don't need to transmute like in the other examples since it is behind a pointer. Another similar example is in the nomicon: Unchecked Uninitialized Memory. That one accesses the elements by index, but it seems like doing that, using * on each &mut MaybeUninit<T>, and calling write are all valid. I used write since it's shortest and is easy to understand. The nomicon also says that using ptr methods like write is also valid, which should be equivalent to using MaybeUninit::write.
There's some nightly [MaybeUninit] methods that will be helpful in the future, like slice_assume_init_mut

Are read_volatile and write_volatile atomic for usize?

I want to use read_volatile and write_volatile for IPC using shared memory. Is it guaranteed that writing of an unsigned integer of usize type will be atomic?
At the time of this writing, Rust does not have a proper memory model, but instead it uses that imposed by the LLVM, that is basically that of C++, that in turn is inherited fom C. So the best references you have of what is guaranteed doing memory stuff is that from C.
In C volatile should not be used for syncronization, its intended use is for memory mapped I/O and maybe for single-threaded signal handlers. See for example this Linux-kernel specific gideline. Or this other description of volatile:
This makes volatile objects suitable for communication with a signal handler, but not with another thread of execution.
If you want to do concurrent access to a value you should use atomics operations. They have the volatile guarantee plus additional ones. They are guaranteed to be atomic even in the presence of concurrent access. And moreover they allow you to set the ordering mode.
For your particular case you should use AtomicUsize. Note that the availability of that type is conditioned on your architecture having the necessary support, but that is exactly what you want.
Note that an AtomicUsize has the same memory layout of a plain usize, so if you have a usize embedded in a shared struct you can access atomically with a pointer cast. I think this code is sound:
struct SharedData {
// ...
x: usize
}
fn test(data: *mut SharedData) {
let x = unsafe { &*(&(*data).x as *const usize as *const AtomicUsize) };
let _ = x.load(Ordering::Relaxed);
}
Although you would be better just declaring that x as AtomicUsize directly.
Also note that reading or writing that value using any non-atomic operation (even just reading it out of curiosity, even using volatile access) invokes Undefined Behavior.

Why does a boolean need to be atomic?

In rust, there is such a thing as an AtomicBool. It is defined as:
A boolean type which can be safely shared between threads.
I understand that if you're using a boolean to implement a thread lock, to be used from multiple threads to control access to a resource, doing something like:
// Acquire the lock
if thread_lock == false:
thread_lock = true
...
// Release the lock
thread_lock = false
Is definitely not thread safe. Both threads can read the thread_lock variable at the same time, see that it's unlocked (false), set it to true, and both think they have exclusive access to the thread.
With a proper thread lock, you need a boolean where, when you try to set it, one of two things will happen:
Trying to acquire a lock can fail if another thread already has a lock
Trying to acquire a lock will block until no other threads have a lock
I don't know if Rust has a concept like this, but I know Python's threading.Lock does exactly that.
As far as I can tell, this is NOT the scenario that an AtomicBool addresses. An AtomicBool has a load() method, and a store() method. Neither return a Result<bool> type (implying the operation can't fail), and as far as I can tell, neither do any kind of blocking.
What exactly does an AtomicBool protect us from? Why can we not use a regular bool from different threads (other than the fact that the compiler won't let us)?
The only thing I can think of is that when one thread is writing the bits into memory, another might try to read those bits at the same time. A bool is 8 bits. If 4 of the 8 bits were written when the other thread tries to read the data, the data read will be 4 bits of the old value, and 4 bits of the new value. Is this the problem being addressed? Can this happen? It doesn't seem like even in that scenario, a bool would need to be atomic, since of the 8 bits, only one bit matters, which will either be a 0 or a 1.
What exactly does an AtomicBool protect us from? Why can we not use a regular bool from different threads (other than the fact that the compiler won't let us)?
Anything that might go wrong, whether you can think of it or not. I hate to follow this up with something I can think of, because it doesn't matter. The rules say it's not guaranteed to work and that should end it. Thinking you have to think of a way it can fail or it can't fail is just wrong.
But here's one way:
// Release the lock
thread_lock = false
Say this particular CPU doesn't have a particularly good way to set a boolean to false without using a register but does have a good single operation that negates a boolean and tests if it's zero without using a register. On this CPU, in conditions of register pressure, this might get optimized to:
Negate thread_lock and test if it's zero.
If the copy of thread_lock was false, negate thread_lock again.
What happens if in-betweens steps 1 and 2 another thread observes thread_lock to be true even though it was false going into this operation and will be false when it's done?
The thread lock in Rust is Mutex. It is typically used to provide multi-thread mutable access to a value (which is usually the reason why you want to lock between threads), but you can also use it to lock an empty tuple Mutex<()> to lock on nothing. I can't think of good reasons that you need to lock threads without needing to lock on particular values, though; for example if you want to write to a log file from multiple threads, you might want to have a Mutex<fs::File> like this:
let file = Arc::new(Mutex::new(fs::File::create("write.log")?));
for _ in 0..10 {
let file = Arc::clone(&file);
thread::spawn(move |file| {
// do other stuff
let mut guard = file.lock();
guard.write_all(b"stuff").unwrap();
drop(guard);
// do other stuff
Ok(())
})
}
For atomic values, usually the most important primitives are not load and store but compare_and_exchange, etc. Atomics can be thought as "lightweight" mutexes that only contain primitive data, but you perform all operations you want in a single call instead of acquiring and releasing it in two separate operations. Furthermore, mutexes can actually be implemented based on an AtomicBool if the operating system doesn't support it, like the following code:
struct MyMutex(AtomicBool);
impl MyMutex {
fn try_lock(&self) -> Result<(), ()> {
let result = self.0.compare_exchange(false, true, Ordering::SeqCst);
if result {
Ok(()) // we have acquired the lock
} else {
Err(()) // someone else is holding the lock
}
}
fn release(&self) {
self.0.store(false, Ordering::Release);
}
}
You can share any value that is Sync from multiple threads, provided that you can deal with the lifetime properly. For example, the following compiles without any unsafe code:
fn process(b: &'static bool) {
if b { do_something () }
else { do_something_else() }
}
fn main() {
let boxed = Box::new(true);
let refed: &'static bool = my_bool.leak();
for _ in 0..10 {
thread::spawn(move || process(refed));
}
}
You can also do this with non-'static references with the sufficient tools, such as wrapping them in Arcs, etc.
A bool is 8 bits. If 4 of the 8 bits were written when the other thread tries to read the data, the data read will be 4 bits of the old value, and 4 bits of the new value.
This cannot happen in Rust. Rust enforces ownership and borrowing very strictly. You can't even have two mutable references to the same value on the same thread, much less on different threads.
Multiple mutable references to the same value is always Undefined Behaviour in Rust; there are no exceptions to this strict rule. By declaring that a reference is mutable, the compiler is allowed to do various optimizations on your code assuming that we are the unique place that can read/write the value; not other threads, not other functions, not even other variables (if a: &mut bool and let b = &mut *a, you can't use a before b is dropped). You will have much worse problems than writing different bits concurrently if you have multiple mutable pointers.
(By the way, "writing bits" to the same value is not a correct way of thinking it; it's much more complicated than "writing bits" in modern CPUs even without Rust's borrow checking rules)
TL;DR: If you don't have the unsafe keyword anyway in your code, you don't need to worry about race conditions. Rust is a very memory-safe language where memory bugs are mostly checked at compile time.

How do I trigger the release of a Rust Mutex after a panic in Wasm so that future calls will be ok?

I encountered a deadlock while developing with Rust and WebAssembly.
Due to the use of some globally accessed variables, I choose lazy_static and a Mutex (using thread_local callbacks would have caused nesting problems). I have declared a lot of Rust functions are used by JavaScript through #[wasm_bindgen]. They read and write the lazy_static variables.
After one of the functions panics, the mutex lock cannot be released, causing other functions to panic if they need to use the same mutex.
I know that the panic problem is unexpected and needs to be fixed, but these functions are relatively independent of each other. Although the reading and writing of the lazy_static variables intersect, a certain bug may not necessarily affect other parts.
How do I trigger the release of the Mutex after a panic in Wasm to allow other calls to be ok? Is there any better practice for this kind of problem?
Rust:
use std::sync::Mutex;
use std::sync::PoisonError;
use wasm_bindgen::prelude::*;
pub struct CurrentStatus {
pub index: i32,
}
impl CurrentStatus {
fn new() -> Self {
CurrentStatus { index: 1 }
}
fn get_index(&mut self) -> i32 {
self.index += 1;
self.index.clone()
}
fn add_index(&mut self) {
self.index += 2;
}
}
lazy_static! {
pub static ref FOO: Mutex<CurrentStatus> = Mutex::new(CurrentStatus::new());
}
unsafe impl Send for CurrentStatus {}
#[wasm_bindgen]
pub fn add_index() {
FOO.lock()
.unwrap_or_else(PoisonError::into_inner)
.add_index();
}
#[wasm_bindgen]
pub fn get_index() -> i32 {
let mut foo = FOO.lock().unwrap_or_else(PoisonError::into_inner);
if foo.get_index() == 6 {
panic!();
}
return foo.get_index();
}
JavaScript:
const js = import("../pkg/hello_wasm.js");
js.then(js => {
window.js = js;
console.log(js.get_index());
js.add_index();
console.log(js.get_index());
js.add_index();
console.log(js.get_index());
js.add_index();
console.log(js.get_index());
js.add_index();
console.log(js.get_index());
js.add_index();
});
After the panic, I can not call the function at all and it is as if the Wasm is dead.
Before answering this question I should probably mention, that panic handling shouldn't be used as general error mechanism. They should be used for unrecoverable errors.
Citing documentation.
This allows a program to terminate immediately and provide feedback to the caller of the program. panic! should be used when a program reaches an unrecoverable state.
Panics in Rust are actually much more gentle than it may seem in the first place for people coming from C++ background (which I assume is the case for some people writing in the comments). Uncaught Rust panics by default terminate thread, while C++ exception terminate whole process.
Citing documentation
Fatal logic errors in Rust cause thread panic, during which a thread will unwind the stack, running destructors and freeing owned resources. While not meant as a 'try/catch' mechanism, panics in Rust can nonetheless be caught (unless compiling with panic=abort) with catch_unwind and recovered from, or alternatively be resumed with resume_unwind. If the panic is not caught the thread will exit, but the panic may optionally be detected from a different thread with join. If the main thread panics without the panic being caught, the application will exit with a non-zero exit code.
It is fine to catch_unwind and recover thread from panic, but you should know that catch_unwind isn't guaranteed to catch all panics.
Note that this function may not catch all panics in Rust. A panic in Rust is not always implemented via unwinding, but can be implemented by aborting the process as well. This function only catches unwinding panics, not those that abort the process.
So, we understood that recovering from panic is fine. The question is what to do when lock is poisoned.
Citing documentation
The mutexes in this module implement a strategy called "poisoning" where a mutex is considered poisoned whenever a thread panics while holding the mutex. Once a mutex is poisoned, all other threads are unable to access the data by default as it is likely tainted (some invariant is not being upheld).
There is a valid reason for poisoning, because invariants of your data may not be held. Consider panic! in the middle of some function. This is just an additional level of security, that you can bypass.
A poisoned mutex, however, does not prevent all access to the underlying data. The PoisonError type has an into_inner method which will return the guard that would have otherwise been returned on a successful lock. This allows access to the data, despite the lock being poisoned.
use std::sync::{Mutex, PoisonError};
fn main() {
let mutex = Mutex::new(1);
// We are prepared to face bugs if invariants are wrong
println!("{}", mutex.lock().unwrap_or_else(PoisonError::into_inner));
}
Playground link
Of course it's always better to fix panic, than do this.
I had a problem where I was running several integration tests in parallel, and they had a mutex (that was accessed globally), but if one test failed, all subsequent tests would fail as well. This was a problem when I wanted to debug where a test failed, simply because the output would have a lot of other failed tests (due to poison error).
The solution was to simply use the parking_lot crate (a library for using mutexes), which seems to clear the mutex if the thread crashes.
Sadly I didn't find anything in the documentation that explains how it works, or if it even guarantees this behavior in future versions. But the current version works well for me, and if it works for you, it's easier than the accepted answer, since all you have to do is replace sync::Mutex for parking_lot::Mutex and you are ready to go (and also remove the .unwrap to the lock(), since it doesn't return a Result)

Why does Rust have mutexes and other sychronization primitives, if sharing of mutable state between tasks is not allowed?

My understanding is that it's not possible to share mutable state between tasks in Rust, so why does Rust has things like mutexes in the language? What's their purpose?
"Sharing mutable data between tasks is not allowed" is an oversimplification. No offense meant, it's also used in much introductory material on Rust, and for good reasons. But the truth is, Rust just wants to get rid of data races; not sharing anything is the preferred approach but not the only. Rust also wants to be a system programming language in the same sense as C and C++ are, so it won't nilly-willy completely remove some capability or performance optimization. However, in general shared mutable memory is not safe (data races etc.) so if you want it, you will have to acknowledge the responsibility by wrapping it in unsafe blocks.
Luckily, some patterns of using shared mutable memory are safe (e.g. using proper locking discipline). When these patterns are recognized and considered important enough, someone writes some unsafe code that they convince themselves (or perhaps even "prove") exposes a safe interface. In other words: Code using the interface can never violate the various safety requirements of Rust. For example, while Mutex allows you to access mutable memory from different tasks at different times, it never permits aliasing among tasks (i.e. access at the same time), so data races are prevented.
Rust defines a Mutex as
A mutual exclusion primitive useful for protecting shared data
A clear example of Mutex use can be found in the Mutex documentation. Note the use of the mut keyword to designate mutable variables:
use std::sync::{Arc, Mutex};
use std::thread;
use std::sync::mpsc::channel;
const N: usize = 10;
// Spawn a few threads to increment a shared variable (non-atomically), and
// let the main thread know once all increments are done.
//
// Here we're using an Arc to share memory among threads, and the data inside
// the Arc is protected with a mutex.
let data = Arc::new(Mutex::new(0));
let (tx, rx) = channel();
for _ in 0..10 {
let (data, tx) = (data.clone(), tx.clone());
thread::spawn(move || {
// The shared state can only be accessed once the lock is held.
// Our non-atomic increment is safe because we're the only thread
// which can access the shared state when the lock is held.
//
// We unwrap() the return value to assert that we are not expecting
// threads to ever fail while holding the lock.
let mut data = data.lock().unwrap();
*data += 1;
if *data == N {
tx.send(()).unwrap();
}
// the lock is unlocked here when `data` goes out of scope.
});
}
rx.recv().unwrap();
Rust also provides an unsafe keyword. Unsafe operations are those that potentially violate the memory-safety guarantees of Rust's static semantics. So the guarantee of immutable safety is by no means assured.

Resources