A type can be Send if it can be moved from one thread to another safely (according to the Rust book). I understand the non-atomic increment/decrement that Rc does but I don't understand how that would make it unsafe in the following example:
use std::rc::Rc;
use std::{thread};
fn main() {
// x1 initialized - count = 1
let x1 = Rc::new(5);
// x1 cloned - count = 2. No other threads exist to cause issues due to non-atomic increment
let x2 = Rc::clone(&x1);
// x2 moved from thead-main to thread-other. This move occurs before the thread actually runs
// x2 in main cannot be used after this point and drop wont be called on it.
// This is a move so no increment/decrement takes place.
// main can't move forward unless the move is complete so drop/decrement wont happen
thread::spawn(move || {
// Technically "moving" x2 from thread-main to thread-other did not cause problems
// so why is `Rc` not `Send`?
println!("{:?}", x2);
});
}
As far as I understand Send only talks about the move being possible, since no moves cause decrement/increment why is Rc not send?
You are asking why the compiler does not prove that your code is safe. This is impossible. It's impossible for a compiler to prove any non-trivial semantic property of a turing-complete program (Rice's theorem). Yes, you may be able to prove it in your trivial example, but outside of that, it would be horrendously difficult.
So the compiler has to simplify and be pessimistic.
Send is a trait, operating at the type-system level, with no knowledge about the run-time environment. And Rc, having to assume the worst, does not implement it. Yes, it would be safe, if you pass over all the references together. But there's no way of proving that you are doing that in the general case.
If you just have one reference, you can use Rc::unwrap. Otherwise, you can muck with unsafe, which is the escape hatch for dealing with things that the compiler can't prove is safe. But then it's up to you to ensure that the code is safe.
Also as a note, this code would be unsafe. Variables are dropped when they go out of scope - for x1, that would be after the thread spawns, which means two threads would access it at the same time. But there's a simple fix for that - drop x1 beforehand.
Related
Example seen below. It seems like this might by definition be ub, but it remains unclear to me.
fn main(){
let mut threads = Vec::new();
for _ in 0..100 {
let thread = thread::spawn(move || {
fs::read_to_string("./config/init.json")
.unwrap()
.trim()
.to_string()
});
threads.push(thread);
}
for handler in threads {
handler.join().unwrap();
}
}
On most operating systems only individual read operations are guaranteed to be atomic. read_to_string may perform multiple distinct reads, which means that it's not guaranteed to be atomic between multiple threads/processes. If another process is modifying this file concurrently, read_to_string could return a mixture of data from before and after the modification. In other words, each read_to_string operation is not guaranteed to return an identical result, and some may even fail while others succeed if another process deletes the file while the program is running.
However, none of this behavior is classified as "undefined." Absent hardware problems, you are guaranteed to get back a std::io::Result<String> in a valid state, which is something you can reason about. Once UB is invoked, you can no longer reason about the state of the program.
By way of analogy, consider a choose your own adventure book. At the end of each segment you'll have some instructions like "If you choose to go into the cave, go to page 53. If you choose to take the path by the river, go to page 20." Then you turn to the appropriate page and keep reading. This is a bit like Result -- if you have an Ok you do one thing, but if you have an Err you do another thing.
Once undefined behavior is invoked, this kind of choice no longer makes sense because the program is in a state where the rules of the language no longer apply. The program could do anything, including deleting random files from your hard drive. In the book analogy, the book caught fire. Trying to follow the rules of the book no longer makes any sense, and you hope the book doesn't burn your house down with it.
In Rust you're not supposed to be able to invoke UB without using the unsafe keyword, so if you don't see that keyword anywhere then UB isn't on the table.
In rust, there is such a thing as an AtomicBool. It is defined as:
A boolean type which can be safely shared between threads.
I understand that if you're using a boolean to implement a thread lock, to be used from multiple threads to control access to a resource, doing something like:
// Acquire the lock
if thread_lock == false:
thread_lock = true
...
// Release the lock
thread_lock = false
Is definitely not thread safe. Both threads can read the thread_lock variable at the same time, see that it's unlocked (false), set it to true, and both think they have exclusive access to the thread.
With a proper thread lock, you need a boolean where, when you try to set it, one of two things will happen:
Trying to acquire a lock can fail if another thread already has a lock
Trying to acquire a lock will block until no other threads have a lock
I don't know if Rust has a concept like this, but I know Python's threading.Lock does exactly that.
As far as I can tell, this is NOT the scenario that an AtomicBool addresses. An AtomicBool has a load() method, and a store() method. Neither return a Result<bool> type (implying the operation can't fail), and as far as I can tell, neither do any kind of blocking.
What exactly does an AtomicBool protect us from? Why can we not use a regular bool from different threads (other than the fact that the compiler won't let us)?
The only thing I can think of is that when one thread is writing the bits into memory, another might try to read those bits at the same time. A bool is 8 bits. If 4 of the 8 bits were written when the other thread tries to read the data, the data read will be 4 bits of the old value, and 4 bits of the new value. Is this the problem being addressed? Can this happen? It doesn't seem like even in that scenario, a bool would need to be atomic, since of the 8 bits, only one bit matters, which will either be a 0 or a 1.
What exactly does an AtomicBool protect us from? Why can we not use a regular bool from different threads (other than the fact that the compiler won't let us)?
Anything that might go wrong, whether you can think of it or not. I hate to follow this up with something I can think of, because it doesn't matter. The rules say it's not guaranteed to work and that should end it. Thinking you have to think of a way it can fail or it can't fail is just wrong.
But here's one way:
// Release the lock
thread_lock = false
Say this particular CPU doesn't have a particularly good way to set a boolean to false without using a register but does have a good single operation that negates a boolean and tests if it's zero without using a register. On this CPU, in conditions of register pressure, this might get optimized to:
Negate thread_lock and test if it's zero.
If the copy of thread_lock was false, negate thread_lock again.
What happens if in-betweens steps 1 and 2 another thread observes thread_lock to be true even though it was false going into this operation and will be false when it's done?
The thread lock in Rust is Mutex. It is typically used to provide multi-thread mutable access to a value (which is usually the reason why you want to lock between threads), but you can also use it to lock an empty tuple Mutex<()> to lock on nothing. I can't think of good reasons that you need to lock threads without needing to lock on particular values, though; for example if you want to write to a log file from multiple threads, you might want to have a Mutex<fs::File> like this:
let file = Arc::new(Mutex::new(fs::File::create("write.log")?));
for _ in 0..10 {
let file = Arc::clone(&file);
thread::spawn(move |file| {
// do other stuff
let mut guard = file.lock();
guard.write_all(b"stuff").unwrap();
drop(guard);
// do other stuff
Ok(())
})
}
For atomic values, usually the most important primitives are not load and store but compare_and_exchange, etc. Atomics can be thought as "lightweight" mutexes that only contain primitive data, but you perform all operations you want in a single call instead of acquiring and releasing it in two separate operations. Furthermore, mutexes can actually be implemented based on an AtomicBool if the operating system doesn't support it, like the following code:
struct MyMutex(AtomicBool);
impl MyMutex {
fn try_lock(&self) -> Result<(), ()> {
let result = self.0.compare_exchange(false, true, Ordering::SeqCst);
if result {
Ok(()) // we have acquired the lock
} else {
Err(()) // someone else is holding the lock
}
}
fn release(&self) {
self.0.store(false, Ordering::Release);
}
}
You can share any value that is Sync from multiple threads, provided that you can deal with the lifetime properly. For example, the following compiles without any unsafe code:
fn process(b: &'static bool) {
if b { do_something () }
else { do_something_else() }
}
fn main() {
let boxed = Box::new(true);
let refed: &'static bool = my_bool.leak();
for _ in 0..10 {
thread::spawn(move || process(refed));
}
}
You can also do this with non-'static references with the sufficient tools, such as wrapping them in Arcs, etc.
A bool is 8 bits. If 4 of the 8 bits were written when the other thread tries to read the data, the data read will be 4 bits of the old value, and 4 bits of the new value.
This cannot happen in Rust. Rust enforces ownership and borrowing very strictly. You can't even have two mutable references to the same value on the same thread, much less on different threads.
Multiple mutable references to the same value is always Undefined Behaviour in Rust; there are no exceptions to this strict rule. By declaring that a reference is mutable, the compiler is allowed to do various optimizations on your code assuming that we are the unique place that can read/write the value; not other threads, not other functions, not even other variables (if a: &mut bool and let b = &mut *a, you can't use a before b is dropped). You will have much worse problems than writing different bits concurrently if you have multiple mutable pointers.
(By the way, "writing bits" to the same value is not a correct way of thinking it; it's much more complicated than "writing bits" in modern CPUs even without Rust's borrow checking rules)
TL;DR: If you don't have the unsafe keyword anyway in your code, you don't need to worry about race conditions. Rust is a very memory-safe language where memory bugs are mostly checked at compile time.
I have a function of type
f: fn(x: SomeType, y: Arc<()>) -> ISupposeTheReturnTypeDoesNotMatter
when compiled (with or without optimization), will the y be optimized away?
The intention of the y is to limit the number of the running instances of f, if y is referenced too many times, the caller of f will not call f until the reference count of y becomes lower.
Edit: clarification on my intention
The intention is to keep the number of running http requests (represented by the above f) in control, the pseudo code looks like this:
let y = Arc::new(());
let all_jobs_to_be_done = vector_of_many_jobs;
loop {
while y.strong_count() < some_predefined_limit {
// we have some free slots, fill them up with instances of f,
// f is passed with a clone of y,
// so that the running of f would increase the ref count,
// and the death of the worker thread would decrease the ref count
let work = all_jobs_to_be_done.pop();
let ticket = y.clone();
spawn_work(move || {f(work, ticket)});
}
sleep_for_a_few_seconds();
}
The reason for this seemingly hacky work around is that I cannot find a library that meets my needs (consume a changing work queue with bounded amount of async (tokio) workers, and requeue the work if the job fails)
Will Rust optimize away unused function arguments?
Yes, LLVM (the backend for rustc) is able to optimize away unused variables when removing them does not change program behavior, although nothing guarantees it will do it. rustc has some passes before LLVM too, but the same applies.
Knowing what exactly counts as program behavior is tricky business. However, multi-threaded primitives used in refcounting mechanics are usually the sort of thing that cannot be optimized away for good reason. Refer to the Rust reference for more information (other resources that might help are the nomicon, the different GitHub repos, the Rust fora, the C++11 memory model which Rust uses, etc.).
On the other hand, if you are asking about what are the semantics of the language when it encounters unused parameters, then no, Rust does not ignore them (and hopefully never will!).
will the y be optimized away?
No, it is a type with side effects. For instance, dropping it requires running non-trivial code.
The intention of the y is to limit the number of the running instances of f
Such an arrangement does not limit how many threads are running f since Arc is not a mutex and, even if it were some kind of mutex, you could construct as many independent ones as you wanted.
I'm trying to understand what exactly the Rust aliasing/memory model allows. In particular I'm interested in when accessing memory outside the range you have a reference to (which might be aliased by other code on the same or different threads) becomes undefined behaviour.
The following examples all access memory outside what is ordinarily allowed, but in ways that would be safe if the compiler produced the obvious assembly code. In addition, I see little conflict potential with compiler optimization, but they might still violate strict aliasing rules of Rust or LLVM thus constituting undefined behavior.
The operations are all properly aligned and thus cannot cross a cache-line or page boundary.
Read the aligned 32-bit word surrounding the data we want to access and discard the parts outside of what we're allowed to read.
Variants of this could be useful in SIMD code.
pub fn read(x: &u8) -> u8 {
let pb = x as *const u8;
let pw = ((pb as usize) & !3) as *const u32;
let w = unsafe { *pw }.to_le();
(w >> ((pb as usize) & 3) * 8) as u8
}
Same as 1, but reads the 32-bit word using an atomic_load intrinsic.
pub fn read_vol(x: &u8) -> u8 {
let pb = x as *const u8;
let pw = ((pb as usize) & !3) as *const AtomicU32;
let w = unsafe { (&*pw).load(Ordering::Relaxed) }.to_le();
(w >> ((pb as usize) & 3) * 8) as u8
}
Replace the aligned 32-bit word containing the value we care about using CAS. It overwrites the parts outside what we're allowed to access with what's already in there, so it only affects the parts we're allowed to access.
This could be useful to emulate small atomic types using bigger ones. I used AtomicU32 for simplicity, in practice AtomicUsize is the interesting one.
pub fn write(x: &mut u8, value:u8) {
let pb = x as *const u8;
let atom_w = unsafe { &*(((pb as usize) & !3) as *const AtomicU32) };
let mut old = atom_w.load(Ordering::Relaxed);
loop {
let shift = ((pb as usize) & 3) * 8;
let new = u32::from_le((old.to_le() & 0xFF_u32 <<shift)|((value as u32) << shift));
match atom_w.compare_exchange_weak(old, new, Ordering::SeqCst, Ordering::Relaxed) {
Ok(_) => break,
Err(x) => old = x,
}
}
}
This is a very interesting question.
There are actually several issues with these functions, making them unsound (i.e., not safe to expose) for various formal reasons.
At the same time, I am unable to actually construct a problematic interaction between these functions and compiler optimizations.
Out-of-bounds accesses
I'd say all of these functions are unsound because they can access unallocated memory. Each of them I can call with a &*Box::new(0u8) or &mut *Box::new(0u8), resulting in out-of-bounds accesses, i.e. accesses beyond what was allocated using malloc (or whatever allocator). Neither C nor LLVM permit such accesses. (I'm using the heap because I find it easier to think about allocations there, but the same applies to the stack where every stack variable is really its own independent allocation.)
Granted, the LLVM language reference doesn't actually define when a load has undefined behavior due to the access not being inside the object. However, we can get a hint in the documentation of getlementptr inbounds, which says
The in bounds addresses for an allocated object are all the addresses that point into the object, plus the address one byte past the end.
I am fairly certain that being in bounds is a necessary but not sufficient requirement for actually using an address with load/store.
Note that this is independent of what happens on the assembly level; LLVM will do optimizations based on a much higher-level memory model that argues in terms of allocated blocks (or "objects" as C calls them) and staying within the bounds of these blocks.
C (and Rust) are not assembly, and it is not possible to use assembly-based reasoning on them.
Most of the time it is possible to derive contradictions from assembly-based reasoning (see e.g. this bug in LLVM for a very subtle example: casting a pointer to an integer and back is not a NOP).
This time, however, the only examples I can come up with are fairly far-fetched: For example, with memory-mapped IO, even reads from a location could "mean" something to the underlying hardware, and there could be such a read-sensitive location sitting right next to the one that's passed into read.
But really I don't know much about this kind of embedded/driver development, so this may be entirely unrealistic.
(EDIT: I should add that I am not an LLVM expert. Probably the llvm-dev mailing list is a better place to determine if they are willing to commit to permitting such out-of-bounds accesses.)
Data races
There is another reason at least some of these functions are not sound: Concurrency. You clearly already saw this coming, judging from the use of concurrent accesses.
Both read and read_vol are definitely unsound under the concurrency semantics of C11. Imagine x is the first element of a [u8], and another thread is writing to the second element at the same time as we execute read/read_vol. Our read of the whole 32bit word overlaps with the other thread's write. This is a classical "data race": Two threads accessing the same location at the same time, one access being a write, and one access not being atomic. Under C11, any data race is UB so we are out. LLVM is slightly more permissive so both read and read_val are probably allowed, but right now Rust declares that it uses the C11 model.
Also note that "vol" is a bad name (assuming you meant this as short-hand for "volatile") -- in C, atomicity has nothing to do with volatile! It is literally impossible to write correct concurrent code when using volatile and not atomics. Unfortunately, Java's volatile is about atomicity, but that's a very different volatile than the one in C.
And finally, write also introduces a data race between an atomic read-modify-update and a non-atomic write in the other thread, so it is UB in C11 as well. And this time it is also UB in LLVM: Another thread could be reading from one of the extra locations that write affects, so calling write would introduce a data race between our writing and the other thread's reading. LLVM specifies that in this case, the read returns undef. So, calling write can make safe accesses to the same location in other threads return undef, and subsequently trigger UB.
Do we have any examples of issues caused by these functions?
The frustrating part is, while I found multiple reasons to rule out your functions following the spec(s), there seems to be no good reason that these functions are ruled out! The read and read_vol concurrency issues are fixed by LLVM's model (which however has other problems, compared to C11), but write is illegal in LLVM just because read-write data races make the read return undef -- and in this case we know we are writing the same value that was already stored in these other bytes! Couldn't LLVM just say that in this special case (writing the value that's already there), the read must return that value? Probably yes, but this stuff is subtle enough that I would also not be surprised if that invalidates some obscure optimization.
Moreover, at least on non-embedded platforms the out-of-bounds accesses done by read are unlikely to cause actual trouble. I guess one could imagine a semantics which returns undef when reading an out-of-bounds byte that is guaranteed to sit on the same page as an in-bounds byte. But that would still leave write illegal, and that is a really tough one: write can only be allowed if the memory on these other locations is left absolutely unchanged. There could be arbitrary data sitting there from other allocations, parts of the stack frame, whatever. So somehow the formal model would have to let you read those other bytes, not allow you to gain anything by inspecting them, but also verify that you are not changing the bytes before writing them back with a CAS. I'm not aware of any model that would let you do that. But I thank you for bringing these nasty cases to my attention, it's always good to know that there is still plenty of stuff left to research in terms of memory models :)
Rust's aliasing rules
Finally, what you were probably wondering about is whether these functions violate any of the additional aliasing rules that Rust adds. The trouble is, we don't know -- these rules are still under development. However, all the proposals I have seen so far would indeed rule out your functions: When you hold an &mut u8 (say, one that points right next to the one that's passed to read/read_vol/write), the aliasing rules provide a guarantee that no access whatsoever will happen to that byte by anyone but you. So, your functions reading from memory that others could hold a &mut u8 to already makes them violate the aliasing rules.
However, the motivation for these rules is to conform with the C11 concurrency model and LLVM's rules for memory access. If LLVM declares something UB, we have to make it UB in Rust as well unless we are willing to change our codegen in a way that avoids the UB (and typically sacrifices performance). Moreover, given that Rust adopted the C11 concurrency model, the same holds true for that. So for these cases, the aliasing rules really don't have any choice but make these accesses illegal. We could revisit this once we have a more permissive memory model, but right now our hands are bound.
This question already has an answer here:
How can I pass a reference to a stack variable to a thread?
(1 answer)
Closed 5 years ago.
I have the following code:
fn main() {
let message = "Can't shoot yourself in the foot if you ain't got no gun";
let t1 = std::thread::spawn(|| {
println!("{}", message);
});
t1.join();
}
rustc gives me the compilation error:
closure may outlive the current function, but it borrows message, which is owned by the current function
This is wrong since:
The function it's referring to here is (I believe) main. The threads will be killed or enter in UB once main is finished executing.
The function it's referring to clearly invokes .join() on said thread.
Is the previous code unsafe in any way? If so, why? If not, how can I get the compiler to understand that?
Edit: Yes I am aware I can just move the message in this case, my question is specifically asking how can I pass a reference to it (preferably without having to heap allocate it, similarly to how this code would do it:
std::thread([&message]() -> void {/* etc */});
(Just to clarify, what I'm actually trying to do is access a thread safe data structure from two threads... other solutions to the problem that don't involve making the copy work would also help).
Edit2: The question this has been marked as a duplicate of is 5 pages long and as such I'd consider it and invalid question in it's own right.
Is the previous code 'unsafe' in any way ? If so, why ?
The goal of Rust's type-checking and borrow-checking system is to disallow unsafe programs, but that does not mean that all programs that fail to compile are unsafe. In this specific case, your code is not unsafe, but it does not satisfy the type constraints of the functions you are using.
The function it's referring to clearly invokes .join() on said thread.
But there is nothing from a type-checker standpoint that requires the call the .join. A type-checking system (on its own) can't enforce that a function has or has not been called on a given object. You could just as easily imagine an example like
let message = "Can't shoot yourself in the foot if you ain't got no gun";
let mut handles = vec![];
for i in 0..3 {
let t1 = std::thread::spawn(|| {
println!("{} {}", message, i);
});
handles.push(t1);
}
for t1 in handles {
t1.join();
}
where a human can tell that each thread is joined before main exits. But a typechecker has no way to know that.
The function it's referring to here is (I believe) main. So presumably those threads will be killed when main exists anyway (and them running after main exists is ub).
From the standpoint of the checkers, main is just another function. There is no special knowledge that this specific function can have extra behavior. If this were any other function, the thread would not be auto-killed. Expanding on that, even for main there is no guarantee that the child threads will be killed instantly. If it takes 5ms for the child threads to be killed, that is still 5ms where the child threads could be accessing the content of a variable that has gone out of scope.
To gain the behavior that you are looking for with this specific snippet (as-is), the lifetime of the closure would have to be tied to the lifetime of the t1 object, such that the closure was guaranteed to never be used after the handles have been cleaned up. While that is certainly an option, it is significantly less flexible in the general case. Because it would be enforced at the type level, there would be no way to opt out of this behavior.
You could consider using crossbeam, specifically crossbeam::scope's .spawn, which enforces this lifetime requirement where the standard library does not, meaning a thread must stop execution before the scope is finished.
In your specific case, your code works fine as long as you transfer ownership of message to the child thread instead of borrowing it from the main function, because there is no risk of unsafe code with or without your call to .join. Your code works fine if you change
let t1 = std::thread::spawn(|| {
to
let t1 = std::thread::spawn(move || {