std::sync::atomic::AtomicUsize implements Sync which means immutable references are free of data races when shared between multiple threads. Why does AtomicUsize not implement Send? Is there state which is linked to the thread that created the atomic or is this a language design decision relating to the way atomics are intended to be used i.e. via a Arc<_> etc.
It's a trick! AtomicUsize does implement Send:
use std::sync::atomic::AtomicUsize;
fn checker<T>(_: T) where T: Send {}
fn main() {
checker(AtomicUsize::default());
}
In fact, there's even an automated test that ensures this is the case.
Rust 1.26
These auto traits are now documented, thanks to a change made to rustdoc.
Previous versions
The gotcha lies in how Send is implemented:
This trait is automatically derived when the compiler determines it's appropriate.
This means that Rustdoc doesn't know that Send is implemented for a type because most types don't implement it explicitly.
This explains why AtomicPtr<T> shows up in the implementers list: it has a special implementation that ignores the type of T.
Related
You can do this:
impl Foo {
fn foo(self: &Rc<Self>) {}
}
But not this:
impl Foo {
fn foo(self: &Rc<RefCell<Self>>) {}
}
The former is quite useful - e.g. I can have methods return objects containing weak references to self. But because I can't use RefCell I can't return anything that would mutate self.
There are ways around this (e.g. wrapping the whole struct in RefCell internally) but none as convenient for my current task as just allowing self: &Rc<RefCell<>>.
The grammar allowed is described here. It allows Box, Rc, Arc and Pin but not RefCell. Why?
As of the time of this writing, in August 2022, method receiver type support is still fairly limited to a handful of types, and compositions of those types. You've noticed that RefCell is not among them.
arbitrary_self_types, a feature for expanding the possible types of self, has a tracking issue for discussing its implementation. From the discussion, it seems that this feature is currently targeting types that implement Deref[Mut], which RefCell does not implement. A few comments point out that this is limiting though, so there's still a possibility.
Ultimately, I think the answer is that RefCell does not work because the full-fledged feature hasn't been designed thoroughly enough yet. The Rust team likely implemented the basic feature so that we could have Pin as a self type and thus make futures work, and added a few other "easy" types to get more bang for their buck, but a proper and generic implementation has been deferred.
futures::lock::Mutex is not implementing RefUnwindSafe trait (see https://docs.rs/futures/0.3.17/futures/lock/struct.Mutex.html#impl-RefUnwindSafe )
Why is not safe to pass a futures::lock::Mutex inside std::panic::catch_unwind?
Reproducible code:
use std::panic::catch_unwind;
use futures::lock::Mutex;
fn main() {
let m = Mutex::new(String::new());
catch_unwind(|| {
m.lock()
});
}
Could futures::lock::Mutex implement the RefUnwindSafe trait ?
The documentation for the UnwindSafe trait hints at a reason for this:
Who implements UnwindSafe?
Types such as &mut T and &RefCell are examples which are not unwind safe. The general idea is that any mutable state which can be shared across catch_unwind is not unwind safe by default. This is because it is very easy to witness a broken invariant outside of catch_unwind as the data is simply accessed as usual.
Types like &Mutex, however, are unwind safe because they implement poisoning by default. They still allow witnessing a broken invariant, but they already provide their own “speed bumps” to do so.
futures::lock::Mutex provides mutability like &mut T and RefCell<T>, but does not implement the poisoning feature of std::sync::Mutex<T>, so it does not implement UnwindSafe.
Though as the documentation points out, the UnwindSafe is less about memory safety and more about upholding logical invariants - hence why neither UnwindSafe nor AssertUnwindSafe are unsafe.
I have this code:
fn main() {
let p = Person;
let r = &p as &dyn Eatable;
Consumer::consume(r);
// Compile error
Consumer::consume_generic(r);
}
trait Eatable {}
struct Person;
impl Eatable for Person {}
struct Consumer;
impl Consumer {
fn consume(eatable: &dyn Eatable) {}
fn consume_generic<T: Eatable>(eatable: &T) {}
}
Error:
the size for values of type dyn Eatable cannot be known at
compilation time
I think it is strange. I have a method that literally takes a dyn Eatable and compiles fine, so that method knows somehow the size of Eatable. The generic method (consume_generic) will properly compile down for every used type for performance and the consume method will not.
So a few questions arise: why the compiler error? Are there things inside the body of the methods in which I can do something which I can not do in the other method? When should I prefer the one over the other?
Sidenote: I asked this question for the language Swift as well: Differences generic protocol type parameter vs direct protocol type. In Swift I get the same compile error but the underlying error is different: protocols/traits do not conform to themselves (because Swift protocols can holds initializers, static things etc. which makes it harder to generically reference them). I also tried it in Java, I believe the generic type is erased and it makes absolutely no difference.
The problem is not with the functions themselves, but with the trait bounds on types.
Every generic types in Rust has an implicit Sized bound: since this is correct in the majority of cases, it was decided not to force the developer to write this out every time. But, if you are using this type only behind some kind of reference, as you do here, you may want to lift this restriction by specifying T: ?Sized. If you add this, your code will compile fine:
impl Consumer {
fn consume(eatable: &dyn Eatable) {}
fn consume_generic<T: Eatable + ?Sized>(eatable: &T) {}
}
Playground as a proof
As for the other questions, the main difference is in static vs dynamic dispatch.
When you use the generic function (or the semantically equivalent impl Trait syntax), the function calls are dispatched statically. That is, for every type of argument you pass to the function, compiler generates the definition independently of others. This will likely result in more optimized code in most cases, but the drawbacks are possibly larger binary size and some limitations in API (e.g. you can't easily create a heterogeneous collection this way).
When you use dyn Trait syntax, you opt in for dynamic dispatch. The necessary data will be stored into the table attached to trait object, and the correct implementation for every trait method will be chosen at runtime. The consumer, however, needs to be compiled only once. This is usually slower, both due to the indirection and to the fact that individual optimizations are impossible, but more flexible.
As for the recommendations (note that this is an opinion, not the fact) - I'd say it's better to stick to generics whenever possible and only change it to trait objects if the goal is impossible to achieve otherwise.
If I mark my struct as Sync will the compiler output differ? Will the compiler implement some Mutex-like magic?
struct MyStruct {
data: RefCell<u32>,
}
unsafe impl Sync for MyStruct {}
unsafe impl Send for MyStruct {}
The compiler uses a mechanism named "language items" to reference items (types, traits, etc.) that are defined in a library (usually core) but are used by the compiler, whether that be in code generated by the compiler, for validating the code or for producing specialized error messages.
Send and Sync are defined in the core library. Sync is a language item, but Send isn't. The only reference to Sync I could find in the compiler is where it checks that the type of a static variable implements Sync. (Send and Sync used to be more special to the compiler. Before auto traits were added to the language, they were implemented as "auto traits" explicitly.)
Other than that, the compiler doesn't care about what Send and Sync mean. It's the libraries (specifically, types/functions that are generic over Send/Sync types) that give the traits their meaning.
Neither trait influences what code is emitted by the compiler regarding a particular type. Making a type "thread-safe" is not something that can be done automatically. Consider a struct with many fields: even if the fields are all atomic types, a partially updated struct might not be in a valid state. The compiler doesn't know about the invariants of a particular type; only the programmer knows them. Therefore, it's the programmer's responsibility to make the type thread-safe.
A static global C string (as in this answer) doesn't have the Sync trait.
pub static MY_STRING: &'static *const u8
= "hello" as const *u8;
// TODO: Simple assertion showing it's not Sync ;)
Sync is described as
The precise definition is: a type T is Sync if &T is thread-safe. In other words, there is no possibility of data races when passing &T references between threads.
It seems like this is entirely readonly and has static lifetime, so why isn't it safe to pass a reference?
The chapter Send and Sync in The Rustonomicon describes what it means for a type to be Send or Sync. It mentions that:
raw pointers are neither Send nor Sync (because they have no safety guards).
But that just begs the question; why doesn't *const T implement Sync? Why do the safety guards matter?
Just before that, it says:
Send and Sync are also automatically derived traits. This means that, unlike every other trait, if a type is composed entirely of Send or Sync types, then it is Send or Sync. Almost all primitives are Send and Sync, and as a consequence pretty much all types you'll ever interact with are Send and Sync.
This is the key reason why raw pointers are neither Send nor Sync. If you defined a struct that encapsulates a raw pointer, but only expose it as a &T or &mut T in the struct's API, did you really make sure that your struct respects the contracts of Send and Sync? If raw pointers were Send, then Rc<T> would also be Send by default, so it would have to explicitly opt-out. (In the source, there is in fact an explicit opt-out for Rc<T>, but it's only for documentation purposes, because it's actually redundant.)
[...] they're unsafe traits. This means that they are unsafe to implement, and other unsafe code can assume that they are correctly implemented.
OK, let's recap: they're unsafe to implement, but they're automatically derived. Isn't that a weird combination? Actually, it's not as bad as it sounds. Most primitive types, like u32, are Send and Sync. Simply compounding primitive values into a struct or enum is not enough to disqualify the type for Send or Sync. Therefore, you need a struct or enum with non-Send or non-Sync before you need to write an unsafe impl.
Send and Sync are marker traits, which means they have no methods. Therefore, when a function or type puts a Send or Sync bound on a type parameter, it's relying on the type to respect a particular contract across all of its API. Because of this:
Incorrectly implementing Send or Sync can cause Undefined Behavior.