Can I define a trait whose implementations must be `!Send`? - multithreading

I'd like to define a trait which forces its implementors to under no circumstances be sent to, or shared between, threads. It should suffice to mark the trait as !Send, but Rust doesn't seem to let me.
Is it possible?
Example (playground):
#![feature(optin_builtin_traits)]
// This is a syntax error
//trait ThreadThing : !Send {}
// This doesn't work either
trait ThreadThing { }
impl !Send for ThreadThing {}

No, you can't make !Send a condition of ThreadThing. The compiler just doesn't support that kind of logic.
If it would be possible for someone using your crate to make a type that is implicitly Send, contains no unsafe code in its implementation anywhere, and make it unsafe just by implementing ThreadThing for it -- in that case, you would make ThreadThing an unsafe trait to indicate that there is unsafe code somewhere that relies on an invariant that can't be described in the type system: the invariant "Things that are Send don't implement ThreadThing".
If, as is more likely, it's only unsafe to implement Send manually for a type that implements ThreadThing -- in that case, you don't need to do anything, because manually implementing Send is unsafe already. If an implementor of ThreadThing decides to manually implement Send, they take on the burden of guaranteeing not only their own invariants, but also ThreadThing's.

The answer is: yes, you can, under some very specific conditions. Whether you should need to do this is another matter.
You can define a negative trait implementation for another trait if the trait you are negating is:
an auto-trait.
from the current crate.
So the following will work (playground):
#![feature(optin_builtin_traits)]
auto trait Scary {}
trait ThreadThing { }
impl !Scary for ThreadThing {}
But it would not work if you were trying to do:
impl !Send for ThreadThing {}
Or if Scary was not an auto-trait.
Note however that, in general it should not be necessary to mark a trait !Send in this way. The concrete implementations of the trait will be marked Send or !Send by the Rust compiler based upon the contents of the implementing struct.

Related

How to implement !Send or !Sync for a type?

Is it possible to explicitly mark a type with unimplemented Send or Sync marker traits without some redundant fields? This can be an alternative:
struct T {
_marker: PhantomData<*const ()>,
}
but again, it has a redundant field, is super cryptic (you need to remember that a pointer is !Send + !Sync) and might be nontrivial to get a proper combination (consider !Sync + Send). I believe negative trait might be a solution, but they are not available at stable yet:
// error[E0658]: negative trait bounds are not yet fully implemented
impl !Send for T {}
Playground
Your PhantomData solution is generally the way to do this. I was sort of surprised to see that there's no PhantomUnsend or PhantomUnsync types in std::marker, in the same way that there's a PhantomPinned; my advice would be to add, in a util.rs module, something resembling:
pub type PhantomUnsync = PhantomData<Cell<()>>;
pub type PhantomUnsend = PhantomData<MutexGuard<'static, ()>>;
These typedefs make it clear that they exist to force !Send or !Sync in types that contain them, in the same way that PhantomPinned forces !Unpin.

Differences generic trait-bounded method vs 'direct' trait method

I have this code:
fn main() {
let p = Person;
let r = &p as &dyn Eatable;
Consumer::consume(r);
// Compile error
Consumer::consume_generic(r);
}
trait Eatable {}
struct Person;
impl Eatable for Person {}
struct Consumer;
impl Consumer {
fn consume(eatable: &dyn Eatable) {}
fn consume_generic<T: Eatable>(eatable: &T) {}
}
Error:
the size for values of type dyn Eatable cannot be known at
compilation time
I think it is strange. I have a method that literally takes a dyn Eatable and compiles fine, so that method knows somehow the size of Eatable. The generic method (consume_generic) will properly compile down for every used type for performance and the consume method will not.
So a few questions arise: why the compiler error? Are there things inside the body of the methods in which I can do something which I can not do in the other method? When should I prefer the one over the other?
Sidenote: I asked this question for the language Swift as well: Differences generic protocol type parameter vs direct protocol type. In Swift I get the same compile error but the underlying error is different: protocols/traits do not conform to themselves (because Swift protocols can holds initializers, static things etc. which makes it harder to generically reference them). I also tried it in Java, I believe the generic type is erased and it makes absolutely no difference.
The problem is not with the functions themselves, but with the trait bounds on types.
Every generic types in Rust has an implicit Sized bound: since this is correct in the majority of cases, it was decided not to force the developer to write this out every time. But, if you are using this type only behind some kind of reference, as you do here, you may want to lift this restriction by specifying T: ?Sized. If you add this, your code will compile fine:
impl Consumer {
fn consume(eatable: &dyn Eatable) {}
fn consume_generic<T: Eatable + ?Sized>(eatable: &T) {}
}
Playground as a proof
As for the other questions, the main difference is in static vs dynamic dispatch.
When you use the generic function (or the semantically equivalent impl Trait syntax), the function calls are dispatched statically. That is, for every type of argument you pass to the function, compiler generates the definition independently of others. This will likely result in more optimized code in most cases, but the drawbacks are possibly larger binary size and some limitations in API (e.g. you can't easily create a heterogeneous collection this way).
When you use dyn Trait syntax, you opt in for dynamic dispatch. The necessary data will be stored into the table attached to trait object, and the correct implementation for every trait method will be chosen at runtime. The consumer, however, needs to be compiled only once. This is usually slower, both due to the indirection and to the fact that individual optimizations are impossible, but more flexible.
As for the recommendations (note that this is an opinion, not the fact) - I'd say it's better to stick to generics whenever possible and only change it to trait objects if the goal is impossible to achieve otherwise.

Should trait bounds be duplicated in struct and impl?

The following code uses a struct with generic type. While its implementation is only valid for the given trait bound, the struct can be defined with or without the same bound. The struct's fields are private so no other code could create an instance anyway.
trait Trait {
fn foo(&self);
}
struct Object<T: Trait> {
value: T,
}
impl<T: Trait> Object<T> {
fn bar(object: Object<T>) {
object.value.foo();
}
}
Should the trait bound for the structure should be omitted to conform to the DRY principle, or should it be given to clarify the dependency? Or are there circumstances one solution should be preferred over the other?
I believe that the existing answers are misleading. In most cases, you should not put a bound on a struct unless the struct literally will not compile without it.
tl;dr
Bounds on structs express the wrong thing for most people. They are infectious, redundant, sometimes nearsighted, and often confusing. Even when a bound feels right, you should usually leave it off until it's proven necessary.
(In this answer, anything I say about structs applies equally to enums.)
0. Bounds on structs have to be repeated everywhere the struct is
This is the most obvious, but (to me) least compelling reason to avoid writing bounds on structs. As of this writing (Rust 1.65), you have to repeat every struct's bounds on every impl that touches it, which is a good enough reason not to put bounds on structs for now. However, there is an accepted RFC (implied_bounds) which, when implemented and stabilized, will change this by inferring the redundant bounds. But even then, bounds on structs are still usually wrong:
1. Bounds on structs leak out of abstractions.
Your data structure is special. "Object<T> only makes sense if T is Trait," you say. And perhaps you are right. But the decision affects not just Object, but any other data structure that contains an Object<T>, even if it does not always contain an Object<T>. Consider a programmer who wants to wrap your Object in an enum:
enum MyThing<T> { // error[E0277]: the trait bound `T: Trait` is not satisfied
Wrapped(your::Object<T>),
Plain(T),
}
Within the downstream code this makes sense because MyThing::Wrapped is only used with Ts that do implement Thing, while Plain can be used with any type. But if your::Object<T> has a bound on T, this enum can't be compiled without that same bound, even if there are lots of uses for a Plain(T) that don't require such a bound. Not only does this not work, but even if adding the bound doesn't make it entirely useless, it also exposes the bound in the public API of any struct that happens to use MyThing.
Bounds on structs limit what other people can do with them. Bounds on code (impls and functions) do too, of course, but those constraints are (presumably) required by your own code, while bounds on structs are a preemptive strike against anyone downstream who might use your struct in an innovative way. This may be useful, but unnecessary bounds are particularly annoying for innovators because they constrain what can compile without usefully constraining what can actually run (more on that in a moment).
2. Bounds on structs are redundant with bounds on code.
So you don't think downstream innovation is possible? That doesn't mean the struct itself needs a bound. To make it impossible to construct an Object<T> without T: Trait, it is enough to put that bound on the impl that contains Object's constructor(s); if it's impossible to call a_method on an Object<T> without T: Trait you can say that on the impl that contains a_method, or perhaps on a_method itself. (Until implied_bounds is implemented, you have to, anyway, so you don't even have the weak justification of "saving keystrokes.")
Even and especially when you can't think of any way for downstream to use an un-bounded Object<T>, you should not forbid it a priori, because...
3. Bounds on structs mean something different to the type system than bounds on code.
A T: Trait bound on Object<T> means more than "all Object<T>s have to have T: Trait"; it actually means something like "the concept of Object<T> itself does not make sense unless T: Trait", which is a more abstract idea. Think about natural language: I've never seen a purple elephant, but I can easily name the concept of "purple elephant" despite the fact that it corresponds to no real-world animal. Types are a kind of language and it can make sense to refer to the idea of Elephant<Purple>, even when you don't know how to create one and you certainly have no use for one. Similarly, it can make sense to express the type Object<NotTrait> in the abstract even if you don't and can't have one in hand right now. Especially when NotTrait is a type parameter, which may not be known in this context to implement Trait but in some other context does.
Case study: Cell<T>
For one example of a struct that originally had a trait bound which was eventually removed, look no farther than Cell<T>, which originally had a T: Copy bound. In the RFC to remove the bound many people initially made the same kinds of arguments you may be thinking of right now, but the eventual consensus was that "Cell requires Copy" was always the wrong way to think about Cell. The RFC was merged, paving the way for innovations like Cell::as_slice_of_cells, which lets you do things you couldn't before in safe code, including temporarily opt-in to shared mutation. The point is that T: Copy was never a useful bound on Cell<T>, and it would have done no harm (and possibly some good) to leave it off from the beginning.
This kind of abstract constraint can be hard to wrap one's head around, which is probably one reason why it's so often misused. Which relates to my last point:
4. Unnecessary bounds invite unnecessary parameters (which are worse).
This does not apply to all cases of bounds on structs, but it is a common point of confusion. You may, for instance, have a struct with a type parameter that should implement a generic trait, but not know what parameter(s) the trait should take. In such cases it is tempting to use PhantomData to add a type parameter to the main struct, but this is usually a mistake, not least because PhantomData is hard to use correctly. Here are some examples of unnecessary parameters added because of unnecessary bounds: 1 2 3 4 5 In the majority of such cases, the correct solution is simply to remove the bound.
Exceptions to the rule
Okay, when do you need a bound on a struct? I can think of two possible reasons.
In Shepmaster's answer, the struct will simply not compile without a bound, because the Iterator implementation for I actually defines what the struct contains. One other way that a struct won't compile without a bound is when its implementation of Drop has to use the trait somehow. Drop can't have bounds that aren't on the struct, for soundness reasons, so you have to write them on the struct as well.
When you're writing unsafe code and you want it to rely on a bound (T: Send, for example), you might need to put that bound on the struct. unsafe code is special because it can rely on invariants that are guaranteed by non-unsafe code, so just putting the bound on the impl that contains the unsafe is not necessarily enough.
But in all other cases, unless you really know what you're doing, you should avoid bounds on structs entirely.
Trait bounds that apply to every instance of the struct should be applied to the struct:
struct IteratorThing<I>
where
I: Iterator,
{
a: I,
b: Option<I::Item>,
}
Trait bounds that only apply to certain instances should only be applied to the impl block they pertain to:
struct Pair<T> {
a: T,
b: T,
}
impl<T> Pair<T>
where
T: std::ops::Add<T, Output = T>,
{
fn sum(self) -> T {
self.a + self.b
}
}
impl<T> Pair<T>
where
T: std::ops::Mul<T, Output = T>,
{
fn product(self) -> T {
self.a * self.b
}
}
to conform to the DRY principle
The redundancy will be removed by RFC 2089:
Eliminate the need for “redundant” bounds on functions and impls where
those bounds can be inferred from the input types and other trait
bounds. For example, in this simple program, the impl would no longer
require a bound, because it can be inferred from the Foo<T> type:
struct Foo<T: Debug> { .. }
impl<T: Debug> Foo<T> {
// ^^^^^ this bound is redundant
...
}
It really depends on what the type is for. If it is only intended to hold values which implement the trait, then yes, it should have the trait bound e.g.
trait Child {
fn name(&self);
}
struct School<T: Child> {
pupil: T,
}
impl<T: Child> School<T> {
fn role_call(&self) -> bool {
// check everyone is here
}
}
In this example, only children are allowed in the school so we have the bound on the struct.
If the struct is intended to hold any value but you want to offer extra behaviour when the trait is implemented, then no, the bound shouldn't be on the struct e.g.
trait GoldCustomer {
fn get_store_points(&self) -> i32;
}
struct Store<T> {
customer: T,
}
impl<T: GoldCustomer> Store {
fn choose_reward(customer: T) {
// Do something with the store points
}
}
In this example, not all customers are gold customers and it doesn't make sense to have the bound on the struct.

When is it appropriate to mark a trait as unsafe, as opposed to marking all the functions in the trait as unsafe?

Saying the same thing in code, when would I pick either of the following examples?
unsafe trait MyCoolTrait {
fn method(&self) -> u8;
}
trait MyCoolTrait {
unsafe fn method(&self) -> u8;
}
The opt-in builtin traits (OIBIT) RFC states:
An unsafe trait is a trait that is unsafe to implement, because it represents some kind of trusted assertion. Note that unsafe traits are perfectly safe to use. Send and Share (note: now called Sync) are examples of unsafe traits: implementing these traits is effectively an assertion that your type is safe for threading.
There's another example of an unsafe trait in the standard library, Searcher. It says:
The trait is marked unsafe because the indices returned by the next() methods are required to lie on valid utf8 boundaries in the haystack. This enables consumers of this trait to slice the haystack without additional runtime checks.
Unfortunately, neither of these paragraphs really help my understanding of when it is correct to mark the entire trait unsafe instead of some or all of the methods.
I've asked about marking a function as unsafe before, but this seems different.
A function is marked unsafe to indicate that it is possible to violate memory safety by calling it. A trait is marked unsafe to indicate that it is possible to violate memory safety by implementing it at all. This is commonly because the trait has invariants that other unsafe code relies on being upheld, and that these invariants cannot be expressed any other way.
In the case of Searcher, the methods themselves should be safe to call. That is, users should not have to worry about whether or not they're using a Searcher correctly; the interface contract says all calls are safe. There's nothing you can do that will cause the methods to violate memory safety.
However, unsafe code will be calling the methods of a Searcher, and such unsafe code will be relying on a given Searcher implementation to return offsets that are on valid UTF-8 code point boundaries. If this assumption is violated, then the unsafe code could end up causing a memory safety violation itself.
To put it another way: the correctness of unsafe code using Searchers depends on every single Searcher implementation also being correct. Or: implementing this trait incorrectly allows for safe code to induce a memory safety violation is unrelated unsafe code.
So why not just mark the methods unsafe? Because they aren't unsafe at all! They don't do anything that could violate memory safety in and of themselves. next_match just scans for and returns an Option<(usize, usize)>. The danger only exists when unsafe code assumes that these usizes are valid indices into the string being searched.
So why not just check the result? Because that'd be slower. The searching code wants to be fast, which means it wants to avoid redundant checks. But those checks can't be expressed in the Searcher interface... so instead, the whole trait is flagged as unsafe to warn anyone implementing it that there are extra conditions not stated or enforced in the code that must be respected.
There's also Send and Sync: implementing those when you shouldn't violates the expectations of (among other things) code that has to deal with threads. The code that lets you create threads is safe, but only so long as Send and Sync are only implemented on types for which they're appropriate.
The rule of thumb will be like this:
Use unsafe fn method() if the method user need to wrap method call in unsafe block.
Use unsafe trait MyTrait if the trait implementor need to unsafe impl MyTrait.
unsafe is a tip to Rust user: unsafe code must be written carefully.
The key point is that unsafe should be used as dual: when an author declare a trait/function as unsafe, the implementor/user need to implement/use it with unsafe.
When function is marked as unsafe, it means the user needs to use the function carefully. The function author is making assumption that the function user must keep.
When trait is marked as unsafe, it means the trait implementor needs to implement carefully. The trait requires implementor keeps certain assumption. But users of the unsafe trait can insouciantly call methods defined in the trait.
For concrete example, unsafe trait Searcher requires all Searcher implementation should return valid utf8 boundary when calling next. And all implementation are marked as unsafe impl Searcher, indicating implementation code might be unsafe. But as a user of Searcher, one can call searcher.next() without wrapping it in unsafe block.

What does the exclamation point mean in a trait implementation?

I found in the library reference for std::rc::Rc this trait implementation
impl<T> !Send for Rc<T>
where
T: ?Sized,
What does the exclamation point in front of Send mean?
I consulted both The Rust Programming Language¹ book and The Rust Reference², but didn't find an explanation. Please give a reference in your answer.
¹ especially the [section 3.19 Traits
² and sections 5.1 Traits and 5.1 Implementations
It's a negative trait implementation for the Send trait as described in RFC 19.
As a summary: The Send trait is an auto trait, which means it is automatically implemented for all types that only contain other Send types:
unsafe auto trait Send {}
(Send is also an unsafe trait, which means it is unsafe to implement, but that is not relevant to the question.)
An auto trait may not define any methods, which also makes it a marker trait. (The syntax for defining auto traits is currently only available in the standard library or on the nightly compiler, but their semantics are stable.)
To opt out of the automatic implementation of Send, you must write an explicit negative trait implementation:
impl !Send for MyType {}
This means that even though MyType only contains other types that are Send, MyType itself will not automatically implement Send.
See also the answer to another question: What is an auto trait in Rust?
This is a negative trait impl, so you can read it as opting out of the Send trait.
From reference about auto traits:
Auto traits can also have negative implementations, shown as impl !AutoTrait for T in the standard library documentation, that override the automatic implementations. For example *mut T has a negative implementation of Send, and so *mut T is not Send, even if T is.

Resources