Why doesn't Rust let me compare a Foo and an &Foo? - rust

As far as I understand, equality comparison between references compares the values of the referents, not the addresses contained in the references. i.e. They implicitly dereference the references.
This being the case, why do I need to write:
if ref_to_foo == &another_foo {
rather than
if ref_to_foo == another_foo {
when
if ref_to_foo == ref_to_another_foo {
already has both sides implicitly dereferenced?
The obvious answer is "because the compiler makes me", but I'm trying to understand why the language designers considered this to be a bad idea.

When writing a==b, the compiler understands PartialEq::eq(&a, &b).
Thus, when writing &a==&b, the compiler understands PartialEq::eq(&&a, &&b).
This documentation leads to this source code
impl<A: ?Sized, B: ?Sized> PartialEq<&B> for &A
where
A: PartialEq<B>,
{
#[inline]
fn eq(&self, other: &&B) -> bool {
PartialEq::eq(*self, *other)
}
#[inline]
fn ne(&self, other: &&B) -> bool {
PartialEq::ne(*self, *other)
}
}
showing that the implementation of PartialEq::eq(&&a, &&b) simply dereferences the arguments in order to forward the call to PartialEq::eq(&a, &b) (so, the same as a==b in the end).
There does not seem to exist any default implementation of PartialEq that dereferences only one of the two arguments, thus a==&b and &a==b should be rejected.

Related

How does Rust infer lifetime while raw pointers are involved?

struct MyCell<T> {
value: T
}
impl<T> MyCell<T> {
fn new(value: T) -> Self {
MyCell { value }
}
fn get(&self) -> &T {
&self.value
}
fn set(&self, new_value: T) {
unsafe {
*(&self.value as *const T as *mut T) = new_value;
}
}
}
fn set_to_local(cell: &MyCell<&i32>) {
let local = 100;
cell.set(&local);
}
fn main() {
let cell = MyCell::new(&10);
set_to_local(&cell);
}
When calling cell.set(&local), suppose cell is 'x and '&local is 'y, I am told that the covariance rule will change the type of cell from &MyCell<'x, &i32> to &MyCell<'y, &i32>.
How does the assignment inside the unsafe block affect the lifetime inference for the parameters of set()? Raw pointers does not have lifetime then how does the compiler know it should make cell and new_value have the same lifetime using covariance?
How does the assignment inside the unsafe block affect the lifetime inference for the parameters of set()?
It doesn't — and this is not even specific to raw pointers or unsafe. Function bodies never affect any aspect of the function signature (except for async fns which are irrelevant here).
Raw pointers does not have lifetime then how does the compiler know it should make cell and new_value have the same lifetime using covariance?
It sounds like you misread some advice. In the code you have, Cell<T> is invariant in T, but for an interior-mutable type to be sound, it must be invariant (not covariant) in the type parameter. In the code you have, the compiler infers covariance for T because MyCell contains a field that is simply of the type T. Covariance is the “typical” case for most generic types.
Therefore, the code you have is unsound because MyCell is covariant over T but must instead be invariant over T.
Your code is also unsound because in the implementation of set(), you're creating an immutable reference to a T, &self.value, and then writing to its referent. This is “undefined behavior” no matter how you do it, because creating &self.value asserts to the compiler/optimizer that the pointed-to memory won't be modified until the reference is dropped.
If you want to reimplement the standard library's Cell, you must do it like the standard library does, with the UnsafeCell primitive:
pub struct Cell<T: ?Sized> {
value: UnsafeCell<T>,
}
UnsafeCell is how you opt out of &'s immutability guarantees: creating an &UnsafeCell<T> doesn't assert that the T won't be mutated. It is also invariant in T, which automatically makes the containing type invariant in T. Both of these are necessary here.

Are Rust traits analogous to JavaScript mixins?

The Rust book (2nd Edition) suggests that "Traits are similar to a feature often called ‘interfaces’ in other languages, though with some differences." For those not familiar with interfaces, the analogy doesn't illuminate. Can traits be reasonably thought of as mixins such as those found commonly in JavaScript?
They both seem to be a way to share code and add methods to multiple types/objects without inheritance, but how crucial are the differences for conceptual understanding?
"Traits" (or "Roles" in Perl) are a way to add multiple units of functionality to a class (or struct in Rust) without the problems of multiple inheritance. Traits are "cross cutting concerns" meaning they're not part of the class hierarchy, they can be potentially implemented on any class.
Traits define an interface, meaning in order for anything to implement that trait it must define all the required methods. Like you can require that method parameters be of a certain classes, you can require that certain parameters implement certain traits.
A good example is writing output. In many languages, you have to decide if you're writing to a FileHandle object or a Socket object. This can get frustrating because sometimes things will only write to files, but not sockets or vice versa, or maybe you want to capture the output in a string for debugging.
If you instead define a trait, you can write to anything that implements that trait. This is exactly what Rust does with std::io::Write.
pub trait Write {
fn write(&mut self, buf: &[u8]) -> Result<usize>;
fn flush(&mut self) -> Result<()>;
fn write_all(&mut self, mut buf: &[u8]) -> Result<()> {
while !buf.is_empty() {
match self.write(buf) {
Ok(0) => return Err(Error::new(ErrorKind::WriteZero,
"failed to write whole buffer")),
Ok(n) => buf = &buf[n..],
Err(ref e) if e.kind() == ErrorKind::Interrupted => {}
Err(e) => return Err(e),
}
}
Ok(())
}
...and a few more...
}
Anything which wants to implement Write must implement write and flush. A default write_all is provided, but you can implement your own if you like.
Here's how Vec<u8> implements Write so you can "print" to a vector of bytes.
impl Write for Vec<u8> {
fn write(&mut self, buf: &[u8]) -> io::Result<usize> {
self.extend_from_slice(buf);
Ok(buf.len())
}
fn write_all(&mut self, buf: &[u8]) -> io::Result<()> {
self.extend_from_slice(buf);
Ok(())
}
fn flush(&mut self) -> io::Result<()> { Ok(()) }
}
Now when you write something that needs to output stuff instead of deciding if it should write to a File or a TcpStream (a network socket) or whatever, you say it just has to have the Write trait.
fn display( out: Write ) {
out.write(...whatever...)
}
Mixins are a severely watered down version of this. Mixins are a collection of methods which get injected into a class. That's about it. They solve the problem of multiple inheritance and cross-cutting concerns, but little else. There's no formal promise of an interface, you just call the methods and hope for the best.
Mixins are mostly functionally equivalent, but provide none of the compile time checks and high performance that traits do.
If you're familiar with mixins, traits will be a familiar way to compose functionality. The requirement to define an interface will be the struggle, but strong typing will be a struggle for anyone coming to Rust from JavaScript.
Unlike in JavaScript, where mixins are a neat add-on, traits are a fundamental part of Rust. They allow Rust to be strongly-typed, high-performance, very safe, but also extremely flexible. Traits allow Rust to perform extensive compile time checks on the validity of function arguments without the traditional restrictions of a strongly typed language.
Many core pieces of Rust are implemented with traits. std::io::Writer has already been mentioned. There's also std::cmp::PartialEq which handles == and !=. std::cmp::PartialOrd for >, >=, < and <=. std::fmt::Display for how a thing should be printed with {}. And so on.
Thinking of traits as mixins will lead you away from, rather than towards, understanding. Traits are fundamentally about the strict type system, which will be quite alien to a programmer whose native language is JavaScript.
Like most programming constructs, traits are flexible enough that one could use them in a way that resembles how mixins are idiomatically used, but that won't resemble at all how most other programmers, including the standard library, use traits.
You should think of traits as a radical novelty.
Traits or "type classes" (in Haskell, which is where Rust got traits from) are fundamentally about logical constraints on types. Traits are not fundamentally about values. Since JavaScript is unityped, mixins, which are about values, are nothing like traits/type-classes in a statically typed language like Rust or Haskell. Traits let us talk in a principled way about the commonalities between types. Unlike C++, which has "templates", Haskell and Rust type check implementations before monomorphization.
Assuming a generic function:
fn foo<T: Trait>(x: T) { /* .. */ }
or in Haskell:
foo :: Trait t => t -> IO ()
foo = ...
The bound T: Trait means that any type T you pick must satisfy the Trait. To satisfy the Trait, the type must explicitly say that it is implementing the Trait and therein provide a definition of all items required by the Trait. In order to be sound, Rust also guarantees that each type implements a given trait at most once - therefore, there can never be overlapping implementations.
Consider the following marker trait and a type which implements it:
trait Foo {}
struct Bar;
impl Foo for Bar {}
or in Haskell:
class Foo x where
data Bar = Bar
instance Foo Bar where
Notice that Foo does not have any methods, functions, or any other items. A difference between Haskell and Rust here is that x is absent in the Rust definition. This is because the first type parameter to a trait is implicit in Rust (and referred to by with Self) while it is explicit in Haskell.
Speaking of type parameters, we can define the trait StudentOf between two types like so:
trait StudentOf<A> {}
struct AlanTuring;
struct AlonzoChurch;
impl StudentOf<AlonzoChurch> for AlanTuring {}
or in Haskell:
class StudentOf self a where
data AlanTuring = AlanTuring
data AlonzoChurch = AlonzoChurch
instance StudentOf AlanTuring AlonzoChurch where
Until now, we've not introduced any functions - let's do that:
trait From<T> {
fn from(x: T) -> Self;
}
struct WrapF64(f64);
impl From<f64> for WrapF64 {
fn from(x: f64) -> Self {
WrapF64(x)
}
}
or in Haskell:
class From self t where
from :: t -> self
newtype WrapDouble = WrapDouble Double
instance From WrapDouble Double where
from d = WrapDouble d
What you've seen here is also a form of return type polymorphism. Let's make it a bit more clear and consider a Monoid trait:
trait Monoid {
fn mzero() -> Self;
fn mappend(self, rhs: Self) -> Self;
}
struct Sum(usize);
impl Monoid for Sum {
fn mzero() -> Self { Sum(0) }
fn mappend(self, rhs: Self) -> Self { Sum(self.0 + rhs.0) }
}
fn main() {
let s: Sum = Monoid::mzero();
let s2 = s.mappend(Sum(2));
// or equivalently:
let s2 = <Sum as Monoid>::mappend(s, Sum(2));
}
or in Haskell:
class Monoid m where
mzero :: m -- Notice that we don't have any inputs here.
mappend :: m -> m -> m
...
The implementation of mzero here is inferred by the required return type Sum, which is why it is called return type polymorphism. Another subtle difference here is the self syntax in mappend - this is mostly a syntactic difference that allows us to do s.mappend(Sum(2)); in Rust.
Traits also allow us to require that each type which implements the trait must provide an associated item, such as associated constants:
trait Identifiable {
const ID: usize; // Each impl must provide a constant value.
}
impl Identifiable for bool {
const ID: usize = 42;
}
or associated types:
trait Iterator {
type Item;
fn next(&mut self) -> Option<Self::Item>;
}
struct Once<T>(Option<T>);
impl<T> Iterator for Once<T> {
type Item = T;
fn next(&mut self) -> Option<Self::Item> {
self.0.take()
}
}
Associated types also allow us to define functions on the type level rather than functions on the value level:
trait UnaryTypeFamily { type Output: Clone; }
impl UnaryTypeFamily for InputType { Output = String; }
fn main() {
// Apply the function UnaryTypeFamily with InputType.
let foo: <InputType as UnaryTypeFamily>::Output = String::new();
}
Some traits such as Iterator are also object safe. This means that you can erase the actual type behind a pointer, and a vtable will be created for you:
fn use_boxed_iter(iter: Box<Iterator<Item = u8>>) { /* .. */ }
The Haskell equivalent of trait objects are existentially quantified types, which in fact trait objects are in a type theoretical sense.
Finally, there's the issue of higher kinded types, which lets us be generic over type constructors. In Haskell, you can formulate what it means to be an (endo)functor like so:
class Functor (f :: * -> *) where
fmap :: (a -> b) -> (f a -> f b)
At this point, Rust does not have an equivalent notion, but will be equally expressive with generic associated types (GATs) soon:
trait FunctorFamily {
type Functor<T>;
fn fmap<A, B, F>(self: Self::Functor<A>, mapper: F) -> Self::Functor<B>
where F: Fn(A) -> B;
}
To add to schwern's answer
A mixin is a subclass specification that may be applied to various
parent classes in order to extend them with the same set of features. - Traits: Composable Units of Behaviour.
The major difference compared to trait is that they have "total ordering". Changing the order in which mixins are implemented for a class or strut can cause the behaviour of the class or struct to change. If mixins X, Y were applied to a struct or class A, then applying X after Y can give you a different behaviour compared to when you apply Y after X. Traits are independent of implementation order - i.e has flattened code.

Why are borrows of struct members allowed in &mut self, but not of self to immutable methods?

If I have a struct that encapsulates two members, and updates one based on the other, that's fine as long as I do it this way:
struct A {
value: i64
}
impl A {
pub fn new() -> Self {
A { value: 0 }
}
pub fn do_something(&mut self, other: &B) {
self.value += other.value;
}
pub fn value(&self) -> i64 {
self.value
}
}
struct B {
pub value: i64
}
struct State {
a: A,
b: B
}
impl State {
pub fn new() -> Self {
State {
a: A::new(),
b: B { value: 1 }
}
}
pub fn do_stuff(&mut self) -> i64 {
self.a.do_something(&self.b);
self.a.value()
}
pub fn get_b(&self) -> &B {
&self.b
}
}
fn main() {
let mut state = State::new();
println!("{}", state.do_stuff());
}
That is, when I directly refer to self.b. But when I change do_stuff() to this:
pub fn do_stuff(&mut self) -> i64 {
self.a.do_something(self.get_b());
self.a.value()
}
The compiler complains: cannot borrow `*self` as immutable because `self.a` is also borrowed as mutable.
What if I need to do something more complex than just returning a member in order to get the argument for a.do_something()? Must I make a function that returns b by value and store it in a binding, then pass that binding to do_something()? What if b is complex?
More importantly to my understanding, what kind of memory-unsafety is the compiler saving me from here?
A key aspect of mutable references is that they are guaranteed to be the only way to access a particular value while they exist (unless they're reborrowed, which "disables" them temporarily).
When you write
self.a.do_something(&self.b);
the compiler is able to see that the borrow on self.a (which is taken implicitly to perform the method call) is distinct from the borrow on self.b, because it can reason about direct field accesses.
However, when you write
self.a.do_something(self.get_b());
then the compiler doesn't see a borrow on self.b, but rather a borrow on self. That's because lifetime parameters on method signatures cannot propagate such detailed information about borrows. Therefore, the compiler cannot guarantee that the value returned by self.get_b() doesn't give you access to self.a, which would create two references that can access self.a, one of them being mutable, which is illegal.
The reason field borrows don't propagate across functions is to simplify type checking and borrow checking (for machines and for humans). The principle is that the signature should be sufficient for performing those tasks: changing the implementation of a function should not cause errors in its callers.
What if I need to do something more complex than just returning a member in order to get the argument for a.do_something()?
I would move get_b from State to B and call get_b on self.b. This way, the compiler can see the distinct borrows on self.a and self.b and will accept the code.
self.a.do_something(self.b.get_b());
Yes, the compiler isolates functions for the purposes of the safety checks it makes. If it didn't, then every function would essentially have to be inlined everywhere. No one would appreciate this for at least two reasons:
Compile times would go through the roof, and many opportunities for parallelization would have to be discarded.
Changes to a function N calls away could affect the current function. See also Why are explicit lifetimes needed in Rust? which touches on the same concept.
what kind of memory-unsafety is the compiler saving me from here
None, really. In fact, it could be argued that it's creating false positives, as your example shows.
It's really more of a benefit for preserving programmer sanity.
The general advice that I give and follow when I encounter this problem is that the compiler is guiding you to discovering a new type in your existing code.
Your particular example is a bit too simplified for this to make sense, but if you had struct Foo(A, B, C) and found that a method on Foo needed A and B, that's often a good sign that there's a hidden type composed of A and B: struct Foo(Bar, C); struct Bar(A, B).
This isn't a silver bullet as you can end up with methods that need each pair of data, but in my experience it works the majority of the time.

What is the syntax and semantics of the `where` keyword?

Unfortunately, Rust's documentation regarding where is very lacking. The keyword only appears in one or two unrelated examples in the reference.
What semantic difference does where make in the following code? Is there any difference at all? Which form is preferred?
fn double_a<T>(a: T) -> T where T: std::num::Int {
a + a
}
fn double_b<T: std::num::Int>(a: T) -> T {
a + a
}
In the implementation of the CharEq trait, it seems that where is being used as some sort of "selector" to implement Trait for anything that matches some closure type. Am I correct?
Is there any way I can get a better, more complete picture of where? (full specification of usage and syntax)
In your example, the two codes are strictly equivalent.
The where clauses were introduced to allow more expressive bound-checking, doing for example :
fn foo<T>(a: T) where Bar<T>: MyTrait { /* ... */ }
Which is not possible using only the old syntax.
Using where rather than the original syntax is generally preferred for readability even if the old syntax can still be used.
You can imagine for example constructions like
fn foo<A, B, C>(a: A, b: B, c: C)
where A: SomeTrait + OtherTrait,
B: ThirdTrait<A>+ OtherTrait,
C: LastTrait<A, B>
{
/* stuff here */
}
which are much more readable this way, even if the could still be expressed using the old syntax.
For your question about the CharEq trait, the code is:
impl<F> CharEq for F where F: FnMut(char) -> bool {
#[inline]
fn matches(&mut self, c: char) -> bool { (*self)(c) }
#[inline]
fn only_ascii(&self) -> bool { false }
}
It literally means: Implementation of trait CharEq for all type F that already implements the trait FnMut(char) -> bool (that is, a closure or a function taking a char and returning a bool).
For more details, you can look at the RFC that introduced the where clauses : https://github.com/rust-lang/rfcs/pull/135

Returning a closure from a function

Note: This question was asked before Rust's first stable release. There have been lots of changes since and the syntax used in the function is not even valid anymore. Still, Shepmaster's answer is excellent and makes this question worth keeping.
Finally unboxed closures have landed, so I am experimenting with them to see what you can do.
I have this simple function:
fn make_adder(a: int, b: int) -> || -> int {
|| a + b
}
However, I get a missing lifetime specifier [E0106] error. I have tried to fix this by changing the return type to ||: 'static -> int, but then I get another error cannot infer an appropriate lifetime due to conflicting requirements.
If I understand correctly, the closure is unboxed so it owns a and b. It seems very strange to me that it needs a lifetime. How can I fix this?
As of Rust 1.26, you can use impl trait:
fn make_adder(a: i32) -> impl Fn(i32) -> i32 {
move |b| a + b
}
fn main() {
println!("{}", make_adder(1)(2));
}
This allows returning an unboxed closure even though it is impossible to specify the exact type of the closure.
This will not help you if any of these are true:
You are targeting Rust before this version
You have any kind of conditional in your function:
fn make_adder(a: i32) -> impl Fn(i32) -> i32 {
if a > 0 {
move |b| a + b
} else {
move |b| a - b
}
}
Here, there isn't a single return type; each closure has a unique, un-namable type.
You need to be able to name the returned type for any reason:
struct Example<F>(F);
fn make_it() -> Example<impl Fn()> {
Example(|| println!("Hello"))
}
fn main() {
let unnamed_type_ok = make_it();
let named_type_bad: /* No valid type here */ = make_it();
}
You cannot (yet) use impl SomeTrait as a variable type.
In these cases, you need to use indirection. The common solution is a trait object, as described in the other answer.
It is possible to return closures inside Boxes, that is, as trait objects implementing certain trait:
fn make_adder(a: i32) -> Box<dyn Fn(i32) -> i32> {
Box::new(move |b| a + b)
}
fn main() {
println!("{}", make_adder(1)(2));
}
(try it here)
There is also an RFC (its tracking issue) on adding unboxed abstract return types which would allow returning closures by value, without boxes, but this RFC was postponed. According to discussion in that RFC, it seems that some work is done on it recently, so it is possible that unboxed abstract return types will be available relatively soon.
The || syntax is still the old boxed closures, so this doesn't work for the same reason it didn't previously.
And, it won't work even using the correct boxed closure syntax |&:| -> int, since it is literally is just sugar for certain traits. At the moment, the sugar syntax is |X: args...| -> ret, where the X can be &, &mut or nothing, corresponding to the Fn, FnMut, FnOnce traits, you can also write Fn<(args...), ret> etc. for the non-sugared form. The sugar is likely to be changing (possibly something like Fn(args...) -> ret).
Each unboxed closure has a unique, unnameable type generated internally by the compiler: the only way to talk about unboxed closures is via generics and trait bounds. In particular, writing
fn make_adder(a: int, b: int) -> |&:| -> int {
|&:| a + b
}
is like writing
fn make_adder(a: int, b: int) -> Fn<(), int> {
|&:| a + b
}
i.e. saying that make_adder returns an unboxed trait value; which doesn't make much sense at the moment. The first thing to try would be
fn make_adder<F: Fn<(), int>>(a: int, b: int) -> F {
|&:| a + b
}
but this is saying that make_adder is returning any F that the caller chooses, while we want to say it returns some fixed (but "hidden") type. This required abstract return types, which says, basically, "the return value implements this trait" while still being unboxed and statically resolved. In the language of that (temporarily closed) RFC,
fn make_adder(a: int, b: int) -> impl Fn<(), int> {
|&:| a + b
}
Or with the closure sugar.
(Another minor point: I'm not 100% sure about unboxed closures, but the old closures certainly still capture things by-reference which is another thing that sinks the code as proposed in the issue. This is being rectified in #16610.)
Here's how to implement a closure based counter:
fn counter() -> impl FnMut() -> i32 {
let mut value = 0;
move || -> i32 {
value += 1;
return value;
}
}
fn main() {
let mut incre = counter();
println!("Count 1: {}", incre());
println!("Count 2: {}", incre());
}

Resources