Am I missing something, or are mutable non-reference arguments not supported in Rust?
To give an example, I was playing with Rust and tried to implement Euclid's algorithm generic for all numeric types, and ideally I just wanted to pass arguments by value and have them mutable, but adding keyword mut to the argument type is rejected by compiler. So I have to declare a mutable copy of the argument as the function prologue. Is this idiomatic/efficent?
use std::ops::Rem;
extern crate num;
use self::num::Zero;
pub fn gcd<T: Copy + Zero + PartialOrd + Rem<Output=T>>(a : T, b : T) -> T
{
let mut aa = a;
let mut bb = b;
while bb > T::zero() {
let t = bb;
bb = aa % bb;
aa = t;
}
aa
}
It's certainly possible to say that an argument will be mutable:
use num::Zero; // 0.4.0
use std::ops::Rem;
pub fn gcd<T>(mut a: T, mut b: T) -> T
where
T: Copy + Zero + PartialOrd + Rem<Output = T>,
{
while b > T::zero() {
let t = b;
b = a % b;
a = t;
}
a
}
Is [declaring a mutable copy of the argument] idiomatic/efficient?
It should be fine from an efficiency perspective. The optimizer will see that they are the same and not do any extraneous copying.
As for idiomatic, I'm not so sure. I originally started by not putting mut in my function argument list as I felt that it was oversharing details about the implementation. Nowadays, I go ahead and put it in there.
Related
I'm currently writing a simple function to swap numbers in Rust:
fn swapnumbers() {
let a = 1;
let b = 2;
let (a, b) = (b, a);
println!("{}, {}", a, b);
}
I am now trying to make a test for it, how do I do it? All my other attempts have failed.
I would suggest modifying the function to return something instead of printing it, and then using either the assert_eq! or assert! macros to test for proper function. (docs for assert_eq!, docs for assert!)
fn swapnumbers() -> (i32, i32) {
let a = 1;
let b = 2;
let (a, b) = (b, a);
return (a, b);
}
assert_eq!(swapnumbers(), (2, 1));
(-> (i32, i32) means that this function returns a tuple of two i32s)
And if you're unfamiliar with testing in Rust, the official Rust book tutorial can help you out with that!
If you want to actually swap numbers, you would need to do something like this:
fn swapnumbers(a: &mut i32, b: &mut i32) {
std::mem::swap(a, b);
}
Note the types specified after the parameter names. &mut i32 means the passed value must be a mutable reference of an i32 The parameter must be mutable for you to be able to assign to it and change its value, and it must be a reference so that the function does not actually take ownership of the data.
I want to implement a generic fibonacci function that works with any type implementing Zero, One, and AddAssign. I first implemented a version that works fine, but is specialized for num::BigUint (see on play.rust-lang.org). I than came up with the following generic implementation (see on play.rust-lang.org):
extern crate num;
use num::{One, Zero};
use std::mem::swap;
use std::ops::AddAssign;
fn fib<T: Zero + One + AddAssign<&T>>(n: usize) -> T {
let mut f0 = Zero::zero();
let mut f1 = One::one();
for _ in 0..n {
f0 += &f1;
swap(&mut f0, &mut f1);
}
f0
}
This doesn't compile:
error[E0106]: missing lifetime specifier
--> src/main.rs:7:34
|
7 | fn fib<T: Zero + One + AddAssign<&T>>(n: usize) -> T {
| ^ expected lifetime parameter
Rust wants me to add a lifetime parameter to AddAssign<&T> but I don't know how to express the lifetime of f1.
You need to use Higher Rank Trait Bounds. This one means basically "For any lifetime 'a, T satisfies the AddAssign<&'a T> trait":
fn fib<T>(n: usize) -> T
where
for<'a> T: Zero + One + AddAssign<&'a T>,
I also had to change the way fib is called because the compiler couldn't figure out the return type, which could be literally any type that implements those traits. Declaring x's type gives sufficient context to the compiler so that it knows what you want.
fn main() {
let x: num::BigUint = fib(10);
// let x = fib::<BigUint>(10); // Also works
println!("fib(10) = {}", x);
}
playground
Let's say I have the following struct in Rust:
struct Num {
pub num: i32;
}
impl Num {
pub fn new(x: i32) -> Num {
Num { num: x }
}
}
impl Clone for Num {
fn clone(&self) -> Num {
Num { num: self.num }
}
}
impl Copy for Num { }
impl Add<Num> for Num {
type Output = Num;
fn add(self, rhs: Num) -> Num {
Num { num: self.num + rhs.num }
}
}
And then I have the following code snippet:
let a = Num::new(0);
let b = Num::new(1);
let c = a + b;
let d = a + b;
This works because Num is marked as Copy. Otherwise, the second addition would be a compilation error, since a and b had already been moved into the add function during the first addition (I think).
The question is what the emitted assembly does. When the add function is called, are two copies of the arguments made into the new stack frame, or is the Rust compiler smart enough to know that in this case, it's not necessary to do that copying?
If the Rust compiler isn't smart enough, and actually does the copying like a function with a argument passed by value in C++, how do you avoid the performance overhead in cases where it matters?
The context is I am implementing a matrix class (just to learn), and if I have a 100x100 matrix, I really don't want to be invoking two copies every time I try to do a multiply or add.
The question is what the emitted assembly does
There's no need to guess; you can just look. Let's use this code:
use std::ops::Add;
#[derive(Copy, Clone, Debug)]
struct Num(i32);
impl Add for Num {
type Output = Num;
fn add(self, rhs: Num) -> Num {
Num(self.0 + rhs.0)
}
}
#[inline(never)]
fn example() -> Num {
let a = Num(0);
let b = Num(1);
let c = a + b;
let d = a + b;
c + d
}
fn main() {
println!("{:?}", example());
}
Paste it into the Rust Playground, then select the Release mode and view the LLVM IR:
Search through the result to see the definition of the example function:
; playground::example
; Function Attrs: noinline norecurse nounwind nonlazybind readnone uwtable
define internal fastcc i32 #_ZN10playground7example17h60e923840d8c0cd0E() unnamed_addr #2 {
start:
ret i32 2
}
That's right, this was completely and totally evaluated at compile time and simplified all the way down to a simple constant. Compilers are pretty good nowadays.
Maybe you want to try something not quite as hardcoded?
#[inline(never)]
fn example(a: Num, b: Num) -> Num {
let c = a + b;
let d = a + b;
c + d
}
fn main() {
let something = std::env::args().count();
println!("{:?}", example(Num(something as i32), Num(1)));
}
Produces
; playground::example
; Function Attrs: noinline norecurse nounwind nonlazybind readnone uwtable
define internal fastcc i32 #_ZN10playground7example17h73d4138fe5e9856fE(i32 %a) unnamed_addr #3 {
start:
%0 = shl i32 %a, 1
%1 = add i32 %0, 2
ret i32 %1
}
Oops, the compiler saw that we we basically doing (x + 1) * 2, so it did some tricky optimizations here to get to 2x + 2. Let's try something harder...
#[inline(never)]
fn example(a: Num, b: Num) -> Num {
a + b
}
fn main() {
let something = std::env::args().count() as i32;
let another = std::env::vars().count() as i32;
println!("{:?}", example(Num(something), Num(another)));
}
Produces
; playground::example
; Function Attrs: noinline norecurse nounwind nonlazybind readnone uwtable
define internal fastcc i32 #_ZN10playground7example17h73d4138fe5e9856fE(i32 %a, i32 %b) unnamed_addr #3 {
start:
%0 = add i32 %b, %a
ret i32 %0
}
A simple add instruction.
The real takeaway from this is:
Look at the generated assembly for your cases. Even similar-looking code might optimize differently.
Perform micro and macro benchmarking. You never know exactly how the code will play out in the big picture. Maybe all your cache will be blown, but your micro benchmarks will be stellar.
is the Rust compiler smart enough to know that in this case, it's not necessary to do that copying?
As you just saw, the Rust compiler plus LLVM are pretty smart. In general, it is possible to elide copies when it knows that the operand isn't needed. Whether it will work in your case or not is tough to answer.
Even if it did, you might not want to be passing large items via the stack as it's always possible that it will need to be copied.
And note that you don't have to implement copy for the value, you can choose to only allow it via references:
impl<'a, 'b> Add<&'b Num> for &'a Num {
type Output = Num;
fn add(self, rhs: &'b Num) -> Num {
Num(self.0 + rhs.0)
}
}
In fact, you may want to implement both ways of adding them, and maybe all 4 permutations of value / reference!
If the Rust compiler isn't smart enough, and actually does the copying like a function with a argument passed by value in C++, how do you avoid the performance overhead in cases where it matters?
The context is I am implementing a matrix class (just to learn), and if I have a 100x100 matrix, I really don't want to be invoking two copies every time I try to do a multiply or add.
All of Rust's implicit copies (be them from moves or actual Copy types) are a shallow memcpy. If you heap allocate, only the pointers and such are copied. Unlike C++, passing a vector by value will only copy three pointer-sized values.
To copy heap memory, an explicit copy must be made, normally by calling .clone(), implemented with #[derive(Clone)] or impl Clone.
I've talked in more detail about this elsewhere.
Shepmaster points out that shallow copies are often messed with a lot by the compiler - generally only heap memory and massive stack values cause problems.
This code fails as expected at let c = a; with compile error "use of moved value: a":
fn main() {
let a: &mut i32 = &mut 0;
let b = a;
let c = a;
}
a is moved into b and is no longer available for an assignment to c. So far, so good.
However, if I just annotate b's type and leave everything else alone:
fn main() {
let a: &mut i32 = &mut 0;
let b: &mut i32 = a;
let c = a;
}
the code fails again at let c = a;
But this time with a very different error message: "cannot move out of a because it is borrowed ... borrow of *a occurs here: let b: &mut i32 = a;"
So, if I just annotate b's type: no move of a into b, but instead a "re"-borrow of *a?
What am I missing?
Cheers.
So, if I just annotate b's type: no move of a into b, but instead a "re"-borrow of *a?
What am I missing?
Absolutely nothing, as in this case these two operations are semantically very similar (and equivalent if a and b belong to the same scope).
Either you move the reference a into b, making a a moved value, and no longer available.
Either you reborrow *a in b, making a unusable as long as b is in scope.
The second case is less definitive than the first, you can show this by putting the line defining b into a sub-scope.
This example won't compile because a is moved:
fn main() {
let a: &mut i32 = &mut 0;
{ let b = a; }
let c = a;
}
But this one will, because once b goes out of scope a is unlocked:
fn main() {
let a: &mut i32 = &mut 0;
{ let b = &mut *a; }
let c = a;
}
Now, to the question "Why does annotating the type of b change the behavior ?", my guess would be:
When there is no type annotation, the operation is a simple and straightforward move. Nothing is needed to be checked.
When there is a type annotation, a conversion may be needed (casting a &mut _ into a &_, or transforming a simple reference into a reference to a trait object). So the compiler opts for a re-borrow of the value, rather than a move.
For example, this code is perflectly valid:
fn main() {
let a: &mut i32 = &mut 0;
let b: &i32 = a;
}
and here moving a into b would not make any sense, as they are of different type. Still this code compiles: b simply re-borrows *a, and the value won't be mutably available through a as long as b is in scope.
To complement #Levans's answer on the specific question "Why does annotating the type change the behaviour?":
When you don't write the type, the compiler performs a simple move. When you do put the type, the let statement becomes a coercion site as documented in "Coercion sites":
Possible coercion sites are:
let statements where an explicit type is given.
In the present case the compiler performs a reborrow coercion, which is a special case of coercion going from &mut to &mut, as explained in this issue comment on GitHub.
Note that reborrowing in general and reborrow coercion in particular are currently poorly documented. There is an open issue on the Rust Reference to improve that point.
fn increment(number: &mut int) {
// this fails with: binary operation `+` cannot be applied to type `&mut int`
//let foo = number + number;
let foo = number.add(number);
println!("{}", foo);
}
fn main() {
let mut test = 5;
increment(&mut test);
println!("{}", test);
}
Why does number + number fail but number.add(number) works?
As a bonus question: The above prints out
10
5
Am I right to assume that test is still 5 because the data is copied over to increment? The only way that the original test variable could be mutated by the increment function would be if it was send as Box<int>, right?
number + number fails because they're two references, not two ints. The compiler also tells us why: the + operator isn't implemented for the type &mut int.
You have to dereference with the dereference operator * to get at the int. This would work let sum = *number + *number;
number.add(number); works because the signature of add is fn add(&self, &int) -> int;
Am I right to assume that test is still 5 because the data is copied
over to increment? The only way that the original test variable could
be mutated by the increment function would be if it was send as
Box, right?
Test is not copied over, but is still 5 because it's never actually mutated. You could mutate it in the increment function if you wanted.
PS: to mutate it
fn increment(number: &mut int) {
*number = *number + *number;
println!("{}", number);
}