Let's say I have the following struct in Rust:
struct Num {
pub num: i32;
}
impl Num {
pub fn new(x: i32) -> Num {
Num { num: x }
}
}
impl Clone for Num {
fn clone(&self) -> Num {
Num { num: self.num }
}
}
impl Copy for Num { }
impl Add<Num> for Num {
type Output = Num;
fn add(self, rhs: Num) -> Num {
Num { num: self.num + rhs.num }
}
}
And then I have the following code snippet:
let a = Num::new(0);
let b = Num::new(1);
let c = a + b;
let d = a + b;
This works because Num is marked as Copy. Otherwise, the second addition would be a compilation error, since a and b had already been moved into the add function during the first addition (I think).
The question is what the emitted assembly does. When the add function is called, are two copies of the arguments made into the new stack frame, or is the Rust compiler smart enough to know that in this case, it's not necessary to do that copying?
If the Rust compiler isn't smart enough, and actually does the copying like a function with a argument passed by value in C++, how do you avoid the performance overhead in cases where it matters?
The context is I am implementing a matrix class (just to learn), and if I have a 100x100 matrix, I really don't want to be invoking two copies every time I try to do a multiply or add.
The question is what the emitted assembly does
There's no need to guess; you can just look. Let's use this code:
use std::ops::Add;
#[derive(Copy, Clone, Debug)]
struct Num(i32);
impl Add for Num {
type Output = Num;
fn add(self, rhs: Num) -> Num {
Num(self.0 + rhs.0)
}
}
#[inline(never)]
fn example() -> Num {
let a = Num(0);
let b = Num(1);
let c = a + b;
let d = a + b;
c + d
}
fn main() {
println!("{:?}", example());
}
Paste it into the Rust Playground, then select the Release mode and view the LLVM IR:
Search through the result to see the definition of the example function:
; playground::example
; Function Attrs: noinline norecurse nounwind nonlazybind readnone uwtable
define internal fastcc i32 #_ZN10playground7example17h60e923840d8c0cd0E() unnamed_addr #2 {
start:
ret i32 2
}
That's right, this was completely and totally evaluated at compile time and simplified all the way down to a simple constant. Compilers are pretty good nowadays.
Maybe you want to try something not quite as hardcoded?
#[inline(never)]
fn example(a: Num, b: Num) -> Num {
let c = a + b;
let d = a + b;
c + d
}
fn main() {
let something = std::env::args().count();
println!("{:?}", example(Num(something as i32), Num(1)));
}
Produces
; playground::example
; Function Attrs: noinline norecurse nounwind nonlazybind readnone uwtable
define internal fastcc i32 #_ZN10playground7example17h73d4138fe5e9856fE(i32 %a) unnamed_addr #3 {
start:
%0 = shl i32 %a, 1
%1 = add i32 %0, 2
ret i32 %1
}
Oops, the compiler saw that we we basically doing (x + 1) * 2, so it did some tricky optimizations here to get to 2x + 2. Let's try something harder...
#[inline(never)]
fn example(a: Num, b: Num) -> Num {
a + b
}
fn main() {
let something = std::env::args().count() as i32;
let another = std::env::vars().count() as i32;
println!("{:?}", example(Num(something), Num(another)));
}
Produces
; playground::example
; Function Attrs: noinline norecurse nounwind nonlazybind readnone uwtable
define internal fastcc i32 #_ZN10playground7example17h73d4138fe5e9856fE(i32 %a, i32 %b) unnamed_addr #3 {
start:
%0 = add i32 %b, %a
ret i32 %0
}
A simple add instruction.
The real takeaway from this is:
Look at the generated assembly for your cases. Even similar-looking code might optimize differently.
Perform micro and macro benchmarking. You never know exactly how the code will play out in the big picture. Maybe all your cache will be blown, but your micro benchmarks will be stellar.
is the Rust compiler smart enough to know that in this case, it's not necessary to do that copying?
As you just saw, the Rust compiler plus LLVM are pretty smart. In general, it is possible to elide copies when it knows that the operand isn't needed. Whether it will work in your case or not is tough to answer.
Even if it did, you might not want to be passing large items via the stack as it's always possible that it will need to be copied.
And note that you don't have to implement copy for the value, you can choose to only allow it via references:
impl<'a, 'b> Add<&'b Num> for &'a Num {
type Output = Num;
fn add(self, rhs: &'b Num) -> Num {
Num(self.0 + rhs.0)
}
}
In fact, you may want to implement both ways of adding them, and maybe all 4 permutations of value / reference!
If the Rust compiler isn't smart enough, and actually does the copying like a function with a argument passed by value in C++, how do you avoid the performance overhead in cases where it matters?
The context is I am implementing a matrix class (just to learn), and if I have a 100x100 matrix, I really don't want to be invoking two copies every time I try to do a multiply or add.
All of Rust's implicit copies (be them from moves or actual Copy types) are a shallow memcpy. If you heap allocate, only the pointers and such are copied. Unlike C++, passing a vector by value will only copy three pointer-sized values.
To copy heap memory, an explicit copy must be made, normally by calling .clone(), implemented with #[derive(Clone)] or impl Clone.
I've talked in more detail about this elsewhere.
Shepmaster points out that shallow copies are often messed with a lot by the compiler - generally only heap memory and massive stack values cause problems.
Related
I'm currently writing a simple function to swap numbers in Rust:
fn swapnumbers() {
let a = 1;
let b = 2;
let (a, b) = (b, a);
println!("{}, {}", a, b);
}
I am now trying to make a test for it, how do I do it? All my other attempts have failed.
I would suggest modifying the function to return something instead of printing it, and then using either the assert_eq! or assert! macros to test for proper function. (docs for assert_eq!, docs for assert!)
fn swapnumbers() -> (i32, i32) {
let a = 1;
let b = 2;
let (a, b) = (b, a);
return (a, b);
}
assert_eq!(swapnumbers(), (2, 1));
(-> (i32, i32) means that this function returns a tuple of two i32s)
And if you're unfamiliar with testing in Rust, the official Rust book tutorial can help you out with that!
If you want to actually swap numbers, you would need to do something like this:
fn swapnumbers(a: &mut i32, b: &mut i32) {
std::mem::swap(a, b);
}
Note the types specified after the parameter names. &mut i32 means the passed value must be a mutable reference of an i32 The parameter must be mutable for you to be able to assign to it and change its value, and it must be a reference so that the function does not actually take ownership of the data.
Consider the following:
// Just a sequence of adjacent fields of same the type
#[repr(C)]
#[derive(Debug)]
struct S<T> {
a : T,
b : T,
c : T,
d : T,
}
impl<T : Sized> S<T> {
fn new(a : T, b : T, c : T, d : T) -> Self {
Self {
a,
b,
c,
d,
}
}
// reinterpret it as an array
fn as_slice(&self) -> &[T] {
unsafe { std::slice::from_raw_parts(self as *const Self as *const T, 4) }
}
}
fn main() {
let s = S::new(1, 2, 3, 4);
let a = s.as_slice();
println!("s :: {:?}\n\
a :: {:?}", s, a);
}
Is this code portable?
Is it always safe to assume a repr(C) struct with fields of same type can be reinterpreted like an array? Why?
Yes, it is safe and portable, except for very large T (fix below). None of the points listed in the safety section of the documentation for std::slice::from_raw_parts are a concern here:
The data pointer is valid for mem::size_of::<T>() * 4, which is the size of S<T>, and is properly aligned.
All of the items are in the same allocation object, because they are in the same struct.
The pointer is not null, because it is a cast from the safe &self parameter, and it is properly aligned, because S<T> has (at least) the alignment of T.
The data parameter definitely points to 4 consecutive initialized Ts, because S is marked #[repr(C)] which is defined such that in your struct, no padding would be introduced. (repr(Rust) makes no such guarantee).
The memory referenced is not mutated during the lifetime of the reference, which is guaranteed by the borrow checker.
The total size of the slice must not be greater than isize::MAX. The code does not check this, so it is technically a safety hole. To be sure, add a check to as_slice, before the unsafe:
assert!(std::mem::size_of::<S<T>>() <= isize::MAX as _);
The check will normally be optimized out.
In Rust, is there any manner to handle operator functions such as add, or sub? I need to get the reference for those functions, but I can only find about traits. I'll leave here a comparative of what I need (like the wrapper methods) in Python.
A = 1
B = 2
A.__add__(B)
#Or maybe do something more, like
C = int(1).__add__
C(2)
You can obtain a function pointer to a trait method of a specific type via the universal function call syntax:
let fptr = <i32 as std::ops::Add>::add; // type: `fn(i32, i32) -> i32`
fptr(1, 3); // returns 4
Bigger example (Playground):
use std::ops;
fn calc(a: i32, b: i32, op: fn(i32, i32) -> i32) -> i32 {
op(a, b)
}
fn main() {
println!("{}", calc(2, 5, <i32 as ops::Add>::add)); // prints 7
println!("{}", calc(2, 5, <i32 as ops::Sub>::sub)); // prints -3
println!("{}", calc(2, 5, <i32 as ops::Mul>::mul)); // prints 10
}
Your int(1).__add__ example is a bit more complicated because we have a partially applied function here. Rust does not have this built into the language, but you can easily use closures to achieve the same effect:
let op = |b| 1 + b;
op(4); // returns 5
Today's Rust mystery is from section 4.9 of The Rust Programming Language, First Edition. The example of references and borrowing has this example:
fn main() {
fn sum_vec(v: &Vec<i32>) -> i32 {
return v.iter().fold(0, |a, &b| a + b);
}
fn foo(v1: &Vec<i32>) -> i32 {
sum_vec(v1);
}
let v1 = vec![1, 2, 3];
let answer = foo(&v1);
println!("{}", answer);
}
That seems reasonable. It prints "6", which is what you'd expect if the
v of sum_vec is a C++ reference; it's just a name for a memory
location, the vector v1 we defined in main().
Then I replaced the body of sum_vec with this:
fn sum_vec(v: &Vec<i32>) -> i32 {
return (*v).iter().fold(0, |a, &b| a + b);
}
It compiled and worked as expected. Okay, that's not… entirely crazy. The compiler is trying to make my life easier, I get that. Confusing, something that I have to memorize as a specific tic of the language, but not entirely crazy. Then I tried:
fn sum_vec(v: &Vec<i32>) -> i32 {
return (**v).iter().fold(0, |a, &b| a + b);
}
It still worked! What the hell?
fn sum_vec(v: &Vec<i32>) -> i32 {
return (***v).iter().fold(0, |a, &b| a + b);
}
type [i32] cannot be dereferenced. Oh, thank god, something that makes sense. But I would have expected that almost two iterations earlier!
References in Rust aren't C++ "names for another place in memory," but what are they? They're not pointers either, and the rules about them seem to be either esoteric or highly ad-hoc. What is happening such that a reference, a pointer, and a pointer-to-a-pointer all work equally well here?
The rules are not ad-hoc nor really esoteric. Inspect the type of v and it's various dereferences:
fn sum_vec(v: &Vec<i32>) {
let () = v;
}
You'll get:
v -> &std::vec::Vec<i32>
*v -> std::vec::Vec<i32>
**v -> [i32]
The first dereference you already understand. The second dereference is thanks to the Deref trait. Vec<T> dereferences to [T].
When performing method lookup, there's a straight-forward set of rules:
If the type has the method, use it and exit the lookup.
If a reference to the type has the method, use it and exit the lookup.
If the type can be dereferenced, do so, then return to step 1.
Else the lookup fails.
References in Rust aren't C++ "names for another place in memory,"
They absolutely are names for a place in memory. In fact, they compile down to the same C / C++ pointer you know.
Am I missing something, or are mutable non-reference arguments not supported in Rust?
To give an example, I was playing with Rust and tried to implement Euclid's algorithm generic for all numeric types, and ideally I just wanted to pass arguments by value and have them mutable, but adding keyword mut to the argument type is rejected by compiler. So I have to declare a mutable copy of the argument as the function prologue. Is this idiomatic/efficent?
use std::ops::Rem;
extern crate num;
use self::num::Zero;
pub fn gcd<T: Copy + Zero + PartialOrd + Rem<Output=T>>(a : T, b : T) -> T
{
let mut aa = a;
let mut bb = b;
while bb > T::zero() {
let t = bb;
bb = aa % bb;
aa = t;
}
aa
}
It's certainly possible to say that an argument will be mutable:
use num::Zero; // 0.4.0
use std::ops::Rem;
pub fn gcd<T>(mut a: T, mut b: T) -> T
where
T: Copy + Zero + PartialOrd + Rem<Output = T>,
{
while b > T::zero() {
let t = b;
b = a % b;
a = t;
}
a
}
Is [declaring a mutable copy of the argument] idiomatic/efficient?
It should be fine from an efficiency perspective. The optimizer will see that they are the same and not do any extraneous copying.
As for idiomatic, I'm not so sure. I originally started by not putting mut in my function argument list as I felt that it was oversharing details about the implementation. Nowadays, I go ahead and put it in there.