When is it useful to use `ref` with a function paramter - rust

This question is not about the difference between fn foo(x: &T) and fn foo(ref x: T).
I wish to ask when it's desirable / idiomatically correct to use the latter. I'm unable to imagine a scenario you would need the ref keyword in a function signature because you could just declare fn foo(x: T) and use &x inside the function.

At the top level of a function parameter list, there is no reasonable use case for ref. It basically results in ownership of the value being moved into the function, but you only get a reference to work with.
The reason this is allowed in a function parameter list is consistency with the pattern matching syntax in other parts of the language. You can use any irrefutable pattern as a function parameter, just like in let statements. Syntax and semantics of these assignments are essentially the same, but not everything that's technically allowed to be in a function parameter list actually makes sense there, just as not all code that compiles is actually useful. Clippy warns against using ref at the top level of a function parameter list.
It may be useful to use ref in destructuring in a function parameter list. If you accept, say, a reference to a pair &(String, String), and you want to give individual names to the to the two entries, you can use
fn foo(&(ref x, ref y): &(String, String))
to achieve this. Not using ref here would be invalid, since you can't take ownership of these strings.
Since the arrival of match ergonomics, this (and most other) uses of the ref keyword can be rewritten using the more succinct syntax
fn foo((x, y): &(String, String))
I personally prefer the longer, more explicit version, since the "ergonomic" version makes the types that x and y have more opaque.

Related

Functional core lib functions ordering and return types

I have a question with regards to functional rust, and how the core lib functions work. It seems they return different things, depending on their ordering.
As an example, I have made a simple function that doubles numbers, and takes only even numbers. So, a map and a filter is needed. It looks like this:
However when I change the order of the map/filter, the return of filter changes, like so:
I understand the error, and that I need to dereference the variable, but I don't know why this change happens. Can someone explain to me what is going on here?
slice::iter returns a std::slice::Iter. That structure implements Iterator<Item = &T>, so it yields references.
Iterator::filter returns a std::iter::Filter, which implements Iterator<Item = I::Item>, aka it yields whatever the iterator it transforms yields. So in the second snippet, since iter() yields &i32, iter().filter(..) yields &i32.
In the first snippet however there's a Iterator::map between the two. That yields whatever the mapping function returns, which in your case is an i32. Therefore the filter which follows matches that, and yields an i32 as well.
The last pieces of the puzzle are that:
since it doesn't take ownership filter's callback receives a reference to the input value, even if that value is already a reference, so if it's transforming an Iterator<Item=&T> it receives an &&T
while the . operator will dereference as many times as necessary, other operators will not dereference at all
because references are so common in Rust, rather than implement only i32 % i32 the stdlib also implements:
&i32 % i32
i32 % &i32
&i32 % &i32
However that's where the stdlib stops, there is no impl Rem<i32> for &&i32. Therefore no &&i32 % i32, therefore your second version can not find a trait implementation to call.
FWIW while you can of course dereference in the filter callback, there's an iterator adapter to avoid working with references to simple types (as, as you've discovered, that's often less convenient; it's also commonly less efficient): Iterator::copied will transform an Iterator<Item=&T> into Iterator<Item=T> as long as T is Copy (meaning trivially copiable and usually fairly small, like an integer).

Opposite of Borrow trait for Copy types?

I've seen the Borrow trait used to define functions that accept both an owned type or a reference, e.g. T or &T. The borrow() method is then called in the function to obtain &T.
Is there some trait that allows the opposite (i.e. a function that accepts T or &T and obtains T) for Copy types?
E.g. for this example:
use std::borrow::Borrow;
fn foo<T: Borrow<u32>>(value: T) -> u32 {
*value.borrow()
}
fn main() {
println!("{}", foo(&5));
println!("{}", foo(5));
}
This calls borrow() to obtain a reference, which is then immediately dereferenced.
Is there another implementation that just copies the value if T was passed in, and dereferences if &T was given? Or is the above the idiomatic way of writing this sort of thing?
There is not really an inverse trait for Borrow, because it's not really useful as a bound on functions the same way Borrow is. The reason has to do with ownership.
Why is "inverse Borrow" less useful than Borrow?
Functions that need references
Consider a function that only needs to reference its argument:
fn puts(arg: &str) {
println!("{}", arg);
}
Accepting String would be silly here, because puts doesn't need to take ownership of the data, but accepting &str means we might sometimes force the caller to keep the data around longer than necessary:
{
let output = create_some_string();
output.push_str(some_other_string);
puts(&output);
// do some other stuff but never use `output` again
} // `output` isn't dropped until here
The problem being that output isn't needed after it's passed to puts, and the caller knows this, but puts requires a reference, so output has to stay alive until the end of the block. Obviously you can always fix this in the caller by adding more blocks and sometimes a let, but puts can also be made generic to let the caller delegate the responsibility of cleaning up output:
fn puts<T: Borrow<str>>(arg: T) {
println!("{}", arg.borrow());
}
Accepting T: Borrow for puts gives the caller the flexibility to decide whether to keep the argument around or to move it into the function.¹
Functions that need owned values
Now consider the case of a function that actually needs to take ownership:
struct Wrapper(String);
fn wrap(arg: String) -> Wrapper {
Wrapper(arg)
}
In this case accepting &str would be silly, because wrap would have to call to_owned() on it. If the caller has a String that it's no longer using, that would needlessly copy the data that could have just been moved into the function. In this case, accepting String is the more flexible option, because it allows the caller to decide whether to make a clone or pass an existing String. Having an "inverse Borrow" trait would not add any flexibility that arg: String does not already provide.
But String isn't always the most ergonomic argument, because there are several different kinds of string: &str, Cow<str>, Box<str>... We can make wrap a little more ergonomic by saying it accepts anything that can be converted into a String.
fn wrap<T: Into<String>>(arg: T) -> Wrapper {
Wrapper(arg.into())
}
This means you can call it like wrap("hello, world") without having to call .to_owned() on the literal. Which is not really a flexibility win -- the caller can always call .into() instead without loss of generality -- but it is an ergonomic win.
What about Copy types?
Now, you asked about Copy types. For the most part the arguments above still apply. If you're writing a function that, like puts, only needs a &A, using T: Borrow<A> might be more flexible for the caller; for a function like wrap that needs the whole A, it's more flexible to just accept A. But for Copy types the ergonomic advantage of accepting T: Into<A> is much less clear-cut.
For integer types, because generics mess with type inference, using them usually makes it less ergonomic to use literals; you may end up having to explicitly annotate the types.
Since &u32 doesn't implement Into<u32>, that particular trick wouldn't work here anyway.
Since Copy types are readily available as owned values, it's less common to use them by reference in the first place.
Finally, turning a &A into an A when A: Copy is as simple as just adding *; being able to skip that step is probably not a compelling enough win to counterbalance the added complexity of using generics in most cases.
In conclusion, foo should almost certainly just accept value: u32 and let the caller decide how to get that value.
See also
Is it more conventional to pass-by-value or pass-by-reference when the method needs ownership of the value?
¹ For this particular function you'd probably want AsRef<str>, because you're not relying on the extra guarantees of Borrow, and the fact that all T implements Borrow<T> isn't usually relevant for unsized types such as str. But that is beside the point.
With the function you have you can only use a u32 or a type that can be borrowed as u32.
You can make your function more generic by using a second template argument.
fn foo<T: Copy, N: Borrow<T>>(value: N) -> T {
*value.borrow()
}
This is however only a partial solution as it will require type annotations in some cases to work correctly.
For example, it works out of the box with usize:
let v = 0usize;
println!("{}", foo(v));
There is no problem here for the compiler to guess that foo(v) is a usize.
However, if you try foo(&v), the compiler will complain that it cannot find the right output type T because &T could implement several Borrow traits for different types. You need to explicitly specify which one you want to use as output.
let output: usize = foo(&v);

Why are len() and is_empty() not defined in a trait?

Most patterns in Rust are captured by traits (Iterator, From, Borrow, etc.).
How come a pattern as pervasive as len/is_empty has no associated trait in the standard library? Would that cause problems which I do not foresee? Was it deemed useless? Or is it only that nobody thought of it (which seems unlikely)?
Was it deemed useless?
I would guess that's the reason.
What could you do with the knowledge that something is empty or has length 15? Pretty much nothing, unless you also have a way to access the elements of the collection for example. The trait that unifies collections is Iterator. In particular an iterator can tell you how many elements its underlying collection has, but it also does a lot more.
Also note that should you need an Empty trait, you can create one and implement it for all standard collections, unlike interfaces in most languages. This is the power of traits. This also means that the standard library doesn't need to provide small utility traits for every single use case, they can be provided by libraries!
Just adding a late but perhaps useful answer here. Depending on what exactly you need, using the slice type might be a good option, rather than specifying a trait. Slices have len(), is_empty(), and other useful methods (full docs here). Consider the following:
use core::fmt::Display;
fn printme<T: Display>(x: &[T]) {
println!("length: {}, empty: ", x.len());
for c in x {
print!("{}, ", c);
}
println!("\nDone!");
}
fn main() {
let s = "This is some string";
// Vector
let vv: Vec<char> = s.chars().collect();
printme(&vv);
// Array
let x = [1, 2, 3, 4];
printme(&x);
// Empty
let y:Vec<u8> = Vec::new();
printme(&y);
}
printme can accept either a vector or an array. Most other things that it accepts will need some massaging.
I think maybe the reason for there being no Length trait is that most functions will either a) work through an iterator without needing to know its length (with Iterator), or b) require len because they do some sort of random element access, in which case a slice would be the best bet. In the first case, knowing length may be helpful to pre-allocate memory of some size, but size_hint takes care of this when used for anything like Vec::with_capacity, or ExactSizeIterator for anything that needs specific allocations. Most other cases would probably need to be collected to a vector at some point within the function, which has its len.
Playground link to my example here: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=9a034c2e8b75775449afa110c05858e7

How can I simplify multiple uses of BigInt::from()?

I wrote a program where I manipulated a lot of BigInt and BigUint values and perform some arithmetic operations.
I produced code where I frequently used BigInt::from(Xu8) because it is not possible to directly add numbers from different types (if I understand correctly).
I want to reduce the number of BigInt::from in my code. I thought about a function to "wrap" this, but I would need a function for each type I want to convert into BigInt/BigUint:
fn short_name(n: X) -> BigInt {
return BigInt::from(n)
}
Where X will be each type I want to convert.
I couldn't find any solution that is not in contradiction with the static typing philosophy of Rust.
I feel that I am missing something about traits, but I am not very comfortable with them, and I did not find a solution using them.
Am I trying to do something impossible in Rust? Am I missing an obvious solution?
To answer this part:
I produced code where I frequently used BigInt::from(Xu8) because it is not possible to directly add numbers from different types (if I understand correctly).
On the contrary, if you look at BigInt's documentation you'll see many impl Add:
impl<'a> Add<BigInt> for &'a u64
impl Add<u8> for BigInt
and so on. The first allows calling a_ref_to_u64 + a_bigint, the second a_bigint + an_u8 (and both set OutputType to be BigInt). You don't need to convert these types to BigInt before adding them! And if you want your method to handle any such type you just need an Add bound similar to the From bound in Frxstrem's answer. Of course if you want many such operations, From may end up more readable.
The From<T> trait (and the complementary Into<T> trait) is what is typically used to convert between types in Rust. In fact, the BigInt::from method comes from the From trait.
You can modify your short_name function into a generic function with a where clause to accept all types that BigInt can be converted from:
fn short_name<T>(n: T) -> BigInt // function with generic type T
where
BigInt: From<T>, // where BigInt implements the From<T> trait
{
BigInt::from(n)
}

What's the rule of thumb when dealing with passing args in Rust?

I read a couple of articles and it's still unclear to me. It looks like T and &T is kinda interchangeable as long as a compiler doesn't show any errors. But after I read an official doc I want to pass everything by reference to take advantage of borrowing.
Could you provide any simple rule about passing an arg as T against &T when T is an object/string? E.g., in C++ there're 3 options:
T – copy the value, can't mutate the current value
&T – don't create a copy, can mutate the current value
const &T – don't create a copy, can't mutate the current value
E.g., is it a good idea to pass by T if I want to deallocate T after it goes out of scope in my child function (the function I'm passing T to); and use &T if I want to use it my child function in a read-only mode and then continue to use it in my current (parent) function.
Thanks!
These are the rules I personally use (in order).
Pass by value (T) if the parameter has a generic type and the trait(s) that this generic type implements all take &self or &mut self but there is a blanket impl for &T or &mut T (respectively) for all types T that implement that trait (or these traits). For example, in std::io::Write, all methods take &mut self, but there is a blanket impl impl<'a, W: Write + ?Sized> Write for &'a mut W provided by the standard library. This means that although you accept a T (where T: Write) by value, one can pass a &mut T because &mut T also implements Write.
Pass by value (T) if you must take ownership of the value (for example, because you pass it to another function/method that takes it by value, or because the alternatives would require potentially expensive clones).
Pass by mutable reference (&mut T) if you must mutate the object (by calling other functions/methods that take the object by mutable reference, or just by overwriting it and you want the caller to see the new value) but do not need to take ownership of it.
Pass by value (T) if the type is Copy and is small (my criterion for small is size_of::<T>() <= size_of::<usize>() * 2, but other people might have slightly different criteria). The primitive integer and floating-point types are examples of such types. Passing values of these types by reference would create an unnecessary indirection in memory, so the caller will have to perform an additional machine instruction to read it. When size_of::<T>() <= size_of::<usize>(), you're usually not saving anything by passing the value by reference because T and &T will usually be both passed in a single register (if the function has few enough parameters).
Pass by shared reference (&T) otherwise.
In general, prefer passing by shared reference when possible. This avoids potentially expensive clones when the type is large or manages resources other than memory, and gives the most flexibility to the caller in how the value can be used after the call.
E.g., is it a good idea to pass by T if I want to deallocate T after it goes out of scope in my child function (the function I'm passing T to)
You'd better have a good reason for that! If you ever decide that you actually need to use the T later in the caller, then you'll have to change the callee's signature and update all call sites (because unlike in C++, where going from T to const T& is mostly transparent, going from T to &T in Rust is not: you must add a & in front of the argument in all call sites).
I recommend you use Clippy if you're not already using it. Clippy has a lint that can notify you if you write a function that takes an argument by value but the function doesn't need to take ownership of it (this lint used to warn by default, but it no longer does 😞, so you have to enable it manually with #[warn(clippy::needless_pass_by_value)]).

Resources