Can I use a mutable reference method like a value-passing one? - rust

Can I use a mutable reference method like a value-passing one? For example, can I use
o.mth(&mut self, ...)
as
o.mth(self, ...)
This would allow me to return the result without worrying about the lifetime of o. It might involve a move closure, or some kind of wrapper?
For context, I'm trying to return a boxed iterator over CSV records using the rust-csv package but the iterator can't outlive the reader, which Reader::records(&'t mut self) borrows mutably. Contrast this with BufRead::lines(self), which consumes its reader and hence can be returned without lifetime problems.

No, you cannot. The reason that self, &self, and &mut self methods exist is because they behave differently, have different restrictions, and allow different things.
In this case, you'd probably ultimately end up trying to create an iterator that yields references to itself, which isn't allowed, or store a value and a reference to that value in the same struct, which is also disallowed.

Related

Can you specify return type mutability in Rust?

Is it possible to specify the mutability of the assigned variable in rust? Something like
fn new(len: usize) -> Thing { ... }
fn new_mut(len: usize) -> mut Thing { ... }
I have a specific case where knowing the mutability of a type can be used to make several optimizations under the hood for my data structure.
Trying to enforce mutability manually is possible, but seems quite inelegant, especially when the concept of mutability is an intrinsic part of the rust language already. And you end up in weird situations like this:
// Thing::new() returns a data structure with an immutable backing type,
// but the code below looks like it should be mutable.
let mut foo = Thing::new(5);
In this case, I either have to choice of trying to figure out if someone tried to make a mutable reference to my immutable Thing manually (and panicking I suppose), or by making new return a wrapper over Thing that hides all mutable functions (which means the mut keyword is rendered pointless and misleading).
I think you have some misconception: mutability of a return type is not and should NOT be part of the function signature, mutability is always decided on the caller side.
The return type is a description of the memory slot returned to the caller when the call stack returns after executing the function. Ownership of the returned type is fully transferred, e.g. a Thing is fully moved to the caller. How the caller handles the returned memory cell is not the concern of the called function because it's already finished and returned. There's no such thing as mutable or immutable return type, mutability is always related to memory slots. In your example this is only decided at declaring variable foo that defines the memory slot of the result type on the caller side. As long as you have full ownership of a data structure, you're free to decide or even change mutability of the data.
What you're looking for is maybe a separate type specialized for optimization.

Is there a difference between using a reference, and using an owned value in Rust?

I'm reading the Rust book. It explains that when you create a function, you need to decide if your function will take ownership of its arguments, or take them as a mutable or immutable references.
What I'm not entirely clear about is if there's a syntactic difference between using an owned value within that function, or using a reference.
If you have a reference to a struct with methods, is the syntax for using those methods exactly the same as it would be if you were dealing with an owned variable? Are there any other difference between how one would use an owned variable, and how one would use a reference to that variable?
When do you need to dereference a reference variable? I've only seen dereferencing when you're trying to increment the value stored by the variable pointed to by a mutable reference to an int, or something like that. It seems like you only need to dereference it if you intend to replace the value of the variable entirely with something new. For example, if you want to run a method on a reference to a struct, you don't need to dereference, but if you want to replace the value with a completely difference instance of that struct, you need to dereference. Is that right?
If you have a reference to a struct with methods, is the syntax for using those methods exactly the same as it would be if you were dealing with an owned variable? Are there any other difference between how one would use an owned variable, and how one would use a reference to that variable?
Yes, except if you have an immutable reference you're restricted to only calling methods which take immutable references, and if you have a mutable reference you can call methods which take mutable or immutable references, and if you have ownership you can call any method, including those which take ownership. Example:
struct Struct;
impl Struct {
fn takes_self_ref(&self) {}
fn takes_self_mut_ref(&mut self) {}
fn takes_self(self) {}
}
fn func_owned(mut s: Struct) {
s.takes_self_ref(); // compiles
s.takes_self_mut_ref(); // compiles
s.takes_self(); // compiles
}
fn func_mut_ref(s: &mut Struct) {
s.takes_self_ref(); // compiles
s.takes_self_mut_ref(); // compiles
s.takes_self(); // error
}
fn func_ref(s: &Struct) {
s.takes_self_ref(); // compiles
s.takes_self_mut_ref(); // error
s.takes_self(); // error
}
When do you need to dereference a reference variable?
The deference operator *, however references are automatically dereferenced by the compiler on method calls which is why you rarely see the deference operator used in Rust code in practice, since it's rarely explicitly needed.

Using same reference for multiple method parameters

I'll preface by saying I'm very new to Rust, and I'm still wrapping my head around the semantics of the borrow-checker. I have some understanding of why it doesn't like my code, but I'm not sure how to resolve it in an idiomatic way.
I have a method in Rust which accepts 3 parameters with a signature that looks something like this:
fn do_something(&mut self, mem: &mut impl TraitA, bus: &mut impl TraitB, int_lines: &impl TraitC) -> ()
I also have a struct which implements all three of these traits; however, the borrow-checker is complaining when I attempt to use the same reference for multiple parameters:
cannot borrow `*self` as mutable more than once at a time
And also:
cannot borrow `*self` as immutable because it is also borrowed as mutable
My first question is whether this is a shortcoming of the borrow-checker (being unable to recognize that the same reference is being passed), or by design (I suspect this is the case, since from the perspective of the called method each reference is distinct and thus the ownership of each can be regarded separately).
My second question is what the idiomatic approach would be. The two solutions I see are:
a) Combining all three traits into one. While this is technically trivial given my library's design, it would make the code decidedly less clean since the three traits are used to interface with unrelated parts of the struct's state. Furthermore, since this is a library (the do_something method is part of a test), it hinders the possibility of separating the state out into separate structs.
b) Moving each respective part of the struct's state into separate structs, which are then owned by the main struct. This seems like the better option to me, especially since it does not require any changes to the library code itself.
Please let me know if I'm missing another solution, or if there's a way to convince the borrow-checker to accept my original design.
The borrow checker is operating as designed. It only knows you are passing three different mutable references into the same function: it does not know what the function will do with these, even if they do happen to be references to the same struct. Within the function they are three different mutable references to the same struct.
If the three different traits represent three different functional aspects, then your best approach might be to split the struct into different structs, each implementing one of the traits, as you have proposed.
If you would prefer to keep a single struct, and if the function will always be called with a single struct, then you can just pass it in once like this:
fn do_something(&mut self, proc: &mut (impl TraitA + TraitB + TraitC)) -> () { ... }

Do I have to create distinct structs for both owned (easy-to-use) and borrowed (more efficient) data structures?

I have a Message<'a> which has &'a str references on a mostly short-lived buffer.
Those references mandate a specific program flow as they are guaranteed to never outlive the lifetime 'a of the buffer.
Now I also want to have an owned version of Message, such that it can be moved around, sent via threads, etc.
Is there an idiomatic way to achieve this? I thought that Cow<'a, str> might help, but unfortunately, Cow does not magically allocate in case &'a str would outlive the buffer's lifetime.
AFAIK, Cow is not special in the sense that no matter if Cow holds an Owned variant, it must still pass the borrow checker on 'a.
Definition of std::borrow::Cow.
pub enum Cow<'a, B> {
Borrowed(&'a B),
Owned(<B as ToOwned>::Owned),
}
Is there an idiomatic way to have an owned variant of Message? For some reason we have &str and String, &[u8] and Vec<u8>, ... does that mean people generally would go for &msg and Message?
I suppose I still have to think about if an owned variant is really, really needed, but my experience shows that having an escape hatch for owned variants generally improves prototyping speed.
Yes, you need to have multiple types, one representing the owned concept and one representing the borrowed concept.
You'll see the same technique throughout the standard library and third-party crates.
See also:
How to abstract over a reference to a value or a value itself?
How to avoid writing duplicate accessor functions for mutable and immutable references in Rust?

What's the rule of thumb when dealing with passing args in Rust?

I read a couple of articles and it's still unclear to me. It looks like T and &T is kinda interchangeable as long as a compiler doesn't show any errors. But after I read an official doc I want to pass everything by reference to take advantage of borrowing.
Could you provide any simple rule about passing an arg as T against &T when T is an object/string? E.g., in C++ there're 3 options:
T – copy the value, can't mutate the current value
&T – don't create a copy, can mutate the current value
const &T – don't create a copy, can't mutate the current value
E.g., is it a good idea to pass by T if I want to deallocate T after it goes out of scope in my child function (the function I'm passing T to); and use &T if I want to use it my child function in a read-only mode and then continue to use it in my current (parent) function.
Thanks!
These are the rules I personally use (in order).
Pass by value (T) if the parameter has a generic type and the trait(s) that this generic type implements all take &self or &mut self but there is a blanket impl for &T or &mut T (respectively) for all types T that implement that trait (or these traits). For example, in std::io::Write, all methods take &mut self, but there is a blanket impl impl<'a, W: Write + ?Sized> Write for &'a mut W provided by the standard library. This means that although you accept a T (where T: Write) by value, one can pass a &mut T because &mut T also implements Write.
Pass by value (T) if you must take ownership of the value (for example, because you pass it to another function/method that takes it by value, or because the alternatives would require potentially expensive clones).
Pass by mutable reference (&mut T) if you must mutate the object (by calling other functions/methods that take the object by mutable reference, or just by overwriting it and you want the caller to see the new value) but do not need to take ownership of it.
Pass by value (T) if the type is Copy and is small (my criterion for small is size_of::<T>() <= size_of::<usize>() * 2, but other people might have slightly different criteria). The primitive integer and floating-point types are examples of such types. Passing values of these types by reference would create an unnecessary indirection in memory, so the caller will have to perform an additional machine instruction to read it. When size_of::<T>() <= size_of::<usize>(), you're usually not saving anything by passing the value by reference because T and &T will usually be both passed in a single register (if the function has few enough parameters).
Pass by shared reference (&T) otherwise.
In general, prefer passing by shared reference when possible. This avoids potentially expensive clones when the type is large or manages resources other than memory, and gives the most flexibility to the caller in how the value can be used after the call.
E.g., is it a good idea to pass by T if I want to deallocate T after it goes out of scope in my child function (the function I'm passing T to)
You'd better have a good reason for that! If you ever decide that you actually need to use the T later in the caller, then you'll have to change the callee's signature and update all call sites (because unlike in C++, where going from T to const T& is mostly transparent, going from T to &T in Rust is not: you must add a & in front of the argument in all call sites).
I recommend you use Clippy if you're not already using it. Clippy has a lint that can notify you if you write a function that takes an argument by value but the function doesn't need to take ownership of it (this lint used to warn by default, but it no longer does 😞, so you have to enable it manually with #[warn(clippy::needless_pass_by_value)]).

Resources