Using the Rust compiler to prevent forgetting to call a method - rust

I have some code like this:
foo.move_right_by(10);
//do some stuff
foo.move_left_by(10);
It's really important that I perform both of those operations eventually, but I often forget to do the second one after the first. It causes a lot of bugs and I'm wondering if there is an idiomatic Rust way to avoid this problem. Is there a way to get the rust compiler to let me know when I forget?
My idea was to maybe somehow have something like this:
// must_use will prevent us from forgetting this if it is returned by a function
#[must_use]
pub struct MustGoLeft {
steps: usize;
}
impl MustGoLeft {
fn move(&self, foo: &mut Foo) {
foo.move_left_by(self.steps);
}
}
// If we don't use left, we'll get a warning about an unused variable
let left = foo.move_left_by(10);
// Downside: move() can be called multiple times which is still a bug
// Downside: left is still available after this call, it would be nice if it could be dropped when move is called
left.move();
Is there a better way to accomplish this?
Another idea is to implement Drop and panic! if the struct is dropped without having called that method. This isn't as good though because it's a runtime check and that is highly undesirable.
Edit: I realized my example may have been too simple. The logic involved can get quite complex. For example, we have something like this:
foo.move_right_by(10);
foo.open_box(); // like a cardboard box, nothing to do with Box<T>
foo.move_left_by(10);
// do more stuff...
foo.close_box();
Notice how the operations aren't performed in a nice, properly nested order. The only thing that's important is that the inverse operation is always called afterwards. The order sometimes needs to be specified in a certain way in order to make the code work as expected.
We can even have something like this:
foo.move_right_by(10);
foo.open_box(); // like a cardboard box, nothing to do with Box<T>
foo.move_left_by(10);
// do more stuff...
foo.move_right_by(10);
foo.close_box();
foo.move_left_by(10);
// do more stuff...

You can use phantom types to carry around additional information, which can be used for type checking without any runtime cost. A limitation is that move_left_by and move_right_by must return a new owned object because they need to change the type, but often this won't be a problem.
Additionally, the compiler will complain if you don't actually use the types in your struct, so you have to add fields that use them. Rust's std provides the zero-sized PhantomData type as a convenience for this purpose.
Your constraint could be encoded like this:
use std::marker::PhantomData;
pub struct GoneLeft;
pub struct GoneRight;
pub type Completed = (GoneLeft, GoneRight);
pub struct Thing<S = ((), ())> {
pub position: i32,
phantom: PhantomData<S>,
}
// private to control how Thing can be constructed
fn new_thing<S>(position: i32) -> Thing<S> {
Thing {
position: position,
phantom: PhantomData,
}
}
impl Thing {
pub fn new() -> Thing {
new_thing(0)
}
}
impl<L, R> Thing<(L, R)> {
pub fn move_left_by(self, by: i32) -> Thing<(GoneLeft, R)> {
new_thing(self.position - by)
}
pub fn move_right_by(self, by: i32) -> Thing<(L, GoneRight)> {
new_thing(self.position + by)
}
}
You can use it like this:
// This function can only be called if both move_right_by and move_left_by
// have been called on Thing already
fn do_something(thing: &Thing<Completed>) {
println!("It's gone both ways: {:?}", thing.position);
}
fn main() {
let thing = Thing::new()
.move_right_by(4)
.move_left_by(1);
do_something(&thing);
}
And if you miss one of the required methods,
fn main(){
let thing = Thing::new()
.move_right_by(3);
do_something(&thing);
}
then you'll get a compile error:
error[E0308]: mismatched types
--> <anon>:49:18
|
49 | do_something(&thing);
| ^^^^^^ expected struct `GoneLeft`, found ()
|
= note: expected type `&Thing<GoneLeft, GoneRight>`
= note: found type `&Thing<(), GoneRight>`

I don't think #[must_use] is really what you want in this case. Here's two different approaches to solving your problem. The first one is to just wrap up what you need to do in a closure, and abstract away the direct calls:
#[derive(Debug)]
pub struct Foo {
x: isize,
y: isize,
}
impl Foo {
pub fn new(x: isize, y: isize) -> Foo {
Foo { x: x, y: y }
}
fn move_left_by(&mut self, steps: isize) {
self.x -= steps;
}
fn move_right_by(&mut self, steps: isize) {
self.x += steps;
}
pub fn do_while_right<F>(&mut self, steps: isize, f: F)
where F: FnOnce(&mut Self)
{
self.move_right_by(steps);
f(self);
self.move_left_by(steps);
}
}
fn main() {
let mut x = Foo::new(0, 0);
println!("{:?}", x);
x.do_while_right(10, |foo| {
println!("{:?}", foo);
});
println!("{:?}", x);
}
The second approach is to create a wrapper type which calls the function when dropped (similar to how Mutex::lock produces a MutexGuard which unlocks the Mutex when dropped):
#[derive(Debug)]
pub struct Foo {
x: isize,
y: isize,
}
impl Foo {
fn new(x: isize, y: isize) -> Foo {
Foo { x: x, y: y }
}
fn move_left_by(&mut self, steps: isize) {
self.x -= steps;
}
fn move_right_by(&mut self, steps: isize) {
self.x += steps;
}
pub fn returning_move_right(&mut self, x: isize) -> MovedFoo {
self.move_right_by(x);
MovedFoo {
inner: self,
move_x: x,
move_y: 0,
}
}
}
#[derive(Debug)]
pub struct MovedFoo<'a> {
inner: &'a mut Foo,
move_x: isize,
move_y: isize,
}
impl<'a> Drop for MovedFoo<'a> {
fn drop(&mut self) {
self.inner.move_left_by(self.move_x);
}
}
fn main() {
let mut x = Foo::new(0, 0);
println!("{:?}", x);
{
let wrapped = x.returning_move_right(5);
println!("{:?}", wrapped);
}
println!("{:?}", x);
}

I only looked at the initial description and probably missed the details in the conversation but one way to enforce the actions is to consume the original object (going right) and replace it with one that forces you to to move left by same amount before you can do whatever you wanted to do to finish the task.
The new type can forbid / require different calls to be made before getting to a finished state. For example (untested):
struct CanGoRight { .. }
impl CanGoRight {
fn move_right_by(self, steps: usize) -> MustGoLeft {
// Note: self is consumed and only `MustGoLeft` methods are allowed
MustGoLeft{steps: steps}
}
}
struct MustGoLeft {
steps: usize;
}
impl MustGoLeft {
fn move_left_by(self, steps: usize) -> Result<CanGoRight, MustGoLeft> {
// Totally making this up as I go here...
// If you haven't moved left at least the same amount of steps,
// you must move a bit further to the left; otherwise you must
// switch back to `CanGoRight` again
if steps < self.steps {
Err(MustGoLeft{ steps: self.steps - steps })
} else {
Ok(CanGoRight{ steps: steps - self.steps })
}
}
fn open_box(self) -> MustGoLeftCanCloseBox {..}
}
let foo = foo.move_right_by(10); // can't move right anymore
At this point foo can no longer move right as it isn't allowed by MustGoLeft but it can move left or open the box. If it moves left far enough it gets back to the CanGoRight state again. But if it opens the box then totally new rules apply. Either way you'll have to deal with both possibilities.
There's probably going to be some duplication between the states, but should be easy enough to refactor. Adding a custom trait might help.
In the end it sounds like you're making a state machine of sorts. Maybe https://hoverbear.org/2016/10/12/rust-state-machine-pattern/ will be of use.

Related

Late type in Rust

I'm working with two crates: A and B. I control both. I'd like to create a struct in A that has a field whose type is known only to B (i.e., A is independent of B, but B is dependent on A).
crate_a:
#[derive(Clone)]
pub struct Thing {
pub foo: i32,
pub bar: *const i32,
}
impl Thing {
fn new(x: i32) -> Self {
Thing { foo: x, bar: &0 }
}
}
crate_b:
struct Value {};
fn func1() {
let mut x = A::Thing::new(1);
let y = Value {};
x.bar = &y as *const Value as *const i32;
...
}
fn func2() {
...
let y = unsafe { &*(x.bar as *const Value) };
...
}
This works, but it doesn't feel very "rusty". Is there a cleaner way to do this? I thought about using a trait object, but ran into issues with Clone.
Note: My reason for splitting these out is that the dependencies in B make compilation very slow. Value above is actually from llvm_sys. I'd rather not leak that into A, which has no other dependency on llvm.
The standard way to implement something like this is with generics, which are kind of like type variables: they can be "assigned" a particular type, possibly within some constraints. This is how the standard library can provide types like Vec that work with types that you declare in your crate.
Basically, generics allow Thing to be defined in terms of "some unknown type that will become known later when this type is actually used."
Given the example in your code, it looks like Thing's bar field may or may not be set, which suggests that the built-in Option enum should be used. All you have to do is put a type parameter on Thing and pass that through to Option, like so:
pub mod A {
#[derive(Clone)]
pub struct Thing<T> {
pub foo: i32,
pub bar: Option<T>,
}
impl<T> Thing<T> {
pub fn new(x: i32) -> Self {
Thing { foo: x, bar: None }
}
}
}
pub mod B {
use crate::A;
struct Value;
fn func1() {
let mut x = A::Thing::new(1);
let y = Value;
x.bar = Some(y);
// ...
}
fn func2(x: &A::Thing<Value>) {
// ...
let y: &Value = x.bar.as_ref().unwrap();
// ...
}
}
(Playground)
Here, the x in B::func1() has the type Thing<Value>. You can see with this syntax how Value is substituted for T, which makes the bar field Option<Value>.
If Thing's bar isn't actually supposed to be optional, just write pub bar: T instead, and accept a T in Thing::new() to initialize it:
pub mod A {
#[derive(Clone)]
pub struct Thing<T> {
pub foo: i32,
pub bar: T,
}
impl<T> Thing<T> {
pub fn new(x: i32, y: T) -> Self {
Thing { foo: x, bar: y }
}
}
}
pub mod B {
use crate::A;
struct Value;
fn func1() {
let mut x = A::Thing::new(1, Value);
// ...
}
fn func2(x: &A::Thing<Value>) {
// ...
let y: &Value = &x.bar;
// ...
}
}
(Playground)
Note that the definition of Thing in both of these cases doesn't actually require that T implement Clone; however, Thing<T> will only implement Clone if T also does. #[derive(Clone)] will generate an implementation like:
impl<T> Clone for Thing<T> where T: Clone { /* ... */ }
This can allow your type to be more flexible -- it can now be used in contexts that don't require T to implement Clone, while also being cloneable when T does implement Clone. You get the best of both worlds this way.

Why can I write Guess{value:1000} when value is private?

I have a code like this from the offical tutorial book Chapter 9.3
#![allow(unused)]
fn main() {
pub struct Guess {
value: i32,
}
impl Guess {
pub fn new(value: i32) -> Guess {
if value < 1 || value > 100 {
panic!("Guess value must be between 1 and 100, got {}.", value);
}
Guess { value }
}
pub fn value(&self) -> i32 {
self.value
}
}
let g = Guess{
value:1000
};
println!("{}", g.value);
}
According to the book I should not be able to create a Guess using the let g = Guess {} However, this code does not cause any error and prints 1000
There is still a problem even if I put the struct and impl outside the main func like this and delete the pub keyword.
struct Guess {}
impl Guess {}
fn main() {}
According to the book I should not be able to create a Guess using the let g = Guess {}
It is unlikely that the book states that.
Rust visibility works in terms of modules, anything in a module can see all the contents of their parents regardless of their pub status; the reverse is not true.
Since main and Guess.value are in the same module, there is no barrier. Moving Guess out of main doesn't change that, they're still in the same module. For visibility to become an issue, Guess needs to be moved into a separate non-ancestor module (e.g. a sub-module, or a sibling).
Example:
pub struct Foo(usize);
pub mod x {
pub struct Bar(usize);
impl Bar {
pub fn get(&self) -> usize { self.0 }
}
pub fn make(n: super::Foo) -> Bar {
// can access Foo's fields because this is a
// descendant module, so it is "part of" the
// same module
Bar(n.0)
}
}
pub fn qux(n: x::Bar) -> Foo {
// Can't access Bar's field because it's not exposed to this module
// Foo(n.0)
Foo(n.get())
}

How to use traits for function overloading in Rust? [duplicate]

I am modeling an API where method overloading would be a good fit. My naïve attempt failed:
// fn attempt_1(_x: i32) {}
// fn attempt_1(_x: f32) {}
// Error: duplicate definition of value `attempt_1`
I then added an enum and worked through to:
enum IntOrFloat {
Int(i32),
Float(f32),
}
fn attempt_2(_x: IntOrFloat) {}
fn main() {
let i: i32 = 1;
let f: f32 = 3.0;
// Can't pass the value directly
// attempt_2(i);
// attempt_2(f);
// Error: mismatched types: expected enum `IntOrFloat`
attempt_2(IntOrFloat::Int(i));
attempt_2(IntOrFloat::Float(f));
// Ugly that the caller has to explicitly wrap the parameter
}
Doing some quick searches, I've found some references that talk about overloading, and all of them seem to end in "we aren't going to allow this, but give traits a try". So I tried:
enum IntOrFloat {
Int(i32),
Float(f32),
}
trait IntOrFloatTrait {
fn to_int_or_float(&self) -> IntOrFloat;
}
impl IntOrFloatTrait for i32 {
fn to_int_or_float(&self) -> IntOrFloat {
IntOrFloat::Int(*self)
}
}
impl IntOrFloatTrait for f32 {
fn to_int_or_float(&self) -> IntOrFloat {
IntOrFloat::Float(*self)
}
}
fn attempt_3(_x: &dyn IntOrFloatTrait) {}
fn main() {
let i: i32 = 1;
let f: f32 = 3.0;
attempt_3(&i);
attempt_3(&f);
// Better, but the caller still has to explicitly take the reference
}
Is this the closest I can get to method overloading? Is there a cleaner way?
Yes, there is, and you almost got it already. Traits are the way to go, but you don't need trait objects, use generics:
#[derive(Debug)]
enum IntOrFloat {
Int(i32),
Float(f32),
}
trait IntOrFloatTrait {
fn to_int_or_float(&self) -> IntOrFloat;
}
impl IntOrFloatTrait for i32 {
fn to_int_or_float(&self) -> IntOrFloat {
IntOrFloat::Int(*self)
}
}
impl IntOrFloatTrait for f32 {
fn to_int_or_float(&self) -> IntOrFloat {
IntOrFloat::Float(*self)
}
}
fn attempt_4<T: IntOrFloatTrait>(x: T) {
let v = x.to_int_or_float();
println!("{:?}", v);
}
fn main() {
let i: i32 = 1;
let f: f32 = 3.0;
attempt_4(i);
attempt_4(f);
}
See it working here.
Here's another way that drops the enum. It's an iteration on Vladimir's answer.
trait Tr {
fn go(&self) -> ();
}
impl Tr for i32 {
fn go(&self) {
println!("i32")
}
}
impl Tr for f32 {
fn go(&self) {
println!("f32")
}
}
fn attempt_1<T: Tr>(t: T) {
t.go()
}
fn main() {
attempt_1(1 as i32);
attempt_1(1 as f32);
}
Function Overloading is Possible!!! (well, sorta...)
This Rust Playground example has more a more detailed example, and shows usage of a struct variant, which may be better for documentation on the parameters.
For more serious flexible overloading where you want to have sets of any number of parameters of any sort of type, you can take advantage of the From<T> trait for conversion of a tuple to enum variants, and have a generic function that converts tuples passed into it to the enum type.
So code like this is possible:
fn main() {
let f = Foo { };
f.do_something(3.14); // One f32.
f.do_something((1, 2)); // Two i32's...
f.do_something(("Yay!", 42, 3.14)); // A str, i32, and f64 !!
}
First, define the different sets of parameter combinations as an enum:
// The variants should consist of unambiguous sets of types.
enum FooParam {
Bar(i32, i32),
Baz(f32),
Qux(&'static str, i32, f64),
}
Now, the conversion code; a macro can be written to do the tedious From<T> implementations, but here's what it could produce:
impl From<(i32, i32)> for FooParam {
fn from(p: (i32, i32)) -> Self {
FooParam::Bar(p.0, p.1)
}
}
impl From<f32> for FooParam {
fn from(p: f32) -> Self {
FooParam::Baz(p)
}
}
impl From<(&'static str, i32, f64)> for FooParam {
fn from(p: (&'static str, i32, f64)) -> Self {
FooParam::Qux(p.0, p.1, p.2)
}
}
And then finally, implement the struct with generic method:
struct Foo {}
impl Foo {
fn do_something<T: Into<FooParam>>(&self, t: T) {
use FooParam::*;
let fp = t.into();
match fp {
Bar(a, b) => print!("Bar: {:?}, {:?}\n", a, b),
Baz(a) => print!("Baz: {:?}\n", a),
Qux(a, b, c) => {
print!("Qux: {:?}, {:?}, {:?}\n", a, b, c)
}
}
}
}
Note: The trait bound on T needs to be specified.
Also, the variants need to be composed of combinations of types that the compiler wouldn't find ambiguous - which is an expectation for overloaded methods in other languages as well (Java/C++).
This approach has possibilities... it would be awesome if there's a decorator available - or one were written that did the From<T> implementations automatically when applied to an enum. Something like this:
// THIS DOESN'T EXIST - so don't expect the following to work.
// This is just an example of a macro that could be written to
// help in using the above approach to function overloading.
#[derive(ParameterOverloads)]
enum FooParam {
Bar(i32, i32),
Baz(f32),
Qux(&'static str, i32, f64),
}
// If this were written, it could eliminate the tedious
// implementations of From<...>.
The Builder
Another approach that addresses the case where you have multiple optional parameters to an action or configuration is the builder pattern. The examples below deviate somewhat from the recommendations in the link. Typically, there's a separate builder class/struct which finalizes the configuration and returns the configured object when a final method is invoked.
One of the most relevant situations this can apply to is where you want a constructor that takes a variable number of optional arguments - since Rust doesn't have built-in overloading, we can't have multiple versions of ___::new(). But we can get a similar effect using a chain of methods that return self. Playground link.
fn main() {
// Create.
let mut bb = BattleBot::new("Berzerker".into());
// Configure.
bb.flame_thrower(true)
.locomotion(TractorTreads)
.power_source(Uranium);
println!("{:#?}", bb);
}
Each of the configuration methods has a signature similar to:
fn power_source(&mut self, ps: PowerSource) -> &mut Self {
self.power_source = ps;
self
}
These methods could also be written to consume self and return non-reference copies or clones of self.
This approach can also be applied to actions. For instance, we could have a Command object that can be tuned with chained methods, which then performs the command when .exec() is invoked.
Applying this same idea to an "overloaded" method that we want to take a variable number of parameters, we modify our expectations a bit and have the method take an object that can be configured with the builder pattern.
let mut params = DrawParams::new();
graphics.draw_obj(params.model_path("./planes/X15.m3d")
.skin("./skins/x15.sk")
.location(23.64, 77.43, 88.89)
.rotate_x(25.03)
.effect(MotionBlur));
Alternatively, we could decide on having a GraphicsObject struct that has several config tuning methods, then performs the drawing when .draw() is invoked.

Most idiomatic way to create a default struct

To create a default struct, I used to see fn new() -> Self in Rust, but today, I discovered Default. So there are two ways to create a default struct:
struct Point {
x: i32,
y: i32,
}
impl Point {
fn new() -> Self {
Point {
x: 0,
y: 0,
}
}
}
impl Default for Point {
fn default() -> Self {
Point {
x: 0,
y: 0,
}
}
}
fn main() {
let _p1 = Point::new();
let _p2: Point = Default::default();
}
What is the better / the most idiomatic way to do so?
If you had to pick one, implementing the Default trait is the better choice to allow your type to be used generically in more places while the new method is probably what a human trying to use your code directly would look for.
However, your question is a false dichotomy: you can do both, and I encourage you to do so! Of course, repeating yourself is silly, so I'd call one from the other (it doesn't really matter which way):
impl Point {
fn new() -> Self {
Default::default()
}
}
Clippy even has a lint for this exact case!
I use Default::default() in structs that have member data structures where I might change out the implementation. For example, I might be currently using a HashMap but want to switch to a BTreeMap. Using Default::default gives me one less place to change.
In this particular case, you can even derive Default, making it very succinct:
#[derive(Default)]
struct Point {
x: i32,
y: i32,
}
impl Point {
fn new() -> Self {
Default::default()
}
}
fn main() {
let _p1 = Point::new();
let _p2: Point = Default::default();
}

Same object with different API faces at compile time?

I have an object that can be in either of two modes: a source or a sink. It is always in one of them and it is always known at compile time (when passed the object you know if you are going to read or write to it obviously).
I can put all the methods on the same object, and just assume I won't be called improperly or error when I do, or I was thinking I could be make two
tuple structs of the single underlying object and attach the methods to those tuple structs instead. The methods are almost entirely disjoint.
It is kind of abusing the fact that both tuple structs have the same layout and there is zero overhead for the casts and tuple storage.
Think of this similar to the Java ByteBuffer and related classes where you write then flip then read then flip back and write more. Except this would catch errors in usage.
However, it does seem a little unusual and might be overly confusing for such a small problem. And it seems like there is a better way to do this -- only requirement is zero overhead so no dynamic dispatch.
https://play.rust-lang.org/?gist=280d2ec2548e4f38e305&version=stable
#[derive(Debug)]
struct Underlying {
a: u32,
b: u32,
}
#[derive(Debug)]
struct FaceA(Underlying);
impl FaceA {
fn make() -> FaceA { FaceA(Underlying{a:1,b:2}) }
fn doa(&self) { println!("FaceA do A {:?}", *self); }
fn dou(&self) { println!("FaceA do U {:?}", *self); }
fn tob(&self) -> &FaceB { unsafe{std::mem::transmute::<&FaceA,&FaceB>(self)} }
}
#[derive(Debug)]
struct FaceB(Underlying);
impl FaceB {
fn dob(&self) { println!("FaceB do B {:?}", *self); }
fn dou(&self) { println!("FaceB do U {:?}", *self); }
fn toa(&self) -> &FaceA { unsafe{std::mem::transmute::<&FaceB,&FaceA>(self)} }
}
fn main() {
let a = FaceA::make();
a.doa();
a.dou();
let b = a.tob();
b.dob();
b.dou();
let aa = b.toa();
aa.doa();
aa.dou();
}
First of all, it seems like you don't understand how ownership works in Rust; you may want to read the Ownership chapter of the Rust Book. Specifically, the way you keep re-aliasing the original FaceA is how you would specifically enable the very thing you say you want to avoid. Also, all the borrows are immutable, so it's not clear how you intend to do any sort of mutation.
As such, I've written a new example from scratch that involves going between two types with disjoint interfaces (view on playpen).
#[derive(Debug)]
pub struct Inner {
pub value: i32,
}
impl Inner {
pub fn new(value: i32) -> Self {
Inner {
value: value,
}
}
}
#[derive(Debug)]
pub struct Upper(Inner);
impl Upper {
pub fn new(inner: Inner) -> Self {
Upper(inner)
}
pub fn into_downer(self) -> Downer {
Downer::new(self.0)
}
pub fn up(&mut self) {
self.0.value += 1;
}
}
#[derive(Debug)]
pub struct Downer(Inner);
impl Downer {
pub fn new(inner: Inner) -> Self {
Downer(inner)
}
pub fn into_upper(self) -> Upper {
Upper::new(self.0)
}
pub fn down(&mut self) {
self.0.value -= 1;
}
}
fn main() {
let mut a = Upper::new(Inner::new(0));
a.up();
let mut b = a.into_downer();
b.down();
b.down();
b.down();
let mut c = b.into_upper();
c.up();
show_i32(c.0.value);
}
#[inline(never)]
fn show_i32(v: i32) {
println!("v: {:?}", v);
}
Here, the into_upper and into_downer methods consume the subject value, preventing anyone from using it afterwards (try accessing a after the call to a.into_downer()).
This should not be particularly inefficient; there is no heap allocation going on here, and Rust is pretty good at moving values around efficiently. If you're curious, this is what the main function compiles down to with optimisations enabled:
mov edi, -1
jmp _ZN8show_i3220h2a10d619fa41d919UdaE
It literally inlines the entire program (save for the show function that I specifically told it not to inline). Unless profiling shows this to be a serious performance problem, I wouldn't worry about it.

Resources