Why is a match pattern with a guard clause not exhaustive? - rust

Consider the following code sample (playground).
#[derive(PartialEq, Clone, Debug)]
enum State {
Initial,
One,
Two,
}
enum Event {
ButtonOne,
ButtonTwo,
}
struct StateMachine {
state: State,
}
impl StateMachine {
fn new() -> StateMachine {
StateMachine {
state: State::Initial,
}
}
fn advance_for_event(&mut self, event: Event) {
// grab a local copy of the current state
let start_state = self.state.clone();
// determine the next state required
let end_state = match (start_state, event) {
// starting with initial
(State::Initial, Event::ButtonOne) => State::One,
(State::Initial, Event::ButtonTwo) => State::Two,
// starting with one
(State::One, Event::ButtonOne) => State::Initial,
(State::One, Event::ButtonTwo) => State::Two,
// starting with two
(State::Two, Event::ButtonOne) => State::One,
(State::Two, Event::ButtonTwo) => State::Initial,
};
self.transition(end_state);
}
fn transition(&mut self, end_state: State) {
// update the state machine
let start_state = self.state.clone();
self.state = end_state.clone();
// handle actions on entry (or exit) of states
match (start_state, end_state) {
// transitions out of initial state
(State::Initial, State::One) => {}
(State::Initial, State::Two) => {}
// transitions out of one state
(State::One, State::Initial) => {}
(State::One, State::Two) => {}
// transitions out of two state
(State::Two, State::Initial) => {}
(State::Two, State::One) => {}
// identity states (no transition)
(ref x, ref y) if x == y => {}
// ^^^ above branch doesn't match, so this is required
// _ => {},
}
}
}
fn main() {
let mut sm = StateMachine::new();
sm.advance_for_event(Event::ButtonOne);
assert_eq!(sm.state, State::One);
sm.advance_for_event(Event::ButtonOne);
assert_eq!(sm.state, State::Initial);
sm.advance_for_event(Event::ButtonTwo);
assert_eq!(sm.state, State::Two);
sm.advance_for_event(Event::ButtonTwo);
assert_eq!(sm.state, State::Initial);
}
In the StateMachine::transition method, the code as presented does not compile:
error[E0004]: non-exhaustive patterns: `(Initial, Initial)` not covered
--> src/main.rs:52:15
|
52 | match (start_state, end_state) {
| ^^^^^^^^^^^^^^^^^^^^^^^^ pattern `(Initial, Initial)` not covered
But that is exactly the pattern I'm trying to match! Along with the (One, One) edge and (Two, Two) edge. Importantly I specifically want this case because I want to leverage the compiler to ensure that every possible state transition is handled (esp. when new states are added later) and I know that identity transitions will always be a no-op.
I can resolve the compiler error by uncommenting the line below that one (_ => {}) but then I lose the advantage of having the compiler check for valid transitions because this will match on any states added in the future.
I could also resolve this by manually typing each identity transition such as:
(State::Initial, State::Initial) => {}
That is tedious, and at that point I'm just fighting the compiler. This could probably be turned into a macro, so I could possibly do something like:
identity_is_no_op!(State);
Or worst case:
identity_is_no_op!(State::Initial, State::One, State::Two);
The macro can automatically write this boilerplate any time a new state is added, but that feels like unnecessary work when the pattern I have written should be covering the exact case that I'm looking for.
Why doesn't this work as written?
What is the cleanest way to do what I am trying to do?
I have decided that a macro of the second form (i.e. identity_is_no_op!(State::Initial, State::One, State::Two);) is actually the preferred solution.
It's easy to imagine a future in which I do want some of the states to do something in the 'no transition' case. Using this macro would still have the desired effect of forcing the state machine to be revisited when new States are added and would just require adding the new state to the macro arglist if nothing needs to be done. A reasonable compromise IMO.
I think the question is still useful because the behavior was surprising to me as a relatively new Rustacean.

Why doesn't this work as written?
Because the Rust compiler cannot take guard expressions into account when determining if a match is exhaustive. As soon as you have a guard, it assumes that guard could fail.
Note that this has nothing to do with the distinction between refutable and irrefutable patterns. if guards are not part of the pattern, they're part of the match syntax.
What is the cleanest way to do what I am trying to do?
List every possible combination. A macro could slightly shorten writing the patterns, but you can't use macros to replace entire match arms.

Related

Reduce nested match statements with unwrap_or_else in Rust

I have a rust program that has multiple nested match statements as shown below.
match client.get(url).send() {
Ok(mut res) => {
match res.read_to_string(&mut s) {
Ok(m) => {
match get_auth(m) {
Ok(k) => k,
Err(_) => return Err(“a”);
}
},
Err(_) => {
return Err(“b”);
}
}
},
Err(_) => {
return Err(“c”);
},
};
All the variables k and m are of type String.I am looking for a way to make the code more readable by removing excessive nested match statements keeping the error handling intact since both the output and the error types are important for the problem.Is it possible to achieve this by unwrap_or_else?
The .map_err() utility converts a Result to have a new error type, leaving the success type alone. It accepts a closure that consumes the existing error value and returns the new one.
The ? operator will early-return the error in the Err case, and unwrap in the Ok case.
Combining these two allows you to express this same flow succinctly:
get_auth(
client.get(url).send().map_err(|_| "c")?
.read_to_string(&mut s).map_err(|_| "b")?
).map_err(|_| "a")?
(I suspect that you actually want to pass s to get_auth() but that's not what the code in your question does, so I'm choosing to represent the code you posted instead of imaginary code that I'm guessing about.)

How to use mail filter context data?

I am trying to write a mail filter in Rust using the milter crate. I built the example on a Linux VM and it all works fine. However, the example is using u32 as the type of context injected into their handlers, a quite simple example. I instead need to store a string from the handle_header callback through to the handle_eom handler so I can use an incoming header to set the envelope from.
If I log the value of the header in handle_header to console, it writes correctly but by the time it arrives in handle_eom, it has been corrupted/overwritten whatever. I thought that context was supposed to be specifically for this scenario but it seems weird that it uses type inference rather than e.g. a pointer to an object that you can just assign whatever you want to it.
Is my understanding of context wrong or is the code incorrect?
I tried using value and &value in handle_header and it behaves the same way.
use milter::*;
fn main() {
Milter::new("inet:3000#localhost")
.name("BounceRewriteFilter")
.on_header(header_callback)
.on_eom(eom_callback)
.on_abort(abort_callback)
.actions(Actions::ADD_HEADER | Actions::REPLACE_SENDER)
.run()
.expect("milter execution failed");
}
#[on_header(header_callback)]
fn handle_header<'a>(mut context: Context<&'a str>, header: &str, value: &'a str) -> milter::Result<Status> {
if header == "Set-Return-Path" {
match context.data.borrow_mut() {
Some(retpath) => *retpath = &value,
None => {
context.data.replace(value)?;
}
}
}
Ok(Status::Continue)
}
#[on_eom(eom_callback)]
fn handle_eom(mut context: Context<&str>) -> milter::Result<Status> {
match context.data.take() {
Ok(result) => {
println!("Set-return-path header is {}", result.unwrap());
context.api.replace_sender(result.unwrap(), None::<&str>)?;
}
Err(_error) => {}
}
Ok(Status::Continue)
}
Thanks to glts on Github, the author of the crate, the problem was that the string slices passed into the handle_header method were not borrowed by the external code that stores the data pointer so by the time that handle_eom is called, the memory has been reused for something else.
All I had to do was change Context<&str> to Context<String> and convert the strings using mystr.to_owned() and in the reverse direction val = &*mystring

How to match an argument with dots in Rust macros?

I am writing a program and it contains a lot of matchblocks as I keep calling methods and functions that return Result struct type results.
So I was thinking maybe a macro will reduce the amount of code.
And the final macro is like this:
#[macro_export]
macro_rules! ok_or_return {
//when calls on methods
($self: ident, $method: ident($($args: tt)*), $Error: ident::$err: ident) => {{
match $self.$method($($args)*) {
Ok(v) => v,
Err(e) => {
dbg!(e);
return Err($Error::$err);
}
}
}};
}
As yo can see, I use $($args: tt)* to match multiple arguments, and it goes pretty well. Even when I use struct.method() as a form of argument, it compiled.
Like:
ok_or_return!(self, meth(node.get_num()), Error::GetNumError);
However, if I use the same form to match a normal macro argument, it failed. I changed the specifier to tt, and it didn't work out. Like:
ok_or_return!(self.people, meth(node.get_num()), Error::GetNumError);
So my problem is why node.get_num() can be matched and self.people can't?

Writing Rust-y code: Keeping references to different structs depending on type of object in XML

I'm having a hard time formulating this in a rust-y manner, since my brain is still hardwired in Python. So I have a XML file:
<xml>
<car>
<name>First car</name>
<brand>Volvo</brand>
</car>
<plane>
<name>First plane</name>
<brand>Boeing</brand>
</plane>
<car>
<name>Second car</name>
<brand>Volvo</brand>
</car>
</xml>
In reality it's much more complex and the XML is about 500-1000MB large. I'm reading this using quick-xml which gives me events such as Start (tag start), Text and End (tag end) and I'm doing a state machine to keep track.
Now I want to off-load the parsing of car and plane to different modules (they need to be handled differently) but share a base-implementation/trait.
So far so good.
Now using my state machine I know when I need to offload to the car or the plane:
When I enter the main car tag I want to create a new instance of car
After that, offload everything until the corresponding </car> to it
When we reach the end I'm going to call .save() on the car implementation to store it elsewhere, and can free/destroy the instance.
But this means in my main loop I need to create a new instance of the car and keep track of it (and the same for plane if that's the main element.
let mut current_xml_section: I_DONT_KNOW_THE_TYPE = Some()
loop {
match reader.read_event(&mut buf) {
Ok(Event::Start(ref e)) => {
if state == State::Unknown {
match e.name() {
b"car" => {
state = State::InSection;
current_section = CurrentSection::Car;
state_at_depth = depth;
current_xml_section = CurrentSection::Car::new(e); // this won't work
},
b"plane" => {
state = State::InSection;
current_section = CurrentSection::Plane;
state_at_depth = depth;
current_xml_section = CurrentSection::Plane::new(e); // this won't work
},
_ => (),
};
}else{
current_xml_section.start_tag(e); // this won't work
}
depth += 1;
},
Ok(Event::End(ref e)) => {
depth -= 1;
if state == State::InSection && state_at_depth == depth {
state = State::Unknown;
current_section = CurrentSection::Unknown;
state_at_depth = 0;
current_xml_section.save(); // this won't work
// Free current_xml_section here
}else{
if state == State::InSection {
current_xml_section.end_tag(e) // this won't work
}
}
},
// unescape and decode the text event using the reader encoding
Ok(Event::Text(e)) => (
if state == State::InSection {
current_xml_section.text_data(e) // this won't work
}
),
Ok(Event::Eof) => break, // exits the loop when reaching end of file
Err(e) => panic!("Error at position {}: {:?}", reader.buffer_position(), e),
_ => (), // There are several other `Event`s we do not consider here
}
// if we don't keep a borrow elsewhere, we can clear the buffer to keep memory usage low
buf.clear();
}
}
So I basically don't know how to keep a reference in the main loop to the "current" object (I'm sorry, Python term), given that:
We may or may not have a current tag we're processing
That section might be a reference to either Car or Plane
I've also considered:
Use Serde, but it's a massive document and frankly I don't know the entire structure of it (I'm black box decoding it) so it would need to be passed to Serde in chunks (and I didn't manage to do that, even though I tried)
Keeping a reference to the latest plane, the latest car (and start by creating blank objects outside of the main loop) but it feels ugly
Using Generics
Any nudge in the right direction would be welcome as I try to un-Python my brain!
Event driven parsing of XML lends itself particularly well to a scope driven approach, where each level is parsed by a different function.
For example, your main loop could look like this:
loop {
match reader.read_event(&mut buf) {
Ok(Event::Start(ref e)) => {
match e.name() {
b"car" => handle_car(&mut reader, &mut buf)?,
b"plane" => handle_plane(&mut reader, &mut buf)?,
_ => return Err("Unexpected Tag"),
}
},
Ok(Event::Eof) => break,
_ => (),
}
}
Note that the inner match statement only has to consider the XML tags that can occur at the top level; any other tag is unexpected and should generate an error.
handle_car would look something like this:
fn handle_car(reader: &mut Reader<&[u8]>, buf:&mut Vec<u8>) -> Result<(),ErrType> {
let mut car = Car::new();
loop {
match reader.read_event(buf) {
Ok(Event::Start(ref e)) => {
match e.name() {
b"name" => {
car.name = handle_name(reader, buf)?;
},
b"brand" => {
car.brand = handle_brand(reader, buf)?;
},
_ => return Err("bad tag"),
}
},
Ok(Event::End(ref e)) => break,
Ok(Event::Eof) => return Err("Unexpected EOF"),
_ => (),
}
}
car.save();
Ok(())
}
handle_car creates its own instance of Car, which lives within the scope of that function. It has its own loop where it handles all the tags that can occur within it. If those tags contain yet more tags, you just introduce a new set of handling functions for them. The function returns a Result so that if the input structure does not match expectations the error can be passed up (as can any errors produced by quick_xml, which I have ignored but real code would handle).
This pattern has some advantages when parsing XML:
The structure of the code matches the expected structure of the XML, making it easier to read and understand.
The state is implicit in the structure of the code. No need for state variables or depth counters.
Common tags, that appear in multiple places (such as <name> and <brand> can be handled by common functions that are re-used
If the XML format you are parsing has nested structures (eg. if <car> could contain another <car>) this is handled by recursion.
Your original problem of not knowing how to store the Car / Plane within the main loop is completely avoided.

Idiomatic rust way to properly parse Clap ArgMatches

I'm learning rust and trying to make a find like utility (yes another one), im using clap and trying to support command line and config file for the program's parameters(this has nothing to do with the clap yml file).
Im trying to parse the commands and if no commands were passed to the app, i will try to load them from a config file.
Now I don't know how to do this in an idiomatic way.
fn main() {
let matches = App::new("findx")
.version(crate_version!())
.author(crate_authors!())
.about("find + directory operations utility")
.arg(
Arg::with_name("paths")
...
)
.arg(
Arg::with_name("patterns")
...
)
.arg(
Arg::with_name("operation")
...
)
.get_matches();
let paths;
let patterns;
let operation;
if matches.is_present("patterns") && matches.is_present("operation") {
patterns = matches.values_of("patterns").unwrap().collect();
paths = matches.values_of("paths").unwrap_or(clap::Values<&str>{"./"}).collect(); // this doesn't work
operation = match matches.value_of("operation").unwrap() { // I dont like this
"Append" => Operation::Append,
"Prepend" => Operation::Prepend,
"Rename" => Operation::Rename,
_ => {
print!("Operation unsupported");
process::exit(1);
}
};
}else if Path::new("findx.yml").is_file(){
//TODO: try load from config file
}else{
eprintln!("Command line parameters or findx.yml file must be provided");
process::exit(1);
}
if let Err(e) = findx::run(Config {
paths: paths,
patterns: patterns,
operation: operation,
}) {
eprintln!("Application error: {}", e);
process::exit(1);
}
}
There is an idiomatic way to extract Option and Result types values to the same scope, i mean all examples that i have read, uses match or if let Some(x) to consume the x value inside the scope of the pattern matching, but I need to assign the value to a variable.
Can someone help me with this, or point me to the right direction?
Best Regards
Personally I see nothing wrong with using the match statements and folding it or placing it in another function. But if you want to remove it there are many options.
There is the ability to use the .default_value_if() method which is impl for clap::Arg and have a different default value depending on which match arm is matched.
From the clap documentation
//sets value of arg "other" to "default" if value of "--opt" is "special"
let m = App::new("prog")
.arg(Arg::with_name("opt")
.takes_value(true)
.long("opt"))
.arg(Arg::with_name("other")
.long("other")
.default_value_if("opt", Some("special"), "default"))
.get_matches_from(vec![
"prog", "--opt", "special"
]);
assert_eq!(m.value_of("other"), Some("default"));
In addition you can add a validator to your operation OR convert your valid operation values into flags.
Here's an example converting your match arm values into individual flags (smaller example for clarity).
extern crate clap;
use clap::{Arg,App};
fn command_line_interface<'a>() -> clap::ArgMatches<'a> {
//Sets the command line interface of the program.
App::new("something")
.version("0.1")
.arg(Arg::with_name("rename")
.help("renames something")
.short("r")
.long("rename"))
.arg(Arg::with_name("prepend")
.help("prepends something")
.short("p")
.long("prepend"))
.arg(Arg::with_name("append")
.help("appends something")
.short("a")
.long("append"))
.get_matches()
}
#[derive(Debug)]
enum Operation {
Rename,
Append,
Prepend,
}
fn main() {
let matches = command_line_interface();
let operation = if matches.is_present("rename") {
Operation::Rename
} else if matches.is_present("prepend"){
Operation::Prepend
} else {
//DEFAULT
Operation::Append
};
println!("Value of operation is {:?}",operation);
}
I hope this helps!
EDIT:
You can also use Subcommands with your specific operations. It all depends on what you want to interface to be like.
let app_m = App::new("git")
.subcommand(SubCommand::with_name("clone"))
.subcommand(SubCommand::with_name("push"))
.subcommand(SubCommand::with_name("commit"))
.get_matches();
match app_m.subcommand() {
("clone", Some(sub_m)) => {}, // clone was used
("push", Some(sub_m)) => {}, // push was used
("commit", Some(sub_m)) => {}, // commit was used
_ => {}, // Either no subcommand or one not tested for...
}

Resources