Capture the entire contiguous matched input with nom - rust

I'm looking to apply a series of nom parsers and return the complete &str that matches. I want to match strings of the form a+bc+. Using the existing chain! macro I can get pretty close:
named!(aaabccc <&[u8], &str>,
map_res!(
chain!(
a: take_while!(is_a) ~
tag!("b") ~
take_while!(is_c) ,
|| {a}
),
from_utf8
));
where
fn is_a(l: u8) -> bool {
match l {
b'a' => true,
_ => false,
}
}
fn is_c(l: u8) -> bool {
match l {
b'c' => true,
_ => false,
}
}
Say we have 'aaabccc' as input. The above parser will match the input but only 'aaa' will be returned. What I would like to do is return 'aaabccc', the original input.
chain! is not the right macro for this, but there was not another that seemed more correct. What would the best way to do this be?
At the time of this writing I'm using nom 1.2.2 and rustc 1.9.0-nightly (a1e29daf1 2016-03-25).

It appears as if you want recognized!:
if the child parser was successful, return the consumed input as produced value
And an example:
#[macro_use]
extern crate nom;
use nom::IResult;
fn main() {
assert_eq!(aaabccc(b"aaabcccddd"), IResult::Done(&b"ddd"[..], "aaabccc"));
}
named!(aaabccc <&[u8], &str>,
map_res!(
recognize!(
chain!(
take_while!(is_a) ~
tag!("b") ~
take_while!(is_c),
|| {}
)
),
std::str::from_utf8
)
);
fn is_a(l: u8) -> bool {
match l {
b'a' => true,
_ => false,
}
}
fn is_c(l: u8) -> bool {
match l {
b'c' => true,
_ => false,
}
}
I'm not sure if chain! is the best way of combining sequential parsers if you don't care for the values, but it works in this case.

Related

How to more easily write !bool in const generics when calling functions?

I have code similar to the following:
fn f1<const B: bool>(x:u64) { }
fn f2<const B: bool>(x:u64) {
match B {
true => f1::<false>(x),
false => f1::<true>(x),
}
}
That is, I need to negate a const bool in function calls. Rust currently doesn't support something as plain as f1::<!B>() unfortunately. Is there a better way for me to do this other than to litter match statements everywhere?
If this is just one function, I would create a function:
fn negate_f1<const B: bool>(x: u64) {
match B {
true => f1::<false>(x),
false => f1::<true>(x),
}
}
fn f2<const B: bool>(x: u64) {
negate_f1::<B>(x);
}
If more, you can create a macro. The following macro works only for bare identifiers (foo(), not foo::bar()) and one generic parameter, extending it is difficult but possible:
macro_rules! negate {
( $fn:ident :: < ! $param:ident > ( $($arg:expr),* $(,)? ) ) => {
match $param {
true => $fn::<false>( $($arg),* ),
false => $fn::<true>( $($arg),* ),
}
}
}
fn f2<const B: bool>(x: u64) {
negate!(f1::<!B>(x));
}
On nightly, this can be achieved by enabling the generic_const_exprs feature and wrapping the constant expression in curly braces: f1::<{!B}>. However due to this bug this requires extra bounds on f2:
#![feature(generic_const_exprs)]
fn f1<const B: bool>(_x:u64) { }
pub fn f2<const B: bool>(x:u64) where [(); !B as usize]: {
f1::<{!B}>(x);
}
Playground

How to access nested enums without full match syntax

I am using the Serde crate to deserialise a JSON file, which has a nested structure like this:
struct Nested {
a: Vec<Foo>,
b: u8,
}
struct Foo {
c: Bar,
d: Vec<f32>,
}
Struct Bar {
e: u32,
f: String,
}
Part of the applications purpose is to check for missing parameters (or incorrect types in parameters), and then display a nicely printed list of errors found in the file, so I need to handle the structure missing parameters or wrongly typed.
I came across this great post that helped solved my issue, by wrapping each parameter in an enum result that contains the value if it passed, the value if it failed, or a final enum if it was missing (since the nested structures might also be missing I wrapped them in the same enum):
pub enum TryParse<T> {
Parsed(T),
Unparsed(Value),
NotPresent
}
struct Nested {
a: TryParse<Vec<Foo>>,
b: TryParse<u8>,
}
struct Foo {
c: TryParse<Bar>,
d: TryParse<Vec<f32>>,
}
Struct Bar {
e: TryParse<u32>,
f: TryParse<String>,
}
However, I'm not sure how to access them now without unpacking every step into a match statement. For example, I can access B very easily:
match file.b {
Parsed(val) => {println!("Got parameter of {}", val)},
Unparsed(val) => {println!("Invalid type: {:?}", val)}
NotPresent => {println!("b not found")},
};
However, I'm not sure how to access the nested ones (C D E and F). I can't use Unwrap or expect since this isn't technically a 'Result', so how do I unpack these?:
if file.a.c.e.Parsed() && file.a.c.e == 32 {... //invalid
if file.a.d && file.a.d.len() == 6... //invalid
I know in a way this flies against rust's 'handle every outcome' philosophy, and I want to handle them, but I want to know if there is a nicer way than 400 nested match statements (the above example is very simplified, the files I am using have up to 6 nested layers, each parameter in the top node has at least 3 layers, some are vectors as well)…
Perhaps I need to implement a function similar to unwrap() on my 'TryParse'? or would it be better to wrap each parameter in a 'Result', extend that with the deserialise trait, and then somehow store the error in the Err option that says if it was a type error or missing parameter?
EDIT
I tried adding the following, some of which works and some of which does not:
impl <T> TryParse<T> {
pub fn is_ok(self) -> bool { //works
match self {
Self::Parsed(_t) => true,
_ => false,
}
}
pub fn is_absent(self) -> bool { //works
match self {
Self::NotPresent => true,
_ => false,
}
}
pub fn is_invalid(self) -> bool { //works
match self {
Self::Unparsed(_) => true,
_ => false,
}
}
pub fn get(self) -> Result<T, dyn Error> { //doesnt work
match self {
Self::Parsed(t) => Ok(t),
Self::Unparsed(e) => Err(e),
Self::NotPresent => Err("Invalid")
}
}
}
I can't believe it is this hard just to get the result, should I just avoid nested enums or get rid of the TryParse enums/ functions all together and wrap everything in a result, so the user simply knows if it worked or didn't work (but no explanation why it failed)
Implementing unwrap() is one possibility. Using Result is another, with a custom error type. You can deserialize directly into result with #[serde(deserialize_with = "...")], or using a newtype wrapper.
However, a not-enough-used power of pattern matching is nested patterns. For example, instead of if file.a.c.e.Parsed() && file.a.c.e == 32 you can write:
if let TryParse::Parsed(a) = &file.a {
// Unfortunately we cannot combine this `if let` with the surrounding `if let`,
// because `Vec` doesn't support pattern matching (currently).
if let TryParsed::Parsed(
[Foo {
c:
TryParse::Parsed(Bar {
e: TryParse::Parsed(32),
..
}),
..
}, ..],
) = a.as_slice()
{
// ...
}
}
May not be the most Rust-y way of doing it, but for those like me moving from another language like C/Python/C++, this is the way I have done it that still allows me to quickly validate if I have an error and use the match syntax to handle it. Thanks to #Chayim Friedman for assisting with this, his way is probably better but this made the most sense for me:
#[derive(Debug)]
pub enum TryParse<T> {
Parsed(T),
Unparsed(Value),
NotPresent
}
impl<'de, T: DeserializeOwned> Deserialize<'de> for TryParse<T> {
fn deserialize<D: Deserializer<'de>>(deserializer: D) -> Result<Self, D::Error> {
match Option::<Value>::deserialize(deserializer)? {
None => Ok(TryParse::NotPresent),
Some(value) => match T::deserialize(&value) {
Ok(t) => Ok(TryParse::Parsed(t)),
Err(_) => Ok(TryParse::Unparsed(value)),
},
}
}
}
impl <T> TryParse<T> {
//pub fn is_ok(self) -> bool { ---> Use get().is_ok(), built into result
// match self {
// Self::Parsed(_t) => true,
// _ => false,
// }
//}
pub fn is_absent(self) -> bool {
match self {
Self::NotPresent => true,
_ => false,
}
}
pub fn is_invalid(self) -> bool {
match self {
Self::Unparsed(_) => true,
_ => false,
}
}
pub fn get(&self) -> Result<&T, String> {
match self {
Self::Parsed(t) => Ok(t),
Self::Unparsed(v) => Err(format!("Unable to Parse {:?}", v)),
Self::NotPresent => Err("Parameter not Present".to_string())
}
}
// pub fn get_direct(&self) -> &T {
// match self {
// Self::Parsed(t) => t,
// _ => panic!("Can't get this value!"),
// }
// }
}
match &nested.a.get().unwrap()[1].c.get.expect("Missing C Parameter").e{
Parsed(val) => {println!("Got value of E: {}", val)},
Unparsed(val) => {println!("Invalid Type: {:?}", val)}
NotPresent => {println!("Param E Not Found")},
};
//Note the use of '&' at the beginning because we need to borrow a reference to it
I know I need to change my mindset to use the rust way of thinking, and I am completely open to other suggestions if they can demonstrate some working code.

How can I get the T from an Option<T> when using syn?

I'm using syn to parse Rust code. When I read a named field's type using field.ty, I get a syn::Type. When I print it using quote!{#ty}.to_string() I get "Option<String>".
How can I get just "String"? I want to use #ty in quote! to print "String" instead of "Option<String>".
I want to generate code like:
impl Foo {
pub set_bar(&mut self, v: String) {
self.bar = Some(v);
}
}
starting from
struct Foo {
bar: Option<String>
}
My attempt:
let ast: DeriveInput = parse_macro_input!(input as DeriveInput);
let data: Data = ast.data;
match data {
Data::Struct(ref data) => match data.fields {
Fields::Named(ref fields) => {
fields.named.iter().for_each(|field| {
let name = &field.ident.clone().unwrap();
let ty = &field.ty;
quote!{
impl Foo {
pub set_bar(&mut self, v: #ty) {
self.bar = Some(v);
}
}
};
});
}
_ => {}
},
_ => panic!("You can derive it only from struct"),
}
My updated version of the response from #Boiethios, tested and used in a public crate, with support of several syntaxes for Option:
Option
std::option::Option
::std::option::Option
core::option::Option
::core::option::Option
fn extract_type_from_option(ty: &syn::Type) -> Option<&syn::Type> {
use syn::{GenericArgument, Path, PathArguments, PathSegment};
fn extract_type_path(ty: &syn::Type) -> Option<&Path> {
match *ty {
syn::Type::Path(ref typepath) if typepath.qself.is_none() => Some(&typepath.path),
_ => None,
}
}
// TODO store (with lazy static) the vec of string
// TODO maybe optimization, reverse the order of segments
fn extract_option_segment(path: &Path) -> Option<&PathSegment> {
let idents_of_path = path
.segments
.iter()
.into_iter()
.fold(String::new(), |mut acc, v| {
acc.push_str(&v.ident.to_string());
acc.push('|');
acc
});
vec!["Option|", "std|option|Option|", "core|option|Option|"]
.into_iter()
.find(|s| &idents_of_path == *s)
.and_then(|_| path.segments.last())
}
extract_type_path(ty)
.and_then(|path| extract_option_segment(path))
.and_then(|path_seg| {
let type_params = &path_seg.arguments;
// It should have only on angle-bracketed param ("<String>"):
match *type_params {
PathArguments::AngleBracketed(ref params) => params.args.first(),
_ => None,
}
})
.and_then(|generic_arg| match *generic_arg {
GenericArgument::Type(ref ty) => Some(ty),
_ => None,
})
}
You should do something like this untested example:
use syn::{GenericArgument, PathArguments, Type};
fn extract_type_from_option(ty: &Type) -> Type {
fn path_is_option(path: &Path) -> bool {
leading_colon.is_none()
&& path.segments.len() == 1
&& path.segments.iter().next().unwrap().ident == "Option"
}
match ty {
Type::Path(typepath) if typepath.qself.is_none() && path_is_option(typepath.path) => {
// Get the first segment of the path (there is only one, in fact: "Option"):
let type_params = typepath.path.segments.iter().first().unwrap().arguments;
// It should have only on angle-bracketed param ("<String>"):
let generic_arg = match type_params {
PathArguments::AngleBracketed(params) => params.args.iter().first().unwrap(),
_ => panic!("TODO: error handling"),
};
// This argument must be a type:
match generic_arg {
GenericArgument::Type(ty) => ty,
_ => panic!("TODO: error handling"),
}
}
_ => panic!("TODO: error handling"),
}
}
There's not many things to explain, it just "unrolls" the diverse components of a type:
Type -> TypePath -> Path -> PathSegment -> PathArguments -> AngleBracketedGenericArguments -> GenericArgument -> Type.
If there is an easier way to do that, I would be happy to know it.
Note that since syn is a parser, it works with tokens. You cannot know for sure that this is an Option. The user could, for example, type std::option::Option, or write type MaybeString = std::option::Option<String>;. You cannot handle those arbitrary names.

How do I assert an enum is a specific variant if I don't care about its fields?

I'd like to check enums with fields in tests while ignoring the actual value of the fields for now.
Consider the following example:
enum MyEnum {
WithoutFields,
WithFields { field: String },
}
fn return_with_fields() -> MyEnum {
MyEnum::WithFields {
field: "some string".into(),
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn example() {
assert_eq!(return_with_fields(), MyEnum::WithFields {..});
}
}
playground
I'd like to use assert_eq! here, but the compiler tells me:
error: expected expression, found `}`
--> src/lib.rs:18:64
|
18 | assert_eq!(return_with_fields(), MyEnum::WithFields {..});
| ^ expected expression
This is similar to Why do I get an error when pattern matching a struct-like enum variant with fields?, but the solution does not apply in my case.
Of course, I can use match and do it myself, but being able to use assert_eq! would be less work.
Rust 1.42
You can use std::matches:
assert!(matches!(return_with_fields(), MyEnum::WithFields { .. }));
Previous versions
Your original code can be made to work with a new macro:
macro_rules! is_enum_variant {
($v:expr, $p:pat) => (
if let $p = $v { true } else { false }
);
}
#[test]
fn example() {
assert!(is_enum_variant!(return_with_fields(), MyEnum::WithoutFields {..}));
}
Personally, I tend to add methods to my enums:
fn is_with_fields(&self) -> bool {
match self {
MyEnum::WithFields { .. } => true,
_ => false,
}
}
I also tend to avoid struct-like enums and instead put in extra work:
enum MyEnum {
WithoutFields,
WithFields(WithFields),
}
struct WithFields { field: String }
impl MyEnum {
fn is_with_fields(&self) -> bool {
match self {
MyEnum::WithFields(_) => true,
_ => false,
}
}
fn as_with_fields(&self) -> Option<&WithFields> {
match self {
MyEnum::WithFields(x) => Some(x),
_ => None,
}
}
fn into_with_fields(self) -> Option<WithFields> {
match self {
MyEnum::WithFields(x) => Some(x),
_ => None,
}
}
}
I hope that some day, enum variants can be made into their own type to avoid this extra struct.
If you are using Rust 1.42 and later, see Shepmaster's answer below.
A simple solution here would be to do the opposite assertion:
assert!(return_with_fields() != MyEnum::WithoutFields);
or even more simply:
assert_ne!(return_with_fields(), MyEnum::WithoutFields);
Of course if you have more members in your enum, you'll have to add more asserts to cover all possible cases.
Alternatively, and this what OP probably had in mind, since assert! just panics in case of failure, the test can use pattern matching and call panic! directly in case something is wrong:
match return_with_fields() {
MyEnum::WithFields {..} => {},
MyEnum::WithoutFields => panic!("expected WithFields, got WithoutFields"),
}
I'd use a macro like #Shepmaster proposed, but with more error reporting (like the existing assert! and assert_eq! macros:
macro_rules! assert_variant {
($value:expr, $pattern:pat) => ({
let value = &$value;
if let $pattern = value {} else {
panic!(r#"assertion failed (value doesn't match pattern):
value: `{:?}`,
pattern: `{}`"#, value, stringify!($pattern))
}
})
// TODO: Additional patterns for trailing args, like assert and assert_eq
}
Rust playground demonstrating this example

Can you put another match clause in a match arm?

Can you put another match clause in one of the match results of a match like this in:
pub fn is_it_file(input_file: &str) -> String {
let path3 = Path::new(input_file);
match path3.is_file() {
true => "File!".to_string(),
false => match path3.is_dir() {
true => "Dir!".to_string(),
_ => "Don't care",
}
}
}
If not why ?
Yes you can (see Qantas' answer). But Rust often has prettier ways to do what you want. You can do multiple matches at once by using tuples.
pub fn is_it_file(input_file: &str) -> String {
let path3 = Path::new(input_file);
match (path3.is_file(), path3.is_dir()) {
(true, false) => "File!",
(false, true) => "Dir!",
_ => "Neither or Both... bug?",
}.to_string()
}
Sure you can, match is an expression:
fn main() {
fn foo() -> i8 {
let a = true;
let b = false;
match a {
true => match b {
true => 1,
false => 2
},
false => 3
}
}
println!("{}", foo()); // 2
}
You can view the results of this on the Rust playpen.
The only thing that seems off about your code to me is the inconsistent usage of .to_string() in your code, the last match case doesn't have that.

Resources