How to propagate Nom fail context out of many0?

How to propagate Nom fail context out of many0? - rust

I have a Nom parser that parses strings but fails on keywords.
The parser below correctly fails when given a keyword.
But the error message that says it failed because it encountered a keyword does not propagate.
The problem is that while label fails, many0(label) succeeds with one less result. Subsequently, eof fails because not everything was parsed. This correctly fails, but the error message is lost.
How do I propagate the local error message?
(See below code on playground.)
use nom::bytes::complete::{take_while, take_while1};
use nom::error::{VerboseError};
use nom::multi::{many0};
use nom::{IResult};
use nom::error::{context};
use nom::combinator::{eof, fail};
type ParseResult<'input, Out> = IResult<&'input str, Out, VerboseError<&'input str>>;
fn label(s1: &str) -> ParseResult<&str> {
let (s2, a) = take_while1(|c: char| c.is_alphabetic())(s1)?;
let (s3, _) = take_while(|c: char| c.is_whitespace())(s2)?;
if a == "derp" {
return context("this message is lost", fail)(s1);
}
Ok((s3, a))
}
fn parse(s: &str) -> ParseResult<Vec<&str>> {
let (s, labels) = many0(label)(s)?;
let (s, _) = eof(s)?;
Ok((s, labels))
}
fn main() {
println!("{:?}", parse("foo bar herp flerp"));
// Ok(("", ["foo", "bar", "herp", "flerp"]))
// This fails with Nom(Eof), both Nom(Fail) and context is lost
println!("{:?}", parse("foo bar herp derp"));
// Err(Error(VerboseError { errors: [("derp", Nom(Eof))] }))
}

You can use cut to prevent many0 to explore alternatives (consuming less) in the failure case:
use nom::combinator::cut;
return cut(context("this message is lost", fail))(s1);
Playground

Related

Why can't Rust find method for enum generated using proc_macro_attribute?

I am trying to write procedural macros that will accept a Rust enum like
#[repr(u8)]
enum Ty {
A,
B
}
and generate a method for the enum that will let me convert an u8 into an allowed variant like this
fn from_byte(byte: u8) -> Ty {
match {
0 => Ty::A,
1 => Ty::B,
_ => unreachable!()
}
}
This is what I have implemented using proc_macro lib. (no external lib)
#![feature(proc_macro_diagnostic)]
#![feature(proc_macro_quote)]
extern crate proc_macro;
use proc_macro::{TokenStream, Diagnostic, Level, TokenTree, Ident, Group, Literal};
use proc_macro::quote;
fn report_error(tt: TokenTree, msg: &str) {
Diagnostic::spanned(tt.span(), Level::Error, msg).emit();
}
fn variants_from_group(group: Group) -> Vec<Ident> {
let mut iter = group.stream().into_iter();
let mut res = vec![];
while let Some(TokenTree::Ident(id)) = iter.next() {
match iter.next() {
Some(TokenTree::Punct(_)) | None => res.push(id),
Some(tt) => {
report_error(tt, "unexpected variant. Only unit variants accepted.");
return res
}
}
}
res
}
#[proc_macro_attribute]
pub fn procmac(args: TokenStream, input: TokenStream) -> TokenStream {
let _ = args;
let mut res = TokenStream::new();
res.extend(input.clone());
let mut iter = input.into_iter()
.skip_while(|tt| if let TokenTree::Punct(_) | TokenTree::Group(_) = tt {true} else {false})
.skip_while(|tt| tt.to_string() == "pub");
match iter.next() {
Some(tt # TokenTree::Ident(_)) if tt.to_string() == "enum" => (),
Some(tt) => {
report_error(tt, "unexpected token. this should be only used with enums");
return res
},
None => return res
}
match iter.next() {
Some(tt) => {
let variants = match iter.next() {
Some(TokenTree::Group(g)) => {
variants_from_group(g)
}
_ => return res
};
let mut match_arms = TokenStream::new();
for (i, v) in variants.into_iter().enumerate() {
let lhs = TokenTree::Literal(Literal::u8_suffixed(i as u8));
if i >= u8::MAX as usize {
report_error(lhs, "enum can have only u8::MAX variants");
return res
}
let rhs = TokenTree::Ident(v);
match_arms.extend(quote! {
$lhs => $tt::$rhs,
})
}
res.extend(quote!(impl $tt {
pub fn from_byte(byte: u8) -> $tt {
match byte {
$match_arms
_ => unreachable!()
}
}
}))
}
_ => ()
}
res
}
And this is how I am using it.
use helper_macros::procmac;
#[procmac]
#[derive(Debug)]
#[repr(u8)]
enum Ty {
A,
B
}
fn main() {
println!("TEST - {:?}", Ty::from_byte(0))
}
The problem is this causing an error from the compiler. The exact error being
error[E0599]: no variant or associated item named `from_byte` found for enum `Ty` in the current scope
--> main/src/main.rs:91:32
|
85 | enum Ty {
| ------- variant or associated item `from_byte` not found here
...
91 | println!("TEST - {:?}", Ty::from_byte(0))
| ^^^^^^^^^ variant or associated item not found in `Ty`
Running cargo expand though generate the proper code. And running that code directly works as expected. And so I am stumped. It could be I am missing something about how proc_macros should be used since this is the first time I am playing with them and I don't see anything that would cause this error. I am following the sorted portion of the proc_macro_workshop0. Only change is, I am using TokenStream directly instead of using syn and quote crates. Also, if I mistype the method name, the rust compiler does suggest that a method with similar name exists.

Here is a Playground repro: https://play.rust-lang.org/?version=nightly&mode=debug&edition=2018&gist=02c1ee77bcd80c68967834a53c011e41
So, indeed what you mention is true: the expanded code could be copy-pasted and it would work. When this happens (having behavior from macro expansion and "manual copy-pasted expansion" differ), there are two possibilities:
macro_rules! metavariables
When emitting code using macro_rules! special captures, some of these captures are wrapped with special invisible parenthesis that already tell the parser how the thing inside should be parsed, which make it illegal to use in other places (for instance, one may capture a $Trait:ty, and then doing impl $Trait for ... will fail (it will parse $Trait as a type, thus leading to it being interpreted as a trait object (old syntax)); see also https://github.com/danielhenrymantilla/rust-defile for other examples.
This is not your case, but it's good to keep in mind (e.g. my initial hunch was that when doing $tt::$rhs if $tt was a :path-like capture, then that could fail).
macro hygiene/transparency and Spans
Consider, for instance:
macro_rules! let_x_42 {() => (
let x = 42;
)}
let_x_42!();
let y = x;
This expands to code that, if copy-pasted, does not fail to compile.
Basically the name x that the macro uses is "tainted" to be different from any x used outside the macro body, precisely to avoid misinteractions when the macro needs to define helper stuff such as variables.
And it turns out that this is the same thing that has happened with your from_byte identifier: your code was emitting a from_byte with private hygiene / a def_site() span, which is something that normally never happens for method names when using classic macros, or classic proc-macros (i.e., when not using the unstable ::proc_macro::quote! macro). See this comment: https://github.com/rust-lang/rust/issues/54722#issuecomment-696510769
And so the from_byte identifier is being "tainted" in a way that allows Rust to make it invisible to code not belonging to that same macro expansion, such as the code in your fn main.
The solution, at this point, is easy: forge a from_bytes Identifier with an explicit non-def_site() Span (e.g., Span::call_site(), or even better: Span::mixed_site() to mimic the rules of macro_rules! macros) so as to prevent it from getting that default def_site() Span that ::proc_macro::quote! uses:
use ::proc_macro::Span;
// ...
let from_byte = TokenTree::from(Ident::new("from_byte", Span::mixed_site()));
res.extend(quote!(impl $tt {
// use an interpolated ident rather than a "hardcoded one"
// vvvvvvvvvv
pub fn $from_byte(byte: u8) -> $tt {
match byte {
$match_arms
_ => unreachable!()
}
}
}))
Playground

How to get a value from a Result?

How can I get the value of a struct which is returned in a Result from another function? Example below.
#[derive(Debug)]
pub struct Keypair(ed25519_dalek::Keypair);
pub fn keypair_from_seed(seed: &[u8]) -> Result<Keypair, Box<dyn error::Error>> {
let dalek_keypair = ed25519_dalek::Keypair { secret, public };
Ok(Keypair(dalek_keypair))
}
fn main(){
//here seed_bytes is mnemonics
let sk = keypair_from_seed(&seed_bytes);
//sk contains the secret key and public key, i want to extract it different secret key & public key
}

If you feel very confident
let sk = keypair_from_seed(&seed_bytes).unwrap();
or
let sk = keypair_from_seed(&seed_bytes).expect("my own failure message");
However, it is recommended to proceed like this
if let Ok(sk) = keypair_from_seed(&seed_bytes) {
// ... use sk ...
} else {
// ... sk is not available, may be should
// we warn the user, ask for an alternative ...
}
or, if you want to explicitly handle the error
match keypair_from_seed(&seed_bytes) {
Ok(sk) => {
// ... use sk ...
},
Err(e) => {
// ... sk is not available, and e explains why ...
},
}
Note that, if the function containing these lines is also
able to return an error, you can just propagate it with
the ? notation (if the error returned by
keypair_from_seed() is convertible into the error returned
by your function)
let sk = keypair_from_seed(&seed_bytes)?;
see
unwrap,
expect,
if let,
match,
?

Lets look the definition of Result in Rust documentation
enum Result<T, E> {
Ok(T),
Err(E),
}
So a Result is either Ok which contains a value with type T, or Err which contains a value with type E.
You have couple options to extract the value.
1- result.unwrap() or result.expect("error message")
This function returns the Ok value if result is Ok or panics the program (program is terminated). If you are sure that it doesn't contain error or you just want to write the correct case first and deal with error handling later it makes sense but you shouldn't use it all the time since it directly crashes the app when the value is not Ok.
You can use it like this
let val = result.unwrap();
// or
let val = result.expect("oops not Ok");
Only difference of expect you can provide the error message yourself instead of the standard error message of unwrap.
2- Pattern matching
In Rust, pattern matching is used for enum types so that user can do the necessary thing based on the current variant of the enum. You can use it like this
match result {
Ok(val) => {
// Use val here....
},
Err(err) => {
// Do something with the error if you want
}
}
If you are going to handle only one variant, you can also use if let statement like this
if let Some(val) = result {
// Do something with val
}

The returned result from the function is of the type Result<Keypair, Box<dyn error::Error>>.
There are multiple ways to extract a result from the Result container. Basically rust wants you to check for any errors and handle it. If no errors, you can extract the result and use it.
if let Ok(sk) = keypair_from_seed(&seed) {
let public = sk.0.public;
let secret = sk.0.secret;
/* use your keys */
}
Notice the sk.0 since you are using a struct of a tuple type. If your struct had multiple variables, something like
pub struct KeyTuple(ed25519_dalek::Keypair, i32, &str);
You would have used it as
let kt = keytuple_from_seed(&seed).unwrap();
let kp: ed25519_dalek::Keypair = kt.0;
let le: i32 = kt.1;
let st: &str = kt.2;

rust program crashes when calling unwrap

My simple rust program uses jsonpath crate to lookup some values in a json document.
The code that I am using is the following
use serde::Deserialize;
extern crate jsonpath;
extern crate serde_json;
use jsonpath::Selector;
use serde_json::Value;
use std::any::{Any, TypeId};
fn main() {
let jsondoc = r#"
{
"a": 10,
"b": "a string",
"c" : false,
"point" : {
"x" : 1,
"y": 2
}
}
"#;
let json: Value = serde_json::from_str(jsondoc).unwrap(); // Parse JSON document
let selector1 = Selector::new("$.a").unwrap(); // Create a JSONPath selector
let result1: Vec<f64> = selector1.find(&json)
.map(|t| t.as_f64().unwrap())
.collect();
println!("{:?}", result1);
let selector2 = Selector::new("$.b").unwrap(); // Create a JSONPath selector
let result2: Vec<&str> = selector2.find(&json)
.map(|t| t.as_str().unwrap())
.collect();
println!("{:?}", result2);
let selector3 = Selector::new("$.c").unwrap(); // Create a JSONPath selector
let result3: Vec<bool> = selector3.find(&json)
.map(|t| t.as_bool().unwrap())
.collect();
println!("{:?}", result3);
}
A concern is that if I change the json value "a" from 10 to "ten" (different data type) the code crashes in this statement: map(|t| t.as_f64().unwrap()) (cannot unwrap)
How do I protect the code in order to avoid panics?

As mentioned in comments, the problem here is that unwrap panics on errors. The solution is to propagate your result around your program, bubbling it up until you want to handle it.
There's one complication, though: as_*() methods don't return a Result, but rather an Option. If we want to treat it as an error case and propogate it around the program, turning it into a Result will make that much nicer.
Here's a simple way to do that using ok_or:
let selector3 = Selector::new("$.c").unwrap(); // Create a JSONPath selector
let result3: Vec<bool> = selector3.find(&json)
.map(|t| t.as_bool().ok_or("expected boolean, found something else").unwrap())
.collect();
println!("{:?}", result3);
If you want to make this error message better and/or include more information in it, I think making a custom error enum is probably the best solution. There's a lot of detail on that in this answer to a related question about error handling, but in short, if you create your own error type containing more information, you can easily make the errors much more readable:
use snafu::Snafu; // 0.6.8
#[derive(Debug, Snafu)]
enum MyError {
#[snafu(display("expected {}, found {}", value))]
WrongValueType {
expected: &'static str,
actual: serde_json::Value,
}
}
let selector3 = Selector::new("$.c").unwrap(); // Create a JSONPath selector
let result3: Vec<bool> = selector3.find(&json)
.map(|t| t.as_bool().ok_or_else(|| MyError::WrongValueType { expected: "boolean", actual: t.clone() }).unwrap())
.collect();
println!("{:?}", result3);
This uses the snafu crate, but there are other options (or it can be done manually). Again, see the above linked answer for more information.
Alright - we have a Result. Now, we need to propogate it up. This adds some boilerplate, but fortunately, it shouldn't have to change the style all that much. In particular, Iterator::collect can be used to collect into anything implementing FromIterator, and Result implements FromIterator.
Replacing each map with something like this will iterate until either all results are found Ok(), or one is an Err:
let selector3 = Selector::new("$.c").unwrap(); // Create a JSONPath selector
let result3: Vec<bool> = selector3.find(&json)
.map(|t| t.as_bool().ok_or("expected boolean, found something else"))
// &'static str is our error type
.collect::<Result<Vec<bool>, &'static str>>()
.unwrap();
println!("{:?}", result3);
We're still unwrapping, so it'll still panic, but that brings the error one step up. To deal with it at the top level, you'll need to stick this into a function returning a Result itself, and then match on that result to deal with the error case. It's a common pattern to do this by turning our main function into just error handling, and then to make our "real main" a separate function returning a Result. Here's an application of that on your snippet of code:
// `Box<dyn std::error::Error>` encapsulates "any error" without giving
// access to the details
fn try_main() -> Result<(), Box<dyn std::error::Error>> {
let jsondoc = r#"
{
"a": 10,
"b": "a string",
"c" : false,
"point" : {
"x" : 1,
"y": 2
}
}
"#;
let json: Value = serde_json::from_str(jsondoc)?; // Parse JSON document
let selector1 = Selector::new("$.a").unwrap(); // Create a JSONPath selector
let result1: Vec<f64> = selector1.find(&json)
.map(|t| t.as_f64().ok_or("expected boolean, found something else"))
.collect::<Result<_, _>>()?;
println!("{:?}", result1);
// ... remaining code can be translated as this one.
}
fn main() {
match try_main() {
Ok(()) => {}
Err(e) => {
eprintln!("Error occurred: {}", e);
std::process::exit(1);
}
}
}

Is there a way to write a function that using odbc::Statement in a loop?

I have working example of a simple loop (mostly taken from the odbc crate's example):
use std::io;
use odbc::*;
use odbc_safe::AutocommitOn;
fn main(){
let env = create_environment_v3().map_err(|e| e.unwrap()).unwrap();
let conn = env.connect_with_connection_string(CONN_STRING).unwrap();
let mut stmt = Statement::with_parent(&conn).unwrap();
loop {
let mut sql_text = String::new();
println!("Please enter SQL statement string: ");
io::stdin().read_line(&mut sql_text).unwrap();
stmt = match stmt.exec_direct(&sql_text).unwrap() {
Data(mut stmt) => {
let cols = stmt.num_result_cols().unwrap();
while let Some(mut cursor) = stmt.fetch().unwrap() {
for i in 1..(cols + 1) {
match cursor.get_data::<&str>(i as u16).unwrap() {
Some(val) => print!(" {}", val),
None => print!(" NULL"),
}
}
println!();
}
stmt.close_cursor().unwrap()
}
NoData(stmt) => {println!("Query executed, no data returned"); stmt}
}
}
}
I don't want to create new Statements for each query, as I just can .close_cursor().
I'd like to extract the loop's body to a function, like this:
fn exec_stmt(stmt: Statement<Allocated, NoResult, AutocommitOn>) {
//loop's body here
}
But I just can't! The .exec_direct() method mutably consumes my Statement and returns another. I tried different ways to pass Statement arg to the function (borrow, RefCell, etc), but they all fail when using in a loop. I am still new to Rust, so most likely I just don't know something, or does the .exec_direct's Statement consumption makes it impossible?

There's no nice way to move and then move back values through parameters. It's probably best to copy what .exec_direct does and just make the return type of your function a statement as well.
The usage would then look like this:
let mut stmt = Statement::with_parent(&conn).unwrap();
loop {
stmt = exec_stmt(stmnt);
}
and your function signature would be:
fn exec_stmt(stmt: Statement<...>) -> Statement<...> {
match stmt.exec_direct() {
...
}
}
I probably wouldn't recommend this, but if you really wanted to get it to work you could use Option and the .take() method.
fn exec_stmt(some_stmt: &mut Option<Statement<...>>) {
let stmt = some_stmt.take().unwrap();
// do stuff ...
some_stmt.replace(stmt);
}

The odbc-safe crate tried to have each state transition of ODBC reflected in a different type. The odbc-api crate also tries to protect you from errors, but is a bit more subtle about it. Your use case would be covered by the the Preallocated struct.
The analog example from the odbc-api documentation looks like this:
use odbc_api::{Connection, Error};
use std::io::{self, stdin, Read};
fn interactive(conn: &Connection) -> io::Result<()>{
let mut statement = conn.preallocate().unwrap();
let mut query = String::new();
stdin().read_line(&mut query)?;
while !query.is_empty() {
match statement.execute(&query, ()) {
Err(e) => println!("{}", e),
Ok(None) => println!("No results set generated."),
Ok(Some(cursor)) => {
// ...print cursor contents...
},
}
stdin().read_line(&mut query)?;
}
Ok(())
}
This will allow you to declare a function without any trouble:
use odbc_api::Preallocated;
fn exec_statement(stmt: &mut Preallocated) {
// loops body here
}

If a Result returns Err(_), I want the whole function to return a HTTP request error

I'm trying to use the Iron framework to build a simple backend in Rust. This handler is just supposed to return the content of a certain file and I can get it to work properly with unwrap() but I want to try to do proper error handling. This is how I would imagine it would look like:
fn get_content(res: &mut Request) -> IronResult<Response> {
let mut id = String::new();
res.body.read_to_string(&mut id).unwrap();
let file_path_string = &("../content/".to_string() + &id + ".rdt");
// TODO: Error handling
match File::open(file_path_string) {
Ok(f) => {
let mut s = String::new();
f.read_to_string(&mut s);
Ok(Response::with(((status::Ok), s)))
}
Err(err) => Err(Response::with(((status::InternalServerError), "File not found")))
};
}
This throws the error not all control paths return a value [E0269], which is fine. But if I add a response after the match part:
match File::open(file_path_string) {
Ok(f) => {
let mut s = String::new();
f.read_to_string(&mut s);
Ok(Response::with(((status::Ok), s)))
}
Err(err) => Err(Response::with(((status::InternalServerError), "File not found")))
};
Err(Response::with(((status::InternalServerError), "File not found")))
I instead get the error message:
expected `iron::error::IronError`,
found `iron::response::Response`
(expected struct `iron::error::IronError`,
found struct `iron::response::Response`) [E0308]
src/main.rs:95
Err(Response::with(((status::InternalServerError), "File not found")))
I think the problem is the collision between Rust Err and Iron Err? I'm not sure though. And I have not done much web development (or Rust for that matter) in the past so any feedback on the code is also appreciated!
UPDATE: I think this is more "The Rust Way" to do it? But I'm not sure
fn get_content(res: &mut Request) -> IronResult<Response> {
let mut id = String::new();
res.body.read_to_string(&mut id).unwrap();
let file_path_string = &("../content/".to_string() + &id + ".rdt");
// TODO: Error handling
let f;
match File::open(file_path_string) {
Ok(file) => f = file,
Err(err) => Err(HttpError::Io(err))
};
let mut s = String::new();
f.read_to_string(&mut s);
Ok(Response::with(((status::Ok), s)))
}
Having the code inside the error handling seems weird as read_to_string also needs to be taken care of and that would create a nested mess of error handling? However, these matching arms are obviously of incompatible types so it won't work... any suggestions?

An Ok() takes an Response, but an Err() takes an IronError.
Hence your call Err(...) is not valid when ... is a Response!
How to correct it? Well the first step is, you must create an IronError to send back. I believe (not familiar with Iron) that Iron will automatically an appropriate error code and that it's not your job to do that. In the documentation we find one key type implementing IronError:
pub enum HttpError {
Method,
Uri(ParseError),
Version,
Header,
TooLarge,
Status,
Io(Error),
Ssl(Box<Error + 'static + Send + Sync>),
Http2(HttpError),
Utf8(Utf8Error),
// some variants omitted
}
I can't see one which allows for an arbitrary string like "file not found". However, your use case is one of an IO failure, right? So it would make sense to use HttpError::Io with the std::IoError that you got back from File::open():
match File::open(file_path_string) {
Ok(f) => {
let mut s = String::new();
f.read_to_string(&mut s);
Ok(Response::with(((status::Ok), s)))
}
Err(err) => Err(HttpError::Io(err))
};
By the way, it also fixes your "TODO: error handling"! How beautiful!
(Code untested, please feel free to edit if compilation fails)

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How to propagate Nom fail context out of many0? - rust

You can use cut to prevent many0 to explore alternatives (consuming less) in the failure case: use nom::combinator::cut; return cut(context("this message is lost", fail))(s1); Playground

Related

Why can't Rust find method for enum generated using proc_macro_attribute?

How to get a value from a Result?

rust program crashes when calling unwrap

Is there a way to write a function that using odbc::Statement in a loop?

If a Result returns Err(_), I want the whole function to return a HTTP request error

Categories

Resources