Custom Nom parser error without custom ErrorKind - rust

I have a small parser that fails if the number it parses is out of bounds,
use nom::IResult;
use nom::bytes::complete::tag;
use nom::character::complete::digit1;
use nom::combinator::fail;
fn dup(s: &str) -> IResult<&str, u64> {
let (s, _) = tag("dup")(s)?;
let (s, n) = digit1(s)?;
let n = match n {
"0" => 0,
"1" => 1,
"2" => 2,
_ => return fail(s), // FIXME: Annotate.
};
Ok((s, n))
}
(See on playground.)
But I'd like for the error to be more meaningful.
How do I annotate this error with the context that n is out of bounds?
E.g. instead of fail(), use something that provides an error message.
Or somehow wrap a part of the parser in something that provides this context.
I know that you can create your own custom ErrorKind, but can it be done without? (When you have an ErrorKind, error_position!() and error_node_position!() macros will work.)

You probably want to read Error Management.
Broadly speaking, nom::error::Error is a low-overhead type for parser errors, that's why it only has the parser's errors.
If you want to attach more context, nom provides a nom::error::VerboseError type, there's also ancillary crates which provide further error wrappers.
Finally, if that is still not sufficient nom's error handling is based around the ParseError trait so you can have a completely custom error type and implement that. The latter option obviously has the highest overhead, but also the highest level of flexibility.

Thanks to Masklinn's answer (which I've marked as accepted), a way to add custom error messages without adding a custom error type is to use VerboseError and convert_error() instead of the default Error, since this is capable for embedding context. Here's a modified example (also on playground):
use nom::IResult;
use nom::bytes::complete::tag;
use nom::character::complete::digit1;
use nom::combinator::fail;
use nom::error::{context, VerboseError};
use nom::error::convert_error;
use nom::Finish;
fn dup(s: &str) -> IResult<&str, u64, VerboseError<&str>> {
let (s, _) = tag("dup")(s)?;
let (sd, n) = digit1(s)?;
let n = match n {
"0" => 0,
"1" => 1,
"2" => 2,
_ => return fail(s), // FIXME: Annotate.
};
Ok((sd, n))
}
fn main() {
let input = "dup3";
let result = context("dup", dup)(input).finish().err().unwrap();
println!("{}", convert_error(input, result));
}
Adding context("dup", dup) provides a quite beautiful and readable context to the error message:
0: at line 1, in Fail:
dup3
^
1: at line 1, in dup:
dup3
^
but it does not add clarity at the innermost layer. If I add a context on the fail line:
let n = match n {
"0" => 0,
"1" => 1,
"2" => 2,
_ => return context("using an out-of-bounds dup", fail)(s),
};
then the message becomes
0: at line 1, in Fail:
dup3
^
1: at line 1, in using an out-of-bounds dup:
dup3
^
2: at line 1, in dup:
dup3
^
which is almost what I want! But I really just want to replace the message "in Fail" with "in using an out-of-bounds dup", not add to it. It is worth mentioning here what the Error Management docs say about convert_error():
Note that VerboseError and convert_error are meant as a starting point for language errors, but that they cannot cover all use cases. So a custom convert_error function should probably be written.
So the least complicated way I've found to add custom annotation/context to error messages, is using VerboseError, but also replacing convert_error with one that pops the ErrorKind::Fail and ErrorKind::Eof if they are followed by something with a context, in which case I expect the context to reside at the same position, causing a duplicate entry:
fn pretty_print_error(s: &str, mut e: VerboseError<&str>) -> String {
let (_root_s, root_error) = e.errors[0].clone();
if matches!(root_error, VerboseErrorKind::Nom(ErrorKind::Fail))
|| matches!(root_error, VerboseErrorKind::Nom(ErrorKind::Eof))
{
e.errors.remove(0);
}
convert_error(s, e)
}
Simpler solutions are welcome.

Related

Advent of Code 2015: day 5, part 2 unknown false positives

I'm working through the Advent of Code 2015 problems in order to practise my Rust skills.
Here is the problem description:
Realizing the error of his ways, Santa has switched to a better model of determining whether a string is naughty or nice. None of the old rules apply, as they are all clearly ridiculous.
Now, a nice string is one with all of the following properties:
It contains a pair of any two letters that appears at least twice in the string without overlapping, like xyxy (xy) or aabcdefgaa (aa), but not like aaa (aa, but it overlaps).
It contains at least one letter which repeats with exactly one letter between them, like xyx, abcdefeghi (efe), or even aaa.
For example:
qjhvhtzxzqqjkmpb is nice because is has a pair that appears twice (qj) and a letter that repeats with exactly one letter between them (zxz).
xxyxx is nice because it has a pair that appears twice and a letter that repeats with one between, even though the letters used by each rule overlap.
uurcxstgmygtbstg is naughty because it has a pair (tg) but no repeat with a single letter between them.
ieodomkazucvgmuy is naughty because it has a repeating letter with one between (odo), but no pair that appears twice.
How many strings are nice under these new rules?
This is what I've managed to come up with so far:
pub fn part2(strings: &[String]) -> usize {
strings.iter().filter(|x| is_nice(x)).count()
/* for s in [
String::from("qjhvhtzxzqqjkmpb"),
String::from("xxyxx"),
String::from("uurcxstgmygtbstg"),
String::from("ieodomkazucvgmuy"),
String::from("aaa"),
]
.iter()
{
is_nice(s);
}
0 */
}
fn is_nice(s: &String) -> bool {
let repeat = has_repeat(s);
let pair = has_pair(s);
/* println!(
"s = {}: repeat = {}, pair = {}, nice = {}",
s,
repeat,
pair,
repeat && pair
); */
repeat && pair
}
fn has_repeat(s: &String) -> bool {
for (c1, c2) in s.chars().zip(s.chars().skip(2)) {
if c1 == c2 {
return true;
}
}
false
}
fn has_pair(s: &String) -> bool {
// Generate all possible pairs
let mut pairs = Vec::new();
for (c1, c2) in s.chars().zip(s.chars().skip(1)) {
pairs.push((c1, c2));
}
// Look for overlap
for (value1, value2) in pairs.iter().zip(pairs.iter().skip(1)) {
if value1 == value2 {
// Overlap has occurred
return false;
}
}
// Look for matching pair
for value in pairs.iter() {
if pairs.iter().filter(|x| *x == value).count() >= 2 {
//println!("Repeat pair: {:?}", value);
return true;
}
}
// No pair found
false
}
However despite getting the expected results for the commented-out test values, my result when running on the actual puzzle input does not compare with community verified regex-based implementations. I can't seem to see where the problem is despite having thoroughly tested each function with known test values.
I would rather not use regex if at all possible.
I think has_pairs has a bug:
In the word aaabbaa, we have overlapping aa (at the beginning aaa), but I think you are not allowed to return false right away, because there is another - non-overlapping - aa at the end of the word.

Implicit class holding mutable variable in multithreaded environment

I need to implement a parallel method, which takes two computation blocks, a and b, and starts each of them in a new thread. The method must return a tuple with the result values of both the computations. It should have the following signature:
def parallel[A, B](a: => A, b: => B): (A, B)
I managed to solve the exercise by using straight Java-like approach. Then I decided to make up a solution with implicit class. Here's it:
object ParallelApp extends App {
implicit class ParallelOps[A](a: => A) {
var result: A = _
def spawn(): Unit = {
val thread = new Thread {
override def run(): Unit = {
result = a
}
}
thread.start()
thread.join()
}
}
def parallel[A, B](a: => A, b: => B): (A, B) = {
a.spawn()
b.spawn()
(a.result, b.result)
}
println(parallel(1 + 2, "a" + "b"))
}
For unknown reason, I receive output (null,null). Could you please point me out where is the problem?
Spoiler alert: It's not complicated. It's funny, like a magic trick (if you consider reading the documentation about Java Memory Model "funny", that is). If you haven't figured it out yet, I would highly recommend to try to figure it out, otherwise it won't be funny. Someone should make a "division-by-zero proves 2 = 4"-riddle out of it.
Consider the following shorter example:
implicit class Foo[A](a: A) {
var result: String = "not initialized"
def computeResult(): Unit = result = "Yay, result!"
}
val a = "a string"
a.computeResult()
println(a.result)
When run, it prints
not initialized
despite the fact that we invoked computeResult() and set result to "Yay, result!". The problem is that the two invocations a.computeResult() and a.result belong to two completely independent instances of Foo. The implicit conversion is performed twice, and the second implicitly created object doesn't know anything about the changes in the first implicitly created object. It has nothing to do with threads or JMM at all.
By the way: your code is not parallel. Calling join right after calling start doesn't bring you anything, your main thread will simply go idle and wait until another thread finishes. At no point will there be two threads that do any useful work concurrently.
EDIT: Fixed a bug pointed out by Andrey Tyukin
One way to solve your problem is to use Scala Futures
Documentation. Tutorial.
Useful Klang Blog.
You'll typically need some combination of these libraries:
import scala.concurrent.ExecutionContext.Implicits.global
import scala.concurrent.{Await, Future}
import scala.util.{Failure, Success}
import scala.concurrent.duration._
an asynchronous example:
def parallelAsync[A,B](a: => A, b: => B): Future[(A,B)] = {
// as per Andrey Tyukin's comments, this line runs
// the two futures sequentially and we do not get
// any benefit from it. I will leave this line here
// so others will not fall in my trap
//for {i <- Future(a); j <- Future(b) } yield (i,j)
Future(a) zip Future(b)
}
parallelAsync(1 + 2, "a" + "b").onComplete {
case Success(x) => println(x)
case Failure(e) => e.printStackTrace()
}
If you must block until both are complete, you can use this:
def parallelSync[A,B](a: => A, b: => B): (A,B) = {
// see comment above
//val f = for { i <- Future(a); j <- Future(b) } yield (i,j)
val tuple = Future(a) zip Future(b)
Await.result(tuple, 5 second)
}
println(parallelSync(3 + 4, "c" + "d"))
When running these little examples, don't forget to sleep a little bit at the end so the program won't end before the results come back
Thread.sleep(3000)

Check if a float can be converted to integer without loss

I wanted to check whether an integer was a power of 2. My standard approach would have been to see if log₂(x) was an integer value, however I found no elegant way to do this. My approaches were the following:
let x = 65;
let y = (x as f64).log(2.0);
// Compute the difference between y and the result of
// of truncating y by casting to int and back
let difference = y - (y as i64 as f64);
// This looks nice but matches on float values will be phased out
match difference {
0.0 => println!("{} is a power of 2", x),
_ => println!("{} is NO power of 2", x),
}
// This seems kind of clunky
if difference == 0.0 {
println!("{} is a power of 2", x);
} else {
println!("{} is NO power of 2", x);
}
Is there a builtin option in Rust to check if a float can be converted to an integer without truncation?
Something that behaves like:
42.0f64.is_int() // True/ Ok()
42.23f64.is_int() // False/ Err()
In other words, a method/ macro/ etc. that allows me to check if I will lose information (decimals) by casting to int.
I already found that checking whether an integer is a power of 2 can be done efficiently with x.count_ones() == 1.
You can use fract to check if there is a non-zero fractional part:
42.0f64.fract() == 0.0;
42.23f64.fract() != 0.0;
Note that this only works if you already know that the number is in range. If you need an extra check to test that the floating-point number is between 0 and u32::MAX (or between i32::MIN and i32::MAX), then you might as well do the conversion and check that it didn't lose precision:
x == (x as u32) as f64

Swift - best practice to find the longest string at [String] array

I'm trying to find what is the most effective way to get the longest string in a string array. For example :
let array = ["I'm Roi","I'm asking here","Game Of Thrones is just good"]
and the outcome will be - "Game Of Thrones is just good"
I've tried using the maxElement func, tho it's give the max string in a alphabetic ideas(maxElement()).
Any suggestions? Thanks!
Instead of sorting which is O(n log(n)) for a good sort, use max(by:) which is O(n) on Array providing it a closure to compare string lengths:
Swift 4:
For Swift 4 you can get the string length with the count property on String:
let array = ["I'm Roi","I'm asking here","Game Of Thrones is just good"]
if let max = array.max(by: {$1.count > $0.count}) {
print(max)
}
Swift 3:
Use .characters.count on String to get the string lengths:
let array = ["I'm Roi","I'm asking here","Game Of Thrones is just good"]
if let max = array.max(by: {$1.characters.count > $0.characters.count}) {
print(max)
}
Swift 2:
Use maxElement on Array providing it a closure to compare string lengths:
let array = ["I'm Roi","I'm asking here","Game Of Thrones is just good"]
if let max = array.maxElement({$1.characters.count > $0.characters.count}) {
print(max)
}
Note: maxElement is O(n). A good sort is O(n log(n)), so for large arrays, this will be much faster than sorting.
You can use reduce to do this. It will iterate through your array, keeping track of the current longest string, and then return it when finished.
For example:
let array = ["I'm Roi","I'm asking here","Game Of Thrones is just good"]
if let longestString = array.reduce(Optional<String>.None, combine:{$0?.characters.count > $1.characters.count ? $0:$1}) {
print(longestString) // "Game Of Thrones is just good"
}
(Note that Optional.None is now Optional.none in Swift 3)
This uses an nil starting value to account for the fact that the array could be empty, as pointed out by #JHZ (it will return nil in that case). If you know your array has at least one element, you can simplify it to:
let longestString = array.reduce("") {$0.characters.count > $1.characters.count ? $0:$1}
Because it only iterates through each element once, it will quicker than using sort(). I did a quick benchmark and sort() appears around 20x slower (although no point in premature optimisation, I feel it is worth mentioning).
Edit: I recommend you go with #vacawama's solution as it's even cleaner than reduce!
Here you go:
let array = ["I'm Roi","I'm asking here","Game Of Thrones is just good"]
var sortedArr = array.sort() { $0.characters.count > $1.characters.count }
let longestEelement = sortedArr[0]
You can also practice with the use of Generics by creating this function:
func longestString<T:Sequence>(from stringsArray: T) -> String where T.Iterator.Element == String{
return (stringsArray.max {$0.count < $1.count}) ?? ""
}
Explanation: Create a function named longestString. Declar that there is a generic type T that implements the Sequence protocol (Sequence is defined here: https://developer.apple.com/documentation/swift/sequence). The function will return a single String (of course, the longest). The where clause explains that the generic type T should be limited to having elements of type String.
Inside the function, call the max function of the stringsArray by comparing the longest string of the elements inside. What will be returned is the longest String (an optional as it can be nil if the array is empty). If the longest string is nil then (use of ??) returns an empty string as the longest string instead.
Now call it:
let longestA = longestString(from:["Shekinah", "Chesedh", "Agape Sophia"])
If you get the hang of using generics, even if the strings are hidden inside objects, you can make use of the pattern of coding above. You can change the element to objects of the same class (Person for example).
Thus:
class Person {
let name: String
init(name: String){
self.name = name
}
}
func longestName<T:Sequence>(from stringsArray: T) -> String where T.Iterator.Element == Person{
return (stringsArray.max {$0.name.count < $1.name.count})?.name ?? ""
}
Then call the function like these:
let longestB = longestName(from:[Person(name: "Shekinah"), Person(name: "Chesedh"), Person(name: "Agape Sophia")])
You also get to rename your function based on the appropriateness of its use. You can tweak the pattern to return something else, like the object itself, or the length (count) of the String. And finally, becoming familiar with generics may improve your coding ability.
Now, with a little tweak again, you may extend further so that you can compare strings owned by many different types as long as they implement a common protocol.
protocol Nameable {
var name: String {get}
}
This defines a protocol named Nameable that requires those who implement to have a name variable of type String. Next, we define two different things that both implement the protocol.
class Person: Nameable {
let name: String
init(name: String){
self.name = name
}
}
struct Pet: Nameable {
let name: String
}
Then we tweak our generic function so that it requires that the elements must conform to Nameable, vastly different though they are.
func longestName<T:Sequence>(from stringsArray: T) -> String where T.Iterator.Element == Nameable{
return (stringsArray.max {$0.name.count < $1.name.count})?.name ?? ""
}
Let's collect the different objects into an array. Then call our function.
let myFriends: [Nameable] = [Pet(name: "Bailey"), Person(name: "Agape Sophia")]
let longestC = longestName(from: myFriends)
Lastly, after knowing "where" above and "Sequence" above, you may simply extend Sequence:
extension Sequence where Iterator.Element == String {
func topString() -> String {
self.max(by: { $0.count < $1.count }) ?? ""
}
}
Or the protocol type:
extension Sequence where Iterator.Element == Nameable {
func theLongestName() -> Nameable? {
self.max(by: { $0.name.count < $1.name.count })
}
}

How to generate tuples from strings?

I am writing a macro to parse some structured text into tuples, line by line. Most parts work now, but I am stuck at forming a tuple by extracting/converting Strings from a vector.
// Reading Tuple from a line
// Example : read_tuple( "1 ab 3".lines()
// ,(i32, String, i32))
// Expected : (1, "ab", 3)
// Note:: you can note use str
macro_rules! read_tuple {
(
$lines :ident , ( $( $t :ty ),* )
)
=> {{
let l = ($lines).next().unwrap();
let ws = l.trim().split(" ").collect::<Vec<_>>();
let s : ( $($t),* ) = (
// for w in ws {
// let p = w.parse().unwrap();
// ( p) ,
// }
ws[0].parse().unwrap(),
ws[1].parse().unwrap(),
//...
ws[2].parse().unwrap(),
// Or any way to automatically generate these statments?
);
s
}}
}
fn main() {
let mut _x = "1 ab 3".lines();
let a = read_tuple!( _x, (i32, String, i32));
print!("{:?}",a);
}
How can I iterate through ws and return the tuple within this macro?
You can try here
A tuple is a heterogeneous collection; each element may be of a different type. And in your example, they are of different types, so each parse method is needing to produce a different type. Therefore pure runtime iteration is right out; you do need all the ws[N].parse().unwrap() statements expanded.
Sadly there is not at present any way of writing out the current iteration of a $(…)* (though it could be simulated with a compiler plugin). There is, however, a way that one can get around that: blending run- and compile-time iteration. We use iterators to pull out the strings, and the macro iteration expansion (ensuring that $t is mentioned inside the $(…) so it knows what to repeat over) to produce the right number of the same lines. This also means we can avoid using an intermediate vector as we are using the iterator directly, so we win all round.
macro_rules! read_tuple {
(
$lines:ident, ($($t:ty),*)
) => {{
let l = $lines.next().unwrap();
let mut ws = l.trim().split(" ");
(
$(ws.next().unwrap().parse::<$t>().unwrap(),)*
)
}}
}
A minor thing to note is how I changed ),* to ,)*; this means that you will get (), (1,), (1, 2,), (1, 2, 3,), &c. instead of (), (1), (1, 2), (1, 2, 3)—the key difference being that a single-element tuple will work (though you’ll still sadly be writing read_tuple!(lines, (T))).

Resources