identifying left recursion in antlr4

identifying left recursion in antlr4 - antlr4

A grammar (copied from its manual) reports the following left recursions when
I un-commented the following production: casting_type->constant_primary
error(119): The following sets of rules are mutually left-recursive [primary, method_call_root, method_call, cast]
and [casting_type, constant_cast, cast, constant_primary, constant_function_call, function_subroutine_call, primary]
and [subroutine_call, function_subroutine_call, constant_function_call, constant_primary, method_call, method_call_root, casting_type, primary, constant_cast, cast]
The above error report has 3 sets of rules. The third set has 2 left-recursions in it:
casting_type,constant_primary,constant_cast,casting_type
casting_type,constant_primary,constant_function_call,function_subroutine_call,subroutine_call,method_call,method_call_root,primary,cast,casting_type
Since this error was reported after I un-commented one production, I
think it is reasonable to expect to see at least its names in each set (casting_type,constant_primary). Clearly the first set
lacks both these names, so it cannot contain a recursion. And the second set (I cannot give the full
grammar here because it is too long) has recursion-1 and some extra names
which seem not relevant.
My question is: why is Antlr printing the first and the second sets of rules?
Is this a bug in antlr (I tried 4.6 and 4.7, same result), or is this hinting at a problem that I am missing something in these sets?
I saw a similar post elsewhere, where the reported names did not indicate a recursion, but on deeper analysis recursion was found somewhere else.

Probably nobody can really answer your question, not even the authors of ANTLR. To me it looks like you get follow-up errors which make not much sense, because a real error made analysis impossible (or at least can lead to wrong conclusions). Of course there can also be a bug in ANTLR, but I recommend to focus on one of the sets and fix that (if you can see what makes them mutually left recursive). Maybe the other errors disappear then or you have to analyze again.

Related

What is the typed hole exploration development style?

While doing the CIS194 (Spring of 2013) homework 10, I got stuck with Applicative instance of a Parser type. I seek help from Google, an I came across with this Reddit post. The user ephrion gave an answer, which was also an example of typed hole exploration method, which I didn't quite understand. In the comments section of his answer he also said this:
It's extremely useful and one of the things that makes Haskell development so nice.
So question is, what this method is exactly, and are there some explicit order of steps in this method?
I still consider myself as a beginner when it comes to Haskell, and by googling about the subject I didn't find a very clear explanation how this kind of development style could be used.

Almost anywhere on the right hand side of an assignment in Haskell, you can write an underscore (optionally followed by other characters) instead of a value (constant or function). Instead of compiling, GHC will then tell you which type of value you might want to replace the underscore with, and list which identifiers in scope are of that type.
Matthías Páll Gissurarson is expanding the list of hints from GHC to include compound expressions.

Stop GHC from warning me about one particular missing pattern

Let's say I would generally like to be warned about incomplete patterns in my code, but sometimes I know about a pattern incompleteness for a certain function and I know it's fine.
Is it still true that GHC's warning granularity is per-module, and there's no way to change warnings regarding a particular function or definition?

Yes, still true, but you can work around this by using error.
f (Just a) = show a
without a case for Nothing gives warnings but adding
f Nothing = error "f: Nothing supplied as an argument. This shouldn't have happened. Oops."
gets rid of the warning.
A per-function solution of your problem is to give Haskell some code you think will never be run, to keep it quiet.
Please note: I think your code should be robust and cover every eventuality unless you can prove it will never happen.
Working around this restriction isn't very good practice, I think.
(You might think that is a wide-open back door to hack away a useful compile-time check and should be stopped by -Wall, but I can obfuscate my round any simple restriction you'd choose and I think a complete solution to that problem would essentially solve the halting problem, so let's not blame the compiler.)

Haskell without types

Is it possible to disable or work around the type system in Haskell? There are situations where it is convenient to have everything untyped as in Forth and BCPL or monotyped as in Mathematica. I'm thinking along the lines of declaring everything as the same type or of disabling type checking altogether.
Edit: In conformance with SO principles, this is a narrow technical question, not a request for discussion of the relative merits of different programming approaches. To rephrase the question, "Can Haskell be used in a way such that avoidance of type conflicts is entirely the responsibility of the programmer?"

Also look at Data.Dynamic which allows you to have dynamically typed values in parts of your code without disabling type-checking throughout.

GHC 7.6 (not released yet) has a similar feature, -fdefer-type-errors:
http://hackage.haskell.org/trac/ghc/wiki/DeferErrorsToRuntime
It will defer all type errors until runtime. It's not really untyped but it allows almost as much freedom.

Even with fdefer-type-errors one wouldn't be avoiding the type system. Nor does it really allow type independence. The point of the flag is to allow code with type errors to compile, so long as the errors are not called by the Main function. In particular, any code with a type error, when actually called by a Haskell interpreter, will still fail.
While the prospect of untyped functions in Haskell might be tempting, it's worth noting that the type system is really at the heart of the language. The code proves its own functionality in compilation, and the rigidity of the type system prevents a large number of errors.
Perhaps if you gave a specific example of the problem you're having, the community could address it. Interconverting between number types is something that I've asked about before, and there are a number of good tricks.

Perhaps fdefer-type-errors combined with https://hackage.haskell.org/package/base-4.14.1.0/docs/Unsafe-Coerce.html offers what you need.

Mandatory use of braces

As part of a code standards document I wrote awhile back, I enforce "you must always use braces for loops and/or conditional code blocks, even (especially) if they're only one line."
Example:
// this is wrong
if (foo)
//bar
else
//baz
while (stuff)
//things
// This is right.
if (foo) {
// bar
} else {
// baz
}
while (things) {
// stuff
}
When you don't brace a single-line, and then someone comments it out, you're in trouble. If you don't brace a single-line, and the indentation doesn't display the same on someone else's machine... you're in trouble.
So, question: are there good reasons why this would be a mistaken or otherwise unreasonable standard? There's been some discussion on it, but no one can offer me a better counterargument than "it feels ugly".

The best counter argument I can offer is that the extra line(s) taken up by the space reduce the amount of code you can see at one time, and the amount of code you can see at one time is a big factor in how easy it is to spot errors. I agree with the reasons you've given for including braces, but in many years of C++ I can only think of one occasion when I made a mistake as a result and it was in a place where I had no good reason for skipping the braces anyway. Unfortunately I couldn't tell you if seeing those extra lines of code ever helped in practice or not.
I'm perhaps more biased because I like the symmetry of matching braces at the same indentation level (and the implied grouping of the contained statements as one block of execution) - which means that adding braces all the time adds a lot of lines to the project.

I enforce this to a point, with minor exceptions for if statements which evaluate to either return or to continue a loop.
So, this is correct by my standard:
if(true) continue;
As is this
if(true) return;
But the rule is that it is either a return or continue, and it is all on the same line. Otherwise, braces for everything.
The reasoning is both for the sake of having a standard way of doing it, and to avoid the commenting problem you mentioned.

I see this rule as overkill. Draconian standards don't make good programmers, they just decrease the odds that a slob is going to make a mess.
The examples you give are valid, but they have better solutions than forcing braces:
When you don't brace a single-line, and then someone comments it out, you're in trouble.
Two practices solve this better, pick one:
1) Comment out the if, while, etc. before the one-liner with the one-liner. I.e. treat
if(foo)
bar();
like any other multi-line statement (e.g. an assignment with multiple lines, or a multiple-line function call):
//if(foo)
// bar();
2) Prefix the // with a ;:
if(foo)
;// bar();
If you don't brace a single-line, and the indentation doesn't display the same on someone else's machine... you're in trouble.
No, you're not; the code works the same but it's harder to read. Fix your indentation. Pick tabs or spaces and stick with them. Do not mix tabs and spaces for indentation. Many text editors will automatically fix this for you.
Write some Python code. That will fix at least some bad indentation habits.
Also, structures like } else { look like a nethack version of a TIE fighter to me.
are there good reasons why this would be a mistaken or otherwise unreasonable standard? There's been some discussion on it, but no one can offer me a better counterargument than "it feels ugly".
Redundant braces (and parentheses) are visual clutter. Visual clutter makes code harder to read. The harder code is to read, the easier it is to hide bugs.
int x = 0;
while(x < 10);
{
printf("Count: %d\n", ++x);
}
Forcing braces doesn't help find the bug in the above code.
P.S. I'm a subscriber to the "every rule should say why" school, or as the Dalai Lama put it, "Know the rules so that you may properly break them."

I have yet to have anyone come up with a good reason not to always use curly braces.
The benefits far exceed any "it feels ugly" reason I've heard.
Coding standard exist to make code easier to read and reduce errors.
This is one standard that truly pays off.

I find it hard to argue with coding standards that reduce errors and make the code more readable. It may feel ugly to some people at first, but I think it's a perfectly valid rule to implement.

I stand on the ground that braces should match according to indentation.
// This is right.
if (foo)
{
// bar
}
else
{
// baz
}
while (things)
{
// stuff
}
As far as your two examples, I'd consider yours slightly less readable since finding the matching closing parentheses can be hard, but more readable in cases where indentation is incorrect, while allowing logic to be inserted easier. It's not a huge difference.
Even if indentation is incorrect, the if statement will execute the next command, regardless of whether it's on the next line or not. The only reason for not putting both commands on the same line is for debugger support.

The one big advantage I see is that it's easier to add more statements to conditionals and loops that are braced, and it doesn't take many additional keystrokes to create the braces at the start.

My personal rule is if it's a very short 'if', then put it all on one line:
if(!something) doSomethingElse();
Generally I use this only when there are a bunch of ifs like this in succession.
if(something == a) doSomething(a);
if(something == b) doSomething(b);
if(something == b) doSomething(c);
That situation doesn't arise very often though, so otherwise, I always use braces.

At present, I work with a team that lives by this standard, and, while I'm resistant to it, I comply for uniformity's sake.
I object for the same reason I object to teams that forbid use of exceptions or templates or macros: If you choose to use a language, use the whole language. If the braces are optional in C and C++ and Java, mandating them by convention shows some fear of the language itself.
I understand the hazards described in other answers here, and I understand the yearning for uniformity, but I'm not sympathetic to language subsetting barring some strict technical reason, such as the only compiler for some environment not accommodating templates, or interaction with C code precluding broad use of exceptions.
Much of my day consists of reviewing changes submitted by junior programmers, and the common problems that arise have nothing to do with brace placement or statements winding up in the wrong place. The hazard is overstated. I'd rather spend my time focusing on more material problems than looking for violations of what the compiler would happily accept.

The only way coding standards can be followed well by a group of programmers is to keep the number of rules to a minimum.
Balance the benefit against the cost (every extra rule confounds and confuses programmers, and after a certain threshold, actually reduces the chance that programmers will follow any of the rules)
So, to make a coding standard:
Make sure you can justify every rule with clear evidence that it is better than the alternatives.
Look at alternatives to the rule - is it really needed? If all your programmers use whitespace (blank lines and indentation) well, an if statement is very easy to read, and there is no way that even a novice programmer can mistake a statement inside an "if" for a statement that is standalone. If you are getting lots of bugs relating to if-scoping, the root cause is probably that you have a poor whitepsace/indentation style that makes code unnecessarily difficult to read.
Prioritise your rules by their measurable effect on code quality. How many bugs can you avoid by enforcing a rule (e.g. "always check for null", "always validate parameters with an assert", "always write a unit test" versus "always add some braces even if they aren't needed"). The former rules will save you thousands of bugs a year. The brace rule might save you one. Maybe.
Keep the most effective rules, and discard the chaff. Chaff is, at a minimum, any rule that will cost you more to implement than any bugs that might occur by ignoring the rule. But probably if you have more than about 30 key rules, your programmers will ignore many of them, and your good intentions will be as dust.
Fire any programmer who comments out random bits of code without reading it :-)
P.S. My stance on bracing is: If the "if" statement or the contents of it are both a single line, then you may omit the braces. That means that if you have an if containing a one-line comment and a single line of code, the contents take two lines, and therefore braces are required. If the if condition spans two lines (even if the contents are a single line), then you need braces. This means you only omit braces in trivial, simple, easily read cases where mistakes are never made in practice. (When a statement is empty, I don't use braces, but I always put a comment clearly stating that it is empty, and intentionally so. But that's bordering on a different topic: being explicit in code so that readers know that you meant a scope to be empty rather than the phone rang and you forgot to finish the code)

Many languanges have a syntax for one liners like this (I'm thinking of perl in particular) to deal with such "ugliness". So something like:
if (foo)
//bar
else
//baz
can be written as a ternary using the ternary operator:
foo ? bar : baz
and
while (something is true)
{
blah
}
can be written as:
blah while(something is true)
However in languages that don't have this "sugar" I would definitely include the braces. Like you said it prevents needless bugs from creeping in and makes the intention of the programmer clearer.

I am not saying it is unreasonable, but in 15+ years of coding with C-like languages, I have not had a single problem with omitting the braces. Commenting out a branch sounds like a real problem in theory - I've just never seen it happening in practice.

Another advantage of always using braces is that it makes search-and-replace and similar automated operations easier.
For example: Suppose I notice that functionB is usually called immediately after functionA, with a similar pattern of arguments, and so I want to refactor that duplicated code into a new combined_function. A regex could easily handle this refactoring if you don't have a powerful enough refactoring tool (^\s+functionA.*?;\n\s+functionB.*?;) but, without braces, a simple regex approach could fail:
if (x)
functionA(x);
else
functionA(y);
functionB();
would become
if (x)
functionA(x);
else
combined_function(y);
More complicated regexes would work in this particular case, but I've found it very handy to be able to use regex-based search-and-replace, one-off Perl scripts, and similar automated code maintenance, so I prefer a coding style that doesn't make that needlessly complicated.

I don't buy into your argument. Personally, I don't know anyone who's ever "accidentally" added a second line under an if. I would understand saying that nested if statements should have braces to avoid a dangling else, but as I see it you're enforcing a style due to a fear that, IMO, is misplaced.

Here are the unwritten (until now I suppose) rules I go by. I believe it provides readability without sacrificing correctness. It's based on a belief that the short form is in quite a few cases more readable than the long form.
Always use braces if any block of the if/else if/else statement has more than one line. Comments count, which means a comment anywhere in the conditional means all sections of the conditional get braces.
Optionally use braces when all blocks of the statement are exactly one line.
Never place the conditional statement on the same line as the condition. The line after the if statement is always conditionally executed.
If the conditional statement itself performs the necessary action, the form will be:
for (init; term; totalCount++)
{
// Intentionally left blank
}
No need to standardize this in a verbose manner, when you can just say the following:
Never leave braces out at the expense of readability. When in doubt, choose to use braces.

I think the important thing about braces is that they very definitely express the intent of the programmer. You should not have to infer intent from indentation.
That said, I like the single-line returns and continues suggested by Gus. The intent is obvious, and it is cleaner and easier to read.

If you have the time to read through all of this, then you have the time to add extra braces.

I prefer adding braces to single-line conditionals for maintainability, but I can see how doing it without braces looks cleaner. It doesn't bother me, but some people could be turned off by the extra visual noise.
I can't offer a better counterargument either. Sorry! ;)

For things like this, I would recommend just coming up with a configuration template for your IDE's autoformatter. Then, whenever your users hit alt-shift-F (or whatever the keystroke is in your IDE of choice), the braces will be automatically added. Then just say to everyone: "go ahead and change your font coloring, PMD settings or whatever. Please don't change the indenting or auto-brace rules, though."
This takes advantage of the tools available to us to avoid arguing about something that really isn't worth the oxygen that's normally spent on it.

Depending on the language, having braces for a single lined conditional statement or loop statement is not mandatory. In fact, I would remove them to have fewer lines of code.
C++:
Version 1:
class InvalidOperation{};
//...
Divide(10, 0);
//...
Divide(int a, in b)
{
if(b == 0 ) throw InvalidOperation();
return a/b;
}
Version 2:
class InvalidOperation{};
//...
Divide(10, 0);
//...
Divide(int a, in b)
{
if(b == 0 )
{
throw InvalidOperation();
}
return a/b;
}
C#:
Version 1:
foreach(string s in myList)
Console.WriteLine(s);
Version2:
foreach(string s in myList)
{
Console.WriteLine(s);
}
Depending on your perspective, version 1 or version 2 will be more readable. The answer is rather subjective.

Wow. NO ONE is aware of the dangling else problem? This is essentially THE reason for always using braces.
In a nutshell, you can have nasty ambiguous logic with else statements, especially when they're nested. Different compilers resolve the ambiguity in their own ways. It can be a huge problem to leave off braces if you don't know what you're doing.
Aesthetics and readability has nothing to do it.

In Functional Programming, is it considered a bad practice to have incomplete pattern matchings

Is it generally considered a bad practice to use non-exhaustive pattern machings in functional languages like Haskell or F#, which means that the cases specified don't cover all possible input cases?
In particular, should I allow code to fail with a MatchFailureException etc. or should I always cover all cases and explicitly throw an error if necessary?
Example:
let head (x::xs) = x
Or
let head list =
match list with
| x::xs -> x
| _ -> failwith "Applying head to an empty list"
F# (unlike Haskell) gives a warning for the first code, since the []-case is not covered, but can I ignore it without breaking functional style conventions for the sake of succinctness? A MatchFailure does state the problem quite well after all ...

If you complete your pattern-matchings with a constructor [] and not the catch-all _, the compiler will have a chance to tell you to look again at the function with a warning the day someone adds a third constructor to lists.
My colleagues and I, working on a large OCaml project (200,000+ lines), force ourselves to avoid partial pattern-matching warnings (even if that means writing | ... -> assert false from time to time) and to avoid so-called "fragile pattern-matchings" (pattern matchings written in such a way that the addition of a constructor may not be detected) too. We consider that the maintainability benefits.

Explicit is better than implicit (borrowed from the Zen of Python ;))
It's exactly the same as in a C switch over an enum... It's better to write all the cases (with a fall through) rather than just putting a default, because the compiler will tell you if you add new elements to the enumeration and you forgot to handle them.

I think that it depends quite a bit on the context. Are you trying to write robust, easy to debug code, or are you trying to write something simple and succinct?
If I were working on a long term project with multiple developers, I'd put in the assert to give a more useful error message. I also agree with Pascal's comment that not using a wildcard would be ideal from a software engineering perspective.
If I were working on a smaller scale project on which I was the only developer, I wouldn't think twice about using an incomplete match. If necessary, you can always check the compiler warnings.
I think it also depends a bit on the types you're matching against. Realistically, no extra union cases will be added to the list type, so you don't need to worry about fragile matching. On the other hand, in code that you control and are actively working on, there may well be types which are in flux and have additional union cases added, which means that protecting against fragile matching may be worth it.

This is a special case of a more general question, which is "should you ever create partial functions". Incomplete pattern matches are only one example of partial functions.
As a rule total functions are preferable. When you find yourself looking at a function that just has to be partial, ask yourself if you can solve the problem in the type system first. Sometimes that is more trouble than its worth (e.g. creating a whole type of lists with known lengths just to avoid the "head []" problem). So its a trade-off.
Or maybe you just asking whether its good practice in partial functions to say things like
head [] = error "head: empty list"
In which case the answer is YES!

The Haskell prelude (standard functions) contains many partial functions, e.g. head and tail only work on non-empty lists, but don't ask me why.

This question has two aspects.
For the user of the API, failwith... simply throws a System.Exception, which is unspecific (and therefore is sometimes considered a bad practice in itself). On the other hand, the implicitly thrown MatchFailureException can be specifically caught using a type test pattern, and therefore is preferrable.
For the reviewer of the implementation code, failwith... clearly documents that the implementer has at least given some thought about the possible cases, and therefore is preferrable.
As the two aspects contradict each other, the right answer depends on the circumstances (see also kvb's answer). A solution which is 100% "correct" from any point of view would have to
deal with every case explicitly,
throw a specific exception where necessary, and
clearly document the exception
Example:
/// <summary>Gets the first element of the list.</summary>
/// <exception cref="ArgumentException">The list is empty.</exception>
let head list =
match list with
| [] -> invalidArg "list" "The list is empty."
| x::xs -> x

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string