Consider the following code snippet in Idris:
myList : List Int
myList = [
1,
2,
3
]
The closing delimiter ] is on the same column as the declaration itself. I find this a quite natural way to want to format long, multi-line lists.
However, the equivalent snippet in Haskell fails to compile with a syntax error:
myList :: [Int]
myList = [
1,
2,
3
]
>> main.hs:9:1: error:
>> parse error (possibly incorrect indentation or mismatched brackets)?
>> |
>> 9 | ]
>> | ^
And requires instead the the closing delimiter ] is placed on a column number strictly greater than where the expression is declared. Or at least, as far as I can garner, this seems to be what is going on.
Is there a reason Haskell doesn't like this syntax? I know there are some subtle interactions between the Haskell parser and lexer to enable Haskell's implementation of the offsides rule, so perhaps it has something to do with that.
Well, ultimately the answer is just “because the Haskell language standard demands it to be parsed this way”.
As to for some reasoning why this is a good idea, it's that indentation is the primary way code is structured, and parentheses/brackets only come in locally. I find this much more consequent than Python's attitude that indentation is kind of the primary structure, but for an expression to spread over multiple lines you actually need to wrap it in parentheses. (Not saying that these are the only two ways it could be done.)
Note that if you really want, you can always disable the indentation sensitivity completely, with something like
myList :: [Int]
myList = l where {
l = [
1,
2,
3
]}
But I would not recommend it. The preferred style to write multiline lists is
myList
= [ 1
, 2
, 3
]
or
myList = [ 1
, 2
, 3 ]
Again, I would argue that this leading-comma style is much preferrable to the trailing-comma one most programmers in other languages use, especially for nested lists: the commas become “bullet points” aligned with the opening bracket, which makes the AST structure very clear.
myMonstrosity :: [(Int, [([Int], Int)])]
= [ ( 1
, [ ( [37,43]
, 9 )
, ( [768,4,9807,3,4,98]
, 15 ) ]
)
, ( 2, [] )
, ( 3
, [ ( [], 300 )
, ( [0..4000], -5 ) ]
)
]
Related
Context
I'm trying to generate a parser for BCP47 Language-Tag values, which are specified in ABNF (Augmented Backus–Naur form). I'm doing this in Haskell and would like to use the robust BNFC tool-chain, which expects LBNF (Labeled Backus–Naur form). I've searched for tooling to do this conversion automatically and could find none, so I'm basically attempting to write an LBNF for it using the ABNF as reference.
Attempted so far
I've done a lot of searching, and I think this question may be useful, but I can't get bnfc to accept any use of ε, it always spits out a syntax error at that character. For example,
Convert every option [ E ] to a fresh non-terminal X and add
X = ε | E.
-- ABNF option:
-- foo = [ E ]
-- Fresh X
Foo. Foo ::= X ;
-- add
X. X ::= ε | E ;
E. E ::= "e" ;
syntax error at line 8, column 10 due to lexer error
Giving up on that, I tried to get something even simpler working:
language = 2*ALPHA
I could not.
I've seen some BNF documentation (sorry I lost the link now) with an example for digits that looked like:
number ::= digit
number ::= number digit
This makes sense to me, so I tried the following:
LanguageISO2. Language ::= ALPHA ALPHA ;
token ALPHA ( letter ) ;
The fails to parse "en", but does parse "e n". It's clear why, but what is the right way to do what I'm intending?
I can make things kind of work by abusing token,
LanguageISO2. Language ::= ALPHA_TWO ;
token ALPHA_TWO ( letter letter ) ;
But this will quickly get out of hand as I handle 3*ALPHA and 5*8ALPHA, etc.
Specific Question
Could someone convert the following to LBNF so I can see the right approach to these things?
langtag = (language
["-" script]
["-" region]
*("-" variant))
language = (2*3ALPHA [ extlang ])
extlang = *3("-" 3ALPHA) ; reserved for future use
script = 4ALPHA ; ISO 15924 code
region = 2ALPHA ; ISO 3166 code
/ 3DIGIT ; UN M.49 code
variant = 5*8alphanum ; registered variants
/ (DIGIT 3alphanum)
alphanum = (ALPHA / DIGIT) ; letters and numbers
Thanks very much in advance.
Below is an Alloy model representing this set of integers: {0, 2, 4, 6}
As you know, the plus symbol (+) denotes set union. How can 0 be unioned to 2? 0 and 2 are not sets. I thought the union operator applies only to sets? Isn't this violating a basic notion of set union?
Second question: Is there a better way to model this, one that is less cognitively jarring?
one sig List {
numbers: set Int
} {
numbers = 0 + 2 + 4 + 6
}
In Alloy, everything you work with is a set of tuples. none is the empty set, and many sets are sets of relations (tuples with arity > 1). So also each integer, when you use it, is a set with a relation of arity 1 and cardinality 1. I.e. in Alloy when you use 1 it is really {(1)}, a set of a type containing the atom 1. I.e. the definition is in reality like:
enum Int {-8,-7,-6,-5,-4,-3,-2,-1,0,1,2,3,4,5,6,7}
Ints in Alloy are just not very good integers :-( The finite set of atoms is normally not a problem but with Ints there are just too few of them to be really useful. Worse, they quickly overflow and Alloy is not good in handling this at all.
But I do agree it looks ugly. I have an even worse problem with seq.
0-A + 1->B + 2->C + 3->C
I already experimented with adding literal seq to Alloy and got an experimental version running. Maybe sets could also be implemented this way:
// does not work in Alloy 4
seq [ A, B, C, C ] = 0->A + 1->B + 2->C + 3->C
set [ 1, 2, 3, 3 ] = 1+2+3
Today you could do this:
let x[a , b ] = { a + b }
run {
x[1,x[2,x[3,4]]] = 1+2+3+4
} for 4 int
But not sure I like this any better. If macros would have meta fields or would make the arguments available as a sequence (like most interpreters have) then we could do this
// does not work in Alloy 4
let list[ args ... ] = Int.args // args = seq univ
run {
range[ list[1,2,3,4,4] ] = 1+2+3+4
}
If you like the seq [ A, B, C, C ] syntax or the varargs then start a thread on the AlloyTools list. As said, I got the seq [ A, B, C, C ] working in a prototype.
The documentation for Parsec.Expr.buildExpressionParser says:
Prefix and postfix operators of the same precedence can only occur
once (i.e. --2 is not allowed if - is prefix negate).
and indeed, this is biting me, since the language I am trying to parse allows arbitrary repetition of its prefix and postfix operators (think of a C expression like **a[1][2]).
So, why does Parsec make this restriction, and how can I work around it?
I think I can move my prefix/postfix parsers down into the term parser since they have the highest precedence.
i.e.
**a + 1
is parsed as
(*(*(a)))+(1)
but what could I have done if I wanted it to parse as
*(*((a)+(1)))
if buildExpressionParser did what I want, I could simply have rearranged the order of the operators in the table.
Note See here for a better solution
I solved it myself by using chainl1:
prefix p = Prefix . chainl1 p $ return (.)
postfix p = Postfix . chainl1 p $ return (flip (.))
These combinators use chainl1 with an op parser that always succeeds, and simply composes the functions returned by the term parser in left-to-right or right-to-left order. These can be used in the buildExprParser table; where you would have done this:
exprTable = [ [ Postfix subscr
, Postfix dot
]
, [ Prefix pos
, Prefix neg
]
]
you now do this:
exprTable = [ [ postfix $ choice [ subscr
, dot
]
]
, [ prefix $ choice [ pos
, neg
]
]
]
in this way, buildExprParser can still be used to set operator precedence, but now only sees a single Prefix or Postfix operator at each precedence. However, that operator has the ability to slurp up as many copies of itself as it can, and return a function which makes it look as if there were only a single operator.
How can i convert List in String in erlang?
My list view:
[{{19,59,51},{2011,1,14},"fff"},{{19,59,47},{2011,1,14},"ASDfff"}]
Thank you.
A very simple thing would be
List = [{{19,59,51},{2011,1,14},"fff"},
{{19,59,47},{2011,1,14},"ASDfff"}],
IOList = io_lib:format("~w", [List]),
FlatList = lists:flatten(IOList),
but as these appear to be timestamps which you may want to be formatted in a better way, something like
FormattedIOLists =
[ io_lib:format("~4..0B-~2..0B-~2..0B ~2..0B:~2..0B:~2..0B ~s",
[YYYY,M,D, HH,MM,SS, Comment])
|| {{HH,MM,SS},{YYYY,M,D},Comment} <- List ],
FormattedFlatLists =
[ lists:flatten(io_lib:format("~4..0B-~2..0B-~2..0B ~2..0B:~2..0B:~2..0B ~s",
[YYYY,M,D, HH,MM,SS, Comment]))
|| {{HH,MM,SS},{YYYY,M,D},Comment} <- List ],
could fit your bill better.
For quick and dirty interactive output on the shell,
9> [ io:format("~4..0B-~2..0B-~2..0B ~2..0B:~2..0B:~2..0B ~s~n", [YYYY,M,D, HH,MM,SS, Comment]) || {{HH,MM,SS},{YYYY,M,D},Comment} <- List ].
2011-01-14 19:59:51 fff
2011-01-14 19:59:47 ASDfff
[ok,ok]
10> lists:foreach(fun({{HH,MM,SS},{YYYY,M,D},Comment}) -> io:format("~4..0B-~2..0B-~2..0B ~2..0B:~2..0B:~2..0B ~s~n", [YYYY,M,D, HH,MM,SS, Comment]) end, List).
2011-01-14 19:59:51 fff
2011-01-14 19:59:47 ASDfff
11>
Note that in most cases building recursive lists of lists (iolists) is a much better thing to do than flattening those iolists. Most output functions directly accept iolists for output data, so you gain nothing by flattening the lists before the actualy output happens.
Maybe just:
io_lib:format("~w", [[{{19,59,51},{2011,1,14},"fff"},{{19,59,47},{2011,1,14},"ASDfff"}]]).
Usually, I will use the :: primitive thus:
SomeVariable"_ :: ] DefaultValue
I'm looking for a way to wrap that ugly SOB. I'm trying to reason it. Normally, it would be with a tacit definition. This, for example:
default =: 13 : 'x"_ :: ] y'
fails miserably. Because, of course, in this context:
SomeVariable default DefaultValue
if SomeVariable doesn't exist, J will throw a valence error.
So, how can you wrap ::?
You can indeed wrap :: but if you want to give it a verb argument, you need to deal with the syntactic issues.
For example, you can use an adverb:
fault=:1 :0
u"_ :: ]
)
Or you could convert the verb you are manipulating into a gerund and pass that in (but that would be ugly, so I do not think you want that).
I use,
ORdef_z_ =: ".#[^:(_1< 4!:0#<#[)
'asd' ORdef 3 NB. asd not assigned, returns right.
3
asd =. 'asd' ORdef 3
asd=.'asd' ORdef 22 NB. will return 3 due to previous assignment
asd
3