I'm just wishing to clarify my understanding of how the order of boolean precedence is applied in expressions. I'll give an example using the following order of precedence:
1) NOT
2) AND
3) OR
Given the expression, A NOT B AND C, I know that NOT will be evaluated first. However, will this be applied to the rest of the expression (B AND C), or just the next token, B?
My current understanding is that it is the former so this should be evaluated as A NOT (B AND C). Is this correct ?
Thanks
There is no universally agreed order of precedence for boolean operators, so it will depend on who or what is doing the parsing. For any given computer language this should be well defined in the documentation for the language. Also note that NOT is a unary operator (with right-to-left associativity), so your example above is not really a valid boolean expression.
It could be argued that since boolean operators have analogues in ordinary arithmetic, that we should use an analogous order of precedence. In that case NOT (analogous to unary negation) would have the highest precedence, then AND (multiplication), then OR (addition).
Related
How can I find out the type of operator "+"? says operator + isn't a function.
Which does an operator such as + behave more like,
a curried function, or
a function whose argument has a pair tuple type?
By an operator say #, I mean # not (#). (#) is a curried function, and if # behaves like a curried function, I guess there is no need to have (#), isn't it?
Thanks.
According to the Haskell 2010 report:
An operator symbol [not starting with a colon] is an ordinary identifier.1
Hence, + is an identifier. Identifiers can identify different things, and in the case of +, it identifies a function from the Num typeclass. That function behaves like any other Haskell function, except for different parsing rules of expressions involving said functions.
Specifically, this is described in 3.2.
An operator is a function that can be applied using
infix syntax (Section 3.4), or partially applied using a section (Section 3.5).
In order to determine what constitutes an operator, we can read further:
An operator is either an operator symbol, such as + or $$, or is an ordinary identifier enclosed in grave accents (backquotes), such as `op`2.
So, to sum up:
+ is an identifier. It's an operator symbol, because it doesn't start with a colon and only uses non-alphanumeric characters. When used in an expression (parsed as a part of an expression), we treat it as an operator, which in turn means it's a function.
To answer your question specifically, everything you said involves just syntax. The need for () in expressions is merely to enable prefix application. In all other cases where such disambiguation isn't necessary (such as :type), it might have simply been easier to implement that way, because you can then just expect ordinary identifiers and push the burden to provide one to the user.
1 For what it's worth, I think the report is actually misleading here. This specific statement seems to be in conflict with other parts, crucially this:
Dually, an operator symbol can be converted to an ordinary identifier by enclosing it in parentheses.
My understanding it is that in the first quote, the context for "ordinary" means that it's not a type constructor operator, but a value category operator, hence "value category" means "ordinary". In other quotes, "ordinary" is used for identifiers which are not operator-identifiers; the difference being obviously the application. That would corroborate the fact that enclosing an operator identifier in parens, we turn it into an ordinary identifier for the purposes of prefix application. Phew, at least I didn't write that report ;)
2 I'd like to point out one additional thing here. Neither `(+)` nor (`add`) do actually parse. The second one is understandable, since the report specifically says that enclosing in parens only works for operator identifiers, one can see that `add`, while being an operator, isn't an operator identifier like +.
The first case is actually a bit more tricky for me. Since we can obtain an operator by enclosing an ordinary identifier in backticks, (+) isn't exactly "as ordinary" as add. The language, or at least GHC parser that I tested this with, seems to differentiate between "ordinary ordinary" identifiers, and ordinary identifiers obtained by enclosing operator symbols with parens. Whether this actually contradicts the spec or is another case of mixed naming is beyond my knowledge at this point.
(+) and + refer to the same thing, a function of type Num a => a -> a -> a. We spell it as (+) when we want to use it prefix, and + when we want to write it infix. There is no essential difference; it is just a matter of syntax.
So + (without section) behaves just like a function which requires a pair argument [because a section is needed for partial application]. () works like currying.
While this is not unreasonable from a syntactical point of view, I wouldn't describe it like that. Uncurried functions in Haskell take a tuple as an argument (say, (a, a) -> a as opposed to a -> a -> a), and there isn't an actual tuple anywhere here. 2 + 3 is better thought of as an abbreviation for (+) 2 3, the key point being that the function being used is the same in both cases.
One big reason prefix operators are nice is that they can avoid the need for parentheses so that + - 10 1 2 unambiguously means (10 - 1) + 2. The infix expression becomes ambiguous if parens are dropped, which you can do away with by having certain precedence rules but that's messy, blah, blah, blah.
I'd like to make Haskell use prefix operations but the only way I've seen to do that sort of trades away the gains made by getting rid of parentheses.
Prelude> (-) 10 1
uses two parens.
It gets even worse when you try to compose functions because
Prelude> (+) (-) 10 1 2
yields an error, I believe because it's trying to feed the minus operation into the plus operations rather than first evaluating the minus and then feeding--so now you need even more parens!
Is there a way to make Haskell intelligently evaluate prefix notation? I think if I made functions like
Prelude> let p x y = x+y
Prelude> let m x y = x-y
I would recover the initial gains on fewer parens but function composition would still be a problem. If there's a clever way to join this with $ notation to make it behave at least close to how I want, I'm not seeing it. If there's a totally different strategy available I'd appreciate hearing it.
I tried reproducing what the accepted answer did here:
Haskell: get rid of parentheses in liftM2
but in both a Prelude console and a Haskell script, the import command didn't work. And besides, this is more advanced Haskell than I'm able to understand, so I was hoping there might be some other simpler solution anyway before I do the heavy lifting to investigate whatever this is doing.
It gets even worse when you try to compose functions because
Prelude> (+) (-) 10 1 2
yields an error, I believe because it's
trying to feed the minus operation into the plus operations rather
than first evaluating the minus and then feeding--so now you need even
more parens!
Here you raise exactly the key issue that's a blocker for getting what you want in Haskell.
The prefix notation you're talking about is unambiguous for basic arithmetic operations (more generally, for any set of functions of statically known arity). But you have to know that + and - each accept 2 arguments for + - 10 1 2 to be unambiguously resolved as +(-(10, 1), 2) (where I've explicit argument lists to denote every call).
But ignoring the specific meaning of + and -, the first function taking the second function as an argument is a perfectly reasonable interpretation! For Haskell rather than arithmetic we need to support higher order functions like map. You would want not not x has to turn into not(not(x)), but map not x has to turn into map(not, x).
And what if I had f g x? How is that supposed to work? Do I need to know what f and g are bound to so that I know whether it's a case like not not x or a case like map not x, just to know how to parse the call structure? Even assuming I have all the code available to inspect, how am I supposed to figure out what things are bound to if I can't know what the call structure of any expression is?
You'd end up needing to invent disambiguation syntax like map (not) x, wrapping not in parentheses to disable it's ability to act like an arity-1 function (much like Haskell's actual syntax lets you wrap operators in parentheses to disable their ability to act like an infix operator). Or use the fact that all Haskell functions are arity-1, but then you have to write (map not) x and your arithmetic example has to look like (+ ((- 10) 1)) 2. Back to the parentheses!
The truth is that the prefix notation you're proposing isn't unambiguous. Haskell's normal function syntax (without operators) is; the rule is you always interpret a sequence of terms like foo bar baz qux etc as ((((foo) bar) baz) qux) etc (where each of foo, bar, etc can be an identifier or a sub-term in parentheses). You use parentheses not to disambiguate that rule, but to group terms to impose a different call structure than that hard rule would give you.
Infix operators do complicate that rule, and they are ambiguous without knowing something about the operators involved (their precedence and associativity, which unlike arity is associated with the name not the actual value referred to). Those complications were added to help make the code easier to understand; particularly for the arithmetic conventions most programmers are already familiar with (that + is lower precedence than *, etc).
If you don't like the additional burden of having to memorise the precedence and associativity of operators (not an unreasonable position), you are free to use a notation that is unambiguous without needing precedence rules, but it has to be Haskell's prefix notation, not Polish prefix notation. And whatever syntactic convention you're using, in any language, you'll always have to use something like parentheses to indicate grouping where the call structure you need is different from what the standard convention would indicate. So:
(+) ((-) 10 1) 2
Or:
plus (minus 10 1) 2
if you define non-operator function names.
I'm trying to understand the Haskell 2010 Report section 3.17.2 "Informal Semantics of Pattern Matching". Most of it, relating to a pattern match succeeding or failing seems straightforward, however I'm having difficulty understanding the case which is described as the pattern match "diverging".
I'm semi-persuaded it means that the match algorithm does not "converge" to an answer (hence the match function never returns). But if doesn't return, then, how can it return a value, as suggested by the parenthetical "i.e. return ⊥"? And what does it mean to "return ⊥" anyway? How one handle that outcome?
Item 5 has the particularly confusing (to me) point "If the value is ⊥, the match diverges". Is this just saying that a value of ⊥ produces a match result of ⊥? (Setting aside that I don't know what that outcome means!)
Any illumination, possibly with an example, would be appreciated!
Addendum after a couple of lengthy answers:
Thanks Tikhon and all for your efforts.
It seems my confusion comes from there being two different realms of explanation: The realm of Haskell features and behaviors, and the realm of mathematics/semantics, and in Haskell literature these two are intermingled in an attempt to explain the former in terms of the latter, without sufficient signposts (to me) as to which elements belong to which.
Evidently "bottom" ⊥ is in the semantics domain, and does not exist as a value within Haskell (ie: you can't type it in, you never get a result that prints out as " ⊥").
So, where the explanation says a function "returns ⊥", this refers to a function that does any of a number of inconvenient things, like not terminate, throw an exception, or return "undefined". Is that right?
Further, those who commented that ⊥ actually is a value that can be passed around, are really thinking of bindings to ordinary functions that haven't yet actually been called upon to evaluate ("unexploded bombs" so to speak) and might never, due to laziness, right?
The value is ⊥, usually pronounced "bottom". It is a value in the semantic sense--it is not a normal Haskell value per se. It represents computations that do not produce a normal Haskell value: exceptions and infinite loops, for example.
Semantics is about defining the "meaning" of a program. In Haskell, we usually talk about denotational semantics, where the value is a mathematical object of some sort. The most trivial example would be that the expression 10 (but also the expression 9 + 1) have denotations of the number 10 (rather than the Haskell value 10). We usually write that ⟦9 + 1⟧ = 10 meaning that the denotation of the Haskell expression 9 + 1 is the number 10.
However, what do we do with an expression like let x = x in x? There is no Haskell value for this expression. If you tried to evaluate it, it would simply never finish. Moreover, it is not obvious what mathematical object this corresponds to. However, in order to reason about programs, we need to give some denotation for it. So, essentially, we just make up a value for all these computations, and we call the value ⊥ (bottom).
So ⊥ is just a way to define what a computation that doesn't return "means".
We also define other computations like undefined and error "some message" as ⊥ because they also do not have obvious normal values. So throwing an exception corresponds to ⊥. This is exactly what happens with a failed pattern match.
The usual way of thinking about this is that every Haskell type is "lifted"--it contains ⊥. That is, Bool corresponds to {⊥, True, False} rather than just {True, False}. This represents the fact that Haskell programs are not guaranteed to terminate and can have exceptions. This is also true when you define your own type--the type contains every value you defined for it as well as ⊥.
Interestingly, since Haskell is non-strict, ⊥ can exist in normal code. So you could have a value like Just ⊥, and if you never evaluate it, everything will work fine. A good example of this is const: const 1 ⊥ evaluates to 1. This works for failed pattern matches as well:
const 1 (let Just x = Nothing in x) -- 1
You should read the section on denotational semantics in the Haskell WikiBook. It's a very approachable introduction to the subject, which I personally find very fascinating.
Denotational semantics
So, briefly denotational semantics, which is where ⊥ lives, is a mapping from Haskell values to some other space of values. You do this to give meaning to programs in a more formal manner than just talking about what programs should do—you say that they must respect their denotational semantics.
So for Haskell, you often think about how Haskell expressions denote mathematical values. You often see Strachey brackets ⟦·⟧ to denote the "semantic mapping" from Haskell to Math. Finally, we want our semantic brackets to be compatible with semantic operations. For instance
⟦x + y⟧ = ⟦x⟧ + ⟦y⟧
where on the left side + is the Haskell function (+) :: Num a => a -> a -> a and on the right side it's the binary operation in a commutative group. While is cool, because then we know that we can use the properties from the semantic map to know how our Haskell functions should work. To wit, let's write the commutative property "in Math"
⟦x⟧ + ⟦y⟧ == ⟦y⟧ + ⟦x⟧
= ⟦x + y⟧ == ⟦y + x⟧
= ⟦x + y == y + x⟧
where the third step also indicates that the Haskell (==) :: Eq a => a -> a -> a ought to have the properties of a mathematical equivalence relationship.
Well, except...
Anyway, that's all well and good until we remember that computers are finite things and Maths don't much care about that (unless you're using intuitionistic logic, and then you get Coq). So, we have to take note of places where our semantics don't follow Math quite right. Here are three examples
⟦undefined⟧ = ??
⟦error "undefined"⟧ = ??
⟦let x = x in x⟧ = ??
This is where ⊥ comes into play. We just assert that so far as the denotational semantics of Haskell are concerned each of those examples might as well mean (the newly introduced Mathematical/semantic concept of) ⊥. What are the Mathematical properties of ⊥? Well, this is where we start to really dive into what the semantic domain is and start talking about monotonicity of functions and CPOs and the like. Essentially, though, ⊥ is a mathematical object which plays roughly the same game as non-termination does. To the point of view of the semantic model, ⊥ is toxic and it infects expressions with its toxic-nondeterminism.
But it's not a Haskell-the-language concept, just a Semantic-domain-of-the-language-Haskell thing. In Haskell we have undefined, error and infinite looping. This is important.
Extra-semantic behavior (side note)
So the semantics of ⟦undefined⟧ = ⟦error "undefined"⟧ = ⟦let x = x in x⟧ = ⊥ are clear once we understand the mathematical meanings of ⊥, but it's also clear that those each have different effects "in reality". This is sort of like "undefined behavior" of C... it's behavior that's undefined so far as the semantic domain is concerned. You might call it semantically unobservable.
So how does pattern matching return ⊥?
So what does it mean "semantically" to return ⊥? Well, ⊥ is a perfectly valid semantic value which has the infection property which models non-determinism (or asynchronous error throwing). From the semantic point of view it's a perfectly valid value which can be returned as is.
From the implementation point of view, you have a number of choices, each of which map to the same semantic value. undefined isn't quite right, nor is entering an infinite loop, so if you're going to pick a semantically undefined behavior you might as well pick one that's useful and throw an error
*** Exception: <interactive>:2:5-14: Non-exhaustive patterns in function cheers
In any programming language textbooks, we are always told how each operator in that language has either left or right associativity. It seems that associativity is a fundamental property of any operator regardless of the number of operands it takes. It also seems to me that we can assign any associativity to any operator regardless of how we assign associativity to other operators.
But why is it the case? Perhaps an example is better. Suppose I want to design a hypothetical programming language. Is it valid to assign associativity to these operators in this arbitrary way (all having the same precedence):
unary operator:
! right associative
binary operators:
+ left associative
- right associative
* left associative
/ right associative
! + - * / are my 5 operators all having the same precedence.
If yes, how would an expression like 2+2!3+5*6/3-5!3!3-3*2 is parenthesized by my hypothetical parser? And why.
EDIT:
The first example (2+2!3+5*6/3-5!3!3-3*2) is incorrect. Perhaps forget about the unary op and let me put it this way, can we assign operators having the same precedence different associativity like the way I did above? If yes how would an example,say 2+3-4*5/3+2 be evaluated? Because most programming language seems to assign the same associativity to the operators having the same precedence. But we always talk about OPERATOR ASSOCIATIVITY as if it is a property of an individual operator - not a property of a precedence level.
Let us remember what associativity means. Take any operator, say #. Its associativity, as we all know, is the rule that disambiguates expressions of the form a # b # c: if # is left associative, it's parsed as (a # b) # c; if it's right associative, a # (b # c). It could also be nonassociative, in which case a # b # c is a syntax error.
What about if we have two different operators, say # and #? If one is of higher precedence than the other, there's nothing more to say, no work for associativity to do; precedence takes care of the disambiguation. However, if they are of equal precedence, we need associativity to help us. There are three easy cases:
If both operators are left associative, a # b # c means (a # b) # c.
If both operators are right associative, a # b # c means a # (b # c).
If both operators are nonassociative, then a # b # c is a syntax error.
In the remaining cases, the operators do not agree about associativity. Which operator's choice gets precedence? You could probably devise such associativity-precedence rules, but I think the most natural rule to impose is to declare any such case syntax errors. After all, if two operators are of equal precedence, why one would have associativity-precedence over the other?
Under the natural rule I just gave, your example expression is a syntax error.
Now, we could certainly assign differing associativities to operators of the same precedence. However, this would mean that there are combinations of operators of equal precedence (such as your example!) that are syntax errors. Most language designers seem to prefer to avoid that and assign the same associativity to all operators of equal precedence; that way, all combinations are legal. It's just aesthetics, I think.
You have to define associativity somehow, and most languages choose to assign associativity (and precedence) "naturally" -- to match the rules of common mathematics.
There are notable exceptions, however -- APL has strict right-to-left associativity, with all operators at the same precedence level.
I'm wondering why Haskell doesn't have a single element tuple. Is it just because nobody needed it so far, or any rational reasons? I found an interesting thread in a comment at the Real World Haskell's website http://book.realworldhaskell.org/read/types-and-functions.html#funcstypes.composite, and people guessed various reasons like:
No good syntax sugar.
It is useless.
You can think that a normal value like (1) is actually a single element tuple.
But does anyone know the reason except a guess?
There's a lib for that!
http://hackage.haskell.org/packages/archive/OneTuple/0.2.1/doc/html/Data-Tuple-OneTuple.html
Actually, we have a OneTuple we use all the time. It's called Identity, and is now used as the base of standard pure monads in the new mtl:
http://hackage.haskell.org/packages/archive/transformers/0.2.2.0/doc/html/Data-Functor-Identity.html
And it has an important use! By virtue of providing a type constructor of kind * -> *, it can be made an instance (a trival one, granted, though not the most trivial) of Monad, Functor, etc., which lets us use it as a base for transformer stacks.
The exact reason is because it's totally unnecessary. Why would you need a one-tuple if you can just have its value?
The syntax also tends to be a bit clunky. In Python, you can have one-tuples, but you need a trailing comma to distinguish it from a parenthesized expression:
onetuple = (3,)
All in all, there's no reason for it. I'm sure there's no "official" reason because the designers of Haskell probably never even considered a single element tuple because it has no use.
I don't know if you were looking for some reasons beyond the obvious, but in this case the obvious answer is the right one.
My answer is not exactly about Haskell semantics, but about the theoretical mathematical elegance of making a value the same as its one-tuple. (So this answer should not be taken as an explanation of the standard behavior expected of a Haskell implementation, because it isn't intended as such.)
In programming languages and computation models where all functions are curried, such as lambda-calculus and combinatory logic, every function has exactly one input argument and one output/return value. No more, no less.
When we want a particular function f to have more than one input argument – say 3 –, we simulate it under this curried regime by creating a 1-argument function that returns a 2-argument function. Thus, f x y z = ((f x) y) z, and f x would return a 2-argument function.
Likewise, sometimes we might want to return more than one value from a function. It is not literally possible under this semantics, but we can simulate it by returning a tuple. We can generalize this.
If, for uniformity, we constrain the only return value of any function to be an (n-)tuple, we are able to harmonize some interesting features of the unit value and of supposedly non-tuple return values with the features of tuples in general, as follows.
Let's adopt as the general syntax of n-tuples the following schema, where ci is the component with the index i:
Notice that n-tuples have delimiting parentheses in this syntax.
Under this schema, how would we represent a 0-tuple? Since it has no components, this degenerate case would be represented like this: ( ). This syntax precisely coincides with the syntax we adopt to represent the unit value. So, we are tempted to make the unit value the same as a 0-tuple.
What about a 1-tuple? It would have this representation: . Here a syntactical ambiguity would immediately arise: parentheses would be used in the language both as 1-tuple delimiters and as mere grouping of values or expressions. So, in a context where (v) appears, a compiler or interpreter would be unsure whether this is a 1-tuple with a component whose value is v, or just an isolated value v inside superfluous parentheses.
A way to solve this ambiguity is to force a value to be the same as the 1-tuple that would have it as its only component. Not much would be sacrificed, since the only non-empty projection we can perform on a 1-tuple is to obtain its only value.
For this to be consistently enforced, the syntactical consequence is that we would have to relax a bit our former requirement that delimiting parentheses are mandatory for all n-tuples: now they would be optional for 1-tuples, and mandatory for all other values of n. (Or we could require all values to be delimited by parentheses, but this would be inconvenient for practical use.)
In summary, under the interpretation that a 1-tuple is the same as its only component value, we could, by making syntactic puns with parentheses, consider all return values of functions in our programming language or computing model as n-tuples: the 0-tuple in the case of the unit type, 1-tuples in the case of ordinary/"atomic" values which we usually don't think of as tuples, and pairs/triples/quadruples/... for other kinds of tuples. This heterodox interpretation is mathematically parsimonious and uniform, is expressive enough to simulate functions with multiple input arguments and multiple return values, and is not incompatible with Haskell (in the sense that no harm is done if the programmer assumes this unofficial interpretation).
This was an argument by syntactic puns. Whether you are satisfied or not with it, we can do even better. A more principled argument can be taken from the mathematical theory of relations, by exploring the Cartesian product operation.
An (n-adic) relation is extensionally defined as a uniform set of (n-)tuples. (This characterization is fundamental to relational database theory and is therefore important knowledge for professional computer programmers.)
A dyadic relation – a set of pairs (2-tuples) – is a subset of the Cartesian product of 2 sets:
. For a homogeneous relation: .
A triadic relation – a set of triples (3-tuples) – is a subset of the Cartesian product of 3 sets:
. For a homogeneous relation: .
A monadic relation – a set of monuples (1-tuples) – is a subset of the Cartesian product of 1 set:
(by the usual mathematical convention).
As we can see, a monadic relation is just a set of atomic values! This means that a set of 1-tuples is a set of atomic values. Therefore, it is convenient to consider that 1-tuples and atomic values are the same thing.