I am reading the Haskell report 2010 and has some questions with regarding to the meta-logical representation in section 2.4Here:
In the mnemonics of "varid" and "varsym", does "var" mean variable?
My understanding are that "varid" are identifiers for variables and functions, while "varsym" are identifiers also, but for operators. Is this understanding correct?
If 1 and 2 are correct, does it mean operator is also a kind of variable? (I am very confused because this is likely not right.)
Appreciate any advice.
As far as I can tell, the report is defining the difference between symbols that are used prefix, and those that are used infix, for example:
f x y -- f is used prefix
a / b -- / is used infix
This is just a syntactic convenience, as all prefix symbols can be used infix with backticks, and all infix symbols can be used prefix with ()s:
x `f` y -- infix
(/) a b -- prefix
(a /) b -- operator section
(/ b) a -- operator section
Sub-questions:
yes, but I can't figure out any meaningful mnemonic for the id and sym parts. :(
operators are in the realm of Haskell syntax, not its semantics. They're only used to provide a more convenient syntax for writing some expressions. As far as I know, if they were removed from Haskell, the only loss would be convenient syntax -- i.e. there's nothing that you need operators for, other than convenient syntax, and you can replace every single use of operators with non-operator symbols. They are completely identical to variables -- they are variables -- but require different syntax for their use.
yes, I would agree that operator symbols are variables. However, the values bound to oerators symbols would not be variables.
Related
Basically, I need to define infix operator for function composition, flipped g . f manner.
(#) :: (a -> b) -> (b -> c) -> a -> c
(#) = flip (.)
infixl 9 #
Here, temporarily, I just have chosen the symbol: #, but I'm not certain this choice is not problematic in terms of collision to other definitions.
The weird thing is somehow I could not find a Pre-defined infix operator list in Haskell.
I want the composition operator as concise as possible, hopefully, a single character like ., and if possible, I want to make it &. Is it OK?
Is there any guidance or best pracice tutorial? Please advise.
Related Q&A:
What characters are permitted for Haskell operators?
You don't see a list of Haskell built-in operators for the same reason you don't see a list of all built-in functions. They're everywhere. Some are in Prelude, some are in Control.Monad, etc., etc. Operators aren't special in Haskell; they're ordinary functions with neat syntax. And Haskellers are generally pretty operator-happy in general. Spend any time inside your favorite lens library and you'll find plenty of amusing-looking operators.
In terms of (#), it might be best to avoid. I don't know of any built-in operators called that, but # can be a bit special when it comes to parsing. Specifically, a compiler extension enables # at the end of ordinary identifiers, and GHC defines a lot of built-in (primitive) types following this practice. For instance, Int is the usual (boxed) integer type, whereas Int# is a primitive integer. Most people don't need to interface with this directly, but it is a use that # has in Haskell that would be confused slightly by the addition of your operator.
Your (#) operator is called (>>>) in Haskell, and it works on all Category instances, including functions. Its companion is (<<<), the generalization of (.) to all Category instances. If three characters is too long for you, I've seen it called (|>) in some other languages, but Haskell already uses that operator for something else. If you're not using Data.Sequence, you could use that operator. But personally, I'd just go with (>>>). Any Haskeller will recognize it pretty quickly.
How can I find out the type of operator "+"? says operator + isn't a function.
Which does an operator such as + behave more like,
a curried function, or
a function whose argument has a pair tuple type?
By an operator say #, I mean # not (#). (#) is a curried function, and if # behaves like a curried function, I guess there is no need to have (#), isn't it?
Thanks.
According to the Haskell 2010 report:
An operator symbol [not starting with a colon] is an ordinary identifier.1
Hence, + is an identifier. Identifiers can identify different things, and in the case of +, it identifies a function from the Num typeclass. That function behaves like any other Haskell function, except for different parsing rules of expressions involving said functions.
Specifically, this is described in 3.2.
An operator is a function that can be applied using
infix syntax (Section 3.4), or partially applied using a section (Section 3.5).
In order to determine what constitutes an operator, we can read further:
An operator is either an operator symbol, such as + or $$, or is an ordinary identifier enclosed in grave accents (backquotes), such as `op`2.
So, to sum up:
+ is an identifier. It's an operator symbol, because it doesn't start with a colon and only uses non-alphanumeric characters. When used in an expression (parsed as a part of an expression), we treat it as an operator, which in turn means it's a function.
To answer your question specifically, everything you said involves just syntax. The need for () in expressions is merely to enable prefix application. In all other cases where such disambiguation isn't necessary (such as :type), it might have simply been easier to implement that way, because you can then just expect ordinary identifiers and push the burden to provide one to the user.
1 For what it's worth, I think the report is actually misleading here. This specific statement seems to be in conflict with other parts, crucially this:
Dually, an operator symbol can be converted to an ordinary identifier by enclosing it in parentheses.
My understanding it is that in the first quote, the context for "ordinary" means that it's not a type constructor operator, but a value category operator, hence "value category" means "ordinary". In other quotes, "ordinary" is used for identifiers which are not operator-identifiers; the difference being obviously the application. That would corroborate the fact that enclosing an operator identifier in parens, we turn it into an ordinary identifier for the purposes of prefix application. Phew, at least I didn't write that report ;)
2 I'd like to point out one additional thing here. Neither `(+)` nor (`add`) do actually parse. The second one is understandable, since the report specifically says that enclosing in parens only works for operator identifiers, one can see that `add`, while being an operator, isn't an operator identifier like +.
The first case is actually a bit more tricky for me. Since we can obtain an operator by enclosing an ordinary identifier in backticks, (+) isn't exactly "as ordinary" as add. The language, or at least GHC parser that I tested this with, seems to differentiate between "ordinary ordinary" identifiers, and ordinary identifiers obtained by enclosing operator symbols with parens. Whether this actually contradicts the spec or is another case of mixed naming is beyond my knowledge at this point.
(+) and + refer to the same thing, a function of type Num a => a -> a -> a. We spell it as (+) when we want to use it prefix, and + when we want to write it infix. There is no essential difference; it is just a matter of syntax.
So + (without section) behaves just like a function which requires a pair argument [because a section is needed for partial application]. () works like currying.
While this is not unreasonable from a syntactical point of view, I wouldn't describe it like that. Uncurried functions in Haskell take a tuple as an argument (say, (a, a) -> a as opposed to a -> a -> a), and there isn't an actual tuple anywhere here. 2 + 3 is better thought of as an abbreviation for (+) 2 3, the key point being that the function being used is the same in both cases.
One big reason prefix operators are nice is that they can avoid the need for parentheses so that + - 10 1 2 unambiguously means (10 - 1) + 2. The infix expression becomes ambiguous if parens are dropped, which you can do away with by having certain precedence rules but that's messy, blah, blah, blah.
I'd like to make Haskell use prefix operations but the only way I've seen to do that sort of trades away the gains made by getting rid of parentheses.
Prelude> (-) 10 1
uses two parens.
It gets even worse when you try to compose functions because
Prelude> (+) (-) 10 1 2
yields an error, I believe because it's trying to feed the minus operation into the plus operations rather than first evaluating the minus and then feeding--so now you need even more parens!
Is there a way to make Haskell intelligently evaluate prefix notation? I think if I made functions like
Prelude> let p x y = x+y
Prelude> let m x y = x-y
I would recover the initial gains on fewer parens but function composition would still be a problem. If there's a clever way to join this with $ notation to make it behave at least close to how I want, I'm not seeing it. If there's a totally different strategy available I'd appreciate hearing it.
I tried reproducing what the accepted answer did here:
Haskell: get rid of parentheses in liftM2
but in both a Prelude console and a Haskell script, the import command didn't work. And besides, this is more advanced Haskell than I'm able to understand, so I was hoping there might be some other simpler solution anyway before I do the heavy lifting to investigate whatever this is doing.
It gets even worse when you try to compose functions because
Prelude> (+) (-) 10 1 2
yields an error, I believe because it's
trying to feed the minus operation into the plus operations rather
than first evaluating the minus and then feeding--so now you need even
more parens!
Here you raise exactly the key issue that's a blocker for getting what you want in Haskell.
The prefix notation you're talking about is unambiguous for basic arithmetic operations (more generally, for any set of functions of statically known arity). But you have to know that + and - each accept 2 arguments for + - 10 1 2 to be unambiguously resolved as +(-(10, 1), 2) (where I've explicit argument lists to denote every call).
But ignoring the specific meaning of + and -, the first function taking the second function as an argument is a perfectly reasonable interpretation! For Haskell rather than arithmetic we need to support higher order functions like map. You would want not not x has to turn into not(not(x)), but map not x has to turn into map(not, x).
And what if I had f g x? How is that supposed to work? Do I need to know what f and g are bound to so that I know whether it's a case like not not x or a case like map not x, just to know how to parse the call structure? Even assuming I have all the code available to inspect, how am I supposed to figure out what things are bound to if I can't know what the call structure of any expression is?
You'd end up needing to invent disambiguation syntax like map (not) x, wrapping not in parentheses to disable it's ability to act like an arity-1 function (much like Haskell's actual syntax lets you wrap operators in parentheses to disable their ability to act like an infix operator). Or use the fact that all Haskell functions are arity-1, but then you have to write (map not) x and your arithmetic example has to look like (+ ((- 10) 1)) 2. Back to the parentheses!
The truth is that the prefix notation you're proposing isn't unambiguous. Haskell's normal function syntax (without operators) is; the rule is you always interpret a sequence of terms like foo bar baz qux etc as ((((foo) bar) baz) qux) etc (where each of foo, bar, etc can be an identifier or a sub-term in parentheses). You use parentheses not to disambiguate that rule, but to group terms to impose a different call structure than that hard rule would give you.
Infix operators do complicate that rule, and they are ambiguous without knowing something about the operators involved (their precedence and associativity, which unlike arity is associated with the name not the actual value referred to). Those complications were added to help make the code easier to understand; particularly for the arithmetic conventions most programmers are already familiar with (that + is lower precedence than *, etc).
If you don't like the additional burden of having to memorise the precedence and associativity of operators (not an unreasonable position), you are free to use a notation that is unambiguous without needing precedence rules, but it has to be Haskell's prefix notation, not Polish prefix notation. And whatever syntactic convention you're using, in any language, you'll always have to use something like parentheses to indicate grouping where the call structure you need is different from what the standard convention would indicate. So:
(+) ((-) 10 1) 2
Or:
plus (minus 10 1) 2
if you define non-operator function names.
What does the notation f >.> g = g . f mean? Specifically what does the two > symbols around the dot indicate?
It is just the name of an operator; the operator might as well have been named foo or >>>>.> or something else, as long as its a legal name in Haskell.
A domain specific meaning of the symbols might have been assigned by the author, especially in relation to other names in the same library, however, in this case, I doubt there is some deeper meaning — it's just the reverse function composition operator.
I am looking to create a DSL and I'm looking for a language where you can define your own bracket-style operators for things like the floor and ceiling functions. I'd rather not go the route of defining my own Antlr parser for a custom syntax.
As a far as I know the only languages I know of that allow you to define custom operators are all binary infix operators.
tl;dr: Which languages allow for defined paired symbol (like opening bracket/closed bracket) operators?
Also, I don't see how this question can be "too broad" if no one has named a single language that has this and the criteria are very specific and definitely in the programming domain.
Since Fortress is dead, the only languages I know of where something like this would be imaginable are those of FORTH heritage.
In all others that I know of, braces, brackets and parentheses are already heavily used, and can't be overloaded further.
I suggest giving up the quest for such stuff and get comfortable writing
floor x
ceiling y
or however function application is expressed in the language of your choice.
However, in the article you cite, it is said: Unicode contains codepoints for these symbols at U+2308–U+230B: ⌈x⌉, ⌊x⌋.
Thus you can at least define this a s operator in a Haskell like language and use like:
infix 5 ⌈⌉
(foo ⌈⌉)
The best I could come up with is like the following:
--- how to make brackets
module Test where
import Prelude.Math
infix 7 `«`
infix 6 `»`
_ « x = Math.ceil x
(») = const
x = () «2.345» ()
main _ = println x
The output was: 3.0
(This is not actually Haskell, but Frege, a Haskell-like JVM language.)
Note that I used «» instead of ⌈⌉, because I somehow have no font in my IDE that would correctly show the bracket symbols. (This is another reason not to do such.)
The way it works is that with the infix directives, we get this parsed as
(») ((«) () 2.345) ()
(One could insert any expression in place of the ())
Maybe if you ask in the Haskell group, someone finds a better solution.