What's the priority for function composition in Haskell?

What's the priority for function composition in Haskell? - haskell

I saw this code on my textbook:
double :: (Num a) => a -> a
double x = x * 2
map (double.double) [1,2,3,4]
What I don't get is that if functional composition operation have the highest priority, why use parentheses to include double.double? If I remove those parentheses, I get error message. So what's exactly is functional composition's priority?

All of the built-in operators' respective precedences and fixities can be found in the Haskell Report section 4.4.2. In particular, . and !! have precedence 9, which is the highest among operators. However, function application is not an operator. Function application is specifically designed to have precedence higher than any operator, so
map (double.double) [1,2,3,4]
This is applying the function double . double to each element of the list [1, 2, 3, 4]
map double.double [1,2,3,4]
This is attempting to compose the functions map double and double [1, 2, 3, 4], which is unlikely to be successful (though it is not technically impossible).

Precedence (and associativity) are ways of resolving the ambiguity between multiple infix operators in an expression. If there are two operators next to an operand, precedence (and associativity) tells you which of them takes the operand as an argument and which of them takes the other applied-operator expression as an argument. For example, in the expression 1 + 2 * 3, the 2 is next to both + and *. The higher precedence of * means that * gets the 2 as its left argument, while + takes the whole 2 * 3 sub-expression as its right argument.
However that's not the case in map double.double [1, 2, 3, 4]. There's only one operator, with two operands on either side, so there's no question for precedence to answer for us. The two operands are map double and double [1, 2, 3, 4]; operands are sequences of one or more terms, not only the immediate left and right terms. Where there's more than one term the sequence is parsed as simple chained function application (i.e. a b c d is ((a b) c) d).
Another way to think of it is that there is an "adjacency operator" which has higher precedence than can be assigned to anything else, and is invisibly present between any two non-operator terms (but not anywhere else). In this way of thinking map double.double [1, 2, 3, 4] is really something like map # double . double # [1, 2, 3, 4] (where I've written this "adjacency operator" as #). Since # has higher precedence than ., this is (map # double) . (double # [1, 2, 3, 4]).1
Whichever way you choose to interpret it, there is a simple consequence. It is simply impossible for any applied operator expression to be passed as an argument in non-operator function application unless there are parentheses around the operator application. If there is an operator in an expression outside any parentheses, then the outermost layer of the expression is always going to be an operator application.
1 This adjacency operator explanation seems to be pretty common. I personally think it is a poor explanation for how to parse expressions, since you need to partially parse an expression to know where to insert the adjacency operators.
It's often called the "whitespace operator", which is even more confusing since not every piece of whitespace represents this operator, and you don't always need whitespace for it to be there. e.g. length"four" + 1

Related

What kind of comprehension is this?

One of my students found that, for ell (a list of string) and estr (a string), the following expression is True iff a member of ell is contained in estr:
any(t in estr for t in ell)
Can anyone explain why this is syntactically legal, and if so what does the comprehension generate?

This is a generator expression.
func_that_takes_any_iterable(i for i in iterable)
It's like a list comprehension, but a generator, meaning it only produces one element at a time:
>>> a = [i for i in range(10)]
>>> b = (i for i in range(10))
>>> print(a)
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> print(b)
<generator object <genexpr> at 0x7fb9113fae40>
>>> print(list(b))
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> print(list(b))
[]
When using a generator expression in isolation, the parentheses are required for syntax reasons. When creating one as an argument to a function, the syntax allows for the extra parentheses to be absent.
Most functions don't care what type of iterable they're given - list, tuple, dict, generator, etc - and generators are perfectly valid. They're also marginally more memory-efficient than list comprehensions, since they don't generate the entire thing up front. For all() and any(), this is especially good, since those methods short-circuit as soon as they would return False and True respectively.
As of python 3.8, the syntactic restrictions for the walrus operator := are similar - in isolation, it must be used inside its own set of parentheses, but inside another expression it can generally be used without them.

This is syntactically legal because it's not a list. Anything of the form (x for x in array) is a generator. Think of it like lazy list, which will generate answer only when you ask.
Now generator are also an iterable so it's perfectly fine for it to be inside an any() function.
willBeTrue = any(True for i in range(20))
So this will basically generate 20 Trues and any function looks for any true value; thus will return True.
Now coming to your expression:
ans = any(t in estr for t in ell)
This {t in estr} returns a boolean value. Now generator makes len(ell) amount of those boolean value and any thinks good.. since atleast one is true I return True.

Parse mathematical expressions with pyparsing

I'm trying to parse a mathematical expression using pyparsing. I know i could just copy the example calculator from pyparsing site, but i want to understand it so i can add to it later. And i'm here because i tried to understand the example, and i couldn't, so i tried my best, and i got to this:
symbol = (
pp.Literal("^") |
pp.Literal("*") |
pp.Literal("/") |
pp.Literal("+") |
pp.Literal("-")
)
operation = pp.Forward()
atom = pp.Group(
pp.Literal("(").suppress() + operation + pp.Literal(")").suppress()
) | number
operation << (pp.Group(number + symbol + number + pp.ZeroOrMore(symbol + atom)) | atom)
expression = pp.OneOrMore(operation)
print(expression.parseString("9-1+27+(3-5)+9"))
That prints:
[[9, '-', 1, '+', 27, '+', [[3, '-', 5]], '+', 9]]
It works, kinda. I want precedence and all sorted into Groups, but after trying a lot, i couldn't find a way to do it. More or less like this:
[[[[9, '-', 1], '+', 27], '+', [3, '-', 5]], '+', 9]
I want to keep it AST-looking, i would like to generate code from it.
I did saw the operatorPrecedence class? similar to Forward, but i don't think i understand how it works either.
EDIT:
Tried more in depth operatorPrecedence and i got this:
expression = pp.operatorPrecedence(number, [
(pp.Literal("^"), 1, pp.opAssoc.RIGHT),
(pp.Literal("*"), 2, pp.opAssoc.LEFT),
(pp.Literal("/"), 2, pp.opAssoc.LEFT),
(pp.Literal("+"), 2, pp.opAssoc.LEFT),
(pp.Literal("-"), 2, pp.opAssoc.LEFT)
])
Which doesn't handle parenthesis (i don't know if i will have to postprocess the results) and i need to handle them.

The actual name for this parsing problem is "infix notation" (and in recent versions of pyparsing, I am renaming operatorPrecedence to infixNotation). To see the typical implementation of infix notation parsing, look at the fourFn.py example on the pyparsing wiki. There you will see an implementation of this simplified BNF to implement 4-function arithmetic, with precedence of operations:
operand :: integer or real number
factor :: operand | '(' expr ')'
term :: factor ( ('*' | '/') factor )*
expr :: term ( ('+' | '-') term )*
So an expression is one or more terms separated by addition or subtraction operations.
A term is one or more factors separated by multiplication or division operations.
A factor is either a lowest-level operand (in this case, just integers or reals), OR an expr enclosed in ()'s.
Note that this is a recursive parser, since factor is used indirectly in the definition of expr, but expr is also used to define factor.
In pyparsing, this looks roughly like this (assuming that integer and real have already been defined):
LPAR,RPAR = map(Suppress, '()')
expr = Forward()
operand = real | integer
factor = operand | Group(LPAR + expr + RPAR)
term = factor + ZeroOrMore( oneOf('* /') + factor )
expr <<= term + ZeroOrMore( oneOf('+ -') + term )
Now using expr, you can parse any of these:
3
3+2
3+2*4
(3+2)*4
The infixNotation pyparsing helper method takes care of all the recursive definitions and groupings, and lets you define this as:
expr = infixNotation(operand,
[
(oneOf('* /'), 2, opAssoc.LEFT),
(oneOf('+ -'), 2, opAssoc.LEFT),
])
But this obscures all the underlying theory, so if you are trying to understand how this is implemented, look at the raw solution in fourFn.py.
[EDIT - 18 Dec 2022] For those looking for a pre-defined solution, I've packaged infixNotation up into its own pip-installable package called plusminus. plusminus defines a BaseArithmeticParser class for creating a ready-to-run parser and evaluator that supports these operators:
** ÷ >= ∈ in ?:
* + == ∉ not |absolute-value|
// - != ∩ and
/ < ≠ ∪ ∧
mod > ≤ & or
× <= ≥ | ∨
And these functions:
abs ceil max
round floor str
trunc min bool
The BaseArithmeticParser class allows you to define additional operators and functions for your own domain-specific expressions, and the examples show how to define parsers with custom functions and operators for dice rolling, retail price discounts, among others.

Mathematica Position of Elements of a List that fulfill an inequality

The task I have is pretty simple but I can not solve it in mathematica.
Given a list
myList = {1, 3, 4}
I would like to get the position of entries smaller than a number - say 2 in the example above.
Attempts such as
Position[myList, #[[1]] < 2 &]
Position[myList, # < 2 &]
which would be similar to the function SELECT don't work. How can I use Position or some other function. Thanks!

Reason: The reason is that Position takes a pattern not a function.
(i.e. Position[-list-,-pattern-])
Solution:
Position[myList, x_ /; x < 2]
{{1}}
Similarly:
myList2 = {1, 2, 3, 4, 5, 1, "notNumber"}
Position[myList2, x_ /; x < 3]
{{1}, {2}, {6}}
(i.e. Position[ myList, element_x where element_x < 2])
/; <-- denotes a condition (Super useful when defining functions over specific inputs too!)
x_ <-- is a named "pattern object"
x <-- is a reference to the pattern object
Deeper Reason:
I don't know exactly what the Mathematica internals look like, but I imagine it runs something like this: if you use a functional description instead of a pattern description (i.e. #...& instead of x_/;...) the function looks for patterns that contain "#...&" which doesn't make sense (since it's comparing objects not feeding them to your defined function). On the other hand when you use a pattern description it compares them, then checks the conditional for truth (the conditional limiting matches, the widely defined x_ matching everything) and you get meaningful matching. Flip all that for functions defined to work with other functions.
I love Mathematica, but it's not good at making it's pattern based functions and function based functions obviously separate from eachother (aside from looking at documentation).
Hope that helps.

Funny Haskell Behaviour: min function on three numbers, including a negative

I've been playing around with some Haskell functions in GHCi.
I'm getting some really funny behaviour and I'm wondering why it's happening.
I realized that the function min is only supposed to be used with two values. However, when I use three values, in my case
min 1 2 -5
I'm getting
-4
as my result.
Why is that?

You are getting that result because this expression:
min 1 2 -5
parses as if it were parenthesized like this:
(min 1 2) -5
which is the same as this:
1 -5
which is the same as this:
1 - 5
which is of course -4.
In Haskell, function application is the most tightly-binding operation, but it is not greedy. In fact, even a seemingly simple expression like min 1 2 actually results in two separate function calls: the function min is first called with a single value, 1; the return value of that function is a new, anonymous function, which will return the smaller value between 1 and its single argument. That anonymous function is then called with an argument of 2, and of course returns 1. So a more accurate fully-parenthesized version of your code is this:
((min 1) 2) - 5
But I'm not going to break everything down that far; for most purposes, the fact that what looks like a function call with multiple arguments actually turns into a series of multiple single-argument function calls is a hand-wavable implementation detail. It's important to know that if you pass too few arguments to a function, you get back a function that you can call with the rest of the arguments, but most of the time you can ignore the fact that such calls are what's happening under the covers even when you pass the right number of arguments to start with.
So to find the minimum of three values, you need to chain together two calls to min (really four calls per the logic above, but again, hand-waving that):
min (min 1 2) (-5)
The parentheses around -5 are required to ensure that the - is interpreted as prefix negation instead of infix subtraction; without them, you have the same problem as your original code, only this time you would be asking Haskell to subtract a number from a function and would get a type error.
More generally, you could let Haskell do the chaining for you by applying a fold to a list, which can then contain as many numbers as you like:
foldl1 min [1, 2, -5]
(Note that in the literal list syntax, the comma and square bracket delimit the -5, making it clearly not a subtraction operation, so you don't need the parentheses here.)
The call foldl1 fun list means "take the first two items of list and call fun on them. Then take the result of that call and the next item of list, and call fun on those two values. Then take the result of that call and the next item of the list..." And so on, continuing until there's no more list, at which point the value of the last call to fun is returned to the original caller.
For the specific case of min, there is a folded version already defined for you: minimum. So you could also write the above this way:
minimum [1, 2, -5]
That behaves exactly like my foldl1 solution; in particular, both will throw an error if handed an empty list, while if handed a single-element list, they will return that element unchanged without ever calling min.
Thanks to JohnL for reminding me of the existence of minimum.

When you type min 1 2 -5, Haskell doesn't group it as min 1 2 (-5), as you seem to think. It instead interprets it as (min 1 2) - 5, that is, it does subtraction rather than negation. The minimum of 1 and 2 is 1, obviously, and subtracting 5 from that will (perfectly correctly) give you -4.
Generally, in Haskell, you should surround negative numbers with parentheses so that this kind of stuff doesn't happen unexpectedly.

Nothing to add to the previous answers. But you are probably looking for this function.
import Data.List
minimum [1, 2, -4]

What does the "#" symbol mean in reference to lists in Haskell?

I've come across a piece of Haskell code that looks like this:
ps#(p:pt)
What does the # symbol mean in this context? I can't seem to find any info on Google (it's unfortunately hard to search for symbols on Google), and I can't find the function in the Prelude documentation, so I imagine it must be some sort of syntactic sugar instead.

Yes, it's just syntactic sugar, with # read aloud as "as". ps#(p:pt) gives you names for
the list: ps
the list's head : p
the list's tail: pt
Without the #, you'd have to choose between (1) or (2):(3).
This syntax actually works for any constructor; if you have data Tree a = Tree a [Tree a], then t#(Tree _ kids) gives you access to both the tree and its children.

The # Symbol is used to both give a name to a parameter and match that parameter against a pattern that follows the #. It's not specific to lists and can also be used with other data structures.
This is useful if you want to "decompose" a parameter into it's parts while still needing the parameter as a whole somewhere in your function. One example where this is the case is the tails function from the standard library:
tails :: [a] -> [[a]]
tails [] = [[]]
tails xxs#(_:xs) = xxs : tails xs

I want to add that # works at all levels, meaning you can do this:
let a#(b#(Just c), Just d) = (Just 1, Just 2) in (a, b, c, d)
Which will then produce this: ((Just 1, Just 2), Just 1, 1, 2)
So basically it's a way for you to bind a pattern to a value. This also means that it works with any kind of pattern, not just lists, as demonstrated above. This is a very useful thing to know, as it means you can use it in many more cases.
In this case, a is the entire Maybe Tuple, b is just the first Just in the tuple, and c and d are the values contained in the first and second Just in the tuple respectively

To add to what the other people have said, they are called as-patterns (in ML the syntax uses the keyword "as"), and are described in the section of the Haskell Report on patterns.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string