What does the 'outermost part' of expression mean in Haskell - haskell

True beginner here, and I'm reading about the differences between NF and WHNF, and one of the definitions I come across states
To determine whether an expression is in weak head normal form, we only have to
look at the outermost part of the expression.
I'm not sure what criteria to apply to determine what the 'outermost' part is. For example (from a stack overflow answer by #hammer):
'h' : ("e" ++ "llo") -- the outermost part is the data constructor (:)
(1 + 1, 2 + 2) -- the outermost part is the data constructor (,)
\x -> 2 + 2 -- the outermost part is a lambda abstraction
Especially in the first example there, the (:) operator is in the middle of the 'h' and the other expression, so how is it the outermost part?
In general, when looking at an expression, how does one determine what the outermost part is?

Good question. Here, "outermost" (or "topmost") describes the location in reference to a kind of standard, abstract view of the expression that can differ from its actual syntax. For example, these two expressions:
(1, 2)
(,) 1 2
have identical meaning in Haskell. It turns out that in both cases, the constructor (,) is the outermost part of the expression, even though the expressions have different syntactic form with the comma appearing in different physical locations in the syntax.
Generally speaking, Haskell's "standard syntax" involves function application of the form:
fexpr expr1 .. exprn
However, the language also allows other kinds of syntax:
1 + 2 -- infix operators
(3, 4) -- tuples
[5,6,7,8] -- lists
"abc" -- strings
These alternate syntaxes can be converted to the standard syntax like so:
1 + 2 ==> (+) 1 2
(3, 4) ==> (,) 3 4
[5,6,7,8] ==> (:) 5 ((:) 6 ((:) 7 ((:) 8 [])))
"abc" ==> (:) 'a' ((:) 'b' ((:) 'c' []))
Though it's far from obvious, (+) is a variable whose value is a function (like sqrt); while (,) and (:) are constructors (just like True or Just). Even more confusing and further from obvious, even though [1,2,3] is special syntax, the empty list [] is a (unary) constructor! That's why I still used empty lists on the right-hand side in my standard syntax versions.
Anyway, once converted to the standard syntax, an expression will either be a constructor application, like one of these:
True -- a nullary constructor
(,) (1+2) (2+3) -- constructor expecting two args and fully applied
(:) 5 -- constructor partially applied and expecting one more arg
and we say that the "outermost part" of the expression is this constructor application, or it will be an unapplied lambda abstraction:
(\x y -> (+) x (length y))
and we say that the "outermost part" of the expression is this unapplied lambda abstraction, or it will be something else:
w -- a variable
(\x -> x) 10 -- an applied lambda abstraction
(f x) y ((*) 2 z) -- some other function application
(+) 10 (length (1:[])) -- another function application
and we say the "outermost part" of the expression is a variable reference or a function application or whatever.
Anyway, if the outermost part is either a constructor application or an unapplied lambda abstraction, then it's said to be in weak head normal form. If it's something else, it's not.
So, Just (5+6) is in WHNF because the outermost part is the application of the constructor Just. On the other hand, sqrt (5+6) is not in WHNF because the outermost part is the application of the variable sqrt, and a variable isn't a constructor.
Similarly, 5+6 itself is not in WHNF because the outermost part is the application of the variable (+) to 5 and 6, while [5,6] is in WHNF because the outermost part is the application of the (implied) constructor (:) to 5 and [6].

You have to look at the abstract syntax tree to understand what this means.
:
/ \
'h' ++
/ \
"e" "llo"
(,)
/ \
+ +
/ \ / \
1 1 2 2
\x ->
|
+
/ \
2 2
"Outermost" means the root of the tree.

Related

Why (-2) is different from (\x -> x - 2) in Haskell? [duplicate]

This question already has an answer here:
Why doesn't `-` (minus) work for operator sections? [duplicate]
(1 answer)
Closed 4 years ago.
I'm learning Haskell using CIS194(spring13) by myself. Today when I was trying to deal with homework 4 Exercise 1, I got a error with
fun1' :: [Integer] -> Integer
fun1' = foldl' (*) 1 . map (-2) . filter even
It's said:
error:
? No instance for (Num (Integer -> Integer))
arising from a use of syntactic negation
(maybe you haven't applied a function to enough arguments?)
? In the first argument of map, namely (- 2)
In the first argument of (.), namely map (- 2)
In the second argument of (.), namely map (- 2) . filter even
|
12 | fun1' = foldl' (*) 1 . map (- 2) . filter even
| ^^^
And It's OK to replace (-2) with (\x -> x - 2). I think it just like filter (==0) or map (*3) xs, (-2) should be a function with type signature Integer -> Integer and has same effect with (\x -> x - 2).
So my question is why they are different?
Haskell has a special case in its syntax for -; it is treated as a unary operator, synonymous with negate, whenever that parses. It was considered at the time the language was made that negative numbers would be more commonly needed than subtraction sections (and this still seems true today).
However, it was known that people would occasionally want what you do, and so provisions were made: the Prelude includes a function named subtract which just flips the arguments to (-).
> map (subtract 2) [1,2,3]
[-1,0,1]

Confusion on partially applied infix operators

So I was reading through a Haskell guide online, and was just curious about the combination of infix operators and filter.
Say you have a function like
filter (>5) [6, 10, 5]
That will return [6,10], which seems like the intuitive way filter should work.
However, doing
filter ((>) 5) [6, 10, 5]
returns an empty list (this still makes sense, (>) checks if its first argument is larger than the second argument).
However, filter is typically defined something like
filter :: (a -> Bool) -> [a] -> [a]
filter _ [] = []
filter p (x:xs)
| p x = x : filter p xs
| otherwise = filter p xs
When the type system knows it has an infix operator, are most of those infix operators written so that a partially applied function needs a leading argument to the original prefix function? i.e. is infix > defined as something like (butchered syntax)
infix> :: Int -> Int-> Bool
infix> x y = (>) y x
x infix> y = (>) x y
Sorry if this question doesn't make sense, I feel like I'm missing something basic in how p x is evaluated when p is a partially applied infix operator.
(>5) and ((>) 5) are two different types of expression.
The first is what is known as a section. Sections are of the form (op exp) or (exp op) where op is an infix operator and exp is another expression. A section takes an argument and applies it on the missing side, so (>5) 4 = (4 > 5) and (5>) 4 = (5 > 4). In other words, (>5) is equivalent to \x -> x > 5.
In ((>) 5), (>) is the infix operator > converted to an expression. ((>) 5) then is the application of 5 to the function (>), which gives a new function that takes the next argument. If we apply that argument, e.g., (>) 5 4, we get the prefix equivalent of (5 > 4).
This conversion of an infix operator to an expression that can be used prefix works for all infix operators. You can also go the other way and covert an identifier into an infix operator, i.e.:
`foo`
You can even say:
(`foo`)
to turn it back into an expression.

Unsure how to use the `$` function

While trying to write a function for transposing a list of lists, I saw something very curious. I tried:
> let abc xs | null (head xs) = [] | otherwise = map head xs : abc $ map tail xs
and got an error. Then I tried:
> let abc xs | null (head xs) = [] | otherwise = map head xs : abc ( map tail xs )
> abc [[1,2,3], [4,5,6], [7,8,9]]
[[1,4,7],[2,5,8],[3,6,9]]
I was led to believe that the $ operator can be used instead of the brackets, and that that's more Haskellish. Why am I getting an error?
An operator is a function that can be applied in a infix position. So $ is a function.
In Haskell, you can define your own functions that can be used in infix position - between the arguments. Then you can also define function application precedence and associativity using infix, infixr, infixl - that is, the clues telling the compiler whether to treat a $ b $ c as (a $ b) $ c, or a $ (b $ c).
The precedence of $ is such that your first expression is interpreted like (map head xs : abc) $ ...
For example, to declare $ as infix, place its name in ():
($) :: (a->b) -> a -> b
f $ x = f x
or composition:
(.) :: (b->c)->(a->b)->a->c
(f . g) x = f $ g x
Arithemtic "operators" are also defined as infix functions in class Num.
Additionally, you can use other functions as infix, by quoting them in backticks `` at the application site. Sometimes it makes the expression look prettier:
f `map` xs == map f xs
(not that in this particular case it makes it look pretty, just to show a simple example)
Adding to Sassa's correct answer, let's dissect the code snippet you provided a bit further:
map head xs : abc $ map tail xs
Two operators are used here: (:) and ($). As noted above, these are interpreted as infix by default because their names consist only of symbols.
Each operator has a precedence which decides how 'tightly' it binds or, perhaps more usefully, which operator is applied first. Your code could be interpreted either as
((map head xs) : abc) $ (map tail xs)
where (:) binds more tightly (is applied before) ($) or as
(map head xs) : (abc $ (map tail xs))
where ($) binds more tightly. Note that I have put parentheses around function application (for example map applied to tail and xs) as well. Function application binds more tightly than any operator and is thus always applied first.
To decide which of the two ways to interpret the code is correct, the compiler needs to get information about which operator should bind more tightly. This is done using a fixity declaration like
infix 8
or in general
infix i
where i is between 0 and 9. Higher values of i mean that an operator binds more tightly. (infixr and infixl may be used to additionally define associativity as explained in Sassa's answer, but this doesn't affect your specific problem.)
As it turns out, the fixity declaration for the ($) operator is
infixr 0 $
as seen in the Prelude documentation. (:) is a bit more 'magic' because it is hard-coded into the Haskell syntax, but the Haskell Report specifies that it has precedence 5.
Now we finally have enough information to conclude that the first interpretation of your code is indeed correct: the precedence of (:), 5, is higher than the precedence of ($), 0. As a general rule of thumb, ($) often doesn't interact well with other operators.
By the way, if an expression contains two different operators which have the same precedence (such as (==) and (/=)), the order in which they should be applied is unclear, so you have to use parentheses to specify it explicitly.

Partial Application with Infix Functions

While I understand a little about currying in the mathematical sense, partially
applying an infix function was a new concept which I discovered after diving
into the book Learn You a Haskell for Great Good.
Given this function:
applyTwice :: (a -> a) -> a -> a
applyTwice f x = f (f x)
The author uses it in a interesting way:
ghci> applyTwice (++ [0]) [1]
[1,0,0]
ghci> applyTwice ([0] ++) [1]
[0,0,1]
Here I can see clearly that the resulting function had different parameters
passed, which would not happen by normal means considering it is a curried
function (would it?). So, is there any special treatment on infix sectioning by
Haskell? Is it generic to all infix functions?
As a side note, this is my first week with Haskell and functional programming,
and I'm still reading the book.
Yes, you can partially apply an infix operator by specifying either its left or right operand, just leaving the other one blank (exactly in the two examples you wrote).
So, ([0] ++) is the same as (++) [0] or \x -> [0] ++ x (remember you can turn an infix operator into a standard function by means of parenthesis), while (++ [0]) equals to \x -> x ++ [0].
It is useful to know also the usage of backticks, ( `` ), that enable you to turn any standard function with two arguments in an infix operator:
Prelude> elem 2 [1,2,3]
True
Prelude> 2 `elem` [1,2,3] -- this is the same as before
True
Prelude> let f = (`elem` [1,2,3]) -- partial application, second operand
Prelude> f 1
True
Prelude> f 4
False
Prelude> let g = (1 `elem`) -- partial application, first operand
Prelude> g [1,2]
True
Prelude> g [2,3]
False
Yes, this is the section syntax at work.
Sections are written as ( op e ) or ( e op ), where op is a binary operator and e is an expression. Sections are a convenient syntax for partial application of binary operators.
The following identities hold:
(op e) = \ x -> x op e
(e op) = \ x -> e op x
All infix operators can be used in sections in Haskell - except for - due to strangeness with unary negation. This even includes non-infix functions converted to infix by use of backticks. You can even think of the formulation for making operators into normal functions as a double-sided section:
(x + y) -> (+ y) -> (+)
Sections are (mostly, with some rare corner cases) treated as simple lambdas. (/ 2) is the same as:
\x -> (x / 2)
and (2 /) is the same as \x -> (2 / x), for an example with a non-commutative operator.
There's nothing deeply interesting theoretically going on here. It's just syntactic sugar for partial application of infix operators. It makes code a little bit prettier, often. (There are counterexamples, of course.)

What do the parentheses signify in (x:xs) when pattern matching?

when you split a list using x:xs syntax why is it wrapped in a parentheses? what is the significance of the parentheses? why not [x:xs] or just x:xs?
The cons cell doesn't have to be parenthesized in every context, but in most contexts it is because
Function application binds tighter than any infix operator.
Burn this into your brain in letters of fire.
Example:
length [] = 0
length (x:xs) = 1 + length xs
If parentheses were omitted the compiler would think you had an argument x followed by an ill-placed infix operator, and it would complain bitterly. On the other hand this is OK
length l = case l of [] -> 0
x:xs -> 1 + length xs
In this case neither x nor xs can possibly be construed as part of a function application so no parentheses are needed.
Note that the same wonderful rule function application binds tighter than any infix operator is what allows us to write length xs in 1 + length xs without any parentheses. The infix rule giveth and the infix rule taketh away.
You're simply using the cons operator :, which has low precedence. Parentheses are needed so that things stay right.
And you don't use [x:xs], because that would match a list whose only element is a list with head x and tail xs.
I don't know exact answer, but I guess that is due to what can be matched in patterns. Only constructors can be matched. Constructors can be of single word or composite. Look at the next code:
data Foo = Bar | Baz Int
f :: Foo -> Int
f Bar = 1
f (Baz x) = x - 1
Single word constructors match as is. But composite constructors must be surrounded with parens in order to avoid ambiguity. If we skip parens it looks like matching against two independent arguments:
f Baz x = x - 1
So, as (:) is composite it must be in parens. Skipping parens for Bar is a kind of syntactic sugar.
UPDATE: I realized that (as sykora noted) it is a consequence of operator precedence. It clarifies my assumptions. Function application (which is just space between function and argument) has highest precedence. Others including (:) have lower precedence. So f x:xs is to be interpreted as ((:) (f x)) xs that is presumably not what we need. While f (x:xs) is interpreted as f applied to x:xs which is in turn (:) applied to x and xs.
It's to do with parsing.
Remember, the colon : is just a constructor that's written with operator syntax. So a function like
foo [] = 0
foo (x:xs) = x + foo xs
could also be written as
foo [] = 0
foo ((:) x xs) = x + foo xs
If you drop the parenthesis in that last line, it becomes very hard to parse!
: is a data constructor, like any other pattern match, but written infix. The parentheses are purely there because of infix precedence; they're actually not required and can be safely omitted when precedence rules allow. For instance:
> let (_, a:_) = (1, [2, 3, 4]) in a
2
> let a:_ = "xyzzy"
'x'
> case [1, 2, 3] of; a:b -> a; otherwise -> 0;
1
Interestingly, that doesn't seem to work in the head of a lambda. Not sure why.
As always, the "juxtaposition" operator binds tighter than anything else, so more often than not the delimiters are necessary, but they're not actually part of the pattern match--otherwise you wouldn't be able to use patterns like (x:y:zs) instead of (x:(y:zs)).

Resources