I've taken up learning Haskell again, after a short hiatus and I am currently trying to get a better understanding of how recursion and lambda expressions work in Haskell.
In this: YouTube video, there is a function example that puzzles me far more than it probably should, in terms of how it actually works:
firstThat :: (a -> Bool) -> a -> [a] -> a
firstThat f = foldr (\x acc -> if f x then x else acc)
For the sake of clarity and since it wasn't immediately obvious to me, I'll give an example of applying this function to some arguments:
firstThat (>10) 2000 [10,20,30,40] --returns 20, but would return 2000, if none of the values in the list were greater than 10
Please correct me, if my assumptions are wrong.
It seems firstThat takes three arguments:
a function that takes one arguments and returns a Boolean value. Since the > operator is actually an infix function, the first argument in the example above seems the result of a partial application to the > function – is this correct?
an unspecified value of the same type expected as the missing argument to the function provided as the first argument
a list of values of the aforementioned type
But the actual function firstThat seems to be defined differently from its type declaration, with just one argument. Since foldr normally takes three arguments I gathered there is some kind of partial application happening. The lambda expression provided as an argument to foldr seem to be missing its arguments too.
So, how exactly does this function work? I apologize if I am being too dense or fail to see the forest for the trees, but I just cannot wrap my head around it, which is frustrating.
Any helpful explanation or example(s) would be greatly appreciated.
Thanks!
But the actual function firstThat seems to be defined differently from its type declaration, with just one argument. Since foldr normally takes three arguments I gathered there is some kind of partial application happening.
You are right. However, there is a nicer way of putting it than talking about "missing arguments" -- one that doesn't lead you into asking where they have gone. Here are two ways in which the arguments are not missing.
Firstly, consider this function:
add :: Num a => a -> a -> a
add x y = x + y
As you may know, we can also define it like this:
add :: Num a => a -> a -> a
add = (+)
That works because Haskell functions are values like any other. We can simply define a value, add, as being equal to another value, (+), which just happens to be a function. There is no special syntax required to declare a function. The upshot is that writing arguments explicitly is (almost) never necessary; the main reason why we do so because it often makes code more readable (for instance, I could define firstThat without writing the f parameter explicitly, but I won't do so because the result is rather hideous).
Secondly, whenever you see a function type with three arguments...
firstThat :: (a -> Bool) -> a -> [a] -> a
... you can also read it like this...
firstThat :: (a -> Bool) -> (a -> [a] -> a)
... that is, a function of one argument that produces a function of two arguments. That works for all functions of more than one argument. The key takeaway is that, at heart, all Haskell functions take just one argument. That is why partial application works. So on seeing...
firstThat :: (a -> Bool) -> a -> [a] -> a
firstThat f = foldr (\x acc -> if f x then x else acc)
... you can accurately say that you have written explicitly all parameters that firstThat takes -- that is, only one :)
The lambda expression provided as an argument to foldr seem to be missing its arguments too.
Not really. foldr (when restricted to lists) is...
foldr :: (a -> b -> b) -> b -> [a] -> b
... and so the function passed to it takes two arguments (feel free to add air quotes around "two", given the discussion above). The lambda was written as...
\x acc -> if f x then x else acc
... with two explicit arguments, x and acc.
a function that takes one arguments and returns a Boolean value. Since the > operator is actually an infix function, the first argument in the example above seems the result of a partial application to the > function – is this correct?
yes: (>10) is short for \x -> x > 10, just as (10>) would be short for \x -> 10 > x.
an unspecified value of the same type expected as the missing argument to the function provided as the first argument
first of all, it's not a missing argument: by omitting an argument, you obtain a function value. however, the type of the 2nd argument does indeed match the argument of the function >10, just as it matches the type of the elements of the list [10,20,30,40] (which is better reasoning).
a list of values of the aforementioned type
yes.
But the actual function firstThat seems to be defined differently from its type declaration, with just one argument. Since foldr normally takes three arguments I gathered there is some kind of partial application happening. The lambda expression provided as an argument to foldr seem to be missing its arguments too.
that's because given e.g. foo x y z = x * y * z, these 2 lines are equivalent:
bar x = foo x
bar x y z = foo x y z
— that's because of a concept called currying. Currying is also the reason why function type signatures are not (a, b) -> c but instead a -> b -> c, which in turn is equivalent to a -> (b -> c) because of the right associativity of the -> type operator.
Therefore, these two lines are equivalent:
firstThat f = foldr (\x acc -> if f x then x else acc)
firstThat f x y = foldr (\x acc -> if f x then x else acc) x y
Note: that you can also use Data.List.find combined with Data.Maybe.fromMaybe:
λ> fromMaybe 2000 $ find (>10) [10, 20, 30]
20
λ> fromMaybe 2000 $ find (>10) [1, 2, 3]
2000
See also:
https://en.wikipedia.org/wiki/Currying.
https://www.fpcomplete.com/user/EFulmer/currying-and-partial-application
http://learnyouahaskell.com/higher-order-functions
Related
Can someone explain to me step by step what this function means?
select :: (a->a->Bool) -> a -> a -> a
As the comments pointed out, this is not a function definition, but just a type signature. It says, for any type a which you are free to choose, this function expects:
A function that takes two values of type a and gives a Bool
Two values of type a
and it returns another value of type a. So for example, we could call:
select (<) 1 2
where a is Int, since (<) is a function that takes two Ints and returns a Bool. We could not call:
select isPrefixOf 1 2
because isPrefixOf :: (Eq a) => [a] -> [a] -> Bool -- i.e. it takes two lists (provided that the element type supports Equality), but numbers are not lists.
Signatures can tell us quite a lot, however, due to parametericity (aka free theorems). The details are quite techincal, but we can intuit that select must return one of its two arguments, because it has no other way to construct values of type a about which it knows nothing (and this can be proven).
But beyond that we can't really tell. Often you can tell almost certainly what a function does by its signature. But as I explored this signature, I found that there were actually quite a few functions it could be, from the most obvious:
select f x y = if f x y then x else y
to some rather exotic
select f x y = if f x x && f y y then x else y
And the name select doesn't help much -- it seems to tell us that it will return one of the two arguments, but the signature already told us that.
In Haskell functions always take one parameter. Multiple parameters are implemented via Currying. That being the case, I can see how a function of two parameters would be defined as "func1" below. It's a function that returns a function (closure) that adds the outer function's single parameter to the returned function's single parameter.
However, although this is how curried functions work, that's not the regular Haskell syntax for defining a two-parameter function. Instead we're taught to define such a function like "func2".
I'd like to know how Haskell understands that func2 should behave the same way as func1. There's nothing about the definition of func2 that suggest to me that it is a function that returns a function. To the contrary it actually looks like a two-parameter function, something we're told doesn't exist!
What's the trick here? Is Haskell just born knowing that we can define multi-parameter functions in this textbook way, and that they work the way we expect anyhow? That is, is this a syntax convention that doesn't seem to be clearly documented (Haskell knows what you mean and will supply the missing function return for you), or is there some other magic at work or something I'm missing?
func1 :: Int -> (Int -> Int)
func1 x = (\y -> x + y)
func2 :: Int -> Int -> Int
func2 x y = x + y
main = do
print (func1 7 9)
print (func2 7 9)
In the language itself, writing a function definition of the form f x y z = _ is equivalent to f = \x y z -> _, which is equivalent to f = \x -> \y -> \z -> _. There's no theoretical reason for this; it's just that those nested lambda abstractions are a terrible eye-/finger-sore and everyone thought that it would be fine to sacrifice a bit of pedantry to make some syntax sugar for it. That's all there is on the surface and is probably all you need to know, for now.
In the implementation of the language, though, things get trickier. In GHC, which is the most common implementation, there actually is a difference between f x y = _ and f = \x -> \y -> _. When GHC compiles Haskell, it assigns arity to declarations. The former definition of f has arity 2, and the latter has arity 0. Take (.) from GHC.Base
(.) f g = \x -> f (g x)
(.) has arity 2, even though its type ((b -> c) -> (a -> b) -> a -> c) says that it can be applied up to thrice. This affects optimization: GHC will only inline a function that is saturated, or has at least as many arguments applied as its arity. In the call (maximum .), (.) will not inline, because it only has one argument (it is unsaturated). In the call (maximum . f), it will inline to \x -> maximum (f x), and in (maximum . f) 1, the (.) will inline first to a lambda abstraction (producing (\x -> maximum (f x)) 1), which will beta-reduce to maximum (f 1). If (.) were implemented
(.) f g x = f (g x)
(.) would have arity 3, which means it would inline less often (specifically the f . g case, which is a very common argument to higher order functions), likely reducing performance, which is exactly what the comment on it says:
Make sure it has TWO args only on the left, so that it inlines
when applied to two functions, even if there is no final argument
Final answer: the two forms should be equivalent, according to the language's semantics, but in GHC the two forms have different characteristics when it comes to optimization, even if they always give the same result.
When talking about type signatures, there is no such thing as a "multi-parameter function". All functions are single-parameter, period. Haskell doesn't need to somehow "translate" multi-parameter functions into single-parameter ones, because the former doesn't exist at all.
All function type signatures look like a -> b, where a is argument type and b is return type. Sometimes b may just happen to contain more arrows ->, in which case we, humans (but not the compiler), may say that the function has multiple parameters.
When talking about the syntax for implementations, i.e. f x y = z - that is merely syntactic sugar, which gets desugared (i.e. mechanically transformed) into f = \x -> \y -> z during compilation.
In working through a solution to the 8 Queens problem, a person used the following line of code:
sameDiag try qs = any (\(colDist,q) -> abs (try - q) == colDist) $ zip [1..] qs
try is an an item; qs is a list of the same items.
Can someone explain how colDist and q in the lambda function get bound to anything?
How did try and q used in the body of lambda function find their way into the same scope?
To the degree this is a Haskell idiom, what problem does this design approach help solve?
The function any is a higher-order function that takes 2 arguments:
the 1st argument is of type a -> Bool, i.e. a function from a to Bool
the 2nd argument is of type [a], i.e. a list of items of type a;
i.e. the 1st argument is a function that takes any element from the list passed as the 2nd argument, and returns a Bool based on that element. (well it can take any values of type a, not just the ones in that list, but it's quite obviously certain that any won't be invoking it with some arbitrary values of a but the ones from the list.)
You can then simplify thinking about the original snippet by doing a slight refactoring:
sameDiag :: Int -> [Int] -> Bool
sameDiag try qs = any f xs
where
xs = zip [1..] qs
f = (\(colDist, q) -> abs (try - q) == colDist)
which can be transformed into
sameDiag :: Int -> [Int] -> Bool
sameDiag try qs = any f xs
where
xs = zip [1..] qs
f (colDist, q) = abs (try - q) == colDist)
which in turn can be transformed into
sameDiag :: Int -> [Int] -> Bool
sameDiag try qs = any f xs
where
xs = zip [1..] qs
f pair = abs (try - q) == colDist) where (colDist, q) = pair
(Note that sameDiag could also have a more general type Integral a => a -> [a] -> Bool rather than the current monomorphic one)
— so how does the pair in f pair = ... get bound to a value? well, simple: it's just a function; whoever calls it must pass along a value for the pair argument. — when calling any with the first argument set to f, it's the invocation of the function any who's doing the calling of f, with individual elements of the list xs passed in as values of the argument pair.
and, since the contents of xs is a list of pairs, it's OK to pass an individual pair from this list to f as f expects it to be just that.
EDIT: a further explanation of any to address the asker's comment:
Is this a fair synthesis? This approach to designing a higher-order function allows the invoking code to change how f behaves AND invoke the higher-order function with a list that requires additional processing prior to being used to invoke f for every element in the list. Encapsulating the list processing (in this case with zip) seems the right thing to do, but is the intent of this additional processing really clear in the original one-liner above?
There's really no additional processing done by any prior to invoking f. There is just very minimalistic bookkeeping in addition to simply iterating through the passed in list xs: invoking f on the elements during the iteration, and immediately breaking the iteration and returning True the first time f returns True for any list element.
Most of the behavior of any is "implicit" though in that it's taken care of by Haskell's lazy evaluation, basic language semantics as well as existing functions, which any is composed of (well at least my version of it below, any' — I haven't taken a look at the built-in Prelude version of any yet but I'm sure it's not much different; just probably more heavily optimised).
In fact, any is simple it's almost trivial to re-implement it with a one liner on a GHCi prompt:
Prelude> let any' f xs = or (map f xs)
let's see now what GHC computes as its type:
Prelude> :t any'
any' :: (a -> Bool) -> [a] -> Bool
— same as the built-in any. So let's give it some trial runs:
Prelude> any' odd [1, 2, 3] -- any odd values in the list?
True
Prelude> any' even [1, 3] -- any even ones?
False
Prelude> let adult = (>=18)
Prelude> any' adult [17, 17, 16, 15, 17, 18]
— see how you can sometimes write code that almost looks like English with higher-order functions?
zip :: [a] -> [b] -> [(a,b)] takes two lists and joins them into pairs, dropping any remaining at the end.
any :: (a -> Bool) -> [a] -> Bool takes a function and a list of as and then returns True if any of the values returned true or not.
So colDist and q are the first and second elements of the pairs in the list made by zip [1..] qs, and they are bound when they are applied to the pair by any.
q is only bound within the body of the lambda function - this is the same as with lambda calculus. Since try was bound before in the function definition, it is still available in this inner scope. If you think of lambda calculus, the term \x.\y.x+y makes sense, despite the x and the y being bound at different times.
As for the design approach, this approach is much cleaner than trying to iterate or recurse through the list manually. It seems quite clear in its intentions to me (with respect to the larger codebase it comes from).
I have to dessignate types of 2 functions(without using compiler :t) i just dont know how soudl i read these functions to make correct steps.
f x = map -1 x
f x = map (-1) x
Well i'm a bit confuse how it will be parsed
Function application, or "the empty space operator" has higher precedence than any operator symbol, so the first line parses as f x = map - (1 x), which will most likely1 be a type error.
The other example is parenthesized the way it looks, but note that (-1) desugars as negate 1. This is an exception from the normal rule, where operator sections like (+1) desugar as (\x -> x + 1), so this will also likely1 be a type error since map expects a function, not a number, as its first argument.
1 I say likely because it is technically possible to provide Num instances for functions which may allow this to type check.
For questions like this, the definitive answer is to check the Haskell Report. The relevant syntax hasn't changed from Haskell 98.
In particular, check the section on "Expressions". That should explain how expressions are parsed, operator precedence, and the like.
These functions do not have types, because they do not type check (you will get ridiculous type class constraints). To figure out why, you need to know that (-1) has type Num n => n, and you need to read up on how a - is interpreted with or without parens before it.
The following function is the "correct" version of your function:
f x = map (subtract 1) x
You should be able to figure out the type of this function, if I say that:
subtract 1 :: Num n => n -> n
map :: (a -> b) -> [a] -> [b]
well i did it by my self :P
(map) - (1 x)
(-)::Num a => a->a->->a
1::Num b=> b
x::e
map::(c->d)->[c]->[d]
map::a
a\(c->d)->[c]->[d]
(1 x)::a
1::e->a
f::(Num ((c->d)->[c]->[d]),Num (e->(c->d)->[c]->[d])) => e->(c->d)->[c]->[d]
I'm a newbie to Haskell, and a relative newbie to functional programming.
In other (besides Haskell) languages, lambda forms are often very useful.
For example, in Scheme:
(define (deriv-approx f)
(lambda (h x)
(/ (- (f (+ x h)
(f x)
h)))
Would create a closure (over the function f) to approximate a derivative (at value x, with interval h).
However, this usage of a lambda form doesn't seem to be necessary in Haskell, due to its partial application:
deriv-approx f h x = ( (f (x + h)) - (f x) ) / h
What are some examples where lambda forms are necessary in Haskell?
Edit: replaced 'closure' with 'lambda form'
I'm going to give two slightly indirect answers.
First, consider the following code:
module Lambda where
derivApprox f h x = ( (f (x + h)) - (f x) ) / h
I've compiled this while telling GHC to dump an intermediate representation, which is roughly a simplified version of Haskell used as part of the compilation process, to get this:
Lambda.derivApprox
:: forall a. GHC.Real.Fractional a => (a -> a) -> a -> a -> a
[LclIdX]
Lambda.derivApprox =
\ (# a) ($dFractional :: GHC.Real.Fractional a) ->
let {
$dNum :: GHC.Num.Num a
[LclId]
$dNum = GHC.Real.$p1Fractional # a $dFractional } in
\ (f :: a -> a) (h :: a) (x :: a) ->
GHC.Real./
# a
$dFractional
(GHC.Num.- # a $dNum (f (GHC.Num.+ # a $dNum x h)) (f x))
h
If you look past the messy annotations and verbosity, you should be able to see that the compiler has turned everything into lambda expressions. We can consider this an indication that you probably don't need to do so manually.
Conversely, let's consider a situation where you might need lambdas. Here's a function that uses a fold to compose a list of functions:
composeAll :: [a -> a] -> a -> a
composeAll = foldr (.) id
What's that? Not a lambda in sight! In fact, we can go the other way, as well:
composeAll' :: [a -> a] -> a -> a
composeAll' xs x = foldr (\f g x -> f (g x)) id xs x
Not only is this full of lambdas, it's also taking two arguments to the main function and, what's more, applying foldr to all of them. Compare the type of foldr, (a -> b -> b) -> b -> [a] -> b, to the above; apparently it takes three arguments, but above we've applied it to four! Not to mention that the accumulator function takes two arguments, but we have a three argument lambda here. The trick, of course, is that both are returning a function that takes a single argument; and we're simply applying that argument on the spot, instead of juggling lambdas around.
All of which, hopefully, has convinced you that the two forms are equivalent. Lambda forms are never necessary, or perhaps always necessary, because who can tell the difference?
There is no semantic difference between
f x y z w = ...
and
f x y = \z w -> ...
The main difference between expression style (explicit lambdas) and declaration style is a syntactic one. One situation where it matters is when you want to use a where clause:
f x y = \z w -> ...
where ... -- x and y are in scope, z and w are not
It is indeed possible to write any Haskell program without using an explicit lambda anywhere by replacing them with named local functions or partial application.
See also: Declaration vs. expression style.
When you can declare named curried functions (such as your Haskell deriv-approx) it is never necessary to use an explicit lambda expression. Every explicit lambda expression can be replaced with a partial application of a named function that takes the free variables of the lambda expression as its first parameters.
Why one would want to do this in source code is not easy to see, but some implementations essentially work that way.
Also, somewhat beside the point, would the following rewriting (different from what I've just described) count as avoiding lambdas for you?
deriv-approx f = let myfunc h x = (f(x+h)-(f x))/h in myfunc
If you only use a function once, e.g. as a parameter to map or foldr or some other higher-order function, then it is often better to use a lambda than a named function, because it immediately becomes clear that the function isn't used anywhere else - it can't be, because it doesn't have a name. When you introduce a new named function, you give people reading your code another thing to remember for the duration of the scope. So lambdas are never strictly speaking necessary, but they are often preferable to the alternative.