Haskell type signature in lambda expression - haskell

Suppose I have a lambda expression in my program like:
\x -> f $ x + 1
and I want to specify for type safety that x must be an Integer. Something like:
-- WARNING: bad code
\x::Int -> f $ x + 1

You can just write \x -> f $ (x::Int) + 1 instead. Or, perhaps more readable, \x -> f (x + 1 :: Int). Note that type signatures generally encompass everything to their left, as far left as makes syntactic sense, which is the opposite of lambdas extending to the right.
The GHC extension ScopedTypeVariables incidentally allows writing signatures directly in patterns, which would allow \(x::Int) -> f $ x + 1. But that extension also adds a bunch of other stuff you might not want to worry about; I wouldn't turn it on just for a syntactic nicety.

I want to add to C.A.McCann's answer by noting that you don't need ScopedTypeVariables. Even if you never use the variable, you can always still do:
\x -> let _ = (x :: T) in someExpressionThatDoesNotUseX

Related

Eta-conversion changes semantics in a strict language

Take this OCaml code:
let silly (g : (int -> int) -> int) (f : int -> int -> int) =
g (f (print_endline "evaluated"; 0))
silly (fun _ -> 0) (fun x -> fun y -> x + y)
It prints evaluated and returns 0. But if I eta-expand f to get g (fun x -> f (print_endline "evaluated"; 0) x), evaluated is no longer printed.
Same holds for this SML code:
fun silly (g : (int -> int) -> int, f : int -> int -> int) : int =
g (f (print "evaluated" ; 0));
silly ((fn _ => 0), fn x => fn y => x + y);
On the other hand, this Haskell code doesn't print evaluated even with the strict pragma:
{-# LANGUAGE Strict #-}
import Debug.Trace
silly :: ((Int -> Int) -> Int) -> (Int -> Int -> Int) -> Int
silly g f = g (f (trace "evaluated" 0))
main = print $ silly (const 0) (+)
(I can make it, though, by using seq, which makes perfect sense for me)
While I understand that OCaml and SML do the right thing theoretically, are there any practical reason to prefer this behaviour to the "lazier" one? Eta-contraction is a common refactoring tool and I'm totally scared of using it in a strict language. I feel like I should paranoically eta-expand everything, just because otherwise arguments to partially applied functions can be evaluated when they're not supposed to. When is the "strict" behaviour useful?
Why and how does Haskell behave differently under the Strict pragma? Are there any references I can familiarize myself with to better understand the design space and pros and cons of the existing approaches?
To address the technical part of your question, eta-conversion also changes the meaning of expressions in lazy languages, you just need to consider the eta-rule of a different type constructor, e.g., + instead of ->.
This is the eta-rule for binary sums:
(case e of Lft y -> f (Lft y) | Rgt y -> f (Rgt y)) = f e (eta-+)
This equation holds under eager evaluation, because e will always be reduced on both sides. Under lazy evaluation, however, the r.h.s. only reduces e if f also forces it. That might make the l.h.s. diverge where the r.h.s. would not. So the equation does not hold in a lazy language.
To make it concrete in Haskell:
f x = 0
lhs = case undefined of Left y -> f (Left y); Right y -> f (Right y)
rhs = f undefined
Here, trying to print lhs will diverge, whereas rhs yields 0.
There is more that could be said about this, but the essence is that the equational theories of both evaluation regimes are sort of dual.
The underlying problem is that under a lazy regime, every type is inhabited by _|_ (non-termination), whereas under eager it is not. That has severe semantic consequences. In particular, there are no inductive types in Haskell, and you cannot prove termination of a structural recursive function, e.g., a list traversal.
There is a line of research in type theory distinguishing data types (strict) from codata types (non-strict) and providing both in a dual manner, thus giving the best of both worlds.
Edit: As for the question why a compiler should not eta-expand functions: that would utterly break every language. In a strict language with effects that's most obvious, because the ability to stage effects via multiple function abstractions is a feature. The simplest example perhaps is this:
let make_counter () =
let x = ref 0 in
fun () -> x := !x + 1; !x
let tick = make_counter ()
let n1 = tick ()
let n2 = tick ()
let n3 = tick ()
But effects are not the only reason. Eta-expansion can also drastically change the performance of a program! In the same way you sometimes want to stage effects you sometimes also want to stage work:
match :: String -> String -> Bool
match regex = \s -> run fsm s
where fsm = ...expensive transformation of regex...
matchFloat = match "[0-9]+(\.[0-9]*)?((e|E)(+|-)?[0-9]+)?"
Note that I used Haskell here, because this example shows that implicit eta-expansion is not desirable in either eager or lazy languages!
With respect to your final question (why does Haskell do this), the reason "Strict Haskell" behaves differently from a truly strict language is that the Strict extension doesn't really change the evaluation model from lazy to strict. It just makes a subset of bindings into "strict" bindings by default, and only in the limited Haskell sense of forcing evaluation to weak head normal form. Also, it only affects bindings made in the module with the extension turned on; it doesn't retroactively affect bindings made elsewhere. (Moreover, as described below, the strictness doesn't take effect in partial function application. The function needs to be fully applied before any arguments are forced.)
In your particular Haskell example, I believe the only effect of the Strict extension is as if you had explicitly written the following bang patterns in the definition of silly:
silly !g !f = g (f (trace "evaluated" 0))
It has no other effect. In particular, it doesn't make const or (+) strict in their arguments, nor does it generally change the semantics of function applications to make them eager.
So, when the term silly (const 0) (+) is forced by print, the only effect is to evaluate its arguments to WHNF as part of the function application of silly. The effect is similar to writing (in non-Strict Haskell):
let { g = const 0; f = (+) } in g `seq` f `seq` silly g f
Obviously, forcing g and f to their WHNFs (which are lambdas) isn't going to have any side effect, and when silly is applied, const 0 is still lazy in its remaining argument, so the resulting term is something like:
(\x -> 0) ((\x y -> <defn of plus>) (trace "evaluated" 0))
(which should be interpreted without the Strict extension -- these are all lazy bindings here), and there's nothing here that will force the side effect.
As noted above, there's another subtle issue that this example glosses over. Even if you had made everything in sight strict:
{-# LANGUAGE Strict #-}
import Debug.Trace
myConst :: a -> b -> a
myConst x y = x
myPlus :: Int -> Int -> Int
myPlus x y = x + y
silly :: ((Int -> Int) -> Int) -> (Int -> Int -> Int) -> Int
silly g f = g (f (trace "evaluated" 0))
main = print $ silly (myConst 0) myPlus
this still wouldn't have printed "evaluated". This is because, in the evaluation of silly when the strict version of myConst forces its second argument, that argument is a partial application of the strict version of myPlus, and myPlus won't force any of its arguments until it's been fully applied.
This also means that if you change the definition of myPlus to:
myPlus x = \y -> x + y -- now it will print "evaluated"
then you'll be able to largely reproduce the ML behavior. Because myPlus is now fully applied, it will force its argument, and this will print "evaluated". You can suppress it again eta-expanding f in the definition of silly:
silly g f = g (\x -> f (trace "evaluated" 0) x) -- now it won't
because now when myConst forces its second argument, that argument is already in WHNF (because it's a lambda), and we never get to the application of f, full or not.
In the end, I guess I wouldn't take "Haskell plus the Strict extension and unsafe side effects like trace" too seriously as a good point in the design space. Its semantics may be (barely) coherent, but they sure are weird. I think the only serious use case is when you have some code whose semantics "obviously" don't depend on lazy versus strict evaluation but where performance would be improved by a lot of forcing. Then, you can just turn on Strict for a performance boost without having to think too hard.

Does Haskell "understand" curried function definitions?

In Haskell functions always take one parameter. Multiple parameters are implemented via Currying. That being the case, I can see how a function of two parameters would be defined as "func1" below. It's a function that returns a function (closure) that adds the outer function's single parameter to the returned function's single parameter.
However, although this is how curried functions work, that's not the regular Haskell syntax for defining a two-parameter function. Instead we're taught to define such a function like "func2".
I'd like to know how Haskell understands that func2 should behave the same way as func1. There's nothing about the definition of func2 that suggest to me that it is a function that returns a function. To the contrary it actually looks like a two-parameter function, something we're told doesn't exist!
What's the trick here? Is Haskell just born knowing that we can define multi-parameter functions in this textbook way, and that they work the way we expect anyhow? That is, is this a syntax convention that doesn't seem to be clearly documented (Haskell knows what you mean and will supply the missing function return for you), or is there some other magic at work or something I'm missing?
func1 :: Int -> (Int -> Int)
func1 x = (\y -> x + y)
func2 :: Int -> Int -> Int
func2 x y = x + y
main = do
print (func1 7 9)
print (func2 7 9)
In the language itself, writing a function definition of the form f x y z = _ is equivalent to f = \x y z -> _, which is equivalent to f = \x -> \y -> \z -> _. There's no theoretical reason for this; it's just that those nested lambda abstractions are a terrible eye-/finger-sore and everyone thought that it would be fine to sacrifice a bit of pedantry to make some syntax sugar for it. That's all there is on the surface and is probably all you need to know, for now.
In the implementation of the language, though, things get trickier. In GHC, which is the most common implementation, there actually is a difference between f x y = _ and f = \x -> \y -> _. When GHC compiles Haskell, it assigns arity to declarations. The former definition of f has arity 2, and the latter has arity 0. Take (.) from GHC.Base
(.) f g = \x -> f (g x)
(.) has arity 2, even though its type ((b -> c) -> (a -> b) -> a -> c) says that it can be applied up to thrice. This affects optimization: GHC will only inline a function that is saturated, or has at least as many arguments applied as its arity. In the call (maximum .), (.) will not inline, because it only has one argument (it is unsaturated). In the call (maximum . f), it will inline to \x -> maximum (f x), and in (maximum . f) 1, the (.) will inline first to a lambda abstraction (producing (\x -> maximum (f x)) 1), which will beta-reduce to maximum (f 1). If (.) were implemented
(.) f g x = f (g x)
(.) would have arity 3, which means it would inline less often (specifically the f . g case, which is a very common argument to higher order functions), likely reducing performance, which is exactly what the comment on it says:
Make sure it has TWO args only on the left, so that it inlines
when applied to two functions, even if there is no final argument
Final answer: the two forms should be equivalent, according to the language's semantics, but in GHC the two forms have different characteristics when it comes to optimization, even if they always give the same result.
When talking about type signatures, there is no such thing as a "multi-parameter function". All functions are single-parameter, period. Haskell doesn't need to somehow "translate" multi-parameter functions into single-parameter ones, because the former doesn't exist at all.
All function type signatures look like a -> b, where a is argument type and b is return type. Sometimes b may just happen to contain more arrows ->, in which case we, humans (but not the compiler), may say that the function has multiple parameters.
When talking about the syntax for implementations, i.e. f x y = z - that is merely syntactic sugar, which gets desugared (i.e. mechanically transformed) into f = \x -> \y -> z during compilation.

Haskell - Lambda calculus equivalent syntax?

While writing some lambda functions in Haskell, I was originally writing the functions like:
tru = \t f -> t
fls = \t f -> f
However, I soon noticed from the examples online that such functions are frequently written like:
tru = \t -> \f -> t
fls = \t -> \f -> f
Specifically, each of the items passed to the function have their own \ and -> as opposed to above. When checking the types of these they appear to be the same. My question is, are they equivalent or do they actually differ in some way? And not only for these two functions, but does it make a difference for functions in general? Thank you much!
They're the same, Haskell automatically curries things to keep things syntax nice. The following are all equivalent**
foo a b = (a, b)
foo a = \b -> (a, b)
foo = \a b -> (a, b)
foo = \a -> \b -> (a, b)
-- Or we can simply eta convert leaving
foo = (,)
If you want to be idiomatic, prefer either the first or the last. Introducing unnecessary lambdas is good for teaching currying, but in real code just adds syntactic clutter.
However in raw lambda calculus (not Haskell) most manually curry with
\a -> \b -> a b
Because people don't write a lot of lambda calculus by hand and when they do they tend to stick unsugared lambda calculus to keep things simple.
** modulo the monomorphism restriction, which won't impact you anyways with a type signature.
Though, as jozefg said, they are themselves equivalent, they may lead to different execution behaviour when combined with local variable bindings. Consider
f, f' :: Int -> Int -> Int
with the two definitions
f a x = μ*x
where μ = sum [1..a]
and
f' a = \x -> μ*x
where μ = sum [1..a]
These sure look equivalent, and certainly will always yield the same results.
GHCi, version 7.6.2: http://www.haskell.org/ghc/ :? for help
...
[1 of 1] Compiling Main            ( def0.hs, interpreted )
Ok, modules loaded: Main.
*Main> sum $ map (f 10000) [1..10000]
2500500025000000
*Main> sum $ map (f' 10000) [1..10000]
2500500025000000
However, if you try this yourself, you'll notice that with f takes quite a lot of time whereas with f' you get the result immediately. The reason is that f' is written in a form that prompts GHC to compile it so that actually f' 10000 is evaluated before starting to map it over the list. In that step, the value μ is calculated and stored in the closure of (f' 10000). On the other hand, f is treated simply as "one function of two variables"; (f 10000) is merely stored as a closure containing the parameter 10000 and μ is not calculated at all at first. Only when map applies (f 10000) to each element in the list, the whole sum [1..a] is calculated, which takes some time for each element in [1..10000]. With f', this was not necessary because μ was pre-calculated.
In principle, common-subexpression elimination is an optimisation that GHC is able to do itself, so you might at times get good performance even with a definition like f. But you can't really count on it.

Haskell type dessignation

I have to dessignate types of 2 functions(without using compiler :t) i just dont know how soudl i read these functions to make correct steps.
f x = map -1 x
f x = map (-1) x
Well i'm a bit confuse how it will be parsed
Function application, or "the empty space operator" has higher precedence than any operator symbol, so the first line parses as f x = map - (1 x), which will most likely1 be a type error.
The other example is parenthesized the way it looks, but note that (-1) desugars as negate 1. This is an exception from the normal rule, where operator sections like (+1) desugar as (\x -> x + 1), so this will also likely1 be a type error since map expects a function, not a number, as its first argument.
1 I say likely because it is technically possible to provide Num instances for functions which may allow this to type check.
For questions like this, the definitive answer is to check the Haskell Report. The relevant syntax hasn't changed from Haskell 98.
In particular, check the section on "Expressions". That should explain how expressions are parsed, operator precedence, and the like.
These functions do not have types, because they do not type check (you will get ridiculous type class constraints). To figure out why, you need to know that (-1) has type Num n => n, and you need to read up on how a - is interpreted with or without parens before it.
The following function is the "correct" version of your function:
f x = map (subtract 1) x
You should be able to figure out the type of this function, if I say that:
subtract 1 :: Num n => n -> n
map :: (a -> b) -> [a] -> [b]
well i did it by my self :P
(map) - (1 x)
(-)::Num a => a->a->->a
1::Num b=> b
x::e
map::(c->d)->[c]->[d]
map::a
a\(c->d)->[c]->[d]
(1 x)::a
1::e->a
f::(Num ((c->d)->[c]->[d]),Num (e->(c->d)->[c]->[d])) => e->(c->d)->[c]->[d]

When are lambda forms necessary in Haskell?

I'm a newbie to Haskell, and a relative newbie to functional programming.
In other (besides Haskell) languages, lambda forms are often very useful.
For example, in Scheme:
(define (deriv-approx f)
(lambda (h x)
(/ (- (f (+ x h)
(f x)
h)))
Would create a closure (over the function f) to approximate a derivative (at value x, with interval h).
However, this usage of a lambda form doesn't seem to be necessary in Haskell, due to its partial application:
deriv-approx f h x = ( (f (x + h)) - (f x) ) / h
What are some examples where lambda forms are necessary in Haskell?
Edit: replaced 'closure' with 'lambda form'
I'm going to give two slightly indirect answers.
First, consider the following code:
module Lambda where
derivApprox f h x = ( (f (x + h)) - (f x) ) / h
I've compiled this while telling GHC to dump an intermediate representation, which is roughly a simplified version of Haskell used as part of the compilation process, to get this:
Lambda.derivApprox
:: forall a. GHC.Real.Fractional a => (a -> a) -> a -> a -> a
[LclIdX]
Lambda.derivApprox =
\ (# a) ($dFractional :: GHC.Real.Fractional a) ->
let {
$dNum :: GHC.Num.Num a
[LclId]
$dNum = GHC.Real.$p1Fractional # a $dFractional } in
\ (f :: a -> a) (h :: a) (x :: a) ->
GHC.Real./
# a
$dFractional
(GHC.Num.- # a $dNum (f (GHC.Num.+ # a $dNum x h)) (f x))
h
If you look past the messy annotations and verbosity, you should be able to see that the compiler has turned everything into lambda expressions. We can consider this an indication that you probably don't need to do so manually.
Conversely, let's consider a situation where you might need lambdas. Here's a function that uses a fold to compose a list of functions:
composeAll :: [a -> a] -> a -> a
composeAll = foldr (.) id
What's that? Not a lambda in sight! In fact, we can go the other way, as well:
composeAll' :: [a -> a] -> a -> a
composeAll' xs x = foldr (\f g x -> f (g x)) id xs x
Not only is this full of lambdas, it's also taking two arguments to the main function and, what's more, applying foldr to all of them. Compare the type of foldr, (a -> b -> b) -> b -> [a] -> b, to the above; apparently it takes three arguments, but above we've applied it to four! Not to mention that the accumulator function takes two arguments, but we have a three argument lambda here. The trick, of course, is that both are returning a function that takes a single argument; and we're simply applying that argument on the spot, instead of juggling lambdas around.
All of which, hopefully, has convinced you that the two forms are equivalent. Lambda forms are never necessary, or perhaps always necessary, because who can tell the difference?
There is no semantic difference between
f x y z w = ...
and
f x y = \z w -> ...
The main difference between expression style (explicit lambdas) and declaration style is a syntactic one. One situation where it matters is when you want to use a where clause:
f x y = \z w -> ...
where ... -- x and y are in scope, z and w are not
It is indeed possible to write any Haskell program without using an explicit lambda anywhere by replacing them with named local functions or partial application.
See also: Declaration vs. expression style.
When you can declare named curried functions (such as your Haskell deriv-approx) it is never necessary to use an explicit lambda expression. Every explicit lambda expression can be replaced with a partial application of a named function that takes the free variables of the lambda expression as its first parameters.
Why one would want to do this in source code is not easy to see, but some implementations essentially work that way.
Also, somewhat beside the point, would the following rewriting (different from what I've just described) count as avoiding lambdas for you?
deriv-approx f = let myfunc h x = (f(x+h)-(f x))/h in myfunc
If you only use a function once, e.g. as a parameter to map or foldr or some other higher-order function, then it is often better to use a lambda than a named function, because it immediately becomes clear that the function isn't used anywhere else - it can't be, because it doesn't have a name. When you introduce a new named function, you give people reading your code another thing to remember for the duration of the scope. So lambdas are never strictly speaking necessary, but they are often preferable to the alternative.

Resources