I have to dessignate types of 2 functions(without using compiler :t) i just dont know how soudl i read these functions to make correct steps.
f x = map -1 x
f x = map (-1) x
Well i'm a bit confuse how it will be parsed
Function application, or "the empty space operator" has higher precedence than any operator symbol, so the first line parses as f x = map - (1 x), which will most likely1 be a type error.
The other example is parenthesized the way it looks, but note that (-1) desugars as negate 1. This is an exception from the normal rule, where operator sections like (+1) desugar as (\x -> x + 1), so this will also likely1 be a type error since map expects a function, not a number, as its first argument.
1 I say likely because it is technically possible to provide Num instances for functions which may allow this to type check.
For questions like this, the definitive answer is to check the Haskell Report. The relevant syntax hasn't changed from Haskell 98.
In particular, check the section on "Expressions". That should explain how expressions are parsed, operator precedence, and the like.
These functions do not have types, because they do not type check (you will get ridiculous type class constraints). To figure out why, you need to know that (-1) has type Num n => n, and you need to read up on how a - is interpreted with or without parens before it.
The following function is the "correct" version of your function:
f x = map (subtract 1) x
You should be able to figure out the type of this function, if I say that:
subtract 1 :: Num n => n -> n
map :: (a -> b) -> [a] -> [b]
well i did it by my self :P
(map) - (1 x)
(-)::Num a => a->a->->a
1::Num b=> b
x::e
map::(c->d)->[c]->[d]
map::a
a\(c->d)->[c]->[d]
(1 x)::a
1::e->a
f::(Num ((c->d)->[c]->[d]),Num (e->(c->d)->[c]->[d])) => e->(c->d)->[c]->[d]
Related
I read about The Monomorphism Restriction from the page https://www.haskell.org/tutorial/pitfalls.html and could not understand the last point:
A common violation of the restriction happens with functions defined
in a higher-order manner, as in this definition of sum from the
Standard Prelude:
sum = foldl (+) 0
As is, this would cause a static type error. We can fix the problem by
adding the type signature:
sum :: (Num a) => [a] -> a
Also note that this problem would not have arisen if we had written:
sum xs = foldl (+) 0 xs
because the restriction only applies to pattern bindings.
Why the last point does not cause any error?
because the restriction only applies to pattern bindings.
Essentially, the MR does not apply when we are defining a function using a function binding of the form
f arg1 ... argN = ...
with N > 0.
The intuition is as follows. The purpose of the MR is to avoid turning Haskell non-functions into lower-level functions accidentally. For instance,
x = 3 + 4
is not a function. However, its type is Num a => a, which is usually implemented as a function from a Num dictionary to the result of 3+4 where + is a function defined by the dictionary. This can lead to a bad performance, since every time we use x the sum will need to be recomputed from scratch. This is unavoidable if we want to compute print (x :: Int) >> print (x :: Double), for instance. But actually using x at different types is rather uncommon.
So, the MR makes x monomorphic, preventing us to use it at more than a single type. In that way, recomputation can be avoided.
However, if x is already a function there is no harm in keeping that polymorphic, since we are "recomputing" function calls anyway. So, the MR does not apply to function bindings.
Can someone explain to me step by step what this function means?
select :: (a->a->Bool) -> a -> a -> a
As the comments pointed out, this is not a function definition, but just a type signature. It says, for any type a which you are free to choose, this function expects:
A function that takes two values of type a and gives a Bool
Two values of type a
and it returns another value of type a. So for example, we could call:
select (<) 1 2
where a is Int, since (<) is a function that takes two Ints and returns a Bool. We could not call:
select isPrefixOf 1 2
because isPrefixOf :: (Eq a) => [a] -> [a] -> Bool -- i.e. it takes two lists (provided that the element type supports Equality), but numbers are not lists.
Signatures can tell us quite a lot, however, due to parametericity (aka free theorems). The details are quite techincal, but we can intuit that select must return one of its two arguments, because it has no other way to construct values of type a about which it knows nothing (and this can be proven).
But beyond that we can't really tell. Often you can tell almost certainly what a function does by its signature. But as I explored this signature, I found that there were actually quite a few functions it could be, from the most obvious:
select f x y = if f x y then x else y
to some rather exotic
select f x y = if f x x && f y y then x else y
And the name select doesn't help much -- it seems to tell us that it will return one of the two arguments, but the signature already told us that.
In Haskell functions always take one parameter. Multiple parameters are implemented via Currying. That being the case, I can see how a function of two parameters would be defined as "func1" below. It's a function that returns a function (closure) that adds the outer function's single parameter to the returned function's single parameter.
However, although this is how curried functions work, that's not the regular Haskell syntax for defining a two-parameter function. Instead we're taught to define such a function like "func2".
I'd like to know how Haskell understands that func2 should behave the same way as func1. There's nothing about the definition of func2 that suggest to me that it is a function that returns a function. To the contrary it actually looks like a two-parameter function, something we're told doesn't exist!
What's the trick here? Is Haskell just born knowing that we can define multi-parameter functions in this textbook way, and that they work the way we expect anyhow? That is, is this a syntax convention that doesn't seem to be clearly documented (Haskell knows what you mean and will supply the missing function return for you), or is there some other magic at work or something I'm missing?
func1 :: Int -> (Int -> Int)
func1 x = (\y -> x + y)
func2 :: Int -> Int -> Int
func2 x y = x + y
main = do
print (func1 7 9)
print (func2 7 9)
In the language itself, writing a function definition of the form f x y z = _ is equivalent to f = \x y z -> _, which is equivalent to f = \x -> \y -> \z -> _. There's no theoretical reason for this; it's just that those nested lambda abstractions are a terrible eye-/finger-sore and everyone thought that it would be fine to sacrifice a bit of pedantry to make some syntax sugar for it. That's all there is on the surface and is probably all you need to know, for now.
In the implementation of the language, though, things get trickier. In GHC, which is the most common implementation, there actually is a difference between f x y = _ and f = \x -> \y -> _. When GHC compiles Haskell, it assigns arity to declarations. The former definition of f has arity 2, and the latter has arity 0. Take (.) from GHC.Base
(.) f g = \x -> f (g x)
(.) has arity 2, even though its type ((b -> c) -> (a -> b) -> a -> c) says that it can be applied up to thrice. This affects optimization: GHC will only inline a function that is saturated, or has at least as many arguments applied as its arity. In the call (maximum .), (.) will not inline, because it only has one argument (it is unsaturated). In the call (maximum . f), it will inline to \x -> maximum (f x), and in (maximum . f) 1, the (.) will inline first to a lambda abstraction (producing (\x -> maximum (f x)) 1), which will beta-reduce to maximum (f 1). If (.) were implemented
(.) f g x = f (g x)
(.) would have arity 3, which means it would inline less often (specifically the f . g case, which is a very common argument to higher order functions), likely reducing performance, which is exactly what the comment on it says:
Make sure it has TWO args only on the left, so that it inlines
when applied to two functions, even if there is no final argument
Final answer: the two forms should be equivalent, according to the language's semantics, but in GHC the two forms have different characteristics when it comes to optimization, even if they always give the same result.
When talking about type signatures, there is no such thing as a "multi-parameter function". All functions are single-parameter, period. Haskell doesn't need to somehow "translate" multi-parameter functions into single-parameter ones, because the former doesn't exist at all.
All function type signatures look like a -> b, where a is argument type and b is return type. Sometimes b may just happen to contain more arrows ->, in which case we, humans (but not the compiler), may say that the function has multiple parameters.
When talking about the syntax for implementations, i.e. f x y = z - that is merely syntactic sugar, which gets desugared (i.e. mechanically transformed) into f = \x -> \y -> z during compilation.
I've taken up learning Haskell again, after a short hiatus and I am currently trying to get a better understanding of how recursion and lambda expressions work in Haskell.
In this: YouTube video, there is a function example that puzzles me far more than it probably should, in terms of how it actually works:
firstThat :: (a -> Bool) -> a -> [a] -> a
firstThat f = foldr (\x acc -> if f x then x else acc)
For the sake of clarity and since it wasn't immediately obvious to me, I'll give an example of applying this function to some arguments:
firstThat (>10) 2000 [10,20,30,40] --returns 20, but would return 2000, if none of the values in the list were greater than 10
Please correct me, if my assumptions are wrong.
It seems firstThat takes three arguments:
a function that takes one arguments and returns a Boolean value. Since the > operator is actually an infix function, the first argument in the example above seems the result of a partial application to the > function – is this correct?
an unspecified value of the same type expected as the missing argument to the function provided as the first argument
a list of values of the aforementioned type
But the actual function firstThat seems to be defined differently from its type declaration, with just one argument. Since foldr normally takes three arguments I gathered there is some kind of partial application happening. The lambda expression provided as an argument to foldr seem to be missing its arguments too.
So, how exactly does this function work? I apologize if I am being too dense or fail to see the forest for the trees, but I just cannot wrap my head around it, which is frustrating.
Any helpful explanation or example(s) would be greatly appreciated.
Thanks!
But the actual function firstThat seems to be defined differently from its type declaration, with just one argument. Since foldr normally takes three arguments I gathered there is some kind of partial application happening.
You are right. However, there is a nicer way of putting it than talking about "missing arguments" -- one that doesn't lead you into asking where they have gone. Here are two ways in which the arguments are not missing.
Firstly, consider this function:
add :: Num a => a -> a -> a
add x y = x + y
As you may know, we can also define it like this:
add :: Num a => a -> a -> a
add = (+)
That works because Haskell functions are values like any other. We can simply define a value, add, as being equal to another value, (+), which just happens to be a function. There is no special syntax required to declare a function. The upshot is that writing arguments explicitly is (almost) never necessary; the main reason why we do so because it often makes code more readable (for instance, I could define firstThat without writing the f parameter explicitly, but I won't do so because the result is rather hideous).
Secondly, whenever you see a function type with three arguments...
firstThat :: (a -> Bool) -> a -> [a] -> a
... you can also read it like this...
firstThat :: (a -> Bool) -> (a -> [a] -> a)
... that is, a function of one argument that produces a function of two arguments. That works for all functions of more than one argument. The key takeaway is that, at heart, all Haskell functions take just one argument. That is why partial application works. So on seeing...
firstThat :: (a -> Bool) -> a -> [a] -> a
firstThat f = foldr (\x acc -> if f x then x else acc)
... you can accurately say that you have written explicitly all parameters that firstThat takes -- that is, only one :)
The lambda expression provided as an argument to foldr seem to be missing its arguments too.
Not really. foldr (when restricted to lists) is...
foldr :: (a -> b -> b) -> b -> [a] -> b
... and so the function passed to it takes two arguments (feel free to add air quotes around "two", given the discussion above). The lambda was written as...
\x acc -> if f x then x else acc
... with two explicit arguments, x and acc.
a function that takes one arguments and returns a Boolean value. Since the > operator is actually an infix function, the first argument in the example above seems the result of a partial application to the > function – is this correct?
yes: (>10) is short for \x -> x > 10, just as (10>) would be short for \x -> 10 > x.
an unspecified value of the same type expected as the missing argument to the function provided as the first argument
first of all, it's not a missing argument: by omitting an argument, you obtain a function value. however, the type of the 2nd argument does indeed match the argument of the function >10, just as it matches the type of the elements of the list [10,20,30,40] (which is better reasoning).
a list of values of the aforementioned type
yes.
But the actual function firstThat seems to be defined differently from its type declaration, with just one argument. Since foldr normally takes three arguments I gathered there is some kind of partial application happening. The lambda expression provided as an argument to foldr seem to be missing its arguments too.
that's because given e.g. foo x y z = x * y * z, these 2 lines are equivalent:
bar x = foo x
bar x y z = foo x y z
— that's because of a concept called currying. Currying is also the reason why function type signatures are not (a, b) -> c but instead a -> b -> c, which in turn is equivalent to a -> (b -> c) because of the right associativity of the -> type operator.
Therefore, these two lines are equivalent:
firstThat f = foldr (\x acc -> if f x then x else acc)
firstThat f x y = foldr (\x acc -> if f x then x else acc) x y
Note: that you can also use Data.List.find combined with Data.Maybe.fromMaybe:
λ> fromMaybe 2000 $ find (>10) [10, 20, 30]
20
λ> fromMaybe 2000 $ find (>10) [1, 2, 3]
2000
See also:
https://en.wikipedia.org/wiki/Currying.
https://www.fpcomplete.com/user/EFulmer/currying-and-partial-application
http://learnyouahaskell.com/higher-order-functions
How should one reason about function evaluation in examples like the following in Haskell:
let f x = ...
x = ...
in map (g (f x)) xs
In GHC, sometimes (f x) is evaluated only once, and sometimes once for each element in xs, depending on what exactly f and g are. This can be important when f x is an expensive computation. It has just tripped a Haskell beginner I was helping and I didn't know what to tell him other than that it is up to the compiler. Is there a better story?
Update
In the following example (f x) will be evaluated 4 times:
let f x = trace "!" $ zip x x
x = "abc"
in map (\i -> lookup i (f x)) "abcd"
With language extensions, we can create situations where f x must be evaluated repeatedly:
{-# LANGUAGE GADTs, Rank2Types #-}
module MultiEvG where
data BI where
B :: (Bounded b, Integral b) => b -> BI
foo :: [BI] -> [Integer]
foo xs = let f :: (Integral c, Bounded c) => c -> c
f x = maxBound - x
g :: (forall a. (Integral a, Bounded a) => a) -> BI -> Integer
g m (B y) = toInteger (m + y)
x :: (Integral i) => i
x = 3
in map (g (f x)) xs
The crux is to have f x polymorphic even as the argument of g, and we must create a situation where the type(s) at which it is needed can't be predicted (my first stab used an Either a b instead of BI, but when optimising, that of course led to only two evaluations of f x at most).
A polymorphic expression must be evaluated at least once for each type it is used at. That's one reason for the monomorphism restriction. However, when the range of types it can be needed at is restricted, it is possible to memoise the values at each type, and in some circumstances GHC does that (needs optimising, and I expect the number of types involved mustn't be too large). Here we confront it with what is basically an inhomogeneous list, so in each invocation of g (f x), it can be needed at an arbitrary type satisfying the constraints, so the computation cannot be lifted outside the map (technically, the compiler could still build a cache of the values at each used type, so it would be evaluated only once per type, but GHC doesn't, in all likelihood it wouldn't be worth the trouble).
Monomorphic expressions need only be evaluated once, they can be shared. Whether they are is up to the implementation; by purity, it doesn't change the semantics of the programme. If the expression is bound to a name, in practice you can rely on it being shared, since it's easy and obviously what the programmer wants. If it isn't bound to a name, it's a question of optimisation. With the bytecode generator or without optimisations, the expression will often be evaluated repeatedly, but with optimisations repeated evaluation would indicate a compiler bug.
Polymorphic expressions must be evaluated at least once for every type they're used at, but with optimisations, when GHC can see that it may be used multiple times at the same type, it will (usually) still be shared for that type during a larger computation.
Bottom line: Always compile with optimisations, help the compiler by binding expressions you want shared to a name, and give monomorphic type signatures where possible.
Your examples are indeed quite different.
In the first example, the argument to map is g (f x) and is passed once to map most likely as partially applied function.
Should g (f x), when applied to an argument within map evaluate its first argument, then this will be done only once and then the thunk (f x) will be updated with the result.
Hence, in your first example, f xwill be evaluated at most 1 time.
Your second example requires a deeper analysis before the compiler can arrive at the conclusion that (f x) is always constant in the lambda expression. Perhaps it will never optimize it at all, because it may have knowledge that trace is not quite kosher. So, this may evaluate 4 times when tracing, and 4 times or 1 time when not tracing.
This is really dependent on GHC's optimizations, as you've been able to tell.
The best thing to do is to study the GHC core that you get after optimizing the program. I would look at the generated Core and examine whether f x had its own let statement outside the map or not.
If you want to be sure, then you should factor f x out into its own variable assigned in a let, but there's not really a guaranteed way to figure it out other than reading through Core.
All that said, with the exception of things like trace that use unsafePerformIO, this will never change the semantics of your program: how it actually behaves.
In GHC without optimizations, the body of a function is evaluated every time the function is called. (A "call" means the function is applied to arguments and the result is evaluated.) In the following example, f x is inside a function, so it will execute each time the function is called.
(GHC may optimize this expression as discussed in the FAQ [1].)
let f x = trace "!" $ zip x x
x = "abc"
in map (\i -> lookup i (f x)) "abcd"
However, if we move f x out of the function, it will execute only once.
let f x = trace "!" $ zip x x
x = "abc"
in map ((\f_x i -> lookup i f_x) (f x)) "abcd"
This can be rewritten more readably as
let f x = trace "!" $ zip x x
x = "abc"
g f_x i = lookup i f_x
in map (g (f x)) "abcd"
The general rule is that, each time a function is applied to an argument, a new "copy" of the function body is created. Function application is the only thing that may cause an expression to re-execute. However, be warned that some functions and function calls do not look like functions syntactically.
[1] http://www.haskell.org/haskellwiki/GHC/FAQ#Subexpression_Elimination