Symbol ! in haskell in front of a variable name in a function [duplicate] - haskell

I came across the following definition as I try to learn Haskell using a real project to drive it. I don't understand what the exclamation mark in front of each argument means and my books didn't seem to mention it.
data MidiMessage = MidiMessage !Int !MidiMessage

It's a strictness declaration. Basically, it means that it must be evaluated to what's called "weak head normal form" when the data structure value is created. Let's look at an example, so that we can see just what this means:
data Foo = Foo Int Int !Int !(Maybe Int)
f = Foo (2+2) (3+3) (4+4) (Just (5+5))
The function f above, when evaluated, will return a "thunk": that is, the code to execute to figure out its value. At that point, a Foo doesn't even exist yet, just the code.
But at some point someone may try to look inside it, probably through a pattern match:
case f of
Foo 0 _ _ _ -> "first arg is zero"
_ -> "first arge is something else"
This is going to execute enough code to do what it needs, and no more. So it will create a Foo with four parameters (because you can't look inside it without it existing). The first, since we're testing it, we need to evaluate all the way to 4, where we realize it doesn't match.
The second doesn't need to be evaluated, because we're not testing it. Thus, rather than 6 being stored in that memory location, we'll just store the code for possible later evaluation, (3+3). That will turn into a 6 only if someone looks at it.
The third parameter, however, has a ! in front of it, so is strictly evaluated: (4+4) is executed, and 8 is stored in that memory location.
The fourth parameter is also strictly evaluated. But here's where it gets a bit tricky: we're evaluating not fully, but only to weak normal head form. This means that we figure out whether it's Nothing or Just something, and store that, but we go no further. That means that we store not Just 10 but actually Just (5+5), leaving the thunk inside unevaluated. This is important to know, though I think that all the implications of this go rather beyond the scope of this question.
You can annotate function arguments in the same way, if you enable the BangPatterns language extension:
f x !y = x*y
f (1+1) (2+2) will return the thunk (1+1)*4.

A simple way to see the difference between strict and non-strict constructor arguments is how they behave when they are undefined. Given
data Foo = Foo Int !Int
first (Foo x _) = x
second (Foo _ y) = y
Since the non-strict argument isn't evaluated by second, passing in undefined doesn't cause a problem:
> second (Foo undefined 1)
1
But the strict argument can't be undefined, even if we don't use the value:
> first (Foo 1 undefined)
*** Exception: Prelude.undefined

I believe it is a strictness annotation.
Haskell is a pure and lazy functional language, but sometimes the overhead of lazyness can be too much or wasteful. So to deal with that, you can ask to compiler to fully evaluate the arguments to a function instead of parsing thunks around.
There's more information on this page: Performance/Strictness.

Related

How does return statement work in Haskell? [duplicate]

This question already has answers here:
What's so special about 'return' keyword
(3 answers)
Closed 5 years ago.
Consider these functions
f1 :: Maybe Int
f1 = return 1
f2 :: [Int]
f2 = return 1
Both have the same statement return 1. But the results are different. f1 gives value Just 1 and f2 gives value [1]
Looks like Haskell invokes two different versions of return based on return type. I like to know more about this kind of function invocation. Is there a name for this feature in programming languages?
This is a long meandering answer!
As you've probably seen from the comments and Thomas's excellent (but very technical) answer You've asked a very hard question. Well done!
Rather than try to explain the technical answer I've tried to give you a broad overview of what Haskell does behind the scenes without diving into technical detail. Hopefully it will help you to get a big picture view of what's going on.
return is an example of type inference.
Most modern languages have some notion of polymorphism. For example var x = 1 + 1 will set x equal to 2. In a statically typed language 2 will usually be an int. If you say var y = 1.0 + 1.0 then y will be a float. The operator + (which is just a function with a special syntax)
Most imperative languages, especially object oriented languages, can only do type inference one way. Every variable has a fixed type. When you call a function it looks at the types of the argument and chooses a version of that function that fits the types (or complains if it can't find one).
When you assign the result of a function to a variable the variable already has a type and if it doesn't agree with the type of the return value you get an error.
So in an imperative language the "flow" of type deduction follows time in your program Deduce the type of a variable, do something with it and deduce the type of the result. In a dynamically typed language (such as Python or javascript) the type of a variable is not assigned until the value of the variable is computed (which is why there don't seem to be types). In a statically typed language the types are worked out ahead of time (by the compiler) but the logic is the same. The compiler works out what the types of variables are going to be, but it does so by following the logic of the program in the same way as the program runs.
In Haskell the type inference also follows the logic of the program. Being Haskell it does so in a very mathematically pure way (called System F). The language of types (that is the rules by which types are deduced) are similar to Haskell itself.
Now remember Haskell is a lazy language. It doesn't work out the value of anything until it needs it. That's why it makes sense in Haskell to have infinite data structures. It never occurs to Haskell that a data structure is infinite because it doesn't bother to work it out until it needs to.
Now all that lazy magic happens at the type level too. In the same way that Haskell doesn't work out what the value of an expression is until it really needs to, Haskell doesn't work out what the type of an expression is until it really needs to.
Consider this function
func (x : y : rest) = (x,y) : func rest
func _ = []
If you ask Haskell for the type of this function it has a look at the definition, sees [] and : and deduces that it's working with lists. But it never needs to look at the types of x and y, it just knows that they have to be the same because they end up in the same list. So it deduces the type of the function as [a] -> [a] where a is a type that it hasn't bothered to work out yet.
So far no magic. But it's useful to notice the difference between this idea and how it would be done in an OO language. Haskell doesn't convert the arguments to Object, do it's thing and then convert back. Haskell just hasn't been asked explicitly what the type of the list is. So it doesn't care.
Now try typing the following into ghci
maxBound - length ""
maxBound : "Hello"
Now what just happened !? minBound bust be a Char because I put it on the front of a string and it must be an integer because I added it to 0 and got a number. Plus the two values are clearly very different.
So what is the type of minBound? Let's ask ghci!
:type minBound
minBound :: Bounded a => a
AAargh! what does that mean? Basically it means that it hasn't bothered to work out exactly what a is, but is has to be Bounded if you type :info Bounded you get three useful lines
class Bounded a where
minBound :: a
maxBound :: a
and a lot of less useful lines
So if a is Bounded there are values minBound and maxBound of type a.
In fact under the hood Bounded is just a value, it's "type" is a record with fields minBound and maxBound. Because it's a value Haskell doesn't look at it until it really needs to.
So I appear to have meandered somewhere in the region of the answer to your question. Before we move onto return (which you may have noticed from the comments is a wonderfully complex beast.) let's look at read.
ghci again
read "42" + 7
read "'H'" : "ello"
length (read "[1,2,3]")
and hopefully you won't be too surprised to find that there are definitions
read :: Read a => String -> a
class Read where
read :: String -> a
so Read a is just a record containing a single value which is a function String -> a. Its very tempting to assume that there is one read function which looks at a string, works out what type is contained in the string and returns that type. But it does the opposite. It completely ignores the string until it's needed. When the value is needed, Haskell first works out what type it's expecting, once it's done that it goes and gets the appropriate version of the read function and combines it with the string.
now consider something slightly more complex
readList :: Read a => [String] -> a
readList strs = map read strs
under the hood readList actually takes two arguments
readList' (Read a) -> [String] -> [a]
readList' {read = f} strs = map f strs
Again as Haskell is lazy it only bothers looking at the arguments when it's needs to find out the return value, at that point it knows what a is, so the compiler can go and fine the right version of Read. Until then it doesn't care.
Hopefully that's given you a bit of an idea of what's happening and why Haskell can "overload" on the return type. But it's important to remember it's not overloading in the conventional sense. Every function has only one definition. It's just that one of the arguments is a bag of functions. read_str doesn't ever know what types it's dealing with. It just knows it gets a function String -> a and some Strings, to do the application it just passes the arguments to map. map in turn doesn't even know it gets strings. When you get deeper into Haskell it becomes very important that functions don't know very much about the types they're dealing with.
Now let's look at return.
Remember how I said that the type system in Haskell was very similar to Haskell itself. Remember that in Haskell functions are just ordinary values.
Does this mean I can have a type that takes a type as an argument and returns another type? Of course it does!
You've seen some type functions Maybe takes a type a and returns another type which can either be Just a or Nothing. [] takes a type a and returns a list of as. Type functions in Haskell are usually containers. For example I could define a type function BinaryTree which stores a load of a's in a tree like structure. There are of course lots of much stranger ones.
So, if these type functions are similar to ordinary types I can have a typeclass that contains type functions. One such typeclass is Monad
class Monad m where
return a -> m a
(>>=) m a (a -> m b) -> m b
so here m is some type function. If I want to define Monad for m I need to define return and the scary looking operator below it (which is called bind)
As others have pointed out the return is a really misleading name for a fairly boring function. The team that designed Haskell have since realised their mistake and they're genuinely sorry about it. return is just an ordinary function that takes an argument and returns a Monad with that type in it. (You never asked what a Monad actually is so I'm not going to tell you)
Let's define Monad for m = Maybe!
First I need to define return. What should return x be? Remember I'm only allowed to define the function once, so I can't look at x because I don't know what type it is. I could always return Nothing, but that seems a waste of a perfectly good function. Let's define return x = Just x because that's literally the only other thing I can do.
What about the scary bind thing? what can we say about x >>= f? well x is a Maybe a of some unknown type a and f is a function that takes an a and returns a Maybe b. Somehow I need to combine these to get a Maybe b`
So I need to define Nothing >== f. I can't call f because it needs an argument of type a and I don't have a value of type a I don't even know what 'a' is. I've only got one choice which is to define
Nothing >== f = Nothing
What about Just x >>= f? Well I know x is of type a and f takes a as an argument, so I can set y = f a and deduce that y is of type b. Now I need to make a Maybe b and I've got a b so ...
Just x >>= f = Just (f x)
So I've got a Monad! what if m is List? well I can follow a similar sort of logic and define
return x = [x]
[] >>= f = []
(x : xs) >>= a = f x ++ (xs >>= f)
Hooray another Monad! It's a nice exercise to go through the steps and convince yourself that there's no other sensible way of defining this.
So what happens when I call return 1?
Nothing!
Haskell's Lazy remember. The thunk return 1 (technical term) just sits there until someone needs the value. As soon as Haskell needs the value it know what type the value should be. In particular it can deduce that m is List. Now that it knows that Haskell can find the instance of Monad for List. As soon as it does that it has access to the correct version of return.
So finally Haskell is ready To call return, which in this case returns [1]!
The return function is from the Monad class:
class Applicative m => Monad (m :: * -> *) where
...
return :: a -> m a
So return takes any value of type a and results in a value of type m a. The monad, m, as you've observed is polymorphic using the Haskell type class Monad for ad hoc polymorphism.
At this point you probably realize return is not an good, intuitive, name. It's not even a built in function or a statement like in many other languages. In fact a better-named and identically-operating function exists - pure. In almost all cases return = pure.
That is, the function return is the same as the function pure (from the Applicative class) - I often think to myself "this monadic value is purely the underlying a" and I try to use pure instead of return if there isn't already a convention in the codebase.
You can use return (or pure) for any type that is a class of Monad. This includes the Maybe monad to get a value of type Maybe a:
instance Monad Maybe where
...
return = pure -- which is from Applicative
...
instance Applicative Maybe where
pure = Just
Or for the list monad to get a value of [a]:
instance Applicative [] where
{-# INLINE pure #-}
pure x = [x]
Or, as a more complex example, Aeson's parse monad to get a value of type Parser a:
instance Applicative Parser where
pure a = Parser $ \_path _kf ks -> ks a

What does ! mean in record definition before field type? [duplicate]

I came across the following definition as I try to learn Haskell using a real project to drive it. I don't understand what the exclamation mark in front of each argument means and my books didn't seem to mention it.
data MidiMessage = MidiMessage !Int !MidiMessage
It's a strictness declaration. Basically, it means that it must be evaluated to what's called "weak head normal form" when the data structure value is created. Let's look at an example, so that we can see just what this means:
data Foo = Foo Int Int !Int !(Maybe Int)
f = Foo (2+2) (3+3) (4+4) (Just (5+5))
The function f above, when evaluated, will return a "thunk": that is, the code to execute to figure out its value. At that point, a Foo doesn't even exist yet, just the code.
But at some point someone may try to look inside it, probably through a pattern match:
case f of
Foo 0 _ _ _ -> "first arg is zero"
_ -> "first arge is something else"
This is going to execute enough code to do what it needs, and no more. So it will create a Foo with four parameters (because you can't look inside it without it existing). The first, since we're testing it, we need to evaluate all the way to 4, where we realize it doesn't match.
The second doesn't need to be evaluated, because we're not testing it. Thus, rather than 6 being stored in that memory location, we'll just store the code for possible later evaluation, (3+3). That will turn into a 6 only if someone looks at it.
The third parameter, however, has a ! in front of it, so is strictly evaluated: (4+4) is executed, and 8 is stored in that memory location.
The fourth parameter is also strictly evaluated. But here's where it gets a bit tricky: we're evaluating not fully, but only to weak normal head form. This means that we figure out whether it's Nothing or Just something, and store that, but we go no further. That means that we store not Just 10 but actually Just (5+5), leaving the thunk inside unevaluated. This is important to know, though I think that all the implications of this go rather beyond the scope of this question.
You can annotate function arguments in the same way, if you enable the BangPatterns language extension:
f x !y = x*y
f (1+1) (2+2) will return the thunk (1+1)*4.
A simple way to see the difference between strict and non-strict constructor arguments is how they behave when they are undefined. Given
data Foo = Foo Int !Int
first (Foo x _) = x
second (Foo _ y) = y
Since the non-strict argument isn't evaluated by second, passing in undefined doesn't cause a problem:
> second (Foo undefined 1)
1
But the strict argument can't be undefined, even if we don't use the value:
> first (Foo 1 undefined)
*** Exception: Prelude.undefined
I believe it is a strictness annotation.
Haskell is a pure and lazy functional language, but sometimes the overhead of lazyness can be too much or wasteful. So to deal with that, you can ask to compiler to fully evaluate the arguments to a function instead of parsing thunks around.
There's more information on this page: Performance/Strictness.

Can I declare a NULL value in Haskell?

Just curious, seems when declaring a name, we always specify some valid values, like let a = 3. Question is, in imperative languages include c/java there's always a keyword of "null". Does Haskell has similar thing? When could a function object be null?
There is a “null” value that you can use for variables of any type. It's called ⟂ (pronounced bottom). We don't need a keyword to produce bottom values; actually ⟂ is the value of any computation which doesn't terminate. For instance,
bottom = let x = x in x -- or simply `bottom = bottom`
will infinitely loop. It's obviously not a good idea to do this deliberately, however you can use undefined as a “standard bottom value”. It's perhaps the closest thing Haskell has to Java's null keyword.
But you definitely shouldn't/can't use this for most of the applications where Java programmers would grab for null.
Since everything in Haskell is immutable, a value that's undefined will always stay undefined. It's not possible to use this as a “hold on a second, I'll define it later” indication†.
It's not possible to check whether a value is bottom or not. For rather deep theoretical reasons, in fact. So you can't use this for values that may or may not be defined.
And you know what? It's really good that Haskell does't allow this! In Java, you constantly need to be wary that values might be null. In Haskell, if a value is bottom than something is plain broken, but this will never be part of intended behaviour / something you might need to check for. If for some value it's intended that it might not be defined, then you must always make this explicit by wrapping the type in a Maybe. By doing this, you make sure that anybody trying to use the value must first check whether it's there. Not possible to forget this and run into a null-reference exception at runtime!
And because Haskell is so good at handling variant types, checking the contents of a Maybe-wrapped value is really not too cumbersome. You can just do it explicitly with pattern matching,
quun :: Int -> String
quun i = case computationWhichMayFail i of
Just j -> show j
Nothing -> "blearg, failed"
computationWhichMayFail :: Int -> Maybe Int
or you can use the fact that Maybe is a functor. Indeed it is an instance of almost every specific functor class: Functor, Applicative, Alternative, Foldable, Traversable, Monad, MonadPlus. It also lifts semigroups to monoids.
Dᴏɴ'ᴛ Pᴀɴɪᴄ now,
you don't need to know what the heck these things are. But when you've learned what they do, you will be able to write very concise code that automagically handles missing values always in the right way, with zero risk of missing a check.
†Because Haskell is lazy, you generally don't need to defer any calculations to be done later. The compiler will automatically see to it that the computation is done when it's necessary, and no sooner.
There is no null in Haskell. What you want is the Maybe monad.
data Maybe a
= Just a
| Nothing
Nothing refers to classic null and Just contains a value.
You can then pattern match against it:
foo Nothing = Nothing
foo (Just a) = Just (a * 10)
Or with case syntax:
let m = Just 10
in case m of
Just v -> print v
Nothing -> putStrLn "Sorry, there's no value. :("
Or use the supperior functionality provided by the typeclass instances for Functor, Applicative, Alternative, Monad, MonadPlus and Foldable.
This could then look like this:
foo :: Maybe Int -> Maybe Int -> Maybe Int
foo x y = do
a <- x
b <- y
return $ a + b
You can even use the more general signature:
foo :: (Monad m, Num a) => m a -> m a -> m a
Which makes this function work for ANY data type that is capable of the functionality provided by Monad. So you can use foo with (Num a) => Maybe a, (Num a) => [a], (Num a) => Either e a and so on.
Haskell does not have "null". This is a design feature. It completely prevents any possibility of your code crashing due to a null-pointer exception.
If you look at code written in an imperative language, 99% of the code expects stuff to never be null, and will malfunction catastrophically if you give it null. But then 1% of the code does expect nulls, and uses this feature to specify optional arguments or whatever. But you can't easily tell, by looking at the code, which parts are expecting nulls as legal arguments, and which parts aren't. Hopefully it's documented — but don't hold your breath!
In Haskell, there is no null. If that argument is declared as Customer, then there must be an actual, real Customer there. You can't just pass in a null (intentionally or by mistake). So the 99% of the code that is expecting a real Customer will always work.
But what about the other 1%? Well, for that we have Maybe. But it's an explicit thing; you have to explicitly say "this value is optional". And you have to explicitly check when you use it. You cannot "forget" to check; it won't compile.
So yes, there is no "null", but there is Maybe which is kinda similar, but safer.
Not in Haskell (or in many other FP languages). If you have some expression of some type T, its evaluation will give a value of type T, with the following exceptions:
infinite recursion may make the program "loop forever" and failing to return anything
let f n = f (n+1) in f 0
runtime errors can abort the program early, e.g.:
division by zero, square root of negative, and other numerical errors
head [], fromJust Nothing, and other partial functions used on invalid inputs
explicit calls to undefined, error "message", or other exception-throwing primitives
Note that even if the above cases might be regarded as "special" values called "bottoms" (the name comes from domain theory), you can not test against these values at runtime, in general. So, these are not at all the same thing as Java's null. More precisely, you can't write things like
-- assume f :: Int -> Int
if (f 5) is a division-by-zero or infinite recursion
then 12
else 4
Some exceptional values can be caught in the IO monad, but forget about that -- exceptions in Haskell are not idiomatic, and roughly only used for IO errors.
If you want an exceptional value which can be tested at run-time, use the Maybe a type, as #bash0r already suggested. This type is similar to Scala's Option[A] or Java's not-so-much-used Optional<A>.
The value is having both a type T and type Maybe T is to be able to precisely identify which functions always succeed, and which ones can fail. In Haskell the following is frowned upon, for instance:
-- Finds a value in a list. Returns -1 if not present.
findIndex :: Eq a => [a] -> a -> Int
Instead this is preferred:
-- Finds a value in a list. Returns Nothing if not present.
findIndex :: Eq a => [a] -> a -> Maybe Int
The result of the latter is less convenient than the one of the former, since the Int must be unwrapped at every call. This is good, since in this way each user of the function is prevented to simply "ignore" the not-present case, and write buggy code.

Does a function in Haskell always evaluate its return value?

I'm trying to better understand Haskell's laziness, such as when it evaluates an argument to a function.
From this source:
But when a call to const is evaluated (that’s the situation we are interested in, here, after all), its return value is evaluated too ... This is a good general principle: a function obviously is strict in its return value, because when a function application needs to be evaluated, it needs to evaluate, in the body of the function, what gets returned. Starting from there, you can know what must be evaluated by looking at what the return value depends on invariably. Your function will be strict in these arguments, and lazy in the others.
So a function in Haskell always evaluates its own return value? If I have:
foo :: Num a => [a] -> [a]
foo [] = []
foo (_:xs) = map (* 2) xs
head (foo [1..]) -- = 4
According to the above paragraph, map (* 2) xs, must be evaluated. Intuitively, I would think that means applying the map to the entire list- resulting in an infinite loop.
But, I can successfully take the head of the result. I know that : is lazy in Haskell, so does this mean that evaluating map (* 2) xs just means constructing something else that isn't fully evaluated yet?
What does it mean to evaluate a function applied to an infinite list? If the return value of a function is always evaluated when the function is evaluated, can a function ever actually return a thunk?
Edit:
bar x y = x
var = bar (product [1..]) 1
This code doesn't hang. When I create var, does it not evaluate its body? Or does it set bar to product [1..] and not evaluate that? If the latter, bar is not returning its body in WHNF, right, so did it really 'evaluate' x? How could bar be strict in x if it doesn't hang on computing product [1..]?
First of all, Haskell does not specify when evaluation happens so the question can only be given a definite answer for specific implementations.
The following is true for all non-parallel implementations that I know of, like ghc, hbc, nhc, hugs, etc (all G-machine based, btw).
BTW, something to remember is that when you hear "evaluate" for Haskell it normally means "evaluate to WHNF".
Unlike strict languages you have to distinguish between two "callers" of a function, the first is where the call occurs lexically, and the second is where the value is demanded. For a strict language these two always coincide, but not for a lazy language.
Let's take your example and complicate it a little:
foo [] = []
foo (_:xs) = map (* 2) xs
bar x = (foo [1..], x)
main = print (head (fst (bar 42)))
The foo function occurs in bar. Evaluating bar will return a pair, and the first component of the pair is a thunk corresponding to foo [1..]. So bar is what would be the caller in a strict language, but in the case of a lazy language it doesn't call foo at all, instead it just builds the closure.
Now, in the main function we actually need the value of head (fst (bar 42)) since we have to print it. So the head function will actually be called. The head function is defined by pattern matching, so it needs the value of the argument. So fst is called. It too is defined by pattern matching and needs its argument so bar is called, and bar will return a pair, and fst will evaluate and return its first component. And now finally foo is "called"; and by called I mean that the thunk is evaluated (entered as it's sometimes called in TIM terminology), because the value is needed. The only reason the actual code for foo is called is that we want a value. So foo had better return a value (i.e., a WHNF). The foo function will evaluate its argument and end up in the second branch. Here it will tail call into the code for map. The map function is defined by pattern match and it will evaluate its argument, which is a cons. So map will return the following {(*2) y} : {map (*2) ys}, where I have used {} to indicate a closure being built. So as you can see map just returns a cons cell with the head being a closure and the tail being a closure.
To understand the operational semantics of Haskell better I suggest you look at some paper describing how to translate Haskell to some abstract machine, like the G-machine.
I always found that the term "evaluate," which I had learned in other contexts (e.g., Scheme programming), always got me all confused when I tried to apply it to Haskell, and that I made a breakthrough when I started to think of Haskell in terms of forcing expressions instead of "evaluating" them. Some key differences:
"Evaluation," as I learned the term before, strongly connotes mapping expressions to values that are themselves not expressions. (One common technical term here is "denotations.")
In Haskell, the process of forcing is IMHO most easily understood as expression rewriting. You start with an expression, and you repeatedly rewrite it according to certain rules until you get an equivalent expression that satisfies a certain property.
In Haskell the "certain property" has the unfriendly name weak head normal form ("WHNF"), which really just means that the expression is either a nullary data constructor or an application of a data constructor.
Let's translate that to a very rough set of informal rules. To force an expression expr:
If expr is a nullary constructor or a constructor application, the result of forcing it is expr itself. (It's already in WHNF.)
If expr is a function application f arg, then the result of forcing it is obtained this way:
Find the definition of f.
Can you pattern match this definition against the expression arg? If not, then force arg and try again with the result of that.
Substitute the pattern match variables in the body of f with the parts of (the possibly rewritten) arg that correspond to them, and force the resulting expression.
One way of thinking of this is that when you force an expression, you're trying to rewrite it minimally to reduce it to an equivalent expression in WHNF.
Let's apply this to your example:
foo :: Num a => [a] -> [a]
foo [] = []
foo (_:xs) = map (* 2) xs
-- We want to force this expression:
head (foo [1..])
We will need definitions for head and `map:
head [] = undefined
head (x:_) = x
map _ [] = []
map f (x:xs) = f x : map f x
-- Not real code, but a rule we'll be using for forcing infinite ranges.
[n..] ==> n : [(n+1)..]
So now:
head (foo [1..]) ==> head (map (*2) [1..]) -- using the definition of foo
==> head (map (*2) (1 : [2..])) -- using the forcing rule for [n..]
==> head (1*2 : map (*2) [2..]) -- using the definition of map
==> 1*2 -- using the definition of head
==> 2 -- using the definition of *
I believe the idea must be that in a lazy language if you're evaluating a function application, it must be because you need the result of the application for something. So whatever reason caused the function application to be reduced in the first place is going to continue to need to reduce the returned result. If we didn't need the function's result we wouldn't be evaluating the call in the first place, the whole application would be left as a thunk.
A key point is that the standard "lazy evaluation" order is demand-driven. You only evaluate what you need. Evaluating more risks violating the language spec's definition of "non-strict semantics" and looping or failing for some programs that should be able to terminate; lazy evaluation has the interesting property that if any evaluation order can cause a particular program to terminate, so can lazy evaluation.1
But if we only evaluate what we need, what does "need" mean? Generally it means either
a pattern match needs to know what constructor a particular value is (e.g. I can't know what branch to take in your definition of foo without knowing whether the argument is [] or _:xs)
a primitive operation needs to know the entire value (e.g. the arithmetic circuits in the CPU can't add or compare thunks; I need to fully evaluate two Int values to call such operations)
the outer driver that executes the main IO action needs to know what the next thing to execute is
So say we've got this program:
foo :: Num a => [a] -> [a]
foo [] = []
foo (_:xs) = map (* 2) xs
main :: IO ()
main = print (head (foo [1..]))
To execute main, the IO driver has to evaluate the thunk print (head (foo [1..])) to work out that it's print applied to the thunk head (foo [1..]). print needs to evaluate its argument on order to print it, so now we need to evaluate that thunk.
head starts by pattern matching its argument, so now we need to evaluate foo [1..], but only to WHNF - just enough to tell whether the outermost list constructor is [] or :.
foo starts by pattern matching on its argument. So we need to evaluate [1..], also only to WHNF. That's basically 1 : [2..], which is enough to see which branch to take in foo.2
The : case of foo (with xs bound to the thunk [2..]) evaluates to the thunk map (*2) [2..].
So foo is evaluated, and didn't evaluate its body. However, we only did that because head was pattern matching to see if we had [] or x : _. We still don't know that, so we must immediately continue to evaluate the result of foo.
This is what the article means when it says functions are strict in their result. Given that a call to foo is evaluated at all, its result will also be evaluated (and so, anything needed to evaluate the result will also be evaluated).
But how far it needs to be evaluated depends on the calling context. head is only pattern matching on the result of foo, so it only needs a result to WHNF. We can get an infinite list to WHNF (we already did so, with 1 : [2..]), so we don't necessarily get in an infinite loop when evaluating a call to foo. But if head were some sort of primitive operation implemented outside of Haskell that needed to be passed a completely evaluated list, then we'd be evaluating foo [1..] completely, and thus would never finish in order to come back to head.
So, just to complete my example, we're evaluating map (2 *) [2..].
map pattern matches its second argument, so we need to evaluate [2..] as far as 2 : [3..]. That's enough for map to return the thunk (2 *) 2 : map (2 *) [3..], which is in WHNF. And so it's done, we can finally return to head.
head ((2 *) 2 : map (2 *) [3..]) doesn't need to inspect either side of the :, it just needs to know that there is one so it can return the left side. So it just returns the unevaluated thunk (2 *) 2.
Again though, we only evaluated the call to head this far because print needed to know what its result is, so although head doesn't evaluate its result, its result is always evaluated whenever the call to head is.
(2 *) 2 evaluates to 4, print converts that into the string "4" (via show), and the line gets printed to the output. That was the entire main IO action, so the program is done.
1 Implementations of Haskell, such as GHC, do not always use "standard lazy evaluation", and the language spec does not require it. If the compiler can prove that something will always be needed, or cannot loop/error, then it's safe to evaluate it even when lazy evaluation wouldn't (yet) do so. This can often be faster so GHC optimizations do actually do this.
2 I'm skipping over a few details here, like that print does have some non-primitive implementation we could step inside and lazily evaluate, and that [1..] could be further expanded to the functions that actually implement that syntax.
Not necessarily. Haskell is lazy, meaning that it only evaluates when it needs to. This has some interesting effects. If we take the below code, for example:
-- File: lazinessTest.hs
(>?) :: a -> b -> b
a >? b = b
main = (putStrLn "Something") >? (putStrLn "Something else")
This is the output of the program:
$ ./lazinessTest
Something else
This indicates that putStrLn "Something" is never evaluated. But it's still being passed to the function, in the form of a 'thunk'. These 'thunks' are unevaluated values that, rather than being concrete values, are like a breadcrumb-trail of how to compute the value. This is how Haskell laziness works.
In our case, two 'thunks' are passed to >?, but only one is passed out, meaning that only one is evaluated in the end. This also applies in const, where the second argument can be safely ignored, and therefore is never computed. As for map, GHC is smart enough to realise that we don't care about the end of the array, and only bothers to compute what it needs to, in your case the second element of the original list.
However, it's best to leave the thinking about laziness to the compiler and keep coding, unless you're dealing with IO, in which case you really, really should think about laziness, because you can easily go wrong, as I've just demonstrated.
There are lots and lots of online articles on the Haskell wiki to look at, if you want more detail.
Function could evaluate either return type:
head (x:_) = x
or exception/error:
head _ = error "Head: List is empty!"
or bottom (⊥)
a = a
b = last [1 ..]

How to write the type declaration of an Haskell function with no arguments?

How to write the type declaration of an haskell function without arguments?
There is no such thing as a function without arguments, that would be just a value. Sure, you can write such a declaration:
five :: Int
five = 5
It might look more like what you asked for if I make it
five' :: () -> Int
five' () = 5
but that's completely equivalent (unless you write something ridiculous like five' undefined) and superfluent1.
If what you mean is something like, in C
void scream() {
printf("Aaaah!\n");
}
then that's again not a function but an action. (C programmers do call it function, but you might better say procedure, everybody would understand.) What I said above holds pretty much the same way, you'd use
scream :: IO()
scream = putStrLn "Aaaah!"
Note that the empty () do in this case not have anything to do with not having arguments (that follows already from the absence of -> arrows), instead it means there is also no return value, it's just a "side-effect-only" action.
1Actually, it differs in one relevant way: five is a constant applicative form, which sort of means it's memoised. If I had defined such a constant in some roundabout way (e.g. sum $ 5 : replicate 1000000 0) then the lengthy calculation would be carried out only once, even if five is evaluated multiple times during a program run. OTOH, wherever you would have written out five' (), the calculation would have been done anew.
Since functions in Haskell are pure (their result only depends on their arguments), the equivalent of a function with no arguments is just a value. For example, one = 1.

Resources