Understanding Type of IO () in `let` Expression - haskell

Given:
λ: let f = putStrLn "foo" in 42
42
What is f's type? Why does "foo" not get printed before showing the result of 42?
Lastly, why doesn't the following work?
λ: :t f
<interactive>:1:1: Not in scope: ‘f’

What is f's type?
As you have correctly identified, it is IO () which can be thought of as an IO action that returns nothing useful (())
Why does "foo" not get printed before showing the result of 42?
Haskell is lazily evaluated, but even seq is not enough in this case. An IO action will only be performed in the REPL if the expression returns the IO action. An IO action will only be performed in a program if it's returned by main. However, there are ways to get around this limitation.
Lastly, why doesn't the following work?
Haskell's let names a value within the scope of an expression, so after the expression has been evaluated f goes out of scope.

let f = ... simply defines f, and does not "run" anything. It is vaguely similar to a definition of a new function in imperative programming.
Your full code let f = putStrLn "foo" in 42 could be loosely translated to
{
function f() {
print("foo");
}
return 42;
}
You wouldn't expect the above to print anything, right?
By comparison, let f = putStrLn "foo" in do f; f; return 42 is similar to
{
function f() {
print("foo");
}
f();
f();
return 42;
}
The correspondence is not perfect, but hopefully you get the idea.

f will be of type IO ().
"foo" is not printed because f is not 'binded' to real world. (I can't say this is a friendly explanation. If this sounds nonsense, you may want to refer some tutorial to catch the idea of Monad and lazy evaluation).
let name = value in (scope) makes the value available in, but not out of the scope, so :t won't find it in ghci's top level scope.
let without in makes it available to :t (this code is only valid in ghci):
> let f = putStrLn "foo"
> :t f
f :: IO ()

There are two things going on here.
First, consider
let x = sum [1..1000000] in 42
Haskell is lazy. Since we don't actually do anything with x, it is never computed. (Which is just as well, because it would be mildly slow.) Indeed, if you compile this, the compiler will see that x is never used, and delete it (i.e., not generate any compiled code for it).
Second, calling putStrLn does not actually print anything. Rather, it returns IO (), which you can think of as a kind of "I/O command object". Merely having a command object is different from executing the it. By design, the only way to "execute" an I/O command object is to return it from main. At least, it is in a complete program; GHCi has the helpful feature that if you enter an expression that returns an I/O command object, GHCi will execute it for you.
Your expression returns 42; again, f isn't used, so it doesn't do anything.
As chi rightly points out, it's a bit like declaring a local (zero-argument) function but never calling it. You wouldn't expect to see any output.
You can also do something like
actions = [print 5, print 6, print 7, print 8]
This creates a list of I/O command objects. But, again, it does not execute any of them.
Typically when you write a function that does I/O, it's a do-block that chains everything into one giant I/O command object and returns it to the caller. In that case, you don't really need to understand or thing about this distinction between defining a command object and executing it. But the distinction is still there.
It's perhaps easier to see this with a monad that has an explicit run-function. For example, runST takes an ST command object, runs it, and gives you back the answer. But (say) newSTVar, by itself, does nothing but construct an ST command; you have to runST that before anything actually "happens".

Related

Is print in Haskell a pure function?

Is print in Haskell a pure function; why or why not? I'm thinking it's not, because it does not always return the same value as pure functions should.
A value of type IO Int is not really an Int. It's more like a piece of paper which reads "hey Haskell runtime, please produce an Int value in such and such way". The piece of paper is inert and remains the same, even if the Ints eventually produced by the runtime are different.
You send the piece of paper to the runtime by assigning it to main. If the IO action never comes in the way of main and instead languishes inside some container, it will never get executed.
Functions that return IO actions are pure like the others. They always return the same piece of paper. What the runtime does with those instructions is another matter.
If they weren't pure, we would have to think twice before changing
foo :: (Int -> IO Int) -> IO Int
foo f = liftA2 (+) (f 0) (f 0)
to:
foo :: (Int -> IO Int) -> IO Int
foo f = let x = f 0 in liftA2 (+) x x
Yes, print is a pure function. The value it returns has type IO (), which you can think of as a bunch of code that outputs the string you passed in. For each string you pass in, it always returns the same code.
If you just read the Tag of pure-function (A function that always evaluates to the same result value given the same argument value(s) and that does not cause any semantically observable side effect or output, such as mutation of mutable objects or output to I/O devices.) and then Think in the type of print:
putStrLn :: String -> IO ()
You will find a trick there, it always returns IO (), so... No, it produces effects. So in terms of Referential Transparency is not pure
For example, getLine returns IO String but it is also a pure function. (#interjay contribution), What I'm trying to say, is that the answer depends very close of the question:
On matter of value, IO () will always be the same IO () value for the same input.
And
On matter of execution, it is not pure because the execution of that
IO () could have side effects (put an string in the screen, in this
case looks so innocent, but some IO could lunch nuclear bombs, and
then return the Int 42)
You could understand better with the nice approach of #Ben here:
"There are several ways to explain how you're "purely" manipulating
the real world. One is to say that IO is just like a state monad, only
the state being threaded through is the entire world outside your
program;= (so your Stuff -> IO DBThing function really has an extra
hidden argument that receives the world, and actually returns a
DBThing along with another world; it's always called with different
worlds, and that's why it can return different DBThing values even
when called with the same Stuff). Another explanation is that an IO
DBThing value is itself an imperative program; your Haskell program is
a totally pure function doing no IO, which returns an impure program
that does IO, and the Haskell runtime system (impurely) executes the
program it returns."
And #Erik Allik:
So Haskell functions that return values of type IO a, are actually not
the functions that are being executed at runtime — what gets executed
is the IO a value itself. So these functions actually are pure but
their return values represent non-pure computations.
You can found them here Understanding pure functions in Haskell with IO

What does ! mean in record definition before field type? [duplicate]

I came across the following definition as I try to learn Haskell using a real project to drive it. I don't understand what the exclamation mark in front of each argument means and my books didn't seem to mention it.
data MidiMessage = MidiMessage !Int !MidiMessage
It's a strictness declaration. Basically, it means that it must be evaluated to what's called "weak head normal form" when the data structure value is created. Let's look at an example, so that we can see just what this means:
data Foo = Foo Int Int !Int !(Maybe Int)
f = Foo (2+2) (3+3) (4+4) (Just (5+5))
The function f above, when evaluated, will return a "thunk": that is, the code to execute to figure out its value. At that point, a Foo doesn't even exist yet, just the code.
But at some point someone may try to look inside it, probably through a pattern match:
case f of
Foo 0 _ _ _ -> "first arg is zero"
_ -> "first arge is something else"
This is going to execute enough code to do what it needs, and no more. So it will create a Foo with four parameters (because you can't look inside it without it existing). The first, since we're testing it, we need to evaluate all the way to 4, where we realize it doesn't match.
The second doesn't need to be evaluated, because we're not testing it. Thus, rather than 6 being stored in that memory location, we'll just store the code for possible later evaluation, (3+3). That will turn into a 6 only if someone looks at it.
The third parameter, however, has a ! in front of it, so is strictly evaluated: (4+4) is executed, and 8 is stored in that memory location.
The fourth parameter is also strictly evaluated. But here's where it gets a bit tricky: we're evaluating not fully, but only to weak normal head form. This means that we figure out whether it's Nothing or Just something, and store that, but we go no further. That means that we store not Just 10 but actually Just (5+5), leaving the thunk inside unevaluated. This is important to know, though I think that all the implications of this go rather beyond the scope of this question.
You can annotate function arguments in the same way, if you enable the BangPatterns language extension:
f x !y = x*y
f (1+1) (2+2) will return the thunk (1+1)*4.
A simple way to see the difference between strict and non-strict constructor arguments is how they behave when they are undefined. Given
data Foo = Foo Int !Int
first (Foo x _) = x
second (Foo _ y) = y
Since the non-strict argument isn't evaluated by second, passing in undefined doesn't cause a problem:
> second (Foo undefined 1)
1
But the strict argument can't be undefined, even if we don't use the value:
> first (Foo 1 undefined)
*** Exception: Prelude.undefined
I believe it is a strictness annotation.
Haskell is a pure and lazy functional language, but sometimes the overhead of lazyness can be too much or wasteful. So to deal with that, you can ask to compiler to fully evaluate the arguments to a function instead of parsing thunks around.
There's more information on this page: Performance/Strictness.

Symbol ! in haskell in front of a variable name in a function [duplicate]

I came across the following definition as I try to learn Haskell using a real project to drive it. I don't understand what the exclamation mark in front of each argument means and my books didn't seem to mention it.
data MidiMessage = MidiMessage !Int !MidiMessage
It's a strictness declaration. Basically, it means that it must be evaluated to what's called "weak head normal form" when the data structure value is created. Let's look at an example, so that we can see just what this means:
data Foo = Foo Int Int !Int !(Maybe Int)
f = Foo (2+2) (3+3) (4+4) (Just (5+5))
The function f above, when evaluated, will return a "thunk": that is, the code to execute to figure out its value. At that point, a Foo doesn't even exist yet, just the code.
But at some point someone may try to look inside it, probably through a pattern match:
case f of
Foo 0 _ _ _ -> "first arg is zero"
_ -> "first arge is something else"
This is going to execute enough code to do what it needs, and no more. So it will create a Foo with four parameters (because you can't look inside it without it existing). The first, since we're testing it, we need to evaluate all the way to 4, where we realize it doesn't match.
The second doesn't need to be evaluated, because we're not testing it. Thus, rather than 6 being stored in that memory location, we'll just store the code for possible later evaluation, (3+3). That will turn into a 6 only if someone looks at it.
The third parameter, however, has a ! in front of it, so is strictly evaluated: (4+4) is executed, and 8 is stored in that memory location.
The fourth parameter is also strictly evaluated. But here's where it gets a bit tricky: we're evaluating not fully, but only to weak normal head form. This means that we figure out whether it's Nothing or Just something, and store that, but we go no further. That means that we store not Just 10 but actually Just (5+5), leaving the thunk inside unevaluated. This is important to know, though I think that all the implications of this go rather beyond the scope of this question.
You can annotate function arguments in the same way, if you enable the BangPatterns language extension:
f x !y = x*y
f (1+1) (2+2) will return the thunk (1+1)*4.
A simple way to see the difference between strict and non-strict constructor arguments is how they behave when they are undefined. Given
data Foo = Foo Int !Int
first (Foo x _) = x
second (Foo _ y) = y
Since the non-strict argument isn't evaluated by second, passing in undefined doesn't cause a problem:
> second (Foo undefined 1)
1
But the strict argument can't be undefined, even if we don't use the value:
> first (Foo 1 undefined)
*** Exception: Prelude.undefined
I believe it is a strictness annotation.
Haskell is a pure and lazy functional language, but sometimes the overhead of lazyness can be too much or wasteful. So to deal with that, you can ask to compiler to fully evaluate the arguments to a function instead of parsing thunks around.
There's more information on this page: Performance/Strictness.

Why this Haskell code never terminates?

I recently wrote some Haskell code and it never terminates. After I carefully examined my code, the problem boiled down to the following code piece
main :: IO ()
main = print $ let a = 10 in
let a = a in
a :: Int
I guess this must have something to do with the laziness of Haskell since the same code terminates in OCaml. However, if I wrote the following code instead
main :: IO ()
main = print $ let a = 10 in
let b = a in
b :: Int
the code would have no problem terminating at all. I can't get the reason since in the original code, the two a's should be considered as two different variables. I don't know why the naming of them has anything to do with the semantic of the program.
The issue is that, unlike OCaml, let bindings in Haskell are recursive by default. So let x = x in ... is equivalent to OCaml's let rec x = x in ... and is a circular definition.
This is why shadowing variable names in Haskell (ie defining a multiple times) is considered bad style and even has a compiler warning, which you can turn on with the -Wall flag or more specifically -fwarn-name-shadowing.
This default makes more sense in Haskell than OCaml because, thanks to laziness, circular values (rather than just recursive functions) are actually useful. let x = 1:x gives us an infinite list of 1, which we can use just like a normal list.
At the same time, some people don't like this for basically exactly the reason you ran into here: it's possible to introduce unintuitive infinite loops in your code, which makes some errors and typos harder to track down. This is also confusing because, by necessity, <- bindings in do-notation are not recursive by default, which is a bit inconsistent.
The second binding (a = a) shadows the other one. The first example is (almost) exactly equivalent to
main = print $ let xyz = 10 in
let a = a in
a :: Int
and I hope it's clear why that one doesn't terminate! You can get GHC to warn you about this by using the -fwarn-name-shadowing flag (or by entering :set -fwarn-name-shadowing in GHCi)

Evaluation of nullary functions in Haskell

Suppose you have a nullary function in Haskell, which is used several times in the code. Is it always evaluated only once? I already tested the following code:
sayHello :: Int
sayHello = unsafePerformIO $ do
putStr "Hello"
return 42
test :: Int -> [Int]
test 0 = []
test n = (sayHello:(test (n-1)))
When I call test 10, it writes "Hello" only once, so it's indicating the result of function is stored after first evaluation. My question is, is it guaranteed? Will I get the same result across different compilers?
Edit
The reason I used unsafePerformIO is to check whether sayHello is evaluated more than once. I don't use that in my program. Normally I expect sayHello to have exactly the same result every time its evaluated. But it's a time-consuming operation, so I wanted to know if it could be accessed this way, or if it should be passed as an argument wherever it's needed to ensure it is not evaluated multiple times, i.e.:
test _ 0 = []
test s n = (s:(test (n-1)))
...
test sayHello 10
According to the answers this should be used.
There is no such thing as a nullary function. A function in Haskell has exactly one argument, and always has type ... -> .... sayHello is a value -- an Int -- but not a function. See this article for more.
On guarantees: No, you don't really get any guarantees. The Haskell report specifies that Haskell is non-strict -- so you know what value things will eventually reduce to -- but not any particular evaluation strategy. The evaluation strategy GHC generally uses is lazy evaluation, i.e. non-strict evaluation with sharing, but it doesn't make strong guarantees about that -- the optimizer could shuffle your code around so that things are evaluated more than once.
There are also various exceptions -- for example, foo :: Num a => a is polymorphic, so it probably won't be shared (it's compiled to an actual function). Sometimes a pure value might be evaluated by more than one thread at the same time (that won't happen in this case because unsafePerformIO explicitly uses noDuplicate to avoid it). So when you program, you can generally expect laziness, but if you want any sort of guarantees you'll have to be very careful. The Report itself won't really give you anything on how your program is evaluated.
unsafePerformIO gives you even less in the way of guarantees, of course. There's a reason it's called "unsafe".
Top level no-argument functions like sayHello are called Constant Applicative Forms and are always memoised (atleast in GHC - see http://www.haskell.org/ghc/docs/7.2.1/html/users_guide/profiling.html). You would have to resort to tricks like passing in dummy arguments and turning optimisations off to not share a CAF globally.
Edit: quote from the link above -
Haskell is a lazy language, and certain expressions are only ever
evaluated once. For example, if we write:
x = nfib 25 then x will only be evaluated once (if at all), and
subsequent demands for x will immediately get to see the cached result.
The definition x is called a CAF (Constant Applicative Form), because
it has no arguments.
If you do want "Hello" printed n times, you need to remove the unsafePermformIO, so the runtime will know it can't optimize away repeated calls to putStr. I'm not clear whether you want to return the list of int, so I've written two versions of test, one of which returns (), one [Int].
sayHello2 :: IO Int
sayHello2 = do
putStr "Hello"
return 42
test2 :: Int -> IO ()
test2 0 = return ()
test2 n = do
sayHello2
test2 (n-1)
test3 :: Int -> IO [Int]
test3 0 = return []
test3 n = do
r <- sayHello2
l <- test3 (n-1)
return $ r:l

Resources