Performance of lazy evaluation in Haskell when the arguments appear several times - haskell

Let's say I have a function which can calculate power of four of a number defined by
let power4 x = x*x*x*x
And I try to pass x = (3 + 8)*2
let result = power4 ((3 + 8)*2)
Since in Haskell, the values are evaluated until they are needed, does it mean that x will evaluate four times? If so, is there any way to improve the Haskell compiler?
Thanks.

No, it will only be evaluated once. In call by name it would be evaluated four times, but all Haskell implementations are call by need (although the standard does not require this), which means each expression will be evaluated at most once.
This only applies when the expression is fully concrete. E.g. there is no guarantee that in:
foo x = x + (1+2)
bar = foo 3 + foo 4
That when computing bar, (1+2) will be evaluated only once. In fact, it will probably be evaluated twice (if compiled without optimizations).

If you aren't sure, you could use trace to check (ref: http://www.haskell.org/haskellwiki/Debugging):
import Debug.Trace
power4 x = x*x*x*x
main = print $ power4 (trace "called" ((3+8)*2))
result:
called
234256
so the expression is only evaluated once.

Related

Writing several unit definitions?

I've seen many OCaml programs that have all their functions at the top and then a unit definition at the end, like:
let rec factorial num =
if num = 0 then 1
else num * factorial (num-1)
let () =
let num2 = read_int () in
print_int (factorial num2)
Why is this? Does it act like a main function? If so, you shouldn't be able to use several of them right?
What is the best way to handle several input for example? Writing several unit definitions?
Yes, a unit expression at the top level of a module acts like the main function of the module. I.e., it gets executed at the time the program is started.
You can have many unit expressions anywhere you can have one unit expression. The ; operator is specifically intended for such cases:
let () =
Printf.printf "hello\n";
Printf.printf "world\n"
As a side comment, I often write a main function in my main module:
let main () =
(* main calculation of program *)
let () = main ()
This is possibly a holdover from all the years I wrote C code.
I have also seen this in other people's code (possibly there are a lot of us who used to write C code).
I really like Jeffrey's answer, but in case if you want extra details and what to know what let () = foo means here is some extracurricular reading.
Abstractly speaking the operation of OCaml programs could be defined as a machine that reduces expressions until they become irreducible. And an irreducible expression is called a value. For example, 5 + 3 is reduced to 8 and there is no other way to reduce 8 so 8 is a value. A more complex example of a value is (fun x -> x + 1). And a more complex example of expression would be
(fun x -> x + 1) 5
Which is reduced to 6.
The whole semantics of the language is defined as a set of such reduction rules. And a program in OCaml is an ordered list of definitions of the form,
let <pattern> = <expression>
So that when an OCaml program is evaluated (executed) it reduces the part of each definition and assigns it to the pattern on the left-hand side, e.g.,
let 5 = 2 + 3
is a valid definition in OCaml. It will reduce the 2 + 3 expression to 5 and then try to match the resulting value with the left-hand side. If it matches, then the next definition is evaluated, and so on. If it doesn't the program is terminated.
Here 5 is a very simple value that matches only with 5 and, in general, your values will be more complex. However, there is a value that is even more primitive than 5. It is a value of type unit that has only one inhabitant, denoted as (). And this is also the value, to which colloquially expressions with side effects are reduced. Since in OCaml every expression must reduce to a value, we need a value that represents no value, and that is unit. For example print_endline "foo" reduces to () with a side effect of emitting string foo to the standard output.
Therefore, when we write
let foo () = print_endline "foo"
let () = foo ()
We evaluate (reduce) the function foo until it reaches the () value that indicates that we fully reduced foo ().
We could also use a wildcard matcher and write
let _ = foo ()
or bind the result to a variable, e.g.,
let bar = foo ()
But it is considered a good style to use () on the left-hand side of an expression that evaluates to () to indicate that the right-hand side doesn't produce any interesting value. It also prevents common errors, e.g.,
let () = foo
will yield an error saying that unit -> unit and can't be matched with unit and even provide a hint: Did you forget to provide ()' as argument?`

List comprehension in haskell with let and show, what is it for?

I'm studying project euler solutions and this is the solution of problem 4, which asks to
Find the largest palindrome made from the product of two 3-digit
numbers
problem_4 =
maximum [x | y<-[100..999], z<-[y..999], let x=y*z, let s=show x, s==reverse s]
I understand that this code creates a list such that x is a product of all possible z and y.
However I'm having a problem understanding what does s do here. Looks like everything after | is going to be executed everytime a new element from this list is needed, right?
I don't think I understand what's happening here. Shouldn't everything to the right of | be constraints?
A list comprehension is a rather thin wrapper around a do expression:
problem_4 = maximum $ do
y <- [100..999]
z <- [y..999]
let x = y*z
let s = show x
guard $ s == reverse s
return x
Most pieces translate directly; pieces that aren't iterators (<-) or let expressions are treated as arguments to the guard function found in Control.Monad. The effect of guard is to short-circuit the evaluation; for the list monad, this means not executing return x for the particular value of x that led to the false argument.
I don't think I understand what's happening here. Shouldn't everything to the right of | be constraints?
No, at the right part you see an expression that is a comma-separated (,) list of "parts", and every part is one of the following tree:
an "generator" of the form somevar <- somelist;
a let statement which is an expression that can be used to for instance introduce a variable that stores a subresult; and
expressions of the type boolean that act like a filter.
So it is not some sort of "constraint programming" where one simply can list some constraints and hope that Haskell figures it out (in fact personally that is the difference between a "programming language" and a "specification language": in a programming language you have "control" how the data flows, in a specification language, that is handled by a system that reads your specifications)
Basically an iterator can be compared to a "foreach" loop in many imperative programming languages. A "let" statement can be seen as introducing a temprary variable (but note that in Haskell you do not assign variable, you declare them, so you can not reassign values). The filter can be seen as an if statement.
So the list comprehension would be equivalent to something in Python like:
for y in range(100, 1000):
for z in range(y, 1000):
x = y * z
s = str(x)
if x == x[::-1]:
yield x
We thus first iterate over two ranges in a nested way, then we declare x to be the multiplication of y and z, with let s = show x, we basically convert a number (for example 15129) to its string counterpart (for example "15129"). Finally we use s == reverse s to reverse the string and check if it is equal to the original string.
Note that there are more efficient ways to test Palindromes, especially for multiplications of two numbers.

What is this expression in Haskell, and how do I interpret it?

I'm learning basic Haskell so I can configure Xmonad, and I ran into this code snippet:
newKeys x = myKeys x `M.union` keys def x
Now I understand what the M.union in backticks is and means. Here's how I'm interpreting it:
newKeys(x) = M.union(myKeys(x),???)
I don't know what to make of the keys def x. Is it like keys(def(x))? Or keys(def,x)? Or is def some sort of other keyword?
It's keys(def,x).
This is basic Haskell syntax for function application: first the function itself, then its arguments separated by spaces. For example:
f x y = x + y
z = f 5 6
-- z = 11
However, it is not clear what def is without larger context.
In response to your comment: no, def couldn't be a function that takes x as argument, and then the result of that is passed to keys. This is because function application is left-associative, which basically means that in any bunch of things separated by spaces, only the first one is the function being applied, and the rest are its arguments. In order to express keys(def(x)), one would have to write keys (def x).
If you want to be super technical, then the right way to think about it is that all functions have exactly one parameter. When we declare a function of two parameters, e.g. f x y = x + y, what we really mean is that it's a function of one parameter, which returns another function, to which we can then pass the remaining parameter. In other words, f 5 6 means (f 5) 6.
This idea is kind of one of the core things in Haskell (and any ML offshoot) syntax. It's so important that it has its own name - "currying" (after Haskell Curry, the mathematician).

How does this haskell code work?

I'm a new student and I'm studying in Computer Sciences. We're tackling Haskell, and while I understand the idea of Haskell, I just can't seem to figure out how exactly the piece of code we're supposed to look at works:
module U1 where
double x = x + x
doubles (d:ds) = (double d):(doubles ds)
ds = doubles [1..]
I admit, it seems rather simple for someone that knows whats happening, but I can't wrap my head around it. If I write "take 5 ds", it obviously gives back [2,4,6,8,10]. What I dont get, is why.
Here's my train of thought : I call ds, which then looks for doubles. because I also submit the value [1..], doubles (d:ds) should mean that d = 1 and ds = [2..], correct? I then double the d, which returns 2 and puts it at the start of a list (array?). Then it calls upon itself, transferring ds = [2..] to d = 2 and ds = [3..], which then doubles d again and again calls upon itself and so on and so forth until it can return 5 values, [2,4,6,8,10].
So first of all, is my understanding right? Do I have any grave mistakes in my string of thought?
Second of all, since it seems to save all doubled d into a list to call for later, whats the name of that list? Where did I exactly define it?
Thanks in advance, hope you can help out a student to understand this x)
I think you are right about the recursion/loop part about how doubles goes through each element of the infinite list.
Now regarding
it seems to save all doubled d into a list to call for later, whats
the name of that list? Where did I exactly define it?
This relates to a feature that's called Lazy Evaluation in Haskell. The list isn't precomputed and stored any where. Instead, you can imagine that a list is a function object in C++ that can generate elements when needed. (The normal language you may see is that expressions are evaluated on demand). So when you do
take 5 [1..]
[1..] can be viewed as a function object that generates numbers when used with head, take etc. So,
take 5 [1..] == (1 : take 4 [2..])
Here [2..] is also a "function object" that gives you numbers. Similarly, you can have
take 5 [1..] == (1 : 2 : take 3 [3..]) == ... (1 : 2 : 3 : 4 : 5 : take 0 [6..])
Now, we don't need to care about [6..], because take 0 xs for any xs is []. Therefore, we can have
take 5 [1..] == (1 : 2 : 3 : 4 : 5 : [])
without needing to store any of the "infinite" lists like [2..]. They may be viewed as function objects/generators if you want to get an idea of how Lazy computation can actually happen.
Your train of thought looks correct. The only minor inaccuracy in it lies in describing the computation using expressions such has "it doubles 2 and then calls itself ...". In pure functional programming languages, such as Haskell, there actually is no fixed evaluation order. Specifically, in
double 1 : double [2..]
it is left unspecified whether doubling 1 happens before of after doubling the rest of the list. Theoretical results guarantee that order is indeed immaterial, in that -- roughly -- even if you evaluate your expression in a different order you will get the same result. I would recommend that you see this property at work using the Lambda Bubble Pop website: there you can pop bubbles in a different order to simulate any evaluation order. No matter what you do, you will get the same result.
Note that, because evaluation order does not matter, the Haskell compiler is free to choose any evaluation order it deems to be the most appropriate for your code. For instance, let ds be defined as in the final line in your code, and consider
take 5 (drop 5 ds)
this results in [12,14,16,18,20]. Note that the compiler has no need to double the first 5 numbers, since you are dropping them, so they can be dropped before they are completely computed (!!).
If you want to experiment, define yourself a function which is very expensive to compute (say, write fibonacci following the recursive definifion).
fibonacci 0 = 0
fibonacci 1 = 1
fibonacci n = fibonacci (n-1) + fibonacci (n-2)
Then, define
const5 n = 5
and compute
fibonacci 100
and observe how long that actually takes. Then, evaluate
const5 (fibonacci 100)
and see that the result is immediately reached -- the argument was not even computed (!) since there was no need for it.

Are there any programming languages where variables are really functions?

For example, I would write:
x = 2
y = x + 4
print(y)
x = 5
print(y)
And it would output:
6 (=2+4)
9 (=5+4)
Also, are there any cases where this could actually be useful?
Clarification: Yes, lambdas etc. solve this problem (they were how I arrived at this idea); I was wondering if there were specific languages where this was the default: no function or lambda keywords required or needed.
Haskell will meet you halfway, because essentially everything is a function, but variables are only bound once (meaning you cannot reassign x in the same scope).
It's easy to consider y = x + 4 a variable assignment, but when you look at y = map (+4) [1..] (which means add 4 to every number in the infinite list from 1 upwards), what is y now? Is it an infinite list, or is it a function that returns an infinite list? (Hint: it's the second one.) In this case, treating variables as functions can be extremely beneficial, if not an absolute necessity, when taking advantage of laziness.
Really, in Haskell, your definition of y is a function accepting no arguments and returning x+4, where x is also a function that takes no arguments, but returns the value 2.
In any language with first order functions, it's trivial to assign anonymous functions to variables, but for most languages you'll have to add the parentheses to indicate a function call.
Example Lua code:
x = function() return 2 end
y = function() return x() + 4 end
print(y())
x = function() return 5 end
print(y())
$ lua x.lua
6
9
Or the same thing in Python (sticking with first-order functions, but we could have just used plain integers for x):
x = lambda: 2
y = lambda: x() + 4
print(y())
x = lambda: 5
print(y())
$ python x.py
6
9
you can use func expressions in C#
Func<int, int> y = (x) => x + 5;
Console.WriteLine(y(5)); // 10
Console.WriteLine(y(3)); // 8
... or ...
int x = 0;
Func<int> y = () => x + 5;
x = 5;
Console.WriteLine(y()); // 10
x = 3;
Console.WriteLine(y()); // 8
... if you are really wanting to program in a functional style the first option would probably be best.
it looks more like the stuff you saw in math class.
you don't have to worry about external state.
Check out various functional languages like F#, Haskell, and Scala. Scala treats functions as objects that have an apply() method, and you can store them in variables and pass them around like you can any other kind of object. I don't know that you can print out the definition of a Scala function as code though.
Update: I seem to recall that at least some Lisps allow you to pretty-print a function as code (eg, Scheme's pretty-print function).
This is the way spreadsheets work.
It is also related to call by name semantics for evaluating function arguments. Algol 60 had that, but it didn't catch on, too complicated to implement.
The programming language Lucid does this, although it calls x and y "streams" rather than functions.
The program would be written:
y
where
y = x + 4
end
And then you'd input:
x(0): 2
y = 6
x(1): 5
y = 7
Of course, Lucid (like most interesting programming languages) is fairly obscure, so I'm not surprised that nobody else found it. (or looked for it)
Try checking out F# here and on Wikipedia about Functional programming languages.
I myself have not yet worked on these types of languages since I've been concentrated on OOP, but will be delving soon once F# is out.
Hope this helps!
The closest I've seen of these have been part of Technical Analysis systems in charting components. (Tradestation, metastock, etc), but mainly they focus on returning multiple sets of metadata (eg buy/sell signals) which can be then fed into other functions that accept either meta data, or financial data, or plotted directly.
My 2c:
I'd say a language as you suggest would be highly confusing to say the least. Functions are generally r-values for good reason. This code (javascript) shows how enforcing functions as r-values increases readability (and therefore maintenance) n-fold:
var x = 2;
var y = function() { return x+2; }
alert(y());
x= 5;
alert(y());
Self makes no distinction between fields and methods, both are slots and can be accessed in exactly the same way. A slot can contain a value or a function (so those two are still separate entities), but the distinction doesn't matter to the user of the slot.
In Scala, you have lazy values and call-by-name arguments in functions.
def foo(x : => Int) {
println(x)
println(x) // x is evaluated again!
}
In some way, this can have the effect you looked for.
I believe the mathematically oriented languages like Octave, R and Maxima do that. I could be wrong, but no one else has mentioned them, so I thought I would.

Resources