G-machine, (non-)strict contexts - why case expressions need special treatment - haskell

I'm currently reading Implementing functional languages: a tutorial by SPJ and the (sub)chapter I'll be referring to in this question is 3.8.7 (page 136).
The first remark there is that a reader following the tutorial has not yet implemented C scheme compilation (that is, of expressions appearing in non-strict contexts) of ECase expressions.
The solution proposed is to transform a Core program so that ECase expressions simply never appear in non-strict contexts. Specifically, each such occurrence creates a new supercombinator with exactly one variable which body corresponds to the original ECase expression, and the occurrence itself is replaced with a call to that supercombinator.
Below I present a (slightly modified) example of such transformation from 1
t a b = Pack{2,1} ;
f x = Pack{2,2} (case t x 7 6 of
<1> -> 1;
<2> -> 2) Pack{1,0} ;
main = f 3
== transformed into ==>
t a b = Pack{2,1} ;
f x = Pack{2,2} ($Case1 (t x 7 6)) Pack{1,0} ;
$Case1 x = case x of
<1> -> 1;
<2> -> 2 ;
main = f 3
I implemented this solution and it works like charm, that is, the output is Pack{2,2} 2 Pack{1,0}.
However, what I don't understand is - why all that trouble? I hope it's not just me, but the first thought I had of solving the problem was to just implement compilation of ECase expressions in C scheme. And I did it by mimicking the rule for compilation in E scheme (page 134 in 1 but I present that rule here for completeness): so I used
E[[case e of alts]] p = E[[e]] p ++ [Casejump D[[alts]] p]
and wrote
C[[case e of alts]] p = C[[e]] p ++ [Eval] ++ [Casejump D[[alts]] p]
I added [Eval] because Casejump needs an argument on top of the stack in weak head normal form (WHNF) and C scheme doesn't guarantee that, as opposed to E scheme.
But then the output changes to enigmatic: Pack{2,2} 2 6.
The same applies when I use the same rule as for E scheme, i.e.
C[[case e of alts]] p = E[[e]] p ++ [Casejump D[[alts]] p]
So I guess that my "obvious" solution is inherently wrong - and I can see that from outputs. But I'm having trouble stating formal arguments as to why that approach was bound to fail.
Can someone provide me with such argument/proof or some intuition as to why the naive approach doesn't work?

The purpose of the C scheme is to not perform any computation, but just delay everything until an EVAL happens (which it might or might not). What are you doing in your proposed code generation for case? You're calling EVAL! And the whole purpose of C is to not call EVAL on anything, so you've now evaluated something prematurely.
The only way you could generate code directly for case in the C scheme would be to add some new instruction to perform the case analysis once it's evaluated.
But we (Thomas Johnsson and I) decided it was simpler to just lift out such expressions. The exact historical details are lost in time though. :)

Related

Under what circumstances could Common Subexpression Elimination affect the laziness of a Haskell program?

From wiki.haskell.org:
First of all, common subexpression elimination (CSE) means that if an expression appears in several places, the code is rearranged so that the value of that expression is computed only once. For example:
foo x = (bar x) * (bar x)
might be transformed into
foo x = let x' = bar x in x' * x'
thus, the bar function is only called once. (And if bar is a particularly expensive function, this might save quite a lot of work.)
GHC doesn't actually perform CSE as often as you might expect. The trouble is, performing CSE can affect the strictness/laziness of the program. So GHC does do CSE, but only in specific circumstances --- see the GHC manual. (Section??)
Long story short: "If you care about CSE, do it by hand."
I'm wondering under what circumstances CSE "affects" the strictness/laziness of the program and what kind of effect that could be.
The naive CSE rule would be
e'[e, e] ~> let x = e in e'[x, x].
That is, whenever a subexpression e occurs twice in the expression e', we use a let-binding to compute e once. This however leads itself to some trivial space leaks. For example
sum [1..n] + prod [1..n]
is typically O(1) space usage in a lazy functional programming language like Haskell (as sum and prod would tail-recurse and blah blah blah), but would become O(n) when the naive CSE rule is enacted. This can be terrible for programs when n is high!
The approach is then to make this rule more specific, restricting it to a small set of cases that we know won't have the problem. We can begin by more specifically enumerating the problems with the naive rule, which will form a set of priorities for us to develop a better CSE:
The two occurrences of e might be far apart in e', leading to a long lifetime for the let x = e binding.
The let-binding must always allocate a closure where previously there might not have been one.
This can create an unbound number of closures.
There are cases where the closure might never deallocate.
Something better
let x = e in e'[e] ~> let x = e in e'[x]
This is a more conservative rule but is much safer. Here we recognize that e appears twice but the first occurrence syntactically dominates the second expression, meaning here that the programmer has already introduced a let-binding. We can safely just reuse that let-binding and replace the second occurrence of e with x. No new closures are allocated.
Another example of syntactic domination:
case e of { x -> e'[e] } ~> case e of { x -> e'[x] }
And yet another:
case e of {
Constructor x0 x1 ... xn ->
e'[e]
}
~>
case e of {
Constructor x0 x1 ... xn ->
e'[Constructor x0 x1 ... xn]
}
These rules all take advantage of existing structure in the program to ensure that the kinetics of space usage remain the same before and after the transformation. They are much more conservative than the original CSE but they are also much safer.
See also
For a full discussion of CSE in a lazy FPL, read Chitil's (very accessible) 1997 paper. For a full treatment of how CSE works in a production compiler, see GHC's CSE.hs module, which is documented very thoroughly thanks to GHC's tradition of writing long footnotes. The comment-to-code ratio in that module is off the charts. Also note how old that file is (1993)!

Essence of Laziness. Haskell [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I am learning a haskell for a few days and the laziness is something like buzzword. Because of the fact I am not familiar with laziness ( I have been working mainly with non-functional languages ) it is not easy concept for me.
So, I am asking for any excerise / example which show me what laziness is in the fact.
Thanks in advance ;)
In Haskell you can create an infinite list. For instance, all natural numbers:
[1,2..]
If Haskell loaded all the items in memory at once that wouldn't be possible. To do so you would need infinite memory.
Laziness allows you to get the numbers as you need them.
Here's something interesting: dynamic programming, the bane of every intro. algorithms student, becomes simple and natural when written in a lazy and functional language. Take the example of string edit distance. This is the problem of measuring how similar two DNA strands are or how many bytes changed between two releases of a binary executable or just how 'different' two strings are. The dynamic programming algorithm, expressed mathematically, is simple:
let:
• d_{i,j} be the edit distance of
the first string at index i, which has length m
and the second string at index j, which has length m
• let a_i be the i^th character of the first string
• let b_j be the j^th character of the second string
define:
d_{i,0} = i (0 <= i <= m)
d_{0,j} = j (0 <= j <= n)
d_{i,j} = d_{i - 1, j - 1} if a_i == b_j
d_{i,j} = min { if a_i != b_j
d_{i - 1, j} + 1 (delete)
d_{i, j - 1} + 1 (insert)
d_{i - 1, j - 1} + 1 (modify)
}
return d_{m, n}
And the algorithm, expressed in Haskell, follows the same shape of the algorithm:
distance a b = d m n
where (m, n) = (length a, length b)
a' = Array.listArray (1, m) a
b' = Array.listArray (1, n) b
d i 0 = i
d 0 j = j
d i j
| a' ! i == b' ! j = ds ! (i - 1, j - 1)
| otherwise = minimum [ ds ! (i - 1, j) + 1
, ds ! (i, j - 1) + 1
, ds ! (i - 1, j - 1) + 1
]
ds = Array.listArray bounds
[d i j | (i, j) <- Array.range bounds]
bounds = ((0, 0), (m, n))
In a strict language we wouldn't be able to define it so straightforwardly because the cells of the array would be strictly evaluated. In Haskell we're able to have the definition of each cell reference the definitions of other cells because Haskell is lazy – the definitions are only evaluated at the very end when d m n asks the array for the value of the last cell. A lazy language lets us set up a graph of standing dominoes; it's only when we ask for a value that we need to compute the value, which topples the first domino, which topples all the other dominoes. (In a strict language, we would have to set up an array of closures, doing the work that the Haskell compiler does for us automatically. It's easy to transform implementations between strict and lazy languages; it's all a matter of which language expresses which idea better.)
The blog post does a much better job of explaining all this.
So, I am asking for any excerise / example which show me what laziness is in the fact.
Click on Lazy on haskell.org to get the canonical example. There are many other examples just like it to illustrate the concept of delayed evaluation that benefits from not executing some parts of the program logic. Lazy is certainly not slow, but the opposite of eager evaluation common to most imperative programming languages.
Laziness is a consequence of non-strict function evaluation. Consider the "infinite" list of 1s:
ones = 1:ones
At the time of definition, the (:) function isn't evaluated; ones is just a promise to do so when it is necessary. Such a time would be when you pattern match:
myHead :: [a] -> a
myHead (x:rest) = x
When myHead ones is called, x and rest are needed, but the pattern match against 1:ones simply binds x to 1 and rest to ones; we don't need evaluate ones any further at this time, so we don't.
The syntax for infinite lists, using the .. "operator" for arithmetic sequences, is sugar for calls to enumFrom and enumFromThen. That is
-- An infintite list of ones
ones = [1,1..] -- enumFromThen 1 1
-- The natural numbers
nats = [1..] -- enumFrom 1
so again, laziness just comes from the non-strict evaluation of enumFrom.
Unlike with other languages, Haskell decouples the creation and definition of an object.... You can easily watch this in action using Debug.Trace.
You can define a variable like this
aValue = 100
(the value on the right hand side could include a complicated evaluation, but let's keep it simple)
To see if this code ever gets called, you can wrap the expression in Debug.Trace.trace like this
import Debug.Trace
aValue = trace "evaluating aValue" 100
Note that this doesn't change the definition of aValue, it just forces the program to output "evaluating aValue" whenever this expression is actually created at runtime.
(Also note that trace is considered unsafe for production code, and should only be used to debug).
Now, try two experiments.... Write two different mains
main = putStrLn $ "The value of aValue is " ++ show aValue
and
main = putStrLn "'sup"
When run, you will see that the first program actually creates aValue (you will see the "creating aValue" message, while the second does not.
This is the idea of laziness.... You can put as many definitions in a program as you want, but only those that are used will be actually created at runtime.
The real use of this can be seen with objects of infinite size. Many lists, trees, etc. have an infinite number of elements. Your program will use only some finite number of values, but you don't want to muddy the definition of the object with this messy fact. Take for instance the infinite lists given in other answers here....
[1..] -- = [1,2,3,4,....]
You can again see laziness in action here using trace, although you will have to write out a variant of [1..] in an expanded form to do this.
f::Int->[Int]
f x = trace ("creating " ++ show x) (x:f (x+1)) --remember, the trace part doesn't change the expression, it is just used for debugging
Now you will see that only the elements you use are created.
main = putStrLn $ "the list is " ++ show (take 4 $ f 1)
yields
creating 1
creating 2
creating 3
creating 4
the list is [1,2,3,4]
and
main = putStrLn "yo"
will not show any item being created.

GHCI Haskell not remembering bindings in command line

I am trying to learn Haskell but it is a little hard as non of my bindings are remembered from the command line; output from my terminal below.
> let b = []
> b
[]
> 1:b
[1]
> b
[]
I have no idea why this is like this can anyone please help.
What did you expect your example to do? From what you've presented, I don't see anything surprising.
Of course, that answer is probably surprising to you, or you wouldn't have asked. And I'll be honest: I can guess what you were expecting. If I'm right, you thought the output would be:
> let b = []
> b
[]
> 1:b
[1]
> b
[1]
Am I right? Supposing I am, then the question is: why isn't it?
Well, the short version is "that's not what (:) does". Instead, (:) creates a new list out of its arguments; x:xs is a new list whose first element is x and the rest of which is identical to xs. But it creates a new list. It's just like how + creates a new number that's the sum of its arguments: is the behavior
> let b = 0
> b
0
> 1+b
1
> b
0
surprising, too? (Hopefully not!)
Of course, this opens up the next question of "well, how do I update b, then?". And this is where Haskell shows its true colors: you don't. In Haskell, once a variable is bound to a value, that value will never change; it's as though all variables and all data types are const (in C-like languages or the latest Javascript standard) or val (in Scala).
This feature of Haskell – it's called being purely functional – is possibly the single biggest difference between Haskell and every single mainstream language out there. You have to think about writing programs in a very different way when you aren't working with mutable state everywhere.
For example, to go a bit further afield, it's quite possible the next thing you'll try will be something like this:
> let b = []
> b
[]
> let b = 1 : b
In that case, what do you think is going to be printed out when you type b?
Well, remember, variables don't change! So the answer is:
[1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,…
forever – or until you hit control-C and abort.
This is because let b = 1 : b defines a new variable named b; you might as well have written let c = 1 : c. Thus, you're saying "b is a list which is 1 followed by b"; since we know what b is, we can substitute and get "b is a list which is 1 followed by 1 followed by b", and so on forever. Or: b = 1 : b, so substituting in for b we get b = 1 : 1 : b, and substituting in we get b = 1 : 1 : 1 : 1 : ….
(The fact that Haskell produces an infinite list, rather than going into an infinite loop, is because Haskell is non-strict, more popularly referred to as lazy – this is also possibly the single biggest difference between Haskell and every single mainstream language out there. For further information, search for "lazy evaluation" on Google or Stack Overflow.)
So, in the end, I hope you can see why I wasn't surprised: Haskell can't possibly update variable bindings. So since your definition was let b = [], then of course the final result was still [] :-)

Why must equations of a function in Haskell have the same number of arguments? [duplicate]

I noticed today that such a definition
safeDivide x 0 = x
safeDivide = (/)
is not possible. I am just curious what the (good) reason behind this is. There must be a very good one (it's Haskell after all :)).
Note: I am not looking suggestions for alternative implementations to the code above, it's a simple example to demonstrate my point.
I think it's mainly for consistency so that all clauses can be read in the same manner, so to speak; i.e. every RHS is at the same position in the type of the function. I think would mask quite a few silly errors if you allowed this, too.
There's also a slight semantic quirk: say the compiler padded out such clauses to have the same number of patterns as the other clauses; i.e. your example would become
safeDivide x 0 = x
safeDivide x y = (/) x y
Now consider if the second line had instead been safeDivide = undefined; in the absence of the previous clause, safeDivide would be ⊥, but thanks to the eta-expansion performed here, it's \x y -> if y == 0 then x else ⊥ — so safeDivide = undefined does not actually define safeDivide to be ⊥! This seems confusing enough to justify banning such clauses, IMO.
The meaning of a function with multiple clauses is defined by the Haskell standard (section 4.4.3.1) via translation to a lambda and case statement:
fn pat1a pat1b = r1
fn pat2a pat2b = r2
becomes
fn = \a b -> case (a,b) of
(pat1a, pat1b) -> r1
(pat2a, pat2b) -> r2
This is so that the function definition/case statement way of doing things is nice and consistent, and the meaning of each isn't specified redundantly and confusingly.
This translation only really makes sense when each clause has the same number of arguments. Of course, there could be extra rules to fix that, but they'd complicate the translation for little gain, since you probably wouldn't want to define things like that anyway, for your readers' sake.
Haskell does it this way because it's predecessors (like LML and Miranda) did. There is no technical reason it has to be like this; equations with fewer arguments could be eta expanded. But having a different number of arguments for different equations is probably a typo rather than intentional, so in this case we ban something sensible&rare to get better error reporting in the common case.

Translate list comprehension to Prolog

I have a list comprehension in Haskell that I want to translate to Prolog.
The point of the list comprehension is rotating a 4 by 4 grid:
rotate :: [Int] -> [Int]
rotate grid = [ grid !! (a + 4 * b) | a <- [0..3], b <- [0..3] ]
Now in Prolog, I translated it like this:
rotateGrid([T0,T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15],
[T0,T4,T8,T12,T1,T5,T9,T13,T2,T6,T10,T14,T3,T7,T11,T15]).
Can we do better?
We can use findall/3 for list comprehensions (Cf. the SWI-Prolog Documentation). E.g.,
?- findall(X, between(1,10,X), Xs).
Xs = [1,2,3,4,5,6,7,8,9,10]
Xs is a list holding all values that can unify with X when X is a number between 1 and 10. This is roughly equivalent to the Haskell expression let Xs = [x | x <- [1..10]](1). You can read a findall/3 statement thus: "find all values of [First Argument] such that [Conditions in Second Argument] hold, and put those values in the list, [Third Argument]".
I've used findall/3 to write a predicate rotate_grid(+Grid, ?RotatedGrid). Here is a list of the approximate Haskell-Prolog equivalences I used in the predicate; each line shows the relation between the value that the Haskell expression will evaluate to and the Prolog variable with the same value:
a <- [0..3] = A in between(0, 3, A)
b <- [0..3] = B in between(0, 3, B)
(a + 4 * d) = X in X is A + 4 * D
<Grid> !! <Index> = Element in nth0(Index, Grid, Element)
Then we simply need to find all the values of Element:
rotate_grid(Grid, RotatedGrid) :-
findall( Element,
( between(0,3,A),
between(0,3,B),
Index is A + 4 * B,
nth0(Index, Grid, Element) ),
RotatedGrid
).
To verify that this produces the right transformation, I down-cased the Prolog code from the question and posed the following query:
?- rotate_grid([t0,t1,t2,t3,t4,t5,t6,t7,t8,t9,t10,t11,t12,t13,t14,t15],
[t0,t4,t8,t12,t1,t5,t9,t13,t2,t6,t10,t14,t3,t7,t11,t15]).
| true.
Footnotes:
(1): between/3 isn't actually the analogue of [m..n], since the latter returns a list of values from m to n where between(M,N,X) will instantiate X with each value between M and N (inclusive) on backtracking. To get a list of numbers in SWI-Prolog, we can use numlist(M,N,Ns). So a stricter analogue for x <- [1.10] would be the conjunction member(X, Ns), numlist(1, 10, Ns).
You want a permutation of a list. The concrete elements are not considered. Therefore, you can generalize your Haskell signature to
rotate :: [x] -> [x]
This is already a very valuable hint for Prolog: the list's elements will not be considered - elements will not even be compared. So a Prolog solution should be able to handle variables directly, like so:
?- rotateGrid(L,R).
L = [_A,_B,_C,_D,_E,_F,_G,_H,_I,_J,_K,_L,_M,_N,_O,_P],
R = [_A,_E,_I,_M,_B,_F,_J,_N,_C,_G,_K,_O,_D,_H,_L,_P].
And your original definition handles this perfectly.
Your version using list comprehensions suggests itself to be realized via backtracking, certain precautions have to be taken. Using findall/3, as suggested by #aBathologist will rename variables:
?- length(L,16),rotate_grid(L,R).
L = [_A,_B,_C,_D,_E,_F,_G,_H,_I,_J,_K,_L,_M,_N,_O,_P],
R = [_Q,_R,_S,_T,_U,_V,_W,_X,_Y,_Z,_A1,_B1,_C1,_D1,_E1,_F1].
The built-in predicate bagof/3 addresses this problem. Note that we have to declare all local, existential variables explicitly:
rotate_grid2(Grid, RotatedGrid) :-
bagof(
Element,
A^B^Index^ % declaration of existential variables
( between(0,3,A),
between(0,3,B),
Index is A + 4 * B,
nth0(Index, Grid, Element)
),
RotatedGrid).
For lists that are shorter than 16 elements, the Haskell version produces a clean error, but here we get pretty random results:
?- L=[1,2,3,4],rotate_grid(L,R).
L = [1,2,3,4], R = [1,2,3,4].
?- L=[1,2,3,4,5],rotate_grid(L,R).
L = [1,2,3,4,5], R = [1,5,2,3,4].
This is due to the unclear separation between the part that enumerates and "generates" a concrete element. The cleanest way is to add length(Grid, 16) prior to the goal bagof/3.
List comprehensions in Prolog
Currently, only B-Prolog offers a form of list comprehensions:
R#=[E: A in 0..3,B in 0..3,[E,I],(I is A+4*B,nth0(I,L,E))].
However, it does not address the second problem:
| ?- L = [1,2,3], R#=[E: A in 0..3,B in 0..3,[E,I],(I is A+4*B,nth0(I,L,E))].
L = [1,2,3]
R = [1,2,3]
yes
Use a loop predicate foreach/4
If the comprehension should retain variables, which is for example important in constraint programming, a Prolog system could offer a predicate foreach/4. This predicate is the DCG buddy of foreach/2.
Here is how variables are not retained via findall/3, the
result R contains fresh variables according to the ISO
core semantics of findall/3:
Welcome to SWI-Prolog (threaded, 64 bits, version 7.7.1)
SWI-Prolog comes with ABSOLUTELY NO WARRANTY. This is free software.
?- functor(L,foo,5), findall(X,
(between(1,5,N), M is 6-N, arg(M,L,X)), R).
L = foo(_5140, _5142, _5144, _5146, _5148),
R = [_5210, _5204, _5198, _5192, _5186].
And here is how variables can be retained via foreach/4,
the resulting list has the same variables as the compound
we started with:
Jekejeke Prolog 3, Runtime Library 1.3.0
(c) 1985-2018, XLOG Technologies GmbH, Switzerland
?- [user].
helper(N,L) --> [X], {M is 6-N, arg(M,L,X)}.
Yes
?- functor(L,foo,5), foreach(between(1,5,N),helper(N,L),R,[]).
L = foo(_A,_G,_M,_S,_Y),
R = [_Y,_S,_M,_G,_A]
Using foreach/4 instead of bagof/3 might seem a little bit over the top. foreach/4 will probably only show its full potential when implementing Picat loops, since it can build up constraints, what bagof/3 cannot do.
foreach/4 is an implementation without the full materialization of all solution that are then backtracked. It shares with bagof/3 the reconstruct of variables, but still allows backtracking in the conjunction of the closures.

Resources