Haskell implemented without a stack? - haskell

from How does a stackless language work?
Haskell (as commonly implemented) does not have a call stack;
evaluation is based on graph reduction.
Really? That's interesting, because while I've never experienced it myself, I've read that if you don't use the strict versions of the fold functions and then force the evaluation of an infinite fold you get a stack overflow. Surely that indicates the presence of a stack. Can anyone clarify?

I'm not by any means an expert on this, but I think the answer you quoted is not entirely accurate. Haskell doesn't have the straightforward kind of stack most imperative languages have, where you can trace a path of calls through a program. Because of its laziness, evaluation is based on graph reduction, which you can read about here, but calls are still eventually placed in a stack. According to this page, "The “stack“ in GHC's execution engine bears little resemblance to the lexical call stack." So yes, there's a stack, but it's very different from one you would find in an imperative language, and it's created using graph reduction.

Haskell is not "stackless" or anything like it. Code generated from Haskell source still has some kind of symbols and execution shows some stack traces but they're very loosely related to source code. Here's some information about attempts of simplifying debugging/tracing/profiling:
http://www.haskell.org/wikiupload/9/9f/HIW2011-Talk-Marlow.pdf

Related

Generalizing/compiling haskell code into a lambda

I am pretty much 90% sure that the title of this question is wrong however I have no idea what the right title would be (I will gladly edit the title if suggestions come along!).
When reading up on Haskell and the core principles of the language you always find that it is a language "based on lambda expressions". I remember reading somewhere that this means that at the end, the main function just gets "proprocessed" into one big lambda, everything gets inlined, basically your entire code becomes one single, huge, lambda expression.
My questions are:
Is what I said above true?
If the answer to question 1 is "yes", is there any... decompiler/partial compiler/preprocessor? I know about this that lets you see the assembly code behind languages like C/++ and Haskell but is there anything I could use to explore the generated lambda expression?
This question is asked from a purely educational standpoint and not intended to seek a solution to a particular problem. I simply wish to learn more about a language I find extremely fascinating.
Let's make a distinction between the semantics of Haskell and the implementation of GHC. Mostly because we use different terms for language semantics than for assembly, but also because some other compiler might do things differently than GHC.
Every Haskell program defines main, which is an expression of type IO (). I don't like to call it a "lambda expression" because the type shows that it's not a function. The definition of main is some nested tree of function calls. Even the sequential lines in a do block are defined as calls to the functions (>>) and (>>=).
GHC uses heuristics to decide what to inline, to get the best runtime performance. It will usually inline small expressions that aren't recursive. I believe the runtime system maintains a callstack of functions currently being evaluated, not unlike the runtime result of compiling function calls in C or other imperative languages.
GHC provides many options for printing intermediate stages of compilation. I'm not sure which you will find interesting. Core is the lowest-level representation that feels like Haskell. Cmm (also called C--) is the highest-level representation that feels like assembly.

Is avoiding partial functions any easier in Haskell than other languages?

We're urged to avoid partial functions with seemingly more emphasis in Haskell than other languages.
Is this because partial functions are a more frequent risk in Haskell than other languages (c.f. this question), or is it that avoiding them in other languages is impractical to the point of little consideration?
Is this because partial functions are a more frequent risk in Haskell than other languages (c.f. this question), or is it that avoiding them in other languages is impractical to the point of little consideration?
Certainly the latter. The most commonly used languages all have some notion of the null value as an inhabitant of every type, the practical effect being that every value is akin to haskell's Maybe a.
You can argue that in haskell we have the same issue: bottoms can hide anywhere, e.g.
uhoh :: String
uhoh = error "oops"
But this isn't really the case. In haskell all bottom are morally equivalent and we can reason about code as if they didn't exist. If we could catch exceptions in pure code, this would no longer be the case. Here's an interesting discussion.
And just a subjective addendum, I think intermediate haskell developers tend to be aware of whether a function is partial, and to complain loudly when they are surprised to find they were wrong. At the same time a fair portion of the Prelude contains partial functions, such as tail and / and these haven't changed in spite of much attention and many alternative preludes, which I think is evidence that the language and standard lib probably struck a pretty decent balance.
EDIT I agree that Alexey Romanov's answer is an important part of the picture as well.
One reason why partial functions are significantly worse in Haskell compared to other languages is the lack of stack traces by default. When you call e.g. head on an empty list, you only get Prelude.head: empty list. Good luck figuring out which call of head is the problem or where the empty list came from! Of course, it may not even be in your code, but in some library you are using.
To get a stack trace, you need to either compile with profiling enabled or to make it available explicitly: see https://hackage.haskell.org/package/base-4.9.1.0/docs/GHC-Stack.html and https://wiki.haskell.org/Debugging. And both of these options appeared in relatively recent GHC versions (and work on improving them is ongoing).

Is Haskell really a purely functional language considering unsafePerformIO?

Haskell is generally referenced as an example of a purely functional language. How can this be justified given the existence of System.IO.Unsafe.unsafePerformIO ?
Edit: I thought with "purely functional" it was meant that it is impossible to introduce impure code into the functional part of the program.
The Languages We Call Haskell
unsafePerformIO is part of the Foreign Function Interface specification, not core Haskell 98 specification. It can be used to do local side effects that don't escape some scope, in order to expose a purely functional interface. That is, we use it to hide effects when the type checker can't do it for us (unlike the ST monad, which hides effects with a static guarantee).
To illustrate precisely the multiple languages that we call "Haskell", consider the image below. Each ring corresponds to a specific set of computational features, ordered by safety, and with area correlating to expressive power (i.e. the number of programs you can write if you have that feature).
The language known as Haskell 98 is specified right down in the middle, admitting total and partial functions. Agda (or Epigram), where only total functions are allowed, is even less expressive, but "more pure" and more safe. While Haskell as we use it today includes everything out to the FFI, where unsafePerformIO lives. That is, you can write anything in modern Haskell, though if you use things from the outer rings, it will be harder to establish safety and security guarantees made simple by the inner rings.
So, Haskell programs are not typically built from 100% referentially transparent code, however, it is the only moderately common language that is pure by default.
I thought with "purely functional" it was meant that it is impossible to introduce impure code...
The real answer is that unsafePerformIO is not part of Haskell, any more than say, the garbage collector or run-time system are part of Haskell. unsafePerformIO is there in the system so that the people who build the system can create a pure functional abstraction over very effectful hardware. All real languages have loopholes that make it possible for system builders to get things done in ways that are more effective than dropping down to C code or assembly code.
As to the broader picture of how side effects and I/O fit into Haskell via the IO monad, I think the easiest way to think of Haskell is that it is a pure language that describes effectful computations. When the computation described is main, the run-time system carries out those effects faithfully.
unsafePerformIO is a way to get effects in an unsafe manner; where "unsafe" means "safety must be guaranteed by the programmer"—nothing is checked by the compiler. If you are a savvy programmer and are willing to meet heavy proof obligations, you can use unsafePerformIO. But at that point you are not programming in Haskell any more; you are programming in an unsafe language that looks a lot like Haskell.
The language/implementation is purely functional. It includes a couple "escape hatches," which you don't have to use if you don't want to.
I don't think unsafePerformIO means that haskell somehow becomes impure. You can create pure (referentially transparent) functions from impure functions.
Consider the skiplist. In order for it to work well it requires access to a RNG, an impure function, but this doesn't make the data structure impure. If you add an item and then convert it to a list, the same list will be returned every time given the item you add.
For this reason I think unsafePerformIO should be thought of as promisePureIO. A function that means that functions that have side-effects and therefore would be labelled impure by the type system can become acknowledged as referentially transparent by the type system.
I understand that you have to have a slightly weaker definition of pure for this to hold though. i.e pure functions are referentially transparent and never called because of a side-effect (like print).
Unfortunately the language has to do some real world work, and this implies talking with the external environment.
The good thing is that you can (and should) limit the usage of this "out of style" code to few specific well documented portions of your program.
I have a feeling I'll be very unpopular for saying what I'm about to say, but felt I had to respond to some of the (in my opinion mis-) information presented here.
Although it's true that unsafePerformIO was officially added to the language as part of the FFI addendum, the reasons for this are largely historical rather than logical. It existed unofficially and was widely used long before Haskell ever had an FFI. It was never officially part of the main Haskell standard because, as you have observed, it was just too much of an embarrassment. I guess the hope was that it would just go away at some point in the future, somehow. Well that hasn't happened, nor will it in my opinion.
The development of FFI addendum provided a convenient pretext for unsafePerformIO to get snuck in to the official language standard as it probably doesn't seem too bad here, when compared to adding the capability to call foreign (I.E. C) code (where all bets are off regarding statically ensuring purity and type safety anyway). It was also jolly convenient to put it here for what are essentially political reasons. It fostered the myth that Haskell would be pure, if only it wasn't for all that dirty "badly designed" C, or "badly designed" operating systems, or "badly designed" hardware or .. whatever.. It's certainly true that unsafePerformIO is regularly used with FFI related code, but the reasons for this are often more to do with bad design of the FFI and indeed of Haskell itself, not bad design of whatever foreign thing Haskell is trying interface too.
So as Norman Ramsey says, the official position came to be that it was OK to use unsafePerformIO provided certain proof obligations were satisfied by whoever used it (primarily that doing this doesn't invalidate important compiler transformations like inlining and common sub-expression elimination). So far so good, or so one might think. The real kicker is that these proof obligations cannot be satisfied by what is probably the single most common use case for unsafePerformIO, which by my estimate accounts for well over 50% of all the unsafePerformIOs out there in the wild. I'm talking about the appalling idiom known as the "unsafePerformIO hack" which is provably (in fact obviously) completely unsafe (in the presence of inlining and cse) .
I don't really have the time, space or inclination to go into what the "unsafePerformIO hack" is or why it's needed in real IO libraries, but the bottom line is that folk who work on Haskells IO infrastructure are usually "stuck between a rock and a hard place". They can either provide an inherently safe API which has no safe implementation (in Haskell), or they can provide an inherently unsafe API which can be safely implemented, but what they can rarely do is provide safety in both API design and implementation. Judging by the depressing regularity with which the "unsafePerformIO hack" appears in real world code (including the Haskell standard libraries), it seems most choose the former option as the lesser of the two evils, and just hope that the compiler won't muck things up with inlining, cse or any other transformation.
I do wish all this was not so. Unfortunately, it is.
Safe Haskell, a recent extension of GHC, gives a new answer to this question. unsafePerformIO is a part of GHC Haskell, but not a part of the safe dialect.
unsafePerformIO should be used only to build referentially transparent functions; for example, memoization. In these cases, the author of a package marks it as "trustworthy". A safe module can import only safe and trustworthy modules; it cannot import unsafe modules.
For more information: GHC manual, Safe Haskell paper
Haskell is generally referenced as an example of a purely functional language. How can this be justified given the existence of System.IO.Unsafe.unsafePerformIO ?
Edit: I thought with "purely functional" it was meant that it is impossible to introduce impure code into the functional part of the program.
The IO monad is actually defined in Haskell, and you can in fact see its definition here. This monad does not exist to deal with impurities but rather to handle side effects. In any case, you could actually pattern match your way out of the IO monad, so the existence of unsafePerformIO shouldn't really be troubling to you.

Language features helpful for writing quines (self-printing programs)?

OK, for those who have never encountered the term, a quine is a "self-replicating" computer program. To be more specific, one which - upon execution - produces a copy of its own source code as its only output.
The quines can, of course, be developed in many programming languages (but not all); but some languages are obviously more suited to producing quines than others (to clearly understand the somewhat subjective-sounding "more suited", look at a Haskell example vs. C example in the Wiki page - and I provide my more-objective definition below).
The question I have is, from programming language perspective, what language features (either theoretical design ones or syntax sugar) make the language more suitable/helpful for writing quines?
My definition of "more suitable" is "quines are easier to write" and "are shorter/more readable/less obfuscated". But you're welcome to add more criteria that are at least somewhat objective.
Please note that this question explicitly excludes degenerate cases, like a language which is designed to contain "print_a_quine" primitive.
I am not entirely sure, so correct me if anyone of you knows better.
I agree with both other answers, going further by explaining, that a quine is this:
Y g
where Y is a Y fixed-point combinator (or any other fixed-point combinator), which means in lambda calculus:
Y g = g(Y g)
now, it is quite apparent, that we need the code to be data and g be a function which will print its arguments.
So to summarize we need for constructing such a quines functions, printing function, fixed-point combinator and call-by-name evaluation strategy.
The smallest language that satisfies this conditions is AFAIK Zot from the Iota and Jot family.
Languages like the Io Programming Language and others allow the treating of code as data. In tree walking systems, this typically allows the language implementer to expose the abstract syntax tree as a first class citizen. In the case of Io, this is what it does. Being object oriented, the AST is modelled around Message objects, and a special sentinel is created to represent the currently executing message; this sentinel is called thisMessage. thisMessage is a full Message like any other, and responds to the print message, which prints it to the screen. As a result, the shortest quine I've ever been able to produce in any language, has come from Io and looks like this:
thisMessage print
Anyway, I just couldn't help but sharing this with you on this subject. The above certainly makes writing quines easy, but not doing it this way certainly doesn't preclude easily creating a quine.
I'm not sure if this is useful answer from a practical point of view, but there is some useful theory in computability theory. In particular fixed points and Kleene's recursion theorem can be used for writing quines. Apparently, the theory can be used for writing quine in LISP (as the wikipedia page shows).

What functional language techniques can be used in imperative languages?

Which techniques or paradigms normally associated with functional languages can productively be used in imperative languages as well?
e.g.:
Recursion can be problematic in languages without tail-call optimization, limiting its use to a narrow set of cases, so that's of limited usefulness
Map and filter have found their way into non-functional languages, even though they have a functional sort of feel to them
I happen to really like not having to worry about state in functional languages. If I were particularly stubborn I might write C programs without modifying variables, only encapsulating my state in variables passed to functions and in values returned from functions.
Even though functions aren't first class values, I can wrap one in an object in Java say, and pass that into another method. Like Functional programming, just less fun.
So, for veterans of functional programming, when you program in imperative languages, what ideas from FP have you applied successfully?
Pretty nearly all of them?
If you understand functional languages, you can write imperative programs that are "informed" by a functional style. That will lead you away from side effects, and toward programs in which reading the program text at any particular point is sufficient to let you really know what the meaning of the program is at that point.
Back at the Dawn of Time we used to worry about "coupling" and "cohesion". Learning an FP will lead you to write systems with optimal (minimal) coupling, and high cohesion.
Here are things that get in the way of doing FP in a non-FP language:
If the language doesn't support lambda/closures, and doesn't have any syntactic sugar to easily mostly hack it, you are dead in the water. You don't call map/filter without closures.
If the language is statically-typed and doesn't support generics, you are dead in the water. All the good FP stuff uses genericity.
If the language doesn't support tail-recursion, you are hindered. You can write implementations of e.g. 'map' iteratively; also often your data may not be too large and recursion will be ok.
If the language does not support algebraic data types and pattern-matching, you will be mildly hindered. It's just annoying not to have them once you've tasted them.
If the language cannot express type classes, well, oh well... you'll get by, but darn if that's not just the awesomest feature ever, but Haskell is the only remotely popular language with good support.
Not having first-class functions really puts a damper on writing functional programs, but there are a few things that you can do that don't require them. The first is to eschew mutable state - try to have most or all of your classes return new objects that represent the modified state instead of making the change internally. As an example, if you were writing a linked list with an add operation, you would want to return the new linked list from add as opposed to modifying the object.
While this may make your programs less efficient (due to the increased number of objects being created and destroyed) you will gain the ability to more easily debug the program because the state and operation of the objects becomes more predictable, not to mention the ability to nest function calls more deeply because they have state inputs and outputs.
I've successfully used higher-order functions a lot, especially the kind that are passed in rather than the kind that are returned. The kind that are returned can be a bit tedious but can be simulated.
All sorts of applicative data structures and recursive functions work well in imperative languages.
The things I miss the most:
Almost no imperative languages guarantee to optimize every tail call.
I know of no imperative language that supports case analysis by pattern matching.

Resources