I am aware that I should not do this
I am asking for dirty hacks to do something nasty
Goal
I wish to modify an argument in a pure function.
Thus achieving effect of pass by reference.
Illustration
For example, I have got the following function.
fn :: a -> [a] -> Bool
fn a as = elem a as
I wish to modify the arguments passed in. Such as the list as.
as' = [1, 2, 3, 4]
pointTo as as'
It is a common misconception that Haskell just deliberately prevents you from doing stuff it deems unsafe, though most other languages allow it. That's the case, for instance, with C++' const modifiers: while they are a valuable guarantee that something which is supposed to stay constant isn't messed with by mistake, it's generally assumed that the overall compiled result doesn't really change when using them; you still get basically impure functions, implemented as some assembler instructions over a stack memory frame.
What's already less known is that even in C++, these const modifiers allow the compiler to apply certain optimisations that can completely break the results when the references are modified (as possible with const_cast – which in Haskell would have at least “unsafe” in its name!); only the mutable keyword gives the guarantee that hidden modifications don't mess up the sematics you'd expect.
In Haskell, the same general issue applies, but much stronger. A Haskell function is not a bunch of assembler instructions operating on a stack frame and memory locations pointed to from the stack. Sometimes the compiler may actually generate something like that, but in general the standard is very careful not to suggest any particular implementation details. So actually, functions may get inlined, stream-fused, rule-replaced etc. to something where it totally wouldn't make sense to even think about “modifying the arguments”, because perhaps the arguments never even exist as such. The thing that Haskell guarantees is that, in a much higher-level sense, the semantics are preserved; the semantics of how you'd think about a function mathematically. In mathematics, it also doesn't make sense at all, conceptually, to talk about “modified arguments”.
Related
In Matthew Pickering's blog post about inlining, he mentions a trick to force GHC to inline recursive definitions applied to a statically known argument:
However, there is a trick that can be used in order to tell GHC that an argument is truly static. We replace the value argument with a type argument. Then by defining suitable type class instances, we can recurse on the structure of the type as we would on a normal value. This time however, GHC will happily inline each “recursive” call as each call to sum is at a different type.
Then the blog post proceeds to show a very simplistic example using type-level Nats. My question is about forcing GHC to fully inline functions over actually interesting types.
In my case, I have a computation that is via a monadic DSL. Under the hood, the DSL is an RWS. Looking at the generated Core with lots of INLINE pragmas, I can see calls to the recursive functions that implement <> for the writer output. Since these functions are recursive, GHC doesn't inline them. But if it did, since the full RWS computation is static enough that the spine of the writer output could be fully unrolled during compilation, it would, after finitely many steps, arrive at Core with no recursive definitions left.
Taking the idea from Matt's blog post, we would need to have a type-level representation of the monadic value; I'd love to see a version of this (especially with good ergonomics!), but I am not too optimistic: unlike the Servant approach, we would need a way to bind variables to results (a la monadic >>=). Alternatively, are there any other mechanisms to tell GHC that its input size is statically known, so it shouldn't shy away from aggressively inlining recrusive definitions? For example, is it possible to emulate Agda-style sized types by using a graded monad with some appropriate size measure in the index?
P.s.: My aim is not mere optimization, but working around a Clash bug, so this elimination of recursive definitions by aggressive inlining would need to be something dependable, not a nice-to-have performance improvement.
I've always wondered how the Haskell exception system fits in with the whole "Pure functional language" thing. For example see the below GHCi session.
GHCi, version 8.0.1: http://www.haskell.org/ghc/ :? for help
Prelude> head []
*** Exception: Prelude.head: empty list
Prelude> :t head
head :: [a] -> a
Prelude> :t error
error :: [Char] -> a
Prelude> error "ranch"
*** Exception: ranch
CallStack (from HasCallStack):
error, called at <interactive>:4:1 in interactive:Ghci1
Prelude>
The type of head is [a] -> a. But when you call it on the special case of an empty list, you get an exception instead. But this exception is not accounted for in the type signature.
If I remember correctly it's a similar story when there is a failure during pattern matching. It doesn't matter what the type signature says, if you haven't accounted for every possible pattern, you run the risk of throwing an exception.
I don't have a single, concise question to ask, but my head is swimming. What was the motivation for adding this strange exception system to an otherwise pure and elegant language? Is it still pure but I'm just missing something? If I want to take advantage of this exception feature, how would I go about doing it (ie how do I catch and handle exceptions? is there anything else I can do with them?) For example, if ever I write code that uses the "head" function, surely I should take precautions for the case where an empty list somehow smuggles itself in.
You are confusing two concepts: purity and totality.
Purity says that functions have no side effects.
Totality says that every function terminates and produces a value.
Haskell is pure, but is not total.
Outside of IO, nontermination (e.g., let loop = loop in loop) and exceptions (e.g., error "urk!") are the same – nonterminating and exceptional terms, when forced, do not evaluate to a value. The designers of Haskell wanted a Turing-complete language, which – as per the halting problem – means that they forwent totality. And once you have nontermination, I suppose you might as well have exceptions, too – defining error msg = error msg and having calls to error do nothing forever is much less satisfying in practice than actually seeing the error message you want in finite time!
In general, though, you're right – partial functions (those which are not defined for every input value, like head) are ugly. Modern Haskell generally prefers writing total functions instead by returning Maybe or Either values, e.g.
safeHead :: [a] -> Maybe a
safeHead [] = Nothing
safeHead (x:_) = Just x
errHead :: [a] -> Either String a
errHead [] = Left "Prelude.head: empty list"
errHead (x:_) = Right x
In this case, the Functor, Applicative, Monad, MonadError, Foldable, Traversable, etc., machinery makes combining these total functions and working with their results easy.
Should you actually come across an exception in your code – for instance, you might use error to check a complicated invariant in your code that you think you've enforced, but you have a bug – you can catch it in IO. Which returns to the question of why it's OK to interact with exceptions in IO – doesn't that make the language impure? The answer is the same as that to the question of why we can do I/O in IO, or work with mutable variables – evaluating a value of type IO A doesn't produce the side effects that it describes, it's just an action that describes what a program could do. (There are better descriptions of this elsewhere on the internet; exceptions aren't any different than other effects.)
(Also, note that there is a separate-but-related exception system in IO, which is used when e.g. trying to read a file that isn't there. People are often OK with this exception system, in moderation, because since you're in IO you're already working with impure code.)
For example, if ever I write code that uses the "head" function, surely I should take precautions for the case where an empty list somehow smuggles itself in.
A simpler solution: don't use head. There are plenty of replacements: listToMaybe from Data.Maybe, the various alternative implementations in the safe package, etc. The partial functions [1] in the base libraries -- specially ones as easy to replace as head -- are little more than historical cruft, and should be either ignored or replaced by safe variants, such as those in the aforementioned safe package. For further arguments, here is an entirely reasonable rant about partial functions.
If I want to take advantage of this exception feature, how would I go about doing it (ie how do I catch and handle exceptions? is there anything else I can do with them?)
Exceptions of the sort thrown by error can only be caught in the IO monad. If you are writing pure functions you won't want to force your users to run them in the IO monad merely for catching exceptions. Therefore, if you ever use error in a pure function, assume the error will not be caught [2]. Ideally you shouldn't use error in pure code at all, but if you are somehow compelled to do so, at least make sure to write an informative error message (that is, not "Prelude.head: empty list") so that your users know what is going on when the program crashes.
If I remember correctly it's a similar story when there is a failure during pattern matching. It doesn't matter what the type signature says, if you haven't accounted for every possible pattern, you run the risk of throwing an exception.
Indeed. The only difference from using head to writing the incomplete pattern match (\(x:_) -> x) by yourself explicitly is that in the latter case the compiler will at least warn you if you use -Wall, while with head even that is swept under the rug.
I've always wondered how the Haskell exception system fits in with the whole "Pure functional language" thing.
Technically speaking, partial functions don't affect purity (which doesn't make them any less nasty, of course). From a theoretical point of view, head [] is just as undefined as things like foo = let x = x in x. (The keyword for further reading into such subtleties is "bottom".)
[1]: Partial functions are functions that, just like head, are not defined for some values of the argument types they are supposed to take.
[2]: It is worth mentioning that exceptions in IO are a whole different issue, as you can't trivially avoid e.g. a file read failure just by using better functions. There are quite a few approaches towards handling such scenarios in a sensible way. If you are curious about the issue, here is one "highly opinionated" article about it that is illustrative of the relevant tools and trade-offs.
Haskell does not require that your functions be total, and doesn't track when they're not. (Total functions are those that have a well defined output for every possible value of their input type)
Even without exceptions or pattern match failures, you can have a function that doesn't define output for some inputs by just going on forever. An example is length (repeat 1). This continues to compute forever, but never actually throws an error.
The way Haskell semantics "copes" with this is to declare that there is an "extra" value in every single type; the so called "bottom value", and declare that any computation that doesn't properly complete and produce a normal value of its type actually produces the bottom value. It's represented by the mathematical symbol ⊥ (only when talking about Haskell; there isn't really any way in Haskell to directly refer to this value, but undefined is often also used since that is a Haskell name that is bound to an error-raising computation, and so semantically produces the bottom value).
This is a theoretical wart in the system, since it gives you the ability to create a 'value' of any type (albeit not a very useful one), and a lot of the reasoning about bits of code being correct based on types actually relies on the assumption that you can't do exactly that (if you're into the Curry-Howard isomorphism between pure functional programs and formal logic, the existence of ⊥ gives you the ability to "prove" logical contradictions, and thus to prove absolutely anything at all).
But in practice it seems to work out that all the reasoning done by pretending that ⊥ doesn't exist in Haskell still generally works well enough to be useful when you're writing "well-behaved" code that doesn't use ⊥ very much.
The main reason for tolerating this situation in Haskell is ease-of-use as a programming language rather than a system of formal logic or mathematics. It's impossible to make a compiler that could actually tell of arbitrary Haskell-like code whether or not each function is total or partial (see the Halting Problem). So a language that wanted to enforce totality would have to either remove a lot of the things you can do, or require you to jump through lots of hoops to demonstrate that your code always terminates, or both. The Haskell designers didn't want to do that.
So given that Haskell as a language is resigned to partiality and ⊥, it may as well give you things like error as a convenience. After all, you could always write a error :: String -> a function by just not terminating; getting an immediate printout of the error message rather than having the program just spin forever is a lot more useful to practicing programmers, even if those are both equivalent in the theory of Haskell semantics!
Similarly, the original designers of Haskell decided that implicitly adding a catch-all case to every pattern match that just errors out would be more convenient than forcing programmers to add the error case explicitly every time they expect a part of their code to only ever see certain cases. (Although a lot of Haskell programmers, including me, work with the incomplete-pattern-match warning and almost always treat it as an error and fix their code, and so would probably prefer the original Haskell designers went the other way on this one).
TLDR; exceptions from error and pattern match failure are there for convenience, because they don't make the system any more broken than it already has to be, without being quite a different system than Haskell.
You can program by throwing and catch exceptions if you really want, including catching the exceptions from error or pattern match failure, by using the facilities from Control.Exception.
In order to not break the purity of the system you can raise exceptions from anywhere (because the system always has to deal with the possibility of a function not properly terminating and producing a value; "raising an exception" is just another way in which that can happen), but exceptions can only be caught by constructs in IO. Because the formal semantics of IO permit basically anything to happen (because it has to interface with the real world and there aren't really any hard restrictions we can impose on that from the definition of Haskell), we can also relax most of the rules we usually need for pure functions in Haskell and still have something that technically fits in the model of Haskell code being pure.
I haven't used this very much at all (usually I prefer to keep my error handling using things that are more well-defined in terms of Haskell's semantic model than the operational model of what IO does, which can be as simple as Maybe or Either), but you can read about it if you want to.
I'm reading through Real World Haskell and in my copy, on page 59, it states:
In Haskell we don't have the equivalent of null. We could use Maybe... Instead we've decided to use a no-argument Empty constructor
Then, in the next section, on Errors, there is the Haskell function:
mySecond xs = if null (tail xs)
then error "list too short"
else head (tail xs)
Now, I don't understand what the "null" in this function definition is referring to, since it was stated clearly that there is no equivalent to (Java's) null in Haskell.
Any help appreciated.
null is a function which simply tests if a list is empty. It doesn't have anything to do with nullable values in other languages.
null :: [a] -> Bool
null [] = True
null (_:_) = False
Not really an answer to your question, but I feel it fits here to give some technical background of what the book is talking about:
Most language implementations nowadays (C++ is the most important counterexample1) store objects not right "in place" where they are used (i.e. in the function-call stack), but in some random place on the heap. All that's stored on the stack is a pointer/reference to the actual object.
One consequence of this is that you can, with mutability, easily switch a reference to point to another object, without needing to meddle with that object itself. Also, you don't even need to have an object, instead you can consider the reference a "promise" that there will be an object by the time somebody derefers it. Now if you've learned some Haskell that will heavily remind you of lazy evaluation, and indeed it's at the ground of how lazyness is implemented. This allows not only Haskell's infinite lists etc., also you can structuce both code and data more nicely – in Haskell we call it "tying the knot", and particularly in OO languages it's quite detrimental to the way you construct objects.
Imperative languages can't do this "promise" thing nicely automatic as Haskell can: lazy evaluation without referential transparency would lead to awfully unpredictable side-effects. So when you promise a value, you need to keep track of where it'll be soonest used, and make sure you manally construct the promised object before that point. Obviously, it's a big disaster to use a reference that just points to some random place in memory (as can easily happen in C), so it's now standard to point to a conventional spot that can't be a valid memory location: this is the null pointer. That way, you get at least a predictable error message, but an error it is nevertheless; Tony Hoare therefore now calls this idea of his "billion dollar mistake".
So null references are evil, Haskell cleverly avoids them. Or does it?
Actually, it can be perfectly reasonable to not have an object at all, namely when you explicitly allow empty data structures. The classic example are linked lists, and thus Haskell has a specialised null function that checks just this: is the list empty, or is there a head I can safely refer to?
Still – this isn't really idiomatic, for Haskell has a far more general and convenient way of savely resolving such "options" scenarios: pattern matching. The idiomatic version of your example would be
mySecond' (_:x2:_) = x2
mySecond' _ = error "list too short"
Note that this is not just more consise, but also safer than your code: you check null (tail xs), but if xs itself is empty then tail will already throw an error.
1C++ certainly allows storing stuff on the heap too, but experienced programmers like to avoid it for performance reasons. On the other hand, Java stores "built-in" types (you can recognise them by lowercase names, e.g. int) on the stack, to avoid redirection cost.
2Consider the words pointer and reference as synonyms, though some languages prefer either or give them slightly different meanings.
I'm struggling with what Super Combinators are:
A supercombinator is either a constant, or a combinator which contains only supercombinators as subexpressions.
And also with what Constant Applicative Forms are:
Any super combinator which is not a lambda abstraction. This includes truly constant expressions such as 12, ((+) 1 2), [1,2,3] as well as partially applied functions such as ((+) 4). Note that this last example is equivalent under eta abstraction to \ x -> (+) 4 x which is not a CAF.
This is just not making any sense to me! Isn't ((+) 4) just as "truly constant" as 12? CAFs sound like values to my simple mind.
These Haskell wiki pages you reference are old, and I think unfortunately written. Particularly unfortunate is that they mix up CAFs and supercombinators. Supercombinators are interesting but unrelated to GHC. CAFs are still very much a part of GHC, and can be understood without reference to supercombinators.
So let's start with supercombinators. Combinators derive from combinatory logic, and, in the usage here, consist of functions which only apply the values passed in to one another in one or another form -- i.e. they combine their arguments. The most famous set of combinators are S, K, and I, which taken together are Turing-complete. Supercombinators, in this context, are functions built only of values passed in, combinators, and other supercombinators. Hence any supercombinator can be expanded, through substitution, into a plain old combinator.
Some compilers for functional languages (not GHC!) use combinators and supercombinators as intermediate steps in compilation. As with any similar compiler technology, the reason for doing this is to admit optimization analysis that is more easily performed in such a simplified, minimal language. One such core language built on supercombinators is Edwin Brady's epic.
Constant Applicative Forms are something else entirely. They're a bit more subtle, and have a few gotchas. The way to think of them is as an aspect of compiler implementation with no separate semantic meaning but with a potentially profound effect on runtime performance. The following may not be a perfect description of a CAF, but it'll try to convey my intuition of what one is, since I haven't seen a really good description anywhere else for me to crib from. The clean "authoritative" description in the GHC Commentary Wiki reads as follows:
Constant Applicative Forms, or CAFs for short, are top-level values
defined in a program. Essentially, they are objects that are not
allocated dynamically at run-time but, instead, are part of the static
data of the program.
That's a good start. Pure, functional, lazy languages can be thought of in some sense as a graph reduction machine. The first time you demand the value of a node, that forces its evaluation, which in turn can demand the values of subnodes, etc. One a node is evaluated, the resultant value sticks around (although it does not have to stick around -- since this is a pure language we could always keep the subnodes live and recalculate with no semantic effect). A CAF is indeed just a value. But, in the context, a special kind of value -- one which the compiler can determine has a meaning entirely dependent on its subnodes. That is to say:
foo x = ...
where thisIsACaf = [1..10::Int]
thisIsNotACaf = [1..x::Int]
thisIsAlsoNotACaf :: Num a => [a]
thisIsAlsoNotACaf = [1..10] -- oops, polymorphic! the "num" dictionary is implicitly a parameter.
thisCouldBeACaf = const [1..10::Int] x -- requires a sufficiently smart compiler
thisAlsoCouldBeACaf _ = [1..10::Int] -- also requires a sufficiently smart compiler
So why do we care if things are CAFs? Basically because sometimes we really really don't want to recompute something (for example, a memotable!) and so want to make sure it is shared properly. Other times we really do want to recompute something (e.g. a huge boring easy to generate list -- such as the naturals -- which we're just walking over) and not have it stick around in memory forever. A combination of naming things and binding them under lets or writing them inline, etc. typically lets us specify these sorts of things in a natural, intuitive way. Occasionally, however, the compiler is smarter or dumber than we expect, and something we think should only be computed once is always recomputed, or something we don't want to hang on to gets lifted out as a CAF. Then, we need to think things through more carefully. See this discussion to get an idea about some of the trickiness involved: A good way to avoid "sharing"?
[By the way, I don't feel up to it, but anyone that wants to should feel free to take as much of this answer as they want to try and integrate it with the existing Haskell Wiki pages and improve/update them]
Matt is right in that the definition is confusing. It is even contradictory. A CAF is defined as:
Any super combinator which is not a lambda abstraction. This includes
truly constant expressions such as 12, ((+) 1 2), [1,2,3] as
well as partially applied functions such as ((+) 4).
Hence, ((+) 4) is seen as a CAF. But in the very next sentence we're told it is equivalent to something that is not a CAF:
this last example is equivalent under eta abstraction to \ x -> (+) 4 x which is not a CAF.
It would be cleaner to rule out partially applied functions on the ground that they are equivalent to lambda abstractions.
Compiled with ghc --make, these two programs produce the exact same binaries:
-- id1a.hs
main = print (id' 'a')
id' :: a -> a
id' x = x
-- id1b.hs
main = print (id' 'a')
id' :: Char -> Char
id' x = x
Is this just because of how trivial/contrived my example is, or does this
hold true as programs get more complex?
Also, is there any good reason to avoid making my types as general as
possible? I usually try keep specifics out where I don't need them, but
I am not extremely familiar with the effects of this on compiled languages,
especially Haskell/GHC.
Side Note:
I seem to recall a recent SO question where the answer was to make a type more
specific in order to improve some performance issue, though I cannot find it
now, so I may have imagined it.
Edit:
I understand from a usability / composability standpoint that more general is always better, I'm more interested in the effects this has on the compiled code. Is it possible for me to be too eager in abstracting my code? Or is this usually not a problem in Haskell?
I would go and make everything as general as possible. If you run into performance issues you can start thinking about messing with concrete implementations but IMHO this will not be a problem very often and if this really gets an problem then maybe your performance need will be as great as to think about moving into imperative-land again ;)
Is there any good reason to avoid making my types as general as possible?
No, as long as you have the Specialize pragma at your disposal for those rare situations where it might actually matter.
Is this just because of how trivial/contrived my example is
Yes. Namely, try splitting the definition of id' and main into different modules and you should see a difference.
However, Carsten is right: there may be performance-related reasons to use concrete types, but you should generally start with general types and use concrete implementations only if you actually have a problem.
General types usually make your functions more usable, in my opinion.
This may be a poor example, but if you're writing a function such as elem (takes a list and an element and returns true if the list contains that element and false otherwise), using specific types will constrain the usability of your function. ie. if you specify the type as Int, you can't use that function to check if a String contains a certain character, for example.
I'm not quite sure about performance, but I haven't experienced any issues and I use general types almost all the time.