As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
I know a few programmers who keep talking about Haskell when they are among themselves, and here on SO everyone seems to love that language. Being good at Haskell seems somewhat like the hallmark of a genius programmer.
Can someone give a few Haskell examples that show why it is so elegant / superior?
This is the example that convinced me to learn Haskell (and boy am I glad I did).
-- program to copy a file --
import System.Environment
main = do
--read command-line arguments
[file1, file2] <- getArgs
--copy file contents
str <- readFile file1
writeFile file2 str
OK, it's a short, readable program. In that sense it's better than a C program. But how is this so different from (say) a Python program with a very similar structure?
The answer is lazy evaluation. In most languages (even some functional ones), a program structured like the one above would result in the entire file being loaded into memory, and then written out again under a new name.
Haskell is "lazy". It doesn't calculate things until it needs to, and by extension doesn't calculate things it never needs. For instance, if you were to remove the writeFile line, Haskell wouldn't bother reading anything from the file in the first place.
As it is, Haskell realises that the writeFile depends on the readFile, and so is able to optimise this data path.
While the results are compiler-dependent, what will typically happen when you run the above program is this: the program reads a block (say 8KB) of the first file, then writes it to the second file, then reads another block from the first file, and writes it to the second file, and so on. (Try running strace on it!)
... which looks a lot like what the efficient C implementation of a file copy would do.
So, Haskell lets you write compact, readable programs - often without sacrificing a lot of performance.
Another thing I must add is that Haskell simply makes it difficult to write buggy programs. The amazing type system, lack of side-effects, and of course the compactness of Haskell code reduces bugs for at least three reasons:
Better program design. Reduced complexity leads to fewer logic errors.
Compact code. Fewer lines for bugs to exist on.
Compile errors. Lots of bugs just aren't valid Haskell.
Haskell isn't for everyone. But everyone should give it a try.
The way it was pitched to me, and what I think is true after having worked on learning on Haskell for a month now, is the fact that functional programming twists your brain in interesting ways: it forces you to think about familiar problems in different ways: instead of loops, think in maps and folds and filters, etc. In general, if you have more than one perspective on a problem, it makes you better enabled to reason about this problem, and switch viewpoints as necessary.
The other really neat thing about Haskell is its type system. It's strictly typed, but the type inference engine makes it feel like a Python program that magically tells you when you've done a stupid type-related mistake. Haskell's error messages in this regard are somewhat lacking, but as you get more acquainted with the language you'll say to yourself: this is what typing is supposed to be!
You are kind of asking the wrong question.
Haskell is not a language where you go look at a few cool examples and go "aha, I see now, that's what makes it good!"
It's more like, we have all these other programming languages, and they're all more or less similar, and then there's Haskell which is totally different and wacky in a way that's totally awesome once you get used to the wackiness. But the problem is, it takes quite a while to acclimate to the wackiness. Things that set Haskell apart from almost any other even-semi-mainstream language:
Lazy evaluation
No side effects (everything is pure, IO/etc happens via monads)
Incredibly expressive static type system
as well as some other aspects that are different from many mainstream languages (but shared by some):
functional
significant whitespace
type inferred
As some other posters have answered, the combination of all these features means that you think about programming in an entirely different way. And so it's hard to come up with an example (or set of examples) that adequately communicates this to Joe-mainstream-programmer. It's an experiential thing. (To make an analogy, I can show you photos of my 1970 trip to China, but after seeing the photos, you still won't know what it was like to have lived there during that time. Similarly, I can show you a Haskell 'quicksort', but you still won't know what it means to be a Haskeller.)
What really sets Haskell apart is the effort it goes to in its design to enforce functional programming. You can program in a functional style in pretty much any language, but it's all too easy to abandon at the first convenience. Haskell does not allow you to abandon functional programming, so you must take it to its logical conclusion, which is a final program that is easier to reason about, and sidesteps a whole class of the thorniest types of bugs.
When it comes to writing a program for real world use, you may find Haskell lacking in some practical fashion, but your final solution will be better for having known Haskell to begin with. I'm definitely not there yet, but so far learning Haskell has been much more enlightening than say, Lisp was in college.
Part of the fuss is that purity and static typing enable for parallelism combined with aggressive optimisations. Parallel languages are hot now with multicore being a bit disruptive.
Haskell gives you more options for parallelism than pretty much any general purpose language, along with a fast, native code compiler. There is really no competition with this kind of support for parallel styles:
semi-implicit parallelism via thread sparks
explicit threads
data parallel arrays
actors and message passing
transactional memory
So if you care about making your multicore work, Haskell has something to say.
A great place to start is with Simon Peyton Jones' tutorial on parallel and concurrent programming in Haskell.
I've spent the last year learning Haskell and writing a reasonably large and complex project in it. (The project is an automated options trading system, and everything from the trading algorithms to the parsing and handling of low-level, high-speed market data feeds is done in Haskell.) It's considerably more concise and easier to understand (for those with appropriate background) than a Java version would be, as well as extremely robust.
Possibly the biggest win for me has been the ability to modularize control flow through things such as monoids, monads, and so on. A very simple example would be the Ordering monoid; in an expression such as
c1 `mappend` c2 `mappend` c3
where c1 and so on return LT, EQ or GT, c1 returning EQ causes the expression to continue, evaluating c2; if c2 returns LT or GT that's the value of the whole, and c3 is not evaluated. This sort of thing gets considerably more sophisticated and complex in things like monadic message generators and parsers where I may be carrying around different types of state, have varying abort conditions, or may want to be able to decide for any particular call whether abort really means "no further processing" or means, "return an error at the end, but carry on processing to collect further error messages."
This is all stuff it takes some time and probably quite some effort to learn, and thus it can be hard to make a convincing argument for it for those who don't already know these techniques. I think that the All About Monads tutorial gives a pretty impressive demonstration of one facet of this, but I wouldn't expect that anybody not familiar with the material already would "get it" on the first, or even the third, careful reading.
Anyway, there's lots of other good stuff in Haskell as well, but this is a major one that I don't see mentioned so often, probably because it's rather complex.
Software Transactional Memory is a pretty cool way to deal with concurrency. It's much more flexible than message passing, and not deadlock prone like mutexes. GHC's implementation of STM is considered one of the best.
For an interesting example you can look at:
http://en.literateprograms.org/Quicksort_(Haskell)
What is interesting is to look at the implementation in various languages.
What makes Haskell so interesting, along with other functional languages, is the fact that you have to think differently about how to program. For example, you will generally not use for or while loops, but will use recursion.
As is mentioned above, Haskell and other functional languages excel with parallel processing and writing applications to work on multi-cores.
I couldn't give you an example, I'm an OCaml guy, but when I'm in such a situation as yourself, curiosity just takes hold and I have to download a compiler/interpreter and give it a go. You'll likely learn far more that way about the strengths and weaknesses of a given functional language.
One thing I find very cool when dealing with algorithms or mathematical problems is Haskell's inherent lazy evaluation of computations, which is only possible due to its strict functional nature.
For example, if you want to calculate all primes, you could use
primes = sieve [2..]
where sieve (p:xs) = p : sieve [x | x<-xs, x `mod` p /= 0]
and the result is actually an infinite list. But Haskell will evaluate it left from right, so as long as you don't try to do something that requires the entire list, you can can still use it without the program getting stuck in infinity, such as:
foo = sum $ takeWhile (<100) primes
which sums all primes less than 100. This is nice for several reasons. First of all, I only need to write one prime function that generates all primes and then I'm pretty much ready to work with primes. In an object-oriented programming language, I would need some way to tell the function how many primes it should compute before returning, or emulate the infinite list behavior with an object. Another thing is that in general, you end up writing code that expresses what you want to compute and not in which order to evaluate things - instead the compiler does that for you.
This is not only useful for infinite lists, in fact it gets used without you knowing it all the time when there is no need to evaluate more than necessary.
I find that for certain tasks I am incredibly productive with Haskell.
The reason is because of the succinct syntax and the ease of testing.
This is what the function declaration syntax is like:
foo a = a + 5
That's is simplest way I can think of defining a function.
If I write the inverse
inverseFoo a = a - 5
I can check that it is an inverse for any random input by writing
prop_IsInverse :: Double -> Bool
prop_IsInverse a = a == (inverseFoo $ foo a)
And calling from the command line
jonny#ubuntu: runhaskell quickCheck +names fooFileName.hs
Which will check that all the properties in my file are held, by randomly testing inputs a hundred times of so.
I don't think Haskell is the perfect language for everything, but when it comes to writing little functions and testing, I haven't seen anything better. If your programming has a mathematical component this is very important.
I agree with others that seeing a few small examples is not the best way to show off Haskell. But I'll give some anyway. Here's a lightning-fast solution to Euler Project problems 18 and 67, which ask you to find the maximum-sum path from the base to the apex of a triangle:
bottomUp :: (Ord a, Num a) => [[a]] -> a
bottomUp = head . bu
where bu [bottom] = bottom
bu (row : base) = merge row $ bu base
merge [] [_] = []
merge (x:xs) (y1:y2:ys) = x + max y1 y2 : merge xs (y2:ys)
Here is a complete, reusable implementation of the BubbleSearch algorithm by Lesh and Mitzenmacher. I used it to pack large media files for archival storage on DVD with no waste:
data BubbleResult i o = BubbleResult { bestResult :: o
, result :: o
, leftoverRandoms :: [Double]
}
bubbleSearch :: (Ord result) =>
([a] -> result) -> -- greedy search algorithm
Double -> -- probability
[a] -> -- list of items to be searched
[Double] -> -- list of random numbers
[BubbleResult a result] -- monotone list of results
bubbleSearch search p startOrder rs = bubble startOrder rs
where bubble order rs = BubbleResult answer answer rs : walk tries
where answer = search order
tries = perturbations p order rs
walk ((order, rs) : rest) =
if result > answer then bubble order rs
else BubbleResult answer result rs : walk rest
where result = search order
perturbations :: Double -> [a] -> [Double] -> [([a], [Double])]
perturbations p xs rs = xr' : perturbations p xs (snd xr')
where xr' = perturb xs rs
perturb :: [a] -> [Double] -> ([a], [Double])
perturb xs rs = shift_all p [] xs rs
shift_all p new' [] rs = (reverse new', rs)
shift_all p new' old rs = shift_one new' old rs (shift_all p)
where shift_one :: [a] -> [a] -> [Double] -> ([a]->[a]->[Double]->b) -> b
shift_one new' xs rs k = shift new' [] xs rs
where shift new' prev' [x] rs = k (x:new') (reverse prev') rs
shift new' prev' (x:xs) (r:rs)
| r <= p = k (x:new') (prev' `revApp` xs) rs
| otherwise = shift new' (x:prev') xs rs
revApp xs ys = foldl (flip (:)) ys xs
I'm sure this code looks like random gibberish. But if you read Mitzenmacher's blog entry and understand the algorithm, you'll be amazed that it's possible to package the algorithm into code without saying anything about what you're searching for.
Having given you some examples as you asked for, I will say that the best way to start to appreciate Haskell is to read the paper that gave me the ideas I needed to write the DVD packer: Why Functional Programming Matters by John Hughes. The paper actually predates Haskell, but it brilliantly explains some of the ideas that make people like Haskell.
For me, the attraction of Haskell is the promise of compiler guaranteed correctness. Even if it is for pure parts of the code.
I have written a lot of scientific simulation code, and have wondered so many times if there was a bug in my prior codes, which could invalidate a lot of current work.
it has no loop constructs. not many languages have this trait.
If you can wrap your head around the type system in Haskell I think that in itself is quite an accomplishment.
I agree with those that said that functional programming twists your brain into seeing programming from a different angle. I've only used it as a hobbyist, but I think it fundamentally changed the way I approach a problem. I don't think I would have been nearly as effective with LINQ without having been exposed to Haskell (and using generators and list comprehensions in Python).
To air a contrarian view: Steve Yegge writes that Hindely-Milner languages lack the flexibility required to write good systems:
H-M is very pretty, in a totally
useless formal mathematical sense. It
handles a few computation constructs
very nicely; the pattern matching
dispatch found in Haskell, SML and
OCaml is particularly handy.
Unsurprisingly, it handles some other
common and highly desirable constructs
awkwardly at best, but they explain
those scenarios away by saying that
you're mistaken, you don't actually
want them. You know, things like, oh,
setting variables.
Haskell is worth learning, but it has its own weaknesses.
Related
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I've been working with functional programming for a while now and I think it's great so I would like to teach some of my friends Haskell.
Unfortunately, I can't think of any particular piece of code to show them and say "See, this is how it would look imperatively, see how much better functional is"
So, could someone that's more of an expert than I am (and that's a very low requirement) help me out?
This doesn't seem to oppinionated to me, but in case it is, please tell me how to fix it.
Probably the best notions to carry back are so called "value semantics" and "purity".
Each of these play off one another so much it's hard to separate them in practice. In principle, however, value semantics means that each "thing" should act like a value instead of an object. It leads to simpler passing, less "spooky action at a distance" from statefulness, and it provides some amount of background to perform equational reasoning on code. Purity means that side effects do not occur wherever you have code but instead only at carefully demarcated points. This means that most of your code ends up independent and reusable while only the core "application" bits entangle themselves deeply with state and effect.
You might say that purity is having value semantics everywhere or that values are pure computations—so perhaps it's worth saying that "values" refer to the nouns (statics) of your system and "purity" the verbs (dynamics).
These techniques are well known to be useful in other languages. It's a common idea in OO languages these days to happily sacrifice some speed for value semantics due to the organizational and correctness benefits. If you become comfortable with Haskell then you will understand how value semantics and purity work if they are applied to every single aspect of an entire program without compromise. That means you've been exposed to some powerful patterns for reasoning about and building pure programs.
One place I've been thinking about making a comparison is between free monads and the Command pattern. Both are solving very similar problems—"how do I make explicit a structure containing instructions to be performed by a program and execute it at a later time, perhaps in various ways?"—but the Command pattern tends to dance around a lot mutability in, at the very least, the interpreter if not the commands themselves.
Can we write Command patterns which behave more like Free monads? What would be the benefits? These are the kinds of questions you can ask with much more acuity if you've got a strong Haskell background.
It's an interesting, and tricky question. There has been a trend of concepts from functional languages making their way into imperative languages for some time now, and the line between functional/imperative languages is quite blurred. For example, say you want to square every element of a list xs and store the result in a new list, ys.
>> xs = [1, 2, 3] # Python
>> xs = [1, 2, 3] -- Haskell
If you didn't know about functional idioms, you might do this:
>> ys = [0] * len(xs) # Python
>> for i in range(len(xs)):
ys[i] = xs[i] * xs[i]
In Haskell you would just write
>> ys = map (\x -> x * x) xs -- Haskell
This idiom also exists in Python, of course
>> ys = map(lambda x: x * x, xs) # Python
Arguably, an even nicer way to write it is using a list comprehension
>> ys = [x * x | x <- xs] -- Haskell
which also exists in Python
>> ys = [x * x for x in xs] # Python
Certainly, the functional way is much nicer (and more composable, more reusable) than the imperative way. But in this case, you don't need to use a functional language to get the benefit - you just have to be ready to "think in a functional way."
Monads and continuations.
I have come across code like this:
synchronized(q) {
Object o = q.poll();
if (o == null) {
...// do something
}
}
This has folded itself into a much nicer API refactoring:
Object o = q.poll(doSomethingAsLambda)
The q implementation was rewritten, of course, but now synchronization is finer grained, because the implementation permits executing custom code "inside" a branch the q implementation would be aware of.
Modern functional languages (like Haskell and ML) are concise - you can say a lot in not much code.
A real benefit of concision is that you can iterate a design quickly, and in other domains (like graphic or fashion design) rapid iteration seems to be considered to be one of the "tools" for becoming a good or expert designer; therefore it's a fair belief that learning how to rapidly iterate software designs will help make you a better programmer.
Of course modern scripting languages like Python and "design" languages like Alloy (developed by Daniel Jackson at the MIT) are concise and allow you to rapidly iterate designs / prototypes. So a greater theme seems to be that "lightweight" / concise languages help you iterate and improve your design skills rather than just "functional programming will make you a better 'mainstream' programmer".
I'm starting to learn functional programming language like Haskell, ML and most of the exercises will show off things like:
foldr (+) 0 [ 1 ..10]
which is equivalent to
sum = 0
for( i in [1..10] )
sum += i
So that leads me to think why can't compiler know that this is Arithmetic Progression and use O(1) formula to calculate?
Especially for pure FP languages without side effect?
The same applies for
sum reverse list == sum list
Given a + b = b + a
and definition of reverse, can compilers/languages prove it automatically?
Compilers generally don't try to prove this kind of thing automatically, because it's hard to implement.
As well as adding the logic to the compiler to transform one fragment of code into another, you have to be very careful that it only tries to do it when it's actually safe - i.e. there are often lots of "side conditions" to worry about. For example in your example above, someone might have written an instance of the type class Num (and hence the (+) operator) where the a + b is not b + a.
However, GHC does have rewrite rules which you can add to your own source code and could be used to cover some relatively simple cases like the ones you list above, particularly if you're not too bothered about the side conditions.
For example, and I haven't tested this, you might use the following rule for one of your examples above:
{-# RULES
"sum/reverse" forall list . sum (reverse list) = sum list
#-}
Note the parentheses around reverse list - what you've written in your question actually means (sum reverse) list and wouldn't typecheck.
EDIT:
As you're looking for official sources and pointers to research, I've listed a few.
Obviously it's hard to prove a negative but the fact that no-one has given an example of a general-purpose compiler that does this kind of thing routinely is probably quite strong evidence in itself.
As others have pointed out, even simple arithmetic optimisations are surprisingly dangerous, particularly on floating point numbers, and compilers generally have flags to turn them off - for example Visual C++, gcc. Even integer arithmetic isn't always clear-cut and people occasionally have big arguments about how to deal with things like overflow.
As Joachim noted, integer variables in loops are one place where slightly more sophisticated optimisations are applied because there are actually significant wins to be had. Muchnick's book is probably the best general source on the topic but it's not that cheap. The wikipedia page on strength reduction is probably as good an introduction as any to one of the standard optimisations of this kind, and has some references to the relevant literature.
FFTW is an example of a library that does all kinds of mathematical optimization internally. Some of its code is generated by a customised compiler the authors wrote specifically for the purpose. It's worthwhile because the authors have domain-specific knowledge of optimizations that in the specific context of the library are both worth the effort and safe
People sometimes use template metaprogramming to write "self-optimising libraries" that again might rely on arithmetic identities, see for example Blitz++. Todd Veldhuizen's PhD dissertation has a good overview.
If you descend into the realms of toy and academic compilers all sorts of things go. For example my own PhD dissertation is about writing inefficient functional programs along with little scripts that explain how to optimise them. Many of the examples (see Chapter 6) rely on applying arithmetic rules to justify the underlying optimisations.
Also, it's worth emphasising that the last few examples are of specialised optimisations being applied only to certain parts of the code (e.g. calls to specific libraries) where it is expected to be worthwhile. As other answers have pointed out, it's simply too expensive for a compiler to go searching for all possible places in an entire program where an optimisation might apply. The GHC rewrite rules that I mentioned above are a great example of a compiler exposing a generic mechanism for individual libraries to use in a way that's most appropriate for them.
The answer
No, compilers don’t do that kind of stuff.
One reason why
And for your examples, it would even be wrong: Since you did not give type annotations, the Haskell compiler will infer the most general type, which would be
foldr (+) 0 [ 1 ..10] :: Num a => a
and similar
(\list -> sum (reverse list)) :: Num a => [a] -> a
and the Num instance for the type that is being used might well not fulfil the mathematical laws required for the transformation you suggest. The compiler should, before everything else, avoid to change the meaning (i.e. the semantics) of your program.
More pragmatically: The cases where the compiler could detect such large-scale transformations rarely occur in practice, so it would not be worth it to implement them.
An exception
Note notable exceptions are linear transformations in loops. Most compilers will rewrite
for (int i = 0; i < n; i++) {
... 200 + 4 * i ...
}
to
for (int i = 0, j = 200; i < n; i++, j += 4) {
... j ...
}
or something similar, as that pattern does often occur in code working on array.
The optimizations you have in mind will probably not be done even in the presence of monomorphic types, because there are so many possibilities and so much knowledge required. For example, in this example:
sum list == sum (reverse list)
The compiler would need to know or take into account the following facts:
sum = foldl (+) 0
(+) is commutative
reverse list is a permutation of list
foldl x c l, where x is commutative and c is a constant, yields the same result for all permutations of l.
This all seems trivial. Sure, the compiler can most probably look up the definition of sumand inline it. It could be required that (+) be commutative, but remember that +is just another symbol without attached meaning to the compiler. The third point would require the compiler to prove some non trivial properties about reverse.
But the point is:
You don't want to perform the compiler to do those calculations with each and every expression. Remember, to make this really useful, you'd have to heap up a lot of knowledge about many, many standard functions and operators.
You still can't replace the expression above with True unless you can rule out the possibility that list or some list element is bottom. Usually, one cannot do this. You can't even do the following "trivial" optimization of f x == f x in all cases
f x `seq` True
For, consider
f x = (undefined :: Bool, x)
then
f x `seq` True ==> True
f x == f x ==> undefined
That being said, regarding your first example slightly modified for monomorphism:
f n = n * foldl (+) 0 [1..10] :: Int
it is imaginable to optimize the program by moving the expression out of its context and replace it with the name of a constant, like so:
const1 = foldl (+) 0 [1..10] :: Int
f n = n * const1
This is because the compiler can see that the expression must be constant.
What you're describing looks like super-compilation. In your case, if the expression had a monomorphic type like Int (as opposed to polymorphic Num a => a), the compiler could infer that the expression foldr (+) 0 [1 ..10] has no external dependencies, therefore it could be evaluated at compile time and replaced by 55. However, AFAIK no mainstream compiler currently does this kind of optimization.
(In functional programming "proving" is usually associated with something different. In languages with dependent types types are powerful enough to express complex proposition and then through the Curry-Howard correspondence programs become proofs of such propositions.)
As others have noted, it's unclear that your simplifications even hold in Haskell. For instance, I can define
newtype NInt = N Int
instance Num NInt where
N a + _ = N a
N b * _ = N b
... -- etc
and now sum . reverse :: Num [a] -> a does not equal sum :: Num [a] -> a since I can specialize each to [NInt] -> NInt where sum . reverse == sum clearly does not hold.
This is one general tension that exists around optimizing "complex" operations—you actually need quite a lot of information in order to successfully prove that it's okay to optimize something. This is why the syntax-level compiler optimization which do exist are usually monomorphic and related to the structure of programs---it's usually such a simplified domain that there's "no way" for the optimization to go wrong. Even that is often unsafe because the domain is never quite so simplified and well-known to the compiler.
As an example, a very popular "high-level" syntactic optimization is stream fusion. In this case the compiler is given enough information to know that stream fusion can occur and is basically safe, but even in this canonical example we have to skirt around notions of non-termination.
So what does it take to have \x -> sum [0..x] get replaced by \x -> x*(x + 1)/2? The compiler would need a theory of numbers and algebra built-in. This is not possible in Haskell or ML, but becomes possible in dependently typed languages like Coq, Agda, or Idris. There you could specify things like
revCommute :: (_+_ :: a -> a -> a)
-> Commutative _+_
-> foldr _+_ z (reverse as) == foldr _+_ z as
and then, theoretically, tell the compiler to rewrite according to revCommute. This would still be difficult and finicky, but at least we'd have enough information around. To be clear, I'm writing something very strange above, a dependent type. The type not only depends on the ability to introduce both a type and a name for the argument inline, but also the existence of the entire syntax of your language "at the type level".
There are a lot of differences between what I just wrote and what you'd do in Haskell, though. First, in order to form a basis where such promises can be taken seriously, we must throw away general recursion (and thus we already don't have to worry about questions of non-termination like stream-fusion does). We also must have enough structure around to create something like the promise Commutative _+_---this likely depends upon there being an entire theory of operators and mathematics built into the language's standard library else you would need to create that yourself. Finally, the richness of type system required to even express these kinds of theories adds a lot of complexity to the entire system and tosses out type inference as you know it today.
But, given all that structure, I'd never be able to create an obligation Commutative _+_ for the _+_ defined to work on NInts and so we could be certain that foldr (+) 0 . reverse == foldr (+) 0 actually does hold.
But now we'd need to tell the compiler how to actually perform that optimization. For stream-fusion, the compiler rules only kick in when we write something in exactly the right syntactic form to be "clearly" an optimization redex. The same kinds of restrictions would apply to our sum . reverse rule. In fact, already we're sunk because
foldr (+) 0 . reverse
foldr (+) 0 (reverse as)
don't match. They're "obviously" the same due to some rules we could prove about (.), but that means that now the compiler must invoke two built-in rules in order to perform our optimization.
At the end of the day, you need a very smart optimization search over the sets of known laws in order to achieve the kinds of automatic optimizations you're talking about.
So not only do we add a lot of complexity to the entire system, require a lot of base work to build-in some useful algebraic theories, and lose Turing completeness (which might not be the worst thing), we also only get a finicky promise that our rule would even fire unless we perform an exponentially painful search during compilation.
Blech.
The compromise that exists today tends to be that sometimes we have enough control over what's being written to be mostly certain that a certain obvious optimization can be performed. This is the regime of stream fusion and it requires a lot of hidden types, carefully written proofs, exploitations of parametricity, and hand-waving before it's something the community trusts enough to run on their code.
And it doesn't even always fire. For an example of battling that problem take a look at the source of Vector for all of the RULES pragmas that specify all of the common circumstances where Vector's stream-fusion optimizations should kick in.
All of this is not at all a critique of compiler optimizations or dependent type theories. Both are really incredible. Instead it's just an amplification of the tradeoffs involved in introducing such an optimization. It's not to be done lightly.
Fun fact: Given two arbitrary formulas, do they both give the same output for the same inputs? The answer to this trivial question is not computable! In other words, it is mathematically impossible to write a computer program that always gives the correct answer in finite time.
Given this fact, it's perhaps not surprising that nobody has a compiler that can magically transform every possible computation into its most efficient form.
Also, isn't this the programmer's job? If you want the sum of an arithmetic sequence commonly enough that it's a performance bottleneck, why not just write some more efficient code yourself? Similarly, if you really want Fibonacci numbers (why?), use the O(1) algorithm.
I'm currently digesting the nice presentation Why learn Haskell? by Keegan McAllister. There he uses the snippet
minimum = head . sort
as an illustration of Haskell's lazy evaluation by stating that minimum has time-complexity O(n) in Haskell. However, I think the example is kind of academic in nature. I'm therefore asking for a more practical example where it's not trivially apparent that most of the intermediate calculations are thrown away.
Have you ever written an AI? Isn't it annoying that you have to thread pruning information (e.g. maximum depth, the minimum cost of an adjacent branch, or other such information) through the tree traversal function? This means you have to write a new tree traversal every time you want to improve your AI. That's dumb. With lazy evaluation, this is no longer a problem: write your tree traversal function once, to produce a huge (maybe even infinite!) game tree, and let your consumer decide how much of it to consume.
Writing a GUI that shows lots of information? Want it to run fast anyway? In other languages, you might have to write code that renders only the visible scenes. In Haskell, you can write code that renders the whole scene, and then later choose which pixels to observe. Similarly, rendering a complicated scene? Why not compute an infinite sequence of scenes at various detail levels, and pick the most appropriate one as the program runs?
You write an expensive function, and decide to memoize it for speed. In other languages, this requires building a data structure that tracks which inputs for the function you know the answer to, and updating the structure as you see new inputs. Remember to make it thread safe -- if we really need speed, we need parallelism, too! In Haskell, you build an infinite data structure, with an entry for each possible input, and evaluate the parts of the data structure that correspond to the inputs you care about. Thread safety comes for free with purity.
Here's one that's perhaps a bit more prosaic than the previous ones. Have you ever found a time when && and || weren't the only things you wanted to be short-circuiting? I sure have! For example, I love the <|> function for combining Maybe values: it takes the first one of its arguments that actually has a value. So Just 3 <|> Nothing = Just 3; Nothing <|> Just 7 = Just 7; and Nothing <|> Nothing = Nothing. Moreover, it's short-circuiting: if it turns out that its first argument is a Just, it won't bother doing the computation required to figure out what its second argument is.
And <|> isn't built in to the language; it's tacked on by a library. That is: laziness allows you to write brand new short-circuiting forms. (Indeed, in Haskell, even the short-circuiting behavior of (&&) and (||) aren't built-in compiler magic: they arise naturally from the semantics of the language plus their definitions in the standard libraries.)
In general, the common theme here is that you can separate the production of values from the determination of which values are interesting to look at. This makes things more composable, because the choice of what is interesting to look at need not be known by the producer.
Here's a well-known example I posted to another thread yesterday. Hamming numbers are numbers that don't have any prime factors larger than 5. I.e. they have the form 2^i*3^j*5^k. The first 20 of them are:
[1,2,3,4,5,6,8,9,10,12,15,16,18,20,24,25,27,30,32,36]
The 500000th one is:
1962938367679548095642112423564462631020433036610484123229980468750
The program that printed the 500000th one (after a brief moment of computation) is:
merge xxs#(x:xs) yys#(y:ys) =
case (x`compare`y) of
LT -> x:merge xs yys
EQ -> x:merge xs ys
GT -> y:merge xxs ys
hamming = 1 : m 2 `merge` m 3 `merge` m 5
where
m k = map (k *) hamming
main = print (hamming !! 499999)
Computing that number with reasonable speed in a non-lazy language takes quite a bit more code and head-scratching. There are a lot of examples here
Consider generating and consuming the first n elements of an infinite sequence. Without lazy evaluation, the naive encoding would run forever in the generation step, and never consume anything. With lazy evaluation, only as many elements are generated as the code tries to consume.
A classic programming exercise is to write a Lisp/Scheme interpreter in Lisp/Scheme. The power of the full language can be leveraged to produce an interpreter for a subset of the language.
Is there a similar exercise for Haskell? I'd like to implement a subset of Haskell using Haskell as the engine. Of course it can be done, but are there any online resources available to look at?
Here's the backstory.
I am exploring the idea of using Haskell as a language to explore some of the concepts in a Discrete Structures course I am teaching. For this semester I have settled on Miranda, a smaller language that inspired Haskell. Miranda does about 90% of what I'd like it to do, but Haskell does about 2000%. :)
So my idea is to create a language that has exactly the features of Haskell that I'd like and disallows everything else. As the students progress, I can selectively "turn on" various features once they've mastered the basics.
Pedagogical "language levels" have been used successfully to teach Java and Scheme. By limiting what they can do, you can prevent them from shooting themselves in the foot while they are still mastering the syntax and concepts you are trying to teach. And you can offer better error messages.
I love your goal, but it's a big job. A couple of hints:
I've worked on GHC, and you don't want any part of the sources. Hugs is a much simpler, cleaner implementation but unfortunately it's in C.
It's a small piece of the puzzle, but Mark Jones wrote a beautiful paper called Typing Haskell in Haskell which would be a great starting point for your front end.
Good luck! Identifying language levels for Haskell, with supporting evidence from the classroom, would be of great benefit to the community and definitely a publishable result!
There is a complete Haskell parser: http://hackage.haskell.org/package/haskell-src-exts
Once you've parsed it, stripping out or disallowing certain things is easy. I did this for tryhaskell.org to disallow import statements, to support top-level definitions, etc.
Just parse the module:
parseModule :: String -> ParseResult Module
Then you have an AST for a module:
Module SrcLoc ModuleName [ModulePragma] (Maybe WarningText) (Maybe [ExportSpec]) [ImportDecl] [Decl]
The Decl type is extensive: http://hackage.haskell.org/packages/archive/haskell-src-exts/1.9.0/doc/html/Language-Haskell-Exts-Syntax.html#t%3ADecl
All you need to do is define a white-list -- of what declarations, imports, symbols, syntax is available, then walk the AST and throw a "parse error" on anything you don't want them to be aware of yet. You can use the SrcLoc value attached to every node in the AST:
data SrcLoc = SrcLoc
{ srcFilename :: String
, srcLine :: Int
, srcColumn :: Int
}
There's no need to re-implement Haskell. If you want to provide more friendly compile errors, just parse the code, filter it, send it to the compiler, and parse the compiler output. If it's a "couldn't match expected type a against inferred a -> b" then you know it's probably too few arguments to a function.
Unless you really really want to spend time implementing Haskell from scratch or messing with the internals of Hugs, or some dumb implementation, I think you should just filter what gets passed to GHC. That way, if your students want to take their code-base and take it to the next step and write some real fully fledged Haskell code, the transition is transparent.
Do you want to build your interpreter from scratch? Begin with implementing an easier functional language like the lambda calculus or a lisp variant. For the latter there is a quite nice wikibook called Write yourself a Scheme in 48 hours giving a cool and pragmatic introduction into parsing and interpretation techniques.
Interpreting Haskell by hand will be much more complex since you'll have to deal with highly complex features like typeclasses, an extremely powerful type system (type-inference!) and lazy-evaluation (reduction techniques).
So you should define a quite little subset of Haskell to work with and then maybe start by extending the Scheme-example step by step.
Addition:
Note that in Haskell, you have full access to the interpreters API (at least under GHC) including parsers, compilers and of course interpreters.
The package to use is hint (Language.Haskell.*). I have unfortunately neither found online tutorials on this nor tried it out by myself but it looks quite promising.
create a language that has exactly the features of Haskell that I'd like and disallows everything else. As the students progress, I can selectively "turn on" various features once they've mastered the basics.
I suggest a simpler (as in less work involved) solution to this problem. Instead of creating a Haskell implementation where you can turn features off, wrap a Haskell compiler with a program that first checks that the code doesn't use any feature you disallow, and then uses the ready-made compiler to compile it.
That would be similar to HLint (and also kind of its opposite):
HLint (formerly Dr. Haskell) reads Haskell programs and suggests changes that hopefully make them easier to read. HLint also makes it easy to disable unwanted suggestions, and to add your own custom suggestions.
Implement your own HLint "suggestions" to not use the features you don't allow
Disable all the standard HLint suggestions.
Make your wrapper run your modified HLint as a first step
Treat HLint suggestions as errors. That is, if HLint "complained" then the program doesn't proceed to compilation stage
Baskell is a teaching implementation, http://hackage.haskell.org/package/baskell
You might start by picking just, say, the type system to implement. That's about as complicated as an interpreter for Scheme, http://hackage.haskell.org/package/thih
The EHC series of compilers is probably the best bet: it's actively developed and seems to be exactly what you want - a series of small lambda calculi compilers/interpreters culminating in Haskell '98.
But you could also look at the various languages developed in Pierce's Types and Programming Languages, or the Helium interpreter (a crippled Haskell intended for students http://en.wikipedia.org/wiki/Helium_(Haskell)).
If you're looking for a subset of Haskell that's easy to implement, you can do away with type classes and type checking. Without type classes, you don't need type inference to evaluate Haskell code.
I wrote a self-compiling Haskell subset compiler for a Code Golf challenge. It takes Haskell subset code on input and produces C code on output. I'm sorry there isn't a more readable version available; I lifted nested definitions by hand in the process of making it self-compiling.
For a student interested in implementing an interpreter for a subset of Haskell, I would recommend starting with the following features:
Lazy evaluation. If the interpreter is in Haskell, you might not have to do anything for this.
Function definitions with pattern-matched arguments and guards. Only worry about variable, cons, nil, and _ patterns.
Simple expression syntax:
Integer literals
Character literals
[] (nil)
Function application (left associative)
Infix : (cons, right associative)
Parenthesis
Variable names
Function names
More concretely, write an interpreter that can run this:
-- tail :: [a] -> [a]
tail (_:xs) = xs
-- append :: [a] -> [a] -> [a]
append [] ys = ys
append (x:xs) ys = x : append xs ys
-- zipWith :: (a -> b -> c) -> [a] -> [b] -> [c]
zipWith f (a:as) (b:bs) = f a b : zipWith f as bs
zipWith _ _ _ = []
-- showList :: (a -> String) -> [a] -> String
showList _ [] = '[' : ']' : []
showList show (x:xs) = '[' : append (show x) (showItems show xs)
-- showItems :: (a -> String) -> [a] -> String
showItems show [] = ']' : []
showItems show (x:xs) = ',' : append (show x) (showItems show xs)
-- fibs :: [Int]
fibs = 0 : 1 : zipWith add fibs (tail fibs)
-- main :: String
main = showList showInt (take 40 fibs)
Type checking is a crucial feature of Haskell. However, going from nothing to a type-checking Haskell compiler is very difficult. If you start by writing an interpreter for the above, adding type checking to it should be less daunting.
You might look at Happy (a yacc-like parser in Haskell) which has a Haskell parser.
This might be a good idea - make a tiny version of NetLogo in Haskell. Here is the tiny interpreter.
see if helium would make a better base to build upon than standard haskell.
Uhc/Ehc is a series of compilers enabling/disabling various Haskell features.
http://www.cs.uu.nl/wiki/Ehc/WebHome#What_is_UHC_And_EHC
I've been told that Idris has a fairly compact parser, not sure if it's really suitable for alteration, but it's written in Haskell.
Andrej Bauer's Programming Language Zoo has a small implementation of a purely functional programming language somewhat cheekily named "minihaskell". It is about 700 lines of OCaml, so very easy to digest.
The site also contains toy versions of ML-style, Prolog-style and OO programming languages.
Don't you think it would be easier to take the GHC sources and strip out what you don't want, than it would be to write your own Haskell interpreter from scratch? Generally speaking, there should be a lot less effort involved in removing features as opposed to creating/adding features.
GHC is written in Haskell anyway, so technically that stays with your question of a Haskell interpreter written in Haskell.
It probably wouldn't be too hard to make the whole thing statically linked and then only distribute your customized GHCi, so that the students can't load other Haskell source modules. As to how much work it would take to prevent them from loading other Haskell object files, I have no idea. You might want to disable FFI too, if you have a bunch of cheaters in your classes :)
The reason why there are so many LISP interpreters is that LISP is basically a predecessor of JSON: a simple format to encode data. This makes the frontend part quite easy to handle. Compared to that, Haskell, especially with Language Extensions, is not the easiest language to parse.
These are some syntactical constructs that sound tricky to get right:
operators with configurable precedence, associativity, and fixity,
nested comments
layout rule
pattern syntax
do- blocks and desugaring to monadic code
Each of these, except maybe the operators, could be tackled by students after their Compiler Construction Course, but it would take the focus away from how Haskell actually works. In addition to that, you might not want to implement all syntactical constructs of Haskell directly, but instead implement passes to get rid of them. Which brings us to the literal core of the issue, pun fully intended.
My suggestion is to implement typechecking and an interpreter for Core instead of full Haskell. Both of these tasks are quite intricate by themselves already.
This language, while still a strongly typed functional language, is way less complicated to deal with in terms of optimization and code generation.
However, it is still independent from the underlying machine.
Therefore, GHC uses it as an intermediary language and translates most syntaxical constructs of Haskell into it.
Additionally, you should not shy away from using GHC's (or another compiler's) frontend.
I'd not consider that as cheating since custom LISPs use the host LISP system's parser (at least during bootstrapping). Cleaning up Core snippets and presenting them to students, along with the original code, should allow you to give an overview of what the frontend does, and why it is preferable to not reimplement it.
Here are a few links to the documentation of Core as used in GHC:
System FC: equality constraints and coercions
GHC/As a library
The Core type
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
I'm going to be teaching a lower-division course in discrete structures. I have selected the text book Discrete Structures, Logic, and Computability in part because it contains examples and concepts that are conducive to implementation with a functional programming language. (I also think it's a good textbook.)
I want an easy-to-understand FP language to illustrate DS concepts and that the students can use. Most students will have had only one or two semesters of programming in Java, at best. After looking at Scheme, Erlang, Haskell, Ocaml, and SML, I've settled on either Haskell or Standard ML. I'm leaning towards Haskell for the reasons outlined below, but I'd like the opinion of those who are active programmers in one or the other.
Both Haskell and SML have pattern matching which makes describing a recursive algorithm a cinch.
Haskell has nice list comprehensions that match nicely with the way such lists are expressed mathematically.
Haskell has lazy evaluation. Great for constructing infinite lists using the list comprehension technique.
SML has a truly interactive interpreter in which functions can be both defined and used. In Haskell, functions must be defined in a separate file and compiled before being used in the interactive shell.
SML gives explicit confirmation of the function argument and return types in a syntax that's easy to understand. For example: val foo = fn : int * int -> int. Haskell's implicit curry syntax is a bit more obtuse, but not totally alien. For example: foo :: Int -> Int -> Int.
Haskell uses arbitrary-precision integers by default. It's an external library in SML/NJ. And SML/NJ truncates output to 70 characters by default.
Haskell's lambda syntax is subtle -- it uses a single backslash. SML is more explicit. Not sure if we'll ever need lambda in this class, though.
Essentially, SML and Haskell are roughly equivalent. I lean toward Haskell because I'm loving the list comprehensions and infinite lists in Haskell. But I'm worried that the extensive number of symbols in Haskell's compact syntax might cause students problems. From what I've gathered reading other posts on SO, Haskell is not recommended for beginners starting out with FP. But we're not going to be building full-fledged applications, just trying out simple algorithms.
What do you think?
Edit: Upon reading some of your great responses, I should clarify some of my bullet points.
In SML, there's no syntactic distinction between defining a function in the interpreter and defining it in an external file. Let's say you want to write the factorial function. In Haskell you can put this definition into a file and load it into GHCi:
fac 0 = 1
fac n = n * fac (n-1)
To me, that's clear, succinct, and matches the mathematical definition in the book. But if you want to write the function in GHCi directly, you have to use a different syntax:
let fac 0 = 1; fac n = n * fac (n-1)
When working with interactive interpreters, from a teaching perspective it's very, very handy when the student can use the same code in both a file and the command line.
By "explicit confirmation of the function," I meant that upon defining the function, SML right away tells you the name of the function, the types of the arguments, and the return type. In Haskell you have to use the :type command and then you get the somewhat confusing curry notation.
One more cool thing about Haskell -- this is a valid function definition:
fac 0 = 1
fac (n+1) = (n+1) * fac n
Again, this matches a definition they might find in the textbook. Can't do that in SML!
Much as I love Haskell, here are the reasons I would prefer SML for a class in discrete math and data structures (and most other beginners' classes):
Time and space costs of Haskell programs can be very hard to predict, even for experts. SML offers much more limited ways to blow the machine.
Syntax for function defintion in an interactive interpreter is identical to syntax used in a file, so you can cut and paste.
Although operator overloading in SML is totally bogus, it is also simple. It's going to be hard to teach a whole class in Haskell without having to get into type classes.
Student can debug using print. (Although, as a commenter points out, it is possible to get almost the same effect in Haskell using Debug.Trace.trace.)
Infinite data structures blow people's minds. For beginners, you're better off having them define a stream type complete with ref cells and thunks, so they know how it works:
datatype 'a thunk_contents = UNEVALUATED of unit -> 'a
| VALUE of 'a
type 'a thunk = 'a thunk_contents ref
val delay : (unit -> 'a) -> 'a thunk
val force : 'a thunk -> 'a
Now it's not magic any more, and you can go from here to streams (infinite lists).
Layout is not as simple as in Python and can be confusing.
There are two places Haskell has an edge:
In core Haskell you can write a function's type signature just before its definition. This is hugely helpful for students and other beginners. There just isn't a nice way to deal with type signatures in SML.
Haskell has better concrete syntax. The Haskell syntax is a major improvement over ML syntax. I have written a short note about when to use parentheses in an ML program; this helps a little.
Finally, there is a sword that cuts both ways:
Haskell code is pure by default, so your students are unlikely to stumble over impure constructs (IO monad, state monad) by accident. But by the same token, they can't print, and if you want to do I/O then at minumum you have to explain do notation, and return is confusing.
On a related topic, here is some advice for your course preparation: don't overlook Purely Functional Data Structures by Chris Okasaki. Even if you don't have your students use it, you will definitely want to have a copy.
We teach Haskell to first years at our university. My feelings about this are a bit mixed. On the one hand teaching Haskell to first years means they don't have to unlearn the imperative style. Haskell can also produce very concise code which people who had some Java before can appreciate.
Some problems I've noticed students often have:
Pattern matching can be a bit difficult, at first. Students initially had some problems seeing how value construction and pattern matching are related. They also had some problems distinguishing between abstractions. Our exercises included writing functions that simplify arithmetic expression and some students had difficulty seeing the difference between the abstract representation (e.g., Const 1) and the meta-language representation (1).
Furthermore, if your students are supposed to write list processing functions themselves, be careful pointing out the difference between the patterns
[]
[x]
(x:xs)
[x:xs]
Depending on how much functional programming you want to teach them on the way, you may just give them a few library functions and let them play around with that.
We didn't teach our students about anonymous functions, we simply told them about where clauses. For some tasks this was a bit verbose, but worked well otherwise. We also didn't tell them about partial applications; this is probably quite easy to explain in Haskell (due to its form of writing types) so it might be worth showing to them.
They quickly discovered list comprehensions and preferred them over higher-order functions like filter, map, zipWith.
I think we missed out a bit on teaching them how to let them guide their thoughts by the types. I'm not quite sure, though, whether this is helpful to beginners or not.
Error messages are usually not very helpful to beginners, they might occasionally need some help with these. I haven't tried it myself, but there's a Haskell compiler specifically targeted at newcomers, mainly by means of better error messages: Helium
For the small programs, things like possible space leaks weren't an issue.
Overall, Haskell is a good teaching language, but there are a few pitfalls. Given that students feel a lot more comfortable with list comprehensions than higher-order functions, this might be the argument you need. I don't know how long your course is or how much programming you want to teach them, but do plan some time for teaching them basic concepts--they will need it.
BTW,
# SML has a truly interactive
interpreter in which functions can be
both defined and used. In Haskell,
functions must be defined in a
separate file and compiled before
being used in the interactive shell.
Is inaccurate. Use GHCi:
Prelude> let f x = x ^ 2
Prelude> f 7
49
Prelude> f 2
4
There are also good resources for Haskell in education on the haskell.org edu. page, with experiences from different teachers. http://haskell.org/haskellwiki/Haskell_in_education
Finally, you'll be able to teach them multicore parallelism just for fun, if you use Haskell :-)
Many universities teach Haskell as a first functional language or even a first programming language, so I don't think this will be a problem.
Having done some of the teaching on one such course, I don't agree that the possible confusions you identify are that likely. The most likely sources of early confusion are parsing errors caused by bad layout, and mysterious messages about type classes when numeric literals are used incorrectly.
I'd also disagree with any suggestion that Haskell is not recommended for beginners starting out with FP. It's certainly the big bang approach in ways that strict languages with mutation aren't, but I think that's a very valid approach.
SML has a truly interactive interpreter in which functions can be both defined and used. In Haskell, functions must be defined in a separate file and compiled before being used in the interactive shell.
While Hugs may have that limitation, GHCi does not:
$ ghci
GHCi, version 6.10.1: http://www.haskell.org/ghc/ :? for help
Loading package ghc-prim ... linking ... done.
Loading package integer ... linking ... done.
Loading package base ... linking ... done.
Prelude> let hello name = "Hello, " ++ name
Prelude> hello "Barry"
"Hello, Barry"
There's many reasons I prefer GHC(i) over Hugs, this is just one of them.
SML gives explicit confirmation of the function argument and return types in a syntax that's easy to understand. For example: val foo = fn : int * int -> int. Haskell's implicit curry syntax is a bit more obtuse, but not totally alien. For example: foo :: Int -> Int -> Int.
SML has what you call "implicit curry" syntax as well.
$ sml
Standard ML of New Jersey v110.69 [built: Fri Mar 13 16:02:47 2009]
- fun add x y = x + y;
val add = fn : int -> int -> int
Essentially, SML and Haskell are roughly equivalent. I lean toward Haskell because I'm loving the list comprehensions and infinite lists in Haskell. But I'm worried that the extensive number of symbols in Haskell's compact syntax might cause students problems. From what I've gathered reading other posts on SO, Haskell is not recommended for beginners starting out with FP. But we're not going to be building full-fledged applications, just trying out simple algorithms.
I like using Haskell much more than SML, but I would still teach SML first.
Seconding nominolo's thoughts, list comprehensions do seem to slow students from getting to some higher-order functions.
If you want laziness and infinite lists, it's instructive to implement it explicitly.
Because SML is eagerly evaluated, the execution model is far easier to comprehend, and "debugging via printf" works a lot better than in Haskell.
SML's type system is also simpler. While your class likely wouldn't use them anyways, Haskell's typeclasses are still an extra bump to get over -- getting them to understand the 'a versus ''a distinction in SML is tough enough.
Most answers were technical, but I think you should consider at least one that is not: Haskell (as OCaml), at this time, has a bigger community using it in a wider range of contexts. There's also a big database of libraries and applications written for profit and fun at Hackage. That may be an important factor in keeping some of your students using the language after your course is finished, and maybe trying other functional languages (like Standard ML) later.
I am amazed you are not considering OCaml and F# given that they address so many of your concerns. Surely decent and helpful development environments are a high priority for learners? SML is way behind and F# is way ahead of all other FPLs in that respect.
Also, both OCaml and F# have list comprehensions.
Haskell. I'm ahead in my algos/theory class in CS because of the stuff I learned from using Haskell. It's such a comprehensive language, and it will teach you a ton of CS, just by using it.
However, SML is much easier to learn. Haskell has features such as lazy evaluation and control structures that make it much more powerful, but with the cost of a steep(ish) learning curve. SML has no such curve.
That said, most of Haskell was unlearning stuff from less scientific/mathematic languages such as Ruby, ObjC, or Python.