I'm familiar with several computer languages (Java, C, C++, Python, Scheme, Javascript) but am only vaguely with the terminology for analyzing and comparing them (things like dynamic/static binding, dynamic/static types, pass-by-value vs. pass-by-reference, closures, operator overloading, etc.).
Is there a whitepaper or easily-readable-book that discusses these topics in enough depth for me to be able to look at an unfamiliar computer language and say to myself, "Oh, it has dynamic binding and static types", and to say "That's different from C++ because ... but similar because ..."?
If you like learning by example, Rosetta Code is a great resource. Its Language Comparison Table might be a good place to start.
I've found it helpful both for theoretical comparisons ("How do C++ and Java's respective exception handling systems differ?") and for practical work ("I know how to do a foreach() in PHP; what's the syntax for the equivalent operation in PERL?").
This free ebook may be somewhat heavier than what you are looking for, but is comprehensive:
Practical Foundations for Programming Languages (pdf 1.5Mb)
Here is an extract of the TOC:
I Judgements and Rules
1 Syntactic Objects
2 Inductive Definitions
3 Hypothetical and General Judgements
II Levels of Syntax
4 Concrete Syntax
5 Abstract Syntax
III Statics and Dynamics
6 Statics
7 Dynamics
8 Type Safety
9 Evaluation Dynamics
IV Function Types
10 Function Definitions and Values
11 Godel’s System T
12 Plotkin’s PCF
V Finite Data Types
13 Product Types
14 Sum Types
15 Pattern Matching
16 Generic Programming
VI Infinite Data Types
17 Inductive and Co-Inductive Types
18 Recursive Types
VII Dynamic Types
19 The Untyped l-Calculus
20 Dynamic Typing
21 Hybrid Typing
VIII Variable Types
22 Girard’s System F
23 Abstract Types
24 Constructors and Kinds
IX Subtyping
25 Subtyping
26 Singleton Kinds
X Classes and Meth
27 Dynamic Dispatch
28 Inheritance
XI Control Effects
29 Control Stacks
30 Exceptions
31 Continuations
XII Types and Propos
32 Constructive Logic
33 Classical Logic
XIII Symbols
34 Symbols
35 Fluid Binding
36 Dynamic Classification
XIV Storage Effects
37 Modernized Algol
38 Mutable Data Structures
XV Laziness
39 Lazy Evaluation
40 Polarization
XVI Parallelism
41 Nested Parallelism
42 Futures and Speculation
XVII Concurrency
43 Process Calculus
45 Distributed Algol
XVIII Modularity
46 Components and Linking
47 Type Abstractions and Type Classes
48 Hierarchy and Parameterization
XIX Equivalence
49 Equational Reasoning
50 Equational Reasoning
51 Parametricity
52 Process Equivalence
XX Appendices
A Mathematical Preliminaries
Related
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
Despite some experience with Lisp and ML, I'm having a great deal of trouble learning to read and (idiomatically) write Haskell because the local style seems to be
do eta elimination whenever possible
eschew parentheses in favor of exploiting operator precedence
pack half your logic into bucketloads of overloaded, non-alphanumeric infix operators
The last one is particularly difficult because there are so many predefined operators, each with their own conventions and general semantics, that often reading Haskell becomes an exercise in Hoogle and :type.
Are there any good tutorials that assume knowledge of CS/functional concepts, and instead focus on Haskell-specific idioms? I'm looking for something like Real-World Haskell, that starts off with a very naive, explicit program and then gradually transforms it into a more idiomatic style, introducing and explaining the idioms as it goes. But instead of introducing and explaining general concepts like monads and type classes, it would introduce specific monads and specific type classes, like "but this is exactly what the Alternative monoid does!"
Basic type classes like Show, Eq and Ord should be easy to grasp by reading the library documentation found by Hoogle and/or Haskell-2010 Language Report.
The numeric tower in Haskell seems to be convoluted (Int type is an instance of whooping 11 type classes according to the report), but it is just to support all useful kinds of numbers and number representations mathematicians invented for us: e.g. Integer is a arbitrary size integer, Int is usual machine word-sized integer, and a lazy Peano representation of integers (not in standard library) proved useful in implementation of graph algorithms. The most important numeric type classes are Num and Integral. You can convert between different integer types by using fromIntegral functions. Note also that numerals such as 123 have type Num a => a and there's special type defaulting mechanism designed to reduce the need of type declarations to specify exact numeric type you need. In advanced use cases this works against you so you may want to alter the defaults.
The same situation is with different types of strings: no single representation fits all, so many of them are in the wild: String, Data.ByteString and Data.Text are most important.
Regarding more complicated type classes the best source is Typeclassopedia.
For certain type classes such as Monad, Applicative and Arrow there are a lot of dedicated tutorials and research works. Depending on your math skills, you may also want to read original research papers on the category theory concepts behind the type classes such as excellent "Notions of computation and monads" by Eugenio Moggi.
As for "eta reductions" it is called Point-Free Style. You can get some information from
the references mentioned at that link. You can also look at Combinatory Logic, a 1978 paper by John Backus Can programming be liberated from von neumann style? and APL programming language to get a richer historical perspective on the point-free style.
Also there are general books on Haskell such as 'A Gentle Introduction to Haskell' and 'Learn You a Haskell for Great Good'.
As for operator precedence - there are really few operators you must remember: (.), ($) and (>>=) are used much more than everything else (barring arithmetics of course but arithmetic operators are rather unsurprising).
The syntax for tuples and lists seems unproblematic for me too. It is just redundant: foo : bar : [] is the same as [foo, bar] and (,) foo bar is the same as (foo, bar). The prefix versions of tuples of higher arity such as (,,,,) are rarely used.
See also http://www.haskell.org/haskellwiki/Section_of_an_infix_operator for explanation of constructs such as (+ 2) and (2 +) called sections.
Also you can learn from the changes the HLint tool suggests to improve your code. The HLint executable can be installed by cabal install HLint.
As for advanced topics, I can recommend studying purely functional data structures (lets you design efficient immutable data structures and reason about time consumption), call by need lambda calculus (lets you reason about what is evaluated and in which order), System Fc type system (gives you some background on how haskell type checker works), denotational semantic (reasoning about non-termination, partiality and recursion, and also some insight on the notions of strictness, purity and composability).
I think the "Write Yourself a Scheme in 48 Hours" tutorial is exactly what you're looking for. This tutorial explains how to implement a Scheme interpreter in Haskell; it's the one document that really made everything click for me.
It has concrete examples of various Haskell idioms and ideas all rolled together in a cool project. This is particularly good for somebody with experience in Lisp, especially if you've read something like SICP or have implemented a Scheme interpreter before. The tutorial is great either way, but it might be a little bit easier to follow if you've done something similar before.
In System F I can define the genuine total addition function using Church numerals.
In Haskell I cannot define that function because of the bottom value. For example, in haskell if x + y = x, then I cannot say that y is zero - if x is bottom, x + y = x for any y. So the addition is not the true addition but an approximation to it.
In C I cannot define that function because C specification requires everything to have finite size. So in C possible approximations are even worse than in Haskell.
So we have:
In System F it's possible to define the addition but it's not possible to have a complete implementation (because there are no infinite hardware).
In Haskell it's not possible to define the addition (because of the bottom), and it's not possible to have a complete implementation.
In C it's not possible to define the total addition function (because semantic of everything is bounded) but compliant implementations are possible.
So all 3 formal systems (Haskell, System F and C) seem to have different design tradeoffs.
So what are consequences of choosing one over another?
Haskell
This is a strange problem because you're working with a vague notion of =. _|_ = _|_ only "holds" (and even then you should really use ⊑) at the domain semantic level. If we distinguish between information available at the domain semantic level and equality in the language itself, then it's perfectly correct to say that True ⊑ x + y == x --> True ⊑ y == 0.
It's not addition that's the problem, and it's not natural numbers that are the problem either -- the issue is simply distinguishing between equality in the language and statements about equality or information in the semantics of the language. Absent the issue of bottoms, we can typically reason about Haskell using naive equational logic. With bottoms, we can still use equational reasoning -- we just have to be more sophisticated with our equations.
A fuller and clearer exposition of the relationship between total languages and the partial languages defined by lifting them is given in the excellent paper "Fast and Loose Reasoning is Morally Correct".
C
You claim that the C requires everything (including addressable space) to have a finite size, and therefore that C semantics "impose a limit" on the size of representable naturals. Not really. The C99 standard says the following: "Any pointer type may be converted to an integer type. Except as previously specified, the
result is implementation-defined. If the result cannot be represented in the integer type,
the behavior is undefined. The result need not be in the range of values of any integer
type." The rationale document further emphasizes that "C has now been implemented on a wide range of architectures. While some of these
architectures feature uniform pointers which are the size of some integer type, maximally
portable code cannot assume any necessary correspondence between different pointer types and the integer types. On some implementations, pointers can even be wider than any integer type."
As you can see, there's explicitly no assumption that pointers must be of a finite size.
You have a set of theories as frameworks to do your reasoning with; finite reality, Haskell semantics, System F are just ones of them.
You can choose appropriate theory for your work, build new theory from scratch or from big pieces of existing theories gathered together. For example, you can consider set of always terminating Haskell programs and employ bottomless semantics safely. In this case your addition will be correct.
For low level language there may be considerations to plug finiteness in but for high level language it is worth to omit such things because more abstract theories allow wider application.
While programming, you use not "language specification" theory but "language specification + implementation limitations" theory so there is no difference between cases where memory limits present in language specification or in language implementation. Absence of limits become important when you start building pure theoretic constructions in framework of language semantics. For example, you may want to prove some program equivalences or language translations and find that every unneeded detail in language specification brings a much pain in proof.
I'm sure you've heard the aphorism that "in theory there is no difference between theory and practice, but in practice there is."
In this case, in theory there are differences, but all of these systems deal with the same finite amount of addressable memory so in practice there is no difference.
EDIT:
Assuming you can represent a natural number in any of these systems, you can represent addition in any of them. If the constraints you are concerned about prevent you from representing a natural number then you can't represent Nat*Nat addition.
Represent a natural number as a pair of (heuristic lower bound on the maximum bit size and a lazily evaluated list of bits).
In the lambda calculus, you can represent the list as a function that returns a function that called with true returns the 1's bit, and called with false returns a function that does the same for the 2's bit and so on.
Addition is then an operation applied to the zip of those two lazy lists that propagates a carry bit.
You of course have to represent the maximum bit size heuristic as a natural number, but if you only instantiate numbers with a bit count that is strictly smaller than the number you are representing, and your operators don't break that heuristic, then the bit size is inductively a smaller problem than the numbers you want to manipulate, so operations terminate.
On the ease of accounting for edge cases, C will give you very little help. You can return special values to represent overflow/underflow, and even try to make them infectious (like IEEE-754 NaN) but you won't get complaints at compile time if you fail to check. You could try and overload a signal SIGFPE or something similar to trap problems.
I cannot say that y is zero - if x is bottom, x + y = x for any y.
If you're looking to do symbolic manipulation, Matlab and Mathematica are implemented in C and C like languages. That said, python has a well-optimized bigint implementation that is used for all integer types. It's probably not suitable for representing really really large numbers though.
I've come across references to Haskell's Data.Typeable, but it's not clear to me why I would want to use it in my code.
What problem does it solve, and how?
Data.Typeable is an encoding of an well known approach (see e.g. Harper) to implementing delayed (dynamic) type checking in a statically typed language -- using a universal type.
Such a type wraps code for which type checking would not succeed until a later phase. Rather than reject the program as ill-typed, the compiler passes it on for runtime checking.
The style originated in Abadi et al., and developed for Haskell by Cheney and Hinze as a wrapper to represent all dynamic types, with the Typeable class appearing as part of the SYB work of SPJ and Lammel.
Reference
Martín Abadi, Luca Cardelli, Benjamin Pierce and Gordon Plotkin, "Dynamic Typing in a Statically Typed Language", ACM Transactions on Programming Languages and Systems (TOPLAS), 1991.
James Cheney and Ralf Hinze, "A lightweight implementation of generics and dynamics", Haskell '02: Proceedings of the 2002 ACM SIGPLAN Workshop on Haskell, 2002.
Lammel, Ralf and Jones, Simon Peyton, "Scrap your boilerplate: a practical design pattern for generic programming, TLDI '03: Proceedings of the 2003 ACM SIGPLAN International Workshop on Types in Languages Design and Implementation, 2003
Harper, 2011, Practical Foundations for Programming Languages.
Even in the text books: dynamic types (with typeable representations) are statically typed languages with only one type, Harper ch 20:
20.4 Untyped Means Uni-Typed
The untyped λ-calculus may be faithfully embedded in a
typed language with recursive types. This means that every
untyped λ-term has a representation as a typed expression
in such a way that execution of the representation of a
λ-term corresponds to execution of the term itself. This
embedding is not a matter of writing an interpreter for
the λ-calculus in ℒ{+×⇀µ} (which we could surely do), but
rather a direct representation of untyped λ-terms as typed
expressions in a language with recursive types.
The key observation is that the untyped λ-calculus is
really the uni-typed λ-calculus! It is not the absence
of types that gives it its power, but rather that it has
only one type, namely the recursive type
D = µt.t → t.
It's a library that allows, among other things, naming types. If a type a is declared Typeable, then you can get its name using show $ typeOf x where x is any value of type a. It also features limited type-casting.
(This is somewhat similar to C++'s RTTI or dynamic languages' reflection.)
One of the earliest descriptions I could find of a Data.Typeable-like library for Haskell is by John Peterson from 1992: http://www.cs.yale.edu/publications/techreports/tr1022.pdf
The earliest "official" paper I know of introducing the actual Data.Typeable library is the first Scrap Your Boilerplate paper from 2003: http://research.microsoft.com/en-us/um/people/simonpj/Papers/hmap/index.htm
I'm sure there's lots of intervening history that someone here can chime in with!
The Data.Typeable class is used primarily for generic programming in the Scrap Your Boilerplate (SYB) style. See also Data.Data
The idea is that SYB defines a collection combinators for performing operations such as printing, counting, searching, substiting, etc in a uniform manner over a variety of user-created types. The Typeable typeclass provides the necessary plumbing.
In modern GHC, you can just say deriving Data.Typeable when defining your own type in order to provide it with the necessary instances.
I first came across exceptions with ADA 83. As far as I know, the designers of ADA invented the concept of exceptions. Is this true, or did any programming language that came before use exceptions too?
According to c2.com's Ground Breaking Languages page it was PL/I.
It depends on how you define generics. Parametric polymorphism - which allows you to define functions and types that are not tied to particular argument / field types - was there in ML already - and that's 1973. There is a Standard ML sample from Wikipedia:
fun reverse [] = []
| reverse (x::xs) = (reverse xs) # [x]
Note that this function is statically typed, but polymorphic ("generic") on any type of list.
While this example is SML (which is a later thing), so far as I know, the concept was present in earliest ML versions as well.
From Wikipedia:
Generic programming facilities first
appeared in the 1970s in languages
like CLU and Ada, and were
subsequently adopted by many
object-based and object-oriented
languages, including BETA, C++, D,
Eiffel, Java, and DEC's now defunct
Trellis-Owl language. Implementations
of generics in languages such as Java
and C# are formally based on the
notion of parametricity, due to John
C. Reynolds.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
I'm going to be teaching a lower-division course in discrete structures. I have selected the text book Discrete Structures, Logic, and Computability in part because it contains examples and concepts that are conducive to implementation with a functional programming language. (I also think it's a good textbook.)
I want an easy-to-understand FP language to illustrate DS concepts and that the students can use. Most students will have had only one or two semesters of programming in Java, at best. After looking at Scheme, Erlang, Haskell, Ocaml, and SML, I've settled on either Haskell or Standard ML. I'm leaning towards Haskell for the reasons outlined below, but I'd like the opinion of those who are active programmers in one or the other.
Both Haskell and SML have pattern matching which makes describing a recursive algorithm a cinch.
Haskell has nice list comprehensions that match nicely with the way such lists are expressed mathematically.
Haskell has lazy evaluation. Great for constructing infinite lists using the list comprehension technique.
SML has a truly interactive interpreter in which functions can be both defined and used. In Haskell, functions must be defined in a separate file and compiled before being used in the interactive shell.
SML gives explicit confirmation of the function argument and return types in a syntax that's easy to understand. For example: val foo = fn : int * int -> int. Haskell's implicit curry syntax is a bit more obtuse, but not totally alien. For example: foo :: Int -> Int -> Int.
Haskell uses arbitrary-precision integers by default. It's an external library in SML/NJ. And SML/NJ truncates output to 70 characters by default.
Haskell's lambda syntax is subtle -- it uses a single backslash. SML is more explicit. Not sure if we'll ever need lambda in this class, though.
Essentially, SML and Haskell are roughly equivalent. I lean toward Haskell because I'm loving the list comprehensions and infinite lists in Haskell. But I'm worried that the extensive number of symbols in Haskell's compact syntax might cause students problems. From what I've gathered reading other posts on SO, Haskell is not recommended for beginners starting out with FP. But we're not going to be building full-fledged applications, just trying out simple algorithms.
What do you think?
Edit: Upon reading some of your great responses, I should clarify some of my bullet points.
In SML, there's no syntactic distinction between defining a function in the interpreter and defining it in an external file. Let's say you want to write the factorial function. In Haskell you can put this definition into a file and load it into GHCi:
fac 0 = 1
fac n = n * fac (n-1)
To me, that's clear, succinct, and matches the mathematical definition in the book. But if you want to write the function in GHCi directly, you have to use a different syntax:
let fac 0 = 1; fac n = n * fac (n-1)
When working with interactive interpreters, from a teaching perspective it's very, very handy when the student can use the same code in both a file and the command line.
By "explicit confirmation of the function," I meant that upon defining the function, SML right away tells you the name of the function, the types of the arguments, and the return type. In Haskell you have to use the :type command and then you get the somewhat confusing curry notation.
One more cool thing about Haskell -- this is a valid function definition:
fac 0 = 1
fac (n+1) = (n+1) * fac n
Again, this matches a definition they might find in the textbook. Can't do that in SML!
Much as I love Haskell, here are the reasons I would prefer SML for a class in discrete math and data structures (and most other beginners' classes):
Time and space costs of Haskell programs can be very hard to predict, even for experts. SML offers much more limited ways to blow the machine.
Syntax for function defintion in an interactive interpreter is identical to syntax used in a file, so you can cut and paste.
Although operator overloading in SML is totally bogus, it is also simple. It's going to be hard to teach a whole class in Haskell without having to get into type classes.
Student can debug using print. (Although, as a commenter points out, it is possible to get almost the same effect in Haskell using Debug.Trace.trace.)
Infinite data structures blow people's minds. For beginners, you're better off having them define a stream type complete with ref cells and thunks, so they know how it works:
datatype 'a thunk_contents = UNEVALUATED of unit -> 'a
| VALUE of 'a
type 'a thunk = 'a thunk_contents ref
val delay : (unit -> 'a) -> 'a thunk
val force : 'a thunk -> 'a
Now it's not magic any more, and you can go from here to streams (infinite lists).
Layout is not as simple as in Python and can be confusing.
There are two places Haskell has an edge:
In core Haskell you can write a function's type signature just before its definition. This is hugely helpful for students and other beginners. There just isn't a nice way to deal with type signatures in SML.
Haskell has better concrete syntax. The Haskell syntax is a major improvement over ML syntax. I have written a short note about when to use parentheses in an ML program; this helps a little.
Finally, there is a sword that cuts both ways:
Haskell code is pure by default, so your students are unlikely to stumble over impure constructs (IO monad, state monad) by accident. But by the same token, they can't print, and if you want to do I/O then at minumum you have to explain do notation, and return is confusing.
On a related topic, here is some advice for your course preparation: don't overlook Purely Functional Data Structures by Chris Okasaki. Even if you don't have your students use it, you will definitely want to have a copy.
We teach Haskell to first years at our university. My feelings about this are a bit mixed. On the one hand teaching Haskell to first years means they don't have to unlearn the imperative style. Haskell can also produce very concise code which people who had some Java before can appreciate.
Some problems I've noticed students often have:
Pattern matching can be a bit difficult, at first. Students initially had some problems seeing how value construction and pattern matching are related. They also had some problems distinguishing between abstractions. Our exercises included writing functions that simplify arithmetic expression and some students had difficulty seeing the difference between the abstract representation (e.g., Const 1) and the meta-language representation (1).
Furthermore, if your students are supposed to write list processing functions themselves, be careful pointing out the difference between the patterns
[]
[x]
(x:xs)
[x:xs]
Depending on how much functional programming you want to teach them on the way, you may just give them a few library functions and let them play around with that.
We didn't teach our students about anonymous functions, we simply told them about where clauses. For some tasks this was a bit verbose, but worked well otherwise. We also didn't tell them about partial applications; this is probably quite easy to explain in Haskell (due to its form of writing types) so it might be worth showing to them.
They quickly discovered list comprehensions and preferred them over higher-order functions like filter, map, zipWith.
I think we missed out a bit on teaching them how to let them guide their thoughts by the types. I'm not quite sure, though, whether this is helpful to beginners or not.
Error messages are usually not very helpful to beginners, they might occasionally need some help with these. I haven't tried it myself, but there's a Haskell compiler specifically targeted at newcomers, mainly by means of better error messages: Helium
For the small programs, things like possible space leaks weren't an issue.
Overall, Haskell is a good teaching language, but there are a few pitfalls. Given that students feel a lot more comfortable with list comprehensions than higher-order functions, this might be the argument you need. I don't know how long your course is or how much programming you want to teach them, but do plan some time for teaching them basic concepts--they will need it.
BTW,
# SML has a truly interactive
interpreter in which functions can be
both defined and used. In Haskell,
functions must be defined in a
separate file and compiled before
being used in the interactive shell.
Is inaccurate. Use GHCi:
Prelude> let f x = x ^ 2
Prelude> f 7
49
Prelude> f 2
4
There are also good resources for Haskell in education on the haskell.org edu. page, with experiences from different teachers. http://haskell.org/haskellwiki/Haskell_in_education
Finally, you'll be able to teach them multicore parallelism just for fun, if you use Haskell :-)
Many universities teach Haskell as a first functional language or even a first programming language, so I don't think this will be a problem.
Having done some of the teaching on one such course, I don't agree that the possible confusions you identify are that likely. The most likely sources of early confusion are parsing errors caused by bad layout, and mysterious messages about type classes when numeric literals are used incorrectly.
I'd also disagree with any suggestion that Haskell is not recommended for beginners starting out with FP. It's certainly the big bang approach in ways that strict languages with mutation aren't, but I think that's a very valid approach.
SML has a truly interactive interpreter in which functions can be both defined and used. In Haskell, functions must be defined in a separate file and compiled before being used in the interactive shell.
While Hugs may have that limitation, GHCi does not:
$ ghci
GHCi, version 6.10.1: http://www.haskell.org/ghc/ :? for help
Loading package ghc-prim ... linking ... done.
Loading package integer ... linking ... done.
Loading package base ... linking ... done.
Prelude> let hello name = "Hello, " ++ name
Prelude> hello "Barry"
"Hello, Barry"
There's many reasons I prefer GHC(i) over Hugs, this is just one of them.
SML gives explicit confirmation of the function argument and return types in a syntax that's easy to understand. For example: val foo = fn : int * int -> int. Haskell's implicit curry syntax is a bit more obtuse, but not totally alien. For example: foo :: Int -> Int -> Int.
SML has what you call "implicit curry" syntax as well.
$ sml
Standard ML of New Jersey v110.69 [built: Fri Mar 13 16:02:47 2009]
- fun add x y = x + y;
val add = fn : int -> int -> int
Essentially, SML and Haskell are roughly equivalent. I lean toward Haskell because I'm loving the list comprehensions and infinite lists in Haskell. But I'm worried that the extensive number of symbols in Haskell's compact syntax might cause students problems. From what I've gathered reading other posts on SO, Haskell is not recommended for beginners starting out with FP. But we're not going to be building full-fledged applications, just trying out simple algorithms.
I like using Haskell much more than SML, but I would still teach SML first.
Seconding nominolo's thoughts, list comprehensions do seem to slow students from getting to some higher-order functions.
If you want laziness and infinite lists, it's instructive to implement it explicitly.
Because SML is eagerly evaluated, the execution model is far easier to comprehend, and "debugging via printf" works a lot better than in Haskell.
SML's type system is also simpler. While your class likely wouldn't use them anyways, Haskell's typeclasses are still an extra bump to get over -- getting them to understand the 'a versus ''a distinction in SML is tough enough.
Most answers were technical, but I think you should consider at least one that is not: Haskell (as OCaml), at this time, has a bigger community using it in a wider range of contexts. There's also a big database of libraries and applications written for profit and fun at Hackage. That may be an important factor in keeping some of your students using the language after your course is finished, and maybe trying other functional languages (like Standard ML) later.
I am amazed you are not considering OCaml and F# given that they address so many of your concerns. Surely decent and helpful development environments are a high priority for learners? SML is way behind and F# is way ahead of all other FPLs in that respect.
Also, both OCaml and F# have list comprehensions.
Haskell. I'm ahead in my algos/theory class in CS because of the stuff I learned from using Haskell. It's such a comprehensive language, and it will teach you a ton of CS, just by using it.
However, SML is much easier to learn. Haskell has features such as lazy evaluation and control structures that make it much more powerful, but with the cost of a steep(ish) learning curve. SML has no such curve.
That said, most of Haskell was unlearning stuff from less scientific/mathematic languages such as Ruby, ObjC, or Python.