Machine learning in OCaml or Haskell? - haskell

I'm hoping to use either Haskell or OCaml on a new project because R is too slow. I need to be able to use support vectory machines, ideally separating out each execution to run in parallel. I want to use a functional language and I have the feeling that these two are the best so far as performance and elegance are concerned (I like Clojure, but it wasn't as fast in a short test). I am leaning towards OCaml because there appears to be more support for integration with other languages so it could be a better fit in the long run (e.g. OCaml-R).
Does anyone know of a good tutorial for this kind of analysis, or a code example, in either Haskell or OCaml?

Hal Daume has written several major machine learning algorithms during his Ph.D. (now he is an assistant professor and rising star in machine learning community)
On his web page, there are a SVM, a simple decision tree and a logistic regression all in OCaml. By reading these code, you can have a feeling how machine learning models are implemented in OCaml.
Another good example of writing basic machine learning models is Owl library for scientific and numeric computations in OCaml.
I'd also like to mention F#, a new .Net language similar to OCaml. Here's a factor graph model written in F# analyzing Chess play data. This research also has a NIPS publication.
While FP is suitable for implementing machine learning and data mining models. But what you can get here most is NOT performance. It is right that FP supports parallel computing better than imperative languages, like C# or Java. But implementing a parallel SVM, or decision tree, has very little relation to do with the language! Parallel is parallel. The numerical optimizations behind machine learning and data mining are usually imperative, writing them pure-functionally is usually hard and less efficient. Making these sophisticated algorithms parallel is very hard task in the algorithm level, not in the language level. If you want to run 100 SVM in parallel, FP helps here. But I don't see the difficulty running 100 libsvm parallel in C++, not to consider that the single thread libsvm is more efficient than a not-well-tested haskell svm package.
Then what do FP languages, like F#, OCaml, Haskell, give?
Easy to test your code. FP languages usually have a top-level interpreter, you can test your functions on the fly.
Few mutable states. This means that passing the same parameter to a function, this function always gives the same result, thus debugging is easy in FPs.
Code is succinct. Type inference, pattern matching, closures, etc. You focus more on the domain logic, and less on the language part. So when you write the code, your mind is mainly thinking about the programming logic itself.
Writing code in FPs is fun.

The only problem I can see is that OCaml doesn't really support multicore parallelism, while GHC has excellent support and performance. If you're looking to use multiple threads of execution, on multiple calls, GHC Haskell will be a lot easier.
Secondly, the Haskell FFI is more powerful (that is, it does more with less code) than OCaml's, and more libraries are avaliable (via Hackage: http://hackage.haskell.org ) so I don't think foreign interfaces will be a deciding factor.

As far as multi-language integration goes, combining C and Haskell is remarkably easy, and I say this as someone who is (unlike dons) not really much of an expert on either. Any other language that integrates well with C shouldn't be much trickier; you can always fall back to a thin interface layer in C if nothing else. For better or worse, C is still the lingua franca of programming, so Haskell is more than acceptable for most cases.
...but. You say you're motivated by performance issues, and want to use "a functional language". From this I infer you're not previously familiar with the languages you ask about. Among Haskell's defining features are that it, by default, uses non-strict evaluation and immutable data structures--which are both incredibly useful in many ways, but it also means that optimizing Haskell for performance is often dramatically different from other languages, and well-honed instincts may lead you astray in baffling ways. You may want to browse performance-related topics on the Haskell wiki to get a feel for the issues.
Which isn't to say that you can't do what you want in Haskell--you certainly can. Both laziness and immutability can in fact be exploited for performance benefits (Chris Okasaki's thesis provides some nice examples). But be aware that there'll be a bit of a learning curve when it comes to dealing with performance.
Both Haskell and OCaml provide the lovely benefits of using an ML-family language, but for most programmers, OCaml is likely to offer a gentler learning curve and better immediate results.

It's hard to give a definitive answer on this. Haskell has the advantages that Don mentioned along with having a more powerful type system and cleaner syntax. OCaml will be easier to learn if you coming from almost any other language (this is because Haskell is as function as functional languages get), and working with mutable random access structures can be a little clunky in Haskell. You will also likely find the performance characteristics of your OCaml code more intuitive than Haskell because of Haskell's lazy evaluation.
Really, I would recommend you evaluate both if you have the time. Here are some relevant Haskell resources:
http://hackage.haskell.org/package/hslibsvm
http://hackage.haskell.org/package/HSvm
Real World Haskell: this is a great freely available book for Haskell
Learn You a Haskell: this tutorial is just plain fun to read
Oh, if you look further into Haskell be sure to sign up for the Haskell Beginners and Haskell Cafe lists. The community is friendly and eager to help out newcomers (is my bias showing?).

If speed is your prime concern then go for C. Haskell is pretty good performance wise but you are never going to get as fast as C. To my knowledge the only functional language that has bettered C in a benchmark is Stalin Scheme but that is very old and nobody really knows how it works.
I've written genetic programming libraries where performance was key and I wrote it in a functional style in C. The functional style allowed me to easily parallelise it using OMP and it scales linearly upto 8 cores within a single process. You certainly can't do that in OCaml although Haskell is improving all the time with regards to concurrency and parallelism.
The downside of using C was that it took me months to finally find all the bugs and stop the core dumps which was extremely challenging because of the concurrency. Haskell would probably have caught 90% of those bugs on the first compilation.
So speed at any cost ? Looking back I'd wish I'd used Haskell as I could stand it to be 2 - 3 times slower if I'd saved over a month in development time.

While dons is correct that multicore parallelism at the thread level is better supported in Haskell, it sounds like you could live with process level parallelism (from your phrase: ideally separating out each execution to run in parallel.) which is supported quite well in OCaml. Keith pointed out that Haskell has a more powerful type system, but it can also be said that OCaml has a more powerful module system than Haskell.
As others have pointed out, OCaml's learning curve will be lower than Haskell's; you'll likely be more productive more quickly in OCaml. That said, learning OCaml is a great stepping-stone towards learning Haskell because many of the underlying concepts are very similar, so you could always migrate to Haskell later and find a lot of things familiar there. And as you pointed out, there is an OCaml-R bridge.

As an examples of Haskell and Ocaml in machine learning see stuff at Hal Daume and Lloyd Allison homepages. IMO it's is much more straightforward to achieve C++-like performance in Ocaml, than in Haskell. Through, as already said, Haskell has much nicer community (packages, tools and support), syntax&features (i.e. FFI, probability monads via typeclasses) and parallel programming support.

Having revamped OCaml-R, I've got a few comments to make on integrating OCaml and R. It might be worthwile to use OCaml to call R code, it works, but is not yet exactly straightforward. So using it to pilot R is worthwile. Integrating R functionality much more thoroughly is still cumbersome as, for example, much remains to be done to export R's type system and data to OCaml in a seamless way (you will have work to do). Moreover, the interaction of R's GC and OCaml's GC is a delicate point: you free n values in O(n^2) time, which isn't nice (to solve this point, you either need a more flexible R API, as far as I understand it, or to implement a GC in the binding itself as a big R array for proper interaction between GCs).
In a nutshell, I'd go for the "pilot R from OCaml" approach.
Contributions on the GC interaction layer and on mapping R datatypes to OCaml are most welcome.

You may want to take a look at this : http://www.haskell.org/pipermail/haskell-cafe/2010-May/077243.html

Late answer but a machine learning library in Haskell is available here : https://github.com/mikeizbicki/HLearn
This library implements various ML algorithms who are designed to have a much faster cross-validation than the usual implementations. It is based on the following paper Algebraic classifiers: a generic approach to fast cross-validation,
online training, and parallel training. The authors claims a 400x speed-up compared to the same task in Weka.

for haskell, consider checking hasktorch (which I managed to use for my AI thesis). for ocaml there seem to be tensorflow bindings.

Related

Benefits and uses of a functional programming language [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
Why functional languages?
I began programming with C/C++, VB, and eventually Python - all imperative languages. I took a course about programming languages and learned my first functional language - OCaml. It was terrible.
Syntax and other horrors aside, OCaml took my imperative thought process and threw it out the window. It was frustrating. I insisted that everything that could be done functionally could also be done imperatively. I thought of functional programming as imperative programming without a limb (side effects). In response to my frustration, the only benefit my professor could come up with was an FPL's ability to parallelize side-effect-free functions.
Anyways, enough talk.
What are some advantages that FPLs offer above IPLs?
Is there anything that can easily be done in an FPL that cannot easily be done in an IPL?
Are there any real-world examples of FPLs in use, or do they mostly serve as academic exercises? (When I say real-world, I mean a project that heavily relies on the functional aspect of the language and doesn't cram an FPL into a scenario where it doesn't belong).
Thanks,
Advait
First of all, almost any language in common use today is equivalent in expressive power, be it imperative or functional, so it's natural to think that anything you can do in a functional language you can probably do in an imperative one, because it's probably true.
One of the really nice things about functional languages is that their structure permits the application of Hindley-Milner type inference. This is the type system used in SML, OCaml and a bunch of other functional languages. It genuinely seems to lead to reduced rates of errors and is capable of saving you a lot of time and energy by finding bugs up-front as compile errors.
The automatic parallelisation argument is a bit over-used, especially because the promise simply hasn't eventuated. I have written explicitly parallel code in functional languages and it is nicer, IMHO, than doing something similar in Java or the like.
Anecdotally at least, I wouldn't be the first person to claim that learning a functional language makes you a better imperative programmer! That discomfort you felt in having your "imperative" thought process interrupted when using OCaml is actually a really good process to go through. It makes you question assumptions and stops you from writing code in a particular way just because you have always done it that way.
As for real-world use, you might like to look at the proceedings of the Commercial Users of Functional Programming workshops. There are also some very large projects written in various functional languages, although most of them are probably of limited interest outside fairly small communities. The theorem provers Coq and Isabelle are written in Ocaml and SML, respectively.
Whatever you do, I would persevere. I spent a long time banging my head against ML before things finally clicked. These days I'm not sure I even remember how Java or C work, because I haven't had a need for them in a long time... I just use ML!
When one finally manages to silence his imperativelly (mis)trained mind, FP actually becomes easier and more fun than IP.
FP tends to be safer, less bug prone, due to its declarative nature.
I like to think that parallelising imperative code doubles its complexity (try yourself with a non-trivial parallel app). IMO, FP reduces the gap a lot, thanks to lack of syncronisation in many cases.
Citing Gian, learning FP make you a wiser imperative programmer...
You can read http://www.paulgraham.com/avg.html

Looking for a functional language [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I'm a scientist working mostly with C++, but I would like to find a better language. I'm looking for suggestions, I'm not even sure my "dream language" exist (yet), but here's my wishlist;
IMPORTANT FEATURES (in order of importance)
1.1: Performance: For science, performance is very important. I perfectly understand the importance of productivity, not just execution speed, but when your program has to run for hours, you just can't afford to write it in Python or Ruby. It doesn't need to be as fast as C++, but it has to be reasonably close (e.g.: Fortran, Java, C#, OCaml...).
1.2: High-level and elegant: I would like to be able to concentrate as most as possible on the science and get a clear code. I also dislike verbose languages like Java.
1.3: Primarely functional: I like functional programming, and I think it suits both my style and scientific programming very well. I don't care if the language supports imperative programming, it might be a plus, but it has to focus and encourage functional programming.
1.4: Portability: Should work well on Linux (especially Linux!), Mac and Windows. And no, I do not think F# works well on Linux with mono, and I'm not sure OCaml works well on windows ;)
1.5: Object-oriented, preferably under the "everything is an object" philosophy: I realized how much I liked object-oriented programming when I had to deal pure C not so long ago. I like languages with a strong commitment to object-oriented programming, not just timid support.
NOT REALLY IMPORTANT, BUT THINGS THAT WOULD BE NICE
2.1: "Not-too-strong" typing: I find Haskell's strong typing system to be annoying, I like to be able to do some implicit casting.
2.2: Tools: Good tools is always a plus, but I guess it really depends on the languages. I played with Haskell using Geany, a lightweight editor, and I never felt handicapped. On the other hand I wouldn't have done the same with Java or even Scala (Scala, in particular, seems to be lacking good tools, which is really a shame). Java is really the #1 language here, with NetBeans and Javadoc, programming with Java is easy and fun.
2.3: Garbage collected, but translated or compiled without a virtual machine. I have nothing against virtual machines, but the two giants in the domain have their problems. On paper the .net framework seems much better, and especially suited for functional programming, but in practice it's still very windows-centric and the support for Linux/MacOS is terrible not as good as it should be, so it's not really worth considering. Java is now a mature VM, but it annoys me on some levels: I dislike the ways it deals with executables, generics, and it writes terrible GUIs (although these things aren't so bad).
In my mind there are three viable candidates: Haskell, Standard ML, OCaml. (Scala is out on the grounds that it compiles to JVM codes and is therefore unlikely to be fast enough when programs must run for days.)
All are primarily functional. I will comment where I have knowledge.
Performant
OCaml gives the most stable performance for all situations, but performance is hard to improve. What you get is what you get :-)
Haskell has the best parallel performance and can get excellent use out of an 8-core or 16-core machine. If your future is parallel, I urge you to master your dislike of the type system and learn to use Haskell effectively, including the Data Parallel Haskell extensions.
The down side of Haskell performance is that it can be quite difficult to predict the space and time required to evaluate a lazy functional program. There are excellent profiling tools, but still significant effort may be required.
Standard ML with the MLton compiler gives excellent performance. MLton is a whole-program compiler and does a very good job.
High-level and elegant
Syntactically Haskell is the clear winner. The type system, however, is cluttered with the remains of recent experiments. The core of the type system is, however, high-level and elegant. The "type class" mechanism is particularly powerful.
Standard ML has ugly syntax but a very clean type system and semantics.
OCaml is the least elegant, both from a point of view of syntax and from the type system. The remains of past experiments are more obtrusive than in Haskell. Also, the standard libraries do not support functional programming as well as you might expect.
Primarily functional
Haskell is purely functional; Standard ML is very functional; OCaml is mostly functional (but watch out for mutable strings and for some surprising omissions in the libraries; for example, the list functions are not safe for long lists).
Portability
All three work very well on Linux. The Haskell developers use Windows and it is well supported (though it causes them agony). I know OCaml runs well on OSX because I use an app written in OCaml that has been ported to OSX. But I'm poorly informed here.
Object-oriented
Not to be found in Haskell or SML. OCaml has a bog-standard OO system grafted onto the core language, not well integrated with other languages.
You don't say why you are keen for object-orientation. ML functors and Haskell type classes provide some of the encapsulation and polymorphism (aka "generic programming") that are found in C++.
Type system than can be subverted
All three languages provide unsafe casts. In all three cases they are a good way to get core dumps.
I like to be able to do some implicit casting.
I think you will find Haskell's type-class system to your liking—you can get some effects that are similar to implicit casting, but safely. In particular, numeric and string literals are implicitly castable to any type you like.
Tools
There are pretty good profiling tools with Haskell. Standard ML has crappy tools. OCaml has basically standard Unix profiling plus an unusable debugger. (The debugger refuses to cross abstraction barriers, and it doesn't work on native code.)
My information may be out of date; the tools picture is changing all the time.
Garbage-collected and compiled to native code
Check. Nothing to choose from there.
Recommendation
Overcome your aversion to safe, secure type systems. Study Haskell's type classes (the original paper by Wadler and Blott and a tutorial by Mark Jones may be illuminating). Get deeper into Haskell, and be sure to learn about the huge collection of related software at Hackage.
Try Scala. It's an object-oriented functional language that runs in the JVM, so you can access everything that was ever written in Java. It has all your important features, and one of the nice to have features. (Obviously not #2.2 :) but that will probably get better quickly.) It does have very strong typing, but with type inference it doesn't really get in your way.
You just described Common Lisp...
If you like using lists for most things, and care about performance, use Haskell or Ocaml. Although Ocaml suffers significantly in that Floats on the heap need to be boxed due to the VM design (but arrays of floats and purely-float records aren't individually boxed, which is good).
If you're willing to use arrays more than lists, or plan on programming using mutable state, use Scala rather than Haskell. If you're looking to write threaded multi-core code, use Scala or Haskell (Ocaml requires you to fork).
Scala's list is polymorphic, so a list of ints is really a list of boxed Int objects. Of course you could write your own list of ints in Scala that would be as fast, but I assume you'd rather use the standard libraries. Scala does have as much tail recursion as is possible on JVM.
Ocaml fails on Vista 64 for me, I think because they just changed the linker in the latest version (3.11.1?), but earlier versions worked fine.
Scala tool support is buggy at the moment if you're using nightly builds, but should be good soon. There are eclipse and netbeans plugins. I'm using emacs instead. I've used both the eclipse and netbeans debugger GUI successfully in the past.
None of Scala, Ocaml, or Haskell, have truly great standard libraries, but at least you can easily use Java libs in Scala. If you use mapreduce, Scala wins on integration. Haskell and Ocaml have a reasonable amount of 3rd party libs. It annoys me that there are differently named combinators for 2-3 types of monad in Haskell.
http://metamatix.org/~ocaml/price-of-abstraction.html might convince you to stay with C++. It's possible to write Scala that's almost identical in performance to Java/C++, but not necessarily in a high level functional or OO style.
http://gcc.gnu.org/projects/cxx0x.html seems to suggest that auto x = ... (type inference for expressions) and lambdas are usable. C++0x with boost, if you can stomach it, seems pretty functional. The downside to C++ high performance template abusing libraries is, of course, compile time.
Your requirements seem to me to describe ocaml quite well, except for the "not-too-strong" typing. As for tools, I use and like tuareg mode for emacs. Ocaml should run on windows (I haven't used it myself though), and is pretty similar to F#, FWIW.
I'd consider the ecosystem around the language as well. In my opinion Ocaml's major drawback is that it doesn't have a huge community, and consequently lacks the large library of third-party modules that are part of what makes python so convenient. Having to write your own code or modify someone else's one-shot prototype module you found on the internet can eat up some of the time you save by writing in a nice functional language.
You can use F# on mono; perhaps worth a look? I know that mono isn't 100% perfect (nothing ever is), but it is very far from "terrible"; most of the gaps are in things like WCF/WPF, which I doubt you'd want to use from FP. This would seem to offer much of what you want (except obviously it runs in a VM - but you gain a huge set of available libraries in the bargain (i.e. most of .NET) - much more easily than OCaml which it is based on).
I would still go for Python but using NumPy or some other external module for the number crunching or alternatively do the logic in Python and the hotspots in C / assembler.
You are always giving up cycles for comfort, the more comfort the more cycles. Thus you requirements are mutual exclusive.
I think that Common Lisp fits your description quite well.
1.1: Performance: Modern CL implementations are almost on par with C. There are also foreign function interfaces to interact with C libraries, and many bindings are already done (e.g. the GNU Scientific Library).
1.2: High-level and elegant: Yep.
1.3: Primarily functional: Yes, but you can also "get imperative" wherever the need arises; CL is "multi-paradigm".
1.4: Portability: There are several implementations with differing support for each platform. Some links are at CLiki and ALU Wiki.
1.5: Object-oriented, preferably under the "everything is an object" philosophy: CLOS, the Common Lisp Object System, is much closer to being "object oriented" than any of the "curly" languages, and also has features you will sorely miss elsewhere, like multimethods.
2.1: "Not-too-strong" typing: CL has dynamic, strong typing, which seems to be what you want.
2.2: Tools: Emacs + SLIME (the Superior Lisp Interaction Mode for Emacs) is a very nice free IDE. There is also a plugin for Eclipse (Cusp), and the commercial CL implementations also oftem bring an own IDE.
2.3: Garbage collected, but translated or compiled without a virtual machine. The Lisp image that you will be working on is a kind of VM, but I think that's not what you mean.
A further advantage is the incremental development: you have a REPL (read-eval-print-loop) running that provides a live interface into the running image. You can compile and recompile individual functions on the fly, and inspect the current program state on the live system. You have no forced interruptions due to compiling.
Short Version: The D Programming Language
Yum Yum Yum, that is a big set of requirements.
As you probably know, object orientation, high-level semantics, performance, portability and all the rest of your requirements don't tend to fit together from a technical point of view. Let's split this into a different view:
Syntax Requirements
Object Orientated presentation
Low memory management complexity
Allows function style
Isn't Haskell (damn)
Backend Requirements
Fast for science
Garbage Collected
On this basis I would recommend The D programming language it is a successor to C trying to be all things to all people.
This article on D is about it's functional programming aspects. It is object-orientated, garbage collected and compiles to machine code so is fast!
Good Luck
Clojure and/or Scala are good canditates for JVM
I'm going to assume that you are familiar enough with the languages you mentioned to have ruled them out as possibilities. Given that, I don't think there is a language that fulfills all your expectations. However, there are still a few languages you could take a look at:
Clojure This really is a very nice language. It's syntax is based on LISP, and it runs on the JVM.
D This is like C++ done right. It has all the features you want except that it's kind of weak on the functional programming.
Clean This is based very heavily on Haskell, but removes some of Haskell's problems. Downsides are that it's not very mature and doesn't have a lot of libraries.
Factor Syntactically it's based on Forth, but has support for LISP-like functional programming as well as better support for classes.
Take a peek at Erlang. Originally, Erlang was intended for building fault-tolerant, highly parallel systems. It is a functional language, embracing immutability and first-class functions. It has an official Windows binary release, and the source can be compiled for many *NIX platforms (there is a MacPorts build, for example).
In terms of high-level features, Erlang support list comprehensions, pattern matching, guard clauses, structured data, and other things you would expect. It's relatively slow in sequential computation, but pretty amazing if you're doing parallel computation. Erlang does run on a VM, but it runs on its own VM, which is part of the distribution.
Erlang, while not strictly object-oriented, does benefit from an OO mindset. Erlang uses a thing called a process as its unit of concurrency. An Erlang process is actually a lot like a native thread, except with much less overhead. Each process has a mailbox, will be sent messages, and will process those messages. It's easy enough to treat processes as if they were objects.
I don't know if it has much in the way of scientific libraries. It might not be a good fit for your needs, but it's a cool language that few people seem to know about.
Are you sure that you really need a functional language? I did most of my programming in lisp, which is obviously a functional language, but I have found that functional programming is more of a mind-set than a language feature. I'm using VB right now, which I think is an excellent language (speed, support, IDE) and I basically use the same programming style that I did in lisp - functions call other functions that call other functions - and functions are usually 1-5 lines long.
I do know that Lisp has good performance, run on all platforms, but it is somewhat outdated in terms of how up to date support for features such as graphics, multi-threading etc. are.
i've taken a look at clojure but if you don't like java you probably won't like clojure. It's a functional-lisp-style language implemented on top of java - but you'll probably find yourself using java libraries all the time which adds the verbosoity of java. I like lisp but I didn't like clojure despite the hype.
Are you also sure about your performanc requirements? Matlab is an excellent language for a lot of scientific computation, but it is kind of slow and I hate reading it. You might find t useful though especially in conjunction with other languages, for prototypes/scenarios/subunits.
Many of your requirements are based on hearsay. One example: the idea that Mono is "terrible".
http://banshee-project.org/
That's the official media player of many Linux distributions. It's written in C#. (They don't even have a public Windows release of it!)
Your assertions about the relative performance of various languages are equally dubious. And requiring a language to not use a virtual machine is quite unrealistic and totally undesirable. Even an OS is a form of VM on which applications run, which virtualises the hardware devices of the machine.
Though you earn points for mentioning tools (although not with enough priority). As Knuth observed, the first question to ask about a language is "What's the debugger like?"
Looking over your requirements, I would recommend VB on either Mono, or a virtual machine running windows. As a previous poster said, the first thing to ask about a language is "What is the debugger like" and VB/C# have the best debugger. Just a result of all those Microsoft employees hammering on the debugger, and having the teams nearby to bug (no pun intended) into fixing it.
The best thing about VB and C# is the large set of developer tools, community, google help, code exapmles, libraries, softwaer that interfaces with it, etc. I've used a wide variety of software development environments over the past 27 years, and the only thing that comes close is the Xerox Lisp machine environmnets (better) and the Symbolics Lisp machines (worse).

Should I learn Haskell or F# if I already know OCaml? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
I am wondering if I should continue to learn OCaml or switch to F# or Haskell.
Here are the criteria I am most interested in:
Longevity
Which language will last longer? I don't want to learn something that might be abandoned in a couple years by users and developers.
Will Inria, Microsoft, University of Glasgow continue to support their respective compilers for the long run?
Practicality
Articles like this make me afraid to use Haskell. A hash table is the best structure for fast retrieval. Haskell proponents in there suggest using Data.Map which is a binary tree.
I don't like being tied to a bulky .NET framework unless the benefits are large.
I want to be able to develop more than just parsers and math programs.
Well Designed
I like my languages to be consistent.
Please support your opinion with logical arguments and citations from articles. Thank you.
Longevity
Haskell is de facto the dominant language of functional-programming research. Haskell 98 will last for many more years in stable form, and something called Haskell may last 10 to 30 years---although the language will continue to evolve. The community has a major investment in Haskell and even if the main GHC developers are hit by a bus tomorrow (the famous "bus error in Cambridge" problem), there are plenty of others who can step up to the plate. There are also other, less elaborate compilers.
Caml is controlled by a small group at INRIA, the French national laboratory. They also have a significant investment, Others are also invested in Caml, and the code is open source, and the compiler is not too complicated, so that too will be maintained for a long time. I predict Caml will be much more stable than Haskell, as the INRIA folks appear no longer to be using it as a vehicle for exploring new language ideas (or at least they are doing so at a smaller rate than in the past).
Who knows what a company will do? If F# is successful, Microsoft could support it for 20 years. If it is not successful, they could pull the plug in 2012. I can't guess and won't try.
Practicality
A hash table is the best structure for fast retrieval. Haskell proponents in there suggest using Data.Map which is a binary tree.
It depends on what you are searching. When your keys are strings, ternary search trees are often faster than hash tables. When your keys are integers, Okasaki and Gill's binary Patricia trees are competitive with hashing. If you really want to, you can build a hash table in Haskell using the IO monad, but it's rare to need to.
I think there will always be a performance penalty for lazy evaluation. But "practical" is not the same as "as fast as possible". The following are true about performance:
It is easiest to predict the time and space behavior of a Caml program.
F# is in the middle (who really knows what .NET and the JIT will do?).
It is hardest to predict the time and space behavior of Haskell programs.
Haskell has the best profiling tools, and in the long run, this is what yields the best performance.
I want to be able to develop more than just parsers and math programs.
For an idea of the range of what's possible in Haskell, check out the xmonad window manager and the vast array ofpackages at hackage.haskell.org.
I don't like being tied to a bulky .NET framework unless the benefits are large.
I can't comment:
Well Designed
I like my languages to be consistent.
Some points on which to evaluate consistency:
Haskell's concrete syntax is extremely well designed; I'm continually impressed at the good job done by the Haskell committee. OCaml syntax is OK but suffers by comparison. F# started from Caml core syntax and has many similarities.
Haskell and OCaml both have very consistent stories about operator overloading. Haskell has a consistent and powerful mechanism you can extend yourself. OCaml has no overloading of any kind.
OCaml has the simplest type system, especially if you don't write objects and functors (which many Caml programmers don't, although it seems crazy to me not to write functors if you're writing ML). Haskell's type system is ambitious and powerful, but it is continually being improved, which means there is some inconsistency as a result of history. F# essentially uses the .NET type system, plus ML-like Hindley-Milner polymorphism (See question "What is Hindley-Milner".)
OCaml is not quite consistent on whether it thinks variants should be statically typed or dynamically typed, so it provides both ("algebraic data types" and "polymorphic variants"). The resulting language has a lot of expressive power, which is great for experts, but which construct to use is not always obvious to the amateur.
OCaml's order of evaluation is officially undefined, which is a poor design choice in a language with side effects. Worse, the implementations are inconsistent: the bytecoded virtual machine uses one order and the native-code compiler uses the other.
Should you learn F# or Haskell if you know OCaml?
I believe the answer is certainly yes, ideally you should learn all three languages because each one has something to offer but F# is the only one with a significant future so, if you can only feasibly learn one language, learn F# by reading my Visual F# 2010 for Technical Computing book or subscribing to our The F#.NET Journal.
Longevity
Microsoft committed to supporting F# when they released it as part of Visual Studio 2010 in April. So F# is guaranteed a rosy future for at least a few years. With a powerful combination of practically-important features like a high performance native-code REPL, high-level constructs for parallelism built-in to .NET 4 and a production-quality IDE mode, F# is a long way ahead of any other functional programming language in terms of real world applicability now. Frankly, nobody is even working on anything that might be able to compete with F# in the near future. My own open source HLVM project is an attempt to do so but it is far from ready.
In contrast, both OCaml and Haskell are being developed in extremely unproductive directions. This has been killing OCaml for several years now and I expect Haskell to follow suit over the next few years. Most former professional OCaml and Haskell programmers already moved on to F# (e.g. Credit Suisse, Flying Frog Consultancy) and most of the rest will doubtless migrate to more practical alternatives such as Clojure and Scala in the near future.
Specifically, OCaml's QPL license prevents anyone else from fixing its growing number of fundamental design flaws (16Mb string and array limits on 32-bit machines, no shared-memory parallelism, no value types, parametric polymorphism via type erasure, interpreted REPL, cumbersome FFI etc.) because they must distribute derivative works only in the form of patches to the original and the Debian package maintainers refuse to acknowledge an alternative upstream. The new features being added to the language, such as first-class modules in OCaml 3.12, are nowhere near as valuable as multicore capability would have been.
Some projects were started in an attempt to save OCaml but they proved to be too little too late. The parallel GC is practically useless and David Teller quit the batteries included project (although it has been picked up and released in a cut-down form). Consequently, OCaml has gone from being the most popular functional language in 2007 to severe decline today, with caml-list traffic down over 50% since 2007.
Haskell has fewer industrial users than OCaml and, although it does have multicore support, it is still being developed in a very unproductive direction. Haskell is developed almost entirely by two people at Microsoft Research in Cambridge (UK). Despite the fact that purely functional programming is bad for performance by design, they are continuing to try to develop solutions for parallel Haskell aimed at multicores when the massive amounts of unnecessary copying it incurs hits the memory wall and destroys any hope of scalable parallelism on a multicore.
The only major user of Haskell in industry is Galois with around 30 full-time Haskell programmers. I doubt they will let Haskell die completely but that does not mean they will develop it into a more generally-useful language.
Practicality
I wrote the article you cited about hash tables. They are a good data structure. Other people have referred to purely functional alternatives like ternary trees and Patricia trees but these are usually ~10× slower than hash tables in practice. The reason is simply that cache misses dominate performance concerns today and trees incur an extra O(log n) pointer indirections.
My personal preference is for optional laziness and optional purity because both are generally counter productive in the real world (e.g. laziness makes performance and memory consumption wildly unpredictable and purity severely degrades average-case performance and makes interoperability a nightmare). I am one of the only people earning a living entirely from functional programming through my own company. Suffice to say, if I thought Haskell were viable I would have diversified into it years ago but I keep choosing not to because I do not believe it is commercially viable.
You said "I don't like being tied to a bulky .NET framework unless the benefits are large". The benefits are huge. You get a production-quality IDE, a production-quality JIT compiler that performs hugely-effective optimizations like type-specializing generics, production-quality libraries for everything from GUI programming (see Game of Life in 32 lines of F#) to number crunching. But the real benefit of .NET, at least for me, is that you can sell the libraries that you write in F# and earn lots of money. Nobody has ever succeeded selling libraries to OCaml and Haskell programmers (and I am one of the few people to have tried) but F# libraries already sell in significant quantities. So the bulky .NET framework is well worth it if you want to earn a living by writing software.
Well designed
These languages are all well designed but for different purposes. OCaml is specifically designed for writing theorem provers and Haskell is specifically designed for researching Haskell. F# was designed to address all of the most serious practical problems with OCaml and Haskell such as poor interoperability, lack of concurrent garbage collection and lack of mature modern libraries like WPF in order to bring a productive modern language to a large audience.
This wasn't one of your criteria but have you considered job availability? Haskell currently list 144 jobs on indeed, Ocaml list 12 and C# list 26,000. These numbers are not perfect but I bet you that once F# ships it won't be long before it blows past Haskell and Ocaml in the number of job listings.
So far every programming language included in Visual Studios has thousands of job listings for it. Seems to me that if you want the best chance to use a functional programming language as your day job then F# will soon be it.
Longevity
No one can predict the future, but
OCaml and Haskell have been surving well for a number of years, which bodes well for their future
when F# ships with VS2010, MS will have legal obligations to support it for at least 5 years
Practicality
Perf: I don't have enough first-hand experience with Haskell, but based on second-hand and third-hand info, I think OCaml or F# are more pragmatic, in the sense that I think it is unlikely you'll be able to get the same run-time perf in Haskell that you do in OCaml of F#.
Libraries: Easy access to the .Net Framework is a huge benefit of F#. You can view it as being "tied to this bulky thing" if you like, but don't forget that "you have access to a huge bulky library of often incredibly useful stuff". The 'connectivity' to .Net is one of the big selling points for F#. F# is younger and so has fewer third-party libraries, but there is already e.g. FsCheck, FParsec, Fake, and a bunch of others, in addition to the libraries "in the box" on .Net.
Tooling: I don't have enough personal experience to compare, but I think the VS integration with F# is superior to anything you'll find for OCaml/Haskell today (and F# will continue to improve a bit here over the next year).
Change:
F# is still changing as it approaches its first supported release in VS2010, so there are some breaking changes to language/library you may have to endure in the near future.
Well Designed
Haskell is definitely beautiful and consistent. I don't know enough OCaml but my hunch is it is similarly attractive. I think that F# is 'bigger' than either of those, which means more dusty corners and inconsistencies (largely as a result of mediating the impedence mismatch between FP and .Net), but overall F# still feels 'clean' to me, and the inconsistencies that do exist are at least well-reasoned/intentioned.
Overall
In my opinion you will be in 'good shape' knowing any of these three languages well. If you know a big long-term project you want to use it for, one may stand out, but I think many of the skills will be transferable (more easily between F# and OCaml than to/from Haskell, but also more easily among any of these three than with, say, Java).
There's no simple answer to that question, but here are some things to consider:
Haskell and OCaml are both mature languages with strong implementations. Actually, there are multiple good implementations of Haskell, but I don't think that's a major point in its favor for your purpose.
F# is much younger, and who can predict where Microsoft will decide to take it? How you feel about that depends more on how you feel about Microsoft than anything anyone can tell you about programming languages.
OCaml (or ML in general), is a good practical language choice that supports doing cool functional stuff without forcing you to work in a way that might be uncomfortable. You get the full benefit of things like algebraic data types, pattern matching, type inference, and everybody else's favorite stuff. Oh, and objects.
Haskell gives you all that (except objects, pretty much), but also more or less forces you to rethink everything you think you know about programming. This might be a very good thing, if you're looking to learn something new, but it might be more than you want to bite off. I say this as someone who is only maybe halfway along the path to being a productive, happy Haskell programmer.
Both OCaml and Haskell are being used to write lots of different kinds of programs, not just compilers and AI or whatever. Google is your friend.
One last note: OCaml gives you hashtable, but it's hardly sensible to use it in code if you really want to embrace functional programming. Persistent trees (like Data.Map) are really the right solution for Haskell, and have lots of nice properties, which is one of the cool things to learn about when you pick up Haskell.
F# and OCaml are very similar in syntax, though, obviously, F# is better with .NET.
Which one you learn or use should be dependent on which platform you are aiming for.
In VS2010 F# is going to be included, and since it compiles to .NET bytecode, it can be used on a windows OS that supports the .NET version you used for it. This will give you a large area, but there are limits, currently with F# that OCaml don't have, in that F# appears not to take advantage of all the processors on a machine, but, that is probably due to F# still being developed, and this may be a feature that isn't as important yet.
There are other functional languages, such as Erlang that you could look at, but, basically, if you are strong in one FP language then you should be able to pick up another fairly quickly, so, just pick one that you like and try to develop interesting and challenging applications in it.
Eventually language writers will find a way to get OO languages to work well with multi-cores and FP may fall to the wayside again, but, that doesn't appear to be happening anytime soon.
This is not directly related to the OP's question as to whether or not to learn F#, but rather, an example of real world OCaml usage in the financial sector:
http://ocaml.janestreet.com/?q=node/61
Very interesting talk.
In terms of longevity, it's very difficult to judge popularity of languages, but just doing a quick check on here, these are the numbers of questions tagged with the appropriate (functional) language :-
2672 Scala,
1936 Haskell,
1674 F#,
1126 Clojure,
709 Scheme,
332 OCaml
I'd say this was a good indication of which languages people are actively learning at the moment and therefore might be a good indication of which ones might be popular in the next few years.

What are some pros/cons to various functional languages?

I know of several functional languages - F#, Lisp and its dialects, R, and more. However, as I've never used any of them (although the three I mentioned are on my "to-learn" list), I was wondering about the pros/cons of the various functional languages out there. Are there significant pros/cons, both in learning the language and in any real-world applications of said language?
Haskell is "extreme" (lazy, pure), has active users, lots of documentation, and makes runnable applications.
SML is "less extreme" (strict, impure), has active users, formal specification, many implementations (SML/NJ, Mlton, Moscow ML, etc.). Implementations vary on how applications are deployed wrt the runtimes.
OCaml is ML with attitude. It has an object orientation, active users, documentation, add ons, and makes runnable applications.
Erlang is concurrent, strict, pure (mostly), and supports distributed apps. It needs a runtime installed separately, so deployment is different from the languages that make runnable binaries.
F# is similar to OCaml with Microsoft backing and .NET libraries.
Scala runs on the JVM and can be used as a functional language with advanced features, or as simply a souped-up Java, or both. The flexibility is cited as a drawback for learning a functional language because it's easy to slip back into imperative Java ways. Of course it is also an advantage if you want to use existing JVM libraries.
I'm not sure if your question is to functional languages in general, or differences between them. For general info on why functional:
http://paulspontifications.blogspot.com/2007/08/no-silver-bullet-and-functional.html
Why Functional Programming Matters
As far as differences between functional languages:
Distinctive traits of the functional languages
The awesome thing about functional languages is that base themselves off of the lambda calculus and other math. This results in being able to use similar algorithms and thoughts across languages more easily.
As far as which one you should learn: Pick one that will have a comfortable environment for you. For example, if you're using .NET and Visual Studio, F# is an excellent fit. (Actually, the VS integration makes F# a strong contender, period.) The book "How to Design Programs" (full text, free, online) with PLT Scheme is also a good choice.
I'm biased, but F# looks to have the biggest "real-world" potential. This is mainly because of the nice IDE/.NET integration, allowing you to fully tap .NET and OO, while keeping a lot of functional power (and extending it in ways too). Scala might be possible contender, but it's more of an OO language that has some functional features; hence Scala won't be as big a productivity gain.
Edit: Just to note JavaScript and Ruby, before someone comments on that :). Ruby is something else you could take a look at if you're doing that type of web dev, as it has a lot of functional concepts in, although not as polished as other languages.
The biggest downside is that once you see the power you can have, you won't be happy using lesser languages. This becomes a problem if you're forced to deal with people who haven't yet understood.
One final note, the only "con" is that "it's so complicated". This isn't actually true -- functional languages are often simpler -- but if you have years of C or whatnot in your brain, it can be a significant hurdle to "get" the functional concept. After it clicks, it should be relatively smooth sailing.
Lisp has a gentle learning curve. You can learn the basics in an hour, though of course it takes longer to learn idioms etc. On the down side, there are many dialects of Lisp, and it's difficult to interact with mainstream environments like Java or .NET.
I would not recommend R unless you need to do statistics. It's a strange language, and not exactly functional. You can do functional programming in R, but most people don't.
If you're familiar with the Microsoft tool stack, F# might be easy to get into. And it has a huge, well-tested library behind it, i.e. the CLR.
You can use a functional programming style in any language, though some make it easier than others. As far as that goes, you might try Python.
ML family (SML/OCaml/F#):
Pros:
Fairly simple
Have effective implementations (on the level with Java/C#)
Easily predictable resource consumption (compared to lazy languages)
Readable syntax
Strong module system
(For F#): large .Net library available
Has mutable variables
Cons:
Sometimes too simple (no typeclasses => problems with overloading)
(Except F#): standard libraries are missing some useful things
Has mutable variables :)
Cannot have infinite data structures (not lazy language)
I haven't mentioned features common to most static-typed functional languages: type inference, parametric polymorphism, higher-order functions, algrebraic data types & pattern matching.
I have learnt Haskell at the university like a pure functional languaje and I can say that's really powerful, but also I couldn't find a practical use.
However, i found this: Haskell in practice . Check it, is amazing.
The characteristics of functional paradigms sometimes are pros, and sometimes cons, depending on the situation / context.
Some of them are:
high level
lambda functions
lazy evaluation
Higher-order functions
recursion
type inference
Cite from wikipedia:
Efficiency issues
Functional programming languages have
been perceived as less efficient in
their use of CPU and memory than
imperative languages such as C and
Pascal.[26] However, for programs that
perform intensive numerical
computations, functional languages
such as OCaml and Clean are similar in
speed to C. For
programs that handle large matrices
and multidimensional databases, array
functional languages (such as J and K)
were designed with speed optimization
in mind.
Purely functional languages have a
reputation for being slower than
imperative languages.
However, immutability of data can, in
many cases, lead to execution
efficiency in allowing the compiler to
make assumptions that are unsafe in an
imperative language, vastly increasing
opportunities for inlining.
Lazy evaluation may also speed up the
program, even asymptotically, whereas
it may slow it down at most by a
constant factor (however, it may
introduce memory leaks when used
improperly).

What are the primary differences between Haskell and F#? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Closed 9 years ago.
Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
I've searched on the Internet for comparisons between F# and Haskell but haven't found anything really definitive. What are the primary differences and why would I want to choose one over the other?
Haskell is a "pure" functional language, where as F# has aspects of both imperative/OO and functional languages. Haskell also has lazy evaluation, which is fairly rare amongst functional languages.
What do these things mean? A pure functional language, means there are no side effects (or changes in shared state, when a function is called) which means that you are guaranteed that if you call f(x), nothing else happens besides returning a value from the function, such as console output, database output, changes to global or static variables.. and although Haskell can have non pure functions (through monads), it must be 'explicitly' implied through declaration.
Pure functional languages and 'No side effect' programming has gained popularity recently as it lends itself well to multi core concurrency, as it is much harder to get wrong with no shared state, rather than myriad locks & semaphores.
Lazy evaluation is where a function is NOT evaluated until it is absolutely necessary required. meaning that many operation can be avoided when not necessary. Think of this in a basic C# if clause such as this:
if(IsSomethingTrue() && AnotherThingTrue())
{
do something;
}
If IsSomethingTrue() is false then AnotherThingTrue() method is never evaluated.
While Haskell is an amazing language, the major benefit of F# (for the time being), is that it sits on top of the CLR. This lends it self to polyglot programming. One day, you may write your web UI in ASP.net MVC, your business logic in C#, your core algorithms in F# and your unit tests in Ironruby.... All amongst the the .Net framework.
Listen to the Software Engineering radio with Simon Peyton Jones for more info on Haskell: Episode 108: Simon Peyton Jones on Functional Programming and Haskell
Big differences:
Platform
Object orientation
Laziness
The similarities are more important than the differences. Basically, you should use F# if you are on .NET already, Haskell otherwise. Also, OO and laziness mean that F# is closer to what you (probably) already know, so it is probably easier to learn.
Platform : Haskell has its own runtime, F# uses .NET. I don't know what the performance difference is, although I suspect the average code is about the same before optimisation. F# has the advantage if you need the .NET libraries.
Object orientation : F# has OO, and is very careful to make sure that .NET classes are easy to use even if your code isn't OO. Haskell has type classes which let you do something like OO, in a weird sort of way. They are like Ruby mixins crossed with Common Lisp generic functions. They're a little like Java/C# interfaces.
Laziness : Haskell is lazy, F# is not. Laziness enables some nice tricks and makes some things that look slow actually execute fast. But I find it a lot harder to guess how fast my code will run. Both languages let you use the other model, you just have to be explicit about it in your code.
Minor differences:
Syntax : Haskell has slightly nicer syntax in my opinion. It's a little more terse and regular, and I like declaring types on a separate line. YMMV.
Tools : F# has excellent Visual Studio integration, if you like that sort of thing. Haskell also has an older Visual Studio plugin, but I don't think it ever got out of beta. Haskell has a simple emacs mode, and you can probably use OCaml's tuareg-mode to edit F#.
Side effects : Both languages make it pretty obvious when you are mutating variables. But Haskell's compiler also forces you to mark side effects whenever you use them. The practical difference is that you have to be a lot more aware of when you use libraries with side effects as well.
F# is part of the ML family of languages and is very close to OCaml. You may want to read this discussion on the differences between Haskell and OCaml.
A major difference, which is probably a result ofthe purity but I less see mentioned, is the pervasive use of monads. As is frequently pointed out, monads can be built in most any language, but life changes greatly when they are used pervasively throughout the libraries, and you use them yourself.
Monads provide something seen in a much more limited way in other languages: abstraction of flow control. They're incredibly useful and elegant ways of doing all sorts of things, and a year of Haskell has entirely changed the way I program, in the same way that moving from imperative to OO programming many years ago changed it, or, much later, using higher-order functions did.
Unfortunately, there's no way in a space like this to provide enough understanding to let you see what the difference is. In fact, no amount of writing will do it; you simply have to spend enough time learning and writing code to gain a real understanding.
As well, F# sometimes may become slightly less functional or more awkward (from the functional programming point of view) when you interface with the .NET platform/libraries, as the libraries were obviously designed from an OO point of view.
So you might consider your decision this way: are you looking to try out one of these languages in order to get a quick, relatively small increment of improvement, or are you willing to put in more time and get less immediate benefit for something bigger in the long term. (Or, at least, if you don't get something bigger, the easy ability to switch to the other quickly?) If the former, F# is your choice, if the latter, Haskell.
A couple of other unrelated points:
Haskell has slightly nicer syntax, which is no suprise, since the designers of Haskell knew ML quite well. However, F#'s 'light' syntax goes a long way toward improving ML syntax, so there's not a huge gap there.
In terms of platforms, F# is of course .NET; how well that will work on Mono I don't know. GHC compiles to machine code with its own runtime, working well under both Windows and Unix, which compares to .NET in the same way, that, say, C++ does. This can be an advantage in some circumstances, especially in terms of speed and lower-level machine access. (I had no problem writing a DDE server in Haskell/GHC, for example; I don't think you could do that in any .NET language, and regardless, MS certainly doesn't want you doing that.)
Well, for one I'd say a main advantage is that F# compiles against the .NET platform which makes it easy to deploy on windows. I've seen examples which explained using F# combined with ASP.NET to build web applications ;-)
On the other hand, Haskell has been around for waaaaay longer, so I think the group of people who are real experts on that language is a lot bigger.
For F# I've only seen one real implementation so far, which is the Singularity proof of concept OS. I've seen more real world implementations of Haskell.

Resources