Haskell libraries overview and their quality [closed] - haskell

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
I want to use Haskell in production. It has a lot of libraties but not all of them are stable, ready-to-use and well-developed. Some libraries with interesting conceptions have experimental status. Many libraries are still in minor versions (0.0.1 for example). Some of them just abandoned. Hackage too huge to monitor them, so I need a brief slice of the current libraries state, their prospects and suitability for use.
I understand that question is very broad, but this information will be useful to anyone in any way. Here we can gather information bit by bit and then use it for an informative paper.
So what libraries I can use for:
Fast arrays capable of handling millions of items
Fast and powerful maps (probably, Data.Map?)
Fast, generic and convenient trees
Queues, hashtables
Regular expressions
Finite state machines
Neural networks, genetic algorithms
Mathematical calculations
Physics (wich can be used in game developing)
GUI
Image processing (we have various image formats actually)
Working with databases (maybe ORM or some DSLs to generate SQL)
Functional reactive programming
OpenGL bindings (yes, HOpenGL is good), OpenAL and OpenCL bindings
Parsing (Parsec is great I think)
Multithread and parallel programming
Network
Multipurpose game engines
Something else?
What is also interesting to have the tools for:
Testing (QuickCheck)
Logging (Maybe hslogger)
Profiling
Debugging
Here the links to the similar topics:
What are the best Haskell libraries to operationalize a program?
Regex & String Libraries in Haskell
Libraries for strict data structures in Haskell
Memory efficient strings in Haskell
Which Haskell library for computer graphics geometry?
Which Haskell XML library to use?
Other links
Applications and libraries (list and brief description)
Regular expressions
Haskell libraries you should use
There are a hell of a lot of Haskell libraries now. What are we going to do about it?
Popular Haskell Packages: Q2 2010 report
Thank you.

I'll leave this as a community wiki - others people please feel free to add items or commentary in a reasonably concise manner.
Fast arrays capable of handling millions of items: Repa, Vector.
Fast and powerful maps: containers and unordered-containers.
Fast, generic and convenient trees:
Queues, hashtables: See the hashtables package for the latest and greatest.
Regular expressions: regex-pcre, regex-tdfa
Finite state machines: fsmActions - but it has version 0.4.3 alpha; fst - but not exactly a FSM. In some cases FRP will be useful instead of true FSM.
Neural networks, genetic algorithms: HNN is well established. As far as GA, we really have a framework (GA) and something that looks more complete (hgalib), but I haven't inspected it closely.
Mathematical calculations: hmatrix
Physics: dimensional.
GUI: GTK works well. I get the sense that wxhaskell generates more questions per-capita, but that's an informal impression.
Image processing: Parsing? Juicy-Pixels and JuicyPixels-Repa.
No, Image processing! The CV, Friday, yarr
Working with databases: Consider using the persistent-* wrappers, but also look at HDBC. PostgreSQL is stable. For Cassandra there are several but consider cql.
Functional reactive programming: Reactive-banana, netwire
OpenGL bindings: OpenGL, GL.
Parsing: Parsec, attoparsec, polyparse, frisby.
Multithread and parallel programming - See the parallel package and Control.Concurrency. monad-par is relatively new but frequently easier to reason about than the basic parallel library. See also async for concurrent IO.
Network - Depends. Network with blaze-builder, cereal, or binary. Also consider network-{conduit, enumerator, pipes}. There are several client/server wrappers out there as well.
Multipurpose game engines: For learning? gloss. Otherwise you probably need to roll your own but make use of OpenGL, GLUT, GTK, FRP, ogre bindings, SDL, and perhaps FunGEn if it's back on track.
Configuration management: configurator, config-ini.
XML Process: HaXml, HXT, xml-conduit - good, stable and powerful libraries.
Tools:
Testing - QuickCheck, LazySmallCheck, Test-Framework, hspec, HUnit
Logging - Yep, hslogger or dlist with the writer monad if that's all you need.
Profiling - hpc, thread scope, criterion, GHC time and space profiling utilities.
Debugging - GHCi debugging, unsafe (trace) debugging, making better property tests.
Related Questions:
Haskell library for 2D drawing
Filling the enclosed areas with random colors - Haskell - Friday

Related

What's the status of current Functional Reactive Programming implementations?

I'm trying to visualize some simple automatic physical systems (such things as pendulum, robot arms,etc.) in Haskell.
Often those systems can be described by equations like
df/dt = c*f(t) + u(t)
where u(t) represents some kind of 'intelligent control'. Those systems look to fit very nicely in the Functional Reactive Programming paradigm.
So I grabbed the book "The Haskell School of Expression" by Paul Hudak,
and found that the domain specific language "FAL" (for Functional Animation Language) presented there actually works quite pleasently for my simple toy systems (although some functions, notably integrate, seemed to be a bit too lazy for an efficient use, but easily fixable).
My question is, what's the more mature, up-to-date, well-maintained, performance-tuned alternative for more advanced, or even practical applications today?
This wiki page lists several options for Haskell, but I'm not clear about the following respects:
The status of "reactive", the project from Conal Eliott who is (as I understand it) one of the inventers of this programming paradigm, looks a bit stale. I love his code, but maybe I should try other more up-to-date alternatives? What's the primary difference between them, in terms of syntax/performance/runtime-stability?
To quote from a survey in 2011, Section 6, "... FRP implementations are still not efficient enough or predictable enough in performance to be used effectively in domains which require latency guarantees ...". Alghough the survey suggests some interesting possible optimizations, given the fact that FRP is there for more than 15 years, I get the impression that this performance problem might be something very or even inherently difficult to solve at least within a few years. Is this true?
The same author of the survey talks about "time leaks" in his blog. Is the problem unique to FRP, or something we are generally having when programming in a pure, non-strict language? Have you ever found it just too difficult to stabilize an FRP-based system over time, if not performant enough?
Is this still a research level project? Are the people like plant engineers, robotics engineers, financial engineers, etc. actually using them (in whaterver language that suits their needs)?
Although I personally prefer a Haskell implementation, I'm open to other suggestions. For example, it would be particularly fun to have an Erlang implementation --- it would then be very easy to have an intelligent, adaptive, self-learning server process!
Right now there are mainly two practical Haskell libraries out there for functional reactive programming. Both are maintained by single persons, but are receiving code contributions from other Haskell programmers as well:
Netwire focusses on efficiency, flexibility and predictability. It has its own event paradigm and can be used in areas where traditional FRP does not work, including network services and complex simulations. Style: applicative and/or arrowized. Initial author and maintainer: Ertugrul Söylemez (this is me).
reactive-banana builds on the traditional FRP paradigm. While it is practical to use it also serves as ground for classic FRP research. Its main focus is on user interfaces and there is a ready-made interface to wx. Style: applicative. Initial author and maintainer: Heinrich Apfelmus.
You should try both of them, but depending on your application you will likely find one or the other to be a better fit.
For games, networking, robot control and simulations you will find Netwire to be useful. It comes with ready-made wires for those applications, including various useful differentials, integrals and lots of functionality for transparent event handling. For a tutorial visit the documentation of the Control.Wire module on the page I linked.
For graphical user interfaces currently your best choice is reactive-banana. It already has a wx interface (as a separate library reactive-banana-wx) and Heinrich blogs a lot about FRP in this context including code samples.
To answer your other questions: FRP isn't suitable in scenarios where you need real-time predictability. This is largely due to Haskell, but unfortunately FRP is difficult to realize in lower level languages. As soon as Haskell itself becomes real-time-ready, FRP will get there, too. Conceptually Netwire is ready for real-time applications.
Time leaks aren't really a problem anymore, because they are largely related to the monadic framework. Practical FRP implementations simply don't offer a monadic interface. Yampa has started this and Netwire and reactive-banana both build on that.
I know of no commercial or otherwise large scale projects using FRP right now. The libraries are ready, but I think the people aren't – yet.
Although there are some good answers already, I'm going to attempt to answer your specific questions.
reactive is not usable for serious projects, due to time leak problems. (see #3). The current library with the most similar design is reactive-banana, which was developed with reactive as an inspiration, and in discussion with Conal Elliott.
Although Haskell itself is inappropriate for hard real-time applications, it is possible to use Haskell for soft realtime applications in some cases. I'm not familiar with current research, but I don't believe this is an insurmountable problem. I suspect that either systems like Yampa, or code generation systems like Atom, are possibly the best approach to solving this.
A "time leak" is a problem specific to switchable FRP. The leak occurs when a system is unable to free old objects because it may need them if a switch were to occur at some point in the future. In addition to a memory leak (which can be quite severe), another consequence is that, when the switch occurs, the system must pause while the chain of old objects is traversed to generate current state.
Non-switchable frp libraries such as Yampa and older versions of reactive-banana don't suffer from time leaks. Switchable frp libraries generally employ one of two schemes: either they have a special "creation monad" in which FRP values are created, or they use an "aging" type parameter to limit the contexts in which switches can occur. elerea (and possibly netwire?) use the former, whereas recent reactive-banana and grapefruit use the latter.
By "switchable frp", I mean one which implements Conal's function switcher :: Behavior a -> Event (Behavior a) -> Behavior a, or identical semantics. This means that the shape of the network can dynamically switch as it's run.
This doesn't really contradict #ertes's statement about monadic interfaces: it turns out that providing a Monad instance for an Event makes time leaks possible, and with either of the above approaches it's no longer possible to define the equivalent Monad instances.
Finally, although there's still a lot of work remaining to be done with FRP, I think some of the newer platforms (reactive-banana, elerea, netwire) are stable and mature enough that you can build reliable code from them. But you may need to spend a lot of time learning the ins and outs in order to understand how to get good performance.
I'm going to list a couple of items in the Mono and .Net space and one from the Haskell space that I found not too long ago. I'll start with Haskell.
Elm - link
Its description as per its site:
Elm aims to make front-end web development more pleasant. It
introduces a new approach to GUI programming that corrects the
systemic problems of HTML, CSS, and JavaScript. Elm allows you to
quickly and easily work with visual layout, use the canvas, manage
complicated user input, and escape from callback hell.
It has its own variant of FRP. From playing with its examples it seems pretty mature.
Reactive Extensions - link
Description from its front page:
The Reactive Extensions (Rx) is a library for composing asynchronous
and event-based programs using observable sequences and LINQ-style
query operators. Using Rx, developers represent asynchronous data
streams with Observables, query asynchronous data streams using LINQ
operators, and parameterize the concurrency in the asynchronous data
streams using Schedulers. Simply put, Rx = Observables + LINQ +
Schedulers.
Reactive Extensions comes from MSFT and implements many excellent operators that simplify handling events. It was open sourced just a couple of days ago. It's very mature and used in production; in my opinion it would have been a nicer API for the Windows 8 APIs than the TPL-library provides; because observables can be both hot and cold and retried/merged etc, while tasks always represent hot or done computations that are either running, faulted or completed.
I've written server-side code using Rx for asynchronocity, but I must admit that writing functionally in C# can be a bit annoying. F# has a couple of wrappers, but it's been hard to track the API development, because the group is relatively closed and isn't promoted by MSFT like other projects are.
Its open sourcing came with the open sourcing of its IL-to-JS compiler, so it could probably work well with JavaScript or Elm.
You could probably bind F#/C#/JS/Haskell together very nicely using a message broker, like RabbitMQ and SocksJS.
Bling UI Toolkit - link
Description from its front page:
Bling is a C#-based library for easily programming images, animations,
interactions, and visualizations on Microsoft's WPF/.NET. Bling is
oriented towards design technologists, i.e., designers who sometimes
program, to aid in the rapid prototyping of rich UI design ideas.
Students, artists, researchers, and hobbyists will also find Bling
useful as a tool for quickly expressing ideas or visualizations.
Bling's APIs and constructs are optimized for the fast programming of
throw away code as opposed to the careful programming of production
code.
Complimentary LtU-article.
I've tested this, but not worked with it for a client project. It looks awesome, has nice C# operator overloading that form the bindings between values. It uses dependency properties in WPF/SL/(WinRT) as event sources. Its 3D animations work well on reasonable hardware. I would use this if I end up on a project in need for visualizations; probably porting it to Windows 8.
ReactiveUI - link
Paul Betts, previously at MSFT, now at Github, wrote that framework. I've worked with it pretty extensively and like the model. It's more decoupled than Blink (by its nature from using Rx and its abstractions) - making it easier to unit test code using it. The github git client for Windows is written in this.
Comments
The reactive model is performant enough for most performance-demanding applications. If you are thinking of hard real-time, I'd wager that most GC-languages have problems. Rx, ReactiveUI create some amount of small object that need to be GCed, because that's how subscriptions are created/disposed and intermediate values are progressed in the reactive "monad" of callbacks. In general on .Net I prefer reactive programming over task-based programming because callbacks are static (known at compile time, no allocation) while tasks are dynamically allocated (not known, all calls need an instance, garbage created) - and lambdas compile into compiler-generated classes.
Obviously C# and F# are strictly evaluated, so time-leak isn't a problem here. Same for JS. It can be a problem with replayable or cached observables though.

What haskell topics need to be addressed in a Real-World-Haskell style?

It has been quite some time now that RWH came out (almost 3 years). I was eager to get my copy after following the incremental writing of the book online (which is, I think, one of the best ways to write a book.) What a rewarding read in the midst of all the rather academic papers a haskell student usually encounters!
It was a sturdy companion on quite some trips and I refer back to it regularly.
Still, my copy started to look pretty battered and even though most of the content is still valid, there has been an abundance of new topics in the haskell world that would be worth covering in a similar fashion.
Considering the impact RWH had (and still has,) I sincerely hope that there will be a sequel some day :)
Some of the topics for a sequel that would immediately come to my mind:
Iteratees
more on concurrent programming in haskell
merits and dangers of lazy evaluation
possibly covering some common libraries that deal with this
in particular lazy io
new ghc features (e.g. the new I/O Manager, LLVM code generator)
Memoization
..
What are the topics that the haskell community needs a RWH-style explanation for?
this is a summary of the suggestions so far:
Concepts
Iteratees / lazy IO
Arrows
ghc event manager
Techniques
generics (uniplate, syb)
metaprogramming (Template Haskell)
data structures (use of functional datastructures, designing data structures)
EDSLs (designing EDSLs)
memoization
designing with monads
best practices for imperative programming
Tools
ThreadScope
Advanced FFI tools (c2hs, using Haskell from C)
cabal
haddock
hoogle
Tuning the runtime, esp. GC flags
Djinn
Libraries
arrays and array programming (vector, repa, hmatrix)
numerics (random numbers)
parallel programming (The Par monad)
unicode and locales (text, text-icu)
parsing (attoparsec, tagsoup)
networking (snap, yesod)
web stuff (templating)
persistance (especially no-sql storage bindings)
graphics (cairo, sdl, opengl)
xml (haxml)
crypto
processors and systems stuff
Here's my take, biased towards the ecosystem.
Libraries
arrays and array programming:
vector
repa
hmatrix
numerics
random numbers
parallel programming
The Par monad
unicode and locales
text and text-icu
parsing
attoparsec
tagsoup
networking
snap and/or yesod
web stuff
templating
persistance
databases beyond hdbc
no-sql storage bindings
graphics
cairo
sdl
opengl
xml
haxml
crypto
processors and systems stuff
Techniques
generics
uniplate
syb
metaprogramming
Template Haskell
data structures
designing data structures
EDSLs
designing EDSLs
memoization
designing with monads
Tools
ThreadScope
Advanced FFI tools
c2hs
using Haskell from C
Tuning the runtime, esp. GC flags
I would love to see:
Cabal & Hoogle & Haddock (best practices for the daily code - build - test - deploy workflow)
Available datastructures and their (real world) usage, performance and space characteristics
Data Visualization
Best practices for imperative programming
Yesod & Snap
More on Database Connectivity (SQL and NoSQL)
More on Network Programming
The "More on..." might be better placed in a "Haskell Cookbook" though.
These are less "real worldy", but I'd like to see helpful introductions (and possible Real World applications?) to
Djinn
Template Haskell
Arrows
I've been meaning to ask this exact same question! I would buy RWH vol2 if it contained the items in the list so far. I would also like to real world examples for (in no particular order)
GADTs
type families
techniques for heterogeneous lists
Typeclassopedia style presentation of standard typeclasses
a fuller explanation of Edward Yang's Type Technology Tree
records / lenses
I would love to see an "RWH approach" to functional reactive programming - a RWH version of this, maybe covering Yampa or something similar. But maybe this topic is not quite "real-worldy" enough (yet)...
I am only recently new to Haskell and have only read a few chapters of this book and Programming in Haskell by Graham Hutton
However, I would have to agree with Alexander in the sense I would love to see a "Haskell Cookbook" as well as a new more updated version of RWH (As I have yet to finish this is not as important personally for me!).
Advice and sample codes to do with Dates, Generating Random Numbers and the most efficient codes to perform key algorithms (Sorting etc.) would be a great addition to any such book!

When choosing a functional programming language for use with LLVM, what are the trade-offs?

Let's assume for the moment that C++ is not a functional programming language. If you want to write a compiler using LLVM for the back-end, and you want to use a functional programming language and its bindings to LLVM to do your work, you have two choices as far as I know: Objective Caml and Haskell. If there are others, then I'd like to know about those too.
I'm not asking for subjective opinions, so please don't give this the subjective tag. I want to make up my own mind about this, but I'm not sure I know what are all the trade-offs. So, StackOverflow to the rescue. What are the trade-offs?
Either OCaml or Haskell would be a good choice. Why not check out the LLVM tutorials for each language? The LLVM tutorial for OCaml is here: http://llvm.org/docs/tutorial/OCamlLangImpl1.html
Haskell has more momentum these days, but there are plenty of good parsing libraries for OCaml as well including the PEG parser generator Aurochs, Menhir, and the GLR parser generator Dypgen. Also check out this presentation on pcl a monadic parser combinator library for OCaml (like Parsec for Haskell) there's some good info in there comparing Haskell's and OCaml's approach: http://osp.janestreet.com/files/pcl.pdf
Some will say that laziness gives Haskell the edge in parsing, but you can get laziness in OCaml as well.
Haskell has higher level bindings to LLVM than OCaml (the Haskell ones provide some interesting type safety guarantees) and Haskell has by far more libraries to use (1700 packages on http://hackage.haskell.org) making it easier to glue together components.
Availability of native bindings need not constrain your choice of language. There is a third option, apart from using bindings or generating IR text directly:
You can use a language-neutral serialization format, such as Google's Protocol Buffers, to serve as the bridge from your front-end to your back-end. Protocol buffers are, after all, just ASTs in disguise.
Your front end, implemented in a functional language, then does what it is best at -- parsing, type checking, desugaring, core-to-core transformations, etc -- and the C++ backend takes the IR from your frontend and uses LLVM's feature-complete-by-definition native C++ API to do lowering from your-language-IR to LLVM IR. This makes it much easier to handle "advanced" features of LLVM such as debug metadata.
I'm using this strategy with hprotoc and associated Haskell bindings for protocol buffers, and am very happy with the results. There is much to be said for using the right tool for the job!
OCaml is the only functional language with bindings in the LLVM distro itself and documentation on llvm.org such as the Kaleidoscope tutorial. If you have OCaml installed when you build and install LLVM then it will automatically build and install the LLVM bindings for OCaml as well. Moreover, these OCaml bindings have been in use for years so they are mature and reliable.
I have been developing HLVM in OCaml using the standard LLVM bindings and found OCaml+LLVM to be an extremely powerful combination. HLVM provides tuples, arrays, unions, TCO of all tail calls, generic printing, FFI to C, JIT compilation and parallel garbage collection with a VM weighing in at under 2kLOC of OCaml code that took only a few man-weeks to develop from scratch. HLVM's numerical performance already far exceeds that of today's fastest open source FPLs including OCaml itself. I have published articles in the OCaml Journal describing how LLVM can be used from OCaml for everything from basic expression evaluation to advanced topics such as parallelism and garbage collection. You may also like this mini example.

Looking for a functional language [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I'm a scientist working mostly with C++, but I would like to find a better language. I'm looking for suggestions, I'm not even sure my "dream language" exist (yet), but here's my wishlist;
IMPORTANT FEATURES (in order of importance)
1.1: Performance: For science, performance is very important. I perfectly understand the importance of productivity, not just execution speed, but when your program has to run for hours, you just can't afford to write it in Python or Ruby. It doesn't need to be as fast as C++, but it has to be reasonably close (e.g.: Fortran, Java, C#, OCaml...).
1.2: High-level and elegant: I would like to be able to concentrate as most as possible on the science and get a clear code. I also dislike verbose languages like Java.
1.3: Primarely functional: I like functional programming, and I think it suits both my style and scientific programming very well. I don't care if the language supports imperative programming, it might be a plus, but it has to focus and encourage functional programming.
1.4: Portability: Should work well on Linux (especially Linux!), Mac and Windows. And no, I do not think F# works well on Linux with mono, and I'm not sure OCaml works well on windows ;)
1.5: Object-oriented, preferably under the "everything is an object" philosophy: I realized how much I liked object-oriented programming when I had to deal pure C not so long ago. I like languages with a strong commitment to object-oriented programming, not just timid support.
NOT REALLY IMPORTANT, BUT THINGS THAT WOULD BE NICE
2.1: "Not-too-strong" typing: I find Haskell's strong typing system to be annoying, I like to be able to do some implicit casting.
2.2: Tools: Good tools is always a plus, but I guess it really depends on the languages. I played with Haskell using Geany, a lightweight editor, and I never felt handicapped. On the other hand I wouldn't have done the same with Java or even Scala (Scala, in particular, seems to be lacking good tools, which is really a shame). Java is really the #1 language here, with NetBeans and Javadoc, programming with Java is easy and fun.
2.3: Garbage collected, but translated or compiled without a virtual machine. I have nothing against virtual machines, but the two giants in the domain have their problems. On paper the .net framework seems much better, and especially suited for functional programming, but in practice it's still very windows-centric and the support for Linux/MacOS is terrible not as good as it should be, so it's not really worth considering. Java is now a mature VM, but it annoys me on some levels: I dislike the ways it deals with executables, generics, and it writes terrible GUIs (although these things aren't so bad).
In my mind there are three viable candidates: Haskell, Standard ML, OCaml. (Scala is out on the grounds that it compiles to JVM codes and is therefore unlikely to be fast enough when programs must run for days.)
All are primarily functional. I will comment where I have knowledge.
Performant
OCaml gives the most stable performance for all situations, but performance is hard to improve. What you get is what you get :-)
Haskell has the best parallel performance and can get excellent use out of an 8-core or 16-core machine. If your future is parallel, I urge you to master your dislike of the type system and learn to use Haskell effectively, including the Data Parallel Haskell extensions.
The down side of Haskell performance is that it can be quite difficult to predict the space and time required to evaluate a lazy functional program. There are excellent profiling tools, but still significant effort may be required.
Standard ML with the MLton compiler gives excellent performance. MLton is a whole-program compiler and does a very good job.
High-level and elegant
Syntactically Haskell is the clear winner. The type system, however, is cluttered with the remains of recent experiments. The core of the type system is, however, high-level and elegant. The "type class" mechanism is particularly powerful.
Standard ML has ugly syntax but a very clean type system and semantics.
OCaml is the least elegant, both from a point of view of syntax and from the type system. The remains of past experiments are more obtrusive than in Haskell. Also, the standard libraries do not support functional programming as well as you might expect.
Primarily functional
Haskell is purely functional; Standard ML is very functional; OCaml is mostly functional (but watch out for mutable strings and for some surprising omissions in the libraries; for example, the list functions are not safe for long lists).
Portability
All three work very well on Linux. The Haskell developers use Windows and it is well supported (though it causes them agony). I know OCaml runs well on OSX because I use an app written in OCaml that has been ported to OSX. But I'm poorly informed here.
Object-oriented
Not to be found in Haskell or SML. OCaml has a bog-standard OO system grafted onto the core language, not well integrated with other languages.
You don't say why you are keen for object-orientation. ML functors and Haskell type classes provide some of the encapsulation and polymorphism (aka "generic programming") that are found in C++.
Type system than can be subverted
All three languages provide unsafe casts. In all three cases they are a good way to get core dumps.
I like to be able to do some implicit casting.
I think you will find Haskell's type-class system to your liking—you can get some effects that are similar to implicit casting, but safely. In particular, numeric and string literals are implicitly castable to any type you like.
Tools
There are pretty good profiling tools with Haskell. Standard ML has crappy tools. OCaml has basically standard Unix profiling plus an unusable debugger. (The debugger refuses to cross abstraction barriers, and it doesn't work on native code.)
My information may be out of date; the tools picture is changing all the time.
Garbage-collected and compiled to native code
Check. Nothing to choose from there.
Recommendation
Overcome your aversion to safe, secure type systems. Study Haskell's type classes (the original paper by Wadler and Blott and a tutorial by Mark Jones may be illuminating). Get deeper into Haskell, and be sure to learn about the huge collection of related software at Hackage.
Try Scala. It's an object-oriented functional language that runs in the JVM, so you can access everything that was ever written in Java. It has all your important features, and one of the nice to have features. (Obviously not #2.2 :) but that will probably get better quickly.) It does have very strong typing, but with type inference it doesn't really get in your way.
You just described Common Lisp...
If you like using lists for most things, and care about performance, use Haskell or Ocaml. Although Ocaml suffers significantly in that Floats on the heap need to be boxed due to the VM design (but arrays of floats and purely-float records aren't individually boxed, which is good).
If you're willing to use arrays more than lists, or plan on programming using mutable state, use Scala rather than Haskell. If you're looking to write threaded multi-core code, use Scala or Haskell (Ocaml requires you to fork).
Scala's list is polymorphic, so a list of ints is really a list of boxed Int objects. Of course you could write your own list of ints in Scala that would be as fast, but I assume you'd rather use the standard libraries. Scala does have as much tail recursion as is possible on JVM.
Ocaml fails on Vista 64 for me, I think because they just changed the linker in the latest version (3.11.1?), but earlier versions worked fine.
Scala tool support is buggy at the moment if you're using nightly builds, but should be good soon. There are eclipse and netbeans plugins. I'm using emacs instead. I've used both the eclipse and netbeans debugger GUI successfully in the past.
None of Scala, Ocaml, or Haskell, have truly great standard libraries, but at least you can easily use Java libs in Scala. If you use mapreduce, Scala wins on integration. Haskell and Ocaml have a reasonable amount of 3rd party libs. It annoys me that there are differently named combinators for 2-3 types of monad in Haskell.
http://metamatix.org/~ocaml/price-of-abstraction.html might convince you to stay with C++. It's possible to write Scala that's almost identical in performance to Java/C++, but not necessarily in a high level functional or OO style.
http://gcc.gnu.org/projects/cxx0x.html seems to suggest that auto x = ... (type inference for expressions) and lambdas are usable. C++0x with boost, if you can stomach it, seems pretty functional. The downside to C++ high performance template abusing libraries is, of course, compile time.
Your requirements seem to me to describe ocaml quite well, except for the "not-too-strong" typing. As for tools, I use and like tuareg mode for emacs. Ocaml should run on windows (I haven't used it myself though), and is pretty similar to F#, FWIW.
I'd consider the ecosystem around the language as well. In my opinion Ocaml's major drawback is that it doesn't have a huge community, and consequently lacks the large library of third-party modules that are part of what makes python so convenient. Having to write your own code or modify someone else's one-shot prototype module you found on the internet can eat up some of the time you save by writing in a nice functional language.
You can use F# on mono; perhaps worth a look? I know that mono isn't 100% perfect (nothing ever is), but it is very far from "terrible"; most of the gaps are in things like WCF/WPF, which I doubt you'd want to use from FP. This would seem to offer much of what you want (except obviously it runs in a VM - but you gain a huge set of available libraries in the bargain (i.e. most of .NET) - much more easily than OCaml which it is based on).
I would still go for Python but using NumPy or some other external module for the number crunching or alternatively do the logic in Python and the hotspots in C / assembler.
You are always giving up cycles for comfort, the more comfort the more cycles. Thus you requirements are mutual exclusive.
I think that Common Lisp fits your description quite well.
1.1: Performance: Modern CL implementations are almost on par with C. There are also foreign function interfaces to interact with C libraries, and many bindings are already done (e.g. the GNU Scientific Library).
1.2: High-level and elegant: Yep.
1.3: Primarily functional: Yes, but you can also "get imperative" wherever the need arises; CL is "multi-paradigm".
1.4: Portability: There are several implementations with differing support for each platform. Some links are at CLiki and ALU Wiki.
1.5: Object-oriented, preferably under the "everything is an object" philosophy: CLOS, the Common Lisp Object System, is much closer to being "object oriented" than any of the "curly" languages, and also has features you will sorely miss elsewhere, like multimethods.
2.1: "Not-too-strong" typing: CL has dynamic, strong typing, which seems to be what you want.
2.2: Tools: Emacs + SLIME (the Superior Lisp Interaction Mode for Emacs) is a very nice free IDE. There is also a plugin for Eclipse (Cusp), and the commercial CL implementations also oftem bring an own IDE.
2.3: Garbage collected, but translated or compiled without a virtual machine. The Lisp image that you will be working on is a kind of VM, but I think that's not what you mean.
A further advantage is the incremental development: you have a REPL (read-eval-print-loop) running that provides a live interface into the running image. You can compile and recompile individual functions on the fly, and inspect the current program state on the live system. You have no forced interruptions due to compiling.
Short Version: The D Programming Language
Yum Yum Yum, that is a big set of requirements.
As you probably know, object orientation, high-level semantics, performance, portability and all the rest of your requirements don't tend to fit together from a technical point of view. Let's split this into a different view:
Syntax Requirements
Object Orientated presentation
Low memory management complexity
Allows function style
Isn't Haskell (damn)
Backend Requirements
Fast for science
Garbage Collected
On this basis I would recommend The D programming language it is a successor to C trying to be all things to all people.
This article on D is about it's functional programming aspects. It is object-orientated, garbage collected and compiles to machine code so is fast!
Good Luck
Clojure and/or Scala are good canditates for JVM
I'm going to assume that you are familiar enough with the languages you mentioned to have ruled them out as possibilities. Given that, I don't think there is a language that fulfills all your expectations. However, there are still a few languages you could take a look at:
Clojure This really is a very nice language. It's syntax is based on LISP, and it runs on the JVM.
D This is like C++ done right. It has all the features you want except that it's kind of weak on the functional programming.
Clean This is based very heavily on Haskell, but removes some of Haskell's problems. Downsides are that it's not very mature and doesn't have a lot of libraries.
Factor Syntactically it's based on Forth, but has support for LISP-like functional programming as well as better support for classes.
Take a peek at Erlang. Originally, Erlang was intended for building fault-tolerant, highly parallel systems. It is a functional language, embracing immutability and first-class functions. It has an official Windows binary release, and the source can be compiled for many *NIX platforms (there is a MacPorts build, for example).
In terms of high-level features, Erlang support list comprehensions, pattern matching, guard clauses, structured data, and other things you would expect. It's relatively slow in sequential computation, but pretty amazing if you're doing parallel computation. Erlang does run on a VM, but it runs on its own VM, which is part of the distribution.
Erlang, while not strictly object-oriented, does benefit from an OO mindset. Erlang uses a thing called a process as its unit of concurrency. An Erlang process is actually a lot like a native thread, except with much less overhead. Each process has a mailbox, will be sent messages, and will process those messages. It's easy enough to treat processes as if they were objects.
I don't know if it has much in the way of scientific libraries. It might not be a good fit for your needs, but it's a cool language that few people seem to know about.
Are you sure that you really need a functional language? I did most of my programming in lisp, which is obviously a functional language, but I have found that functional programming is more of a mind-set than a language feature. I'm using VB right now, which I think is an excellent language (speed, support, IDE) and I basically use the same programming style that I did in lisp - functions call other functions that call other functions - and functions are usually 1-5 lines long.
I do know that Lisp has good performance, run on all platforms, but it is somewhat outdated in terms of how up to date support for features such as graphics, multi-threading etc. are.
i've taken a look at clojure but if you don't like java you probably won't like clojure. It's a functional-lisp-style language implemented on top of java - but you'll probably find yourself using java libraries all the time which adds the verbosoity of java. I like lisp but I didn't like clojure despite the hype.
Are you also sure about your performanc requirements? Matlab is an excellent language for a lot of scientific computation, but it is kind of slow and I hate reading it. You might find t useful though especially in conjunction with other languages, for prototypes/scenarios/subunits.
Many of your requirements are based on hearsay. One example: the idea that Mono is "terrible".
http://banshee-project.org/
That's the official media player of many Linux distributions. It's written in C#. (They don't even have a public Windows release of it!)
Your assertions about the relative performance of various languages are equally dubious. And requiring a language to not use a virtual machine is quite unrealistic and totally undesirable. Even an OS is a form of VM on which applications run, which virtualises the hardware devices of the machine.
Though you earn points for mentioning tools (although not with enough priority). As Knuth observed, the first question to ask about a language is "What's the debugger like?"
Looking over your requirements, I would recommend VB on either Mono, or a virtual machine running windows. As a previous poster said, the first thing to ask about a language is "What is the debugger like" and VB/C# have the best debugger. Just a result of all those Microsoft employees hammering on the debugger, and having the teams nearby to bug (no pun intended) into fixing it.
The best thing about VB and C# is the large set of developer tools, community, google help, code exapmles, libraries, softwaer that interfaces with it, etc. I've used a wide variety of software development environments over the past 27 years, and the only thing that comes close is the Xerox Lisp machine environmnets (better) and the Symbolics Lisp machines (worse).

Should I learn Haskell or F# if I already know OCaml? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
I am wondering if I should continue to learn OCaml or switch to F# or Haskell.
Here are the criteria I am most interested in:
Longevity
Which language will last longer? I don't want to learn something that might be abandoned in a couple years by users and developers.
Will Inria, Microsoft, University of Glasgow continue to support their respective compilers for the long run?
Practicality
Articles like this make me afraid to use Haskell. A hash table is the best structure for fast retrieval. Haskell proponents in there suggest using Data.Map which is a binary tree.
I don't like being tied to a bulky .NET framework unless the benefits are large.
I want to be able to develop more than just parsers and math programs.
Well Designed
I like my languages to be consistent.
Please support your opinion with logical arguments and citations from articles. Thank you.
Longevity
Haskell is de facto the dominant language of functional-programming research. Haskell 98 will last for many more years in stable form, and something called Haskell may last 10 to 30 years---although the language will continue to evolve. The community has a major investment in Haskell and even if the main GHC developers are hit by a bus tomorrow (the famous "bus error in Cambridge" problem), there are plenty of others who can step up to the plate. There are also other, less elaborate compilers.
Caml is controlled by a small group at INRIA, the French national laboratory. They also have a significant investment, Others are also invested in Caml, and the code is open source, and the compiler is not too complicated, so that too will be maintained for a long time. I predict Caml will be much more stable than Haskell, as the INRIA folks appear no longer to be using it as a vehicle for exploring new language ideas (or at least they are doing so at a smaller rate than in the past).
Who knows what a company will do? If F# is successful, Microsoft could support it for 20 years. If it is not successful, they could pull the plug in 2012. I can't guess and won't try.
Practicality
A hash table is the best structure for fast retrieval. Haskell proponents in there suggest using Data.Map which is a binary tree.
It depends on what you are searching. When your keys are strings, ternary search trees are often faster than hash tables. When your keys are integers, Okasaki and Gill's binary Patricia trees are competitive with hashing. If you really want to, you can build a hash table in Haskell using the IO monad, but it's rare to need to.
I think there will always be a performance penalty for lazy evaluation. But "practical" is not the same as "as fast as possible". The following are true about performance:
It is easiest to predict the time and space behavior of a Caml program.
F# is in the middle (who really knows what .NET and the JIT will do?).
It is hardest to predict the time and space behavior of Haskell programs.
Haskell has the best profiling tools, and in the long run, this is what yields the best performance.
I want to be able to develop more than just parsers and math programs.
For an idea of the range of what's possible in Haskell, check out the xmonad window manager and the vast array ofpackages at hackage.haskell.org.
I don't like being tied to a bulky .NET framework unless the benefits are large.
I can't comment:
Well Designed
I like my languages to be consistent.
Some points on which to evaluate consistency:
Haskell's concrete syntax is extremely well designed; I'm continually impressed at the good job done by the Haskell committee. OCaml syntax is OK but suffers by comparison. F# started from Caml core syntax and has many similarities.
Haskell and OCaml both have very consistent stories about operator overloading. Haskell has a consistent and powerful mechanism you can extend yourself. OCaml has no overloading of any kind.
OCaml has the simplest type system, especially if you don't write objects and functors (which many Caml programmers don't, although it seems crazy to me not to write functors if you're writing ML). Haskell's type system is ambitious and powerful, but it is continually being improved, which means there is some inconsistency as a result of history. F# essentially uses the .NET type system, plus ML-like Hindley-Milner polymorphism (See question "What is Hindley-Milner".)
OCaml is not quite consistent on whether it thinks variants should be statically typed or dynamically typed, so it provides both ("algebraic data types" and "polymorphic variants"). The resulting language has a lot of expressive power, which is great for experts, but which construct to use is not always obvious to the amateur.
OCaml's order of evaluation is officially undefined, which is a poor design choice in a language with side effects. Worse, the implementations are inconsistent: the bytecoded virtual machine uses one order and the native-code compiler uses the other.
Should you learn F# or Haskell if you know OCaml?
I believe the answer is certainly yes, ideally you should learn all three languages because each one has something to offer but F# is the only one with a significant future so, if you can only feasibly learn one language, learn F# by reading my Visual F# 2010 for Technical Computing book or subscribing to our The F#.NET Journal.
Longevity
Microsoft committed to supporting F# when they released it as part of Visual Studio 2010 in April. So F# is guaranteed a rosy future for at least a few years. With a powerful combination of practically-important features like a high performance native-code REPL, high-level constructs for parallelism built-in to .NET 4 and a production-quality IDE mode, F# is a long way ahead of any other functional programming language in terms of real world applicability now. Frankly, nobody is even working on anything that might be able to compete with F# in the near future. My own open source HLVM project is an attempt to do so but it is far from ready.
In contrast, both OCaml and Haskell are being developed in extremely unproductive directions. This has been killing OCaml for several years now and I expect Haskell to follow suit over the next few years. Most former professional OCaml and Haskell programmers already moved on to F# (e.g. Credit Suisse, Flying Frog Consultancy) and most of the rest will doubtless migrate to more practical alternatives such as Clojure and Scala in the near future.
Specifically, OCaml's QPL license prevents anyone else from fixing its growing number of fundamental design flaws (16Mb string and array limits on 32-bit machines, no shared-memory parallelism, no value types, parametric polymorphism via type erasure, interpreted REPL, cumbersome FFI etc.) because they must distribute derivative works only in the form of patches to the original and the Debian package maintainers refuse to acknowledge an alternative upstream. The new features being added to the language, such as first-class modules in OCaml 3.12, are nowhere near as valuable as multicore capability would have been.
Some projects were started in an attempt to save OCaml but they proved to be too little too late. The parallel GC is practically useless and David Teller quit the batteries included project (although it has been picked up and released in a cut-down form). Consequently, OCaml has gone from being the most popular functional language in 2007 to severe decline today, with caml-list traffic down over 50% since 2007.
Haskell has fewer industrial users than OCaml and, although it does have multicore support, it is still being developed in a very unproductive direction. Haskell is developed almost entirely by two people at Microsoft Research in Cambridge (UK). Despite the fact that purely functional programming is bad for performance by design, they are continuing to try to develop solutions for parallel Haskell aimed at multicores when the massive amounts of unnecessary copying it incurs hits the memory wall and destroys any hope of scalable parallelism on a multicore.
The only major user of Haskell in industry is Galois with around 30 full-time Haskell programmers. I doubt they will let Haskell die completely but that does not mean they will develop it into a more generally-useful language.
Practicality
I wrote the article you cited about hash tables. They are a good data structure. Other people have referred to purely functional alternatives like ternary trees and Patricia trees but these are usually ~10× slower than hash tables in practice. The reason is simply that cache misses dominate performance concerns today and trees incur an extra O(log n) pointer indirections.
My personal preference is for optional laziness and optional purity because both are generally counter productive in the real world (e.g. laziness makes performance and memory consumption wildly unpredictable and purity severely degrades average-case performance and makes interoperability a nightmare). I am one of the only people earning a living entirely from functional programming through my own company. Suffice to say, if I thought Haskell were viable I would have diversified into it years ago but I keep choosing not to because I do not believe it is commercially viable.
You said "I don't like being tied to a bulky .NET framework unless the benefits are large". The benefits are huge. You get a production-quality IDE, a production-quality JIT compiler that performs hugely-effective optimizations like type-specializing generics, production-quality libraries for everything from GUI programming (see Game of Life in 32 lines of F#) to number crunching. But the real benefit of .NET, at least for me, is that you can sell the libraries that you write in F# and earn lots of money. Nobody has ever succeeded selling libraries to OCaml and Haskell programmers (and I am one of the few people to have tried) but F# libraries already sell in significant quantities. So the bulky .NET framework is well worth it if you want to earn a living by writing software.
Well designed
These languages are all well designed but for different purposes. OCaml is specifically designed for writing theorem provers and Haskell is specifically designed for researching Haskell. F# was designed to address all of the most serious practical problems with OCaml and Haskell such as poor interoperability, lack of concurrent garbage collection and lack of mature modern libraries like WPF in order to bring a productive modern language to a large audience.
This wasn't one of your criteria but have you considered job availability? Haskell currently list 144 jobs on indeed, Ocaml list 12 and C# list 26,000. These numbers are not perfect but I bet you that once F# ships it won't be long before it blows past Haskell and Ocaml in the number of job listings.
So far every programming language included in Visual Studios has thousands of job listings for it. Seems to me that if you want the best chance to use a functional programming language as your day job then F# will soon be it.
Longevity
No one can predict the future, but
OCaml and Haskell have been surving well for a number of years, which bodes well for their future
when F# ships with VS2010, MS will have legal obligations to support it for at least 5 years
Practicality
Perf: I don't have enough first-hand experience with Haskell, but based on second-hand and third-hand info, I think OCaml or F# are more pragmatic, in the sense that I think it is unlikely you'll be able to get the same run-time perf in Haskell that you do in OCaml of F#.
Libraries: Easy access to the .Net Framework is a huge benefit of F#. You can view it as being "tied to this bulky thing" if you like, but don't forget that "you have access to a huge bulky library of often incredibly useful stuff". The 'connectivity' to .Net is one of the big selling points for F#. F# is younger and so has fewer third-party libraries, but there is already e.g. FsCheck, FParsec, Fake, and a bunch of others, in addition to the libraries "in the box" on .Net.
Tooling: I don't have enough personal experience to compare, but I think the VS integration with F# is superior to anything you'll find for OCaml/Haskell today (and F# will continue to improve a bit here over the next year).
Change:
F# is still changing as it approaches its first supported release in VS2010, so there are some breaking changes to language/library you may have to endure in the near future.
Well Designed
Haskell is definitely beautiful and consistent. I don't know enough OCaml but my hunch is it is similarly attractive. I think that F# is 'bigger' than either of those, which means more dusty corners and inconsistencies (largely as a result of mediating the impedence mismatch between FP and .Net), but overall F# still feels 'clean' to me, and the inconsistencies that do exist are at least well-reasoned/intentioned.
Overall
In my opinion you will be in 'good shape' knowing any of these three languages well. If you know a big long-term project you want to use it for, one may stand out, but I think many of the skills will be transferable (more easily between F# and OCaml than to/from Haskell, but also more easily among any of these three than with, say, Java).
There's no simple answer to that question, but here are some things to consider:
Haskell and OCaml are both mature languages with strong implementations. Actually, there are multiple good implementations of Haskell, but I don't think that's a major point in its favor for your purpose.
F# is much younger, and who can predict where Microsoft will decide to take it? How you feel about that depends more on how you feel about Microsoft than anything anyone can tell you about programming languages.
OCaml (or ML in general), is a good practical language choice that supports doing cool functional stuff without forcing you to work in a way that might be uncomfortable. You get the full benefit of things like algebraic data types, pattern matching, type inference, and everybody else's favorite stuff. Oh, and objects.
Haskell gives you all that (except objects, pretty much), but also more or less forces you to rethink everything you think you know about programming. This might be a very good thing, if you're looking to learn something new, but it might be more than you want to bite off. I say this as someone who is only maybe halfway along the path to being a productive, happy Haskell programmer.
Both OCaml and Haskell are being used to write lots of different kinds of programs, not just compilers and AI or whatever. Google is your friend.
One last note: OCaml gives you hashtable, but it's hardly sensible to use it in code if you really want to embrace functional programming. Persistent trees (like Data.Map) are really the right solution for Haskell, and have lots of nice properties, which is one of the cool things to learn about when you pick up Haskell.
F# and OCaml are very similar in syntax, though, obviously, F# is better with .NET.
Which one you learn or use should be dependent on which platform you are aiming for.
In VS2010 F# is going to be included, and since it compiles to .NET bytecode, it can be used on a windows OS that supports the .NET version you used for it. This will give you a large area, but there are limits, currently with F# that OCaml don't have, in that F# appears not to take advantage of all the processors on a machine, but, that is probably due to F# still being developed, and this may be a feature that isn't as important yet.
There are other functional languages, such as Erlang that you could look at, but, basically, if you are strong in one FP language then you should be able to pick up another fairly quickly, so, just pick one that you like and try to develop interesting and challenging applications in it.
Eventually language writers will find a way to get OO languages to work well with multi-cores and FP may fall to the wayside again, but, that doesn't appear to be happening anytime soon.
This is not directly related to the OP's question as to whether or not to learn F#, but rather, an example of real world OCaml usage in the financial sector:
http://ocaml.janestreet.com/?q=node/61
Very interesting talk.
In terms of longevity, it's very difficult to judge popularity of languages, but just doing a quick check on here, these are the numbers of questions tagged with the appropriate (functional) language :-
2672 Scala,
1936 Haskell,
1674 F#,
1126 Clojure,
709 Scheme,
332 OCaml
I'd say this was a good indication of which languages people are actively learning at the moment and therefore might be a good indication of which ones might be popular in the next few years.

Resources