Which languages are dynamically typed and compiled (and which are statically typed and interpreted)? - programming-languages

In my reading on dynamic and static typing, I keep coming up against the assumption that statically typed languages are compiled, while dynamically typed languages are interpreted. I know that in general this is true, but I'm interested in the exceptions.
I'd really like someone to not only give some examples of these exceptions, but try to explain why it was decided that these languages should work in this way.

Here's a list of a few interesting systems. It is not exhaustive!
Dynamically typed and compiled
The Gambit Scheme compiler, Chez Scheme, Will Clinger's Larceny Scheme compiler, the Bigloo Scheme compiler, and probably many others.
Why?
Lots of people really like Scheme. Programs as data, good macro system, 35 years of development, big community. But they want performance. Hence, a number of good native-code compilers—Chez Scheme is even a successful commercial product (interpreted bytecodes are free; native codes you pay for).
The LuaJIT just-in-time compiler for Lua.
Why?
To show it could be done. And then, people started to like getting 3x speedup on their Lua programs. Lua is in a lot of games, where performance matters, plus it's creeping into other products too. 70% of the code in Adobe Lightroom is Lua.
The iconc Icon-to-C compiler.
Why?
The fifty people who used it loved Icon. Totally unusual evaluation model, the most innovative (and in my opinion, best) string-processing system ever designed. But that evaluation model was really expensive, especially on late-1980s computers. By compiling Icon to C, the Icon Project made it possible for big Icon programs to run in fewer hours.
Conclusion: people first develop an attachment to a dynamically typed language, and probably a significant code base. Eventually, the community spits out a native-code compiler so that you can get better performance and solve bigger problems.
Statically Typed and Interpreted
This category is less common, but...
Objective Caml. Dialect of ML, vehicle for lots of innovative experiments in language design.
Why?
Very portable system and very fast compilation times. People like both properties, so the new language-design ideas are desseminated widely.
Moscow ML. Standard ML with a few extra features of the modules system.
Why?
Portable, fast compilation times, easy to make an interactive read/eval/print loop. Became a popular teaching compiler.
C-Terp. An old product, I think maybe from Gimpel Software. Saber C—a product I don't think you can buy any more.
Why?
Debugging. Especially, debugging on 1980s hardware under MS-DOS. For very little resources, you could get really good help debugging C code on very limited hardware (think: 4.77MHz processor with an 8-bit bus, 640K of RAM fully loaded). Nearly impossible to get a good visual debugger for native-compiled code, but with the interpreter, fairly easy.
UCSD Pascal—the system that made "P-code" a household word.
Why?
Teachers liked Niklaus Wirth's language design, and the compiler could run on very small machines. Wirth's clean design and the UCSD P-system made an unbeatable combination, and Pascal was the standard teaching language of the 1970s. Younger people may find it hard to appreciate that in the 1970s there was no debate over what language to teach in the first course. Today I know of programs using C, C++, Haskell, Java, ML, and Scheme. In the 1970s it was always Pascal, and the UCSD P-system was a big reason way.
In case you are wondering, P stood for portable.
Summary: Interpreting a statically typed language is a great way to get an implementation into everybody's hands quickly. (It also had advantages for debugging on Bronze Age hardware.)

Objective-C is compiled and supports dynamic typing (certainly when calling methods via [target doSomething] syntax). That is, you can send any message to a target (using ordinary language syntax, without programming against a reflection API), receive only a warning at compile time that it might not be handled, and receive an exception only at runtime if the target doesn't respond to that selector (which is like a method signature); and you can ask any object (which can all be of static type id if your code doesn't know any better or doesn't care) whether it respondsToSelector: to probe its capabilities.

Java (a statically typed language) is compiled to JVM bytecode, which was interpreted on older versions of the JVM, whereas it now uses Just In Time (JIT) compilation, meaning machine code is generated at runtime. I also believe ML and its dialects can be interpreted, and ML is definitely statically typed.

Python is a dynamic language that has compilers.
See this SO question - Python - why compile?, for instance.
In general, compiling makes the program run much faster.

Actionscript has dynamic typing and compiles to bytecode.
And it even compiles right down to native machine code if you want to release a Flash app on the iPhone.

Related

Why do you use less expressive languages, and should I also?

I'm a Python programmer who knows a bit of Ruby and PHP as well. I don't really know enough about Java to do anything meaningful, and I certainly don't know C, C++, or other low-level languages. I've heard all the "Who cares about speed because hardware is cheap, but coders are expensive" arguments, and I'm not trying to raise a debate here. I want to understand 2 things about the community of lower level programming languages (whether it's C or even assembly):
What is the main reason you choose to still use it (job requirements, speed, desktop vs web, etc)?
Is it still worth me taking the time to learn C++ (monetarily speaking) or others this late in the game, or will I benefit?
Also, consider the benefits / disadvantages of dynamic vs static typing, when choosing your reasons. I primarily program for the web, but don't take that fully into consideration because it's partly due to the fact that the web is all I know.
Unashamed Fortran programmer here. Speed speed speed. Oh, and all the scientists I work with are reasonably fluent in Fortran. But then I work in computational electromagnetics on large clusters and supercomputers and wouldn't recognise a web application if it leapt up and bit me on the nose.
I wouldn't regard Fortran as low-level or less-expressive, I think it's a DSL for linear-algebra and more general number-crunching. So why did I take the bait and respond to this question ?
Dynamic vs static typing ? Static please, everything tied down at compile time so the compiler can work its socks off optimising.
For a web programmer, C/C++ will offer you virtually no advantage. It is less expressive than Perl, Ruby, Python, etc. and requires more code and attention to detail of memory management. Unfortunately, choosing a language for its "features" is often second to choosing for its platform. C++ isn't as clean and elegant as C#, most of that comes from its C compatibility. Sadly, even though there are better languages for certain things, most aren't compiled and most aren't widely supported.
If you plan to develop a commercial product that the customer will download or receive on CD, then C/C++ offers you protection of your Intellectual Property (hard to reverse engineer), and a small runtime footprint, as well as ability to target older platforms like Windows XP.
It is not too late in the game to learn C/C++. C/C++ will be around as long as all the higher level languages are around, because those languages are implemented in C/C++. It is not as if we will all move to Python one day and C/C++ will be retired. High level, non-compiled languages are not self-hoisting, so they cannot exist without C++.
It is the tool to use if you are going to implement higher level things like languages, APIs, toolkits, drivers, IDEs, etc. But C++ is not the tool to use if you want the fastest way to develop an internal GUI app or a Web app.
Just learn the tool for the job. If the job changes, or you wish to change the job, then you may want to push yourself to learn C++ to see the other side of the Computer Science world, the side between the CPU and what you currently write your web apps in.
I'd think the big three reasons here are going to be Performance, Legacy System Support, and Embedded Development.
So mega subjective it renders any answer next to useless.
For the numerical calculations I do writing straight C++ is just faster than using e.g. straight Python. Sure, I can interface my C++ libraries to higher-level languages (and I do), but since most of my work is done on the low-level numerical side, I wouldn't gain too much.
Also consider that a lot of libraries especially in scientific computation are in FORTRAN, C or C++ and linking against them from C++ is much faster (especially if you just want to get done with it) that creating wrappers and interfaces all yourself.
If you would gain anything from learning a low-level language depends a lot on your problem domain.
There are several good reasons for lower-level languages.
First, there's a lot of applications where performance does matter. The main application I work on is slow enough as it is (it does a whole lot of things), and would be unusable in Python.
Second, there are applications that require standalone executables that aren't all that big.
Third, there's a tremendous amount of legacy code in C and C++, and it isn't going away any time soon.
Fourth, operating systems are normally written in C or C++ or similar languages, and expose APIs in them. If you need to get chummy with the OS, for whatever reason, you're better off using the OS language.
Dynamic typing is very clearly better for getting an application up and running fast, and my Lisp background pushed me to believe that static typing is normally just premature optimization. However, lots of people believe that static typing is much better for enforcing correctness in large projects, and C and C++ are well suited for large projects.
For your second question, I have no idea what you want to do in the future, so I don't know if it would be worthwhile for you to learn C++. For professional development, I'd strongly suggest learning a variety of languages, including C or a similar language. There are other questions on SO about what languages to learn.
Why aren't Python, Ruby and PHP written in itself?
Mission critical applications need best possible performance and algorithms and doesn't need things like metaprogramming.
C++ has some great ideas and libraries, which I then found in those modern languages, sometimes better sometimes worse (compare power of templates in C++ and generics in Java).
Low level languages will make you learn something more about lower level abstractions of the computer, operating systems or networks.
Static restricts you in programming and often makes you write out the types, however it also let's you better express your intent in a way, which is automatically provable by compiler and you can get support for tools. Dynamic gives you free hands, you can do more and maybe easier, however you have to test better.
I may be misunderstanding you here but it sounds like you're under the impression that C++ is the appropriate language for every kind of project. It isn't. You wouldn't use a Liebherr truck for a cross country road trip. Every popular language in existence got that way because it works well for some situations. There was a time when C++ was used to write web applications which quickly gave way to Perl scripts because the trade-off between production and performance. So in response to your first question, mainly people still use it because in certain situations it's the best tool for the job.
As far as whether or not you should learn it, I say if you have the time and the desire go for it. Even if C++ is never the correct tool for any of the projects you take on a decent understanding of concepts that are necessary in C++ will make you a better developer in every language.

How to create a language these days?

I need to get around to writing that programming language I've been meaning to write. How do you kids do it these days? I've been out of the loop for over a decade; are you doing it any differently now than we did back in the pre-internet, pre-windows days? You know, back when "real" coders coded in C, used the command line, and quibbled over which shell was superior?
Just to clarify, I mean, not how do you DESIGN a language (that I can figure out fairly easily) but how do you build the compiler and standard libraries and so forth? What tools do you kids use these days?
One consideration that's new since the punched card era is the existence of virtual machines already bountifully provided with "standard libraries." Targeting the JVM or the .NET CLR instead of ye olde "language walled garden" saves you a lot of bootstrapping. If you're creating a compiled language, you may also find Java byte code or MSIL an easier compile target than machine code (of course, if you're in this for the fun of creating a tight optimising compiler then you'll see this as a bug rather than a feature).
On the negative side, the idioms of the JVM or CLR may not be what you want for your language. So you may still end up building "standard libraries" just to provide idiomatic interfaces over the platform facility. (An example is that every languages and its dog seems to provide its own method for writing to the console, rather than leaving users to manually call System.out.println or Console.WriteLine.) Nevertheless, it enables an incremental development of the idiomatic libraries, and means that the more obscure libraries for which you never get round to building idiomatic interfaces are still accessible even if in an ugly way.
If you're considering an interpreted language, .NET also has support for efficient interpretation via the Dynamic Language Runtime (DLR). (I don't know if there's an equivalent for the JVM.) This should help free you up to focus on the language design without having to worry so much about the optimisation of the interpreter.
I've written two compilers now in Haskell for small domain-specific languages, and have found it to be an incredibly productive experience. The parsec library makes playing with syntax easy, and interpreters are very simple to write over a Haskell data structure. There is a description of writing a Lisp interpreter in Haskell that I found helpful.
If you are interested in a high-performance backend, I recommend LLVM. It has a concise and elegant byte-code and the best x86/amd64 generating backend you can find. There is an optional garbage collector, and some experimental backends that target the JVM and CLR.
You can write a compiler in any language that produces LLVM bytecode. If you are adventurous enough to learn Haskell but want LLVM, there are a set of Haskell-LLVM bindings.
What has changed considerably but hasn't been mentioned yet is IDE support and interoperability:
Nowadays we pretty much expect Intellisense, step-by-step execution and state inspection "right in the editor window", new types that tell the debugger how to treat them and rather helpful diagnostic messages. The old "compile .x -> .y" executable is not enough to create a language anymore. The environment is nothing to focus on first, but affects willingness to adopt.
Also, libraries have become much more powerful, noone wants to implement all that in yet another language. Try to borrow, make it easy to call existing code, and make it easy to be called by other code.
Targeting a VM - as itowlson suggested - is probably a good way to get started. If that turns out a problem, it can still be replaced by native compilers.
I'm pretty sure you do what's always been done.
Write some code, and show your results to the world.
As compared to the olden times, there are some tools to make your job easier though. Might I suggest ANTLR for parsing your language grammar?
Speaking as someone who just built a very simple assembly like language and interpreter, I'd start out with the .NET framework or similar. Nothing can beat the powerful syntax of C# + the backing of the entire .NET community when attempting to write most things. From here i designed a simple bytecode format and assembly syntax and proceeeded to write my interpreter + assembler.
Like i said, it was a very simple language.
You should not accept wimpy solutions like using the latest tools. You should bootstrap the language by writing a minimal compiler in Visual Basic for Applications or a similar language, then write all the compilation tools in your new language and then self-compile it using only the language itself.
Also, what is the proposed name of the language?
I think recently there have not been languages with ALL CAPITAL LETTER names like COBOL and FORTRAN, so I hope you will call it something like MIKELANG with all capital letters.
Not so much an implementation but a design decision which effects implementation - if you make every statement of your language have a unique parse tree without context, you'll get something that it's easy to hand-code a parser, and that doesn't require large amounts of work to provide syntax highlighting for. Similarly simple things like using a different symbol for module namespaces and object namespaces ( unlike Java which uses . for both package and class namespaces ) means you can parse the code without loading every module that it refers to.
Standard libraries - include the equivalent of everything in C99 standard libraries other than setjmp. Add whatever else you need for your domain. Work out an easy way to do this, either something like SWIG or an in-line FFI such as Ruby's [can't remember module name] and Python's ctypes.
Building as much of the language in the language is an option, but projects which start out doing either give up (rubinius moved to using C++ for parts of its standard library), or is only for research purposes (Mozilla Narcissus)
I am actually a kid, haha. I've never written an actual compiler before or designed a language, but I have finished The Red Dragon Book, so I suppose I have somewhat of an idea (I hope).
It would depend firstly on the grammar. If it's LR or LALR I suppose tools like Bison/Flex would work well. If it's more LL, I'd use Spirit, which is a component of Boost. It allows you to write the language's grammar in C++ in an EBNF-like syntax, so no muddling around with code generators; the C++ compiler compiles the grammar for you. If any of these fail, I'd write an EBNF grammar on paper, and then proceed to do some heavy recursive descent parsing, which seems to work; if C++ can be parsed pretty well using RDP (as GCC does it), then I suppose with enough unit tests and patience you could write entire compilers using RDP.
Once I have a parser running and some sort of intermediate representation, it then depends on how it runs. If it's some bytecode or native code compiler, I'll use LLVM or libJIT to process it. LLVM is more suited for general compilation, but I like the libJIT API and documentation better. Alternatively, if I'm really lazy, I'll generate C code and let GCC do the actual compilation. Another alternative, is to target an existing VM, like Parrot or the JVM or the CLR. Parrot is the VM being designed for Perl. If it's just an interpreter, I'll walk the syntax tree.
A radical alternative is to use Prolog, which has syntax features which remarkably simulate EBNF. I have no experience with it though, and if I am not wrong (which I am almost certainly going to be), Prolog would be quite slow if used to parse heavy duty programming languages with a lot of syntactical constructs and quirks (read: C++ and Perl).
All this I'll do in C++, if only because I am more used to writing in it than C. I'd stay away from Java/Python or anything of that sort for the actual production code (writing compilers in C/C++ help to make it portable), but I could see myself using them as a prototyping language, especially Python, which I am partial towards. Of course, I've never actually done any of this before, so I'm not one to say.
On lambda-the-ultimate there's a link to Create Your Own Programming Language by Marc-André Cournoyer, which appears to describe how to leverage some modern tools for creating little languages.
Just to clarify, I mean, not how do you DESIGN a language (that I can figure out fairly easily)
Just a hint: Look at some quite different languages first, before designing a new languge (i.e. languages with a very different evaluation strategy). Haskell and Oz come to mind. Though you should also know Prolog and Scheme. A year ago I also was like "hey, let's design a language that behaves exactly as I want", but fortunatly I looked at those other languages first (or you could also say unfortunatly, because now I don't know how I want a language to behave anymore...).
Before you start creating a language you should read this:
Hanspeter Moessenboeck, The Art of Niklaus Wirth
ftp://ftp.ssw.uni-linz.ac.at/pub/Papers/Moe00b.pdf
There's a big shortcut to implementing a language that I don't see in the other answers here. If you use one of Lukasiewicz's "unparenthesized" forms (ie. Forward Polish or Reverse Polish) you don't need a parser at all! With reverse polish, the dependencies go right-to-left so you simply execute each token as it's scanned. With forward polish, it's the reverse of that, so you actually execute the program "backwards", simplifying subexpressions until reaching the starting token.
To understand why this works, you should investigate the 3 primary tree-traversal algorithms: pre-order, in-order, post-order. These three traversals are the inverse of the parsing task that a language reader (i. parser) has to perform. Only the in-order notation "requires" a recursive decent to re-construct the expression tree. With the other two, you can get away with just a stack.
This may require more "thinking' and less "implementing".
BTW, if you've already found an answer (this question is a year old), you can post that and accept it.
Real coders still code in C. Just that it's a litte sharper.
Hmmm... language design? or writing a compiler?
If you want to write a compiler, you'd use Flex + Bison. (google)
Not an easy answer, but..
You essentially want to define a set of rules written in text (tokens) and then some parser that checks these rules and assembles them into fragments.
http://www.mactech.com/articles/mactech/Vol.16/16.07/UsingFlexandBison/
People can spend years on this, The above article talks about using two tools (Flex and Bison) That can be used to turn text into code you can feed to a compiler.
First I spent a year or so to actually think how the language should look like. At the same time I helped in developing Ioke (www.ioke.org) to learn language internals.
I have chosen Objective-C as implementation platform as it's fast (enough), simple and rich language. It also provides test framework so agile approach is a go. It also has a rich standard library I can build upon.
Since my language is simple on syntactic level (no keywords, only literals, operators and messages) I could go with Ragel (http://www.complang.org/ragel/) for building scanner. It's fast as hell and simple to use.
Now I have a working object model, scanner and simple operator shuffling plus standard library bootstrap code. I can even run a simple programs - as long as they fit in one file that is :)
Of course older techniques are still common (e.g. using Flex and Bison) many newer language implementations combine the lexing and parsing phase, by using a parser based on a parsing expression grammar (PEG). This works for recursive descent parsers created using combinators, or memoizing Packrat parsers. Many compilers are built using the Antlr framework also.
Use bison/flex which is the gnu version of yacc/lex. This book is extremely helpful.
The reason to use bison is it catches any conflicts in the language. I used it and it made my life many years easier (ok so i'm on my 2nd year but the first 6months was a few years ago writing it in C++ and the parsing/conflicts/results were terrible! :(.)
If you want to write a compiler obviously you need to read the Dragon Book ;)
Here is another good book that I have just read. It is practical and easier to understand than the Dragon Book:
http://www.amazon.co.uk/s/ref=nb_sb_noss?url=search-alias%3Daps&field-keywords=language+implementation+patterns&x=0&y=0
Mike --
If you're interested in an efficient native-code-generating compiler for Windows so you can get your bearings -- without wading through all the unnecessary widgets, gadgets, and other nonsense that clutter today's machines -- I recommend the Osmosian Order's Plain English development system. It includes a unique interface, a simplified file manager, a friendly text editor, a handy hexadecimal dumper, the compiler/linker (of course), and a wysiwyg page-layout application for documentation. Written entirely in Plain English, it is a quick download (less than a megabyte), small enough to understand in short order (about 25,000 lines of Plain English code, with just 4,000 in the compiler/linker), yet powerful enough to reproduce itself on a bottom-of-the-line Dell in less than three seconds. Really: three seconds. And it's free to all who write and ask for a copy, including the source code and and a rather humorous tongue-in-cheek 100-page manual. See www.osmosian.com for details on how to get a copy, or write to me directly with questions or comments: Gerry.Rzeppa#pobox.com

Looking for a functional language [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I'm a scientist working mostly with C++, but I would like to find a better language. I'm looking for suggestions, I'm not even sure my "dream language" exist (yet), but here's my wishlist;
IMPORTANT FEATURES (in order of importance)
1.1: Performance: For science, performance is very important. I perfectly understand the importance of productivity, not just execution speed, but when your program has to run for hours, you just can't afford to write it in Python or Ruby. It doesn't need to be as fast as C++, but it has to be reasonably close (e.g.: Fortran, Java, C#, OCaml...).
1.2: High-level and elegant: I would like to be able to concentrate as most as possible on the science and get a clear code. I also dislike verbose languages like Java.
1.3: Primarely functional: I like functional programming, and I think it suits both my style and scientific programming very well. I don't care if the language supports imperative programming, it might be a plus, but it has to focus and encourage functional programming.
1.4: Portability: Should work well on Linux (especially Linux!), Mac and Windows. And no, I do not think F# works well on Linux with mono, and I'm not sure OCaml works well on windows ;)
1.5: Object-oriented, preferably under the "everything is an object" philosophy: I realized how much I liked object-oriented programming when I had to deal pure C not so long ago. I like languages with a strong commitment to object-oriented programming, not just timid support.
NOT REALLY IMPORTANT, BUT THINGS THAT WOULD BE NICE
2.1: "Not-too-strong" typing: I find Haskell's strong typing system to be annoying, I like to be able to do some implicit casting.
2.2: Tools: Good tools is always a plus, but I guess it really depends on the languages. I played with Haskell using Geany, a lightweight editor, and I never felt handicapped. On the other hand I wouldn't have done the same with Java or even Scala (Scala, in particular, seems to be lacking good tools, which is really a shame). Java is really the #1 language here, with NetBeans and Javadoc, programming with Java is easy and fun.
2.3: Garbage collected, but translated or compiled without a virtual machine. I have nothing against virtual machines, but the two giants in the domain have their problems. On paper the .net framework seems much better, and especially suited for functional programming, but in practice it's still very windows-centric and the support for Linux/MacOS is terrible not as good as it should be, so it's not really worth considering. Java is now a mature VM, but it annoys me on some levels: I dislike the ways it deals with executables, generics, and it writes terrible GUIs (although these things aren't so bad).
In my mind there are three viable candidates: Haskell, Standard ML, OCaml. (Scala is out on the grounds that it compiles to JVM codes and is therefore unlikely to be fast enough when programs must run for days.)
All are primarily functional. I will comment where I have knowledge.
Performant
OCaml gives the most stable performance for all situations, but performance is hard to improve. What you get is what you get :-)
Haskell has the best parallel performance and can get excellent use out of an 8-core or 16-core machine. If your future is parallel, I urge you to master your dislike of the type system and learn to use Haskell effectively, including the Data Parallel Haskell extensions.
The down side of Haskell performance is that it can be quite difficult to predict the space and time required to evaluate a lazy functional program. There are excellent profiling tools, but still significant effort may be required.
Standard ML with the MLton compiler gives excellent performance. MLton is a whole-program compiler and does a very good job.
High-level and elegant
Syntactically Haskell is the clear winner. The type system, however, is cluttered with the remains of recent experiments. The core of the type system is, however, high-level and elegant. The "type class" mechanism is particularly powerful.
Standard ML has ugly syntax but a very clean type system and semantics.
OCaml is the least elegant, both from a point of view of syntax and from the type system. The remains of past experiments are more obtrusive than in Haskell. Also, the standard libraries do not support functional programming as well as you might expect.
Primarily functional
Haskell is purely functional; Standard ML is very functional; OCaml is mostly functional (but watch out for mutable strings and for some surprising omissions in the libraries; for example, the list functions are not safe for long lists).
Portability
All three work very well on Linux. The Haskell developers use Windows and it is well supported (though it causes them agony). I know OCaml runs well on OSX because I use an app written in OCaml that has been ported to OSX. But I'm poorly informed here.
Object-oriented
Not to be found in Haskell or SML. OCaml has a bog-standard OO system grafted onto the core language, not well integrated with other languages.
You don't say why you are keen for object-orientation. ML functors and Haskell type classes provide some of the encapsulation and polymorphism (aka "generic programming") that are found in C++.
Type system than can be subverted
All three languages provide unsafe casts. In all three cases they are a good way to get core dumps.
I like to be able to do some implicit casting.
I think you will find Haskell's type-class system to your liking—you can get some effects that are similar to implicit casting, but safely. In particular, numeric and string literals are implicitly castable to any type you like.
Tools
There are pretty good profiling tools with Haskell. Standard ML has crappy tools. OCaml has basically standard Unix profiling plus an unusable debugger. (The debugger refuses to cross abstraction barriers, and it doesn't work on native code.)
My information may be out of date; the tools picture is changing all the time.
Garbage-collected and compiled to native code
Check. Nothing to choose from there.
Recommendation
Overcome your aversion to safe, secure type systems. Study Haskell's type classes (the original paper by Wadler and Blott and a tutorial by Mark Jones may be illuminating). Get deeper into Haskell, and be sure to learn about the huge collection of related software at Hackage.
Try Scala. It's an object-oriented functional language that runs in the JVM, so you can access everything that was ever written in Java. It has all your important features, and one of the nice to have features. (Obviously not #2.2 :) but that will probably get better quickly.) It does have very strong typing, but with type inference it doesn't really get in your way.
You just described Common Lisp...
If you like using lists for most things, and care about performance, use Haskell or Ocaml. Although Ocaml suffers significantly in that Floats on the heap need to be boxed due to the VM design (but arrays of floats and purely-float records aren't individually boxed, which is good).
If you're willing to use arrays more than lists, or plan on programming using mutable state, use Scala rather than Haskell. If you're looking to write threaded multi-core code, use Scala or Haskell (Ocaml requires you to fork).
Scala's list is polymorphic, so a list of ints is really a list of boxed Int objects. Of course you could write your own list of ints in Scala that would be as fast, but I assume you'd rather use the standard libraries. Scala does have as much tail recursion as is possible on JVM.
Ocaml fails on Vista 64 for me, I think because they just changed the linker in the latest version (3.11.1?), but earlier versions worked fine.
Scala tool support is buggy at the moment if you're using nightly builds, but should be good soon. There are eclipse and netbeans plugins. I'm using emacs instead. I've used both the eclipse and netbeans debugger GUI successfully in the past.
None of Scala, Ocaml, or Haskell, have truly great standard libraries, but at least you can easily use Java libs in Scala. If you use mapreduce, Scala wins on integration. Haskell and Ocaml have a reasonable amount of 3rd party libs. It annoys me that there are differently named combinators for 2-3 types of monad in Haskell.
http://metamatix.org/~ocaml/price-of-abstraction.html might convince you to stay with C++. It's possible to write Scala that's almost identical in performance to Java/C++, but not necessarily in a high level functional or OO style.
http://gcc.gnu.org/projects/cxx0x.html seems to suggest that auto x = ... (type inference for expressions) and lambdas are usable. C++0x with boost, if you can stomach it, seems pretty functional. The downside to C++ high performance template abusing libraries is, of course, compile time.
Your requirements seem to me to describe ocaml quite well, except for the "not-too-strong" typing. As for tools, I use and like tuareg mode for emacs. Ocaml should run on windows (I haven't used it myself though), and is pretty similar to F#, FWIW.
I'd consider the ecosystem around the language as well. In my opinion Ocaml's major drawback is that it doesn't have a huge community, and consequently lacks the large library of third-party modules that are part of what makes python so convenient. Having to write your own code or modify someone else's one-shot prototype module you found on the internet can eat up some of the time you save by writing in a nice functional language.
You can use F# on mono; perhaps worth a look? I know that mono isn't 100% perfect (nothing ever is), but it is very far from "terrible"; most of the gaps are in things like WCF/WPF, which I doubt you'd want to use from FP. This would seem to offer much of what you want (except obviously it runs in a VM - but you gain a huge set of available libraries in the bargain (i.e. most of .NET) - much more easily than OCaml which it is based on).
I would still go for Python but using NumPy or some other external module for the number crunching or alternatively do the logic in Python and the hotspots in C / assembler.
You are always giving up cycles for comfort, the more comfort the more cycles. Thus you requirements are mutual exclusive.
I think that Common Lisp fits your description quite well.
1.1: Performance: Modern CL implementations are almost on par with C. There are also foreign function interfaces to interact with C libraries, and many bindings are already done (e.g. the GNU Scientific Library).
1.2: High-level and elegant: Yep.
1.3: Primarily functional: Yes, but you can also "get imperative" wherever the need arises; CL is "multi-paradigm".
1.4: Portability: There are several implementations with differing support for each platform. Some links are at CLiki and ALU Wiki.
1.5: Object-oriented, preferably under the "everything is an object" philosophy: CLOS, the Common Lisp Object System, is much closer to being "object oriented" than any of the "curly" languages, and also has features you will sorely miss elsewhere, like multimethods.
2.1: "Not-too-strong" typing: CL has dynamic, strong typing, which seems to be what you want.
2.2: Tools: Emacs + SLIME (the Superior Lisp Interaction Mode for Emacs) is a very nice free IDE. There is also a plugin for Eclipse (Cusp), and the commercial CL implementations also oftem bring an own IDE.
2.3: Garbage collected, but translated or compiled without a virtual machine. The Lisp image that you will be working on is a kind of VM, but I think that's not what you mean.
A further advantage is the incremental development: you have a REPL (read-eval-print-loop) running that provides a live interface into the running image. You can compile and recompile individual functions on the fly, and inspect the current program state on the live system. You have no forced interruptions due to compiling.
Short Version: The D Programming Language
Yum Yum Yum, that is a big set of requirements.
As you probably know, object orientation, high-level semantics, performance, portability and all the rest of your requirements don't tend to fit together from a technical point of view. Let's split this into a different view:
Syntax Requirements
Object Orientated presentation
Low memory management complexity
Allows function style
Isn't Haskell (damn)
Backend Requirements
Fast for science
Garbage Collected
On this basis I would recommend The D programming language it is a successor to C trying to be all things to all people.
This article on D is about it's functional programming aspects. It is object-orientated, garbage collected and compiles to machine code so is fast!
Good Luck
Clojure and/or Scala are good canditates for JVM
I'm going to assume that you are familiar enough with the languages you mentioned to have ruled them out as possibilities. Given that, I don't think there is a language that fulfills all your expectations. However, there are still a few languages you could take a look at:
Clojure This really is a very nice language. It's syntax is based on LISP, and it runs on the JVM.
D This is like C++ done right. It has all the features you want except that it's kind of weak on the functional programming.
Clean This is based very heavily on Haskell, but removes some of Haskell's problems. Downsides are that it's not very mature and doesn't have a lot of libraries.
Factor Syntactically it's based on Forth, but has support for LISP-like functional programming as well as better support for classes.
Take a peek at Erlang. Originally, Erlang was intended for building fault-tolerant, highly parallel systems. It is a functional language, embracing immutability and first-class functions. It has an official Windows binary release, and the source can be compiled for many *NIX platforms (there is a MacPorts build, for example).
In terms of high-level features, Erlang support list comprehensions, pattern matching, guard clauses, structured data, and other things you would expect. It's relatively slow in sequential computation, but pretty amazing if you're doing parallel computation. Erlang does run on a VM, but it runs on its own VM, which is part of the distribution.
Erlang, while not strictly object-oriented, does benefit from an OO mindset. Erlang uses a thing called a process as its unit of concurrency. An Erlang process is actually a lot like a native thread, except with much less overhead. Each process has a mailbox, will be sent messages, and will process those messages. It's easy enough to treat processes as if they were objects.
I don't know if it has much in the way of scientific libraries. It might not be a good fit for your needs, but it's a cool language that few people seem to know about.
Are you sure that you really need a functional language? I did most of my programming in lisp, which is obviously a functional language, but I have found that functional programming is more of a mind-set than a language feature. I'm using VB right now, which I think is an excellent language (speed, support, IDE) and I basically use the same programming style that I did in lisp - functions call other functions that call other functions - and functions are usually 1-5 lines long.
I do know that Lisp has good performance, run on all platforms, but it is somewhat outdated in terms of how up to date support for features such as graphics, multi-threading etc. are.
i've taken a look at clojure but if you don't like java you probably won't like clojure. It's a functional-lisp-style language implemented on top of java - but you'll probably find yourself using java libraries all the time which adds the verbosoity of java. I like lisp but I didn't like clojure despite the hype.
Are you also sure about your performanc requirements? Matlab is an excellent language for a lot of scientific computation, but it is kind of slow and I hate reading it. You might find t useful though especially in conjunction with other languages, for prototypes/scenarios/subunits.
Many of your requirements are based on hearsay. One example: the idea that Mono is "terrible".
http://banshee-project.org/
That's the official media player of many Linux distributions. It's written in C#. (They don't even have a public Windows release of it!)
Your assertions about the relative performance of various languages are equally dubious. And requiring a language to not use a virtual machine is quite unrealistic and totally undesirable. Even an OS is a form of VM on which applications run, which virtualises the hardware devices of the machine.
Though you earn points for mentioning tools (although not with enough priority). As Knuth observed, the first question to ask about a language is "What's the debugger like?"
Looking over your requirements, I would recommend VB on either Mono, or a virtual machine running windows. As a previous poster said, the first thing to ask about a language is "What is the debugger like" and VB/C# have the best debugger. Just a result of all those Microsoft employees hammering on the debugger, and having the teams nearby to bug (no pun intended) into fixing it.
The best thing about VB and C# is the large set of developer tools, community, google help, code exapmles, libraries, softwaer that interfaces with it, etc. I've used a wide variety of software development environments over the past 27 years, and the only thing that comes close is the Xerox Lisp machine environmnets (better) and the Symbolics Lisp machines (worse).

What fast low-level languages can you recommend?

I have become interested in C-like languages for performance computing. Can you recommend some alternative programming languages which have the following attributes:
must be close to the hardware (bit fiddling, pointers or some alternative safe method like references)
no managed code (no jvm/.net languages)
has to be really fast (like C)
must be above ASM level (and yes I am interested in macro languages on top of ASM)
can be obscure, not very widespread
I am mainly interested in little-known languages.
How about Assembly language, or the D programming language?
If you don't know about it and are interested just in broadening your horizons, take a look at Forth. Reading about Forth always makes me feel C is high-level.
Well, I've always preferred C and/or C++ because there are multiple flavours (MSVC, glibc etc), it runs on many different platforms (e.g. mobile devices, Windows, linux) and devices, and it can be written cross platform (different processor architectures) and even for high end graphics (e.g. DirectX).
You get "decent" access to platform resources (conditions vary), it can be as fast as you choose to hone it, and it's a tad easier (IMHO) to write than ASM. There's also a pretty decent range of support tools and code analysis tools to make things a little easier.
Also C and C++ have been around for quite some time, so it's got (even today) an excellent and enthusiastic community!
You don't explicitly state that it can't be C in your question, so I'll go ahead and recommend C. It fulfills your three bulleted desires, and you won't have to worry about different versions of the language (like each different kind of assembler).
Forth!
Forth can be faster than machine language on some architectures. The compiled code is extremely dense, therefore, making optimal use of code caching.
assembly would be the closest to the hardware and therefore the fastest
Ada was originally designed for embedded systems (among other things).
OpenCL might be interesting. It's sort of like OpenGL shader language (a subset of C with extensions), but for general purpose parallel array computing.
You could start programming FPGAs in VHDL, Verilog, System C ...
Variations on a theme
FORTRAN is older than C, and is still one of the major players in numerical computing. Until 1990 (when the language was substantially modernized), the language didn't have any form of pointer (checked or not). This lack meant that there was no way to manage memory dynamically; it also made aliasing analysis easy for the compiler, which is one of the things that makes Fortran code fast.
ALGOL was the first structured programming language. Although it had limited success with programmers, it had a strong influence on language designers.
Ada is an imperative language with a strong type system and good modularity, which makes it good for low-level programming with strong assurance requirements (it was sponsored by the US government with military and avionics applications in mind). It was inspired by Pascal, like Modula-2 and Modula-3.
Going further from the mainstream of low-level imperative programming, there is FORTH. FORTH can be compiled for, and even interpreted on, devices with very little memory; it finds a lot of use on low-end embedded systems, including microcontrollers. The language is based on reverse polish notation, made famous by HP calculators (in fact, the language of HP calculators is strongly influenced by FORTH). Many implementations don't have variables: all data is kept on one or more stacks.
Just for fun, I'll mention INTERCAL, the grandaddy of esoteric languages.
Stuff that will blow your mind
Esoteric languages can be instructive, and a quite a few work close to the machine (usually a virtual machine, but in principle you could implement them for an actual computer if you were crazy enough). You could look at brainfuck (a sort of intermediate stage between Turing machines and C), or the many single-instruction languages, or befunge (what if memory was a two-dimensional array?).
Cyclone looks a lot like C. The syntax is the same, and Cyclone has pointers, untagged structures and unions, goto statements and manual memory management. And yet it's a safe language: you can't have a dangling pointer, or a buffer overflow. And you have access to high-level features such as pattern matching, exceptions, polymorphism, abstract types and optional automatic memory management (not just garbage collection, but also regions). Cyclone is both useful and instructive; for a C die-hard, it can be a good way of discovering what makes a safe language. Cyclone can compile to C, so you can run your programs anywhere you have a C compiler for.
Going in a different direction, if you want to be close to the hardware, while still not actually designing hardware, have a look at synchronous languages, such as Lustre and Esterel. These languages are used to program high-assurance realtime systems such as nuclear plants, airplanes and railway signaling. These languages give up Turing completeness and gain the assurance that programmers can know exactly how fast their program will run and how much memory it will require. If you think C is close to the machine, finding out what a language that is really close to the machine may come as a shock.
You can't get much closer than assembly language, unless you get a job with a chip-maker and start writing micro code!!!
If you're on Windows I think you can get hold of Microsoft MASM (macro assembler) that will allow you go get up and running quickly. I used it a long time ago and it's not a bad product.
Seems a bit awkward to answer my question, but I have found two languages:
Pyrex
Vala
They may not fulfill all of the constraints, but they are great for performance computing and both translates to C.

Which language would you use in your OS?

This is probably more of a subjective question, but which language (not API like .NET or JDK) would you use should you write your own operating system? Which language provides flexibility, simplicity, and possibly a low-level interface to the hardware? I was thinking Java or C...
C, of course.
Haskell.
Once you have flipped the right hardware bits, C is a terrible language to use for the rest of the OS. Things like the scheduler, filesystems, drivers, etc. are complex high-level algorithms, and you don't want to be writing those in assembly language (or C; same thing). It's too hard to get right. (The VM subsystem and memory manager may need to be written in something low-level, as you will need to bootstrap your high-level langauge's runtime somehow.)
Anyway, this isn't just a crazy idea that I am coming up with for SO. Here is an OS written in Haskell: http://programatica.cs.pdx.edu/House/
Lisp is another good choice; the original Lisp machines were infinitely more tweakable (at runtime) than "modern" OSes like UNIX and Windows.
Sometimes history forgets good ideas (often in the name of "maximum performance"), and that makes me sad.
D would be an interesting choice. From its own description:
D is a systems programming language. Its focus is on combining the power and high performance of C and C++ with the programmer productivity of modern languages like Ruby and Python. Special attention is given to the needs of quality assurance, documentation, management, portability and reliability.
The D runtime assumes the existence of a garbage collector, which would not be appropriate for the very lowest levels of the kernel. However, it would be appropriate for many of the higher layers.
Build the basic components like task schedulers and drivers etc with Assembly, then build the higher level components like applications and tools with C
I believe this is how Windows XP was built too (unsure about Windows Vista and Windows 7).
Definitely... C
C, ASM, C#
Singularity
Low-level in something like Haskell or D. Productivity over performance, in my opinion. You can rewrite slow parts in C++ or even assembly later if the need arises.
High-level in Python or Ruby. Ideally I'd also have a really fast JIT-capable VM for that language, but that's not going to happen for either language for a while. Lua would be a good alternative if speed gets in the way.
The kernel has to be written in a low-level language, C is by far the best choice for this, because it is so memory efficient. The higher levels could be built with a combination of Java or more ideally Objective-C, and scripting languages like python and ruby, or lua.
Honestly, I would either use C or some hierarchy of languages that I had either designed or fit together completely seamlessly. What I would be looking for is a seamless experience that starts at the bare metal level and then I could move to higher and higher level languages as I moved up the problem space. I would probably chose something like:
C - for bare metal stuff like drivers, kernel, etc
Java/C# - for application-level things like administration consoles, OS apps
Python/PowerShell - for scripting activities like common administrative tasks (creating a new user, etc)
Personally, I think C/C#/PowerShell is more tightly integrated and the type of experience I'd be looking for. Of course, if I ever got so ambitious as to write an OS, I would have a lot of spare time on my hands and would probably really enjoy tackling the language stack first. So maybe it would be L/L#/LScript ...
BitC seems to have this in mind. Despite it's name it seems to be the midpoint of assembly language and lisp. The goal was to make a language with a strong correspondence with machine language but have an intermediate representation that supports stronger correctness inferences than is possible with most other common languages. The languages was created as part of the Coyotos project, an operating system with lofty goals of security and reliability. Formal verification is made significantly possible with the ir used in BitC.
Ada:
Ada is a structured, statically typed, imperative, and object-oriented high-level computer programming language, extended from Pascal and other languages. It was originally designed by a team led by Jean Ichbiah of CII Honeywell Bull under contract to the United States Department of Defense (DoD) from 1977 to 1983 to supersede the hundreds of programming languages then used by the DoD. Ada is strongly typed and compilers are validated for reliability in mission-critical applications, such as avionics software.
Ada, because it was not only specifically developed for such projects, but it also provides support for several very useful high level features (such as support for strong typing, concurrency and abstraction) that are simply not available in standard C.
So that, even as a project grows, you don't have to work around language limitations (think encapsulation, abstraction, namespaces in C).
Don't get me wrong, C works obviously for a great many of projects, but once a project has gained a certain size (think Linux kernel, gcc, GNOME), you will inevitably appreciate certain features of more high level languages to make the development process less tedious and also less obfuscated.
In C however, these features usually end up being -pretty poorly- emulated by excessive and almost pervert use of the pre-processor (this can for example be seen in the gcc code base), so that you get to see lots of nested macros, that from an implementation point of view, actually emulate features found in other programming languages.
In addition, Ada is the only programming language, that I am aware of, that actually provides standardized support for source code analysis using the ASIS, having such a facility in place is however the prerequisite to actually be able to maintain and transform/re-engineer a code base in the long run (think refactoring).
Having an interface like ASIS available, means that you can actually implement "semantic patching", where you can automatically rename a file, function or variable/data structure and it will actually work.
Java ?? no jave runs on a virtual machine which needs an os to run on top of ,
maybe C and some ASM ;)
I would go with D to see whether it can do it.
I would only pick the following 3 out of practicality.
C (good old fashioned)
C++ (C with stuff tacked on. Windows is partially written in this)
Java (the medium level language that just might have a capable garbage collector with controllable pauses with G1).
If I were going to start a new OS I'd do it with the subset of C++ recommended by the embedded industry. You can use things like classes and use it "as a better C" and be just as fast. Just avoid things that have massive overhead. You can even use some template features, if you stick to a certain subet that basically don't have any overhead. Look on embedded.com for features in C++ that have little to no overhead, but will allow you organize your code better than you ever could in C.
Oberon? I guess I miss Pascal too much some times. C paid the bills for quite a while, but I don't really love it.
Lisp of course!
Title text: Some say the world will end in fire; some say in segfaults.
For an OS, you want speed at the lowest levels. So assembly, C, C++, Objective-C, or Java seem to be the current choices. Although it's just recently that Java got fast, and it's hard for me to imagine an OS with garbage collection.
If I were writing my own, it would be a mix of assembly and C.
A C or C++ microkernel with a JIT for a highly dynamic language like Ruby or maybe a language with native support for the Prototype pattern. Even device drivers in that language.
Not because it's practical but because it's really cool. Cool in the way that NeXTStep was cool for using Obj-C for pretty much everything.
http://www.dwheeler.com/sloc/redhat71-v1/redhat71sloc.html - share of languages in Linux's source code.
C, by a number of reasons. Other candidates, like D, are great. However, C has this advantage: there's a lot of available open-sourced C code that you could reuse in your project (much more than is for other languages appropriate for system programming).
I would be torn between using some existing low level language and write my own based on C# but with much better generics support.
In second case I would make each method generic, but all the constraints will be resolved by compiler - to allow "duck typing" like in Scala but still language should be static. Also static virtual methods would lower the codebase.
I've had that idea for a long time, but it never seems to be doable in real timeframe, so who knows maybe in the future. :-)
Some would say Java.
Note that openfirmware is written partly in Forth, and it's very low level.
Have an open mind.
"The kernel has to be written in a low-level language, C is by far the best choice for this, because it is so memory efficient. "
Um... What about FORTH?
FORTH can be low level and high level, so you could have a whole operating system written in FORTH from the ground up, and still have a nice easy REPL scripting environment on top, also in FORTH.
However, any decent operating system should support lots of langauges on top, from C all the way to Python Ruby and Javascript. Making FORTH the basis for it all has a lot of benefits though.
edit: I'd only ever attempt this for an embedded environment with a single known hardware set. Trying to write an OS that could compete with Linux or Windows is a fools job.
If this isn't a hypothetical question, and you're looking to create your own OS, I'd probably go with C because most of the examples out there are written in C.
Also, (And I haven't build an OS yet so take this with a grain of salt), I'm thinking that the c runtime libraries would be a lot easier to port to your new OS than say .NET.
Pascal + Oberon: they have the power of C and C++ but they're not as daunting to use. Both these languages are grossly under appreciated.

Resources