I have become interested in C-like languages for performance computing. Can you recommend some alternative programming languages which have the following attributes:
must be close to the hardware (bit fiddling, pointers or some alternative safe method like references)
no managed code (no jvm/.net languages)
has to be really fast (like C)
must be above ASM level (and yes I am interested in macro languages on top of ASM)
can be obscure, not very widespread
I am mainly interested in little-known languages.
How about Assembly language, or the D programming language?
If you don't know about it and are interested just in broadening your horizons, take a look at Forth. Reading about Forth always makes me feel C is high-level.
Well, I've always preferred C and/or C++ because there are multiple flavours (MSVC, glibc etc), it runs on many different platforms (e.g. mobile devices, Windows, linux) and devices, and it can be written cross platform (different processor architectures) and even for high end graphics (e.g. DirectX).
You get "decent" access to platform resources (conditions vary), it can be as fast as you choose to hone it, and it's a tad easier (IMHO) to write than ASM. There's also a pretty decent range of support tools and code analysis tools to make things a little easier.
Also C and C++ have been around for quite some time, so it's got (even today) an excellent and enthusiastic community!
You don't explicitly state that it can't be C in your question, so I'll go ahead and recommend C. It fulfills your three bulleted desires, and you won't have to worry about different versions of the language (like each different kind of assembler).
Forth!
Forth can be faster than machine language on some architectures. The compiled code is extremely dense, therefore, making optimal use of code caching.
assembly would be the closest to the hardware and therefore the fastest
Ada was originally designed for embedded systems (among other things).
OpenCL might be interesting. It's sort of like OpenGL shader language (a subset of C with extensions), but for general purpose parallel array computing.
You could start programming FPGAs in VHDL, Verilog, System C ...
Variations on a theme
FORTRAN is older than C, and is still one of the major players in numerical computing. Until 1990 (when the language was substantially modernized), the language didn't have any form of pointer (checked or not). This lack meant that there was no way to manage memory dynamically; it also made aliasing analysis easy for the compiler, which is one of the things that makes Fortran code fast.
ALGOL was the first structured programming language. Although it had limited success with programmers, it had a strong influence on language designers.
Ada is an imperative language with a strong type system and good modularity, which makes it good for low-level programming with strong assurance requirements (it was sponsored by the US government with military and avionics applications in mind). It was inspired by Pascal, like Modula-2 and Modula-3.
Going further from the mainstream of low-level imperative programming, there is FORTH. FORTH can be compiled for, and even interpreted on, devices with very little memory; it finds a lot of use on low-end embedded systems, including microcontrollers. The language is based on reverse polish notation, made famous by HP calculators (in fact, the language of HP calculators is strongly influenced by FORTH). Many implementations don't have variables: all data is kept on one or more stacks.
Just for fun, I'll mention INTERCAL, the grandaddy of esoteric languages.
Stuff that will blow your mind
Esoteric languages can be instructive, and a quite a few work close to the machine (usually a virtual machine, but in principle you could implement them for an actual computer if you were crazy enough). You could look at brainfuck (a sort of intermediate stage between Turing machines and C), or the many single-instruction languages, or befunge (what if memory was a two-dimensional array?).
Cyclone looks a lot like C. The syntax is the same, and Cyclone has pointers, untagged structures and unions, goto statements and manual memory management. And yet it's a safe language: you can't have a dangling pointer, or a buffer overflow. And you have access to high-level features such as pattern matching, exceptions, polymorphism, abstract types and optional automatic memory management (not just garbage collection, but also regions). Cyclone is both useful and instructive; for a C die-hard, it can be a good way of discovering what makes a safe language. Cyclone can compile to C, so you can run your programs anywhere you have a C compiler for.
Going in a different direction, if you want to be close to the hardware, while still not actually designing hardware, have a look at synchronous languages, such as Lustre and Esterel. These languages are used to program high-assurance realtime systems such as nuclear plants, airplanes and railway signaling. These languages give up Turing completeness and gain the assurance that programmers can know exactly how fast their program will run and how much memory it will require. If you think C is close to the machine, finding out what a language that is really close to the machine may come as a shock.
You can't get much closer than assembly language, unless you get a job with a chip-maker and start writing micro code!!!
If you're on Windows I think you can get hold of Microsoft MASM (macro assembler) that will allow you go get up and running quickly. I used it a long time ago and it's not a bad product.
Seems a bit awkward to answer my question, but I have found two languages:
Pyrex
Vala
They may not fulfill all of the constraints, but they are great for performance computing and both translates to C.
Related
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I want a better C. Let me explain:
I do a lot of programming in C, which is required for applications that have real-time needs such as audio programming, robotics, device drivers, etc.
While I love C, one thing that gets on my nerves after having spent a lot of time with Haskell is the lack of a proper type system. That is, as soon as you want to write a more general-purpose function, say something that manipulates a generic pointer, (like say a generic linked list) you have to cast things to void* or whatever, and you loose all type information. It's an all-or-nothing system, which doesn't let you write generic functions without losing all the advantages of type checking.
C++ doesn't solve this. And I don't want to use C++ anyways. I find OO classes and templates to be a headache.
Haskell and its type classes do solve this. You can have semantically useful types, and use type constraints to write functions that operate on classes of types, that don't depend on void.
But the domain I'm working in, I can't use Haskell, because it's not real-time capable--mostly due to garbage collection. GC is needed because it's very difficult to do functional programming, which is allocation-heavy, without automatic memory management. However, there is nothing specifically in the idea of type classes that goes against C's semantics. I want C, but with Haskell's dependable type system, to help me write well-typed systems. However, I really want C: I want to be in control of memory management, I want to know how the data structures are layed out, I want to use (well-typed) pointer arithmetic, I want mutability.
Is there any language like this? If so, why is it not more popular for low-level programming?
Aside: I know there are some small language experiments in this direction, but I'm interested in things that would be really usable in real-world projects. I'm interesting in growing-to-well-developed languages, but not so much "toy" languages.
I should add, I heard of Cyclone, which is interesting, but I couldn't get it to compile for me (Ubuntu) and I haven't heard of any projects actually using it.. any other suggestions in this vein are welcome.
Thanks!
Since nobody brought it up yet: I think the ATS language is a very good candidate for a better C! Especially since you enjoy Haskell and thus functional programming with strong types. Note that ATS seems to be specifically designed for systems programming and hard real-time applications as most of it can do without garbage collection.
If you check the shootout you will find that performance is basically on par with C. I think this is quite impressive since modern c compilers have years and years and years of optimization work behind them while ATS is basically developed by one guy. -- while other languages providing similar safety features usually introduce overhead ATS ensures things entirely at compile time and thus yields very similar performance characteristics as C.
To quote the website:
What is ATS?
ATS is a statically typed programming language that unifies implementation with formal specification. It is equipped with a highly expressive type system rooted in the framework Applied Type System, which gives the language its name. In particular, both dependent types and linear types are available in ATS. The current implementation of ATS (ATS/Anairiats) is written in ATS itself. It can be as efficient as C/C++ (see The Computer Language Benchmarks Game for concrete evidence) and supports a variety of programming paradigms that include:
Functional programming. The core of ATS is a functional language based on eager (aka. call-by-value) evaluation, which can also accommodate lazy (aka. call-by-need) evaluation. The availability of linear types in ATS often makes functional programs written in it run not only with surprisingly high efficiency (when compared to C) but also with surprisingly small (memory) footprint (when compared to C as well).
Imperative programming. The novel and unique approach to imperative programming in ATS is firmly rooted in the paradigm of programming with theorem-proving. The type system of ATS allows many features considered dangerous in other languages (e.g., explicit pointer arithmetic and explicit memory allocation/deallocation) to be safely supported in ATS, making ATS a viable programming language for low-level systems programming.
Concurrent programming. ATS, equipped with a multicore-safe implementation of garbage collection, can support multithreaded programming through the use of pthreads. The availability of linear types for tracking and safely manipulating resources provides an effective means to constructing reliable programs that can take advantage of multicore architectures.
Modular programming. The module system of ATS is largely infuenced by that of Modula-3, which is both simple and general as well as effective in supporting large scale programming.
In addition, ATS contains a subsystem ATS/LF that supports a form of (interactive) theorem-proving, where proofs are constructed as total functions. With this component, ATS advocates a programmer-centric approach to program verification that combines programming with theorem-proving in a syntactically intertwined manner. Furthermore, this component can serve as a logical framework for encoding deduction systems and their (meta-)properties.
What about Nimrod or Vala languages ?
Rust
Another (real) candidate for a better C is The Rust Programming Language.
Unlike some other suggestions, (Go, Nimrod, D, ...) Rust can directly compete with C and C++ because it has manual memory management and does not require garbage collection (see [1]).
What sets Rust apart is that it has safe manual memory management. (The link is to pc walton's blog, one of Rusts main contributors and generally worth a read ;) Among other things, this means it fixes the billion dollar mistake of nullpointers. Many of the other languages suggested here either require garbage collection (Go) or have garbage collection turned on by default and do not provide facilities for safe manual memory management beyond what C++ provides (Nimrod, D).
While Rust has an imperative heart, it does borrow a lot of nice things from functional languages, for example sum types aka tagged unions. It is also really concerned with being a safe and performance oriented language.
[1] Right now there are two main pointer types owned pointers (like std::unique_ptr in C++ but with better support from the typechecker) and managed pointers. As the name suggests the latter do require task-local garbage collection, but there are thoughts to remove them from the language and only provide them as a library.
EDITED to reflect #ReneSacs comment: Garbage collection is not required in D and Nimrod.
I don't know much about Haskell, but if you want a strong type system, take a look at Ada. It is heavily used in embedded systems for aerospace applications. The SIGADA moto is "In strong typing we trust." It won't be of much use, however, if you have to do Windows/Linux type device drivers.
A few reasons it is not so popular:
verbose syntax -- designed to be read, not written
compilers were historically expensive
the relationship to DOD and design committees, which programmers seem to knock
I think the truth is that most programmers don't like strong type systems.
Nim (former Nimrod) has a powerful type system, with concepts and easy generics. It also features extensive compile time mechanisms with templates and macros. It also has easy C FFI and all the low level features that you expect from a system programming language, so you can write your own kernel, for example.
Currently it compiles to C, so you can use it everywhere GCC runs, for example. If you only want to use Nim as better C, you can do it via the --os:standalone compiler switch, that gives you a bare bones standard library, with no OS ties.
For example, to compile to an AVR micro-controller you can use:
nim c --cpu:avr --os:standalone --deadCodeElim:on --genScript x.nim
Nim has a soft real-time GC where you can specify when it runs and the max pause time in microseconds. If you really can't afford the GC, you can disable it completely (--gc:none compiler switch) and use only manual memory management like C, losing most of the standard library, but still retaining the much saner and powerful type system.
Also, tagged pointers are a planned feature, that ensure you don't mix kernel level pointers with user level pointers, for example.
D might offer what you want. It has a very rich type system, but you can still control memory layout if you need to. It has unrestricted pointers like C. It’s garbage collected, but you aren’t forced to use the garbage collector and you can write your own memory management code if you really want.
However, I’m not sure to what extent you can mix the type richness with the low-level approach you want to use.
Let us know if you find something that suits your needs.
I'm not sure what state Cyclone is in, but that provided more safety for standard C. D can be also considered a "better C" to some extent, but its status is not very clear with its split-brain in standard library.
My language of choice as a "better C" is OOC. It's still young, but it's quite interesting. It gives you the OO without C++'s killer complexity. It gives you easy access to C interfaces (you can "cover" C structs and use them normally when calling external libraries / control the memory layout this way). It uses GC by default, but you can turn it off if you really don't want it (but that means you cannot use the standard library collections anymore without leaking).
The other comment mentioned Ada which I forgot about, but that reminded me: there's Oberon, which is supposed to be a safe(-er) language, but that also contains garbage collection mechanisms.
You might also want to look at BitC. It’s a serious language and not a toy, but it isn’t ready yet and probably won’t be ready in time to be of any use to you.
Nonetheless, a specific design goal of BitC is to support low-level development in conjunction with a Haskell-style type system. It was originally designed to support development of the Coyotos microkernel. I think that Coyotos was killed off, but BitC is still apparently being developed.
C++ doesn't solve this. And I don't want to use C++ anyways. I find OO classes and templates to be a headache.
Get over this attitude. Just use C++. You can start with coding C in C++ and keep gradually moving to better style.
I've heard these terms thrown around describing languages before, like C is not quite a low-level language, C++ is a midlevel, and Python is a high-level language.
I understand that it has to do something with the way the code is compiled, and how it is written. But what defines a language into one of those three categories? Are these absolute categories, or just a general idea programmers use to describe languages to each other?
Yes, they're just general terms. It's to do with abstraction, and how close you are to what the computer's actually doing.
Here's a list of programming languages ranging from very low to very high level:
Machine Code could probably be considered the lowest level programming language.
Assembly language is at the level of telling the processor what to do. There is still a conversion step towards machine code.
C is a step up from assembler, because you get to specify what you want to do in slightly more abstract terms, but you're still fairly close to the metal.
C++ does everything that C can do but adds the capability to abstract things away into classes.
Java/C# do similar things to C++ in a way, but without the opportunity to do everything you can do in C (like pointer manipulation in Java's case [thanks Joe!]). They have garbage collection though, which you have to do manually in C++.
Python/Ruby are even higher level, and let you forget about a lot of the details that you would need to specify in something like Java or C#.
SQL is even higher level (it's declarative). Just say "Give me all the items in the table sorted by age" and it will work out the most efficient way to carry this out for you.
low level = long development time + very fast executable file
high level = shorter development time + slower executable file
mid level is between the two
Very low-level: Machine Code
Low level: Assembler, Forth
Mid level: C, C++, most system programming languages
Mid/High level: D, Go, garbage collected system programming languages
High level: Java, C#, most interpreted languages
Even Higher level: Lisp dialects
Highest level: SQL, declarative programming languages
If there is anything else to be added, tell me.
They aren't absolute. They are all relative to what other languages are being used in industry at the time. For example, there was a time when assembly was considered mid-level.
The 'level' is essentially a measure of how abstracted the programmer is from the actual hardware-based operations. In a low level language you might have to care about actual memory locations, whereas in a high-level you just create variables and let the OS handle memory.
A normal CPU processes either 32 or 64-bit instructions. In the simplest form, think of this as an 32 1's and 0's in a row - that's what the processor actually interprets and executes. Writing this directly (machine code) would be the 'lowest-level'.
Low level means closer to the machine, and therefore more difficult and more powerful. The higher level you get, the more removed from the machine and "English-like" you get, but you lose a lot of the power and functionality that comes with being able to control the minute details of the machine. Higher level languages also generally tend to protect you more and have much more precautions and checks in place, while lower level languages trust you, so to speak, and let you play around at your own risk.
The term mid-level language is one I've never heard.
"Low" and "High" refer to how "close" to the machine you are in your programming. The lowest level would be machine (binary) code. Next (and still considered low) is assembler. The higher level languages involve more symbolism and constructs that are supposed to be closer to how humans normally think. C (and somewhat C++) has a reputation as being somewhat a hybrid low/high level because it has many constructs that are in high level languages, but also has instructions (e.g. shifts) that are low level languages but often not in higher level languages.
From low to high, you can categorize the languages as follows.
Machine Code --> Assembly Language --> Compiled Language --> Interpreted Language
Remember that these aren't absolute black and white definitions, but rather shades of gray. This is more of a guideline than a rule.
Think of machine code as a long string of 1s and 0s understood by the native platform. Consider this your baseline... the lowest "level" you can have.
Assembly language could be considered a symbolic representation of this. I believe there is a 1 to 1 mapping between assembly code instructions and machine code instructions. This is your low level language.
Java and C++, for example, are both compiled languages, but many would consider C++ to be a lower level language than Java because it exposes low level system access, while Java runs in a protected environment (the virtual machine). Remember that a compiled language is compiled (converted, if you will) to machine code before execution. C is also a compiled language, but would be considered lower level than both Java and C++.
For our sake, we will say that C and C++ are low level languages because they offer (relatively) little abstraction from the hardware and direct memory management. In actuality, they fall somewhere between low and mid, as you will see soon enough.
We will call Java and C# (.NET) mid level languages because they have automatic memory management (garbage collection), plenty of high-level abstractions (IE objects... yet C++ supports objects. Do you see why the scale is considered to be loosely defined?)
With an interpreted language, the interpreter resides in memory and reads the source code directly. These are high level languages. Python, Perl, Javascript, and PHP are all examples of high level languages.
C is a midlevel language, because we can use code in assembly language.
The only slight difference is pointers make it powerful (if pointer remove in C then it be will considered a low-level language). Its portable features makes it midlevel, so we can say it is a midlevel language.
It is all relative... The "level" reflect the amount of abstraction.
Once you add a spectrum of levels of a programming language you add nuance to the definition.
Clearly machine code and assembly are machine-dependent. C and C++ in theory are machine-independent, but in truth that is not universal. In C, things like alignment need to be taken into account and you can always manage the stack in C and in the C subset of C++) via a pointer and a single initialised variable—if you are crazy enough—so that (x86) the RSP register (stack pointer) is never used. So C, yes it is midlevel. Everything else is high-level and some super high-level.
Low-level languages are very close to machine language that may be binary or RTL. It is hard to write and very quick to execute. It can interact with the hardware and a high-level programming language is very easy to write, but it can be executed after compilation.
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I'm a scientist working mostly with C++, but I would like to find a better language. I'm looking for suggestions, I'm not even sure my "dream language" exist (yet), but here's my wishlist;
IMPORTANT FEATURES (in order of importance)
1.1: Performance: For science, performance is very important. I perfectly understand the importance of productivity, not just execution speed, but when your program has to run for hours, you just can't afford to write it in Python or Ruby. It doesn't need to be as fast as C++, but it has to be reasonably close (e.g.: Fortran, Java, C#, OCaml...).
1.2: High-level and elegant: I would like to be able to concentrate as most as possible on the science and get a clear code. I also dislike verbose languages like Java.
1.3: Primarely functional: I like functional programming, and I think it suits both my style and scientific programming very well. I don't care if the language supports imperative programming, it might be a plus, but it has to focus and encourage functional programming.
1.4: Portability: Should work well on Linux (especially Linux!), Mac and Windows. And no, I do not think F# works well on Linux with mono, and I'm not sure OCaml works well on windows ;)
1.5: Object-oriented, preferably under the "everything is an object" philosophy: I realized how much I liked object-oriented programming when I had to deal pure C not so long ago. I like languages with a strong commitment to object-oriented programming, not just timid support.
NOT REALLY IMPORTANT, BUT THINGS THAT WOULD BE NICE
2.1: "Not-too-strong" typing: I find Haskell's strong typing system to be annoying, I like to be able to do some implicit casting.
2.2: Tools: Good tools is always a plus, but I guess it really depends on the languages. I played with Haskell using Geany, a lightweight editor, and I never felt handicapped. On the other hand I wouldn't have done the same with Java or even Scala (Scala, in particular, seems to be lacking good tools, which is really a shame). Java is really the #1 language here, with NetBeans and Javadoc, programming with Java is easy and fun.
2.3: Garbage collected, but translated or compiled without a virtual machine. I have nothing against virtual machines, but the two giants in the domain have their problems. On paper the .net framework seems much better, and especially suited for functional programming, but in practice it's still very windows-centric and the support for Linux/MacOS is terrible not as good as it should be, so it's not really worth considering. Java is now a mature VM, but it annoys me on some levels: I dislike the ways it deals with executables, generics, and it writes terrible GUIs (although these things aren't so bad).
In my mind there are three viable candidates: Haskell, Standard ML, OCaml. (Scala is out on the grounds that it compiles to JVM codes and is therefore unlikely to be fast enough when programs must run for days.)
All are primarily functional. I will comment where I have knowledge.
Performant
OCaml gives the most stable performance for all situations, but performance is hard to improve. What you get is what you get :-)
Haskell has the best parallel performance and can get excellent use out of an 8-core or 16-core machine. If your future is parallel, I urge you to master your dislike of the type system and learn to use Haskell effectively, including the Data Parallel Haskell extensions.
The down side of Haskell performance is that it can be quite difficult to predict the space and time required to evaluate a lazy functional program. There are excellent profiling tools, but still significant effort may be required.
Standard ML with the MLton compiler gives excellent performance. MLton is a whole-program compiler and does a very good job.
High-level and elegant
Syntactically Haskell is the clear winner. The type system, however, is cluttered with the remains of recent experiments. The core of the type system is, however, high-level and elegant. The "type class" mechanism is particularly powerful.
Standard ML has ugly syntax but a very clean type system and semantics.
OCaml is the least elegant, both from a point of view of syntax and from the type system. The remains of past experiments are more obtrusive than in Haskell. Also, the standard libraries do not support functional programming as well as you might expect.
Primarily functional
Haskell is purely functional; Standard ML is very functional; OCaml is mostly functional (but watch out for mutable strings and for some surprising omissions in the libraries; for example, the list functions are not safe for long lists).
Portability
All three work very well on Linux. The Haskell developers use Windows and it is well supported (though it causes them agony). I know OCaml runs well on OSX because I use an app written in OCaml that has been ported to OSX. But I'm poorly informed here.
Object-oriented
Not to be found in Haskell or SML. OCaml has a bog-standard OO system grafted onto the core language, not well integrated with other languages.
You don't say why you are keen for object-orientation. ML functors and Haskell type classes provide some of the encapsulation and polymorphism (aka "generic programming") that are found in C++.
Type system than can be subverted
All three languages provide unsafe casts. In all three cases they are a good way to get core dumps.
I like to be able to do some implicit casting.
I think you will find Haskell's type-class system to your liking—you can get some effects that are similar to implicit casting, but safely. In particular, numeric and string literals are implicitly castable to any type you like.
Tools
There are pretty good profiling tools with Haskell. Standard ML has crappy tools. OCaml has basically standard Unix profiling plus an unusable debugger. (The debugger refuses to cross abstraction barriers, and it doesn't work on native code.)
My information may be out of date; the tools picture is changing all the time.
Garbage-collected and compiled to native code
Check. Nothing to choose from there.
Recommendation
Overcome your aversion to safe, secure type systems. Study Haskell's type classes (the original paper by Wadler and Blott and a tutorial by Mark Jones may be illuminating). Get deeper into Haskell, and be sure to learn about the huge collection of related software at Hackage.
Try Scala. It's an object-oriented functional language that runs in the JVM, so you can access everything that was ever written in Java. It has all your important features, and one of the nice to have features. (Obviously not #2.2 :) but that will probably get better quickly.) It does have very strong typing, but with type inference it doesn't really get in your way.
You just described Common Lisp...
If you like using lists for most things, and care about performance, use Haskell or Ocaml. Although Ocaml suffers significantly in that Floats on the heap need to be boxed due to the VM design (but arrays of floats and purely-float records aren't individually boxed, which is good).
If you're willing to use arrays more than lists, or plan on programming using mutable state, use Scala rather than Haskell. If you're looking to write threaded multi-core code, use Scala or Haskell (Ocaml requires you to fork).
Scala's list is polymorphic, so a list of ints is really a list of boxed Int objects. Of course you could write your own list of ints in Scala that would be as fast, but I assume you'd rather use the standard libraries. Scala does have as much tail recursion as is possible on JVM.
Ocaml fails on Vista 64 for me, I think because they just changed the linker in the latest version (3.11.1?), but earlier versions worked fine.
Scala tool support is buggy at the moment if you're using nightly builds, but should be good soon. There are eclipse and netbeans plugins. I'm using emacs instead. I've used both the eclipse and netbeans debugger GUI successfully in the past.
None of Scala, Ocaml, or Haskell, have truly great standard libraries, but at least you can easily use Java libs in Scala. If you use mapreduce, Scala wins on integration. Haskell and Ocaml have a reasonable amount of 3rd party libs. It annoys me that there are differently named combinators for 2-3 types of monad in Haskell.
http://metamatix.org/~ocaml/price-of-abstraction.html might convince you to stay with C++. It's possible to write Scala that's almost identical in performance to Java/C++, but not necessarily in a high level functional or OO style.
http://gcc.gnu.org/projects/cxx0x.html seems to suggest that auto x = ... (type inference for expressions) and lambdas are usable. C++0x with boost, if you can stomach it, seems pretty functional. The downside to C++ high performance template abusing libraries is, of course, compile time.
Your requirements seem to me to describe ocaml quite well, except for the "not-too-strong" typing. As for tools, I use and like tuareg mode for emacs. Ocaml should run on windows (I haven't used it myself though), and is pretty similar to F#, FWIW.
I'd consider the ecosystem around the language as well. In my opinion Ocaml's major drawback is that it doesn't have a huge community, and consequently lacks the large library of third-party modules that are part of what makes python so convenient. Having to write your own code or modify someone else's one-shot prototype module you found on the internet can eat up some of the time you save by writing in a nice functional language.
You can use F# on mono; perhaps worth a look? I know that mono isn't 100% perfect (nothing ever is), but it is very far from "terrible"; most of the gaps are in things like WCF/WPF, which I doubt you'd want to use from FP. This would seem to offer much of what you want (except obviously it runs in a VM - but you gain a huge set of available libraries in the bargain (i.e. most of .NET) - much more easily than OCaml which it is based on).
I would still go for Python but using NumPy or some other external module for the number crunching or alternatively do the logic in Python and the hotspots in C / assembler.
You are always giving up cycles for comfort, the more comfort the more cycles. Thus you requirements are mutual exclusive.
I think that Common Lisp fits your description quite well.
1.1: Performance: Modern CL implementations are almost on par with C. There are also foreign function interfaces to interact with C libraries, and many bindings are already done (e.g. the GNU Scientific Library).
1.2: High-level and elegant: Yep.
1.3: Primarily functional: Yes, but you can also "get imperative" wherever the need arises; CL is "multi-paradigm".
1.4: Portability: There are several implementations with differing support for each platform. Some links are at CLiki and ALU Wiki.
1.5: Object-oriented, preferably under the "everything is an object" philosophy: CLOS, the Common Lisp Object System, is much closer to being "object oriented" than any of the "curly" languages, and also has features you will sorely miss elsewhere, like multimethods.
2.1: "Not-too-strong" typing: CL has dynamic, strong typing, which seems to be what you want.
2.2: Tools: Emacs + SLIME (the Superior Lisp Interaction Mode for Emacs) is a very nice free IDE. There is also a plugin for Eclipse (Cusp), and the commercial CL implementations also oftem bring an own IDE.
2.3: Garbage collected, but translated or compiled without a virtual machine. The Lisp image that you will be working on is a kind of VM, but I think that's not what you mean.
A further advantage is the incremental development: you have a REPL (read-eval-print-loop) running that provides a live interface into the running image. You can compile and recompile individual functions on the fly, and inspect the current program state on the live system. You have no forced interruptions due to compiling.
Short Version: The D Programming Language
Yum Yum Yum, that is a big set of requirements.
As you probably know, object orientation, high-level semantics, performance, portability and all the rest of your requirements don't tend to fit together from a technical point of view. Let's split this into a different view:
Syntax Requirements
Object Orientated presentation
Low memory management complexity
Allows function style
Isn't Haskell (damn)
Backend Requirements
Fast for science
Garbage Collected
On this basis I would recommend The D programming language it is a successor to C trying to be all things to all people.
This article on D is about it's functional programming aspects. It is object-orientated, garbage collected and compiles to machine code so is fast!
Good Luck
Clojure and/or Scala are good canditates for JVM
I'm going to assume that you are familiar enough with the languages you mentioned to have ruled them out as possibilities. Given that, I don't think there is a language that fulfills all your expectations. However, there are still a few languages you could take a look at:
Clojure This really is a very nice language. It's syntax is based on LISP, and it runs on the JVM.
D This is like C++ done right. It has all the features you want except that it's kind of weak on the functional programming.
Clean This is based very heavily on Haskell, but removes some of Haskell's problems. Downsides are that it's not very mature and doesn't have a lot of libraries.
Factor Syntactically it's based on Forth, but has support for LISP-like functional programming as well as better support for classes.
Take a peek at Erlang. Originally, Erlang was intended for building fault-tolerant, highly parallel systems. It is a functional language, embracing immutability and first-class functions. It has an official Windows binary release, and the source can be compiled for many *NIX platforms (there is a MacPorts build, for example).
In terms of high-level features, Erlang support list comprehensions, pattern matching, guard clauses, structured data, and other things you would expect. It's relatively slow in sequential computation, but pretty amazing if you're doing parallel computation. Erlang does run on a VM, but it runs on its own VM, which is part of the distribution.
Erlang, while not strictly object-oriented, does benefit from an OO mindset. Erlang uses a thing called a process as its unit of concurrency. An Erlang process is actually a lot like a native thread, except with much less overhead. Each process has a mailbox, will be sent messages, and will process those messages. It's easy enough to treat processes as if they were objects.
I don't know if it has much in the way of scientific libraries. It might not be a good fit for your needs, but it's a cool language that few people seem to know about.
Are you sure that you really need a functional language? I did most of my programming in lisp, which is obviously a functional language, but I have found that functional programming is more of a mind-set than a language feature. I'm using VB right now, which I think is an excellent language (speed, support, IDE) and I basically use the same programming style that I did in lisp - functions call other functions that call other functions - and functions are usually 1-5 lines long.
I do know that Lisp has good performance, run on all platforms, but it is somewhat outdated in terms of how up to date support for features such as graphics, multi-threading etc. are.
i've taken a look at clojure but if you don't like java you probably won't like clojure. It's a functional-lisp-style language implemented on top of java - but you'll probably find yourself using java libraries all the time which adds the verbosoity of java. I like lisp but I didn't like clojure despite the hype.
Are you also sure about your performanc requirements? Matlab is an excellent language for a lot of scientific computation, but it is kind of slow and I hate reading it. You might find t useful though especially in conjunction with other languages, for prototypes/scenarios/subunits.
Many of your requirements are based on hearsay. One example: the idea that Mono is "terrible".
http://banshee-project.org/
That's the official media player of many Linux distributions. It's written in C#. (They don't even have a public Windows release of it!)
Your assertions about the relative performance of various languages are equally dubious. And requiring a language to not use a virtual machine is quite unrealistic and totally undesirable. Even an OS is a form of VM on which applications run, which virtualises the hardware devices of the machine.
Though you earn points for mentioning tools (although not with enough priority). As Knuth observed, the first question to ask about a language is "What's the debugger like?"
Looking over your requirements, I would recommend VB on either Mono, or a virtual machine running windows. As a previous poster said, the first thing to ask about a language is "What is the debugger like" and VB/C# have the best debugger. Just a result of all those Microsoft employees hammering on the debugger, and having the teams nearby to bug (no pun intended) into fixing it.
The best thing about VB and C# is the large set of developer tools, community, google help, code exapmles, libraries, softwaer that interfaces with it, etc. I've used a wide variety of software development environments over the past 27 years, and the only thing that comes close is the Xerox Lisp machine environmnets (better) and the Symbolics Lisp machines (worse).
I'm curious to know if certain languages are, by design, better suited for certain processor architectures. When I say architectures, I don't mean ARM/PPC/MIPS but more stack, accumulator, or register based architectures.
For example, I can think of Forth, which is a stack architecture. Any others?
Yes, definitely... it goes the other way as well: many hardware architectures are designed to accommodate certain languages.
RISC architectures are very much an answer to that people moved from assembly to compiled languages like C/C++.
Burroughs B5000 had Algol instead of assembler.
There are several different Forth chips.
Lisp machines were designed to run Lisp efficiently.
Java processors run Java bytecode in hardware.
Some ARM processors have (optional) Java acceleration technology.
Probably many more good examples are available.
Yes they do. For example, the Occam programming language was originally targerted specifically at the Transputer architecture.
Perhaps this is a bit of a smart-ass answer, but:
The assembly languages of the processors involved are tightly linked to the architecture, so, yes, there do exist some languages where it is true.
Whether higher-level languages exhibit the same is perhaps more interesting.
I saw a talk on Google Video by Simon Peyton Jones that talked about this. He mentioned that back in the day people were very interested in writing hardware that was specialized to execute a particular language, but people figured out a better way to solve the problem: make the compiler smarter. Take a look at Haskell. GHC produces some ridiculously fast code from high level constructs, yet Haskell is so much unlike x86 assembler that the two look alien from each other. The same kind of thing happened with Java and Lisp: Java and Lisp are both very fast on modern computers and take decent advantage of our processors, but Java was originally compiled for a weird stack-based bytecode and long ago, people built Lisp machines.
Here's the video, by the way. Most of it is not relevant to the current question but you may find it interesting, it's about "why functional programming is important" and how to make unit tests the easy way.
http://video.google.com/videoplay?docid=-4991530385753299192&hl=en
It's only been fairly recent (last decade or so?) that compilers have been smart enough to make Haskell and Java almost as fast as C, even though neither of them expose much of the underlying architecture. Heck, GHC doesn't even use the stack, how wacky is that?
The best known example is of course c
C was written in the early 1970s to suit the DEC PDP-11
e.g. On the PDP-7 the programming language B only had one data type, but porting it to the PDP-11 which had different sized data types, data types for variables were added to the language.
Most languages target Von Neumann architecture, which is the basis of most CPU.
Occam for Transputer, mentioned by Neil Butterworth is a notable exception.
VHDL is another exception, based on data flow concept, but it is not a programming language, it is a hardware description and simulation language.
This is probably more of a subjective question, but which language (not API like .NET or JDK) would you use should you write your own operating system? Which language provides flexibility, simplicity, and possibly a low-level interface to the hardware? I was thinking Java or C...
C, of course.
Haskell.
Once you have flipped the right hardware bits, C is a terrible language to use for the rest of the OS. Things like the scheduler, filesystems, drivers, etc. are complex high-level algorithms, and you don't want to be writing those in assembly language (or C; same thing). It's too hard to get right. (The VM subsystem and memory manager may need to be written in something low-level, as you will need to bootstrap your high-level langauge's runtime somehow.)
Anyway, this isn't just a crazy idea that I am coming up with for SO. Here is an OS written in Haskell: http://programatica.cs.pdx.edu/House/
Lisp is another good choice; the original Lisp machines were infinitely more tweakable (at runtime) than "modern" OSes like UNIX and Windows.
Sometimes history forgets good ideas (often in the name of "maximum performance"), and that makes me sad.
D would be an interesting choice. From its own description:
D is a systems programming language. Its focus is on combining the power and high performance of C and C++ with the programmer productivity of modern languages like Ruby and Python. Special attention is given to the needs of quality assurance, documentation, management, portability and reliability.
The D runtime assumes the existence of a garbage collector, which would not be appropriate for the very lowest levels of the kernel. However, it would be appropriate for many of the higher layers.
Build the basic components like task schedulers and drivers etc with Assembly, then build the higher level components like applications and tools with C
I believe this is how Windows XP was built too (unsure about Windows Vista and Windows 7).
Definitely... C
C, ASM, C#
Singularity
Low-level in something like Haskell or D. Productivity over performance, in my opinion. You can rewrite slow parts in C++ or even assembly later if the need arises.
High-level in Python or Ruby. Ideally I'd also have a really fast JIT-capable VM for that language, but that's not going to happen for either language for a while. Lua would be a good alternative if speed gets in the way.
The kernel has to be written in a low-level language, C is by far the best choice for this, because it is so memory efficient. The higher levels could be built with a combination of Java or more ideally Objective-C, and scripting languages like python and ruby, or lua.
Honestly, I would either use C or some hierarchy of languages that I had either designed or fit together completely seamlessly. What I would be looking for is a seamless experience that starts at the bare metal level and then I could move to higher and higher level languages as I moved up the problem space. I would probably chose something like:
C - for bare metal stuff like drivers, kernel, etc
Java/C# - for application-level things like administration consoles, OS apps
Python/PowerShell - for scripting activities like common administrative tasks (creating a new user, etc)
Personally, I think C/C#/PowerShell is more tightly integrated and the type of experience I'd be looking for. Of course, if I ever got so ambitious as to write an OS, I would have a lot of spare time on my hands and would probably really enjoy tackling the language stack first. So maybe it would be L/L#/LScript ...
BitC seems to have this in mind. Despite it's name it seems to be the midpoint of assembly language and lisp. The goal was to make a language with a strong correspondence with machine language but have an intermediate representation that supports stronger correctness inferences than is possible with most other common languages. The languages was created as part of the Coyotos project, an operating system with lofty goals of security and reliability. Formal verification is made significantly possible with the ir used in BitC.
Ada:
Ada is a structured, statically typed, imperative, and object-oriented high-level computer programming language, extended from Pascal and other languages. It was originally designed by a team led by Jean Ichbiah of CII Honeywell Bull under contract to the United States Department of Defense (DoD) from 1977 to 1983 to supersede the hundreds of programming languages then used by the DoD. Ada is strongly typed and compilers are validated for reliability in mission-critical applications, such as avionics software.
Ada, because it was not only specifically developed for such projects, but it also provides support for several very useful high level features (such as support for strong typing, concurrency and abstraction) that are simply not available in standard C.
So that, even as a project grows, you don't have to work around language limitations (think encapsulation, abstraction, namespaces in C).
Don't get me wrong, C works obviously for a great many of projects, but once a project has gained a certain size (think Linux kernel, gcc, GNOME), you will inevitably appreciate certain features of more high level languages to make the development process less tedious and also less obfuscated.
In C however, these features usually end up being -pretty poorly- emulated by excessive and almost pervert use of the pre-processor (this can for example be seen in the gcc code base), so that you get to see lots of nested macros, that from an implementation point of view, actually emulate features found in other programming languages.
In addition, Ada is the only programming language, that I am aware of, that actually provides standardized support for source code analysis using the ASIS, having such a facility in place is however the prerequisite to actually be able to maintain and transform/re-engineer a code base in the long run (think refactoring).
Having an interface like ASIS available, means that you can actually implement "semantic patching", where you can automatically rename a file, function or variable/data structure and it will actually work.
Java ?? no jave runs on a virtual machine which needs an os to run on top of ,
maybe C and some ASM ;)
I would go with D to see whether it can do it.
I would only pick the following 3 out of practicality.
C (good old fashioned)
C++ (C with stuff tacked on. Windows is partially written in this)
Java (the medium level language that just might have a capable garbage collector with controllable pauses with G1).
If I were going to start a new OS I'd do it with the subset of C++ recommended by the embedded industry. You can use things like classes and use it "as a better C" and be just as fast. Just avoid things that have massive overhead. You can even use some template features, if you stick to a certain subet that basically don't have any overhead. Look on embedded.com for features in C++ that have little to no overhead, but will allow you organize your code better than you ever could in C.
Oberon? I guess I miss Pascal too much some times. C paid the bills for quite a while, but I don't really love it.
Lisp of course!
Title text: Some say the world will end in fire; some say in segfaults.
For an OS, you want speed at the lowest levels. So assembly, C, C++, Objective-C, or Java seem to be the current choices. Although it's just recently that Java got fast, and it's hard for me to imagine an OS with garbage collection.
If I were writing my own, it would be a mix of assembly and C.
A C or C++ microkernel with a JIT for a highly dynamic language like Ruby or maybe a language with native support for the Prototype pattern. Even device drivers in that language.
Not because it's practical but because it's really cool. Cool in the way that NeXTStep was cool for using Obj-C for pretty much everything.
http://www.dwheeler.com/sloc/redhat71-v1/redhat71sloc.html - share of languages in Linux's source code.
C, by a number of reasons. Other candidates, like D, are great. However, C has this advantage: there's a lot of available open-sourced C code that you could reuse in your project (much more than is for other languages appropriate for system programming).
I would be torn between using some existing low level language and write my own based on C# but with much better generics support.
In second case I would make each method generic, but all the constraints will be resolved by compiler - to allow "duck typing" like in Scala but still language should be static. Also static virtual methods would lower the codebase.
I've had that idea for a long time, but it never seems to be doable in real timeframe, so who knows maybe in the future. :-)
Some would say Java.
Note that openfirmware is written partly in Forth, and it's very low level.
Have an open mind.
"The kernel has to be written in a low-level language, C is by far the best choice for this, because it is so memory efficient. "
Um... What about FORTH?
FORTH can be low level and high level, so you could have a whole operating system written in FORTH from the ground up, and still have a nice easy REPL scripting environment on top, also in FORTH.
However, any decent operating system should support lots of langauges on top, from C all the way to Python Ruby and Javascript. Making FORTH the basis for it all has a lot of benefits though.
edit: I'd only ever attempt this for an embedded environment with a single known hardware set. Trying to write an OS that could compete with Linux or Windows is a fools job.
If this isn't a hypothetical question, and you're looking to create your own OS, I'd probably go with C because most of the examples out there are written in C.
Also, (And I haven't build an OS yet so take this with a grain of salt), I'm thinking that the c runtime libraries would be a lot easier to port to your new OS than say .NET.
Pascal + Oberon: they have the power of C and C++ but they're not as daunting to use. Both these languages are grossly under appreciated.