I've heard these terms thrown around describing languages before, like C is not quite a low-level language, C++ is a midlevel, and Python is a high-level language.
I understand that it has to do something with the way the code is compiled, and how it is written. But what defines a language into one of those three categories? Are these absolute categories, or just a general idea programmers use to describe languages to each other?
Yes, they're just general terms. It's to do with abstraction, and how close you are to what the computer's actually doing.
Here's a list of programming languages ranging from very low to very high level:
Machine Code could probably be considered the lowest level programming language.
Assembly language is at the level of telling the processor what to do. There is still a conversion step towards machine code.
C is a step up from assembler, because you get to specify what you want to do in slightly more abstract terms, but you're still fairly close to the metal.
C++ does everything that C can do but adds the capability to abstract things away into classes.
Java/C# do similar things to C++ in a way, but without the opportunity to do everything you can do in C (like pointer manipulation in Java's case [thanks Joe!]). They have garbage collection though, which you have to do manually in C++.
Python/Ruby are even higher level, and let you forget about a lot of the details that you would need to specify in something like Java or C#.
SQL is even higher level (it's declarative). Just say "Give me all the items in the table sorted by age" and it will work out the most efficient way to carry this out for you.
low level = long development time + very fast executable file
high level = shorter development time + slower executable file
mid level is between the two
Very low-level: Machine Code
Low level: Assembler, Forth
Mid level: C, C++, most system programming languages
Mid/High level: D, Go, garbage collected system programming languages
High level: Java, C#, most interpreted languages
Even Higher level: Lisp dialects
Highest level: SQL, declarative programming languages
If there is anything else to be added, tell me.
They aren't absolute. They are all relative to what other languages are being used in industry at the time. For example, there was a time when assembly was considered mid-level.
The 'level' is essentially a measure of how abstracted the programmer is from the actual hardware-based operations. In a low level language you might have to care about actual memory locations, whereas in a high-level you just create variables and let the OS handle memory.
A normal CPU processes either 32 or 64-bit instructions. In the simplest form, think of this as an 32 1's and 0's in a row - that's what the processor actually interprets and executes. Writing this directly (machine code) would be the 'lowest-level'.
Low level means closer to the machine, and therefore more difficult and more powerful. The higher level you get, the more removed from the machine and "English-like" you get, but you lose a lot of the power and functionality that comes with being able to control the minute details of the machine. Higher level languages also generally tend to protect you more and have much more precautions and checks in place, while lower level languages trust you, so to speak, and let you play around at your own risk.
The term mid-level language is one I've never heard.
"Low" and "High" refer to how "close" to the machine you are in your programming. The lowest level would be machine (binary) code. Next (and still considered low) is assembler. The higher level languages involve more symbolism and constructs that are supposed to be closer to how humans normally think. C (and somewhat C++) has a reputation as being somewhat a hybrid low/high level because it has many constructs that are in high level languages, but also has instructions (e.g. shifts) that are low level languages but often not in higher level languages.
From low to high, you can categorize the languages as follows.
Machine Code --> Assembly Language --> Compiled Language --> Interpreted Language
Remember that these aren't absolute black and white definitions, but rather shades of gray. This is more of a guideline than a rule.
Think of machine code as a long string of 1s and 0s understood by the native platform. Consider this your baseline... the lowest "level" you can have.
Assembly language could be considered a symbolic representation of this. I believe there is a 1 to 1 mapping between assembly code instructions and machine code instructions. This is your low level language.
Java and C++, for example, are both compiled languages, but many would consider C++ to be a lower level language than Java because it exposes low level system access, while Java runs in a protected environment (the virtual machine). Remember that a compiled language is compiled (converted, if you will) to machine code before execution. C is also a compiled language, but would be considered lower level than both Java and C++.
For our sake, we will say that C and C++ are low level languages because they offer (relatively) little abstraction from the hardware and direct memory management. In actuality, they fall somewhere between low and mid, as you will see soon enough.
We will call Java and C# (.NET) mid level languages because they have automatic memory management (garbage collection), plenty of high-level abstractions (IE objects... yet C++ supports objects. Do you see why the scale is considered to be loosely defined?)
With an interpreted language, the interpreter resides in memory and reads the source code directly. These are high level languages. Python, Perl, Javascript, and PHP are all examples of high level languages.
C is a midlevel language, because we can use code in assembly language.
The only slight difference is pointers make it powerful (if pointer remove in C then it be will considered a low-level language). Its portable features makes it midlevel, so we can say it is a midlevel language.
It is all relative... The "level" reflect the amount of abstraction.
Once you add a spectrum of levels of a programming language you add nuance to the definition.
Clearly machine code and assembly are machine-dependent. C and C++ in theory are machine-independent, but in truth that is not universal. In C, things like alignment need to be taken into account and you can always manage the stack in C and in the C subset of C++) via a pointer and a single initialised variable—if you are crazy enough—so that (x86) the RSP register (stack pointer) is never used. So C, yes it is midlevel. Everything else is high-level and some super high-level.
Low-level languages are very close to machine language that may be binary or RTL. It is hard to write and very quick to execute. It can interact with the hardware and a high-level programming language is very easy to write, but it can be executed after compilation.
Related
I am trying to gain a more holistic, general, and high-level understanding of programs and programming languages.
I would like to understand how they actually function. I understand at the lowest level is machine code which is 0 and 1s. Then you have assembly. Then you have another high level language where every instruction/function/method/call/routine whatever you want to call it maps to some instruction or group of instructions in assembly right? The higher level language cannot provide or do anything outside of what the lower level language assembly provides correct?
Similarly, since all code runs on an OS, that code can only do things that the OS provides. It is impossible for the code to do anything outside of what the OS actually provides correct?
The computer has an instruction set, machine code, which defines what can be done on the computer. Assembly code is essentially a more convenient representation of that, so assembly code can do anything the machine can do. A higher-level programming language has to run on the machine, so it can't do anything the machine can't do, though it may well be able to express it more conveniently (e.g. print "foo" rather than several dozen machine-code instructions). It is a matter of implementation choice whether the compiler for that programming language generates machine code directly, or assembly code, or any other form that might need further processing.
This brings us to the question of whether it's possible for a program (regardless of what it's written in) to do something the OS does not explicitly provide for. I find this an odd way to express it, since the point of writing a program is to give you some capability that you didn't have before, so in some sense you only write programs for things the OS does not explicitly provide you with. The problem lies in defining what the OS "provides". If it's a general-purpose OS, then its designers likely intend to "provide" the ability for you to write a wide range of programs. The OS may choose to provide some convenient feature (say, the ability to create files) but if it did not provide such a feature, you could perhaps do it yourself, given suitable motivation (and, for the file creation example, the ability to do disk I/O - possibly requiring you to write a disk driver too).
I am trying to gain a more holistic, general, and high-level
understanding of programs and programming languages.
I would like to understand how they actually function.
I recommend working through the understanding of modern hardware to get good performance and energy efficiency with the following as example:
With subword parallelism to improve matrix multiply by a factor of 4.
Doubling the performance by unrolling the loop to demonstrate the value of instruction level parallelism.
Doubling performance again by optimizing for caches using
blocking.
Finally, A speedup of 14 from 16 processors by using thread-level parallelism.
All four optimizations in total add just 24 lines of C code to a matrix multiply example you find in
Computer Organization and Design: The Hardware Software Interface (5th ed.)
or similar books.
To make a point, it really is worth it to dig deeper than just "learning python" even if it is a good start. So understanding low level really effects high level programming in many ways and thats what you are after according to your question.
There is actually not just Assembly - there are hardware description languages as VHDL thats handled in this topic f.e.:
https://electronics.stackexchange.com/questions/132611/whats-the-motivation-in-using-verilog-or-vhdl-over-c
In the context of programming language discussion/comparison, what does the term "power" mean?
Does it have a well defined meaning? Even a poorly defined meaning?
Say if someone says "language X is more powerful than language Y" or asks the same as a question, what do they mean - or what information are they trying to find out?
It does not have a well-defined meaning. In these types of discussions, "language X is more powerful than language Y" usually means little more than "I like language X more than language Y." On the other end of the spectrum, you'll also usually have someone chime in about how any Turing-complete language can accomplish the same tasks as any other Turing-complete language, so that neither is strictly more powerful than the other.
I think a good meaning for it is expressivity. When a language is highly expressive, it means less code is required to express concepts. To me, this doesn't just mean that you have to write less code to accomplish the same tasks, but also that the code is easily readable by humans. Of course, generally (to a point), having fewer lines of code to read makes the task of reading and understanding easier for humans.
Having a "powerful" standard library comes into play here along the same lines. If a language comes equipped with thorough, complete libraries, then idiomatic code in that language will be able to benefit from the existing library code and not have to repeat or reinvent common functionality in application code. The end result is, again, having to write and read less code to accomplish the same tasks.
I keep saying "generally" and "to a point", because once a language gets too terse, it gets more difficult for humans to decipher. I suppose at this extreme, a language may still be considered "more powerful" (or even "too powerful"). So I guess I'm saying my personal interpretation of "powerful" includes some aspects of "useful" and "readable" in it as well.
C is powerful, because it is low level and gives you access to hardware. Python is powerful because you can prototype quickly. Lisp is powerful because its REPL gives you fantastic debugging opportunities. SQL is powerful because you say what you want and the DMBS will figure out the best way to do it for you. Haskell is powerful because each function can be tested in isolation. C++ is powerful because it has ten times the number of syntactic constructs that any one person ever needs or uses. APL is powerful since it can squeeze a ten-screen program into ten characters. Hell, COBOL is powerful because... why else would all the banks be using it? :)
"Powerful" has no real technical meaning, but lots of people have made proposals.
A couple of the more interesting ones:
Paul Graham wants to call a language "more powerful" if you can write the same programs in fewer lines of code (or some other sane, sensible measure of program size).
Matthias Felleisen has written a very serious theoretical study called On the Expressive Power of Programming Language.
As someone who knows and uses many programming languages, I believe that there are real differences between languages, and that "power" can be a convenient shorthand to describe ways in which one language might be better than another. Nevertheless, whenever I hear a discussion or claim that one language is more powerful than another, I tend to keep one hand firmly on my wallet.
The only meaningful way to describe "power" in a programming language is "can do what I require with the least amount of resources" where "resources" is defined as "whatever costs I'd rather not pay" and could, thus, be development time, CPU time, memory space, money, etc.
So basically the definition of "power" is purely subjective and rendered meaningless in any objective discussion.
Powerful means "high in power". "Power" is something that increases your ability to do things. "Things" vary in shape, size and other things. Loosely speaking therefore, "powerful" when applied to a programming language means that it helps you to do perform your tasks quickly and efficiently.
This makes "powerful" somewhat well defined but not constant across domains. A language powerful in one domain might be crippling in another eg. C is very powerful if you want to do systems level programming since it gives you direct access to the machine and hardware and structures that let you code much faster than you would in assembly. C compilers also produce tight code that runs fast. However, once you move to web applications, C can become very "unpowerful" and crippling since it's so much effort to get something up and running and you have to worry about a lot of extraneous details like memory etc.
Sometimes, languages are "powerful" in multiple domains. This gives them a general "powerful" tag (or badge since were are on SO here). PG's claim is that with LISP, this is the case. That might be true or might not be.
At the end of the day, "powerful" is a loaded word so you should evaluate who is saying it, why he's saying it and what it means to to your work.
There are really only two meanings people are worried about:
"Powerful" in the sense of "takes less resources (time, money, programmers, LOC, etc.) to achieve the same/better result", and "powerful" in the sense of "is capable of doing a wide range of tasks".
Some languages are extrememly resource-effective for a small range of tasks. Others are not so resource-effective but can be applied to a wide range of tasks (e.g. C, which is often used in OS development, creation of compilers and runtime libraries, and work with microcontrollers).
Which of these two meanings someone has in mind when they use the term "powerful" depends on the context (and even then is not always clear). Indeed often it is a bit of both.
Typically there are two distinct meanings:
Expressive, meaning the code tends to be very short and understandable
Low level, meaning you have very fine-grained control over the hardware.
For the most languages, these two definitions are at opposite ends of the spectrum: Python is very expressive but not very low level; C is very low level but not very expressive. Depending on which definition you pick, either language is powerful or not powerful.
nothing absolutely nothing.
To high level programmers it might mean alot of available datatypes built in. Or maybe abstractions to easily create or follow Design Patterns.
Paul Graham is a very high level guy here is what he has to say:
http://www.paulgraham.com/avg.html
Java guys might tell you something about portability, the power to reach every platform.
C/UNIX programmers may tell you that its speed and efficiency, complete control over every inch of memory.
VHDL/Verilog programmers will tell you its complete control over every clock and gate so as to not waste any electricity or time.
But in my opinion a "powerful language" supports all of the features for you to complete your task. Documentation may be important, or perhaps it is portability, or the ability to do graphics. It could be anything, writing a gui from Assembly is just stupid, so is trying to design an embedded processor in flash.
Choosing a language that suits your needs perfectly will always feel like power.
I view the term as marketing fluff, no one well-defined meaning.
If you consider, say, Assembler, C, and C++. On occasions one drops from C++ "down" to C for particualr needs, and in turn from C down to assembler. So that make assembler the most powerful because it's the only language that can do everything. Or, to argue the other way, a single line of C++ code can replace several of C (hiding polymorphic dispatch via function pointers for example) and a single line of C replaces many of assembler. So C++ is more powerful because one line does "more".
I think the term had some currency when products such as early databases and spreadsheets had in-built languages, some quite restricted. So vendors would tout their language as being "powerful" because it was less restricted.
It can have several meanings. In the very basic sense there's power as far as what is computable. In that sense the most powerful languages are Turing Complete which includes pretty much every general purpose programming language (as opposed to most markup languages and domain specific languages which are often not Turing complete).
In a more pragmatic sense it often refers to how concisely (and readably) you can do certain things. Basically how easy is it to do certain tasks in one language compared to another.
What language is more powerful (besides being somewhat subjective) depends heavily on what you're trying to do. If your requirements are to get something running on a small device with 64k of memory you're likely not going to be using Java. Most likely the right language would be C or C++ (or if you're really hard core assembly). If you need a very simple CRUD app done in 1 day, maybe something like Ruby On Rails would be the way to go (I know Rails is a framework and Ruby is the language, but these days what libraries and frameworks are available factor greatly into picking a language)
I think that, perhaps coincidentally, the physics definition of power is relevant here: "The rate at which work is performed."
Of course, a toaster does not perform very quickly the work of putting out fires. Similarly, the power of a programming language is not universal, but specific to the domain or task to which it is being applied. C is a powerful language for writing device drivers or implementations of higher-level languages; Python is a powerful language for writing general-purpose applications; XPath is a powerful language for writing queries on structured data sets.
So given a problem domain, the power of a language can be said to be the rate at which a competent programmer is able to use it to solve problems in that domain.
A precise answer can be tried to reach, by not assuming that the elements that define "powerful" (in the context of languages) come from so many dimensions.
See how many could be, and a lot will be missing:
runtime speed
code size
expressiveness
supported paradigms
development / debugging time
domain specialization
standard libs
codebase
toolchain ecosystem
portability
community
support / documentation
popularity
(add more here)
These and more parameters draw together X picture of how "programming in some language" would be like at X level. That will be only the definition, though, the only real knowledge comes with the actual practice of using the language, but i digress.
The question comes down to which parameter will represent the intrinsic quality of a language. If you refer to a language in itself, its ultimate, intrinsic purpose is "express things", and thus the most representative parameter is rightfully expressiveness, and is also one that resonates frequently when someone talks about how powerful a language is.
At the moment you try to widen the question/answer to cover more than the expressiveness of the language "as a language, as a tongue", you are more talking about different kinds of "environment", social environment, development environment, commercial environment, etc.
Depending of the complexity of the environment to be defined you'll have to mix more parameters that come from multiple, vast, overlapping and sometimes contradictory dimensions, and eventually the point of getting the definition will be lost or the question will have to be narrowed.
This approximation still won't answer "what is an expressive language", but, again, a common understanding are the definitions that Vineet well points out in its answer, and Forest remarks in the comments. I agree, for me "expression" is "conveying meaning".
I remember many instructors in college calling whatever language they were teaching "powerful".
Leads me to think:
Powerful = a relative term comparing the latest way to code something vs. the original or previous way.
I find it useless to use the word "powerful" in regards to discussing anything software related. Every time my professor in college would introduce a new concept such as polymorphism he would say "so this is a really powerful feature". After a while I got annoyed. If everything is powerful then nothing is. It's all the same. You can write code to do anything. Does is really matter how much code is required to do it? You can say it's short or efficient but powerful is just useless. Nuclear energy is powerful. Code is words.
I think that power would normally refer to how quickly it can process data, for example I found that in python as soon as a list exceeds a length of approx. 2000 it becomes unbearably slow whereas in C++ a list can easily contain 20,000 entries without doing so.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
In informatics theory I hear and read about high-level and low-level languages all time.
Yet I don't understand why this is still relevant as there aren't any (relevant) low-level languages except assembler in use today.
So you get:
Low-level
Assembler
Definitely not low-level
C
BASIC
FORTRAN
COBOL
...
High-level
C++
Ruby
Python
PHP
...
And if assembler is low-level, how could you put for example C into the same list. I mean: C is extremely high-level compared to assembler. Same even for COBOL, Fortran, etc.
So why does everybody keep mentioning high and low-level languages if assembler is really the only low-level language?
You will find that
many of the truths we cling to depend upon our own point of view.
For a C programmer, Assembler is a low-level language.
For a Java programmer, C is a low-level language and so on.
I suspect the folks programming the first stored-program computer with 1s and 0s would have thought Assembler a high-level language. It's all relative.
(Quote from Return of the Jedi)
According to Wikipedia, the low level languages are machine code and assembly.
From the source:
In computer science, a low-level
programming language is a programming
language that provides little or no
abstraction from a computer's
instruction set architecture. The word
"low" refers to the small or
nonexistent amount of abstraction
between the language and machine
language; because of this, low-level
languages are sometimes described as
being "close to the hardware."
Then, to answer:
So why does everybody keep mentioning high and low-level languages if assembler is really the only low-level language.
I don't know who "everyone" is, but I would venture a guess that back when high-level languages were not as commonplace as they are today, it was more relevant to talk about low-level vs. high-level (because there was a relatively significant amount of programmers writing assembly code). In modern times it is a less important distinction. Personally, I rarely hear people using these terms except to differentiate between assembly or not (except for those times when you might hear someone raised on Python referring to C or C++ as low-level, but this is not in the spirit of the original definition).
You're asking a relatively subjective question; it's a question about terminology, that vernacular, and perspective.
For example, is Lisp a high-level or a low-level language? What if the implementation is running on a Lisp Machine?
Often, when people attempt to build a spectrum from low-level to high-level, what they are trying to quantify is a degree of "closeness to the hardware" as opposed to the degree of "abstraction."
Qualities which count toward an implementation's closeness to the hardware:
The programmer directly controls the memory layout of data and has access at run-time to memory addresses of data.
Mathematical operations are defined in terms of the hardware or loosely defined in order to conform to different types of hardware.
There may be a library providing dynamic memory allocation, but use of dynamic memory is manual.
Management of memory during string manipulation is manual.
Converse qualities which count toward an implementation's abstraction from the hardware:
The programmer does not have run-time access to address of data (references instead of pointers).
Mathematical operations are defined in specific terms not tied to specific hardware. (e.g., ActionScript 3 supports the Number type which self-converts from integer to floating-point rather than experience overflow.)
Management of dynamic memory is handled by the environment, possibly through reference counting, garbage collection, or another automated memory management scheme.
Management of memory during string manipulation is always hidden from the programmer and handled by the environment.
Other qualities might render a language very abstract compared to the hardware on which it runs:
Declarative, search-based syntax. (e.g. Prolog)
With factors like these in mind, I would revise the spectrum you have written as follows:
Lowest level:
Assembly language of the platform in question.
Low-level languages with higher-level flow control than assembly:
C, C++
Pascal
High-level languages:
FORTRAN
COBOL
Python
Perl
Highest-level languages:
PROLOG
Python
Scheme
Python appears twice by intent -- it spans a portion of the spectrum depending on how the code is written.
As low-level, I would add:
.NET IL
Java JVM
Other P-Code used in environments like VB6
The "level" of a language is a moving target. In 1973, PL/I was considered a high-level language. Today, C is considered (at least by language professionals) as a low-level language [see footnote]. Some of the reasons:
Exposes machine-level representations of numbers
"Integer" arithmetic can overflow
No real support for strings, or at the very least, strings are not first-class
Manual memory management
Address arithmetic
Unsafe
A high-level language might include
Support for integer types independent of the target machine
Default integer arithmetic never overflows unless the machine runs out of memory
Strings as first-class values with, e.g., concatenation built in
Automatic memory management with no address arithmetic
Safe
Some candidates as "high-level languages" by this definition might include Icon, Scheme, Smalltalk, and some of your favorite scripting languages.
Back in the day when I was a young scholar and dinosaurs roamed the earth, people referred to Icon as a "very high-level language". As recently as 15 years ago you could even attend a learned symposium on Very High Level Languages. But that term isn't used much any more.
Why does everybody keep mentioning high and low-level languages?
Even though the difference between "high" and "low" keeps changing, distinctions like the ones listed above are still important. And there are so many distinction that the words "high" and "low" can be a useful shorthand. But not that useful—to a cynic, a high-level language is one that looks at least as powerful as whatever my favorite language is, and a low-level language is everything else. In other words, "level" can easily degenerate into mere name-calling.
Footnote: It's hard to find citations for terminology used at professional meetings, especially when professionals don't use the terms "low-level" and "high-level" because they're not so technical. But danben asked about citations, and I found a couple:
"To provide the required precision, experimental programs are usually written in a low-level language (eg C or Pascal)," in a refereed article on computer vision.
"The C programming language is well-known for its flexibility in dealing with low-level constructs," in an important paper by Necula et al.
P.S. Don't count too heavily on Wikipedia for good information on programming languages, especially if the Wikipedia reference cites no references or sources
Purely guessing here, but this may be a case of language-shift, whereby the distinction between low- and high-level langauges is slowly evolving in peoples' minds into the difference between managed- and unmanaged-languages, typed-and untyped-languages etc.etc. (at least in the way people are using the terminology).
To a large extent, "low-level" and "high-level" not binary categories but are a continuum. There are some languages that are clearly low-level (assembly, machine code), but beyond that there is really only "higher-level" and "lower-level".
As I see it, "lower-level" languages require code that looks more like the architecture of the computer, and "higher-level" languages accept code that looks more like the structure of the problem. But with that, languages can be high-level for one problem and low-level for another.
Low-level
Binary
Assembler
ET IL
Java JVM
Other P-Code used in environments like VB6
Definitely not low-level
C
BASIC
FORTRAN
COBOL
Python
Perl
Pascal
High-level
C++
Ruby
Python
PHP
PROLOG
Scheme
I have become interested in C-like languages for performance computing. Can you recommend some alternative programming languages which have the following attributes:
must be close to the hardware (bit fiddling, pointers or some alternative safe method like references)
no managed code (no jvm/.net languages)
has to be really fast (like C)
must be above ASM level (and yes I am interested in macro languages on top of ASM)
can be obscure, not very widespread
I am mainly interested in little-known languages.
How about Assembly language, or the D programming language?
If you don't know about it and are interested just in broadening your horizons, take a look at Forth. Reading about Forth always makes me feel C is high-level.
Well, I've always preferred C and/or C++ because there are multiple flavours (MSVC, glibc etc), it runs on many different platforms (e.g. mobile devices, Windows, linux) and devices, and it can be written cross platform (different processor architectures) and even for high end graphics (e.g. DirectX).
You get "decent" access to platform resources (conditions vary), it can be as fast as you choose to hone it, and it's a tad easier (IMHO) to write than ASM. There's also a pretty decent range of support tools and code analysis tools to make things a little easier.
Also C and C++ have been around for quite some time, so it's got (even today) an excellent and enthusiastic community!
You don't explicitly state that it can't be C in your question, so I'll go ahead and recommend C. It fulfills your three bulleted desires, and you won't have to worry about different versions of the language (like each different kind of assembler).
Forth!
Forth can be faster than machine language on some architectures. The compiled code is extremely dense, therefore, making optimal use of code caching.
assembly would be the closest to the hardware and therefore the fastest
Ada was originally designed for embedded systems (among other things).
OpenCL might be interesting. It's sort of like OpenGL shader language (a subset of C with extensions), but for general purpose parallel array computing.
You could start programming FPGAs in VHDL, Verilog, System C ...
Variations on a theme
FORTRAN is older than C, and is still one of the major players in numerical computing. Until 1990 (when the language was substantially modernized), the language didn't have any form of pointer (checked or not). This lack meant that there was no way to manage memory dynamically; it also made aliasing analysis easy for the compiler, which is one of the things that makes Fortran code fast.
ALGOL was the first structured programming language. Although it had limited success with programmers, it had a strong influence on language designers.
Ada is an imperative language with a strong type system and good modularity, which makes it good for low-level programming with strong assurance requirements (it was sponsored by the US government with military and avionics applications in mind). It was inspired by Pascal, like Modula-2 and Modula-3.
Going further from the mainstream of low-level imperative programming, there is FORTH. FORTH can be compiled for, and even interpreted on, devices with very little memory; it finds a lot of use on low-end embedded systems, including microcontrollers. The language is based on reverse polish notation, made famous by HP calculators (in fact, the language of HP calculators is strongly influenced by FORTH). Many implementations don't have variables: all data is kept on one or more stacks.
Just for fun, I'll mention INTERCAL, the grandaddy of esoteric languages.
Stuff that will blow your mind
Esoteric languages can be instructive, and a quite a few work close to the machine (usually a virtual machine, but in principle you could implement them for an actual computer if you were crazy enough). You could look at brainfuck (a sort of intermediate stage between Turing machines and C), or the many single-instruction languages, or befunge (what if memory was a two-dimensional array?).
Cyclone looks a lot like C. The syntax is the same, and Cyclone has pointers, untagged structures and unions, goto statements and manual memory management. And yet it's a safe language: you can't have a dangling pointer, or a buffer overflow. And you have access to high-level features such as pattern matching, exceptions, polymorphism, abstract types and optional automatic memory management (not just garbage collection, but also regions). Cyclone is both useful and instructive; for a C die-hard, it can be a good way of discovering what makes a safe language. Cyclone can compile to C, so you can run your programs anywhere you have a C compiler for.
Going in a different direction, if you want to be close to the hardware, while still not actually designing hardware, have a look at synchronous languages, such as Lustre and Esterel. These languages are used to program high-assurance realtime systems such as nuclear plants, airplanes and railway signaling. These languages give up Turing completeness and gain the assurance that programmers can know exactly how fast their program will run and how much memory it will require. If you think C is close to the machine, finding out what a language that is really close to the machine may come as a shock.
You can't get much closer than assembly language, unless you get a job with a chip-maker and start writing micro code!!!
If you're on Windows I think you can get hold of Microsoft MASM (macro assembler) that will allow you go get up and running quickly. I used it a long time ago and it's not a bad product.
Seems a bit awkward to answer my question, but I have found two languages:
Pyrex
Vala
They may not fulfill all of the constraints, but they are great for performance computing and both translates to C.
This is probably more of a subjective question, but which language (not API like .NET or JDK) would you use should you write your own operating system? Which language provides flexibility, simplicity, and possibly a low-level interface to the hardware? I was thinking Java or C...
C, of course.
Haskell.
Once you have flipped the right hardware bits, C is a terrible language to use for the rest of the OS. Things like the scheduler, filesystems, drivers, etc. are complex high-level algorithms, and you don't want to be writing those in assembly language (or C; same thing). It's too hard to get right. (The VM subsystem and memory manager may need to be written in something low-level, as you will need to bootstrap your high-level langauge's runtime somehow.)
Anyway, this isn't just a crazy idea that I am coming up with for SO. Here is an OS written in Haskell: http://programatica.cs.pdx.edu/House/
Lisp is another good choice; the original Lisp machines were infinitely more tweakable (at runtime) than "modern" OSes like UNIX and Windows.
Sometimes history forgets good ideas (often in the name of "maximum performance"), and that makes me sad.
D would be an interesting choice. From its own description:
D is a systems programming language. Its focus is on combining the power and high performance of C and C++ with the programmer productivity of modern languages like Ruby and Python. Special attention is given to the needs of quality assurance, documentation, management, portability and reliability.
The D runtime assumes the existence of a garbage collector, which would not be appropriate for the very lowest levels of the kernel. However, it would be appropriate for many of the higher layers.
Build the basic components like task schedulers and drivers etc with Assembly, then build the higher level components like applications and tools with C
I believe this is how Windows XP was built too (unsure about Windows Vista and Windows 7).
Definitely... C
C, ASM, C#
Singularity
Low-level in something like Haskell or D. Productivity over performance, in my opinion. You can rewrite slow parts in C++ or even assembly later if the need arises.
High-level in Python or Ruby. Ideally I'd also have a really fast JIT-capable VM for that language, but that's not going to happen for either language for a while. Lua would be a good alternative if speed gets in the way.
The kernel has to be written in a low-level language, C is by far the best choice for this, because it is so memory efficient. The higher levels could be built with a combination of Java or more ideally Objective-C, and scripting languages like python and ruby, or lua.
Honestly, I would either use C or some hierarchy of languages that I had either designed or fit together completely seamlessly. What I would be looking for is a seamless experience that starts at the bare metal level and then I could move to higher and higher level languages as I moved up the problem space. I would probably chose something like:
C - for bare metal stuff like drivers, kernel, etc
Java/C# - for application-level things like administration consoles, OS apps
Python/PowerShell - for scripting activities like common administrative tasks (creating a new user, etc)
Personally, I think C/C#/PowerShell is more tightly integrated and the type of experience I'd be looking for. Of course, if I ever got so ambitious as to write an OS, I would have a lot of spare time on my hands and would probably really enjoy tackling the language stack first. So maybe it would be L/L#/LScript ...
BitC seems to have this in mind. Despite it's name it seems to be the midpoint of assembly language and lisp. The goal was to make a language with a strong correspondence with machine language but have an intermediate representation that supports stronger correctness inferences than is possible with most other common languages. The languages was created as part of the Coyotos project, an operating system with lofty goals of security and reliability. Formal verification is made significantly possible with the ir used in BitC.
Ada:
Ada is a structured, statically typed, imperative, and object-oriented high-level computer programming language, extended from Pascal and other languages. It was originally designed by a team led by Jean Ichbiah of CII Honeywell Bull under contract to the United States Department of Defense (DoD) from 1977 to 1983 to supersede the hundreds of programming languages then used by the DoD. Ada is strongly typed and compilers are validated for reliability in mission-critical applications, such as avionics software.
Ada, because it was not only specifically developed for such projects, but it also provides support for several very useful high level features (such as support for strong typing, concurrency and abstraction) that are simply not available in standard C.
So that, even as a project grows, you don't have to work around language limitations (think encapsulation, abstraction, namespaces in C).
Don't get me wrong, C works obviously for a great many of projects, but once a project has gained a certain size (think Linux kernel, gcc, GNOME), you will inevitably appreciate certain features of more high level languages to make the development process less tedious and also less obfuscated.
In C however, these features usually end up being -pretty poorly- emulated by excessive and almost pervert use of the pre-processor (this can for example be seen in the gcc code base), so that you get to see lots of nested macros, that from an implementation point of view, actually emulate features found in other programming languages.
In addition, Ada is the only programming language, that I am aware of, that actually provides standardized support for source code analysis using the ASIS, having such a facility in place is however the prerequisite to actually be able to maintain and transform/re-engineer a code base in the long run (think refactoring).
Having an interface like ASIS available, means that you can actually implement "semantic patching", where you can automatically rename a file, function or variable/data structure and it will actually work.
Java ?? no jave runs on a virtual machine which needs an os to run on top of ,
maybe C and some ASM ;)
I would go with D to see whether it can do it.
I would only pick the following 3 out of practicality.
C (good old fashioned)
C++ (C with stuff tacked on. Windows is partially written in this)
Java (the medium level language that just might have a capable garbage collector with controllable pauses with G1).
If I were going to start a new OS I'd do it with the subset of C++ recommended by the embedded industry. You can use things like classes and use it "as a better C" and be just as fast. Just avoid things that have massive overhead. You can even use some template features, if you stick to a certain subet that basically don't have any overhead. Look on embedded.com for features in C++ that have little to no overhead, but will allow you organize your code better than you ever could in C.
Oberon? I guess I miss Pascal too much some times. C paid the bills for quite a while, but I don't really love it.
Lisp of course!
Title text: Some say the world will end in fire; some say in segfaults.
For an OS, you want speed at the lowest levels. So assembly, C, C++, Objective-C, or Java seem to be the current choices. Although it's just recently that Java got fast, and it's hard for me to imagine an OS with garbage collection.
If I were writing my own, it would be a mix of assembly and C.
A C or C++ microkernel with a JIT for a highly dynamic language like Ruby or maybe a language with native support for the Prototype pattern. Even device drivers in that language.
Not because it's practical but because it's really cool. Cool in the way that NeXTStep was cool for using Obj-C for pretty much everything.
http://www.dwheeler.com/sloc/redhat71-v1/redhat71sloc.html - share of languages in Linux's source code.
C, by a number of reasons. Other candidates, like D, are great. However, C has this advantage: there's a lot of available open-sourced C code that you could reuse in your project (much more than is for other languages appropriate for system programming).
I would be torn between using some existing low level language and write my own based on C# but with much better generics support.
In second case I would make each method generic, but all the constraints will be resolved by compiler - to allow "duck typing" like in Scala but still language should be static. Also static virtual methods would lower the codebase.
I've had that idea for a long time, but it never seems to be doable in real timeframe, so who knows maybe in the future. :-)
Some would say Java.
Note that openfirmware is written partly in Forth, and it's very low level.
Have an open mind.
"The kernel has to be written in a low-level language, C is by far the best choice for this, because it is so memory efficient. "
Um... What about FORTH?
FORTH can be low level and high level, so you could have a whole operating system written in FORTH from the ground up, and still have a nice easy REPL scripting environment on top, also in FORTH.
However, any decent operating system should support lots of langauges on top, from C all the way to Python Ruby and Javascript. Making FORTH the basis for it all has a lot of benefits though.
edit: I'd only ever attempt this for an embedded environment with a single known hardware set. Trying to write an OS that could compete with Linux or Windows is a fools job.
If this isn't a hypothetical question, and you're looking to create your own OS, I'd probably go with C because most of the examples out there are written in C.
Also, (And I haven't build an OS yet so take this with a grain of salt), I'm thinking that the c runtime libraries would be a lot easier to port to your new OS than say .NET.
Pascal + Oberon: they have the power of C and C++ but they're not as daunting to use. Both these languages are grossly under appreciated.