What does it mean by a 'systems language'? - programming-languages

The Go talk 2009 pdf has a comment to explain why they came up with the go language :
No new major systems language in a decade.
What's the meaning of systems language?
Is it a language that is supposed to run on target system by generating native binary?
Is it a language that can build operating system on its own?
I can see C#/Java is `not' a systems language, and C/C++ is.

It's a rough, informal distinction, but the idea is that there are "application programming languages," targeted at programmers who develop shrinkwrapped business applications, and "systems programming languages," targeted at programmers who program tools for other programmers (compilers, etc.) and low-level software such as OS kernels, device drivers, etc.
In short, most (recently-invented, anyway) languages are designed to make it easier to develop user-facing software for dealing with some non-computing domain---finance, engineering, etc. Systems programming languages are those, such as C, FORTH, Go, etc. which are intended or at least suitable for programming in the domain of computing.
These often, but do not always, feature compilation to native code, loose type systems which permit extensive "punning," and unmanaged memory access through pointers or an equivalent construct.

Look here? Sorry if this comes off as sorta a throw away link, but really this should be all you need. Unless you are asking for something else more specific.
A reason C# is definitely not a systems language is its dependency on .NET.

Related

How to go about making your own programming language? [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
Learning to write a compiler
I looked around trying to find out more about programming language development, but couldn't find a whole lot online. I have found some tutorial videos, but not much for text guides, FAQs, advice etc. I am really curious about how to build my own programming language. It brings me to SO to ask:
How can you go about making your own programming language?
I would like to build a very basic language. I don't plan on having a very good language, nor do I think it will be used by anyone. I simply want to make my own language to learn more about operating systems, programming, and become better at everything.
Where does one start? Building the syntax? Building a compiler? What skills are needed? A lot of assembly and understanding of the operating system? What languages are most compilers and languages built in? I assume C.
I'd say that before you begin you might want to take a look at the Dragon Book and/or Programming Language Pragmatics. That will ground you in the theory of programming languages. The books cover compilation, and interpretation, and will enable you to build all the tools that would be needed to make a basic programming language.
I don't know how much assembly language you know, but unless you're rather comfortable with some dialect of assembly language programming I'd advise you against trying to write a compiler that compiles down to assembly code, as it's quite a bit of a challenge. You mentioned earlier that you're familiar wtih both C and C++, so perhaps you can write a compiler that compiles down to C or C++ and then use gcc/g++ or any other C/C++ compiler to convert the code to a native executable. This is what the Vala programming language does (it converts Vala syntax to C code that uses the GObject library).
As for what you can use to write the compiler, you have a lot of options. You could write it by hand in C or C++, or in order to simplify development you could use a higher level language so that you can focus on the writing of the compiler more than the memory allocations and the such that are needed for working with strings in C.
You could simply generate the grammars and have Flex and Bison generate the parser and lexical analyser. This is really useful as it allows you to do iterative development to quickly work on getting a working compiler.
Another option you have is to use ANTLR to generate your parser, the advantage to this is that you get lots of target languages that ANTLR can compile to. I've never used this but I've heard a lot about it.
Furthermore if you'd like a better grounding on the models that are used so frequently in programming language compiler/scanner/parser construction you should get a book on the Models of Computation. I'd recommend Introduction to the Theory of Computation.
You also seem to show an interest in gaining an understanding of operating systems. This I would say is something that is separate from Programming Language Design, and should be pursued separately. The book Principles of Modern Operating Systems is a pretty good starting place for learning about that. You could start with small projects like creating a shell, or writing a programme that emulates the ls command, and then go into more low level things, depending on how through you are with the system calls in C.
I hope that helps you.
EDIT: I've learnt a lot since I write this answer. I was taking the online course on programming languages that Brown University was offering when I saw this answer featured there. The professor very rightly points out that this answer talks a lot about parsers but is light on just about everything else. I'd really suggest going through the course videos and exercises if you'd like to get a better idea on how to create a programming language.
It entirely depends on what your programming language is going to be like.
Do you definitely want it to be compiled? There are interpreted languages as well... or you could implement compilation at execution time
What do you want the target platform to be? Some options:
Native code (which architectures and operating systems?)
JVM
Regular .NET
.NET using the Dynamic Language Runtime (like IronRuby/IronPython)
Parrot
Personally I would strongly consider targeting the JVM or .NET, just because then you get a lot of "safety" for free, as well as a huge set of libraries your language can use. (Obviously with native code there are plenty of libraries too, but I suspect that getting the interoperability between them right may be trickier.)
I see no reason why you'd particularly want to write a compiler (or other part of the system) in C, especially if it's only for educational purposes (so you don't need a 100-million-lines-a-second compiler). What language are you personally most productive in?
Take a look at ANTLR. It is an awesome compiler-compiler the stuff you use to build a parser for a language.
Building a language is basically about defining a grammar and adding production rules to this grammar. Doing that by hand is not trivial, but a good compiler-compiler will help you a lot.
You might also want to have a look at the classic "Dragon Book" (a book about compilers that features a knight slaying a dragon on the front page). (Google it).
Building domain specific languages is a useful skill to master. Domain specific languages is typically not full featured programming language, but typically business rules formulated in a custom made language tailor made for the project. Have a look at that topic too.
There are various tutorials online such as Write Yourself a Scheme in 48 hrs.
One place to start tho' might be with an "embedded domain specific language" (EDSL). This is a language that actually runs within the environment of another, but you have created keywords, operators, etc particularly suited to the subject (domain) that you want to work in.

What languages should a microISV use to write commercial software? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 1 year ago.
Improve this question
I've been writing software in Java for many years now, but it was always for internal applications that would be deployed to a server. I'd like to get into writing desktop applications now but I don't know where to start. I've written a few Java/Swing applications but again they were for internal use.
My understanding is that Java and other semi-compiled and interpreted languages are too easy to reverse engineer, making them unsuitable for commercial software. I am aware that there are compilers for Java and some other interpreted language, but I've also heard that they are pricey and/or unreliable.
Assuming I start a microISV and wish to develop and sell applications to a broad audience, what's my best bet? I would prefer something that can be written close to once, and compiled for different operating systems but I am not opposed to .NET and a Windows-only audience if other languages would compromise the experience (installation ease & user experience) in Windows. My only issue there is that I don't have a large starting budget and paying out the wazoo for the required development tools is not really in the cards.
Why would people want to reverse engineer your software? They might pirate it, but you can't prevent pirating no matter what language you use. I doubt you have a top-secret algorithm that you're trying to hide either, in which case reverse-engineering might be an issue.
You should go with whatever you know best, and Java can work just fine.
If you are intent on switching to another language, I recommend taking a look at Qt. Qt is a free and open-source cross-platform toolkit for C++ that allows you to write applications in that will compile and run on Windows, Mac, and Linux with minimal effort. You CAN write commercial software for free with Qt with its LGPL license.
Edit: GCJ compiles Java to native code, but only supports Java 1.4.
Well, if you're trying to be an Independent Service Vendor -- and not a Software Vendor -- then in a sense it doesn't matter if you use a language like Java which can be decompiled. Because you'd be selling yourself as the best person to integrate and customize the software for your clients. The software is the delivery mechanism for the thing that will actually make you the money: you and your skills. Plenty of companies make a profit by giving away their software for free and contracting their services to set it up for their clients. You can mitigate the Java decompiling issue somewhat by using an obfuscator, but it's kind of fighting the wrong battle.
If you intent to make your money selling software and not service, then Java would be a relatively risky route to take.
It all depends on your business plan.
If you are starting a one-man company, then you are selling your personal expertise. So the language you use must be the one (or maybe two) that you are most familiar with and expert in. I'm surprised you felt it necessary to ask this.
Any code can be decompiled to some degree. I think you can obfuscate Java to a degree that will deter the casual user... but I think the other people hit the nail on the head. Of all the reasons not to use Java, the ease of decompiling should be very very low on your list. If that is all that is stopping you, go for it! Google Java obfuscater and you will find something.
I'm skeptical about the risk of reverse engineering a complex piece of software written in Java, but for purposes of your question I'm willing to stipulate it. I assume the same issues rule out any other language that is implemented only on the JVM.
The most salient aspects of Java are
Static type system
Class-based object system
Automatic memory management
No freestanding functions or modules outside the class/interface system
Generics
This combination could be replicated in a language like C#, but I assume the same objections you have about distributing JVM bytecode also apply to MSIL bytecodes.
I'm having a hard time coming up with a language that has all these features. Here are some nearby languages:
C++ has everything except automatic memory management, plus it allows freestanding functions. However the C++ generic mechanism (templates) is not for the faint of heart, and it doesn't (yet) support modular typechecking. Lots more flexibility than Java but also lots more ways to shoot your foot off. Use with caution.
Modula-3 has all of the above but it's essentially a dead language, plus like C++ there's no modular type checking for the generics.
I'm not familiar enough with Eiffel to be able to make good comparisons, but I think it's worth looking into.
Delphi may also be worth looking into. It seems to have everything above except generics. It's primarily a proprietary Windows environment (formerly known as Object Pascal), but there seems to exist an open-source 'Free Pascal' compiler that supports Delphi.
There are many object-oriented languages with automatic memory management and dynamic typing, among which one might highlight ruby, Python, and Smalltalk. None of these really compiles well and reliably to standalone native machine code, although all push toward some form of experimental compilation. And they are all dynamically typed, which is quite different from what you're used to.
If I were in your position I would probably go ahead an use Java and accept some risk of reverse engineering. Decompilers aren't as wonderful as you might think, and they don't produce wildly maintainable code, either. But if you really want to be able to produce native machine code, I would investigate Delphi and Eiffel. (I myself would use Modula-3, but that's because I once invested substantial effort in learning it. It's a very well designed language for its niche, but the user community is about gone and I think it's a dead letter. Pity.)

What fast low-level languages can you recommend?

I have become interested in C-like languages for performance computing. Can you recommend some alternative programming languages which have the following attributes:
must be close to the hardware (bit fiddling, pointers or some alternative safe method like references)
no managed code (no jvm/.net languages)
has to be really fast (like C)
must be above ASM level (and yes I am interested in macro languages on top of ASM)
can be obscure, not very widespread
I am mainly interested in little-known languages.
How about Assembly language, or the D programming language?
If you don't know about it and are interested just in broadening your horizons, take a look at Forth. Reading about Forth always makes me feel C is high-level.
Well, I've always preferred C and/or C++ because there are multiple flavours (MSVC, glibc etc), it runs on many different platforms (e.g. mobile devices, Windows, linux) and devices, and it can be written cross platform (different processor architectures) and even for high end graphics (e.g. DirectX).
You get "decent" access to platform resources (conditions vary), it can be as fast as you choose to hone it, and it's a tad easier (IMHO) to write than ASM. There's also a pretty decent range of support tools and code analysis tools to make things a little easier.
Also C and C++ have been around for quite some time, so it's got (even today) an excellent and enthusiastic community!
You don't explicitly state that it can't be C in your question, so I'll go ahead and recommend C. It fulfills your three bulleted desires, and you won't have to worry about different versions of the language (like each different kind of assembler).
Forth!
Forth can be faster than machine language on some architectures. The compiled code is extremely dense, therefore, making optimal use of code caching.
assembly would be the closest to the hardware and therefore the fastest
Ada was originally designed for embedded systems (among other things).
OpenCL might be interesting. It's sort of like OpenGL shader language (a subset of C with extensions), but for general purpose parallel array computing.
You could start programming FPGAs in VHDL, Verilog, System C ...
Variations on a theme
FORTRAN is older than C, and is still one of the major players in numerical computing. Until 1990 (when the language was substantially modernized), the language didn't have any form of pointer (checked or not). This lack meant that there was no way to manage memory dynamically; it also made aliasing analysis easy for the compiler, which is one of the things that makes Fortran code fast.
ALGOL was the first structured programming language. Although it had limited success with programmers, it had a strong influence on language designers.
Ada is an imperative language with a strong type system and good modularity, which makes it good for low-level programming with strong assurance requirements (it was sponsored by the US government with military and avionics applications in mind). It was inspired by Pascal, like Modula-2 and Modula-3.
Going further from the mainstream of low-level imperative programming, there is FORTH. FORTH can be compiled for, and even interpreted on, devices with very little memory; it finds a lot of use on low-end embedded systems, including microcontrollers. The language is based on reverse polish notation, made famous by HP calculators (in fact, the language of HP calculators is strongly influenced by FORTH). Many implementations don't have variables: all data is kept on one or more stacks.
Just for fun, I'll mention INTERCAL, the grandaddy of esoteric languages.
Stuff that will blow your mind
Esoteric languages can be instructive, and a quite a few work close to the machine (usually a virtual machine, but in principle you could implement them for an actual computer if you were crazy enough). You could look at brainfuck (a sort of intermediate stage between Turing machines and C), or the many single-instruction languages, or befunge (what if memory was a two-dimensional array?).
Cyclone looks a lot like C. The syntax is the same, and Cyclone has pointers, untagged structures and unions, goto statements and manual memory management. And yet it's a safe language: you can't have a dangling pointer, or a buffer overflow. And you have access to high-level features such as pattern matching, exceptions, polymorphism, abstract types and optional automatic memory management (not just garbage collection, but also regions). Cyclone is both useful and instructive; for a C die-hard, it can be a good way of discovering what makes a safe language. Cyclone can compile to C, so you can run your programs anywhere you have a C compiler for.
Going in a different direction, if you want to be close to the hardware, while still not actually designing hardware, have a look at synchronous languages, such as Lustre and Esterel. These languages are used to program high-assurance realtime systems such as nuclear plants, airplanes and railway signaling. These languages give up Turing completeness and gain the assurance that programmers can know exactly how fast their program will run and how much memory it will require. If you think C is close to the machine, finding out what a language that is really close to the machine may come as a shock.
You can't get much closer than assembly language, unless you get a job with a chip-maker and start writing micro code!!!
If you're on Windows I think you can get hold of Microsoft MASM (macro assembler) that will allow you go get up and running quickly. I used it a long time ago and it's not a bad product.
Seems a bit awkward to answer my question, but I have found two languages:
Pyrex
Vala
They may not fulfill all of the constraints, but they are great for performance computing and both translates to C.

Do certain languages have intrinsic processor architectures by-design

I'm curious to know if certain languages are, by design, better suited for certain processor architectures. When I say architectures, I don't mean ARM/PPC/MIPS but more stack, accumulator, or register based architectures.
For example, I can think of Forth, which is a stack architecture. Any others?
Yes, definitely... it goes the other way as well: many hardware architectures are designed to accommodate certain languages.
RISC architectures are very much an answer to that people moved from assembly to compiled languages like C/C++.
Burroughs B5000 had Algol instead of assembler.
There are several different Forth chips.
Lisp machines were designed to run Lisp efficiently.
Java processors run Java bytecode in hardware.
Some ARM processors have (optional) Java acceleration technology.
Probably many more good examples are available.
Yes they do. For example, the Occam programming language was originally targerted specifically at the Transputer architecture.
Perhaps this is a bit of a smart-ass answer, but:
The assembly languages of the processors involved are tightly linked to the architecture, so, yes, there do exist some languages where it is true.
Whether higher-level languages exhibit the same is perhaps more interesting.
I saw a talk on Google Video by Simon Peyton Jones that talked about this. He mentioned that back in the day people were very interested in writing hardware that was specialized to execute a particular language, but people figured out a better way to solve the problem: make the compiler smarter. Take a look at Haskell. GHC produces some ridiculously fast code from high level constructs, yet Haskell is so much unlike x86 assembler that the two look alien from each other. The same kind of thing happened with Java and Lisp: Java and Lisp are both very fast on modern computers and take decent advantage of our processors, but Java was originally compiled for a weird stack-based bytecode and long ago, people built Lisp machines.
Here's the video, by the way. Most of it is not relevant to the current question but you may find it interesting, it's about "why functional programming is important" and how to make unit tests the easy way.
http://video.google.com/videoplay?docid=-4991530385753299192&hl=en
It's only been fairly recent (last decade or so?) that compilers have been smart enough to make Haskell and Java almost as fast as C, even though neither of them expose much of the underlying architecture. Heck, GHC doesn't even use the stack, how wacky is that?
The best known example is of course c
C was written in the early 1970s to suit the DEC PDP-11
e.g. On the PDP-7 the programming language B only had one data type, but porting it to the PDP-11 which had different sized data types, data types for variables were added to the language.
Most languages target Von Neumann architecture, which is the basis of most CPU.
Occam for Transputer, mentioned by Neil Butterworth is a notable exception.
VHDL is another exception, based on data flow concept, but it is not a programming language, it is a hardware description and simulation language.

Which language would you use in your OS?

This is probably more of a subjective question, but which language (not API like .NET or JDK) would you use should you write your own operating system? Which language provides flexibility, simplicity, and possibly a low-level interface to the hardware? I was thinking Java or C...
C, of course.
Haskell.
Once you have flipped the right hardware bits, C is a terrible language to use for the rest of the OS. Things like the scheduler, filesystems, drivers, etc. are complex high-level algorithms, and you don't want to be writing those in assembly language (or C; same thing). It's too hard to get right. (The VM subsystem and memory manager may need to be written in something low-level, as you will need to bootstrap your high-level langauge's runtime somehow.)
Anyway, this isn't just a crazy idea that I am coming up with for SO. Here is an OS written in Haskell: http://programatica.cs.pdx.edu/House/
Lisp is another good choice; the original Lisp machines were infinitely more tweakable (at runtime) than "modern" OSes like UNIX and Windows.
Sometimes history forgets good ideas (often in the name of "maximum performance"), and that makes me sad.
D would be an interesting choice. From its own description:
D is a systems programming language. Its focus is on combining the power and high performance of C and C++ with the programmer productivity of modern languages like Ruby and Python. Special attention is given to the needs of quality assurance, documentation, management, portability and reliability.
The D runtime assumes the existence of a garbage collector, which would not be appropriate for the very lowest levels of the kernel. However, it would be appropriate for many of the higher layers.
Build the basic components like task schedulers and drivers etc with Assembly, then build the higher level components like applications and tools with C
I believe this is how Windows XP was built too (unsure about Windows Vista and Windows 7).
Definitely... C
C, ASM, C#
Singularity
Low-level in something like Haskell or D. Productivity over performance, in my opinion. You can rewrite slow parts in C++ or even assembly later if the need arises.
High-level in Python or Ruby. Ideally I'd also have a really fast JIT-capable VM for that language, but that's not going to happen for either language for a while. Lua would be a good alternative if speed gets in the way.
The kernel has to be written in a low-level language, C is by far the best choice for this, because it is so memory efficient. The higher levels could be built with a combination of Java or more ideally Objective-C, and scripting languages like python and ruby, or lua.
Honestly, I would either use C or some hierarchy of languages that I had either designed or fit together completely seamlessly. What I would be looking for is a seamless experience that starts at the bare metal level and then I could move to higher and higher level languages as I moved up the problem space. I would probably chose something like:
C - for bare metal stuff like drivers, kernel, etc
Java/C# - for application-level things like administration consoles, OS apps
Python/PowerShell - for scripting activities like common administrative tasks (creating a new user, etc)
Personally, I think C/C#/PowerShell is more tightly integrated and the type of experience I'd be looking for. Of course, if I ever got so ambitious as to write an OS, I would have a lot of spare time on my hands and would probably really enjoy tackling the language stack first. So maybe it would be L/L#/LScript ...
BitC seems to have this in mind. Despite it's name it seems to be the midpoint of assembly language and lisp. The goal was to make a language with a strong correspondence with machine language but have an intermediate representation that supports stronger correctness inferences than is possible with most other common languages. The languages was created as part of the Coyotos project, an operating system with lofty goals of security and reliability. Formal verification is made significantly possible with the ir used in BitC.
Ada:
Ada is a structured, statically typed, imperative, and object-oriented high-level computer programming language, extended from Pascal and other languages. It was originally designed by a team led by Jean Ichbiah of CII Honeywell Bull under contract to the United States Department of Defense (DoD) from 1977 to 1983 to supersede the hundreds of programming languages then used by the DoD. Ada is strongly typed and compilers are validated for reliability in mission-critical applications, such as avionics software.
Ada, because it was not only specifically developed for such projects, but it also provides support for several very useful high level features (such as support for strong typing, concurrency and abstraction) that are simply not available in standard C.
So that, even as a project grows, you don't have to work around language limitations (think encapsulation, abstraction, namespaces in C).
Don't get me wrong, C works obviously for a great many of projects, but once a project has gained a certain size (think Linux kernel, gcc, GNOME), you will inevitably appreciate certain features of more high level languages to make the development process less tedious and also less obfuscated.
In C however, these features usually end up being -pretty poorly- emulated by excessive and almost pervert use of the pre-processor (this can for example be seen in the gcc code base), so that you get to see lots of nested macros, that from an implementation point of view, actually emulate features found in other programming languages.
In addition, Ada is the only programming language, that I am aware of, that actually provides standardized support for source code analysis using the ASIS, having such a facility in place is however the prerequisite to actually be able to maintain and transform/re-engineer a code base in the long run (think refactoring).
Having an interface like ASIS available, means that you can actually implement "semantic patching", where you can automatically rename a file, function or variable/data structure and it will actually work.
Java ?? no jave runs on a virtual machine which needs an os to run on top of ,
maybe C and some ASM ;)
I would go with D to see whether it can do it.
I would only pick the following 3 out of practicality.
C (good old fashioned)
C++ (C with stuff tacked on. Windows is partially written in this)
Java (the medium level language that just might have a capable garbage collector with controllable pauses with G1).
If I were going to start a new OS I'd do it with the subset of C++ recommended by the embedded industry. You can use things like classes and use it "as a better C" and be just as fast. Just avoid things that have massive overhead. You can even use some template features, if you stick to a certain subet that basically don't have any overhead. Look on embedded.com for features in C++ that have little to no overhead, but will allow you organize your code better than you ever could in C.
Oberon? I guess I miss Pascal too much some times. C paid the bills for quite a while, but I don't really love it.
Lisp of course!
Title text: Some say the world will end in fire; some say in segfaults.
For an OS, you want speed at the lowest levels. So assembly, C, C++, Objective-C, or Java seem to be the current choices. Although it's just recently that Java got fast, and it's hard for me to imagine an OS with garbage collection.
If I were writing my own, it would be a mix of assembly and C.
A C or C++ microkernel with a JIT for a highly dynamic language like Ruby or maybe a language with native support for the Prototype pattern. Even device drivers in that language.
Not because it's practical but because it's really cool. Cool in the way that NeXTStep was cool for using Obj-C for pretty much everything.
http://www.dwheeler.com/sloc/redhat71-v1/redhat71sloc.html - share of languages in Linux's source code.
C, by a number of reasons. Other candidates, like D, are great. However, C has this advantage: there's a lot of available open-sourced C code that you could reuse in your project (much more than is for other languages appropriate for system programming).
I would be torn between using some existing low level language and write my own based on C# but with much better generics support.
In second case I would make each method generic, but all the constraints will be resolved by compiler - to allow "duck typing" like in Scala but still language should be static. Also static virtual methods would lower the codebase.
I've had that idea for a long time, but it never seems to be doable in real timeframe, so who knows maybe in the future. :-)
Some would say Java.
Note that openfirmware is written partly in Forth, and it's very low level.
Have an open mind.
"The kernel has to be written in a low-level language, C is by far the best choice for this, because it is so memory efficient. "
Um... What about FORTH?
FORTH can be low level and high level, so you could have a whole operating system written in FORTH from the ground up, and still have a nice easy REPL scripting environment on top, also in FORTH.
However, any decent operating system should support lots of langauges on top, from C all the way to Python Ruby and Javascript. Making FORTH the basis for it all has a lot of benefits though.
edit: I'd only ever attempt this for an embedded environment with a single known hardware set. Trying to write an OS that could compete with Linux or Windows is a fools job.
If this isn't a hypothetical question, and you're looking to create your own OS, I'd probably go with C because most of the examples out there are written in C.
Also, (And I haven't build an OS yet so take this with a grain of salt), I'm thinking that the c runtime libraries would be a lot easier to port to your new OS than say .NET.
Pascal + Oberon: they have the power of C and C++ but they're not as daunting to use. Both these languages are grossly under appreciated.

Resources