How to go about making your own programming language? [duplicate]

How to go about making your own programming language? [duplicate] - programming-languages

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
Learning to write a compiler
I looked around trying to find out more about programming language development, but couldn't find a whole lot online. I have found some tutorial videos, but not much for text guides, FAQs, advice etc. I am really curious about how to build my own programming language. It brings me to SO to ask:
How can you go about making your own programming language?
I would like to build a very basic language. I don't plan on having a very good language, nor do I think it will be used by anyone. I simply want to make my own language to learn more about operating systems, programming, and become better at everything.
Where does one start? Building the syntax? Building a compiler? What skills are needed? A lot of assembly and understanding of the operating system? What languages are most compilers and languages built in? I assume C.

I'd say that before you begin you might want to take a look at the Dragon Book and/or Programming Language Pragmatics. That will ground you in the theory of programming languages. The books cover compilation, and interpretation, and will enable you to build all the tools that would be needed to make a basic programming language.
I don't know how much assembly language you know, but unless you're rather comfortable with some dialect of assembly language programming I'd advise you against trying to write a compiler that compiles down to assembly code, as it's quite a bit of a challenge. You mentioned earlier that you're familiar wtih both C and C++, so perhaps you can write a compiler that compiles down to C or C++ and then use gcc/g++ or any other C/C++ compiler to convert the code to a native executable. This is what the Vala programming language does (it converts Vala syntax to C code that uses the GObject library).
As for what you can use to write the compiler, you have a lot of options. You could write it by hand in C or C++, or in order to simplify development you could use a higher level language so that you can focus on the writing of the compiler more than the memory allocations and the such that are needed for working with strings in C.
You could simply generate the grammars and have Flex and Bison generate the parser and lexical analyser. This is really useful as it allows you to do iterative development to quickly work on getting a working compiler.
Another option you have is to use ANTLR to generate your parser, the advantage to this is that you get lots of target languages that ANTLR can compile to. I've never used this but I've heard a lot about it.
Furthermore if you'd like a better grounding on the models that are used so frequently in programming language compiler/scanner/parser construction you should get a book on the Models of Computation. I'd recommend Introduction to the Theory of Computation.
You also seem to show an interest in gaining an understanding of operating systems. This I would say is something that is separate from Programming Language Design, and should be pursued separately. The book Principles of Modern Operating Systems is a pretty good starting place for learning about that. You could start with small projects like creating a shell, or writing a programme that emulates the ls command, and then go into more low level things, depending on how through you are with the system calls in C.
I hope that helps you.
EDIT: I've learnt a lot since I write this answer. I was taking the online course on programming languages that Brown University was offering when I saw this answer featured there. The professor very rightly points out that this answer talks a lot about parsers but is light on just about everything else. I'd really suggest going through the course videos and exercises if you'd like to get a better idea on how to create a programming language.

It entirely depends on what your programming language is going to be like.
Do you definitely want it to be compiled? There are interpreted languages as well... or you could implement compilation at execution time
What do you want the target platform to be? Some options:
Native code (which architectures and operating systems?)
JVM
Regular .NET
.NET using the Dynamic Language Runtime (like IronRuby/IronPython)
Parrot
Personally I would strongly consider targeting the JVM or .NET, just because then you get a lot of "safety" for free, as well as a huge set of libraries your language can use. (Obviously with native code there are plenty of libraries too, but I suspect that getting the interoperability between them right may be trickier.)
I see no reason why you'd particularly want to write a compiler (or other part of the system) in C, especially if it's only for educational purposes (so you don't need a 100-million-lines-a-second compiler). What language are you personally most productive in?

Take a look at ANTLR. It is an awesome compiler-compiler the stuff you use to build a parser for a language.
Building a language is basically about defining a grammar and adding production rules to this grammar. Doing that by hand is not trivial, but a good compiler-compiler will help you a lot.
You might also want to have a look at the classic "Dragon Book" (a book about compilers that features a knight slaying a dragon on the front page). (Google it).
Building domain specific languages is a useful skill to master. Domain specific languages is typically not full featured programming language, but typically business rules formulated in a custom made language tailor made for the project. Have a look at that topic too.

There are various tutorials online such as Write Yourself a Scheme in 48 hrs.
One place to start tho' might be with an "embedded domain specific language" (EDSL). This is a language that actually runs within the environment of another, but you have created keywords, operators, etc particularly suited to the subject (domain) that you want to work in.

Related

Tool for automated porting and language that can compile into others

I'm just asking this out of curiosity :
Is there any tool that can automatically convert a source code of reasonable complexity from one language to another ?
Is there any "meta-language" that can compile into several other languages ? For example CoffeeScript compiles into Javascript.
If you know any open-source example, it'd be great !
Thank you for your time.
PS: No idea how to tag this. Feel free to edit.

GCC converts complex C++ code into machine code and thus technically is an answer to your question. In fact, there are lots of compiler like this, but I don't think these are what you intended to ask.
There are tools that are hardwired to translate just one language to another as source code (another poster suggested "f2C", which is a perfect example). These are just like compilers... but rarer.
There are virtually no tools that will map from one language to many others, out of the box. The problem is that languages have different execution models, data types, and execution schemes, which such a translator has to simulate properly in the target language.
The are "code generators" that claim to do this, but they are largely IMHO specifications of rather simple functions that translate trivially to simple code in the target langauge.
If you want to translate one language to another in a sort of general way, you need a program transformation system, e.g., a system that can parse arbitrary langauges, and for which you can provide translation rules that map to other languages in a sort of straightforward way.
Our DMS Software Reengineering Toolkit is one of these. This SO What kinds of patterns could I enforce on the code to make it easier to translate to another programming language? discusses the issues in more detail.

You can convert Fortran code to C using the f2c tool.
For python, you can convert a subset of the language to C++ using shedskin.
The vala language is converted to C before the real compilation.

Creating programming languages and compiler designing. Are they related?

Alright, I guess this question has been asked a lot of times here.
I want to create a programming language, not necessarily starting today, but over a span of 2-3 yrs. I'm not a very good programmer, but I'm improving. What I wanted to ask is how closely creating a language and writing a compiler related?
Since, a compiler translates a language from one form into another, I guess it's all about writing a compiler for a particular piece of text. SO if I learn compiler design, will I be able to write my own programming language?

You can design a programming language without knowing anything about implementing compilers, and vice versa. The language designer can write a specification for the language, and a compiler implementor can then take that and create the compiler.
However, if this is a personal project, then you will probably have to learn how to do both. A programming language for which there is no compiler is purely theoretical, and it is difficult to figure out how good a programming language is without writing and executing real programs with it. Even if you do find someone willing to implement the compiler for you, you might not want to have to wait for that person every time you have a new idea to try, so you will want to know how to do it yourself.
Implementing a compiler is a pretty advanced programming project, so if you are just getting started as a programmer, you have a steep learning curve ahead of you. You might want to start by looking at the tutorials and examples for LLVM, although that might not actually be a suitable compiler infrastructure for your language.

Naruto, it depends on what kind of "Language" you want to create. If it is a simple, just-for-learning language, and you choose the grammar, etc, etc, you won't need to know a lot about programming. BUT, if you are going to deal with a serious one, you will have to study at least one computer programming language deep not only to use it, but to try to reach several of its concepts, for example, like OO, generics, lambda expressions, etc, etc.
Believe me, this is not a task of months, but a serious journey. Anyway, I wish you luck ;)

Intimately related. You really don't have a language unless you have a way to interpret/compile it into an executable form.

It depends on what you mean by "compiler". Compilers/interpreters usually consist of two big parts: a parser part, which reads a text in your language and builds an internal structure (AST) out of it, and a code generation/interpretation part, which reads the AST and translates it to machine or byte codes. While you definitely will need to know how to write a parser for your language, code generation is less important, at least, at the early stages. You can start by simply translating your language to C and see where you go from there.

How to create a language these days?

I need to get around to writing that programming language I've been meaning to write. How do you kids do it these days? I've been out of the loop for over a decade; are you doing it any differently now than we did back in the pre-internet, pre-windows days? You know, back when "real" coders coded in C, used the command line, and quibbled over which shell was superior?
Just to clarify, I mean, not how do you DESIGN a language (that I can figure out fairly easily) but how do you build the compiler and standard libraries and so forth? What tools do you kids use these days?

One consideration that's new since the punched card era is the existence of virtual machines already bountifully provided with "standard libraries." Targeting the JVM or the .NET CLR instead of ye olde "language walled garden" saves you a lot of bootstrapping. If you're creating a compiled language, you may also find Java byte code or MSIL an easier compile target than machine code (of course, if you're in this for the fun of creating a tight optimising compiler then you'll see this as a bug rather than a feature).
On the negative side, the idioms of the JVM or CLR may not be what you want for your language. So you may still end up building "standard libraries" just to provide idiomatic interfaces over the platform facility. (An example is that every languages and its dog seems to provide its own method for writing to the console, rather than leaving users to manually call System.out.println or Console.WriteLine.) Nevertheless, it enables an incremental development of the idiomatic libraries, and means that the more obscure libraries for which you never get round to building idiomatic interfaces are still accessible even if in an ugly way.
If you're considering an interpreted language, .NET also has support for efficient interpretation via the Dynamic Language Runtime (DLR). (I don't know if there's an equivalent for the JVM.) This should help free you up to focus on the language design without having to worry so much about the optimisation of the interpreter.

I've written two compilers now in Haskell for small domain-specific languages, and have found it to be an incredibly productive experience. The parsec library makes playing with syntax easy, and interpreters are very simple to write over a Haskell data structure. There is a description of writing a Lisp interpreter in Haskell that I found helpful.
If you are interested in a high-performance backend, I recommend LLVM. It has a concise and elegant byte-code and the best x86/amd64 generating backend you can find. There is an optional garbage collector, and some experimental backends that target the JVM and CLR.
You can write a compiler in any language that produces LLVM bytecode. If you are adventurous enough to learn Haskell but want LLVM, there are a set of Haskell-LLVM bindings.

What has changed considerably but hasn't been mentioned yet is IDE support and interoperability:
Nowadays we pretty much expect Intellisense, step-by-step execution and state inspection "right in the editor window", new types that tell the debugger how to treat them and rather helpful diagnostic messages. The old "compile .x -> .y" executable is not enough to create a language anymore. The environment is nothing to focus on first, but affects willingness to adopt.
Also, libraries have become much more powerful, noone wants to implement all that in yet another language. Try to borrow, make it easy to call existing code, and make it easy to be called by other code.
Targeting a VM - as itowlson suggested - is probably a good way to get started. If that turns out a problem, it can still be replaced by native compilers.

I'm pretty sure you do what's always been done.
Write some code, and show your results to the world.
As compared to the olden times, there are some tools to make your job easier though. Might I suggest ANTLR for parsing your language grammar?

Speaking as someone who just built a very simple assembly like language and interpreter, I'd start out with the .NET framework or similar. Nothing can beat the powerful syntax of C# + the backing of the entire .NET community when attempting to write most things. From here i designed a simple bytecode format and assembly syntax and proceeeded to write my interpreter + assembler.
Like i said, it was a very simple language.

You should not accept wimpy solutions like using the latest tools. You should bootstrap the language by writing a minimal compiler in Visual Basic for Applications or a similar language, then write all the compilation tools in your new language and then self-compile it using only the language itself.
Also, what is the proposed name of the language?
I think recently there have not been languages with ALL CAPITAL LETTER names like COBOL and FORTRAN, so I hope you will call it something like MIKELANG with all capital letters.

Not so much an implementation but a design decision which effects implementation - if you make every statement of your language have a unique parse tree without context, you'll get something that it's easy to hand-code a parser, and that doesn't require large amounts of work to provide syntax highlighting for. Similarly simple things like using a different symbol for module namespaces and object namespaces ( unlike Java which uses . for both package and class namespaces ) means you can parse the code without loading every module that it refers to.
Standard libraries - include the equivalent of everything in C99 standard libraries other than setjmp. Add whatever else you need for your domain. Work out an easy way to do this, either something like SWIG or an in-line FFI such as Ruby's [can't remember module name] and Python's ctypes.
Building as much of the language in the language is an option, but projects which start out doing either give up (rubinius moved to using C++ for parts of its standard library), or is only for research purposes (Mozilla Narcissus)

I am actually a kid, haha. I've never written an actual compiler before or designed a language, but I have finished The Red Dragon Book, so I suppose I have somewhat of an idea (I hope).
It would depend firstly on the grammar. If it's LR or LALR I suppose tools like Bison/Flex would work well. If it's more LL, I'd use Spirit, which is a component of Boost. It allows you to write the language's grammar in C++ in an EBNF-like syntax, so no muddling around with code generators; the C++ compiler compiles the grammar for you. If any of these fail, I'd write an EBNF grammar on paper, and then proceed to do some heavy recursive descent parsing, which seems to work; if C++ can be parsed pretty well using RDP (as GCC does it), then I suppose with enough unit tests and patience you could write entire compilers using RDP.
Once I have a parser running and some sort of intermediate representation, it then depends on how it runs. If it's some bytecode or native code compiler, I'll use LLVM or libJIT to process it. LLVM is more suited for general compilation, but I like the libJIT API and documentation better. Alternatively, if I'm really lazy, I'll generate C code and let GCC do the actual compilation. Another alternative, is to target an existing VM, like Parrot or the JVM or the CLR. Parrot is the VM being designed for Perl. If it's just an interpreter, I'll walk the syntax tree.
A radical alternative is to use Prolog, which has syntax features which remarkably simulate EBNF. I have no experience with it though, and if I am not wrong (which I am almost certainly going to be), Prolog would be quite slow if used to parse heavy duty programming languages with a lot of syntactical constructs and quirks (read: C++ and Perl).
All this I'll do in C++, if only because I am more used to writing in it than C. I'd stay away from Java/Python or anything of that sort for the actual production code (writing compilers in C/C++ help to make it portable), but I could see myself using them as a prototyping language, especially Python, which I am partial towards. Of course, I've never actually done any of this before, so I'm not one to say.

On lambda-the-ultimate there's a link to Create Your Own Programming Language by Marc-André Cournoyer, which appears to describe how to leverage some modern tools for creating little languages.

Just to clarify, I mean, not how do you DESIGN a language (that I can figure out fairly easily)
Just a hint: Look at some quite different languages first, before designing a new languge (i.e. languages with a very different evaluation strategy). Haskell and Oz come to mind. Though you should also know Prolog and Scheme. A year ago I also was like "hey, let's design a language that behaves exactly as I want", but fortunatly I looked at those other languages first (or you could also say unfortunatly, because now I don't know how I want a language to behave anymore...).

Before you start creating a language you should read this:
Hanspeter Moessenboeck, The Art of Niklaus Wirth
ftp://ftp.ssw.uni-linz.ac.at/pub/Papers/Moe00b.pdf

There's a big shortcut to implementing a language that I don't see in the other answers here. If you use one of Lukasiewicz's "unparenthesized" forms (ie. Forward Polish or Reverse Polish) you don't need a parser at all! With reverse polish, the dependencies go right-to-left so you simply execute each token as it's scanned. With forward polish, it's the reverse of that, so you actually execute the program "backwards", simplifying subexpressions until reaching the starting token.
To understand why this works, you should investigate the 3 primary tree-traversal algorithms: pre-order, in-order, post-order. These three traversals are the inverse of the parsing task that a language reader (i. parser) has to perform. Only the in-order notation "requires" a recursive decent to re-construct the expression tree. With the other two, you can get away with just a stack.
This may require more "thinking' and less "implementing".
BTW, if you've already found an answer (this question is a year old), you can post that and accept it.

Real coders still code in C. Just that it's a litte sharper.
Hmmm... language design? or writing a compiler?
If you want to write a compiler, you'd use Flex + Bison. (google)

Not an easy answer, but..
You essentially want to define a set of rules written in text (tokens) and then some parser that checks these rules and assembles them into fragments.
http://www.mactech.com/articles/mactech/Vol.16/16.07/UsingFlexandBison/
People can spend years on this, The above article talks about using two tools (Flex and Bison) That can be used to turn text into code you can feed to a compiler.

First I spent a year or so to actually think how the language should look like. At the same time I helped in developing Ioke (www.ioke.org) to learn language internals.
I have chosen Objective-C as implementation platform as it's fast (enough), simple and rich language. It also provides test framework so agile approach is a go. It also has a rich standard library I can build upon.
Since my language is simple on syntactic level (no keywords, only literals, operators and messages) I could go with Ragel (http://www.complang.org/ragel/) for building scanner. It's fast as hell and simple to use.
Now I have a working object model, scanner and simple operator shuffling plus standard library bootstrap code. I can even run a simple programs - as long as they fit in one file that is :)

Of course older techniques are still common (e.g. using Flex and Bison) many newer language implementations combine the lexing and parsing phase, by using a parser based on a parsing expression grammar (PEG). This works for recursive descent parsers created using combinators, or memoizing Packrat parsers. Many compilers are built using the Antlr framework also.

Use bison/flex which is the gnu version of yacc/lex. This book is extremely helpful.
The reason to use bison is it catches any conflicts in the language. I used it and it made my life many years easier (ok so i'm on my 2nd year but the first 6months was a few years ago writing it in C++ and the parsing/conflicts/results were terrible! :(.)

If you want to write a compiler obviously you need to read the Dragon Book ;)
Here is another good book that I have just read. It is practical and easier to understand than the Dragon Book:
http://www.amazon.co.uk/s/ref=nb_sb_noss?url=search-alias%3Daps&field-keywords=language+implementation+patterns&x=0&y=0

Mike --
If you're interested in an efficient native-code-generating compiler for Windows so you can get your bearings -- without wading through all the unnecessary widgets, gadgets, and other nonsense that clutter today's machines -- I recommend the Osmosian Order's Plain English development system. It includes a unique interface, a simplified file manager, a friendly text editor, a handy hexadecimal dumper, the compiler/linker (of course), and a wysiwyg page-layout application for documentation. Written entirely in Plain English, it is a quick download (less than a megabyte), small enough to understand in short order (about 25,000 lines of Plain English code, with just 4,000 in the compiler/linker), yet powerful enough to reproduce itself on a bottom-of-the-line Dell in less than three seconds. Really: three seconds. And it's free to all who write and ask for a copy, including the source code and and a rather humorous tongue-in-cheek 100-page manual. See www.osmosian.com for details on how to get a copy, or write to me directly with questions or comments: Gerry.Rzeppa#pobox.com

Programming language with native code support, No framework (I write the framework)

I'm looking for a programming language. It should be an easy language to learn, and should have a Garbage Collector. It should be a basic language with features like basic types (integer, boolean), arrays and etc, and I should write the framework.
It is for a game editor I want to write. The editor's designer will write the code of the UI in this programming language. The framework will be a 2D graphics and audio framework, and in the future it'll be 3D too.
I thought about the new Go language, but it doesn't have much support and theres no binding to OpenGL and etc.
Any ideas?
Thanks.

The obvious two are [C](http://en.wikipedia.org/wiki/C_(programming_language)) or C++. However, [D](http://en.wikipedia.org/wiki/D_(programming_language)) is closer to Java and C# given that it has a garbage collector in the standard, as well as an alternative standard library that is fairly closer to Java than the C++ standard library. The downside with D is that they tools are not as mature as C++ or C and the community isn't as large.
The obvious solution though it to look down the list of compiled languages on wikipedia and see which you like the look of.

Well, that's a fairly broad question and without more specific requirements it is difficult to give a focused answer, but it sounds like C (or C++) would fit the bill for you. The languages you described all owe their syntax to C. C will compile to native code. C is basic language in that there is not much to learn beyond the basic syntax and it has all the basic primitives that you require.
Now that you've added the requirement of a garbage collected language, I suppose that you could try Go, but that language is not mature and there's always a risk there.

If you don't want to manage memory all by yourself like C or C++, you can try the new Go language. It compiles to native code (albeit for Linux and MacOSX only for now) and comes with a basic framework that can be easily replaced with your own framework.
It has a very active user base, so IMO it is possible to mature quickly.

You may want to look at Lua.
Lua is a relatively tiny language which manages to be capable and universal with just a few concepts. The BNF specification for the whole language fits easily on one page. It has numbers, booleans, tables and functions, and surprisingly that's all the datatypes it needs. It can even work in an object-oriented fashion.
There's a compiler, Luac, that compiles Lua to bytecode.
Lua is already being used as a UI programming language for games. Addons for World of Warcraft and a few other games are programmed in Lua. I believe Lua is a very good fit for this kind of task.
You want OpenGL? OK... http://luagl.wikidot.com/ is an OpenGL library for Lua.

Since we don't know what you want to do, I don't know what are the chances we success. Therefor, what about a language where you have to set the probability of your statement to fail :
Meet GOTO++.
Don't say "thanks you", it's on me.

Enjoy a challenge?
Try go.
Here's a tech talk by rob pike, and here is a discussion group: http://groups.google.com/group/golang-nuts/topics
.

C++ is Great, it's not scripting lang, so you don't even need a scripting host.

What are the most important programming languages to know for concepts? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
In your opinion, what are the most important languages for a programmer to know? I'm talking about concepts, not about how practical the language is.
List the languages and a reason. For example, Lisp for functional programming, JavaScript for prototype-based OOP, etc.

Must know:
1) C (system programming, understanding of machine architecture)
2) Perl or Python or Ruby (practical day-to-day tasks)
3) Java or C# or C++ (OOP, and quite important to get a job these days)
Really important:
1) Haskell or ML (functional programming; changes the way you think)
2) Lisp or Scheme (power of macros)
Nice additionals:
1) Forth (very low-level, explicit stack operation + joy to write your own interpreter)
2) Assembly languages (know how your CPU works)
3) Erlang (parallel processing)
4) Prolog (logic programming)
5) Smalltalk (true OOP and true interactive developent)

The assembly languages of as many chips as you can learn for low-level knowledge.
C to learn more practical knowledge of low-level workings, since almost all languages are implemented in C.
C++ for object-oriented programming on top of the low-level goodness of C.
Pascal to learn how to work with strong typing.
Java to see how you can shield yourself from low-level concerns.
Perl to learn regular expressions, weak/dynamic typing, and other good things.
Python to see strong/dynamic/duck typing.
Ruby to see how object-orientedness works on top of Perl-esque weak/dynamic typing.
Common Lisp for that functional enlightenment.
Scheme for the emphasis on recursion.
Emacs Lisp so you can extend Emacs.
Haskell to see pure functional programming done right.
APL so you learn how not to write code.
COBOL so you can make mad money maintaining legacy code.
Erlang to really learn about concurrency. (Thanks to Pete Kirkham for correcting me.)
Scala for functional programming on the JVM.
Clojure for a Lisp-like functional language on the JVM.
Prolog to understand logic programming.
D so you can see why all the D fanatics are always so pro-D.
C# so you can program for .NET (and Mono).
F# so you can do functional programming on .NET.
Forth for stack-based languages.
PHP so you can see how not to create a language. (Just kidding. Learn PHP beacause it's really useful for web development.)
JavaScript because it's basically the language for client-side web scripting.
bash for a good, general-purpose scripting language.
Visual Basic so you can read the code your boss wrote. =)
INTERCAL for "fun."
brainfuck so you can torture your friends.
LOLCODE so you can convince them to still be your friends after you subject them to brainfuck.
...And so on.

C for understanding how the most other language(-implementations) and operation systems are implemented

I think the three languages that best combine practicality and coverage of programming concepts would be
C
Python
Javascript
From these languages you can learn low-level system programming, pointers and memory management, static typing, dynamic typing, high-level scripting, event-driven programming, OO programming, functional programming.
Obviously you're not going to get as pure an intro to functional programming as you would with, say, Haskell, but you can learn a lot of the concepts in Python and (especially) Javascript.

It is not the languages rather the paradigms you should know:
procedural (like C, Pascal)
object-oriented (like Java, C++, Smalltalk)
functional (like Lisp, ML, Scala)
If you understood one of these paradigms in one language, it is easy to learn another language in the same paradigm. And there are even more fields specially supported by languages that are important to understand:
parallelism (in Erlang or Scala)
declarative templates (e.g. in C++ or Prolog)
dynamic languages (e.g. JavaScript)
At at last you should always know what goes on under the hoods, so you better have a look at assembler.

I would say:
C or Assembler to understand how the processor work.
Smalltalk (or C#, Java, Python, Ruby, etc) to understand object oriented programming.
Lisp (any Lisp, Scheme, Common Lisp, Clojure) to understand high level programming, meta programming (macros), etc.
Haskell to understand type inference and other functional concepts.
If you are into distributed systems, I'll consider learning Erlang too. Those are the language I recommend learning, even if only superficial, only for the sake of learning even if you never use them to write a real application.

Its best to know a variety. This gives you a better overall perspective of the art of programming, plus, you get to choose the best tool for the job.
My current list would be:-
C - programming close to the machine.
Python - programmers nirvana.
Perl - for when s**t happens.
Java - cause it will keep you in work.
C# - cause it will keep you in work.
lisp, scheme or something functional to get your brain out of a rut.
SQL - for managing large data sets.
JCL, COBOL, VAX DCL, CShell VB - just to remind you how bad things could be!

A good short list:
C for the machine concepts
Haskell for functional programming
Smalltalk (or maybe Ruby or Simula-67) for object-oriented programming
Prolog for logic programming
Icon for backtracking and mind-blowing string processing
Bourne shell for Unix scripting
Might also include
Scheme for macros
Awk or Perl or ... for regular expressions
FORTH for tiny bootstrapping postfix wonderfulness :-)

For concepts, I would choose assembler and Java.
The first because you should know in intimate detail how machines work.
The second because you should understand how to shield yourself from the intimate details of the way machines work :-). By that I mean a language with a rich set of data structures at hand (so really Java could be Python, C++ with Boost and so on).

Well. I'd say learn C and javascript. They are most widely used languages.
You might want to learn Java/some .Net language and/or python/ruby: they're more convenient, tho.
This have the advantage that all those languages are reasonably well designed.
For example, don't learn PHP or C++ because they're a mess. They're used widely, you might want to learn them one day, but they can seriously mess with your mind.

Limbo - a programming language with concurrency and channels, what C should have evolved into. ( see also D, another C successor )
No-one else seems to be mentioning any declarative languages, so here are a few:
Prolog - a declarative language for logic programming
Modelica - a declarative OO language for modelling systems.
XSLT - a declarative language for transforming XML.
For parallelism, you don't get much wider than shader language, and the related OpenCL - typically 512 processors in parallel on a high-end desktop, rather than Erlang's 4 processors in parallel ( though with many scheduled processes ).

This is a good question. I know a lot of people, myself included, get stuck in a rut where we are just churning out code like one would churn out burgers at a McDonald's. Coding becomes too mechanical—we understand how to get things done, but often times we forget why these things get done behind the scenes.
In my world it's been C++/C/ObjectiveC that have taught me the most, even though I write C# every day at work.
For the most part, learning C++ has helped me learn about memory management and how the various objects are stored in memory, etc.—the actual science of programming. What really opened my eyes was the Programming Paradigms class offered at Standford that you can get off of iTunes and I think YouTube.

You already mentioned two, here are a few more:
Java: Java is a good example of OOP 'cause you HAVE to use oop, and it's designed from the ground up to be an OOP language.
BASIC: Although obsolete, it's a good example of procedural languages and it has a very easy syntax.

For web programming: PHP, ANSI-SQL, javascript.
Some people may argue that HTML and CSS are not programming languages. But they are esential for web app development.
For Desktop app development, C++ with the Qt Framework. Qt gives C++ the additional "cross-platform" fizz.

Pascal or Basic for start and to master basics of procedural programming.
At school we learned Haskel for functional programming.
And then one should try assembler or C for getting deep and Java for OOP.
A have no arguments for this - that's only my taste and what I tried.

I would sat C/C++ cause it sets allot of basics for allot of other languages used around the world.
Personally I learned Java/JavaScript->VB(short course fortunately)->C#->C++, with a pinch of PHP and Perl on top of it all. Best part of that line was C# and then moving behind the scene in C++.

Look for a set of different programming languages:
C++ / C# / Java etc.
C / Assembler
Python, Ruby, Lua, Perl etc.
sh, awk, sed, regular expressions
Prolog (or similar)
Haskell / Lisp etc.
It doesn't really matter which ones you choose, but that you choose one from each "category".

Pseudocode for reading/writing documentation. :p
I think knowing C/C++ or any other low level language will help you with understanding the impacts of how managed/script languages works helps. Such as pointers will demystify variable references.

If you want to understand computers and the underlying hardware, C is the single most important language, commonly said to be the lingua franca of computers. The Stackoverflow podcast tends to cover this at least once a week.
There seem to be enough answers on this, so I'll just leave it at that.

You should know a scripting language so that you can prototype your applications faster.
Maybe python/ruby/perl . Groovy is also an alternative if you're a Java guy that likes his java libs.

Alot is already mentioned but I would definitely add C++ (already done) for the following reason:
C++ for learning how to use pointers and get the main idea about them.
Although there is the discussion if c++ is still the 'better' language (all depends on what you want to make really) it never hurts to understand pointers, just in case you do need them ever.

Having a scan through the answers so far I'm surprised I've not seen any mention of actionscript.
I think if you learn some C/C++, then some Java then that should prepare you for pretty much all of the decent languages out there.
I prefer to see my code in action and I find Actionscript 3 (not 2 or 1) along with Flex (which is MXML) great for quickly demoing visual concepts.
So C & Java helps to learn the syntax of the majority of languages.
Actionscript 3 (very similar to java syntax) & MXML for being able to express you code visually very quickly.

C - low level system programming plus to understand generic concepts about how memory is handled, stack, stack frame, heap and so on. These are helpful for understanding higher level languages
C++ - mainly std library (separation of generic algorithms and containers), templates, namespaces, but OO concepts as well. Templates meta programming will give you completely different perspective on writing software, this is compile time execution versus run time execution. Templates inheritance (static vs dynamic polymorphism).
Python - dynamic type system, list comprehension - functional programming (?), no memory management for developer, spaces for indentation
Objective-C - message dispatching (can dispatch to nil), dynamic type system (static as well), late binding, OO concepts

It may sound crazy, but I first learned to program writing VBScript macros for windows. I used a template, which is available here http://vbscript-macro-template.blogspot.com/ and I just added to it and also tried to understand everything that it did. Now, several years later I am writing my own desktop and database applications.

You should start from C and go through C++, Java and the goto WinForms,
Then better goto .NET

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string