Is there any programming language that is defined by reference implementation? - programming-languages

Some of the major programming languages (see list below) are defined by a specification. But in principle, you could also define a language by implementing a reference compiler / interpreter. (I don't want to say that this would be a good idea.) But I'm curious:
Is there any (major) language that is not defined by specification, but by reference implementation?
By major language I just want to make sure that answers don't come up with esoteric ones (Whitespace, Brainfuck, ...) or with something like "I've created this compiler a while ago. There is not specification, so your answer is 'yes'."
List of programming language specifications
C: At least 3 specifications. ISO 9899:2011 is the latest one.
C++: e.g. this
Java: Java language Specification
JavaScript: ECMAScript Language Specification
Perl: Perl 6 specification
PHP: PHP language reference
Python: Python language reference
I was not able to find a specification for Ruby and TCL

Almost all programming languages are defined by a reference implementation, until they're not. Every language I can think of with just a couple of important exceptions started with a reference implementation and a loose specification. Over time, as it matured, the specification was hardened and the compiler was required to follow its own spec.
The major change occurs when there is a second implementation. Then you need a formal spec (but you don't always get one).
Early exceptions include Algol 60, Pascal and Ada. Each of those acquired a spec very early on, and then often multiple implementations. C# is another.
In answer to your question, Delphi is now the reference implementation for its own brand of Pascal; FoxPro is the only dBase language left; Rebol; Basic (any dialect you choose); SQL (most dialects are non standard); R and S; etc.
The list of programming languages is vast, the list of standards is quite small. I rather guess your question is definitional: how many languages will you accept as 'major'?

Related

Is it possible to make a proxy between IDE and language server?

Say, I want to slightly change the behavior of some language.
Can this be done? Has it been done before?
UPD: Typescript has very limited possibilites to work with property names (for example you can not with the help of typescript create a propname that is derived from another one, e.g. uppercased or with a prefix prepended), whereas Javascript (which is what Typescript compiles to) is very flexible. So I wanted with the help of LSP to modify some of the Typescript's LSP messages that carry autocompletion options.
Yes, but how to do that is operating system specific. From the compiler's point of view, the proxy would appear as your IDE. LSP is a network protocol
I want to slightly change the behavior of some language.
So you want to change the semantics of some language (and you don't tell which). Then LSP is not the best place to do so.
For example, in some simple cases, with GCC, writing your GCC plugin is more appropriate.
In most cases, changing the "behavior" is really changing the language itself. Then you might need to make your own implementation of that language. Sometimes, you might patch an existing free software implementation of the original languages. In other cases, you'll need to make your language implementation by yourself. Then read the Dragon Book and consider compiling or transpiling to C your language in your language implementation. Be sure to specify it on paper first.
Don't confuse a programming language (which is a specification, generally written in English - perhaps with some formalization in some specific notation, e.g. n1570 for C11, R5RS for Scheme) with its implementation (which is a software). Read Scott's Programming Language Pragmatics.
Don't confuse an IDE with a language implementation. For example, all C or C++ compilers I know about don't have any IDE and are command line programs (e.g. GCC, Clang, etc...), and most of them don't even know about LSP. The IDE might start the C++ compiler, but it is not a compiler. You can code (in C, C++, Java, C#, Ocaml, ....) with a plain source code editor (even a simple language-agnostic one such as Notepad on Windows, or Leafpad or nano on Linux).
Most programming languages are defining source programs as a set of "translation units", practically of "source files", each being a sequence of characters with some complex syntax and semantics. How these source files are created is outside of the scope of the programming language specification (you could use editors, you could write your own generators for these source files, ....)
LSP is a proposal for a protocol between "IDE"s and language implementations.
Notice that autocompletion (mentioned in your comment) is not a feature of the programming language (and is not part of the semantics of C or C++). It might be a feature of some IDEs (not all of them). It could be done in a language neutral way (e.g. in Emacs).

Distinctive characteristics of programming languages [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
Beyond the syntax of each language (e.g. print v. echo), what are some key distinctive characteristics to look out for to distinguish a programming language?
As a beginner in programming, I'm still confused between the strengths and weaknesses of each programming language and how to distinguish them beyond their aliases for common native functions. I think it's much easier to classify languages based on a set of distinctive characterstics e.g. OOP v. Functional.
There are many thing that define a PL, here I'l list a few:
Is it procedural, OO, imperative?
Does it has strong type checking(C#, C++, Delphi) or dynamic(PHP, Pythong, JS)
How are references handled? (Does it hide pointers like C#?)
Does it require a runtime (C#, Java) or is it native to the OS(C, C++)
Does it support threads (E.g Eiffel needs extra libraries for it)
There are may others like the prescense of garbage collectors, the handling of params, etc. The Eiffel language has an interesting feature which is Design By Contract, I haven't seen this on any other language(I think C# 4.0 has it now), but it can be pretty useful if well used.
I would recommend you to take a look on Bertrand Meyer's work to get a deeper understanding on how PL's work and the things that define them. Another thing that can define a PL is the interaction level with the system, this what makes the difference between low-level languages and high-level languages.
Hope I can help
In a domain (imperative, functional, concatenative, term rewriting), sometimes its best to look at the presence or absence of any particular set of functionality. For example, for the main stream imperative.
First order functions
Closures
Built in classes, prototypical inheritance, or toolkit (Example: C++, Self/JavaScript, Lua/Perl)
Complex data types (more than array)
In-built concurrency primitives
Futures
Pass by values, pass by name, pass by reference or an combination thereof
Garbage collected or not? What kind?
Event-based
Interface based types, class based types, or no user types (Go, Java, Lua)
etc
You can consider things like:
Can you call functions?
Can you pass functions to other functions?
Can you create new functions? (In C you can pass function pointers to functions, but you cannot create new functions)
Can you create new data types?
Can you create new data types with functions that operate on them? (the typical basis for "OO" languages)
Can you execute code that was not available at compile-time (using an eval function, maybe)?
Must all types be known at compile-time?
Are types available at run-time?
The difference between low-level and high-level languages. (Even though "low" and "high" are relative terms.)
A high-level language will use an abstraction to hide details that low-level languages would expose to the user. For example, in Matlab or Python, you can initialize an N-dimensional array in a single command. Not so in C or assembly.
IMHO the strength of a language is given by how many things you can do with it; how fast and how easy can you accomplish the goals.
The weaknesses of a language are the sum of constraints (of various types) that you encounter while you try to achieve your goal.
There are many features that a programming language may support. Additionally these features aren't always mutually exclusive. For example OCaml and F# are both functional and object oriented. Also writing a list here of all the paradigms that a language can support would be exhaustive, however there is a book Programming Language Pragmatics that is a comprehensive treatment of many paradigms found in programming languages.
However, for me the important things I need to know when working with a language are the following:
Is it dynamically or statically typed
Is it a typed language, and if it is typed is strong or weak?
Is it garbage collected
Does it support pass by value or pass by reference semantics or both?
Does it support first order functions (i.e. can functions be treated as variables)
Is it object-oriented
Polymorphism. Is it parametric or ad-hoc.
How expressive is the type system (i.e. can I create non-leaky abstractions)
Overloaded methods
Generics (templates)
Exception handling.
Type system (typed vs untyped, statically vs dynamically typed, weakly and strongly typed).
Supported paradigms (procedural, object-oriented, functional, logic, multi).
Default implementation (compiler vs interpreter vs JIT-compiler).
Memory management (manual vs automatic (reference counting or GC)).
Intended domain of use (number crunching, prototyping, scripting, DSL, ...).
Generation (1GL, 2GL, 3GL, 4GL, 5GL).
Used natural language (English vs Non-English-based). However, it's about syntax.
General remark: many of this classification scheme are not comprehensive and are not that good. And links are mostly at Wikipedia. So be aware.
You can consider other characteristics such as:
Strong vs weak and static vs dynamic typing, support for generic typing
How memory is handled (is it abstracted or do you have direct control over your data, pass by ref vs pass by value)
Compiled vs interpreted vs a bit of both
The forms of user-defined types available... classes, structures, tuples, lists etc.
Whether threading facilities are inbuilt or you need to turn to external libraries
Facility for generative coding... C++ template metaprogramming is a form of this
In the case of OOP, single vs multi inheritance, interfaces, anonymous/inner classes etc.
Whether a language is multi-paradigm (i.e. C# and its support for functional programming)
Availability of reflection
The verbosity of a language or the amount of 'syntactic sugar'... e.g. C++ is quite verbose when it comes to iterating over a vector. Java is quite succinct when anonymous inner classes are used for event-handling. Python's list comprehensions save a lot of typing.

How can a programming language be "implemented"?

maybe this is just a little misunderstanding but how can a programming language be implemented?
I'm not talking about how to implement my own programming language but about the word "implemented"?
I mean, you can implement a compiler or an interpreter, but a programming language?
What does it mean if I read "C++ is implemented in C" or "Python was implemented in C"?
I think a language is more sth. like a protocol of how someone thinks about things should be implemented. For example, if he wants do display a messagebox he can say the command for this is ShowMessageBox(string) and implement a compiler who will translate this into something that works on a computer (aside from the selected programming paradigms he imagines).
I think this question leads to the question "what is a programming language in reality"? A compiler, an interpreter or just a documented language standard about how things should be implemented in a language?
[EDIT]
Answer: Languages are never implemented, only compilers/interpreters etc. It's this simple.
Here's a very academic answer (from a longtime academic).
First I'll reframe the question:
What does it mean for a programming language to be implemented?
I'll start with "what is a programming language":
A programming language is a formal language (a set of utterances we can characterize precisely through algorithmic rules) such that a sentence in the language has a computational meaning. There are a variety of ways to give computation meaning; two of the most popular are that a computation stands for a function (from values to values, or from machine states to machine states) and that a computation stands for a machine that makes "state transitions" and interacts with the outside world.
A language is implemented when a means is provided to read in an utterance and perform the computation, that is, calculate the function or perform the behavior. The means is the implementation.
Typical implementations include
Direct interpretation of the language syntax. This model is rare but FORTH probably comes closest to it.
Translation of the syntax into virtual-machine code, also called bytecode, which is itself another language and which is interpreted. It is popular to write bytecode interpreters in C. Lua, Perl, Python, and Ruby are implemented more or less this way.
Translation of the syntax into hardware machine instructions, which is itself another language, and which is interpreted by your CPU. C and C++ are typically (but not always) implemented this way.
Direct interpretation of the language in hardware. IA-32 machine code and AMD64 machine code are implemented this way.
When a person says "Language X is implemented in Y", they are usually saying that a translator for X or an interpreter for X's bytecode is written in language Y.
One of the great secrets of compiler writers is the ability to write the compiler for language X in language X itself. If this interests you, get Andrew Appel's paper Axiomatic Bootstrapping: A Guide for Compiler Hackers.
Sometimes the answer to this question is not obvious. Squeak Smalltalk writes both a translator and a bytecode interpreter in Smalltalk, then translates the interpreter to C, which is translated to machine code. What is Squeak implemented in? Smalltalk.
Poke a professor; get a lecture.
You are right, those statements don't make any sense. It's pretty obvious that whoever made those statements doesn't understand the difference between a programming language and a compiler (or interpreter).
This is a surprisingly common problem. For example, sometimes people talk about interpreted languages or compiled languages. That's the same thing: languages aren't interpreted or compiled, they just are. Interpretation and compilation are traits of the implementation not the language.
Another goodie: Python has a GIL. No, it doesn't: one implementation of Python has a GIL, all the other implementations don't, and the Python Language itself certainly doesn't. Or: Ruby has green threads. Again, not true: Ruby has threads. Period. Whether any particular language implementation chooses to implement them as green threads, native threads, platform threads or whatever, is a trait of that particular implementation, not of Ruby. And of course my favorite: Ruby 1.9 is faster than Ruby 1.8. This doesn't even make sense: Ruby 1.9 and Ruby 1.8 are programming languages, i.e. a bunch of abstract mathematical rules. You cannot run a programming language, therefore a programming language can never be "faster" or "slower" than another one.
The most blatant confusion about the difference between programming languages and implementations is the Computer Language Benchmark Game, which claims to benchmark languages against each other but in fact benchmarks implementations.
All of these are just different expressions of the fact that apparently some people seem to be fundamentally incapable of grasping the concept of abstraction. Or at least the concept of having an abstract language and a concrete implementation of that language.
If we go back to the statement that "Python is implemented in C", it should now be obvious that that statement is not just wrong. If the statement were wrong that would imply that the statement even makes sense, i.e. that there is some possible world out there, in which it could at least theoretically be right. But that's not the case. The statement is neither wrong nor right, it simply doesn't make sense. If English were a typed language, it would be a type error.
Python is a programming language. Programming languages aren't implemented in anything. They are just implemented. Compilers and interpreters are implemented in languages. But even if you interpret the statement this way, it isn't true: Jython is implemented in Java, IronPython is implemented in C#, PyPy is implemented in RPython and Python, Pynie is implemented in PGE, NQP and PIR. (Oh, and all of those implementations have compilers, so there goes your "Python is an interpreted language".) Similar with Ruby: Rubinius is implemented in Ruby and C++, JRuby and XRuby are implemented in Java, IronRuby and Ruby.NET are implemented in C#, HotRuby is implemented in ECMAScript, Red Sun is implemented in ActionScript, RubyGoLightly is implemented in Go, Cardinal is implemented in PGE, NQP and PIR, SmallRuby is implemented in Smalltalk/X, MagLev is implemented in GemStone Smalltalk and Ruby, YARI is implemented in Io. And for C++: Clang (which is the C, C++ and Objective-C front-end for LLVM) is implemented in C++ (all three front-ends are implemented in C++).
"C++ is implemented in C". I understand this as "C++ compiler is written in C language". Quite simple, without too much philosophy.
Generally, C++ compiler can be written in any language, including C++ itself (except of the first compiler version).
"Python was implemented in C" means that at least one Python compiler (in this case the most commonly used one) is written using C. The developers of that implementation of Python made a deliberate decision not to use C++. As a statement it is incomplete as Python has also been implemented in Java, in C# and in Python.
The main relevance is that it gives you some idea of the systems you might be able to port the language onto: anything targeted by a C compiler should (at least in theory) be capable of running the C implementation of Python, but if they'd chosen to use C++ there would be a smaller set of systems that could run it.
C++ usually isn't implemented in C these days: I believe it is usually implemented in C++. It is quite common for languages to be implemented in the same language (or a subset of the language) as it means you are no longer dependent on some other unrelated language being available for the target. To bootstrap onto a new system you cross compile from some other system.
If you compile gcc for a new platform the build process involves compiling the source code once with whatever compiler is already available (perhaps an older gcc), then compiling it a second time with the newly compiled compiler, then compiling it a third time with the output from the second compilation. If the second and third versions aren't identical you get a build error. If they are identical then you've got a pretty good indication that it compiled correctly.
A programming language is a standard. Its interpreter or compiler is an implementation of this standard.
To build a new language, you don't necessarily needs to do in in low level machine code (assembly for instance). So, using another language to accomplish your goal (creating a new language here) is perfectly normal. So, when we say: Python was implemented in C, it just means that C was used to create that language. For instance, C can be complied on many different architecture, so the programmers doesn't have to take care of the different type of computers (portable).
A language is just a way to express yourself to the computer. Today, it can be done in various ways. But when you use the same syntax as the language and create your own framework, it's called a library or framework. A programming language is just a notation for writing program. If the notation change, you have a different language. Like French or Spanish comes from Latin. (French is implemented in Latin ;)
Why is there so many different languages? Because the goal of a language is to solve complex problems. So, depending on what you want to try yo accomplish, choosing the appropriate language can be an important decision.
The statement "Language X is implemented in Language Y" makes sense and is true if and only if there exists a canonical implementation of Language X and that implementation is written in Language Y. In common usage, either the first or the most popular implementation is often assumed to be canonical.
For example, Perl is one of the few languages with a definitive canon. "Python is implemented in C" makes sense if CPython is taken to be the canonical implementation of Python, and "C++ is implemented in C" is true for CFront, the original implementation of "C with classes" by Bjarne Stroustrup.
The direct answer:
Implementation in the context you are talking about just means written and language actually means compiler.
The original C++ compiler was as I understand it written in C. There is nothing (apart from knowledge and time) to stop you from writing a C++ compiler in another language.
Implementation is the code that makes software work. Often we talk about the implementation of a function as in: "the function has not been implemented yet."
eg
void foo()
{
//function has not been implemented yet
throw();
}
This often happens during the design phase of a program because the call needs to be there in order to write/debug/concept test the calling code but we haven't got round to implementing (writing the code to go insde the function)

what is a programming language?

Wikipedia says:
A programming language is a machine-readable artificial language designed to express computations that can be performed by a machine, particularly a computer. Programming languages can be used to create programs that specify the behavior of a machine, to express algorithms precisely, or as a mode of human communication.
But is this true? It occurred to me in the shower this morning that a programming language might just be a set of conventions, something that both a human and an appropriately arranged compiler can interpret. If that's the case, then isn't it this definition of a programming language misleading? If that isn't the case, then what's the difference between a compiler and the language it compiles?
Thanks!
z.
A programming language is exactly that set of conventions, but I don't see why that makes the Wikipedia entry misleading, really. If it makes you feel better, you might edit it to read something like:
A programming language is a machine-readable artificial language designed to express computations that can be performed by a machine, particularly a computer. Programming languages can be used to define programs that specify the behavior of a machine, to express algorithms precisely, or as a mode of human communication.
I understand what you are saying, and you are right. Describing a programming language as a "machine-readable artificial language designed to express computations that can be performed by a machine" is unnecessarily specific. Programming languages can be more broadly generalized as established descriptions of tasks (or "a set of conventions") that allow one entity to control the behavior of another. What we traditionally identify as programming languages are just a layer of abstraction between machine code and programmers, and are specifically designed for electronic computers.
Programming languages are not limited to traditional computers (see the K'NEX Computer), and aren't even necessarily limited to computational devices at all. For example, when I am pleased with my dog's behavior, he gets a treat. When I am displeased, he gets nothing. Over time the dog learns the treat/no treat programming and I can use the treats to control his behavior (to an extent).
I don't see what is different between what you are asking...
It occurred to me in the shower this morning that a programming language might just be a set of conventions, something that both a human and an appropriately arranged compiler can interpret.
... and the Wikipedia definition.
The key is that a programming language is just "a machine-readable artificial language".
A compiler does indeed act as an effective specification of a language in terms of a reduction to machine code - however, as it's generally difficult to understand a language by reading the compiler's source, one generally considers a programming language in terms of an abstract processing model that the compiler implements. This abstract model is what one means when one refers to the programming language.
That said, there are indeed many languages (Hi there, PHP!) in which the compiler is the only specification of the language in existence. These languages tend to change unpredictably at times as compiler bugs are fixed or introduced.
Programming languages are an abstraction layer that helps insulate the programmer from having to talk in electrical signals to the computer. The creators of the language have done all the hard work in creating a structure (language) or standard (grammar, conjugation, etc.) that then can be interpreted by a compiler in terms that the computer understands.
All programming languages are really nothing more than domain specific languages for machine code or manipulating the registers and memory of a processing entity.
This is probably the true explanation of what a programming language really is:
Step 1: Think of a language and its grammar, which is a set of rules for making syntactically valid statements using the language. For example, a language called GRID has tiles {0,1} as its alphabet and grammar rules that make sure every GRID statement has equal length and height.
Step 2 (definition of program): GRID, so far, is useless. I'd dare to think of any valid statement of GRID as just data. We need to add something else to GRID: a successor function. So GRID={Grammar, alphabet, successor function}. To make this clear, lets use the rules of "The Game of Life" as successor function.
Step 3: The Game of Life is actually Turing Complete, so GRID={Grammar, alphabet, successor function = GOL} can perform any computation that is computable.
A programming language is nothing but a language with a successor function. The environment that evaluates a valid statement of the language(program) does nothing but follow those successor functions. Variables, for example, are things whose successor functions = (STAY THE SAME)
Computers are just very fast environments ;)
Wikipedia's definition might have been taken out of context. For one thing, only programs written in machine code are machine-readable. Otherwise, you need a compiler to convert C++, Java or even assembly code to machine code so the computer can carry out your instructions. Unless you include comments that are only readable to humans, or unless you are strictly discussing a topic within the realm of your program, programming is insufficient for human communication.

Abstract (programming) language built to transpile

Introduction
Oftentimes I encounter a situation in which a library has been written in a particular programming language. That's great, if I want to use the library in the same language, but if I want to use a different language, that's going to be a problem (that doesn't mean that there might be a more or less hacky way).
For some libraries I have the feeling that they have been written in that particular programming language, simply because any language will do (and because of personal preference of the author) meaning that no language-specific high-level external 3rdparty-libraries are being used. For these situations I thought that it would be neat, if there was some sort of abstract (programming) language in which the library author can specify the algorithms, but which can then be transpiled into a lot of other programming languages. Thus if I want to use that library, I can simply use the transpiler to get that library in my language of choice.
Actual question
So what I am looking for is a language, that is specifically meant to be transpiled into most popular languages (e.g. Java, C/C++, Python). I'm interested in whether someone has gone through the effort of creating such a "universal" transpiler language before.
Note that I'm not looking for a particular transpiler from one language to another. I want to know whether there exists a (programming) language that has been designed for being transpilable into source code of a lot of different actual programming languages. Thus the language I'm looking for would probably not even run by itself (only the transpiled code would be an actual program).
Although I would be interested to hear general pros/cons for the existence of such a language, this is also not what this question is about due to the rules here on SO. Therefore I'd ask you, to not write opinion-based answers in this kind of style.
The answer to this question might very well be that there is no such language, but as my recherche hasn't brought anything up, I thought that maybe someone here knows of such a language, that I might have missed due to it not being widely used.
One language that was designed with the intention of being transpiled to various other languages is Haxe
At the time of writing it supports generating source code for:
JavaScript
ActionScript 3
PHP (including PHP7)
C++
Java
C#
Python
Lua
(reference: https://haxe.org/documentation/introduction/compiler-targets.html)
It also supports compiling directly to byte code for specific VMs

Resources