Is there a programming language that can be both high and low level? I'll elaborate...
Say, for example, you want to write an enterprise system which needs good abstractions and static typing (among other things), so you pick Java. But then, you need a portion of this system to be very low latency, so you pick C++ and do your garbage collection. Then, you want some kind of automated build script, so you quickly write something up in python.
Is there a language that could do all of these tasks, and be configurable such that one can use a GC or not if specified, the script be interpreted or not if specified, and the language can have static typing if specified? - Say, a unified syntax with optional static typing, customizable GC, and compilers..a sort of customizable Frankenstein language?
Disclaimer, I have not taken any programming language/compiler course, so this may be a noob-ish question!
Real world challenges are better dealt with multiple programming languages and programming language implementations.
Since you mentioned Python, important parts of the library are implemented in C (or in Cython) for performance reasons.
The programming languages involved on a modern Web site are at least HTML, CSS, and JavaScript, plus a templating language, plush the language used for the backend, plus the language used for services, plus SQL or the likes for querying databases.
The history of programming has recurring attempts at a single programming language for everything (ADA comes to mind), and each of them resulted in either inconsistent languages or complicated code. All of them were short-sighted.
Related
This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
Learning to write a compiler
I looked around trying to find out more about programming language development, but couldn't find a whole lot online. I have found some tutorial videos, but not much for text guides, FAQs, advice etc. I am really curious about how to build my own programming language. It brings me to SO to ask:
How can you go about making your own programming language?
I would like to build a very basic language. I don't plan on having a very good language, nor do I think it will be used by anyone. I simply want to make my own language to learn more about operating systems, programming, and become better at everything.
Where does one start? Building the syntax? Building a compiler? What skills are needed? A lot of assembly and understanding of the operating system? What languages are most compilers and languages built in? I assume C.
I'd say that before you begin you might want to take a look at the Dragon Book and/or Programming Language Pragmatics. That will ground you in the theory of programming languages. The books cover compilation, and interpretation, and will enable you to build all the tools that would be needed to make a basic programming language.
I don't know how much assembly language you know, but unless you're rather comfortable with some dialect of assembly language programming I'd advise you against trying to write a compiler that compiles down to assembly code, as it's quite a bit of a challenge. You mentioned earlier that you're familiar wtih both C and C++, so perhaps you can write a compiler that compiles down to C or C++ and then use gcc/g++ or any other C/C++ compiler to convert the code to a native executable. This is what the Vala programming language does (it converts Vala syntax to C code that uses the GObject library).
As for what you can use to write the compiler, you have a lot of options. You could write it by hand in C or C++, or in order to simplify development you could use a higher level language so that you can focus on the writing of the compiler more than the memory allocations and the such that are needed for working with strings in C.
You could simply generate the grammars and have Flex and Bison generate the parser and lexical analyser. This is really useful as it allows you to do iterative development to quickly work on getting a working compiler.
Another option you have is to use ANTLR to generate your parser, the advantage to this is that you get lots of target languages that ANTLR can compile to. I've never used this but I've heard a lot about it.
Furthermore if you'd like a better grounding on the models that are used so frequently in programming language compiler/scanner/parser construction you should get a book on the Models of Computation. I'd recommend Introduction to the Theory of Computation.
You also seem to show an interest in gaining an understanding of operating systems. This I would say is something that is separate from Programming Language Design, and should be pursued separately. The book Principles of Modern Operating Systems is a pretty good starting place for learning about that. You could start with small projects like creating a shell, or writing a programme that emulates the ls command, and then go into more low level things, depending on how through you are with the system calls in C.
I hope that helps you.
EDIT: I've learnt a lot since I write this answer. I was taking the online course on programming languages that Brown University was offering when I saw this answer featured there. The professor very rightly points out that this answer talks a lot about parsers but is light on just about everything else. I'd really suggest going through the course videos and exercises if you'd like to get a better idea on how to create a programming language.
It entirely depends on what your programming language is going to be like.
Do you definitely want it to be compiled? There are interpreted languages as well... or you could implement compilation at execution time
What do you want the target platform to be? Some options:
Native code (which architectures and operating systems?)
JVM
Regular .NET
.NET using the Dynamic Language Runtime (like IronRuby/IronPython)
Parrot
Personally I would strongly consider targeting the JVM or .NET, just because then you get a lot of "safety" for free, as well as a huge set of libraries your language can use. (Obviously with native code there are plenty of libraries too, but I suspect that getting the interoperability between them right may be trickier.)
I see no reason why you'd particularly want to write a compiler (or other part of the system) in C, especially if it's only for educational purposes (so you don't need a 100-million-lines-a-second compiler). What language are you personally most productive in?
Take a look at ANTLR. It is an awesome compiler-compiler the stuff you use to build a parser for a language.
Building a language is basically about defining a grammar and adding production rules to this grammar. Doing that by hand is not trivial, but a good compiler-compiler will help you a lot.
You might also want to have a look at the classic "Dragon Book" (a book about compilers that features a knight slaying a dragon on the front page). (Google it).
Building domain specific languages is a useful skill to master. Domain specific languages is typically not full featured programming language, but typically business rules formulated in a custom made language tailor made for the project. Have a look at that topic too.
There are various tutorials online such as Write Yourself a Scheme in 48 hrs.
One place to start tho' might be with an "embedded domain specific language" (EDSL). This is a language that actually runs within the environment of another, but you have created keywords, operators, etc particularly suited to the subject (domain) that you want to work in.
I'm currently assessing which strongly-typed server-side languages I could choose to learn next. I'm coming from a background of mainly php development (oop). I'm looking at strongly-typed languages as I consider this a major downside to php (and sometimes an upside).
I know both C# and Java (JSP/Servlets) are an option, however I wanted to consider other languages that I've yet to research.
I'm mainly looking at this from a career POV, so theres no point in picking up a language that's dying out or in low demand (now or in the future).
Scala very beautifully blends object oriented programming and functional programming to form a new programming paradigm called object-functional programming which is, as far as my experience goes, most scalable and productive paradigm ever.
Go for it, I would say.
Do you mean statically typed languages (checked at compile time)? If so, C# or Java really are probably your best bets for widely used server-side languages. Languages such as Python and Ruby are strongly typed, but they are dynamic like PHP.
http://en.wikipedia.org/wiki/Type_system
http://en.wikipedia.org/wiki/Strongly_typed_programming_language
First avoid using "strongly typed " most people misunderstood this term. (I personally refuse to give it a meaning)
I am assuming that your are talking of a statically type language as opposed to a dynamically type language.
I can understand from your background with php that you want to lean a statically type language but be aware that php is the worst dynamically type language that I have ever seen.
I would say that if you know C#, Java that is enough.
IMHO learning a language for a career POV is not what will improve your skill in PL.
This because most industries have no idea which language to choose in other to express the best there problems. So there are choosing language base on what the other competitors used. [1]
Your also have to ask your self why you want to learn statically type language.
Understand what is the difference between static and dynamic typing and what it implies. This is a hard question to answer. Is not as obvious as what most people think.
So I can answer which language while improve your programming skill ( LISP (DT) , Smalltalk (DT) , Scheme (DT) , ML (ST) , Haskell (ST), Prolog (DT), C (ST), Self (DT) ).
DT: Dynamically type,
ST: Statically type
[1] http://www.paulgraham.com/avg.html
Me and my friend are in the first stages of creating a domain-specific language designed for game programming, for his thesis paper.
The language will be fairly low-level, it will have a C-like syntax, optional garbage collection, and will be geared towards systems that have little memory or processing power (ie. Nintendo DS), but should be powerful enough to facilitate PC development easily.
It won't be a scripting language, but a compiled one, but as we don't want to spend months writing a normal compiler, the first implementation will basically be a LanguageName-to-C translator, with TCC or GCC as the end compiler.
Now, I have a question for all you game programmers out there:
What would you like to see in such a language? What features, implementation- and syntax-wise, would be best for it? What to avoid?
Edit:
Some things we already thought up:
state-based objects - an object can exist in one of it's states (or sub-states)
events and functions - events don't have to exist to be called, and can bubble up
limited dynamic allocation and pointer support - we want it to be as safe as possible
support for object compositing (Hero is composed (dynamically) of Actor, Hurtable, Steerable, etc.)
"resources" in states, loaded and unloaded automatically at beginning/end of state (for example, an OpenGL texture object is a resource)
basic support for localization and serialization
a syntax that is quickly parsable
we want to make the language as consistent as possible: everything is passed as value, every declaration has predictable syntax (eg. function retType name(type arg) is (qualifier, list) { }; no const, static, public qualifiers anywhere except in the qualifier list), etc.
Something to make concurrent programming easier. A blend of Erlang and C++ perhaps. I've been thinking about this on and off ever since the Cell processor was announced but it would take a serious chunk of R&D time to develop it and solve many of the problems that already have solutions in traditional C++ programs.
Personally I enjoy writing games that are able to access the wide, wide audience of the web. Would be beyond interesting to make it simple to interface between the desktop and the web.
That is probably more the domain of apps built with the language than the language itself, I suppose, but perhaps something that's useful to keep in mind during the design phase.
You could do worse than to read this: The Next Mainstream Programming Languages: A Game Developer's Perspective (PDF).
So, I don't want to really bust your bubble, but... maybe I should? As a professional game developer, I have to say that there really need to be three types of "languages" for game development.
First, there's your engine-level language. This is typically C++. It's all about performance. The artifacts of gameplay are not meant to be implemented here (sadly, they often are).
Next comes the gameplay language. This is lightweight, easy to understand, and designed for rapid iteration.
Finally, there is some sort of visual scripting language. This is the lightest of all and is geared toward non-programmers (level designers, etc.).
That being said:
Definitely check out UnrealScript. It's used throughout the industry (since the Unreal Engine is a cornerstone of FPS game development).
I would highly recommend supporting:
Concurrent programming (check out what CCP does with Stackless Python for Eve Online)
Network replication (check out UnrealScript, you can tag functions to run on either the server or the client, or to be safe to run on the client, etc.)
State (as mentioned) would be great. UnrealScript has this facility. This needs to be safely done (i.e., enter and exit at any point, complex transitions handled elegantly, etc.)
Good luck!
Wikipedia says:
A programming language is a machine-readable artificial language designed to express computations that can be performed by a machine, particularly a computer. Programming languages can be used to create programs that specify the behavior of a machine, to express algorithms precisely, or as a mode of human communication.
But is this true? It occurred to me in the shower this morning that a programming language might just be a set of conventions, something that both a human and an appropriately arranged compiler can interpret. If that's the case, then isn't it this definition of a programming language misleading? If that isn't the case, then what's the difference between a compiler and the language it compiles?
Thanks!
z.
A programming language is exactly that set of conventions, but I don't see why that makes the Wikipedia entry misleading, really. If it makes you feel better, you might edit it to read something like:
A programming language is a machine-readable artificial language designed to express computations that can be performed by a machine, particularly a computer. Programming languages can be used to define programs that specify the behavior of a machine, to express algorithms precisely, or as a mode of human communication.
I understand what you are saying, and you are right. Describing a programming language as a "machine-readable artificial language designed to express computations that can be performed by a machine" is unnecessarily specific. Programming languages can be more broadly generalized as established descriptions of tasks (or "a set of conventions") that allow one entity to control the behavior of another. What we traditionally identify as programming languages are just a layer of abstraction between machine code and programmers, and are specifically designed for electronic computers.
Programming languages are not limited to traditional computers (see the K'NEX Computer), and aren't even necessarily limited to computational devices at all. For example, when I am pleased with my dog's behavior, he gets a treat. When I am displeased, he gets nothing. Over time the dog learns the treat/no treat programming and I can use the treats to control his behavior (to an extent).
I don't see what is different between what you are asking...
It occurred to me in the shower this morning that a programming language might just be a set of conventions, something that both a human and an appropriately arranged compiler can interpret.
... and the Wikipedia definition.
The key is that a programming language is just "a machine-readable artificial language".
A compiler does indeed act as an effective specification of a language in terms of a reduction to machine code - however, as it's generally difficult to understand a language by reading the compiler's source, one generally considers a programming language in terms of an abstract processing model that the compiler implements. This abstract model is what one means when one refers to the programming language.
That said, there are indeed many languages (Hi there, PHP!) in which the compiler is the only specification of the language in existence. These languages tend to change unpredictably at times as compiler bugs are fixed or introduced.
Programming languages are an abstraction layer that helps insulate the programmer from having to talk in electrical signals to the computer. The creators of the language have done all the hard work in creating a structure (language) or standard (grammar, conjugation, etc.) that then can be interpreted by a compiler in terms that the computer understands.
All programming languages are really nothing more than domain specific languages for machine code or manipulating the registers and memory of a processing entity.
This is probably the true explanation of what a programming language really is:
Step 1: Think of a language and its grammar, which is a set of rules for making syntactically valid statements using the language. For example, a language called GRID has tiles {0,1} as its alphabet and grammar rules that make sure every GRID statement has equal length and height.
Step 2 (definition of program): GRID, so far, is useless. I'd dare to think of any valid statement of GRID as just data. We need to add something else to GRID: a successor function. So GRID={Grammar, alphabet, successor function}. To make this clear, lets use the rules of "The Game of Life" as successor function.
Step 3: The Game of Life is actually Turing Complete, so GRID={Grammar, alphabet, successor function = GOL} can perform any computation that is computable.
A programming language is nothing but a language with a successor function. The environment that evaluates a valid statement of the language(program) does nothing but follow those successor functions. Variables, for example, are things whose successor functions = (STAY THE SAME)
Computers are just very fast environments ;)
Wikipedia's definition might have been taken out of context. For one thing, only programs written in machine code are machine-readable. Otherwise, you need a compiler to convert C++, Java or even assembly code to machine code so the computer can carry out your instructions. Unless you include comments that are only readable to humans, or unless you are strictly discussing a topic within the realm of your program, programming is insufficient for human communication.
Polyglot, or multiple language, solutions allow you to apply languages to problems which they are best suited for. Yet, at least in my experience, software shops tend to want to apply a "super" language to all aspects of the problem they are trying to solve. Sticking with that language come "hell or high water" even if another language is available which solves the problem simply and naturally. Why do you or do you not implement using polyglot solutions?
I almost always advocate more than 1 language in a solution space (actually, more than 2 since SQL is part of so many projects). Even if the client likes a language with explicit typing and a large pool of talent, I advocate the use of scripting languages for administrative, testing, data scrubbing, etc.
The advantages of many-language boil down to "right tool for the job."
There are legitimate disadvantages, though:
Harder to have collective code ownership (not everyone is versed in all languages)
Integration problems (diminished in managed platforms)
Increased runtime overhead from infrastructure libraries (this is often significant)
Increased tooling costs (IDEs, analysis tools, etc.)
Cognitive "bumps" when switching from one to another. This is a double-edged sword: for those well-versed, different paradigms are complementary and when a problem arises in one there is often a "but in X I would solve this with Z!" and problems are solved rapidly. However, for those who don't quite grok the paradigms, there can be a real slow-down when trying to comprehend "What is this?"
I also think it should be said that if you're going to go with many languages, in my opinion you should go for languages with significantly different approaches. I don't think you gain much in terms of problem-solving by having, say, both C# and VB on a project. I think in addition to your mainstream language, you want to have a scripting language (high productivity for smaller and one-off tasks) and a language with a seriously different cognitive style (Haskell, Prolog, Lisp, etc.).
I've been lucky to work in small projects with the possibility to suggest a suitable language for my task. For example C as a low-level language, extending Lua for the high-level/prototyping has served very well, getting up to speed quickly on a new embedded platform. I'd always prefer two languages for any bigger project, one domain-specific fit to that particular project. It adds a lot of expressiveness for quickly trying out new features.
However probably this serves you best for agile development methods, whereas for a more traditional project the first hurdle to overcome would be choosing which language to use, when scripting languages tend to immediately seem "newcomers" with less marketing push or "seriousness" in their image.
The biggest issue with polyglot solutions is that the more languages involved, the harder it is to find programmers with the proper skill set. Particularly if any of the languages are even slightly esoteric, or hail from entirely different schools of design (e.g. - functional vs procedural vs object oriented). Yes, any good programmer should be able to learn what they need, but management often wants someone who can "hit the ground running", no matter how unrealistic that is.
Other reasons include code reuse, increased complexity interfacing between the different languages, and the inevitable turf wars over which language a particular bit of code should belong in.
All of that said, realize that many systems are polyglot by design -- anything using databases will have SQL in addition to some other language. And there's often scripting involved as well, either for actual code or for the build system.
Pretty much all of my professional programming experience has been in the above category. Generally there's a core language (C or C++), SQL of varying degrees, shell scripting, and possibly some perl or python code on the periphery.
My employer's attitude has always been to use what works.
This has meant that when we found some useful Perl modules (like the one that implements "Benford's Law", Statistics::Benford), I had to learn how to use ActiveState's PDK.
When we decided to add interval maths to our project, I had to learn Ada and how to use both GNAT and ObjectAda.
When a high-speed string library was requested, I had to relearn assembler and get used to MASM32 and WinAsm.
When we wanted to have a COM DLL of libiconv (based on Delphi Inspiration's code), I got reacquainted with Delphi.
When we wanted to use Dr. Bill Poser's libuninum, I had to relearn C, and how to use Visual C++ 6's IDE.
We still prototype things in VB6 and VBScript, because they're good at it.
Maybe sometime down the line I'll end up doing stuff in Forth, or Eiffel, or D, or, heaven help me, Haskell (I don't have anything against the language per se, it's just a very different paradigm.)
One issue that I've run into is that Visual Studio doesn't allow multiple languages to be mixed in a single project, forcing you to abstract things out into separate DLLs for each language, which isn't necessarily ideal.
I suspect the main reason, however, is the perception that switching back and forth between many different languages leads to programmer inefficiency. There is some truth to this, I switch constantly between JavaScript, C#, VBScript, and VB.NET and there is a bit of lost time as I switch from one language to another, as I mix my syntax a bit.
Still, there is definitely room for more "polyglot" solutions particularly that extend beyond using JavaScript and whatever back-end programming language.
Well, all the web is polyglot now with Java/PHP/Ruby in the back and JavaScript in the front...
Other examples that come to mind -- a flexible complex system written in a low level language (C or C++) with an embedded high level language (Python, Lua, Scheme) to provide customization and scripting interface. Microsoft Office and VBA, Blender and Python.
A project which can be done in a scripting language such as Python with performance critical or OS-dependent pieces done in C.
Both JVM and CLR are getting lots of new interesting scripting languages compatible. Java + Groovy, C# + IRonPython etc.