Does someone know a comparison between programming languages on their expressive power? - programming-languages

I'm searching for a comparison that roughly states how many lines of code I need in a given programming language to solve a given problem. I thought about using the programming language shootout source for that but perhaps there's a different source for that as well.
Thank you for your help
Tobias

In general it's well known that some languages have stronger "expressiveness" than others. rosettacode.org, as mentioned by deceze is a good source for things like "hello world" (trivial thought/problem), but to get a stronger sense of how expressive a language is in relation to another, there is a good write-up post here:
http://redmonk.com/dberkholz/2013/03/25/programming-languages-ranked-by-expressiveness/
It looks across many open source projects, and compares commits of the code base by the LOC that was committed, in an attempt to gauge how verbose it was to accomplish the expression.
Of course, the assumption is that one commit roughly translates to one computational goal/thought/expression/etc. And I think in general that's true of most coders & their codebases (we at least try our best!).

Related

How do I compare programming languages for projects at work?

I am wondering what are some specific questions I should keep in mind when I am comparing programming languages for use on given work projects. For instance, I am told logic programming languages like Prolog are good for natural language processing. I'm not sure why exactly; I assume it is true because experts say so, but I don't know the consideration that guides them to that conclusion. So I am looking for a simple heuristic, a checklist of questions, I can apply to evaluate programming languages and be able to explain my decisions, so that I can say "Language X is good for Y because it does Z."
The only way I know of to figure out which programming language is most appropriate for a given problem, is to know lots of programming languages. After all, if you don't know screwdrivers exist, how will you know not to use a hammer when you encounter a screw?
Unfortunately, there are thousands (maybe tens of thousands) of programming languages, so learning even a significant portion of them is just not realistic.
However, programming languages implement paradigms. And Peter van Roy's famous poster only lists about 34 of those. Although he deliberately decided to ignore several aspects, including anything related to typing, so the real number is probably higher than that. But we can expect it to be well below 100.
That's still a lot, though, but thankfully, paradigms aren't atomic either, they are composed of concepts. The poster lists about a dozen of those (again ignoring typing and a couple of other things). Significantly less than paradigms.
Learning a significant portion of concepts is entirely feasible. Once you know them, you can look at a problem and see which concepts would be useful to have to build a solution. Then you look at which paradigms contain those concepts and which languages implement those paradigms. Pick one, learn it, use it, solve the problem.
And since you already know the concepts (and thus the paradigms) the language implements, you only need to learn the syntax, not the semantics. There aren't actually that many different syntaxes in the wild (C, C++, Objective-C, Objective-C++, D, Go, Java, C#, ECMAScript, PHP, Vala and many others share a lot of syntax, for example, as do Smalltalk, Self, Newspeak and Objective-C, SML, OCaml and F#, and so on), so chances are, you'll pick that up very quickly. (Besides, with today's modern IDEs that's much less of an issue anyway.)
One small point to bear in mind: if you are an expert in language X and you are asked to develop a program in domain Y for which language Z is supposed to be ideal -- will you deliver sooner and better by writing in the language you are an expert in even if it is not (by some measures) ideal for the problem domain ? Or will you deliver better and sooner by first learning a new language ?
I think your search for a simple heuristic is in vain.
Start with what you're team is familiar with. While there's a lot to be said for the philosophy that a great developer can pick up almost any language in short order, there's a practical side that goes to the fact that if you have a ton of .Net or Java coders, you're best served in starting from that base.
Now, within both stacks you have options on things like functional programming (F#, Erlang, etc.) and other languages on the runtime your team is most familiar with. But it really does boil down to the culture, infrastructure, and (most importantly) the experience and flexibility of the individual developers on your team.
There are several factors to consider:
What is your local expertise? If you have a company full of C programmers, it's probably not worth retraining everybody to be Lisp programmers.
What are your libraries? If you have libraries that you want to use, make sure that they are compatible with your language of choice.
If you are starting a new project with a wide-open field of options, I would recommend taking a few sample problems out of the application domain. Nothing too complex, but nothing trivial either. Then, implement (or have someone from your team implement) these samples in each candidate language. Then, choose the one that is clear, easy, and appropriate.

is it possible to markup all programming languages under object oriented paradigm using a common markup schema?

i have planned to develop a tool that converts a program written in a programming language (eg: Java) to a common markup language (eg: XML) and that markup code is converted to another language (eg: C#).
in simple words, it is a programming language converter that converts program written in one language to another language.
i think it is possible but i don know where to start. i wanna know the possibilities to do so and information about some existing system.
What you are trying to do is extremely hard, but if you want to know what you are up for I've listed the steps you need to follow below:
First the hard bit:
First you obtain or derive an operational semantics for your source and target languages.
Then you enhance the semantics to capture your source and target memory models.
Then you need to unify the two enhanced-semantics within a common operational model.
Then you need to define a mapping from your source languages onto the common operational model.
Then you need to define a mapping from your operational model to your target language
Step 4, as you pointed out in your question, is trivial.
Step 1 is difficult, as most languages do not have sufficiently formal semantics specified; but I recommend checking out http://lucacardelli.name/TheoryOfObjects.html as this is the best starting point for building a traditional OO semantics.
Step 2 is almost certainly impossible in general, but may be merely obscenely difficult if you are willing to sacrifice some efficiency.
Step 3 will depend on how clean the result of step 1 turned out, but is going to be anything from delicate and tricky to impossible.
Step 5 is not going to be trivial, it is effectively writing a compiler.
Ultimately, what you propose to do is impossible in general, due to the difficulties inherited in steps 1 and 2. However it should be difficult, but doable, if you are willing to: severely restrict the source language constructs supported; pretty much forget handling threads correctly; and pick two languages with sufficiently similar semantics (ie. Java and C# are ok, but C++ and anything-else is not).
It depends on what languages you want to support, but in general this is a huge & difficult task unless you plan to only support a very small subset of each language.
The real problem is that each programming languages has different features (with some areas that overlap and others that don't) and different ways of solving the same problems -- and it's pretty tricky to detect the problem the programmer is trying to solve and convert that to a new idiom. :) And think about the differences between GUIs created in different languages....
See http://xmlvm.org/ as an example (a project aimed at converting between source code of many different languages, with an XML middle-point) -- the site covers in some depth the challenges they are tackling and the compromises they take, and (if you still have any interest in this kind of project...) ask more specific followup questions.
Notice specifically what the output source code looks like -- it's not at all readable, maintainable, efficient, etc..
It is "technically easy" to produce XML for any single langauge: build a parser, construct and abstract syntax tree, and dump out that tree as XML. (I build tools that do this off-the-shelf for many languages). By technically easy, I mean that the community knows how to do this (see any compiler textbook, e.g., Aho&Ullman Dragon book). I do not mean this is a trivial exercise in terms of effort, because real languages are complicated and messy; there have been many attempts to build C++ parsers and few successes. (I have one of the successes, and it was expensive to get right).
What is really hard (and I don't try to do) is produce XML according to a single schema in which the language semantics are exposed. And without that, it will be essentially impossible to write a translator from a generic XML to an arbitrary target language. This is known as the UNCOL problem and people have been looking since 1958 for the answer. I note that the Wikipedia article seems to indicate the problem is solved, but you can't find many references to UNCOL in the literature since 1961.
The closest attempt I've seen to this is the OMG's "ASTM" model (http://www.omg.org/spec/ASTM/1.0/Beta1/); it exports XMI which is XML. But the ASTM model has lots of escapes built into it to allow langauges that it doesn't model perfectly (AFAIK, that means every language) to extend the XMI in arbitrary ways so that the language-specific information can be encoded. Consequently each language parser produces a custom version of the XMI, and thus each reader has to pretty much know about the extensions and full generality vanishes.

What does "powerful" mean, when discussing programming languages?

In the context of programming language discussion/comparison, what does the term "power" mean?
Does it have a well defined meaning? Even a poorly defined meaning?
Say if someone says "language X is more powerful than language Y" or asks the same as a question, what do they mean - or what information are they trying to find out?
It does not have a well-defined meaning. In these types of discussions, "language X is more powerful than language Y" usually means little more than "I like language X more than language Y." On the other end of the spectrum, you'll also usually have someone chime in about how any Turing-complete language can accomplish the same tasks as any other Turing-complete language, so that neither is strictly more powerful than the other.
I think a good meaning for it is expressivity. When a language is highly expressive, it means less code is required to express concepts. To me, this doesn't just mean that you have to write less code to accomplish the same tasks, but also that the code is easily readable by humans. Of course, generally (to a point), having fewer lines of code to read makes the task of reading and understanding easier for humans.
Having a "powerful" standard library comes into play here along the same lines. If a language comes equipped with thorough, complete libraries, then idiomatic code in that language will be able to benefit from the existing library code and not have to repeat or reinvent common functionality in application code. The end result is, again, having to write and read less code to accomplish the same tasks.
I keep saying "generally" and "to a point", because once a language gets too terse, it gets more difficult for humans to decipher. I suppose at this extreme, a language may still be considered "more powerful" (or even "too powerful"). So I guess I'm saying my personal interpretation of "powerful" includes some aspects of "useful" and "readable" in it as well.
C is powerful, because it is low level and gives you access to hardware. Python is powerful because you can prototype quickly. Lisp is powerful because its REPL gives you fantastic debugging opportunities. SQL is powerful because you say what you want and the DMBS will figure out the best way to do it for you. Haskell is powerful because each function can be tested in isolation. C++ is powerful because it has ten times the number of syntactic constructs that any one person ever needs or uses. APL is powerful since it can squeeze a ten-screen program into ten characters. Hell, COBOL is powerful because... why else would all the banks be using it? :)
"Powerful" has no real technical meaning, but lots of people have made proposals.
A couple of the more interesting ones:
Paul Graham wants to call a language "more powerful" if you can write the same programs in fewer lines of code (or some other sane, sensible measure of program size).
Matthias Felleisen has written a very serious theoretical study called On the Expressive Power of Programming Language.
As someone who knows and uses many programming languages, I believe that there are real differences between languages, and that "power" can be a convenient shorthand to describe ways in which one language might be better than another. Nevertheless, whenever I hear a discussion or claim that one language is more powerful than another, I tend to keep one hand firmly on my wallet.
The only meaningful way to describe "power" in a programming language is "can do what I require with the least amount of resources" where "resources" is defined as "whatever costs I'd rather not pay" and could, thus, be development time, CPU time, memory space, money, etc.
So basically the definition of "power" is purely subjective and rendered meaningless in any objective discussion.
Powerful means "high in power". "Power" is something that increases your ability to do things. "Things" vary in shape, size and other things. Loosely speaking therefore, "powerful" when applied to a programming language means that it helps you to do perform your tasks quickly and efficiently.
This makes "powerful" somewhat well defined but not constant across domains. A language powerful in one domain might be crippling in another eg. C is very powerful if you want to do systems level programming since it gives you direct access to the machine and hardware and structures that let you code much faster than you would in assembly. C compilers also produce tight code that runs fast. However, once you move to web applications, C can become very "unpowerful" and crippling since it's so much effort to get something up and running and you have to worry about a lot of extraneous details like memory etc.
Sometimes, languages are "powerful" in multiple domains. This gives them a general "powerful" tag (or badge since were are on SO here). PG's claim is that with LISP, this is the case. That might be true or might not be.
At the end of the day, "powerful" is a loaded word so you should evaluate who is saying it, why he's saying it and what it means to to your work.
There are really only two meanings people are worried about:
"Powerful" in the sense of "takes less resources (time, money, programmers, LOC, etc.) to achieve the same/better result", and "powerful" in the sense of "is capable of doing a wide range of tasks".
Some languages are extrememly resource-effective for a small range of tasks. Others are not so resource-effective but can be applied to a wide range of tasks (e.g. C, which is often used in OS development, creation of compilers and runtime libraries, and work with microcontrollers).
Which of these two meanings someone has in mind when they use the term "powerful" depends on the context (and even then is not always clear). Indeed often it is a bit of both.
Typically there are two distinct meanings:
Expressive, meaning the code tends to be very short and understandable
Low level, meaning you have very fine-grained control over the hardware.
For the most languages, these two definitions are at opposite ends of the spectrum: Python is very expressive but not very low level; C is very low level but not very expressive. Depending on which definition you pick, either language is powerful or not powerful.
nothing absolutely nothing.
To high level programmers it might mean alot of available datatypes built in. Or maybe abstractions to easily create or follow Design Patterns.
Paul Graham is a very high level guy here is what he has to say:
http://www.paulgraham.com/avg.html
Java guys might tell you something about portability, the power to reach every platform.
C/UNIX programmers may tell you that its speed and efficiency, complete control over every inch of memory.
VHDL/Verilog programmers will tell you its complete control over every clock and gate so as to not waste any electricity or time.
But in my opinion a "powerful language" supports all of the features for you to complete your task. Documentation may be important, or perhaps it is portability, or the ability to do graphics. It could be anything, writing a gui from Assembly is just stupid, so is trying to design an embedded processor in flash.
Choosing a language that suits your needs perfectly will always feel like power.
I view the term as marketing fluff, no one well-defined meaning.
If you consider, say, Assembler, C, and C++. On occasions one drops from C++ "down" to C for particualr needs, and in turn from C down to assembler. So that make assembler the most powerful because it's the only language that can do everything. Or, to argue the other way, a single line of C++ code can replace several of C (hiding polymorphic dispatch via function pointers for example) and a single line of C replaces many of assembler. So C++ is more powerful because one line does "more".
I think the term had some currency when products such as early databases and spreadsheets had in-built languages, some quite restricted. So vendors would tout their language as being "powerful" because it was less restricted.
It can have several meanings. In the very basic sense there's power as far as what is computable. In that sense the most powerful languages are Turing Complete which includes pretty much every general purpose programming language (as opposed to most markup languages and domain specific languages which are often not Turing complete).
In a more pragmatic sense it often refers to how concisely (and readably) you can do certain things. Basically how easy is it to do certain tasks in one language compared to another.
What language is more powerful (besides being somewhat subjective) depends heavily on what you're trying to do. If your requirements are to get something running on a small device with 64k of memory you're likely not going to be using Java. Most likely the right language would be C or C++ (or if you're really hard core assembly). If you need a very simple CRUD app done in 1 day, maybe something like Ruby On Rails would be the way to go (I know Rails is a framework and Ruby is the language, but these days what libraries and frameworks are available factor greatly into picking a language)
I think that, perhaps coincidentally, the physics definition of power is relevant here: "The rate at which work is performed."
Of course, a toaster does not perform very quickly the work of putting out fires. Similarly, the power of a programming language is not universal, but specific to the domain or task to which it is being applied. C is a powerful language for writing device drivers or implementations of higher-level languages; Python is a powerful language for writing general-purpose applications; XPath is a powerful language for writing queries on structured data sets.
So given a problem domain, the power of a language can be said to be the rate at which a competent programmer is able to use it to solve problems in that domain.
A precise answer can be tried to reach, by not assuming that the elements that define "powerful" (in the context of languages) come from so many dimensions.
See how many could be, and a lot will be missing:
runtime speed
code size
expressiveness
supported paradigms
development / debugging time
domain specialization
standard libs
codebase
toolchain ecosystem
portability
community
support / documentation
popularity
(add more here)
These and more parameters draw together X picture of how "programming in some language" would be like at X level. That will be only the definition, though, the only real knowledge comes with the actual practice of using the language, but i digress.
The question comes down to which parameter will represent the intrinsic quality of a language. If you refer to a language in itself, its ultimate, intrinsic purpose is "express things", and thus the most representative parameter is rightfully expressiveness, and is also one that resonates frequently when someone talks about how powerful a language is.
At the moment you try to widen the question/answer to cover more than the expressiveness of the language "as a language, as a tongue", you are more talking about different kinds of "environment", social environment, development environment, commercial environment, etc.
Depending of the complexity of the environment to be defined you'll have to mix more parameters that come from multiple, vast, overlapping and sometimes contradictory dimensions, and eventually the point of getting the definition will be lost or the question will have to be narrowed.
This approximation still won't answer "what is an expressive language", but, again, a common understanding are the definitions that Vineet well points out in its answer, and Forest remarks in the comments. I agree, for me "expression" is "conveying meaning".
I remember many instructors in college calling whatever language they were teaching "powerful".
Leads me to think:
Powerful = a relative term comparing the latest way to code something vs. the original or previous way.
I find it useless to use the word "powerful" in regards to discussing anything software related. Every time my professor in college would introduce a new concept such as polymorphism he would say "so this is a really powerful feature". After a while I got annoyed. If everything is powerful then nothing is. It's all the same. You can write code to do anything. Does is really matter how much code is required to do it? You can say it's short or efficient but powerful is just useless. Nuclear energy is powerful. Code is words.
I think that power would normally refer to how quickly it can process data, for example I found that in python as soon as a list exceeds a length of approx. 2000 it becomes unbearably slow whereas in C++ a list can easily contain 20,000 entries without doing so.

New or not so well-known paradigms, syntax features and behaviours of programming languages?

I've designed some educational programming languages and interpreters for them, but my problem always was that they ended up "normal" and "boring", mostly similar to some kind of existing language (ASM and BASIC).
I find it really hard to come up with new ideas for syntax features, "neat things" and new or very modified programming paradigms for it. I always thought that it was hard to come up with good new things not fun/useless new things for this case.
I wondered if you could help me out with your creativity:
What features in terms of language syntax and built-in functions as well as maybe even new paradigms can I work into my language to keep it useless but more fun, enjoyable, interesting and/or different to program in?
I always thought that it was hard to come up with good new things
You were right. This is why John Backus, Ken Iverson, Niklaus Wirth, Robin Milner, Kristen Nygaard and Ole-Johan Dahl, Alan Kay, and Barbara Liskov all won Turing Awards—they contributed good new ideas to the design of programming languages.
If you want to add a dash of interest to your own designs, these are excellent people to steal from.
Both ASM and BASIC are imperative languages, so you might want to consider features of functional programming languages, especially lambdas and maps. You might also want to consider interesting flows of control, for example, being able to throw an exception and then later, as a result of catching the exception and making a certain call, resume from the point that the exception was thrown (albeit using a modified environment). Also, co-routines, or other forms of language-level parallelism are often interesting.
In addition to Michael's comment on functional languages, look at closures and blocks (like they're done in Objective-C). Those let you treat functions or pieces of code as first-class objects that you can pass around and call on demand. Some cool stuff can be done with that, and it's also shaping up to becoming the paradigm for programming massively multi-core systems.
You could also look into currying, which means binding some of a function's parameters, so you can then use it on fewer arguments. That way, you could create a base-b logarithm function, which you could curry to create functions for the base-2, base-10, etc. logarithm.
And something less functional (as in language): look at Ruby's way of treating everything as an object (even numbers), you can do quite a bit with that. Like an object-oriented runtime with introspection, an interpreter "for free," etc. Implementing OOP stuff is easier than you'd think.
A lot of stuff has been done in the last 30-odd years, don't restrict yourself to 70s-style programming! ;) If you're looking for inspiration, check out Ruby, Python, Scala, Objective-C, JavaScript (read Douglas Crockford's JavaScript: The Good Parts), etc.
The Esolang wiki gives a good sample of the weirds and wonderfuls of all kinds of esoteric programming languages, including many user creations. Perhaps some inspiration for something sane lies therein.
look at Forth. It is something original. Too original.
intercal has plenty of unusual language features B-)
I've always thought it would be neat to apply CSP to a stack based language. Could get pretty interesting.
See Wikipedia: Programming Languages. There are many useful links, especially in the Taxonomies section.
So much of the "new" is really just "forgotten old". I will hold my thoughts on some of the "popular" programming languages of the day.
There are many things that could be explored and active research is being done on some of them. Some of the things I think would be useful are:
real continuations in a non-functional language
here is an attempt to add them on to C++: http://mainisusuallyafunction.blogspot.co.nz/2012/02/continuations-in-c-with-fork.html
languages that let the user create new syntax elements
FORTH and J might be starting points.
Pogoscript is interesting as well because flow control constructs like if/elseif/else and while/wend arten't special can be created in user code.
custom user defined operators actually aren't new: I think Haskell, Nemerle, Kaleidoscope and several others already do this but even that wouldn't be "boring"

Resources for learning a new language quickly?

The title may seem slightly self-contradictory, and I accept that you can't really learn a language quickly. However, an experienced programmer that already has knowledge of a few languagues and different styles (functional, OO, imperative etc.) often wants to get started quickly. I've seen a few websites doing effective "translations" in the form of "just show me syntax equivalence". I can't remember the sites now, but for related languages (e.g. Perl/PHP) it's quite common.
Is there a better resource that covers more languages? Is there a resource that covers idioms as well as syntax? I think this would be incredibly useful for doing small amounts of work on existing code bases where you are not familiar with the language. Looking at the existing code, as we know, is not always a good indicator of quality. Likewise, for "learn by doing" weekend project I always have the urge to write reasonably idiomatic, clean code from the start. Such a resource could also link to known good example projects of varying sizes for those that prefer to learn by reading. Reading a well-written medium sized code base can also be much more practical when access to development environments might be limited.
I think it's possible to find tutorials and summaries for individual languages that provide some of this functionality in disparate web locations but I'm hoping there is a good, centralised, comparative place that the busy programmer can turn to.
You generally have two main things to overcome:
Syntax
Reference
Syntax you can pick up fairly quickly with a language tutorial and a stack of samplecode.
Reference (library/API calls) you need to find a proper guide to; perhaps the language reference, or perhaps google...
With those two in place, following a walkthrough (to get you used to using the development environment) will have you pretty much ready - you'll be able to look up what you want to say (reference), and know how to say it (syntax).
This, of course, applies principally to procedural/oop languages; languages that require a paradigm switch (ML/Haskell) you should go to lectures for ;)
(and for the weirder moments, there's SO!)
In the past my favour was "learning by doing". So e.g. I know a little bit of C++ and a lot of C#.Net but I must write a FTP Tool in Python.
So I sit for an hour and so the syntax differences by a tutorial, than I develop the form itself and look at the generated code. Then I search a open source Python FTP Client and get pieces of code (Not copy and paste, write it self to see, feel and remember the code!)
After a few hours I get it.
So: The mix is the best. A book, a piece of good code, the willing to learn and a free night with much coffee.
At the risk of sounding cheesy, I would start with the language's website tutorial and/or FAQ, followed by asking more specific questions here. SO is my centralized location for programming knowledge.
I remember when I learned Perl. I was asked to modify some Perl code at work and I'd never seen the language before. I had experience with several other languages, however, so it wasn't hard to figure out the syntax with the online Perl docs in one window and the code in another, side-by-side. I don't know that solely reading existing code is necessarily the best way to learn. In my case, I didn't know Perl but I could tell that the person who originally wrote the code didn't know Perl either. I'm not sure I could've distinguished between good Perl and really confusing Perl. It would've been nice to be able to ask questions here at the time.
Language isn't important. What is important is learning your ways around designing algorithms and the proper application of design patterns. Focus on the technique, not the language that implements a certain technique. Once you understand the proper development techniques, any programming language will just become real easy, no matter how obscure they are...
When you put a focus on a language, you're restricting your own knowledge.
http://devcheatsheet.com/ seems to be a step in the right direction: it aggregates cheat sheets/quick references and they are (somewhat) manually reviewed. It's also wide-ranging. It still comes up short a bit in terms of "idiomatic" quick reference: for example, the page on Ruby doesn't mention yield.
Rosetta Code appears to be an excellent resource that includes hints on coding idiomatically and moves from simple (like for-loops) to things like drawing. I haven't checked out how comprehensive it is, but there are a large number of languages and tasks listed. The drawbacks re: original question are:
Some of the linking is not accurate
(navigating Python->ForLoop will
take you to the top of the ForLoop
page, not the Python section). It's a
wiki, this can be improved.
Ideally you could "slice" the wiki
however you chose to see e.g. the top
20 tasks for two languages
side-by-side.
http://hyperpolyglot.org/ seems to be an almost perfect match for what I was looking for. The quality is not always there, or idiom can be lacking, but it has the same intention and is pretty comprehensive.

Resources