Could Lojban be used to perform better at natural language understanding than English? - nlp

Not sure if this is a good place for the question but thought I would try since there are lots of smart people here.
I am just wondering if using a constructed (i.e. not developed naturally) and syntactically unambiguous human language like Lojban could be used to perform better at natural language understanding than, say, English, since it is a more logical language.
If anyone has explored this idea or has a better understanding of NLP I'd love some feedback.

I asked a similar question at the linguistics stack exchange.
The response I received was quite informative from user #Prash:
A few corrections: lojban, though a human language, is not a natural language; it is a conlang. AFAIK, there are no native speakers of lojban: that would require teaching lojban as one of the primary languages to a very young child.
Lojban is syntactically unambiguous, and only mostly unambiguous semantically. If there were a lojban programming language, this should not matter because one would avoid writing semantically ambiguous forms (like metaphors). This question has come up on various forums for lojban, Prolog, Haskell, etc., The consensus on those forums seems to be that it is possible, but no one has done it yet. Some people (e.g. 1, e.g. 2) have attempted to implement such a thing, but AFAIK, for very limited domains.

Related

How do I compare programming languages for projects at work?

I am wondering what are some specific questions I should keep in mind when I am comparing programming languages for use on given work projects. For instance, I am told logic programming languages like Prolog are good for natural language processing. I'm not sure why exactly; I assume it is true because experts say so, but I don't know the consideration that guides them to that conclusion. So I am looking for a simple heuristic, a checklist of questions, I can apply to evaluate programming languages and be able to explain my decisions, so that I can say "Language X is good for Y because it does Z."
The only way I know of to figure out which programming language is most appropriate for a given problem, is to know lots of programming languages. After all, if you don't know screwdrivers exist, how will you know not to use a hammer when you encounter a screw?
Unfortunately, there are thousands (maybe tens of thousands) of programming languages, so learning even a significant portion of them is just not realistic.
However, programming languages implement paradigms. And Peter van Roy's famous poster only lists about 34 of those. Although he deliberately decided to ignore several aspects, including anything related to typing, so the real number is probably higher than that. But we can expect it to be well below 100.
That's still a lot, though, but thankfully, paradigms aren't atomic either, they are composed of concepts. The poster lists about a dozen of those (again ignoring typing and a couple of other things). Significantly less than paradigms.
Learning a significant portion of concepts is entirely feasible. Once you know them, you can look at a problem and see which concepts would be useful to have to build a solution. Then you look at which paradigms contain those concepts and which languages implement those paradigms. Pick one, learn it, use it, solve the problem.
And since you already know the concepts (and thus the paradigms) the language implements, you only need to learn the syntax, not the semantics. There aren't actually that many different syntaxes in the wild (C, C++, Objective-C, Objective-C++, D, Go, Java, C#, ECMAScript, PHP, Vala and many others share a lot of syntax, for example, as do Smalltalk, Self, Newspeak and Objective-C, SML, OCaml and F#, and so on), so chances are, you'll pick that up very quickly. (Besides, with today's modern IDEs that's much less of an issue anyway.)
One small point to bear in mind: if you are an expert in language X and you are asked to develop a program in domain Y for which language Z is supposed to be ideal -- will you deliver sooner and better by writing in the language you are an expert in even if it is not (by some measures) ideal for the problem domain ? Or will you deliver better and sooner by first learning a new language ?
I think your search for a simple heuristic is in vain.
Start with what you're team is familiar with. While there's a lot to be said for the philosophy that a great developer can pick up almost any language in short order, there's a practical side that goes to the fact that if you have a ton of .Net or Java coders, you're best served in starting from that base.
Now, within both stacks you have options on things like functional programming (F#, Erlang, etc.) and other languages on the runtime your team is most familiar with. But it really does boil down to the culture, infrastructure, and (most importantly) the experience and flexibility of the individual developers on your team.
There are several factors to consider:
What is your local expertise? If you have a company full of C programmers, it's probably not worth retraining everybody to be Lisp programmers.
What are your libraries? If you have libraries that you want to use, make sure that they are compatible with your language of choice.
If you are starting a new project with a wide-open field of options, I would recommend taking a few sample problems out of the application domain. Nothing too complex, but nothing trivial either. Then, implement (or have someone from your team implement) these samples in each candidate language. Then, choose the one that is clear, easy, and appropriate.

What language to learn after Haskell? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
As my first programming language, I decided to learn Haskell. I'm an analytic philosophy major, and Haskell allowed me to quickly and correctly create programs of interest, for instance, transducers for natural language parsing, theorem provers, and interpreters. Although I've only been programming for two and a half months, I found Haskell's semantics and syntax much easier to learn than more traditional imperative languages, and feel comfortable (now) with the majority of its constructs.
Programming in Haskell is like sorcery, however, and I would like to broaden my knowledge of programming. I would like to choose a new programming language to learn, but I do not have enough time to pick up an arbitrary language, drop it, and repeat. So I thought I would pose the question here, along with several stipulations about the type of language I am looking for. Some are subjective, some are intended to ease the transition from Haskell.
Strong type system. One of my favorite parts of programming in Haskell is writing type declarations. This helps structure my thoughts about individual functions and their relationship to the program as a whole. It also makes informally reasoning about the correctness of my program easier. I'm concerned with correctness, not efficiency.
Emphasis on recursion rather than iteration. I use iterative constructs in Haskell, but implement them recursively. However, it is much easier to understand the structure of a recursive function than a complicated iterative procedure, especially when using combinators and higher-order functions like maps, folds and bind.
Rewarding to learn. Haskell is a rewarding language to work in. It's a little like reading Kant. My experience several years ago with C, however, was not. I'm not looking for C. The language should enforce a conceptually interesting paradigm, which in my entirely subjective opinion, the C-likes do not.
Weighing the answers: These are just notes, of course. I'd just like to reply to everyone who gave well-formed responses. You have been very helpful.
1) Several responses indicated that a strong, statically typed language emphasizing recursion means another functional language. While I want to continue working strongly with Haskell, camccann and larsmans correctly pointed out that another such language would "ease the transition too much." These comments have been very helpful, because I am not looking to write Haskell in Caml! Of the proof assistants, Coq and Agda both look interesting. In particular, Coq would provide a solid introduction to constructive logic and formal type theory. I've spent a little time with first-order predicate and modal logic (Mendellsohn, Enderton, some of Hinman), so I would probably have a lot of fun with Coq.
2) Others heavily favored Lisp (Common Lisp, Scheme and Clojure). From what I gather, both Common Lisp and Scheme have excellent introductory material (On Lisp and The Reasoned Schemer, SICP). The material in SICP causes me to lean towards Scheme. In particular, Scheme through SICP would cover a different evaluation strategy, the implementation of laziness, and a chance to focus on topics like continuations, interpreters, symbolic computation, and so on. Finally, as others have pointed out, Lisp's treatment of code/data would be entirely new. Hence, I am leaning heavily towards option (2), a Lisp.
3) Third, Prolog. Prolog has a wealth of interesting material, and its primary domain is exactly the one I'm interested in. It has a simple syntax and is easy to read. I can't comment more at the moment, but after reading an overview of Prolog and skimming some introductory material, it ranks with (2). And it seems like Prolog's backtracking is always being hacked into Haskell!
4) Of the mainstream languages, Python looks the most interesting. Tim Yates makes the languages sound very appealing. Apparently, Python is often taught to first-year CS majors; so it's either conceptually rich or easy to learn. I'd have to do more research.
Thank you all for your recommendations! It looks like a Lisp (Scheme, Clojure), Prolog, or a proof assistant like Coq or Agda are the main langauages being recommended.
I would like to broaden my knowledge of programming. (...) I thought I would pose the question here, along with several stipulations about the type of language I am looking for. Some are subjective, some are intended to ease the transition from Haskell.
Strong type system. (...) It also makes informally reasoning about the correctness of my program easier. I'm concerned with correctness, not efficiency.
Emphasis on recursion rather than iteration. (...)
You may be easing the transition a bit too much here, I'm afraid. The very strict type system and purely functional style are characteristic of Haskell and pretty much anything resembling a mainstream programming language will require compromising at least somewhat on one of these. So, with that in mind, here are a few broad suggestions aimed at retaining most of what you seem to like about Haskell, but with some major shift.
Disregard practicality and go for "more Haskell than Haskell": Haskell's type system is full of holes, due to nontermination and other messy compromises. Clean up the mess and add more powerful features and you get languages like Coq and Agda, where a function's type contains a proof of its correctness (you can even read the function arrow -> as logical implication!). These languages have been used for mathematical proofs and for programs with extremely high correctness requirements. Coq is probably the most prominent language of the style, but Agda has a more Haskell-y feel (as well as being written in Haskell itself).
Disregard types, add more magic: If Haskell is sorcery, Lisp is the raw, primal magic of creation. Lisp-family languages (also including Scheme and Clojure) have nearly unparalleled flexibility combined with extreme minimalism. The languages have essentially no syntax, writing code directly in the form of a tree data structure; metaprogramming in a Lisp is easier than non-meta programming in some languages.
Compromise a bit and move closer to the mainstream: Haskell falls into the broad family of languages influenced heavily by ML, any of which you could probably shift to without too much difficulty. Haskell is one of the strictest when it comes to correctness guarantees from types and use of functional style, where others are often either hybrid styles and/or make pragmatic compromises for various reasons. If you want some exposure to OOP and access to lots of mainstream technology platforms, either Scala on the JVM or F# on .NET have a lot in common with Haskell while providing easy interoperability with the Java and .NET platforms. F# is supported directly by Microsoft, but has some annoying limitations compared to Haskell and portability issues on non-Windows platforms. Scala has direct counterparts to more of Haskell's type system and Java's cross-platform potential, but has a more heavyweight syntax and lacks the powerful first-party support that F# enjoys.
Most of those recommendations are also mentioned in other answers, but hopefully my rationale for them offers some enlightenment.
I'm going to be That Guy and suggest that you're asking for the wrong thing.
First you say that you want to broaden your horizons. Then you describe the kind of language that you want, and its horizons sound incredibly like the horizons you already have. You're not going to gain very much by learning the same thing over and over.
I would suggest you learn a Lisp — i.e. Common Lisp, Scheme/Racket or Clojure. They're all dynamically typed by default, but feature some sort of type hinting or optional static typing. Racket and Clojure are probably your best bets.
Clojure is more recent and has more Haskellisms like immutability by default and lots of lazy evaluation, but it's based on the Java Virtual Machine, which means it has some odd warts (e.g. the JVM doesn't support tail call elimination, so recursion is kind of a hack).
Racket is much older, but has picked up a lot of power along the way, such as static type support and a focus on functional programming. I think you'd probably get the most out of Racket.
The macro systems in Lisps are very interesting and vastly more powerful than anything you'll see anywhere else. That alone is worth at least looking at.
From the standpoint of what suits your major, the obvious choice seems like a logic language such as Prolog or its derivatives. Logic programming can be done very neatly in a functional language (see, e.g. The Reasoned Schemer) , but you might enjoy working with the logic paradigm directly.
An interactive theorem proving system such as twelf or coq might also strike your fancy.
I'd advise you learn Coq, which is a powerful proof assistant with syntax that will feel comfortable to the Haskell programmer. The cool thing about Coq is it can be extracted to other functional languages, including Haskell. There is even a package (Meldable-Heap) on Hackage that was written in Coq, had properties proven about its operation, then extracted to Haskell.
Another popular language that offers more power than Haskell is Agda - I don't know Agda beyond knowing it is dependently typed, on Hackage, and well respected by people I respect, but those are good enough reasons to me.
I wouldn't expect either of these to be easy. But if you know Haskell and want to move forward to a language that gives more power than the Haskell type system then they should be considered.
As you didn't mention any restrictions besides your subjective interests and emphasize 'rewarding to learn' (well, ok, I'll ignore the static typing restriction), I would suggest to learn a few languages of different paradigms, and preferably ones which are 'exemplary' for each of them.
A Lisp dialect for the code-as-data/homoiconicity thing and because they are good, if not the best, examples of dynamic (more or less strict) functional programming languages
Prolog as the predominant logic programming language
Smalltalk as the one true OOP language (also interesting because of its usually extremely image-centric approach)
maybe Erlang or Clojure if you are interested in languages forged for concurrent/parallel/distributed programming
Forth for stack oriented programming
(Haskell for strict functional statically typed lazy programming)
Especially Lisps (CL not as much as Scheme) and Prolog (and Haskell) embrace recursion.
Although I am not a guru in any of these languages, I did spend some time with each of them, except Erlang and Forth, and they all gave me eye-opening and interesting learning experiences, as each one approaches problem solving from a different angle.
So, though it may seem as if I ignored the part about your having no time to try a few languages, I rather think that time spent with any of these will not be wasted, and you should have a look at all of them.
How about a stack-oriented programming language? Cat hits your high points. It is:
Statically typed with type inference.
Makes you re-think common imperative languages concepts like looping. Conditional execution and looping are handled with combinators.
Rewarding - forces you to understand yet another model of computation. Gives you another way to think about and decompose problems.
Dr. Dobbs published a short article about Cat in 2008 though the language has changed slightly.
If you want a strong(er)ly typed Prolog, Mercury is an interesting choice. I've dabbled in it in the past and I liked the different perspective it gave me. It also has moded-ness (which parameters need to be free/fixed) and determinism (how many results are there?) in the type system.
Clean is very similar to Haskell, but has uniqueness typing, which are used as an alternative to Monads (more specifically, the IO monad). Uniqueness typing also does interesting stuff to working with arrays.
I'm a bit late but I see that no one has mentioned a couple of paradigms and related languages that can interest you for their high-level of abstraction and generality:
rewriting systems, like Maude or ELAN;
Constraint Handling Rules (CHR).
Despite its failure to meet one of your big criteria (static* typing), I'm going to make a case for Python. Here are a few reasons I think you should take a look at it:
For an imperative language, it is surprisingly functional. This was one of the things that struck me when I learned it. Take list comprehensions, for example. It has lambdas, first-class functions, and many functionally-inspired compositions on iterators (maps, folds, zips...). It gives you the option of picking whatever paradigm suits the problem best.
IMHO, it is, like Haskell, beautiful to code in. The syntax is simple and elegant.
It has a culture that focuses on doing things in a straightforward way, rather than focusing too minutely on efficiency.
I understand if you are looking for something else though. Logic programming, for instance, might be right up your alley, as others have suggested.
* I assume you mean static typing here, since you want to declare the types. Techincally, Python is a strongly typed language, since you can't arbitrarily interpret, say, a string as an number. Interestingly, there are Python derivatives that allow static typing, like Boo.
I would recommend you Erlang. It is not strong typed language and you should try it. It is very different approach to programming and you may find that there are problems where strong typing is not The Best Tool(TM). Anyway Erlang provides you tools for static type verification (typer, dialyzer) and you can use strong typing on parts where you gain benefits from it. It can be interesting experience for you but be prepared, it will be very different feeling. If you are looking for "conceptually interesting paradigm" you can found them in Erlang, message passing, memory separation instead sharing, distribution, OTP, error handling and error propagation instead of error "prevention" and so. Erlang can be far away from your current experience but still brain tickling if you have experience with C and Haskell.
Given your description, I would suggest Ocaml or F#.
The ML family are generally very good in terms of a strong type system. The emphasis on recursion, coupled with pattern matching, is also clear.
Where I am a bit hesitant is on the rewarding to learn part. Learning them was rewarding for me, no doubt. But given your restrictions and your description of what you want, it seems you are not actually looking for something much more different than Haskell.
If you didn't put your restrictions I would have suggested Python or Erlang, both of which would take you out of your comfort zone.
In my experience, strong typing + emphasis on recursion means another functional programming language. Then again, I wonder if that's very rewarding, given that none of them will be as "pure" as Haskell.
As other posters have suggested, Prolog and Lisp/Scheme are nice, even though both are dynamically typed. Many great books with a strong theoretical "taste" to them have been published about Scheme in particular. Take a look at SICP, which also conveys a lot of general computer science wisdom (meta-circular interpreters and the like).
Factor will be a good choice.
You could start looking into Lisp.
Prolog is a cool language too.
If you decide to stray from your preference for a type system,you might be interested in the J programming language. It is outstanding for how it emphasizes function composition. If you like point-free style in Haskell, the tacit form of J will be rewarding. I've found it extraordinarily thought-provoking, especially with regard to semantics.
True, it doesn't fit your preconceptions as to what you'd like, but give it a look. Just knowing that it's out there is worth discovering. The sole source of complete implementations is J Software, jsoftware.com.
Go with one of the main streams. Given the resources available, future marketability of your skill, rich developer ecosystem I think you should start with either Java or C#.
Great question-- I've been asking it myself recently after spending several months thoroughly enjoying Haskell, although my background is very different (organic chemistry).
Like you, C and its ilk are out of the question.
I've been oscillating between Python and Ruby as the two practical workhorse scripting languages today (mules?) that both have some functional components to them to keep me happy. Without starting any Rubyist/Pythonist debates here, but my personal pragmatic answer to this question is:
Learn the one (Python or Ruby) that you first get an excuse to apply.

Is Haskell suitable as a first language?

I have had previous exposure to imperative languages (C, some Java) however I would say I had no experience in programming. Therefore: treating me as a non-programmer, would Haskell be suitable as a first language?
My interests in Pure Mathematics and CS seem to align to the intention of most Haskell tutorials, and although i can inherently recognise the current and future industry value of imperative programming, I find the potential of functional programming (in as much as it seems such a paradigm shift) fascinating.
I guess my question can be distilled as follows - would a non-programmer have to understand imperative programming to appreciate and fully utilise functional programming?
Some references:
Are there any studies on whether functional/declarative or imperative programming is easier to learn as a first language?
Which programming languages have helped you to understand programming better?
Well, the existence of SICP suggests that functional languages can be used as introductory material. Scheme is perhaps more approachable than Haskell, however.
Haskell seems to have a reputation for being "difficult" to learn, but people tend to forget that classic imperative programming is difficult to learn as well. Many people struggle at first with the concept of assigning a value to a variable, and a surprising number of programmers never actually do become comfortable with pointers and indirect references.
The connections between Haskell and abstract mathematics don't really matter as much as people sometimes assume, but for someone interested in the math anyway, looking at the analogies might provide an interesting bonus.
There has been at least one study on the effects of teaching Haskell to beginner programmers:
The Risks and Benefits of Teaching Purely Functional Programming in First Year. Manuel M. T. Chakravarty and Gabriele Keller. Journal of Functional Programming 14(1), pp 113-123, 2004.
With the following abstract:
We argue that teaching purely
functional programming as such in
freshman courses is detrimental to
both the curriculum as well as to
promoting the paradigm. Instead, we
need to focus on the more general aims
of teaching elementary techniques of
programming and essential concepts of
computing. We support this viewpoint
with experience gained during several
semesters of teaching large first-year
classes (up to 600 students) in
Haskell. These classes consisted of
computer science students as well as
students from other disciplines. We
have systematically gathered student
feedback by conducting surveys after
each semester. This article
contributes an approach to the use of
modern functional languages in first
year courses and, based on this,
advocates the use of functional
languages in this setting.
So, yes, you can use Haskell, but you should focus on elementary, general techniques and essential concepts, rather than functional programming per se.
There are a number of popular books for beginner programmers that also make it an attractive target for teaching these elementary concepts, including:
"Programming in Haskell"
"The Craft of Functional Programming"
Additionally, Haskell is already widely taught as a first language. -- but remember, the key is to focus on the core concepts as illustrated in Haskell, not to teach the large, rich language that is Haskell itself.
I'll go against the popular opinion and say that Haskell is NOT a good first programming language for the typical first-time programmer. I don't think it is as approachable for a raw beginner as imperative languages like Ruby.
The reason for this, is that people do not think about the world in a functional manner. When they see a car driving down the street, they see the same car, with ever-changing mutable state. They don't see a series of slightly different immutable cars.
If you check out other SO questions, you'll see that Haskell is pretty much never mentioned as a good choice for a beginner.
However, if you are a mathematician, or already know enough about programming to understand the value of functional programming, I think Haskell is a fine choice.
So to summarize, I think Haskell is a perfect fit for you, but not a good fit for the typical beginner.
EDIT: Thanks for the insightful comments. Owen's point that people think in a multi-paradigm manner is very true. This strengthens my belief that a multi-paradigm language like Ruby would be easier to pick up, and has the added benefit of exposing the student to both imperative and functional thinking. Haskell is decidedly not multi-paradigm.
Chuck mentioned Haskell's sophisticated type system which is another great point. While I personally prefer statically typed languages, using a dynamic language allows a beginner to ignore that piece of the puzzle until they are curious enough to find out what is going on behind the scenes. Haskell's type system, while elegant, is in your face from day 1.
Eleven reasons to use Haskell as a mathematician
I cannot write it better than that. But to summarize:
Haskell is declarative and mathematics is the ultimate declarative language, which means that code written in Haskell is remarkably similar to what you would write as a mathematical statement.
Haskell is high-level, no need to know details about caches, memory management and all the other hardware stuff. Also that means short programs which is always good.
Haskell is great for symbolic computation, algebra, logic ...
Haskell is pretty :)
To answer your question: you'll have no problem to start with a functional language as a mathematician with no programming experience. Actually it's the better choice, you won't have to repair the brain damage you would get from C/Java/whatever.
You should also check Mathematica. Some people tend to dislike it since it is a commercial closed-source product, but I think it's a pretty good environment for doing mathematics.
If you haven't had any experience at all, it will in fact be easier for you to be productive in functional programming, especially PURE functional programming. I'm an immigrant from imperative to function, I had to deal with having to forget about 80% of what I learned to be productive in Haskell.
In contrast, it's easier to switch from functional to imperative later on.
On one hand, I think Haskell is nice as a first language, but I suppose, for anyone seriously interested in programming, it should be learned in parallel with C or after C (or an assembly). C is necessary to learn what's happening under the hood, what are the costs of doing this and that, and finally appreciate the usefulness of higher level of abstraction and automatic resource management. I think when being exposed to both C (as a low-level imperative language) and Haskell (as a high-level functional language), most students will find Haskell both practical and expressive.
On the other hand, I think that programming is a craft. It is a practical activity, and it is important to learn the joy of creating something new, useful or interesting. So you need to get things done. And the easiest way for this is using a language which has tools for your problems, i.e. libraries for your data formats, algorithms for your kind of problems. And at this point, Python (or Ruby) may be a better choice, because Hackage still lags behind PyPI in many areas (and say, how many days you need to teach a novice to manipulate an image, or to plot charts in Haskell?).
So, my opinion is that some exposure to low-level imperative programming is necessary (to OOP, probably, not). Then you can understand the value of Haskell. But to get things done, and to quickly become productive, Python is a better choice for beginners. Haskell requires a few weeks before it becomes your tool.
I would say that it is suitable as a first language, and that having learned an imperative language first would probably only interfere with the learning process (since it requires lots of unlearning first).
As a caveat, I would add that a functional language principles would probably be best understood by someone with a mathematical background, as the concepts are abstract mathematical ones.
I know that many schools do teach it as a first functional language, but not as a first language.
Yes it is. Real World Haskell is a great way to get into it http://book.realworldhaskell.org/
I would hesitantly say "yes" except for the fact that in learning, finding someone as a mentor or tutor would be a much less daunting task if you chose a more imperative language to start programming. Might I suggest R or Python (with NumPy and SciPy) instead?
No.
It's very easy for a haskell98 program to be clearly understood. LYAH is a great tutorial for people with no experience but trying to prevent a learner from stumbling on extensions x, y z is gona be tricky. Soon they start to explore and become overwhelmed with advanced programming/mathematical concepts which are much harder to understand but need to be understood to read other's code.
If every piece of haskell was written in just haskell'98/'10 I would probably say yes though.
Without necessarily addressing the question as such, I would add: if you find haskell's persnicketiness too hard, do not be discouraged.
There are other programming languages, even functional ones, which are late bound.

Is similarity to "natural language" a convincing selling point for a programming language? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
Look, for example at AppleScript (and there are plenty of others, some admittedly quite good) which advertise their use of the natural language metaphor. Code is apparently more readable because it can be/is intended to be constructed in English-like sentences, says they. I'm sure there are people who would like nothing better than to program using only English sentences. However, I have doubts about the viability of a language that takes that paradigm too far (excepting niche cases).
So, after a certain reasonable point, is natural-languaginess a benefit or a misfeature? What if the concept is carried to an extreme -- will code necessarily be more readable? Or might it be unnecessarily long, difficult to work with, and just as capable of producing hilarity on the scale of obfuscated Perl, obfuscated C, and eye-twisting Bash script logorrhea?
I am aware of some specialty cases like "Inform" that are almost pure English, but these have a niche that they're not likely to venture out from. I hear and read about how great it would be for code to read more like English sentences, but are there discussions of the possible disadvantages? If everyday language is so clear, simple, clean, lovely, concise, understandable, why did we invent mathematical notation in the first place?
Is it really easier to describe complex instructions accurately and precisely to a machine in natural language, or isn't something closer to mathematical markup a much better choice? Where should that line be drawn? And finally, are you attracted to languages that are touted as resembling English sentences? Should this whole question have just been a one liner:
naturalLanguage > computerishLanguage ? booAndHiss : cheerLoudly;
Well, of course, natural languages are rarely clear, simple, clean, lovely, concise, understandable which is one of the reasons that most programming is done in languages far from natural.
My answer to this would be that the ideal programming language lies somewhere between a natural language and a very formal language.
On the one extreme, there's the formal, minimal, mathematical languages. Take for example Brainfuck:
,>++++++[<-------->-],[<+>-]<. // according to Wikipedia, this means addition
Or, what's somewhat preferable to the above mess, any type of lambda calculus.
λfxy.x
λfxy.y
This is one possible way of expressing the Boolean truth values in lambda calculus. Doesn't look very neat, especially when you build logical operators (such as AND being e.g. λpq.pqp) around them.
I claim that most people could not write production code in such a minimalistic, hard-to-grasp language.
The problem on the other end of the spectrum, namely natural languages as they are spoken by humans, is that languages with too much complexity and flexibility allows the programmer to express vague and indefinite things that can mean nothing to today's computers. Let's take this sample program:
MAYBE IT WILL RAIN CATS AND DOGS LATER ON. WOULD YOU LIKE THIS, DEAR COMPUTER?
IF SO, PRINT "HELLO" ON THE SCREEN.
IF YOU HATE RAIN MORE THAN GEORGE DOES, PRINT SOME VAGUE GARBAGE INSTEAD.
(IN THE LATTER CASE, IT IS UP TO YOU WHERE YOU OUTPUT THAT GARBAGE.)
Now this is an obvious case of vagueness. But sometimes you would get things wrong with more reasonable natural language programs, such as:
READ AN INTEGER NUMBER FROM THE TERMINAL.
READ ANOTHER INTEGER NUMBER FROM THE TERMINAL.
IF IT IS LARGER THAN ZERO, PRINT AN ERROR.
Which number is IT referring to? And what kind of error should be printed (you forgot to specify it.) — You would have to be really careful to be extremely explicit about what you mean.
It's already too easy to mis-understand other humans. How do you expect a computer to do better?
Thus, a computer language's syntax and grammar has to be strict enough so that it doesn't allow ambiguity. A statement must evaluate in a deterministic way. (There are maybe corner cases; I'm talking about the general case here.)
I personally prefer languages with a very limited set of keywords. You can quickly learn such a language, and you don't have to choose between 10,000 ways of achieving one goal simply because there's 10,000 keywords for doing the same thing (as in: GO/WALK/RUN/TROD/SLEEPWALK/etc. TO THE FRIDGE AND GET ME A BEER!). It means if you need to think about 10,000 different ways of doing something, it won't be due to the language, but due to the fact that there are 9,999 stupid ways to do it, and 1 elegant solution that just shines more than all the others.
Note that I wrote all natural language examples in upper-case. That's because I sort of had good old GW-BASIC and COBOL in mind while I wrote this. There've been some examples of programming languages that lean on natural language, and I think history has shown that they are, in general, somewhat less widespread than e.g. terse C-style languages.
I recently read that according to Gartner there are over 400 billion lines of COBOL source code in active use worldwide today.
That doesn't prove anything other than that banks and governments are fond of their legacy code, but you could construe it as a testament to the success of English-like programming languages. I'm not aware of any other programming language that is so close to English and so verbose.
Aside from that, I tend to agree with the other respondents: Programmers prefer not to type so much, and in general a language based on mathematics-like shorthand is both more expressive and more precise than one based on English.
There's a point where terse, expressive code looks like line noise. Perl, APL and J come to mind as examples with "illegible one-liners." Programmers are humans, and it may be beneficial to leave them with some similarity to natural language to give their brains something familiar to hold on to. Thus, I propagate a happy medium that's reminiscent of but not too close to natural language.
"When a programming language is created that allows programmers to program in simple English, it will be discovered that programmers cannot speak English." ~ Unknown
In my (not so) humble opinion, no.
Natural language is full of ambiguities. Normally we do not think of them because humans can easily disambiguate them, based on many criteria often unavailable to the computer. First off we have knowledge about the world (elephants don't fit in pajamas), but also we use more senses than just hearing when we speak to each other, body language to name one. The intonation and manner things are said with also helps alot to disambiguate. It is harder to catch irony or sarcasm in written text, which is more or less a transcription of what we would say, more in the case IM less in the case of well written articles. In general there is loads and loads of ambiguity in natural language, for instance where the PPs, prepositional phrases attach:
"Workers [dumped [sacks [with flour]]]"
"Workers [dumped [sacks] [with a fork-lift]]]"
Any human immediatly tells where the PP will attach, its reasonable to have sacks with flour in them, and its reasonable to use a fork-lift to dump something. Another very troublesome area is the word "and" which messes up the grammar horrendously, or all the references we use, the pronouns in general, but also more complex references, ie. "Bill bought a Dodge Viper, sadly the car was a lemon".
So we have three options, keep the ambiguities in and try to deal with them, accepting very many errors in disambiguation and very very slow parsing, no LALR or LL will work here, or try to make an artifical grammar resembling natural language, and keeping it deterministic, which is more reasonable but still horrible. We now have a language that falsely resembles English, but it isn't which is confusing. We have none of the benefits of a proper syntax and none of the benefits of natural language, but an oversized overwordly monstrum, with a diffcult and unintuitive grammar, diffcult to learn and slow to write.
The third way is realizing we need a succinct way of expressing ourselves, which can also be processed by a computer, not resembling any natural language, but focusing on being an unambigous description of an algorithm. This will increase the readability, especially if we compare to a very precise natural language counter part. This is why many people prefer to also read the pseudo-code when dealing with difficult problems or advanced algorithms, it relieves us of the trouble with dealing with ambiguities, and is more optimal for expressing computer instructions.
The issue isn't so much that it's easier to describe complex ideas using one approach or the other, but it certainly is easier understanding machine languages (at least for machines). The biggest issue is, as always, ambiguity. Computers are terrible at understand it, so most grammars for programming languages need to be constructed to either remove all ambiguity, or the general language must be constructed so that ambiguity isn't actually a problem (this is tricky).
Any programming language that allows for ambiguity would be terribly error prone; and any natural language that doesn't allow ambiguity would be terribly verbose and convoluted (I'm looking at you, Lojban [ok, maybe Lojban isn't so bad‚ still…]).
The propensity some people show for preferring natural languages for programming languages might essentially root out in the desire to eventually be able to input a physics textbook into a parser, whereupon it'll do your homework when asked.
Of course, that's not to say that programming languages shouldn't have hints of natural language: Especially for OOP it makes good sense to have calling grammar resemble natural grammar, like in Obj-C, which is sort of a game of mad libs:
[pot makeCoffee:strong withSugar:NO];
Doing the same in BrainFuck would be, well, a brainfuck, three full pages of code to flip a switch will do that to you.
In essensce; the best languages are (probably) the ones that resemble natural languages, without pretending to be one. (Avoiding the uncanny valley of programming languages, [if there is such a thing] if you will. [Subclauses! Yay!])
A natural language is too ambiguous to be used as programming language. It has to be artificially constrained to eliminate ambiguities.
But it defeats the purpose of having a "natural" programming language, because you have its verbosity and none of its advantages in expressibility.
I think the fourth language I coded professionally in (after Fortran, Pascal and Cobol) was Natural. Which is a pretty obscure 4GL of 1980's vintage for developing mainframe systems against an ADABAS database.
Called Natural I believe because it had pretensions to be so. Supposedly management-readable like cobol, but minus the fluff.
Which should tell you that attempts at 'Natural' programming languages have a commercial history of over 30 years now (more if you count cobol) but they have pretty much lost out to languages that don't pretend to be 'natural' but do allow the programmer to define the problem succinctly. When I first started coding the 1GL -> 2GL -> 3GL evolution wasn't that old and the progression to 4GL (defined then as a more english-like programming languages) for mainstream work seem an obvious next step. It hasn't worked out that way. If anything getting up to speed with coding now has got harder because there's more abstract concepts to learn.
SQL was designed with natural language in mind originally. Fortunately it hasn't held on too tightly to this and advances since its conception are less "naturalistic".
But anybody that has tried to write a complicated query in SQL will tell you that its not that easy. You have worry about the range of some keywords over your query. You have this incredibly hard to understand query, that does some crazy shit, but you re-write it every time you need to change something because its easier.
Natural language programming is a bad idea. the further you get from assembly, the more mistakes you can make, not in terms of logical errors or anything like that, but in terms of having the wrong assumption about how the script interpreter/bytecode intepreter/compiler makes your code run on the CPU.
Is seems to be a great feature for beginners, or people who program as a "secondary activity". But I doubt you could reach the complexity and polyvalence of actual programming languages with natural language.
If there was a programming language that actually adhered to all of the conventions of the natural language it mimics, then that would be fantastic.
In reality, however, a lot of so-called "natural" programming languages have far stricter syntax than English, which means that although they are easily readable, it is debatable whether they are actually all that easy to write.
What makes sense in English is often a syntax error in AppleScript.
Everyday language isn't so clear, simple, clean, lovely, concise and understandable - to a computer. However, to a human, readability counts for a lot, and the closer you get to a natural language, the easier it is to read. That's why we're not all using assembly language.
If you have a completely natural language, there are a lot of things that need to be handled - the sentence needs to be parsed, each word must be understood - and there is plenty of room for ambiguity. That's generally not a good thing for a programming language, because then we're venturing into psychic programming - the computer has to figure out what you were thinking, which is not at all easy to get.
However, if you can make something sufficiently close to natural language - and yes, Inform 7 is probably the best example - so sentences look natural, but still have some structure you need to follow - then the code is almost instantly readable, even to people that don't know the language. There's usually also less specialized syntax to remember - because you're really just talking (a slightly modified form of) English - but if you have to do something out of the ordinary, then you might have to jump through some hoops to do that.
In practice, most languages don't bother with this, because that makes it easier for them to allow you to be precise. However, some will still hover closer to the "natural language". This can be a good thing: if you have to translate some pseudocode algorithm to a language, you don't need to manipulate it as much to make it work, reducing the risk that you make an error in the translation.
As an example, let's compare C and Pascal. This Pascal code:
for i := 1 to 10 do begin
j := j + 1;
end;
is equivalent to this C code:
for (i = 1; i <= 10; i++) {
j = j + 1;
}
If you had no prior knowledge of either syntax, the Pascal version is generally going to be simpler to read, if only because it's not as complex as a C for.
Let's also consider operators. Pascal and C both share +, - and *. They also both have /, but with different semantics: In C, / does an integer division if both operands are integers; in Pascal, it always does a "real" division and uses div for integer division. That means that you have to take the types into account when figuring out what actually happens in that line of code.
C also has a bunch of other operators: &&, ||, &, |, ^, <<, >> - in Pascal, those operators are instead named and, or, and, or, xor, shl, shr. Instead of relying on some semi-arbitrary sequence of characters, it's spelled out more. It's instantly obvious that xor is - well, XOR - unlike the C version, where there's no obvious correlation between ^ and XOR.
Of course, this is to some degree a matter of opinion: I much prefer a Pascal-like syntax to a C-like syntax, because I think it's more readable, but that doesn't mean everyone else does: A more natural language is usually going to be more verbose, and some people simply dislike that extra level of verbosity.
Basically, it's a matter of choosing what makes the most sense for the problem domain: if the problem domain is very limited (like with Inform), then a natural language makes perfect sense. If it's a very generic domain (like with C), then you either need far more advanced processing than we are currently capable of, or a lot of verbosity to fill in the details - and in that case, you have to choose a balance depending on what sort of users will be using the languages (for regular people, you need more naturalness, for people who know programming, they're usually comfortable enough with less natural languages and will prefer something closer to that end).
I think the question is, who reads and who writes the application code in question? I think, regardless of the language or architecture, a trained software developer should be writing the code, and analyze the code as bugs arise.

What is a computer programming language?

At the risk of sounding naive, I ask this question in search of a deeper understanding of the concept of programming languages in general. I write this question for my own edification and the edification of others.
What is a useful definition of a computer programming language and what are its basic and necessary components? What are the key features that differentiate languages (functional, imperative, declarative, object oriented, scripting, etc...)?
One way to think about this question. Imagine you are looking at the hardware of a modern desktop or laptop computer. Assume, that the C language or any of its variants do not exist. How would you describe to others all the things needed to make the computer expressive and functional in terms of what we expect of personal computers today?
Tangentially related, what is it about computer languages that allow other languages to exist? For example take a scripting language like Javascript, Perl, or PHP. I assume part of the definition of these is that there is an interpreter most likely implemented in C or C++ at some level. Is it possible to write an interpreter for Javascript in Javascript? Is this a requirement for a complete language? Same for Perl, PHP, etc?
I would be satisfied with a list of concepts that can be looked up or researched further.
Like any language, programming languages are simply a communication tool for expressing and conveying ideas. In this case, we're translating our ideas of how software should work into a structured and methodical form that computers (as well as other humans who know the language, in most cases) can read and understand.
What is a useful definition of a computer programming language and what are its basic and necessary components?
I would say the defining characteristic of a programming language is as follows: things written in that language are intended to eventually be transformed into something that is executed. Thus, pseudocode, while perhaps having the structure and rigor of a programming language, is not actually a programming language. Likewise, UML can express many powerful ideas in an abstract manner just like a programming language can, but it falls short because people don't generally write UML to be executed.
How would you describe to others all the things needed to make the computer expressive and functional in terms of what we expect of personal computers today?
Even if the word "programming language" wasn't part of the shared vocabulary of the group I was talking to, I think it would be obvious to the others that we'd need a way to communicate with the computer. Just as no one expects a car to drive itself (yet!) without external instructions in the form of interaction with the steering wheel and pedals, no one could expect the hardware to function without being told what to do. As noted above, a programming language is the conduit through which we can make that communication happen.
Tangentially related, what is it about computer languages that allow other languages to exist?
All useful programming languages have a property called Turing completeness. If one language in the Turing-complete set can do something, then any of them can; they are said to be computationally equivalent.
However, just because they're equally "powerful" doesn't mean they're equally nice to work with for humans. This is why many people are willing to sacrifice the unparalleled micromanagement you get from writing assembly code in exchange for the expressiveness and power you get with higher-level languages, like Ruby, Python, or C#.
Is it possible to write an interpreter for Javascript in Javascript? Is this a requirement for a complete language? Same for Perl, PHP, etc?
Since there is a Javascript interpreter written in C, it follows that it must be possible to write a Javascript interpreter in Javascript, since both are Turing-complete. However, again, note that Turing-completeness says nothing about how hard it is to do something in one language versus another -- only whether it is possible to begin with. Your Javascript-interpreter-inside-Javascript might well be horrendously inefficient, consume absurd amounts of memory, require enormous processing power, and be a hideously ugly hack. But Turing-completeness guarantees it can be done!
While this doesn't directly answer your question, I am reminded of the Revenge of the Nerds essay by Paul Graham about the evolution of programming languages. It's certainly an interesting place to start your investigation.
Not a definition, but I think there are essentially two strands of development in programming languages:
Those working their way up from what the machine can do to something more expressive and less tied to the machine (Assembly, Fortran, C, C++, Java, ...)
Those going down from some mathematical or theoretical computer science concept of computation to something implementable on a real machine (Lisp, Prolog, ML, Haskell, ...)
Of course, in reality the picture is not as neat, and both strands influence each other by borrowing the best ideas.
Slightly long rant ahead.
A computer language is actually not all that different from a human language. Both are used to express ideas and concepts in commonly understood terms. Among different human languages there are syntactic differences, but you can express the same thing in every language (does that make human languages Turing complete? :)). Some languages are better suited for expressing certain things than others.
For example, although technically not completely correct, the Inuit language seems quite suited to describe various kinds of snow. Japanese in my experience is very suitable for expressing ones feelings and state of mind thanks to a large, concise vocabulary in that area. German is pretty good for being very precise thanks to largely unambiguous grammar.
Different programming languages have different specialities as well, but they mostly differ in the level of detail required to express things. The big difference between human and programming languages is mostly that programming languages lack a lot of vocabulary and have very few "grammatical" rules. With libraries you can extend the vocabulary of a language though.
For example:
Make me coffee.
Very easy to understand for a human, but only because we know what each of the words mean.
coffee : a drink made from the roasted and ground beanlike seeds of a tropical shrub
drink : a liquid that can be swallowed
swallow : cause or allow to pass down the throat
... and so on and so on
We know all these definitions by heart, but we had to learn them at some point.
In the same way, a computer can be "taught" to "understand" words as well.
Coffee::make()->giveTo($me);
This could be a perfectly valid expression in a computer language. If the computer "knows" what Coffee, make() and giveTo() means and if $me is defined. It expresses the same idea as the English sentence, just with a different, more rigorous syntax.
In a different environment you'd have to say slightly different things to get the same outcome. In Japanese for example you'd probably say something like:
コーヒーを作ってもらっても良いですか?
Kōhī o tsukuttemoratte mo ii desu ka?
Which would roughly translate to:
if ($Person->isAgreeable('Coffee::make()')) {
return $Person->return(Coffee::make());
}
Same idea, same outcome, but the $me is implied and if you don't check for isAgreeable first you may get a runtime error. In computer terms that would be somewhat analogous to Ruby's implied behaviour of returning the result of the last expression ("grammatical feature") and checking for available memory first (environmental necessity).
If you're talking to a really slow person with little vocabulary, you probably have to explain things in a lot more detail:
Go to the kitchen.
Take a pot.
Fill the pot with water.
...
Just like Assembler. :o)
Anyway, the point being, a programming language is actually a language just like a human language. Their syntax is different and specialized for the problem domain (logic/math) and the "listener" (computers), but they're just ways to transport ideas and concepts.
EDIT:
Another point about "optimization for the listener" is that programming languages try to eliminate ambiguity. The "make me coffee" example could, technically, be understood as "turn me into coffee". A human can tell what's meant intuitively, a computer can't. Hence in programming languages everything usually has one and one meaning only. Where it doesn't you can run into problems, the "+" operator in Javascript being a common example.
1 + 1 -> 2
'1' + '1' -> '11'
See "Programming Considered as a Human Activity." EWD 117.
http://www.cs.utexas.edu/~EWD/transcriptions/EWD01xx/EWD117.html
Also See http://www.csee.umbc.edu/331/current/notes/01/01introduction.pdf
Human expression which:
describes mathematical functions
makes the computer turn switches on and off
This question is very broad. My favorite definition is that a programming language is a means of expressing computations
Precisely
At a high level
In ways we can reason about them
By computation I mean what Turing and Church meant: the Turing machine and the lambda calculus have equivalent expressive power (which is a theorem), and the Church-Turing hypothesis (which is a conjecture) says roughly that there's no more powerful notion of computation out there. In other words, the kinds of computations that can be expressed in any programming languages are at best the kinds that can be expressed using Turing machines or lambda-calculus programs—and some languages will be able to express only a subset of those calculations.
This definition of computation also encompasses your friendly neighborhood hardware, which is pretty easy to simulate using a Turing machine and even easier to simulate using the lambda calculus.
Expressing computations precisely means the computer can't wiggle out of its obligations: if we have a particular computation in mind, we can use a programming language to force the computer to perform that computation. (Languages with "implementation defined" or "undefined" constructs make this task more difficult. Programmers using these languages are often willing to settle for—or may be unknowingly settling for—some computation that is only closely related to the computation they had in mind.)
Expressing computation at a high level is what programming langauges are all about. An important reason that there are so many different programming languages out there is that there are so many different high-level ways of thinking about problems. Often, if you have an important new class of problems to solve, you may be best off creating a new programming language. For example, Larry Wall's writing suggests that solving a class of problems called "systems administration" was a motivation for him to create Perl.
(Another reason there are so many different programming languages out there is that creating a new language is a lot of fun, and anyone can learn to do it.)
Finally, many programmers want languages that make it easy to reason about programs. For example, today a student of mine implemented a new algorithm that made his program run over six times faster. He had to reason very carefully about the contents of C arrays to make sure that the new algorithm would do the same job the old one did. Luckily C has decent tools for reasoning about programs, for example:
A change in a[i] cannot affect the value of a[i-1].
My student also applied a reasoning principle that isn't valid in C:
The sum of of a sequence unsigned integers will be at least as large as any integer in the sequence.
This isn't true in C because the sum might overflow. One reason some programmers prefer languages like Standard ML is that in SML, this reasoning principle is always valid. Of languages in wide use, probably Haskell has the strongest reasoning principles Richard Bird has developed equational reasoning about programs to a high art.
I will not attempt to address all the tangential details that follow your opening question. But I hope you will get something out of an answer that aims to give a deeper understanding, as you asked, of a fundamental question about programming languages.
One thing a lot of "IT" types forget is that there are 2 types of computer programming languages:
Software programming languages: C, Java, Perl, COBAL, etc.
Hardware programming languages: VHDL, Verilog, System Verilog, etc.
Interesting.
I'd say the defining feature of a programming language is the ability to make decisions based on input. Effectively, if and goto. Everything else is lots and lots of syntactic sugar. This is the idea that spawned Brainfuck, which is actually remarkably fun to (try to) use.
There are places where the line blurs; for example, I doubt people would consider XSLT to really be a programming language, but it's Turing-complete. I've even solved a Project Euler problem with it. (Very, very slowly.)
Three main properties of languages come to mind:
How is it run? Is it compiled to bare metal (C), compiled to mostly bare metal with some runtime lookup (C++), run on a JIT virtual machine (Java, .NET), bytecode-interpreted (Perl), or purely interpreted (uhh..)? This doesn't comment much on the language itself, but speaks to how portable the code may be, what sort of speed I might expect (and thus what broad classes of tasks would work well), and sometimes how flexible the language is.
What paradigms does it support? Procedural? Functional? Is the standard library built with classes or functions? Is there reflection? Is there, ideally, support for pretty much whatever I want to do?
How can I represent my data? Are there arrays, and are they fixed-size or not? How easy is it to use strings? Are there structs or hashes built in? What's the type system like? Are there objects? Are they class-based or prototype-based? Is everything an object, or are there primitives? Can I inherit from built-in objects?
I realize the last one is a very large collection of potential questions, but it's all related in my mind.
I imagine rebuilding the programming language landscape entirely from scratch would work pretty much how it did the first time: iteratively. Start with assembly, the list of direct commands the processor understands, and wrap it with something a bit easier to use. Repeat until you're happy.
Yes, you can write a Javascript interpreter in Javascript, or a Python interpreter in Python (see: PyPy), or a Python interpreter in Javascript. Such languages are called self-hosting. Have a look at Perl 6; this has been a goal for its main implementation from the start.
Ultimately, everything just has to translate to machine code, not necessarily C. You can write D or Fortran or Haskell or Lisp if you want. C just happens to be an old standard. And if you write a compiler for language Foo that can ultimately spit out machine code, by whatever means, then you can rewrite that compiler in Foo and skip the middleman. Of course, if your language is purely interpreted, this will probably result in a stack overflow...
As a friend taught me about computer languages, a language is a world. A world of communication with that machine. It is world for implementing ideas, algorithms, functionality, as Alonzo and Alan described. It is the technical equivalent of the mathematical structures that the aforementioned scientists built. It is a language with epxressions and also limits. However, as Ludwig Wittgenstein said "The limits of my language mean the limits of my world", there are always limitations and that's how one chooses it's language that fits better his needs.
It is a generic answer... some thoughts actually and less an answer.
There are many definitions to this but what I prefer is:
Computer programming is programming that helps to solve a particular technical task/problem.
There are 3 key phrases to look out for:
You: Computer will do what you (Programmer) told it to do.
Instruct: Instruction is given to the computer in a language that it can understand. We will discuss that below.
Problem: At the end of the day computers are tools (Complex). They are there to make out life simpler.
The answer can be lengthy but you can find more about computer programming

Resources