How stable is the Groovy language? - groovy

We're writing a large production system in Java, and I'm considering whether or not we can write some of the components in one of the JVM-based dynamic languages. Groovy appears to be the best choice from the Java interoperability standpoint. But is the Groovy implementation reliable enough to use in production (I would assume so), and is the Groovy language specification itself stable enough so that we aren't going to have to revise our production code substantially in a year or two? What are your experiences?
Summary (5/30/09): Good comments, the sense I get is that you should be cautious in adopting Groovy for mission-critical production use, it's fine for ancillary usages like putting together test cases, and there's a middle ground where it's probably fine but do your homework first. Performance is an issue, which needs to be balanced against the increase in developer productivity. Bill and Ichorus have equally helpful answers based on Groovy use, so it was a coin toss.
Update (12/3/09): More recently I've been taking a serious look at Scala, another JVM language. It was designed and implemented by Martin Odersky, the original author of the current javac compiler and the co-designer of Java Generics. Scala is a strongly typed, but uses type inferencing to strip out a lot of boilerplate. It's a nice blend of object-oriented and functional programming. James Gosling likes it. James Strachan, the author of Groovy, likes it too. And Odersky's experience writing javac means Scala's raw performance is not far from Java's, which is impressive.
Update (4/26/11): Take a look at Groovy++, a statically typed extension of Groovy, which has performance equivalent to Java. Looks very interesting.

Edit: Here almost four years later and Groovy has become much more solid.
I can wholeheartedly recommend it for production grade projects.
I've been using Groovy to support production applications for a while and for that purpose it is stable enough. As for actually having Groovy in bona fide production code; I don't think I would do that. Groovy does too many surprising things. It has gotten much better in this regard over the past year or so, but every once in awhile I will run into a bug that is a bit difficult to track down because of the generated code (my issues seem to have revolved around scoping).
I have gotten away from Groovy (though the stuff that we use that is simple and solid is still around) and have used Python (jython implementation), which has been far more predictable in my opinion. Also, python trumps Groovy in readability.
You can write some very interesting code in Groovy with closures and operator overloading and whatnot.
These languages are used for convenience and speed on ancillary code...stuff that can be switched out on the fly if need be. None of it is in production. I don't think I would put either in production unless it was as a stopgap to get something critical out of the door in a major hurry or as a proof of concept or prototype.
And in the case of putting it in actual production, it would have to be in the most dire of circumstances and I would assign someone to rewrite it in pure Java for the next release. I am 98% sure that either would be fine in production but that 2% is too much unnecessary risk.

We've got several production apps running on Grails (using Groovy as the language). So far, no issues have resulted. As for JVM compatibility, take a look at how little the JVM byte-code has changed in the last 5 years... like 1 instruction was added, and none were made obselete.
Will new versions of Groovy come out in the next year? Yes. Will you be required to change to them? No. Though you might want to, 1.6 is a huge speed improvement.
Which brings us to the major drawback of Groovy, the speed issue. Obviously, Groovy is slower than straight Java. The current number are up to 10 TIMES slower, for certain actions. That said, is your CPU the bottleneck in your app? For us, it's mostly DB access and latency. If it's the same for you, what matter if the CPU spends 200ms processing the page request instead of 35ms?
Is that the only problem with Groovy? Nope. Dynamic languages have refactoring difficulties, since there isn't necessarily a complete class specification anywhere in the code. However, this is partially balanced by the smaller code-size which makes it easier to find the places to modify the code.
Anyway, Groovy is a perfectly fine language for production uses. Mix it with Java for your "critical" code, if you fear the reliability. That's the BEST part of Groovy... how easy it mixes with Java classes.

I'm currently exploring using Groovy for only writing unit tests. This has the effects of
Allowing the potentially tedious part of writing tests to be done in a syntax that is a bit simpler than Java.
Keeps the Groovy code out of production.
Allows a large portion of the code base to be written in a non-Java language.
Of course, we haven't started yet, but this is at least my way of attempting to introduce alternate JVM languages to our new projects (and possibly existing ones). I have the same concerns you do, and even more so around performance than stability.

Scripting languages evolve "too fast" in the aspect of syntactic features.
If you want a language for the JVM that will stay compatible for
many years,
Java is your only choice ;)
By the way, I don't think that the readability of code is
ensured by a scripting language automatically.

We used Grails/Groovy as our primary backend at my previous company, and from that experience I'd say that I would choose Groovy over Java in most circumstances I am likely to encounter, since it interoperates with Java seamlessly and is otherwise more fun and expressive. Additionally I would expect the database would almost always be the bottleneck of my application rather than language performance, and we didn't encounter any stability issues/bugs with groovy as far as I recall.
But personally it's usually not about Groovy vs Java for me in most cases -- it's about Groovy/Java + available libraries vs. other languages like Python/Jython/JavaScript/Ruby + available libraries. And there are a lot of other considerations there such as strength of community, maturity of relevant technologies for your particular application, etc. In particular, for web development, Grails was decent, but the community seemed lacking. My overall opinion is that I would use python or Node.js going forward. If I needed the JVM I'd use a jython-compatible python web development environment.

I've been playing with Groovy for a month or so. The simplicity is awesome, and the dynamic language features are really cool too. However, speed is definitely an issue. Also, the groovy console really sucks. You cannot do things that you can do e.g. in python. Every once in a while I have to restart the console, reimport, things, etc. It also keeps forgetting the values I put in the variables while in the console mode; somehow mystically they go out of scope. (Is it because of the JVM? I don't know.) I cannot come up with an example from the top of my head, but the behavior I get in the Groovy console is different from the behavior I get in the Grails console, and is different from what I get by just writing code in a script.
A few more warnings. Note that Groovy is almost, but not 100% compatible with Java. For example, this will not compile:
public class HelloWorld {
public static void main(String args[]) {
System.out.println( "Hello, world!\n");
}
}
Also take a look at How to get classpath in Groovy?

Related

Does this language have its niche | future?

I am working on a new language, targeted for web development, embeding into applications, distributed applications, high-reliability software (but this is for distant future).
Also, it's target to reduce development expenses in long term - more time to write safer code and less support later. And finally, it enforces many things that real teams have to enforce - like one crossplatform IDE, one codestyle, one web framework.
In short, the key syntax/language features are:
Open source, non-restrictive licensing. Surely crossplatform.
Tastes like C++ but simpler, Pythonic syntax with strict & static type checking. Easier to learn, no multiple inheritance and other things which nobody know anyway :-)
LLVM bytecode/compilation backend gives near-C speed.
Is has both garbage collection & explicit object destruction.
Real OS threads, native support of multicore computers. Multithreading is part of language, not a library.
Types have the same width on any platform. int(32), long(64) e.t.c
Built in post and preconditions, asserts, tiny unit tests. You write a method - you can write all these things in 1 place, so you have related things in one place. If you worry that your class sourcecode will be bloated with this - it's IDEs work to hide what you don't need now.
Java-like exception handling (i.e. you have to handle all exceptions)
I guess I'll leave web & cluster features for now...
What you think? Are there any existing similar languages which I missed?
To summarize: You language has no real selling points. It just does what a dozen other languages already did, with syntax and semantics just slightly off, depending on where the programmer comes from. This may be a good thing, as it makes the language easier to adapt, but you also have to convince people to trouble to switch. All this stuff has to be built and debugged and documented again, tools have to be programmed, people have to learn it and convince their pointy-haired bosses to use it, etc. "So it's language X with a few features from Y and nicer syntax? But it won't make my application's code 15% shorter and cleaner, it won't free me from boilerplate X, etc - and it won't work with my IDE." The last one is important. Tools matter. If there are no good tools for a language, few people will shy away, rightfully so.
And finally, it enforces many things that real teams have to enforce - like one crossplatform IDE, one codestyle, one web framework.
Sounds like a downside! How does the language "enforce one X"? How do you convince programmers this coding style is the one true style? Why shouldn't somebody go and replace the dog slow, hardly maintained, severly limited IDE you "enforce" with something better? How could one web framework possibly fit all applications? Programmers rarely like to be forced into X, and they are sometimes right.
Also, you language will have to talk to others. So you have ready-made standard solutions for multithreading and web development in mind? Maybe you should start with a FFI instead. Python can use extensions written in C or C++, use dynamic libraries through ctypes, and with Cython it's amazingly simple to wrap any given C library with a Python interface. Do you have any idea how many important libraries are written in C? Unless your language can use these, people can hardly get (real-world) stuff done with it. Just think of GUI. Most mayor GUI toolkits are C or C++. And Java has hundreds of libraries (the other JVM languages profit much from Java interop) for many many purposes.
Finally, on performance: LLVM can give you native code generation, which is a huge plus (performance-wise, but also because the result is standalone), but the LLVM optimizers are limited, too. Don't expect it to beat C. Especially not hand-tuned C compiled via icc on Intel CPUs ;)
"Are there any existing similar
languages which I missed?"
D? Compared to your features:
The compiler has a dual license - GPL and Artistic
See example code here.
LDC targets LLVM. Support for D version 2 is under development.
Built-in garbage collection or explicit memory management.
core.thread
Types
Unit tests / Pre and Post Contracts
try/catch/finally exception handling plus scope guarantees
Responding to a few of your points individually (I've omitted what I consider either unimportant or good):
targeted for web development
Most people use php. Not because it's the best language available, that's for sure.
embeding into applications
Lua.
distributed applications, high-reliability software (but this is for distant future).
Have you carefully studied Erlang, both its design and its reference implementation?
it enforces many things that real teams have to enforce - like one crossplatform IDE, one codestyle, one web framework.
If your language becomes successful, people will make other IDEs, other code styles, other web frameworks.
Multithreading is part of language, not a library.
Really good languages for multithreading forbid side effects inside threads. Yes, in practice that pretty much means Erlang only.
Types have the same width on any platform. int(32), long(64) e.t.c
Sigh... There's only one reasonable width for integers outside of machine-level languages like C: infinite.
Designing your own language will undoubtedly teach you someting. But designing a good language is like designing a good cryptosystem: lots of amateurs try, but it takes an expert to do it well.
I suggest you read some of Norman Ramsey's answers here on programming language design, starting with this thread.
Given your interest in distributed applications, knowing Erlang is a must. As for sequential programming, the minimum is one imperative language and one functional language (ideally both Lisp/Scheme and Haskell, but F# is a good start). I also recommend knowing at least one high-level language that doesn't have objects, just so you understand that not having objects can often make the programmer's life easier (because objects are complex).
As for what could drive other people to learn your language... Good tools/libraries/frameworks can't hurt (FORTRAN, php), and a big company setting the example can't hurt (Java, C#). Good design doesn't seem to be much of a factor (a ha-ha-only-serious joke has it that what makes a language successful is using {braces} to delimit blocks: C, C++, Java, C#, php)...
What you've given us is a list of features, with no coherent philosophy, or explanation as to how they will work together. None of the features are unique. At best, you're offering incremental improvements over what's already there. I'd expect there's already languages kicking around with what you've said, it's just that they're still fairly obscure, because they didn't make it.
Languages have inertia. People have to learn new languages, and sometimes new tools. They need incentive to do so, and 20% improvement in a few features doesn't cut it.
What you need, at a minimum, is a killer app and a form of elevator pitch. (The "elevator pitch" is what you tell the higher-ups about your project when you're in the elevator with them, in current US business parlance.) You need to have your language be obviously worth learning for some purpose, and you need to be able to tell people why it's worth learning before they think "just another language by somebody who wanted to write a language" and go away.
You need to form a language community. That community needs to have some localization at first: people who work in X big company, people who want to do Y, whatever. Decide on what that community is likely to be, and come up with one big reason to switch and some reasons to believe that your language can deliver what it promises.
No.
Every buzzword you have included in your feature list is an enormous amount of work to be spec'd, implemented, documented, and tested.
How many people will be actively developing the language? I guess the web is full of failed programming language projects. (Same is true for non-mainstream OSes)
Have a look at what .Net/Visual Studio or Java/Eclipse have accomplished. That's 1000s of years of specification, development, tests, documentation, feedback, bug fixes, service packs.
During my last job I heard about somebody who wrote his own programming framework, because it was "better". The resulting program code (both in the framework and in the applications) is certainly unmaintainable once the original programmer quits, or is "hit by a bus", as the saying goes.
As the list sounds like Java++ or Mono++, you'd probably be more successful in engaging in an existing project, even if it won't have your name tag on it.
Perhaps you missed one key term. Performance.
In any case, unless this new language has some really out-of-this-world features(ex: 100% increase in performance over other web development languages), I think it will be yet another fish in the pond.
Currently I'm responsible for maintaining a framework developed/owned by my company. It's a nightmare. Unless there is a mainstream community, working on this full time, it's really an elephant. I do not appreciate my company's decision to develop its own framework(because it's supposed to be "faster") day 'n night.
The language tastes good in my opinion, I don't want use java for a simple website but I would like to have types and things like that. ASP .NET is a problem because of licensing and I can't afford those licenses for a single website... Also features looks good
Remember a lot of operator overloading: I think is the biggest thing that PHP is actually missing. It allows classes to behave much more like basic types :)
When you have something to test I'll love to help you with it! Thanks
Well, if you have to reinvent the wheel, you can go for it :)
I am not going to give you any examples of languages or language features, but I will give you one advice instead:
Supporting framework is what is the most important thing. People will tend to love it or hate it, depending on how easy is to write good code that get job done. Therefore, please do usability test before releasing it. I mean ask several people how they will do certain task and create API accordingly. Then test beta API on other coders and listen carefully to their comments.
Regards and good luck :)
There's always space for another programming language. Apart from getting the design right, I think the biggest problem is coming across as just another wannabe language. So you may want to look at your marketing, you need a big sponsor who can integrate your language into their products, or you need to generate a buzz around it, easiest way is astroturfing. Good luck.
http://en.wikipedia.org/wiki/List_of_programming_languages
NB the names G and G++ aren't taken. Oh and watch out for the patent trolls.
Edit
Oops G / G++ are taken... still there are plenty more letters left.
This sounds more like a "systems" language rather than a "web development language". The major languages in this category (other than C++/C) are D and Go.
My advice to you would be to not start from scratch but examine the possibility of creating tools or libraries for those languages, and seeing just how far you can push them.

Right Language for the Job

Using the right language for the job is the key - this is the comment I read in SO and I also belive thats the right thing to do. Because of this we ended up using different languages for different parts of the project - like perl, VBA(Excel Macros), C# etc. We have three to four languages currently in use inside the project. Using the right language for the job has made it immensly more easy to do automate a job, but of late people are complaining that any new person who has to take over the project will have to learn so many different languages to get started. Also it is difficult to find such kind of person. Please note that this is a one to two person working on the project maximum at a given point of time. I would like to know if the method we are following is right or should we converge to single language and try to use it across all the job even though another language might be better suited for it. Your experenece related to this would also help.
Languages used and their purpose:
Perl - Processing large text file(log files)
C# with Silverlight for web based reporting.
LabVIEW for automation
Excel macros for processing data in excel sheets, generating graphs and exporting to powerpoint.
I'd say you are doing it right. Almost all the projects I've ever worked on have used multiple languages, without any problems. I think you may be overestimating the difficulty people have picking up new languages, particularly if they all use the same paradigm. If your project were using Haskell, Smalltalk, C++ and assembler, you might have difficulties, I must admit :-)
Using a variety languages vs maintainability is simply another design decision with cost-vs-benefits trade-offs that, like any design decision, need to considered carefully. While I like using the "absolute best" tool for the task (whatever "absolute best" means), I wouldn't necessarily use it without considering other factors such as:
do we have sufficient skill and experience to implement successfully
will we be able to find the necessary resources to maintain it
do we already use the language/tech in our system
does the increase in complexity to the overall system (e.g. integration issues, the impact on build automation) outweigh the benefits of using the "absolute best" language
is there another language that we already use and have experience in that is usable in lieu of the "absolute best" language
I worked a system with around a dozen engineers that used C++, Java, SQL, TCL, C, shell scripts, and just a touch of Perl. I was proud that we used the "best language" where they made sense, but, in one case, using the "best language" (TCL) was a mistake - not because it was TCL - but rather because we failed to observe the full costs-vs-benefits of the choice:*
we had only 1 engineer deeply familiar with TCL - the original engineer who refused to use anything but TCL for a particular target component - and then that engineer left the project
this target component was the only part of the system to use TCL and it was small relative to the other components in the system
the component could have been also implemented in another language we already used that we had plenty of experience in (C or C++) with some extra effort
the component appeared deceptively simple, but in reality had subtle corner cases that bit us in production (not something we could have known then, but always something to consider as a possibility later)
we had to implement special changes to the nightly build because the TCL compiler (yes, we compiled the TCL code into an executable) wouldn't work unless it could first throw its logo up on the screen - a screen which wasn't available during a cron-initiated automated nightly build. (We resorted to using xvfb to satisfy it.)
when bugs turned up, we had difficult time finding engineers who wanted to work on it
when problems with this component cropped up only after sustained load in the field, we lacked the experience and deep understanding of the TCL execution engine to easily debug it in the field
finally, the maintenance and sustainment team, which is a much smaller team with fewer resources than the main development team, had one more language that they needed training and experience in just to support this relatively small component
Although there were a number of things we could have done to head-off some of the issues we hit down the road (e.g. put more emphasis on getting TCL experience earlier, running better tests to detect the issues earlier, etc), my point is that the overall cost-vs-benefit didn't justify using TCL to code that single component. Again, it was not a mistake to use TCL because it was TCL (TCL is a fine language), but rather it was a mistake because we failed to give full consideration to the cost-vs-benefits.
As a Software Engineer it's your job to learn new languages if you need to. I would say you should go with the right tool for the job.
It's like sweeping the floor with an octopus. Yeah, it gets the job done... kind of... but it's probably not the best tool for the job. You're better off using a mop.
Another option is to create positions geared towards working in specific languages. So you can have a C# developer, a Perl developer, and a VBA developer who will only work with that language. It's a bit more overhead, but it is a workable solution.
Any modern software project of any scope -- even if it's a one-person job -- requires more than one language. For instance, a web project usually requires Javascript, a backend language, and a DB query language (though any of these might be created by the backend language). That said, there's a threshold that is easily reached, and then it would be very hard to find new developers to take over projects. What's the limit? Three languages? Four? Let's say: five is too many, but one would be too few for any reasonably complex project.
Using the right language for the right job is definitely appropriate - I am mainly a web programmer, and I need to know server-side programming (Rails, PHP + others), SQL, Javascript, jQuery, HTML & CSS (not programming languages strictly, but complex things I need to know) - it would be impossible for me to do all of that in a single language or technology.
If you have a team of smart developers, picking up new languages will not be a problem for them. In fact they probably will be eager to do so. Just make sure they are given adequate time (and mentoring) to learn the new language.
Sure, there will be a learning curve to implementing production code if there is a new language to learn, but you will have a stronger team member for it.
If you have developers in your teams who strongly resist learning new languages, unless there is a very good reason (e.g they are justifiably adamant that they are being asked to used a different language when it is not appropriate to do so) - then they are not the sort of developers I'd want in my teams.
And don't bother trying to hire people who know all the languages you use. Hire smart programmers (who probably know at least one language you use) - they should pick up the other languages just fine.
I would be of the opinion, that if I a programmer on my team wanted to introduce a second (or third) language into a project, that there better be VERY VERY good reason to do so; as a project manager I would need to be convinced that the cost of doing so, more than offset the problems. And it would take a lot of convincing.
Splitting up project into multiple languages makes it very expensive to hire the right person(s) to take over that project when it needs maintenance. For small and medium shops it could be a huge obstacle.
Edit: I am not talking about using javascript and c# on the same project, I am talking about using C# for most of the code, F# for a few parts and then VB or C++ for others - there would need to be a compelling reason.
KISS: 'Keep it simple stupid' is a good axiom to follow in most cases.
EDIT: I am not completely opposed to adding languages, but the burden of proof is on the person to who wants to do it. KISS (to me) applies to not only getting the project/product done and shipped, but also must take into account the lifetime maint. and support requirements. Lots of languages come and go, and programmers are attracted to new languages like a moth to a light. Most projects I have worked on, I still oversee and/or support 5 or 10 years later - last thing I want to see is some long forgotten and/or orphaned language responsible for some key part of an application I need to support.
My experience using C++ and Lua was that I wrote more glue than actual operational code and for dubious benefit.
I'd start by saying the issue is whether you are using the right paradigm for the job?
Suppose you know how to do object oriented programming in C#. I don't think the leap to Java is all that great . Although you'd have to familiarize yourself with libraries and syntax, the idea is pretty similar.
If you have procedural parts to your project, such as parsing files and various data transformations, your Perl/Excel Macros seem pretty similar.
But to address your issues, I'd say above all, you'll need clarity in code. Since your staff are using several languages, they won't be familiar with all languages to an equal degree. So make sure of this:
1) Syntactic sugar is explained in comments. Sugars are pretty language specific and may not be obvious to a new reader. For instance, in VBA I seem to remember there are default properties, so that TextBox1 = "Hello" is what you'd write instead of TextBox1.Text = "Hello". This can be confusing. Also, things like LINQ statements can have un-obvious meanings. So make sure people have comments to read.
2) Where two components from different languages have to work together, make excruciatingly specific details about how that happens. For instance, I once had to write a C component to be called from Excel VBA. There's quote a few steps and potential errors in doing this, especially as far as compiler flags. Make sure both sides know how their side of the interaction occurs.
Finally, as far as hiring people goes, I think you have to find people who aren't married to a specific language. To put it vaguely, hire an intelligent person who sees business issues, not code. He'll learn the lingo soon enough.

Why did you decide "against" using Erlang?

Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
Have you actually "tried" (means programmed in, not just read an article on it) Erlang and decided against it for a project? If so, why? Also, if you have opted to go back to your old language, or to use another functional language like F#, Haskell, Clojure, Scala, or something else then this counts too, and state why.
I returned to Haskell for my personal projects from Erlang for the simple virtue of Haskell's amazing type system. Erlang gives you a ton of tools to handle when things go wrong. Haskell gives you tools to keep you from going wrong in the first place.
When working in a language with a strong type system you are effectively proving free theorems about your code every time you compile.
You also get a bunch of overloading sugar from Haskell's typeclass machinery, but that is largely secondary to me -- even if it does allow me to express a number of abstractions that would be terribly verbose or non-idiomatic and unusable in Erlang (e.g. Haskell's category-extras).
I love Erlang, I love its channels and its effortless scalability. I turn to it when these are the things I need. Haskell isn't a panacea. I give up a better operational understanding of space consumption. I give up the magical one pass garbage collector. I give up OTP patterns and all that effortless scalability.
But its hard for me to give up the security blanket that, as is commonly said, in Haskell, if it typechecks, it is probably correct.
We use Haskell, OCaml and (now) F# so for us it has nothing to do with lack of C-like syntax. Rather we skip Erlang because:
It's dynamically typed (we're fans of Haskell's type system)
Doesn't provide a 'real' string type (I understand why, but it's annoying that this hasn't been corrected at the language level yet)
Tends to have poor (incomplete or unmaintained) database drivers
It isn't batteries included and doesn't appear to have a community working on correcting this. If it does, it isn't highly visible. Haskell at least has Hackage, and I'd guess that's what has us choosing that language over any other. In Windows environments F# is about to have the ultimate advantage here.
There are probably other reasons I can't think of right now, but these are the major points.
The best reason to avoid Erlang is when you cannot commit to the functional way of programming.
I read an anti-Erlang blog rant a few weeks ago, and one of the author's criticisms of Erlang is that he couldn't figure out how to make a function return a different value each time he called it with the same arguments. What he really hadn't figured out is that Erlang is that way on purpose. That's how Erlang manages to run so well on multiple processors without explicit locking. Purely functional programming is side-effect-free programming. You can arm-twist Erlang into working like our ranting blogger wanted, adding side effects, but in doing so you throw away the value Erlang offers.
Pure functional programming is not the only right way to program. Not everything needs to be mathematically rigorous. If you determine your application would be best written in a language that misuses the term "function", better cross Erlang off your list.
I have used Erlang in a few project already. I often use it for restful services. Where I don't use it however is for complex front end web applications where tools like Ruby on Rails are far better. But for the powerbroker behind the scenes I know of no better tool than Erlang.
I also use a few applications written in Erlang. I use CouchDB and RabbitMQ a bit and I have set up a few EJabberd servers. These applications are the most powerful, easiest and flexible tools in their field.
Not wanting to use Erlang because it does not use JVM is in my mind pretty silly. JVM is not some magical tool that is the best in doing everything in the world. In my mind the ability to choose from an arsenal of different tools and not being stuck in a single language or framework is what separates experts from code monkeys.
PS: After reading my comment back in context I noticed it looked like I was calling oxbow_lakes a code monkey. I really wasn't and apologize if he took it like that. I was generalizing about types of programmers and I would never call an individual such a negative name based on one comment by him. He is probably a good programmer even though I encourage him to not make the JVM some sort of a deal breaker.
Whilst I haven't, others on the internet have, e.g.
We investigated the relative merits of
C++ and Erlang in the implementation
of a parallel acoustic ray tracing
algorithm for the U.S. Navy. We found
a much smaller learning curve and
better debugging environment for
parallel Erlang than for
pthreads-based C++ programming. Our
C++ implementation outperformed the
Erlang program by at least 12x.
Attempts to use Erlang on the IBM Cell
BE microprocessor were frustrated by
Erlang's memory footprint. (Source)
And something closer to my heart, which I remember reading back in the aftermath of the ICFP contest:
The coding was very straightforward,
translating pseudocode into C++. I
could have used Java or C#, but I'm at
the point where programming at a high
level in C++ is just as easy, and I
wanted to retain the option of quickly
dropping down into some low-level
bit-twiddling if it came down to it.
Erlang is my other favorite language
for hacking around in, but was worried
about running into some performance
problem that I couldn't extricate
myself from. (Source)
For me, the fact that Erlang is dynamically typed is something that makes me wary. Although I do use dynamically typed languages because some of them are just so very problem-oriented (take Python, I solve a lot of problems with it), I wish they were statically typed instead.
That said, I actually intended to give Erlang a try for some time, and I’ve just started downloading the source. So your “question” achieved something after all. ;-)
I know Erlang since university, but have never used it in my own projects so far. Mainly because I'm mostly developing desktop applications, and Erlang is not a good language for making nice GUIs. But I will soon implement a server application, and I will give Erlang a try, because that's what it's good for. But I'm worring that I need more librarys, so maybe I'll try with Java instead.
A number of reasons:
Because it looks alien from anyone used to the C family of languages
Because I wanted to be able to run on the Java Virtual Machine to take advantage of tools I knew and understood (like JConsole) and the years of effort which have gone into JIT and GC.
Because I didn't want to have to rewrite all the (Java) libraries I've built up over the years.
Because I have no idea about the Erlang "ecosystem" (database access, configuration, build etc).
Basically I am familiar with Java, its platform and ecosystem and I have invested much effort into building stuff which runs on the JVM. It was easier by far to move to scala
I Decided against using Erlang for my project that was going to be run with a lot of shared data on a single multi-processor system and went with Clojure becuase Clojure really gets shared-memory-concurrency. When I have worked on distributed data storage systems Erlang was a great fit because Erlang really shines at distributed message passing systems. I compare the project to the best feature in the language and choose accordingly
Used it for a message gateway for a proprietary, multi-layered, binary protocol. OTP patterns for servers and relationships between services as well as binary pattern matching made the development process very easy. For such a use case I'd probably favor Erlang over other languages again.
The JVM is not a tool, it is a platform. Although I am all in favour of choosing the best tool for the job the platform is mostly already determined. Unless I am developing something standalone, from scratch and without the desire to reuse any existing code/library (three aspects that are unlikely in isolation already) I may be free to choose the platform.
I do use multiple tools and languages but I mainly targetg the JVM platform. That precludes Erlang for most if not all of my projects, as interesting as some of it concepts are.
Silvio
While I liked many design aspects of the Erlang runtime and the OTP platform, I found it to be a pretty annoying program language to develop in. The commas and periods are totally lame, and often require re-writing the last character of many lines of code just to change one line. Also, some operations that are simple in Ruby or Clojure are tedious in Erlang, for example string handling.
For distributed systems relying on a shared database the Mnesia system is really powerful and probably a good option, but I program in a language to learn and to have fun, and Erlang's annoying factor started to outweigh the fun factor once I had gotten past the basic bank account tutorials and started writing plugins for an XMPP server.
I love Erlang from the concurrency standpoint. Erlang really did concurrency right. I didn't end up using erlang primarily because of syntax.
I'm not a functional programmer by trade. I generally use C++, so I'm covet my ability to switch between styles (OOP, imperative, meta, etc). It felt like Erlang was forcing me to worship the sacred cow of immutable-data.
I love it's approach to concurrency, simple, beautiful, scalable, powerful. But the whole time I was programming in Erlang I kept thinking, man I'd much prefer a subset of Java that disallowed data sharing between thread and used Erlangs concurrency model. I though Java would have the best bet of restricting the language the feature set compatible with Erlang's processes and channels.
Just recently I found that the D Programing language offers Erlang style concurrency with familiar c style syntax and multi-paradigm language. I haven't tried anything massively concurrent with D yet, so I can't say if it's a perfect translation.
So professionally I use C++ but do my best to model massively concurrent applications as I would in Erlang. At some point I'd like to give D's concurrency tools a real test drive.
I am not going to even look at Erlang.
Two blog posts nailed it for me:
Erlang machinery walks the whole list to figure out whether they have a message to process, and the only way to get message means walking the whole list (I suspect that filtering messages by pid also involves walking the whole message list)
http://www.lshift.net/blog/2010/02/28/memory-matters-even-in-erlang
There are no miracles, indeed, Erlang does not provide too many services to deal with unavoidable overloads - e.g. it is still left to the application programmer to deal checking for available space in the message queue (supposedly by walking the queue to figure out the current length and I suppose there are no built-in mechanisms to ensure some fairness between senders).
erlang - how to limit message queue or emulate it?
Both (1) and (2) are way below naive on my book, and I am sure there are more software "gems" of similar nature sitting inside Erlang machinery.
So, no Erlang for me.
It seems that once you have to deal with a large system that requires high performance under overload C++ + Boost is still the only game in town.
I am going to look at D next.
I wanted to use Erlang for a project, because of it's amazing scalability with number of CPU'S. (We use other languages and occasionally hit the wall, leaving us with having to tweak the app)
The problem was that we must deliver our application on several platforms: Linux, Solaris and AIX, and unfortunately there is no Erlang install for AIX at the moment.
Being a small operation precludes the effort in porting and maintaining an AIX version of Erlang, and asking our customers to use Linux for part of our application is a no go.
I am still hoping that an AIX Erlang will arrive so we can use it.

How to create a language these days?

I need to get around to writing that programming language I've been meaning to write. How do you kids do it these days? I've been out of the loop for over a decade; are you doing it any differently now than we did back in the pre-internet, pre-windows days? You know, back when "real" coders coded in C, used the command line, and quibbled over which shell was superior?
Just to clarify, I mean, not how do you DESIGN a language (that I can figure out fairly easily) but how do you build the compiler and standard libraries and so forth? What tools do you kids use these days?
One consideration that's new since the punched card era is the existence of virtual machines already bountifully provided with "standard libraries." Targeting the JVM or the .NET CLR instead of ye olde "language walled garden" saves you a lot of bootstrapping. If you're creating a compiled language, you may also find Java byte code or MSIL an easier compile target than machine code (of course, if you're in this for the fun of creating a tight optimising compiler then you'll see this as a bug rather than a feature).
On the negative side, the idioms of the JVM or CLR may not be what you want for your language. So you may still end up building "standard libraries" just to provide idiomatic interfaces over the platform facility. (An example is that every languages and its dog seems to provide its own method for writing to the console, rather than leaving users to manually call System.out.println or Console.WriteLine.) Nevertheless, it enables an incremental development of the idiomatic libraries, and means that the more obscure libraries for which you never get round to building idiomatic interfaces are still accessible even if in an ugly way.
If you're considering an interpreted language, .NET also has support for efficient interpretation via the Dynamic Language Runtime (DLR). (I don't know if there's an equivalent for the JVM.) This should help free you up to focus on the language design without having to worry so much about the optimisation of the interpreter.
I've written two compilers now in Haskell for small domain-specific languages, and have found it to be an incredibly productive experience. The parsec library makes playing with syntax easy, and interpreters are very simple to write over a Haskell data structure. There is a description of writing a Lisp interpreter in Haskell that I found helpful.
If you are interested in a high-performance backend, I recommend LLVM. It has a concise and elegant byte-code and the best x86/amd64 generating backend you can find. There is an optional garbage collector, and some experimental backends that target the JVM and CLR.
You can write a compiler in any language that produces LLVM bytecode. If you are adventurous enough to learn Haskell but want LLVM, there are a set of Haskell-LLVM bindings.
What has changed considerably but hasn't been mentioned yet is IDE support and interoperability:
Nowadays we pretty much expect Intellisense, step-by-step execution and state inspection "right in the editor window", new types that tell the debugger how to treat them and rather helpful diagnostic messages. The old "compile .x -> .y" executable is not enough to create a language anymore. The environment is nothing to focus on first, but affects willingness to adopt.
Also, libraries have become much more powerful, noone wants to implement all that in yet another language. Try to borrow, make it easy to call existing code, and make it easy to be called by other code.
Targeting a VM - as itowlson suggested - is probably a good way to get started. If that turns out a problem, it can still be replaced by native compilers.
I'm pretty sure you do what's always been done.
Write some code, and show your results to the world.
As compared to the olden times, there are some tools to make your job easier though. Might I suggest ANTLR for parsing your language grammar?
Speaking as someone who just built a very simple assembly like language and interpreter, I'd start out with the .NET framework or similar. Nothing can beat the powerful syntax of C# + the backing of the entire .NET community when attempting to write most things. From here i designed a simple bytecode format and assembly syntax and proceeeded to write my interpreter + assembler.
Like i said, it was a very simple language.
You should not accept wimpy solutions like using the latest tools. You should bootstrap the language by writing a minimal compiler in Visual Basic for Applications or a similar language, then write all the compilation tools in your new language and then self-compile it using only the language itself.
Also, what is the proposed name of the language?
I think recently there have not been languages with ALL CAPITAL LETTER names like COBOL and FORTRAN, so I hope you will call it something like MIKELANG with all capital letters.
Not so much an implementation but a design decision which effects implementation - if you make every statement of your language have a unique parse tree without context, you'll get something that it's easy to hand-code a parser, and that doesn't require large amounts of work to provide syntax highlighting for. Similarly simple things like using a different symbol for module namespaces and object namespaces ( unlike Java which uses . for both package and class namespaces ) means you can parse the code without loading every module that it refers to.
Standard libraries - include the equivalent of everything in C99 standard libraries other than setjmp. Add whatever else you need for your domain. Work out an easy way to do this, either something like SWIG or an in-line FFI such as Ruby's [can't remember module name] and Python's ctypes.
Building as much of the language in the language is an option, but projects which start out doing either give up (rubinius moved to using C++ for parts of its standard library), or is only for research purposes (Mozilla Narcissus)
I am actually a kid, haha. I've never written an actual compiler before or designed a language, but I have finished The Red Dragon Book, so I suppose I have somewhat of an idea (I hope).
It would depend firstly on the grammar. If it's LR or LALR I suppose tools like Bison/Flex would work well. If it's more LL, I'd use Spirit, which is a component of Boost. It allows you to write the language's grammar in C++ in an EBNF-like syntax, so no muddling around with code generators; the C++ compiler compiles the grammar for you. If any of these fail, I'd write an EBNF grammar on paper, and then proceed to do some heavy recursive descent parsing, which seems to work; if C++ can be parsed pretty well using RDP (as GCC does it), then I suppose with enough unit tests and patience you could write entire compilers using RDP.
Once I have a parser running and some sort of intermediate representation, it then depends on how it runs. If it's some bytecode or native code compiler, I'll use LLVM or libJIT to process it. LLVM is more suited for general compilation, but I like the libJIT API and documentation better. Alternatively, if I'm really lazy, I'll generate C code and let GCC do the actual compilation. Another alternative, is to target an existing VM, like Parrot or the JVM or the CLR. Parrot is the VM being designed for Perl. If it's just an interpreter, I'll walk the syntax tree.
A radical alternative is to use Prolog, which has syntax features which remarkably simulate EBNF. I have no experience with it though, and if I am not wrong (which I am almost certainly going to be), Prolog would be quite slow if used to parse heavy duty programming languages with a lot of syntactical constructs and quirks (read: C++ and Perl).
All this I'll do in C++, if only because I am more used to writing in it than C. I'd stay away from Java/Python or anything of that sort for the actual production code (writing compilers in C/C++ help to make it portable), but I could see myself using them as a prototyping language, especially Python, which I am partial towards. Of course, I've never actually done any of this before, so I'm not one to say.
On lambda-the-ultimate there's a link to Create Your Own Programming Language by Marc-André Cournoyer, which appears to describe how to leverage some modern tools for creating little languages.
Just to clarify, I mean, not how do you DESIGN a language (that I can figure out fairly easily)
Just a hint: Look at some quite different languages first, before designing a new languge (i.e. languages with a very different evaluation strategy). Haskell and Oz come to mind. Though you should also know Prolog and Scheme. A year ago I also was like "hey, let's design a language that behaves exactly as I want", but fortunatly I looked at those other languages first (or you could also say unfortunatly, because now I don't know how I want a language to behave anymore...).
Before you start creating a language you should read this:
Hanspeter Moessenboeck, The Art of Niklaus Wirth
ftp://ftp.ssw.uni-linz.ac.at/pub/Papers/Moe00b.pdf
There's a big shortcut to implementing a language that I don't see in the other answers here. If you use one of Lukasiewicz's "unparenthesized" forms (ie. Forward Polish or Reverse Polish) you don't need a parser at all! With reverse polish, the dependencies go right-to-left so you simply execute each token as it's scanned. With forward polish, it's the reverse of that, so you actually execute the program "backwards", simplifying subexpressions until reaching the starting token.
To understand why this works, you should investigate the 3 primary tree-traversal algorithms: pre-order, in-order, post-order. These three traversals are the inverse of the parsing task that a language reader (i. parser) has to perform. Only the in-order notation "requires" a recursive decent to re-construct the expression tree. With the other two, you can get away with just a stack.
This may require more "thinking' and less "implementing".
BTW, if you've already found an answer (this question is a year old), you can post that and accept it.
Real coders still code in C. Just that it's a litte sharper.
Hmmm... language design? or writing a compiler?
If you want to write a compiler, you'd use Flex + Bison. (google)
Not an easy answer, but..
You essentially want to define a set of rules written in text (tokens) and then some parser that checks these rules and assembles them into fragments.
http://www.mactech.com/articles/mactech/Vol.16/16.07/UsingFlexandBison/
People can spend years on this, The above article talks about using two tools (Flex and Bison) That can be used to turn text into code you can feed to a compiler.
First I spent a year or so to actually think how the language should look like. At the same time I helped in developing Ioke (www.ioke.org) to learn language internals.
I have chosen Objective-C as implementation platform as it's fast (enough), simple and rich language. It also provides test framework so agile approach is a go. It also has a rich standard library I can build upon.
Since my language is simple on syntactic level (no keywords, only literals, operators and messages) I could go with Ragel (http://www.complang.org/ragel/) for building scanner. It's fast as hell and simple to use.
Now I have a working object model, scanner and simple operator shuffling plus standard library bootstrap code. I can even run a simple programs - as long as they fit in one file that is :)
Of course older techniques are still common (e.g. using Flex and Bison) many newer language implementations combine the lexing and parsing phase, by using a parser based on a parsing expression grammar (PEG). This works for recursive descent parsers created using combinators, or memoizing Packrat parsers. Many compilers are built using the Antlr framework also.
Use bison/flex which is the gnu version of yacc/lex. This book is extremely helpful.
The reason to use bison is it catches any conflicts in the language. I used it and it made my life many years easier (ok so i'm on my 2nd year but the first 6months was a few years ago writing it in C++ and the parsing/conflicts/results were terrible! :(.)
If you want to write a compiler obviously you need to read the Dragon Book ;)
Here is another good book that I have just read. It is practical and easier to understand than the Dragon Book:
http://www.amazon.co.uk/s/ref=nb_sb_noss?url=search-alias%3Daps&field-keywords=language+implementation+patterns&x=0&y=0
Mike --
If you're interested in an efficient native-code-generating compiler for Windows so you can get your bearings -- without wading through all the unnecessary widgets, gadgets, and other nonsense that clutter today's machines -- I recommend the Osmosian Order's Plain English development system. It includes a unique interface, a simplified file manager, a friendly text editor, a handy hexadecimal dumper, the compiler/linker (of course), and a wysiwyg page-layout application for documentation. Written entirely in Plain English, it is a quick download (less than a megabyte), small enough to understand in short order (about 25,000 lines of Plain English code, with just 4,000 in the compiler/linker), yet powerful enough to reproduce itself on a bottom-of-the-line Dell in less than three seconds. Really: three seconds. And it's free to all who write and ask for a copy, including the source code and and a rather humorous tongue-in-cheek 100-page manual. See www.osmosian.com for details on how to get a copy, or write to me directly with questions or comments: Gerry.Rzeppa#pobox.com

Groovy advantages over Jython or Jruby?

Why would I choose to use Groovy when I could use Jython or Jruby? Does the language provide any inherent advantages to make up for the fact that Jython and Jruby skills are applicable to their parent languages outside of the JVM?
Keep in mind that I purposely keeping this question generic, but if there are any advantages that exist in a particular domain, please don't hesitate to describe them.
EDIT
To clarify, If I write some code in Jruby, I can now, in some cases, move that code outside of the JVM if need be, or at the very least I have gained a better understanding of Ruby. Whereas Groovy skills are applicable only when using a language that just exists inside the JVM. Jython and Jruby have this built in advantage, what does Groovy have to make up for this disadvantage?
If Groovy doesn't have any advantages that you've found, and you would suggest just using Jython or Jruby, let me know.
Edit 2
Thanks everyone for all the answers, most of them make the same point, Groovy integrates slightly better with Java then Jython or Jruby.
Follow up
Using Netbeans 6.5 as my IDE I have found that Groovy to integrates better with Java projects then Jruby. I am not sure if lack of integration is a failing of Jruby or Netbeans. But after using it for alittle Groovy definitely seems to have a leg up.
I've done pretty extensive development in Ruby and Groovy (as well as a little Jython using Grinder as a load testing tool).
Of the 3, I prefer Groovy the most. I like the closure syntax the best and I think that it has the tightest integration in how it works with other java classes on the JVM. It's been a little while since I last used JRuby, but importing Java classes and working with the classloader in JRuby didn't feel as clean to me.
The fact that Groovy is also essentially a superset of Java means that the huge population of Java programmers out there will have a quicker uptake time in picking Groovy up over Ruby/JRuby. They can start programming it like it's Java and slowly start inserting idomatic groovy as they pick it up.
More to the point of what you're asking, I think that another advantage of Groovy is that the language that you go to when you want to optimize something is almost the exact same syntax, it's Java. If you're working in the Ruby or Python worlds, you're going to have to go to either C which is a big shift or Java, which is also quite different than those languages. Programming in Groovy tends to help keep your Java skills somewhat sharp as well.
If you have particular access to a Ruby or Python infrastructure, or a team that has familiarity with those kind of environments, then I could see choosing one of those other languages.
Really, all 3 of them are very nice languages and what you pick should depend more on the problem that you're trying to fix and the resources that you have available to you. Once you've become proficient in one dynamic language, picking up a second or a third is much easier.
I would say if you need to mix Java with Jruby/Groovy, go with Groovy. As everybody said, Groovy has tighter Java integration.
But as far as the language implementation goes, I prefer the Ruby language over Groovy, the language revolves around itself, in Groovy there are some hacks that are inherent to the implementation itself (just watch a Grails stacktrace vs. a Rails stacktrace and you'll see what I mean).
I highly recommend seeing Neal Ford's comparison of Groovy and JRuby
I think Dick Wall gave a very good summary of the differences between these three on the Java Posse podcast (#213, about 34:20 in) ...
"JRuby was designed to make programmers happy ... it's a programming language developer's choice; Python has very strong roots in simplicity and education; Groovy is aimed squarely at being the choice for Java developers ... it's a very familiar environment for Java ... with support for annotations".
In terms of moving the language outside of the JVM, I don't think the Java runtime imposes much of an overhead -- it's a simple install, and you need to set some environment variables -- but it does provide a number of benefits including a mature runtime which has been highly optimised, and a large set of libraries. The JRuby team are now reporting better performance than the native MRI. http://blog.headius.com/2008/08/twas-brillig.html
I've only had experience with Jython and Groovy. The biggest disadvantage with Jython, at the moment, is that the latest release recommended for production (2.2.1) has a feature set that "roughly corresponds to that of Python-2.2" (Jython FAQ). There is a beta implementing what I assume is Python 2.5, which is now a version behind. Don't know if the same can be said for JRuby.
I don't know why you should choose Groovy because I don't know your background. If you are a Java developer Groovy feels more similar to your current language then JRuby or Jython. Groovy combines the best of Java, the language, Java, the platform, and Ruby the language.

Resources