Domain specific languages, application generators and software reuse - dsl

James Neighbors mentioned DSLs as an approach for software reuse but without explaining why.He just say that DSLs can be a better approach than a library of reusable components. I could not understand the relationship and what benefits can we come up with using DSLs in software reuse ?
Also in When and How to develop DSLs paper by Mernik , he mentioned that DSLs can serve as an input language to application generators, and application generators is one approach of reusing software discussed by Krueger.
Could anybody tell me the relationships or just how would a DSL be an effective approach towards software reuse ? Thanks a lot for your help

James made it very clear why DSLs are a good approach for software reuse (he and I were at UC Irvine together):
They capture the concepts of interest in the problem domain
They use a notation familiar to community that works in that domain
They define the rules of composition of specification/solution components to produce an answer, so that a DSL fragment can be checked for sanity as it is provided
His Draco system implemented all these concepts, accepting DSL descriptions, followed by a DSL instance, which Draco then compiled to low level code by applying implementation knowledge fragments ("refinement rules") to map from a high-level DSL into lower level DSLs/optimizing in the lower level DSL, and then repeating until you finally reach a DSL at low-enough level abstraction to give to a conventional compiler (e.g, to LISP or C or Ada or COBOL or ...).
This is his refine-and-optimize paradigm, that allows a set of DSLs to refine through layers of hierarchy to low level code. Thus, you get composability of layered domains and you can work at a very high level of abstraction.
So you capture problem specification and implementation knowledge, and apply it to get code. Reuse of abstractions, of specifications, of implementation, wow, ... not just reuse of "code" which is where lots of folks still seem stuck, as they were in the early 80s. Code is really hard to reuse.
This is really a very nice paradigm compared to "subroutines-as-components" (the fancy term for this currently is "inner DSL", which misses the domain notation, specification checking, implementation, and compositionality elements).
I think you really ought to read his PhD thesis (accessible here along with a lot of his other papers) carefully. It is a lot more approachable than might expect. It isn't full of arcane math; it is full of concepts and demonstrations of how to engineer his kinds of DSLs.

Related

UML -- Is it still a popular method for markup/pre development /visualization of software?

In my object oriented programming class, we learned some of the main concepts of UML and I was just wondering if UML is common in real world situations or are there more popular methods.
There are certainly organizations that rely on UML, including a few that may expect you to answer OO design questions with UML in an interview. Plus, documentation tools like Doxygen generate UML-like diagrams to describe a class hierarchy.
Beyond that though, most groups I've worked with in academia or industry don't really use it. If you want an explanation of why, read "Death by UML Fever".
Generally agree with #chrisaycock. Would add a couple of things:
You should distinguish using UML for specification versus documentation. At the peak of its hype curve, UML was touted as the former. So development processes mandated modelling in UML before moving into code. That use has diminished greatly (although there are still pockets using executable uml, notably in real-time/embedded environments).
As a documentation tool, UML is still popular. UML class diagrams, for example, can convey the structure of a module in a way that is much more revealing and intuitive than linear code can ever be. Similarly sequence- or activity diagrams are very useful for understanding flow of control for an action that transcends a number of classes.
In the documentation context UML diagrams are increasingly being generated automatically rather than being manually created, e.g. from doxygen (as #chrisaycock mentions).
However it's also still useful for sketching out designs ahead of development e.g. on a whiteboard.
hth.
I once attended a Q&A session on UML and MDA in embedded systems where the panel included authors Bruce Powell Douglass and Steven Mellor. Having previously studied and worked on RT-SSADM projects and the Ward-Mellor methodology, I challenged Stephen Mellor on why a new way of software design comes along every 10 years before practitioners have hardly gotten to grips or truly understood the last one. He responded rather too honestly perhaps with "this way I sell more books"!
To some extent therefore I suggest that the hype surrounding any particular notation or methodology is driven primarily by CASE tool vendors and publishing houses; often the authors are also employed by the tool vendors and have titles like "Chief Evangelist".
That is not to say that these tools have no value; we should all be wary of such marketing, but on the other hand we also need to communicate our ideas and designs in an unambiguous and clear manner, and using a defined notation however inelegant, will always be better than some ad-hoc "sticks and boxes" notation that has no definitive semantics. Given that need for communication, UML (and derivatives such as SysML) is currently the most widely accepted and used notation, and currently enjoys the widest tool support. It differs from much that has gone before by being defined as a standard agreed by multiple parties rather the work on a single author or CASE tool vendor, so it is likely to develop rather than disappear.
I think the article, linked by #chrisaycock, could also have corollaries e.g., "Death by Agile Fever", "Death by CMM Fever", "Death by RT-SSADM Fever", ... ;-)
As #sfinnie stated, it really depends upon the usage, but UML by itself is nothing more than a notation. In order to be really useful, you need to follow some development method. #Clifford's post not withstanding, I'd recommend a mature method. Executable UML started as Shlaer-Mellor and has been in use for 19+ years. Douglass' method (not called ROPES anymore, but ???) has been around for 11 years. The Unified Process is based on Booch, OMT, and OOSE methods, so it can be considered 19+ years old as well. Of course you might find some other UML or non-UML development method that better fits your needs.

domain specific languages and compilers

I was looking over Martin Fowler's recent book contents - Domain Specific Languages and I noticed some ANTLR example - that got me thinking that writing compilers will become more and more popular since people needs in this matter will increase.
So, will the compiler theory still be as arid (being subjective here) as it was until now or are there any chances that we'll get more applied, programmer oriented materials ?
Even though DSLs may seem to create more opportunities for creating new compilers, I don't think they will make the challenges of writing a compiler any easier. You can either use compiler tools like yacc to generate code to handle your dsl syntax, or you can hand carve your own parser with an eye towards better internal efficiency than what the yacc generators spit out.
Either way, you have to have sufficient knowledge of how to define and manipulate a language grammar to make your DSL work and to avoid loopholes and can't-get-there-from-here problems.
Spiffy tools help to implement the solution, but they don't solve the problem for you. To quote my high school chemistry teacher: "Sure! Bring your calculators to class! Calculators only help you get the wrong answer faster!"
So, will the compiler theory still be as arid (being subjective here) as it was until now or are there any chances that we'll get more applied, programmer oriented materials ?
I'd say that compiler theory is actually pretty rich, but may not centered around C style languages. If you want to look at some powerful tools commonly used by academic language designers, I suggest that you check out functional programming languages (ML, Scheme, LISP, Haskell, OCaml, Scala, Clojure, etc.). Personally I prefer Haskell with Parsec, but there are many options. I think the common consensus is that the structure of these languages is more conducive to language design and implementation, at least in a theoretical sense.
Like Kristopher said above, programmers don't necessarily make the best language designers. I've seen some really cool DSL's and I've seen some pretty awful ones (my opinion, of course, YMMV). Knowledge of language concepts is a must for designing any language, DSL or otherwise (Type theory, category theory, various code analyses, machine optimization, etc). Not to mention, if you're designing a DSL, you have to have a fairly intimate knowledge of the domain you're targeting.
Tools off the shelf like yacc, ANTLR, flex, and cup can make building your compiler easier like buying wood from a lumberyard to build your house is easier than going off into the woods and cutting down trees. Both get you the material for the structure, but you still have to know how to build the house. We will definitely see more DSLs in the near future and these tools will help. Will the DSLs be worth using or even useable, however? The tools won't make a difference here, at least in my opinion. Language design employs a lot of real computer science and/or mathematics. Good language designers will have to at least be familiar with both, and good language implementers must be familiar with language design tools.
As high-quality DSLs get easier to build, we are more likely to see more of them. There are several obstacles:
Choosing a good problem domain for a DSL. It has to broad enough to have appeal to more than the author, and narrow enough to have good solutions (C# doesn't count).
Implementing a DSL well. Lots of people seem to think if they have a parser they are done. Actually, you need a lot of technology: parsing, analysis, code generation, ... (See DMS Software Reengineering Toolkit for an engine that contains what I think is needed to produce DSLs effectively)
Acceptance of the DSL by the community. Its amazing how many people insist on coding in just the programming language they know, and nothing else.
There was an explosion of programming languages in the 70's and 80's. Then Java came along and killed everything off. Now we are in another phase of people inventing lots of languages. So, I'd say it is cyclical, and there is really nothing "new" going on.
However, one aspect that remains constant is that most programmers aren't very good at designing languages. Tools like yacc and ANTLR make some of the implementation easier, but they don't make would-be language designers any better at language design.
There already are some useful tools, look for Xtext, EMFText, Jetbrains MPS, Intentional Domain Workbench and Microsofts former OSLO project with the M language. All these tools make defining languages easier, although it has its cost, however for a DSL you might have a bit other requirements than for regular general purpose programming languages.

Specification, modeling and programming are principially the same, right?

In formal specifications based on abstract algebraic types and equational theory you use formulas of equational theory to specify theory. System which will satisfy those constraints is called in formal logic a model.
Modeling is process of creating a model, which abstracts of some aspects, which are unnecessary details for a specific case. So concrete system has to adhere to created model in observed aspects.
Programming is a process of creating a program which will have specific behaviour - will perform specific algorithms - and programming languages through different paradigms enable us to think in a certain specific way, which abstracts of some details, usually machine specific ones.
So could we be doing all those things at the same time, because they are principially the same? Is declarative programming the nearest attempt to do that? Could we use some sort f programming languages which will be good for programming as well as for modeling and specification?
The scientist who has done the most to advance this point of view is Tony Hoare. Tony, along with his colleague Edsger Dijkstra, advocated nondeterministic programming languages so that there would be a smoother path from specification to implementation. Tony definitely wanted a single language for both specification and implementation. For more on this view, read his book on the Alegbra of Programming. Tony also did the seminal work on proving correctness of abstractions. All of this work was done in the context of simple, imperative languages with structured control flow and classic, side-effecting procedures. So there is not any connection with declarative programming of necessity. And historically, work on functional programming (the main branch of declarative programming) has followed more from Backus's Turing lecture on "liberating programming from the von Neumann bottleneck"; functional programming has been about programming productivity as much as anything else.
What we discovered since Hoare is that formal specifications and formal modelsl are very expensive. The expense hasn't been shown to be justified except in very special circumstances, like "if the software doesn't work, the patient will die" or "if the software doesn't work, the plane will crash." Informal models and specifications are quite useful, and much cheaper to produce and work with. There is still interesting research going on around the fringes on modelling, model checking, and so on. One of my personal favorites is the Alloy language done by Daniel Jackson's group at MIT. There's also great stuff done at Microsoft Research and plenty of good stuff elsewhere. There's some work in declarative programming as well, but it too is of the "cheap and cheerful" variety rather than a comprehensive, programmatic approach like Hoare's. One of my favorites there is Claessen's and Hughes's QuickCheck, which provides a way to state formal properties and explore them by random testing. No proofs or theorems, but still jolly useful.
In summary, you describe an agenda of doing formal models, specifications, and programs, all within a single framework. There is still plenty of good work going on piecemeal, but the unified agenda has been abandoned.

Is Antlr a DSL generator and an alternative to Intentional Programming?

I am struck by the ambition and creativity of Charles Simonyi's efforts to establish the field of Intentional Programming, first at Microsoft and then with his own company.
What exactly is Intentional Programming
http://en.wikipedia.org/wiki/Intentional_programming
In this approach to software, a
programmer first builds a toolbox
specific to a given problem domain
(such as life insurance). Domain
experts, aided by the programmer, then
describe the program's intended
behavior in a What You See Is What You
Get (WYSIWYG)-like manner. An
automated system uses the program
description and the toolbox to
generate the final program. Successive
changes are only done at the WYSIWYG
level.
It seems to be such a useful and practical approach to programming, potentially circumventing many of the problems with current approaches to software development.
Essentially it seems to facilitate the creation of domain-specific languages by non-programmers (business/systems analysts) but at a stage much closer to real-life implementation than UML could provide. He says it will be completed eventually but that it is not there yet (almost 15 years later).
DSLs run the gamut from simple 5-line rule engines to complex applications like Ruby on Rails. So I imagine the delay in releasing his product has to do with the fact that he is dealing with simplifying a much higher level of abstraction because he has to essentially allow for the encapsulation of all domain languages at once.
So, my question is
(a) whether Antlr could be an alternative to Intentional Programming - although perhaps a less user-friendly alternative which requires the intervention of programmers rather than permitting business analysts to generate the DSL? Could you use Antlr to generate a DSL like Ruby on Rails (assuming it supported Ruby as an output - which I think it does not)? What can it not do? Also, I don't understand why it's called a "language parser" rather than a "language generator" - since the latter describes what it is used for while the former describes how it achieves its end result.
and
(b) if Antlr is different from Intentional Programming, is there anything similar to Intentional Programming?
In answer to part b), three systems that work in a similar space are:
JetBrains MPS
Eclipse xText
MetaCase MetaEdit+
Each of these products has different strengths and weaknesses, but all of them fall into the category of Language Workbenches. Intentional Software's Intentional Workbench is possibly the most ambitious product in this category to date, but is also not generally available.
MPS and xText are free, open-source products. MetaCase is the most mature, and is a commercial product. All of them have a steep learning curve.
I am not an expert on this, so treat with a large pinch of salt. However...
ANTLR itself is not a DSL generator, though it can be used to create code that interprets DSLs. It is a parser generator - but the DSL generator would have to create what ANTLR generates a parser from.
ANTLR is just a parser generator. In any non-trivial DSL, writing the parser is less than 50% of the effort expended in implementing the DSL. The evaluator/rule engine/code generator/schedule or whatever else your DSL does, probably requires more work and can't be generated like a parser.

What languages implement features from functional programming?

Lisp developed a set of interesting language features quite early on in the academic world, but most of them never caught on in production environments.
Some languages, like JavaScript, adapted basic features like garbage collection and lexical closures, but all the stuff that might actually change how you write programs on a large scale, like powerful macros, the code-as-data thing and custom control structures, only seems to propagate within other functional languages, none of which are practical to use for non-trivial projects.
The functional programming community also came up with a lot of other interesting ideas (apart from functional programming itself), like referential transparency, generalised case-expressions (ie, pattern-matching, not crippled like C/C# switches) and curried functions, which seem obviously useful in regular programming and should be easy to integrate with existing programming practice, but for some reason seem to be stuck in the academic world forever.
Why do these features have such a hard time getting adopted? Are there any modern, practical languages that actually learn from Lisp instead of half-assedly copying "first class functions", or is there an inherent conflict that makes this impossible?
Are there any modern, practical
languages that actually learn from
Lisp instead of half-assedly copying
"first class functions", or is there
an inherent conflict that makes this
impossible?
Why aren't lisp, haskell, ocaml, or f# modern?
You might just need to take it on yourself and look at them and realize that they are more robust, with libraries like java, then you'd think.
A lot of features have been adopted from functional languages to other languages. But vice versa -- (some) functional languages have objects, for example.
I suggest you try Clojure. Syntactically beautiful dialect, functional (in the ML sense), and fast. You get immutability, software transactional memory, multiversion concurrency control, a REPL, SLIME support, and an inexhaustible FFI. It's the Lisp (& Haskell) for the Business Programmer. I'm having a great time using it daily in my real job.
There is no known correlation between a language "catching on" and whether or not is has powerful, well researched, well designed features.
A lot has been said on the subject. It exists all over the place in technology, and also the arts. We know artist A has more training and produces works of greater breadth and depth than artist B, yet artist B is far more successful in the marketplace. Is it because there's a zeitgeist? Is is because artist B has better marketing? Is it because most people won't take the time to understand artist A? Maybe artist B is secretly awful and we should mistrust experts who make judgements about artists? Probably all of the above, to some degree or another.
This drives people who study the arts, and people who study programming languages, crazy.
Scala is a cool functional/OO language with pattern matching, first class functions, and the like. It has the advantage of compiling to Java bytecode and inter-operates well with Java code.
Common Lisp, used in the real-world albeit not wildely so, I guess.
Python or Ruby. See Paul Graham's thoughts on this in the question "I like Lisp but my company won't let me use it. What should I do?".
Scala is the absolute king of languages which have adopted significant academic features. Higher kinds, self types, polymorphic pattern matching, etc. All of these are bleeding-edge (or near to it) academic research topics that have been incorporated into Scala as fundamental features. Arguably, this has been to the detriment of the langauge's simplicity, but it does lead to some very interesting patterns.
C# is more mainstream than Scala, but it also has adopted fewer of these "out-there" functional features. LINQ is a limited implementation for Wadler's generalized list comprehensions, and everyone knows about lambdas. But for all that, C# (rightfully) remains a bit conservative in adopting research features from the academic world.
Erlang has recently gained renewed exposure not only through being used by Twitter, but also by the rise of XMPP driven messaging and implementations such as ejabberd. It sports many of the ideas coming from functional programming being a language designed with that in mind. Initially used to run Telephone switches and conceived by Ericson to run the first GSM networks. It is still around, it is fully functional (as a language) and used in many production environments.
Lua.
It's used as a scripting/extension language for a number of games (like World of Worcraft), and applications (Snort, NMAP, Wireshark, etc). In fact, according to an Adobe developer, Adobe's Lightroom is over 40% Lua.
The guys behind Lua have repeatedly listed Scheme and Lisp as major influences on Lua, and Lua has even been described as Scheme without the parentheses.
Have you checked out F#
Lot's of dynamic programming languages implement ideas from functional programming. The newer .Net languages (C# and VB) have what they call lambda's but these aren't side effect free.
It's not difficult combining concepts from functional programming and object oriented programming for example but it doesn't always make a lot of sense. Object oriented languages (try to) encapsulate state inside objects while functional languages encapsulate state inside functions. If you combine objects and functions in one language it gets harder to make sense of all this.
There have been a lot of languages that have combined these paradigms by just throwing them together (F#) and this can be usefull but I think we still need a couple of decades of playing with languages like this untill we can create a new paradigm that succesfully will combine the ideas from oo and functional programming.
C# 3.0 definitely does.
C# now has
Lambda Expressions
Higher Order Functions
Map / Reduce + Filter ( Folding?) to lists and all types which implement IEnumerable.
LINQ
Object + Collection Initializers.
The last two list items may not fall under proper functional programming, anyways the answer is C# has implemented many useful concepts from Lisp etc.
In addition to what was said, a lot of LISP goodness is based on guaranteed lack of side-effects and using built-in data structures. Both rarely hold in real world. ML is probably better functional base.
Lisp developed a set of interesting language features quite early on in the academic
world, but most of them never caught on in production environments.
Because the kind of people who manage software developers aren't the kinds of people who you can have an interesting chat comparing different language features with. Around 2000, I wanted to use LISP to implement XML-to-HTML transforms on our corporate website (this is around the time of Amazon implementing their backend in LISP). I didn't get to. This is mildly ironic seeing as the company I was working for made and sold a Common LISP environment.
Another "real-world" language that implements functional programming features is Javascript. Since absolutely everything has a value, then high-order functions are easily implemented. You also have other tenants of functional programming such as lambda functions, closures, and currying.
The features you refer to ("powerful" macros, the code-as-data thing and custom control structures) have not propagated within other functional languages. They died after Lisp taught us that they are a bad idea.
Modern functional languages (OCaml, Haskell, Erlang, Scala, F#, C# 3.0, JavaScript) do not have those features.
Cheers,
Jon Harrop.

Resources