I was looking at some syntax diagrams for SQLite and was wondering if they could be used to describe all languages (like Python, C++, etc.)?
http://www.sqlite.org/lang_createtable.html
From some CS classes I took years ago I remember groups of languages the could be described by DFA and what not, but don't remember many details and think this is probably different anyways.
Any clarity would be appreciated.
You wouldn't usually call them "flow charts", but "syntax diagrams" (as you did) or "railroad diagrams". See the Wikipedia article for details, and feel free to use my Railroad Diagram Generator for generating them from an EBNF grammar.
A DFA corresponds to regular grammars, whereas EBNF and syntax diagrams describe context free grammars. These are different levels of the Chomsky hierarchy, which is the basic framework for classifying formal grammars.
Related
James Neighbors mentioned DSLs as an approach for software reuse but without explaining why.He just say that DSLs can be a better approach than a library of reusable components. I could not understand the relationship and what benefits can we come up with using DSLs in software reuse ?
Also in When and How to develop DSLs paper by Mernik , he mentioned that DSLs can serve as an input language to application generators, and application generators is one approach of reusing software discussed by Krueger.
Could anybody tell me the relationships or just how would a DSL be an effective approach towards software reuse ? Thanks a lot for your help
James made it very clear why DSLs are a good approach for software reuse (he and I were at UC Irvine together):
They capture the concepts of interest in the problem domain
They use a notation familiar to community that works in that domain
They define the rules of composition of specification/solution components to produce an answer, so that a DSL fragment can be checked for sanity as it is provided
His Draco system implemented all these concepts, accepting DSL descriptions, followed by a DSL instance, which Draco then compiled to low level code by applying implementation knowledge fragments ("refinement rules") to map from a high-level DSL into lower level DSLs/optimizing in the lower level DSL, and then repeating until you finally reach a DSL at low-enough level abstraction to give to a conventional compiler (e.g, to LISP or C or Ada or COBOL or ...).
This is his refine-and-optimize paradigm, that allows a set of DSLs to refine through layers of hierarchy to low level code. Thus, you get composability of layered domains and you can work at a very high level of abstraction.
So you capture problem specification and implementation knowledge, and apply it to get code. Reuse of abstractions, of specifications, of implementation, wow, ... not just reuse of "code" which is where lots of folks still seem stuck, as they were in the early 80s. Code is really hard to reuse.
This is really a very nice paradigm compared to "subroutines-as-components" (the fancy term for this currently is "inner DSL", which misses the domain notation, specification checking, implementation, and compositionality elements).
I think you really ought to read his PhD thesis (accessible here along with a lot of his other papers) carefully. It is a lot more approachable than might expect. It isn't full of arcane math; it is full of concepts and demonstrations of how to engineer his kinds of DSLs.
Is it appropriate to think of UML as a DSL?
I know that everybody thinks of UML as a way to draw pictures, but underlying the "view" of UML diagrams there is a model...and the "L" in UML stands for Language.
Another question - is SysML a DSL?
UML is a DSL.
A "domain specific language" lets one specify a problem or a solution in a narrow area of application; banking, telephony circuit design, .... One way to distinguish a DSL is that it cannot do general purpose computation (although there are some DSLs that can). Java, C#, Python and COBOL fail this test. (Some would say COBOL is domain-specific for "business" but its only serious concession to that is a decimal data type, and C# has that too.) ColdFusion fails this test; nonstandard syntax does not a DSL make, but IIRC ColdFusion has some support for generating HTML. Fortran fails this test, but its array(-section) sublanguage is only good for arrays and not general purpose computation. Verilog is very domain specific: it is designed to let you write down digital circuits.
UML focuses on specifying different aspects of how software is structured. [You'll note it can't do general purpose computation; one hallmark]. (It actually has 9 or more different aspects it addresses: classes, statecharts, deployment, ... I'll stick to the class aspect for this discussion). The class diagram aspect lets one describe how data is organized, and operations on that data. You can argue this about software so it can't be "domain-specific". What, building software isn't a problem domain?
SYSML is focused on expressing how systems are joined, so it fits this category too.
A more useful question to ask IMHO is, "If I think of UML as a DSL, what do I gain?" Here I don't think you get a lot. The concept of DSL is useful when you arguing for one you don't have (designed or possess) with the point being better expressiveness for a common problem, and might be useful for arguing "you don't want implement your system entirely in it because it isn't Turing capable". It is also useful if you want to explain that your language is going to have a lot of funny notations, precisely because they serve special purposes. People already know this about UML, so... nothing learned.
While I'm a big fan of DSLs, I'm also a big fan of GPLs (general purpose languages). I think in big systems you should necessarily find a "lot" of both: the DSLs to express what they can succinctly (cuts engineering and maintenance costs), and the GPLs to provide arbitrary computation and glue between the system parts. For me what counts in a language is:
what's the class of problem it claims to address and how well does it do it?
what's the syntax (and is it relatively standard for the problem domain)?
what are the precise semantics (this is where you learn the most)?
how good is the tool support?
how well does the DSL integrate into other parts of a big system?
how big and supportive is the community?
UML has (after 15 years) arrived at pretty good answers to these questions.
Homegrown DSLs often don't do so well, partly due to poor design, but often due to the fact that tool support is difficult to get. My company provides machinery to give DSL builders excellent support to improve this situation.
UML is NOT a DSL because UML can be used to model any vertical domain (insurance software, embedded systems,...)
UML is a (horizontal) DSL because UML is a specialized language to model software systems.
So UML is and is not a DSL depending on how you look at it. You could apply the same reasoning to many other languages like html or SQL. They are general because they can be used to represent/manipulate any kind of data but they are specific because they are focused on one task
Short answer - NO - to both questions.
Think of UML as a tool that lets you describe software architectures, software interactions and so on ... describe them in a general way, language agnostic.
DSLs are specialised syntaxes meant to make it easier to describe some specific set of problems
I think the answer to your first question depends on how to define "General" in the term "General Purpose Language". Wikipedia says it is not a DSL:
The opposite is:
a general-purpose programming language, such as C, Java or Python,
or a general-purpose modeling language such as the Unified Modeling Language (UML).
I am a MDA enthusiast so I think I can provide you a very detailed answer to your question.
What is the UML:
The Object Management Group (OMG), a consortium of companies aimed at providing standard languages and technologies, defined a meta-meta modeling language called "The Meta Object Facility" or MOF (http://www.omg.org/mof). A meta-model is a model describing a model or, in other terms, describing the vocabulary (the elements you can use in a model), the syntax (ho they relate each other) and their semantics (what does each entity mean and how its meaning changes in a given context, etc.). A meta-model plays the same role played by Context-Free grammars with respect to the languages they produce. You can thus think at a meta-meta model as a language you can use to define meta-model. This is what the OMG actually did with the UML. The UML language has a meta-model described by means of the MOF in two documents: The UML Infrastructure and the UML Supersturcture (http://www.omg.org/spec/UML).
The UML meta-model has been defined with the aim of been generic enough to cope with the modeling of different systems belonging to different domains. When you define a new UML model you create an instance of the UML meta-model. You could do that for many reasons: to analyse some characteristics of the system, to share some aspects of the system with other stakeholders and so on. However, one of the most important aspects of the OMG vision are model transformations. You can think at a transformation as a set of rules telling an interpreter how to explore a model and produce something else. You can basically transform a model into two different kind of thins, other models (Model2Model, M2M transformations, defined by means of the QVT language) or text such as code or documentation (Model2Text, M2T, transformations defined by means of the MOFM2T Transformation language). So it is VERY IMPORTANT to understand that a UML model is not its diagram. A diagram is just a pictorical representation of the model contents, useful for umans, but not machine readable. You can't apply transformations to a diagram.
The Eclipse Modeling Framework (EMF) is a very powerful (and FREE!) framework implementing all the technologies I have mentioned. A subset of the MOF is implemented in the Eclipse ECORE language. By means of the ECORE the UML meta-model is defined so grafical UML editors (i.e. Papyrus, TopCased, etc.) actually creates XMI representation of the graphically defined UML models conform to the ECORE representation of the UML meta-model. Such representation can be provided as input to transformations engine. The two transformation languages, and related engines, are also available in the EMF with the QVTo plugin and ACCELEO (implementing the MOFM2T transformation language).
As mentioned UML is intentionally generic. However it also provide lightweight extension mechanisms to extends original language vocabulary with domain specific constructs. This can be done by means of stereotypes. A stereotype is a sort of label (actually with meta-attributes) you can attach on model elements to create new entities in the language. You can for instance say in your models some of the classes could be requirements or something else. There are of course some rules, for instance when you stereotype a meta-class you can not violate its original semantics, just reduce it.
SySML is a profile of the UML http://www.omgsysml.org/. A SysML Block is just a UML class stereotyped as Block, a SysML Requirement is just another UML class stereotyped as Block and so on.
Profiling a meta-model like the UML is a quite easy way of creating a sort of DSL (as with stereotypes you add to a more general language some constructs which belong to your domain) which is compatible with the UML (i.e. you can use SysML and UML together). There is another way of creating a DSL which is defining its meta-model by means of the MOF (ECORE). In this case you create a brand new language which is conceptually at the same level of the UML itself.
Many people say UML is just about diagrams because in many cases they do not know what they are talking about. The topic is far more complex, interesting and promising.
UML is a general modelling language that is not specific to any domain whilst the S in DSL stands for Specific. UML is used for modelling systems that can also be represented by multi-purpose programming languages. DSL on the other hand are constrained programming/scripting languages which are specific to a particular domain.
I have seen a lot of these diagrams in some help files and src documentation
What are they called? Are there any other (for same purpose) known diagrams?
Img source : http://www.sqlite.org/images/syntax/insert-stmt.gif
They are called "railroad diagrams", because of their resemblance to a railroad track. They were often used to describe the grammar of older languages, before more formal grammars became routinely used. The problem with them is you can't easily feed them into tools like parser generators, or grammar checkers, so they are not used so much these days.
They are called syntax diagrams.
In my object oriented programming class, we learned some of the main concepts of UML and I was just wondering if UML is common in real world situations or are there more popular methods.
There are certainly organizations that rely on UML, including a few that may expect you to answer OO design questions with UML in an interview. Plus, documentation tools like Doxygen generate UML-like diagrams to describe a class hierarchy.
Beyond that though, most groups I've worked with in academia or industry don't really use it. If you want an explanation of why, read "Death by UML Fever".
Generally agree with #chrisaycock. Would add a couple of things:
You should distinguish using UML for specification versus documentation. At the peak of its hype curve, UML was touted as the former. So development processes mandated modelling in UML before moving into code. That use has diminished greatly (although there are still pockets using executable uml, notably in real-time/embedded environments).
As a documentation tool, UML is still popular. UML class diagrams, for example, can convey the structure of a module in a way that is much more revealing and intuitive than linear code can ever be. Similarly sequence- or activity diagrams are very useful for understanding flow of control for an action that transcends a number of classes.
In the documentation context UML diagrams are increasingly being generated automatically rather than being manually created, e.g. from doxygen (as #chrisaycock mentions).
However it's also still useful for sketching out designs ahead of development e.g. on a whiteboard.
hth.
I once attended a Q&A session on UML and MDA in embedded systems where the panel included authors Bruce Powell Douglass and Steven Mellor. Having previously studied and worked on RT-SSADM projects and the Ward-Mellor methodology, I challenged Stephen Mellor on why a new way of software design comes along every 10 years before practitioners have hardly gotten to grips or truly understood the last one. He responded rather too honestly perhaps with "this way I sell more books"!
To some extent therefore I suggest that the hype surrounding any particular notation or methodology is driven primarily by CASE tool vendors and publishing houses; often the authors are also employed by the tool vendors and have titles like "Chief Evangelist".
That is not to say that these tools have no value; we should all be wary of such marketing, but on the other hand we also need to communicate our ideas and designs in an unambiguous and clear manner, and using a defined notation however inelegant, will always be better than some ad-hoc "sticks and boxes" notation that has no definitive semantics. Given that need for communication, UML (and derivatives such as SysML) is currently the most widely accepted and used notation, and currently enjoys the widest tool support. It differs from much that has gone before by being defined as a standard agreed by multiple parties rather the work on a single author or CASE tool vendor, so it is likely to develop rather than disappear.
I think the article, linked by #chrisaycock, could also have corollaries e.g., "Death by Agile Fever", "Death by CMM Fever", "Death by RT-SSADM Fever", ... ;-)
As #sfinnie stated, it really depends upon the usage, but UML by itself is nothing more than a notation. In order to be really useful, you need to follow some development method. #Clifford's post not withstanding, I'd recommend a mature method. Executable UML started as Shlaer-Mellor and has been in use for 19+ years. Douglass' method (not called ROPES anymore, but ???) has been around for 11 years. The Unified Process is based on Booch, OMT, and OOSE methods, so it can be considered 19+ years old as well. Of course you might find some other UML or non-UML development method that better fits your needs.
It'a a well-known fact that UML does not Turing complete (in contrast to usual programming languages). But it seems to me UML is even more flexible than traditional languages. I can't imagine a problem you can describe by means of such language as C++ (f.e) but at the same time can't describe by means of UML. Quite the contrary it's much more easier for me to fancy a construction existing in UML but unreliazable in C++ (Java, Delphi, VB and so on...)
Could you help me to understand this moment? I really can't catch it.
I´d say that UML IS a turing complete language since the addition of the Action Semantics package (this happened in UML 1.5 version).
Now UML includes an imperative action language (not to be confused with OCL) that allows a precise definition of the behaviour of class methods. This imperative action language includes the typical set of assignments, if conditions, iterators,... you´d expect from any programming language.
This action language is one of the popular components of Executable UML approaches but it´s now part of the UML standard itself
Interesting question. A couple of points come to mind, although there's probably a whole more to it. Apologies it's quite long.
What can you describe with e.g. C++ that you can't describe with UML?
First, you have to define what you mean by "UML". Generally, people tend to mean the 'core' elements - those on Class Diagrams, State Diagrams, Activity Diagrams etc. - plus OCL (the constraint language).
Given those elements you can't specify imperative algorithms. Specifically, anything that requires assignment. You can however get very close: the steps and decision logic can be expressed using e.g. Activity Diagrams, and the function of each step defined as pre- and post-conditions in OCL. However, you never quite get to fully specifying the behaviour. Take an example of an atomic step whose purpose is to increment the value of an integer. The input is an integer - say X. The output is described by the post-condition X == X#pre+1. However, there's nothing in UML to actually implement the step.
Now it's entirely conceivable to extend usage of UML to address above. The UML Action Semantics were developed precisely to enable specification of actions. Doing so makes the language computationally complete. The problems are merely practical:
There's no universally agreed and adopted syntax for the semantics;
There are very few implementations
What can you describe with UML that can't be implemented in e.g. C++?
In essence nothing. However there are two practical limitations:
UML "specifications" are usually imprecise, ambiguous and/or incomplete. Activity Diagrams, for example, often leave paths dangling. Could it be represented directly in C++? Yes. Would it compile? No.
Some of the mappings for UML constructs to imperative, stack based languages are non-trivial. State Models are an example: while there are well-known patterns, the mapping is quite complex. This is especially true for hierarchical and/or concurrent behaviour. In an activity Diagram, it's easy to express that two activities happen in parallel and then synchronise before moving to the next step. That can of course be done in C++ but requires the use of e.g. threading libraries.
It can however be done. In fact, it's what the Executable UML tools do: Model Compilers take an executable UML model and translate it into 100% functioning imperative code.
hth.
As the name implies UML is a modelling language. It can sometimes be applied as a methodology for designing software.
Once upon a time they were dreaming up ways of automatic code generation, they were called CASE tools. They failed to get the code generators to work effectively, although they did remove a lot of boiler plate code from the language. This augmentation became the key to UML as it provided a way to augment the experience of designing and programming software.
I don't know if UML is "Turing Complete", I hope it is, wouldn't it be great to come up with solutions by describing the problem to the computer in pictorial format and letting the computer do all that hard nasty programming for you.
UML is the meta language to the doing in the code. It describes artefacts, how they relate/interact and what they do.
UML is being added to, new design artefacts are being added year by year, and if it is not already Turing Complete I don't see why it couldn't be.
However I think somewhere along the line I read something about languages being "Turing Equivalent" if they could both express and solve the same solution.
Since UML is the design language and code is the implementation language based on the UML design I would say that UML and code (c#, java, etc) are Turing Equivalent. If they are agreed to be Turing Equivalent then UML must be Turing Complete.