graph library (c#) for Rete algorithm - c#-4.0

Can you give me suggestions of graph libraries that are best to develop Rete algorithm.
I'm using .net 4.0
I found QuickGraph but I'm not sure if it's useful in this case.

I'm not a C# dev, but I've implemented rete in another language. You want a directed acyclic graph algorithm, start looking here on github. Or perhaps here. However, you can get away with a simpler data structure with a visitor. And, if you haven't I'd read Doorenbos, 1995, which will walk you through how to implement the whole thing.

Well, I agree with Chase. I have built a rules engine using Composite and Visitor, and its working absolutely flawless. Composite helps in organizing rules in a hierarchy (nesting) and Visitor helps you draw unlimited operations like evaluators, visualizers etc. I'd suggest building a truth logic first using composite and visitor and then wrap it up with expression parsing, where expressions are represented as text, maybe XML nodes, which naturally has the hierarchical structure to represent nesting of rules. Best is that you can version expression based rules.

Related

Extracting Validations using NLP

Considering me a newbie into this but if I want to develop an engine to extract only validations from a technical documents like functional specification. Usually the validations are quick to identify.
If this can be done somehow I can use it for further automation.
I checked and few frameworks are available like
https://opennlp.apache.org/
http://nlp.stanford.edu/
I did few POC's as well , but designing an intelligent engine having generic rules is where I am getting blocked.
Any pointers will be helpful ...
For extracting only validation you need to have the grammar or similar like structure maybe a regular expression. If you dont have ANY idea what type of validation you might face then you can follow un-supervised learning approach. Designing any generic rule is not as easy as it sound so you need to work extra hard for this.

Creating source to source translator

I want to know that what are the strategies to create a source to source translator i.e creating translation from one high level language to another. The two ways that come into my mind are
1- Changing syntax tree of one language to other language syntax tree
2- Changing it to intermediate language and then converting that to other high level language
My question is that is it possible to do the conversion using both strategies and which is more feasible to do, can anyone give some refernces to any theory or implementation done by some converter like any of above methods. And is there any standard xml based intermediate language, i know that xmlvm uses xml as intermediate language but it does not provide any proper specification of the intermediate language.
Any compiler is, roughly, a source-to-source translator. Target language can be an assembly language (or directly a binary machine code language), or C, or whatever high level language you fancy. So, the general compilers theory is applicable.
And just as a word of advice - one intermediate language is normally not nearly enough. Use more. Use dozens of intermediate languages, each different from a previous one in just one tiny aspect. This way any language-to-language translation is nothing but trivial.
Another word of advice (anticipating downvotes here) - stay away from XML, especially as a representation for ASTs.
I would look at LLVM, which can do source to source. Although the output isn't pretty, it might provide some good ideas.
The converters are usually based on constructing the semantic tree of one program and then re-shaping it to the target PL. As an example, take a look at C# to Java convertor.
The second approach is also possible, but the organization of your code may change completely after conversion. So, it is better to keep the intermediate common structure (IL, ST, etc), as high level as possible.
Try Clang! It is powerful for source-to-source translation. As of now it fully supports C, C++, Objective C and Objective C++.
You may also want to look at ROSE compiler infrastructure.

Pros/cons of different language workbench tools such as Xtext and MPS?

Does anyone have experience working with language workbench tools such as Xtext, Spoofax, and JetBrains' MPS? I'm looking to try one out and am having a hard time finding a good comparison of the different tools. What are the pros and cons of each?
I'm looking to build DSLs that generate python code, so I'm especially interested to hear from people who've used one of these tools with python (all three seem pretty Java-focused... why is that?). The DLSs are primarily for my own use, so I care less about building a really pretty IDE than I do about it being KISS to define the syntax and write the code generator. The ability to type-check / do static analysis of the DLSs would be pretty cool too.
I'm a little afraid of getting far down a path, hitting a wall, and realizing that all my code is in a format that can't be ported to anything else -- is that a risk with these tools? MPS in particular seems a little scary since as I understand it you don't really generate text-based syntaxes but rather build specialized editors for ASTs.
Markus Voelter does a pretty good job comparing those three in se-radio and Software ArchitekTOUR podcasts.
The basic idea is, that Xtext is most used, therefore most stable and documented, and it is based on popular Eclipse platform and modeling ecosystem - EMF which surrounds it. On the other hand it is parser based and uses ANTLR internally, which means the kind of grammars you can define is limited and languages cannot be combined easily.
Spoofax is an academic product with least adoption of those three. It is also parser based, but uses its own parser generator internally which allows language combinations.
Jetbrains MPS is projection based, which gives much freedom to language designer and allows combinations of languages. *t also has solid support. Drawback might be the learning curve.
None of these tools is strictly Java focused as target language for code generators. Xtext uses Xpand templates, which are plain text. I don't really know how code generation in Spoofax works. MPS has its base language, which is said to be subset of Java, but there are different alternatives.
I personally use Xtext because of its simplicity and maturity, but those strong limitations given by its design make it not a very future proof choice.
I have chosen XText in the same case two weeks ago, but I don't know anything about Spoofax.
My first impression - Xtext is very simple and productive.
I have made my first realife(but very simple) project in 30 minutes, I have generated a graphviz dot graph and html report.
I don't like MPS because I prefer plain text source and destination files.
There are other systems for doing this kind of thing. If your goal is building tools, you don't necessarily have to look to an IDE with an integrated tool; sometimes you can find better tools that have focused on utility rather than IDE integration
Consider any of the pure program transformation tools:
TXL (practical, single paradigm)
Stratego (Spoofax before it was transplanted into Eclipse)
Rascal (research, very nicely designed in many ways)
DMS Software Reengineering Toolkit (happens to be mine; commercial; used to do heavy duty DSL/conventional langauge analysis and transformation including on C++)
These all provide good mechanisms for defining DSLs and transforming them.
What really matters is the support machinery for carrying out "life after parsing".
I 've experimented for a couple of days with Xtext and while the tool looks promising I was eventually put off by the tight integration with the Eclipse ecosystem and the pain one has to go through just to solve what should be given hassle-free out of the box: a headless run of the code generator you implemented. See here for some of the minutiae one has to go through (and it's not even properly documented on the Xtext web site but rather on a blog, meaning its an ad-hoc patch that could very well break on the next release).
Will take another look in half a year to see if there has been any improvement on this front.
Take a look at the Markus Völter's book. It does a very comprehensive comparison of these 3 technologies.
http://dslbook.org
XText is very well maintained but this doesn't mean it's problem-less. Getting type-system, scoping and generation running isn't as easy as advertised.
Spoofax is scannerless, (simplifying grammar composition). Not that well documented, but seems complete.
MPS is projectional. A pro for language composition and con for editing. Supports multiple editors for an AST and will soon even support a nice diagram editor. Base language documentation isn't that good. Typesystem, scoping, checking is very well handled. Model to model transformations are done by the solver. My colleagues using it complain about model to text languages. (My opinion M2M wasn't that intuitive either.)
Years ago Microsoft had the OSLO project. MGrammar and especially Quadrant were very promising. It was possible to represent your model in table, form, text or diagram view. But suddenly they've cancelled the project (and perhaps shot the people working on it)
Perhaps today the best place to compare different language workbenches is http://www.languageworkbenches.net/ and there http://www.languageworkbenches.net/past-editions/ shows how a set of Language Workbenches implement a similar kind of task: a dsl for a particular domain.
Update 2022: as links were broken and newer articles on the topic are written see the site referred above at:
https://web.archive.org/web/20160324201529/http://www.languageworkbenches.net/
References to article reviewing language workbenches include: 1) State of the art: https://link.springer.com/chapter/10.1007/978-3-319-02654-1_11 and 2) Empirical evaluation: https://hal.archives-ouvertes.fr/file/index/docid/706841/filename/Evaluation_of_Modeling_Tools_Adaptation.pdf

What is the best way to create multiple language versions of a domain?

I would like to create a set of domain objects in multiple languages, so that I can target different platforms. I have been looking at external DSLs as a way to define a language for my domain, and then potentially writing adapters that generate code for the languages I'm interested in targeting. Is this the best way to solve this problem? Or is it just simpler to maintain multiple versions of the project?
I think that Apache Thrift delivers what you are asking for.
Sorry for late answer, but as you mention C# being your main language, this practically fully supported Visual Studio based technology is exactly what you are looking for.
You have to understand what you want to abstract with your DSLs, but the multiple-platform support is trivial on top of that.
Disclaimer: This is our technology, but it's publicly open and it solves exactly the problem presented in the question.
http://abstraction.codeplex.com/
Note! Mind the very "alpha" stage of the current download, I suggest you skip the zipped download and grab the latest source. I am updating better construct in relatively near future. Check out the "Context" implementation in "Production/Dev/AbstractionTemplate" solution.
It is difficult to be helpful without understanding what you are planning to use your DSL for.
Is portability your main problem here?
To succesfully target these different platforms, you will probably have to maintain plaftorm-specific layers anyway (generated or not).
If you plan to write your whole application in your DSL, then use your own compiler to transform it into runnable code for each platform, well it is most probably a bad idea, too complex and overengineered.
However, if you have a well-defined chunk of platform-independent logic, then a DSL is a good choice. Just write an interpreter for it on each target platform (provided that performance is not critical, this is also simpler and easier than generating code).
What is the best way to create multiple language versions of a domain?
This is (was?) somehow the idea of Model Driven Architecture (MDA). Quoting Model-driven architecture from Wikipedia:
The Model-Driven Architecture approach
defines system functionality using a
platform-independent model (PIM) using
an appropriate domain-specific
language (DSL).
Then, given a platform definition
model (PDM) corresponding to CORBA,
.NET, the Web, etc., the PIM is
translated to one or more
platform-specific models (PSMs) that
computers can run. This requires
mappings and transformations and
should be modeled too.
The PSM may use different Domain
Specific Languages (DSLs), or a
General Purpose Language (GPL) like
Java, C#, PHP, Python, etc. Automated tools generally
perform this translation.
Depending on the complexity of your domain and the availability of a MDA Tool, this might be an option (with a lower implementation cost).
See also
MDA: Nice idea, shame about the ...
Language Workbenches and Model Driven Architecture
UML vs. Domain-Specific Languages
DSL in the context of UML and GPL
UML or DSL: Which Bear Is Best? (be sure to read this one)

Generate class diagram from existing javadocs

I'm using an external java library for which I only have the javadocs and do not have the source code. I'd like to generate a UML diagram from the existing javadocs so that I can visualize the class hierarchy using something like Graphviz. Is that possible? Note that what I'm looking for is a graphical version of overview-tree.html.
Please let me know if you have any ideas and/or suggestions.
Thanks,
Shirley
I don't believe that there is such a tool. Most of the reverse engineer tools depend on the actual code. The javadoc information isn't guaranteed to match the code as a 1:1 for the structure, thus making it unreliable.
I'm not familiar with any off-the-shelf solution for this purpose. Most commonly folks have the source code that generated the JavaDoc.
That being said, the overview-tree.html traditionally has a fairly straightforward HTML format.
It should not be difficult to write a script that would read the file as text or as a DOM, reconstruct the hierarchy of UL and LI tags, and use that to build an input file for graphviz. I've done similar stuff in the past with other forms of data.
It's just a matter of time and proficiency with the scripting language or appropriate tools.
The one problem of this approach is that you would only get the hierarchy of classes. You would have to make it somewhat smarter if you wanted to get the "implements XYZ" and create multiple hierarchies. Even if you could get that data, you would have to manipulate GraphViz's levels to get it to provide an appropriate layout once you have this multiple inheritance structure.
Of course, adding the details of the members would turn this into a whole new problem since you will have to access other HTML files.

Resources