Using metamorphic code to reduce boilerplate - metaprogramming

Has anyone seen metamorphic code -- that is, code that generates and runs instructions (including IL and Java Bytecode, as well as native code) -- used to reduce boilerplate code?
Regardless of the application or language, typically one has some database code to get rows from the database and return a list of objects. Of course, there are countless ways of doing this based on your database connector. You might end up accessing the cells of the row by index (awkward, because changing "SELECT Name, Age" to "SELECT Age, Name" would break your code, plus the indexes obfuscate meaning), or using myObject.Age = resultRow.getValue("Age") (awkward, because this involves simply going through every field to set its data based on the columns).
Keeping with the database theme, LINQ to SQL is awesome. However, defining data models is less awesome, especially when your database has so many tables that SSMS can't list all of them in the object browser. Also, it's not the stored procedure writing or the SQL involvement that I dislike; just the connection of objects to database.
Someone at the company at which I intern wrote a really awesome method from our SqlCommand class (which inherits from the System one) that uses .NET reflection, with System.Reflection.Emit, to generate a method that would set fields (decorated with an attribute containing the name of the column) on any model object with a nullary constructor. I would consider this metamorphic because a specific part of the program writes new methods.
This pattern of generating objects from the database is just one example. One which I came across two days ago was databinding support for SWT (via JFace). I made these perfectly clean models with setAddress(Address address) and getName() and now I have to pollute the setters with PropertyChangeSupport fire-ers and carry around a PropertyChangeSupport instance (even if it is just in an abstract base class)! Then I found PojoBindables and now I feel like a level 80 databinder, simply because I need to write less.
Specifically, things that use native code with something like this or a Java Agent would be really sweet.

Generic programming might up your alley. The Concept C++ website has a really good tutorial that covers abstraction and lifting, ideas that can be used in any language and turn boilerplate code into a positive force. By examining a bunch of boilerplate methods that are almost exactly the same, you can derive a set of requirements that unite the code conceptually ("To make X happen you must do Y, so make X1 happen you must do Y with difference 1"). From there you can use a template to capture the commonalities, and use the template inputs to specify the differences. C# and Java have their own generics implementations at this point, so it might be worth checking out.

Related

JOOQ vs SQL Queries

I am on jooq queries now...I feel the SQL queries looks more readable and maintainable and why we need to use JOOQ instead of using native SQL queries.
Can someone explains few reason for using the same?
Thanks.
Here are the top value propositions that you will never get with native (string based) SQL:
Dynamic SQL is what jOOQ is really really good at. You can compose the most complex queries dynamically based on user input, configuration, etc. and still be sure that the query will run correctly.
An often underestimated effect of dynamic SQL is the fact that you will be able to think of SQL as an algebra, because instead of writing difficult to compose native SQL syntax (with all the keywords, and weird parenthesis rules, etc.), you can think in terms of expression trees, because you're effectively building an expression tree for your queries. Not only will this allow you to implement more sophisticated features, such as SQL transformation for multi tenancy or row level security, but every day things like transforming a set of values into a SQL set operation
Vendor agnosticity. As soon as you have to support more than one SQL dialect, writing SQL manually is close to impossible because of the many subtle differences in dialects. The jOOQ documentation illustrates this e.g. with the LIMIT clause. Once this is a problem you have, you have to use either JPA (much restricted query language: JPQL) or jOOQ (almost no limitations with respect to SQL usage).
Type safety. Now, you will get type safety when you write views and stored procedures as well, but very often, you want to run ad-hoc queries from Java, and there is no guarantee about table names, column names, column data types, or syntax correctness when you do SQL in a string based fashion, e.g. using JDBC or JdbcTemplate, etc. By the way: jOOQ encourages you to use as many views and stored procedures as you want. They fit perfectly in the jOOQ paradigm.
Code generation. Which leads to more type safety. Your database schema becomes part of your client code. Your client code no longer compiles when your queries are incorrect. Imagine someone renaming a column and forgetting to refactor the 20 queries that use it. IDEs only provide some degree of safety when writing the query for the first time, they don't help you when you refactor your schema. With jOOQ, your build fails and you can fix the problem long before you go into production.
Documentation. The generated code also acts as documentation for your schema. Comments on your tables, columns turn into Javadoc, which you can introspect in your client language, without the need for looking them up in the server.
Data type bindings are very easy with jOOQ. Imagine using a library of 100s of stored procedures. Not only will you be able to access them type safely (through code generation), as if they were actual Java code, but you don't have to worry about the tedious and useless activity of binding each single in and out parameter to a type and value.
There are a ton of more advanced features derived from the above, such as:
The availability of a parser and by consequence the possibility of translating SQL.
Schema management tools, such as diffing two schema versions
Basic ActiveRecord support, including some nice things like optimistic locking.
Synthetic SQL features like type safe implicit JOIN
Query By Example.
A nice integration in Java streams or reactive streams.
Some more advanced SQL transformations (this is work in progress).
Export and import functionality
Simple JDBC mocking functionality, including a file based database mock.
Diagnostics
And, if you occasionally think something is much simpler to do with plain native SQL, then just:
Use plain native SQL, also in jOOQ
Disclaimer: As I work for the vendor, I'm obviously biased.

Best approach to data filtering

I'm building a little library applications to have a visual catalogue of my programming ebooks.
By now, I've added some of my ebooks info into a ko.observableArray in my BooksViewModel.js file.
Later, I'll be implementing a NodeJS applications with all the data saved in a MongooseDB and access them from there, but by now, I'm just experimenting directly from Knockout.js.
By default, my library shows all the books I added, desorganized, so I'm looking forward to implement "categories" by language. Every book object contains a language attribute.
I want to filter the books showed by language but I'm a little bit confused on how will be the best way to do this.
The books on the array are not organized, they are all dropped there.. some talks about javascript, other C and so on.
At first I thought about creating a separated array for each language, and then implemeting a method in the ViewModel to select the corresponding array of the language you requested.
Later, I would implementa NodeJS API, to get them by language, lets say:
GET /languages/C // will get a json corresponding all the books that talks about C
The ViewModel could contain a method:
self.findByLanguage = function(lang) {
self.books = // GET /languages/:lang
};
But that would query the database every time. I guess is better to load the whole books json first, saved all of them to an array on the client side, and then filter them. That way only one request would be made.
I could have a global array containing all the books, and then implement the filter with ko.utils.ArrayFilter.
What do you guys think will be the best approach? Maybe there is a better way.
Thanks in advance!
If "my programming ebooks" means this application is for you only, there's a trivial difference between querying all and only the selected few books as the database load will generally be close to zero in either of these cases. The number of books would be a few hundred perhaps.
But wait, what's the actual benefits from loading them all at once?
Upsides of storing the whole list client-side
If you are always looking at most of the categories will save you some milliseconds of database load and all bandwith just from changing categories.
Downsides
Bandwith usage is awful, initial page loading is slower giving you plenty of books you don't want or need.
The database system you're using is having speed as important optimization factor. Add an index on language and querying should be done in no time, anyhow. For the time you're using arrays as data source, this might not show in comparison to 'just sending the whole array'.
Opening the page in multiple windows/browsers/on multiple pcs will require you to syncrhonize all changes to all clients. If you don't do this, you'll have old objects until you reload the page which is exactly what you should avoid if having the list client-side.
If you're planning to run this on your local computer or within your local network, speed should be a trivial issue, so why not let the database do the work? If you're not and speed is an issue, I would personally value "I can load category X pretty fast" over "Initial page loading is slow, but it's fast once everything's loaded".

Is Monotouch.Dialog a suitable replacement for all UITableviews?

This question is mainly targeted towards Miguel as the creator of MT.Dialog but I would like to hear opinions of others as well.
I'm currently refactoring a project that has many table views. I'm wondering if I should replace All of them with MT.Dialog.
My pros are:
easy to use
simple code
hope that Xamarin will offer it cross platform one day
Cons:
my cells are complete custom made. Does it make sense in that case?
performance? Is that an issue?
breaking the MVC paradigms (source no longer separated from view and controller)
Is it in general better to just use MT.Dialog or inherit from it for specific use cases? What are your experiences?
To address some of your questions.
The major difference between MonoTouch.Dialog and UITableView is that with the former you "load" all the data that you want to render upfront, and then forget about it. You let MonoTouch.Dialog take care of rendering it, pushing views and taking care of sections/elements. With UITableView you need so provide callback methods for returning the number of sections, the titles for the sections and the data itself.
UITableView has the advantage that to render say a million rows with the same size and the same cells, you dont really have to load all the data upfront, you can just wait to be called back. That being said, this breaks quickly if you use cells with different heights, as UITableView will have to query for the sizes of all of your rows.
So in short:
(1) yes, even if you use custom cells, you will benefit from shorter code and a simpler programming model. Whether you use the other features of it or not, is up to you.
(2) For performance, the issue boils down to how many rows you will have. Like I mentioned before, if you are browsing a potentially large data set, you would have to load all of those cells in memory up front, or like TweetStation, add features to load "on-demand".
The reality is that it will consume more memory, because you need to load your data in MonoTouch.Dialog. Your best optimization technique is to keep your Elements very lightweight. Tweetstation for example uses a "TweetElement" that merely holds the ID to the tweet, and loads the actual contents on demand, to keep the size of the TweetElement in memory very small.
With UITableView, you do not pay for that price. But if you are not using a database of some sort, the data will still be in memory.
If your application calls for the data to be in memory, then you might as well move the data to be elements and use that as your model.
(3) This is a little bit of a straw man. Your data "source" is never really independent of UIKit. I know that people like to talk about these models as being reusable, but in practice, you wont ever be able to reuse a UITableViewSource as a source for anything but a UITableView. It's main use is to support scalable controls that do not require data to be loaded in memory up-front, it is not really about separating the Model from the View.
So what you really have is an adaptor class that bridges the world of the UITableView with your actual data model (a database, an XML list, an in-memory array, a Redis connection).
With UITableView, your adaptor code lives in the constructor and the UITableViewSource. With MonoTouch.Dialog your adatpro code lives in the code that populates the initial RootElement to DialogViewController.
So there are reasons to use UITableView over MonoTouch.Dialog, but it is none of those three Cons.
I use MonoTouch.Dialog (and it's brother QuickDialog for objc) pretty much every time I use a tableview. It does help a lot to simplify the code, and gives you a better abstraction of a table.
There's one exception, though, which is when the table will have thousands and thousands of rows, and the data is in a database. MT.D/QD requires you to load all the data upfront, so you can create the sections, and that's simply too slow if you don't already have the objects in memory.
Regarding "breaking MVC", I kind of agree with you. I never really use the reflection bindings in MT.D because of that fact. Usually I end up creating the root from scratch in code, or use something like JSON (in my fork https://github.com/escoz/MonoMobile.Forms), so that my domain objects don't need to know about MT.D.

Code generation against Sprocs?

I'm trying to understand choices for code generation tools/ORM tools and discover what solution will best meet the requirements that I have and the limitations present.
I'm creating a foundational solution to be used for new projects. It consists of ASP.NET MVC 3.0, layers for business logic and data access. The data access layer will need to go against Oracle for now, and then switch to SQL this year as the db migration is finished.
From a DTO standpoint mapping to custom types in the solution, what ORM/code generation tool will work with creating my needed code but can ONLY access Stored Procs in Oracle and SQL.?
Meaning, I need to generate the custom objects that are the artifacts from and being pushed to the stored procedures as the parameters, I don't need to generate the sprocs themselves, they already exist. I'm looking for the representation of what the sproc needs and gives back to be generated into DTOs. In some cases I can go against views and generate DTOs. I'm assuming most tools already do this. But for 90% of the time, I don't have access directly to any tables or views, only stored procs.
Does this make sense?
ORMs are best at mapping objects to tables (and/or views), not mapping objects to sprocs.
Very few tools can do automated code generation against whatever output a sproc may generate, depending on the complexity of the sproc. It's much more straight-forward to code generate the input to a sproc as that is generally well defined and clear.
I would say if you are stuck with sprocs, your options for using third party code to help reduce your development and maintenance time are severely limited.
I believe either LinqToSql or EntityFramework (or both?) are capable of some magic with regards to SQL Server to try to mostly automatically figure out what a sproc may be returning. I don't think it works all the time, it's just sophisticated guess work and I seriously doubt it would work with Oracle. I am not aware of anything else software-wise that even attempts to figure out what a sproc may return.
A sproc can return multiple diverse record sets that can be built dynamically by the sproc depending on the input and data in the database. A technical solution to automatically anticipating sproc output seems like it would require the following:
A static set of underlying data in the database
The ability to pass all possible inputs to the sproc and execute the sproc without any negative impact or side effects
That would give you a static set of possible outputs for any given valid input. A small change in the data in the database could invalidate everything.
If I recall correctly, the magic Microsoft did was something like calling the sproc passing NULL for all input parameters and assuming the output is always exactly the first recordset that comes back from the database. That is clearly an incomplete solution to the problem, but in simple cases it appears to be magic because it can work very well some of the time.

What code could be used as a string aggregator for Sybase? (Like Oracle's stragg)

In my travels in Oracle, the 'stragg' function, or 'String Aggregator' was life-saving when I had to create dynamic SQL queries on the fly.
You can read up about it here: http://www.oratechinfo.co.uk/delimited_lists_to_collections.html
The basic use of it was:
select stragg(fruit) from food;
fruit
-----------
apple,pear,banana,strawberry
1 row(s) returned
So simple to use, concatenating chr(13) turned it into a long list, and selecting information from system tables gave a 5 minute solution to dynamically generated SQL, e.g. auditing triggers.
Now I've been charged with transferring oracle functionality related to auditing into Sybase, and a function similar to Stragg would be ideal for this purpose.
E.g.
select #my_table = 'table_of_fruit'
select 'insert into '+#mytable+'_copy (' +char(10)
+ stragg(c.name) +char(10)
+ 'select '
+ stragg('inserted.'+c.name) + char(10)
+ 'from '+#mytable
from syscolumns c
where objectid(#mytable) = c.id
------------------------------------------
insert into table_of_fruit_copy
(fruit, sweetness, price)
select fruit, sweetness,price
from inserted
Done. Simple.
Except I don't know how to get a string-aggregation function working in Sybase.
Does anyone know of an attempt to do this kind of thing, or code that could work the same as stragg that could be used in this way?
The alternative at the moment is printing code based on complex cursors and such (sample LOC: 500), or select statements combining static strings and columns from user tables (sample LOC: 200). Stragg would severely reduce the complexity of this code, and would be a great deal of help in the future (sample LOC: who knows, maybe 50?)
p.s. I'm calling these selects through a shell script then piping them to file, then running the file through iSQL. Not the nicest solution, but it's better than the alternatives.
There are three separate answers
Question
You have made comments about simplicity, which need to be addressed before we get to the solution.
It is a common requirement to be able to take a delimited list of values, say A,B,C,D, and treat this data like it was a set of rows in a table, or vice versa
This one of the Top Ten Worst Programming Practices I read about recently.
In general, Sybase types tend to be somewhat more academically and Relationally qualified than Oracle types, so we simply do not do that sort of thing in SybaseLand or DB2Land.
In 20 years of working with Sybase, I have had to code that as part of my project just once, and that was for non-technical Auditor who loaded the result set into MS Access.
On the other hand, I have had to code that at least 12 times, when producing text files for importation into Oracle databases (fulfilling external requirements is outside my project, but I satisfy any such requirement free). Obviously the target databases were sub-standard and non-relational (loading a column with more than one datum breaks 1NF, and creates Update Anomalies), which is typical of what Oracle types have to do to get some speed.
Therefore, no, it is not simplicity, at least in the sense of that principle. It is by definition, complexity.
Your reference to "arrays" is incorrect. All commercial dbms handle arrays, according to the ISO/IEC/ANSI SQL (STRAGGR and LIST operators are non-standard SQL, therefore not SQL). Sybase is very strong in processing arrays. If it was an array, you would not need special hand coding to handle it (and you do, as per your question). This is not an array, there is no definition to the cells. This is a single concatenated scalar string.
Pivoting is an entirely different process, which uses set-processing; it does not require row-processing. (I understand on good authority, that Oracle is hopeless at scalar subqueries, and thus Oracle people are used to writing them as [very inefficient] joins or inline views, and then filtering: all that can be elevated to set-processing via scalar subqueries, and it will perform much faster. Particularly your Pivots.)
Even the author in your link posts as follows. Please familiarise yourself with the caveats:
It's as simple as this: If you want to have a system with no logical limitation in the number of data elements passed to a given process, then forget the following mechanisms! They are simply the wrong way to approach the problem.
Therefore, know whatever you are doing is sub-standard, non-relational, and limited; and go ahead with your eyes open. No use pretending that: it will not break; it is not limited; it is an "array"; or that Sybase doesn't have a neat little function that Oracle has. Any professional will see through all that. And if the string length is exceeded, for God's sake send some indicator back to the caller ("!Exceeded" in the string) identifying that condition.
Essentially you are turning the set-processing engine on its head, and forcing it into row-processing mode, so it will be very slow. A WHILE loop is distinctly faster than a cursor, but both are in the same class, row-processors.
The alternative at the moment is printing code based on complex cursors and such
What 200 or 500 LoC ? It is possible I am missing something, but my code is the same few lines of code identified under "Using a Table Function" in your link. Maximum 20, if you count nice formatting; the loop; initialisation; error handling. There is nothing "complex" about it. Do the exact reverse to cancatenate a single string from multiple rows. We use stored procedures for this (which oracle does not have, really, PL/SQL is a different animal). If you have ASE 15.0.2 or greater, you can use a User Defined Function, which you can then use in place of a column. Stored procs are better for true arrays.
the concatenation operator in Sybase is the plus sign. For reversal (decomposing the CSV string) you need CHARINDEX and SUBSTRING functions
You may need the Function Reference Manual, if for nothing else, to avoid writing code where we have functions.
Likewise, we do not have a RANK() function. We are quite happy with the 4 lines of code requires for the subquery. It is only required for Oracle because subqueries are crippled.
Ok, I have answered your question, Now to address the approaches.
You will be aware that code using Oracle Extensions to the SQL standard will need to be changed.
Sybase is way more automated than Oracle; if you familiarise yourself with its feature set, in many instances, you can get the same result (as you did in Oracle) without writing any code. Writing code-for-code blocks is the chain gang, rock-breaking method of building roads, in the context of bulldozers. Even if your company had good reason to use that method, you need to the aware that features work quite differently, eg. triggers, which is why I am posting so much detail.
Another issue that will annoy you is that Oracle isn't really ANSI SQL compliant (stretches the definitions in many places, in order to appear to be compliant), and Sybase, given its customer base, is rigidly SQL compliant. So in addition to the same function working differently, or in a different deployment, you need to be aware that code changes may be required to elevate Oracle code to ANSI compliance levels, just to execute on an ANSI SQL compliant platform.
I am not sure if you are trying to write code for the content of a trigger, or if you are trying to capture the changes to a database. I will provide both answers.
Auditing
Capture Changes to Database
We have an very robust, fast and configurable Auditing subsystem, fit for high volumes and banking level auditing requirements. Get your DBA to setup the sybaudit (separate) database, and to configure exactly what changes need to be captured. This facility will perform much faster than any code you or I can write in a trigger (as much as 100 times faster than your row-by-processing required for the above, as it is executed within the engine, within your executing thread). And of course the setup time is a fraction of your coding time.
Triggers
Again, I am not sure exactly what you are trying to achieve, but assuming you want to copy every insert to some table to a COPY of that table (inside the Trigger), that example code you have provided will not work (and I am not counting syntax issues).
Speaking to your example, you need to do way more work, to deal with the different datatypes; column sizes; precisions; scale; etc. And perhaps the UPDATE() function to identify which columns have changed (for an UPDATE trigger of course). If all you are trying to do is convert the various datatypes to strings, check the CONVERT() function.
Triggers are transactional.
Never place row-processing code in a Trigger (it will strangle the table)
You can't place Dynamic SQL in a Trigger.
But in Sybase even that is not necessary. Refer to the User Guide, chapter 19 is devoted to Triggers, with several variations, and examples. Inside the trigger, you should be able to simply:
INSERT table_copy
SELECT column_list -- never use * unless you want the db fixed in cement
FROM inserted
If you are trying to copy the inserts to all tables into one Audit table, then beware. Then I understand your example a little bit more. You will be forcing a highly Symmetric Muli-Threading server (oracle is not a server in the architecture sense) into single-threading through your table. Auditing is multi-threaded.
Last, the use of manual methods of any kind is not required, so if you could expand a bit more on your PS, what the requirement you are trying to fulfil is, I can identify the programmatic method for you. It appears you are trying to use the PL/SQL approach (which is very limited).
Just use the LIST() function. It's a direct replacement for stragg() function. Example:
SELECT LIST(state, ', ') FROM cities
Result:
name
CA, CA, MA, NY

Resources