How to configure jOOQ Embeddable Types from SQL Query - jooq

We’ve got a database where there are quite a lot of embedded types. Often the same embedded type occurs multiple times in the same table. The respective columns follow a naming pattern, so it’s easy to identify the different occurrences using an SQL query for example.
What’s the best way to configure jOOQ so that these embedded types gets mapped by the code generator? Note that our real db contains hundreds of tables, so manually configuring this is a no-go.
Fictional example:
create table t(amount int, unit varchar2(4), amount_pend int, unit_pend varchar(4));

Re-using the same embeddable multiple times per table
As of jOOQ 3.15, it is not yet possible to reference the same embeddable type more than once per table. You can only reference it from several tables, once each.
This should obviously be fixed. It seems to be merely a code generation limitation. I've created a feature request for this:
https://github.com/jOOQ/jOOQ/issues/12608
Starting from jOOQ 3.17, you can declare:
<embeddables>
<embeddable>
<name>MONETARY_AMOUNT</name>
<referencingName>AMOUNT_WITH_UNIT</referencingName>
<tables>T/tables>
<fields>
<field><expression>AMOUNT</expression></field>
<field><expression>UNIT</expression></field>
</fields>
</embeddable>
<embeddable>
<name>MONETARY_AMOUNT</name>
<referencingName>AMOUNT_WITH_UNIT_PEND</referencingName>
<tables>T/tables>
<fields>
<field><name>AMOUNT</name><expression>AMOUNT_PEND</expression></field>
<field><name>UNIT</name><expression>UNIT_PEND</expression></field>
</fields>
</embeddable>
</embeddables>
For the time being, you'll have to produce 2 distinct embeddable types, which aren't compatible by name or type, only by structure.
Generating code generation configuration dynamically
The code generation configuration is a JAXB-annotated API, which happens to conveniently map to:
Maven: https://www.jooq.org/doc/latest/manual/code-generation/codegen-maven/
Standalone XML files following this XSD (or a newer version): https://www.jooq.org/xsd/jooq-codegen-3.16.0.xsd
But you can also use the code generation API programmatically yourself and thus generate the configuration elements dynamically:
https://www.jooq.org/doc/latest/manual/code-generation/codegen-programmatic/
Or, since it's all XML based, you could use XSLT to generate the configuration

Related

How to update rows in Jooq without Codegen using JSON

I am using Jooq version 3.17.0 and attempting to insert data into a table without codegen.
At the minute, I am designing a system that allows data to be imported into multiple tables (one at a time, and starting with just one), yet I do not want to write specific code for each table and as of now, I haven't had a need for codegen.
The code currently works for importing data via JSON, with json being a String formatted in the 'Jooq' format. This imports data correctly into the database. This also allows us to send json data of table updates from one system to our main system that uses Jooq. Yet it gives me an error when I try to update.
I am using MYSQL as my database.
The original code for insertion is :
Result<Record> convertedJson = dslContext.fetchFromJSON(json);
Loader<Record> res1 = dslContext.loadInto(table(tableName)).loadJSON(json).fields(convertedJson.fields()).execute();
However, if we try to update data by sending in the same json, but with one field changed, jooq gives an error org.jooq.exception.DataAccessException stating that there is a duplicate entry for key.
I tried to use :
Loader<Record> res2 = dslContext.loadInto(table(tableName)).onDuplicateKeyUpdate().loadJSON(json).fields(convertedJson.fields()).execute();
But then this throws an error ON DUPLICATE KEY UPDATE only works on tables with explicit primary keys. Table is not updatable : <tableName> since in LoaderImpl.onDuplicateKeyUpdate():220 since table.getPrimaryKey() is null which technically makes sense since table(tableName) returns a Table that does not know it's fields.
My question is probably two-fold.
Is there a way to have a table that is aware of it's fields without codegen?
Is there a way for me to allow jooq to update rows this way.
My preferences is to steer clear of codegen, unless it's really needed. I probably could switch to codegen if needed, but again I would still need to be able to execute SQL without writing specific code for each table. Using JSON is still very much desired, as that allows me to send data from one application to another for import.
Using code generation
You've run into one of those many reasons why code generation is very helpful with jOOQ. If your various tables are known at compile time, and all you're doing is switch table names, then I would go with generated code, making the lookup of the table dynamic. That would solve the problem easily.
From experience with various similar support cases, I've always recommended this first, because as soon as these kinds of troubles start, it's a good idea to re-think the code generation strategy as you will run into other, similar problems, having to work around the lack of ubiquitously available meta data all the time. There are many other benefits to using the code generator.
Emulating code generation
If for some reason you cannot (e.g. the tables aren't known at compile time) or do not want to use the code generator, then you can do the code generator's work yourself at runtime, by building CustomTable types as documented here.
Using other means of providing meta information
Another way to provide jOOQ with meta data is to use one of various forms of implementing org.jooq.Meta, which include:
Looking up meta data from the JDBC driver's DatabaseMetaData (this can be slow, depending on your schema)
Letting jOOQ interpret some DDL scripts
Using jOOQ's XML representation of the standard SQL INFORMATION_SCHEMA
Using generated code

jOOQ difference between Record and TableRecord

I would like to know what the difference is between a jOOQ Record and a TableRecord. So for example a User and a UserRecord. I can see that it has something to do with the actual nullability of a certain table, but why does everyone use the TableRecord and when should I ever use the normal Record?
Thanks!
There's a manual page about literally your question: Record vs. TableRecord. In short:
Record is the generic super type of all jOOQ records.
TableRecord is a specific type of record, which can be associated with a table in your schema. This type is typically extended by code generation output
So for example a User and a UserRecord
This might be a different question. jOOQ's code generator produces these artifacts for each table, depending on your configuration:
The Table (e.g. User). You use this to construct type safe jOOQ queries
The TableRecord (e.g. UserRecord). You can use this to simplify some CRUD operations
The POJO (e.g. User, but in a different package). You can use this to map results to simple POJOs

Is there a jOOQ tool to verify generated definitions are still correct?

I am working with classes generated by jOOQ based on a schema maintained by Liquibase. I am looking for a way to ensure that the jOOQ classes remain consistent with the actual database. The preferred approach is to create a test that can be run by our CI tool when pull requests are created.
Is there a tool to verify that the jOOQ generated definitions are still correct?
Two obvious approaches involving your build setup are:
To check in generated sources (get a diff when they're not up to date). See also the manual section about code generation and version control.
To include both Liquibase migrations and jOOQ code generation in your build. That way, the generated sources and the database are always up to date with what is defined in your Liquibase migration scripts. You can also base your code generation directly on your Liquibase files using the LiquibaseDatabase, if you're not doing anything fancy, vendor specific.
A less obvious way using programmatic jOOQ API is to compare two versions of Meta using Meta.migrateTo(Meta):
// This corresponds to the meta data from your live connection
Meta m1 = ctx.meta();
// This corresponds to the meta data from your generated catalog (or schema, table, etc)
Meta m2 = ctx.meta(catalog);
// This is a generated migration script between the two versions, should be empty
Queries queries = m1.migrateTo(m2);
The approach might work, though it has a lot of caveats, which are still being fixed as of jOOQ 3.14, 3.15. Work in progress can be seen here: https://github.com/jOOQ/jOOQ/projects/1, bug reports very welcome!

JOOQ vs SQL Queries

I am on jooq queries now...I feel the SQL queries looks more readable and maintainable and why we need to use JOOQ instead of using native SQL queries.
Can someone explains few reason for using the same?
Thanks.
Here are the top value propositions that you will never get with native (string based) SQL:
Dynamic SQL is what jOOQ is really really good at. You can compose the most complex queries dynamically based on user input, configuration, etc. and still be sure that the query will run correctly.
An often underestimated effect of dynamic SQL is the fact that you will be able to think of SQL as an algebra, because instead of writing difficult to compose native SQL syntax (with all the keywords, and weird parenthesis rules, etc.), you can think in terms of expression trees, because you're effectively building an expression tree for your queries. Not only will this allow you to implement more sophisticated features, such as SQL transformation for multi tenancy or row level security, but every day things like transforming a set of values into a SQL set operation
Vendor agnosticity. As soon as you have to support more than one SQL dialect, writing SQL manually is close to impossible because of the many subtle differences in dialects. The jOOQ documentation illustrates this e.g. with the LIMIT clause. Once this is a problem you have, you have to use either JPA (much restricted query language: JPQL) or jOOQ (almost no limitations with respect to SQL usage).
Type safety. Now, you will get type safety when you write views and stored procedures as well, but very often, you want to run ad-hoc queries from Java, and there is no guarantee about table names, column names, column data types, or syntax correctness when you do SQL in a string based fashion, e.g. using JDBC or JdbcTemplate, etc. By the way: jOOQ encourages you to use as many views and stored procedures as you want. They fit perfectly in the jOOQ paradigm.
Code generation. Which leads to more type safety. Your database schema becomes part of your client code. Your client code no longer compiles when your queries are incorrect. Imagine someone renaming a column and forgetting to refactor the 20 queries that use it. IDEs only provide some degree of safety when writing the query for the first time, they don't help you when you refactor your schema. With jOOQ, your build fails and you can fix the problem long before you go into production.
Documentation. The generated code also acts as documentation for your schema. Comments on your tables, columns turn into Javadoc, which you can introspect in your client language, without the need for looking them up in the server.
Data type bindings are very easy with jOOQ. Imagine using a library of 100s of stored procedures. Not only will you be able to access them type safely (through code generation), as if they were actual Java code, but you don't have to worry about the tedious and useless activity of binding each single in and out parameter to a type and value.
There are a ton of more advanced features derived from the above, such as:
The availability of a parser and by consequence the possibility of translating SQL.
Schema management tools, such as diffing two schema versions
Basic ActiveRecord support, including some nice things like optimistic locking.
Synthetic SQL features like type safe implicit JOIN
Query By Example.
A nice integration in Java streams or reactive streams.
Some more advanced SQL transformations (this is work in progress).
Export and import functionality
Simple JDBC mocking functionality, including a file based database mock.
Diagnostics
And, if you occasionally think something is much simpler to do with plain native SQL, then just:
Use plain native SQL, also in jOOQ
Disclaimer: As I work for the vendor, I'm obviously biased.

What is the best way to store and search through object transactions?

We have a decent sized object-oriented application. Whenever an object in the app is changed, the object changes are saved back to the DB. However, this has become less than ideal.
Currently, transactions are stored as a transaction and a set of transactionLI's.
The transaction table has fields for who, what, when, why, foreignKey, and foreignTable. The first four are self-explanatory. ForeignKey and foreignTable are used to determine which object changed.
TransactionLI has timestamp, key, val, oldVal, and a transactionID. This is basically a key/value/oldValue storage system.
The problem is that these two tables are used for every object in the application, so they're pretty big tables now. Using them for anything is slow. Indexes only help so much.
So we're thinking about other ways to do something like this. Things we've considered so far:
- Sharding these tables by something like the timestamp.
- Denormalizing the two tables and merge them into one.
- A combination of the two above.
- Doing something along the lines of serializing each object after a change and storing it in subversion.
- Probably something else, but I can't think of it right now.
The whole problem is that we'd like to have some mechanism for properly storing and searching through transactional data. Yeah you can force feed that into a relational database, but really, it's transactional data and should be stored accordingly.
What is everyone else doing?
We have taken the following approach:-
All objects are serialised (using the standard XMLSeriliser) but we have decorated our classes with serialisation attributes so that the resultant XML is much smaller (storing elements as attributes and dropping vowels on field names for example). This could be taken a stage further by compressing the XML if necessary.
The object repository is accessed via a SQL view. The view fronts a number of tables that are identical in structure but the table name appended with a GUID. A new table is generated when the previous table has reached critical mass (a pre-determined number of rows)
We run a nightly archiving routine that generates the new tables and modifies the views accordingly so that calling applications do not see any differences.
Finally, as part of the overnight routine we archive any old object instances that are no longer required to disk (and then tape).
I've never found a great end all solution for this type of problem. Some things you can try is if your DB supports partioning (or even if it doesn't you can implement the same concept your self), but partion this log table by object type and then you can further partion by date/time or by your object ID (if your ID is a numeric this works nicely not sure how a guid would partion).
This will help maintain the size of the table and keep all related transactions to a single instance of an object to itself.
One idea you could explore is instead of storing each field in a name value pair table, you could store the data as a blob (either text or binary). For example serialize the object to Xml and store it in a field.
The downside of this is that as your object changes you have to consider how this affects all historical data if your using Xml then there are easy ways to update the historical xml structures, if your using binary there are ways but you have to be more concious of the effort.
I've had awsome success storing a rather complex object model that has tons of interelations as a blob (the xml serializer in .net didn't handle the relationships btw the objects). I could very easily see myself storing the binary data. A huge downside of storing it as binary data is that to access it you have to take it out of the database with Xml if your using a modern database like MSSQL you can access the data.
One last approach is to split the two patterns, you could define a Difference Schema (and I assume more then one property changes at a time) so for example imagine storing this xml:
<objectDiff>
<field name="firstName" newValue="Josh" oldValue="joshua"/>
<field name="lastName" newValue="Box" oldValue="boxer"/>
</objectDiff>
This will help alleviate the number of rows, and if your using MSSQL you can define an XML Schema and get some of the rich querying ability around the object. You can still partition the table.
Josh
Depending on the characteristics of your specific application an alternative approach is to keep revisions of the entities themselves in their respective tables, together with the who, what, why and when per revision. The who, what and when can still be foreign keys.
Although I would be very careful to use this approach, since this is only viable for applications with a relatively small amount of changes per entity/entity type.
If querying the data is important I would use true Partitioning in SQL Server 2005 and above if you have enterprise edition of SQL Server. We have millions of rows partitioned by year down to day for the current month - you can be as granular as your application demands with a maximum number of 1000 partitions.
Alternatively , if you are using SQL 2008 you could look into filtered indexes.
These are solutions that will enable you to retain the simplified structure you have whilst providing the performance you need to query that data.
Splitting/Archiving older changes obviously should be considered.

Resources