JOOQ - Is there any similar tool like SQL 2 jOOQ Parser? - jooq

Since JOOQ 3.6+, they no longer ship with SQL 2 jOOQ Parser. Search on the internet, I can't find the tool SQL 2 jOOQ Parser anywhere.
Just wonder is there any similar tool like SQL 2 jOOQ Parser so we can generate the JOOQ code from native sql?

There's a feature request for this:
https://github.com/jOOQ/jOOQ/issues/6277
From the feature request:
This was already implemented in the past by the https://github.com/sqlparser/sql2jooq third party module, but it suffered from several flaws:
It didn't produce very good jOOQ code
It worked only for MySQL and PostgreSQL
It depended on a third party parser (by Gudu Soft), which was proprietary and not under our control
It was hard to use
The product got zero (!) user feedback over quite some time, which is never a good sign.
Eventually, we'll re-iterate the idea, but it's a lot of work, and there are probably more interesting things that can be done first. The approach most people will choose when writing jOOQ queries is they:
Choose a test driven approach where the feedback cycle is tight, such that executing a query to test if it's correct is done relatively easily
Use views (seriously, use views! Why don't people use views more often?) for your very complex static SQL and query the views from jOOQ

Related

Customize jOOQ dialect to alter the order in which LIMIT and OFFSET are rendered in a statement

I'm using jOOQ to generate queries to run against Athena (AKA PrestoDB/Trino)
To do this, I am using SQLDialects.DEFAULT, and it works because I use very basic query functionalities.
However, jOOQ renders queries like this:
select *
from "Artist"
limit 10
offset 10
God knows why, but the order of limit and offset seem to matter, and the query only works if written with the order swapped:
select *
from "Artist"
offset 10
limit 10
Is there a class I can subclass, to modify the statement render function so that the order of these are swapped? Or any other way of implementing this myself?
A generic solution in jOOQ
There isn't a simple way to change something as fundamental as the SELECT clause order (or any other SELECT clause syntax) so easily in jOOQ, simply, because this was never a requirement for core jOOQ usage, other than supporting fringe SQL dialects. Since the support of a SQL dialect is a lot of work in jOOQ (with all the integration tests, edge cases, etc.) and since market shares of those dialects are low, it has simply never been a priority to improve this in jOOQ.
You may be tempted to think that this is "just" about the order of keywords in this one case. "Only this one case." It never is. It never stops, and the subtle differences in dialects never end. Just look at the jOOQ code base to get an idea of how weirdly different vendors choose to make their dialects. In this particular case, one would think that it seems extremely obvious and simple to make this clause MySQL / PostgreSQL / SQLite compatible, so your best chance is to make a case with the vendor for a feature request. It should be in their own best interest to be more compatible with the market leaders, to facilitate migration.
Workarounds in jOOQ
You can, of course, patch your generated SQL on a low level, e.g. using an ExecuteListener and a simple regex. Whenever you encounter limit (\d+|\?) offset (\d+|\?), just swap the values (and bind values!). This might work reasonably well for top level selects. It's obviously harder if you're using LIMIT .. OFFSET in nested selects, but probably still doable.
Patching jOOQ is always an option. The class you're looking for in jOOQ 3.17 is org.jooq.impl.Limit. It contains all the rendering logic for this clause. If that's your only patch, then it might be possible to upgrade jOOQ. But obviously, patching is a slippery slope, as you may start patching all sorts of clauses, making upgrades impossible.
You can obviously use plain SQL templates for simple cases, e.g. resultQuery("{0} offset {1} limit {2}", actualSelect, val(10), val(10)). This doesn't scale well, but if it's only about 1-2 queries, it might suffice
Using the SQLDialect.DEFAULT
I must warn you, at this point, that the behaviour of SQLDialect.DEFAULT is unspecified. Its main purpose is to produce something when you call QueryPart.toString() on a QueryPart that is not an Attachable, where a better SQLDialect is unavailable. The DEFAULT dialect may change between minor releases (or even patch releases, if there's an important bug in some toString() method), so any implementation you base on this is at risk of breaking with every upgrade.
The most viable long term solution
... would be to have support for these dialects in jOOQ:
#5414 Presto
#11485 Trino

JOOQ vs SQL Queries

I am on jooq queries now...I feel the SQL queries looks more readable and maintainable and why we need to use JOOQ instead of using native SQL queries.
Can someone explains few reason for using the same?
Thanks.
Here are the top value propositions that you will never get with native (string based) SQL:
Dynamic SQL is what jOOQ is really really good at. You can compose the most complex queries dynamically based on user input, configuration, etc. and still be sure that the query will run correctly.
An often underestimated effect of dynamic SQL is the fact that you will be able to think of SQL as an algebra, because instead of writing difficult to compose native SQL syntax (with all the keywords, and weird parenthesis rules, etc.), you can think in terms of expression trees, because you're effectively building an expression tree for your queries. Not only will this allow you to implement more sophisticated features, such as SQL transformation for multi tenancy or row level security, but every day things like transforming a set of values into a SQL set operation
Vendor agnosticity. As soon as you have to support more than one SQL dialect, writing SQL manually is close to impossible because of the many subtle differences in dialects. The jOOQ documentation illustrates this e.g. with the LIMIT clause. Once this is a problem you have, you have to use either JPA (much restricted query language: JPQL) or jOOQ (almost no limitations with respect to SQL usage).
Type safety. Now, you will get type safety when you write views and stored procedures as well, but very often, you want to run ad-hoc queries from Java, and there is no guarantee about table names, column names, column data types, or syntax correctness when you do SQL in a string based fashion, e.g. using JDBC or JdbcTemplate, etc. By the way: jOOQ encourages you to use as many views and stored procedures as you want. They fit perfectly in the jOOQ paradigm.
Code generation. Which leads to more type safety. Your database schema becomes part of your client code. Your client code no longer compiles when your queries are incorrect. Imagine someone renaming a column and forgetting to refactor the 20 queries that use it. IDEs only provide some degree of safety when writing the query for the first time, they don't help you when you refactor your schema. With jOOQ, your build fails and you can fix the problem long before you go into production.
Documentation. The generated code also acts as documentation for your schema. Comments on your tables, columns turn into Javadoc, which you can introspect in your client language, without the need for looking them up in the server.
Data type bindings are very easy with jOOQ. Imagine using a library of 100s of stored procedures. Not only will you be able to access them type safely (through code generation), as if they were actual Java code, but you don't have to worry about the tedious and useless activity of binding each single in and out parameter to a type and value.
There are a ton of more advanced features derived from the above, such as:
The availability of a parser and by consequence the possibility of translating SQL.
Schema management tools, such as diffing two schema versions
Basic ActiveRecord support, including some nice things like optimistic locking.
Synthetic SQL features like type safe implicit JOIN
Query By Example.
A nice integration in Java streams or reactive streams.
Some more advanced SQL transformations (this is work in progress).
Export and import functionality
Simple JDBC mocking functionality, including a file based database mock.
Diagnostics
And, if you occasionally think something is much simpler to do with plain native SQL, then just:
Use plain native SQL, also in jOOQ
Disclaimer: As I work for the vendor, I'm obviously biased.

Raw SQL with Express without ORMs

I want to use raw SQL queries in my application but I have some questions on how to structure my application.
Some background:
I am writing a JSON API with Express and Postgres.
I am not currently using an ORM. I have used Sequelize before, but I don't believe the queries are optimized so I am hesitant to use it.
I am using camelCase in my code but Postgres is case-insensitive, so for readability, I have used under_scores in my DB tables. I constantly have to do queries like:
SELECT first_name AS "firstName" from users;
When the queries get larger, it is almost impossible to read since there is no syntax highlighting of SQL in js string templates.
I feel there is too much repetition in my queries, but that is expected.
What I am thinking:
I was not able to find a Visual Studio Code extension that can highlight SQL inside js files and strings. If there was one, I might get by.
I might write all my queries in .sql files, so that I can have syntax highlighting and load them all into memory when my application starts to prevent too many IO operations, since it would be against the reasoning why I am using raw SQL in the first place.
Anyone had this issue before? How do you structure your application when using raw SQL with Postgres and Express?
Definitely keep all the sql scripts in the corresponding .sql files.
Stick to a meaningfull naming convention. Come up with one you feel comfortable with: in future it will allow to build helpful tools around your codebase doing a lot of boring stuff automatically and making you much happier.
In case you're getting complicated rapidly, generate at least some of the duplicate/commonly used sql. Consider having some simple placeholders in your files like {{ firstName }} which is translated into first_name AS "firstName". Such a translation should happen only once - when you lode source .sql. This is more sophisticated and highly depends upon your tasks kind. Sometimes such an approach is useless, sometimes - useful.

What (in_memory) graph DB if modeling data is focused

I am out of ideas and hope to get some useful input. I am using this question to compress my experiences and share them, hoping to inspire some distributors to go the next step with modeling graph databases as a first class question/way.
I've been validating some graph database solutions usable by node.js for a few weeks. My use case is to save interactions of different social user network accounts. The need is to use CPU and memory in the most efficient way.
My most important requirements are:
in_memory (at least for indexing)
open source (and free to use)
same JavaScript/Node.js performance as first class citizen
comfortable query and modeling language
Neo4J
I really like cypher so my best choice would be Neo4j.
But the major issue about Neo4j is the JavaScript access is non-native. It uses the REST-API which is about ten times (10x) slower than direct Java access. So I took a look at node-neo4j-embedded, but it has been inactive for more than two years. It looks like its author isn't active at all (bad sign).
ArangoDB
The really nice core developers of ArangoDB answered to my question about internals. Finally it means JavaScript is first class citizen because native queries can be pushed out of JS. Looking at the open source benchmarks, I think it is fair. But I am afraid they didn't use node-neo4j-embedded for their benchmark. The benchmarks compare the REST-APIs (Edited because of #weinberger comment). I wished they compare the native APIs (maybe someone is snoopy enough and give it a try! - let us know!). Update: As I noticed now, OrientDB has answered the benchmark with a new node.js driver (using Command Cache by starting the server with -Dcommand.cache.enabled=true -Dcommand.cache.minExecutionTime=3, what isn't fair, because it wasn't a query caches benchmark!)
Because I like to use ArangoDB as a graph database I would have 3 choices (source: FAQ):
traverse JS objects
using AQLs graph functions
using the REST API
In general it isn't comfortable like cypher. And I am not sure how to compare and what is the right way modeling data (like Neo4J explains very well). I'd love to have something like this for ArangoDB Graphs. It feels like ArangoDB is focused on graph operations and Neo4J fits more the needs of using graphs if you have more relations than rows (the reason to use graphs instead of relations with joins).
MongoDB
The document based MongoDB isn't optimized for graph operations but latterly has gotten an experimental in_memory storage engine. Also there are some projects either in_memory or graph related but nothing is really compelling. And at this discussion it looks like MongoDB isn't what I like to use.
OrientDB
Because there is a comparison about OrientDB vs. MongoDB available (from OrientDB) I though about to use this one. "OrientDB has a hybrid Document-Graph engine" using SQL. I am a former PHP/MySQL expert. But where is the modeling part ? Their chapter working with graphs is not cypher like. It is like using SQL for Graphs. There is nothing wrong with that, but using cypher before I miss the modeling like feeling.
If someone did a modeling process with OrientDB and Graphs maybe you could write a tutorial like Neo4J had done.
Update: About JavaScript access like first citizen there are news:
"In the next release the speed of this driver will be comparable to the native Java one" The forked node.js driver had bin fixed last days.
Update: Before choosing OrientDB one might want to read article about some issues and discussions linked from there. The article is touching a sensitive issue and should be approached with critical mind. Note from author of this update: I'm new to editing SO and don't have enough reputation to put this to comments. I believe this information is a valid point to discussion, not sure how to place it here according to SO rules.
LokiJS
Before I was looking at Neo4J, ArangoDB and MongoDB, I played around with that JavaScript based in_memory database called LokiJS, what seams to follow the strategy to ignore everything what slows down performance and efficiency. LokiJS is trying to complete the Mongo-Style (RoadMap). The major issue is the bad ability to scale. Of cause it isn't a graph database but it was an interesting solution while the beginning of my project. Also it wasn't a perfect feeling to find all the distributed documentation (maybe they should reboot with GitBook).
Finally LokiJS is a very interesting project at all and I hope they will go forward!
LevelDB
Previously when I wrote my degree paper I was looking at levelDB. Remembering this while writing this post, I searched for LevelDB in_memory and got a promising result called MemDown (see also). I haven't tested this find, but maybe someone has experiences working and modeling for this solution. Maybe it would be the most efficient way if all the others will not fit because I would simply write a lightweight cypher clone with the goal to stay much lightweight as I can do.
Edit: Due to comment, here is a link to LevelGraph. As an idea to implement a CYPHER parser for LevelGraph/LevelDB your starting point would be to compare
Cypher:
CREATE (SUBJECT:"a") - [b:PREDICATE] -> (OBJECT:"c")
RETURN, subject, predicate, object
LevelGraph:
var RETURN = { SUBJECT: "a", PREDICATE: "b", OBJECT: "c" };
db.put(RETURN, function(err) {
// ..
});
Conclusion
As you likely noticed I am not the super hero about graphs. But this is my initial dive into this and I'm trying to get an overview. I assume there are a lot people out there who want to ask the same questions as me but haven't the time. I hope this post will help a lot people and will change by comments and answers to become a well done overview how to modeling data for graphs.
#editors: You are welcome.
#commenters: This is the result of my personal research - if you also have done a journey like me, please answer with a short summary like I have done for each DB I've evaluated (don't forget to target my 4 goals).
The idea to combine node-style performance through any of the native features (e.g. streams) and a high level query language like CYPHER is actually quite neat.
What you likely won't get is any kind of low level API, since this is rather rare with DB authors and, supposedly, not wanted in their design patterns. So, long running tcp connections shall just serve fine.
cypher-stream since to incorporate all of this, while (superficially judged) maintaining a good style.
Since you likely won't get any further with the search, I'd suggest sending him a pull request if any other features are needed :)
You should take a look at Gundb https://github.com/amark/gun
It's open source and has a very active and helpfull lead developer.
Join us at https://gitter.im/amark/gun

Subsonic 3.0 LINQ Templates with Multiple Databases

I'm evaluating SubSonic 3.0 for use in our business as a replacement for our POCO objects. I'm new to SubSonic, literally installing it yesterday. I've gotten to the point where I can connect to one database using the 3.0 LINQ T4 Templates, and have been wooed by the promise of being able to connect to multiple databases in one application using SubSonic.
My issue is I can't find any documentation on how to use the T4 Templates with multiple databases (e.g. adding another connection string, setting up the Settings.ttinclude etc).
I've searched Google and Stackoverflow for an answer to see how this would be done or if its even possible. Any help would be appreciated.
So I seemed to be able to make it work by adding another connectionString to the web.config, and then adding a 2nd set of templates for that connectionString, it works, but it doesn't seem 'clean' or even really that DRY to me.
It also seems that I could do almost the same thing with the .NET Built in LINQ by adding multiple .dbml files.
Can anyone give me some reasoning at this point why we shouldn't just use the built in LINQ support over a 3rd party ORM like SubSonic?
Cross posting from the subsonic mailing list:
Oh yeah I do this all the time, the trick is two copies of the templates(easy) or editing the templates to iterate over two sets of tables(harder). In the second settings.tt change the name of the connection string to reflect the other database. You might also want to change the namespace so that you don't have conflicts where table names are the same. It seems hacky but I don't think it is because it allows you to make changes to the templates for each database independently.
If you really want only one set of templates the easiest way to go about it is to edit SQLServer.tt (or your choice of database) and override how LoadTables works such that it will accept a list of connections rather than a single one. I have to say this is a pain and it is going to be much harder than having 2 copies of the files.
(In reply to your answer of your question)
Can anyone give me some reasoning at this point why I shouldn't just use the built in LINQ over a 3rd party ORM like SubSonic?
On immediate thought: SubSonic supports more than just Microsoft Sql Server.

Resources