Jooq XML Database generation

Jooq XML Database generation - jooq

I am manually defining a Database XML schema to use the Jooq capabilities to generate the corresponding code from the definition.
I am using Gradle to generate the code with Jooq:
jooq {
version = '3.13.5'
edition = nu.studer.gradle.jooq.JooqEdition.OSS
configurations {
crate {
generationTool {
logging = org.jooq.meta.jaxb.Logging.INFO
generator {
database {
name = 'org.jooq.meta.xml.XMLDatabase'
properties {
property {
key = 'dialect'
value = 'POSTGRES'
}
property {
key = 'xmlFile'
value = 'src/main/resources/crate_information_schema.xml'
}
}
}
target {
packageName = 'it.fox.crate'
directory = 'src/generated/crate'
}
strategy.name = "it.fox.generator.CrateGenerationStrategy"
}
}
}
}
}
and this is the XML file crate_information_schema.xml I am referencing:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<information_schema xmlns="http://www.jooq.org/xsd/jooq-meta-3.14.0.xsd">
<schemata>
<schema>
<catalog_name></catalog_name>
<schema_name>doc</schema_name>
<comment></comment>
</schema>
</schemata>
<tables>
<table>
<table_catalog></table_catalog>
<table_schema>doc</table_schema>
<table_name>events</table_name>
<table_type>BASE TABLE</table_type>
<comment></comment>
</table>
</tables>
<columns>
<column>
<table_catalog></table_catalog>
<table_schema>doc</table_schema>
<table_name>events</table_name>
<column_name>data_block['angularPositionArray']</column_name>
<data_type>real_array</data_type>
<character_maximum_length>0</character_maximum_length>
<numeric_precision>19</numeric_precision>
<numeric_scale>0</numeric_scale>
<ordinal_position>1</ordinal_position>
<is_nullable>false</is_nullable>
<comment>angularPositionArray</comment>
</column>
<column>
<table_catalog></table_catalog>
<table_schema>doc</table_schema>
<table_name>events</table_name>
<column_name>data_block['eventId']</column_name>
<data_type>bigint(20)</data_type>
<character_maximum_length>0</character_maximum_length>
<numeric_precision>19</numeric_precision>
<numeric_scale>0</numeric_scale>
<ordinal_position>1</ordinal_position>
<is_nullable>false</is_nullable>
<comment>eventId</comment>
</column>
</columns>
</information_schema>
The code generated is not good, because it indicate the Data Type used is unknown:
/**
* #deprecated Unknown data type. Please define an explicit {#link org.jooq.Binding} to specify how this type should be handled. Deprecation can be turned off using {#literal <deprecationOnUnknownTypes/>} in your code generator configuration.
*/
#java.lang.Deprecated
public final TableField<EventsRecord, Object> angularPositionArray = createField(DSL.name("data_block['angularPositionArray']"), org.jooq.impl.DefaultDataType.getDefaultDataType("\"real_array\"").nullable(false), this, "angularPositionArray");
I have a couple of questions:
which is the correct data type for Real Array?
where is the list of supported data type with the keys to use in the XML?
N.B. CrateDB is an unsupported DataBase but Jooq could talk to the DB using the Postgres driver, the only problem is to create manually the schema.

which is the correct data type for Real Array?
Use <data_type>REAL ARRAY</data_type> (with a whitespace, and upper case, see comments and issue #12611)
where is the list of supported data type with the keys to use in the XML?
It's the same as for any other code generation data source: All the types in SQLDataType are supported. The convention around array types is currently undocumented, but any of HSQLDB's or PostgreSQL's notations should work. The feature request to formally support array types as user defined types via standard SQL INFORMATION_SCHEMA.ELEMENT_TYPES is here: https://github.com/jOOQ/jOOQ/issues/8090
N.B. CrateDB is an unsupported DataBase but Jooq could talk to the DB using the Postgres driver, the only problem is to create manually the schema.
You can obviously use the XMLDatabase for this. I'm guessing you cannot use the JDBCDatabase, because the INFORMATION_SCHEMA is too different, and the PG_CATALOG schema doesn't exist? However, you could easily implement your own org.jooq.meta.Database, too, if that makes more sense.

Related

Using full java object package paths inline in autogenerated code

We have a pretty comic situation: a Postgres DB with a schema called Internal:
public class Internal extends SchemaImpl
Now we have to create an enum, with one of the values being called... Internal. The autogenerated code for this enum doesn't compile due to the collision between the enum name and the schema name, returned by getSchema() method:
///
import blabla.jooq.internal.Internal;
///
#Generated(
value = {
"http://www.jooq.org",
"jOOQ version:3.11.12"
},
comments = "This class is generated by jOOQ"
)
#SuppressWarnings({ "all", "unchecked", "rawtypes" })
public enum TypeEnum implements EnumType {
Internal("Internal"), External("External");
/////
/**
* {#inheritDoc}
*/
#Override
public Schema getSchema() {
return Internal.INTERNAL; //<<< The compiler thinks Internal is the enum value and not the schema name static object and fails
}
The two options to fix this are:
Rename the enum (which for some reason we would like to avoid)
Make the autogenerated code have the package names being included with the object references inline
Is there any configuration that will let us achieve option 2?
TIA

Bug in the code generator
That looks like a bug in jOOQ's code generator. I've created: https://github.com/jOOQ/jOOQ/issues/13692
Work around by renaming generated objects
You can work around such bugs by using a "generator strategy":
Programmatic: https://www.jooq.org/doc/latest/manual/code-generation/codegen-generatorstrategy/
Configurative: https://www.jooq.org/doc/latest/manual/code-generation/codegen-matcherstrategy/
With such a strategy, you can rename either the schema or enum class name, or both, of course, depending on your tastes:
<configuration>
<generator>
<strategy>
<matchers>
<schemas>
<schema>
<expression>INTERNAL</expression>
<schemaClass>
<transform>PASCAL</transform>
<expression>$0_SCHEMA</expression>
</schemaClass>
</schema>
</schemas>
<enums>
<enum>
<expression>INTERNAL</expression>
<enumClass>
<transform>PASCAL</transform>
<expression>$0_ENUM</expression>
</enumClass>
</enum>
</enums>
</matchers>
</strategy>
</generator>
</configuration>
Work around by avoiding imports for certain types
Alternatively, you can specify which type references should always remain fully qualified, rather than imported, see:
https://www.jooq.org/doc/latest/manual/code-generation/codegen-advanced/codegen-config-generate/codegen-generate-fully-qualified-types/
For example:
<configuration>
<generator>
<generate>
<fullyQualifiedTypes>.*\.INTERNAL</fullyQualifiedTypes>
</generate>
</generator>
</configuration>

Unmarshalling SOS GetCapabilities via JSONIX yields only abstract offering data

I am trying to use jsonix to unmarshall a GetCapabilities response from SOS_2_0. Below is the code I wrote to unmarshall the xml string. It seems to work fine however not all of the elements have been mapped correctly.
function jsonixParseSensors(xmlStr) {
var module = SOS_2_0_Module_Factory();
var context = new Jsonix.Context([XLink_1_0, GML_3_2_1, OWS_1_1_0, SWE_2_0, SWES_2_0, WSN_T_1, WS_Addr_1_0_Core, OM_2_0, ISO19139_GMD_20070417, ISO19139_GCO_20070417, ISO19139_GSS_20070417, ISO19139_GTS_20070417, ISO19139_GSR_20070417, Filter_2_0, SOS_2_0]);
var unmarshaller = context.createUnmarshaller();
var data = unmarshaller.unmarshalString(xmlStr);
return data;
}
In the screenshot below it is apparent all of the 'offerings' in 'contents' are defaulted to the abstract type (SWES_2_0.AbstractContentsType.Offering) and have no information about the sensor/observation offering in them. It's odd because other elements such as 'filtercapabilities' contain all the info and attributes as well. I have tried this both with and without passing namespacing arguments to unmarshallString and it does not seem to make a difference. Is there something I am fundamentally misunderstanding?
.
SOS GetCapabilities xml from Botts-Geo
SOS GetCapabilities xml from Sensiasoft

The problem was in the SWES_2_0 mapping. The abstractOffering property of the SWES_2_0.AbstractContentsType.Offering type was generated as "element" property:
{
ln: 'AbstractContentsType.Offering',
tn: null,
ps: [{
n: 'abstractOffering',
rq: true,
en: 'AbstractOffering',
ti: '.AbstractOfferingType'
}]
}
This should have been an "element reference" property to allow the swes:AbstractOffering element to be replaced by other elements via substitution groups.
This should be fixed now in ogc-schemas trunk, see the test.

Setting types of parsed values in Antlr

I have a rule that looks like this:
INTEGER : [0-9]+;
myFields : uno=INTEGER COMMA dos=INTEGER
Right now to access uno I need to code:
Integer i = Integer.parseInt(myFields.uno.getText())
It would be much cleaner if I could tell antler to do that conversion for me; then I would just need to code:
Integer i = myFields.uno
What are my options?

You could write the code as action, but it would still be explicit conversion (eventually). The parser (like every parser) parses the text and then it's up to "parsing events" (achieved by listener or visitor or actions in ANTLR4) to create meaningful structures/objects.
Of course you could extend some of the generated or built-in classes and then get the type directly, but as mentioned before, at some point you'll always need to convert text to some type needed.

A standard way of handling custom operations on tokens is to embed them in a custom token class:
public class MyToken extends CommonToken {
....
public Integer getInt() {
return Integer.parseInt(getText()); // TODO: error handling
}
}
Also create
public class MyTokenFactory extends TokenFactory { .... }
to source the custom tokens. Add the factory to the lexer using Lexer#setTokenFactory().
Within the custom TokenFactory, override the method
Symbol create(int type, String text); // (typically override both factory methods)
to construct and return a new MyToken.
Given that the signature includes the target token type type, custom type-specific token subclasses could be returned, each with their own custom methods.
Couple of issues with this, though. First, in practice, it is not typically needed: the assignment var is statically typed, so as in the the OP example,
options { TokenLabelType = "MyToken"; }
Integer i = myFields.uno.getInt(); // no cast required
If Integer is desired & expected, use getInt(). If Boolean ....
Second, ANTLR options allows setting a TokenLabelType to preclude the requirement to manually cast custom tokens. Use of only one token label type is supported. So, to use multiple token types, manual casting is required.

How to define a CAS in database as external resource for an annotator in uimaFIT?

I am trying to structure my a data processing pipeline using uimaFit as follows:
[annotatorA] => [Consumer to dump annotatorA's annotations from CAS into DB]
[annotatorB (should take on annotatorA's annotations from DB as input)]=>[Consumer for annotatorB]
The driver code:
/* Step 0: Create a reader */
CollectionReader readerInstance= CollectionReaderFactory.createCollectionReader(
FilePathReader.class, typeSystem,
FilePathReader.PARAM_INPUT_FILE,"/path/to/file/to/be/processed");
/*Step1: Define Annotoator A*/
AnalysisEngineDescription annotatorAInstance=
AnalysisEngineFactory.createPrimitiveDescription(
annotatorADbConsumer.class, typeSystem,
annotatorADbConsumer.PARAM_DB_URL,"localhost",
annotatorADbConsumer.PARAM_DB_NAME,"xyz",
annotatorADbConsumer.PARAM_DB_USER_NAME,"name",
annotatorADbConsumer.PARAM_DB_USER_PWD,"pw");
builder.add(annotatorAInstance);
/* Step2: Define binding for annotatorB to take
what-annotator-a put in DB above as input */
/*Step 3: Define annotator B */
AnalysisEngineDescription annotatorBInstance =
AnalysisEngineFactory.createPrimitiveDescription(
GateDateTimeLengthAnnotator.class,typeSystem)
builder.add(annotatorBInstance);
/*Step 4: Run the pipeline*/
SimplePipeline.runPipeline(readerInstance, builder.createAggregate());
Questions I have are:
Is the above approach correct?
How do we define the depencdency of annotatorA's output in annotatorB in step 2?
Is the approach suggested at https://code.google.com/p/uimafit/wiki/ExternalResources#Resource_injection
, the right direction to achieve it ?

You can define the dependency with #TypeCapability like this:
#TypeCapability(inputs = { "com.myproject.types.MyType", ... }, outputs = { ... })
public class MyAnnotator extends JCasAnnotator_ImplBase {
....
}
Note that it defines a contract at the annotation level, not the engine level (meaning that any Engine could create com.myproject.types.MyType).
I don't think there are ways to enforce it.
I did create some code to check that an Engine is provided with the right required Annotations in the upstream of a pipeline, and prints an error log otherwise (see Pipeline.checkAndAddCapabilities() and Pipeline.addCapabilities() ). Note however that it will only work if all Engines define their TypeCapabilities, which is often not the case when one uses external Engines/libraries.

ektorp / CouchDB mix HashMap and Annotations

In jcouchdb I used to extend BaseDocument and then, in a transparent manner, mix Annotations and not declared fields.
Example:
import org.jcouchdb.document.BaseDocument;
public class SiteDocument extends BaseDocument {
private String site;
#org.svenson.JSONProperty(value = "site", ignoreIfNull = true)
public String getSite() {
return site;
}
public void setSite(String name) {
site = name;
}
}
and then use it:
// Create a SiteDocument
SiteDocument site2 = new SiteDocument();
site2.setProperty("site", "http://www.starckoverflow.com/index.html");
// Set value using setSite
site2.setSite("www.stackoverflow.com");
// and using setProperty
site2.setProperty("description", "Questions & Answers");
db.createOrUpdateDocument(site2);
Where I use both a document field (site) that is defined via annotation and a property field (description) not defined, both get serialized when I save document.
This is convenient for me since I can work with semi-structured documents.
When I try to do the same with Ektorp I have documents using annotations and Documents using HashMap BUT I couldn't find an easy way of getting the mix of both (I've tried using my own serializers but this seems to much work for something that I get for free in jcouchdb). Also tried to annotate a HashMap field but then is serialized as an object and I get the fields automatically saved BUT inside an object with the name of the HashMap field.
Is it possible to do (easily/for free) using Ektorp?

It is definitely possible. You have two options:
Base your class on org.ektorp.support.OpenCouchDbDocument
Annotate the you class with #JsonAnySetter and #JsonAnyGetter. Red more here: http://wiki.fasterxml.com/JacksonFeatureAnyGetter

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Jooq XML Database generation - jooq

Related

Using full java object package paths inline in autogenerated code

Unmarshalling SOS GetCapabilities via JSONIX yields only abstract offering data

Setting types of parsed values in Antlr

How to define a CAS in database as external resource for an annotator in uimaFIT?

ektorp / CouchDB mix HashMap and Annotations

Categories

Resources