when use jOOQ's code generator, it will split nested classes to prevent large static initialisers, and the Indexes.java maybe contains nested classes such as Indexes0, Indexes1...
I use generator with <version>3.12.3</version>, it generates two nested classes Indexes0 and Indexes1, Indexes0 contains 501 indexes, Indexes1 contains 264 indexes.
but use <version>3.13.4</version>, it only generates Indexes0 nested classes, contains 387 indexes.
Is this a bug in jOOQ generator? or I need to add other config?
jOOQ 3.13 introduced a new <includeSystemIndexes/> defaulting to false, see #9601. This flag turns off the generation of code for system-generated indexes, i.e. indexes you wouldn't want to get declared explicitly when calling Meta.ddl() on your generated code.
If you want those indexes to be included, just turn the flag back on. However, I doubt you need them.
Related
the problem which I have is how to convert jooq select query to some object. If I use default jooq mapper, it works but all fields must be mentioned, and in exact order. If I use simple flat mapper, I have problems with multiset.
The problem with simple flat mapper:
class Student {
private final id;
Set<String> bookIds;
}
private static final SelectQueryMapper<Student> studentMapper = SelectQueryMapperFactory.newInstance().newMapper(Studen.class);
var students = studentMapper.asList(
context.select(
STUDENT.ID.as("id),
multiset(
select(BOOK.ID).from(BOOK).where(BOOK.STUDENT_ID.eq(STUDENT.ID)),
).convertFrom(r -> r.intoSet(BOOK.ID)).as("bookIds"))
.from(STUDENT).where(STUDENT.ID.eq("<id>"))
)
Simple flat mapper for attribute bookIds returns:
Set of exact one String ["[[book_id_1], [book_id_2]]"], instead of ["book_id_1", "book_id_2"]
As I already mention, this is working with default jooq mapper, but in my case all attributes are not mention in columns, and there is possibility that some attributes will be added which are not present in table.
The question is, is there any possibility to tell simple flat mapper that mapping is one on one (Set to set), or to have default jooq mapper which will ignore non-matching and disorder fields.
Also what is the best approach in this situations
Once you start using jOOQ's MULTISET and ad-hoc conversion capabilities, then I doubt you still need third parties like SimpleFlatMapper, which I don't think can deserialise jOOQ's internally generated JSON serialisation format (which is currently an array of arrays, not array of objects, but there's no specification for this format, and it might change in any version).
Just use ad-hoc converters.
If I use default jooq mapper, it works but all fields must be mentioned, and in exact order
You should see that as a feature, not a bug. It increases type safety and forces you to think about your exact projection, helping you avoid to project too much data (which will heavily slow down your queries!)
But you don't have to use the programmatic RecordMapper approach that is currently being advocated in the jOOQ manual and blog posts. The "old" reflective DefaultRecordMapper will continue to work, where you simply have to have matching column aliases / target type getters/setters/member names.
If I run this query:
FOR i IN [
my_collection[*].my_prop,
my_other_collection[*].my_prop,
]
RETURN i
I get this error:
Query: AQL: collection not found: my_other_collection (while parsing)
It's true that 'my_other_collection' may not exist, but I still want the result from 'my_collection'.
How can I make this error non-blocking ?
A missing collection will cause an error and this can not be ignored or suppressed. There is also no concept of late-bound collections which would allow you to evaluate a string as collection reference at runtime. In short: this is not supported.
The question is why you would want to use such a pattern in the first place. I assume that both collections exists, then it will materialize the full array before returning anything, which is presumably memory intensive.
It would be much better to either keep the documents of both collections in a single collection (you may add an extra attribute type to distinguish between them) or to use an ArangoSearch view, so that you can search indexes attributes across collections.
Beyond the two methods already mentioned in the previous answer (Arango search and single collection), you could do this in JavaScript , probably inside Foxx:
Check if collections exist with db_collection(collection-name)
Then build the query string using aql fragments and use union to merge the fragments to pull results from the different collections.
Note that ,if the collections are large, you probably will want to filter that results instead of just pulling all the documents.
For example one of my entities has two sets of IDs.
One that is continuous (which apparently is necessary to create the EntitySet), and one to use as a foreign key when merging with my other table.
This results in featuretools including the ID in the set of features to aggregate. SUM(ID) isn't a feature I am interested in though.
Is there a way to include certain feature when running deep feature synthesis?
There are three ways to exclude features when calling ft.dfs.
Use the ignore_variables to specify variables in an entity that should not be used to create features. It is a dictionary mapping an entity id to a list of variable names to ignore.
Use drop_contains to drop features that contain any of the strings
listed in this parameter.
Use drop_exact to drop features that exactly match any of the strings listed in this parameter.
Here is a example usage of all three in a ft.dfs call
ft.dfs(target_entity="customers"],
ignore_variables={
"transactions": ["amount"],
"customers": ["age", "gender", "date_of_birth"]
}, # ignore these variables
drop_contains=["customers.SUM("], # drop features that contain these strings
drop_exact=["STD(transactions.quanity)"], # drop features named exactly this
...
)
These 3 parameters are all documented here.
The final thing to consider if you are getting features you don't want is the variable types of the variables in your entity set. If you are seeing the sum of an ID variable that must mean that featuretools thinks the ID variable is a numeric value. If you tell featuretools it is an ID it will not apply a numeric aggregation to it.
I am new ArangoDB user and I am using the following query
FOR i IN meteo
FILTER
i.`POM` == "Maxial"
&& TO_NUMBER(i.`TMP`) < 4.2
&& DATE_TIMESTAMP(i.`DTM`) > DATE_TIMESTAMP("2014-12-10")
&& DATE_TIMESTAMP(i.`DTM`) < DATE_TIMESTAMP("2014-12-15")
RETURN
i.`TMP`
on a 2 million document collection. It has an index on the three fields that are filtered. It takes aprox. 9 secs on the Web Interface.
Is it possible to run it faster?
Thank you
Hugo
I have no access to the underlying data and data distribution nor the exact index definitions, so I can only give rather general advice:
Use the explain() command in order to see if the query makes use of indexes, and if yes, which.
If explain() shows that no index is used, check if the attributes contained in the query's FILTER conditions are actually indexed. There is the db.<collection>.getIndexes() command to check which attributes are indexed.
If indexes are present but not used by query, the indexes may have the wrong type. For example, a hash index will only be used for equality comparisons (i.e. ==) but not for other comparison types (<, <=, >, >= etc.). A hash index will only be used if all the indexed attributes are used in the query's FILTER conditions. A skiplist index will only be used if at least its first attribute is used is used in a FILTER condition. If further of the skiplist index attributes are specified in the query (from left-to-right), they may also be used and allow to filter more documents.
Only a single index will be picked when scanning a collection. Having multiple, separate indexes on "POM", "TMP", and "DTM" won't help this query because it will only use one of them per collection that it iterates over. Instead, I suggest trying to put multiple attributes into an index if the query could benefit from this.
The more selective an index is, the better. For example, an index on a single attribute may filter a lot of documents, but a combined index on multiple attributes may filter even more. For this particular query, a skiplist index on [ "POM", "DTM" ] may be the right choice (in combination with 6.)
The only attribute for which the optimizer may consider an index lookup in the given original query is the "POM" attribute. The reason is that the other attributes are used inside function calls (i.e. TO_NUMBER(), DATE_TIMESTAMP()). In general, indexes will not be used for attributes which are used inside functions (e.g. for TO_NUMBER(i.tmp) < 4.2 no index will be used. Same for DATE_TIMESTAMP(i.DTM) > DATE_TIMESTAMP("2014-12-10"). Modifying the conditions so the indexed attributes are directly compared to some constant or a one-time calculated value can enable more candidate indexes. If possible, try to rewrite the conditions so that only the indexed attributes are present on the one side of the comparison. For this particular query, it would be better to use i.DTM > "2014-12-10" instead of DATE_TIMESTAMP(i.DTM) > DATE_TIMESTAMP("2014-12-10").
I know where all we can define the chef attributes, attribute types and also their precendence levels. I just want to understand how they are stored internally.
Suppose I declare an attribute
default[:app][:install] = "/etc/app"
1) How is it stored internally? Is it using in a tree structure(heirearchy) in the node object or is it as hashmaps or a list of variables in the node object?
2) Also, in most of the cookbooks I see that attributes are declared in 2 or 3 levels something as above I dont understand if it is a standard or is it a best practice? Are there any guidelines for the way the attributes have to be declared? Is it something to do with its internal storage. Can't I declare the attribute as
default[:appinstall]= "/etc/app"
and access it as below in my recipe?
node[:appinstall]
Just four Mashes (subclass of Hash which does the string vs. symbol key fixups). When you access the merged view via node['foo'] it uses a Chef::Node::Attribute object to traverse all four in parallel until it finds a leaf value.
What you have shown is correct for setting and using attributes, though string keys are preferred over symbols. You should also in general scope your attributes with the name of the cookbook like:
default['mycookbook']['appinstall'] = '/etc/app'
This will reduce the chances of collisions with other cookbooks.