Suppose I have two tables USER_GROUP and USER_GROUP_DATASOURCE. I have a classic relation where one userGroup can have multiple dataSources and one DataSource simply is a String.
Due to some reasons, I have a custom RecordMapper creating a Java UserGroup POJO. (Mainly compatibility with the other code in the codebase, always being explicit on whats happening). This mapper sometimes creates simply POJOs containing data only from the USER_GROUP table, sometimes also the left joined dataSources.
Currently, I am trying to write the Multiset query along with the custom record mapper. My query thus far looks like this:
List<UserGroup> = ctx
.select(
asterisk(),
multiset(select(USER_GROUP_DATASOURCE.DATASOURCE_ID)
.from(USER_GROUP_DATASOURCE)
.where(USER_GROUP.ID.eq(USER_GROUP_DATASOURCE.USER_GROUP_ID))
).as("datasources").convertFrom(r -> r.map(Record1::value1))
)
.from(USER_GROUP)
.where(condition)
.fetch(new UserGroupMapper()))
Now my question is: How to create the UserGroupMapper? I am stuck right here:
public class UserGroupMapper implements RecordMapper<Record, UserGroup> {
#Override
public UserGroup map(Record rec) {
UserGroup grp = new UserGroup(rec.getValue(USER_GROUP.ID),
rec.getValue(USER_GROUP.NAME),
rec.getValue(USER_GROUP.DESCRIPTION)
javaParseTags(USER_GROUP.TAGS)
);
// Convention: if we have an additional field "datasources", we assume it to be a list of dataSources to be filled in
if (rec.indexOf("datasources") >= 0) {
// How to make `rec.getValue` return my List<String>????
List<String> dataSources = ?????
grp.dataSources.addAll(dataSources);
}
}
My guess is to have something like List<String> dataSources = rec.getValue(..) where I pass in a Field<List<String>> but I have no clue how I could create such Field<List<String>> with something like DSL.field().
How to get a type safe reference to your field from your RecordMapper
There are mostly two ways to do this:
Keep a reference to your multiset() field definition somewhere, and reuse that. Keep in mind that every jOOQ query is a dynamic SQL query, so you can use this feature of jOOQ to assign arbitrary query fragments to local variables (or return them from methods), in order to improve code reuse
You can just raw type cast the value, and not care about type safety. It's always an option, evne if not the cleanest one.
How to improve your query
Unless you're re-using that RecordMapper several times for different types of queries, why not do use Java's type inference instead? The main reason why you're not getting type information in your output is because of your asterisk() usage. But what if you did this instead:
List<UserGroup> = ctx
.select(
USER_GROUP, // Instead of asterisk()
multiset(
select(USER_GROUP_DATASOURCE.DATASOURCE_ID)
.from(USER_GROUP_DATASOURCE)
.where(USER_GROUP.ID.eq(USER_GROUP_DATASOURCE.USER_GROUP_ID))
).as("datasources").convertFrom(r -> r.map(Record1::value1))
)
.from(USER_GROUP)
.where(condition)
.fetch(r -> {
UserGroupRecord ug = r.value1();
List<String> list = r.value2(); // Type information available now
// ...
})
There are other ways than the above, which is using jOOQ 3.17+'s support for Table as SelectField. E.g. in jOOQ 3.16+, you can use row(USER_GROUP.fields()).
The important part is that you avoid the asterisk() expression, which removes type safety. You could even convert the USER_GROUP to your UserGroup type using USER_GROUP.convertFrom(r -> ...) when you project it:
List<UserGroup> = ctx
.select(
USER_GROUP.convertFrom(r -> ...),
// ...
After reading "Asynchronous queries with the Java driver" article in the datastax blog, I was trying to implement a solution similar to the one in the section called - 'Case study: multi-partition query, a.k.a. “client-side SELECT...IN“'.
I currently have code that looks something like this:
public Future<List<ResultSet>> executeMultipleAsync(final BoundStatement statement, final Object... partitionKeys) {
List<Future<ResultSet>> futures = Lists.newArrayListWithExpectedSize(partitionKeys.length);
for (Object partitionKey : partitionKeys) {
Statement bs = statement.bind(partitionKey);
futures.add(executeWithRetry(bs));
}
return Futures.successfulAsList(futures);
}
But, I'd like to improve on that. In the cql query this BoundStatement holds, I'd like to have something that looks like this:
SELECT * FROM <column_family_name> WHERE <param1> = :p1_name AND param2 = :p2_name AND <partiotion_key_name> = ?;
I'd like the clients of this method to give me a BoundStatement with an already bound parameters (two parameters in this case) and a list of partition keys. In this case, all I need to do, is bind the partition keys and execute the queries. Unfortunately, when I bind the key to this statement I fail with an error - com.datastax.driver.core.exceptions.InvalidTypeException: Invalid type for value 0 of CQL type varchar, expecting class java.lang.String but class java.lang.Long provided. The problem is, that I try to bind the key to the first parameter and not the last. Which is a string and not a long.
I can solve this by either giving the partition parameter a name but then I'd have to get the name via method parameters, or by specifying it's index which again will require an additional method parameter. Either way, if I use the name or the index I have to bind it with a specific type. For instance: bs.setLong("<key_name>", partitionKey);. For some reason, I can't leave it to the BoundStatement to interpret the type of the last parameter.
I'd like to avoid passing the parameter name explicitly and bypass the type problem. Is there anything that can be done?
Thanks!
I've posted the same question in 'DataStax Java Driver for Apache Cassandra User Mailing List' and got an answer saying the functionality that I'm missing may be added in the next version (2.2) of the datastax java driver.
In JAVA-721 (to be introduced in 2.2) we are tentatively planning on
adding the following methods with the signature to BoundStatement:
public BoundStatement setObject(int i, V v) public
BoundStatement setObject(String name, V v)
and
You can emulate setObject in 2.1:
void setObject(BoundStatement bs, int position, Object object,
ProtocolVersion protocolVersion) {
DataType type = bs.preparedStatement().getVariables().getType(position);
ByteBuffer buffer = type.serialize(object, protocolVersion);
bs.setBytesUnsafe(position, buffer);
}
To avoid passing the parameter name, one thing you could do is look
for a position that isn't bound yet:
int findUnsetPosition(BoundStatement bs) {
int size = bs.preparedStatement().getVariables().size();
for (int i = 0; i < size; i++)
if (!bs.isSet(i))
return i;
throw new IllegalArgumentException("found no unset position");
}
I don't recommend it though, because it's ugly and unpredictable if
the user forgot to bind one of the non-PK variables.
The way I would do it is require the user to pass a callback that sets
the PK:
interface PKBinder<T> {
void bind(BoundStatement bs, T pk);
}
public <T> Future<List<ResultSet>> executeMultipleAsync(final BoundStatement statement, PKBinder<T> pkBinder, final T...
partitionKeys)
As a bonus, this will also work with composite partition keys.
Background: Project is a Data Import utility for importing data from tsv files into a EF5 DB through DbContext.
Problem: I need to do a lookup for ForeignKeys while doing the import. I have a way to do that but the retrieval if the ID is not functioning.
So I have a TSV file example will be
Code Name MyFKTableId
codevalue namevalue select * from MyFKTable where Code = 'SE'
So when I process the file and Find a '...Id' column I know I need to do a lookup to find the FK The '...' is always the entity type so this is super simple. The problem I have is that I don't have access to the properties of the results of foundEntity
string childEntity = column.Substring(0, column.Length - 2);
DbEntityEntry recordType = myContext.Entry(childEntity.GetEntityOfReflectedType());
DbSqlQuery foundEntity = myContext.Set(recordType.Entity.GetType()).SqlQuery(dr[column])
Any suggestion would be appreciated. I need to keep this generic so we can't use known type casting. The Id Property accessible from IBaseEntity so I can cast that, but all other entity types must be not be fixed
Note: The SQL in the MyFKTableId value is not a requirement. If there is a better option allowing to get away from SqlQuery() I would be open to suggestions.
SOLVED:
Ok What I did was create a Class called IdClass that only has a Guid Property for Id. Modified my sql to only return the Id. Then implemented the SqlQuery(sql) call on the Database rather than the Set([Type]).SqlQuery(sql) like so.
IdClass x = ImportFactory.AuthoringContext.Database.SqlQuery<IdClass>(sql).FirstOrDefault();
SOLVED:
Ok What I did was create a Class called IdClass that only has a Guid Property for Id. Modified my sql to only return the Id. Then implemented the SqlQuery(sql) call on the Database rather than the Set([Type]).SqlQuery(sql) like so.
IdClass x = ImportFactory.AuthoringContext.Database.SqlQuery<IdClass>(sql).FirstOrDefault();
With SubSonic 3 / ActiveRecord, is there an easy way to compare two records without having to compare each column by column. For example, I'd like a function that does something like this (without having to write a custom comparer for each table in my database):
public partial class MyTable
{
public IList<SubSonic.Schema.IColumn> Compare(MyTable m)
{
IList<SubSonic.Schema.IColumn> columnsThatDontMatch = new...;
if (this.Field1 != m.Field1)
{
columnsThatDontMatch.add(Field1_Column);
}
if (this.Field2 != m.Field2)
{
columnsThatDontMatch.add(Field2_Column);
}
...
return columnsThatDontMatch;
}
}
In the end, what I really need is a function that tests for equality between two rows, excluding the primary key columns. The pseudo-code above is a more general form of this. I believe that once I get the columns that don't match, I'll be able to check if any of the columns are primary key fields.
I've looked through Columns property without finding anything that I can use. Ideally, the solution would be something I can toss in the t4 file and generate for all my tables in the database.
The best way, if using SQL Server as your backend as this can be auto populated, is to create a derived column that has a definition that uses CHECKSUM to hash the values of "selected" columns to form a uniqueness outside of the primary key.
EDIT: if you are not using SQL Server then this hashing will need to be done in code as you save, edit the row.
[UPDATE] Chosen approach is below, as a response to this question
Hi,
I' ve been looking around in this subject but I can't really find what I'm looking for...
With Code tables I mean: stuff like 'maritial status', gender, specific legal or social states... More specifically, these types have only set properties and the items are not about to change soon (but could). Properties being an Id, a name and a description.
I'm wondering how to handle these best in the following technologies:
in the database (multiple tables, one table with different code-keys...?)
creating the classes (probably something like inheriting ICode with ICode.Name and ICode.Description)
creating the view/presenter for this: there should be a screen containing all of them, so a list of the types (gender, maritial status ...), and then a list of values for that type with a name & description for each item in the value-list.
These are things that appear in every single project, so there must be some best practice on how to handle these...
For the record, I'm not really fond of using enums for these situations... Any arguments on using them here are welcome too.
[FOLLOW UP]
Ok, I've gotten a nice answer by CodeToGlory and Ahsteele. Let's refine this question.
Say we're not talking about gender or maritial status, wich values will definately not change, but about "stuff" that have a Name and a Description, but nothing more. For example: Social statuses, Legal statuses.
UI:
I want only one screen for this. Listbox with possibe NameAndDescription Types (I'll just call them that), listbox with possible values for the selected NameAndDescription Type, and then a Name and Description field for the selected NameAndDescription Type Item.
How could this be handled in View & Presenters? I find the difficulty here that the NameAndDescription Types would then need to be extracted from the Class Name?
DB:
What are pro/cons for multiple vs single lookup tables?
Using database driven code tables can very useful. You can do things like define the life of the data (using begin and end dates), add data to the table in real time so you don't have to deploy code, and you can allow users (with the right privileges of course) add data through admin screens.
I would recommend always using an autonumber primary key rather than the code or description. This allows for you to use multiple codes (of the same name but different descriptions) over different periods of time. Plus most DBAs (in my experience) rather use the autonumber over text based primary keys.
I would use a single table per coded list. You can put multiple codes all into one table that don't relate (using a matrix of sorts) but that gets messy and I have only found a couple situations where it was even useful.
Couple of things here:
Use Enumerations that are explicitly clear and will not change. For example, MaritalStatus, Gender etc.
Use lookup tables for items that are not fixed as above and may change, increase/decrease over time.
It is very typical to have lookup tables in the database. Define a key/value object in your business tier that can work with your view/presentation.
I have decided to go with this approach:
CodeKeyManager mgr = new CodeKeyManager();
CodeKey maritalStatuses = mgr.ReadByCodeName(Code.MaritalStatus);
Where:
CodeKeyManager can retrieve CodeKeys from DB (CodeKey=MaritalStatus)
Code is a class filled with constants, returning strings so Code.MaritalStatus = "maritalStatus". These constants map to to the CodeKey table > CodeKeyName
In the database, I have 2 tables:
CodeKey with Id, CodeKeyName
CodeValue with CodeKeyId, ValueName, ValueDescription
DB:
alt text http://lh3.ggpht.com/_cNmigBr3EkA/SeZnmHcgHZI/AAAAAAAAAFU/2OTzmtMNqFw/codetables_1.JPG
Class Code:
public class Code
{
public const string Gender = "gender";
public const string MaritalStatus = "maritalStatus";
}
Class CodeKey:
public class CodeKey
{
public Guid Id { get; set; }
public string CodeName { get; set; }
public IList<CodeValue> CodeValues { get; set; }
}
Class CodeValue:
public class CodeValue
{
public Guid Id { get; set; }
public CodeKey Code { get; set; }
public string Name { get; set; }
public string Description { get; set; }
}
I find by far the easiest and most efficent way:
All code-data can be displayed in a identical manner (in the same view/presenter)
I don't need to create tables and classes for every code table that's to come
But I can still get them out of the database easily and use them easily with the CodeKey constants...
NHibernate can handle this easily too
The only thing I'm still considering is throwing out the GUID Id's and using string (nchar) codes for usability in the business logic.
Thanks for the answers! If there are any remarks on this approach, please do!
I lean towards using a table representation for this type of data. Ultimately if you have a need to capture the data you'll have a need to store it. For reporting purposes it is better to have a place you can draw that data from via a key. For normalization purposes I find single purpose lookup tables to be easier than a multi-purpose lookup tables.
That said enumerations work pretty well for things that will not change like gender etc.
Why does everyone want to complicate code tables? Yes there are lots of them, but they are simple, so keep them that way. Just treat them like ever other object. Thy are part of the domain, so model them as part of the domain, nothing special. If you don't when they inevitibly need more attributes or functionality, you will have to undo all your code that currently uses it and rework it.
One table per of course (for referential integrity and so that they are available for reporting).
For the classes, again one per of course because if I write a method to recieve a "Gender" object, I don't want to be able to accidentally pass it a "MarritalStatus"! Let the compile help you weed out runtime error, that's why its there. Each class can simply inherit or contain a CodeTable class or whatever but that's simply an implementation helper.
For the UI, if it does in fact use the inherited CodeTable, I suppose you could use that to help you out and just maintain it in one UI.
As a rule, don't mess up the database model, don't mess up the business model, but it you wnt to screw around a bit in the UI model, that's not so bad.
I'd like to consider simplifying this approach even more. Instead of 3 tables defining codes (Code, CodeKey and CodeValue) how about just one table which contains both the code types and the code values? After all the code types are just another list of codes.
Perhaps a table definition like this:
CREATE TABLE [dbo].[Code](
[CodeType] [int] NOT NULL,
[Code] [int] NOT NULL,
[CodeDescription] [nvarchar](40) NOT NULL,
[CodeAbreviation] [nvarchar](10) NULL,
[DateEffective] [datetime] NULL,
[DateExpired] [datetime] NULL,
CONSTRAINT [PK_Code] PRIMARY KEY CLUSTERED
(
[CodeType] ASC,
[Code] ASC
)
GO
There could be a root record with CodeType=0, Code=0 which represents the type for CodeType. All of the CodeType records will have a CodeType=0 and a Code>=1. Here is some sample data that might help clarify things:
SELECT CodeType, Code, Description FROM Code
Results:
CodeType Code Description
-------- ---- -----------
0 0 Type
0 1 Gender
0 2 Hair Color
1 1 Male
1 2 Female
2 1 Blonde
2 2 Brunette
2 3 Redhead
A check constraint could be added to the Code table to ensure that a valid CodeType is entered into the table:
ALTER TABLE [dbo].[Code] WITH CHECK ADD CONSTRAINT [CK_Code_CodeType]
CHECK (([dbo].[IsValidCodeType]([CodeType])=(1)))
GO
The function IsValidCodeType could be defined like this:
CREATE FUNCTION [dbo].[IsValidCodeType]
(
#Code INT
)
RETURNS BIT
AS
BEGIN
DECLARE #Result BIT
IF EXISTS(SELECT * FROM dbo.Code WHERE CodeType = 0 AND Code = #Code)
SET #Result = 1
ELSE
SET #Result = 0
RETURN #Result
END
GO
One issue that has been raised is how to ensure that a table with a code column has a proper value for that code type. This too could be enforced by a check constraint using a function.
Here is a Person table which has a gender column. It could be a best practice to name all code columns with the description of the code type (Gender in this example) followed by the word Code:
CREATE TABLE [dbo].[Person](
[PersonID] [int] IDENTITY(1,1) NOT NULL,
[LastName] [nvarchar](40) NULL,
[FirstName] [nvarchar](40) NULL,
[GenderCode] [int] NULL,
CONSTRAINT [PK_Person] PRIMARY KEY CLUSTERED ([PersonID] ASC)
GO
ALTER TABLE [dbo].[Person] WITH CHECK ADD CONSTRAINT [CK_Person_GenderCode]
CHECK (([dbo].[IsValidCode]('Gender',[Gendercode])=(1)))
GO
IsValidCode could be defined this way:
CREATE FUNCTION [dbo].[IsValidCode]
(
#CodeTypeDescription NVARCHAR(40),
#Code INT
)
RETURNS BIT
AS
BEGIN
DECLARE #CodeType INT
DECLARE #Result BIT
SELECT #CodeType = Code
FROM dbo.Code
WHERE CodeType = 0 AND CodeDescription = #CodeTypeDescription
IF (#CodeType IS NULL)
BEGIN
SET #Result = 0
END
ELSE
BEGiN
IF EXISTS(SELECT * FROM dbo.Code WHERE CodeType = #CodeType AND Code = #Code)
SET #Result = 1
ELSE
SET #Result = 0
END
RETURN #Result
END
GO
Another function could be created to provide the code description when querying a table that has a code column. Here is an
example of querying the Person table:
SELECT PersonID,
LastName,
FirstName,
GetCodeDescription('Gender',GenderCode) AS Gender
FROM Person
This was all conceived from the perspective of preventing the proliferation of lookup tables in the database and providing one lookup table. I have no idea whether this design would perform well in practice.