How do I specify a primary key on table creation with haskellDB? - haskell

Currently I am using something like this:
dbCreateTable db "MyTable" [ ("Col1", (StringT, False)), ("Col2", (StringT, False)) ]
which works fine, but i'd like to make "Col1" the primary key. Do i need to go back to raw SQL?
edit:
This still seems to hold:
"The part of creating a database from Haskell itself is not very
useful, for example you cannot express foreign- and primary keys,
indexes and constraints. Even the most simple database will need
one of these."
From http://www.mijnadres.net/published/HaskellDB.pdf

As the edit notes, HaskellDB is not very good at creating tables at the moment. It's best to build a database first, and then extract the info.

Related

Querying a composite key with multiple IN values with jOOQ

I have the following query:
SELECT *
FROM table
WHERE (id, other_id, status)
IN (
(1, 'XYZ', 'OK'),
(2, 'ZXY', 'OK') -- , ...
);
Is it possible to construct this query in a type-safe manner using jOOQ, preferably without generating composite keys? Is it possible to do this using jOOQ 3.11?
My apologies, it seems my Google-fu was not up to par. The opposite of this question can be found here: Use JOOQ to do a delete specifying multiple columns in a "not in" clause
For completeness' sake, so that other Google searches might be more immediately helpful, the solution is:
// can be populated using DSL.row(...); for each entry
Collection<? extends Row3<Long, String, String>> values = ...
dslContext.selectFrom(TABLE)
.where(DSL.row(ID, OTHER_ID, STATUS).in(values))
.fetch();
Relevant jOOQ documentation: https://www.jooq.org/doc/3.14/manual/sql-building/conditional-expressions/in-predicate-degree-n/
Your own answer already shows how to do this with a 1:1 translation from SQL to jOOQ using the IN predicate for degrees > 1.
Starting from jOOQ 3.14, there is also the option of using the new <embeddablePrimaryKeys/> flag in the code generator, which will produce embeddable types for all primary keys (and foreign keys referencing them). This will help never forget a key column on these queries, which is especially useful for joins.
Your query would look like this:
ctx.selectFrom(TABLE)
.where(TABLE.PK_NAME.in(
new PkNameRecord(1, "XYZ", "OK"),
new PkNameRecord(2, "ZXY", "OK")))
.fetch();
The query generated behind the scenes is the same as yours, using the 3 constraint columns for the predicate. If you add or remove a constraint from the key, the query will no longer compile. A join would look like this:
ctx.select()
.from(TABLE)
.join(OTHER_TABLE)
.on(TABLE.PK_NAME.eq(OTHER_TABLE.FK_NAME))
.fetch();
Or an implicit join would look like this:
ctx.select(OTHER_TABLE.table().fields(), OTHER_TABLE.fields())
.from(OTHER_TABLE)
.fetch();

Join three table using library knex.js

I am using knex.js
suppose we have three table :-
table1-- id,name,address
table2--id,city,sate,table1_id as fk
table3--id,housenumber,table1_id as fk
I want to join these three table using knex.js libraray of node and express
so that i want to get output json like this.
{
"id":1,
"name":"abc",
"address:"xyz",
"table2":{"id":1,"city":"ttt","state":"www" }//i want check if table1.id == table2.table1_id then put table details
"table3":[]//if no relation found between table1.id === table3.table1.id then kept it as an array
}
tl;dr knex is too low level tool for the thing you are trying to do, you should use an ORM for that kind of task
However you can do that with lots of manual work.
First you have to make the query with proper joins and creating aliases with table prefixes for each column of table to be able to get result data in a format where all data is in a flat array like:
knex('table1' as t1)
.join('table2 as t2', 't2.t1_id', 't1.id')
.select(
't1.id as t1_id',
't1.other_column as t1_other_column',
't2.id as t2_id', <more columns you want to extract>)
Results something like
[ { t1_id: 1, t1_other_column: 'foo', t2_id: 4}, ... more rows with flat data... }]
Then you need to write javascript code for restructuring flat data to nested objects.
But you should not do that kind of work manually. All knex based ORMs has already implemented general solutions for writing that kind of queries in easy manner.

Automating/Tracking Knex Migrations and Lucid Models

The Situation
I recently started working on a new project using nodejs. I have a background of using Python/Django and C#/.NET (not a huge fan of the latter). Node is awesome, but I must say I miss the ease of building models and automating migrations in Django. I am currently using the AdonisJS framework which leverages Knex. Knex is a powerful library, but the migrations all need to be manually built. Additionally, the AdonisJS ORM that manages the Models is independent of Knex (migration manager). You also do not define field attributes on the Models, which can have benifits for dynamically doing things in the front and back end. All things considered, there is a lot of room for human error, miscommunication and a boat load more typing required. I know the the hot thing these days is to keep it loose and fast, but for this specific project, I am looking for a bit more structure than loosely defined models.
Current State
What I have landed on is building a new Class called tableModel and a field class to define the fields within table model. I have already completed this and I am successfully writing the migration files leveraging mustache. I plan on also automatically writing the Models which I shouldn't have a problem with (fingers crossed).
The Problem
Here is where it gets a little tough and where I need help...I need to track what has been added or removed via migration so I can effectively write ups and downs as the tableModels change over time.
So let's say I add a "tableModel" which creates a migration to create table Foo with fields {id (bigint), user_id(int), name(string255)}
Later I want to add a field called description so I would simply add it to my "tableModel" and then run a build command which would build out the migration.
How do I check what has already been created though so I only do an up() for description?
Then I want to remove the name field so I mark it out in my "tableModel" and run a build migration command. How do I check what has been migrated that now needs to be added in to the down().
Edit: I would add a remove field to the up and the corresponding roll back to the down.
Bonus Round
Let's say I want to change user_id from an int to a bigint, because who makes a foreign key just an int? How do I check not just what needs to be added to the up and down, but also checks if I need to change a property on a field.
Edit: would just write the up. and a corresponding roll back to the down
The Big Question
Basically, how do I define dirty "tableModels" classes
Possible Solution?
I am thinking that maybe I should capture some type of registry or snapshot and then run the comparison when building the migrations and or models, then recapture/snapshot. If this is the route, should I store in a json file, write this to the DB itself, or is there another/better option.
If I create the tableModel instances as constants, could I actually write back to the JS file and capture the snapshot as an attribute? IF this is an option, is Node's file system the way to go and what's the best way to do this? Node keep suprising me so I wouldn't be baffled if any of these are an option.
Help!
If anyone has gone down this path before or knows of any tools I could leverage, I would greatly appreciate it and thank you in advance. Also, if I am headed in a completely wrong direction, then please let me know, I both handle and appreciate all types of feedback.
Example
Something to note, when I define the "tableModel" for a given migration or model, it is an instance of the class, I am not creating an extended class since this is not my orm.
class tableModel {
constructor(tableName, modelName = tableName, fields = []) {
this.tableName = tableName
this.modelName = modelName
this.fields = fields
}
// Bunch of other stuff
}
fooTableModel = new tableModel('fooTable', 'fooModel', fields = [
new tableField.stringField('title'),
new tableField.bigIntField('related_user_id'),
new tableField.textField('description','Testing Default',false,true)
]
)
which equates to:
tableModel {
tableName: 'fooTable',
modelName: 'fooModel',
fields:
[ stringField {
name: 'title',
type: 'string',
_unique: false,
allow_null: null,
fieldAttributes: {},
default_value: null },
bigIntField {
name: 'related_user_id',
type: 'bigInteger',
_unique: false,
allow_null: null,
fieldAttributes: {},
default_value: 0 },
textField {
name: 'description',
type: 'text',
_unique: false,
allow_null: true,
fieldAttributes: {},
default_value: 'Testing Default' } ]
You have the up and down notation mixed up. Those are for migrating the "latest" (runs the up function) and doing rollbacks (runs the down function). Up and down to not relate to dropping or adding table columns.
The migrations up is for any change, and the down is to reverse those changes. So if you wanted to drop a column from some table, you write the command in the up, then write the opposite in the down (you'd add it back in...), such that you can "rollback" and the change is effectively reversed. You have to be careful with such things though, as you can put yourself in a situation where you actually lose data.
Want to add a column? Write it in the up, and drop the column in the down.
One of the major points behind the migrations mechanism is to track the state of changes of your database, as time goes forward. So generally, if you created a table in some migration, then a day or so later you realize you need to drop/add columns, you normally don't go back and edit the existing migration, especially if the migration has already been run. You'd just write a new migration to drop/add your column.
Since you're using knex, there are a couple "knex" tables that get created. By default the one you're looking for is knex_migrations, unless someone specifically modified the settings to change the name of it. This table holds all the migrations that have run against your DB, per batch. From the CLI, assuming you have knex.js installed globally, you can run knex migrate:latest, and that will push all the migrations that exist in your directory to the target database, if they have not yet been run. It does this by way of examining that knex_migrations table. If you roll a change and don't like it, and assuming you've properly done the down function, you can invoke knex migrate:rollback to reverse the change. If there are 3 migration files that have NOT yet been run, invoking knex migrate:latest will run all 3 of those migration files under a new batch #, which is 1 higher than the most recent batch number. Conversely, if you invoke a knex migrate:rollback, it will find the highest batch number (there could be more than 1 migration in a batch...), and invoke the down function on all those files, effectively rollback those changes.
All that said, knex is a "query builder" tool. It's got a ton of helper functions to help build the sql for you. Personally, I find this to be a major distraction. Why spend hours on hours figuring out all the helper functions when I can just go crank out raw SQL and run that. Thus, that's what we've done in our system. we use knex.raw('') and write our own DDL and DML. It works great and does exactly what we need it to. We don't need to go figure out the magic of the query building.
The short answer is that knex will automatically know what has and has not been run for you (again, via that knex_migrations table it creates for you...).
Things can get weird though when it start involving git and different branches. I recommend that if you're writing migrations on some branch, and you need to go do other work, always remember to first perform a rollback of any migrations you've done in that branch BEFORE switching branches. Otherwise you will be in weird DB states that don't coincide with the application code.
I would personally just deal with updating models independently of writing migrations. For example, if I'm adding a description column to some table, then I probably want to manually update the ORM to reflect the change of the new db schema. Generally, I've found trying to use a tool that automagically does that for you (rather, if I change the orm, stuff happens to write all the underlying sql...) usually winds me up in a heap of trouble and I just spend more time trying to un-fudge stuff. But, that's just my 2 cents :)
Here is where it gets a little tough and where I need help...I need to track what has been added or removed via migration so I can effectively write ups and downs as the tableModels change over time.
You could store changes in a DB/txt file and those can act as snapshots. So when you want to rollback to a particular migration, you would find the changes (up/down) made for that mutation and adjust accordingly.
Later I want to add a field called description so I would simply add it to my "tableModel" and then run a build command which would build out the migration. How do I check what has already been created though so I only do an up() for description?
Here you either call the database itself directly and check what fields have already been created. If a field is already their and the attributes are the same, you can either ignore it or stop the transaction all together.
Bonus Round Let's say I want to change user_id from an int to a bigint, because who makes a foreign key just an int? How do I check not just what needs to be added to the up and down, but also checks if I need to change a property on a field.
Again, call the DB itself on the table in question. I know the SQL call would be:
describe [table_name];
After reading the end, I think you answered this yourself, but I think capturing these changes would work best in a NoSql database since you're using Node or PostGres with it's json field.

Cassandra 2.0, CQL 3.x Update ...USING TIMESTAMP

I am planning to use the Update...USING TIMESTAMP... statement to make sure that I do not overwrite fresh data with stale data while having to avoid doing at least LOCAL_QUORUM writes.
Here is my table structure.
Table=DocumentStore
DocumentID (primaryKey, bigint)
Document(text)
Version(int)
If the service receives 2 write requests with Version=1 and Version=2, regardless of the order of arrival, the business requirement is that we end up with Version=2 in the database.
Can I use the following CQL Statement?
Update DocumentStore using <versionValue>
SET Document=<documentValue>,
Version=<versionValue>
where DocumentID=<documentIDValue>;
Has anybody used something like this? If so was the behavior as expected?
Yes, this is a known technique. Although it should be
UPDATE "DocumentStore" USING TIMESTAMP <versionValue>
SET "Document" = <documentValue>,
"Version" = <versionValue>
WHERE "DocumentID" = <documentIDValue>;
You missed a TIMESTAMP keyword, and also since you are using case sensitive names, you should enclose them in quotes.

Composite key in Cassandra with Pig and where_clause for part of the key in the where clause

I basically have the same problem as the following Composite key in Cassandra with Pig. The only difference is I try to query for a part of the composite key within the where_clause of pig.
The data structure is similar to the earlier mentioned issue, I'll copy some code/context to minimize the reading of that issue.
We have a CQL table that looks something like this:
CREATE table data (
occurday text,
seqnumber int,
occurtimems bigint,
unique bigint,
fields map<text, text>,
primary key ((occurday, seqnumber), occurtimems, unique)
)
Instead of querying for both the seqnumber and the occurday (as was the issue in previously mentioned issue) I try to query one of the keys.
If I execute this query as part of a LOAD from within Pig, however, things don't work.
-- Need to URL encode the query
data = LOAD 'cql://ks/data?where_clause=occurday%3D%272013-10-01%27' USING CqlStorage();
gives
java.lang.RuntimeException
at org.apache.cassandra.hadoop.cql3.CqlPagingRecordReader$RowIterator.executeQuery(CqlPagingRecordReader.java:665)
at org.apache.cassandra.hadoop.cql3.CqlPagingRecordReader$RowIterator.<init>(CqlPagingRecordReader.java:301)
at org.apache.cassandra.hadoop.cql3.CqlPagingRecordReader.initialize(CqlPagingRecordReader.java:167)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.initialize(PigRecordReader.java:181)
at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:522)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
Caused by: InvalidRequestException(why:occurday cannot be restricted by more than one relation if it includes an Equal)
at org.apache.cassandra.thrift.Cassandra$prepare_cql3_query_result$prepare_cql3_query_resultStandardScheme.read(Cassandra.java:51017)
at org.apache.cassandra.thrift.Cassandra$prepare_cql3_query_result$prepare_cql3_query_resultStandardScheme.read(Cassandra.java:50994)
at org.apache.cassandra.thrift.Cassandra$prepare_cql3_query_result.read(Cassandra.java:50933)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
at org.apache.cassandra.thrift.Cassandra$Client.recv_prepare_cql3_query(Cassandra.java:1756)
at org.apache.cassandra.thrift.Cassandra$Client.prepare_cql3_query(Cassandra.java:1742)
at org.apache.cassandra.hadoop.cql3.CqlPagingRecordReader$RowIterator.prepareQuery(CqlPagingRecordReader.java:605)
at org.apache.cassandra.hadoop.cql3.CqlPagingRecordReader$RowIterator.executeQuery(CqlPagingRecordReader.java:635)
... 7 more
Basically my question is, what am I doing wrong or what don't I understand?
As I understand from CqlPagingRecorderReader Used when Partition Key Is Explicitly Stated
I should be able to query with just part of the partition key?
Also while reading
Add CqlRecordReader to take advantage of native CQL pagination
I get the impression this should be possible, but I am swimming around with (in my opinion) no clear direction on how to accomplish this.
Any help is very very welcome at this point.
Regards,
Lennart Weijl
PS.
I am running on Cassandra 2.0.9 with Pig 0.13.0
According to CASSANDRA-6311, I believe you need to apply the 6331-v2-2.0-branch.txt patch, recompile pig, and then update your LOAD statement to:
data = LOAD 'cql://ks/data?where_clause=occurday%3D%272013-10-01%27' USING CqlInputFormat();
The key change being USING CqlInputFormat() which triggers the use of the new CqlRecordReader that was released in Cassandra 2.0.7.
Edit: Note that the exception is thrown from CqlPagingRecordReader which means you're still using the old record reader.

Resources