Storing schema-less complex json object in Cassandra - cassandra

I have a schema-less json object that I wish to store in Cassandra DB using spring-cassandra. I learned that Cassandra supports Map type but Cassandra doesn't accept Map<String, Object> as a data model.
I need to query on the fields of the json so storing it as a blob is out of question. Is there anyway I can do this?
PS: I've looked at Storing JSON object in CASSANDRA, the answer didn't seem applicable to my use case as my json could be very complex.

Did you look at UDT (user-defined-type) ?
You can define an UDT like this:
CREATE TYPE my_json(
property1 text,
property2 int,
property3 map<text, text>,
property4 map<int, another_json_type>,
...
)
And then in Java use Map<String, UserType>
Note: UserType comes from the Java driver: https://github.com/datastax/java-driver/blob/2.1/driver-core/src/main/java/com/datastax/driver/core/UserType.java
You cannot create an user type n Java, you can only get it from the metadata of your table, see this: https://github.com/datastax/java-driver/blob/3.0/driver-core/src/test/java/com/datastax/driver/core/UserTypesTest.java#L62-L81

1) one solution from me is, integrate solr search and index this table first.
2) Later write a solr analyser to parse the json and put under various fields in solr while indexing.
3) Next step is use solr supported query like select * from table where solr_query = "{search expression syntax}"

Related

How do I write this query without using raw query in sequelize?

I would like to a bulk update in sequelize. Unfortunately it seems like sequelize does not support bulk updates I am using sequelize-typescript if that helps and using postgresql 14
My query in raw SQL looks like this
UPDATE feed_items SET tags = val.tags FROM
(
VALUES ('ddab8ce7-afa3-824f-7b65-edfb53a71764'::uuid,ARRAY[]::VARCHAR(255)[]),
('ece9f2fc-2a09-4a95-16ce-07293b0a14d2'::uuid,ARRAY[]::VARCHAR(255)[])
) AS val(id, tags) WHERE feed_items.id = val.id
I would like to generate this query from a given array of string and array values. The tags is implemented as a string array in my table.
Is there a way to generate the above query without using raw query?
Or an SQL injection safe way of generating the above query?

JOOQ MYSQL QUERY

select question.*,
question_option.id
from question
left join question_option on question_option.question_id = question.id;
how do i write question.* in jooq instead of specifying all the entity vaiables
You can use field() or asterisk() methods from the JOOQ generated objects which are extended from TableImpl.
For example, if you just want to query the fields of a record:
dsl.select(QUESTION.fields()).from...
If you need fields from the join too:
dsl.select(QUESTION.asterisk(), QUESTION_OPTION.ID).from...
I assume that you generate the metamodel so you can use
dsl.select(QUESTION.fields()), QUESTION_OPTION.ID)...

How to fetch Primary Key/Clustering column names for a particular table using CQL statements?

I am trying to fetch the Primary Key/Clustering Key names for a particular table/entity and implement the same query in my JPA interface (which extends CassandraRepository).
I am not sure whether something like:
#Query("DESCRIBE TABLE <table_name>)
public Object describeTbl();
would work here as describe isn't a valid CQL statement and in case it would, what would be the type of the Object?
Suggestions?
One thing you could try, would be to query the system_schema.columns table. It is keyed by keyspace_name and table_name, and might be what you're looking for here:
> SELECT column_name,kind FROM system_schema.columns
WHERE keyspace_name='spaceflight_data'
AND table_name='astronauts_by_group';
column_name | kind
-------------------+---------------
flights | regular
group | partition_key
name | clustering
spaceflight_hours | clustering
(4 rows)
DESCRIBE TABLE is supported only in Cassandra 4 that includes fix for CASSANDRA-14825. But it may not help you much because it just returns the text string representing the CREATE TABLE statement, and you'll need to parse text to extract primary key definition - it's doable but could be tricky, depending on the structure of the primary key.
Or you can obtain underlying Session object and via getMetadata function get access to actual metadata object that allows to obtain information about keyspaces & tables, including the information about schema.

Howto expose a native SQL function as a predicate

I have a table in my database which stores a list of string values as a jsonb field.
create table rabbits_json (
rabbit_id bigserial primary key,
name text,
info jsonb not null
);
insert into rabbits_json (name, info) values
('Henry','["lettuce","carrots"]'),
('Herald','["carrots","zucchini"]'),
('Helen','["lettuce","cheese"]');
I want to filter my rows checking if info contains a given value.
In SQL, I would use ? operator:
select * from rabbits_json where info ? 'carrots';
If my googling skills are fine today, I believe that this is not implemented yet in JOOQ:
https://github.com/jOOQ/jOOQ/issues/9997
How can I use a native predicate in my query to write an equivalent query in JOOQ?
For anything that's not supported natively in jOOQ, you should use plain SQL templating, e.g.
Condition condition = DSL.condition("{0} ? {1}", RABBITS_JSON.INFO, DSL.val("carrots"));
Unfortunately, in this specific case, you will run into this issue here. With JDBC PreparedStatement, you still cannot use ? for other usages than bind variables. As a workaround, you can:
Use Settings.statementType == STATIC_STATEMENT to prevent using a PreparedStatement in this case
Use the jsonb_exists_any function (not indexable) instead of ?, see https://stackoverflow.com/a/38370973/521799

Waterline - Postgres - DataTypes

I am having difficulties with Waterline models and creating the Postgres tables related to those models.
No matter what I do to create a varchar(n) in the table through a model, it converts the attribute to text. And bigint also is being converted to integer!
Should I change the ORM?
Is there a way to do that?
You can do a more pleasant approach, using Waterline to "RUD" in "CRUD" but not to "C" - create! This because Waterline can be very "bad" at creating intermediary tables, primary keys (composite keys) and etc. So what I do today is this:
Compose a full .sql file archive to create indexes and tables.
Create the database once. (Alter if needed).
Declare all the tables as models. Just insert the type, primary key (if it is a single one) and lifecycle callbacks.
Make sure that config/models.js is set to migrate : safe.
Conclusion: I can insert, read and delete rows with Waterline, but I don't trust it (performance-wise) to create my tables. Sequelize on the other hand is a much more mature ORM and can be used if you need it. For me the hybrid waterline + SQL is sufficient.
EDIT: My models dont have any aggregation (like my_pets: { model: pet} ), just row names and types, as simple as possible.
Sails supported datatype:
String, text, integer, float, date, datetime, boolean, binary, array, json, mediumtext, longtext, objectid
If you need to specify exact length -> varchar(n), you need to use supported data type as shown above, or sails provide option called query.
Model.query() method which you can use to perform any kind of query you want.
var queryString='CREATE TABLE if not exists sailsusers.test (id INT NOT NULL,name VARCHAR(45) NULL,PRIMARY KEY (id))'
Test.query(queryString,function(err,a){
if(err)
return console.log(err);
console.log(a,'\n',b);
res.ok();
});

Resources