CQL UPDATE a set<bigint> with join query - cassandra

I have to delete few data from a table using CQL based on some condition which will fetch data from another table. But I am unable to form the query.
Here is the table details from where I need to delete data :
Table Name : xyz_group
Columns : dept_id [int] , sub_id [set<bigint>]
PRIMARY KEY (Partition key) : dept_id
There can be same sub_id for multiple dept_id. The data is something like below :
dept_id | sub_id
-------------------------------
1098 | 345678298, 24579123, 8790455308
2059 | 398534698, 24579123, 8447659928
3467 | 311209878, 24579123, 8790455308, 987654321,
I need to remove only ---> 24579123, 8790455308 from all the rows.
And here is my SELECT query which will fetch the data from another table abc_list which is to be removed from table xyz_group
select sub_id from abc_list where sub_name='XYZ';
The output for the above query will give me a list of sub_id which I want to remove from the table xyz_group. So basically I want to update the set by removing data from the set. Something like below :
UPDATE xyz_group SET sub_id = sub_id - [ query result from above select query ] WHERE dept_id in (1098, 2059, 3467, ...);
I have tried to remove one element from the set, but I am getting the below error :
UPDATE xyz_group SET sub_id = sub_id - [ 24579123 ] WHERE dept_id in (1098, 2059, 3467, ...);
Error : Column sub_id type set<bigint> is not compatible with type list<int>
The tables has around >50k records. Can anyone please help to form the single correct query to update.

The below query is working for me now :
UPDATE xyz_group SET sub_id = sub_id - { 24579123 } WHERE dept_id in (1098, 2059, 3467, ...);
But I am doing a 2 step process to update the table. First collecting the required sub_id and then using a separate UPDATE query to update the table.Not able to do in a single query.

Related

Cosmos db null value

I have two kind of record mention below in my table staudentdetail of cosmosDb.In below example previousSchooldetail is nullable filed and it can be present for student or not.
sample record below :-
{
"empid": "1234",
"empname": "ram",
"schoolname": "high school ,bankur",
"class": "10",
"previousSchooldetail": {
"prevSchoolName": "1763440",
"YearLeft": "2001"
} --(Nullable)
}
{
"empid": "12345",
"empname": "shyam",
"schoolname": "high school",
"class": "10"
}
I am trying to access the above record from azure databricks using pyspark or scala code .But when we are building the dataframe reading it from cosmos db it does not bring previousSchooldetail detail in the data frame.But when we change the query including id for which the previousSchooldetail show in the data frame .
Case 1:-
val Query = "SELECT * FROM c "
Result when query fired directly
empid
empname
schoolname
class
Case2:-
val Query = "SELECT * FROM c where c.empid=1234"
Result when query fired with where clause.
empid
empname
school name
class
previousSchooldetail
prevSchoolName
YearLeft
Could you please tell me why i am not able to get previousSchooldetail in case 1 and how should i proceed.
As #Jayendran, mentioned in the comments, the first query will give you the previouschooldetail document wherever they are available. Else, the column would not be present.
You can have this column present for all the scenarios by using the IS_DEFINED function. Try tweaking your query as below:
SELECT c.empid,
c.empname,
IS_DEFINED(c.previousSchooldetail) ? c.previousSchooldetail : null
as previousSchooldetail,
c.schoolname,
c.class
FROM c
If you are looking to get the result as a flat structure, it can be tricky and would need to use two separate queries such as:
Query 1
SELECT c.empid,
c.empname,
c.schoolname,
c.class,
p.prevSchoolName,
p.YearLeft
FROM c JOIN c.previousSchooldetail p
Query 2
SELECT c.empid,
c.empname,
c.schoolname,
c.class,
null as prevSchoolName,
null as YearLeft
FROM c
WHERE not IS_DEFINED (c.previousSchooldetail) or
c.previousSchooldetail = null
Unfortunately, Cosmos DB does not support LEFT JOIN or UNION. Hence, I'm not sure if you can achieve this in a single query.
Alternatively, you can create a stored procedure to return the desired result.

Save array of objects in cassandra

How can I save array of objects in cassandra?
I'm using a nodeJS application and using cassandra-driver to connect to Cassandra DB. I wanted to save records like below in my db:
{
"id" : "5f1811029c82a61da4a44c05",
"logs" : [
{
"conversationId" : "e9b55229-f20c-4453-9c18-a1f4442eb667",
"source" : "source1",
"destination" : "destination1",
"url" : "https://asdasdas.com",
"data" : "data1"
},
{
"conversationId" : "e9b55229-f20c-4453-9c18-a1f4442eb667",
"source" : "source2",
"destination" : "destination2",
"url" : "https://afdvfbwadvsffd.com",
"data" : "data2"
}
],
"conversationId" : "e9b55229-f20c-4453-9c18-a1f4442eb667"
}
In the above record, I can use type "text" to save values of the columns "id" and "conversationId". But not sure how can I define the schema and save data for the field "logs".
With Cassandra, you'll want to store the data in the same way that you want to query it. As you mentioned querying by conversatonid, that's going to influence how the PRIMARY KEY definition should look. Given this, conversationid, should make a good partition key. As for the clustering columns, I had to make some guesses as to cardinality. So, sourceid looked like it could be used to uniquely identify a log entry within a conversation, so I went with that next.
I thought about using id as the final clustering column, but it looks like all entries with the same conversationid would also have the same id. It might be a good idea to give each entry its own unique identifier, to help ensure uniqueness:
{
"uniqueid": "e53723ca-2ab5-441f-b360-c60eacc2c854",
"conversationId" : "e9b55229-f20c-4453-9c18-a1f4442eb667",
"source" : "source1",
"destination" : "destination1",
"url" : "https://asdasdas.com",
"data" : "data1"
},
This makes the final table definition look like this:
CREATE TABLE conversationlogs (
id TEXT,
conversationid TEXT,
uniqueid UUID,
source TEXT,
destination TEXT,
url TEXT,
data TEXT,
PRIMARY KEY (conversationid,sourceid,uniqueid));
You have a few options depending on how you want to query this data.
The first is to stringify the json in logs field and save that to the database and then convert it back to JSON after querying the data.
The second option is similar to the first, but instead of stringifying the array, you store the data as a list in the database.
The third option is to define a new table for the logs with a primary key of the conversation and clustering keys for each element of the logs. This will allow you to lookup either by the full key or query by just the primary key and retrieve all the rows that match those criteria.
CREATE TABLE conversationlogs (
conversationid uuid,
logid timeuuid,
...
PRIMARY KEY ((conversationid), logid));

How to run CQL in Zeppelin by taking input in user input format?

I was trying to run CQL query by taking in user input format in Zeppelin tool:-
%cassandra
SELECT ${Select Fields Type=uuid ,uuid | created_by | email_verify| username} FROM
${Select Table=keyspace.table_name}
${WHERE email_verify="true" } ${ORDER BY='updated_date' }LIMIT ${limit = 10};
while running this query I was getting this error:
line 4:0 mismatched input 'true' expecting EOF
(SELECT uuid FROM keyspace.table_name ["true"]...)
You need to move WHERE and ORDER BY out of the dynamic form declaration.
The input field declaration is looks as following: ${field_name=default_value}. In your case, instead of WHERE ..., you've got the field name of WHERE email_verify.
It should be as following (didn't tested):
%cassandra
SELECT ${Select Fields Type=uuid ,uuid | created_by | email_verify| username} FROM
${Select Table=keyspace.table_name}
WHERE ${where_cond=email_verify='true'} ORDER BY ${order_by='updated_date'} LIMIT ${limit = 10};
Update:
here is the working example for table with following structure:
CREATE TABLE test.scala_test2 (
id int,
c int,
t text,
tm timestamp,
PRIMARY KEY (id, c)
) WITH CLUSTERING ORDER BY (c ASC)

How to get related field value from database in odoo 11 and postgresql?

I am trying to get a related field value from database, but it showing column 'column_name' does not exist.
When i try to find out the value of product_id or using join to find the common data between sale.order and product.product Model . but it showing column 'column_name' does not exist.
In sale.order model the field defination is like
product_id = fields.Many2one('product.product', related='order_line.product_id', string='Product')
But when i try to join two table like below code to fetch all data as per product, like below code.
select coalesce(p.name,'Unassigned Product'), count(*) from sale_order o left join product_product p on o.product_id = p.id where o.state = 'sale' group by p.name;
It showing below error,
column o.product_id does not exist
LINE 1: ... from sale_order o left join product_product p on o.product_...
When i try to get data from sale_order table like below code.
select product_id from sale_order;
It showing below error.
column "product_id" does not exist
Can any one help me to get that value.
To access a related field from database , you have to use the store=True , keyword.
Rewrite your field definition as,
product_id = fields.Many2one('product.product', related='order_line.product_id', string='Product', store=True)
and uninstall and install the module.

Postgres sorting on timestamp works on mac but not linux

Using Postgres 9.4
I have a posts table which relates to a users table. I'm querying for two users and 3 of their most recent posts.
SELECT
"users"."id" AS "id",
"posts"."id" AS "posts__id",
"posts"."created_at" AS "posts__created_at"
FROM (
SELECT * FROM accounts
WHERE TRUE
ORDER BY "id" ASC
LIMIT 2
) AS "users"
LEFT JOIN LATERAL (
SELECT * FROM posts
WHERE "users".id = posts.author_id
ORDER BY "created_at" DESC, "id" DESC
LIMIT 3
) AS "posts" ON "users".id = "posts".author_id
On mac, the order is as expected.
"2016-04-17 18:49:15.942"
"2016-04-15 03:29:31.212"
"2016-04-13 15:07:15.119"
I get descending order on created_at, which is a timestamptz. However, when run on my travis build, which is Ubuntu, the ordering is stable, but neither ascending nor descending....
"2016-04-15 03:29:31.212"
"2016-04-13 15:07:15.119"
"2016-04-17 18:49:15.942"
I made user to create the databases with the same LC_COLLATE = en_US.UTF-8 with no luck. Why on earth isn't the ordering working on travis?
To solve this, just add the order by statement under your existing statements above.
i.e.
SELECT
"users"."id" AS "id",
"posts"."id" AS "posts__id",
"posts"."created_at" AS "posts__created_at"
FROM (
SELECT * FROM accounts
WHERE TRUE
ORDER BY "id" ASC
LIMIT 2
) AS "users"
LEFT JOIN LATERAL (
SELECT * FROM posts
WHERE "users".id = posts.author_id
ORDER BY "created_at" DESC, "id" DESC
LIMIT 3
) AS "posts" ON "users".id = "posts".author_id
order by posts.created_at desc
The order of output on postgres (and many other dbms's) cannot be guaranteed without an order by statement.
While you do indeed have order by statements, they are within sub-queries, you need the order by on the outer query.
you may need to order the outer query too because the in join between the 2 inner queries, even when they are ordered, won't be guaranteed.
SELECT
"users"."id" AS "id",
"posts"."id" AS "posts__id",
"posts"."created_at" AS "posts__created_at"
FROM (
SELECT * FROM accounts
WHERE TRUE
ORDER BY "id" ASC
LIMIT 2
) AS "users"
LEFT JOIN LATERAL (
SELECT * FROM posts
WHERE "users".id = posts.author_id
ORDER BY "created_at" DESC, "id" DESC
LIMIT 3
) AS "posts" ON "users".id = "posts".author_id
order by "posts"."created_at" DESC
Because the actual sort order depends on both the order of id in the first table and the order of the created_at & id in the second one prior to joining them. This means the order of the first table can produce unexpected results when computing the selected values from the joined table.
To fix the sort order, you should sort the final result set by relevant columns as well.

Resources