unable to upsert using java datastax driver - cassandra

I am unable to upsert a row using the datastax driver.
The data in the Cassandra table is stored like follows:
tag | partition_info
------------+--------------------------------------------------
sometag | {{year: 2018, month: 1}, {year: 2018, month: 2}}
tag is primary key and partition_info is a UDT
CREATE TYPE codingjedi.tag_partitions (
year bigint,
month bigint
);
I want that if a tag doesn't exist then it gets created. If tag exists then the new udt value gets appended to old one. I suppose I cannot use insert as it overrides previous value i.e. this will not work
QueryBuilder.insertInto(tableName).value("tag",model.tag)
.value("partition_info",setAsJavaSet(Set(partitionsInfo)))
I am trying to use update but it isn't working. Datastax driver gives error java.lang.IllegalArgumentException for following query
QueryBuilder.update(tableName).`with`(QueryBuilder.append("partition_info",setAsJavaSet(Set(partitionsInfo))))
.where(QueryBuilder.eq("tag", id.tag))
I tried using add and append for primary key but but got the error PRIMARY KEY part tag found in SET part
QueryBuilder.update(tableName).`with`(QueryBuilder.add("tag",id.tag))
.and(QueryBuilder.append("partition_info",setAsJavaSet(Set(partitionsInfo)))) .where(QueryBuilder.eq("tag", id.tag))

You're using the incorrect operation in your update statement - you're using append, but it's used to append data to columns of list types. You can use instead either add if you're adding a single value (your case, so you wont even need to wrap data into Set explicitly), or addAll if you're adding multiple values.
QueryBuilder.update(tableName)
.`with`(QueryBuilder.add("partition_info", partitionsInfo))
.where(QueryBuilder.eq("tag", id.tag))

Related

How to fetch Primary Key/Clustering column names for a particular table using CQL statements?

I am trying to fetch the Primary Key/Clustering Key names for a particular table/entity and implement the same query in my JPA interface (which extends CassandraRepository).
I am not sure whether something like:
#Query("DESCRIBE TABLE <table_name>)
public Object describeTbl();
would work here as describe isn't a valid CQL statement and in case it would, what would be the type of the Object?
Suggestions?
One thing you could try, would be to query the system_schema.columns table. It is keyed by keyspace_name and table_name, and might be what you're looking for here:
> SELECT column_name,kind FROM system_schema.columns
WHERE keyspace_name='spaceflight_data'
AND table_name='astronauts_by_group';
column_name | kind
-------------------+---------------
flights | regular
group | partition_key
name | clustering
spaceflight_hours | clustering
(4 rows)
DESCRIBE TABLE is supported only in Cassandra 4 that includes fix for CASSANDRA-14825. But it may not help you much because it just returns the text string representing the CREATE TABLE statement, and you'll need to parse text to extract primary key definition - it's doable but could be tricky, depending on the structure of the primary key.
Or you can obtain underlying Session object and via getMetadata function get access to actual metadata object that allows to obtain information about keyspaces & tables, including the information about schema.

Update a collection type of a custom type in cassandra

How can I append a new element to a set which is in a custom type in Cassandra.
custom_type is :
CREATE TYPE custom_type (
normal_type TEXT,
set_type Set<TEXT>
);
and the table to be updated is :
CREATE TABLE test_table (
id TEXT,
my_type FROZEN<custom_type>,
clustering_key TEXT,
PRIMARY KEY ((id),clustering_key)
);
Tried below query but did not work.
#Query("update test_table set my_type.set_type = my_type.set_type + {'newelement'} where id=?1 and clustering_key=?2")
Any Idea on how to do that?
Using [cqlsh 5.0.1 | Cassandra 3.11.1 | CQL spec 3.4.4
When you say frozen, then the whole value is treated as one piece (blob), so you can't update parts of this field. Official documentation states:
When using the frozen keyword, you cannot update parts of a user-defined type value. The entire value must be overwritten. Cassandra treats the value of a frozen, user-defined type like a blob.

update cassandra field using string concatenation

I am trying to update an existing string column in cassandra table.
For example i want to append domain id in front of username.
Following is my table
id, username
1, agaikwad
2, xyz
I want to write cql to update above table to reflect following
id, username
1, homeoffice\\agaikwad
2, homeoffice\\xyz
Following is what I have tried
update users set username = 'homeoffice\\' + username where id = <id>
This is not allowed in C* because it implicitly requires a read before a write which is a bad practice with C* (and an expensive proposition in a distributed system). For a similar behavior you could store this field as a list of strings, lists support the append operation and you would be able to concatenate on the application side.

Composite key in Cassandra with Pig and where_clause for part of the key in the where clause

I basically have the same problem as the following Composite key in Cassandra with Pig. The only difference is I try to query for a part of the composite key within the where_clause of pig.
The data structure is similar to the earlier mentioned issue, I'll copy some code/context to minimize the reading of that issue.
We have a CQL table that looks something like this:
CREATE table data (
occurday text,
seqnumber int,
occurtimems bigint,
unique bigint,
fields map<text, text>,
primary key ((occurday, seqnumber), occurtimems, unique)
)
Instead of querying for both the seqnumber and the occurday (as was the issue in previously mentioned issue) I try to query one of the keys.
If I execute this query as part of a LOAD from within Pig, however, things don't work.
-- Need to URL encode the query
data = LOAD 'cql://ks/data?where_clause=occurday%3D%272013-10-01%27' USING CqlStorage();
gives
java.lang.RuntimeException
at org.apache.cassandra.hadoop.cql3.CqlPagingRecordReader$RowIterator.executeQuery(CqlPagingRecordReader.java:665)
at org.apache.cassandra.hadoop.cql3.CqlPagingRecordReader$RowIterator.<init>(CqlPagingRecordReader.java:301)
at org.apache.cassandra.hadoop.cql3.CqlPagingRecordReader.initialize(CqlPagingRecordReader.java:167)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.initialize(PigRecordReader.java:181)
at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:522)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
Caused by: InvalidRequestException(why:occurday cannot be restricted by more than one relation if it includes an Equal)
at org.apache.cassandra.thrift.Cassandra$prepare_cql3_query_result$prepare_cql3_query_resultStandardScheme.read(Cassandra.java:51017)
at org.apache.cassandra.thrift.Cassandra$prepare_cql3_query_result$prepare_cql3_query_resultStandardScheme.read(Cassandra.java:50994)
at org.apache.cassandra.thrift.Cassandra$prepare_cql3_query_result.read(Cassandra.java:50933)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
at org.apache.cassandra.thrift.Cassandra$Client.recv_prepare_cql3_query(Cassandra.java:1756)
at org.apache.cassandra.thrift.Cassandra$Client.prepare_cql3_query(Cassandra.java:1742)
at org.apache.cassandra.hadoop.cql3.CqlPagingRecordReader$RowIterator.prepareQuery(CqlPagingRecordReader.java:605)
at org.apache.cassandra.hadoop.cql3.CqlPagingRecordReader$RowIterator.executeQuery(CqlPagingRecordReader.java:635)
... 7 more
Basically my question is, what am I doing wrong or what don't I understand?
As I understand from CqlPagingRecorderReader Used when Partition Key Is Explicitly Stated
I should be able to query with just part of the partition key?
Also while reading
Add CqlRecordReader to take advantage of native CQL pagination
I get the impression this should be possible, but I am swimming around with (in my opinion) no clear direction on how to accomplish this.
Any help is very very welcome at this point.
Regards,
Lennart Weijl
PS.
I am running on Cassandra 2.0.9 with Pig 0.13.0
According to CASSANDRA-6311, I believe you need to apply the 6331-v2-2.0-branch.txt patch, recompile pig, and then update your LOAD statement to:
data = LOAD 'cql://ks/data?where_clause=occurday%3D%272013-10-01%27' USING CqlInputFormat();
The key change being USING CqlInputFormat() which triggers the use of the new CqlRecordReader that was released in Cassandra 2.0.7.
Edit: Note that the exception is thrown from CqlPagingRecordReader which means you're still using the old record reader.

Composite key in Cassandra with Pig

We have a CQL table that looks something like this:
CREATE table data (
occurday text,
seqnumber int,
occurtimems bigint,
unique bigint,
fields map<text, text>,
primary key ((occurday, seqnumber), occurtimems, unique)
)
I can query this table from cqlsh like this:
select * from data where seqnumber = 10 AND occurday = '2013-10-01';
This query works and returns the expected data.
If I execute this query as part of a LOAD from within Pig, however, things don't work.
-- Need to URL encode the query
data = LOAD 'cql://ks/data?where_clause=seqnumber%3D10%20AND%20occurday%3D%272013-10-01%27' USING CqlStorage();
gives
InvalidRequestException(why:seqnumber cannot be restricted by more than one relation if it includes an Equal)
at org.apache.cassandra.thrift.Cassandra$prepare_cql3_query_result.read(Cassandra.java:39567)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
at org.apache.cassandra.thrift.Cassandra$Client.recv_prepare_cql3_query(Cassandra.java:1625)
at org.apache.cassandra.thrift.Cassandra$Client.prepare_cql3_query(Cassandra.java:1611)
at org.apache.cassandra.hadoop.cql3.CqlPagingRecordReader$RowIterator.prepareQuery(CqlPagingRecordReader.java:591)
at org.apache.cassandra.hadoop.cql3.CqlPagingRecordReader$RowIterator.executeQuery(CqlPagingRecordReader.java:621)
Shouldn't these behave the same? Why is the version through Pig failing where the straight cqlsh command works?
Hadoop is using CqlPagingRecordReader to try to load your data. This is leading to queries that are not identical to what you have entered. The paging record reader is trying to obtain small slices of Cassandra data at a time to avoid timeouts.
This means that your query is executed as
SELECT * FROM "data" WHERE token("occurday","seqnumber") > ? AND
token("occurday","seqnumber") <= ? AND occurday='A Great Day'
AND seqnumber=1 LIMIT 1000 ALLOW FILTERING
And this is why you are seeing your repeated key error. I'll submit a bug to the Cassandra Project.
Jira:
https://issues.apache.org/jira/browse/CASSANDRA-6151

Resources