Composite keys and counter. I can create the column family but cant fill it - cql

I understood a counter family could have keys of any kind. Are composite keys unsupported?
cqlsh:goh_master> create columnfamily balance (kind ascii, corporation_id ascii, amount counter, primary key ( kind,corporation_id) ) with compact storage;
cqlsh:goh_master> insert into balance(kind,corporation_id,amount) values ('c',103,123456789);
Bad Request: invalid operation for commutative columnfamily balance
cqlsh:goh_master> create columnfamily balance (kind ascii, corporation_id ascii, amount counter, primary key ( kind,corporation_id) ) with compact storage;
cqlsh:goh_master> insert into balance(kind,corporation_id,amount) values ('c',103,123456789);
Bad Request: invalid operation for commutative columnfamily balance

I solved it by myself thanks to this answer.
You cant insert into counters nor just set.
You must always use set counter =counter +n syntax:
cqlsh:goh_master> update balance set amount=amount+12 where kind='c' and corporation_id = 103;
worked like a charm

Related

How to make a sequence of select, update and insert atomic in one single Cassandra statement?

I'm dealing with 1MLN of Tweets (with a frequency of about 5K at seconds) and I would like to do something similar to this code in Cassandra. Let's say that I'm using a Lambda Architecture.
I know the following code is not working, I just would like to explain my logic through it.
DROP TABLE IF EXISTS hashtag_trend_by_week;
CREATE TABLE hashtag_trend_by_week(
shard_week timestamp,
hashtag text ,
counter counter,
PRIMARY KEY ( ( shard_week ), hashtag )
) ;
DROP TABLE IF EXISTS topten_hashtag_by_week;
CREATE TABLE topten_hashtag_by_week(
shard_week timestamp,
counter bigInt,
hashtag text ,
PRIMARY KEY ( ( shard_week ), counter, hashtag )
) WITH CLUSTERING ORDER BY ( counter DESC );
BEGIN BATCH
UPDATE hashtag_trend_by_week SET counter = counter + 22 WHERE shard_week='2021-06-15 12:00:00' and hashtag ='Gino';
INSERT INTO topten_hashtag_trend_by_week( shard_week, hashtag, counter) VALUES ('2021-06-15 12:00:00','Gino',
SELECT counter FROM hashtag_trend_by_week WHERE shard_week='2021-06-15 12:00:00' AND hashtag='Gino'
) USING TTL 7200;
APPLY BATCH;
Then the final query to satisfy my UI should be something like
SELECT hashtag, counter FROM topten_hashtag_by_week WHERE shard_week='2021-06-15 12:00:00' limit 10;
Any suggesting ?
You can only have CQL counter columns in a counter table so you need to rethink the schema for the hashtag_trend_by_week table.
Batch statements are used for making writes atomic in Cassandra so including a SELECT statement does not make sense.
The final query for topten_hashtag_by_week looks fine to me. Cheers!

Yugabyte YCQL check if a set contain a value?

Is there there any way to query on a SET type(or MAP/LIST) to find does it contain a value or not?
Something like this:
CREATE TABLE test.table_name(
id text,
ckk SET<INT>,
PRIMARY KEY((id))
);
Select * FROM table_name WHERE id = 1 AND ckk CONTAINS 4;
Is there any way to reach this query with YCQL api?
And can we use a SET type in SECONDRY INDEX?
Is there any way to reach this query with YCQL api?
YCQL does not support the CONTAINS keyword yet (feel free to open an issue for this on the YugabyteDB GitHub).
One workaround can be to use MAP<INT, BOOLEAN> instead of SET<INT> and the [] operator.
For instance:
CREATE TABLE test.table_name(
id text,
ckk MAP<int, boolean>,
PRIMARY KEY((id))
);
SELECT * FROM table_name WHERE id = 'foo' AND ckk[4] = true;
And can we use a SET type in SECONDRY INDEX?
Generally, collection types cannot be part of the primary key, or an index key.
However, "frozen" collections (i.e. collections serialized into a single value internally) can actually be part of either primary key or index key.
For instance:
CREATE TABLE table2(
id TEXT,
ckk FROZEN<SET<INT>>,
PRIMARY KEY((id))
) WITH transactions = {'enabled' : true};
CREATE INDEX table2_idx on table2(ckk);
Another option is to use with compound primary key and defining ckk as clustering key:
cqlsh> CREATE TABLE ybdemo.tt(id TEXT, ckk INT, PRIMARY KEY ((id), ckk)) WITH CLUSTERING ORDER BY (ckk DESC);
cqlsh> SELECT * FROM ybdemo.tt WHERE id='foo' AND ckk=4;

nested map in cassandra data modelling

I have following requirement of my dataset, need to unserstand what datatype should I use and how to save my data accordingly :-
CREATE TABLE events (
id text,
evntoverlap map<text, map<timestamp,int>>,
PRIMARY KEY (id)
)
evntoverlap = {
'Dig1': {{'2017-10-09 04:10:05', 0}},
'Dig2': {{'2017-10-09 04:11:05', 0},{'2017-10-09 04:15:05', 0}},
'Dig3': {{'2017-10-09 04:11:05', 0},{'2017-10-09 04:15:05', 0},{'2017-10-09 04:11:05', 0}}
}
This gives an error :-
Error from server: code=2200 [Invalid query] message="Non-frozen collections are not allowed inside collections: map<text, map<timestamp, int>>"
How should I store this type of data in single column . Please suggest datatype and insert command for the same.
Thanks,
There is limitation of Cassandra - you can't nest collection (or UDT) inside collection without making it frozen. So you need to "froze" one of the collections - either nested:
CREATE TABLE events (
id text,
evntoverlap map<text, frozen<map<timestamp,int>>>,
PRIMARY KEY (id)
);
or top-level:
CREATE TABLE events (
id text,
evntoverlap frozen<map<text, map<timestamp,int>>>,
PRIMARY KEY (id)
);
See documentation for more details.
CQL collections limited to 64kb, if putting things like maps in maps you might push that limit. Especially with frozen maps you are deserializing the entire map, modifying it, and re inserting. Might be better off with a
CREATE TABLE events (
id text,
evnt_key, text
value map<timestamp, int>,
PRIMARY KEY ((id), evnt_key)
)
Or even a
CREATE TABLE events (
id text,
evnt_key, text
evnt_time timestamp
value int,
PRIMARY KEY ((id), evnt_key, evnt_time)
)
It would be more efficient and safer while giving additional benefits like being able to order the event_time's in ascending or descending order.

com.datastax.driver.core.exceptions.InvalidQueryException: Invalid operator IN for PRIMARY KEY part

I have cassandra 2.1.15.
I have this table
CREATE TABLE ks_mobapp.messages (
pair_id text,
belong_to text,
message_id timeuuid,
cli_time bigint,
sender text,
text text,
time bigint,
PRIMARY KEY ((pair_id, belong_to), message_id)
) WITH CLUSTERING ORDER BY (message_id DESC)
I was trying to delete multiple record as
instances.getCqlSession().execute(QueryBuilder.delete()
.from(AppConstants.KEYSPACE, "messages")
.where(QueryBuilder.eq("pair_id", pairId))
.and(QueryBuilder.eq("belong_to", currentUser.value("userId")))
.and(QueryBuilder.in("message_id", msgId)));
I am getting error:
Caused by: com.datastax.driver.core.exceptions.InvalidQueryException: Invalid operator IN for PRIMARY KEY part message_id
Then I tried:
Session session = instances.getCqlSession();
PreparedStatement statement = session.prepare("DELETE FROM ks_mobApp.messages WHERE pair_id = ? AND belong_to = ? AND message_id = ?;");
Iterator<String> iterator = msgId.iterator();
while(iterator.hasNext()) {
try {
session.executeAsync(statement.bind(pairId, currentUser.value("userId"), UUID.fromString(iterator.next())));
} catch(Exception ex) {
}
}
Its working nice. Is this the correct way? I can't use IN for same partition key ?
DELETE in Query only supported for partition key.
Delete IN relation is only supported for partition key)
There are some WHERE clause restrictions for the UPDATE and DELETE statements in cassandra 2.x
more specifically you can only use the IN operator on the last partition key column. So in your case the last partition column is belong_to. so IN can only be used on that column.
However these limitation are removed in cassandra 3.0. and it will allow
IN to be specified on any partition key column
IN to be specified on any clustering column
Here is the patch https://issues.apache.org/jira/browse/CASSANDRA-6237
Read this also http://www.datastax.com/dev/blog/a-deep-look-to-the-cql-where-clause

Cassandra/Hector: Add a counter on a composite primary key

I've created a table in CQL3 console (no single primary key constituent is unique, together they will be):
CREATE TABLE aggregate_logs (
bpid varchar,
jid int,
month int,
year int,
value counter,
PRIMARY KEY (bpid, jid, month, year));
then been able to update and query by using:
UPDATE aggregate_logs SET value = value + 1 WHERE bpid='1' and jid=1 and month=1 and year=2000;
This works as expected. I wanted to do the same update in Hector (in Scala):
val aggregateMutator:Mutator[Composite] = HFactory.createMutator(keyspace, compositeSerializer)
val compKey = new Composite()
compKey.addComponent(bpid, stringSerializer)
compKey.addComponent(new Integer(jid), intSerializer)
compKey.addComponent(new Integer(month), intSerializer)
compKey.addComponent(new Integer(year), intSerializer)
aggregateMutator.incrementCounter(compKey, LogsAggregateFamily, "value", 1)
but I get an error with the message:
...HInvalidRequestException: InvalidRequestException(why:String didn't validate.)
Running the query direct from hector with:
val query = new me.prettyprint.cassandra.model.CqlQuery(keyspace, compositeSerializer, stringSerializer, new IntegerSerializer())
query.setQuery("UPDATE aggregate_logs SET value = value + 1 WHERE 'bpid'=1 and jid=1 and month=1 and year=2000")
query.execute()
which gives me the error:
InvalidRequestException(why:line 1:59 mismatched input 'and' expecting EOF)
I've not seem any other examples which use a counter under a composite primary key. Is it even possible?
It's definitely possible using directly cql (both via CQLSH and C++, at least):
cqlsh:goh_master> describe table daily_caps;
CREATE TABLE daily_caps
( caps_type ascii, id ascii, value counter, PRIMARY KEY
(caps_type, id) ) WITH COMPACT STORAGE AND comment='' AND
caching='KEYS_ONLY' AND read_repair_chance=0.100000 AND
gc_grace_seconds=864000 AND replicate_on_write='true' AND
compaction_strategy_class='SizeTieredCompactionStrategy' AND
compression_parameters:sstable_compression='SnappyCompressor';
cqlsh:goh_master> update daily_caps set value=value +1 where caps_type='xp' and id ='myid';
cqlsh:goh_master> select * from daily_caps;
caps_type | id | value
-----------+------+-------
xp | myid | 1
CQL3 and the thrift API are not compatible. So creating a column family with CQL3 and accessing it with Hector or another thrift based client will not work. For more information see:
https://issues.apache.org/jira/browse/CASSANDRA-4377

Resources