PRIMARY KEY part colname cannot be restricted by IN relation - cassandra

My CQL3 table is like this
CREATE TABLE stringindice (
id text,
colname text,
colvalue blob,
PRIMARY KEY (id, colname, colvalue)
) WITH COMPACT STORAGE
and I have inserted some values in it. Now when I am trying to do something like this:
QueryBuilder.select().all().from(keySpace, indTastringindice ble).where().and(QueryBuilder.eq("id", 'rowKey")).and(QueryBuilder.in("colname", "string1", "string2"));
which is essentially
select * from stringindice where id = "rowkey" and colname IN ("string1", "string2")
I am getting following exception:
com.datastax.driver.core.exceptions.InvalidQueryException: PRIMARY KEY part colname cannot be restricted by IN relation
at com.datastax.driver.core.exceptions.InvalidQueryException.copy(InvalidQueryException.java:35)
at com.datastax.driver.core.ResultSetFuture.extractCauseFromExecutionException(ResultSetFuture.java:214)
at com.datastax.driver.core.ResultSetFuture.getUninterruptibly(ResultSetFuture.java:169)
at com.datastax.driver.core.Session.execute(Session.java:110)
In the documentation of CQL3, it is written that
"Moreover, the IN relation is only allowed on the last column of the
partition key and on the last column of the full primary key."
So it seems that it is not supported!! If yes, then what is the way if I have to use something like IN for equating many values at once?

It is because you are using compact storage, so the composite column is colname:colvalue (and the value is empty). This means colname is not the last column of the full primary key.
If you don't use compact storage (which is recommended for all new data models), you have the equivalent schema:
CREATE TABLE stringindice (
id text,
colname text,
colvalue blob,
PRIMARY KEY (id, colname)
);
Then your IN query will work:
cqlsh:ks> insert into stringindice (id, colname, colvalue) VALUES ('rowkey', 'string1', '01');
cqlsh:ks> insert into stringindice (id, colname, colvalue) VALUES ('rowkey', 'string2', '02');
cqlsh:ks> insert into stringindice (id, colname, colvalue) VALUES ('rowkey', 'string3', '03');
cqlsh:ks> select * from stringindice where id = 'rowkey' and colname IN ('string1', 'string2');
id | colname | colvalue
--------+---------+----------
rowkey | string1 | 0x01
rowkey | string2 | 0x02

Related

Unable to delete data from a Cassandra CF

So I have a CF whose Schema looks something like this :
CREATE TABLE "emp" (
id text,
column1 text,
column2 text,
PRIMARY KEY (id, column1, column2)
)
I have an entry which looks like this and I want to delete it :
20aff8144049 | name | someValue
So i tried this command :
Delete column2 from emp where id='20aff8144049';
It failed with below error:
no viable alternative at input '20aff8144049' (...column2 from emp where id=["20aff8144049]...)
Can someone help with where I'm going wrong? Thanks!
You can't delete or set null to primary key column
You have to delete the entire row.
You only can delete an entry using a valid value for your primary key. You defined your primary key to include (id, column1, column2) which means that you have to put all the corresponding values in your where clause.
However, I assume you wanted to be able to delete by id only. Therefore, I'd suggest you re-define your column family like this:
CREATE TABLE "emp" (
id text,
column1 text,
column2 text,
PRIMARY KEY ((id), column1, column2)
)
where id is your partition key and column1 and column2 are your clustering columns.

Invalid list literal for <column name> of type frozen<list<bigint>>

I am having an issue filtering by a column of type frozen> that is part of a clustering key sorted in DESC order.
Context
This is the definition of my keyspace and tables
CREATE KEYSPACE hello WITH replication =
{'class': 'SimpleStrategy', 'replication_factor': 1 };
CREATE TABLE hello.table1 (
fn bigint,
et smallint,
st frozen<list<bigint>>,
tn bigint,
ts timestamp,
PRIMARY KEY ((fn, et), st, tn));
CREATE TABLE hello.table2 (
fn bigint,
et smallint,
st frozen<list<bigint>>,
tn bigint,
ts timestamp,
PRIMARY KEY ((fn, et), st, tn)) WITH CLUSTERING ORDER BY (st DESC, tn DESC);
What works...
Inserting records into table1 works fine:
INSERT INTO hello.table1(fn, et,st,tn, ts) VALUES ( 1,1,[23],1,0);
INSERT INTO hello.table1(fn, et,st,tn, ts) VALUES ( 1,1,[24],1,0);
INSERT INTO hello.table1(fn, et,st,tn, ts) VALUES ( 1,1,[25],1,0);
Selecting records and specifying a frozen column in the where clause also works fine
select * from hello.table1 where fn=1 and et=1 and st=[23];
What does not work...
Inserting records into table2 does not work:
INSERT INTO hello.table2(fn, et,st,tn, ts) VALUES ( 1,1,[23],1,0);
And if I already have records inserted (from my application), selecting records and specifying a frozen column in the where clause also does not work
select * from hello.table2 where fn=1 and et=1 and st=[23];

CQL IN set query

Have a table
REATE TABLE IF NOT EXISTS tabletest (uuid text, uuidHotel text, uuidRoom text, uuidGuest text, bookedTimeStampSet set<text>, PRIMARY KEY (uuidHotel, uuidRoom));
Tried to select with IN:
select * from tabletest where uuidhotel = 'uuidHotel' and bookedtimestampset IN ('1460710800000');
Got
'bookedtimestampset' (set<text>) cannot be restricted by a 'IN' relation"
Can I select elements by IN Set filter?
Can I select elements by IN Set filter?
No, but you can put a secondary index on bookedtimestampset and use the CONTAINS operator:
aploetz#cqlsh:stackoverflow> CREATE INDEX timeset_idx ON tabletest(bookedtimestampset);
aploetz#cqlsh:stackoverflow> SELECT uuidhotel,uuidroom FROM tabletest
WHERE uuidhotel = 'uuidHotel1' and bookedtimestampset CONTAINS '1460710800000';
uuidhotel | uuidroom
------------+----------
uuidHotel1 | uuidroom1
(1 rows)
Normally I wouldn't recommend a secondary index, but as long as you are filtering by a partition key (uuidhotel) it should perform ok.
Can I select elements by IN Set filter?
you can't use clause IN with your primary key. It is highly important to understand how significantly data model influences on query performance. Of course, you can add secondary index for column bookedtimestampset but in this case be ready to for performance degradation.
CREATE TABLE IF NOT EXISTS tabletest (uuid text, uuidHotel text, uuidRoom text, uuidGuest text, bookedTimeStampSet set, PRIMARY KEY (uuidHotel, uuidRoom));
your compound primary key consists of one partition key uuidHotel and one clustering key uuidRoom which means that all your hotels and rooms would physically stored on same node in order as result retrieval of rows is very efficient. bookedTimeStampSet is different column which would be spread through whole cluster and it is just impossible to restrict by this column without secondary indexing one.
Consequently. I would recommend you to create primary key according to your future queries even if you need to duplicate some data which is common practice for NoSql database such Cassandra is.
e.q.
CREATE TABLE IF NOT EXISTS tabletest (uuid text, uuidHotel text,
uuidRoom text, uuidGuest text, bookedTimeStamp timestamp, PRIMARY KEY
(uuidHotel, bookedTimeStamp , uuidRoom))
it allows you to make a query like
select * from tabletest where uuidhotel = 'uuidHotel' and
bookedtimestamp > '1460710800000 and bookedtimestamp < '1460710900000'

Columns ordering in Cassandra

When I create a table in CQL, is it necessary to be exact for the order of column that are NOT in the primary_key and NOT clustering columns :
CREATE TABLE user (
a ascii,
b ascii,
c ascii,
PRIMARY KEY (a)
);
Is it equivalent to ?
CREATE TABLE user (
a ascii,
c ascii, <-- switched
b ascii, <-- switched
PRIMARY KEY (a)
);
Thank you for your help
Both of those statements will fail, because of:
The extra comma.
You have not provided a primary key definition.
Assuming you had those fixed, then the answer is still "yes they are the same."
Cassandra applies its own order to your columns at table creation time. Consider this table as I have typed it:
CREATE TABLE testorder (
acolumn text,
jcolumn text,
dcolumn text,
bcolumn text,
apkey text,
bpkey text,
ackey text,
bckey text,
PRIMARY KEY ((bpkey,apkey),bckey,ackey));
After creating it, I'll describe the table so you can see the order that Cassandra has applied to the columns.
aploetz#cqlsh:stackoverflow> desc table testorder ;
CREATE TABLE stackoverflow.testorder (
bpkey text,
apkey text,
bckey text,
ackey text,
acolumn text,
bcolumn text,
dcolumn text,
jcolumn text,
PRIMARY KEY ((bpkey, apkey), bckey, ackey)
) WITH CLUSTERING ORDER BY (bckey ASC, ackey ASC)
Essentially, Cassandra will order the partition keys and the clustering keys (ordered by their precedence in the PRIMARY KEY definition), and then the columns follow in ascending order.

Cassandra Composite Column Family

I have a simple requirement in sql world i want to create
CREATE TABLE event_tracking (
key text,
trackingid timeuuid,
entityId bigint,
entityType text
userid bigint
PRIMARY KEY (key, trackingid)
)
I need a cli create command which is I am not able to do it. I need to create column family through cli as pig cannot read column family created through cqlsh (duh)
Here what I tried and didnt worked
create column family event_tracking
... WITH comparator='CompositeType(TimeUUIDType)'
... AND key_validation_class=UTF8Type
... AND default_validation_class = UTF8Type;
1) I dont know why it add the value column to it when I see it in cqlsh
CREATE TABLE event_tracking (
key text,
trackingid timeuuid,
value text,
PRIMARY KEY (key, trackingid)
) WITH COMPACT STORAGE AND
bloom_filter_fp_chance=0.010000 AND
caching='KEYS_ONLY' AND
comment='' AND
dclocal_read_repair_chance=0.000000 AND
gc_grace_seconds=864000 AND
read_repair_chance=0.100000 AND
replicate_on_write='true' AND
populate_io_cache_on_flush='false' AND
compaction={'class': 'SizeTieredCompactionStrategy'} AND
compression={'sstable_compression': 'SnappyCompressor'};
2) I am using asynatax to insert the row.
OperationResult<CqlResult<Integer, String>> result = keyspace.prepareQuery(CQL3_CF)
.withCql("INSERT INTO event_tracking (key, column1, value) VALUES ("+System.currentTimeMillis()+","+TimeUUIDUtils.getTimeUUID(System.currentTimeMillis())+",'23232323');").execute();
but as soon as i try to add dynamic columns, it is not able to recognize
OperationResult<CqlResult<Integer, String>> result = keyspace.prepareQuery(CQL3_CF)
.withCql("INSERT INTO event_tracking (key, column1, value, userId, event) VALUES ("+System.currentTimeMillis()+","+TimeUUIDUtils.getTimeUUID(System.currentTimeMillis())+",'23232323', 123455, 'view');").execute();
looks like I cannot add dynamic columns through cql3
3) If I try to add new column through cql3
alter table event_tracking add eventid bigint;
it gives me
Bad Request: Cannot add new column to a compact CF
0) If you create the table with COMPACT STORAGE Pig should be able to see it, even if you create it from CQL3. But you would need to put entityId and entityType into the primary key too for that to work (compact storage basically means that the first column in the primary key becomes the row key and the following become a composite type used as the column key, and then there is only room for one more column which will be the value).
1) When you create tables the old way there will always be a value, it's the value of the column, and in CQL3 that is represented as a column called value. This is just how CQL3 maps the underlying storage model onto tables.
2) You have created a table whose columns are of the type CompositeType(TimeUUIDType), so you can only add columns that are TimeUUIDs. You can't tell C* to save a string as a TimeUUID column key.
3) Looping back to 0 use this table:
CREATE TABLE event_tracking (
key text,
trackingid timeuuid,
entityId bigint,
entityType text,
userid bigint,
PRIMARY KEY (key, trackingid, entityId, entityType)
) WITH COMPACT STORAGE
this one assumes that there can only be one trackingId/entityId/entityType combination for each userid (what's up with your inconsistent capitalization, btw?). It that's not the case you need to go the full dynamic columns route, but then you can't have different data types for entityId and entityType (but this would have been the case before CQL3 too), see this question for an example of how to do dynamic columns: Inserting arbitrary columns in Cassandra using CQL3

Resources