Non-EQ relation error Cassandra - how fix primary key? - cassandra

I created a one table posts. When I make request SELECT:
return $this->db->query('SELECT * FROM "posts" WHERE "id" IN(:id) LIMIT '.$this->limit_per_page, ['id' => $id]);
I get error:
PRIMARY KEY column "id" cannot be restricted (preceding column
"post_at" is either not restricted or by a non-EQ relation)
My table dump is:
CREATE TABLE posts (
id uuid,
post_at timestamp,
user_id bigint,
name text,
category set<text>,
link varchar,
image set<varchar>,
video set<varchar>,
content map<text, text>,
private boolean,
PRIMARY KEY (user_id,post_at,id)
)
WITH CLUSTERING ORDER BY (post_at DESC);
I read some article about PRIMARY AND CLUSTER KEYS, and understood, when there are some primary keys - I need use operator = with IN. In my case, i can not use a one PRIMARY KEY. What you advise me to change in table structure, that error will disappear?

My dummy table structure
CREATE TABLE posts (
id timeuuid,
post_at timestamp,
user_id bigint,
PRIMARY KEY (id,post_at,user_id)
)
WITH CLUSTERING ORDER BY (post_at DESC);
And after inserting some dummy data
I ran query select * from posts where id in (timeuuid1,timeuuid2,timeuuid3);
I was using cassandra 2.0 with cql 3.0

Related

Not able to run multiple where clause without Cassandra allow filtering

Hi I am new to Cassandra.
We are working on IOT project where car sensor data will be stored in cassandra.
Here is the example of one table where I am going to store one of the sensor data.
This is some sample data.
The way I want to partition the data is based on the organization_id so that different organization data is partitioned.
Here is the create table command:
CREATE TABLE IF NOT EXISTS engine_speed (
id UUID,
engine_speed_rpm text,
position int,
vin_number text,
last_updated timestamp,
organization_id int,
odometer int,
PRIMARY KEY ((id, organization_id), vin_number)
);
This works fine. However all my queries will be as bellow:
select * from engine_speed
where vin_number='xyz'
and organization_id = 1
and last_updated >='from time stamp' and last_updated <='to timestamp'
Almost all queries in all the table will have similar / same where clause.
I am getting error and it is asking to add "Allow filtering".
Kindly let me know how do I partition the table and define right primary key and indexs so that I don't have to add "allow filtering" in the query.
Apologies for this basic question but I'm just starting using cassandra.(using apache cassandra:3.11.12 )
The order of where clause should match with the order of partition and clustering keys you have defined in your DDL and you cannot skip any part of primary key while applying the WHERE clause before using the next key. So as per the query pattern u have defined, you can try the below DDL:
CREATE TABLE IF NOT EXISTS autonostix360.engine_speed (
vin_number text,
organization_id int,
last_updated timestamp,
id UUID,
engine_speed_rpm text,
position int,
odometer int,
PRIMARY KEY ((vin_number, organization_id), last_updated)
);
But remember,
PRIMARY KEY ((vin_number, organization_id), last_updated)
PRIMARY KEY ((vin_number), organization_id, last_updated)
above two are different in Cassandra, In case 1 your data will be partitioned by combination of vin_number and organization_id while last_updated will act as ordering key. In case 2, your data will be partitioned only by vin_number while organization_id and last_updated will act as ordering key. So you need to figure out which case suits your use case.

Cassandra Invalid Query: Some cluster keys are missing

I'm using Cassandra 3.0.
My table was created with this query, but when I try to insert data into the table, I get the error: 'Some cluster keys are missing: created'
Table Structure:
CREATE TABLE db.feed (
action_object_id int,
owner_id int,
created timeuuid,
action_object text,
action_object_type int,
actor text,
feed_type text,
target text,
target_type int,
verb text,
PRIMARY KEY (action_object_id, owner_id, created)
) WITH CLUSTERING ORDER BY (owner_id ASC, created ASC)
You must have to provide values for all the primary keys. action_object_id, owner_id, created must have to be mentioned in your insert query.
Ex: insert into db.feed(action_object_id, owner_id, created, ...) values (?,?,?,...). And you cannot provide NULL values for primary keys. created cannot be null.

Am I using cassandra efficiently?

I have these table
CREATE TABLE user_info (
userId uuid PRIMARY KEY,
userName varchar,
fullName varchar,
sex varchar,
bizzCateg varchar,
userType varchar,
about text,
joined bigint,
contact text,
job set<text>,
blocked boolean,
emails set<text>,
websites set<text>,
professionTag set<text>,
location frozen<location>
);
create table publishMsg
(
rowKey uuid,
msgId timeuuid,
postedById uuid,
title text,
time bigint,
details text,
tags set<text>,
location frozen<location>,
blocked boolean,
anonymous boolean,
hasPhotos boolean,
esIndx boolean,
PRIMARY KEY(rowKey, msgId)
) with clustering order by (msgId desc);
create table publishMsg_by_user
(
rowKey uuid,
msgId timeuuid,
title text,
time bigint,
details text,
tags set<text>,
location frozen<location>,
blocked boolean,
anonymous boolean,
hasPhotos boolean,
PRIMARY KEY(rowKey, msgId)
) with clustering order by (msgId desc);
CREATE TABLE followers
(
rowKey UUID,
followedBy uuid,
time bigint,
PRIMARY KEY(rowKey, orderKey)
);
I doing 3 INSERT statement in BATCH to put data in publishMsg publishMsg_by_user followers table.
To show a single message I have to query three SELECT query on different table:
publishMsg - to get a publish message details where rowkey & msgId given.
userInfo - to get fullName based on postedById
followers - to know whether a postedById is following a given topic or not
Is this a fit way of using cassandra ? will that be efficient because the given scanerio data can't fit in single table.
Sorry to ask this in an answer but I don't have the rep to comment.
Ignoring the tables for now, what information does your application need to ask for? Ideally in Cassandra, you will only have to execute one query on one table to get the data you need to return to the client. You shouldn't need to have to execute 3 queries to get what you want.
Also, your followers table appears to be missing the orderkey field.

getting result in the same order it was added in the table

I have this table
CREATE TABLE tag_by_user (
userId uuid,
tagId uuid,
colId timeuuid,
tagLabel text,
PRIMARY KEY (userId, tagId,colId)
);
here is my data
insert into tag_by_user(userId,tagId,colId,tagLabel) values(4978f728-0f96-11e5-a6c0-1697f925ec7b
,b0b328fa-0f96-11e5-a6c0-1697f925ec7b,now(),'html');
insert into tag_by_user(userId,tagId,colId,tagLabel) values(4978f728-0f96-11e5-a6c0-1697f925ec7b
,b0b330d4-0f96-11e5-a6c0-1697f925ec7b,now(),'java');
insert into tag_by_user(userId,tagId,colId,tagLabel) values(4978f728-0f96-11e5-a6c0-1697f925ec7b
,c0f22450-0f96-11e5-a6c0-1697f925ec7b,now(),'javascript');
insert into tag_by_user(userId,tagId,colId,tagLabel) values(4978f728-0f96-11e5-a6c0-1697f925ec7b
,c0f226b2-0f96-11e5-a6c0-1697f925ec7b,now(),'scala pro');
insert into tag_by_user(userId,tagId,colId,tagLabel) values(4978f728-0f96-11e5-a6c0-1697f925ec7b
,c0f22ab8-0f96-11e5-a6c0-1697f925ec7b,now(),'c++');
Now i want to get the tags of a given user in same order it was added to the row (i.e in the ascending order of time when it was added and here that one is colId)
cqlsh:ks_demo> select taglabel from tag_by_user where userid= 77c4d46c-0f96-11e5-a6c0-1697f925ec7b order by colid;
it gives this error
Bad Request: Order by currently only support the ordering of columns following their declared order in the PRIMARY KEY
What changes i will have to in schema or in query cqlsh 4.1.1 | Cassandra 2.0.8
You need to leave only userId and colId in the PRIMARY KEY:
CREATE TABLE tag_by_user (
userId uuid,
colId timeuuid,
tagId uuid,
tagLabel text,
PRIMARY KEY (userId, colId)
);
And then use
SELECT * FROM tag_by_user WHERE userId={yourUserId}
to get the tags of a given user in ascending order of time.
If you need to avoid duplicate tags, then you can create an index on tagId and use it to find out if a tag already exists for a given user and process it. Though you cannot modify colId once data is inserted.
As the message suggests, to use order by, you should follow the same order as in PRIMARY KEY.
CREATE TABLE tag_by_user (
userid uuid,
colid timeuuid,
tagid uuid,
taglabel text,
PRIMARY KEY (userid, colid, tagid)
);
select taglabel from tag_by_user where userid = 4978f728-0f96-11e5-a6c0-1697f925ec7b order by colid;
taglabel
------------
html
java
javascript
scala pro
c++

How to design the cassandra table for one query with a ordering and limit?

Now I created a table:
CREATE TABLE posts_by_user(
user_id bigint,
post_id uuid,
post_at timestamp,
PRIMARY KEY (user_id,post_id)
);
I want to select last 10 rows with operator IN for user_id and ordering by post_at field.
Also I read a good article:
http://planetcassandra.org/blog/the-in-operator-in-cassandra-cql/
I can nit use query: WHERE post_at = time AND user_id IN (1,2) because I need all notes, not for a concrete date.
How i can change my design schema? Thank you.
I change on:
CREATE TABLE posts_by_user (
user_id bigint,
post_id uuid,
post_at timestamp,
PRIMARY KEY (user_id, post_at)
) WITH CLUSTERING ORDER BY (post_at DESC);
Think it is a good...
How about using this approach: http://www.datastax.com/documentation/cql/3.1/cql/cql_using/use-slice-partition.html

Resources