Three range query in cassandra - cassandra

I have requirement of fetching records using three range queries from cassandra db. I used the following query.
Select * from keyspace.test
where partitionkey = 'x'
and indexedColumn = 'y'
and column1 < 'value1'
and column2 > 'value2'
and column3 > 'value3'
it failed with following error:
InvalidRequest: Error from server: code=2200 [Invalid query]
message="Cannot execute this query as it might involve data filtering
and thus may have unpredictable performance. If you want to execute
this query despite the performance unpredictability, use ALLOW
FILTERING"
Please help if you know how to give three range conditions in single fetch query .
Table Model:
{
id,
childstatus,
startDate,
endDate,
childId
childRatingValue
}(id, childstatus, startdate, childRatingValue)
clustering keys in order (childstatus, startdate, childRatingValue)
index on childstatus
Thanks in advance

Related

Select query in CQL issue

I am trying to get the count of a table in cassandra DB, i am running the below query:
select count(*)
from bssapi.call_detail_records
WHERE
year = 2020
and month = 3
and event_at > '2020-03-01 07:45:51+0000' ALLOW FILTERING;
The error which i am getting is:
InvalidRequest: code=2200 [Invalid query] message="Partitioning column "year" cannot be restricted because the preceding column ("ColumnDefinition{name=subscription_id, type=org.apache.cassandra.db.marshal.UTF8Type, kind=PARTITION_KEY, componentIndex=0, indexName=null, indexType=null}") is either not restricted or is restricted by a non-EQ relation"
When i filter the query by adding a subscription ID, the system return the count correctly, this issue appear only when i run the count for all the table.
I got the solution for this issue, i used the below query instead:
select COUNT(*) from bssapi.call_detail_records WHERE event_at > '2020-03-22 07:45:51+0000' ALLOW FILTERING;

SELECT COLUMN which has null values (Cassandra 3.11.3)

I have a table (table1) with 14 columns in which 10 columns has data and I need to import data to rest of 4 columns from other table (For now these 4 columns has empty/null value)
My team met has written a code perform import of data from other table but he is facing some issues and he is asking me to give him a select query which will give/display columns which has null/empty dataset (In this example 4 columns which are having null/empty dataset)
I have tried below select query... were I have used distinct query with partition key.
SELECT distinct host_name from table1 WHERE empty_column = '' ;
Note: Column - host_name is the primary or partition key and empty_column is the column which does not have any value or null/empty dataset.
Getting error:
InvalidRequest: Error from server: code=2200 [Invalid query] message="SELECT DISTINCT with WHERE clause only supports restriction by partition key and/or static columns."
Please help...

Cassandra queries performance, ranges

I'm quite new with Cassandra, and I was wondering if there would be any impact in performance if a query is asked with "date = '2015-01-01'" or "date >= '2015-01-01' AND date <= '2015-01-01'".
The only reason I want to use the ranges like that is because I need to make multiple queries and I want to have them prepared (as in prepared statements). This way the prepared statements number is cut by half.
The keys used are ((key1, key2), date) and (key1, date, key2) in the two tables I want to use this. The query for the first table is similar to:
SELECT * FROM table1
WHERE key1 = val1
AND key2 = val2
AND date >= date1 AND date <= date2
For a PRIMARY KEY (key1, date, key2) that type of query just isn't possible. If you do, you'll see an error like this:
InvalidRequest: code=2200 [Invalid query] message="PRIMARY KEY column
"key2" cannot be restricted (preceding column "date" is either not
restricted or by a non-EQ relation)"
Cassandra won't allow you to filter by a PRIMARY KEY component if the preceding column(s) are filtered by anything other than the equals operator.
On the other hand, your queries for PRIMARY KEY ((key1, key2), date) will work and perform well. The reason, is that Cassandra uses the clustering key(s) (date in this case) to specify the on-disk sort order of data within a partition. As you are specifying partition keys (key1 and key2) your result set will be sorted by date, allowing Cassandra to satisfy your query by performing a continuous read from the disk.
Just to test that out, I'll even run two queries on a table with a similar key, and turn tracing on:
SELECT * FROM log_date2 WHERe userid=1001
AND time > 32671010-f588-11e4-ade7-21b264d4c94d
AND time < a3e1f750-f588-11e4-ade7-21b264d4c94d;
Returns 1 row and completes in 4068 microseconds.
SELECT * FROM log_date2 WHERe userid=1001
AND time=74ad4f70-f588-11e4-ade7-21b264d4c94d;
Returns 1 row and completes in 4001 microseconds.

cassandra, select via a non primary key

I'm new with cassandra and I met a problem. I created a keyspace demodb and a table users. This table got 3 columns: id (int and primary key), firstname (varchar), name (varchar).
this request send me the good result:
SELECT * FROM demodb.users WHERE id = 3;
but this one:
SELECT * FROM demodb.users WHERE firstname = 'francois';
doesn't work and I get the following error message:
InvalidRequest: code=2200 [Invalid query] message="No secondary indexes on the restricted columns support the provided operators: "
This request also doesn't work:
SELECT * FROM users WHERE firstname = 'francois' ORDER BY id DESC LIMIT 5;
InvalidRequest: code=2200 [Invalid query] message="ORDER BY with 2ndary indexes is not supported."
Thanks in advance.
This request also doesn't work:
That's because you are mis-understanding how sort order works in Cassandra. Instead of using a secondary index on firstname, create a table specifically for this query, like this:
CREATE TABLE usersByFirstName (
id int,
firstname text,
lastname text,
PRIMARY KEY (firstname,id));
This query should now work:
SELECT * FROM usersByFirstName WHERE firstname='francois'
ORDER BY id DESC LIMIT 5;
Note, that I have created a compound primary key on firstname and id. This will partition your data on firstname (allowing you to query by it), while also clustering your data by id. By default, your data will be clustered by id in ascending order. To alter this behavior, you can specify a CLUSTERING ORDER in your table creation statement:
WITH CLUSTERING ORDER BY (id DESC)
...and then you won't even need an ORDER BY clause.
I recently wrote an article on how clustering order works in Cassandra (We Shall Have Order). It explains this, and covers some ordering strategies as well.
There is one constraint in cassandra: any field you want to use in the where clause has to be the primary key of the table or there must be a secondary index on it. So you have to create an index to firstname and only after that you can use firstname in the where condition and you will get the result you were expecting.

How can I use the second column in this primary key both in ORDER BY clause as well as update it using an UPDATE command

Below is the table.
CREATE TABLE threadpool(
threadtype int,
threadid bigint,
jobcount bigint,
valid boolean,
PRIMARY KEY (threadtype, jobcount, threadid)
);
I want to run the below 2 queries on this table.
SELECT * FROM threadpool WHERE threadtype = 1 ORDER BY jobcount ASC LIMIT 1;
UPDATE threadpool SET valid = false WHERE threadtype = 1 and threadid = 4;
The second query fails with the below reason.
InvalidRequest: code=2200 [Invalid query] message="PRIMARY KEY column "threadid" cannot be restricted (preceding column "jobcount" is either not restricted or by a non-EQ relation)"
Can any body please help me in modelling the data to support both the above queries.
Your described data model can't work, as
only values of datatype counter can be incremented using a CQL statement
counter tables can only have the counter as a single column beside the PK
you cannot sort by counter values

Resources