I am using Cassandra 1.2.3 and can execute select query with Limit 10.
If I want records from 10 to 20, I cannot do "Limit 10,20".
Below query gives me an error.
select * from page_view_counts limit 10,20
How can this be achieved?
Thanks
Nikhil
You can't do skips like this in CQL. You have have to do paging by specifying a start place e.g.
select * from page_view_counts where field >= 'x' limit 10;
to get the next 10 elements starting from x.
I wrote a full example in this answer: Cassandra pagination: How to use get_slice to query a Cassandra 1.2 database from Python using the cql library.
for that you have to first plan your data model so that it can get records according to your requirement...
Can you tell which sort of example your are doing?
and Are you using hector client or any other ?
sorry mate I did it using hector client & java,but seeing your requirement I can suggest to plan your data model like this :
Use time span as a row key in yyyyMMddHH format,in that store column name as composite key made up of UTF8Type and TimeUUID (e.g C1+timeUUID ).
note: here first composite key would be counter column family column name (e.g. C1)
Now you will only store limited records say 20 in your CF and make this c1 counter 20,now if any new record came for the same timespan you have to insert that with key C2+timeUUID now u will increment counter column family c2 upto 20 records
Now to fetch record you just have to pass value C1 , C2 ...etc with rowkey like 2013061116
it will give you 20 records than another 20 and so on...you have to implement this programmatically..hope you got this and helps you
Related
I have one table customer_info in a Cassandra DB & it contains one column as billing_due_date, which is date field (dd-MMM-yy ex. 17-AUG-21). I need to fetch the certain fields from customer_info table based on billing_due_date where billing_due_date should be equal to system date +1.
Can anyone suggest a Cassandra DB query for this?
fetch the certain fields from customer_info table based on billing_due_date
transaction_id is primarykey , It is just generated through uuid()
Unfortunately, there really isn't going to be a good way to do this. Right now, the data in the customer_info table is distributed across all nodes in the cluster based on a hash of the transaction_id. Essentially, any query based on something other than transaction_id is going to read from multiple nodes, which is a query anti-pattern in Cassandra.
In Cassandra, you need to design your tables based on the queries that they need to support. For example, choosing transaction_id as the sole primary key may distribute well, but it doesn't offer much in the way of query flexibility.
Therefore, the best way to solve for this query, is to create a query table containing the data from customer_info with a key definition of PRIMARY KEY (billing_date,transaction_id). Then, a query like this should work:
> SELECT * FROM customer_info_by_date
WHERE billing_due_date = toDate(now()) + 2d;
billing_due_date | transaction_id | name
------------------+--------------------------------------+---------
2021-08-20 | 2fe82360-e314-4d5b-aa33-5deee9f03811 | Rinzler
2021-08-20 | 92cb9ee5-dee6-47fe-b372-0829f2e384cd | Clu
(2 rows)
Note that for this example, I am using the system date plus 2 days out. So in your case, you'll want to adjust the "duration" aspect from 2d down to 1d. Cassandra 4.0 allows date arithmetic, so this should work just fine if you are on that version. If you are not, you'll have to do the "system date plus one" calculation on the app side.
Another way to go about this, would be to create a secondary index on billing_due_date, but I don't recommend that path as it will query multiple nodes to build the result set.
I am having a table that has millions of records. I would like to purge the old data from these cassandra table.
the following is my table definition.
CREATE TABLE "Openmind".mep_notification (
id uuid PRIMARY KEY,
campaign uuid,
created timestamp,
flight uuid,
read boolean,
type text,
user uuid
);
CREATE INDEX mep_notification_user_idx ON "Openmind".mep_notification (user);
how I can get first X number of rows at a time using cql. then the next X number and so on till i get all the rows from the table.
appreciate if you can help
thank you
You can use the limit to get X numbers of rows. See the example bellow:
SELECT *
FROM cycling.rank_by_year_and_name
PER PARTITION LIMIT 2;
DataStax documentation:
https://docs.datastax.com/en/dse/5.1/cql/cql/cql_using/useQueryColumnsSort.html
If you trying to do a pagination, you can use the Pagin. Depending the program language you are using, you can check the rigth application to the Pagin.
DataStax documentation:
https://docs.datastax.com/en/cql-oss/3.3/cql/cql_reference/cqlshPaging.html
In CQL prompt: https://docs.datastax.com/en/dse/6.8/cql/cql/cql_using/search_index/cursorsDeepPaging.html
In Java:
https://docs.datastax.com/en/developer/java-driver/3.11/manual/paging/
In PHP : https://datastax.github.io/php-driver/features/result_paging/
In Spring Boot :
https://docs.spring.io/spring-data/cassandra/docs/current/reference/html/#reference
I am new to CQL. Using Cassandra 3.x
I have a basic university class table as
Class_ID INT
Class_Name VARCHAR
Class_Date TIMESTAMP
Class_TimeHour INT
Sample entries are
{1,"Bio 1","01/01/2018","700"}
{2,"MC 1" ,"01/01/2018","700"}
{3,"Bio 2","01/01/2018","815"}
{3,"MC 2" ,"01/01/2018","1100"}
700 represents 0700 hours in 24 hour notation.
I need to answer some basic queries, please advise on how to best setup the table and queries.
Can i get a Class_TimeHour desc ordered list of classes?
Can i get a Class_TimeHour desc ordered list of a particular class. Does this mean i need to setup a partition key differently from #1 ?
Can i get a list of all Class_Name that are within 60 min window of each other. My results from above should be
{1,"Bio 1","01/01/2018","700"}
{2,"MC 1","01/01/2018","700"}
{3,"Bio 2","01/01/2018","815"}
How many times does "Bio 1" occur per day
Count of how many Class_Name contain "MC" literal.
Thanks!
i need to select 'N'th row from cassandra table based on the particular number i'm getting from my logic. i.e: if logic output is 23 means, i need to get 23rd row details. since there is no auto increment in cassandra,i cant able to go with ID key match. In SQL they getting it using OFFSET and LIMIT. i dont know how to achieve this feet in Cassandra.
Can we achieve this by using any UDF concept??? Someone reply me the solution.Thanks in advance.
Table Schema :
CREATE TABLE new_card (
customer_id bigint,
card_number text,
active tinyint,
auto_pay int,
available_credit_limit double,
average_card_spend_half_yearly double,
average_card_spend_monthly double,
average_card_spend_quarterly double,
average_card_spend_yearly double,
avg_half_yearly_spend_mcc double,
PRIMARY KEY (customer_id, card_number)
);
If you are using Java driver, refer Paging
Note, Cassandra does not support direct offsets, pages are read sequentially. If you have to use offsets to be used in your queries, you might want to revisit your data model. You could have created a composite partition key including the row number as an additional column on top of you existing partition key column.
You simply can't select N row from table, because Cassandra table is made from partitions, and you can order your rows within partition, but not the partitions. Going with paging will go throw all tables, but there's will be no chronological order of the rows selected using suck approach (disregarding the fact that the partitions can change while you doing your go-throw-pages stuff).
If you want to select row number N from Cassandra, you need to implement auto increment field on the application level and use it as key.
There's ways to do it with Cassandra, using lightweight transactions for example, but it have high cost from performance perceptive. See several solutions here:
How to create auto increment IDs in Cassandra
I have a cassandra table structure as follows
create table demo (user_id text , comment_id text , timestamp timeuuid , PRIMARY KEY (user_id , comment_id , timestamp))
Now in the UI I want pagination such that on the click of next button , I should get the value from 10 to 20 then 20 to 30 and so on.
I know we cant initiate a query in cassandra as
select * from demo limit 10,20
So if I create a query as
select * from demo where timestamp = 'sometimestampvalue' limit 10;
This will give 10 values from 'sometimestampvalue' till next 10 values.
Then store the last row timestamp value in a variable (say X) and then initiate the next query as
select * from demo where timestamp = 'X' limit 10;
And so On, will this work ? or something better can be done as I'm ready to change the structure of table also with counter columns added as basically I should be able to do pagination based on any column.
See this answer:
Cassandra Limit 10,20 clause
Basically you will have to handle it in your app code. Your suggestion looks like it will work, with a little tweaking.
Pagination is more easily done in the driver where you can set the fetch size. For example, in Java:
cluster.getConfiguration().getQueryOptions().setFetchSize(10);
See DataStax Java Driver for Apache Cassandra, Features, Paging