how to check all null values in a particular column in cassandra table? - cassandra

by using the below query i am not getting anyrecords
SELECT * FROM test_table where prty_ad_line3_tx = '' and prty_role_type_cd='Co-Borrower' and prty_role_sq_nb =5 limit 10 ALLOW FILTERING;
need help from your end.

Related

Cassandra(Amazon keyspace) Query Error on clustered columns

I am trying execute query on clustering columns on amazon keyspace, since I don't want to use ALLOW FILTERING with my native query I have created 4-5 clustering columns for better performance.
But while trying to filter it based on >= and <= with on 2 clustering columns, I am getting error with below message
message="Clustering column "start_date" cannot be restricted (preceding column "segment_id" is restricted by a non-EQ relation)"
I had also tried with multiple columns query but I am getting not supported error
message="MultiColumn relation is not yet supported."
Query for the reference
select * from table_name where shard_id = 568 and division = '10' and customer_id = 568113 and (segment_id, start_date,end_date)>= (-1, '2022-05-16','2017-03-28') and flag = 1;
or
select * from table_name where shard_id = 568 and division = '10' and customer_id = 568113 and segment_id > -1 and start_date >='2022-05-16';
I am assuming that the your table has the following primary key:
CREATE TABLE table_name (
...
PRIMARY KEY(shard_id, division, customer_id, segment_id, start_date, end_date)
)
In any case, your CQL query is invalid because you can only apply an inequality operator on the last clustering column in your query. For example, these are valid queries based on your table schema:
SELECT * FROM table_name
WHERE shard_id = ? AND division = ?
AND customer_id <= ?
SELECT SELECT * FROM table_name \
WHERE shard_id = ? AND division = ? \
AND customer_id = ? AND segment_id > ?
SELECT SELECT * FROM table_name \
WHERE shard_id = ? AND division = ? \
AND customer_id = ? AND segment_id = ? AND start_date >= ?
All preceding columns must be filtered by an equality operator except for the very last clustering column in your query.
If you require a complex predicate for your queries, you will need to index your Cassandra data with tools such as Elasticsearch or Apache Solr. They will allow you to run complex search parameters to retrieve data from your database. Cheers!
ALLOW Filtering gets a bad rap sometimes. It all depends on how many rows you end up scanning. It's good to understand how many rows per partition will be scanned and work backwards from there. Only the last column can contain inequality statements to bound ranges. Try to order your columns to eliminate the most columns first, which reduce the number of rows 'Filtered'.
In the example below we used the index for keys up to start date and filtered on end_data, segment_id, and flag.
select * from table_name where shard_id = 568 and division = '10' and customer_id = 568113 and start_date >= '2022-05-16' and end_date > '2017-03-28') and (segment_id > -1 flag = 1;```

check if table is empty in cassandra DB

I am trying to find a way to determine if the table is empty in Cassandra DB.
cqlsh> SELECT * from examples.basic ;
key | value
-----+-------
(0 rows)
I am running count(*) to get the value of the number of rows , but I am getting warning message, So I wanted to know if there is any better way to check if the table is empty(zero rows).
cqlsh> SELECT count(*) from examples.basic ;
count
-------
0
(1 rows)
Warnings :
Aggregation query used without partition key
cqlsh>
Aggregations, like count, can be an overkill for what you are trying to accomplish, specially with the star wildcard, as if there is any data on your table, the query will need to do a full table scan. This can be quite expensive if you have several records.
One way to get the result you are looking for is the query
cqlsh> SELECT key FROM keyspace1.table1 LIMIT 1;
Empty table:
The resultset will be empty
cqlsh> SELECT key FROM keyspace1.table1 LIMIT 1;
key
-----
(0 rows)
Table with data:
The resultset will have a record
cqlsh> SELECT key FROM keyspace1.table1 LIMIT 1;
key
----------------------------------
uL24bhnsHYRX8wZItWM6xKdS0WLvDsgi
(1 rows)

CQLSH - Check for null in where clause for MAP Data type

CASSANDRA Version : 2.1.10
CREATE TABLE customer_raw_data (
id uuid,
hash_prefix bigint,
profile_data map<varchar,varchar>
PRIMARY KEY (hash_prefix,id));
I have an index on profile_data and I have row where profile_data is null.
How to write a select query to retrieve the rows where profile_data is null ?
I tried the following
select count(*) from customer_raw_data where profile_data=null;
select count(*) from customer_raw_data where profile_data CONTAINS KEY null;
With Reference to : https://issues.apache.org/jira/browse/CASSANDRA-3783
There is currently no select support for indexed nulls, and given the design of Cassandra, is considered a difficult/prohibitive problem.
Basic problem.
where condition column has to be either primary key or secondary index so make your column what-ever is suitable and then try below query.
Try this..
select count(*) from customer_raw_data where profile_data='';
SELECT * FROM TableName WHERE colName > 5000 ALLOW FILTERING; //Work fine
SELECT * FROM TableName WHERE colName > 5000 limit 10 ALLOW FILTERING;
https://cassandra.apache.org/doc/old/CQL-3.0.html
Check the "ALLOW FILTERING" Part.

Select specific columns from Cassandra based on their name

I have a cassandra database in which columns can be added or removed based on the application need. The column names start with the prefix RSSI. I was wondering if it is possible to select all columns where the column name is like %RSSI%. In MYSQL you can do something like select count(*) FROM INFORMATION_SCHEMA.COLUMNS WHERE table_name ='MACTrain' AND column_name LIKE '%RSSI%'. Is it possible in cassandra ? If not what can be a solution to select columns based on a specific wildcard.
You can obtain the columns metadata of a table by querying the system keyspace:
select * from system.schema_columns
where keyspace_name = 'yourks' and columnfamily_name = 'yourtable';
For Cassandra v3.0 and above, you can use the new system_schema keyspace:
select * from system_schema.columns
where keyspace_name = 'yourks' and table_name = 'yourtable';

cassandra 2.0.9: query for undefined column

Using Cassandra 2.0.9 CQL, how does one query for rows that don't have a particular column defined? For example:
create table testtable ( id int primary key, thing int );
create index on testtable ( thing );
# can now select rows by thing
insert into testtable( id, thing ) values ( 100, 100 );
# row values will be persistent
update testtable using TTL 30 set thing=1 where id=100;
# wait 30 seconds, thing column will go away for the row
select * from testtable;
Ideally I'd like to be able to do something like this:
select * from testtable where NOT DEFINED thing;
or some such and have the row with the id==1 returned. Is there any way to search for rows that do not have a particular column value assigned?
I'm afraid I've been through the Datastax 2.0 manual, as well as the CQLSH help with no luck trying to find an operator or syntax for this. Thanks.
Doesn't appear to be available yet
https://issues.apache.org/jira/browse/CASSANDRA-3783

Resources