Cassandra Undefined name in where clause - cassandra

I'm querying a cassandra table executing the following command:
select * from oap.purchase_events where clientNumber = '100'
The table contains a row with clientNumber 100 , however I get this error:
InvalidRequest: code=2200 [Invalid query] message="Undefined name clientnumber in where clause ('clientnumber = 100')"
The table definition:
CREATE TABLE oap.purchase_events (
"parentId" text,
"childId" text,
"clientNumber" text,
cost double,
description text,
"eventDate" timestamp,
"logDate" timestamp,
message text,
"operationalChannel" text,
"productDuration" bigint,
"productId" text,
"transactionId" text,
volume double,
"volumeUnit" text,
PRIMARY KEY ("parentId", "childId")
) WITH CLUSTERING ORDER BY ("childId" ASC)
AND bloom_filter_fp_chance = 0.01
AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
AND comment = ''
AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'}
AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99.0PERCENTILE';
CREATE INDEX purchase_events_clientNumber_idx ON gestor.purchase_events ("clientNumber");
Any help?

Just enclose clientNumber with double quote
Example : select * from purchase_events where "clientNumber" = '100';

Related

Cassandra queries perform a full table scan if no rows exist for a specific partition key

I have a very large table like
CREATE TABLE IF NOT EXISTS profile (
account_id text,
user_id uuid,
user_data text,
creation_date timestamp,
update_date timestamp,,
PRIMARY KEY ((account_id, user_id))
) WITH bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL', 'rows_per_partition': '10'}
AND comment = ''
AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99PERCENTILE';
The following query will run the full table scan if the table has no rows matching the partial partition key (account_id = 'D-F-8CX7PGX')
SELECT * FROM profile WHERE account_id = 'D-F-8CX7PGX' AND user_id = '123e4567-e89b-12d3-a456-426614174000';
I expect that Cassandra could quickly return with no rows found, not scan the full table.
Someone suggested inserting a dummy row with (account_id = 'D-F-8CX7PGX' AND user_id = '00000000-0000-0000-0000-000000000000') could avoid the full table scan. But I don't understand why it is needed.
Does anyone encounter the similar issue?
A single partition query does not do a full table scan.
Since the partition key is (account_id, user_id) and your query filters on a single partition, Cassandra will attempt to retrieve the partition from the relevant replica(s) without scanning the whole table. Cheers!

How to do pagination and sorting on post table in cassandra database?

I am using cassandra v3.0 as my database
My Table Scheme is as below:
`CREATE TABLE db_name.post (
postcreatedby timeuuid,
contenttype text,
createdat bigint,
friendid timeuuid,
posttype text,
id timeuuid,
PRIMARY KEY (postcreatedby, contenttype, createdat, friendid, posttype, id)
) WITH CLUSTERING ORDER BY (contenttype ASC, createdat ASC, friendid ASC, posttype ASC, id ASC)
AND bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = ''
AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry`enter code here` = '99PERCENTILE';`
And I am trying to run the following query:
SELECT * from post WHERE postcreatedby = timeuuid AND contenttype IN ('text', 'text') AND createdat < bigint AND friendid = timeuuid AND posttype = 'text';
And getting following Error:
InvalidRequest: Error from server: code=2200 [Invalid query] message="Clustering column "friendid" cannot be restricted (preceding column "createdat" is restricted by a non-EQ relation)"
My question is:
I need to use all columns for filtering the data and need to sort it as well. Here i am using 'createdAt' parameter to maintain sorting and pagination.
My problem is, if I will set createdAt as a last cluster key then I can use all columns for filtering as well but unable to sort that data. And if I will put the createdAt as before any parameter then I can not use last parameters as a filter.

toTimestamp function is not working in select statement cassandra

I have table as
cqlsh> DESC relation.students;
CREATE TABLE relation.students (
student_id uuid,
created_at timeuuid,
name text,
PRIMARY KEY (student_id, created_at)
) WITH CLUSTERING ORDER BY (created_at DESC)
AND bloom_filter_fp_chance = 0.01
AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
AND comment = ''
AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'}
AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99.0PERCENTILE';
CREATE INDEX index_students_student_id ON relation.students (student_id);
When I check timeuuid functions, it has toTimestamp function, when I try to use that, it gives error.
cqlsh> select student_id, name, toTimestamp(created_at) from relation.students where student_id=c2e160e6-27bb-4e25-91ca-e33dc7538d25;
InvalidRequest: code=2200 [Invalid query] message="Unknown function 'toTimestamp'"
I am using version 2.1
cqlsh> select peer, release_version from system.peers;
peer | release_version
-------------+-----------------
127.0.0.1 | 2.1.14.1272
In Cassandra 2.2 introduced toTimestamp
Check the Cassandra Release News
You can also create your own function. Cheeck UDF. UDF also introduced in Cassandra 2.2
If you use Cassandra lower than 2.2, you can't use this function.

Cassandra table schema changing

I am using Datastax Cassandra3.0
while creating table in cassandra using cqlsh schema is changing column names are arranging into alphabetaical order. Please see below.
This is the structure when creating a table..
cqlsh> CREATE TABLE tutorialspoint.SupplierItemData_input15(partnumber BIGINT PRIMARY KEY,
... supplier text,
... monthyear varchar,
... allocation int,
... evdate date,
... paymentterms int,
... actualdays int,
... percentageofpayment int,
... variation int,
... paymenttermsummary text,
... copq int,
... year int,
... month int,
... postingdate date);
But while i check the DESCRIBE TABLE NAME the structure is changing
cqlsh> DESCRIBE tutorialspoint.SupplierItemData_input15;
CREATE TABLE tutorialspoint.supplieritemdata_input15 (
partnumber bigint PRIMARY KEY,
actualdays int,
allocation int,
copq int,
evdate date,
month int,
monthyear text,
paymentterms int,
paymenttermsummary text,
percentageofpayment int,
postingdate date,
supplier text,
variation int,
year int
)
WITH bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = ''
AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99PERCENTILE';
Please help me on this.
Thankyou
Ravi
If you want to import data using cqlsh COPY from a csv file then you should add your column names as a header at the top of the csv file. That way it doesn't matter what order they are by default.

Cassandra Delete Records

I'm new to Cassandra and I've been having some issues trying to delete records. I have a table defined as follows:
CREATE TABLE wire_journal (
persistence_id text,
partition_nr bigint,
sequence_nr bigint,
timestamp timeuuid,
timebucket text,
event blob,
event_manifest text,
message blob,
ser_id int,
ser_manifest text,
tag1 text,
tag2 text,
tag3 text,
used boolean static,
writer_uuid text,
PRIMARY KEY ((persistence_id, partition_nr), sequence_nr, timestamp, timebucket)
) WITH CLUSTERING ORDER BY (sequence_nr ASC, timestamp ASC, timebucket ASC)
AND bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = ''
AND compaction = {'bucket_high': '1.5', 'bucket_low': '0.5', 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'enabled': 'true', 'max_threshold': '32', 'min_sstable_size': '50', 'min_threshold': '4', 'tombstone_compaction_interval': '86400', 'tombstone_threshold': '0.2', 'unchecked_tombstone_compaction': 'false'}
AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99PERCENTILE';
And Indexes defined as follows:
CREATE CUSTOM INDEX timestamp_idx ON wire_journal (timestamp) USING 'org.apache.cassandra.index.sasi.SASIIndex';
CREATE CUSTOM INDEX manifest_idx ON wire_journal (event_manifest) USING 'org.apache.cassandra.index.sasi.SASIIndex';
I would like to be able to delete by timestamp and event_manifest.
I can query by an event manifest for example:
select event_manifest, dateOf(timestamp) from wire_journal where event_manifest = '011000028';
The query above works. However If I try to do a deletion for the same criteria as follows:
delete from wire_journal where event_manifest = '011000028';
I get the following error:
InvalidRequest: code=2200 [Invalid query] message="Some partition key parts are missing: persistence_id, partition_nr"
I've tried including those columns in my delete as follows:
delete persistence_id, partition_nr from wire_journal where event_manifest = 'aba:011000028';
and I get the following error:
invalidRequest: code=2200 [Invalid query] message="Invalid identifier persistence_id for deletion (should not be a PRIMARY KEY part)"
How can I go about deleting all the records that match that condition?
Your partition key is (persistence_id, partition_nr) and Cassandra only delete records using partition key
So your query need to be like:
delete from wire_journal where persistence_id = x AND partition_nr = y AND event_manifest = 'aba:011000028';

Resources