Our Java Application doing a batch inserts on 1 of the table,
That table schema is something like..
CREATE TABLE "My_KeySpace"."my_table" (
key text,
column1 varint,
column2 bigint,
column3 text,
column4 boolean,
value blob,
PRIMARY KEY (key, column1, column2, column3, column4)
) WITH CLUSTERING ORDER BY ( column1 DESC, column2 DESC, column3 ASC, column4 ASC )
AND COMPACT STORAGE
AND bloom_filter_fp_chance = 0.1
AND comment = ''
AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.0
AND default_time_to_live = 0
AND gc_grace_seconds = 0
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.1
AND speculative_retry = 'NONE'
AND caching = {
'keys' : 'ALL',
'rows_per_partition' : 'NONE'
}
AND compression = {
'chunk_length_in_kb' : 64,
'class' : 'LZ4Compressor',
'enabled' : true
}
AND compaction = {
'class' : 'LeveledCompactionStrategy',
'sstable_size_in_mb' : 5
};
gc_grace_seconds = 0 in above schema. Because of this I am getting following warning:
2019-02-05 01:59:53.087 WARN [SharedPool-Worker-5 - org.apache.cassandra.cql3.statements.BatchStatement:97] Executing a LOGGED BATCH on table [My_KeySpace.my_table], configured with a gc_grace_seconds of 0. The gc_grace_seconds is used to TTL batchlog entries, so setting gc_grace_seconds too low on tables involved in an atomic batch might cause batchlog entries to expire before being replayed.
I have seen Cassandra code, this warning is there for obvious reasons at: this line
Any solution without changing batch code in application??
Should I increase gc_grace_seconds?
In Cassandra, batches aren't the way to optimize inserts into database - they are usually used mostly for coordinating writing into multiple tables, etc. If you're using the batches for insertion into multiple partitions, you're even get worse performance.
The better throughput for inserts you can get from using asynchronous commands execution (via executeAsync), and/or by using batches but only for inserts that are targeting the the same partition.
Related
My Cassandra server had died and I tried to restore it on the another computer. According this article https://community.datastax.com/questions/4818/backup-and-restore-cassandra-keyspace.html I moved the data folder to the new server. All the tables has been restored properly except one of them.
When I try read data from it I get the exception:
ERROR [ReadStage-2] 2022-10-19 07:47:55,026 AbstractLocalAwareExecutorService.java:166 - Uncaught exception on thread Thread[ReadStage-2,10,main]
java.lang.AssertionError: Lower bound [INCL_END_BOUND(2022-10-15 15:23Z) ]is bigger than first returned value [Row: utcdate=2022-10-15 11:07Z | data={t:0.880347,g:0.530729,a:180.0,v:11.7,d:5.896}] for sstable /var/lib/cassandra/data/Telematics_Energo/devicecoordinate-1005f1704edc11ed81fb63e346a603ff/mc-2158-big-Data.db
at org.apache.cassandra.db.rows.UnfilteredRowIteratorWithLowerBound.computeNext(UnfilteredRowIteratorWithLowerBound.java:127) ~[apache-cassandra-3.11.11.jar:3.11.11]
at org.apache.cassandra.db.rows.UnfilteredRowIteratorWithLowerBound.computeNext(UnfilteredRowIteratorWithLowerBound.java:48) ~[apache-cassandra-3.11.11.jar:3.11.11]
at org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) ~[apache-cassandra-3.11.11.jar:3.11.11]
at org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:374) ~[apache-cassandra-3.11.11.jar:3.11.11]
< ... cut ... >
My query is like this:
SELECT
utcdate, data
FROM
devicecoordinate
WHERE
id = 00a8efb3-7815-e911-a830-00155d03c802
and period = 1663
and utcdate >= '2022-10-09 08:55:00+0000'
and utcdate <= '2022-10-09 08:56:00+0000'
order by
utcdate
;
For others periods the query can execute without exception as well as without order by clause.
The table structure is:
CREATE TABLE devicecoordinate (
id uuid,
period int,
utcdate timestamp,
data text,
PRIMARY KEY (( id, period ), utcdate)
) WITH CLUSTERING ORDER BY ( utcdate DESC )
AND bloom_filter_fp_chance = 0.01
AND comment = ''
AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99.0PERCENTILE'
AND caching = {
'keys' : 'ALL',
'rows_per_partition' : 'NONE'
}
AND compression = {
'chunk_length_in_kb' : 64,
'class' : 'LZ4Compressor',
'crc_check_chance' : 1.0,
'enabled' : true
}
AND compaction = {
'base_time_seconds' : 14400,
'class' : 'DateTieredCompactionStrategy',
'enabled' : true,
'max_sstable_age_days' : 5,
'max_threshold' : 32,
'min_threshold' : 4,
'timestamp_resolution' : 'MICROSECONDS',
'tombstone_compaction_interval' : 86400,
'tombstone_threshold' : 0.2,
'unchecked_tombstone_compaction' : false
};
Cassadra version: 3.11.11
cql_version: 3.4.4
native_protocol_version: 4
How can I fixed the exception?
This AssertionError is thrown by UnfilteredRowIteratorWithLowerBound.computeNext() due to the clustering column value being "larger" than the previous value meaning the clustering rows are out-of-sequence:
ERROR [ReadStage-2] 2022-10-19 07:47:55,026 AbstractLocalAwareExecutorService.java:166 - Uncaught exception on thread Thread[ReadStage-2,10,main]
java.lang.AssertionError: Lower bound [INCL_END_BOUND(2022-10-15 15:23Z) ]is bigger than first returned value [Row: utcdate=2022-10-15 11:07Z | data={t:0.880347,g:0.530729,a:180.0,v:11.7,d:5.896}] for sstable /var/lib/cassandra/data/Telematics_Energo/devicecoordinate-1005f1704edc11ed81fb63e346a603ff/mc-2158-big-Data.db
at org.apache.cassandra.db.rows.UnfilteredRowIteratorWithLowerBound.computeNext(UnfilteredRowIteratorWithLowerBound.java:127) ~[apache-cassandra-3.11.11.jar:3.11.11]
...
In your case, 2022-10-15 15:23Z is larger than utcdate = 2022-10-15 11:07Z.
The last time I've come across this problem is when a DBA cloned a table to another cluster but didn't create the schema correctly.
In the schema you posted, the clustering order for utcdate should be in descending (DESC) order:
CREATE TABLE devicecoordinate (
...
) WITH CLUSTERING ORDER BY (utcdate DESC)
but there's a good chance the table was created withOUT specifying the clustering order so it defaulted to ascending (ASC) order:
WITH CLUSTERING ORDER BY (utcdate ASC)
When the schema does not match the data in the SSTables, Cassandra will not be able to read the data because the rows are out-of-sequence.
To fix it, you will need to:
Drop the table with DROP TABLE devicecoordinate.
Recreate the table.
Restore the SSTables to the new table.
This should fix your problem. Cheers!
I have a column family as follows:
CREATE TABLE keyspace.tableName (
key text PRIMARY KEY,
A blob,
B text,
C text,
D blob,
"E" text
H int
) WITH COMPACT STORAGE
AND bloom_filter_fp_chance = 0.2
AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
AND comment = ''
AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'}
AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.SnappyCompressor'}
AND dclocal_read_repair_chance = 0.0
AND default_time_to_live = 7776000
AND gc_grace_seconds = 3600
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.1
AND speculative_retry = '99.0PERCENTILE';
While making read requests with query:
SELECT A, B, C, D, E, WRITETIME(A), WRITETIME(B) FROM TABLE_NAME WHERE KEY = ?;
I'm getting the following exception:
com.datastax.driver.core.exceptions.ReadTimeoutException: Cassandra timeout during read query at consistency LOCAL_ONE (1 responses were required but only 0 replica responded)
at com.datastax.driver.core.exceptions.ReadTimeoutException.copy(ReadTimeoutException.java:88)
This is a huge cluster handling billions of requests per day. I only get this in less than 0.1% of the requests. My 99th percentile is fine. GC is less than 1 sec but I do see some READ messages being dropped in tpstats. Median SSTable count is less than 3. My queries are such that I don't read the same key again for some time, so my key cache hit rate is less than 10%. My read request timeout is 200ms.
Could anyone please give me any pointers on how to debug this?
In my spark job I am reading data from cassandra using java cassandra util. My query reads like-
JavaRDD<CassandraRow> cassandraRDD = functions.cassandraTable("keyspace","column_family").
select("timeline_id","shopper_id","product_id").where("action=?", "Viewed")
My row key level is set on action column. When I am running my spark job its causing the over utilisation of cpu but when I remove the filter on the action column its working fine.
Please find below the create table script for the column family-
CREATE TABLE keyspace.column_family (
action text,
timeline_id timeuuid,
shopper_id text,
product_id text,
publisher_id text,
referer text,
remote_ip text,
seed_product text,
strategy text,
user_agent text,
PRIMARY KEY (action, timeline_id, shopper_id)
) WITH CLUSTERING ORDER BY (timeline_id DESC, shopper_id ASC)
AND bloom_filter_fp_chance = 0.01
AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
AND comment = ''
AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'}
AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99.0PERCENTILE';
What I am suspecting is as action_item is the row key, all data is getting served from single node (hot spot) and thats why that nodes CPU might be shooting up. Also while reading there is only a single partition of RDD getting created in the spark job. Any help will be appreciated.
Ok you're having a data model issue here. action = partition key so all similar actions are stored in a single partition = (one node + replicas).
How many distinct actions do you have in total ? Your intuition about having hotspot is justified.
You probably need a different partition key OR need to add an extra column to the partition key to let Cassandra distributes the data evenly on the cluster.
Read this blog post : http://www.planetcassandra.org/blog/the-most-important-thing-to-know-in-cassandra-data-modeling-the-primary-key/
Having performance issues using NetworkTopologyStrategy on a production keyspace with replication factor 4 across multiple datacenters (DCs located in 4 worldwide locations). Each DC has 3 nodes with pretty good hardware (70GB RAM, 5TB SSDs, etc.).
Same keyspace performs well in a SimpleStrategy using 4 node cluster in AWS, but running same queries in Production environment results in poor query times (select * from my_table is 6ms in AWS and 271ms in production).
Table "my_table" (name changed for privacy) is defined as:
CREATE TABLE my_table (
rec_type text,
group_id int,
rec_id timeuuid,
user_id int,
content text,
created_on timestamp,
PRIMARY KEY ((rec_type, group_id), rec_id)
) WITH bloom_filter_fp_chance = 0.1
AND comment = ''
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99.0PERCENTILE'
AND caching = {
'keys' : 'ALL',
'rows_per_partition' : 'NONE'
}
AND compression = {
'sstable_compression' : 'LZ4Compressor'
}
AND compaction = {
'class' : 'LeveledCompactionStrategy'
};
This is a newly-created table in Production with occasional updates and low tombstone count.
Query trace is below:
Looks like range requests are taking the most amount of time. What would be the cause of the delay?
Network latency between nodes in the same DC is <1ms and latency between DCs is around 50-60ms.
EDIT:
Below is the query trace for a select * from my_table where rec_type = 'abc' and group = 1 LIMIT 300 query (2 screenshots because trace is so long):
I am receiving a OperationTimedOut error while running an alter table command in cqlsh. How is that possible? Since this is just a table metadata update, shouldn't this operation run almost instantaneously?
Specifically, this is an excerpt from my cqlsh session
cqlsh:metric> alter table metric with gc_grace_seconds = 86400;
OperationTimedOut: errors={}, last_host=sandbox73vm230
The metric table currently has a gc_grace_seconds of 864000. I am seeing this behavior in a 2-node cluster and in a 6-node 2-datacenter cluster. My nodes seem to be communicating fine in general (e.g. I can insert in one and read from the other). Here is the full table definition (a cyanite 0.1.3 schema with DateTieredCompactionStrategy, clustering and caching changes):
CREATE TABLE metric.metric (
tenant text,
period int,
rollup int,
path text,
time bigint,
data list<double>,
PRIMARY KEY ((tenant, period, rollup, path), time)
) WITH CLUSTERING ORDER BY (time ASC)
AND bloom_filter_fp_chance = 0.01
AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
AND comment = ''
AND compaction = {'timestamp_resolution': 'SECONDS', 'class': 'org.apache.cassandra.db.compaction.DateTieredCompactionStrategy'}
AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
AND dclocal_read_repair_chance = 0.0
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = 'NONE';
I realize at this point the question is pretty old, and you may have either figured out the answer or otherwise moved on, but wanted to post this in case others stumbled upon it.
The default cqlsh request timeout is 10 seconds. You can adjust this by starting up cqlsh with the --request-timeout option set to some value that allows your ALTER TABLE to run to completion, e.g.:
cqlsh --request-timeout=1000000