DB used: Datastax cassandra community 3.0.9
Cluster: 3 x (8core 64GB AWS) with 300GB io1 with 3000iops.
Java heap memory allocated = 8g
Write consistency: Quorum , read consistency: ONE Replication factor: 3
Problem: I loaded our servers with 50,000 users and each user had 1000 records initially and after sometime, 20 more records were added to each users. I wanted to fetch the 20 additional records that were added later(Query : select * from table where userID='xyz' and timestamp > 123)
CREATE TABLE tbl (
userID text,
timestamp timestamp,
....
PRIMARY KEY (userID, timestamp)
);
I have added additional 200GB of data into the tbl apart from the original data for 50,000 users.
Heap memory usage is in the range 2-4 GB but almost all the remaining off heap memory (56 GB) is eaten up by cassandra.
From this point onwards, if more data is added to the table, a sharp decline in read throughput is observed due to unavailable memory.
Though, it meets the read throughput SLA, but does not seem to be a scalable solution - (3 x 64 GB) RAM for 200 GB data.
Note:
In the load test experiment, records for the only initial 50,000 users
are being fetched.
Row cache is disabled.
It's read intensive application - 2000 reads / sec
What could be the possible reason for high off heap memory usage?
If your partitions are very large then read times take longer. Since your data is partitioned by userid, all data associated with each user id is stored in a single partition on disk (within the partition, your data is ordered by the timestamp- your clustering key). When reading, cassandra must traverse the whole partition to find the data you are requesting in the read. If each userid has a lot of data associated with it, you could have quite large partitions on disk which will take longer to read.
Related
I have a cassandra cluster, its read latency increases during writes. The writes mostly happen via spark jobs during the night time. The writes happen in huge bursts, is there a way to reduce read latency during the writes. The writes happen using LOCAL_QUORUM and reads happen using LOCAL_ONE. Is there a way to reduce read latency when writes are happening?
Cassandra Cluster Configs
10 Node cassandra cluster (5 in DC1, 5 in DC2)
CPU: 8 Core
Memory: 32GB
Grafana Metrics
I can give some advice:
Use LCS compaction strategy.
Prefer round-robin load balancing policy for reads.
Choose partition_key wisely so that requests are not bombarded on a single partition.
Partition size also play a good role. Cassandra recommends to have smaller partition size. However, I have tested with Partitions of 10000 rows each with each row having size of 800 bytes. It worked better than with 3000 rows(or even 1 row). Very tiny partitions tend to increase CPU usage when data stored is large in terms of row count. However, very large partitions should be avoided even.
Replication Factor should be chosen strategically . Write consistency level should be decided considering the replication of all keyspaces.
We have some large partition in cassandra and I would like to see what caused the large partition. Is there a tool to get the partition data out of cassandra and analyze it ? Right now cqlsh query is timing out even if select single row from the partition.
It can depend on why you get the timeout but theres a some of options.
increase column_index_size_in_kb in your cassandra.yaml to something like 1024 and rebuild the sstables. This works around the object allocation issues a wide partition index introduces.
increasing heap size.
increase read_request_timeout_in_ms
increase key cache size (nodetool setcachecapacity 1000 0 0) then make the read. Watch the read stage until down to zero then try request again. This is hard unless cluster is essentially unused. The read continues even after the timeout, once it finishes reading the index it will cache it so the following read will skip that part (generally the worst) which will speed up significantly
pull raw data from sstable with sstabledump or sstabletools
I'm working on designing Cassandra column family.
I met with a situation of higher GC while SELECTing, after loading a higher density of data. That is, amount of data in a partition increased. Also for low density data, it works fine.
I want to know how Cassandra does the SELECT query (with both partition and cluster key specified)?
Is the whole set of data in a partition is loaded into memory while we execute SELECT?
Will large number of partition keys affect performance?
Cassandra does not load the entire partition into memory, but it does load IndexInfo objects which help Cassandra find the relevant CQL rows within the partition. These are short lived java objects which can create quite a bit of heap pressure (GC pauses) - this is a design issue that will be addressed in CASSANDRA-9754 (known as Birch, a b-tree implementation of the index data structure).
Until cassandra-4.0 is released, you should target 100MB for your max partition size, and break larger partitions into smaller pieces.
I have 4 cassandra nodes in cluster and one column family which has 10 columns where row cannot grow very wide (maybe max 1000 columns).
I have "peak" writes where I insert up to 500 000 records in 5-10 minutes range.
I use node.js driver: node-cassandra-cql
3 nodes are working fine but one node crashes every time on heavy writes.
All nodes currently have around 1.5 GB data size and problematic node has 1.9 GB data size.
All nodes have max heap space at 1GB (I have 4 GB RAM on machines so default cassandra config file calculated this amount of heap)
I use default cassandra configuration except I increased write/read timeouts.
Question: Does anyone knows what could be reason for this?
Is heap size really that small?
What and how to configure cassandra cluster for this use case (heavy writes at small time range and other time actually doing nothing or just small writes)
I haven't tried to increase heap size manually, first I would like to know if maybe there is something other to configure instead just increasing it.
We have a 32 node Cassandra cluster with around 100Gb per node using Murmur3 partitioner. It has time series data and we have build secondary indexes on two columns to perform range queries. Currently, the cluster is stable with all the data bulk loaded and all the secondary indexes rebuilt. The issue occurs when we are performing range queries using cql client or hector, just the query for count of rows takes a huge amount of time and it most cases causes nodes to fail due to memory issues. The nodes have 8gb memory, Cassandra MAX Heap is allotted to 4 GB. Has anyone else faced such an issue ? Is there a better way to do count queries ?
I've had similar issues and most often this can be solved by redesigning the schema bearing in mind the queries that you plan to execute against the data in Cassandra. For a timeseries data it is better to have wide tables with granularity depending on your queries. If your query requires data at a granularity of 1 hour, then it is best to have a wide table with all timestamped data points stored within a single row for every hour so you can get all the required data for 1 hour by reading just 1 row.
Since you say the data is bulk loaded, I am assuming that you may have put all the data into a single table which is why the get_count query is taking an enormous amount of time. We have a a cluster with 8GB RAM but have set the heap size to 3 GB because at 4GB, the RAM utilization is almost always at 8GB [full utilization].