How do I find the partition key of Cassandra partitions with size greater than 100MB? - cassandra

I want to get a list of partition with size greater than 100 MB for analysis. How do I achieve this ?

Cassandra logs a WARN with details of partitions getting compacted when the partition size is larger than compaction_large_partition_warning_threshold. The default in cassandra.yaml is 100MB:
# Log a warning when compacting partitions larger than this value
compaction_large_partition_warning_threshold: 100MiB
You can parse the system.log on the Cassandra nodes and look for log entries which contain the string Writing large partition. It looks something like:
WARN [CompactionExecutor:#] BigTableWriter.java:258 maybeLogLargePartitionWarning \
Writing large partition ks_name/tbl_name:pk (###.###MiB) to sstable /path/to/.../...-big-Data.db
It should be easy enough to write a shell script that would extract the table name and partition key from the logs. Cheers!

Related

Would repartitioning in spark change number of spark partitions whilst reading data from cassandra source?

I am reading a table from cassandra table in spark. I have big partition in cassandra and when partition size of cassandra exceeds 64 MB , in that case cassandra partition is going to be equal to spark partition. Due to large partition I am getting memory issues in spark.
My question is if I do repartition at the beginning after reading data from cassandra, would number of spark partitions change ? and would it not lead to spark memory issues ?
My assumption is at very first place spark would read data from cassandra and hence at this stage cassandra large partition won't split due to repartition . Repartition will work on underlying data loaded from cassandra.
I am just wondering for answer if repartition could change data distribution when reading data from spark , rather than doing partitioning again ?
If you repartition your data using some arbitrary key then yes, it will be redistributed among the Spark partitions.
Technically, Cassandra partitions do not get split into Spark partitions when you retrieve the data but once you're done reading, you can repartition on a different key to break up the rows of a large Cassandra partition.
For the record, it doesn't avoid the memory issues of reading large Cassandra partitions in the first place because ​the default input split size of 64MB is just a notional target that Spark uses to calculate how many Spark partitions are required based on the estimated Cassandra table size and C* partition sizes. But since the calculation is based on estimates, the Spark partitions don't actually end up being 64MB in size.
If you are interested, I've explained in detail how Spark partitions are calculated in this post -- https://community.datastax.com/questions/11500/.
To illustrate with an example, let's say that based on the estimated table size and estimated number of C* partitions, each Spark partition is mapped to 200 token ranges in Cassandra.
For the first Spark partition, the token range might only contain 2 Cassandra partitions of size 3MB and 15MB so the actual size of the data in Sthe park partition is just 18MB.
But in the next Spark partition, the token range contains 28 Cassandra partitions that are mostly 1 to 4MB but there is one partition that is 56MB. The total size of this Spark partition ends up being a lot more than 64MB.
In these 2 cases, one Spark partition was just 18MB in size while the other is bigger than the 64MB target size. I've explained this issue in a bit more detail in this post -- https://community.datastax.com/questions/11565/. Cheers!

Size of cassandra partitions

What is the best tool to find the number of rows in each Cassandra partition? I have a big partition and I want to know how much records are there in that partition
nodetool tablehistograms <keyspace> <table> will give you the distribution of the cells and sizes of thee partition for the table. But that does not give you for sure that partition. To get the specific one you must use count(*) on a select query that specifies the partition key in where clause. A very large partition and that can fail though.
sstablemetadata after 4.0 is based off the describe command in sstable-tools. It will give you the partitions largest in size, and largest in number of rows, and the partitions with most tombstones if you provide the -s to scan the sstable. These can be used against 3.0 and 3.11 sstables. I think 2.1 sstables are not able to be processed though.
...
Partitions: 22515
Rows: 13579337
Tombstones: 0
Cells: 13579337
Widest Partitions:
[12345] 999999
[99049] 62664
[99007] 60437
[99017] 59728
[99010] 59555
Largest Partitions:
[12345] 189888705
[99049] 2965017
[99007] 2860391
[99017] 2826094
[99010] 2818038
...
above example has parititon key an int, it will print out key like:
Widest Partitions:
[frodo] 1
Largest Partitions:
[frodo] 104
You can find the total number of partitions available for a table with nodetool command. ./nodetool cfstats <keyspace>.<table>.
If you know the partition key, you can fire a select count(*) for the partition to get no. of the records in that partition. It's possible that query can timeout for count queries on big partitions set cqlsh request-timeout before executing the query.
To understand how to calculate the physical partition size, go through the Datastax DS220: Data Modeling partition size
Instaclustr has a tool to find the partition size. However, this does not show the number of records in each partition:
https://github.com/instaclustr/cassandra-sstable-tools
As mentioned above either use inbuilt node tool, which could be find within Cassandra Folder extracted from jar , and run nodetool inside terminal .
nodetool toppartitions
Additionally you can also use online tool such as : https://www.cqlguru.io/ , but this need some prior information as vaerage number of rows per partition, average number of text in varchar and all . But this tool is good for rough estimation.

Regarding Cassandra Table Size

How to calculate the total size of a keyspace in cassandra?
I have tried the nodetool cfstats and nodetool tablestats command. It is giving a lot of information, but I am not sure which field provides the exact information.
Can anybody suggest any method to find out the size of a keyspace and a table in Cassandra?
"nodetool tablestats" replaces the older command "nodetool cfstats". In other words both are the same. Output of this command lists the size of each of the tables within a keyspace.
Amongst the output, you are looking for "Space used (total)" value. Its the Total number of bytes of disk space used by SSTables belonging to this table, including obsolete SSTables waiting to be GCd.
Since there could be multiple tables within a keyspace, you need to sum up "Space used (total)" for all tables belonging to a keyspace to get size occupied by keyspace.
Another alternative if you have SSH access to the nodes, is to get to Cassandra Data directory and issue "du -h" to get the size of each keyspace directory. Again sum up the directory size on all nodes for that keyspace (ignoring the snapshot sizes).

How to retrieve wide partition from cassandra?

We have some large partition in cassandra and I would like to see what caused the large partition. Is there a tool to get the partition data out of cassandra and analyze it ? Right now cqlsh query is timing out even if select single row from the partition.
It can depend on why you get the timeout but theres a some of options.
increase column_index_size_in_kb in your cassandra.yaml to something like 1024 and rebuild the sstables. This works around the object allocation issues a wide partition index introduces.
increasing heap size.
increase read_request_timeout_in_ms
increase key cache size (nodetool setcachecapacity 1000 0 0) then make the read. Watch the read stage until down to zero then try request again. This is hard unless cluster is essentially unused. The read continues even after the timeout, once it finishes reading the index it will cache it so the following read will skip that part (generally the worst) which will speed up significantly
pull raw data from sstable with sstabledump or sstabletools

Spark Data Frame write to parquet table - slow at updating partition stats

When I write data from dataframe into parquet table ( which is partitioned ) after all the tasks are successful, process is stuck at updating partition stats.
16/10/05 03:46:13 WARN log: Updating partition stats fast for:
16/10/05 03:46:14 WARN log: Updated size to 143452576
16/10/05 03:48:30 WARN log: Updating partition stats fast for:
16/10/05 03:48:31 WARN log: Updated size to 147382813
16/10/05 03:51:02 WARN log: Updating partition stats fast for:
df.write.format("parquet").mode("overwrite").partitionBy(part1).insertInto(db.tbl)
My table has > 400 columns and > 1000 partitions.
Please let me know if we can optimize and speedup updating partition stats.
I feel the problem here is there are too many partitions for a > 400 columns file. Every time you overwrite a table in hive , the statistics are updated. IN your case it will try to update statistics for 1000 partitions and again each partition has data with > 400 columns.
Try reducing the number of partitions (use another partition column or if it is a date column consider partitioning by month) and you should be able to see a significant change in performance.

Resources