What is the best tool to find the number of rows in each Cassandra partition? I have a big partition and I want to know how much records are there in that partition
nodetool tablehistograms <keyspace> <table> will give you the distribution of the cells and sizes of thee partition for the table. But that does not give you for sure that partition. To get the specific one you must use count(*) on a select query that specifies the partition key in where clause. A very large partition and that can fail though.
sstablemetadata after 4.0 is based off the describe command in sstable-tools. It will give you the partitions largest in size, and largest in number of rows, and the partitions with most tombstones if you provide the -s to scan the sstable. These can be used against 3.0 and 3.11 sstables. I think 2.1 sstables are not able to be processed though.
...
Partitions: 22515
Rows: 13579337
Tombstones: 0
Cells: 13579337
Widest Partitions:
[12345] 999999
[99049] 62664
[99007] 60437
[99017] 59728
[99010] 59555
Largest Partitions:
[12345] 189888705
[99049] 2965017
[99007] 2860391
[99017] 2826094
[99010] 2818038
...
above example has parititon key an int, it will print out key like:
Widest Partitions:
[frodo] 1
Largest Partitions:
[frodo] 104
You can find the total number of partitions available for a table with nodetool command. ./nodetool cfstats <keyspace>.<table>.
If you know the partition key, you can fire a select count(*) for the partition to get no. of the records in that partition. It's possible that query can timeout for count queries on big partitions set cqlsh request-timeout before executing the query.
To understand how to calculate the physical partition size, go through the Datastax DS220: Data Modeling partition size
Instaclustr has a tool to find the partition size. However, this does not show the number of records in each partition:
https://github.com/instaclustr/cassandra-sstable-tools
As mentioned above either use inbuilt node tool, which could be find within Cassandra Folder extracted from jar , and run nodetool inside terminal .
nodetool toppartitions
Additionally you can also use online tool such as : https://www.cqlguru.io/ , but this need some prior information as vaerage number of rows per partition, average number of text in varchar and all . But this tool is good for rough estimation.
Related
command I ran was
nodetool tablehistograms <keyspace> <table>
The bug was
No SSTables exists, unable to calculate 'Partition Size' and 'Cell Count' percentiles
I tried to calculate partition size for better selections on partition keys, but nodetool command did not work fine as the partition size is not provided with this error
SSTables are immutable as far as concerned, and I do not know if I should (and how to) create SSTables based on existed ones?
Experts, please come solve this problem, really appreciate it.
Best
How exact do you need to be when measuring the partition sizes?
For a quick estimate, 'nodetool tablestats <keyspace.table>' will give you the min, max and avg partition size.
If a more accurate measurement is needed, you could download and use DSBulk and run the count option to pull the largest n partitions for a table, which will also print the key, for example:
dsbulk count --stats.modes partitions --stats.numPartitions <n> -k myKeyspace -t myTable
There are no histograms available for the command to report if there are no SSTables on disk.
The nodetool tablehistograms command collects the metrics from the SSTables but if there are none stored on disk then there is nothing for the command to report.
Make sure that the table contains data in the data/ directory then try again. Cheers!
👉 Please support the Apache Cassandra community by hovering over the cassandra tag above and click on Watch tag. 🙏 Thanks!
How to calculate the total size of a keyspace in cassandra?
I have tried the nodetool cfstats and nodetool tablestats command. It is giving a lot of information, but I am not sure which field provides the exact information.
Can anybody suggest any method to find out the size of a keyspace and a table in Cassandra?
"nodetool tablestats" replaces the older command "nodetool cfstats". In other words both are the same. Output of this command lists the size of each of the tables within a keyspace.
Amongst the output, you are looking for "Space used (total)" value. Its the Total number of bytes of disk space used by SSTables belonging to this table, including obsolete SSTables waiting to be GCd.
Since there could be multiple tables within a keyspace, you need to sum up "Space used (total)" for all tables belonging to a keyspace to get size occupied by keyspace.
Another alternative if you have SSH access to the nodes, is to get to Cassandra Data directory and issue "du -h" to get the size of each keyspace directory. Again sum up the directory size on all nodes for that keyspace (ignoring the snapshot sizes).
Through ops-center and nodetool cfstats i was able to find that one of the partitions of a keyspace table is 560 Mb, but couldn't find out which partition is that. How can we trace which partition of the table is that big ??
The fastest possible way is to look for messages in the log about compacting large partitions. Sort of a cheat, but it often works.
Short of that, you'll need to dump the sstables to json, and then inspect the json. There are a number of people who have written tools for this online - https://github.com/BrianGallew/cassandra_tools is one example.
How does quantity of partitions influence on repair time in Cassandra cluster?
Is it correct that the less quantity of partitions the faster speed of Merkle tree algorithm and repair procedure?
Will repair faster for -
CREATE TABLE ks.t1 (
id2 bigint,
id1 bigint,
name text,
PRIMARY KEY (id2, id1, name)
);
than for
CREATE TABLE ks.t1 (
id2 bigint,
id1 bigint,
name text,
PRIMARY KEY ((id2, id1), name)
);
If count(id2, id1) > count (id1) ?
When triggering repair, Cassandra will
read all SSTables locally on disk into memory
compute the Merkle Tree
exchange the Merkle Tree between different replicas
if there is a mismatch, a block of partitions will be sent on the
network
Because Merkle tree resolution only allow 32768 leaf nodes. If there are more than 32768 partitions on a single replica, there will be many partitions that hash into the same leaf node. So if a single partition mismatches, we'll need to send all the block of partitions. That's what I call over repair
This issue is solved more or less by sub-range repair where, instead of repairing the whole token range for a table, Cassandra just attempts to repair a portion of the token range. The direct result is the Merkle Tree resolution will be higher since there are less partitions to repair.
So yes, it seems that having less partitions will reduce over repair.
But ....
In your example, less partition == wider partition which is not ideal either.
Why ? Because if there is a single cell mismatch in a wide partition, Cassandra will need to repair the entire partition, which is a waste of resource.
Furthermore, wide partition will make read path slower because the data is likely to span on many SSTables.
Conclusion, I would personally prefer PRIMARY KEY ((id2, id1), name) and use sub-range repair.
We modeled our data in cassandra table with partition key, lets say "pk". We have a total of 100 unique values for pk and our cluster size is 160. We are using random partitioner. When we add data to Cassandra (with replication factor of 3) for all 100 partitions, I noticed that those 100 partitions are not distributed evenly. One node has as many as 7 partitions and lot of nodes only has 1 or no partition. Given that we are using random partitioner, I expected the distribution to be reasonably even. Because 7 partitions are in the same node, thats creating a hot partition for us. Is there a better way to distribute partitions evenly?
Any input is appreciated.
Thanks
I suspect the problem is the low cardinality of your partition key. With only 100 possible values, it's not unexpected that several values end up hashing to the same nodes.
If you have 160 nodes, then only having 100 possible values for your partition key will mean you aren't using all 160 nodes effectively. An even distribution of data comes from inserting a lot of data with a high cardinality partition key.
So I'd suggest you figure out a way to increase the cardinality of your partition key. One way to do this is to use a compound partition key by including some part of your clustering columns or data fields into your partition key.
You might also consider switching to the Murmur3Partitioner, which generally gives better performance and is the current default partitioner on the newest releases. But you'd still need to address the low cardinality problem.