Using Cassandra, how do i see how many partitions were created base on how i created the primary key? I have been following a tutorial and it mentions to go to bin/cassandra-cli and use the LIST command. However, the latest Cassandra install does not come with this and I have read other articles online that have indicated that cli is now deprecated.
Is there anyway for me to see the partitions that were created using cqlsh?
Thanks in advance!
First of all you have to investigate your cassandra.yaml file to see the number of tokens that are currently configured. This tells you how many partitions each node will own:
$ grep num_tokens conf/cassandra.yaml
...
num_tokens: 128
...
$ grep initial_token conf/cassandra.yaml
...
# initial_token: 1
...
If initial token is commented out, that means that the node will figure out it's own partition ranges during start-up.
Next you can check partition ranges using nodetool ring command:
$ bin/nodetool ring
Datacenter: DC1
==========
Address Rack Status State Load Owns Token
9167006318991683417
127.0.0.2 r1 Down Normal ? ? -9178420363247798328
127.0.0.2 r1 Down Normal ? ? -9127364991967065057
127.0.0.3 r1 Down Normal ? ? -9063041387589326037
This shows you which partition range belongs to which node in the cluster.
In the example above each node owns 128 partition ranges. The range between -9178420363247798327 and -9127364991967065057 belongs to the node 127.0.0.2.
You can use this simple select to tell each row's partition key:
cqlsh:mykeyspace> select token(key), key, added_date, title from mytable;
system.token(key) | key | added_date | title
----------------------+-----------+--------------------------+----------------------
-1651127669401031945 | first | 2013-10-16 00:00:00+0000 | Hello World
-1651127669401031945 | first | 2013-04-16 00:00:00+0000 | Bye World
356242581507269238 | second | 2014-01-29 00:00:00+0000 | Lorem Ipsum
356242581507269238 | second | 2013-03-17 00:00:00+0000 | Today tomorrow
356242581507269238 | second | 2012-04-03 00:00:00+0000 | It's good to meet you
(5 rows)
Finding the partition key in partition ranges will tell you where the record is stored.
Also you can use nodetool to do the same in one simple step:
$ bin/nodetool getendpoints mykeyspace mytable 'first'
127.0.0.1
127.0.0.2
This tells where the records with the partition key 'first' are located.
NOTE: If some of the nodes are down, getendpoints command won't list those nodes, even though they should store the record based on replication settings.
cassandra-cli is not the same thing as cqlsh. Read this for more information: https://wiki.apache.org/cassandra/CassandraCli
The simplest way to get the number of partitions (keys) is with nodetool.
nodetool tablestats <keyspace>.<table>
Keyspace and table are optional. The number of partitions is listed under the value for Number of keys (estimate).
If you want the number of rows then Chris answer is correct.
SELECT * FROM <keyspace>.<table>;
This will show you all rows in the table. Just remember that this is a very costly operation as Cassandra has to fetch this data from all nodes in the cluster that has any data for that table.
can just do a select * from table, if look at headers the partition and row keys are colored differently so can figure it out that way.
Related
I have an 6 node cluster , each node is of 1000 GB in size. But the size of one node reached to 1000 GB randomly.On analysis i found only one key space gets filled & only 1 table of this keyspace size get increased from 200 GB to 800 GB (In 24 hours ) , which means someone execute operations on this table only . I want to figure out what operations had perform on this node which leads to this size increment ?
Are there any logs which can be looked at to see what operations were performed?
I guess how I would do this is to use "nodetool tablehistograms" to prove that you have large partitions for the table. Then I would go to the table directory and run "sstablemetadata" on some of the data files, locating ones that displays some large partition sizes.
One trick you could do once you find sstables that have larger partitions is:
sstabledump <sstable> | grep -n "\"key\" :"
What that will do is show you the line number every time the key switches, the larger the gap between lines, the more rows there are.
Here is an example:
sstabledump aa-483-bti-Data.db | grep -n "\"key\" :"
4: "key" : [ "PROCESSING" ],
65605: "key" : [ "PENDING" ],
8552007: "key" : [ "COMPLETED" ],
As you can see, the gap between PENDING and COMPLETED was much larger than PROCESSING and PENDING (65k lines v.s. 8M lines). So this tells me that the PROCESSING partition is relatively small compared to PENDING. The only mystery is how large is the COMPLETED one as there is no "ending" line. To get the total line count, run:
sstabledump aa-483-bti-Data.db | wc -l
16316029
Total line count is 16M. So COMPLETED goes from 8M to 16M, or about 8M lines. So the COMPLETED partition is large as well, about as large as the PENDING partition.
Looking at sstablemetadata to see if that matches up with the output, I see that it does:
sstablemetadata aa-483-bti-Data.db
Partition Size:
Size (bytes) | Count (%) Histogram
943127 (921.0 kB) | 1 ( 33) OOOOOOOOOOOOOOOOOOOOOOOOOOOOOO
129557750 (123.6 MB) | 1 ( 33) OOOOOOOOOOOOOOOOOOOOOOOOOOOOOO
155469300 (148.3 MB) | 1 ( 33) OOOOOOOOOOOOOOOOOOOOOOOOOOOOOO
I see two relatively large partitions and one small one. Bingo.
Maybe some of those can help you get to the bottom of your large partition(s).
With DataStax Enterprise, you should be able to turn on the Database Auditing feature. In fact, by configuring a logger class of CassandraAuditWriter, all activity gets written to the audit_log table in the dse_audit keyspace.
The data is organized by this PRIMARY KEY: ((date, node, day_partition), event_time); and has columns like username,table_name,keyspace_name,operation and others.
Check out the DataStax docs on that for configuration and query options.
As for (open source) Apache Cassandra, we use Ericsson's Cassandra Audit plugin for this functionality. By adding in the project's JAR, and making a couple of adjustments to the cassandra.yaml file, you can view the audit.logs for records like:
15:42:41.655 - client:'10.0.110.1'|user:'flynn'|status:'ATTEMPT'|operation:'DELETE FROM ecks.ectbl WHERE partk = ?'
I have a Data model like below,
CREATE TABLE appstat.nodedata (
nodeip text,
timestamp timestamp,
flashmode text,
physicalusage int,
readbw int,
readiops int,
totalcapacity int,
writebw int,
writeiops int,
writelatency int,
PRIMARY KEY (nodeip, timestamp)
) WITH CLUSTERING ORDER BY (timestamp DESC)
where, nodeip - primary key and timestamp - clustering key (Sorted by descinding oder to get the latest),
Sample data in this table,
SELECT * from nodedata WHERE nodeip = '172.30.56.60' LIMIT 2;
nodeip | timestamp | flashmode | physicalusage | readbw | readiops | totalcapacity | writebw | writeiops | writelatency
--------------+---------------------------------+-----------+---------------+--------+----------+---------------+---------+-----------+--------------
172.30.56.60 | 2017-12-08 06:13:07.161000+0000 | yes | 34 | 57 | 19 | 27 | 8 | 89 | 57
172.30.56.60 | 2017-12-08 06:12:07.161000+0000 | yes | 70 | 6 | 43 | 88 | 79 | 83 | 89
This is properly available and whenever I need to get the statistics I am able to get the data using the partition key like below,
(The above logic seems similar to my previous question : Aggregation in Cassandra across partitions) but expectation is different,
I have value for each column (like readbw, latency etc.,) populated for every one minute in all the 4 nodes.
Now, If I need to get the max value for a column (Example : readbw), It is possible using the following query,
SELECT max(readbw) FROM nodedata WHERE nodeip IN ('172.30.56.60','172.30.56.61','172.30.56.60','172.30.56.63') AND timestamp < 1512652272989 AND timestamp > 1512537899000;
1) First question : Is there a way to perform max aggregation on all nodes of a column (readbw) without using IN query?
2) Second question : Is there a way in Cassandra, whenever I insert the data in Node 1, Node 2, Node 3 and Node 4.
It needs to be aggregated and stored in another table. So that I will collect the aggregated value of each column from the aggregated table.
If any of my point is not clear, please let me know.
Thanks,
Harry
If you are dse Cassandra you can enable spark and write the aggregation queries
Disclaimer. In your question you should define restrictions to query speed. Readers do not know whether you're trying to show this in real time, or is it more for analytical purposes. It's also not clear on how much data you're operating and the answers might depend on that.
Firstly decide whether you want to do aggregation on read or write. This largely depends on your read/write patterns.
1) First question: (aggregation on read)
The short answer is no - it's not possible. If you want to use Cassandra for this, the best approach would be doing aggregation in your application by reading each nodeip with timestamp restriction. That would be slow. But Cassandra aggregations are also potentially slow. This warning exists for a reason:
Warnings :
Aggregation query used without partition key
I found C++ Cassandra driver to be the fastest option, if you're into that.
If your data size allows, I'd look into using other databases. Regular old MySQL or Postgres will do the job just fine, unless you have terabytes of data. There's also influx DB if you want a more exotic one. But I'm getting off-topic here.
2) Second question: (aggregation on write)
That's the approach I've been using for a while. Whenever I need some aggregations, I would do them in memory (redis) and then flush to Cassandra. Remember, Cassandra is super efficient at writing data, don't be afraid to create some extra tables for your aggregations. I can't say exactly how to do this for your data, as it all depends on your requirements. It doesn't seem feasible to provide results for arbitrary timestamp intervals when aggregating on write.
Just don't try to put large sets of data into a single partition. You're better of with traditional SQL databases then.
I have a 6 node cluster and I kept the replication factor as 3 and 3 for the two data centers. I inserted a single row as test and later more rows. As the replication factor is 6 I want to check if the data is written into all nodes. How can I individually check if the data is present in that node. Worst option i got is shutting down the remaining 5 nodes and checking select statement from the 1 node and repeating same on all nodes. Is there any better way to check this? THanks for your time.
You can use nodetool getendpoints for this. Here is a sample table to keep track of Blade Runners.
aploetz#cqlsh:stackoverflow> SELECT * FROm bladerunners;
id | type | datetime | data
--------+--------------+--------------------------+---------------------------------------------
B25881 | Blade Runner | 2015-02-16 18:00:03+0000 | Holden- Fine as long as nobody unplugs him.
B26354 | Blade Runner | 2015-02-16 18:00:03+0000 | Deckard- Filed and monitored.
(2 rows)
Now if I exit back out to my command prompt, I can use nodetool getendpoints, followed by my keyspace, table, and a partition key value. The list of nodes containing data for that key should be displayed underneath:
aploetz#dockingBay94:~$ nodetool getendpoints stackoverflow bladerunners 'B26354'
127.0.0.1
I have encountered this problem recently. When I populated my tables (called event and index) to more than 1 million, and tried to truncate them for new tests, the tables were not empty after the truncation. CQL shows something like
cqlsh> select count(*) from event limit 100000000;
count
---------
2033492
cqlsh> truncate event;
cqlsh> select count(*) from event limit 100000000;
count
-------
25
(1 rows)
cqlsh> select count(*) from event limit 100000000;
count
-------
27
(1 rows)
cqlsh> select count(*) from event limit 100000000;
count
-------
34
(1 rows)
cqlsh> select event_id, dateOf(time_token), unixTimestampOf(time_token), writetime(time_token) from event limit 100000000;
event_id | dateOf(time_token) | unixTimestampOf(time_token) | writetime(time_token)
--------------------------------------+--------------------------+-----------------------------+-----------------------
567c4f2b-c86a-4663-a8ec-50f70d183b62 | 2014-07-22 22:29:04-0400 | 1406082544416 | 1406082544416000
20a2f9e7-cdcb-4c2d-93e7-a646d0910e6b | 2014-07-22 15:12:29-0400 | 1406056349772 | 1406056349774000
... ...
0d983cec-4ba5-4df8-ada8-eb347add57bf | 2014-07-22 22:20:53-0400 | 1406082053926 | 1406082053930000
(34 rows)
cqlsh>
After the "truncate" command, the "select count(*)" returned numbers quickly changing, and stabilized at 34. To be sure there is no other program inserting records at the time, I ran a CQL statement showing all records were created on July 22 or 23, which is 4 - 5 days ago.
I tried "truncate" command several times, and the results were the same.
This happened in 2 environments. The first environment is on my laptop where I created 3 Cassandra instances cluster using localhost IPs (127.0.0.2, 127.0.0.3, and 127.0.0.4), while the second environment is 3 node Cassandra cluster, with each node on a separate Linux CentOS 6.5 machine. I am using Cassandra 2.0.6.
Could someone help me to figure out what is going on? Thanks in advance.
It is a bug in Cassandra 2.0.6, and got fixed in at least 2.0.10.
Apparently, it is not a well-known (well-published) bug as many DataStax experts did not know it either when I reproduced it to them at the Cassandra Summit 2014. They were also puzzled until the CQL architect dropped by and said he fixed a mysterious bug in recent release. He asked me to upgrade to 2.0.10, and the problem is gone. There are no more lingering records after "truncate" in 2.0.10.
Truncate doesn't truncate hints so hints awaiting delivery will still get delivered. This could be causing your issue, especially if you inserted lots of rows quickly that could have caused a few dropped mutations. However, hints are normally delivered in minutes, not days, so there must be something else wrong if hints are causing your issue. You can see when hints are delivered from the logs.
The safest way to delete all data is to drop the table and recreate under a different name (or in a different keyspace).
There is one thing you absolutely have to make sure before truncating is all the nodes are up.
If you are using Astyanax
/* keyspace variable is Keyspace Type */
keyspace.truncateColumnFamily(ColumnFamilyName);
Note: Even after truncating you will have to manually delete all the table metadata
i have a coulmn family in cassandra 1.2 as below:
time | class_name | level_log | message | thread_name
-----------------+-----------------------------+-----------+---------------+-------------
121118135945759 | ir.apk.tm.test.LoggerSimple | DEBUG | This is DEBUG | main
121118135947310 | ir.apk.tm.test.LoggerSimple | ERROR | This is ERROR | main
121118135947855 | ir.apk.tm.test.LoggerSimple | WARN | This is WARN | main
121118135946221 | ir.apk.tm.test.LoggerSimple | DEBUG | This is DEBUG | main
121118135951461 | ir.apk.tm.test.LoggerSimple | WARN | This is WARN | main
when i use this query:
SELECT * FROM LogTM WHERE token(time) > token(0);
i get nothing!!! but as you see all of time values are greater than zero!
this is CF schema:
CREATE TABLE logtm(
time bigint PRIMARY KEY ,
level_log text ,
thread_name text ,
class_name text ,
msg text
);
can any body help?
thanks :)
If you're not using an ordered partitioner (if you don't know what that means you don't) that query doesn't do what you think. Just because two timestamps sort one way doesn't mean that their tokens do. The token is the (Murmur3) hash of the cell value (unless you've changed the partitioner).
If you need to do range queries you can't do it on the partition key, only on clustering keys. One way you can do it is to use a schema like this:
CREATE TABLE LogTM (
shard INT,
time INT,
class_name ASCII,
level_log ASCII,
thread_name ASCII,
message TEXT,
PRIMARY KEY (shard, time, class_name, level_log, thread_name)
)
If you set shard to zero the schema will be roughly equivalent to what you're doing now, but the query SELECT * FROM LogTM WHERE timestamp > 0 will give you the results you expect.
However, the performance will be awful. With a single value of shard only a single partition/row will be created, and you will only use a single node of your cluster (and that node will be very busy trying to compact that single row).
So you need to figure out a way to spread the load across more nodes. One way is to pick a random shard between something like 0 and 359 (or 0 and 255 if you like multiples of two, the exact range isn't important, it just needs to be an order of magnitude or so larger than the number of nodes) for each insert, and read from all shards when you read back: SELECT * FROM LogTM WHERE shard IN (0,1,2,...) (you need to include all shards in the list, in place of ...).
You can also pick the shard by hashing the message, that way you don't have to worry about duplicates.
You need to tell us more about what exactly it is that you're trying to do, especially how you intend to query the data. Don't go do the thing I described above, it is probably completely wrong for your use case, I just wanted to give you an example so that I could explain what is going on inside Cassandra.