cassandra stateful set in kubernetes - cassandra

I've been trying to setup a redundant stateful set in kubernetes with the google cassandra image, as depicted in kubernetes 1.7 documentation.
According to the image used It's a stateful set with a consistency level of ONE.
In my testing example I'm using a SimpleStrategy replication with a replication factor of 3, as I have setup 3 replicas in the stateful set in one datacenter only.
I've defined cassandra-0,cassandra-1,cassandra-2 as seeds, so all are seeds.
I've created a keyspace and a table:
"create keyspace if not exists testing with replication = { 'class' : 'SimpleStrategy', 'replication_factor' : 3 }"
"create table testing.test (id uuid primary key, name text, age int, properties map<text,text>, nickames set<text>, goals_year map<int,int>, current_wages float, clubs_season tuple<text,int>);"
I am testing with inserting data from another unrelated pod, using the cqlsh binary, and I can see that data ends up in every container, so replication is successfull.
nodetool status on all pods comes up with:
Datacenter: DC1-K8Demo
======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 10.16.0.161 71.04 KiB 32 100.0% 4ad4e1d3-f984-4f0c-a349-2008a40b7f0a Rack1-K8Demo
UN 10.16.0.162 71.05 KiB 32 100.0% fffca143-7ee8-4749-925d-7619f5ca0e79 Rack1-K8Demo
UN 10.16.2.24 71.03 KiB 32 100.0% 975a5394-45e4-4234-9a97-89c3b39baf3d Rack1-K8Demo
...and all cassandra pods have the same data in the table created before:
id | age | clubs_season | current_wages | goals_year | name | nickames | properties
--------------------------------------+-----+--------------+---------------+------------+----------+----------+--------------------------------------------------
b6d6f230-c0f5-11e7-98e0-e9450c2870ca | 26 | null | null | null | jonathan | null | {'goodlooking': 'yes', 'thinkshesthebest': 'no'}
5fd02b70-c0f8-11e7-8e29-3f611e0d5e94 | 26 | null | null | null | jonathan | null | {'goodlooking': 'yes', 'thinkshesthebest': 'no'}
5da86970-c0f8-11e7-8e29-3f611e0d5e94 | 26 | null | null | null | jonathan | null | {'goodlooking': 'yes', 'thinkshesthebest': 'no'}
But then I delete one of those db replica pods(cassandra-0), a new pod springs up again as expected, a new cassandra-0 (thanks kubernetes!), and I see now that all the pods have lost one row of those 3:
id | age | clubs_season | current_wages | goals_year | name | nickames | properties
--------------------------------------+-----+--------------+---------------+------------+----------+----------+--------------------------------------------------
5fd02b70-c0f8-11e7-8e29-3f611e0d5e94 | 26 | null | null | null | jonathan | null | {'goodlooking': 'yes', 'thinkshesthebest': 'no'}
5da86970-c0f8-11e7-8e29-3f611e0d5e94 | 26 | null | null | null | jonathan | null | {'goodlooking': 'yes', 'thinkshesthebest': 'no'}
...and nodetool status now comes up with:
Datacenter: DC1-K8Demo
======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 10.16.0.161 71.04 KiB 32 81.7% 4ad4e1d3-f984-4f0c-a349-2008a40b7f0a Rack1-K8Demo
UN 10.16.0.162 71.05 KiB 32 78.4% fffca143-7ee8-4749-925d-7619f5ca0e79 Rack1-K8Demo
DN 10.16.2.24 71.03 KiB 32 70.0% 975a5394-45e4-4234-9a97-89c3b39baf3d Rack1-K8Demo
UN 10.16.2.28 85.49 KiB 32 69.9% 3fbed771-b539-4a44-99ec-d27c3d590f18 Rack1-K8Demo
... shouldn't the cassandra ring replicate all the data into the newly created pod, and still have the 3 rows there in all cassandra pods?
... this experience is documented in github.
...has someone tried this experience, what might be wrong in this testing context?
super thanks in advance

I think that after bringing down the node, you need to inform the other peers from the cluster that the node is dead and needs replacing.
I would recommend some reading in order to have a correct test case.

Related

Cassandra how to see active user connections

In cassandra (am using DSE),
how do I check how many users are connected to the database? Any way to check node wise?
Is there any auditing info stored which will tell me which all users connected along with info such as IP address and driver used etc?
In Opscenter there is a metric called "Native clients", where is this info stored in the db to query for? Does this include internal communication between the nodes and backups etc?
How do I check how many users are connected to the database? Any way to check node wise?
Is there any auditing info stored which will tell me which all users connected along with info such as IP address and driver used etc?
DSE has a performance service feature which you can enable to make this information available via cql. To enable this particular capability, configure the following in dse.yaml as described in the docs:
user_level_latency_tracking_options:
enabled: true
With this enabled, you can now query a variety of tables, for example:
cqlsh> select * from dse_perf.user_io;
node_ip | conn_id | last_activity | read_latency | total_reads | total_writes | user_ip | username | write_latency
-----------+-----------------+---------------------------------+--------------+-------------+--------------+-----------+-----------+---------------
127.0.0.1 | 127.0.0.1:55116 | 2019-01-14 14:08:19.399000+0000 | 1000 | 1 | 0 | 127.0.0.1 | anonymous | 0
127.0.0.1 | 127.0.0.1:55252 | 2019-01-14 14:07:39.399000+0000 | 0 | 0 | 1 | 127.0.0.1 | anonymous | 1000
(2 rows)
cqlsh> select * from dse_perf.user_object_io;
node_ip | conn_id | keyspace_name | table_name | last_activity | read_latency | read_quantiles | total_reads | total_writes | user_ip | username | write_latency | write_quantiles
-----------+-----------------+---------------+------------+---------------------------------+--------------+----------------+-------------+--------------+-----------+-----------+---------------+-----------------
127.0.0.1 | 127.0.0.1:55252 | s | t | 2019-01-14 14:07:39.393000+0000 | 0 | null | 0 | 1 | 127.0.0.1 | anonymous | 1000 | null
127.0.0.1 | 127.0.0.1:55116 | s | t | 2019-01-14 14:08:19.393000+0000 | 1000 | null | 1 | 0 | 127.0.0.1 | anonymous | 0 | null
Note that there is a cost to enabling the performance service, and it can be enabled and disabled selectively using dsetool perf userlatencytracking [enable|disable].
In a future release of Apache Cassandra (4.0+) and DSE (likely 7.0+), there will be a nodetool clientstats command (CASSANDRA-14275), and a corresponding system_views.clients table (CASSANDRA-14458) that includes connection info. This will include the driver name, if the driver client provides one (newer ones do).
In Opscenter there is a metric called "Native clients", where is this info stored in the db to query for? Does this include internal communication between the nodes and backups etc?
I'm not too up to speed on OpsCenter. From what I know OpsCenter typically stores it's data in the OpsCenter keyspace, you can configure data collection parameters by following this doc.

Counting partition size in cassandra

I'm implementing unique entry counter in Cassandra. The counter may be represented just as a set of tuples:
counter_id = broadcast:12345, token = user:123
counter_id = broadcast:12345, token = user:321
where value for counter broadcast:12345 may be counted as size of corresponding entries set. Such counter can be effectively stored as a table with counter_id being partition key. My first thought was that since single counter value is basically size of partition, i can do count(1) WHERE counter_id = ? query, which won't need to read data and would be super-duper fast. However, i see following trace output:
cqlsh > select count(1) from token_counter_storage where id = '1';
activity | timestamp | source | source_elapsed
-------------------------------------------------------------------------------------------------+----------------------------+------------+----------------
Execute CQL3 query | 2016-06-10 11:22:42.809000 | 172.17.0.2 | 0
Parsing select count(1) from token_counter_storage where id = '1'; [SharedPool-Worker-1] | 2016-06-10 11:22:42.809000 | 172.17.0.2 | 260
Preparing statement [SharedPool-Worker-1] | 2016-06-10 11:22:42.810000 | 172.17.0.2 | 565
Executing single-partition query on token_counter_storage [SharedPool-Worker-2] | 2016-06-10 11:22:42.810000 | 172.17.0.2 | 1256
Acquiring sstable references [SharedPool-Worker-2] | 2016-06-10 11:22:42.810000 | 172.17.0.2 | 1350
Skipped 0/0 non-slice-intersecting sstables, included 0 due to tombstones [SharedPool-Worker-2] | 2016-06-10 11:22:42.810000 | 172.17.0.2 | 1465
Merging data from memtables and 0 sstables [SharedPool-Worker-2] | 2016-06-10 11:22:42.810000 | 172.17.0.2 | 1546
Read 10 live and 0 tombstone cells [SharedPool-Worker-2] | 2016-06-10 11:22:42.811000 | 172.17.0.2 | 1826
Request complete | 2016-06-10 11:22:42.811410 | 172.17.0.2 | 2410
I guess that this trace confirms data being read from disk. Am i right in this conclusion, and if yes, is there any way to simply fetch partition size using index without any excessive disk hits?

How to get tombstone count for a cql query?

I am trying to evaluate number of tombstones getting created in one of tables in our application. For that I am trying to use nodetool cfstats. Here is how I am doing it:
create table demo.test(a int, b int, c int, primary key (a));
insert into demo.test(a, b, c) values(1,2,3);
Now I am making the same insert as above. So I expect 3 tombstones to be created. But on running cfstats for this columnfamily, I still see that there are no tombstones created.
nodetool cfstats demo.test
Average live cells per slice (last five minutes): 0.0
Average tombstones per slice (last five minutes): 0.0
Now I tried deleting the record, but still I don't see any tombstones getting created. Is there any thing that I am missing here? Please suggest.
BTW a few other details,
* We are using version 2.1.1 of the Java driver
* We are running against Cassandra 2.1.0
For tombstone counts on a query your best bet is to enable tracing. This will give you the in depth history of a query including how many tombstones had to be read to complete it. This won't give you the total tombstone count, but is most likely more relevant for performance tuning.
In cqlsh you can enable this with
cqlsh> tracing on;
Now tracing requests.
cqlsh> SELECT * FROM ascii_ks.ascii_cs where pkey = 'One';
pkey | ckey1 | data1
------+-------+-------
One | One | One
(1 rows)
Tracing session: 2569d580-719b-11e4-9dd6-557d7f833b69
activity | timestamp | source | source_elapsed
--------------------------------------------------------------------------+--------------+-----------+----------------
execute_cql3_query | 08:26:28,953 | 127.0.0.1 | 0
Parsing SELECT * FROM ascii_ks.ascii_cs where pkey = 'One' LIMIT 10000; | 08:26:28,956 | 127.0.0.1 | 2635
Preparing statement | 08:26:28,960 | 127.0.0.1 | 6951
Executing single-partition query on ascii_cs | 08:26:28,962 | 127.0.0.1 | 9097
Acquiring sstable references | 08:26:28,963 | 127.0.0.1 | 10576
Merging memtable contents | 08:26:28,963 | 127.0.0.1 | 10618
Merging data from sstable 1 | 08:26:28,965 | 127.0.0.1 | 12146
Key cache hit for sstable 1 | 08:26:28,965 | 127.0.0.1 | 12257
Collating all results | 08:26:28,965 | 127.0.0.1 | 12402
Request complete | 08:26:28,965 | 127.0.0.1 | 12638
http://www.datastax.com/dev/blog/tracing-in-cassandra-1-2

Cassandra CQL: different SELECT results

I am using latest Cassandra 2.1.0 and have the different results for the following queries.
select * from zzz.contact where user_id = 53528c87-0691-46f7-81a1-77173fd8390f
and contact_id = 5ea82764-ce42-45f3-8724-e121c8b7d32e;
returns me one decired record but
select * from zzz.contact where user_id = 53528c87-0691-46f7-81a1-77173fd8390f;
returns 6 other rows except the row which is returned by first SELECT.
Structure of the keyspace/table is:
CREATE KEYSPACE zzz
WITH replication = { 'class' : 'NetworkTopologyStrategy', 'DC1' : '2' };
CREATE TABLE IF NOT EXISTS contact (
user_id uuid,
contact_id uuid,
approved boolean,
ignored boolean,
adding_initiator boolean,
PRIMARY KEY ( user_id, contact_id )
);
Both instances are in keyspace and UN
d:\Tools\apache-cassandra-2.1.0\bin>nodetool status
Starting NodeTool
Note: Ownership information does not include topology; for complete information, specify a keyspace
Datacenter: DC1
================
Status=Up/Down|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns Host ID Rack
UN 192.168.0.146 135.83 KB 256 51.7% 6d035991-3471-498b-8051-55f99a2fdfed RAC1
UN 192.168.0.216 3.26 MB 256 48.3% d82f3a69-c6f8-4237-b50e-d2f370ac644a RAC1
I have two Cassandra instances.
Tried command "nodetool repair" - didn't help.
Tried to add ALLOW FILTERING in the end of the queries - didn't help.
Any help is highly appreciated.
UPD:
here is result of queries:
Microsoft Windows [Version 6.1.7601]
Copyright (c) 2009 Microsoft Corporation. All rights reserved.
d:\Tools\apache-cassandra-2.1.0\bin>cqlsh 192.168.0.216
Connected to ClusterZzz at 192.168.0.216:9042.
[cqlsh 5.0.1 | Cassandra 2.1.0 | CQL spec 3.2.0 | Native protocol v3]
Use HELP for help.
cqlsh> select * from zzz.contact where user_id = 53528c87-0691-46f7-81a1-77173fd8390f and contact_id = 5ea82764-ce42-45f3-8724-e121c8b7d32e;
user_id | contact_id | adding_initiator | approved | ignored
--------------------------------------+--------------------------------------+------------------+----------+---------
53528c87-0691-46f7-81a1-77173fd8390f | 5ea82764-ce42-45f3-8724-e121c8b7d32e | False | True | False
(1 rows)
cqlsh> select * from zzz.contact where user_id = 53528c87-0691-46f7-81a1-77173fd8390f;
user_id | contact_id | adding_initiator | approved | ignored
--------------------------------------+--------------------------------------+------------------+----------+---------
53528c87-0691-46f7-81a1-77173fd8390f | 6fc7f6e4-ac48-484e-9660-128476ca5bf9 | False | False | False
53528c87-0691-46f7-81a1-77173fd8390f | 7a240937-8b28-4424-9772-8c4c8e381432 | False | False | False
53528c87-0691-46f7-81a1-77173fd8390f | 8e6cb13a-96e7-45af-b9d8-40ea459df996 | False | False | False
53528c87-0691-46f7-81a1-77173fd8390f | 938af09a-0fe3-4cdd-b02e-cbdfb078335c | False | True | False
53528c87-0691-46f7-81a1-77173fd8390f | d84d9e7a-e81d-42a2-87b3-f163f7a9a646 | False | True | False
53528c87-0691-46f7-81a1-77173fd8390f | fd2ec705-1661-4cf8-98ef-46f627a9a382 | False | False | False
(6 rows)
cqlsh>
UPD #2:
Worth to mention that my nodes are on Windows7 machines. On production, we use Linux, so there were no problems like I have it with Windows nodes.

Is it possible to tell cassandra to run a query only on the local node

I've got two nodes that are fully replicated. When I run a query on a table that contains 30 rows, cqlsh trace seems to indicate it is fetching some rows from one server and some rows from the other server.
So even though all the rows are available on both nodes, the query takes 250ms+ rather than 1ms for other queries.
I've already got consistency level set to "one" at the protocol level, what else do you have to do to make it only use one node for the query?
select * from organisation:
activity | timestamp | source | source_elapsed
-------------------------------------------------------------------------------------------------+--------------+--------------+----------------
execute_cql3_query | 04:21:03,641 | 10.1.0.84 | 0
Parsing select * from organisation LIMIT 10000; | 04:21:03,641 | 10.1.0.84 | 68
Preparing statement | 04:21:03,641 | 10.1.0.84 | 174
Determining replicas to query | 04:21:03,642 | 10.1.0.84 | 307
Enqueuing request to /10.1.0.85 | 04:21:03,642 | 10.1.0.84 | 1034
Sending message to /10.1.0.85 | 04:21:03,643 | 10.1.0.84 | 1402
Message received from /10.1.0.84 | 04:21:03,644 | 10.1.0.85 | 47
Executing seq scan across 0 sstables for [min(-9223372036854775808), min(-9223372036854775808)] | 04:21:03,644 | 10.1.0.85 | 461
Read 1 live and 0 tombstoned cells | 04:21:03,644 | 10.1.0.85 | 560
Read 1 live and 0 tombstoned cells | 04:21:03,644 | 10.1.0.85 | 611
………..etc….....
It turns out that there was a bug in Cassandra versions 2.0.5-2.0.9 that would make Cassandra more likely to request data on two nodes when it only needed to talk to one.
Upgrading to 2.0.10 or greater resolves this problem.
Refer: CASSANDRA-7535

Resources