Why does hinted handoff not work sometimes? - cassandra

I was testing the effect of playing around with some of the parameters relevant to hinted handoff.
I have 3 nodes in the datacenter with replication strategy as SimpleStrategy. I wrote 200k entries(12 MB) in 2 nodes while the 3rd node was down. After the write was successful, I brought up the 3rd node.
Now I left the system undisturbed for 3 minutes and after 3 minutes I turned off the 2 nodes that had been on since the beginning. Now I queried the 3rd node via cql.
The above procedure was repeated thrice. All the configurations were exactly similar in the 1st and 3rd iteration.
The parameters I played with were hinted_handoff_throttle_in_kb and max_hints_delivery_threads.
In the 1st and 3rd iteration, I had set hinted_handoff_throttle_in_kb: 2048, max_hints_delivery_threads: 4 and in the 2nd iteration I had set hinted_handoff_throttle_in_kb: 1024, max_hints_delivery_threads: 2.
The observations:
In the 1st iteration, node 3 contained more than 195k rows.
In the 2nd iteration, node 3 contained more than 60k rows.
In the 3rd iteration, node 3 contained 0 rows.
I'm not able to understand what makes hinted handoff work in the first 2 cases, but not in the 3rd case despite the fact that during the 1st and 3rd iteration, all the configurations were exactly similar.
System: RHEL
Cassandra Version: 3.0.14

Related

Adding a new node in the topology after the given time interval

I am writing an algorithm for which I want to add new nodes in the topology after every 1 minute for 5 minutes. Initially the topology contains 5 nodes, so after 5 minutes it should have total 10 nodes. How can I implement this in the simulation script. Which behaviour will be best suited to do this ?

Can we modify the number of tablets/shards per table after we create a universe?

As described in this example each tserver started with 12 tablet as we set number of shards to 4.
And when we added a new node the number of tablet per tserver became 9. it seems the total number of tablet, which is 36, will not increase.
My question is:
How many node could we add while we have 36 total tablet(in this example)?
And Is it possible to increase shards count in a running universe to be able to add more node?
How many node could we add while we have 36 total tablet(in this example)?
In this example, you can expand to 12 nodes (each node would end up with 1 leader and 2 followers).
Reasoning: There are 36 total tablets for this table and the replication factor is 3. So there will 12 tablet leaders and 24 tablet followers. Leaders are responsible for handling writes and reads (unless you're doing follower reads, lets assume that is not the case). If you go to 12 nodes, each node would at least have one leader and be doing some work.
Ideally, you should create enough tablets upfront so that you end up with 4 tablets per node eventually.
And Is it possible to increase shards count in a running universe to be able to add more node?
This is currently not possible, but being worked on and getting close to the finish. expected to be released in Q1 2020. If you are interested in this feature, please subscribe to this GitHub issue for updates.
Until that is ready, as a workaround, you can split the table into sufficient number of tablets.

How to avoid selecting too many data

What we are doing is pretty much like
putting time series data into cassandra
running an spark aggregation job every hour and put aggregated data back to cassandra
One of the problems we found is, if the hourly job does not succeed, for example, continuously, 1 AM ~ 2 AM, 2 AM ~ 3 AM, 3 AM ~ 4 AM (or more), then next time, it'll aggregate the data from 1 AM to 5 AM (last success time is recorded in cassandra). The issue comes at this hour, because it's now 4 (or more) hours data, and it's way larger than one hour data which then results in an OutofMemory exception by selecting too many data from cassandra into dataframe.
Well, adding memory to spark executor is a way fixing this. However, considering it's an edge issue, I'm wondering if there's any mature pattern or architecture to deal with this issue.

HazelCast list, partition data loss

When I have 2 node cluster, list.add(xxx) adds data by splitting it to two different nodes.
eg: Node-1 - carries 3 instances,
Node-2 - carries 3 instances,
When we have a total of 6.
When I perform an abrupt shutdown of any one of the Node/s, the data/instance in the node gets lost and only 3 is available.
Is it the nature, or can we tune to get the other 3 also.
Any help would be appreciated !

Write Request metric

I'm currently using 1-node cluster with DataStax Opscenter 5.2.1 (Cassandra 2.2.3) installed on Windows.
There is not too much data is sent to the cluster, and here is the graph (last 20 minutes) of write requests that I can see in Opscenter. The graph looks normal and expected for me:
write_requests(20min)
However, when I've switched the data range to last 1 hour, as turns out there were much more write requests (according to cluste(max) line):
write_requests(1h)
I'm confused, could someone clarify what cluster(max) means in my case? Why these values are so big in comparison with cluster(total) or cluster(min)?
The first graph (20 minute) uses an average. The 1h graph will have 3 lines - min per sample, average, and max per sample.
What you're likely seeing is that something (perhaps opscenter itself) is doing a flood of writes, about 700/second for a few seconds, and on the 20 minute graph it gets averaged out, but with the min/max lines, you'll see the outliers.

Resources