synchronous replication in hazelcast - hazelcast

We are evaluating Hazelcast for one our use cases and i have a doubt regarding the replication in hazelcast.
It is mentioned in http://docs.hazelcast.org/docs/latest-development/manual/html/Distributed_Data_Structures/Map/Backing_Up_Maps.html that "Backup operations are synchronous, so when a map.put(key, value)returns, it is guaranteed that the map entry is replicated to one other member".
But in another page http://docs.hazelcast.org/docs/latest-development/manual/html/Consistency_and_Replication_Model.html, it is mentioned "Two types of backup replication are available: sync and async. Despite what their names imply, both types are still implementations of the lazy (async) replication model".
Both these statements look a bit contradictory. Can someone please throw some light onto this?
Is replication in Hazelcast truly synchronous? I need to have the values updated in both the owner and backup nodes together.

The explanation in here is more correct. In the context of CAP theorem, Hazelcast is an AP product. Thus, Best-Effort Consistency is aimed on replication and both sync and async backups are implementations of the lazy replication model. As it is explained in the page; the difference between two options is;
in sync backups, the caller block until backup updates are applied by backup replicas and acknowledgments are sent back to the caller
the async backups works as fire & forget.
Below, please see the part from Hazelcast Reference Manual:
Hazelcast's replication technique enables Hazelcast clusters to offer high throughput. However, due to temporary situations in the system, such as network interruption, backup replicas can miss some updates and diverge from the primary. Backup replicas can also hit long GC pauses or VM pauses, and fall behind the primary, which is a situation called as replication lag. If a Hazelcast partition primary replica member crashes while there is a replication lag between itself and the backups, strong consistency of the data can be lost.

Related

Cassandra availability penalty in strong consistency mode

as I got Cassandra has ALL consistency level. It provides: "the highest consistency and the lowest availability". If this level provides strong consistency?
What availability penalty for it? I don't see a case when data won't be availabile. Could anyone give example of a such case.
If you use a consistency level of ALL then the coordinator must receive a response from all nodes. This means that:
After a successful write, nobody will read the previous state (high consistency).
If even a single node fails to respond, the whole read/write operation will fail (low availability).
For further reading, see the CAP theorem.
Could anyone give example of a such case.
A node is disconnected for maintenance.
A node crashes.
The power goes out in the server room / datacentre.
A node becomes unresponsive due to high load.
The network connection to a node goes down or becomes too slow.
Data has not yet propagated to all nodes.

Can a Cassandra cluster serve as a replacement for an in-memory Redis key-value store?

My application crawls user's mailbox and saves it to an RDBMS database. I started using Redis as a cache (simple key-value store) for RDBMS database. But gradually I started storing crawler states and other data in Redis that needs to be persistent. Loosing this data means a few hours of downtime. I must ensure airtight consistency for this data. The data should not be lost in node failures or split brain scenarios. Strong consistency is a must. Sharding is done by my application. One Redis process runs on each of ten EC2 m4.large instances. On each of these instances. I am doing up to 20K IOPS to Redis. I am doing more writes than reads, though I have not determined the actual percentage of both. All my data is completely in memory, not backed by disk.
My only problem is each of these instances are SPOF. I cannot use Redis cluster as it does not guarantee consistency. I have evaluated a few more tools like Aerospike, none gives 'No data loss guarantee'.
Cassandra looks promising as I can tune the consistency level I want. I plan to use Cassandra with a replication factor 2, and a write must be written to both the replicas before considered committed. This gives 'No data loss guarantee.
By launching enough cassandra nodes (ssd backed) can I replace my Redis key-value store and still get similar read/write IOPS and
latency? Will opensource cassandra suffice my use case? If not, will the Datastax enterprise In-Memory version solve it?
EDIT 1:
A bit of clarification:
I think I need to use Write consistency level 'ALL' and Read consistency level 'One'. I understand that with this consistency level my cluster will not tolerate any failure. That is OK for me. A few minutes of downtime occasionally is not a problem, as long as my data is consistent. In my present setup, one Redis instance failure causes a few hours of downtime.
I must ensure airtight consistency for this data.
Cassandra deals with failure better when there are more nodes. Assuming your case allows for having more nodes, this is my suggestion.
So, if you have 5 nodes, use CL of QUORUM for both READ and WRITE. What it means is that you always write to at least 3 nodes and read from 3 nodes.(for 5 nodes , QUORUM is 3).
This ensures a very high level consistency
Also ensures limited downtime. Even if a node is down your writes and reads won't break.
If you use CL ALL, then even if one node is down or overloaded, you will have to take a full downtime.
I hope it helps!

when does Cassandra node fail?

How does cassandra guarantee no failure of node at any given point of time,i know data is replicated so there might not be issues of losing the data
Cassandra nodes can fail due to alot of reasons like, very heavy write, out of memory error, hardware failure, tombstone limit 100k error, compaction failures, network errors, and so on.
Cassandra cannot guarantee no failure of node, because it just like any other software is vulnerable to dependent component and hardware.
What it does guarantee is that you won't have data loss, until you have minimum number of required nodes up and running, based on replication factor.
Cassandra could not guarantee no failure of nodes like any other systems, but with a correct setup of cassandra cluster, with enough number of nodes and replicas configured, even some of the nodes down, the entire cluster will still be available and no data lost, which could be transparent to clients. Clients will not realize it.

Cassandra consistency model performance evaluation

Hi I am a student and am trying to evaluate the latency(Insert, read and Upsert) of cassandra for different consistency models and for different replication factors.
I am using Virtual box on my host system and have 10 ubuntu VMs to form a cluster.
When I run the tests, sometimes the average latency comes out lesser for a stronger consistency model.
Also the latency does not increase as I increase the replication factor in some cases which is also not an expected result.
I wanted to know what all could be the possible reasons for such behavior?
There are a few things:
Performance benchmarks using virtual box on a single system will give you very different resutls from a live cluster. For instance, network latencies would be considerably reduced. A real cluster would have different resources available whereas vbox instances are sharing the same resources. Even on a cloud platform, you'd see different numbers.
When a write request comes in, the coordinator sends to all required replicas a write request in parallel. They all process the write and respond. If your lower consistency write went to a busy node, and the higher consistency write went to enough "faster / available" nodes to make a quorum, then the latter will have lower latency. Also, increasing the replication factor means the data is available in more nodes. So reads can be faster (depending on consistency levels).

Configuring Apache Cassandra for Disaster Recovery

How do you configure Apache Cassandra to allow for disaster recovery, to allow for one of two data-centres to fail?
The DataStax documentation talks about using a replication strategy that ensures at least one replication is written to each of your two data-centres. But I don't see how that helps once the disaster has actually happened. If you switch to the remaining data-centre, all your writes will fail because those writes will not be able to replicate to the other data-centre.
I guess you would want your software to operate in two modes: normal mode, for which writes must replicate across both data-centres, and disaster mode, for which they need not. But changing replication strategy does not seem possible.
What I really want is two data-centres that are over provisioned, and during normal operations use the resources of both data-centres, but use the resources of only the one remaining data-centre (with reduced performance) when only one data-centre is functioning.
The trick is to vary the consistency setting given through the API for writes, instead of varying the replication factor. Use the LOCAL_QUORUM setting for writes during a disaster, when only one data-centre is available. During normal operation use EACH_QUORUM to ensure both data-centres have a copy of the data. Reads can use LOCAL_QUORUM all the time.
Here is a summary of the Datastax documentation for multiple data centers and the older but still conceptionally relevant disaster recovery (0.7).
Make a recipe to suite your needs with the two consistencies LOCAL_QUORUM and EACH_QUORUM.
Here, “local” means local to a single data center, while “each” means consistency is strictly maintained at the same level in each data center.
Suppose you have 2 datacenters, one used strictly for disaster recovery then you could set the replication factor to...
3 for the primary write/read center, and two for the failover data center
Now depending how critical it is that your data is actually written to the disaster recovery nodes, you can either use EACH_QUORUM or LOCAL_QUORUM. Assuming you are using a replication placement strategy NetworkTopologyStrategy (NTS),
LOCAL_QUORUM on writes will only delay the client to write locally to the DC1 and asynchronously write to your recovery node(s) in DC2.
EACH_QUORUM will ensure that all data is replicated but will delay writes until both DCs confirm successful operations.
For reads it's likely best to just use LOCAL_QUORUM to avoid inter-data center latency.
There are catches to this approach! If you choose to use EACH_QUORUM on your writes you increase the potential failure points (DC2 is down, DC1-DC2 link is down, DC1 quorum can't be met).
The bonus is once your DC1 goes down, you have a valid DC2 disaster recovery. Also note in the 2nd link it talks about custom snitch settings for routing your IPs properly.

Resources