GridGain open source datacenter topology specification - gridgain

GRIDGAIN DATA-CENTER REPLICATION
A few specific questions regarding the recently open-sourced Gridgain code. The gridgain.org support link says datacenter replication is not enabled for the open-source version. Is this true or false.
More imporatantly, assuming the open-source version has the datacenter feature enabled, how do we go about specifying the topology and activating the replication.
For example, the official documentation suggest to create/set a GridDrSenderCacheConfiguration, GridDrSenderHubConfiguration with details of the topology. I did this but it didnt seem to enable any cross data center replication.
More specifically, I did the following:
assign a dataCenterId byte parameter in the config.xml for gridgain.
...
define those nodes that are part of that datacenter under the
... add ip addresses of nodes
Define above for each node in each datacenterl appropriately. In the gridgain java client code, initiate a gridgain instance and set the GridDrSenderCacheConfiguration,GridDrSenderHubConnection (along wtih the GridDrSenderHubConnectionConfiguration) as specified in the docs for each node in each datacenter and also using a dummy GridDrReceiverHubConfiguration object (all defaults)
However this does not seem to do any replication across the data centers.
Would someone from the GridGain team please give some examples of setting up the data center replication, How to setup the config.xml, and enable in the java code when instantiating a gridgain instance.
Also, I am trying to avoid intra-datacenter replication by setting the gridDrSenderHubConnectionConfiguration.setIgnoredDataCenterIds(localDC); paramter to avoid replicating if the datacenter is

Just confirmed. Since data center replication is not present in open source version, no replication would happen in this case. Please download eval version of GridGain enterprise edition and try it out.

Related

Migrate Data from one Riak cluster to another

I have a situation where we need to migrate data from one Riak cluster to another and then remove the old cluster. The ring size will be same, even the region will be the same. We need to do this to upgrade the instances to AL2. Is there a clean approach to do so on Prod, without realtime data loss?
The answer to this may be tied to your version of Riak KV. If you have the open source version of Riak KV 2.2.3 or earlier, this will require an in-situ upgrade to Riak KV 2.2.6 before progressing. See https://www.tiot.jp/riak-docs/riak/kv/2.2.6/setup/upgrading/version/ with packages at https://files.tiot.jp/riak/kv/2.2/2.2.6/
For an Enterprise Editions of Riak KV 2.2.3 and earlier or the open source edition of Riak KV 2.2.6 or higher, you can use multi-data centre replication (MDC).
Use both of these at the same time for proper replication and to prevent data loss:
fullsync replication will copy across all stored data on its first run and then any missing data on subsequent runs.
realtime replication will replicate all transactions in almost realtime.
If you then set this up as bidirectional replication (get each cluster to replicate to the other for both fullsync and realtime) then you will be able to seemlessly switch your production environment from one cluster to the other without any issues. Once you are happy everything is working as expected, you can kill the old cluster.
Please see the documentation for replication at https://www.tiot.jp/riak-docs/riak/kv/2.2.6/using/cluster-operations/v3-multi-datacenter/

Hazelcast Mancenter enable/disable sanpshot

I am trying to configure wan replication using hazelcast mancenter, but I am not getting the option to select snapshot enable/disable feature here as the option is not listed in the dropdown.Is there a way to achieve this through mancenter?
Version 3.9.4
hazelcast version 3.9.3
thanks
You can add a WAN replication configuration dynamically to a cluster. It is for having one-off WAN sync operations, not continuous replication. The added configuration has two caveats:
It is not persistent, so it will not survive a member restart.
It cannot be used as a target for regular WAN replication. It can only be used for WAN sync.
That's why snapshot setting is not there as well. For adding a persistent WAN config, it must be defined in member configurations.

Hub-Spoke model with Cassandra

I'm trying to create Hub-Spoke topology with Cassandra. I want to have one centralised C* server and many spoke c* servers. Whenever a new records comes to any of the spoke, it should be moved to Hub c* server. I tried with replication startegies but its seems to be bi-directional. Means, If i insert a record in node1 and i'm able to see the record in all the nodes in my cluster.any suggestions/guidance will highly appreciated here.
This is a feature introduced in DataStax Enterprise 5.0. You can find all the details in the docs, but super summarized the DSE Advanced Replication provides a unidirectional replication from remote clusters to a central hubs which also supports prioritization of data streams.

Creating new datacenter with Datastax OpsCenter

I'd like to enable vnodes on my cassandra cluster, which has an Analytics dc and a regular Cassandra dc. I am using OpsCenter 5.0.1 and DSE 4.5. My question is: how can I create a new dc with OpsCenter, with vnodes enabled, so I can transfer my data over from my existing dc's. I am following the instructions on this page, but surely I don't have to manually edit the config file on every node, to enable a new datacenter, right? Any help much appreciated.
Unfortunately OpsCenter's automated provisioning doesn't currently support creating multi-dc clusters or adding data centers to existing clusters. We know this is important functionality that's missing, and are working on making that available as soon as we can.

Dynamically adding new nodes in Cassandra

Is it possible to add new hosts to a Cassandra cluster dynamically?
What I'm trying to do is set up a program that can:
Set up a local version of the database for each user
Each user's machine will become part of the cluster (the machines will be hosts)
Data will be replicated across all the clusters
Building a cluster of multiple hosts usually entails configuring the cassandra.yaml to store the seeds, listen_address and rpc_address of each host.
My idea is to edit these files through java and insert the new host addresses as required but making sure that data is accurate across each users's cassandra.yaml files would be challenging.
I'm wondering if someone has done something similar or has any advice on a better way to achieve this.
Yes is possible. Look at Netflix's Priam for an complete example of a dynamic cassandra cluster management (but designed to work with Amazon EC2).
For rpc_address and listen_address, you can setup a startup script that configures the cassandra.yaml if it's not ok.
For seeds you can configure a custom seed provider. Look at the seed provider used for Netflix's Priam for some ideas how to implement it
The most difficult part will be managing the tokens assigned to each node in a efficient way. Cassandra 1.2 is around the corner and will include a feature called virtual nodes that, IMO, will work well in your case. See the Acunu presentation about it

Resources