Issue with Hazelcast active-passive WAN Replication - hazelcast

I have several active and one passive (no wan-replication element) Hazelcast clusters.
When some item is added to the global WAN-Replicated map I see following message in the log of the passive cluster instance:
Received wan merge but no merge policy defined!
However, as I understanded from the 'hazelcast-fullconfig.xml', there is default merge policy for the map (hz.ADD_NEW_ENTRY). Also, I tried to set it explicit.
So as I understand, wan-replication merge policy and map merge policy are different things.
According to the manual, passive endpoint should not have wan-replication element.
Any ideas how can I configure wan-replication for the passive endpoints? Have I missed something?

In version 2.x, you should define wan-replication-ref (and merge policy) for passive side also.
See:
testWANClusteringActivePassive test in
https://github.com/hazelcast/hazelcast/blob/maintenance-2.x/hazelcast/src/test/java/com/hazelcast/impl/WanReplicationTest.java

Related

Cassandra multi datacenter cluster healthcheck from f5 load balancer

I have a working cassandra cluster across two data centers. Each DC has 3 nodes with replication factor as 3 and READ/WRITE consistency as LOCAL_QUORUM.
I want to stop the traffic to a particular DC when two nodes in the DC are down, because quorum is no longer met. I expected this to be handled by my application(client) i.e. connect to other DC cassandra when local quorum is not met but it is not possible from there.
Can we setup some kind of rule at f5 load balancer to achieve this?
You can setup an external monitor on the BIG-IP to run a script determining cluster health and then load balance on the results. If you're using BIG-IP 11.x+ you create your script and import it, adding any needed arguments it may require. Then you create a monitor profile to call that external monitor.
If you have a DevCentral account, check out this page:
DevCentral Wiki: External Monitor
Scroll down and you'll see a ton of examples to build off. Examples to note are the MySql monitors. This is the path I would recommend for cluster health checks for BIG-IP.
Alternatively, you can simply query a web page looking for a success/failure message so if you already have a cluster health status page, you can have an HTTP monitor validate the message. You can customize the receive string to look for specific content or use regex to look for any specific string (such as clusterFailure or whatnot). From there, you can make the appropriate LB decisions. I ran a similar monitor that read a nagios status page and if it read a failure on a specific message, it would LB connections from that node.
Here's some info on regex with http monitors.

Changes need to make in web application to use hazelcast

In my project , web application runs on different servers. On each server , application cache data during startup in ehcache and maps. I want to migrate to hazelcast. What are the changes need to make to use Hazelcast?
Three steps will get you started:
You need Hazelcast on the classpath.
For example, include http://repo1.maven.org/maven2/com/hazelcast/hazelcast/3.7.4/hazelcast-3.7.4.jar in your webapp's lib folder.
Create a Hazelcast instance in that webapp
For example, HazelcastInstance hazelcastInstance = Hazelcast.newHazelcastInstance();
Obtain a map reference, to use as if it was a local map
For example, java.util.Map<?, ?> map = hazelcastInstance.getMap("name");
Step 2 is where you'll likely have the most difficulty. You want the Hazelcast instances to find each other. They will do this using multicast by default, but if this is blocked for you you will need to be more explicit with the config and specify host addresses.
Send me a DM if you need help.

Kafka 0.9.1 authorization

I am exploring on the Security capabilities of Kafka 0.9.1 but unable to use it successfully.
I have set below configuration in my server.properties
allow.everyone.if.no.acl.found=false
super.users=User:root;User:kafka
I created an ACL using below command
./kafka-acls.sh --authorizer-properties zookeeper.connect= --add --allow-principal User:imit --allow-host --topic imit --producer --consumer --group imit-consumer-group
and I see below response for it
Current ACLs for resource Topic:imit:
User:imit has Allow permission for operations: Describe from hosts:
User:imit has Allow permission for operations: Read from hosts:
User:imit has Allow permission for operations: Write from hosts:
Note: Values mentioned in <> are replaced with some dummy values in the question and used correctly while creating the ACL
I have following observations:
a) Though I define the rule for imit topic to access for a particular using from a given host yet I can write to the topic from any host using any user account.
b) I am unable to read the messages from topic from any host or any user account (even using the one for which I have defined the rules).
I am running Kafka on RHEL 6.7 and all the users are local.
Appreciate if someone can guide if I am missing any configuration parameters or commands to manage authorization or if Kafka is behaving in a weird way.
Also where can I getting authorization related logs in Kafka?
Thanks & Regards,
Sudeep
You are probably missing the below settings, in your Server.properties.
authorizer.class.name=kafka.security.auth.SimpleAclAuthorizer
-- Adding this line would enable the ACL to work via SimpleAclAuthorizer.
-- Kafka by default comes with the kafka.security.auth.SimpleAclAuthorizer defined by the parameter authorizer.class.name
you can try the below setup which might give complete idea.
https://github.com/Symantec/kafka-security-0.9

Datastax java-driver LoadBalancingPolicy

I would like understand how to decide the load balancing policy for a heavy batch workload in cassandra using datastax java driver. I have two datacenters and I would like write into cluster with consistency ONE as quick as possible reliably to the extent possible.
How do I go about choosing the load balancing options, I see TokenAwarePolicy, LatencyAware, DCAware . Can I just use all of them?
Thanks
Srivatsan
The default LoadBalancingPolicy in the java-driver should be perfect for this scenario. The default LoadBalancingPolicy is defined as (from Policies):
public static LoadBalancingPolicy defaultLoadBalancingPolicy() {
return new TokenAwarePolicy(new DCAwareRoundRobinPolicy());
}
This will keep all requests local to the datacenter that the contact points you provide are in and will direct your requests to replicas (using round robin to balance) that have the data you are reading/inserting.
You can nest LoadBalancingPolicies, so if you would like to use all three of these policies you can simply do:
LoadBalancingPolicy policy = LatencyAwarePolicy
.builder(new TokenAwarePolicy(new DCAwareRoundRobinPolicy()))
.build();
If you are willing to use consistency level ONE, you do not care which data centre is used, so there is no need to use DCAwareRoundRobinPolicy. If you want the write to be as quick as possible, you want to minimise the latency, so you ought to use LatencyAwarePolicy; in practice this will normally select a node at the local data centre, but will use a remote node if it is likely to provide better performance, such as when a local node is overloaded. You also want to minimize the number of network hops, so you want to use one of the storage nodes for the write as the coordinator for the write, so you should use TokenAwarePolicy. You can chain policies together by passing one to the constructor call of another.
Unfortunately, the Cassandra driver does not provide any directly useful base policy for you to use as the child policy of LatencyAwarePolicy or TokenAwarePolicy; the choices are DCAwareRoundRobinPolicy, RoundRobinPolicy and WhiteListPolicy. However, if you use RoundRobinPolicy as the child policy, the LatencyAwarePolicy should, after the first few queries, acquire the latency information it needs.

how to make astyanax prefer local DC?

I know that Astyanax has options to make it only use the local DC, but according to this link, the client will then fail if the nodes in the local DC go down. I was wondering if there was something similar to this (a configuration setting), where requests would go to nodes in the local DC if the data exists on one of the nodes, and only access cross data center nodes when absolutely necessary.
Not a configuration setting, but you could achieve it using the following workaround. Have two drivers initialized driver_dc1 and driver_dc2 in your setup where each one connects to the nodes of the pertinent data center.
try{
// perform operation using driver_dc1
}catch(ConnectionException e){
// perform operation using driver_dc2
}

Resources