how to configure gemfire in a HA mode - pivot

how to configure gemfire in a ha mode
in cache.xml
<?xml version="1.0" encoding="UTF-8"?><cache xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://geode.apache.org/schema/cache" xsi:schemaLocation="http://geode.apache.org/schema/cache http://geode.apache.org/schema/cache/cache-1.0.xsd" version="1.0" lock-lease="120" lock-timeout="60" search-timeout="300" is-server="false" copy-on-read="false"/>
<!-- Run one secondary server -->
<cache>
<pool name="red1" subscription-enabled="true" subscription-redundancy="1">
<locator host="node5" port="41111"/>
<locator host="node6" port="41111"/>
</pool>
</cache>

To get HA, you need to have multiple GemFire/Geode locators and servers running.
gfsh>start locator --name=loc1 --port=10334
gfsh>start locator --name=loc2 --port=10335
gfsh>start server --name=serv1 --server-port=40404
gfsh>start server --name=serv2 --server-port=40405
gfsh>start server --name=serv3 --server-port=40406
You then need to make sure that your region has redundant copies. For a Partition Region this can be defined as follows:
gfsh>create region --name=myPR --type=PARTITION_REDUNDANT
This will gurantee that you will be able to tolerate loss of one Geode Server. You can configure upto 3 redundant copies for a Partition Region, make sure that these redundant copies are on different racks etc. please see docs for how to accomplish this. A Replicated region has same data on all servers, so it is always highly available.
Once, you have the server side configured, you need to point your client connection pool to the locator. The client pool will establish connection to available servers, in case of server failures, the pool will automatically try to re-execute the operation on another server. To configure a pool, simply point to the locators, and then use the pool in region definition.
<client-cache>
<pool name="publisher" subscription-enabled="true">
<locator host="lucy" port="41111"/>
<locator host="lucy" port="41111"/>
</pool>
...
<region name="clientRegion" ...
<region-attributes pool-name="publisher" ...
Please refer to the docs for more details.

Related

Spring Boot app can't connect to Cassandra cluster, driver returning "AllNodesFailedException: Could not reach any contact point"

i've updated my spring-boot to v3.0.0 and spring-data-cassandra to v4.0.0 which resulted in unable to connect to cassandra cluster which is deployed in stg env and runs on IPv6 address having different datacenter rather DC1
i've added a config file which accepts localDC programatically
`#Bean(destroyMethod = "close")
public CqlSession session() {
CqlSession session = CqlSession.builder()
.addContactPoint(InetSocketAddress.createUnresolved("[240b:c0e0:1xx:xxx8:xxxx:x:x:x]", port))
.withConfigLoader(
DriverConfigLoader.programmaticBuilder()
.withString(DefaultDriverOption.LOAD_BALANCING_LOCAL_DATACENTER, localDatacenter)
.withString(DefaultDriverOption.AUTH_PROVIDER_PASSWORD,password)
.withString(DefaultDriverOption.CONNECTION_INIT_QUERY_TIMEOUT,"10s")
.withString(DefaultDriverOption.CONNECTION_CONNECT_TIMEOUT, "20s")
.withString(DefaultDriverOption.REQUEST_TIMEOUT, "20s")
.withString(DefaultDriverOption.CONTROL_CONNECTION_TIMEOUT, "20s")
.withString(DefaultDriverOption.SESSION_KEYSPACE,keyspace)
.build())
//.addContactPoint(InetSocketAddress.createUnresolved(InetAddress.getByName(contactPoints).getHostName(), port))
.build();
}
return session;`
and this is my application.yml file
spring:
data:
cassandra:
keyspace-name: xxx
contact-points: [xxxx:xxxx:xxxx:xxx:xxx:xxx]
port: xxx
local-datacenter: xxxx
use-dc-aware: true
username: xxxxx
password: xxxxx
ssl: true
SchemaAction: CREATE_IF_NOT_EXISTS
So locally I was able to connect to cassandra (by default it is pointing to localhost) , but in stg env my appplication is not able to connect to that cluster
logs in my stg env
caused by: com.datastax.oss.driver.api.core.AllNodesFailedException: Could not reach any contact point, make sure you've provided valid addresses (showing first 1 nodes, use getAllErrors() for more): Node (endPoint=/[240b:cOe0:102:xxxx:xxxx:x:x:x]:3xxx,hostId-null,hashCode=4e9ba6a8):[com.datastax.oss.driver.api.core.connection.ConnectionInitException:[s0|controllid:0x984419ed,L:/[240b:cOe0:102:5dd7: xxxx:x:x:xxx]:4xxx - R:/[240b:c0e0:102:xxxx:xxxx:x:x:x]:3xxx] Protocol initialization request, step 1 (OPTIONS: unexpected tarlure com.datastax.oss.driver.apt.core.connection.closedconnectiontxception: Lost connection to remote peer)]
Network
You appear to have a networking issue. The driver can't connect to any of the nodes because they are unreachable from a network perspective as it states in the error message:
... AllNodesFailedException: Could not reach any contact point ...
You need to check that:
you have configured the correct IP addresses,
you have configured the correct CQL port, and
there is network connectivity between your app and the cluster.
Security
I also noted that you configured the driver to use SSL:
ssl: true
but I don't see anywhere where you've configured the certificate credentials and this could explain why the driver can't initiate connections.
Check that the cluster has client-to-node encryption enabled. If it does then you need to prepare the client certificates and configure SSL on the driver.
Driver build
This post appears to be a duplicate of another question you posted but is now closed due to lack of clarity and details.
In that question it appears you are running a version of the Java driver not produced by DataStax as pointed out by #absurdface:
Specifically I note that java-driver-core-4.11.4-yb-1-RC1.jar isn't a Java driver artifact released by DataStax (there isn't even a 4.11.4 Java driver release). This could be relevant for reasons we'll get into ...
We are not aware of where this build came from and without knowing much about it, it could be the reason you are not able to connect to the cluster.
We recommend that you switch to one of the supported builds of the Java driver. Cheers!
A hearty +1 to everything #erick-ramirez mentioned above. I would also expand on his answers with an observation or two.
Normally spring-data-cassandra is used to automatically configure a CqlSession and make it available for injection (or for use in CqlTemplate etc.). That's what you'd normally be configuring with your application.yml file. But you're apparently creating the CqlSession directly in code, which means that spring-data-cassandra isn't involved... and therefore what's in your application.yml likely isn't being used.
This analysis strongly suggests that your CqlSession is not being configured to use SSL. My understanding is that your testing sequence went as follows:
Tested app locally on a local server, everything worked
Tested app against test environment, observed the errors above
If this sequence is correct and you have SSL enabled in you test environment but not on your local Cassandra instance that could very easily explain the behaviour you're describing.
This explanation could also explain the specific error you cite in the error message. "Lost connection to remote peer" indicates that something is unexpectedly killing your socket connection before any protocol messages are explained... and an SSL issue would cause almost exactly that behaviour.
I would recommend checking the SSL configuration for both servers involved in your testing. I would also suggest consulting the SSL-related documentation referenced by Erick above and confirm that you have all the relevant materials when building your CqlSession.
added the certificate in my spring application
public CqlSession session() throws IOException, CertificateException, NoSuchAlgorithmException, KeyStoreException, KeyManagementException {
Resource resource = new ClassPathResource("root.crt");
InputStream inputStream = resource.getInputStream();
CertificateFactory cf = CertificateFactory.getInstance("X.509");
Certificate cert = cf.generateCertificate(inputStream);
TrustManagerFactory trustManagerFactory = TrustManagerFactory.getInstance(TrustManagerFactory.getDefaultAlgorithm());
KeyStore keyStore = KeyStore.getInstance(KeyStore.getDefaultType());
keyStore.load(null);
keyStore.setCertificateEntry("ca", cert);
trustManagerFactory.init(keyStore);
SSLContext sslContext = SSLContext.getInstance("TLSv1.3");
sslContext.init(null, trustManagerFactory.getTrustManagers(), null);
return CqlSession.builder()
.withSslContext(sslContext)
.addContactPoint(new InetSocketAddress(contactPoints,port))
.withAuthCredentials(username, password)
.withLocalDatacenter(localDatacenter)
.withKeyspace(keyspace)
.build();
}
so added the cert file in the configuration file of the cqlsession builder and this helped me in connecting to the remote cassandra cluster

Is there a memory limit for User Code Deployment on Hazelcast Cloud? (free version)

I'm currently playing with Hazelcast Cloud. My use case requires me to upload 50mb of jar file dependencies to Hazelcast Cloud servers. I found out that the upload seems to give up after about a minute or so. I get an upload rate of about 1mb a second, it drops after a while and then stops. I have repeated it a few times and the same thing happens.
Here is the config code I'm using:
Clientconfig config = new ClientConfig();
ClientUserCodeDeploymentConfig clientUserCodeDeploymentConfig =
new ClientUserCodeDeploymentConfig();
// added many jars here...
clientUserCodeDeploymentConfig.addJar("jar dependancy path..");
clientUserCodeDeploymentConfig.addJar("jar dependancy path..");
clientUserCodeDeploymentConfig.addJar("jar dependancy path..");
clientUserCodeDeploymentConfig.setEnabled(true);
config.setUserCodeDeploymentConfig(clientUserCodeDeploymentConfig);
ClientNetworkConfig networkConfig = new ClientNetworkConfig();
networkConfig.setConnectionTimeout(9999999); // i.e. don't timeout
networkConfig.setConnectionAttemptPeriod(9999999); // i.e. don't timeout
config.setNetworkConfig(networkConfig);
Any idea what's the cause, maybe there's a limit on the free cloud cluster?
I'd suggest using the smaller jar because this feature, the client user code upload, was designed for a bit different use cases:
You have objects that run on the cluster via the clients such as Runnable, Callable and Entry Processors.
You have new or amended user domain objects (in-memory format of the IMap set to Object) which need to be deployed into the cluster.
Please see more info here.

Kafka Zookeeper Security Authentication & Authorization(JAAS) Using SASL

Regarding Kafka-Zookeeper Security using DIGEST MD5 Authentication, I am trying to rotate/change credentials/password for both server(zookeeper) and client(kafka) jaas config file.
We have a 3 node cluster of 3 zookeepers and 3 kafka broker nodes with below jaas configuration file.
kafka.conf
org.apache.zookeeper.server.auth.DigestLoginModule required
username="super"
password="password";
};
zookeeper.conf
Server {
org.apache.zookeeper.server.auth.DigestLoginModule required
user_super="password";
};
To rotate we do a rolling restart of server(zookeeper) instances after updating the credential(password) and during the process of rolling restart after updating the same credential/password for super user for client(kafka instances) one at a time, we notice
[2019-06-15 17:17:38,929] INFO [ZooKeeperClient] Waiting until connected. (kafka.zookeeper.ZooKeeperClient)
[2019-06-15 17:17:38,929] INFO [ZooKeeperClient] Connected. (kafka.zookeeper.ZooKeeperClient)
these info level in server logs, which eventually results in unclean shutdown and restart of the broker which impacts the writes and reads for longer than expected. I have tried commenting requireClientAuthScheme=sasl in zookeeper zoo.cfg https://cwiki.apache.org/confluence/display/ZOOKEEPER/Client-Server+mutual+authentication to allow any clients authenticate to zookeeper but no success.
Also, alternative approach - tried to update the credential/password in jaas config file dynamically using sasl.jaas.config and do get the same exception documented in this jira (reference: https://issues.apache.org/jira/browse/KAFKA-8010).
can someone have any suggestions? Thanks in advance.

Migration Of Spring Partition Batch Jobs From Spring Batch 2.2 to Spring Batch 4.02

I am migration spring batch partition jobs (XML configuration) from Spring Batch 2.2.7 / Spring 3.2 to Spring Batch 4.0.2 / Spring 5.0.12. The war file is deployed on Wildfly 11 with ActiveMQ Artemis. The overall approach uses x clustered application servers and divides partitioned jobs into y partitions with each server having y/x listeners so the load is evenly distributed around the cluster.
We utilize 1 queue for outgoing messages and 1 queue for incoming messages across all partitioned batch jobs. All jobs share a single JmsInboundGateway like:
<int-jms:inbound-gateway
id="springbatch.master.inbound.gateway"
connection-factory="springbatch.listener.jmsConnectionFactory"
request-channel="springbatch.slave.jms.request"
request-destination="springbatch.partition.jms.requestsQueue"
concurrent-consumers="${springbatch.partition.concurrent.consumers}"
max-concurrent-consumers="${springbatch.partition.concurrent.maxconsumers}"
max-messages-per-task="${springbatch.partition.concurrent.maxmessagespertask}"/>
<int:service-activator
input-channel="springbatch.slave.jms.request"
output-channel="springbatch.slave.jms.response"
ref="springbatch.stepExecutionRequestHandler"/>
Each Job has an outbound gateway defined like:
<int-jms:outbound-gateway
connection-factory="springbatch.jmsConnectionFactory"
request-channel="partitioned.jms.requests"
request-destination="partition.jms.requestsQueue"
reply-channel="partitioned.jms.reply"
reply-destination="partition.jms.repliesQueue"
receive-timeout="partitioned.timeout}"
correlation-key="JMSCorrelationID" >
<int-jms:reply-listener cache-level="0" />
</int-jms:outbound-gateway>
<int:aggregator
input-channel="partitioned.jms.reply"
ref="partitioned.jms.handler"/>
Based on integration schema changes we removed the JMSCorrelationId and reply listener from the inbound gateway.
For the initial integration effort I only define the inbound gateway and Wildfly is throwing the following exception:
[org.springframework.jms.listener.DefaultMessageListenerContainer] (springbatch.master.inbound.gateway.container-2) Setup of JMS message listener invoker failed for destination 'ActiveMQQueue[partitionRequestQueue]' - trying to recover. Cause: Only allowed one session per connection. See the J2EE spec, e.g. J2EE1.4 Section 6.6: javax.jms.IllegalStateException: Only allowed one session per connection. See the J2EE spec, e.g. J2EE1.4 Section 6.6
at org.apache.activemq.artemis.ra.ActiveMQRASessionFactoryImpl.allocateConnection(ActiveMQRASessionFactoryImpl.java:817)
at org.apache.activemq.artemis.ra.ActiveMQRASessionFactoryImpl.createSession(ActiveMQRASessionFactoryImpl.java:531)
at org.springframework.jms.support.JmsAccessor.createSession(JmsAccessor.java:208)
at org.springframework.jms.listener.DefaultMessageListenerContainer.access$1500(DefaultMessageListenerContainer.java:125)
Is there a different approach to defining number of listeners becauseof this error
Only allowed one session per connection. See the J2EE spec, e.g. J2EE1.4 Section 6.6
UPDATE to Questions Below
When Wildfly starts up I see this line
WFLYJCA0002: Bound JCA ConnectionFactory [java:/JmsXA]
I am using java:/JmsXA
This is a Spring Boot Application and it is working without errors in the logs for the 5 JmsListeners defined.
2019-01-16 06:30:54,667 DEBUG DefaultMessageListenerContainer] (ServerService Thread Pool -- 67) Resumed paused task: org.springframework.jms.listener.DefaultMessageListenerContainer$AsyncMessageListenerInvoker#a877e28
When I add the definition of the Jms-inbound-gateway, I start seeing the errors listed above.
Schema Question
The XML defining the inbound gateway has the following schema defintion
http://www.springframework.org/schema/integration http://www.springframework.org/schema/integration/spring-integration.xsd
http://www.springframework.org/schema/integration/jms http://www.springframework.org/schema/integration/jms/spring-integration-jms.xsd
The previous code (batch 2.2) had the following schema definition:
http://www.springframework.org/schema/integration http://www.springframework.org/schema/integration/spring-integration-2.2.xsd
http://www.springframework.org/schema/integration/jms http://www.springframework.org/schema/integration/jms/spring-integration-jms-2.2.xsd
I just updated with
http://www.springframework.org/schema/integration http://www.springframework.org/schema/integration/spring-integration-5.0.xsd
http://www.springframework.org/schema/integration/jms http://www.springframework.org/schema/integration/jms/spring-integration-jms-5.0.xsd
and I can add JMSCoordinationID back in so that issue is solved.
However, I still have the with ActiveMQ error on server startup when I inclde jms-inbound-gateway.
To resolve the issue with the J2ee Spec, I needed to add cache-level=0 to the inbound gateway.
<int-jms:inbound-gateway
id="springbatch.master.inbound.gateway"
connection-factory="springbatch.jmsConnectionFactory"
request-channel="springbatch.slave.jms.request"
request-destination="springbatch.partition.jms.requestsQueue"
reply-channel="springbatch.slave.jms.response"
concurrent-consumers="${springbatch.partition.concurrent.consumers}"
max-concurrent-consumers="${springbatch.partition.concurrent.maxconsumers}"
max-messages-per-task="${springbatch.partition.concurrent.maxmessagespertask}"
reply-time-to-live="${springbatch.partition.reply.time.to.live}"
cache-level="0"
/>

Hazelcast Eureka Cloud Discovery Plugin not working

We have implemented Hazelcast as an embedded cache in our Spring Boot app, and need a way using which Hazelcast members within a "cluster group" can discover each other dynamically so that we dont have to provide possible IP address/port where Hazelcast might be running.
We came across this hazelcast plugin on github:
https://github.com/hazelcast/hazelcast-eureka which seems to provide the same feature using Eureka as discovery/registration tool.
As mentioned in this github documentation, hazelcast-eureka-one library is included within our boot app classpath, we also disabled TCP-IP & multicast discovery and added below discovery strategy in hazelcast.xml:
<discovery-strategies>
<discovery-strategy class="com.hazelcast.eureka.one.EurekaOneDiscoveryStrategy" enabled="true">
<properties>
<property name="self-registration">true</property>
<property name="namespace">hazelcast</property>
</properties>
</discovery-strategy>
</discovery-strategies>
Our application also provides configured EurekaClient, which is what we are autowiring and inject into this plugin implementation:
*
Config hazelcastConfig = new FileSystemXmlConfig(hazelcastConfigFilePath);
**EurekaOneDiscoveryStrategyFactory.setEurekaClient(eurekaClient);**
hazelcastInstance = Hazelcast.newHazelcastInstance(hazelcastConfig);
*
Problem:
We are able to start 2 instances of our spring boot app on same machine and we notice that each app is starting hazelcast instance embedded on separate port (5701, 5702). But it doesnt seem to recognize each other running within a cluster, this is what we see in app logs when 2nd instance is starting:
Members [1] {
Member [10.41.70.143]:5702 - 7c42eb24-3fa0-45cb-9394-17175cc92b9c this
}
17-12-13 12:22:44.480 WARN [main] c.h.i.Node.log(LoggingServiceImpl.java:168) - [10.41.70.143]:5702 [domain-services] [3.8.2] Config seed port is 5701 and cluster size is 1. Some of the ports seem occupied!
which seem to indicate that both hazelcast instances are running independently and doesnt recognize other running instance in a cluster/group.
Also, immediately after restart we see this exception thrown frequently on both the nodes:
*
java.lang.ClassCastException: com.hazelcast.nio.tcp.MemberWriteHandler cannot be cast to com.hazelcast.nio.ascii.TextWriteHandler
at com.hazelcast.nio.ascii.TextReadHandler.<init>(TextReadHandler.java:109) ~[hazelcast-3.8.2.jar:3.8.2]
at com.hazelcast.nio.tcp.SocketReaderInitializerImpl.init(SocketReaderInitializerImpl.java:89) ~[hazelcast-3.8.2.jar:3.8.2]
*
which seem to indicate there is Incompatibility between hazelcast library in the classpath?
It seems like your Eureka service returns the wrong ports. Hazelcast tries to connect to 8080 and other ports in the same range, whereas Hazelcast uses 5701. Not exactly sure why this happens but it feels like you requesting the wrong service name from Eureka which ends up returning the HTTP (Tomcat?!) ports instead of the separate Hazelcast service that should be registered.

Resources