Hazelcast Eureka Cloud Discovery Plugin not working - hazelcast

We have implemented Hazelcast as an embedded cache in our Spring Boot app, and need a way using which Hazelcast members within a "cluster group" can discover each other dynamically so that we dont have to provide possible IP address/port where Hazelcast might be running.
We came across this hazelcast plugin on github:
https://github.com/hazelcast/hazelcast-eureka which seems to provide the same feature using Eureka as discovery/registration tool.
As mentioned in this github documentation, hazelcast-eureka-one library is included within our boot app classpath, we also disabled TCP-IP & multicast discovery and added below discovery strategy in hazelcast.xml:
<discovery-strategies>
<discovery-strategy class="com.hazelcast.eureka.one.EurekaOneDiscoveryStrategy" enabled="true">
<properties>
<property name="self-registration">true</property>
<property name="namespace">hazelcast</property>
</properties>
</discovery-strategy>
</discovery-strategies>
Our application also provides configured EurekaClient, which is what we are autowiring and inject into this plugin implementation:
*
Config hazelcastConfig = new FileSystemXmlConfig(hazelcastConfigFilePath);
**EurekaOneDiscoveryStrategyFactory.setEurekaClient(eurekaClient);**
hazelcastInstance = Hazelcast.newHazelcastInstance(hazelcastConfig);
*
Problem:
We are able to start 2 instances of our spring boot app on same machine and we notice that each app is starting hazelcast instance embedded on separate port (5701, 5702). But it doesnt seem to recognize each other running within a cluster, this is what we see in app logs when 2nd instance is starting:
Members [1] {
Member [10.41.70.143]:5702 - 7c42eb24-3fa0-45cb-9394-17175cc92b9c this
}
17-12-13 12:22:44.480 WARN [main] c.h.i.Node.log(LoggingServiceImpl.java:168) - [10.41.70.143]:5702 [domain-services] [3.8.2] Config seed port is 5701 and cluster size is 1. Some of the ports seem occupied!
which seem to indicate that both hazelcast instances are running independently and doesnt recognize other running instance in a cluster/group.
Also, immediately after restart we see this exception thrown frequently on both the nodes:
*
java.lang.ClassCastException: com.hazelcast.nio.tcp.MemberWriteHandler cannot be cast to com.hazelcast.nio.ascii.TextWriteHandler
at com.hazelcast.nio.ascii.TextReadHandler.<init>(TextReadHandler.java:109) ~[hazelcast-3.8.2.jar:3.8.2]
at com.hazelcast.nio.tcp.SocketReaderInitializerImpl.init(SocketReaderInitializerImpl.java:89) ~[hazelcast-3.8.2.jar:3.8.2]
*
which seem to indicate there is Incompatibility between hazelcast library in the classpath?

It seems like your Eureka service returns the wrong ports. Hazelcast tries to connect to 8080 and other ports in the same range, whereas Hazelcast uses 5701. Not exactly sure why this happens but it feels like you requesting the wrong service name from Eureka which ends up returning the HTTP (Tomcat?!) ports instead of the separate Hazelcast service that should be registered.

Related

Vertx cluster member connection breaks on hazelcast uuid reset

I am working on a project that is composed of multiple vertx micro services where each service runs on different containers in Openshift platform. Eventbus is used for communication between services.
Sometime when a request is made via eventbus there is no response and failing with below errors
[vert.x-eventloop-thread-2] DEBUG io.vertx.core.eventbus.impl.clustered.ConnectionHolder - tx.id=ea60ebe0-1d81-4041-80d5-79cbe1d2a11c Not connected to server
[vert.x-eventloop-thread-2] WARN io.vertx.core.eventbus.impl.clustered.ConnectionHolder - tx.id=97441ebe-8ce9-42b2-996d-35455a5b32f2 Connecting to server 65c9ab20-43f8-4c59-8455-ecca376b71ac failed
Whenever this happen I can see the below error in the destination server to which above request was made
message=WARNING: [192.168.33.42]:5701 [cdart] [5.0.3] Resetting local member UUID. Previous: 65c9ab20-43f8-4c59-8455-ecca376b71ac, new: 8dd74cdf-e4c4-443f-a38e-3f6c36721795
Could this be due to reset event raised by Hazelcast is not handled in Vertx?
Vertx 4.3.5 version is used in this project.
This is a known issue that will be fixed in the forthcoming 4.4 release.

Spring Boot app can't connect to Cassandra cluster, driver returning "AllNodesFailedException: Could not reach any contact point"

i've updated my spring-boot to v3.0.0 and spring-data-cassandra to v4.0.0 which resulted in unable to connect to cassandra cluster which is deployed in stg env and runs on IPv6 address having different datacenter rather DC1
i've added a config file which accepts localDC programatically
`#Bean(destroyMethod = "close")
public CqlSession session() {
CqlSession session = CqlSession.builder()
.addContactPoint(InetSocketAddress.createUnresolved("[240b:c0e0:1xx:xxx8:xxxx:x:x:x]", port))
.withConfigLoader(
DriverConfigLoader.programmaticBuilder()
.withString(DefaultDriverOption.LOAD_BALANCING_LOCAL_DATACENTER, localDatacenter)
.withString(DefaultDriverOption.AUTH_PROVIDER_PASSWORD,password)
.withString(DefaultDriverOption.CONNECTION_INIT_QUERY_TIMEOUT,"10s")
.withString(DefaultDriverOption.CONNECTION_CONNECT_TIMEOUT, "20s")
.withString(DefaultDriverOption.REQUEST_TIMEOUT, "20s")
.withString(DefaultDriverOption.CONTROL_CONNECTION_TIMEOUT, "20s")
.withString(DefaultDriverOption.SESSION_KEYSPACE,keyspace)
.build())
//.addContactPoint(InetSocketAddress.createUnresolved(InetAddress.getByName(contactPoints).getHostName(), port))
.build();
}
return session;`
and this is my application.yml file
spring:
data:
cassandra:
keyspace-name: xxx
contact-points: [xxxx:xxxx:xxxx:xxx:xxx:xxx]
port: xxx
local-datacenter: xxxx
use-dc-aware: true
username: xxxxx
password: xxxxx
ssl: true
SchemaAction: CREATE_IF_NOT_EXISTS
So locally I was able to connect to cassandra (by default it is pointing to localhost) , but in stg env my appplication is not able to connect to that cluster
logs in my stg env
caused by: com.datastax.oss.driver.api.core.AllNodesFailedException: Could not reach any contact point, make sure you've provided valid addresses (showing first 1 nodes, use getAllErrors() for more): Node (endPoint=/[240b:cOe0:102:xxxx:xxxx:x:x:x]:3xxx,hostId-null,hashCode=4e9ba6a8):[com.datastax.oss.driver.api.core.connection.ConnectionInitException:[s0|controllid:0x984419ed,L:/[240b:cOe0:102:5dd7: xxxx:x:x:xxx]:4xxx - R:/[240b:c0e0:102:xxxx:xxxx:x:x:x]:3xxx] Protocol initialization request, step 1 (OPTIONS: unexpected tarlure com.datastax.oss.driver.apt.core.connection.closedconnectiontxception: Lost connection to remote peer)]
Network
You appear to have a networking issue. The driver can't connect to any of the nodes because they are unreachable from a network perspective as it states in the error message:
... AllNodesFailedException: Could not reach any contact point ...
You need to check that:
you have configured the correct IP addresses,
you have configured the correct CQL port, and
there is network connectivity between your app and the cluster.
Security
I also noted that you configured the driver to use SSL:
ssl: true
but I don't see anywhere where you've configured the certificate credentials and this could explain why the driver can't initiate connections.
Check that the cluster has client-to-node encryption enabled. If it does then you need to prepare the client certificates and configure SSL on the driver.
Driver build
This post appears to be a duplicate of another question you posted but is now closed due to lack of clarity and details.
In that question it appears you are running a version of the Java driver not produced by DataStax as pointed out by #absurdface:
Specifically I note that java-driver-core-4.11.4-yb-1-RC1.jar isn't a Java driver artifact released by DataStax (there isn't even a 4.11.4 Java driver release). This could be relevant for reasons we'll get into ...
We are not aware of where this build came from and without knowing much about it, it could be the reason you are not able to connect to the cluster.
We recommend that you switch to one of the supported builds of the Java driver. Cheers!
A hearty +1 to everything #erick-ramirez mentioned above. I would also expand on his answers with an observation or two.
Normally spring-data-cassandra is used to automatically configure a CqlSession and make it available for injection (or for use in CqlTemplate etc.). That's what you'd normally be configuring with your application.yml file. But you're apparently creating the CqlSession directly in code, which means that spring-data-cassandra isn't involved... and therefore what's in your application.yml likely isn't being used.
This analysis strongly suggests that your CqlSession is not being configured to use SSL. My understanding is that your testing sequence went as follows:
Tested app locally on a local server, everything worked
Tested app against test environment, observed the errors above
If this sequence is correct and you have SSL enabled in you test environment but not on your local Cassandra instance that could very easily explain the behaviour you're describing.
This explanation could also explain the specific error you cite in the error message. "Lost connection to remote peer" indicates that something is unexpectedly killing your socket connection before any protocol messages are explained... and an SSL issue would cause almost exactly that behaviour.
I would recommend checking the SSL configuration for both servers involved in your testing. I would also suggest consulting the SSL-related documentation referenced by Erick above and confirm that you have all the relevant materials when building your CqlSession.
added the certificate in my spring application
public CqlSession session() throws IOException, CertificateException, NoSuchAlgorithmException, KeyStoreException, KeyManagementException {
Resource resource = new ClassPathResource("root.crt");
InputStream inputStream = resource.getInputStream();
CertificateFactory cf = CertificateFactory.getInstance("X.509");
Certificate cert = cf.generateCertificate(inputStream);
TrustManagerFactory trustManagerFactory = TrustManagerFactory.getInstance(TrustManagerFactory.getDefaultAlgorithm());
KeyStore keyStore = KeyStore.getInstance(KeyStore.getDefaultType());
keyStore.load(null);
keyStore.setCertificateEntry("ca", cert);
trustManagerFactory.init(keyStore);
SSLContext sslContext = SSLContext.getInstance("TLSv1.3");
sslContext.init(null, trustManagerFactory.getTrustManagers(), null);
return CqlSession.builder()
.withSslContext(sslContext)
.addContactPoint(new InetSocketAddress(contactPoints,port))
.withAuthCredentials(username, password)
.withLocalDatacenter(localDatacenter)
.withKeyspace(keyspace)
.build();
}
so added the cert file in the configuration file of the cqlsession builder and this helped me in connecting to the remote cassandra cluster

Spring Data GemFire Server java.net.BindException in Linux

I have a Spring Boot app that I am using to start a Pivotal GemFire CacheServer.
When I jar up the file and run it locally:
java -jar gemfire-server-0.0.1-SNAPSHOT.jar
It runs fine without issue. The server is using the default properties
spring.data.gemfire.cache.log-level=info
spring.data.gemfire.locators=localhost[10334]
spring.data.gemfire.cache.server.port=40404
spring.data.gemfire.name=CacheServer
spring.data.gemfire.cache.server.bind-address=localhost
spring.data.gemfire.cache.server.host-name-for-clients=localhost
If I deploy this to a Centos distribution and run it with the same script but passing the "test" profile:
java -jar gemfire-server-0.0.1-SNAPSHOT.jar -Dspring.profiles.active=test
with my test profile application-test.properties looking like this:
spring.data.gemfire.cache.server.host-name-for-clients=server.centralus.cloudapp.azure.com
I can see during startup that the server finds the Locator already running on the host (I start it through a separate process with Gfsh).
The server even joins the cluster for about a minute. But then it shuts down because of a bind exception.
I have checked to see if there is anything running on that port (40404) - and nothing shows up
EDIT
Apparently I DO get this exception locally - it just takes a lot longer.
It is almost instant when I start it up on the Centos distribution. On my Mac it takes around 2 minutes before the process throws the exception:
Adding a few more images of this:
Two bash windows - left is monitoring GF locally and right I use to check the port and start the Java process:
The server is added to the cluster. Note the timestamp of 16:45:05.
Here is the server added and it appears to be running:
Finally, the exception after about two minutes - again look at the timestamp on the exception - 16:47:09. The server is stopped and dropped from the cluster.
Did you start other servers using Gfsh? That is, with a Gfsh command similar to...
gfsh>start server --name=ExampleGfshServer --log-level=config
Gfsh will start CacheServers listening on the default CacheServer port of 40404.
You have a few options.
1) First, you can disable the default CacheServer when starting a server with Gfsh like so...
gfsh>start server --name=ExampleGfshServer --log-level=config --disable-default-server
2) Alternatively, you can change the CacheServer port when starting other servers using Gfsh...
gfsh>start server --name=ExampleGfshServer --log-level=config --server-port=50505
3) If you are starting multiple instances of your Spring Boot, Pivotal GemFire CacheServer class, then you can vary the spring.data.gemfire.cache.server.port property by declaring the property as a System property when you startup.
For instance, you can, in the Spring Boot application.properties, do...
#application.properties
...
spring.data.gemfire.cache.server.port=${gemfire.cache.server.port:40404}
And then when starting the application from the command-line...
java -Dgemfire.cache.server.port=48484 -jar ...
Of course, you could just set the SDG property from the command line too...
java -Dspring.data.gemfire.cache.server.port=48484 --jar ...
Anyway, I guarantee you that you have another process (e.g. Pivotal GemFire CacheServer) with a ServerSocket listening on port 40404, running. netstat -a | grep 40404 should give you better results.
Hope this helps.
Regards,
John

Payara - Hazelcast cluster node picks the wrong network interface

When starting Payara cluster, one of the nodes binds to the wrong IP address (internal IP address of the docker which is installed locally on the node).
What is the proper way of letting know the Payara Cluster instance node which address it should bind to?
Node 1 log:
[2017-12-04T11:35:06.512+0800] [Payara 4.1] [INFO] [] [com.hazelcast.internal.cluster.impl.MulticastJoiner] [tid: _ThreadID=16 _ThreadName=RunLevelControllerThread-1512358500010] [timeMillis: 1512358506512] [levelValue: 800] [[
[172.17.0.1]:5900 [dev] [3.8.5]
Members [1] {
Member [172.17.0.1]:5900 - 9be6669e-b853-44c0-9656-8488d3e1031b this
}
]]
Node 2 log:
[2017-12-04T11:35:06.771+0800] [Payara 4.1] [INFO] [] [com.hazelcast.internal.cluster.impl.MulticastJoiner] [tid: _ThreadID=17 _ThreadName=RunLevelControllerThread-1512358500129] [timeMillis: 1512358506771 [levelValue: 800] [[
[10.4.0.86]:5900 [dev] [3.8.5]
Members [1] {
Member [10.4.0.86]:5900 - e3f9dd48-58b9-45f9-88fc-6b0feaedd78f this
}
]]
I have tested the cluster itself and it works properly on machines with the only one interface (without docker installed).
I have found issues that are related to my case, but was not able to adapt them in Payara Cluster setup:
Hazelcast cluster over AWS using Docker
Configuring a two node
hazelcast cluster - avoiding multicast
Meaning, suggestion to use the local property: -Dhazelcast.local.localAddress=[yourCorrectIpGoesHere] - works, but in case of cluster environment with centralized management of the nodes config - I do not see how to set the different JVM properties for each of the nodes.
Submitting custom hazelcast-config.xml via the "Override configuration file" could be an option, but it means that full configuration should be done via this file, what makes it not super handy to manage, but currently looks like this is the only option that potentially can help here.
Thanks!
Payara server doesn't expose this configuration option directly. Using the system property hazelcast.local.localAddress is the preferred option. However, you shouldn't set it as a JVM option like you did with
-Dhazelcast.local.localAddress=...
Instead, add the system property using the server page in the Admin Console. On the Properties tab go to System properties tab and add a new property with the variable name hazelcast.local.localAddress and override value set to the IP address of the interface you want Hazelcast to bind to.
This way, the configuration is applied during runtime without any server restart needed and should also be propagated to other instances in the cluster if you also set the property for cluster instances. For those, instead of going to the server page you would go to the configuration of each instance and set the system property there.

How to Register Node app with Spring Cloud and Netflix's Eureka

I am trying hard to register my node app, with Netflix's Eureka , and after googling a lot, I am still looking for a solution . The max I can figure out is we have Prana but I got an issue which is still in the issue list of Prana (I guess it means my REST Template is not able to discover this app).
Sidecar , is another suggested solution , for which I am not getting any response . Apart from these two I have found a node module Eureka-Node-client , but none of them serve my purpose .
You can create a Sidecar application like you would create any spring-boot app:
#EnableSidecar
#SpringBootApplication
public class SideCarApplication {
public static void main(final String[] args) {
SpringApplication.run(SideCarApplication.class, args);
}
}
The important thing is that you have to configure it to correctly register your actual service. Your application.yml should look like this:
server:
port: 9999 -- the port your spring-boot sidecar is running
spring:
application:
name: nodeapplication -- the name will be your id in eureka
sidecar:
port: 8000 -- the node applications port
health-uri: http://localhost:8000/health.json -- the exposed health eindpoint of your node application
It is important to note that the healthpoint should return UP so your services status will be correct in eureka. The returned json for a healthy service :
{
"status":"UP"
}
If you are having trouble setting up a spring-boot app, use https://start.spring.io/ to configure a project. Sadly there isn't a sidecar option to tick, but you will get the idea. You can do the same from STS (Spring Tool Suite).
The maven dependency for Sidecar (with spring-cloud as a parent):
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-netflix-sidecar</artifactId>
</dependency>
Checkout eureka-js-client which can be directly used in the node application to register with eureka.
I used the npm generator-express-sidecar to generate a gradlew sidecar project with all the out-of-box configurations
Please refer to the reference link shared below
https://www.npmjs.com/package/generator-express-sidecar

Resources