Payara - Hazelcast cluster node picks the wrong network interface - hazelcast

When starting Payara cluster, one of the nodes binds to the wrong IP address (internal IP address of the docker which is installed locally on the node).
What is the proper way of letting know the Payara Cluster instance node which address it should bind to?
Node 1 log:
[2017-12-04T11:35:06.512+0800] [Payara 4.1] [INFO] [] [com.hazelcast.internal.cluster.impl.MulticastJoiner] [tid: _ThreadID=16 _ThreadName=RunLevelControllerThread-1512358500010] [timeMillis: 1512358506512] [levelValue: 800] [[
[172.17.0.1]:5900 [dev] [3.8.5]
Members [1] {
Member [172.17.0.1]:5900 - 9be6669e-b853-44c0-9656-8488d3e1031b this
}
]]
Node 2 log:
[2017-12-04T11:35:06.771+0800] [Payara 4.1] [INFO] [] [com.hazelcast.internal.cluster.impl.MulticastJoiner] [tid: _ThreadID=17 _ThreadName=RunLevelControllerThread-1512358500129] [timeMillis: 1512358506771 [levelValue: 800] [[
[10.4.0.86]:5900 [dev] [3.8.5]
Members [1] {
Member [10.4.0.86]:5900 - e3f9dd48-58b9-45f9-88fc-6b0feaedd78f this
}
]]
I have tested the cluster itself and it works properly on machines with the only one interface (without docker installed).
I have found issues that are related to my case, but was not able to adapt them in Payara Cluster setup:
Hazelcast cluster over AWS using Docker
Configuring a two node
hazelcast cluster - avoiding multicast
Meaning, suggestion to use the local property: -Dhazelcast.local.localAddress=[yourCorrectIpGoesHere] - works, but in case of cluster environment with centralized management of the nodes config - I do not see how to set the different JVM properties for each of the nodes.
Submitting custom hazelcast-config.xml via the "Override configuration file" could be an option, but it means that full configuration should be done via this file, what makes it not super handy to manage, but currently looks like this is the only option that potentially can help here.
Thanks!

Payara server doesn't expose this configuration option directly. Using the system property hazelcast.local.localAddress is the preferred option. However, you shouldn't set it as a JVM option like you did with
-Dhazelcast.local.localAddress=...
Instead, add the system property using the server page in the Admin Console. On the Properties tab go to System properties tab and add a new property with the variable name hazelcast.local.localAddress and override value set to the IP address of the interface you want Hazelcast to bind to.
This way, the configuration is applied during runtime without any server restart needed and should also be propagated to other instances in the cluster if you also set the property for cluster instances. For those, instead of going to the server page you would go to the configuration of each instance and set the system property there.

Related

Spring Boot app can't connect to Cassandra cluster, driver returning "AllNodesFailedException: Could not reach any contact point"

i've updated my spring-boot to v3.0.0 and spring-data-cassandra to v4.0.0 which resulted in unable to connect to cassandra cluster which is deployed in stg env and runs on IPv6 address having different datacenter rather DC1
i've added a config file which accepts localDC programatically
`#Bean(destroyMethod = "close")
public CqlSession session() {
CqlSession session = CqlSession.builder()
.addContactPoint(InetSocketAddress.createUnresolved("[240b:c0e0:1xx:xxx8:xxxx:x:x:x]", port))
.withConfigLoader(
DriverConfigLoader.programmaticBuilder()
.withString(DefaultDriverOption.LOAD_BALANCING_LOCAL_DATACENTER, localDatacenter)
.withString(DefaultDriverOption.AUTH_PROVIDER_PASSWORD,password)
.withString(DefaultDriverOption.CONNECTION_INIT_QUERY_TIMEOUT,"10s")
.withString(DefaultDriverOption.CONNECTION_CONNECT_TIMEOUT, "20s")
.withString(DefaultDriverOption.REQUEST_TIMEOUT, "20s")
.withString(DefaultDriverOption.CONTROL_CONNECTION_TIMEOUT, "20s")
.withString(DefaultDriverOption.SESSION_KEYSPACE,keyspace)
.build())
//.addContactPoint(InetSocketAddress.createUnresolved(InetAddress.getByName(contactPoints).getHostName(), port))
.build();
}
return session;`
and this is my application.yml file
spring:
data:
cassandra:
keyspace-name: xxx
contact-points: [xxxx:xxxx:xxxx:xxx:xxx:xxx]
port: xxx
local-datacenter: xxxx
use-dc-aware: true
username: xxxxx
password: xxxxx
ssl: true
SchemaAction: CREATE_IF_NOT_EXISTS
So locally I was able to connect to cassandra (by default it is pointing to localhost) , but in stg env my appplication is not able to connect to that cluster
logs in my stg env
caused by: com.datastax.oss.driver.api.core.AllNodesFailedException: Could not reach any contact point, make sure you've provided valid addresses (showing first 1 nodes, use getAllErrors() for more): Node (endPoint=/[240b:cOe0:102:xxxx:xxxx:x:x:x]:3xxx,hostId-null,hashCode=4e9ba6a8):[com.datastax.oss.driver.api.core.connection.ConnectionInitException:[s0|controllid:0x984419ed,L:/[240b:cOe0:102:5dd7: xxxx:x:x:xxx]:4xxx - R:/[240b:c0e0:102:xxxx:xxxx:x:x:x]:3xxx] Protocol initialization request, step 1 (OPTIONS: unexpected tarlure com.datastax.oss.driver.apt.core.connection.closedconnectiontxception: Lost connection to remote peer)]
Network
You appear to have a networking issue. The driver can't connect to any of the nodes because they are unreachable from a network perspective as it states in the error message:
... AllNodesFailedException: Could not reach any contact point ...
You need to check that:
you have configured the correct IP addresses,
you have configured the correct CQL port, and
there is network connectivity between your app and the cluster.
Security
I also noted that you configured the driver to use SSL:
ssl: true
but I don't see anywhere where you've configured the certificate credentials and this could explain why the driver can't initiate connections.
Check that the cluster has client-to-node encryption enabled. If it does then you need to prepare the client certificates and configure SSL on the driver.
Driver build
This post appears to be a duplicate of another question you posted but is now closed due to lack of clarity and details.
In that question it appears you are running a version of the Java driver not produced by DataStax as pointed out by #absurdface:
Specifically I note that java-driver-core-4.11.4-yb-1-RC1.jar isn't a Java driver artifact released by DataStax (there isn't even a 4.11.4 Java driver release). This could be relevant for reasons we'll get into ...
We are not aware of where this build came from and without knowing much about it, it could be the reason you are not able to connect to the cluster.
We recommend that you switch to one of the supported builds of the Java driver. Cheers!
A hearty +1 to everything #erick-ramirez mentioned above. I would also expand on his answers with an observation or two.
Normally spring-data-cassandra is used to automatically configure a CqlSession and make it available for injection (or for use in CqlTemplate etc.). That's what you'd normally be configuring with your application.yml file. But you're apparently creating the CqlSession directly in code, which means that spring-data-cassandra isn't involved... and therefore what's in your application.yml likely isn't being used.
This analysis strongly suggests that your CqlSession is not being configured to use SSL. My understanding is that your testing sequence went as follows:
Tested app locally on a local server, everything worked
Tested app against test environment, observed the errors above
If this sequence is correct and you have SSL enabled in you test environment but not on your local Cassandra instance that could very easily explain the behaviour you're describing.
This explanation could also explain the specific error you cite in the error message. "Lost connection to remote peer" indicates that something is unexpectedly killing your socket connection before any protocol messages are explained... and an SSL issue would cause almost exactly that behaviour.
I would recommend checking the SSL configuration for both servers involved in your testing. I would also suggest consulting the SSL-related documentation referenced by Erick above and confirm that you have all the relevant materials when building your CqlSession.
added the certificate in my spring application
public CqlSession session() throws IOException, CertificateException, NoSuchAlgorithmException, KeyStoreException, KeyManagementException {
Resource resource = new ClassPathResource("root.crt");
InputStream inputStream = resource.getInputStream();
CertificateFactory cf = CertificateFactory.getInstance("X.509");
Certificate cert = cf.generateCertificate(inputStream);
TrustManagerFactory trustManagerFactory = TrustManagerFactory.getInstance(TrustManagerFactory.getDefaultAlgorithm());
KeyStore keyStore = KeyStore.getInstance(KeyStore.getDefaultType());
keyStore.load(null);
keyStore.setCertificateEntry("ca", cert);
trustManagerFactory.init(keyStore);
SSLContext sslContext = SSLContext.getInstance("TLSv1.3");
sslContext.init(null, trustManagerFactory.getTrustManagers(), null);
return CqlSession.builder()
.withSslContext(sslContext)
.addContactPoint(new InetSocketAddress(contactPoints,port))
.withAuthCredentials(username, password)
.withLocalDatacenter(localDatacenter)
.withKeyspace(keyspace)
.build();
}
so added the cert file in the configuration file of the cqlsession builder and this helped me in connecting to the remote cassandra cluster

'The requested address is not valid in its context' while trying to connect to ArangoDB server on LAN

I have two machines on LAN, I'd like to connect to AranoDB serve on one of them from another one.
The first one has an address 192.168.0.105, arangod.conf
[server]
endpoint = tcp://0.0.0.0:8529
storage-engine = auto
another one has an address 192.168.0.100 and arangod.conf
[server]
endpoint = tcp://192.168.0.105:8529
storage-engine = auto
ArangoDB on the first machine is working. When I try to start ArangoDB on the second machine, I see the following error:
2018-08-21T09:46:15Z [2724] INFO {authentication} Jwt secret not specified, generating...
2018-08-21T09:46:15Z [2724] INFO ArangoDB 3.3.12 [win64] 64bit, using build tags/v3.3.12-0-g225095d762, VPack 0.1.30, RocksDB 5.6.0, ICU 58.1, V8 5.7.492.77, OpenSSL 1.0.2a 19 Mar 2015
2018-08-21T09:46:15Z [2724] INFO using storage engine mmfiles
2018-08-21T09:46:15Z [2724] INFO {cluster} Starting up with role SINGLE
2018-08-21T09:46:15Z [2724] INFO {authentication} Authentication is turned on (system only)
2018-08-21T09:46:18Z [2724] INFO using endpoint 'http+tcp://192.168.0.105:8529' for non-encrypted requests
2018-08-21T09:46:18Z [2724] ERROR {communication} unable to bind to endpoint 'http+tcp://192.168.0.105:8529': The requested address is not valid in its context
2018-08-21T09:46:18Z [2724] WARNING {communication} failed to open endpoint 'http+tcp://192.168.0.105:8529' with error: The requested address is not valid in its context
2018-08-21T09:46:18Z [2724] FATAL failed to bind to endpoint 'http+tcp://192.168.0.105:8529'. Please check whether another instance is already running using this endpoint and review your endpoints configuration.
I've already created rules in the windows firewall and in the router.
Test-NetConnection results are:
PS C:\Users\> Test-NetConnection -ComputerName 192.168.0.105 -Port 8529
ComputerName : 192.168.0.105
RemoteAddress : 192.168.0.105
RemotePort : 8529
SourceAddress : 192.168.0.100
TcpTestSucceeded : True
What else should I do?
Not sure what you try here... connect with one server to another server? This is bound to fail. Don't you want to run a server on one machine and connect to it from another computer on the local network using arangosh? Or simply use the web interface?
The endpoint must be an address used by a network interface of your local computer. It can't be the address of another machine.
Setups like clusters require a lot more configuration (if done bare-metal).
For an overview of deployment modes including multi-machine setups you may want to check the work-in-progress documentation: https://docs.arangodb.com/devel/Manual/Deployment/

Unable to start Kudu master

While starting kudu-master, I am getting the below error and unable to start kudu cluster.
F0706 10:21:33.464331 27576 master_main.cc:71] Check failed: _s.ok() Bad status: Invalid argument: Unable to initialize catalog manager: Failed to initialize sys tables async: on-disk master list (hadoop-master:7051, slave2:7051, slave3:7051) and provided master list (:0) differ. Their symmetric difference is: :0, hadoop-master:7051, slave2:7051, slave3:7051
It is a cluster of 8 nodes and i have provided 3 masters as given below in master.gflagfile on master nodes.
--master_addresses=hadoop-master,slave2,slave3
TL;DR
If this is a new installation, working under the assumption that master ip addresses are correct, I believe the easiest solution is to
Stop kudu masters
Nuke the <kudu-data-dir>/master directory
Start kudu masters
Explanation
I believe the most common (if not only) cause of this error (Failed to initialize sys tables async: on-disk master list (hadoop-master:7051, slave2:7051, slave3:7051) and provided master list (:0) differ.) is when a kudu master node gets added incorrectly. The error suggests that kudu-master thinks it's running on a single node rather than 3-node cluster.
Maybe you did not intend to "add a node", but that's most likely what happened. I'm saying this because I had the same problem; after some googling and debugging, I discovered that during the installation, I started kudu-master before putting the correct IP address in master.gflagfile, so that kudu-master was spun up thinking it was running on a single node, not 3 node. Using steps above to clean install kudu-master again, my problem was solved.

Cordapps peers list returning own node info in azure corda deployment

I have created a corda network in azure portal following these documentation:
Documentation: https://docs.corda.net/azure-vm.html
The cordapp jar I used from the link....
Yo cordapp jar: http://ci-artifactory.corda.r3cev.com/artifactory/cordapp-showcase/yo-4.jar
I have installed the same jar in 3 corda nodes.
Now the web application is running, in ipaddress:10004, but http://ipaddress:10004/api/yo/peers returns
{
"peers" : [ "C=GB,L=London,O=Organisation 4 (Corda 2.0.0)" ]
}
http://ipaddress:10004/api/yo/peers returns
{
"me" : "C=GB,L=London,O=Organisation 4 (Corda 2.0.0)"
}
I am not sure if I missed anything on network manager node. Any suggestions?? Thanks in advance.
In the Yo! CorDapp, the peers endpoint is defined to return a list of all the peers on the network, including yourself. See https://github.com/corda/samples/blob/release-V3/yo-cordapp/src/main/kotlin/net/corda/yo/Yo.kt#L72.
The response you're getting implies you can't see the network map node. A few things to try:
Check you can ping the network map node's machine from your node's machine
Ensure the CorDapp is running correctly on the network map node's machine
Ensure your node's node.conf file lists the correct address and port for the network map node

Hazelcast Eureka Cloud Discovery Plugin not working

We have implemented Hazelcast as an embedded cache in our Spring Boot app, and need a way using which Hazelcast members within a "cluster group" can discover each other dynamically so that we dont have to provide possible IP address/port where Hazelcast might be running.
We came across this hazelcast plugin on github:
https://github.com/hazelcast/hazelcast-eureka which seems to provide the same feature using Eureka as discovery/registration tool.
As mentioned in this github documentation, hazelcast-eureka-one library is included within our boot app classpath, we also disabled TCP-IP & multicast discovery and added below discovery strategy in hazelcast.xml:
<discovery-strategies>
<discovery-strategy class="com.hazelcast.eureka.one.EurekaOneDiscoveryStrategy" enabled="true">
<properties>
<property name="self-registration">true</property>
<property name="namespace">hazelcast</property>
</properties>
</discovery-strategy>
</discovery-strategies>
Our application also provides configured EurekaClient, which is what we are autowiring and inject into this plugin implementation:
*
Config hazelcastConfig = new FileSystemXmlConfig(hazelcastConfigFilePath);
**EurekaOneDiscoveryStrategyFactory.setEurekaClient(eurekaClient);**
hazelcastInstance = Hazelcast.newHazelcastInstance(hazelcastConfig);
*
Problem:
We are able to start 2 instances of our spring boot app on same machine and we notice that each app is starting hazelcast instance embedded on separate port (5701, 5702). But it doesnt seem to recognize each other running within a cluster, this is what we see in app logs when 2nd instance is starting:
Members [1] {
Member [10.41.70.143]:5702 - 7c42eb24-3fa0-45cb-9394-17175cc92b9c this
}
17-12-13 12:22:44.480 WARN [main] c.h.i.Node.log(LoggingServiceImpl.java:168) - [10.41.70.143]:5702 [domain-services] [3.8.2] Config seed port is 5701 and cluster size is 1. Some of the ports seem occupied!
which seem to indicate that both hazelcast instances are running independently and doesnt recognize other running instance in a cluster/group.
Also, immediately after restart we see this exception thrown frequently on both the nodes:
*
java.lang.ClassCastException: com.hazelcast.nio.tcp.MemberWriteHandler cannot be cast to com.hazelcast.nio.ascii.TextWriteHandler
at com.hazelcast.nio.ascii.TextReadHandler.<init>(TextReadHandler.java:109) ~[hazelcast-3.8.2.jar:3.8.2]
at com.hazelcast.nio.tcp.SocketReaderInitializerImpl.init(SocketReaderInitializerImpl.java:89) ~[hazelcast-3.8.2.jar:3.8.2]
*
which seem to indicate there is Incompatibility between hazelcast library in the classpath?
It seems like your Eureka service returns the wrong ports. Hazelcast tries to connect to 8080 and other ports in the same range, whereas Hazelcast uses 5701. Not exactly sure why this happens but it feels like you requesting the wrong service name from Eureka which ends up returning the HTTP (Tomcat?!) ports instead of the separate Hazelcast service that should be registered.

Resources