Using standalone infinispan and JGroups not forming a cluster with UDP

Using standalone infinispan and JGroups not forming a cluster with UDP - multicast

I am using standalone infinispan server on linux.
We have bundled the infinispan version 7.1.1.Final. lWe have provided the UDP multicast IP and Port through cluster.xml file.I have 8 node in the cluster.But jgroup is not forming the cluster. Each node has infinispan running as individuals with different bind address. Same setup is done for different deployment and it is running normally. There are no errors and warnings in the infinispan logs.
Each node has only one member in the cluster view with their individual host and IP address
eg:
2021-06-26 00:18:51,697 INFO [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service thread 1-25) ISPN000094: Received new cluster view for channel clustered: [server1/clustered|0] (1) [server1/clustered]
Below things I have investigated, suspecting that can we multicast issue:
Able to get all jgroup IP's and they are of same subnet
Run the tcpdump command to check the communication between bind address and multicast address and able to receive the UDP packets
All the bind address are of bridge br3. So checked that if all the br3 of all node is connected to same interface. It was connected to same interface
Checked the multicast snooping is disable or not. It should be disable to form cluster. It was disabled
kindly provide some insights for this investigation.

Related

OPC UA Multicast Discovery

I am a beginner in OPC UA, exploring the discovery mechanisms mentioned in part 12 of the specification. I have a couple of queries.
In the Multicast extension discovery, the server registers to its Local discovery server(LDS ME), and when client does the registration to its LDS-ME, the client side LDS-ME issues a multicast probe for which the server side LDS-ME responds with an announcement, thus allowing the client to know the list of servers in the network.
My question here is, why is the process referred as Multicast probe and multicast announcement. Because as per the mDNS specification, probe and announcement are used initially to secure unique ownership of a resource record. Anybody could tell me why is it referred as probe and announce?
In the open62541 stack, with the discovery examples, running the server_lds.c, i get a log message saying "Multicast DNS: outbound interface 0.0.0.0, it means that the first OS interface is used (you can explicitly set the interface by using 'discovery.mdnsInterfaceIP' config parameter)".
Now theory says multicast dns IP should be 224.0.0.251: 5353
Why is it being set to 0.0.0.0? Could anyone please let me know?
Regards,
Rakshan

There is no relation to the words "probe" and "announce" used in the mDNS Spec. It just says probe, means look-up or query, and announce like "there are the follwing results related to your probe request".
0.0.0.0 means here every Ipv4 interface is used (bound). So every capable interface in your system will be configured for mDNS. Should be the way you mentioned.
"0.0.0.0" => have a look here https://en.wikipedia.org/wiki/0.0.0.0

Cassandra client connection to multiple addresses

I have a question about Cassandra.
Is it possible to open Cassandra client connection on many IPs?
On my server I have 2 network cards (eth0 and eth1) with IP 10.197.11.21 (eth0) and 192.168.0.45 (eth1) and 127.0.0.1 (lo).
I want my client to connect to Cassandra database with this three IP
in localhost, 10.197.11.21 and 192.168.0.45
For the moment I can choose only 1 IP, what does it do to modify in the file cassandra.yaml ?

You need to set rpc_address: 0.0.0.0 in cassandra.yaml
Note that when you set rpc_address to 0.0.0.0, you also must set broadcast_rpc_address to something other than 0.0.0.0 (e.g. 10.197.11.21).
rpc_address is the address that Cassandra listens on for connections from clients
listen_address is the address that Cassandra listens on for connections from other Cassandra nodes (not client connections)
broadcast_rpc_address is the address that Cassandra broadcasts to clients that are attempting to discover the other nodes in the cluster. When an application first connects to a Cassandra cluster, the cluster sends the application a list of all the nodes in the cluster, and their IP addresses. The IP address sent to the application is the broadcast_ip_address (side-note: cassandra actually sends all IP addresses, this is just the one that it tells the client to connect on). This allows the application to auto-discover all the nodes in the cluster, even if only one IP address was given to the application. This also allows applications to handle situations like a node going offline, or new nodes being added.
Even though your broadcast_rpc_address can only point to one of those two IP addresses, you application can still connect to either one. However, your application will also attempt to connect to other nodes via the broadcast_rpc_addresses sent back by the cluster. You can get around this by providing a full list of the address of every node in the cluster to your application, but the best solution is to build a driver-side address translator.

What is spark.local.ip ,spark.driver.host,spark.driver.bindAddress and spark.driver.hostname?

What will be difference and use of all these?
spark.local.ip
spark.driver.host
spark.driver.bindAddress
spark.driver.hostname
How to fix a machine as a Driver in Spark standalone cluster ?

Short Version
the ApplicationMaster connect to spark Driver by spark.driver.host
spark Driver bind to bindAddress on the client machine
by examples
1 example of port binding
.config('spark.driver.port','50243')
then netstat -ano on windows
TCP 172.18.1.194:50243 0.0.0.0:0 LISTENING 15332
TCP 172.18.1.194:50243 172.18.7.122:54451 ESTABLISHED 15332
TCP 172.18.1.194:50243 172.18.7.124:37412 ESTABLISHED 15332
TCP 172.18.1.194:50243 172.18.7.142:41887 ESTABLISHED 15332
TCP [::]:4040 [::]:0 LISTENING 15332
The nodes in the cluster 172.18.7.1xx are in the same network as my development machine 172.181.1.194 as my netmask is 255.255.248.0
2 example of specify ip from ApplicationMaster to Driver
.config('spark.driver.host','192.168.132.1')
then netstat -ano
TCP 192.168.132.1:58555 0.0.0.0:0 LISTENING 9480
TCP 192.168.132.1:58641 0.0.0.0:0 LISTENING 9480
TCP [::]:4040 [::]:0 LISTENING 9480
however the ApplicationMaster cannot connect and reported error
Caused by: java.net.NoRouteToHostException: No route to host
because this ip is a VM bridge on my development machine
3 example of ip bind
.config('spark.driver.host','172.18.1.194')
.config('spark.driver.bindAddress','192.168.132.1')
then netstat -ano
TCP 172.18.1.194:63937 172.18.7.101:8032 ESTABLISHED 17412
TCP 172.18.1.194:63940 172.18.7.102:9000 ESTABLISHED 17412
TCP 172.18.1.194:63952 172.18.7.121:50010 ESTABLISHED 17412
TCP 192.168.132.1:63923 0.0.0.0:0 LISTENING 17412
TCP [::]:4040 [::]:0 LISTENING 17412
Detailed Version
Before explain in detail, there are only these three related conf variables:
spark.driver.host
spark.driver.port
spark.driver.bindAddress
There are NO variables like spark.driver.hostname or spark.local.ip. But there IS a environment variable called SPARK_LOCAL_IP
and before explain the variables, first we have to understand the application submition process
Main Roles of computers:
development machine
master node (YARN / Spark Master)
worker node
There is an ApplicationMaster for each application, which takes care of resource request from cluster and status monitor of jobs(stages)
The ApplicationMaster is in the cluster, always.
Place of spark Driver
development machine: client mode
within the cluster: cluster mode, same place as the ApplicationMaster
Let's say we are talking about client mode
The spark application can be submitted from a development machine, which act as a client machine of the application, as well as a client machine of the cluster.
The spark application can alternatively submitted from a node within the cluster (master node or worker node or just a specific machine with no resource manager role)
The client machine might not be placed within the same subnet as the cluster and this is one case that these variables try to deal with. Think about your internet connection, it is often not possible that your laptop can be accessed from anywhere around the globe just as google.com.
At the beginning of the application submission process, the spark-submit on the client side would upload necessary files to the spark master or yarn, and negotiate about resource requests. In this step the client connect to the cluster, and the cluster address is the destination address that the client tries to connect.
Then the ApplicationMaster starts on the allocated resource.
The resource allocated for ApplicationMaster is by default random, and cannot control by these variables. It is controlled by the scheduler of the cluster, if you're curious about this.
Then the ApplicationMaster tries to connect BACK to the spark Driver. This is the place that these conf variables take effects.

TCP-IP Join in Hazelcast not working in servicemix

According hazelcast article http://docs.hazelcast.org/docs/2.4/manual/html/ch12s02.html added hostname of another PC in hazelcast.xml which is generated in SERVICEMIX_HOME/etc like below.
<tcp-ip enabled="true">
<hostname>FABLRDT061:5702</hostname>
<interface>127.0.0.1</interface>
</tcp-ip>
If i start the servicemix, its not able to connect to the hostname i specified because of the following connection refusal. The log message in the other pc is as below
[172.16.25.64]:5702 [cellar] 5702 is accepting socket connection from /172.16.25.71:60770
[172.16.25.64]:5702 [cellar] 5702 accepted socket connection from /172.16.25.71:60770
[172.16.25.64]:5702 [cellar] Wrong bind request from Address[127.0.0.1]:5701! This node is not requested endpoint: Address[FABLRDT061]:5702
[172.16.25.64]:5702 [cellar] Connection [/172.16.25.71:60770] lost. Reason: Explicit close
what could be the reason?? Can someone help me out??

Hazelcast is configuration file using which discovery of nodes can be configured.
Eventhough the tutorials explain the following points,
According to the hands on i did, i understand that
Multicast is for auto discovery of cellar nodes in same sytem.
If cellar nodes are present in different systems over network, we use tcp-ip configuration.
For multicasting we dont need to change anything until we writing different multicast groups.
For discovering nodes using tcp-ip we need to specify ipaddresses (as explained by many tutorials but not exactly how.
under tcp-ip tag create a tag called hostname in which the hostname of the other system or ipaddress should be mentioned. In the interface tag, specify the current system's ipaddress.
Similarly in other nodes same should be done.

I would stay away from using hostnames, but replace it by ip addresses.

Selecting an Interface when Multicasting on Linux

I'm working with a cluster of about 40 nodes running Debian 4. Each node runs a daemon which sits and listens on a multicast IP.
I wrote some client software to send out a multicast over the LAN with a client computer on the same switch as the cluster, so that each node in the cluster would receive the packet and respond.
It works great, except when I run the client software on a computer that has both LAN and WAN interfaces. If there is a WAN interface, the multicast doesn't work. So obviously, I figure the multicast is incorrectly going over the WAN interface (eth0), rather than the LAN (eth1.) So, I use the SO_BINDTODEVICE socket option to force the multicast socket to use eth1, and all is well.
But I thought that the kernel's routing table should determine that the LAN (eth1) is obviously a lower cost destination for the multicast. Is there some reason I have to explicitly force the socket to use eth1? And, is there some way (perhaps an ioctl call) that I can have the application automatically determine if a particular interface is a LAN or WAN?

If you don't explicitly bind to an
interface, I believe Linux uses the
interface for the default unicast
route for multicast sending.
Linux needs a multicast route, if none exists you will get a EHOSTUNREACH or ENETUNREACH error. The LCM project documents this possible problem. The routing will be overridden if you use the socket option IP_MULTICAST_IF or IPV6_MULTICAST_IF. You are supposed be able to specify the interface via scope-id field in IPv6 addresses but not all platforms properly support it. As dameiss points out, Stevens' Unix Network Programming book covers these details, you can browse most of the chapter on multicast via Google Books for free.

If you don't explicitly bind to an interface, I believe Linux uses the interface for the default unicast route for multicast sending. So my guess is that your default route is via the WAN interface.
Richard Stevens' "Unix Network Programming, Vol. 1", chapter 17 (at least in the 3rd edition), has some good information and examples of how to enumerate the network interfaces.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string