Spark Worker - Change web ui host in standalone mode - apache-spark

When I view the master node's web ui, it shows all my current workers attached to the cluster.
https://spark.apache.org/docs/3.0.0-preview/web-ui.html
The issue that I am having though is that the IP address it uses for the worker nodes in the web ui is incorrect. Is there a way to change the worker's web ui host/ip that is used in the master's web ui?
Reading through the documentation, there appears to be "SPARK_WORKER_WEBUI_PORT" which sets the port for the worker but there doesn't seem to be a "SPARK_WORKER_WEBUI_HOST".
http://spark.apache.org/docs/latest/spark-standalone.html
To provide more context, I currently have a spark cluster that is deployed in stand alone mode. The spark cluster (master and slaves) are all behind a router (NAT). The workers bind to the master using their internal IP address. I setup port forwarding to route external traffic to each of the master and slaves. The issue is that since my workers are binding to the master using their internal IP addresses, that it uses the internal IP address in the master node's web ui. This makes the worker node's web ui inaccessible for everyone outside of my NAT. If there is a way to specifically set the IP address to use for each of my worker's web ui, then this would resolve this problem. Thanks!

After more research, I determined that the environment variable I was looking for was: SPARK_PUBLIC_DNS
http://spark.apache.org/docs/latest/spark-standalone.html
This allowed me to set a different external host name for my workers.

Related

Cassandra create pool with hosts on different ports

I have 10 Cassandra Nodes running on Kubernetes on my server and 1 contact point that expose the service on port 10023.
However, when the datastax driver tries to establish a connection with the other nodes of the cluster it uses the exposed port instead of the default one and i get the following error:
com.datastax.driver.core.ConnectionException: [/10.210.1.53:10023] Pool was closed during initialization
Is there a way to expose one single contact point and have it to communicate with the other nodes on the standard port (9042)?
i checked on the datastax documentation if there is anything related to it but i didn't find much.
this is how i connect to the cluster
Cluster.Builder builder = Cluster.builder();
builder.addContactPoints(address)
.withPort(Integer.valueOf(10023))
.withCredentials(user, password)
.withMaxSchemaAgreementWaitSeconds(600)
.withSocketOptions(
new SocketOptions()
.setConnectTimeoutMillis(Integer.valueOf(timeout))
.setReadTimeoutMillis(Integer.valueOf(timeout))
).build();
Cluster cluster = builder.withoutJMXReporting().build();
Session session = cluster.connect();
After driver contacts first node, it fetches information about cluster, and use this information, and this information includes on what ports Cassandra listens.
To implement what you want to do, you need that Cassandra listened on the corresponding port - this is configured via native_transport_port parameter of the cassandra.yaml.
Also, by default Cassandra driver will try to connect to all nodes in cluster because it uses DCAware/TokenAware load balancing policy. If you want to use only one node, then you need to use WhiteListPolicy instead of default policy. But is not optimal from the performance point of view.
I would suggest to re-think how you expose Cassandra to clients.

kubernetes master failing to join cluster

We're using k8s 1.9.3 managed via kops 1.9.3 in AWS with Gossip based DNS using the weave cni network plugin.
I was doing a rolling-update of the master IG's to enable a some additional admission controllers. (PodNodeSelector and PodTolerationRestriction) I did this in two other clusters with no problems. When the cluster got to rolling the third master (we run our cluster in a 3 master setup) it brought down the instance and tried to bring up the new master instance but the new master instance failed to join the cluster. Upon further research and subsequent attempts to roll the third master to bring it into the cluster I found that the third, failing to join master, keeps trying to join the cluster as the old masters ip address. Even though it's ip address is something different. Watching a kubectl get nodes | grep master shows that the cluster thinks it's the old ip address and it fails because it's not that ip anymore. It seems that for some reason the cluster gossip based DNS is not getting notified about the new master's ip address.
This is causing problems because the kubernetes svc still has the old master's ip address in it, which is causing any api requests that get directed to that non-existent backend master to fail. It is also causing problems for etcd which keeps trying to contact it on the old ip address. Lots of logs like this:
018-10-29 22:25:43.326966 W | etcdserver: failed to reach the peerURL(http://etcd-events-f.internal.kops-prod.k8s.local:2381) of member 3b7c45b923efd852 (Get http://etcd-events-f.internal.kops-prod.k8s.local:2381/version: dial tcp 10.34.6.51:2381: i/o timeout)
2018-10-29 22:25:43.327088 W | etcdserver: cannot get the version of member 3b7c45b923efd852 (Get http://etcd-events-f.internal.kops-prod.k8s.local:2381/version: dial tcp 10.34.6.51:2381: i/o timeout)
One odd thing is that if I run etcdctl cluster-health on the available masters etcd instances they all show the unhealthy member id as f90faf39a4c5d077 but when I look at the etcd-events logs I see that it sees the unhealth member id as 3b7c45b923efd852. So there seems to be some inconsistency with etcd.
Since we are running in a three node master setup with one master down we don't want to restart any of the other masters to try to fix the problem because we're afraid to lose quorum on the etcd cluster.
We use weave 2.3.0 as our network CNI provider.
Noticed on the failing master that the weave cni config /etc/cni/net.d/10-weave.conf isn't getting created and the /etc/hosts files on the working masters isn't properly getting updated with the new master ip address. It seems like kube-proxy isn't getting the update for some reason.
Running the default debian 8 (jessie) image that is provided with kops 1.9.
How can we get the master to properly update DNS with it's new ip address?
My co-worker found that the fix was restarting the kube-dns and kube-dns-autoscaler pods. We're still not sure why they were failing to update dns with the new master ip but after restarting them adding the new master to the cluster worked fine.

Apache Cassandra Server and Datastax Client - Changing IP Addresses

We are using the latest Apache Cassandra database server, and the Datastax Node.js client, running in the cloud.
When our Cassandra servers are rebuilt, they get new IP addresses. Then any running service clients can't find the new servers, the client driver obviously must cache the IP addresses, instead of using DNS.
Is there some way around this problem, other than doing client shutdown and get a new client, in our services when we encounter an error accessing the database?
If you only have 1 server, there is nothing you can do.
Otherwise the node when it rebuilds (if it is a single node in the cluster of many) will advertise the new IP to the cluster and cluster topology is updated. So the peers table will be updated and the driver can register this event (AFAIK).
But why not use private static addresses for your cassandra nodes?

Will the DataStax Cluster class ever refresh IP address from the hostname given to builder.addContactPoint() if DNS changes?

I've a problem, once set host name, cluster wouldn't update it's IP, even in DNS changes.
Or what is the recommended way of making the application resilient to the fact that more nodes can be added to DNS round robin and old nodes decomissioned ?
I had same thing with Astyanax driver. For me it looks like it works this way:
DNS name is used only when initial connection to cluster is created. At this point driver collects data about cluster nodes. This information is kept in terms of IP addresses already and DNS names are not used any more. Sub-sequential changes in the cluster topology are propagated into the client also using IP addresses.
So, when you add more nodes to the cluster, you actually do not have to assign domain names to them. Just adding a node to the cluster propagates its IP address to the cluster topology table and this info is distributed among all cluster members and smart clients like Java Driver (some third party clients might not have this info and will use only seed nodes to pass queries to).
When you decommission node it works same way. Just all cluster nodes and smart clients receive information that node with a particular IP is not in the cluster any more. It can be even initial seed node.
->Domain name makes sense only for clients which hadn't established cluster connection.
In case you really need to switch IP you have to:
Join node with new IP
Decommission node with old IP
Assign DNS name to new IP

IIS network load balancing

I have a clustered server with 4 nodes running Win server 2008 r2 with IIS 7.
Fail over kicks in when one of nodes fails but is there a way to have it round robin distribute incoming calls to different server?
This happens when incoming requests come from different client but our investigation shows that if there is one client that is making many requests, they all go to the same server.
I would like to the server to round robin request so that node 1 receives first request, node 2 receives second request and so on.
Each request could take a long time and having all requests go to the same node when I have 3 others idling is causing us perf issue. Thanks
NLB port rules have a couple of properties that control how requests are routed. The relevant properties seem to be:
Filtering mode - specifies whether a single host or multiple hosts in the cluster handle traffic for the given port
Affinity - controls how traffic is routed to hosts in the cluster
It is likely you need to set the Affinity value to none, which allows requests to be routed to multiple hosts within the cluster. The docs do not state whether round-robin or another algorithm is used for load balancing.
For more on Filtering Mode and Affinity: Network Load Balancing Manager Properties
How to: Edit a Network Load Balancing Port Rule
Round Robin Load Balancing will not distribute traffic coming from one destination. You will need to configure your load balancer to 'Least Connections'
Basically the NLB passes a new connection to the pool member or node that has the least number of active connections.

Resources