Disable Spark master's check for hostname equality - apache-spark

I have a Spark-master running in a Docker container which in turn is executed on a remote server. Next to the Spark-master there are containers running Spark-slave on the same Docker Host.
Server <---> Docker Host <---> Docker Container
In order to let the slaves find the master, I set a master hostname in Docker SPARKMASTER which the slaves use to connect to the master. So far, so good.
I use the SPARK_MASTER_IP environment variable to let the master bind to that name.
I also exposed the Spark port 7077 to the Docker host and forwarded this port on the physical server host. The port is open and available.
Now on my machine I can connect to the Server using its IP, say 192.168.1.100. When my Spark program connects to the server on port 7077 I get a connection, which is disassociated by the master:
15/10/09 17:13:47 INFO AppClient$ClientEndpoint: Connecting to master spark://192.168.1.100:7077...
15/10/09 17:13:47 WARN ReliableDeliverySupervisor: Association with remote system [akka.tcp://sparkMaster#192.168.1.100:7077] has failed, address is now gated for [5000] ms. Reason: [Disassociated]
I already learned that the reason for this disconnection is that the host IP 192.168.1.100 doesn't match the hostname SPARKMASTER.
I could add a host to my /etc/hosts file which would probably work. But I don't want to do that. Is there a way I can completely disable this check for hostname equality?

Related

Running taurus command from master node on azure containers which is unable to reach slave node due to error in method java.rmi.MarshalException

Error at master node trying to connect to remote jmter slave node in same network
You need to ensure that at least port 1099 is open, check out How to open ports to a virtual machine with the Azure portal article for more details.
Apart from port 1099 you need to open:
The port you specify as the server.rmi.localport on slaves
The port you specify as the client.rmi.localport on master
More information:
Remote hosts and RMI configuration
JMeter Distributed Testing with Docker
JMeter Remote Testing: Using a different port

Docker-compose: connect from Docker container to linux host

In "naked" Docker, I can use --network=host to make the host reachable from the container through the docker0 interface at 127.0.0.1, or I can reach the host at 172.17.0.1.
Ostensibly docker-compose supports the host networking driver through network_mode: host. Unfortunately, this prevents me from using exposed ports. Containers created using docker-compose don't live on the docker0 interface, and hence can reach the host at some IP address like 172.18.0.1 or 172.19.0.1, decided at startup.
Even once I know that address, it doesn't seem to be reachable from the container.
How can I connect from the container to the host?

Bind docker container loopback to the host loopback

I pull a docker image (will use python 3 as an example)
docker pull python:3.6
Then I launch a docker container
docker run -it -p 127.0.0.1:8000:8000 python:3.6 bash
(note that here 127.0.0.1 in 127.0.0.1:8000:8000 allows to specify the destination, host IP but not the source)
So if I launch a server inside the container at 0.0.0.0:
python -m http.server 8000 --bind 0.0.0.0
then I can access the container's server from the host machine without any problem by going to http://127.0.0.1:8000 at the host machine
However if my docker server binds to 127.0.0.1 instead of 0.0.0.0:
python -m http.server 8000 --bind 127.0.0.1
then accessing http://127.0.0.1:8000 from the host does not work.
What's the proper way of binding the container's loopback 127.0.0.1 to the host loopback?
What's the proper way of binding the container's loopback 127.0.0.1 to the host loopback?
On Linux, this can be done by configuring your Docker container to use the hosts network namespace, ie:
docker run --network=host
This only works on Linux because on Linux, your machine is the host, and the containers run as containers in your machines OS. On Windows/OSX, the Docker host runs as a virtual machine, with the containers running in the virtual machine, and so they can't share your machines network namespace.
What's the proper way of binding the container's loopback 127.0.0.1 to the host loopback?
You can't do that. The loopback interface inside a container means "only this container", just like on the host means "only this host". If a service is binding to 127.0.0.1 then there is no way -- from your host or from another container -- to reach that service.
The only way to do what you want is either:
Modify the application configuration to listen on all interfaces (or eth0 specifically), or
Run a proxy inside your container that binds to all interfaces and forwards connections to the localhost address.

UnknownHostException in prestodb

My Hive metastore and HDFS cluster is not directly accessible to my local machine, and i use SSH port forwarding to access it.using a dynamic SOCKS proxy with SSH listening on local port 1080,but the error is : query fail: UnknownHostException linux-hostname
I have solved the problem myself.Previously,i use the Ethernet ip by adding the network configuration in system settings ,finally,i set the ip static over the file /etc/network/interface and then the program runs successfully.

Spark Clusters: worker info doesn't show on web UI

I have installed spark standalone on a set of clusters. And I tried to launch clusters through the cluster launch script. I have added cluster's IP address into conf/slaves file. The master connects to all slaves through password-less ssh.
After running ./bin/start-slaves.sh script, I get the following message:
starting org.apache.spark.deploy.worker.Worker, logging to /root/spark-0.8.0-incubating/bin/../logs/spark-root-org.apache.spark.deploy.worker.Worker-1-jbosstest2.out
But the webUI of the master (localhost:8080) is not showing any information about the worker. But when I add localhost entry onto my conf/slaves file the worker info of localhost is shown.
There are no error messages, the message on terminal says the worker is started, but the WebUI is not showing any workers.
I had the same problem. I noticed when I could not telnet master:port from the slaves. In my etc/hosts file (on master) I had a 127.0.0.1 master entry (before my 192.168.0.x master). When I removed the 127.0.0.1 entry from my etc/hosts file I could telnet and when I start-slaves.sh (from the master) my slaves connected
When you run the cluster, check command $jps in worker nodes, whether its up correctly and check it in the logs with the worker's PID.
or
set the following: run the cluster and check if the ports are up or not with your configured ports
export SPARK_MASTER_WEBUI_PORT=5050
export SPARK_WORKER_WEBUI_PORT=4040
check your /etc/hosts and see the bindings for master
If your master is binding to localhost as well as ip address (eg 192.168.x.x), remove localhost. if you have local host intact master will be mapped to localhost which wont allow slaves to connect to master Ip address
You can use: ./start-master.sh --host 192.168.x.x instead of changing the file: /etc/hosts
I met the same issue and finally solved by adding the following line in $SPARK_HOME/conf/spark-env.sh:
SPARK_MASTER_HOST=your_master_ip_address

Resources