Spark: Unable to Limit WebUI to localhost interface - apache-spark

I've search around the mailing list and SO spark tag, but it would seem that (nearly?) everyone has the opposite problem as mine. I made a stab at looking in the source for an answer, but I figured I might as well see if anyone else has run into the same problem as I.
I'm trying to limit my Master/Worker UI to run only on localhost. As it stands, I have the following two environment variables set in my spark-env.sh:
SPARK_LOCAL_IP=127.0.0.1
SPARK_MASTER_IP=127.0.0.1
and my slaves file contains one line: 127.0.0.1
The problem is that when I run start-all.sh, I can nmap my box's public interface and get the following:
PORT STATE SERVICE
22/tcp open ssh
8080/tcp open http-proxy
8081/tcp open blackice-icecap
Furthermore, I can go to my box's public IP at port 8080 in my browser and get the master node's UI. The UI even reports that the URL/REST URLs to be 127.0.0.1:
Spark Master at spark://127.0.0.1:7077
URL: spark://127.0.0.1:7077
REST URL: spark://127.0.0.1:6066 (cluster mode)
I'd rather not have spark available in any way to the outside world without an explicit SSH tunnel.
There are variables to do with setting the Web UI port, but I'm not concerned with the port, only the network interface to which the Web UI binds.
Any help would be greatly appreciated.

For spark 1.6, do the following:
open core/src/main/scala/org/apache/spark/ui/WebUI.scala
find the line 'serverInfo = Some(startJettyServer("0.0.0.0", port,
handlers, conf, name))'
change "0.0.0.0" in the line to some hostname you defined in LAN
After this you can access WebUI though LAN or a SSH-tunnel.

Related

Cannot connect from windows to redis linux server

I cannot connect to redis server (ubuntu server 16.04 LTS 64 bits on separate PC) from windows 8.1 64-bits. Redis is well documented, however I found very little information how to connect redis server from separate machine.
I have installed latest version of redis into linux and locally everything works fine. I start server via redis-server and also I start redis-cli and after that I am able to add information into server and retrieve it. The same situation is in windows - everything works locally.
In order to connect from windows into linux redis server I did these changes.
In linux I set the static local IP via sudo nano /etc/network/interfaces
address 192.186.xxx.xxx
netmask 255.255.255.0
network 192.168.xxx.xxx
broadcast 192.168.xxx.xxx
gateway 192.168.xxx.xxx
dns-nameservers 8.8.8.8
In redis.conf file I bind my windows PC IP which is given by my internet service provider. I also opened TCP 6379 port in my router GUI. In windows I modify redis.windows-service.conf and redis.windows.conf files. In both of them I bind my IP address given by my internet service provider. After this I cannot start redis-cli properly (empty black cmd window is visible)
What I am doing wrong? I would be very grateful for any help.
You should modify the redis conf, my redis conf is located at /etc/redis/6379.conf.
And you should comment the line "bind 127.0.0.1" Or change to bind 0.0.0.0.
The bind specify which network interface the redis server should listen to. The default is localhost.
And also Change the protected-mode to no :
Protected mode is a layer of security protection, in order to avoid that
Redis instances left open on the internet are accessed and exploited.
When protected mode is on and if:
1) The server is not binding explicitly to a set of addresses using the
"bind" directive.
2) No password is configured.
The server only accepts connections from clients connecting from the
IPv4 and IPv6 loopback addresses 127.0.0.1 and ::1, and from Unix domain
sockets.
By default protected mode is enabled. You should disable it only if
you are sure you want clients from other hosts to connect to Redis
even if no authentication is configured, nor a specific set of interfaces
are explicitly listed using the "bind" directive.
protected-mode yes
If you don't disable the protected-mode, your redis server will not listen public ip interface. more detail see above.
If you can access the remote server from your machine, your problem is most probably with redis security config, read the Securing Redis section in this document
I found that most of the time people don't change the "bind" directive value in redis config, you can test that by setting bind 0.0.0.0 and restarting redis server, if that's the issue, you can then allow whatever subnets you need to access the server.
I have also experience the same issue trying to connect to Redis (MSOpenTech 3.0.5 and 3.2.1) By default if no binding is stated then redis(according to the comments in the conf file) will listen to all available interfaces. That said, v 3.2.1 does have 'bind 127.0.0.1' already set... in 3.0.5 Setting the binding to 'bind 127.0.0.1' still allows the redis-cli to be used. Binding to 192.168.1.2 renders the redis-cli unusable with both versions - there is no IP and Port prompt, simply a carat and the cli does not accept keyboard input. Binging to an external IP the MSOpenTech fork service will not restart and throws an error(nice). Clearing all bindings and reverting back to original state, the redis-cli becomes usable again. Also, on the MS OpenTech fork there is no 'ProtectedMode' setting in either config file. Not sure whether this can actually be set.
Have raised this as an issue on the MSOpenTech fork via github but expecting silence to be the only reply...
I'm not sure this helps you in any way other than knowing that you are not alone. I am trying to pub from PHP to AS3 subscribers - it works great in the Flash IDE but from the localhost browser, redis appears to go decididly deaf.

Cassandra 3.9, how to remote access [duplicate]

I have built Cassandra server 2.0.3, then run it. It is starting and then stopped with messages:
X:\MyProjects\cassandra\apache-cassandra-2.0.3-src\bin>cassandra.bat >log.txt
java.lang.RuntimeException: Unable to gossip with any seeds
at org.apache.cassandra.gms.Gossiper.doShadowRound(Gossiper.java:1160)
at org.apache.cassandra.service.StorageService.checkForEndpointCollision
(StorageService.java:416)
at org.apache.cassandra.service.StorageService.joinTokenRing(StorageServ
ice.java:608)
at org.apache.cassandra.service.StorageService.initServer(StorageService
.java:576)
at org.apache.cassandra.service.StorageService.initServer(StorageService
.java:475)
at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.ja
va:346)
at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon
.java:461)
at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.jav
a:504)
What I can change to run it?
I had a similar problem with my cassandra v2.0.4 cluster running a single node.
Check your cassandra.yaml and make sure that your "listen_address" and "seeds" values match, with the exception that the seeds value requires quotes around it.
You might get this problem if your private IP address is different than the public one (like on AWS). For example, the host thinks it's "172.31.0.2" when it's visible as "55.70.33.10".
The solution to this problem is:
listen_address: 172.31.0.2
broadcast_address: 55.70.33.10
in cassandra.yaml
Make sure your cluster_name entry match on all the nodes in the cluster
(you may need to delete your storage if you changed the cluster name)
Verify that all nodes can ping to each other
broadcast_rpc_address and listen_address should be set to local IP
(not localhost or 127.0.0.1)
seeds should point to the IP address of the seed(s)
If you are on AWS and use the Ec2MultiRegionSnitch you will need to set the seeds to the public IP addresses rather than the private IPs.
I had the same problem on Ubuntu 16.04. I'm not sure which of these changes made it work, where XXX.XXX.XXX.XXX is your public facing IP address, below are selections from cassandra.yaml
seed_provider:
# Addresses of hosts that are deemed contact points.
# Cassandra nodes use this list of hosts to find each other and learn
# the topology of the ring. You must change this if you are running
# multiple nodes!
- class_name: org.apache.cassandra.locator.SimpleSeedProvider
parameters:
# seeds is actually a comma-delimited list of addresses.
# Ex: "<ip1>,<ip2>,<ip3>"
- seeds: "XXX.XXX.XXX.XXX"
listen_address: XXX.XXX.XXX.XXX
broadcast_address: XXX.XXX.XXX.XXX
broadcast_rpc_address: XXX.XXX.XXX.XXX
listen_on_broadcast_address: true
start_rpc: true
rpc_address: XXX.XXX.XXX.XXX
I also needed to restart my Virtual Machine for some reason. ¯_(ツ)_/¯
For a quick single node setup on RHEL, I did the following:
Get info about your network interface setup:
# /sbin/ifconfig -a
It will list the interfaces and the ip addresses they are attached to.
Usually it will show an "Ethernet" interface and a "Local Loopback".
Get the associated ip addresses.
Then edit conf/cassandra.yaml:
rpc_address: [Local Loopback address]
broadcast_rpc_address: [Ethernet address]
listen_address: [Local Loopback address]
broadcast_address: [Ethernet address]
listen_on_broadcast_address: true
seed_provider:
- class_name: org.apache.cassandra.locator.SimpleSeedProvider
parameters:
- seeds: "[Ethernet address]"
Then also, open the correct ports on Linux firewall, being 9042, 7000 and 7001. More info about opening ports on Linux here:
http://ask.xmodulo.com/open-port-firewall-centos-rhel.html
in cassandra.yaml, I update the seed from domain name to IP address. and it works.
Happened to me because in my configuration the "intial_token" settings was specified (I think because I just copied to configuration file over from another cluster member). After clearing the data directory, commenting out the setting and restarting the node, it worked fine for me.
I experienced this error today...
I could not find any reason for the error other than timing issues.
I restarted many times and after a while it sticked. It looks like they expect a bi-directional communication on the gossip channel and if it does not happen quickly enough (which looks like a very small amount of time to me) then they drop the line and generate that error.
In my case I just upgraded my software and restarted the computer. So it was clearly not a connection issue between the computers (I have firewalls and SSL, to complicate matters) and the node was connected before... So the one entry I found in that regard from datastax did not apply...
https://support.datastax.com/hc/en-us/articles/209691483-Bootstap-fails-with-Unable-to-gossip-with-any-seeds-yet-new-node-can-connect-to-seed-nodes
I got the same error. There can be more than one solution. Hope my mistake is what you have done.
I had my localhost IP pointing to some domain name (and I did that in order that my Spring boot application's server context is some domain name like www.example.com:8080 instead of localhost:8080, and I had the following entry in my hosts file on Windows system).
127.0.0.1 www.example.com
While my cassandra batch file was looking for localhost which it didn't find. So, I made another entry for localhost too in my hosts file as:
127.0.0.1 localhost
127.0.0.1 www.example.com
After adding it, I opened new command prompt, ran cassandra batch from the cassandra bin directory and it then worked.
Disable the firewall and SELINUX and try again
In our case ssl was enabled, and cassandra.yaml configuration looks fine as per above comments. Then we enabled ssl debugging by by adding below jvm paramter in cassandra-env.sh -Djavax.net.debug=ssl:handshake
After starting the node again we noticed below in cassandra log file
MessagingService-Outgoing-geo2_host/xx.xx.xx.xx, Exception while
waiting for close javax.net.ssl.SSLHandshakeException: Received fatal
alert: certificate_unknown
After further investigating the ssl debug logs we got to know that the certificate was not valid. After fixing this ssl issue node was able to join the cluster.
Thanks to elvingt
His answer just remind me , I need to verify that all node needs to be able to talk to each other.
https://support.datastax.com/hc/en-us/articles/209691483-Bootstap-fails-with-Unable-to-gossip-with-any-seeds-yet-new-node-can-connect-to-seed-nodes
Gossip communications must be bi-directional.
To verify use this commnd, and you need test from BOTH SIDE
nc -vz {your_node_ip} 7000
Then I recollect that I turned on my ubuntu firewall last night. I open it by
sudo ufw allow 7000/tcp
And it is working now
Getting error during startup/bootstrap
Unable to gossip with any seeds
indicates there is some issue with broadcast_address. broadcast_address is responsible for communication with other nodes not with clients.
This address must be set in seed node(mandatory for seed node), If you are using cloud VMs you might have different IPs(public and private) hence its recommended to use your private IPs for broadcast_address this will save your n/w cost as well.
# Address to broadcast to other Cassandra nodes
# Leaving this blank will set it to the same value as listen_address
broadcast_address: 10.11.xx.xxx
In my scenario I was using IBM and once I set broadcast_address in seed nodes issue got resolved.
Please make sure you are starting your seed node first then other node, this order is mandatory.
in cassandra.yaml
changing listen_address value from localhost to domainName solved my issue
I had same issue, I checked port, used tcpdump, netcat to test connections and finally it comes to expired SSL certificates on internode_encryption. I modified internode_encryption to make it 'none', restarted all nodes and it worked.
Before all neighbor nodes were down. And node repair command was failing with:
"Did not get positive replies from all endpoints"
P.S Dont leave internode_encryption as none for a long time, just regenerate certs and enable it back.

I can't connect to CouchDB UI in other computer

After I loaded the couch database and confirmed connecting localhost with port 5984. I want to access this web console in other computer. But It doesn't work. I changed every other ports and checked the firewall. But those didn't have any problems. Is there anybody got same experience?
Thanks in advanced.
Another question,
For changing the web port in local.ini and killed the previous loading application, but why does the previous one alive? Is there any command to unload/stop the application? I can't find the command in bin directory.
Change the parameter bind_address in the config from 127.0.0.1 (accessible from localhost only) to 0.0.0.0.

connection exception when using hadoop2 (YARN)

I have setup Hadoop (YARN) on ubuntu. The resource manager appears to be running. When I run the hadoop fs -ls command, I receive the following error:
14/09/22 15:52:14 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
ls: Call From ubuntu-8.abcd/xxx.xxx.xxx.xxxx to ubuntu-8.testMachine:9000 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
I checked on the suggested URL in the error message but could not figure out how to resolve the issue. I ahve tried setting the external IP address (as opposed to localhost) in my core-site.xml file (in etc/hadoop) but that has not resolved the issue. IPv6 has been disabled on the box. I am running the process as hduser (which has read/write access to the directory). Any thoughts on fixing this? I am running this on a single node.
bashrc
#HADOOP VARIABLES START
export JAVA_HOME=/usr/lib/jvm/java-8-oracle
export HADOOP_INSTALL=/usr/local/hadoop/hadoop-2.5.1
export PATH=$PATH:$HADOOP_INSTALL/bin
export PATH=$PATH:$HADOOP_INSTALL/sbin
export HADOOP_MAPRED_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_HOME=$HADOOP_INSTALL
export HADOOP_HDFS_HOME=$HADOOP_INSTALL
export HADOOP_YARN_HOME=$HADOOP_INSTALL ##added because I was not sure about the line below
export YARN_HOME=$HADOOP_INSTALL
export HADOOP_CONF_DIR=$HADOOP_INSTALL/etc/hadoop
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_INSTALL/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_INSTALL/lib"
#HADOOP VARIABLES END
Your issue is not related to YARN. It is limited by HDFS usage.
Here is the question with similar situation - person who asked had 9000 port listening on external IP interface but configuration was pointing to localhost. I'd advise first check if somebody at all listens on port 9000 and on what interface. Looks like you have service listening on IP interface which differs from where you look for it. Looking at your logs your client is trying ubuntu-8.testMachine:9000. To what IP it is being resolved? If it is assigned in /etc/hosts to 127.0.0.1, you could have the situation as in question I have mentioned - client tries to access 127.0.0.1 but service is waiting on external IP. OK, you could have vice versa. Here is good default port mapping table for Hadoop services.
Indeed many similar cases have the same root - wrongly configured host interfaces. People often configure their workstation hostname and assign this hostname to localhost in /etc/hosts. More, they write first short name and only after this FQDN. But this means IP is resolved into short hostname but FQDN is resolved into IP (non-symmetric).
This in turn provokes number of situations where services are started on local 127.0.0.1 interface and people have serious connectivity issues (are you surprised? :-) ).
Right approach (at least I encourage it based on expirience):
Assign at least one external interface that is visible to your cluster clients. If you have DHCP and don't want to have static IP, please bind your IP to MAC but move to 'constant' IP value.
Write local hostname into /etc/hosts to match external interface. FQDN name first and then short.
If you can, make your DNS resolver to resolve your FQDN into your IP. Don't care about short name.
Example, you have external IP interface 1.2.3.4 and FQDN (fully qualified domain name) set to myhost.com - in this case your /etc/hosts record MUST look like:
1.2.3.4 myhost.com myhost
And yes, it's better your DNS resolver knows about your name. Check both direct and reverse resolution with:
host myhost.com
host 1.2.3.4
Yes, clustering is not so easy in term of networking administration ;-). Never has been and shall never be.
Be sure you that you had started all the necesary, type start-all.sh, this command will start all the services needed for the connection to hadoop.
After that, you can type jps, with this command you can see all the services running under hadoop, and at the end, check the ports opened of these services with netstat -plnet | grep java.
Hope this solve your issue.

How to use the webUI for Heritrix remotely

Hello I have been playing with Heritrix, and would like to include it on a website/allow remote web access to it.
I have a Linux based server where I have a hosted webpage, and I have built a version of Heritrix.
The issue is I am at home now and would like to be able to offer access to the webUI in Heritrix via the hosted webpage.
I looked through the manual and discovered the -b command to bind it to remote hosts however the documentation could be better.
So what I was hoping for was a little explanation/elaboration on how this command works and if it would be possible to bind the webUI to existing webpage
Thanks for your time in advance
(Here is a link to the documentation im working from: https://webarchive.jira.com/wiki/display/Heritrix/HOWTO+Launch+Heritrix )
You should use -b <public ip address> like -b 192.168.1.1
If you don't want to use a public IP, you can use SSH port forwarding to do this. When creating a PuTTY session, under Connection > SSH > Tunnels enter the following:
Source port: 8443 (or the port Heritrix is installed on, if different)
Destination: localhost:8443 (it's good practice to match the port you're forwarding)
Back on the Session window, make sure you save the session. Now whenever you SSH onto your server you can access the Heritrix web UI by hitting https://localhost:8443

Resources