How to use history server in Apache Spark - apache-spark

I have logged one of my jobs in Apache Spark using the following snippet in the Universities.
conf = SparkConf()\
.setAppName("Ex").set("spark.eventLog.enabled", "true")\
.set("spark.eventLog.dir", "log")
After the job has completed, I tried to copy the log file app-20170416171823-0000. On to my local system and tried to follow the following command to go through the recorded Spark Web UI.
sbin/start-history-server.sh ~/Downloads/log/app-20170416171823-0000
But the history terminated with the following error:
failed to launch: nice -n 0 /usr/local/Cellar/apache-spark/2.1.0/libexec/bin/spark-class org.apache.spark.deploy.history.HistoryServer /Users/sk/Downloads/log/app-20170416171823-0000 at org.apache.spark.deploy.history.FsHistoryProvider.<init>(FsHistoryProvider.scala:77) ... 6 more full log in /usr/local/Cellar/apache-spark/2.1.0/libexec/logs/spark-sk-org.apa
che.spark.deploy.history.HistoryServer-1-Sk-MacBook-Pro.local.out
Contents of the output of the history server:
17/04/16 17:44:52 INFO SecurityManager: Changing modify acls groups to:
17/04/16 17:44:52 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(skaran); groups with view permissions: Set(); users with modify permissions: Set(skaran); groups with modify permissions: Set()
Exception in thread "main" java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.spark.deploy.history.HistoryServer$.main(HistoryServer.scala:278)
at org.apache.spark.deploy.history.HistoryServer.main(HistoryServer.scala)
Caused by: java.lang.IllegalArgumentException: Logging directory specified is not a directory: file:/Users/sk/Downloads/log/app-20170416171823-0000
at org.apache.spark.deploy.history.FsHistoryProvider.org$apache$spark$deploy$history$FsHistoryProvider$$startPolling(FsHistoryProvider.scala:198)
at org.apache.spark.deploy.history.FsHistoryProvider.initialize(FsHistoryProvider.scala:153)
at org.apache.spark.deploy.history.FsHistoryProvider.<init>(FsHistoryProvider.scala:149)
at org.apache.spark.deploy.history.FsHistoryProvider.<init>(FsHistoryProvider.scala:77)
... 6 more

It seems that the argument should be the folder containing the log.

Related

How to start Spark History Server pointing to Google Cloud?

I'm using following command to start the server
sh /usr/local/Cellar/apache-spark/3.2.1/libexec/sbin/start-history-server.sh --properties-file /usr/local/Cellar/apache-spark/3.2.1/custom-configs/cloud.properties
cloud.properties file contents
spark.history.fs.logDirectory=gs://bucket/spark-history-server
Got the following error
Spark Command: /Library/Internet Plug-Ins/JavaAppletPlugin.plugin/Contents/Home/bin/java -cp /usr/local/Cellar/apache-spark/3.2.1/libexec/conf/:/usr/local/Cellar/apache-spark/3.2.1/libexec/jars/* -Xmx1g org.apache.spark.deploy.history.HistoryServer --properties-file /usr/local/Cellar/apache-spark/3.2.1/custom-configs/cloud.properties
========================================
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
22/05/03 00:58:50 INFO HistoryServer: Started daemon with process name: XXXX
22/05/03 00:58:50 INFO SignalUtils: Registering signal handler for TERM
22/05/03 00:58:50 INFO SignalUtils: Registering signal handler for HUP
22/05/03 00:58:50 INFO SignalUtils: Registering signal handler for INT
22/05/03 00:58:50 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
22/05/03 00:58:50 INFO SecurityManager: Changing view acls to: xxx
22/05/03 00:58:50 INFO SecurityManager: Changing modify acls to: xxx
22/05/03 00:58:50 INFO SecurityManager: Changing view acls groups to:
22/05/03 00:58:50 INFO SecurityManager: Changing modify acls groups to:
22/05/03 00:58:50 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(xxx); groups with view permissions: Set(); users with modify permissions: Set(xxx); groups with modify permissions: Set()
22/05/03 00:58:50 INFO FsHistoryProvider: History server ui acls disabled; users with admin permissions: ; groups with admin permissions:
Exception in thread "main" java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.spark.deploy.history.HistoryServer$.main(HistoryServer.scala:305)
at org.apache.spark.deploy.history.HistoryServer.main(HistoryServer.scala)
Caused by: org.apache.hadoop.fs.UnsupportedFileSystemException: No FileSystem for scheme "gs"
at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:3443)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3466)
at org.apache.hadoop.fs.FileSystem.access$300(FileSystem.java:174)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3574)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3521)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:540)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:365)
at org.apache.spark.deploy.history.FsHistoryProvider.<init>(FsHistoryProvider.scala:116)
at org.apache.spark.deploy.history.FsHistoryProvider.<init>(FsHistoryProvider.scala:88)
... 6 more
How to set the FileSystem configuration here?

Network error log on spark docker(bitnami/spark) cluster

Server 1 : Master, Slave Node
Server 2 : Slave Node
Server 3 : Slave Node
When I execute the pi.py example to master node, many jobs were finished with Exit code 1.
Same goes for the log message in workernode, like below.
However, I don't know the exact reason... Could you give me some advise???
20/03/12 13:21:54 INFO CoarseGrainedExecutorBackend: Started daemon with process name: 7571#803acbaf5fbf
20/03/12 13:21:54 INFO SignalUtils: Registered signal handler for TERM
20/03/12 13:21:54 INFO SignalUtils: Registered signal handler for HUP
20/03/12 13:21:54 INFO SignalUtils: Registered signal handler for INT
20/03/12 13:21:55 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
20/03/12 13:21:55 INFO SecurityManager: Changing view acls to: spark,root
20/03/12 13:21:55 INFO SecurityManager: Changing modify acls to: spark,root
20/03/12 13:21:55 INFO SecurityManager: Changing view acls groups to:
20/03/12 13:21:55 INFO SecurityManager: Changing modify acls groups to:
20/03/12 13:21:55 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(spark, root); groups with view permissions: Set(); users with modify permissions: Set(spark, root); groups with modify permissions: Set()
Exception in thread "main" java.lang.reflect.UndeclaredThrowableException
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1713)
at org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:64)
at org.apache.spark.executor.CoarseGrainedExecutorBackend$.run(CoarseGrainedExecutorBackend.scala:188)
at org.apache.spark.executor.CoarseGrainedExecutorBackend$.main(CoarseGrainedExecutorBackend.scala:285)
at org.apache.spark.executor.CoarseGrainedExecutorBackend.main(CoarseGrainedExecutorBackend.scala)
Caused by: org.apache.spark.SparkException: Exception thrown in awaitResult:
at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:226)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:101)
at org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$run$1.apply$mcV$sp(CoarseGrainedExecutorBackend.scala:201)
at org.apache.spark.deploy.SparkHadoopUtil$$anon$2.run(SparkHadoopUtil.scala:65)
at org.apache.spark.deploy.SparkHadoopUtil$$anon$2.run(SparkHadoopUtil.scala:64)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
... 4 more
Caused by: java.io.IOException: Failed to connect to 67f75f899bac:43487
at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:245)
at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:187)
at org.apache.spark.rpc.netty.NettyRpcEnv.createClient(NettyRpcEnv.scala:198)
at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:194)
at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:190)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.net.UnknownHostException: 67f75f899bac
at java.net.InetAddress.getAllByName0(InetAddress.java:1281)
at java.net.InetAddress.getAllByName(InetAddress.java:1193)
at java.net.InetAddress.getAllByName(InetAddress.java:1127)
at java.net.InetAddress.getByName(InetAddress.java:1077)
at io.netty.util.internal.SocketUtils$8.run(SocketUtils.java:146)
at io.netty.util.internal.SocketUtils$8.run(SocketUtils.java:143)
at java.security.AccessController.doPrivileged(Native Method)
at io.netty.util.internal.SocketUtils.addressByName(SocketUtils.java:143)
at io.netty.resolver.DefaultNameResolver.doResolve(DefaultNameResolver.java:43)
at io.netty.resolver.SimpleNameResolver.resolve(SimpleNameResolver.java:63)
at io.netty.resolver.SimpleNameResolver.resolve(SimpleNameResolver.java:55)
at io.netty.resolver.InetSocketAddressResolver.doResolve(InetSocketAddressResolver.java:57)
at io.netty.resolver.InetSocketAddressResolver.doResolve(InetSocketAddressResolver.java:32)
at io.netty.resolver.AbstractAddressResolver.resolve(AbstractAddressResolver.java:108)
at io.netty.bootstrap.Bootstrap.doResolveAndConnect0(Bootstrap.java:202)
at io.netty.bootstrap.Bootstrap.access$000(Bootstrap.java:48)
at io.netty.bootstrap.Bootstrap$1.operationComplete(Bootstrap.java:182)
at io.netty.bootstrap.Bootstrap$1.operationComplete(Bootstrap.java:168)
at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:577)
at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:551)
at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:490)
at io.netty.util.concurrent.DefaultPromise.setValue0(DefaultPromise.java:615)
at io.netty.util.concurrent.DefaultPromise.setSuccess0(DefaultPromise.java:604)
at io.netty.util.concurrent.DefaultPromise.trySuccess(DefaultPromise.java:104)
at io.netty.channel.DefaultChannelPromise.trySuccess(DefaultChannelPromise.java:84)
at io.netty.channel.AbstractChannel$AbstractUnsafe.safeSetSuccess(AbstractChannel.java:985)
at io.netty.channel.AbstractChannel$AbstractUnsafe.register0(AbstractChannel.java:505)
at io.netty.channel.AbstractChannel$AbstractUnsafe.access$200(AbstractChannel.java:416)
at io.netty.channel.AbstractChannel$AbstractUnsafe$1.run(AbstractChannel.java:475)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:510)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:518)
at io.netty.util.concurrent.SingleThreadEventExecutor$6.run(SingleThreadEventExecutor.java:1044)
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
... 1 more
Could you give me some advise???
This means that the worker nodes are not able to reach the master node. Because you are using spark inside docker, docker containers both workers and Master should be able to communicate. Update /etc/hosts for all nodes with correct Ip addresses.
You can also update the docker host /etc/hosts and attach it as a volume with -v /etc/hosts:/etc/hosts:rw inside the container.
Add --network host to your run command to allow port mapping with the docker host. Change the hostname of the containers.
Add -e SPARK_MASTER_URL=spark://YOUR_HOST:7077 to your run command
I founded this thread with a similar problem.
Could you check your acls ?

What configuration is required to run Apache Spark in a OpenShift (driver) + Local Machine (master and executor)

I want to run a ApacheCamel-ApacheSpark program, where in the Camel route accepts message from ActiveMQ and routes it to ApacheSpark. The ApacheCamel & ApacheSpark Driver program run in RedHat Fuse Integration Services Springboot OpenShift (through Oracle VirtualBox) and the Apache Spark Master & Worker nodes run in a cluster on the local machine.
ActiveMQ also runs in OpenShift.
During execution, message from ActiveMQ is consumed successfully, but the Spark program does not run the tasks on executors.
SparkConf conf = new SparkConf()
.setMaster("spark://<master URL 192.XXC.56.XX>:7077")
Error in executor (Please note that 172.17.0.6:35985 in the log below corresponds to host in openshift where the driver is running)
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
18/04/24 10:48:38 INFO CoarseGrainedExecutorBackend: Started daemon with process name: 4296#12HW000634
18/04/24 10:48:39 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
18/04/24 10:48:39 INFO SecurityManager: Changing view acls to: Administrator,jboss
18/04/24 10:48:39 INFO SecurityManager: Changing modify acls to: Administrator,jboss
18/04/24 10:48:39 INFO SecurityManager: Changing view acls groups to:
18/04/24 10:48:39 INFO SecurityManager: Changing modify acls groups to:
18/04/24 10:48:39 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(Administrator, jboss); groups with view permissions: Set(); users with modify permissions: Set(Administrator, jboss); groups with modify permissions: Set()
Exception in thread "main" java.lang.reflect.UndeclaredThrowableException
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1713)
at org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:66)
at org.apache.spark.executor.CoarseGrainedExecutorBackend$.run(CoarseGrainedExecutorBackend.scala:188)
at org.apache.spark.executor.CoarseGrainedExecutorBackend$.main(CoarseGrainedExecutorBackend.scala:284)
at org.apache.spark.executor.CoarseGrainedExecutorBackend.main(CoarseGrainedExecutorBackend.scala)
Caused by: org.apache.spark.SparkException: Exception thrown in awaitResult
at org.apache.spark.rpc.RpcTimeout$$anonfun$1.applyOrElse(RpcTimeout.scala:77)
at org.apache.spark.rpc.RpcTimeout$$anonfun$1.applyOrElse(RpcTimeout.scala:75)
at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36)
at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
at scala.PartialFunction$OrElse.apply(PartialFunction.scala:167)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:83)
at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:100)
at org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$run$1.apply$mcV$sp(CoarseGrainedExecutorBackend.scala:202)
at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:67)
at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:66)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
... 4 more
Caused by: java.io.IOException: Failed to connect to /172.17.0.6:35985
at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:228)
at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:179)
at org.apache.spark.rpc.netty.NettyRpcEnv.createClient(NettyRpcEnv.scala:197)
at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:191)
at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:187)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection timed out: no further information: /172.17.0.6:35985
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:257)
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:291)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:640)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:575)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:489)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:451)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:140)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:144)
... 1 more
The executors time out and the master spawns executors one after the other.
what configurations are necessary to run the driver in OpenShift and connect to Apache Spark Cluster running in the local machine in separate JVM?
You are missing configurations for spark.kubernetes.container.image, spark.submit.deployMode at the very least.

The Worker is refused to connect to client inside of NAT network

I set up a spark standalone cluster environment as below:
master: on cloud, has public IP address: master_ip_address
worker: on cloud, has public IP address
client: inside of NAT network
I run the following command in the client machine:
spark-shell --master spark://master_ip_address:7077
From the worker web UI, I check the stderr log, it shows:
17/10/17 00:54:39 INFO CoarseGrainedExecutorBackend: Started daemon with process name: 1060#jrWS2016-mSpark
17/10/17 00:54:39 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
17/10/17 00:54:40 INFO SecurityManager: Changing view acls to: dbadmin,jshen
17/10/17 00:54:40 INFO SecurityManager: Changing modify acls to: dbadmin,jshen
17/10/17 00:54:40 INFO SecurityManager: Changing view acls groups to:
17/10/17 00:54:40 INFO SecurityManager: Changing modify acls groups to:
17/10/17 00:54:40 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(dbadmin, jshen); groups with view permissions: Set(); users with modify permissions: Set(dbadmin, jshen); groups with modify permissions: Set()
Exception in thread "main" java.lang.reflect.UndeclaredThrowableException
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1713)
at org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:66)
at org.apache.spark.executor.CoarseGrainedExecutorBackend$.run(CoarseGrainedExecutorBackend.scala:188)
at org.apache.spark.executor.CoarseGrainedExecutorBackend$.main(CoarseGrainedExecutorBackend.scala:284)
at org.apache.spark.executor.CoarseGrainedExecutorBackend.main(CoarseGrainedExecutorBackend.scala)
Caused by: org.apache.spark.SparkException: Exception thrown in awaitResult:
at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:205)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:100)
at org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$run$1.apply$mcV$sp(CoarseGrainedExecutorBackend.scala:202)
at org.apache.spark.deploy.SparkHadoopUtil$$anon$2.run(SparkHadoopUtil.scala:67)
at org.apache.spark.deploy.SparkHadoopUtil$$anon$2.run(SparkHadoopUtil.scala:66)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
... 4 more
Caused by: java.io.IOException: Failed to connect to /10.154.10.3:38572
at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:232)
at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:182)
at org.apache.spark.rpc.netty.NettyRpcEnv.createClient(NettyRpcEnv.scala:197)
at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:194)
at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:190)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)
Caused by: io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: no further information: /10.154.10.3:38572
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:257)
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:291)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:631)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:566)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:480)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:442)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:131)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:144)
... 1 more
the IP address: 10.154.10.3 is inner IP Address of client machine, is not external one, so the worker cannot connect to client machine, that's the problem.
My question is: is there a way by setting some configurations or something else to make the environment works.
Thanks.
Since your client is with NAT configuration no machine could connect from outside, If feasible would advise using Bridge network.
If that's not an option look for host-only adapter or at least one VM which has both networks NAT and Bridge/host-only, Then forward specific port to work as tunnel

How to enable spark-history server for standalone cluster non hdfs mode

I have setup Spark2.1.1 cluster (1 master 2 slaves) following http://paxcel.net/blog/how-to-setup-apache-spark-standalone-cluster-on-multiple-machine/ in standalone mode.
I do not have a pre-Hadoop setup on of the machine. I wanted to start spark-history server.
I run it as follows:
roshan#bolt:~/spark/spark_home/sbin$ ./start-history-server.sh
and in the spark-defaults.conf I set this:
spark.eventLog.enabled true
But it fails with the error:
7/06/29 22:59:03 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(roshan); groups with view permissions: Set(); users with modify permissions: Set(roshan); groups with modify permissions: Set()
17/06/29 22:59:03 INFO FsHistoryProvider: History server ui acls disabled; users with admin permissions: ; groups with admin permissions
Exception in thread "main" java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.spark.deploy.history.HistoryServer$.main(HistoryServer.scala:278)
at org.apache.spark.deploy.history.HistoryServer.main(HistoryServer.scala)
Caused by: java.io.FileNotFoundException: Log directory specified does not exist: file:/tmp/spark-events Did you configure the correct one through spark.history.fs.logDirectory?
at org.apache.spark.deploy.history.FsHistoryProvider.org$apache$spark$deploy$history$FsHistoryProvider$$startPolling(FsHistoryProvider.scala:214)
What I should set to spark.history.fs.logDirectory and spark.eventLog.dir
Update 1:
spark.eventLog.enabled true
spark.history.fs.logDirectory file:////home/roshan/spark/spark_home/logs
spark.eventLog.dir file:////home/roshan/spark/spark_home/logs
but I am always getting this error:
java.lang.IllegalArgumentException: Codec [1] is not available. Consider setting spark.io.compression.codec=snappy at org.apache.spark.io.Co
By default spark defines file:/tmp/spark-events as the log directory for history server and your log clearly says spark.history.fs.logDirectory is not configured
first of all you need to create spark-events folder in /tmp (which is not a good idea as /tmp is refreshed everytime a machine is rebooted) and then add spark.history.fs.logDirectory in spark-defaults.conf to point to that directory. But I suggest you create another folder which spark user has access to and update spark-defaults.conf file.
You need to define two more variables in spark-defaults.conf file
spark.eventLog.dir file:path to where you want to store your logs
spark.history.fs.logDirectory file:same path as above
Suppose you want to store in /opt/spark-events where spark user has access to then above parameters in spark-defaults.conf would be
spark.eventLog.enabled true
spark.eventLog.dir file:/opt/spark-events
spark.history.fs.logDirectory file:/opt/spark-events
You can find more information in Monitoring and Instrumentation
Try setting
spark.io.compression.codec=org.apache.spark.io.SnappyCompressionCodec
in spark-defaults.conf

Resources