PySpark WARN messages - apache-spark

How can I disable the following WARN messages when running PySpark code:
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
18/06/08 21:04:55 WARN SparkConf: In Spark 1.0 and later spark.local.dir will be overridden by the value set by the cluster manager (via SPARK_LOCAL_DIRS in mesos/standalone and LOCAL_DIRS in YARN).
18/06/08 21:04:55 WARN Utils: Service 'SparkUI' could not bind on port 4040. Attempting port 4041.
I spent some time playing with log4.properties, but cannot figure out exactly which class logs these.

put this in your init of the spark context:
sc.setLogLevel("INFO")

Related

I want to run spark with yarn, but I get a java.net.ConnectException error

I want to run spark on yarn
root#server01:/export/server/spark# bin/spark-shell --master yarn
But it goes to error like this
root#server01:/export/server/spark# bin/spark-shell --master yarn
22/12/06 07:51:42 WARN Utils: Your hostname, server01 resolves to a loopback address: 127.0.1.1; using 192.168.40.133 instead (on interface ens33)
22/12/06 07:51:42 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
22/12/06 07:51:51 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
22/12/06 07:57:58 ERROR YarnClientSchedulerBackend: The YARN application has already ended! It might have been killed or the Application Master may have failed to start. Check the YARN application logs for more details.
22/12/06 07:57:58 ERROR SparkContext: Error initializing SparkContext.
org.apache.spark.SparkException: Application application_1670312235175_0001 failed 2 times due to Error launching appattempt_1670312235175_0001_000002. Got exception: java.net.ConnectException: Call From localhost/127.0.0.1 to localhost:41647 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
spark-defaults.conf
spark.eventLog.enabled true
spark.eventLog.dir hdfs://node1:8020/sparklog/
spark.eventLog.compress true
# spark-yarn jar package
spark.yarn.jars hdfs://node1:8020/spark/jars/*
# spark and yarn history server
spark.yarn.historyServer.address node1:18080
yarn-site.xml
<property>
<name>yarn.log.server.url</name>
<value>http://node1:19888/jobhistory/logs</value>
</property>
<property>
<name>yarn.log-aggregation.retain-seconds</name>
<value>604800</value>
</property>
I verify the spark.yarn.historyServer to node1:19888 and restart spark,but it doesn't work.

How to forward spark log to jupyter notebook?

I know I can set up log level via spark.sparkContext.setLogLevel('INFO') Logs such as the following appears in the terminal, but not in the jupyter notebook.
2019-03-25 11:42:37 WARN NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
2019-03-25 11:42:37 WARN SparkConf:66 - In Spark 1.0 and later spark.local.dir will be overridden by the value set by the cluster manager (via SPARK_LOCAL_DIRS in mesos/standalone and LOCAL_DIRS in YARN).
2019-03-25 11:42:38 WARN Utils:66 - Service 'SparkUI' could not bind on port 4040. Attempting port 4041.
The spark session is created in local mode in the jupyter notebook cell.
spark = SparkSession \
.builder \
.master('local[7]') \
.appName('Notebook') \
.getOrCreate()
Is there any way to forward the logs to the jupyter notebook?

Error when starting the Spark shell on Ubuntu : Resource temporarily unavailable

I installed Spark 2.3.0 on Ubuntu 18.04 with Java 1.8, following these steps:
https://github.com/ashishtam/apache-spark-multi-node-installation/blob/master/index.md
I have a simple cluster made of 1 master and 1 slave, the /etc/hosts file being
# For Spark cluster
172.16.10.20 master
172.16.10.30 slave01
#127.0.0.1 localhost.localdomain localhost
::1 localhost6.localdomain6 localhost6
# The following lines are desirable for IPv6 capable hosts
::1 localhost ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts
Once Spark installed and configured, I started the master and the slaves
./sbin/start-master.sh
./sbin/start-slaves.sh spark://master-pc:7077
checking the status with jps, on the master node:
> jps
13440 Master
13701 Worker
13963 Jps
and on the slave node:
> jps
4202 Master
4332 Worker
4958 Jps
On the master node, when opening the spark-shell I got:
18/08/30 19:12:38 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
Spark context Web UI available at http://master:4040
Spark context available as 'sc' (master = spark://master:7077, app id = app-20180830191300-0000).
Spark session available as 'spark'.
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 2.3.0
/_/
Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_181)
Type in expressions to have them evaluated.
Type :help for more information.
scala> Exception in thread "main" java.io.IOException: Resource temporarily unavailable
at java.io.FileInputStream.read0(Native Method)
at java.io.FileInputStream.read(FileInputStream.java:207)
at jline.internal.NonBlockingInputStream.read(NonBlockingInputStream.java:169)
at jline.internal.NonBlockingInputStream.read(NonBlockingInputStream.java:137)
at jline.internal.NonBlockingInputStream.read(NonBlockingInputStream.java:246)
at jline.internal.InputStreamReader.read(InputStreamReader.java:261)
at jline.internal.InputStreamReader.read(InputStreamReader.java:198)
at jline.console.ConsoleReader.readCharacter(ConsoleReader.java:2145)
at jline.console.ConsoleReader.readLine(ConsoleReader.java:2349)
at jline.console.ConsoleReader.readLine(ConsoleReader.java:2269)
at scala.tools.nsc.interpreter.jline.InteractiveReader.readOneLine(JLineReader.scala:57)
at scala.tools.nsc.interpreter.InteractiveReader$class.readLine(InteractiveReader.scala:38)
at scala.tools.nsc.interpreter.jline.InteractiveReader.readLine(JLineReader.scala:28)
at scala.tools.nsc.interpreter.ILoop.readOneLine(ILoop.scala:404)
at scala.tools.nsc.interpreter.ILoop.loop(ILoop.scala:413)
at scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply$mcZ$sp(ILoop.scala:923)
at scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply(ILoop.scala:909)
at scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply(ILoop.scala:909)
at scala.reflect.internal.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:97)
at scala.tools.nsc.interpreter.ILoop.process(ILoop.scala:909)
at org.apache.spark.repl.Main$.doMain(Main.scala:76)
at org.apache.spark.repl.Main$.main(Main.scala:56)
at org.apache.spark.repl.Main.main(Main.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:879)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:197)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:227)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:136)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
and it broke...
by curiosity I did the same on the slave node:
18/08/30 19:18:00 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
18/08/30 19:18:43 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.
18/08/30 19:18:43 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.
18/08/30 19:18:43 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.
18/08/30 19:18:43 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.
18/08/30 19:18:43 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.
18/08/30 19:18:43 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.
18/08/30 19:18:43 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.
18/08/30 19:18:43 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.
18/08/30 19:18:43 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.
18/08/30 19:18:43 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.
18/08/30 19:18:43 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.
18/08/30 19:18:43 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.
18/08/30 19:18:43 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.
18/08/30 19:18:43 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.
18/08/30 19:18:43 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.
18/08/30 19:18:43 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.
18/08/30 19:18:43 ERROR SparkContext: Error initializing SparkContext.
java.net.BindException: Cannot assign requested address: Service 'sparkDriver' failed after 16 retries (on a random free port)! Consider explicitly setting the appropriate binding address for the service 'sparkDriver' (for example spark.driver.bindAddress for SparkDriver) to the correct binding address.
at sun.nio.ch.Net.bind0(Native Method)
at sun.nio.ch.Net.bind(Net.java:433)
at sun.nio.ch.Net.bind(Net.java:425)
at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
at io.netty.channel.socket.nio.NioServerSocketChannel.doBind(NioServerSocketChannel.java:128)
at io.netty.channel.AbstractChannel$AbstractUnsafe.bind(AbstractChannel.java:558)
at io.netty.channel.DefaultChannelPipeline$HeadContext.bind(DefaultChannelPipeline.java:1283)
at io.netty.channel.AbstractChannelHandlerContext.invokeBind(AbstractChannelHandlerContext.java:501)
at io.netty.channel.AbstractChannelHandlerContext.bind(AbstractChannelHandlerContext.java:486)
at io.netty.channel.DefaultChannelPipeline.bind(DefaultChannelPipeline.java:989)
at io.netty.channel.AbstractChannel.bind(AbstractChannel.java:254)
at io.netty.bootstrap.AbstractBootstrap$2.run(AbstractBootstrap.java:364)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:403)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:463)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
at java.lang.Thread.run(Thread.java:748)
java.net.BindException: Cannot assign requested address: Service 'sparkDriver' failed after 16 retries (on a random free port)! Consider explicitly setting the appropriate binding address for the service 'sparkDriver' (for example spark.driver.bindAddress for SparkDriver) to the correct binding address.
at sun.nio.ch.Net.bind0(Native Method)
at sun.nio.ch.Net.bind(Net.java:433)
at sun.nio.ch.Net.bind(Net.java:425)
at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
at io.netty.channel.socket.nio.NioServerSocketChannel.doBind(NioServerSocketChannel.java:128)
at io.netty.channel.AbstractChannel$AbstractUnsafe.bind(AbstractChannel.java:558)
at io.netty.channel.DefaultChannelPipeline$HeadContext.bind(DefaultChannelPipeline.java:1283)
at io.netty.channel.AbstractChannelHandlerContext.invokeBind(AbstractChannelHandlerContext.java:501)
at io.netty.channel.AbstractChannelHandlerContext.bind(AbstractChannelHandlerContext.java:486)
at io.netty.channel.DefaultChannelPipeline.bind(DefaultChannelPipeline.java:989)
at io.netty.channel.AbstractChannel.bind(AbstractChannel.java:254)
at io.netty.bootstrap.AbstractBootstrap$2.run(AbstractBootstrap.java:364)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:403)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:463)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
at java.lang.Thread.run(Thread.java:748)
<console>:14: error: not found: value spark
import spark.implicits._
^
<console>:14: error: not found: value spark
import spark.sql
^
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 2.3.0
/_/
Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_181)
Type in expressions to have them evaluated.
Type :help for more information.
scala>
I ended up with a spark shell, but errors about a spark-driver.
I have checked the IP in /etc/hosts, the SPARK_HOME variable in the master node, the configuration files of Spark, the ssh keys...
the spark local ip is also defined as (for the master node):
export SPARK_LOCAL_IP=172.16.10.20
does anyone see what I am doing wrong here?

Why does pyspark fail with "Error while instantiating 'org.apache.spark.sql.hive.HiveSessionStateBuilder'"?

For the life of me I cannot figure out what is wrong with my PySpark install. I have installed all dependencies, including Hadoop, but PySpark cant find it--am I diagnosing this correctly?
See the full error message below, but it ultimately fails on PySpark SQL
pyspark.sql.utils.IllegalArgumentException: u"Error while instantiating 'org.apache.spark.sql.hive.HiveSessionStateBuilder':"
nickeleres#Nicks-MBP:~$ pyspark
Python 2.7.10 (default, Feb 7 2017, 00:08:15)
[GCC 4.2.1 Compatible Apple LLVM 8.0.0 (clang-800.0.34)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.apache.hadoop.security.authentication.util.KerberosUtil (file:/opt/spark-2.2.0/jars/hadoop-auth-2.7.3.jar) to method sun.security.krb5.Config.getInstance()
WARNING: Please consider reporting this to the maintainers of org.apache.hadoop.security.authentication.util.KerberosUtil
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
17/10/24 21:21:58 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
17/10/24 21:21:59 WARN Utils: Service 'SparkUI' could not bind on port 4040. Attempting port 4041.
17/10/24 21:21:59 WARN Utils: Service 'SparkUI' could not bind on port 4041. Attempting port 4042.
17/10/24 21:21:59 WARN Utils: Service 'SparkUI' could not bind on port 4042. Attempting port 4043.
Traceback (most recent call last):
File "/opt/spark/python/pyspark/shell.py", line 45, in <module>
spark = SparkSession.builder\
File "/opt/spark/python/pyspark/sql/session.py", line 179, in getOrCreate
session._jsparkSession.sessionState().conf().setConfString(key, value)
File "/opt/spark/python/lib/py4j-0.10.4-src.zip/py4j/java_gateway.py", line 1133, in __call__
File "/opt/spark/python/pyspark/sql/utils.py", line 79, in deco
raise IllegalArgumentException(s.split(': ', 1)[1], stackTrace)
pyspark.sql.utils.IllegalArgumentException: u"Error while instantiating 'org.apache.spark.sql.hive.HiveSessionStateBuilder':"
>>>
tl;dr Close all the other Spark processes and start over.
The following WARN messages say that there is another process (or multiple processes) that holds the ports.
I'm sure that the process(es) are Spark processes, e.g. pyspark sessions or Spark applications.
17/10/24 21:21:59 WARN Utils: Service 'SparkUI' could not bind on port 4040. Attempting port 4041.
17/10/24 21:21:59 WARN Utils: Service 'SparkUI' could not bind on port 4041. Attempting port 4042.
17/10/24 21:21:59 WARN Utils: Service 'SparkUI' could not bind on port 4042. Attempting port 4043.
That's why after Spark/pyspark has found that the port 4044 is free to use for web UI it tried to instantiate HiveSessionStateBuilder and failed.
pyspark failed as you cannot have more than one Spark application up and running that uses the same local Hive metastore.
WHY THIS HAPPENS ?
Because we try to create new session more than once ! on different tabs of browser of jupyter notebook.
Solution :
START NEW SESSION ON SINGLE TAB IN JUPYTER NOTEBOOK AND AVOID TO CREATE NEW SESSION ON DIFFRENT TABS
from pyspark.sql import SparkSession
spark = SparkSession.builder.appName('EXAMPLE').getOrCreate()
We received the same error while trying to create a spark session using Jupyter notebook.
We noticed that in our case user did not have permission to spark scratch directory i.e. directory used against following spark property value "spark.local.dir". We changed the permission of directory so that user has full access to this and issue got resolved. Generally this directory resides on something like "/tmp/user".
Please note that as per spark documentation spark scratch directory is a "Directory to use for "scratch" space in Spark, including map output files and RDDs that get stored on disk. This should be on a fast, local disk in your system. It can also be a comma-separated list of multiple directories on different disks".
Another possible cause is that the spark application failed to start due to minimum machine requirements were not attended.
In the Application history tab:
Diagnostics:Uncaught exception: org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException: Invalid resource request, requested virtual cores < 0, or requested virtual cores > max configured, requestedVirtualCores=5, maxVirtualCores=4
Illustration:

spark submit "Service 'Driver' could not bind on port" error

I used the following command to run the spark java example of wordcount:-
time spark-submit --deploy-mode cluster --master spark://192.168.0.7:6066 --class org.apache.spark.examples.JavaWordCount /home/pi/Desktop/example/new/target/javaword.jar /books_50.txt
When I run it, the following is the output:-
Running Spark using the REST application submission protocol.
16/07/18 03:55:41 INFO rest.RestSubmissionClient: Submitting a request to launch an application in spark://192.168.0.7:6066.
16/07/18 03:55:44 INFO rest.RestSubmissionClient: Submission successfully created as driver-20160718035543-0000. Polling submission state...
16/07/18 03:55:44 INFO rest.RestSubmissionClient: Submitting a request for the status of submission driver-20160718035543-0000 in spark://192.168.0.7:6066.
16/07/18 03:55:44 INFO rest.RestSubmissionClient: State of driver driver-20160718035543-0000 is now RUNNING.
16/07/18 03:55:44 INFO rest.RestSubmissionClient: Driver is running on worker worker-20160718041005-192.168.0.12-42405 at 192.168.0.12:42405.
16/07/18 03:55:44 INFO rest.RestSubmissionClient: Server responded with CreateSubmissionResponse:
{
"action" : "CreateSubmissionResponse",
"message" : "Driver successfully submitted as driver-20160718035543-0000",
"serverSparkVersion" : "1.6.2",
"submissionId" : "driver-20160718035543-0000",
"success" : true
}
I checked the particular worker (192.168.0.12) for its log and it says:-
Launch Command: "/usr/lib/jvm/jdk-8-oracle-arm32-vfp-hflt/jre/bin/java" "-cp" "/opt/spark/conf/:/opt/spark/lib/spark-assembly-1.6.2-hadoop2.6.0.jar:/opt/spark/lib/datanucleus-api-jdo-3.2.6.jar:/opt/spark/lib/datanucleus-core-3.2.10.jar:/opt/spark/lib/datanucleus-rdbms-3.2.9.jar" "-Xms1024M" "-Xmx1024M" "-Dspark.driver.supervise=false" "-Dspark.app.name=org.apache.spark.examples.JavaWordCount" "-Dspark.submit.deployMode=cluster" "-Dspark.jars=file:/home/pi/Desktop/example/new/target/javaword.jar" "-Dspark.master=spark://192.168.0.7:7077" "-Dspark.executor.memory=10M" "org.apache.spark.deploy.worker.DriverWrapper" "spark://Worker#192.168.0.12:42405" "/opt/spark/work/driver-20160718035543-0000/javaword.jar" "org.apache.spark.examples.JavaWordCount" "/books_50.txt"
========================================
log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.lib.MutableMetricsFactory).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
16/07/18 04:10:58 INFO SecurityManager: Changing view acls to: pi
16/07/18 04:10:58 INFO SecurityManager: Changing modify acls to: pi
16/07/18 04:10:58 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(pi); users with modify permissions: Set(pi)
16/07/18 04:11:00 WARN Utils: Service 'Driver' could not bind on port 0. Attempting port 1.
16/07/18 04:11:00 WARN Utils: Service 'Driver' could not bind on port 0. Attempting port 1.
16/07/18 04:11:00 WARN Utils: Service 'Driver' could not bind on port 0. Attempting port 1.
16/07/18 04:11:00 WARN Utils: Service 'Driver' could not bind on port 0. Attempting port 1.
16/07/18 04:11:00 WARN Utils: Service 'Driver' could not bind on port 0. Attempting port 1.
16/07/18 04:11:00 WARN Utils: Service 'Driver' could not bind on port 0. Attempting port 1.
16/07/18 04:11:00 WARN Utils: Service 'Driver' could not bind on port 0. Attempting port 1.
16/07/18 04:11:00 WARN Utils: Service 'Driver' could not bind on port 0. Attempting port 1.
16/07/18 04:11:00 WARN Utils: Service 'Driver' could not bind on port 0. Attempting port 1.
16/07/18 04:11:00 WARN Utils: Service 'Driver' could not bind on port 0. Attempting port 1.
16/07/18 04:11:00 WARN Utils: Service 'Driver' could not bind on port 0. Attempting port 1.
16/07/18 04:11:00 WARN Utils: Service 'Driver' could not bind on port 0. Attempting port 1.
16/07/18 04:11:00 WARN Utils: Service 'Driver' could not bind on port 0. Attempting port 1.
16/07/18 04:11:00 WARN Utils: Service 'Driver' could not bind on port 0. Attempting port 1.
16/07/18 04:11:00 WARN Utils: Service 'Driver' could not bind on port 0. Attempting port 1.
16/07/18 04:11:00 WARN Utils: Service 'Driver' could not bind on port 0. Attempting port 1.
Exception in thread "main" java.net.BindException: Cannot assign requested address: Service 'Driver' failed after 16 retries! Consider explicitly setting the appropriate port for the service 'Driver' (for example spark.ui.port for SparkUI) to an available port or increasing spark.port.maxRetries.
at sun.nio.ch.Net.bind0(Native Method)
at sun.nio.ch.Net.bind(Net.java:433)
at sun.nio.ch.Net.bind(Net.java:425)
at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
at io.netty.channel.socket.nio.NioServerSocketChannel.doBind(NioServerSocketChannel.java:125)
at io.netty.channel.AbstractChannel$AbstractUnsafe.bind(AbstractChannel.java:485)
at io.netty.channel.DefaultChannelPipeline$HeadContext.bind(DefaultChannelPipeline.java:1089)
at io.netty.channel.AbstractChannelHandlerContext.invokeBind(AbstractChannelHandlerContext.java:430)
at io.netty.channel.AbstractChannelHandlerContext.bind(AbstractChannelHandlerContext.java:415)
at io.netty.channel.DefaultChannelPipeline.bind(DefaultChannelPipeline.java:903)
at io.netty.channel.AbstractChannel.bind(AbstractChannel.java:198)
at io.netty.bootstrap.AbstractBootstrap$2.run(AbstractBootstrap.java:348)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:357)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
at java.lang.Thread.run(Thread.java:745)
My spark-env.sh file (for master) contains:-
export SPARK_MASTER_WEBUI_PORT="8080"
export SPARK_MASTER_IP="192.168.0.7"
export SPARK_EXECUTOR_MEMORY="10M"
My spark-env.sh file (for worker) contains:-
export SPARK_WORKER_WEBUI_PORT="8080"
export SPARK_MASTER_IP="192.168.0.7"
export SPARK_EXECUTOR_MEMORY="10M"
Please help...!!
I had the same issue when trying to run the shell, and was able to get this working by setting the SPARK_LOCAL_IP environment variable. You can assign this from the command line when running the shell:
SPARK_LOCAL_IP=127.0.0.1 ./bin/spark-shell
For a more permanent solution, create a spark-env.sh file in the conf directory of your Spark root. Add the following line:
SPARK_LOCAL_IP=127.0.0.1
Give execute permissions to the script using chmod +x ./conf/spark-env.sh, and this will set this environment variable by default.
I am using Maven/SBT to manage dependencies and the Spark core is contained in a jar file.
You can override the SPARK_LOCAL_IP at runtime by setting the "spark.driver.bindAddress" (here in Scala):
val config = new SparkConf()
config.setMaster("local[*]")
config.setAppName("Test App")
config.set("spark.driver.bindAddress", "127.0.0.1")
val sc = new SparkContext(config)
I also had this issue.
The reason (for me) was that the IP of my local system was not reachable from my local system.
I know that statement makes no sense, but please read the following.
My system name (uname -s) shows that my system is named "sparkmaster".
In my /etc/hosts file, I have assigned a fixed IP address for the sparkmaster system as "192.168.1.70". There were additional fixed IP addresses for sparknode01 and sparknode02 at ...1.71 & ...1.72 respectively.
Due to some other problems I had, I needed to change all of my network adapters to DHCP. This meant that they were getting addresses like 192.168.90.123.
The DHCP addresses were not in the same network as the ...1.70 range and there was no route configured.
When spark starts, is seems to want to try to connect to the host named in uname (i.e. sparkmaster in my case). This was the IP 192.168.1.70 - but there was no way to connect to that because that address was in an unreachable network.
My solution was to change one of my Ethernet adapters back to a fixed static address (i.e. 192.168.1.70) and voila - problem solved.
So the issues seems to be that when spark starts in "local mode" it attempts to connect to a system named after your system's name (rather than local host).
I guess this makes sense if you are wanting to setup a cluster (Like I did) but it can result in the above confusing message.
Possibly putting your system's host name on the 127.0.0.1 entry in /etc/hosts may also solve this problem, but I did not try it.
You need to enter the hostname in your /etc/hosts file.
Something like:
127.0.0.1 localhost "hostname"
This is possibly a duplicate of Spark 1.2.1 standalone cluster mode spark-submit is not working
I have tried the same steps, but able to run the job. Kindly post the full spark-env.sh and spark-defaults if possible.
I had this problem and it is because of changing real IP with my IP in /etc/hosts.
This issue is related to IP address alone. Error messages in the log file are not informative.
check with following 3 steps:
check your IP address - can be checked with ifconfig or ip commands. If your service is not a Public service. IP addresses with 192.168 should be good enough. 127.0.0.1 cannot be used if you are planning a cluster.
check your environment variable SPARK_MASTER_HOST - check there are no typos in the name of the variable or actual IP address.
env | grep SPARK_
check the port you are planning to use for sparkMaster is free with command netstat. Do not use a port below 1024. For example:
netstat -a | 9123
After your sparkmaster starts running if you are not able see webui from a different machine, then open the webui port with command iptables.
Use as below in dataframes
val spark=SparkSession.builder.appName("BinarizerExample").master("local[*]").config("spark.driver.bindAddress", "127.0.0.1").getOrCreate()
First Option :-
Following steps might help:
Get your hostname by using "hostname" command.
xxxxxx.ssssss (e) base ~ hostname
xxxxxx.ssssss.net
Make an entry in the /etc/hosts file for your hostname if not present as follows:
127.0.0.1 xxxxxx.ssssss.net
Second Option:-
you can set spark.driver.bindAddress value in your spark.conf file
spark.driver.bindAddress=127.0.0.1
Thanks!!
I solved this problem by modifying the slave file.its spark-2.4.0-bin-hadoop2.7/conf/slave
please check your configure。

Resources