Livy Server on Amazon EMR hangs on Connecting to ResourceManager - apache-spark

I'm trying to deploy a Livy Server on Amazon EMR. First I built the Livy master branch
mvn clean package -Pscala-2.11 -Pspark-2.0
Then, I uploaded it to the EMR cluster master. I set the following configurations:
livy-env.sh
SPARK_HOME=/usr/lib/spark
HADOOP_CONF_DIR=/etc/hadoop/conf
livy.conf
livy.spark.master = yarn
livy.spark.deployMode = cluster
When I start Livy, it hangs indefinitely while connecting to YARN Resource manager (XX.XX.XXX.XX is the IP address)
16/10/28 17:56:23 INFO RMProxy: Connecting to ResourceManager at /XX.XX.XXX.XX:8032
However when I netcat the port 8032, it connects successfully
nc -zv XX.XX.XXX.XX 8032
Connection to XX.XX.XXX.XX 8032 port [tcp/pro-ed] succeeded!
I think I'm probably missing some step. Anyone has any idea of what this step might be?

I made the following changes to the config files after unzipping the livy-server-0.2.0.zip file
livy-env.sh
export SPARK_HOME=/usr/hdp/current/spark-client
export HADOOP_HOME=/usr/hdp/current/hadoop-client/bin/
export HADOOP_CONF_DIR=/etc/hadoop/conf
export SPARK_CONF_DIR=$SPARK_HOME/conf
export LIVY_LOG_DIR=/jobserver-livy/logs
export LIVY_PID_DIR=/jobserver-livy
export LIVY_MAX_LOG_FILES=10
export HBASE_HOME=/usr/hdp/current/hbase-client/bin
livy.conf
livy.rsc.rpc.server.address=<Loop Back address>
Add 'spark.master yarn-cluster' in the 'spark-defaults.conf' file which is under spark conf folder.
Please let me know if you still have issues.

You can use the following in your log4j.properties, Please post the log file.
log4j.rootCategory=DEBUG, NotConsole
log4j.appender.NotConsole=org.apache.log4j.RollingFileAppender
log4j.appender.NotConsole.File=/<LIVY SERVER INSTALL PATH>/logs/livy.log
log4j.appender.NotConsole.maxFileSize=20MB
log4j.appender.NotConsole.layout=org.apache.log4j.PatternLayout
log4j.appender.NotConsole.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{1}: %m%n

Looking at the github repo , it looks like master branch is under-development and there is a separate release branch for 0.2 version. The straighforward way (which worked for me) to install livy is to follow the steps in the quickstart page: http://livy.io/quickstart.html
Download Livy Server distribution
wget http://archive.cloudera.com/beta/livy/livy-server-0.2.0.zip
unzip
unzip livy-server-0.2.0.zip
start
$ cd livy-server-0.2.0
$ ./bin/livy-server
16/11/07 20:32:51 INFO LivyServer: Using spark-submit version 2.0.0
16/11/07 20:32:51 WARN RequestLogHandler: !RequestLog
16/11/07 20:32:51 INFO WebServer: Starting server on http://ip-xx-xx-xx-xxx.us-west-2.compute.internal:8998

Related

spark docker-image-tool Cannot find docker image

I deployed spark on kuberenets
helm install microsoft/spark --version 1.0.0 (also tried bitnami chart with the same result)
then, as is described https://spark.apache.org/docs/latest/running-on-kubernetes.html#submitting-applications-to-kubernetes
i go to $SPARK_HOME/bin
docker-image-tool.sh -r -t my-tag build
this returns
Cannot find docker image. This script must be run from a runnable distribution of Apache Spark.
but all spark runnables are in this directory.
bash-4.4# cd $SPARK_HOME/bin
bash-4.4# ls
beeline find-spark-home.cmd pyspark.cmd spark-class spark-shell.cmd spark-sql2.cmd sparkR
beeline.cmd load-spark-env.cmd pyspark2.cmd spark-class.cmd spark-shell2.cmd spark-submit sparkR.cmd
docker-image-tool.sh load-spark-env.sh run-example spark-class2.cmd spark-sql spark-submit.cmd sparkR2.cmd
find-spark-home pyspark run-example.cmd spark-shell spark-sql.cmd spark-submit2.cmd
any suggestions what am i doing wrong?
i haven't made any other configurations with spark, am i missing something? should i install docker myself, or any other tools?
You are mixing things here.
When you run helm install microsoft/spark --version 1.0.0 you're deploying Spark with all pre-requisites inside Kubernetes. Helm is doing all hard work for you. After you run this, Spark is ready to use.
Than after you deploy Spark using Helm you are trying to deploy Spark from inside a Spark pod that is already running on Kubernetes.
These are two different things that are not meant to be mixed. This guide is explaining how to run Spark on Kubernetes by hand but fortunately it can be done using Helm as you did before.
When you run helm install myspark microsoft/spark --version 1.0.0, the output is telling you how to access your spark webui:
NAME: myspark
LAST DEPLOYED: Wed Apr 8 08:01:39 2020
NAMESPACE: default
STATUS: deployed
REVISION: 1
NOTES:
1. Get the Spark URL to visit by running these commands in the same shell:
NOTE: It may take a few minutes for the LoadBalancer IP to be available.
You can watch the status of by running 'kubectl get svc --namespace default -w myspark-webui'
export SPARK_SERVICE_IP=$(kubectl get svc --namespace default myspark-webui -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
echo http://$SPARK_SERVICE_IP:8080
2. Get the Zeppelin URL to visit by running these commands in the same shell:
NOTE: It may take a few minutes for the LoadBalancer IP to be available.
You can watch the status of by running 'kubectl get svc --namespace default -w myspark-zeppelin'
export ZEPPELIN_SERVICE_IP=$(kubectl get svc --namespace default myspark-zeppelin -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
echo http://$ZEPPELIN_SERVICE_IP:8080
Let's check it:
$ export SPARK_SERVICE_IP=$(kubectl get svc --namespace default myspark-webui -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
$ echo http://$SPARK_SERVICE_IP:8080
http://34.70.212.182:8080
If you open this URL you have your Spark webui ready.

Spark on K8s - getting error: kube mode not support referencing app depenpendcies in local

I am trying to setup a spark cluster on k8s. I've managed to create and setup a cluster with three nodes by following this article:
https://kubernetes.io/docs/setup/independent/create-cluster-kubeadm/
After that when I tried to deploy spark on the cluster it failed at spark submit setup.
I used this command:
~/opt/spark/spark-2.3.0-bin-hadoop2.7/bin/spark-submit \
--master k8s://https://206.189.126.172:6443 \
--deploy-mode cluster \
--name word-count \
--class org.apache.spark.examples.SparkPi \
--conf spark.executor.instances=5 \
--conf spark.kubernetes.container.image=docker.io/garfiny/spark:v2.3.0 \
—-conf spark.kubernetes.driver.pod.name=word-count \
local:///opt/spark/examples/jars/spark-examples_2.11-2.3.0.jar
And it gives me this error:
Exception in thread "main" org.apache.spark.SparkException: The Kubernetes mode does not yet support referencing application dependencies in the local file system.
at org.apache.spark.deploy.k8s.submit.DriverConfigOrchestrator.getAllConfigurationSteps(DriverConfigOrchestrator.scala:122)
at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication$$anonfun$run$5.apply(KubernetesClientApplication.scala:229)
at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication$$anonfun$run$5.apply(KubernetesClientApplication.scala:227)
at org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2585)
at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.run(KubernetesClientApplication.scala:227)
at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.start(KubernetesClientApplication.scala:192)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:879)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:197)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:227)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:136)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
2018-06-04 10:58:24 INFO ShutdownHookManager:54 - Shutdown hook called
2018-06-04 10:58:24 INFO ShutdownHookManager:54 - Deleting directory /private/var/folders/lz/0bb8xlyd247cwc3kvh6pmrz00000gn/T/spark-3967f4ae-e8b3-428d-ba22-580fc9c840cd
Note: I followed this article for installing spark on k8s.
https://spark.apache.org/docs/latest/running-on-kubernetes.html
The error message comes from commit 5d7c4ba4d73a72f26d591108db3c20b4a6c84f3f and include the page you mention: "Running Spark on Kubernetes" with the mention that you indicate:
// TODO(SPARK-23153): remove once submission client local dependencies are supported.
if (existSubmissionLocalFiles(sparkJars) || existSubmissionLocalFiles(sparkFiles)) {
throw new SparkException("The Kubernetes mode does not yet support referencing application " +
"dependencies in the local file system.")
}
This is described in SPARK-18278:
it wouldn't accept running a local: jar file, e.g. local:///opt/spark/examples/jars/spark-examples_2.11-2.2.0-k8s-0.5.0.jar, on my spark docker image (allowsMixedArguments and isAppResourceReq booleans in SparkSubmitCommandBuilder.java get in the way).
And this is linked to kubernetes issue 34377
The issue SPARK-22962 "Kubernetes app fails if local files are used" mentions:
This is the resource staging server use-case. We'll upstream this in the 2.4.0 timeframe.
In the meantime, that error message was introduced in PR 20320.
It includes the comment:
The manual tests I did actually use a main app jar located on gcs and http.
To be specific and for record, I did the following tests:
Using a gs:// main application jar and a http:// dependency jar. Succeeded.
Using a https:// main application jar and a http:// dependency jar. Succeeded.
Using a local:// main application jar. Succeeded.
Using a file:// main application jar. Failed.
Using a file:// dependency jar. Failed.
That issue should been fixed by now, and the OP garfiny confirms in the comments:
I used the newest spark-kubernetes jar to replace the one in spark-2.3.0-bin-hadoop2.7 package. The exception is gone.
According to the mentioned documentation:
Dependency Management
If your application’s dependencies are all
hosted in remote locations like HDFS or HTTP servers, they may be
referred to by their appropriate remote URIs. Also, application
dependencies can be pre-mounted into custom-built Docker images. Those
dependencies can be added to the classpath by referencing them with
local:// URIs and/or setting the SPARK_EXTRA_CLASSPATH environment
variable in your Dockerfiles. The local:// scheme is also required
when referring to dependencies in custom-built Docker images in
spark-submit.
Note that using application dependencies from the
submission client’s local file system is currently not yet supported.

Spark cleanup job not running

Whenever I do a dse spark-submit <jarname>,it copies the jar in SPARK_WORKER_DIR (in my case /var/lib/spark-worker/worker-0). I want to get the jar automatically deleted once the spark job is successfully completed/run. Using this, I changed my SPARK_WORKER_OPTS in spark-env.sh as follows :
export SPARK_WORKER_OPTS="$SPARK_WORKER_OPTS -Dspark.worker.cleanup.enabled=true -Dspark.worker.cleanup.interval=1800"
But the jar is still not getting deleted. Am I doing something wrong? What should I do?
Adding this line to spark-env.sh and restarting the dse service worked for me:
export SPARK_WORKER_OPTS="$SPARK_WORKER_OPTS -Dspark.worker.cleanup.enabled=true -Dspark.worker.cleanup.interval=3600 -Dspark.worker.cleanup.appDataTtl=172800 "
I restarted the dse service by
nodetool drain
sudo service dse restart
This deletes the log 2 days after the job is complete.

How to enable Spark mesos docker executor?

I'm working on integration between Mesos & Spark. For now, I can start SlaveMesosDispatcher in a docker; and I like to also run Spark executor in Mesos docker. I do the following configuration for it, but I got an error; any suggestion?
Configuration:
Spark: conf/spark-defaults.conf
spark.mesos.executor.docker.image ubuntu
spark.mesos.executor.docker.volumes /usr/bin:/usr/bin,/usr/local/lib:/usr/local/lib,/usr/lib:/usr/lib,/lib:/lib,/home/test/workshop/spark:/root/spark
spark.mesos.executor.home /root/spark
#spark.executorEnv.SPARK_HOME /root/spark
spark.executorEnv.MESOS_NATIVE_LIBRARY /usr/local/lib
NOTE: The spark are installed in /home/test/workshop/spark, and all dependencies are installed.
After submit SparkPi to the dispatcher, the driver job is started but failed. The error messes is:
I1015 11:10:29.488456 18697 exec.cpp:134] Version: 0.26.0
I1015 11:10:29.506619 18699 exec.cpp:208] Executor registered on slave b7e24114-7585-40bc-879b-6a1188cb65b6-S1
WARNING: Your kernel does not support swap limit capabilities, memory limited without swap.
/bin/sh: 1: ./bin/spark-submit: not found
Does any know how to map/set spark home in docker for this case?
I think the issue you're seeing here is a result of the current working directory of the container isn't where Spark is installed. When you specify a docker image for Spark to use with Mesos, it expects the default working directory of the container to be inside $SPARK_HOME where it can find ./bin/spark-submit.
You can see that logic here.
It doesn't look like you're able to configure the working directory through Spark configuration itself, which means you'll need to build a custom image on top of ubuntu that simply does a WORKDIR /root/spark.

Apache Zeppelin not load libmesos.so

I'm evaluating Apache Zeppelin with the current release version v0.5. I have a mesos cluster with spark registered as framework, then I need configure Zeppelin to connect to the remote spark cluster on mesos.
My config in conf/zeppelin-env.sh it's:
export MASTER=mesos://<mesos_ip>:5050
export MESOS_NATIVE_JAVA_LIBRARY=/usr/lib/libmesos.so
export ZEPPELIN_JAVA_OPTS="-Dspark.executor.uri=http://<public_host_url>/spark-1.4.0-bin-hadoop2.6.tgz"
But when I execute the boot command and run the demo notebook the log show some errors and the query don't work:
------ Create new SparkContext mesos://172.23.0.135:5050 -------
Failed to load native Mesos library from /usr/lib/libmesos.so
------ Create new SparkContext mesos://172.23.0.135:5050 -------
Failed to load native Mesos library from /usr/lib/libmesos.so
I can't find any documentation or source code about this erros message. And I don't understand the reason because I have libmesos.so on /usr/lib and when I execute separatly spark-submmit all work fine from my host.
According to the docs you should set the MESOS_NATIVE_JAVA_LIBRARY and SPARK_EXECUTOR_URI environment variables.
export MESOS_NATIVE_JAVA_LIBRARY=/usr/lib/libmesos.so
export SPARK_EXECUTOR_URI={YOUR_SPARK_DOWNLOAD_LOCATION}
Can you try setting something like below?
export MESOS_NATIVE_LIBRARY=/usr/lib/libmesos.so
export SPARK_EXECUTOR_URI=http://<public_host_url>/spark-1.4.0-bin-hadoop2.6.tgz

Resources