why Spark executor needs to connect with Worker - apache-spark

When I kicked off one Spark job I will find the Executor startup command line as following:
bin/java -cp /opt/conf/:/opt/jars/* -Xmx1024M -Dspark.driver.port=56559
org.apache.spark.executor.CoarseGrainedExecutorBackend
--driver-url spark://CoarseGrainedScheduler#10.1.140.2:56559
--executor-id 1 --hostname 10.1.140.5 --cores 2
--app-id app-20161221132517-0000
--worker-url spark://Worker#10.1.140.5:56451
From above command we would find the line --worker-url spark://Worker#10.1.140.5:56451,that's I'm curious about, why Executor needs to communicate with Worker, in my mind executor only needs to talk with other executors and Driver.

You can see in the above image that Executors are part of worker nodes.
Application : User program built on Spark. Consists of a driver program and executors on the cluster.
Worker node : Any node that can run application code in the cluster
Executor : A process launched for an application on a worker node, that runs tasks and keeps data in memory or disk storage across them. Each application has its own executors.
Source

Executor fate is connected with the worker fate. If worker is abnormally terminated executors have to be able to detect this fact and stop themselves. Without this process one could end up with "ghost" executors.

Related

Why is the executors entry not visible in spark web ui

I am running a spark job and even though I've set the the --num-executors parameter to 3 i can't see any executors in the in the web ui executors tab why is happening
Spark in local mode is non-distributed. Spark process will run on single JVM and driver will also behave as an executor.
You can only define number of threads in master URL.
You can switch to standalone mode.
Start the master using below command:
spark-class org.apache.spark.deploy.master.Master
And the worker using:
spark-class org.apache.spark.deploy.worker.Worker spark://<host>:7077
Now run the spark-submit command.
If you have 6 cores, just specifying --executor-cores 2 will create 3 executors and you can check the on spark UI.

how many jvm processes are used for a spark worker?

Each spark executor runs in its own JVM process which means that on each worker (slave) there will be multiple JVMs running. Is it safe to say that each worker runs as many JVMs as there are executors assigned to that machine + at least one more JVM (because spark needs at least one more JVM for BlockManager on each worker). In other words, is the BlockManager on each worker run on a different JVM process ?
Which cluster manager you are using ?
Spark uses cluster managers like K8s/ Mesos/ Yarn for resource allocation. Where the JVM is to run will be decided by the cluster manager. Spark as client request for resources from these cluster managers.

Spark multiple jobs error

I am trying to submit multiple applications on spark.
After first application is completed, Spark allocates all the worker nodes to driver. As a result no cores are left for execution
My Environment: 2 worker nodes each with 1 core and 2GB RAM, the driver is running on the nodes.
Spark submit command: ./spark-submit --class Main --master spark://ip:6066 --deploy-mode cluster /jarPath
So if I submit 3 jobs, after first is completed, second and third gets one core each for their drivers and no cores are left for execution.
Please tell a way to resolve this.
Try killing old instances of spark:
http://spark.apache.org/docs/latest/spark-standalone.html#launching-spark-applications
./bin/spark-class org.apache.spark.deploy.Client kill <master url> <driver ID>
You can find the driver ID through the standalone Master web UI at http://:8080.

Running a distributed Spark Job Server with multiple workers in a Spark standalone cluster

I have a Spark standalone cluster running on a few machines. All workers are using 2 cores and 4GB of memory. I can start a job server with ./server_start.sh --master spark://ip:7077 --deploy-mode cluster --conf spark.driver.cores=2 --conf spark.driver.memory=4g, but whenever I try to start a server with more than 2 cores, the driver's state gets stuck at "SUBMITTED" and no worker takes the job.
I tried starting the spark-shell on 4 cores with ./spark-shell --master spark://ip:7077 --conf spark.driver.cores=4 --conf spark.driver.memory=4g and the job gets shared between 2 workers (2 cores each). The spark-shell gets launched as an application and not a driver though.
Is there any way to run a driver split between multiple workers? Or can I run the job server as an application rather than a driver?
The problem was resolved in the chat
You have to change your JobServer .conf file to set the master parameter to point to your cluster:
master = "spark://ip:7077"
Also, the memory that JobServer program uses can be set in the settings.sh file.
After setting these parameters, you can start JobServer with a simple call:
./server_start.sh
Then, once the service is running, you can create your context via REST, which will ask the cluster for resources and will receive an appropriate number of excecutors/cores:
curl -d "" '[hostname]:8090/contexts/cassandra-context?context-factory=spark.jobserver.context.CassandraContextFactory&num-cpu-cores=8&memory-per-node=2g'
Finally, every job sent via POST to JobServer on this created context will be able to use the executors allocated to the context and will be able to run in a distributed way.

Role of the Executors on the Spark master machine

In a Spark stand alone cluster, does the Master node run tasks as well? I wasn't sure if there Executors processes are spun up on the Master node and do work, alongside the Worker nodes.
Thanks!
Executors would only be started on the nodes where there is at least one worker daemon on that node, i.e, No executor would be started up in a node that do not serve as Worker.
However, Where to start Master and Workers are all based on your decision, there isn't such limitations that Master and Worker cannot co-locate on a same node.
To start a worker daemon the same machine with your master, you can either edit the conf/slaves file to add the master ip in it and use start-all.sh at start time or start a worker at any time you want on the master node, start-slave.sh and supply the Spark master URL --master spark://master-host:7077
Update (based on Daniel Darabos's suggestion) :
When referring to Application Detail UI's Executors tab, you could also find a row has <driver> for its Executor ID, the driver it denotes is the process where your job is scheduled and monitored, it's running the main program you submitted to the spark cluster, slicing your transformations and actions on RDDs into stages, scheduling the stages as TaskSets and arranging executors to run the tasks.
This <driver> will be started on the node which you call spark-submit in client mode, or on one of the worker nodes in cluster mode

Resources