So I've deployed a cluster with Apache Mesos and Apache Spark and I've several jobs that I need to execute on the cluster. I would like to be sure that a job has enough resources to be successfully executed, and if it is not the case, it must return an error.
Spark provides several settings like spark.cores.max and spark.executor.memory in order to limit the resources used by the job, but there is no settings for the lower bound (e.g. set the minimal number of core to 8 for the job).
I'm looking for a way to be sure that a job has enough resources before it is executed (during the resource allocation for instance), do you know if it is possible to get this information with Apache Spark on Mesos?
Related
As I understand reading through mesos documentation, is a resource offer is done to a application/framework and it is upto the application to accept/reject offer.
I have a "never-ending" spark streaming app where I configured the executors/cores/memory I need parallelism. Aren't these resources acquired only once when my spark-app starts-up. That is, lets say, if my executors are idle are they handed back to mesos?
Does resource offer and acceptance happens only once in case of spark-streaming?
The same question can be extended for other long-running framework such as cassandra or YARN on mesos.
my understanding is that when spark-streaming is run on coarse-grained model resource exchanges happens once and resources are dedicated to executors for lifetime of spark app
The best source for Spark on Mesos would be the spark docs site here.
In the coarse-grained section you can see the following which answers your question:
The benefit of coarse-grained mode is much lower startup overhead, but at the cost of reserving Mesos resources for the complete duration of the application. To configure your job to dynamically adjust to its resource requirements, look into Dynamic Allocation.
If you look into Dynamic resource allocation you can potentially move around executor resources via a Spark Shuffle service. This can be achieved via the provided script by the Spark service, or via Marathon.
How does spark choose nodes to run executors?(spark on yarn)
We use spark on yarn mode, with a cluster of 120 nodes.
Yesterday one spark job create 200 executors, while 11 executors on node1,
10 executors on node2, and other executors distributed equally on the other nodes.
Since there are so many executors on node1 and node2, the job run slowly.
How does spark select the node to run executors?
according to yarn resourceManager?
As you mentioned Spark on Yarn:
Yarn Services choose executor nodes for spark job based on the availability of the cluster resource. Please check queue system and dynamic allocation of Yarn. the best documentation https://blog.cloudera.com/blog/2016/01/untangling-apache-hadoop-yarn-part-3/
Cluster Manager allocates resources across the other applications.
I think the issue is with bad optimized configuration. You need to configure Spark on the Dynamic Allocation. In this case Spark will analyze cluster resources and add changes to optimize work.
You can find all information about Spark resource allocation and how to configure it here: http://site.clairvoyantsoft.com/understanding-resource-allocation-configurations-spark-application/
Are all 120 nodes having identical capacity?
Moreover the jobs will be submitted to a suitable node manager based on the health and resource availability of the node manager.
To optimise spark job, You can use dynamic resource allocation, where you do not need to define the number of executors required for running a job. By default it runs the application with the configured minimum cpu and memory. Later it acquires resource from the cluster for executing tasks. It will release the resources to the cluster manager once the job has completed and if the job is idle up to the configured idle timeout value. It reclaims the resources from the cluster once it starts again.
We are currently running parallel Spark jobs on an EMR cluster using HadoopActivity task from Datapipeline. By default, the newer versions of EMR clusters sets spark dynamic allocation to true which will increase/ reduce the number of executors required based on the load. So do we need to set any other property along with spark-submit e.g. number of cores, executor memory etc. or its best to have EMR cluster handle it dynamically?
This always depends of how you application is working. I can give you an good example of how I work here. For the Data Scientists in general they use the default configuration and it works pretty well due to they use Jupyter here to run their models. The only thing that we setup that can be useful for you is the conf spark.dynamicAllocation.minExecutors this allow to setup at least two or one worker for the job. To not be without any executor. That is what we do with the Data Scientists.
But, EMR has one specific type of configuration for each type of machine you choose. So in general it is optimized for the most common activities. But sometimes you need to change according your request, if you need more memory and less cores for skewed data that is better to change.
Let's assume that I have 4 NM and I have configured spark in yarn-client mode. Then, I set dynamic allocation to true to automatically add or remove a executor based on workload. If I understand correctly, each Spark executor runs as a Yarn container.
So, if I add more NM will the number of executors increase ?
If I remove a NM while a Spark application is running, something will happen to that application?
Can I add/remove executors based on other metrics ? If the answer is yes, there is a function, preferably in python,that does that ?
If I understand correctly, each Spark executor runs as a Yarn container.
Yes. That's how it happens for any application deployed to YARN, Spark including. Spark is not in any way special to YARN.
So, if I add more NM will the number of executors increase ?
No. There's no relationship between the number of YARN NodeManagers and Spark's executors.
From Dynamic Resource Allocation:
Spark provides a mechanism to dynamically adjust the resources your application occupies based on the workload. This means that your application may give resources back to the cluster if they are no longer used and request them again later when there is demand.
As you may have guessed correctly by now, it is irrelevant how many NMs you have in your cluster and it's by the workload when Spark decides whether to request new executors or remove some.
If I remove a NM while a Spark application is running, something will happen to that application?
Yes, but only when Spark uses that NM for executors. After all, NodeManager gives resources (CPU and memory) to a YARN cluster manager that will in turn give them to applications like Spark applications. If you take them back, say by shutting the node down, the resource won't be available anymore and the process of a Spark executor simply dies (as any other process with no resources to run).
Can I add/remove executors based on other metrics ?
Yes, but usually it's Spark job (no pun intended) to do the calculation and requesting new executors.
You can use SparkContext to manage executors using killExecutors, requestExecutors and requestTotalExecutors methods.
killExecutor(executorId: String): Boolean Request that the cluster manager kill the specified executor.
requestExecutors(numAdditionalExecutors: Int): Boolean Request an additional number of executors from the cluster manager.
requestTotalExecutors(numExecutors: Int, localityAwareTasks: Int, hostToLocalTaskCount: Map[String, Int]): Boolean Update the cluster manager on our scheduling needs.
I have an analytics node running, with Spark Sql Thriftserver running on it. Now I can't run another Spark Application with spark-submit.
It says it doesn't have resources. How to configure the dse node, to be able to run both?
The SparkSqlThriftServer is a Spark application like any other. This means it requests and reserves all resources in the cluster by default.
There are two options if you want to run multiple applications at the same time:
Allocate only part of your resources to each application.
This is done by setting spark.cores.max to a smaller value than the max resources in your cluster.
See Spark Docs
Dynamic Allocation
Which allows applications to change the amount of resources they use depending on how much work they are trying to do.
See Spark Docs