auto scaling in spark cassandra cluster with zero downtime - apache-spark

how to add or remove spark cassandra cluster resources (workers, executors, cores, memory etc.) dynamically based on workload? can we predict cluster resources before deploying? what can be done with data when doing scaling down/ remove the nodes from the cluster.

Spark supports dynamic allocation of workers with lots of configuration options, please refer to https://spark.apache.org/docs/latest/job-scheduling.html#dynamic-resource-allocation
Short version:
Spark can allocate new executors when task queue is full
Spark will deallocate executors when they have been idle for some time
Executors will be allocated with the amount of cores/memory that has been set at startup, so chose wisely
Caching will be affected by dynamic allocation (mostly executors being deallocated)

Related

Cluster Creation HdInsight and Cores Asignation. Tunning apache spark submit

I want to process a 250gb gzip(filename.json.gzip) file in Azure Hd Insight cluster with Spark. But I could not do it.
I guess because of a bad relationship between cores, ram, vCPU(s)so I would like to kwnow the better cluster to create and Spark configuration to send.
Currently I'm using this instance:
6 nodes of a cluster E8a v4 (8 Cores, 64 GB RAM)
And My Spark Configuration Are:
Driver Memory: 10Gb
Driver Cores: 7
Executor Memory: 10Gb
Executor Cores: 7
Num Executors: 7
So, there is a better choise in the Azure HDInsight Clusters (Link To the all avaiable clusters i can create) and in the Spark submition configuration?
The performance of your Apache Spark jobs depends on multiple factors. These performance factors include: how your data is stored, how the cluster is configured, and the operations that are used when processing the data.
Common challenges you might face include: memory constraints due to improperly sized executors, long-running operations, and tasks that result in cartesian operations.
There are also many optimizations that can help you overcome these challenges, such as caching, and allowing for data skew.
For more details, refer to Optimize Apache Spark jobs in HDInsight.

apache spark executors and data locality

The spark literature says
Each application gets its own executor processes, which stay up for
the duration of the whole application and run tasks in multiple
threads.
And If I understand this right, In static allocation the executors are acquired by the Spark application when the Spark Context is created on all nodes in the cluster (in a cluster mode). I have a couple of questions
If executors are acquired on all nodes and will stay allocated to
this application during the the duration of the whole application,
isn't there a chance a lot of nodes remain idle?
What is the advantage of acquiring resources when Spark context is
created and not in the DAGScheduler? I mean the application could be
arbitrarily long and it is just holding the resources.
So when the DAGScheduler tries to get the preferred locations and
the executors in those nodes are running the tasks, would it
relinquish the executors on other nodes?
I have checked a related question
Does Spark on yarn deal with Data locality while launching executors
But I'm not sure there is a conclusive answer
If executors are acquired on all nodes and will stay allocated to this application during the the duration of the whole application, isn't there a chance a lot of nodes remain idle?
Yes there is chance. If you have data skew this will happen. The challenge is to tune the executors and executor core so that you get maximum utilization. Spark also provides dynamic resource allocation which ensures the idle executors are removed.
What is the advantage of acquiring resources when Spark context is created and not in the DAGScheduler? I mean the application could be arbitrarily long and it is just holding the resources.
Spark tries to keep data in memory while doing transformation. Contrary to map-reduce model where after every Map operation it writes to disk. Spark can keep the data in memory only if it can ensure the code is executed in the same machine. This is the reason of allocating resource beforehand.
So when the DAGScheduler tries to get the preferred locations and the executors in those nodes are running the tasks, would it relinquish the executors on other nodes?
Spark can't start a task on an executor unless the executor is free. Now spark application master negotiates with the yarn to get the preferred location. It may or may not get that. If it doesn't get, it will start task in different executor.

How does spark choose nodes to run executors?(spark on yarn)

How does spark choose nodes to run executors?(spark on yarn)
We use spark on yarn mode, with a cluster of 120 nodes.
Yesterday one spark job create 200 executors, while 11 executors on node1,
10 executors on node2, and other executors distributed equally on the other nodes.
Since there are so many executors on node1 and node2, the job run slowly.
How does spark select the node to run executors?
according to yarn resourceManager?
As you mentioned Spark on Yarn:
Yarn Services choose executor nodes for spark job based on the availability of the cluster resource. Please check queue system and dynamic allocation of Yarn. the best documentation https://blog.cloudera.com/blog/2016/01/untangling-apache-hadoop-yarn-part-3/
Cluster Manager allocates resources across the other applications.
I think the issue is with bad optimized configuration. You need to configure Spark on the Dynamic Allocation. In this case Spark will analyze cluster resources and add changes to optimize work.
You can find all information about Spark resource allocation and how to configure it here: http://site.clairvoyantsoft.com/understanding-resource-allocation-configurations-spark-application/
Are all 120 nodes having identical capacity?
Moreover the jobs will be submitted to a suitable node manager based on the health and resource availability of the node manager.
To optimise spark job, You can use dynamic resource allocation, where you do not need to define the number of executors required for running a job. By default it runs the application with the configured minimum cpu and memory. Later it acquires resource from the cluster for executing tasks. It will release the resources to the cluster manager once the job has completed and if the job is idle up to the configured idle timeout value. It reclaims the resources from the cluster once it starts again.

Is it possible to run multiple Spark applications on a mesos cluster?

I have a Mesos cluster with 1 Master and 3 slaves (with 2 cores and 4GB RAM each) that has a Spark application already up and running. I wanted to run another application on the same cluster, as the CPU and Memory utilization isn't high. Regardless, when I try to run the new Application, I get the error:
16/02/25 13:40:18 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory
I guess the new process is not getting any CPU as the old one occupies all 6.
I have tried enabling dynamic allocation, making the spark app Fine grained. Assigning numerous combinations of executor cores and number of executors. What I am missing here? Is it possible to run a Mesos Cluster with multiple Spark Frameworks at all?
You can try setting spark.cores.max to limit the number of CPUs used by each Spark driver, which will free up some resources.
Docs: https://spark.apache.org/docs/latest/configuration.html#scheduling

Configuring Executor memory and number of executors per Worker node

How to configure the Executor's memory in the Spark cluster. Also, How to configure number of executors per worker node ?
Is there any way to know how much executor's memory is free to cache or persist new RDD's.
Configuring Spark executor memory - use the parameter spark.executor.memory or key --executor-memory when submitting the job
Configuring number of executors per node depends on which scheduler you use for Spark. In case of YARN and Mesos you don't have a control over this, you can just set the number of executors. In case of Spark Standalone cluster, you can tune SPARK_WORKER_INSTANCES parameter
You can check the amount of free memory in WebUI of the Spark driver. Refer here How to set Apache Spark Executor memory to see why this is not equal to the total executor memory you've set

Resources