Spark num-executors - apache-spark

I have setup a 10 node HDP platform on AWS. Below is my configuration
2 Servers - Name Node and Standby Name node
7 Data Nodes and each node has 40 vCPU and 160 GB of memory.
I am trying to calculate the number of executors while submitting spark applications and after going through different blogs I am confused on what this parameter actually means.
Looking at the below blog it seems the num executors are the total number of executors across all nodes
http://blog.cloudera.com/blog/2015/03/how-to-tune-your-apache-spark-jobs-part-2/
But looking at the below blog it seems that the num executors is per node or server
https://blogs.aws.amazon.com/bigdata/post/Tx578UTQUV7LRP/Submitting-User-Applications-with-spark-submit
Can anyone please clarify and review the below :-
Is the num-executors value is per node or the total number of executors across all the data nodes.
I am using the below calculation to come up with the core count, executor count and memory per executor
Number of cores <= 5 (assuming 5)
Num executors = (40-1)/5 = 7
Memory = (160-1)/7 = 22 GB
With the above calculation which would be the correct way
--master yarn-client --driver-memory 10G --executor-memory 22G --num-executors 7 --executor-cores 5
OR
--master yarn-client --driver-memory 10G --executor-memory 22G --num-executors 49 --executor-cores 5
Thanks,
Jayadeep

Can anyone please clarify and review the below :-
Is the num-executors value is per node or the total number of executors across all the data nodes.
You need to first understand that the executors run on the NodeManagers (You can think of this like workers in Spark standalone). A number of Containers (includes vCPU, memory, network, disk, etc.) equal to number of executors specified will be allocated for your Spark application on YARN. Now these executor containers will be run on multiple NodeManagers and that depends on the CapacityScheduler (default scheduler in HDP).
So to sum up, total number of executors is the number of resource containers you specify for your application to run.
Refer this blog to understand better.
I am using the below calculation to come up with the core count, executor count and memory per executor
Number of cores <= 5 (assuming 5) Num executors = (40-1)/5 = 7 Memory = (160-1)/7 = 22 GB
There is no rigid formula for calculating the number of executors. Instead you can try enabling Dynamic Allocation in YARN for your application.

There is a hiccup with the capacity scheduler. As far as I understand it allows you to only schedule by memory. You will first need to change that to the dominant resource calculator scheduling type. That will allow you to ask for more memory and cores combination. Once you change that out you should be able to ask for both cup and memory with your spark application.
As for --num-executors flag, you can even keep it at a very high value of 1000. It will still allocate only the number of containers that is possible to launch on each node. As and when your cluster resources increase your containers attached to your application will increase. The number of containers that you can launch per node will be limited by the amount of resources allocated to the nodemanagers on those nodes.

Related

yarn allocate containers for Spark in AWS

I was able to create YARN containers for my spark jobs.
I have come across various blogs and youtube videos to efficiently use --executors-cores (use values from 4 -6 for efficient throughput) and --executor memory after reserving 1 CPU cores and 1GB RAM for hadoop deamons and determined the right values for each executor.
I also came across articles like these.
I am checking how many containers are created by YARN from spark shell and i am not able to understand how the containers are allocated.
For example i have created EMR cluster with 1 master node m5.xlarge (4 vcore , 16 Gib) and 1 core node with instance type c5.2xlarge ( 8 vcore and 16 Gib RAM)
When i create the spark shell with the following command spark-shell --num-executors=6 --executor-cores=5 --conf spark.executor.memoryOverhead=1G --executor-memory 1G --driver-memory 1G
i see that 6 executors including a driver are being created with 5 cores for each executor for a total of 25 cores
However the metrics from hadoop history server does not reflect the right calculations
I am very confused how in spark UI , more cores than available were allocated for each executor . The total vcores in the cluster is 8 cores considering the core nodes but a total of 25 executors are allocated for the executors.
Can someone please explain what i am missing.

Spark on YARN resource manager: Relation between YARN Containers and Spark Executors

I'm new to Spark on YARN and don't understand the relation between the YARN Containers and the Spark Executors. I tried out the following configuration, based on the results of the yarn-utils.py script, that can be used to find optimal cluster configuration.
The Hadoop cluster (HDP 2.4) I'm working on:
1 Master Node:
CPU: 2 CPUs with 6 cores each = 12 cores
RAM: 64 GB
SSD: 2 x 512 GB
5 Slave Nodes:
CPU: 2 CPUs with 6 cores each = 12 cores
RAM: 64 GB
HDD: 4 x 3 TB = 12 TB
HBase is installed (this is one of the parameters for the script below)
So I ran python yarn-utils.py -c 12 -m 64 -d 4 -k True (c=cores, m=memory, d=hdds, k=hbase-installed) and got the following result:
Using cores=12 memory=64GB disks=4 hbase=True
Profile: cores=12 memory=49152MB reserved=16GB usableMem=48GB disks=4
Num Container=8
Container Ram=6144MB
Used Ram=48GB
Unused Ram=16GB
yarn.scheduler.minimum-allocation-mb=6144
yarn.scheduler.maximum-allocation-mb=49152
yarn.nodemanager.resource.memory-mb=49152
mapreduce.map.memory.mb=6144
mapreduce.map.java.opts=-Xmx4915m
mapreduce.reduce.memory.mb=6144
mapreduce.reduce.java.opts=-Xmx4915m
yarn.app.mapreduce.am.resource.mb=6144
yarn.app.mapreduce.am.command-opts=-Xmx4915m
mapreduce.task.io.sort.mb=2457
These settings I made via the Ambari interface and restarted the cluster. The values also match roughly what I calculated manually before.
I have now problems
to find the optimal settings for my spark-submit script
parameters --num-executors, --executor-cores & --executor-memory.
to get the relation between the YARN container and the Spark executors
to understand the hardware information in my Spark History UI (less memory shown as I set (when calculated to overall memory by multiplying with worker node amount))
to understand the concept of the vcores in YARN, here I couldn't find any useful examples yet
However, I found this post What is a container in YARN? , but this didn't really help as it doesn't describe the relation to the executors.
Can someone help to solve one or more of the questions?
I will report my insights here step by step:
First important thing is this fact (Source: this Cloudera documentation):
When running Spark on YARN, each Spark executor runs as a YARN container. [...]
This means the number of containers will always be the same as the executors created by a Spark application e.g. via --num-executors parameter in spark-submit.
Set by the yarn.scheduler.minimum-allocation-mb every container always allocates at least this amount of memory. This means if parameter --executor-memory is set to e.g. only 1g but yarn.scheduler.minimum-allocation-mb is e.g. 6g, the container is much bigger than needed by the Spark application.
The other way round, if the parameter --executor-memory is set to somthing higher than the yarn.scheduler.minimum-allocation-mb value, e.g. 12g, the Container will allocate more memory dynamically, but only if the requested amount of memory is smaller or equal to yarn.scheduler.maximum-allocation-mb value.
The value of yarn.nodemanager.resource.memory-mb determines, how much memory can be allocated in sum by all containers of one host!
=> So setting yarn.scheduler.minimum-allocation-mb allows you to run smaller containers e.g. for smaller executors (else it would be waste of memory).
=> Setting yarn.scheduler.maximum-allocation-mb to the maximum value (e.g. equal to yarn.nodemanager.resource.memory-mb) allows you to define bigger executors (more memory is allocated if needed, e.g. by --executor-memory parameter).

How to set executor number by memory in YARN mode?

I did some testing on r3.8 xlarge cluster, each instance has 32 cores, and 244G memory.
If I set spark.executor.cores=16, spark.executor.memory=94G, there're 2 executors per instance, but when I set spark.executor.memory larger than 94G, there will be only one executor per instance;
If I set spark.executor.cores=8, spark.executor.memory=35G, there're 4 executors per instance, but when I set spark.executor.memory larger than 35, there will be no larger than 3 executors per instance.
So, my question is, how does the executor number come out by memory set? What's the formula? I though the Spark just simply use 70% of the physical memory to allocate to the executors but seems I'm wrong...
In Yarn mode you need to set number of executor by num-executors and executor memory by executor-memory. Here's a example:
spark-submit --master yarn-cluster --executor-memory 6G --num-executors 31 --executor-cores 32 example.jar Example
Now each executor requests a container from yarn with 6G + memory overhead and 1 core.
More info on spark documentation
Regarding the behavior you're seeing it sounds like the amount of memory available to your YARN NodeManagers is actually less than the 244GB that is available to the OS. To verify this, take a look at your YARN ResourceManager Web UI and you can see how much memory is availible in total across the cluster. This is determined from the yarn.nodemanager.resource.memory-mb in yarn-site.xml.
To answer your question about how the number of executors is determined: In YARN, if you're using spark with dynamicAllocation.enabled set to true, the number of executors is limited above dynamicAllocation.minExecutors and below dynamicAllocation.maxExecutors.
Other than that you're then subjected to YARN's resource allocation which, for most schedulers, will allocate resources to fill up a given queue that your job runs in.
In the situation where you have a totally unutilized cluster with one YARN queue and you submit a job to it, the Spark job will continue to add executors with the given number of cores and memory amount until the entire cluster is full (or there is not enough cores/memory for an additional executor to be allocated).

unreasonable yarn cluster metrics

I have been using spark and yarn for quite a while, and mostly have a handle on all the spark-submit parameter. I am currently using a 5 node EMR cluster, 1 master and 4 workers, all M3.xlarge, which is spec'ed at 4 vCores. (Actually, when I ssh into the machines and checked, there were actually 3 cores only.)
however, when I spark-submit a job into the emr
spark-submit --master yarn --class myclass --num-executors 9 --executor-cores 2 --executor-memory 500M my.jar
The yarn console always shows that I have 32 vCores total, and 4 vCores used, and the number of active nodes is 4.
So this total number of vCores is a real mystery. How can there be 32 vCores? Even if you count the master node, there are 5 * 4 vCores = 20. Not counting the master node, the active worker nodes is indeed 4. That would make the total vCore count to 16, not 32. Anyone can explain this?
The hardware you are running on uses hyperthreading technology. This allows each physical core to work as two virtual cores. Your four worker machines have 4 physical cores, but that actually corresponds to 8 virtual cores.
See:
https://aws.amazon.com/ec2/instance-types/
and https://en.wikipedia.org/wiki/Hyper-threading

Using all resources in Apache Spark with Yarn

I am using Apache Spark with Yarn client.
I have 4 worker PCs with 8 vcpus each and 30 GB of ram in my spark cluster.
Im set my executor memory to 2G and number of instances to 33.
My job is taking 10 hours to run and all machines are about 80% idle.
I dont understand the correlation between executor memory and executor instances. Should I have an instance per Vcpu? Should I set the executor memory to be memory of machine/#executors per machine?
I believe that you have to use the following command:
spark-submit --num-executors 4 --executor-memory 7G --driver-memory 2G --executor-cores 8 --class \"YourClassName\" --master yarn-client
Number of executors should be 4, since you have 4 workers. The executor memory should be close to the maximum memory that each yarn node has allocated, roughly ~5-6GB (I assume you have 30GB total RAM).
You should take a look on the spark-submit parameters and fully understand them.
We were using cassandra as our data source for spark. The problem was there were not enough partitions. We needed to split up the data more. Our mapping for # of cassandra partitions to spark partitions was not small enough and we would only generate 10 or 20 tasks instead of 100s of tasks.

Resources