Optimize Spark and Yarn configuration - apache-spark

We have a cluster of 4 nodes with the characteristics above :
Spark jobs make a lot of times in processing, how could we optimize this time knowing that our jobs run from RStudio and we still have a lot of memory not utilized.

To add more context to the answer above, I would like to give explanation on how to set those parameters --num-executors, --executor-memory, --executor-cores appropriately.
The following answer covers the 3 main aspects mentioned in title - number of executors, executor memory and number of cores.
There may be other parameters like driver memory and others which I did not address as of this answer.
Case 1 Hardware - 6 Nodes, and Each node 16 cores, 64 GB RAM
Each executor is a JVM instance. So we can have multiple executors in a single Node
First 1 core and 1 GB is needed for OS and Hadoop Daemons, so available are 15 cores, 63 GB RAM for each node
Start with one by one how to choose these parameters.
Number of cores:
Number of cores = Concurrent tasks as executor can run
So we might think, more concurrent tasks for each executor will give better performance.
But research shows that any application with more than 5 concurrent tasks, would lead to bad show. So stick this to 5.
This number came from the ability of executor and not from how many cores a system has. So the number 5 stays same
even if you have double(32) cores in the CPU.
Number of executors:
Coming back to next step, with 5 as cores per executor, and 15 as total available cores in one Node(CPU) - we come to
3 executors per node.
So with 6 nodes, and 3 executors per node - we get 18 executors. Out of 18 we need 1 executor (java process) for AM in YARN we get 17 executors
This 17 is the number we give to spark using --num-executors while running from spark-submit shell command
Memory for each executor:
From above step, we have 3 executors per node. And available RAM is 63 GB
So memory for each executor is 63/3 = 21GB.
However small overhead memory is also needed to determine the full memory request to YARN for each executor.
Formula for that over head is max(384, .07 * spark.executor.memory)
Calculating that overhead - .07 * 21 (Here 21 is calculated as above 63/3)
= 1.47
Since 1.47 GB > 384 MB, the over head is 1.47.
Take the above from each 21 above => 21 - 1.47 ~ 19 GB
So executor memory - 19 GB
Final numbers - Executors - 17 per node, Cores 5 per executor, Executor Memory - 19 GB
This way, assigning the resources properly to the spark jobs in the cluster would speed up the jobs; efficiently using available resources.

I recommend you to have a look to these parameters :
--num-executors : controls how many executors will be allocated
--executor-memory : RAM for each executor
--executor-cores : cores for each executor

Related

spark-submit criteria to set parameter values

I am so confused about the right criteria to use when it comes to setting the following spark-submit parameters, for example:
spark-submit --deploy-mode cluster --name 'CoreLogic Transactions Curated ${var_date}' \
--driver-memory 4G --executor-memory 4G --num-executors 10 --executor-cores 4 \
/etl/scripts/corelogic/transactions/corelogic_transactions_curated.py \
--from_date ${var_date} \
--to_date ${var_to_date}
One person is telling me that I am using a lot of executors and cores but he is not explaining why he said that.
Can someone explain to me the right criteria to use when it comes to setting these parameters (--driver-memory 4G --executor-memory 4G --num-executors 10 --executor-cores 4) according to my dataset?
The same in the following case
spark = SparkSession.builder \
.appName('DemoEcon PEP hist stage') \
.config('spark.sql.shuffle.partitions', args.shuffle_partitions) \
.enableHiveSupport() \
.getOrCreate()
I am not quite sure which is the criteria used to set this parameter "spark.sql.shuffle.partitions"
can someone help me to get this clear in my mind?
Thank you in advance
In this website is the answer that I needed, an excellent explanation, with some examples.
http://site.clairvoyantsoft.com/understanding-resource-allocation-configurations-spark-application/
Here is one of those examples:
Case 1 Hardware – 6 Nodes and each node have 16 cores, 64 GB RAM
First on each node, 1 core and 1 GB is needed for Operating System and Hadoop Daemons, so we have 15 cores, 63 GB RAM for each node
We start with how to choose number of cores:
Number of cores = Concurrent tasks an executor can run
So we might think, more concurrent tasks for each executor will give better performance. But research shows that any application with more than 5 concurrent tasks, would lead to a bad show. So the optimal value is 5.
This number comes from the ability of an executor to run parallel tasks and not from how many cores a system has. So the number 5 stays same even if we have double (32) cores in the CPU
Number of executors:
Coming to the next step, with 5 as cores per executor, and 15 as total available cores in one node (CPU) – we come to 3 executors per node which is 15/5. We need to calculate the number of executors on each node and then get the total number for the job.
So with 6 nodes, and 3 executors per node – we get a total of 18 executors. Out of 18 we need 1 executor (java process) for Application Master in YARN. So final number is 17 executors
This 17 is the number we give to spark using –num-executors while running from spark-submit shell command
Memory for each executor:
From above step, we have 3 executors per node. And available RAM on each node is 63 GB
So memory for each executor in each node is 63/3 = 21GB.
However small overhead memory is also needed to determine the full memory request to YARN for each executor.
The formula for that overhead is max(384, .07 * spark.executor.memory)
Calculating that overhead: .07 * 21 (Here 21 is calculated as above 63/3) = 1.47
Since 1.47 GB > 384 MB, the overhead is 1.47
Take the above from each 21 above => 21 – 1.47 ~ 19 GB
So executor memory – 19 GB
Final numbers – Executors – 17, Cores 5, Executor Memory – 19 GB

How to calculate the Executor memory,No of executor ,No of executor cores and Driver memory to read a file of 40GB using Spark?

Yarn Cluster Configuration:
8 Nodes
8 cores per Node
8 GB RAM per Node
1TB HardDisk per Node
Executor memory & No of Executors
Executor memory and no of executors/node are interlinked so you would first start selecting Executor memory or No of executors then based on your choice you can follow this to set properties to get desired results
In YARN these properties would affect number of containers (/executors in Spark) that can be instantiated in a NodeManager based on spark.executor.cores, spark.executor.memory property values (along with executor memory overhead)
For example, if a cluster with 10 nodes (RAM : 16 GB, cores : 6) and set with following yarn properties
yarn.scheduler.maximum-allocation-mb=10GB
yarn.nodemanager.resource.memory-mb=10GB
yarn.scheduler.maximum-allocation-vcores=4
yarn.nodemanager.resource.cpu-vcores=4
Then with spark properties spark.executor.cores=2, spark.executor.memory=4GB you can expect 2 Executors/Node so total you'll get 19 executors + 1 container for Driver
If the spark properties are spark.executor.cores=3, spark.executor.memory=8GB then you will get 9 Executor (only 1 Executor/Node) + 1 container for Driver link
Driver memory
spark.driver.memory —Maximum size of each Spark driver's Java heap memory
spark.yarn.driver.memoryOverhead —Amount of extra off-heap memory that can be requested from YARN, per driver. This, together with spark.driver.memory, is the total memory that YARN can use to create a JVM for a driver process.
Spark driver memory does not impact performance directly, but it ensures that the Spark jobs run without memory constraints at the driver. Adjust the total amount of memory allocated to a Spark driver by using the following formula, assuming the value of yarn.nodemanager.resource.memory-mb is X:
12 GB when X is greater than 50 GB
4 GB when X is between 12 GB and 50 GB
1 GB when X is between 1GB and 12 GB
256 MB when X is less than 1 GB
These numbers are for the sum of spark.driver.memory and spark.yarn.driver.memoryOverhead . Overhead should be 10-15% of the total.
You can also follow this Cloudera link for tuning Spark jobs

Does reducing the number of executor-cores consume less executor-memory?

My Spark job failed with the YARN error Container killed by YARN for exceeding memory limits 10.0 GB of 10 GB physical memory used.
Intuitively, I decreased the number of cores from 5 to 1 and the job ran successfully.
I did not increase the executor-memory because 10g was the max for my YARN cluster.
I just wanted to confirm if my intuition. Does reducing the number of executor-cores consume less executor-memory? If so, why?
spark.executor.cores = 5, spark.executor.memory=10G
This means an executor can run 5 tasks in parallel. That means 10 GB needs to be shared by 5 tasks.So effectively on an average - each task will have 2 GB available. If all the tasks consumes more than 2 GB, than overall JVM will end up consuming more than 10 GB and so YARN will kill the container.
spark.executor.cores = 1, spark.executor.memory=10G
This means an executor can run only 1 task. That means 10 GB is available to 1 task completely. So if the task uses more than 2 GB but less than 10 GB, it will work fine. That was the case in your Job and so it worked.
Yes, each executor uses an extra 7% of memoryOverhead.
This calculation will be created thinking that you have two nodes, so we have three executors in one node and two executors in the other node.
Memory per executor in the first node = 10GB/3 = 3,333GB
Counting off heap overhead = 7% of 3,333GB = 0,233GB.
So, your executor-memory should be 3,333GB - 0,233GB = 3,1GB per node
You can read another explanation here:
https://spoddutur.github.io/spark-notes/distribution_of_executors_cores_and_memory_for_spark_application.html

Spark increasing the number of executors in yarn mode

I am running Spark over Yarn on a 4 Node Cluster. The configuration of each machine in the node is 128GB Memory, 24 Core CPU per node. I run Spark on using this command
spark-shell --master yarn --num-executors 19 --executor-memory 18g --executor-cores 4 --driver-memory 4g
But Spark only launches 16 executors maximum. I have maximum-vcore allocation in yarn set to 80 (out of the 94 cores i have). So i was under the impression that this will launch 19 executors but it can only go upto 16 executors. Also I don't think even these executors are using the allocated VCores completely.
These are my questions
Why isn't spark creating 19 executors. Is there a computation behind
the scenes that's limiting it?
What is the optimal configuration to run spark-shell given my cluster configuration, if I wanted to get the best possible spark performance
driver-core is set to 1 by default. Will increasing it improve performance.
Here is my Yarn Config
yarn.nodemanager.resource.memory-mb: 106496
yarn..minimum-allocation-mb: 3584
yarn..maximum-allocation-mb: 106496
yarn..minimum-allocation-vcores: 1
yarn..maximum-allocation-vcores: 20
yarn.nodemanager.resource.cpu-vcores: 20
Ok so going by your configurations we have:
(I am also a newbie at Spark but below is what I speculate in this scenario)
24 cores and 128GB ram per node and we have 4 nodes in the cluster.
We allocate 1 core and 1 GB memory for overhead and considering you're running your cluster in YARN-Client mode.
We have 127GB Ram and 23 Cores left with us in 4 nodes.
As mentioned in Cloudera blog YARN runs at optimal performance when 5 cores are allocated per executor at max.
So, 23X4 = 92 Cores.
If we allocated 5 cores per executor then 18 executor have 5 cores and 1 executor has 2 cores or likewise.
So lets assume we have 18 executor in our application and 5 cores per executor.
Spark distributes these 18 executors across 4 nodes. suppose its distributed as:
1st node : 4 executors
2nd node : 4 executors
3rd node : 5 executors
4th node : 5 executors
Now, as 'yarn.nodemanager.resource.memory-mb: 106496' is set as 104GB in your configurations, each node can have max 104 GB memory allocated (I would suggest increasing this parameter).
For nodes with 4 executors: 104/4 - 26GB per executor
For nodes with 5 executors: 104/5 ~ 21GB per executor.
Now leaving out 7% memory for overhead we get 24GB and 20GB.
So i would suggest using following configurations:-
--num-executors : 18
--executor-memory : 20G
--executor-cores : 5
Also, This is considering that you're running your cluster in client mode but if you run your cluster in Yarn-cluster mode 1 node will be allocated fir driver program and the calculations will need to be done differently.
I still cannot comment, so it will be as an answer.
See this question. Could you please decrease executor memory and try run this again?

How to tune spark executor number, cores and executor memory?

Where do you start to tune the above mentioned params. Do we start with executor memory and get number of executors, or we start with cores and get the executor number. I followed the link. However got a high level idea, but still not sure how or where to start and arrive to a final conclusion.
The following answer covers the 3 main aspects mentioned in title - number of executors, executor memory and number of cores. There may be other parameters like driver memory and others which I did not address as of this answer, but would like to add in near future.
Case 1 Hardware - 6 Nodes, and Each node 16 cores, 64 GB RAM
Each executor is a JVM instance. So we can have multiple executors in a single Node
First 1 core and 1 GB is needed for OS and Hadoop Daemons, so available are 15 cores, 63 GB RAM for each node
Start with how to choose number of cores:
Number of cores = Concurrent tasks as executor can run
So we might think, more concurrent tasks for each executor will give better performance. But research shows that
any application with more than 5 concurrent tasks, would lead to bad show. So stick this to 5.
This number came from the ability of executor and not from how many cores a system has. So the number 5 stays same
even if you have double(32) cores in the CPU.
Number of executors:
Coming back to next step, with 5 as cores per executor, and 15 as total available cores in one Node(CPU) - we come to
3 executors per node.
So with 6 nodes, and 3 executors per node - we get 18 executors. Out of 18 we need 1 executor (java process) for AM in YARN we get 17 executors
This 17 is the number we give to spark using --num-executors while running from spark-submit shell command
Memory for each executor:
From above step, we have 3 executors per node. And available RAM is 63 GB
So memory for each executor is 63/3 = 21GB.
However small overhead memory is also needed to determine the full memory request to YARN for each executor.
Formula for that over head is max(384, .07 * spark.executor.memory)
Calculating that overhead - .07 * 21 (Here 21 is calculated as above 63/3)
= 1.47
Since 1.47 GB > 384 MB, the over head is 1.47.
Take the above from each 21 above => 21 - 1.47 ~ 19 GB
So executor memory - 19 GB
Final numbers - Executors - 17, Cores 5, Executor Memory - 19 GB
Case 2 Hardware : Same 6 Node, 32 Cores, 64 GB
5 is same for good concurrency
Number of executors for each node = 32/5 ~ 6
So total executors = 6 * 6 Nodes = 36. Then final number is 36 - 1 for AM = 35
Executor memory is : 6 executors for each node. 63/6 ~ 10 . Over head is .07 * 10 = 700 MB. So rounding to 1GB as over head, we get 10-1 = 9 GB
Final numbers - Executors - 35, Cores 5, Executor Memory - 9 GB
Case 3
The above scenarios start with accepting number of cores as fixed and moving to # of executors and memory.
Now for first case, if we think we dont need 19 GB, and just 10 GB is sufficient, then following are the numbers:
cores 5
# of executors for each node = 3
At this stage, this would lead to 21, and then 19 as per our first calculation. But since we thought 10 is ok (assume little overhead), then we cant switch # of executors
per node to 6 (like 63/10). Becase with 6 executors per node and 5 cores it comes down to 30 cores per node, when we only have 16 cores. So we also need to change number of
cores for each executor.
So calculating again,
The magic number 5 comes to 3 (any number less than or equal to 5). So with 3 cores, and 15 available cores - we get 5 executors per node. So (5*6 -1) = 29 executors
So memory is 63/5 ~ 12. Over head is 12*.07=.84
So executor memory is 12 - 1 GB = 11 GB
Final Numbers are 29 executors, 3 cores, executor memory is 11 GB
Dynamic Allocation:
Note : Upper bound for the number of executors if dynamic allocation is enabled. So this says that spark application can eat away all the resources if needed. So in
a cluster where you have other applications are running and they also need cores to run the tasks, please make sure you do it at cluster level. I mean you can allocate
specific number of cores for YARN based on user access. So you can create spark_user may be and then give cores (min/max) for that user. These limits are for sharing between spark and other applications which run on YARN.
spark.dynamicAllocation.enabled - When this is set to true - We need not mention executors. The reason is below:
The static params number we give at spark-submit is for the entire job duration. However if dynamic allocation comes into picture, there would be different stages like
What to start with :
Initial number of executors (spark.dynamicAllocation.initialExecutors) to start with
How many :
Then based on load (tasks pending) how many to request. This would eventually be the numbers what we give at spark-submit in static way. So once the initial executor numbers are set, we go to min (spark.dynamicAllocation.minExecutors) and max (spark.dynamicAllocation.maxExecutors) numbers.
When to ask or give:
When do we request new executors (spark.dynamicAllocation.schedulerBacklogTimeout) - There have been pending tasks for this much duration. so request. number of executors requested in each round increases exponentially from the previous round. For instance, an application will add 1 executor in the first round, and then 2, 4, 8 and so on executors in the subsequent rounds. At a specific point, the above max comes into picture
when do we give away an executor (spark.dynamicAllocation.executorIdleTimeout) -
Please correct me if I missed anything. The above is my understanding based on the blog i shared in question and some online resources. Thank you.
References:
http://site.clairvoyantsoft.com/understanding-resource-allocation-configurations-spark-application/
http://spark.apache.org/docs/latest/configuration.html#dynamic-allocation
http://spark.apache.org/docs/latest/job-scheduling.html#resource-allocation-policy
Also, it depends on your use case, an important config parameter is:
spark.memory.fraction(Fraction of (heap space - 300MB) used for execution and storage) from http://spark.apache.org/docs/latest/configuration.html#memory-management.
If you dont use cache/persist, set it to 0.1 so you have all the memory for your program.
If you use cache/persist, you can check the memory taken by:
sc.getExecutorMemoryStatus.map(a => (a._2._1 - a._2._2)/(1024.0*1024*1024)).sum
Do you read data from HDFS or from HTTP?
Again, a tuning depend on your use case.

Resources