Over utilization of yarn resources with spark - apache-spark

I have EMR cluster with below configuration.
Data Nodes : 6
RAM per Node : 56 GB
Cores per Node: 32
Instance Type: M4*4xLarge
I am running below spark-sql to execute 5 hive scripts in parallel.
spark-sql --master yarn --num-executors 1 --executor-memory 20G --executor-cores 20 --driver-memory 4G -f hive1.hql & spark-sql --master yarn --num-executors 1 --executor-memory 20G --executor-cores 20 --driver-memory 4G -f hive2.hql & spark-sql --master yarn --num-executors 1 --executor-memory 20G --executor-cores 20 --driver-memory 4G -f hive3.hql & spark-sql --master yarn --num-executors 1 --executor-memory 20G --executor-cores 20 --driver-memory 4G -f hive4.hql & spark-sql --master yarn --num-executors 1 --executor-memory 20G --executor-cores 20 --driver-memory 4G -f hive5.hql
But, 270 GB of memory is being utilized by yarn.
As per the parameters in given command,
Each spark job should utilize only 120 GB RAM.
1*20+4 = 24 GB RAM
5 jobs = 5 * 24 = 120 GB
But, why yarn is utilizing 270 GB RAM? (No other Hadoop jobs are running in the cluster)
Do I need to include any extra parameters to limit yarn resource utilization?

Make it as "spark.dynamicAllocation.enabled" false at spark-defaults.conf (../../spark/spark-x.x.x/conf/spark-defaults.conf)
This should help you limiting/avoiding dynamic allocation of resources.

Even though we set executor memory in the command, spark allocates memory dynamically if resources are available in the cluster. To restrict the memory usage to only executor memory, spark dynamic memory allocation parameter should set to false.
You can change it directly in spark config file or pass as config parameter to the command.
spark-sql --master yarn --num-executors 1 --executor-memory 20G --executor-cores 20 --driver-memory 4G --conf spark.dynamicAllocation.enabled=false -f hive1.hql

Related

Spark Submit Command Configurations

Can Someone Please help me to understand below spark-submit configuration in detail.What is the use of each configuration and how exactly spark utilize these configuration for better performance.
--num-executors 3
--master yarn
--deploy-mode cluster
--driver-cores 3
--driver-memory 1G
--executor-cores 3
--executor-memory 1G

Spark fail if not all resources are allocated

Does spark or yarn has any flag to fail fast job if we can't allocate all resoucres?
For example if i run
spark-submit --class org.apache.spark.examples.SparkPi
--master yarn-client
--num-executors 7
--driver-memory 512m
--executor-memory 4g
--executor-cores 1
/usr/hdp/current/spark2-client/examples/jars/spark-examples_*.jar 1000
For now if spark can allocate only 5 executors it just will go with 5. Can we make to run it only with 7 or fail in other case?
You can set a spark.dynamicAllocation.minExecutors config in your job. For it you need to set spark.dynamicAllocation.enabled=true, detailed in this doc

spark scala memory management issues

I am trying to submit a spark scala job with below configuration:
spark-submit --class abcd --queue new --master yarn --executor-cores 1 --executor-memory 4g --driver-memory 2g --num-executors 1
The allocated space for the queue is 700GB and it is taking entire 700GB and running.
Is there a way to restrict to 100GB only?
Thanks in advance.

Spark executor GC taking long

Am running Spark job on a standalone cluster and I noticed after sometime the GC starts taking long and the red scary color begins to show.
Here is the resources available:
Cores in use: 80 Total, 76 Used
Memory in use: 312.8 GB Total, 292.0 GB Used
Job details:
spark-submit --class com.mavencode.spark.MonthlyReports
--master spark://192.168.12.14:7077
--deploy-mode cluster --supervise
--executor-memory 16G --executor-cores 4
--num-executors 18 --driver-cores 8
--driver-memory 20G montly-reports-assembly-1.0.jar
How do I fix the GC time taking so long?
I had the same problem and could resolve it by using Parallel GC instead of G1GC. You may add the following options to the executors additional Java options in the submit request
-XX:+UseParallelGC -XX:+UseParallelOldGC

Show number of executors and executor memory

I am runing a pyspark job using the command
spark-submit ./exp-1.py --num-executors 8 --executor-memory 4G
Is there a way to confirm that these configurations are getting reflected in during execution ?
There is a command verbose for checking configuration when spark job runs.
spark-submit --verbose ./exp-1.py --num-executors 8 --executor-memory 4G

Resources