Dynamic Allocation with spark streaming on yarn not scaling down executors - apache-spark

I'm using spark-streaming (spark version 2.2) on yarn cluster and am trying to enable dynamic core allocation for my application.
The number of executors scale up as required but once executors are assigned, they are not being scaled down even when the traffic is reduced, i.e, and executor once allocated is not being released. I have also enabled external shuffle service on yarn, as mentioned here:
https://spark.apache.org/docs/latest/running-on-yarn.html#configuring-the-external-shuffle-service
The configs that I have set in spark-submit command are:
--conf spark.dynamicAllocation.enabled=false \
--conf spark.streaming.dynamicAllocation.enabled=true \
--conf spark.streaming.dynamicAllocation.scalingInterval=30 \
--conf spark.shuffle.service.enabled=true \
--conf spark.streaming.dynamicAllocation.initialExecutors=15 \
--conf spark.streaming.dynamicAllocation.minExecutors=10 \
--conf spark.streaming.dynamicAllocation.maxExecutors=30 \
--conf spark.streaming.dynamicAllocation.executorIdleTimeout=60s \
--conf spark.streaming.dynamicAllocation.cachedExecutorIdleTimeout=60s \
Can someone help if there is any particular config that I'm missing ?
Thanks

The document added as part of this JIRA helped me out: https://issues.apache.org/jira/browse/SPARK-12133.
The key point to note was that the number of executors are scaled down when the ratio (batch processing time/batch duration) is less than 0.5(default value) which essentially means that the executors are idle for half of the time. The config that can be used to alter this default value is "spark.streaming.dynamicAllocation.scalingDownRatio"

Related

Spark-submit with properties file for python

I am trying to load spark configuration through the properties file using the following command in Spark 2.4.0.
spark-submit --properties-file props.conf sample.py
It gives the following error
org.apache.spark.SparkException: Dynamic allocation of executors requires the external shuffle service. You may enable this through spark.shuffle.service.enabled.
The props.conf file has this
spark.master yarn
spark.submit.deployMode client
spark.authenticate true
spark.sql.crossJoin.enabled true
spark.dynamicAllocation.enabled true
spark.driver.memory 4g
spark.driver.memoryOverhead 2048
spark.executor.memory 2g
spark.executor.memoryOverhead 2048
Now, when I try to run the same by adding all arguments to the command itself, it works fine.
spark2-submit \
--conf spark.master=yarn \
--conf spark.submit.deployMode=client \
--conf spark.authenticate=true \
--conf spark.sql.crossJoin.enabled=true \
--conf spark.dynamicAllocation.enabled=true \
--conf spark.driver.memory=4g \
--conf spark.driver.memoryOverhead=2048 \
--conf spark.executor.memory=2g \
--conf spark.executor.memoryOverhead=2048 \
sample.py
This works as expected.
I don't think spark supports --properties-file, one workaround is making the change on $SPARK_HOME/conf/spark-defaults.conf, spark will auto load it.
You can refer https://spark.apache.org/docs/latest/submitting-applications.html#loading-configuration-from-a-file

Spark executor configuration priority

I saw a Spark submit command with following parameters
spark-submit --class ${my_class} \
--master yarn \
--deploy-mode cluster \
--executor-cores 2 \ <--- executor cores
--driver-cores 2\ <--- driver cores
--num-executors 12 \ <--- number of executors
--files hdfs:///blah.xml \
--conf spark.executor.instances=15 \ <--- number of executors again?
--conf spark.executor.cores=4 \ <--- driver cores again?
--conf spark.driver.cores=4 \ <--- executor cores again?
It seems like there can have multiple ways to set up core number and instance number for executor and driver node, just wondering, in above setting which way take priority and overwrite the other one? The -- parameter or conf parameter? Eventually how many cores and instances are given to the spark job?
Configuration are picked up depending upon the order of preference.
Priority wise , config defined in application through set() gets the highest priority.
Second priority is given to spark-submit parameters and then the next priority is given to default config parameters.
--executor-cores 2 \ <--- executor cores
--driver-cores 2\ <--- driver cores
--num-executors 12 \ <--- number of executors
The above configuration will take priority over --conf parameters as these properties are used to override the default conf priorities.

Difference between requested and allocated memory - Spark on Kubernetes

I am running a Spark Job in Kubernetes cluster using spark-submit command as below,
bin/spark-submit \
--master k8s://https://api-server-host:443 \
--deploy-mode cluster \
--name spark-job-name \
--conf spark.kubernetes.namespace=spark \
--conf spark.kubernetes.container.image=docker-repo/pyspark:55 \
--conf spark.kubernetes.authenticate.driver.serviceAccountName=spark-submit \
--conf spark.kubernetes.pyspark.pythonVersion=3 \
--conf spark.executor.memory=4G \
--files local:///mnt/conf.json \
local:///mnt/ingest.py
and when I check the request and limit for the executor pod, it shows below. There is almost 1700 MB extra got allocated for the pod.
Limits:
memory: 5734Mi
Requests:
cpu: 4
memory: 5734Mi
Why is that?
In addition to #CptDolphin 's answer, be aware that Spark always allocates spark.executor.memoryOverhead extra memory (max of 10% of spark.executor.memory or 384MB, unless explicitly configured), and may allocate additional spark.executor.pyspark.memory if you defined that in your configuration.
What you define the pod (as individual system) can use is one thing, what you define spark or java or any other app runing inside that system (pod) can use, is another thing; think of it as normal computer with limits, and then your application with its own limits.

Java heap space OutOfMemoryError while running join query in Spark SQL shell

Here is my cluster configuration:
Master nodes: 1 (16 vCPU, 64 GB memory)
Worker nodes: 2 (total of 64 vCPU, 256 GB memory)
Here is the Hive query I'm trying to run on the Spark SQL shell:
select a.*,b.name as name from (
small_tbl b
join
(select *
from large_tbl where date = '2019-01-01') a
on a.id = b.id);
Here is the query execution plan as shown on the Spark UI:
The configuration properties set while launching the shell are as follows:
spark-sql --conf spark.driver.maxResultSize=30g \
--conf spark.broadcast.compress=true \
--conf spark.rdd.compress=true \
--conf spark.memory.offHeap.enabled=true \
--conf spark.memory.offHeap.size=304857600 \
--conf spark.dynamicAllocation.enabled=false \
--conf spark.executor.instances=12 \
--conf spark.executor.memory=16g
--conf spark.executor.cores=5 \
--conf spark.driver.memory=32g \
--conf spark.yarn.executor.memoryOverhead=512 \
--conf spark.executor.extrajavaoptions=-Xms20g \
--conf spark.executor.heartbeatInterval=30s \
--conf spark.shuffle.io.preferDirectBufs=true \
--conf spark.memory.fraction=0.5
I have tried most of the solutions suggested here and here which is evident in the properties set above. As far as I know it's not a good idea to increase the maxResultSize property on the driver side since datasets may grow beyond driver's memory size and driver shouldn't be used to store data in this scale.
I have executed the query on Tez engine successfully which took around 4 minutes, whereas Spark takes more than 15 mins to execute and terminates abruptly with the lack of heap space issue.
I strongly believe there must be a way to speed up the query execution on Spark. Please suggest me a solution that works for this kind of queries.

Spark: Entire dataset concentrated in one executor

I am running a spark job with 3 files each of 100MB size, for some reason my spark UI shows all dataset concentrated into 2 executors.This is making the job run for 19 hrs and still running.
Below is my spark configuration . spark 2.3 is the version used.
spark2-submit --class org.mySparkDriver \
--master yarn-cluster \
--deploy-mode cluster \
--driver-memory 8g \
--num-executors 100 \
--conf spark.default.parallelism=40 \
--conf spark.yarn.executor.memoryOverhead=6000mb \
--conf spark.dynamicAllocation.executorIdleTimeout=6000s \
--conf spark.executor.cores=3 \
--conf spark.executor.memory=8G \
I tried repartitioning inside the code which works , as this makes the file go into 20 partitions (i used rdd.repartition(20)). But why should I repartition , i believe specifying spark.default.parallelism=40 in the script should let spark divide the input file to 40 executors and process the file in 40 executors.
Can anyone help.
Thanks,
Neethu
I am assuming you're running your jobs in YARN if yes, you can check following properties.
yarn.scheduler.maximum-allocation-mb
yarn.nodemanager.resource.memory-mb
yarn.scheduler.maximum-allocation-vcores
yarn.nodemanager.resource.cpu-vcores
In YARN these properties would affect number of containers that can be instantiated in a NodeManager based on spark.executor.cores, spark.executor.memory property values (along with executor memory overhead)
For example, if a cluster with 10 nodes (RAM : 16 GB, cores : 6) and set with following yarn properties
yarn.scheduler.maximum-allocation-mb=10GB
yarn.nodemanager.resource.memory-mb=10GB
yarn.scheduler.maximum-allocation-vcores=4
yarn.nodemanager.resource.cpu-vcores=4
Then with spark properties spark.executor.cores=2, spark.executor.memory=4GB you can expect 2 Executors/Node so total you'll get 19 executors + 1 container for Driver
If the spark properties are spark.executor.cores=3, spark.executor.memory=8GB then you will get 9 Executor (only 1 Executor/Node) + 1 container for Driver
you can refer to link for more details
Hope this helps

Resources