How to configure Yarn to use all vcores? - apache-spark

We are running a spark streaming job using yarn as cluster manager, i have dedicated 7 cores per node to each node ...via yarn-site.xml as shown in the pic below
when the job is running ..it's only using 2 vcores and 5 vcores are left alone and the job is slow with lot of batches queued up ..
how can we make it use all the 7 vcores ..that's available to it this is usage when running so that it speed's up our job
Would greatly appreciate if any of the experts in the community will help out as we are new to Yarn & Spark

I searched many answers for this question. Finally, it worked after changing a yarn config file: capacity-scheduler.xml
<property>
<name>yarn.scheduler.capacity.resource-calculator</name>
<value>org.apache.hadoop.yarn.util.resource.DominantResourceCalculator</value>
</property>
Don't forget to restart your yarn

At spark level you can control yarn application master's cores by using parameters spark.yarn.am.cores.
For spark executors you need to pass --executor-cores to spark-submit.
However from spark, you cannot control what(vcores/memory) yarn chooses to allocate to the container that it spawns which is right, as you are running spark over yarn.
In order to control that you will need to change yarn vcore parameters like yarn.nodemanager.resource.cpu-vcores, yarn.scheduler.minimum-allocation-vcores. More you can find here https://www.cloudera.com/documentation/enterprise/5-3-x/topics/cdh_ig_yarn_tuning.html#configuring_in_cm

Related

Why spark num-executors is not equal to yarn containers?

According to [Spark on YARN resource manager: Relation between YARN Containers and Spark Executors, the number of yarn containers should be equal to the num-executors for a spark application. However, I did see in a run that num-executors shown in Spark-UI environment tab was 60 but the number of containers shown in yarn was only 37. I was using spark 2.2 and spark.dynamicAllocation.enabled is set to false. I used Azure-HDinsight cluster. Anyone can explain this?
Spark-UI also shows some terminated executors.
They may have been removed by Spark dynamic execution
or through YARN preemption.
You normally can tell if executors are still alive or not.
Another reason for them to be different is Spark driver.
In ‘yarn-cluster’ mode driver occupies a yarn container too.
So you’ll see +1 container difference in this case too.

Specify spark driver for spark-submit

I'm submitting a spark job from a shell script that has a bunch of env vars and parameters to pass to spark. Strangely, the driver host is not one of these parameters (there are driver cores and memory however). So if I have 3 machines in the cluster, a driver will be chosen randomly. I don't want this behaviour since 1) the jar I'm submitting is only on one of the machines and 2) the driver machine should often be smaller than the other machines which is not the case if it's random choice.
So far, I found no way to specify this param on the command line to spark-submit. I've tried --conf SPARK_DRIVER_HOST="172.30.1.123, --conf spark.driver.host="172.30.1.123 and many other things but nothing has any effect. I'm using spark 2.1.0. Thanks.
I assume you are running with Yarn cluster. In brief yarn uses containers to launch and implement tasks. And resource manager decides where to run which container based on availability of resources. In spark case drivers and executors also launched as containers with separate jvms. Driver dedicated to splitting tasks among executors and collect the results from them. If your node from where you launch your application included in cluster then it will be also used as shared resource for launching driver/executor.
From the documentation: http://spark.apache.org/docs/latest/running-on-yarn.html
When running the cluster in standalone or in Mesos the driver host (this is the master) can be launched with:
--master <master-url> #e.g. spark://23.195.26.187:7077
When using YARN it works a little different. Here the parameter is yarn
--master yarn
The yarn is specified in Hadoop its configuration for the ResourceManager. For how to do this see this interesting guide https://dqydj.com/raspberry-pi-hadoop-cluster-apache-spark-yarn/ . Basically in the hdfs the hdfs-site.xml and in yarn the yarn-site.xml

Number of Executors is less than what is assigned for a Spark job

I am having two hadoop clusters containing 15(big) and 3(small) nodes respectively. Both are managed by cloudera manager. I am running a Spark job using yarn setting --num-executors to 6. The Spark UI of the big cluster is showing the 6 executors, but Spark UI of the small cluster is showing only 3 executors. What are the probable reasons for it? And also how to overcome the issue?
Thanks in advance.

Spark with Yarn: Point in providing spark-resource related parameters?

I am reading through literature about Spark & Resource Management i.e. Yarn in my case.
I think I understood the basic concept and how Yarn encapsulates Spark Master/Workers in containers.
Is there any point in still providing resource-parameters such as --driver-memory, --executor-memory or --number-executors? Shouldn't the Yarn-application-master(spark-master) figure out the demand and request accordingly new resources?
Or is it wise to interfere in the resource negotiation process by providing this parameters?
Spark needs to negotiate the resources from YARN. Providing the resource-parameters tells Spark how many resources to request from YARN.
For executors on YARN:
Spark applications use a fixed number of executors (default = 2).
The --num-executors flag for spark-submit, spark-shell, etc. sets the number of executors as expected.
For memory management on YARN:
Set the memory used by each executor using --executor-memory.
Setting --executor-cores tells Spark how many cores to claim from YARN.
Set the amount of memory for the driver process with --driver-memory.
Some general Spark-on-YARN notes:
Use the --queue option if your YARN cluster schedules application into queues.
Spark is optimized for in-memory computation, so ask YARN for a smaller number of memory-heavy executors (with multiple cores and more memory). Be careful if you have set memory caps within YARN.
The Spark on YARN Documentation has more details.

Performance issues for spark on YARN

We are trying to run our spark cluster on yarn. We are having some performance issues especially when compared to the standalone mode.
We have a cluster of 5 nodes with each having 16GB RAM and 8 cores each. We have configured the minimum container size as 3GB and maximum as 14GB in yarn-site.xml. When submitting the job to yarn-cluster we supply number of executor = 10, memory of executor =14 GB. According to my understanding our job should be allocated 4 container of 14GB. But the spark UI shows only 3 container of 7.2GB each.
We are unable to ensure the container number and resources allocated to it. This causes detrimental performance when compared to the standalone mode.
Can you drop any pointer on how to optimize yarn performance?
This is the command I use for submitting the job:
$SPARK_HOME/bin/spark-submit --class "MyApp" --master yarn-cluster --num-executors 10 --executor-memory 14g target/scala-2.10/my-application_2.10-1.0.jar
Following the discussion I changed my yarn-site.xml file and also the spark-submit command.
Here is the new yarn-site.xml code :
<property>
<name>yarn.resourcemanager.hostname</name>
<value>hm41</value>
</property>
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>14336</value>
</property>
<property>
<name>yarn.scheduler.minimum-allocation-mb</name>
<value>2560</value>
</property>
<property>
<name>yarn.scheduler.maximum-allocation-mb</name>
<value>13312</value>
</property>
And the new command for spark submit is
$SPARK_HOME/bin/spark-submit --class "MyApp" --master yarn-cluster --num-executors 4 --executor-memory 10g --executor-cores 6 target/scala-2.10/my-application_2.10-1.0.jar
With this I am able to get 6 cores on each machine but the memory usage of each node is still around 5G. I have attached the screen shot of SPARKUI and htop.
The memory (7.2GB) you see in the SparkUI is the spark.storage.memoryFraction, which by default is 0.6. As for your missing executors, you should look in the YARN resource manager logs.
Withing yarn-site.xml check that yarn.nodemanager.resource.memory-mb is set the right way. In my understanding of your cluster it should be set to 14GB. This setting is responsible for giving the YARN know how much memory it can use on this specific node
If you have this set right and you have 5 servers running YARN NodeManager, then your job submission command is wrong. First, --num-executors is the number of YARN containers would be started for executing on the cluster. You specify 10 containers with 14GB RAM each, but you don't have this many resources on your cluster! Second, you specify --master yarn-cluster, which means that Spark Driver would run inside of the YARN Application Master that would require a separate container.
In my opinion it shows 3 containers because out of 5 nodes in the cluster you have only 4 of them running YARN NodeManager + you request to allocate 14GB for each of the containers, so YARN first starts Application Master and then polls the NM for available resources and see that it can start only 3 containers. Regarding heap size you see, after starting the Spark find its JVM containers and see the parameters of their start - you should have many -Xmx flags in a single line - one correct and one wrong, you should find its origin in config files (Hadoop or Spark)
Before submitting an application to the cluster, start the spark-shell with the same settings (replace yarn-cluster with yarn-client) and check how it is started, check WebUI and JVMs started
Just because YARN "thinks" it has 70GB (14GBx5), doesn't mean at run time there is 70GB available on the cluster. You could be running other Hadoop components (hive, HBase, flume, solr, or your own app, etc.) which consume memory. So the run-time decision YARN makes is based on what's currently available -- and it had only 52GB (3x14GB) available to you. By the way, the GB numbers are approximate because it is really computed as 1024MB per GB...so you will see decimals.
Use nmon or top to see what else is using memory on each node.

Resources