Performance issues for spark on YARN - apache-spark

We are trying to run our spark cluster on yarn. We are having some performance issues especially when compared to the standalone mode.
We have a cluster of 5 nodes with each having 16GB RAM and 8 cores each. We have configured the minimum container size as 3GB and maximum as 14GB in yarn-site.xml. When submitting the job to yarn-cluster we supply number of executor = 10, memory of executor =14 GB. According to my understanding our job should be allocated 4 container of 14GB. But the spark UI shows only 3 container of 7.2GB each.
We are unable to ensure the container number and resources allocated to it. This causes detrimental performance when compared to the standalone mode.
Can you drop any pointer on how to optimize yarn performance?
This is the command I use for submitting the job:
$SPARK_HOME/bin/spark-submit --class "MyApp" --master yarn-cluster --num-executors 10 --executor-memory 14g target/scala-2.10/my-application_2.10-1.0.jar
Following the discussion I changed my yarn-site.xml file and also the spark-submit command.
Here is the new yarn-site.xml code :
<property>
<name>yarn.resourcemanager.hostname</name>
<value>hm41</value>
</property>
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>14336</value>
</property>
<property>
<name>yarn.scheduler.minimum-allocation-mb</name>
<value>2560</value>
</property>
<property>
<name>yarn.scheduler.maximum-allocation-mb</name>
<value>13312</value>
</property>
And the new command for spark submit is
$SPARK_HOME/bin/spark-submit --class "MyApp" --master yarn-cluster --num-executors 4 --executor-memory 10g --executor-cores 6 target/scala-2.10/my-application_2.10-1.0.jar
With this I am able to get 6 cores on each machine but the memory usage of each node is still around 5G. I have attached the screen shot of SPARKUI and htop.

The memory (7.2GB) you see in the SparkUI is the spark.storage.memoryFraction, which by default is 0.6. As for your missing executors, you should look in the YARN resource manager logs.

Withing yarn-site.xml check that yarn.nodemanager.resource.memory-mb is set the right way. In my understanding of your cluster it should be set to 14GB. This setting is responsible for giving the YARN know how much memory it can use on this specific node
If you have this set right and you have 5 servers running YARN NodeManager, then your job submission command is wrong. First, --num-executors is the number of YARN containers would be started for executing on the cluster. You specify 10 containers with 14GB RAM each, but you don't have this many resources on your cluster! Second, you specify --master yarn-cluster, which means that Spark Driver would run inside of the YARN Application Master that would require a separate container.
In my opinion it shows 3 containers because out of 5 nodes in the cluster you have only 4 of them running YARN NodeManager + you request to allocate 14GB for each of the containers, so YARN first starts Application Master and then polls the NM for available resources and see that it can start only 3 containers. Regarding heap size you see, after starting the Spark find its JVM containers and see the parameters of their start - you should have many -Xmx flags in a single line - one correct and one wrong, you should find its origin in config files (Hadoop or Spark)
Before submitting an application to the cluster, start the spark-shell with the same settings (replace yarn-cluster with yarn-client) and check how it is started, check WebUI and JVMs started

Just because YARN "thinks" it has 70GB (14GBx5), doesn't mean at run time there is 70GB available on the cluster. You could be running other Hadoop components (hive, HBase, flume, solr, or your own app, etc.) which consume memory. So the run-time decision YARN makes is based on what's currently available -- and it had only 52GB (3x14GB) available to you. By the way, the GB numbers are approximate because it is really computed as 1024MB per GB...so you will see decimals.
Use nmon or top to see what else is using memory on each node.

Related

Spark: huge number of thread get created

Spark version 2.1 Hadoop 2.7.3
I have a spark job, only has 1 stage and 100 partitions, my application itself doesn't create any thread. but after I submit it as
spark-submit --class xxx --master yarn --deploy-mode cluster --driver-memory 4g --executor-memory 1g --num-executors 7 --executor-core 1 ./my.jar
I found on every server, it uses about 400 threads, why so many threads are being used? The cluster has 6 servers. so one of the servers get 2 executors, and that use about 800 threads in spark process. when I actually run this. I give it a lot of cores and get a "cannot create native thread" error after system using 32,000 threads, which is the limit from system ulimit setting. even I can assign less core and get around this error, using so many threads won't be efficient anyway, can someone gives some hints?
updated.
it's the connection to hbase causing the problem, not spark using those threads.
Check the scheduler XML configuration in conf directory
Check the scheduler used
Check the weight configured
If there is no pool set, try setting a pool
sc.setLocalProperty("spark.scheduler.pool", "test")
configure the following values
<pool name="test">
<schedulingMode>FAIR</schedulingMode>
<weight>1</weight>
<minShare>2</minShare>
</pool>

what is the relationship between spark executor and yarn container when using spark on yarn

what is the relationship between spark executor and yarn container when using spark on yarn?
For example, when I set executor-memory = 20G and yarn container memory = 10G, does 1 executor contains 2 containers?
Spark Executor Runs within a Yarn Container. A Yarn Container is provided by Resource Manager on demand. A Yarn container can have 1 or more Spark Executors.
Spark-Executors are the one which runs the Tasks.
Spark Executor will be started on a Worker Node(DataNode)
In your case when you set executor-memory = 20G -> This means you are asking for a Container of size 20GB in which your Executors will be running. Now you might have 1 or more Executors using this 20GB of Memory and this is Per Worker Node.
So for example if u have a Cluster to 8 nodes, it will be 8 * 20 GB of Total Memory for your Job.
Below are the 3 config options available in yarn-site.xml with which you can play around and see the differences.
yarn.scheduler.minimum-allocation-mb
yarn.scheduler.maximum-allocation-mb
yarn.nodemanager.resource.memory-mb
When running Spark on YARN, each Spark executor runs as a YARN container, This means the number of containers will always be the same as the executors created by a Spark application e.g. via --num-executors parameter in spark-submit.
https://stackoverflow.com/a/38348175/9605741
In YARN mode, each executor runs in one container. The number of executors is the same as the number of containers allocated from YARN(except in cluster mode, which will allocate another container to run the driver).

AWS EMR Spark- Cloudwatch

I was running an application on AWS EMR-Spark. Here, is the spark-submit job;-
Arguments : spark-submit --deploy-mode cluster --class com.amazon.JavaSparkPi s3://spark-config-test/SWALiveOrderModelSpark-1.0.assembly.jar s3://spark-config-test/2017-08-08
So, AWS uses YARN for resource management. I had a couple of doubts around this while I was observing the cloudwatch metrics :-
1)
What does container allocated imply here? I am using 1 master & 3 slave/executor nodes (all 4 are 8 cores CPU).
2)
I changed my query to:-
spark-submit --deploy-mode cluster --executor-cores 4 --class com.amazon.JavaSparkPi s3://spark-config-test/SWALiveOrderModelSpark-1.0.assembly.jar s3://spark-config-test/2017-08-08
Here the number of cores running is 3. Should it not be 3(number of executors)*4(number of cores) = 12?
1) Container allocated here basically represents the number of spark executors. Spark executor-cores are more like `executor-tasks meaning that you could have your app configured to run one executor per physical cpu and still ask it to have 3 executor-cores per cpu (think hyper-threading).
What happens by default on EMR, when you don't specify the number of spark-executors, is that dynamic allocation is assumed and Spark will only ask from YARN what it thinks it needs in terms of resources. Tried setting explicitly the number of executors to 10 and the containers allocated went upto 6 (max partitions of data). Also, under the tab "Application history", you can get a detailed view of YARN/Spark executors.
2) "cores" here refer to EMR core nodes and are not the same as spark executor cores. Same for "task" that in the monitoring tab refer to EMR task nodes. That is consistent with my setup, as I have 3 EMR slave nodes.

How to configure Yarn to use all vcores?

We are running a spark streaming job using yarn as cluster manager, i have dedicated 7 cores per node to each node ...via yarn-site.xml as shown in the pic below
when the job is running ..it's only using 2 vcores and 5 vcores are left alone and the job is slow with lot of batches queued up ..
how can we make it use all the 7 vcores ..that's available to it this is usage when running so that it speed's up our job
Would greatly appreciate if any of the experts in the community will help out as we are new to Yarn & Spark
I searched many answers for this question. Finally, it worked after changing a yarn config file: capacity-scheduler.xml
<property>
<name>yarn.scheduler.capacity.resource-calculator</name>
<value>org.apache.hadoop.yarn.util.resource.DominantResourceCalculator</value>
</property>
Don't forget to restart your yarn
At spark level you can control yarn application master's cores by using parameters spark.yarn.am.cores.
For spark executors you need to pass --executor-cores to spark-submit.
However from spark, you cannot control what(vcores/memory) yarn chooses to allocate to the container that it spawns which is right, as you are running spark over yarn.
In order to control that you will need to change yarn vcore parameters like yarn.nodemanager.resource.cpu-vcores, yarn.scheduler.minimum-allocation-vcores. More you can find here https://www.cloudera.com/documentation/enterprise/5-3-x/topics/cdh_ig_yarn_tuning.html#configuring_in_cm

Apache Spark: setting executor instances does not change the executors

I have an Apache Spark application running on a YARN cluster (spark has 3 nodes on this cluster) on cluster mode.
When the application is running the Spark-UI shows that 2 executors (each running on a different node) and the driver are running on the third node.
I want the application to use more executors so I tried adding the argument --num-executors to Spark-submit and set it to 6.
spark-submit --driver-memory 3G --num-executors 6 --class main.Application --executor-memory 11G --master yarn-cluster myJar.jar <arg1> <arg2> <arg3> ...
However, the number of executors remains 2.
On spark UI I can see that the parameter spark.executor.instances is 6, just as I intended, and somehow there are still only 2 executors.
I even tried setting this parameter from the code
sparkConf.set("spark.executor.instances", "6")
Again, I can see that the parameter was set to 6, but still there are only 2 executors.
Does anyone know why I couldn't increase the number of my executors?
yarn.nodemanager.resource.memory-mb is 12g in yarn-site.xml
Increase yarn.nodemanager.resource.memory-mb in yarn-site.xml
With 12g per node you can only launch driver(3g) and 2 executors(11g).
Node1 - driver 3g (+7% overhead)
Node2 - executor1 11g (+7% overhead)
Node3 - executor2 11g (+7% overhead)
now you are requesting for executor3 of 11g and no node has 11g memory available.
for 7% overhead refer spark.yarn.executor.memoryOverhead and spark.yarn.driver.memoryOverhead in https://spark.apache.org/docs/1.2.0/running-on-yarn.html
Note that yarn.nodemanager.resource.memory-mb is total memory that a single NodeManager can allocate across all containers on one node.
In your case, since yarn.nodemanager.resource.memory-mb = 12G, if you add up the memory allocated to all YARN containers on any single node, it cannot exceed 12G.
You have requested 11G (-executor-memory 11G) for each Spark Executor container. Though 11G is less than 12G, this still won't work. Why ?
Because you have to account for spark.yarn.executor.memoryOverhead, which is min(executorMemory * 0.10, 384) (by default, unless you override it).
So, following math must hold true:
spark.executor.memory + spark.yarn.executor.memoryOverhead <= yarn.nodemanager.resource.memory-mb
See: https://spark.apache.org/docs/latest/running-on-yarn.html for latest documentation on spark.yarn.executor.memoryOverhead
Moreover, spark.executor.instances is merely a request. Spark ApplicationMaster for your application will make a request to YARN ResourceManager for number of containers = spark.executor.instances. Request will be granted by ResourceManager on NodeManager node based on:
Resource availability on the node. YARN scheduling has its own nuances - this is a good primer on how YARN FairScheduler works.
Whether yarn.nodemanager.resource.memory-mb threshold has not been exceeded on the node:
(number of spark containers running on the node * (spark.executor.memory + spark.yarn.executor.memoryOverhead)) <= yarn.nodemanager.resource.memory-mb*
If the request is not granted, request will be queued and granted when above conditions are met.
To utilize the spark cluster to its full capacity you need to set values for --num-executors, --executor-cores and --executor-memory as per your cluster:
--num-executors command-line flag or spark.executor.instances configuration property controls the number of executors requested ;
--executor-cores command-line flag or spark.executor.cores configuration property controls the number of concurrent tasks an executor can run ;
--executor-memory command-line flag or spark.executor.memory configuration property controls the heap size.
You only have 3 nodes in the cluster, and one will be used as the driver, you have only 2 nodes left, how can you create 6 executors?
I think you confused --num-executors with --executor-cores.
To increase concurrency, you need more cores, you want to utilize all the CPUs in your cluster.

Resources