Yarn shows more resources than cluster have - apache-spark

I start an EMR cluster with 3 m3.xlarge instance (1 master & 2 slaves) and i have some troubles.
From aws documentation a m3.xlarge instance has 4 vcpu ( https://aws.amazon.com/ec2/instance-types/ ) . What does it means? This means 4 threads or 4 core with 2 thread each core? I ask you that, because when i open hadoop UI(port 8088) appear to be 8 available vcore per instances, but from what i experienced, cluster behave like a 2 instances with 4 vcore per instances. Am i wrong? Or it's a bug from Amazon or yarn?

The value 8 vcores comes from the default Yarn property
<property>
<name>yarn.nodemanager.resource.cpu-vcores</name>
<value>8</value>
<description>Number of vcores that can be allocated for containers. This is used by the RM scheduler when allocating resources for containers. This is not used to limit the number of physical cores used by YARN containers.</description>
</property>
Though it is defined to an higher value than the actual number of vcores in the instance, the containers will be created based on the number of vcores actually available per nodemanager instance.
Modify the value of this property in yarn-site.xml as per the instance vcores.

Related

Spark: I see more executors than available cluster's cores

I'm working with Spark and Yarn on an Azure HDInsight cluster, and I have some troubles on understanding the relations between the workers' resources, executors and containers.
My cluster has 10 workers D13 v2 (8 cores and 56GB of memery), therefore I should have 80 cores available for spark applications. However, when I try to start an application with the following parameters
"executorMemory": "6G",
"executorCores": 5,
"numExecutors": 20,
I see in the Yarn UI 100 cores available (therefore, 20 more than what I should have). I've run an heavy query, and on the executor page of Yarn UI I see all 20 executors working, with 4 or 5 active task in parallel. I tried also pushing the numExecutors to 25, and I do see all 25 working, again with several tasks in parallel for each executor.
It was my understanding the 1 executor core = 1 cluster core, but this is not compatible with what I observe. The official Microsoft documentation (for instance here) it's not really helpful. It states:
An Executor runs on the worker node and is responsible for the tasks
for the application. The number of worker nodes and worker node size
determines the number of executors, and executor sizes.
but it does not say what the relation is. I suspect Yarn is only bound by memory limits (e.g. I can run how many executors I want, if I have enough memory), but I don't understand how this might work in relation with the available cpus in the cluster.
Do you know what I am missing?

Should the number of executor core for Apache Spark be set to 1 in YARN mode?

My question: Is it true that running Apache Spark applications in YARN master, with deploy-mode as either client or cluster, the executor-cores should always be set to 1?
I am running an application processing millions of data on a cluster with 200 data nodes each having 14 cores. It runs perfect when I use 2 executor-cores and 150 executors on YARN, but one of the cluster admins is asking me to use 1 executor-core. He is adamant that Spark in YARN should be used with 1 executor core, because otherwise it will be stealing resources from other users. He points me to this page on Apache docs where it says the default value for executor-core is 1 for YARN.
https://spark.apache.org/docs/latest/configuration.html
So, is it true we should use only 1 for executor-cores?
If the executors use 1 core, aren't they single threaded?
Kind regards,
When we run spark application using a cluster manager like Yarn, there’ll be several daemons that’ll run in the background like NameNode, Secondary NameNode, DataNode, JobTracker and TaskTracker. So, while specifying num-executors, we need to make sure that we leave aside enough cores (~1 core per node) for these daemons to run smoothly.
ApplicationMaster is responsible for negotiating resources from the ResourceManager and working with the NodeManagers to execute and monitor the containers and their resource consumption. If we are running spark on yarn, then we need to budget in the resources that AM would need
Example
**Cluster Config:**
200 Nodes
14 cores per Node
Leave 1 core per node for Hadoop/Yarn daemons => Num cores available per node = 14-1 = 13
So, Total available of cores in cluster = 13 x 200 = 2600
Let’s assign 5 core per executors => --executor-cores = 5 (for good HDFS throughput)
Number of available executors = (total cores/num-cores-per-executor) = 2600/5 = 520
Leaving 1 executor for ApplicationManager => --num-executors = 519
Please note : This is just a sample recommended configuration , you
may wish to revise based upon the performance of your application.
Also A better practice is to monitor the node resources while you
execute your job , this gives a better picture of the resource
utilisation in your cluster

AWS EMR Spark- Cloudwatch

I was running an application on AWS EMR-Spark. Here, is the spark-submit job;-
Arguments : spark-submit --deploy-mode cluster --class com.amazon.JavaSparkPi s3://spark-config-test/SWALiveOrderModelSpark-1.0.assembly.jar s3://spark-config-test/2017-08-08
So, AWS uses YARN for resource management. I had a couple of doubts around this while I was observing the cloudwatch metrics :-
1)
What does container allocated imply here? I am using 1 master & 3 slave/executor nodes (all 4 are 8 cores CPU).
2)
I changed my query to:-
spark-submit --deploy-mode cluster --executor-cores 4 --class com.amazon.JavaSparkPi s3://spark-config-test/SWALiveOrderModelSpark-1.0.assembly.jar s3://spark-config-test/2017-08-08
Here the number of cores running is 3. Should it not be 3(number of executors)*4(number of cores) = 12?
1) Container allocated here basically represents the number of spark executors. Spark executor-cores are more like `executor-tasks meaning that you could have your app configured to run one executor per physical cpu and still ask it to have 3 executor-cores per cpu (think hyper-threading).
What happens by default on EMR, when you don't specify the number of spark-executors, is that dynamic allocation is assumed and Spark will only ask from YARN what it thinks it needs in terms of resources. Tried setting explicitly the number of executors to 10 and the containers allocated went upto 6 (max partitions of data). Also, under the tab "Application history", you can get a detailed view of YARN/Spark executors.
2) "cores" here refer to EMR core nodes and are not the same as spark executor cores. Same for "task" that in the monitoring tab refer to EMR task nodes. That is consistent with my setup, as I have 3 EMR slave nodes.

Hadoop-2.7.2: How manage resources

I use a server with 16 cores, 64 GB ram, 2.5 TB disk and I want to execute a Giraph program. I have installed hadoop-2.7.2 and I don't know how can configure hadoop to use only a partial amount of server resources because the server used by many users.
Requirements: Hadoop must use max 12 cores (=> 4 cores for NameNode, DataNode, JobTracker, TaskTracker and max 8 for tasks) and max 28GB ram (i.e., 4*3GB + 8*2GB).
My Yarn-site resources configuration:
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>28672</value>
<description>Physical memory, in MB, to be made available to running containers</description>
</property>
<property>
<name>yarn.nodemanager.resource.cpu-vcores</name>
<value>12</value>
<description>Number of CPU cores that can be allocated for containers.</description>
</property>
</configuration>
When I try to execute Giraph program, in http://localhost:8088 Yarn Application state is: ACCEPTED: waiting for AM container to be allocated, launched and register with RM.
I think some configurations are missing from my Yarn-site.xml in order to adapt the above requirements.
Before assigning resources to the services take a look at Yarn tuning Guide file from Cloudera, you will get idea how much resources should be allocated to OS, Hadoop daemons, etc
As you mentioned
Yarn Application state is: ACCEPTED: waiting for AM container to be allocated, launched and register with RM
If there is no available resources for a job, then it will be in ACCEPTED state until it get resources. So in your case, check how many jobs are submitting at the same time and check the resources utilisation for those jobs.
If you want to configure no waiting for your jobs, you have to consider creating scheduler queues

DC/OS SMACK cluster resource management

I am trying to set DC/OS Spark-Kafka-Cassandra cluster using 1 master and 3 private AWS m3.xlarge instances (each having 4 processors, 15GB RAM).
I have questions regarding some strange behaviour I have incurred in the spike I did several days ago.
On each of the private nodes I have following fixed resources reserved (I speak about CPU usage, memory is not the issue)
0.5 CPUs for Cassandra on each node
0.3 - 0.5 CPUs for Kafka one each node
0.5 CPUs is the Mesos overhead (I simply see in DC/OS UI that it is occupied 0.5CPUs more than the summation of all the services that are running on a node -> this probably belongs to some sort of Mesos overhead)
rest of the resources I have available for running Spark jobs (around 2.5 CPUs)
Now, I want to run 2 streaming jobs, so that they run on every node of the cluster. This requires me to set in dcos spark run command that number of executors is 3 (although I have 3 nodes in the cluster), as well as that number of CPU cores is 3 (it is impossible to set 1 or 2,because as far as I see minimum CPUs per executor is 1). Of course, for each of the streaming jobs, 1 CPU in the cluster is occupied by the driver program.
First strange situation that I see is that instead of running 3 executors with 1 core each, Mesos launches 2 executors on 2 nodes where one has 2 CPUs, while the other has 1 CPU. There is nothing launched on the 3rd node even though there were enough resources. How to force Mesos to run 3 executors on the cluster?
Also, when I run 1 pipeline with 3 CPUs, I see that those CPUs are blocked, and cannot be reused by other streaming pipeline, even though they are not doing any workload. Why Mesos can not shift available resources between applications? Isn't that the main benefit of using Mesos? Or maybe simply there are not enough resources to be shifted?
EDITED
Also the question is can I assign less than one CPU per Executor?
Kindest regards,
Srdjan

Resources