Spark Yarn Memory configuration - apache-spark

I have a spark application that keeps failing on error:
"Diagnostics: Container [pid=29328,containerID=container_e42_1512395822750_0026_02_000001] is running beyond physical memory limits. Current usage: 1.5 GB of 1.5 GB physical memory used; 2.3 GB of 3.1 GB virtual memory used. Killing container."
I saw lots of different parameters that was suggested to change to increase the physical memory. Can I please have the some explanation for the following parameters?
mapreduce.map.memory.mb (currently set to 0 so suppose to take the default which is 1GB so why we see it as 1.5 GB, changing it also dint effect the number)
mapreduce.reduce.memory.mb (currently set to 0 so suppose to take the default which is 1GB so why we see it as 1.5 GB, changing it also dint effect the number)
mapreduce.map.java.opts/mapreduce.reduce.java.opts set to 80% form the previous number
yarn.scheduler.minimum-allocation-mb=1GB (when changing this then I see the effect on the max physical memory, but for the value 1 GB it still 1.5G)
yarn.app.mapreduce.am.resource.mb/spark.yarn.executor.memoryOverhead can't find at all in configuration.
We are defining YARN (running with yarn-cluster deployment mode) using cloudera CDH 5.12.1.

spark.driver.memory
spark.executor.memory
These control the base amount of memory spark will try to allocate for it's driver and for all the executors. These are probably the ones you want to increase if you are running out of memory.
// options before Spark 2.3.0
spark.yarn.driver.memoryOverhead
spark.yarn.executor.memoryOverhead
// options after Spark 2.3.0
spark.driver.memoryOverhead
spark.executor.memoryOverhead
This value is an additional amount of memory to request when you are running Spark on yarn. It is intended to account extra RAM needed for the yarn container that is hosting your Spark Executors.
yarn.scheduler.minimum-allocation-mb
yarn.scheduler.maximum-allocation-mb
When Spark goes to ask Yarn to reserve a block of RAM for an executor, it will ask a value of the base memory plus the overhead memory. However, Yarn may not give it back one of exactly that size. These parameters control the smallest container size and the largest container size that YARN will grant. If you are only using the cluster for one job, I find it easiest to set these to very small and very large values and then using the spark memory settings mentions above to set the true container size.
mapreduce.map.memory.mb
mapreduce.map.memory.mb
mapreduce.map.java.opts/mapreduce.reduce.java.opts
I don't think these have any bearing on your Spark/Yarn job.

Related

Spark is not use all configured storage memory capacity

My task in spark uses images data for prediction I am working on a spark cluster standalone but I have an issue utilizing all the available memory capacity as here all available memory is 2.7 GB (coming from a memory executor that is configured 5 GB *0.6 *0.9= 2.7 it's okay ) but the usage memory is only 342 MB after that value my spark session being crashed and I did not know why this specific value!
I test my application on local and on a standalone cluster mode in addition whatever the memory executor configured value the limit of memory value for execution will be 342 MB. and here as shown my data size of 290691 KB led to the crash of my spark session and it works fine if I decrease the number of images
as follows screenshot issue:
This output error crashed with a data size of 290691 KB
Here my spark UI Storage Memory did not exceed 342 MB
so is there any advice or what is the correct spark configuration?
It's a warning, initially.
The general gist here is that you need to repartition to get more, but smaller size partitions, so as to get more parallelism and higher throughput. You can find many such issues out there on the Internet.

Required executor memory is above the max threshold of this cluster

I am running Spark on an 8 node cluster with yarn as a resource manager. I have 64GB memory per node, and I set the executor memory to 25GB, but I get the error:
Required executor memory (25600MB) is above the max threshold (16500 MB) of this cluster! Please check the values of 'yarn.scheduler.maximum-allocation-mb' and/or 'yarn.nodemanager.resource.memory-mb'.
I the set yarn.scheduler.maximum-allocation-mb and yarn.nodemanager.resource.memory-mb to 25600 but nothing changes.
Executor memory is only the heap portion of the memory. You still have to run a JVM plus allocate the non-heap portion of memory inside a container and have that fit in YARN. Refer to the image from How-to: Tune Your Apache Spark Jobs (Part 2) by Sandy Ryza.
If you want to use executor memory set at 25GB, I suggest you bump up yarn.scheduler.maximum-allocation-mb and yarn.nodemanager.resource.memory-mb to something higher like 42GB.

Allocated Memory and Reserved Memory of Application Master

I am trying to understand the 'Allocated Memory' and 'Reserve Memory' columns that are present in the screenshot. Screenshot from Application Master in YARN UI
The cluster settings that I have done in YARN are:
yarn_nodemanager_resource_memory-mb: 16GB
yarn_scheduler_minimum-allocation-mb: 256MB
yarn_scheduler_increment-allocation-mb: 500MB
yarn_scheduler_maximum-allocation-mb: 16GB
It is a single node cluster having 32GB of memory in total and 6 vCores.
Now, you can see from the screenshot that the 'Allocated Memory' is 8500MB. I would like to know how this is getting calculated.
One more thing - the driver memory specified is spark.driver.memory=10g
Allocated memory is either determined by:
The memory available in your cluster
The memory available in your queue
vCores * (executor memory + executor overhead)
In your case it looks like your allocated memory is limited by the third option. I'm guessing you didn't set spark.executor.memory or spark.executor.memoryOverhead because the memory you are getting is right in line with the default values. The Spark docs show that default values are:
spark.executor.memory = 1g
spark.executor.memoryOverhead = 0.1 * executor memory with a minimum of 384mb
This gives about 1400mb per core which multiple by your 6 cores lines up with the Allocated Memory you are seeing

Spark on YARN resource manager: Relation between YARN Containers and Spark Executors

I'm new to Spark on YARN and don't understand the relation between the YARN Containers and the Spark Executors. I tried out the following configuration, based on the results of the yarn-utils.py script, that can be used to find optimal cluster configuration.
The Hadoop cluster (HDP 2.4) I'm working on:
1 Master Node:
CPU: 2 CPUs with 6 cores each = 12 cores
RAM: 64 GB
SSD: 2 x 512 GB
5 Slave Nodes:
CPU: 2 CPUs with 6 cores each = 12 cores
RAM: 64 GB
HDD: 4 x 3 TB = 12 TB
HBase is installed (this is one of the parameters for the script below)
So I ran python yarn-utils.py -c 12 -m 64 -d 4 -k True (c=cores, m=memory, d=hdds, k=hbase-installed) and got the following result:
Using cores=12 memory=64GB disks=4 hbase=True
Profile: cores=12 memory=49152MB reserved=16GB usableMem=48GB disks=4
Num Container=8
Container Ram=6144MB
Used Ram=48GB
Unused Ram=16GB
yarn.scheduler.minimum-allocation-mb=6144
yarn.scheduler.maximum-allocation-mb=49152
yarn.nodemanager.resource.memory-mb=49152
mapreduce.map.memory.mb=6144
mapreduce.map.java.opts=-Xmx4915m
mapreduce.reduce.memory.mb=6144
mapreduce.reduce.java.opts=-Xmx4915m
yarn.app.mapreduce.am.resource.mb=6144
yarn.app.mapreduce.am.command-opts=-Xmx4915m
mapreduce.task.io.sort.mb=2457
These settings I made via the Ambari interface and restarted the cluster. The values also match roughly what I calculated manually before.
I have now problems
to find the optimal settings for my spark-submit script
parameters --num-executors, --executor-cores & --executor-memory.
to get the relation between the YARN container and the Spark executors
to understand the hardware information in my Spark History UI (less memory shown as I set (when calculated to overall memory by multiplying with worker node amount))
to understand the concept of the vcores in YARN, here I couldn't find any useful examples yet
However, I found this post What is a container in YARN? , but this didn't really help as it doesn't describe the relation to the executors.
Can someone help to solve one or more of the questions?
I will report my insights here step by step:
First important thing is this fact (Source: this Cloudera documentation):
When running Spark on YARN, each Spark executor runs as a YARN container. [...]
This means the number of containers will always be the same as the executors created by a Spark application e.g. via --num-executors parameter in spark-submit.
Set by the yarn.scheduler.minimum-allocation-mb every container always allocates at least this amount of memory. This means if parameter --executor-memory is set to e.g. only 1g but yarn.scheduler.minimum-allocation-mb is e.g. 6g, the container is much bigger than needed by the Spark application.
The other way round, if the parameter --executor-memory is set to somthing higher than the yarn.scheduler.minimum-allocation-mb value, e.g. 12g, the Container will allocate more memory dynamically, but only if the requested amount of memory is smaller or equal to yarn.scheduler.maximum-allocation-mb value.
The value of yarn.nodemanager.resource.memory-mb determines, how much memory can be allocated in sum by all containers of one host!
=> So setting yarn.scheduler.minimum-allocation-mb allows you to run smaller containers e.g. for smaller executors (else it would be waste of memory).
=> Setting yarn.scheduler.maximum-allocation-mb to the maximum value (e.g. equal to yarn.nodemanager.resource.memory-mb) allows you to define bigger executors (more memory is allocated if needed, e.g. by --executor-memory parameter).

How does Spark occupy the memory

If my server has 50GB memory, Hbase is using 40GB. And when I run Spark I set the memory as --executor-memory 30G. So will Spark grab some memory from Hbase since there only 10GB left.
Another question, if Spark only need 1GB memory, but I gave Spark 10G memory, will Spark occupy 10GB memory.
The behavior will be different depending upon the deployment mode. In case you are using local mode, then --executor-memory will not change anything as you only have 1 Executor and that's your driver, so you need to increase the memory of your driver.
In case you are using Standalone mode and submitting your job in cluster mode then following would be applicable: -
--executor-memory is the memory required by per executor. It is the executors Heap Size. By Default 60% of the configured --executor-memory is used to cache RDDs. The remaining 40% of memory is available for any objects created during task execution. this is equivalent to -Xms and -Xmx. so in case you provide more memory then available then your executors will show errros regarding insufficient memory.
When you give Spark executor 30G memory, OS will not give it actual physical memory. But As and when your executor requires actual memory to either cache or processing this will cause your other processes like hbase to go on to swap. If your system's swap is set to zero then you will face OOM Error.
OS Swaps out idle part of the process which could make your process behave very slow.

Resources