Physical mem used % and Physical Vcores Used % in spark 3 yarn - apache-spark

I like to understand,what is "Physical mem used % and Physical Vcores Used %" in spark 3 yarn.
I don't see these metric in spark 2.4 and i could see these new metric in spark 3 yarn.
What is Physical mem used % ?
What is Physical Vcores Used %?
Even cluster 100% I could see Physical mem used 48% and Physical Vcore used 86%
Please help to understand the logic behind these metrics

Related

How spark manages physical memory, virtual memory and executor memory?

As I have been working on Spark for a few days, I get confused around spark memory management. I see terms like physical memory, virtual memory, executor memory, memory overhead and these values don't add up properly as per my current understanding. Can someone explain these things in terms of spark in a simple way?
E.g., I'm running a spark job with following configurations in cluster-mode:
spark_conf = SparkConf() \
.set("spark.executor.memory", "10g") \
.set("spark.executor.cores", 4) \
.set("spark.executor.instances", 30) \
.set("spark.dynamicAllocation.enabled", False)
But I get an error like this:
Failing this attempt.Diagnostics: [2020-08-18 11:57:54.479]
Container [pid=96571,containerID=container_1588672785288_540114_02_000001]
is running 62357504B beyond the 'PHYSICAL' memory limit.
Current usage: 1.6 GB of 1.5 GB physical memory used;
3.7 GB of 3.1 GB virtual memory used. Killing container.
How physical memory and virtual memory allocations are done w.r.t. executor memory and memory overhead?
Also when I run the same job in client-mode with the same configurations, it runs successfully. Why is it so? The only thing that gets changed in client-mode is the driver and I don't have any code which aggregates data to the driver.
When you see the option value,
yarn.nodemanager.vmem-pmem-ratio 2.1
the default ratio between physical and virtual memory is 2.1. You can calculate the physical memory from the total memory of the yarn resource manager divide by number of containers, i.e executors without driver things.
Here is an article but there will be more good articles how yarn allocate the physical memory.

Allocated Memory and Reserved Memory of Application Master

I am trying to understand the 'Allocated Memory' and 'Reserve Memory' columns that are present in the screenshot. Screenshot from Application Master in YARN UI
The cluster settings that I have done in YARN are:
yarn_nodemanager_resource_memory-mb: 16GB
yarn_scheduler_minimum-allocation-mb: 256MB
yarn_scheduler_increment-allocation-mb: 500MB
yarn_scheduler_maximum-allocation-mb: 16GB
It is a single node cluster having 32GB of memory in total and 6 vCores.
Now, you can see from the screenshot that the 'Allocated Memory' is 8500MB. I would like to know how this is getting calculated.
One more thing - the driver memory specified is spark.driver.memory=10g
Allocated memory is either determined by:
The memory available in your cluster
The memory available in your queue
vCores * (executor memory + executor overhead)
In your case it looks like your allocated memory is limited by the third option. I'm guessing you didn't set spark.executor.memory or spark.executor.memoryOverhead because the memory you are getting is right in line with the default values. The Spark docs show that default values are:
spark.executor.memory = 1g
spark.executor.memoryOverhead = 0.1 * executor memory with a minimum of 384mb
This gives about 1400mb per core which multiple by your 6 cores lines up with the Allocated Memory you are seeing

Spark Yarn Memory configuration

I have a spark application that keeps failing on error:
"Diagnostics: Container [pid=29328,containerID=container_e42_1512395822750_0026_02_000001] is running beyond physical memory limits. Current usage: 1.5 GB of 1.5 GB physical memory used; 2.3 GB of 3.1 GB virtual memory used. Killing container."
I saw lots of different parameters that was suggested to change to increase the physical memory. Can I please have the some explanation for the following parameters?
mapreduce.map.memory.mb (currently set to 0 so suppose to take the default which is 1GB so why we see it as 1.5 GB, changing it also dint effect the number)
mapreduce.reduce.memory.mb (currently set to 0 so suppose to take the default which is 1GB so why we see it as 1.5 GB, changing it also dint effect the number)
mapreduce.map.java.opts/mapreduce.reduce.java.opts set to 80% form the previous number
yarn.scheduler.minimum-allocation-mb=1GB (when changing this then I see the effect on the max physical memory, but for the value 1 GB it still 1.5G)
yarn.app.mapreduce.am.resource.mb/spark.yarn.executor.memoryOverhead can't find at all in configuration.
We are defining YARN (running with yarn-cluster deployment mode) using cloudera CDH 5.12.1.
spark.driver.memory
spark.executor.memory
These control the base amount of memory spark will try to allocate for it's driver and for all the executors. These are probably the ones you want to increase if you are running out of memory.
// options before Spark 2.3.0
spark.yarn.driver.memoryOverhead
spark.yarn.executor.memoryOverhead
// options after Spark 2.3.0
spark.driver.memoryOverhead
spark.executor.memoryOverhead
This value is an additional amount of memory to request when you are running Spark on yarn. It is intended to account extra RAM needed for the yarn container that is hosting your Spark Executors.
yarn.scheduler.minimum-allocation-mb
yarn.scheduler.maximum-allocation-mb
When Spark goes to ask Yarn to reserve a block of RAM for an executor, it will ask a value of the base memory plus the overhead memory. However, Yarn may not give it back one of exactly that size. These parameters control the smallest container size and the largest container size that YARN will grant. If you are only using the cluster for one job, I find it easiest to set these to very small and very large values and then using the spark memory settings mentions above to set the true container size.
mapreduce.map.memory.mb
mapreduce.map.memory.mb
mapreduce.map.java.opts/mapreduce.reduce.java.opts
I don't think these have any bearing on your Spark/Yarn job.

Why "memory in use" = 1g in Spark Standalone?

I'm running Apache Spark with Standalone, and when I connect to myip:8080, I always see something like "Memory in use: 120.0 GB Total, 1.0 GB Used". Why only 1Gb is used if much more memory is available? Is it possible (or desirable) to increase the amount of memory that is actually used?

Spark on YARN: Less executor memory than set via spark-submit

I'm using Spark in a YARN cluster (HDP 2.4) with the following settings:
1 Masternode
64 GB RAM (48 GB usable)
12 cores (8 cores usable)
5 Slavenodes
64 GB RAM (48 GB usable) each
12 cores (8 cores usable) each
YARN settings
memory of all containers (of one host): 48 GB
minimum container size = maximum container size = 6 GB
vcores in cluster = 40 (5 x 8 cores of workers)
minimum #vcores/container = maximum #vcores/container = 1
When I run my spark application with the command spark-submit --num-executors 10 --executor-cores 1 --executor-memory 5g ... Spark should give each executor 5 GB of RAM right (I set memory only to 5g due to some overhead memory of ~10%).
But when I had a look in the Spark UI, I saw that each executor only has 3.4 GB of memory, see screenshot:
Can someone explain why there's so less memory allocated?
The storage memory column in the UI displays the amount of memory used for execution and RDD storage. By default, this equals (HEAP_SPACE - 300MB) * 75%. The rest of the memory is used for internal metadata, user data structures and other stuffs.
You can control this amount by setting spark.memory.fraction (not recommended). See more in Spark's documentation

Resources