java.lang.OutOfMemory:Java heap space error in spark-submit - apache-spark

I am running Spark application using spark-submit and defined JVM parameters. With this set of parameters I get java heap space error:
EXTRA_JVM_FLAGS="-server -XX:+UseG1GC
-XX:ReservedCodeCacheSize=384m
-XX:MaxDirectMemorySize=2G
-agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=5005
--master "local[4]"
--driver-memory 2G
--driver-java-options "${EXTRA_JVM_FLAGS}"
I tried to increase driver memory but it caused JVM crash. Also, I tried to increase max direct memory size which did not help in any way. What options should I change to fix heap space error?

You should try the most basic option -Xmx - this is the max heap space size.
Code cache and direct memory size are native memory areas and don't affect the size of the heap.
By default, the JVM allocates 1/4 of RAM available on the box as max heap size. You can increase that if the machine is dedicated to the one JVM process pretty safely.

Related

Spark UI Showing Wrong Memory Allocation

We are currently running into an issue where Spark is showing that each of our nodes only have 4GB of memory. However, we have allocated 10GB of memory by setting spark-worker.jvmOptions = -Xmx10g. We can not figure out what is causing this unusual limitation/incorrect memory allocation.
When we go to run spark jobs it will run as if there is only 4GB of memory per worker.
Any help would be great! Thanks!
Screenshot of SOLR UI
You should set worker memory using : --executor-memory in your spark-submit
Try setting the following parameters inside the conf/spark-defaults.conf file:
spark.executor.memory 10g

Allocated Memory and Reserved Memory of Application Master

I am trying to understand the 'Allocated Memory' and 'Reserve Memory' columns that are present in the screenshot. Screenshot from Application Master in YARN UI
The cluster settings that I have done in YARN are:
yarn_nodemanager_resource_memory-mb: 16GB
yarn_scheduler_minimum-allocation-mb: 256MB
yarn_scheduler_increment-allocation-mb: 500MB
yarn_scheduler_maximum-allocation-mb: 16GB
It is a single node cluster having 32GB of memory in total and 6 vCores.
Now, you can see from the screenshot that the 'Allocated Memory' is 8500MB. I would like to know how this is getting calculated.
One more thing - the driver memory specified is spark.driver.memory=10g
Allocated memory is either determined by:
The memory available in your cluster
The memory available in your queue
vCores * (executor memory + executor overhead)
In your case it looks like your allocated memory is limited by the third option. I'm guessing you didn't set spark.executor.memory or spark.executor.memoryOverhead because the memory you are getting is right in line with the default values. The Spark docs show that default values are:
spark.executor.memory = 1g
spark.executor.memoryOverhead = 0.1 * executor memory with a minimum of 384mb
This gives about 1400mb per core which multiple by your 6 cores lines up with the Allocated Memory you are seeing

Where to set "spark.yarn.executor.memoryOverhead"

I am getting following error while running my spark-scala program.
YarnSchedulerBackends$YarnSchedulerEndpoint: Container killed by YARN for exceeding memory limits. 2.6GB of 2.5GB physical memory used. Consider boosting spark.yarn.executor.memoryOverhead.
I have set spark.yarn.executor.memoryOverhead in the program while creating SparkSession.
My question is - is it ok to set "spark.yarn.executor.memoryOverhead" while creating SparkSession or should it be passed during runtime with spark-submit?
You have to set the spark.yarn.executor.memoryOverhead at the time of sparkSession creation. This parameter is used as amount of off-heap memory (in megabytes) to be allocated per executor. This is memory that accounts for things like VM overheads, interned strings, other native overheads, etc. This tends to grow with the executor size (typically 6-10%).
Now this allocation can only be done at the time of allocation of the executor not at the runtime.

What memory do spark.*.memory properties control - RAM or disk?

You can set spark.driver.memory and spark.executor.memory that are described as follows:
spark.driver.memory 1g Amount of memory to use for the driver process
spark.executor.memory 1g Amount of memory to use per executor process (e.g. 2g, 8g).
The above configuration says memory. So Is it RAM memory or disk?
(I must admit it's a very intriguing question)
Shortly, it's RAM (and honestly Spark does not support disk as a resource to accept/request from a cluster manager).
From the official documentation Application Properties:
Amount of memory to use for the driver process, i.e. where SparkContext is initialized. (e.g. 1g, 2g).
Note: In client mode, this config must not be set through the SparkConf directly in your application, because the driver JVM has already started at that point. Instead, please set this through the --driver-memory command line option or in your default properties file.

How does Spark occupy the memory

If my server has 50GB memory, Hbase is using 40GB. And when I run Spark I set the memory as --executor-memory 30G. So will Spark grab some memory from Hbase since there only 10GB left.
Another question, if Spark only need 1GB memory, but I gave Spark 10G memory, will Spark occupy 10GB memory.
The behavior will be different depending upon the deployment mode. In case you are using local mode, then --executor-memory will not change anything as you only have 1 Executor and that's your driver, so you need to increase the memory of your driver.
In case you are using Standalone mode and submitting your job in cluster mode then following would be applicable: -
--executor-memory is the memory required by per executor. It is the executors Heap Size. By Default 60% of the configured --executor-memory is used to cache RDDs. The remaining 40% of memory is available for any objects created during task execution. this is equivalent to -Xms and -Xmx. so in case you provide more memory then available then your executors will show errros regarding insufficient memory.
When you give Spark executor 30G memory, OS will not give it actual physical memory. But As and when your executor requires actual memory to either cache or processing this will cause your other processes like hbase to go on to swap. If your system's swap is set to zero then you will face OOM Error.
OS Swaps out idle part of the process which could make your process behave very slow.

Resources