How much minimum driver memory should be in spark application? - apache-spark

In spark doc it shows default memory is 1GB can we set it to less than 1 GB.I am providing 634 mb but it is giving error Application is running beyond physical limit .

Yes we can set it below 1 GB .I have run my app with 512m memory and it worked fine.
The error you mentioned is because your application require more memory than you have mentioned
Could you please share the full stack trace of the error.

Related

Spark Insufficient Memory

My Spark job fails with the following error:
java.lang.IllegalArgumentException: Required executor memory (33792 MB), offHeap memory (0) MB, overhead (8192 MB), and PySpark memory (0 MB)
is above the max threshold (24576 MB) of this cluster!
Please check the values of 'yarn.scheduler.maximum-allocation-mb' and/or 'yarn.nodemanager.resource.memory-mb'.
I have defined executor memory to be 33g and executor memory overhead to be 8g. However, the total should be less than or equal to 24g as per the error log. Can someone help me understand what exactly does 24g refer to? Is it the RAM on the master node or something else? Why is it capped to 24g?
Once I figure it out, I can programmatically calculate my other values to not run into this issue again.
Setup: Running make command which houses multiple spark-submit commands on Jenkins which launches it on an AWS EMR cluster running Spark 3.x
This error is happening because you're requesting more resources than is available on the cluster (org.apache.spark.deploy.yarn.Client source). For your case specifically (AWS EMR), I think you should check the value of yarn.nodemanager.resource.memory-mb as message says (in yarn-site.xml or via NodeManager Web UI), and do not try to allocate more than this value per YARN container memory.

Spark Job Fails on yarn with memory error

My spark job fails with following error :
Diagnostics: Container [pid=7277,containerID=container_1528934459854_1736_02_000001] is running beyond physical memory limits. Current usage: 1.4 GB of 1.4 GB physical memory used; 3.1 GB of 6.9 GB virtual memory used. Killing container.
Your containers are getting killed. This happens when your Yarn memory is not as much as required to perform the task. So, the possible solution is to increase Yarn memory.
You have 2 choices:
Either increase the current memory size of your node manager
Or assign a new Node manager on one more Datanode.
It will increase the Yarn Memory and make sure it's around 2 GB at least.

GraphX Disk size running low

I am currently using Apache Spark with Graphx, I have noticed lately that when I run my application with a lots of data the application is using a large part of my disk, for example before I start the program the disk is around 8 GB and during the application runs it goes down to 1 GB, when I close the application the disk is restored but not in full. I have lost some GB, at first I though that it had to do with swap memory and logs, but I can not find what is stored to my disk after the execution of the application.
Can someone explain why is this happening?

Spark streaming on yarn - Container running beyond physical memory limits

I'm running a spark streaming application on Yarn, It works well for several days and after that I encountered a problem, the error message from yarn list below:
Application application_1449727361299_0049 failed 2 times due to AM Container for appattempt_1449727361299_0049_000002 exited with exitCode: -104
For more detailed output, check application tracking page:https://sccsparkdev03:26001/cluster/app/application_1449727361299_0049Then, click on links to logs of each attempt.
Diagnostics: Container [pid=25317,containerID=container_1449727361299_0049_02_000001] is running beyond physical memory limits. Current usage: 3.5 GB of 3.5 GB physical memory used; 5.3 GB of 8.8 GB virtual memory used. Killing container.
And here is my memory configuration:
spark.driver.memory = 3g
spark.executor.memory = 3g
mapred.child.java.opts -Xms1024M -Xmx3584M
mapreduce.map.java.opts -Xmx2048M
mapreduce.map.memory.mb 4096
mapreduce.reduce.java.opts -Xmx3276M
mapreduce.reduce.memory.mb 4096
This OOM error is strange because I didn't maintain any data in memory since it's a streaming program, does anyone encountered the same question like it? Or who know what cause it?
Check the mem on the box/vm instance you're running it on. My guess is the host machine is red lining it.
...due to, it appears, over-allocating memory.
Where do you think the streaming gets executed? Regardless of whether you store anything there? Yup. memory. Not cats or dancing Viking either (add "e").
Guess what? You're allocating 7 GB of memory that is heavily weighted towards physical over virtual mem.
Check your logging, as that would have similar build up time.
What's spark.yarn.am.memory value?
Get your VM and container memory allocation in balance :)
Another thought is to adjust memoryOverhead so as physical & virtual can be more proportional

Java OutOfMemoryError in Windows Azure Virtual Machine

When I run my Java applications on a Window Azure's Ubuntu 12.04 VM,
with 4 by 1.6GHZ core and 7G RAM, I get the following out of memory error after a few minutes.
java.lang.OutOfMemoryError: GC overhead limit exceeded
I have a swap size of 15G byte, and the max heap size is set to 2G. I am using a Oracle Java 1.6. Increase the max heap size only delays the out of memory error.
It seems the JVM is not doing garbage collection.
However, when I run the above Java application on my local Windows 8 PC (core i7) , with the same JVM parameters, it runs fine. The heap size never exceed 1G.
Is there any extra setting on Windows Azure linux VM for running Java apps ?
On Azure VM, I used the following JVM parameters
-XX:+HeapDumpOnOutOfMemoryError
to get a heap dump. The heap dump shows an actor mailbox and Camel messages are taking up all the 2G.
In my Akka application, I have used Akka Camel Redis to publish processed messages to a Redis channel.
The out of memory error goes away when I stub out the above Camel Actor. It looks as though Akka Camel Redis Actor
is not performant on the VM, which has a slower cpu clock speed than my Xeon CPU.
Shing
The GC throws this exception when too much time is spent in garbage collection without collecting anything. I believe the default settings are 98% of CPU time being spent on GC with only 2% of heap being recovered.
This is to prevent applications from running for an extended period of time while making no progress because the heap is too small.
You can turn this off with the command line option -XX:-UseGCOverheadLimit

Resources