How to resolve EMR Spark Out Of Memory Error - apache-spark

I have a Spark job which I am trying to execute on EMR. It is giving me the below error:
java.lang.OutOfMemoryError: Java heap space
-XX:OnOutOfMemoryError="kill -9 %p"
Executing /bin/sh -c "kill -9 22611"...
I have tried it with even 10 core instances of type m5.12xlarge but still the same issue. My code is working fine as I have tested it via AWS Glue and that has succeeded with G1.X and 20 DPUs (takes around 3 hours to complete the job). Any recommendation regarding how I choose the EMR instance type?

So changing the instance types only won't always help, we need to play with spark configurations as well. I followed something like mentioned here and the job is successful on EMR.

Related

AWS EMR: java.lang.OutOfMemoryError: Java heap space

I have AWS EMR Spark job which has to process some 1.3 TB of data and I have read that HDFS replicates data size three times in the nodes. I am using 20 core nodes of m5.24xlarge (started trying from lower but did not succeed) with 768 GB EBS volume each. I am receiving the below error:
#
# java.lang.OutOfMemoryError: Java heap space
# -XX:OnOutOfMemoryError="kill -9 %p"
# Executing /bin/sh -c "kill -9 27571"...
For this, I have tried increasing the HADOOP_HEAPSIZE configuration to 2048 and then to 6144 but I am getting the same error after around 1 hour 30 minutes of the job run. I have tried several combinations of instance type, instance count, EBS volume and HADOOP_HEAPSIZE configuration but nothing seems to work here. I have read about this from the forum and AWS docs and applied the above changes. My SparkConfigurations are as below:
sparkConf.set("spark.driver.memory", 10g);
sparkConf.set("spark.executor.memory", 16g);
Any suggestion of how to achieve this? My program is simple and while doing spark.write.parquet (that is the last step of the job), the EMR step fails. Thanks!

Spark is dropping all executors at the beginning of a job

I'm trying to configure a spark job to run with fixed resources on a Dataproc cluster, however after the job was running for 6 minutes I noticed that all but 7 executors had been dropped. 45 minutes later the job has not progressed at all, and I cannot find any errors or logs to explain.
When I check the timeline in the job details it shows all but 7 executors being removed at the 6 minute mark, with the message Container [really long number] exited from explicit termination request..
The command I am running is:
gcloud dataproc jobs submit spark --region us-central1 --cluster [mycluster] \
--class=path.to.class.app --jars="gs://path-to-jar-file" --project=my-project \
--properties=spark.executor.instances=72,spark.driver.memory=28g,spark.executor.memory=28g
My cluster is 1 + 24 n2-highmem16 instances if that helps.
EDIT: I terminated the job, reset, and tried again. The exact same thing happened at the same point in the job (Job 9 Stage 9/12)
Typically that message is expected to be associated with Spark Dynamic Allocation; if you want to always have a fixed number of executors, you can try to add the property:
...
--properties=spark.dynamicAllocation.enabled=false,spark.executor.instances=72...
However, that probably won't address the root problem in your case aside from seeing idle executors continue to stick around; if the dynamic allocation was relinquishing those executors, that would be due to those tasks having completed already but where your remaining executors for whatever reason are not yet done for a long time. This often indicates some kind of data skew where the remaining executors have a lot more work to do than the ones that already completed for whatever reason, unless the remaining executors were simply all equally loaded as part of a smaller phase of the pipeline, maybe in a "reduce" phase.
If you're seeing lagging tasks out of a large number of equivalent tasks, you might consider adding a repartition() step to your job to chop it up more fine-grained in the hopes of spreading out those skewed partitions, or otherwise changing the way your group or partition your data through other means.
Fixed. The job was running out of resources. Allocated some more executors to the job and it completed.

Databricks Spark: java.lang.OutOfMemoryError: GC overhead limit exceeded i

I am executing a Spark job in Databricks cluster. I am triggering the job via a Azure Data Factory pipeline and it execute at 15 minute interval so after the successful execution of three or four times it is getting failed and throwing with the exception "java.lang.OutOfMemoryError: GC overhead limit exceeded".
Though there are many answer with for the above said question but in most of the cases their jobs are not running but in my cases it is getting failed after successful execution of some previous jobs.
My data size is less than 20 MB only.
My cluster configuration is:
So the my question is what changes I should make in the server configuration. If the issue is coming from my code then why it is getting succeeded most of the time. Please advise and suggest me the solution.
This is most probably related to executor memory being bit low .Not sure what is current setting and if its default what is the default value in this particular databrics distribution. Even though it passes but there would lot of GCs happening because of low memory hence it would keep failing once in a while . Under spark configuration please provide spark.executor.memory and also some other params related to num of executors and cores per executor . In spark-submit the config would be provided as spark-submit --conf spark.executor.memory=1g
You may try increasing memory of driver node.
Sometimes the Garbage Collector is not releasing all the loaded objects in the driver's memory.
What you can try is to force the GC to do that. You can do that by executing the following:
spark.catalog.clearCache()
for (id, rdd) in spark.sparkContext._jsc.getPersistentRDDs().items():
rdd.unpersist()
print("Unpersisted {} rdd".format(id))

Spark Job not getting any cores on EC2

I use flintrock 0.9.0 with spark 2.2.0 to start my cluster on EC2. the code is written in pyspark I have been doing this for a while now and run a couple of successful jobs. In the last 2 days I encountered a problem that when I start a cluster on certain instances I don't get any cores. I observed this behavior on c1.medium and now on r3.xlarge the code to get the spark and spark context objects is this
conf = SparkConf().setAppName('the_final_join')\
.setMaster(master)\
.set('spark.executor.memory','29G')\
.set('spark.driver.memory','29G')
sc = SparkContext(conf=conf)
spark = SparkSession.builder.config(conf=conf).getOrCreate()
on c1.medium is used .set('spark.executor.cores', '2') and it seemed to work. But now I tried to run my code on a bigger cluster of r3.xlarge instances and my Job doesn't get any code no matter what I do. All workers are alive and I see that each of them should have 4 cores. Did something change in the last 2 months or am I missing something in the startup process? I launch the instances in us-east-1c I don't know if this has something to do with it.
Part of your issue may be that your are trying to allocate more memory to the Driver/Executors than you have access to.
yarn.nodemanager.resource.memory-mb controls the maximum sum of memory used by the containers on each node (cite)
You can look up this value for various instances here. r3.xlarge have access to 23,424M, but your trying to give your driver/executor 29G. Yarn is not launching Spark, ultimately, because it doesn't have access to enough memory to run your job.

"GC Overhead limit exceeded" on Hadoop .20 datanode

I've searched and not finding much information related to Hadoop Datanode processes dying due to GC overhead limit exceeded, so I thought I'd post a question.
We are running a test where we need to confirm our Hadoop cluster can handle having ~3million files stored on it (currently a 4 node cluster). We are using a 64bit JVM and we've allocated 8g to the namenode. However, as my test program writes more files to DFS, the datanodes start dying off with this error:
Exception in thread "DataNode: [/var/hadoop/data/hadoop/data]" java.lang.OutOfMemoryError: GC overhead limit exceeded
I saw some posts about some options (parallel GC?) I guess which can be set in hadoop-env.sh but I'm not too sure of the syntax and I'm kind of a newbie, so I didn't quite grok how it's done.
Thanks for any help here!
Try to increase the memory for datanode by using this: (hadoop restart required for this to work)
export HADOOP_DATANODE_OPTS="-Xmx10g"
This will set the heap to 10gb...you can increase as per your need.
You can also paste this at the start in $HADOOP_CONF_DIR/hadoop-env.sh file.
If you are running a map reduce job from command line, you can increase the heap using the parameter -D 'mapreduce.map.java.opts=-Xmx1024m' and/or -D 'mapreduce.reduce.java.opts=-Xmx1024m'. Example:
hadoop --config /etc/hadoop/conf jar /usr/lib/hbase-solr/tools/hbase-indexer-mr-*-job.jar --conf /etc/hbase/conf/hbase-site.xml -D 'mapreduce.map.java.opts=-Xmx1024m' --hbase-indexer-file $HOME/morphline-hbase-mapper.xml --zk-host 127.0.0.1/solr --collection hbase-collection1 --go-live --log4j /home/cloudera/morphlines/log4j.properties
Note that in some Cloudera documentation, they still use the old parameters mapred.child.java.opts, mapred.map.child.java.opts and mapred.reduce.child.java.opts. These parameters don't work anymore for Hadoop 2 (see What is the relation between 'mapreduce.map.memory.mb' and 'mapred.map.child.java.opts' in Apache Hadoop YARN?).
This post solved the issue for me.
So, the key is to "Prepend that environment variable" (1st time seen this linux command syntax :) )
HADOOP_CLIENT_OPTS="-Xmx10g" hadoop jar "your.jar" "source.dir" "target.dir"
GC overhead limit indicates that your (tiny) heap is full.
This is what often happens in MapReduce operations when u process a lot of data. Try this:
< property >
< name > mapred.child.java.opts < /name >
< value > -Xmx1024m -XX:-UseGCOverheadLimit < /value >
< /property >
Also, try these following things:
Use combiners, the reducers shouldn't get any lists longer than a small multiple of the number of maps
At the same time, you can generate heap dump from OOME and analyze with YourKit, etc adn analyze it

Resources