I made standalone cluster and wanted to find the fastest way to process my app.
My machine has 12g ram. Here is some result I tried.
Test A (took 15mins)
1 worker node
spark.executor.memory = 8g
spark.driver.memory = 6g
Test B(took 8mins)
2 worker nodes
spark.executor.memory = 4g
spark.driver.memory = 6g
Test C(took 6mins)
2 worker nodes
spark.executor.memory = 6g
spark.driver.memory = 6g
Test D(took 6mins)
3 worker nodes
spark.executor.memory = 4g
spark.driver.memory = 6g
Test E(took 6mins)
3 worker nodes
spark.executor.memory = 6g
spark.driver.memory = 6g
Compared Test A, Test B just made one more woker (but same memory spend 4*2=8) but It made app fast. Why it happened?
Test C, D, E tried to spend much more memory than it had. but It worked and even faster. is config memory size just for limiting edge of memory?
It does not just as fast as adding worker nodes. How should I know profit number of worker and executor memory size?
On TestB, your application was running in parallel on 2 CPUs, therefore the total amount of time was almost a half.
Regarding memory - memory setting defines an upper limit Setting a small amount will make your. app to perform more GC, and if eventually your heap gets full, you'll receive an OutOfMemoryException.
Regarding the most suitable configuration - well, it depends. If your task does not consume much RAM - configure Spark to have as much executors as your CPUs.
Otherwise, configure your executors to match the appropriate amount of RAM required.
Keep in mind that those limitations should not be constant, and might be changed by your application requirements.
Related
Let's say I have a Spark cluster with 32gb of RAM nodes. 1G of executor memory is enough for processing any of my data.
There is a Linux shell program (program) that I need to run for each partition. It would sound easy if it's simple Linux pipe script, but the program requires 10GB of memory for each run. My initial assumption was that I can just increase executor memory to 11GB and Spark will use one executor per partition for 1G and the other 10G will be allocated for the program that will run in context of executor. But it's not. It's using 11GB for 1G of Spark data and after that it runs the 10GB program in available node memory.
So, I've changed the executor memory back to 1GB and decided to play with cores, instances and yarn. I've tried to use:
--executor_memory 1G
--driver_memory 1G
--executor_cores 1
--num_executors 1
and for YARN 32GB - (10GB * 2 running programs per node) = 12G - 4G for OS = 8G * 1024M = :
"yarn.nodemanager.resource.memory-mb": "8192",
"yarn.scheduler.maximum-allocation-mb": "8192"
Because I'm using 1G per executor, Spark starts 8192 / (1024 * 1.18 of overhead) ~ 6 executors per node. Definitely if each executor will start 10GB program, there will be no RAM to do this. I've increased executor memory to reduce number of executors per node to 2 with executor memory = 3GB
Now it runs 2 executors per node, but the program still fails with Out of Memory exception.
I've added a code to check available memory right before starting the program
total_memory, used_memory, free_memory, shared_memory, cache, available_memory = map(
int, os.popen('free -t -m | grep Mem:').readlines()[0].split()[1:])
But even if available_memory is > 10G the program starts, but it's running out of memory in a middle (it runs for about 4 mins).
Is there a way to allocate memory for external script on a executor nodes? Or maybe there is a workaround for this?
I would appreciate any HELP!!!
Thanks in advance,
Orka
The answer is simple. My resource calculation is correct. All what I changed is
spark.dynamicAllocation.enabled=false
It was true by default and Spark tried to start as many executors as it can on each node.
I have setup a 10 node HDP platform on AWS. Below is my configuration
2 Servers - Name Node and Standby Name node
7 Data Nodes and each node has 40 vCPU and 160 GB of memory.
I am trying to calculate the number of executors while submitting spark applications and after going through different blogs I am confused on what this parameter actually means.
Looking at the below blog it seems the num executors are the total number of executors across all nodes
http://blog.cloudera.com/blog/2015/03/how-to-tune-your-apache-spark-jobs-part-2/
But looking at the below blog it seems that the num executors is per node or server
https://blogs.aws.amazon.com/bigdata/post/Tx578UTQUV7LRP/Submitting-User-Applications-with-spark-submit
Can anyone please clarify and review the below :-
Is the num-executors value is per node or the total number of executors across all the data nodes.
I am using the below calculation to come up with the core count, executor count and memory per executor
Number of cores <= 5 (assuming 5)
Num executors = (40-1)/5 = 7
Memory = (160-1)/7 = 22 GB
With the above calculation which would be the correct way
--master yarn-client --driver-memory 10G --executor-memory 22G --num-executors 7 --executor-cores 5
OR
--master yarn-client --driver-memory 10G --executor-memory 22G --num-executors 49 --executor-cores 5
Thanks,
Jayadeep
Can anyone please clarify and review the below :-
Is the num-executors value is per node or the total number of executors across all the data nodes.
You need to first understand that the executors run on the NodeManagers (You can think of this like workers in Spark standalone). A number of Containers (includes vCPU, memory, network, disk, etc.) equal to number of executors specified will be allocated for your Spark application on YARN. Now these executor containers will be run on multiple NodeManagers and that depends on the CapacityScheduler (default scheduler in HDP).
So to sum up, total number of executors is the number of resource containers you specify for your application to run.
Refer this blog to understand better.
I am using the below calculation to come up with the core count, executor count and memory per executor
Number of cores <= 5 (assuming 5) Num executors = (40-1)/5 = 7 Memory = (160-1)/7 = 22 GB
There is no rigid formula for calculating the number of executors. Instead you can try enabling Dynamic Allocation in YARN for your application.
There is a hiccup with the capacity scheduler. As far as I understand it allows you to only schedule by memory. You will first need to change that to the dominant resource calculator scheduling type. That will allow you to ask for more memory and cores combination. Once you change that out you should be able to ask for both cup and memory with your spark application.
As for --num-executors flag, you can even keep it at a very high value of 1000. It will still allocate only the number of containers that is possible to launch on each node. As and when your cluster resources increase your containers attached to your application will increase. The number of containers that you can launch per node will be limited by the amount of resources allocated to the nodemanagers on those nodes.
I did some testing on r3.8 xlarge cluster, each instance has 32 cores, and 244G memory.
If I set spark.executor.cores=16, spark.executor.memory=94G, there're 2 executors per instance, but when I set spark.executor.memory larger than 94G, there will be only one executor per instance;
If I set spark.executor.cores=8, spark.executor.memory=35G, there're 4 executors per instance, but when I set spark.executor.memory larger than 35, there will be no larger than 3 executors per instance.
So, my question is, how does the executor number come out by memory set? What's the formula? I though the Spark just simply use 70% of the physical memory to allocate to the executors but seems I'm wrong...
In Yarn mode you need to set number of executor by num-executors and executor memory by executor-memory. Here's a example:
spark-submit --master yarn-cluster --executor-memory 6G --num-executors 31 --executor-cores 32 example.jar Example
Now each executor requests a container from yarn with 6G + memory overhead and 1 core.
More info on spark documentation
Regarding the behavior you're seeing it sounds like the amount of memory available to your YARN NodeManagers is actually less than the 244GB that is available to the OS. To verify this, take a look at your YARN ResourceManager Web UI and you can see how much memory is availible in total across the cluster. This is determined from the yarn.nodemanager.resource.memory-mb in yarn-site.xml.
To answer your question about how the number of executors is determined: In YARN, if you're using spark with dynamicAllocation.enabled set to true, the number of executors is limited above dynamicAllocation.minExecutors and below dynamicAllocation.maxExecutors.
Other than that you're then subjected to YARN's resource allocation which, for most schedulers, will allocate resources to fill up a given queue that your job runs in.
In the situation where you have a totally unutilized cluster with one YARN queue and you submit a job to it, the Spark job will continue to add executors with the given number of cores and memory amount until the entire cluster is full (or there is not enough cores/memory for an additional executor to be allocated).
Below are configurations:
Hadoop-2x (1 master, 2 slaves)
yarn.nodemanager.resource.memory = 7096 m
yarn.scheduler.maximum-allocation= 2560 m
Spark - 1.5.1
spark/conf details in all three nodes :
spark.driver.memory 4g
spark.executor.memory 2g
spark.executor.instances 2
spark-sql>CREATE TABLE demo
USING org.apache.spark.sql.json
OPTIONS path
This path has 32 GB compressed data. It is taking 25 minutes to create table demo. Is there anyway to optimize and bring it down in few minutes? Am I missing something out here?
Most usually each executor should represent each core of your CPU. Also note that master is the most irrelevant of your all your machines, because it only assigns tasks to slaves, which do the actual data processing. Your setup is then correct if your slaves are single-core machines but in most cases you would do something like:
spark.driver.memory // this may be the whole memory of your master
spark.executor.instances // sum of all CPU cores that your slaves have
spark.executor.memory // (sum of all slaves memory) / (executor.instances)
That's the easiest formula and will work in vast majority of Spark jobs.
I am processing data with spark and it works with a day worth of data (40G) but fails with OOM on a week worth of data:
import pyspark
import datetime
import operator
sc = pyspark.SparkContext()
sqc = pyspark.sql.SQLContext(sc)
sc.union([sqc.parquetFile(hour.strftime('.....'))
.map(lambda row:(row.id, row.foo))
for hour in myrange(beg,end,datetime.timedelta(0,3600))]) \
.reduceByKey(operator.add).saveAsTextFile("myoutput")
The number of different IDs is less than 10k.
Each ID is a smallish int.
The job fails because too many executors fail with OOM.
When the job succeeds (on small inputs), "myoutput" is about 100k.
what am I doing wrong?
I tried replacing saveAsTextFile with collect (because I actually want to do some slicing and dicing in python before saving), there was no difference in behavior, same failure. is this to be expected?
I used to have reduce(lambda x,y: x.union(y), [sqc.parquetFile(...)...]) instead of sc.union - which is better? Does it make any difference?
The cluster has 25 nodes with 825GB RAM and 224 cores among them.
Invocation is spark-submit --master yarn --num-executors 50 --executor-memory 5G.
A single RDD has ~140 columns and covers one hour of data, so a week is a union of 168(=7*24) RDDs.
Spark very often suffers from Out-Of-Memory errors when scaling. In these cases, fine tuning should be done by the programmer. Or recheck your code, to make sure that you don't do anything that is way too much, such as collecting all the bigdata in the driver, which is very likely to exceed the memoryOverhead limit, no matter how big you set it.
To understand what is happening you should realize when yarn decides to kill a container for exceeding memory limits. That will happen when the container goes beyond the memoryOverhead limit.
In the Scheduler you can check the Event Timeline to see what happened with the containers. If Yarn has killed a container, it will be appear red and when you hover/click over it, you will see a message like:
Container killed by YARN for exceeding memory limits. 16.9 GB of 16 GB physical memory used. Consider boosting spark.yarn.executor.memoryOverhead.
So in that case, what you want to focus on is these configuration properties (values are examples on my cluster):
# More executor memory overhead
spark.yarn.executor.memoryOverhead 4096
# More driver memory overhead
spark.yarn.driver.memoryOverhead 8192
# Max on my nodes
#spark.executor.cores 8
#spark.executor.memory 12G
# For the executors
spark.executor.cores 6
spark.executor.memory 8G
# For the driver
spark.driver.cores 6
spark.driver.memory 8G
The first thing to do is to increase the memoryOverhead.
In the driver or in the executors?
When you are overviewing your cluster from the UI, you can click on the Attempt ID and check the Diagnostics Info which should mention the ID of the container that was killed. If it is the same as with your AM Container, then it's the driver, else the executor(s).
That didn't resolve the issue, now what?
You have to fine tune the number of cores and the heap memory you are providing. You see pyspark will do most of the work in off-heap memory, so you want not to give too much space for the heap, since that would be wasted. You don't want to give too less, because the Garbage Collector will have issues then. Recall that these are JVMs.
As described here, a worker can host multiple executors, thus the number of cores used affects how much memory every executor has, so decreasing the #cores might help.
I have it written in memoryOverhead issue in Spark and Spark – Container exited with a non-zero exit code 143 in more detail, mostly that I won't forget! Another option, that I haven't tried would be spark.default.parallelism or/and spark.storage.memoryFraction, which based on my experience, didn't help.
You can pass configurations flags as sds mentioned, or like this:
spark-submit --properties-file my_properties
where "my_properties" is something like the attributes I list above.
For non numerical values, you could do this:
spark-submit --conf spark.executor.memory='4G'
It turned out that the problem was not with spark, but with yarn.
The solution is to run spark with
spark-submit --conf spark.yarn.executor.memoryOverhead=1000
(or modify yarn config).